115 81 6MB
English Pages 356 [349] Year 2024
Sangeetha Menon Saurabh Todariya Tilak Agerwala Editors
AI, Consciousness and The New Humanism Fundamental Reflections on Minds and Machines
AI, Consciousness and The New Humanism
Sangeetha Menon · Saurabh Todariya · Tilak Agerwala Editors
AI, Consciousness and The New Humanism Fundamental Reflections on Minds and Machines
Editors Sangeetha Menon NIAS Consciousness Studies Programme National Institute of Advanced Studies, Indian Institute of Science campus Bengaluru, India
Saurabh Todariya Human Sciences Research Group International Institute of Information Technology Hyderabad, India
Tilak Agerwala National Institute of Advanced Studies Indian Institute of Science campus Bengaluru, India
ISBN 978-981-97-0502-3 ISBN 978-981-97-0503-0 (eBook) https://doi.org/10.1007/978-981-97-0503-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Paper in this product is recyclable.
Contents
1
Fundamental Reflections on Minds and Machines . . . . . . . . . . . . . . . . Sangeetha Menon, Saurabh Todariya, and Tilak Agerwala
2
An Open Dialogue Between Neuromusicology and Computational Modelling Methods . . . . . . . . . . . . . . . . . . . . . . . . . Sujas Bhardwaj, Kaustuv Kanti Ganguli, and Shantala Hegde
1
11
3
Testing for Causality in Artificial Intelligence (AI) . . . . . . . . . . . . . . . . Nithin Nagaraj
4
Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tilak Agerwala
55
Advaita Ethics for the Machine Age: The Pursuit of Happiness in an Interconnected World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Swami Bodhananda
75
Singularity Beyond Silicon Valley: The Transmission of AI Values in a Global Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert Geraci
93
5
6
37
7
Healthcare Artificial Intelligence in India and Ethical Aspects . . . . . 107 Avik Sarkar, Poorva Singh, and Mayuri Varkey
8
Human Learning and Machine Learning: Unfolding from Creativity Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Parag Kulkarni and L. M. Patnaik
9
Learning Agility: The Journey from Self-Awareness to Self-Immersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Madhurima Das
v
vi
Contents
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions of Implementing Machine Learning in Mental Health . . . . . . . . . . . . 197 Urvakhsh Meherwan Mehta, Kiran Basawaraj Bagali, and Sriharshasai Kommanapalli 11 AI-Based Technological Interventions for Tackling Child Malnutrition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Bita Afsharinia, B. R. Naveen, and Anjula Gurtoo 12 Autonomous Weapon System: Debating Legal–Ethical Consideration and Meaningful Human Control Challenges in the Military Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Prakash Panneerselvam 13 Artificial Intelligence and War: Understanding Their Convergence and the Resulting Complexities in the Military Decision-Making Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Prasanth Balakrishnan Nair 14 Converging Approach to Intelligence: Decision-Making Systems in Artificial Intelligence and Reflections on Human Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Sarita Tamang and Ravindra Mahilal Singh 15 Expanding Cognition: The Plasticity of Thought . . . . . . . . . . . . . . . . . 295 Clayton Crockett 16 The World as Affordances: Phenomenology and Embeddedness in Heidegger and AI . . . . . . . . . . . . . . . . . . . . . . . . 307 Saurabh Todariya 17 Investigating the Ontology of AI vis-à-vis Technical Artefacts . . . . . 319 Ashwin Jayanti 18 Being “LaMDA” and the Person of the Self in AI . . . . . . . . . . . . . . . . . 331 Sangeetha Menon
About the Editors
Sangeetha Menon is Dean School of Humanities and Head of the Consciousness Studies Programme at the National Institute of Advanced Studies, Bangalore, India. Her research and publications explores the interconnected layers of human experiences in the context of wellbeing and life-purposes. Saurabh Todariya is Assistant Professor, Human Sciences Research Group, International Institute of Information Technology, Hyderabad. Tilak Agerwala is Adjunct Associate Professor, Seidenberg School of Computer and Information Systems, Pace University, New York, and Adjunct Professor, at the National Institute of Advanced Studies, Bangalore, India. He retired from IBM in November 2014 after 35 years of service.
vii
Chapter 1
Fundamental Reflections on Minds and Machines Sangeetha Menon, Saurabh Todariya, and Tilak Agerwala
We live in a complex world, and the complexity exists not just in degree but in diversity that is pluralistically rich and spectral. The diversity in gender, ethnicity, culture, economic status, cultural practices, add to what Michael Polanyi described as “tacit knowledge” (Polanyi, 1966) which emphasizes the building of knowledge as a process to include and integrate the personal unknown. The increasing challenges with adopting the absolutist or dualist methods in distributing outcomes, sharing challenges, and managing productivity urge social scientists, philosophers, psychologists, technologists, and science leaders to see beyond the polemics of interdisciplinary and multi-disciplinary perspectives. Can we address the complexity in natural sciences and social sciences with the same conceptual frameworks? Can there be combined human-natural-machine systems that can move towards resilient adaptation? A common argument is that “incommensurability and unification constrain the interdisciplinary dialogue, whereas pluralism drawing on core social scientific concepts would better facilitate integrated sustainability research” (Olsson et al., 2015). Perhaps the major challenge in panarchy is to address the diverse trajectories that will be taken by different natural and societal systems, and the possibility of their unification leading to a homogeneity that does not capture varied resilience narratives. It has come to a scenario today where we have to dialogue on social practices and leadership strategies that encourage and empower diversity by adopting the S. Menon (B) NIAS Consciousness Studies Programme, National Institute of Advanced Studies, Bangalore, India e-mail: [email protected] S. Todariya Human Sciences Research Group, International Institute of Information Technology, Hyderabad, India T. Agerwala Seidenberg School of Computer and Information Systems, Pace University, New York, USA National Institute of Advanced Studies, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_1
1
2
S. Menon et al.
transdisciplinary approach of including the dissimilar and making multiple actors part of the process towards better reach of scientific and technological knowledge. Such an approach and attempt is expected to raise further concerns for addressing humanism and its changing phases. The burgeoning possibilities in AI, stem cell research, and cognitive sciences encourage a vision that grapples with the unseen in terms of multiple questions on ethics, equal justice, and preservation of self-identity. Artificial narrow intelligence (ANI), based on statistical learning, is the AI in our world today. ANI can perform a narrow set of tasks (such as playing chess and checking the weather) and is used extensively in natural language processing, image, speech recognition, and decisionmaking, but is nowhere close to having human-like intelligence. ANI systems are evolving very rapidly, as evidenced by ChatGP, a powerful conversational agent that uses a state-of-the-art language-generation model developed by OpenAI to generate human-like text, have natural and engaging conversations with users, and generate seemingly new, realistic content that is coherent and follows grammatical and structural rules, but its text can be incorrect. Based on statistical learning, ChatGPT does not understand the world, does not think, is incapable of logical reasoning, and, like other ANI systems, raises ethical issues of bias, fairness, safety, and security. The four discourses that are pertinent to the “science and society” rubric today are: sustainability, artificial intelligence, climate change, and indigenous medicine. The focus on these four discourses and practices is pertinent for the Indian society and science establishments and is representative of the massive changes affecting the economic, ecological, social, psychological, and evolutionary systems in an unprecedented manner. One of the major challenges is to bring in different actors into the field of reflective engagement and take a participatory approach with the phenomenology of examining their and others experiences towards understanding sustainability and creating sustainable development goals. The Internet, smart gadgets, and smarter algorithms are fast changing the needs and comforts the humans seek in order to exist and thrive in the new digital world that could facilitate equal access to information and services. Beneath the smoothbed of the pleasures offered by AI are the multiple questions of identity, ethics, and consciousness which are important towards responding to the sustainability questions of well-being and species sustenance. One of the central questions in AI is of “collective intelligence” and “generative AI”, and how knowledge is constructed, evolved, developed, and used in the new world that is dependent on AI-based utilities and applications. It is important to consider how AI intervenes with social sciences, natural sciences, technologies, security and surveillance, law and ethics, philosophy and psychology, so as to present possibilities for greater access and wider distribution across the populations. Equally important is to consider the anthropomorphising of AI and what are the tacit ways in which we have enlivened machines and algorithms with human life, emotions, and aspirations. Can a machine think, believe, aspire, and be purposeful as a human? What is the place in the machine world, for hope, meaning, and transformative enlightenment that inspires human existence? How are the minds of machines different from that of humans?
1 Fundamental Reflections on Minds and Machines
3
In recent years, scholars, scientists, technology pioneers, and philosophers have raised concern on the possible threat of artificial intelligence superseding human species. Such speculation provides the context and the need for us to step back and ask a few fundamental questions: what is intelligence? How is intelligence understood in various disciplines like computer sciences, philosophy, psychology, and the arts? What does it mean for an entity to be an agent and perform an act? What is the self? What is consciousness? Is it possible for a machine to think and act the way humans do? What is the place of aesthetic, creative, and profound experiences in deciding and influencing “intelligence”, behaviour, and value systems? What is “intelligence”? What is “experience”? Is intelligence the power to compute, to be logical, to make rational decisions, to perform, to be successful, and to learn from “experiences”? Experience might sound an anathema in the discussion on “intelligence” since it brings the classical debates on subject versus data, subjectivity versus objectivity, and also functional success versus ethics. Can a machine have experience? Is human intelligence possible without the richness presented by the frailties and intensity of personal experiences? Can one have self-consciousness without having an experience or being an “experiencer” and the agent of action? In the context of advancing AI developments in science, technologies, and generation of big data, it is pertinent that human existence, ethics, aspirations, and transformative values are placed within the context of collective well-being and co-existence. Such a rich context is possible only if there is a presentation of multi-disciplinary engagements along with the questions on the present and future of AI and the insights that will ensue through fundamental reflections. The set of 19 chapters in this book contextualizes perspectives from the fields of computer science, information theory, neuroscience and brain imaging, social sciences, health sciences, psychiatry, and philosophy to engage with the frontier questions concerning artificial intelligence and human experience, with implications for cultural and social lives. The volume will also highlight the place of a new humanism while we attempt to respond to questions on the final frontiers of human existence and machine intelligence. The book presents 19 chapters that provide diverse perspectives and fundamental reflections on minds and machines, in the context of the recent developments in AI. Georg Northoff focuses on how a machine can augment humans rather than do what they do and extend this beyond AGI-style tasks to enhancing peculiarly personal human capacities, such as well-being and morality. He discusses these capacities with the help of notions such as “environment-agent nexus” and adaptive architectures from the brain sciences. Northoff targets the functionality of the environment-agent nexus, specifically its potential augmentation by a machine, namely by IAA. He argues that for an artificial agent to assist in the regulation of such a delicate interplay, great sensitivity and adaptivity will be required and propose the modelling of such environment-agent nexus on the basis of the lessons learned from the brain, and to this end IAA requires artificial agents to endow greater environmental attunement than current AI systems. Sujas Bhardwaj, Kaustuv Kanti Ganguli, and Shantala Hegde discuss music perception, cognition, and production research by examining neural correlates of
4
S. Menon et al.
musical components to a better understanding of the interplay of multiple neural pathways that are both unique and shared among other higher neurocognitive processes. Sujas et al. believe that artificial intelligence (AI) and machine learning (ML) models that are data-driven approaches can investigate whether our current understanding of the neural substrates of musical behaviour can be translated to teach machines to perceive, decode, and produce music akin to humans and how AI algorithms can extract features from human-music interaction. The intent of the authors is to train ML models on such features to help in information retrieval to look at the brain’s natural music processing, recognizing the patterns concealed within it, deciphering its deeper meaning, and, most significantly, mimicking human musical engagements. Nithin Nagaraj investigates “whether machines can think causally” and debates on whether AI systems and algorithms such as deep learning (DL), machine learning (ML), and artificial neural networks (ANN) though are efficient in finding patterns in data by means of heavy computation and sophisticated information processing via probabilistic and statistical inference and have an inherent ability for true causal reasoning and judgement. Two other questions that are discussed in this chapter are: what are the specific factors that make causal thinking so difficult for machines to learn, and is it possible to design an imitation game for causal intelligence machines (a causal Turing Test)? Tilak Agerwala presents a case for ethics and multi-disciplinarity in the context of AI. He argues that autonomous and intelligent systems and services that use narrow artificial intelligence technologies are far from having human-like intelligence, though at the same time AISSN systems can have unanticipated and harmful impacts. This chapter highlights the ethical challenges of AISSNs using three diverse and pervasive examples: Internet of Things, conversational AI, and semi-autonomous vehicles. The author contends that AISSNs will be the norm for the foreseeable future and that artificial general intelligence will not develop anytime soon, and depending on the problem domain, multi-disciplinary teams of computer scientists and engineers, sociologists, economists, ethicists, linguists, and cultural anthropologists will be required to implement humanistic design processes. Swami Bodhananda extends the implications for ethics in machines with the help of insights from key concepts in Yoga and Vedanta and examines whether a machine can ever follow ethical principles and make decisions that are still favourable to the human species. According to the author Advaita-yoga drishti envisages such a prospect for humans and suggests that the yogi is identified with the entire panpsychic realm and the universal self, which is nothing but a holistic, cosmic integrated information network. A conjecture that is proposed in this chapter is that a “moral machine” need not have consciousness to function morally, for the well-being of living beings, and need not be conscious of, or subjectively feel (qualia) such actions. The author argues that similar to human beings, “moral machines” can be fallible, and it will be a matter of learning by trial and error over time and receiving feedback in a continuous manner, that better efficiency is achieved. He also proposes that Advaita prepares humans for the evolutionary possibility of human obsolescence and the ever-present consciousness, and the willingness to self-sacrifice, if need be,
1 Fundamental Reflections on Minds and Machines
5
for higher knowledge and greater possibilities of manifestations is the ultimate value that Advaita teaches. Robert Geraci, in his chapter, examines singularity beyond the Silicon Valley and proposes the transmission of AI values in a global context, where global communities can and should consider how to reformulate transcendent dreams of artificial intelligence (AI) that arose in the USA. He believes that the goals of AI superintelligence and evolution from human to posthuman machines seek hegemonic dominion over our perception of AI. Geraci suggests that Indian culture has its own resources for contemplating cosmic change. He adds in his chapter that whether and how we receive Silicon Valley’s enthusiastic desire for technological transcendence impacts our deployment of AI; to maximize global equity, we must build our machines— and our understanding of them—with attention to values and ethics from outside Silicon Valley. The circulation of Apocalyptic AI into India creates new possibilities for future technological deployment, and it is to be watched whether Indians will introduce local values, such as svaraj, dharma, or ahimsa, or new philosophies, such as Advaitan approaches to human cognition. Avik Sarkar, Poorva Singh, and Mayuri Varkey in their chapter examine healthcare artificial intelligence in India and ethical aspects and believe that several nations globally, including India, lack trained healthcare professionals to take good care of the population, and emerging technologies like artificial intelligence (AI) can help provide healthcare services to the predominantly underserved population. They further explore the use of AI in addressing public health and pandemics. Avik et al. examine the current situation, AI applications globally, followed by those in India while analysing the scenarios. They suggest that using innovative AI tools helps enhance healthcare professionals’ productivity by relieving them of mundane, repetitive administrative-oriented activities and provides an overview of the various areas where AI is used to help healthcare professionals and, thus, help patients, and concludes with a discussion on the factors in the adoption of AI in India along with suggestions for increasing the adoption in the healthcare sector. The chapter by Parag Kulkarni and LM Patnaik examines human learning and machine learning from the perspective of creativity and formulates creative learning models to build abilities in machines to deliver ingenious solutions. This chapter tries to unfold different facets of human learning and machine learning with this unexplored element of surprising creativity. It further tries to formulate creative learning models to build abilities in machines to deliver ingenious solutions. According to Kulkarni et al., creative intelligence is about combination and transformation and carries an element of surprise and differentiation, even in high entropy and uncertain states. The chapter further discusses questions such as: is it possible to learn this ability to surprise or if creativity is just an outcome of an accident or an offshoot of routine work, and if then how can machines learn to produce these surprises? Madhurima Das focuses on “learning agility” in the context of the AI landscape and its inherent impact on organizational design, functions, processes, and behaviour. According to Das, it is imperative to adopt a mindset that is open, aware, inquisitive, reflective, empathetic, innovative, resilient, and risk taking towards developing the ability and willingness to learn from earlier experiences and apply that learning to
6
S. Menon et al.
perform better in newer situations. This chapter explores the impact of AI (automation and digitization) on core organizational components, the need for an agile ecosystem to respond to the digital transformations at the organizational level, and the role of the individual as they move from self-awareness to self-immersion. Urvakhsh Meherwan Mehta, Kiran Bagali, and Sriharshasai Kommanapalli in their chapter carefully examine the promises, pitfalls, and solutions of implementing machine learning in mental health. This study focuses on the rapidly growing applications of machine learning techniques to model and predict human behaviour in a clinical setting, where mental disorders continue to remain an enigma and most discoveries, therapeutic or neurobiological, stem from serendipity. The authors critically review the applied aspects of artificial intelligence and machine learning in decoding important clinical outcomes in psychiatry. They propose and examine predicting the onset of psychotic disorders to classifying mental disorders and long-range applications, along with the veridicality and implementation of the results in the real world. Mehta et al. finally highlight the promises, challenges, and potential solutions of implementing these operations to better model mental disorders. Malnutrition crisis among children in India continues to be an alarming issue, and with rapidly evolving technology, access to proper nutrients for every child is a possibility. Bita Afsharinia, Naveen B. R., and Anjula Gurtoo discuss the AIbased technological interventions for tackling child malnutrition and probe the nutritional factors leading to impaired growth in children and suggests context-specific interventions. The study by Afsharinia et al. suggest that new artificial intelligence (AI) applications such as Anaemia Control Management (ACM) software will assist in anaemia management, in routine clinical practice, of a child and that the AIbased virtual assistant application involving Momby within the Anganwadi/ICDS Programme could improve access to health services and information for mothers who have struggled to access important pre- and postnatal care. The chapter presents the findings and conclusions that it is imperative to strategize and implement AI as advanced approaches to tackle malnutrition arising from nutritional deficiency and health issues. Prakash Panneerselvam presents Autonomous Weapon System (AWS) and the debates on legal-ethical consideration and meaningful human control challenges in the military environment. According to the author in the last five years, Autonomous Weapon System (AWS) has generated intense debate globally over the potential benefit and potential problems associated with these systems, and military planners understand that AWS can perform the most difficult and complex tasks, with or without human interference, and, therefore, can significantly reduce military casualties and save costs. While several prominent public intellectuals including influential figures like Elon Musk and Apple cofounder Steve Wozniak called for banning of “offensive autonomous weapons beyond meaningful human control”, the militaries believe that the AWS can perform better without human control and follow legal and ethical rules better than soldiers. Panneerselvam looks into the emergence of AWS, its future potential, and how it will impact future war scenarios, by examining the debates over the ethical-legal use of AWS and the viewpoints of military planners.
1 Fundamental Reflections on Minds and Machines
7
Prasanth Balakrishnan Nair with field experience as a commanding officer and fighter pilot, in his chapter, elaborates artificial intelligence and war and attempts to understand their convergence and the resulting complexities in the military decisionmaking process. Nair believes that it is important to understand what constitutes the emergence of AI-based decision support systems and how nations can optimally exploit their inevitable emergence. According to the author, it is equally important to understand the associated risks involved and the plausible mitigating strategies, and the inherent question of ethics and morality cannot be divorced from these strategies, especially when it involves decision-making processes that can involve disproportionate and indiscriminate casualties of both man and material. The chapter attempts a holistic understanding of the crucial manned-unmanned teaming and how responsible nations can assimilate and operationalize this into their joint warfighting doctrines. The study proposes a “whole of nation approach” towards AI, since the database that decides the optimal AI-based solution requires access to metadata that would be required to have an “Uncertainty Quantification (UQ)” associated with it so that these can be weighed in by the AI DSS when making decision or providing solutions to the military decision-makers. Sarita Tamang and Ravindra Mahilal Singh examine the theme of converging intelligence modelled on human decision-making and the decision theory in computational systems. The argument presented in this chapter is that the decision theory in AI where Scenario thinking, “the ability to predict the future”, is a key attribute of intelligent behaviour in machines as well as humans. The authors believe that based on the converging approach to intelligence in artificial systems and human reasoning, we can examine closely whether AI holds any insight for human reasoning and whether human actions can be simulated through decision-making models in AI. Crockett surveys elements of non-human cognition to explore the ways to think across the boundary that is usually asserted between living and machinic intelligence, mainly drawing on the work of Catherine Malabou and N. Katherine Hayles. The author examines how a biological model of neuroplasticity is conjoined to a machinic conception of artificial intelligence and argues that the science of epigenetics applies to both living organisms and machines. Crockett reviews the work of Hayles and presents her demonstration on how cognition operates beyond consciousness in both organic and machinic terms. The author concludes that we need to follow Malabou and Hayles to creatively imagine new alliances, new connections, and new ways to resist corporate capitalism and above all think and enact our organic and inorganic autonomisms differently by adapting the philosophy of Hayles and Haraway to help us understand cognition and kin across multiple networks. He believes that machines, animals, plants, fungi, bacteria, electrons, quantum fields, and people: each and all of these express entanglements of profound multiplicity. Saurabh Todariya explores in his chapter the phenomenology of embeddedness in Heidegger and contrasts it with AI. Todariya reviews the claims of AI by making the phenomenological inquiry into the nature of human existence and inquire whether the metanarrative of AI would encompass the various dimensions of human intelligence. In the process Todariya attempts to show that the claims of AI are dependent on the notion of computational intelligence which Heidegger calls as the “present-at-hand”
8
S. Menon et al.
that refers to those skills which require explicit, procedural, and logical reasoning. The author discusses the concept of “practical knowledge” of Hubert Dreyfus or “embodied cognition” which requires the mastering of the practical skills through the kinaesthetic embodied efforts like swimming, dancing, mountain climbing, etc. The chapter discusses the argument that phenomenology does not interpret the world as the object to be calculated but as the affordances provided by our embodied capacities. Since our experience of the world is based on our embodied capacities to realize the certain kind of possibilities in the world, Todariya concludes that the practical, embedded, phenomenological agency makes human intelligence as situated, while the discourse on AI ignores claims of situated, phenomenological understanding of the intelligence. According to Todariya, the situated and contextual understanding of intelligence poses what is called as the “Frame Problem” in AI and could be properly understood and addressed through the phenomenological understanding of the body. Ashwin Jayanti attempts an investigation of the “Ontology of AI vis-à-vis Technical Artefacts”, in his chapter. He argues that most of the philosophical discourse has focused on the analysis and clarification of the epistemological claims of intelligence within AI and on the moral implications of AI, and that philosophical critiques of the plausibility of artificial intelligence do not have much to say about the realworld repercussions of introducing AI systems. Jayanti argues that most of the moral misgivings about AI have to do with conceiving them as autonomous agents beyond the control of human actors, and in this study, he examines such assumptions by investigating into the ontology of AI systems vis- à -vis ordinary (non-AI) technical artefacts to see wherein lies the distinction between the two. He further reviews how contemporary ontologies of technical artefacts apply to AI and holds the position that clarifying the ontology of AI is crucial to understand their normative and moral significance and the implications therefrom. The concluding chapter in this volume by Sangeetha Menon discusses the concept of self in AI and with focus on the “LaMDA”, the large language model of Open AI, and concludes that the focus has to be on the “person” that enlivens the self, than a cognitive or functional architecture. She argues that the debate on the possible presence of sentience in a large language model (LLM) chatbot such as LaMDA inspires to examine the notions and theories of self, its construction, and reconstruction in the digital space as a result of interaction. The question whether the concept of sentience can be correlated with a digital self without a place for personhood undermines the place of sapience and such/their/other high-order capabilities. Menon believes that the concepts of sentience, self, personhood, and consciousness require discrete reflections and theorisations. The theoretical concerns surrounding the discussion on LaMDA and the positioning of AI ethics are mired by the confounding of personhood with the self, and consciousness with sentience. The ability to contemplate and responsibly act using knowledge, experience, understanding, discernment, common sense, self-reflection, and insight gives one the higher-order ability of sapience. Without a person who learns, observes, reflects, and transforms, Sangeetha Menon argues that the self has no meaning except for being an abstract framework notion that can never be touched. She concludes that the metaphysical nature of consciousness cuts across theories of causality and invites a method that is practised and reflected
1 Fundamental Reflections on Minds and Machines
9
towards the discovery of life-changing purposes and the interconnectedness of beings and the many worlds. This edited volume is an attempt to bring in a broader scenario to contextualize the reflections on minds and machines, in the context of the developments and interventions of AI, and present perspectives from eighteen chapters that jointly argue for considering the new humanism that might include ethical, metaphysical, clinical, and health considerations for a sustainable future of the human species and well-being that touches as many lives as possible. While the general concerns and the larger scenario in the background of which the chapters are presented in this volume are acknowledged by the editors and the authors, we wish to make a disclaimer that the specific claims or ideas mentioned in the individual chapters belong to the authors, and we believe that they can be open to further debates.
References Olsson, L., Jerneck, A., Thoren, H., Persson, J., & O’Byrne, D. (2015). Why resilience is unappealing to social science: Theoretical and empirical investigations of the scientific use of resilience. Science Advances, 1(4). https://doi.org/10.1126/sciadv.1400217 Polanyi, M. (1966). The Tacit Dimension. London: Routledge & Kegan Paul.
Chapter 2
An Open Dialogue Between Neuromusicology and Computational Modelling Methods Sujas Bhardwaj, Kaustuv Kanti Ganguli, and Shantala Hegde
Abstract Music perception, cognition, and production research have progressed significantly from examining neural correlates of musical components to a better understanding of the interplay of multiple neural pathways that are both unique and shared among other higher neurocognitive processes. The interactions between the neural connections to perceive an abstract entity like music and how musicians make music are an area to be explored in greater depth. With the abstract nature of music and cultural differences, carrying out research studies using ecologically valid stimuli is becoming imperative. Artificial intelligence (AI) and machine learning (ML) models are data-driven approaches that can investigate whether our current understanding of the neural substrates of musical behaviour can be translated to teach machines to perceive, decode, and produce music akin to humans. AI algorithms can extract features from human-music interaction. Training ML models on such features can help in information retrieval to look at the brain’s natural music processing, recognizing the patterns concealed within it, deciphering its deeper meaning, and, most significantly, mimicking human musical engagements. The question remains how these models can be generalized for knowledge representation of human musical behaviour and what would be applications in a more ecologically valid manner.
S. Bhardwaj Department of Neurology, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru 560029, India S. Bhardwaj · S. Hegde (B) Music Cognition Lab, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru 560029, India e-mail: [email protected] K. K. Ganguli New York University, Abu Dhabi, UAE College of Interdisciplinary Studies, Zayed University, Abu Dhabi, UAE S. Hegde Department of Clinical Psychology, Clinical Neuropsychology and Cognitive Neuroscience Center, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru 560029, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_2
11
12
S. Bhardwaj et al.
Keywords Neuromusicology · Computational modelling · Music AI · Machine learning
1 Introduction How the human brain perceives and produces music has been a very intriguing question in the field of cognitive neuroscience and brain sciences. Often treated as a special kind of language, music as a biological phenomenon has been studied from a multi-disciplinary perspective. A new branch called neuromusicology has come into existence, and the knowledge base of this discipline lies in the understanding of the neural basis of music perception, cognition, and production from a cognitive neuroscience perspective. Cognitive neuroscientists in the field of neuromusicology observe the physiological functions of the brain as they occur in real time during task-related or spontaneous brain activity. Researchers employ a variety of methods, including electroencephalography (EEG), functional magnetic resonance imaging (fMRI), positron emission tomography (PET), and many more. These bioelectric signals and neuroimaging data reveal how the brain deals with musical information in real time and which specific brain regions are engaged in music processing and are able to enjoy music (Neuhaus, 2017). In other words, the key player in neuromusicology is brain activity.
1.1 Neural Basis of Music Perception and Cognition Music is one of the many inherent human qualities and is considered the language of the soul or the art of feeling. Despite being omnipresent in our lives, music is only found in human beings. The relationship between music and the brain is intricate, encompassing several neuronal circuits involved in sensory perception, higher neurocognitive functions, motor functions, language functions, and social cognition (Bhardwaj & Hegde, 2022; Peretz, 2006; Sihvonen et al., 2017). Existing studies have uncovered the neurocomputational basis of music processing, influencing sensory components like pitch and rhythm, alongside the analysis of intricate structures such as melody and harmony. The inferior and medial pre-frontal cortex, pre-motor cortex, and superior temporal gyri contribute to the processing of higherorder melodic patterns (Janata et al., 2002a, 2002b; Patel, 2003). Pre-frontal regions like the dorsoprefrontal cortex, the cingulate cortex, and the inferior parietal areas are where attention to the playing music is mostly distributed (Janata et al., 2002a, 2002b; Zatorre et al., 1994). The hippocampus, in addition to the medial temporal and parietal regions of the brain in charge of episodic memory, regulates familiarity connected to the musical experience (Janata, 2009; Platel et al., 2003). The network of deep limbic and paralimbic centres, amygdala, hippocampus, cingulate cortex, and orbitofrontal cortex regulate emotional engagement with music. The brain’s reward
2 An Open Dialogue Between Neuromusicology and Computational …
13
system depends on this network as well (Blood & Zatorre, 2001; Koelsch, 2010; Salimpoor et al., 2011). The cerebellum, basal ganglia, motor, and somatosensory cortices are all involved in the perception of rhythm in music as well as physical movement in response to beats (Grahn & Rowe, 2009; Zatorre et al., 2007). These discoveries have also had an impact on AI and ML systems, allowing machines to learn and write like humans. There is a plethora of evidence (e.g., Rampinini et al., 2023), but more research is needed to give a comprehensive picture of how music understanding and creative mechanisms function in the brain. We present a fresh theoretical perspective on the brain’s unsupervised learning mechanism concerning the neurocomputational structures involved in music processing, also acknowledging the limitations imposed by both computational and neurobiological factors.
1.1.1
Musicians—The Role Models in Knowing More About the Brain
The literature shows that musicians are excellent research subjects for studies on rhythm perception and music comprehension. Musicians categorize musical structures very efficiently, hence, they are role models for machines to learn from and design artificial neural networks for learning music. Researchers have established in previous studies that non-musicians are also musical because of the innate ability humans have for musical understanding and perception (Koelsch et al., 2000). Numerous control groups, both musicians and non-musicians, have been utilized by researchers to better understand how the brain interprets music and the effects of musical training. When compared to non-musicians, musicians have a stronger awareness of uncertainty (Hansen & Pearce, 2014; Paraskevopoulos et al., 2012). Musical training has an impact on how the brain processes musical syntax (Goldman et al., 2018; Przysinda et al., 2017). Numerous studies have examined western classical musical instruction, but much fewer have examined non-classical modes of music teaching. The inventiveness of the musicians and the variations in their levels of learning interest can be discovered by analysing the higher and lower-order statistical learning models. The data compression, computational speed (errors, time, necessary synapses, storage space, etc.), the capability of improvisation in music comprehension models, and the wow factor can all be determined by the implementation of these models in machines in future (Schmahmann, 2004; Schmidhuber, 2006). The ideal musical machine system will behave like a musician, be able to compose music, improvise, or perform with other musicians, or at the least, be able to talk about music like musicians do. Music is ephemeral, anepistemic, autoanaphoric, enchanting, and culturally as important as language thus cognitive science of music becomes important to decode the music’s architecture (Wiggins, 2020). The information theory approach can be useful for enabling machines to interact and learn while taking part in live musical performances, listening to pre-recorded improvisations, or both. Hierarchical structure, abstraction, and reference are the three most important cognitively valid relationships of human perception and memory afford of musicians that the human-like artificial musician has to pick up during interaction to learn from the musical representation.
14
1.1.2
S. Bhardwaj et al.
Can Machines Learn/Produce Music?
We currently reside in the era of artificial intelligence (AI), and the most significant revolution of our time is machine learning (ML). The emergence of electronic musical instruments, coupled with robust computers, has opened avenues for exploring instruments capable of making intelligent adaptations within the musical context. This is a step in the direction of the new humanism, and we want the machines to understand how people perceive and relate to music. The ability of a machine to mimic intelligent human behaviour is known as AI. Machine creativity has advanced significantly since its inception in the 1950s, especially in the field of music composition. The creation of music is influenced by the interdependence of ML, machine creativity, pattern recognition, and learning mechanisms. Linear regression (modelling relationships), supervised ML algorithms (decision trees), unsupervised ML algorithms, logistic regression models, random forests, dimensionality reduction algorithms, support vector machine, K-means clustering, gradient boosting, and Markov models are some popular methods used in ML. Will computers ever be able to learn music? Can computers understand what it is about music that evokes emotion or enjoyment in listeners? Why does each person’s interpretation of music differ? How are different musical emotions distinguishable? When we give it some thought, we realize that computers only regard music as data. Data-driven modelling techniques exploit repetitive patterns from (un)structured data. Sophisticated AI can even discover long-term hidden correlations which are beyond the computational and memory capacity of an average human brain. But whether these trends bear any physical significance in conceptual comprehension remains unanswered. This raises the debate between computational and human musical interpretive judgement. Thus, adding knowledge constraints to empirical ML models can improve their ability to simulate the cognitive and neurological components of language and music processing. We can imbue machines with artificial reasoning or perspectives that may enable machines to create music appropriately. Artificial intelligence has made it feasible to develop intelligent machines that could work together on musical composition, performance, and digital sound processing to produce meaningful melodies. The semantic and syntactic facets of musical “meaning” can be revealed by reflecting on “memory” for musical attributes. It is possible to gain insights on how to comprehend the abstractions of learning, affect, consciousness, and wisdom through a controlled multi-disciplinary inquiry. ML algorithms have demonstrated efficacy across diverse domains, spanning independent music creation, linguistic analysis, optimizing search engine performance, and refining social network content. Grounded in probabilistic methodologies driven by incoming data, these algorithms autonomously generate predictions, eschewing the need for explicit instructions. Classifiable into distinct categories like supervised, unsupervised, and reinforcement learning, each variant imparts unique learning capabilities to computational systems, mirroring facets of human cognitive processes. This parallelism facilitates the creation of machine-generated models intelligible to human interpreters, providing transparency into the mechanisms of learning and prediction. Consequently, this congruence serves as a catalyst for the
2 An Open Dialogue Between Neuromusicology and Computational …
15
development of AI models inspired by the intricacies of the human brain, envisaging a harmonious co-existence of human and computational entities within societal frameworks. For instance, the integration of statistical learning theory within the ML paradigm furnishes neuroscientific inquiry with cogent concepts, facilitating a nuanced understanding of implicit learning mechanisms intrinsic to the human brain (Perruchet & Pacton, 2006). Implicit learning, an inherent characteristic of neural processing, encompasses “unsupervised learning,” characterized by an absence of explicit instructions, learning intentions, or conscious awareness of acquired knowledge (Norris & Ortega, 2000). The purported function of the cerebral statistical learning mechanism extends to assimilating diverse acoustic insights, spanning disciplines such as musicality and linguistic comprehension. Interdisciplinary investigations propose that acoustical insights acquired through statistical learning find storage in various memory compartments, with mechanisms for data transfer operative between cortical and subcortical regions. This knowledge manifests through multi-faceted processing modes, delineated into semantic-episodic, short-long-term, and implicit-explicit (procedural-declarative) categorizations.
2 Brain and Statistics The auditory cortex is the recipient of external auditory stimuli through an ascending neural pathway that encompasses the cochlea, brainstem, superior and olivary complex, midbrain’s inferior colliculus, and the medial geniculate body within the thalamus (Daikoku, 2021; Koelsch, 2011; Pickles, 2013). The cognitive processing of auditory information involves two essential dimensions: temporal and spatial. Temporal information stems from the discrete time intervals of neuronal spiking in the auditory nerve, whereas spatial information is derived from the tonotopic organization of the cochlea. This intricate neural orchestration contributes to the cognitive aspects of auditory perception (Moore, 2013). The auditory pathway includes significant ascending and descending projections. The auditory cortex sends more descending than ascending projections to nuclei like the dorsal nucleus of the inferior colliculus (Huffman & Henson, 1990; Zatorre & McGill, 2005). The brain possesses computational capabilities to simulate probability distributions related to our surroundings, enabling the anticipation of future states and optimizing both perception and action to address environmental uncertainty (Pickering & Clark, 2014). Sensory predictive coding (Friston, 2010) can be utilized to evaluate prediction error or a discrepancy between sensory information and a prediction (Kiebel et al., 2008; Rao & Ballard, 1999). The ascending processing of auditory information involves key contributions from the auditory brainstem and thalamus, primary auditory cortex, auditory association cortex, pre-motor cortex, and frontal cortex (Daikoku, 2021; Friston et al., 2016; Tishby & Polani, 2011). As a result, a wide range of cognitive functions such as prediction, action, planning, and learning are incorporated into the processing of auditory data.
16
S. Bhardwaj et al.
The brain encodes probability distributions within sequential information (Harrison et al., 2006), through the unsupervised and implicit process of statistical learning (Cleeremans et al., 1998) allowing the brain to assess the uncertainty (Hasson, 2017). The brain selects the best course of action to accomplish a particular objective based on internal statistical models that forecast possible future situations (Monroy et al., 2017, 2019). The interaction of statistical learning can lead to the cultural creation (Feher et al., 2017) fostering musical originality (Daikoku, 2018a). Therefore, statistical learning is a crucial skill for the developing brain that supports both the production and perception of music. The probability distribution from a music corpus is closely related to the pitch prediction of innovative melodies (Pearce et al., 2010a, 2010b). Additionally, the brain efficiently interprets chord sequences through the correlation between the predictability of individual chords and the unpredictability of the global harmonic phrase (Tillmann et al., 2000). Probability and uncertainty encoding do not operate independently but instead interact with one another. Musicians are better at perceiving uncertainty in a tune than nonmusicians (Hansen & Pearce, 2014). Sustained musical training reduces uncertainty, optimizing the brain’s probabilistic music model of music for the generalization of musical structure, musical proficiency, and prediction processing effectiveness.
3 Computational Musicology Empirical informatics research in music follows two main methodologies: datadriven, focusing on big data and statistical modelling, and knowledge-driven, emphasizing musicological principles and potential human judgments. Both methods are widely accepted within their respective domains, each with its merits and drawbacks. Combining them could be beneficial, but trade-off exists. The alignment of standard music information retrieval (MIR) approaches with human judgments is a complex question, considering the nuanced nature of ‘human judgment’ or how humans ‘think’. Cognitive musicology explores these complexities, delving into music perception, cognition, and emotions. As part of music psychology, it uses computer modelling to understand music-related knowledge representation rooted in artificial intelligence, aiming to model processes in the representation, storage, perception, performance, and generation of musical knowledge. The inquiry into how human memory represents musical attributes prompts a comparison between human and machine intelligence. While ML and AI aim to align with human perception, a significant distinction exists in the bottom-up approach of feature modelling for machines and the top-down operation of human perception (Suomala & Kauttonen, 2022). Achieving alignment with human cognition requires interdisciplinary research, extending beyond individual efforts. An informed design strategy within a specific domain can make proposing a cognitively-based computational model plausible. Simultaneously, estimating human judgement, involving subjective experiments poses inherent challenges. Solutions, such as psychoacoustic, behavioural studies, and cognitive experiments, offer insights, but generalization
2 An Open Dialogue Between Neuromusicology and Computational …
17
remains difficult within the highly controlled framework of precise experimental methods. In melodic analysis, a crucial concept is ‘modelling.’ When examining how a mathematical model aligns with melodic movement, it is essential to focus on human cognition-relevant melodic segments. Exploring how melodies are encoded in human memory and identifying key anchors in melodic segments are central considerations for musicians and listeners. Drawing inspiration from speech production and perception experiments, questions arise about the interdependence of these two systems. Other pertinent inquiries involve distinctions between short-term and long-term memory, the role of working memory in music training, production, and performance, and the timescale of psychoacoustic relevance in musical events (Ganguli, 2013; Ganguli & Rao, 2017; Ganguli et al., 2022).
4 Perceptual Attributes of Musical Dimensions Rhythm, melody, harmony, timbre, dynamics, texture, and form are seven elements of music which can be useful in distinguishing different styles, eras, compositions, regions, and musical pieces from one another (Gomez & Gerken, 2000; Kuhl, 2004; Ong et al., 2017). • Rhythm The pattern of music in time is known as rhythm. It consists of the beat, tempo, and metre together. The term ‘rhythm’ especially refers to how musical notes are arranged (either compressed or extended) over a steady beat. The music’s pulse, or how quickly or slowly it moves along, is called the beat (the pulse one taps their foot to while listening to a song). The beat’s tempo is its speed. The most straightforward method for determining tempo is to use a metronome, an analogue or digital device that keeps track of the beats per minute (bpm). Within a song, beats are organized into discrete patterns by the metre. • Melody The coherent progression of tones is known as a melody. A melody is made up of discrete sounds called pitches or notes. When one hums, sings, or plays a tune, they are making a succession of pitches. A scale is a set of notes used to create music; the melodies it produces can be recognizable, predictable, or unpredictable. Major scales are typically thought to sound pleasant, while minor scales are thought to sound sad, frightened, or angry. The directions of the melodies, which can move up, down, or remain flat are represented by melodic contours. • Harmony It is the interaction between pitches as they are played simultaneously; harmony requires the simultaneous playing of many notes. Intervals (two notes played simultaneously and the space between them), chords (three or more notes played simultaneously), and triads are all examples of harmony (most classical and popular music
18
S. Bhardwaj et al.
uses triadic harmony, using three-note chords). The interactions of all the intervals inside a chord produce the musical mood. Harmony is a mixture of pitches that can be either pleasant or unpleasant depending on how they sound together. It aids in creating an atmosphere and a narrative through music. • Timbre It is the characteristic of sound that makes it easier to distinguish between identical melodies made using various means. Timbre can be referred to as the tone colour. Timbre results from the instrument’s material, articulation, sustained pitch, etc. The instrument’s material options include plastic, animal skin, metal, wood, vocal cords, etc. The timbre of an instrument also depends on whether it is hollow or solid, thin or thick, large or little, etc. Based on the initial sound, the softness or hardness of the articulation, and the force with which the instrument is struck to make sound, articulation or attack is determined. A factor in timbre is also the sound’s richness and intensity. • Dynamics It has to do with the loudness or softness of the music. Volume scale options include very loud, loud, medium loud, soft, and very soft. The gradual or a slow amplification of the volume, or occasionally a combination of both, can be used to illustrate the variability of dynamics. • Texture The sonic arrangement resulting from the interplay of musical voices is termed “texture.” Monophony is characterized by the presence of a single musical line playing at a given moment. This can manifest as a solo performance by a lone musician or in unison, where multiple performers execute the same musical line. Heterophony shares similarities with unison, but with the distinction that one voice may exhibit slight variations compared to the others. In instances where two or more voices are amalgamated, one vocal serves as the melody, and the other voices serve as the supporting cast. Polyphony is the simultaneous independent movement of two or more voices. Polyphony can convey a sense of conflict or harmony and thus can create textural differences. • Form The musical roadmap or form is the shape of the musical composition as determined by new and repeated segments. Binary (A B), Ternary (A B A), Song Form (A B A B), Modified Song Form (A A B A), Strophic (A A A A A A), Rondo (A B A C A D A), or Theme and Variation (A A’ A” A’” A’” A””) are all examples of forms. When a portion repeats but differs significantly from its initial format, variation in the theme is labelled with the prime symbol.
2 An Open Dialogue Between Neuromusicology and Computational …
19
4.1 Gisting, Chunking, and Grouping (Transitional Probability Aspect) of the Musical Information It reflects perceptually important melodic elements that would aid in a more accurate representation of the melody line in human memory. Learning increases the quantity and quality of synaptic connections and builds stronger, faster, and more precise neural networks in the brain. Less is sometimes better. Making and internalizing brief, condensed informational units is the first step in learning. People in organized groups employ brief musical excerpts. The music can be swiftly and successfully learned by grouping together specific musical passages. Chunking involves learning music with laser-like accuracy by focusing on minuscule portions of it. The first study on human statistical learning capacity in lexical acquisition was done in infants (Saffran et al., 1996). They were exposed to voice sequences for four minutes that randomly concatenated three-syllable pseudo-words before they learned to distinguish them from non-words. The results indicated that neonates possess the ability to acquire words through statistical learning of nearby transitional likelihood, even in the absence of cues like pauses or intonation that typically convey word boundaries. Subsequent research has demonstrated the significance of statistical learning based on nearby transitional probability in the acquisition of sequential structures in musical elements, including pitches, timbre, and chord sequences (Daikoku et al., 2016; Kim et al., 2011; Koelsch et al., 2016; Saffran et al., 1999, 2005). The early phases of learning a native language and music have traditionally been thought to be represented by the statistical learning that takes place during lexical acquisition. Various statistical or probabilistic learning techniques may help to partially explain how higher-level structures like syntactic processing are learned. ML models such as n-gram or nth-order Markov models have been extensively employed in natural language processing and autonomous music composition (Brent, 1999; Raphael & Stoddard, 2004). It is still unclear whether the human brain uses the similar inherent computational mechanism for lexical and syntactic structure acquisition or relies on independent systems.
4.2 Syntax and Grammar of the Musical Information The arrangement of statistically chunked words and the interaction between nonadjacent reliance and adjacent dependency in syntax are critical issues. However, in many experimental paradigms, the generation of musical expectation between neighbouring events and the construction of hierarchically ordered musical structures have frequently been misunderstood. In music, analogous hierarchical structures are discernible. For hierarchical structures in music, several researchers have developed computational and generative models, such as the Generative Theory of Tonal Music (Lerdahl & Jackendoff, 1983) and the Generative Syntax Model (Rohrmeir, 2011). To develop advanced
20
S. Bhardwaj et al.
statistical learning models closely mirroring those utilized in natural language and music processing, it proves beneficial to take into account factors such as categorization (Jones & Mewhort, 2007), non-adjacent (non-local) transitional probabilities (Frost & Monaghan, 2016; Pena et al., 2002), and higher-order transitional probabilities (Daikoku, 2018a).
4.3 Memory of Music Exploring memory of music could potentially lead us towards the identification of a fundamental structure or pattern for a melodic phrase and its improvisations. Statistical learning derived from episodic experiences represents a method for partially acquiring semantic memory (Altmann, 2017). In this context, a single episodic experience is statistically abstracted to yield semantic knowledge that encapsulates the common statistical features across encountered information (Sloutsky, 2010). This implies that statistical accumulation across various episodes constitutes a portion of the statistical learning that underlies chunk formation and word acquisition. The integration of semantic memory to produce novel episodic memory by statistical learning, however, appears to be occurring concurrently in an opposite statistical learning process (Altmann, 2017). Recent neurophysiological experiments have shown that even after brief exposure (5–10 min), the statistical learning effect is still evident in neuronal response (Daikoku et al., 2014; Francois et al., 2017). However, very few studies have looked into how long statistical knowledge can last. Due to their implicit nature in memory, statistical learning and artificial language learning have been hypothesized to have some characteristics (Guillemin & Tillmann, 2021). Implicit memory can last for up to two years, in accordance with studies on artificial grammar learning (Allen & Reber, 1980). Given the shared attributes between artificial language learning and implicit memory acquired through statistical learning, it is conceivable that such memory possesses both short-term and long-term properties (Kim et al., 2009). Memory consolidation has been shown to convert implicit memory into explicit memory (Fischer et al., 2006). Memory consolidation, action and production, and social communication are other ways to be able to memorize the chunk of information for longer duration. Useful information needs to be stored and the less useful information should be discarded from brain to use memory efficiently. The brain’s ability to discern precise sequence statistics is related to adaptability of the motor corticostriatal circuits, while the skill to anticipate probable outcomes is related to modulation in the motivational and executive corticostriatal circuits (Daikoku, 2021). The brain exhibits adaptability in its decision-making strategies, employing distinct neural circuits in response to fluctuations in environmental statistics. Prediction errors often escalate during interpersonal interactions. However, this error can be mitigated by executing one’s own actions in response to those of others.
2 An Open Dialogue Between Neuromusicology and Computational …
21
4.4 Music Similarity Computational models aiming to measure melodic similarity necessitate to define a representation and an associated distance measure. In Western music, the representation relies on the established written score, capturing pitch intervals, and note onset/duration with precision. The distance metric is often conceptualized as a string matching problem, where costs are musically informed and assigned within the framework of string edit distance. Melodic similarity research extends beyond surface features to incorporate higher-level melodic features, departing from the traditional string matching approach. Insights from such studies highlight the significance of (i) employing musically trained subjects, (ii) designing stimuli, (iii) specifying tasks visà-vis rating scales, and (iv) interpreting the predictive power of various representation and (dis)similarity measures. Notable aspects within this domain include exploring cognitive adequacy in measuring melodic similarity, comparing algorithmic judgments with human assessments (Mullensiefen & Frieler, 2004), and modelling experts’ conceptualizations of melodic similarity (Mullensiefen & Frieler, 2007). Additionally, a significant focus involves using high-level features and corpora-based techniques to model music perception and cognition (Mullensiefen et al., 2008), particularly applied to a diverse collection of Western folk and pop songs. Several similarity algorithms are compared, and their retrieval performances on different melody representations are assessed. Notably, those that align with human perception are deemed more effective in music information retrieval applications. Two such prevalent applications include content-based music search (e.g., music plagiarism, cover song detection) and music compositional aids that base on user preferences on genre and rhythm. Non-Eurogenetic and folk traditions present challenges for Western score-based transcription, as they have evolved as oral traditions lacking well-developed symbolic representations. This is challenging for both pedagogy and music retrieval. Literature on the representation of flamenco singing that is characterized with its highly melismatic form and smooth transitions between notes underscores the challenges of ground truth determination. Volk and van Kranenburg (2012) emphasize the reliance on musicological experts to annotate towards ground truth creation. Symbolic transcription is aligned with continuous time-varying pitch contours that uses dynamic programming. This approach optimizes pitch, energy, and duration through probability functions. While qualitative musicological works offer new insights, their lack of substantial corpus support invites criticism. In contrast, quantitative computational studies, while scalable to sizable datasets, often fall short of revealing novel musical insights. Computational studies predominantly aim to automate well-known tasks, easily performed by musicians. Some studies attempt to merge these methodologies, corroborating musical theoretical concepts through computational methods. Computational modelling in Indian art music encompasses tasks like raga recognition, melodic similarity, pattern discovery, segmentation, and landmark identification. Leveraging signal processing and machine learning methodologies, these approaches provide a foundation for building tools to navigate and organize extensive audio
22
S. Bhardwaj et al.
corpora, perform raga-based searches, retrieve from large audio archives, and support various pedagogical applications. Few distinct approaches in the methodology are consolidated by Marsden (2012) and summarized as follows: • In certain studies, experts are asked to judge the similarity between pairs of melody extracts on a rating scale. This direct approach generates measures of difference, ensuring properties such as non-negativity, self-identity, symmetry, and the triangle inequality. However, critics argue that this method lacks realism, as musicians seldom find themselves in situations where they need to assign a numerical value to the similarity between melodies. • Another study sidesteps direct rating but still utilizes expert judgement. Subjects are tasked with ranking a series of melodies with respect to a reference melody, the distance metric being derived from the relative positions in and ranked order. However, this measure still remains relative, unlike the ones obtained through direct rating of similarity. Another paradigm avoids artificial direct rating by presenting subjects with three melodies and asking them to indicate the pair that is most alike and the pair that is least alike. While placing the least burden on subjects, this approach has proven successful for non-expert subjects. Deriving measurements from these observations requires methods like multi-dimensional scaling over a substantial corpus. • Some studies refrain from direct judgement of similarity, relying on the categorization of melodies from existing musicological studies or based on geographical origin. • Other studies attempt to assess similarity based on real musical activities. For instance, measurements for query-by-humming systems asked subjects to sing/ hum a known melody. Alternatively, subjects were tasked to deliberately vary a melody, assuming that the variations are more similar to the original compared to other melodies. Addressing constraints in methodology and design choices in psychoacoustic experiments is a crucial aspect. We model melodic improvisations as ‘networks of elaborations’ and ‘cognitive demand’ in listeners. The proposed scheme aims to identify a common underlying model among improvised patterns of a ‘template’ melodic phrase, emphasizing the ‘deep’ versus ‘surface’ features of musical memory. Cognitive demand can be modelled as a combination of truly cognitive and simpler perceptual processing, representing the two ways of consuming recurrent musical material. Therefore, understanding the cognitive mode engagement by the subject is crucial for interpreting the aforesaid human ratings accurately, distinguishing between the cognitive mode of a trained musician as opposed to a mere listener.
2 An Open Dialogue Between Neuromusicology and Computational …
23
4.5 Mathematical Modelling and Statistical Learning In order to reduce prediction error, the brain can compute sequential transitional probability, understand entropy distributions, and predict probable states utilizing statistical models (Daikoku, 2021). The neurobiology of predictive coding and statistical learning can be evaluated by calculating entropy from the probability distribution (Harrison et al., 2006). Uncertainty of an outcome is known as conditional entropy (Friston, 2010). Curiosity is a kind of a motivation towards novelty-seeking behaviour and to resolve uncertainty (Kagan, 1972; Schwartenbeck et al., 2013). Mutual information is a measure of dependency between two variables in the brain, which reduces the uncertainty of events (Harrison et al., 2006). Experimental approaches such as computational modelling are useful to comprehend the learning mechanisms of brain (Daikoku, 2018b, 2019; Pearce & Wiggins, 2012; Rohrmeier & Rebuschat, 2012; Wiggins, 2020). Computational modelling represents the pertinent neural mechanism in the sensory cortices, integrating statistical variations (Daikoku, 2019; Roux & Uhlhaas, 2014; Turk-Browne et al., 2009). Simple recurrent network (SRN) is one such example of neural network. It learns the patterns of co-occurrence through error-driven learning and by giving weights to the predictions. Many neural network and deep learning approaches have been used in order to model the human multi-dimensional semantic memory, abstraction of episodic memory and language (Hochreiter & Schmidhuber, 1997; Landauer & Dumais, 1997; Lund & Burgess, 1996). The chunking hypothesis has proven valuable for developers and researchers seeking to model the information dynamics of cognitive processes (Wiggins & Sanjekdar, 2019) as well as in the context of music (Pearce & Wiggins, 2012). In this framework, learning is rooted in the extraction, storage, and combination of information chunks. Markov decision process (MDP) is a well-known reinforcement learning model that extends the simple perspective policy by including active processes like choices and rewards (Friston et al., 2014, 2015; Pezzulo et al., 2015; Schwartenbeck et al., 2013). The Markov model can be interpreted in various ways. Using the variable order notion for accurate modelling of the statistical learning of musical sequences including distributional characteristics of the music, Information Dynamics of Music (IDyOM) is such programme. On the other hand, Information Dynamics of the Thinking (IDyOT) learns using the statistical mechanics of learning music and languages (Winkler & Czigler, 2012). The learning impact dynamically influences the temporal aspect of the statistical learning model. When considering a substantial music corpus, the probability distribution plays a crucial role in anticipating pitch in novel melodies (Pearce et al., 2010a, 2010b). In accordance with the predictive coding hypothesis, machine learning models used in neurophysiological experiments (Daikoku, 2018b; Pearce & Wiggins, 2012; Pearce et al., 2010a, 2010b; Stufflebeam et al., 2009) consistently showed heightened neural activity for stimuli with high information content (i.e., low probability) compared to those with low information content (i.e., high probability).
24
S. Bhardwaj et al.
This pattern underscores the brain’s responsiveness to unexpected or less predictable stimuli, aligning with the principles of predictive coding (Daikoku, 2021).
4.6 Spatio-Temporal Mechanism of the Neural Probability and Uncertainty Encoding At the time of encoding a sequence of stimuli with a probability distribution, the brain inhibits neuronal activity that would occur in response to a predictable stimuli, anticipating highly probable future events (Daikoku, 2021). This encoding involves evaluating the probability and uncertainty of each occurrence. Event-related potential (ERP) studies (Koelsch et al., 2019) may help to partially explain the distinction between the entropy and statistical probability encodings. There is a difference in neural activity that represents the statistical learning effect between predictable and unpredictable stimuli. Statistical learning may be expressed in early (i.e., Auditory Brainstem Response or ABR, MisMatch Negativity or MMN, P5, N100) or late (i.e., P200, N200-250, P300, N400) components, according to studies on event-related potentials (ERP) and event-related magnetic fields (ERF). Early components are associated with fundamental auditory processing, while later components involve higher-order processing related to semantics and syntax (Garrido et al., 2007). Later components indicate a shift in context or predictability, according to neurophysiological (Donchin & Coles, 1988; Frens & Donchin, 2009) and computational studies (Feldman & Friston, 2010). ERP components like early right anterior negativity (ERAN) are involved in the processing of harmony, melody, and rhythmic synaptic signals in trained musicians (Koelsch, 2012) and might reflect the prediction of individual occurrences and the context’s uncertainty. ERAN plays a crucial role for attention-driven transformations and helps in the uncertainty prediction. Different ERP components may be included at various levels of statistical models. In speech processing, high-frequency oscillations primarily track the fine speech structure and contribute to bottom-up processing (Giraud & Poeppel, 2012), whereas low-frequency oscillations in the speech motor cortex follow the speech signal envelope (Park et al., 2015). Synchronization and coupling of brain oscillations with speech frequencies takes place in each frequency band and is influenced by statistical learning and chunking. When brain encodes transitional probability in an auditory sequence, the neural response to predictable stimuli gets inhibited, reflecting anticipation of a likely future input with high transitional probability (Daikoku, 2021). The impacts of statistical learning ultimately show themselves as a variation in the neuronal response amplitudes to stimuli with different levels of transitional probability. The neuroanatomical mechanisms underlying statistical learning involve both general brain regions, such as the hippocampus and striatum (Gomez, 2017; McClelland et al., 1995; Reddy et al., 2015; Schapiro et al., 2014), and specific processing areas, including the superior temporal gyrus (STG) and sulcus (STS) for auditory
2 An Open Dialogue Between Neuromusicology and Computational …
25
learning, and regions like the cuneus and fusiform gyrus for visual learning (Nastase et al., 2014; Strange et al., 2005). Cortical networks are subject to modulation based on the sensory nature of statistical learning (Paraskevopoulos et al., 2018; Roser et al., 2011). Within this intricate network, the neocortex and hippocampus play pivotal roles in probability encoding and uncertainty encoding, respectively (Harrison et al., 2006; Hasson, 2017; Thiessen & Pavlik, 2013). However, other studies suggest that these two levels of statistical learning, probability and uncertainty encoding, are interdependent (Hansen & Pearce, 2014; Tillmann et al., 2000). The pre-motor (PMC), primary motor, and superior temporal cortex significantly contribute to the forward prediction of speech (Hickok, 2012; Overath et al., 2007). In the contexts of infant music and speech processing, the left dorsal stream, which includes the superior temporal areas, pre-frontal cortex, PMC, and auditory-motor areas are crucial in statistical learning (Elmer et al., 2018). The statistical learning process, which is supported by chunking, is also assumed to be related to the basal ganglia and in particular, striatum, which is thought to be connected to motivational circuits (Daikoku, 2021; Plante et al., 2015; Sakai et al., 2003; Turk-Browne et al., 2009). Individuals modify their decisions within sequential frameworks, employing various-order Markov models to adapt to changes in the environmental statistics and engaging dissociable circuits (Karlaftis et al., 2019). The probability encoding essential to this process relies significantly on the functional connectivity between the auditory and motor language networks, specifically the superior temporal cortex, Wernicke’s region, and pre-frontal cortex (Lopez-Barroso et al., 2013). Local dependence violations are primarily processed in the ventral PMC, whereas the handling of nested dependencies is processed in the pars opercularis of the posterior IFG (Amunts et al., 2010; Makuuchi et al., 2009; Opitz & Kotz, 2012). The hemispheric weighting (both left and right) in the processing of language and music syntax is influenced by the ventral portions of Broca’s region. It is also involved in the intricate processing of nested sequences involving actions and mathematical formulas (Fazio et al., 2009; Friederici, 2011; Koechlin & Jubault, 2006; Makuuchi et al., 2009; Tremblay et al., 2013). Tree structures, such as context-free grammar, utilized to describe nested syntactic arrangements, necessitate a specific recursive neural code (Daikoku, 2021; Dehaene et al., 2015; Hauser et al., 2002). Non-human primates cannot interpret context-free grammar the way that humans can. Thus, the peculiarity of human language and musical cognition may be explained by the possibility that this form of processing is specific to humans (Fitch & Hauser, 2004). The mechanisms behind uncertainty encoding and prediction operate in an independent manner (Hasson, 2017; Thiessen & Pavlik, 2013). Lateral temporal regions [Wernicke’s area and the medial temporal lobe (MTL)], which includes the hippocampus, are thought to play a part in encoding uncertainty (Harrison et al., 2006). The complementary learning system (CLS) (McClelland et al., 1995) model for the dual memory system in neocortex and hippocampus may help to partially explain the interaction between neural networks between prediction and uncertainty. The hippocampus is implicated in the sparse, rapid, and long-term episodic memory encoding, involving significant connectivity changes in the dual memory system, and within the hippocampus. Conversely, the neocortex plays a role in the gradual
26
S. Bhardwaj et al.
and enduring formation of semantic memory, relying on statistical knowledge. This process results in more modest alterations in connectivity within the neocortex (Daikoku, 2021; Frost & Monaghan, 2016).
5 Evaluation and Way Forward 5.1 Music, Artificial Intelligence, and Neuroscience Modern musicology, neuroscience, and computational modelling are products of the theories advanced by thinkers like von Neumann, Turing, John McCarthy, Skinner, and Noam Chomsky, among others. Their vision to advance the idea of AI was ground-breaking and contributed to drawing enthusiasts from all of these various but related professions. These enthusiasts can learn a lot from one another and may even help develop technology that will make it easier to analyse and convolute mixed ideologies. Music, as both an art and a science, involves the sequence, combination, and arrangement of sounds in time to create cohesive and continuous compositions. Music’s core can be characterized by patterns. The recognition of patterns in music demands a broader range of search techniques compared to many other fields. This is because it encompasses variations such as rhythmic nuances, pitch transposition, interval variations, and less perceptible elements like inversions, retrogrades, and metric displacements. Human-composed, computer-composed, and also randomly composed music all have patterns, and we can define music as an innate quality of humans, and hence patterns can exist in almost everything be it as random as irrational numbers or as systemic as classical musical sequences. AI researchers have often overlooked the role of emotions in their analysis of listener reactions due to the inherent complexity of precisely pinpointing neurotransmitter responses to activations. Modern humanism uses computational modelling and AI, which are based on how the brain works and processes information, to look into these patterns. But since emotions and music go hand in hand, can there even be music without them? The question of whether new humanism will make up for the lack of emotions in modern technologies is still open. There are physiological and neurological reactions to music in both the central and peripheral nerve systems (Aldridge, 2005; Calvo et al., 2009). The listener’s pupils may enlarge when listening to pleasing music, and their heart rate, blood pressure, and skin conductivity may all change (Blood et al., 1999). Galvanic skin response (GSR) measurement has been demonstrated to be a reliable tool for analysing emotional responses to music (VanderArk & Ely, 1993). The typical feelings that people experience while listening to, creating, and/ or performing music are joy, sadness, fear, disgust, anger, and surprise. The brain’s functions are fundamentally influenced by emotional feelings, which are not simple in nature and depend on a person’s life events, scenarios, and sequences. Since it is
2 An Open Dialogue Between Neuromusicology and Computational …
27
so difficult to exactly measure the chemical responses to emotional activations, AI researchers have chosen to omit emotions when analysing listener reactions. Numerous crucial characteristics are shared by neuroscience, AI, and music. Researchers who specifically concentrate on neuroscience and AI diligently explore and utilize their expertise in the domains of music cognition, music perception, and music therapy. Language and music are closely related to one another and making it challenging to differentiate between them during pattern recognition. Sophisticated algorithms are utilized to identify patterns and unravel the secrets of the brain. Artificial intelligence is still a long way from being able to produce artificial general intelligence (AGI) and is farther away from comprehending consciousness and selfawareness. Though we have made great strides in our understanding of many aspects of the human brain, we are still a long way from achieving the perfect brain model. Although there have long been AI applications in the music industry, there is still a need to address the ethical issues associated with music information retrieval and creation.
5.2 Empirical Versus Observational Studies The brain’s pattern recognition method relies on five fundamental principles: feature analysis, template matching, prototype similarity, recognition by components, and bottom-up/top-down processing. Despite the predominantly musicological nature of studies addressing musical attributes, most involve either a comprehensive qualitative analysis of selected musical excerpts or a compilation of expert domain knowledge. While these studies offer intriguing musical insights, several potential drawbacks need consideration: • The proposed musical models’ generalizability is often compromised by a limited repertoire. • Subjectivity in the analysis of musical excerpts introduces bias, affecting the reliability of the findings. • Absence of concrete quantitative evidence supporting arguments hampers the robustness of the presented analyses. • Manual analysis, limited by human capabilities and inadequate memory (both short- and long-term), restricts the scope of feasible examinations. • Results may face difficulties in reproducibility, raising concerns about the reliability of the findings. While qualitative musicological works often unveil new musical insights, they are susceptible to criticism for not substantiating findings with a sizable corpus. In contrast, quantitative computational studies can scale to substantial datasets but may fall short in uncovering novel musical insights. Computational studies predominantly aim to automate well-known tasks that musicians find relatively easy to perform.
28
S. Bhardwaj et al.
Some studies seek to bridge the gap between these methodologies, combining qualitative and quantitative approaches to validate various concepts in musical theories using computational methods.
5.3 Generative Algorithms The emergence of computational models for generative music represents a recent trend in AI-based technological advancements. However, relying solely on a datadriven strategy often falls short in capturing naturally occurring rhythmic groupings. To overcome this limitation, innovative approaches, such as dictionary-based and stroke-grouping-based methodologies, have been proposed. These methods are specifically designed to generate novel sequences within the 8-beat cycle of Aditala. Furthermore, performers have integrated arithmetic partitioning, drawing inspiration from their own creative processes, to address the limitations of previous models. These earlier models struggled to comprehend the long-term structure and grammatical nuances of this musical idiom, excelling only in capturing localized and short-term phrasing. Addressing this issue involves considering a rhythmic phrase as a gestalt, leading to the hypothesis of three rationales: • A sequence of strokes, when played at a faster speed, behaves as an independent unit rather than a mere compressed version of the reference. • Context influences the accent; the same phrase is played differently as part of a composition compared to being used as a filler (ornamentation) during improvisation. • Phrases exhibit a co-articulation effect; the gesture differs in anticipation of the forthcoming stroke/pattern. Recent findings suggest that there is a gestural difference in articulating the same phrase in different contexts. While timbral features can measure these differences, there is a context-dependence captured in a supra-segmental way, prompting an exploration of speech-prosodic features. This implies that a syntactically correct sequence may not necessarily be semantically plausible to a musician’s expectancy. In the realm of qualitative evaluation, involving expert listening, we believe that incorporating the proposed knowledge constraints would enhance the naturalness and, consequently, the acceptability of the generated sequences. The objective of such studies is to analyse rhythmic patterns at different timescales and explore potential fractal geometry in rhythmic progression. This exploration aims to understand the mental ‘schema’ a performer employs when executing familiar, yet not memorized, rhythm sequences. A brief background on the rhythmic framework of Carnatic music, based on tala, providing a structure for repetition, grouping, and improvisation, is presented. The concept of groupings serves as a fundamental building block of Carnatic rhythm, with percussionists following specific rules to enhance the musical aesthetic of rhythmic generation. One plausible approach involves analysing the data by computing self-similarity matrices with different
2 An Open Dialogue Between Neuromusicology and Computational …
29
empirical tuning of hyperparameters, searching for interesting fractal patterns that can be explained through musicologically plausible hypotheses.
5.4 Machine Listening Versus Appreciation The term ‘machine listening’1 characterizes a rapidly expanding interdisciplinary field in science and engineering that leverages audio signal processing and machine learning to interpret sound and speech. It is the technology behind being ’understood’ by AI applications like Siri and Alexa, identifying songs with Shazam, and engaging with various audio-assistive technologies. However, machine listening transcends being solely a scientific discipline or a technical innovation; it involves significant exploitation of both human and planetary resources for building, powering, and maintaining its extensive infrastructures. Scientifically, machine listening relies on massive volumes of data, extracted from auditory environments and cultures, which, despite their existing diversity, may never be diverse enough. Despite its name, machine listening does not precisely replicate the biological processes of human audition or the psychocultural aspects of meaningmaking. Even if it were to mimic human audition, the question of cognition would persist, as machines do not truly ’listen’ in the subjective sense. What’s crucial in the realm of machine listening is the cultural awareness of AI technologies, distinguishing mere listening from an enculturated appreciation. While the transition from an acoustic signal to psychoacoustic correlates is broadly universal, its recognition by the brain is subjective. The perception of the same musical sound can vary based on cultural biases, turning it into noise for one ethnicity. Additionally, there is a distinction between innate and acquired appreciation; consonant melodic intervals elicit responses from infants with no musical training. The collective goal of AI should be to capitalize on this cultural wisdom rather than merely aiming for ’smart’ intelligent systems. In the era of transfer learning and big data, it is imperative to include frequent checkpoints to ensure that models learn their intended tasks. Rather than striving for a perfect accuracy standard, the focus should be on achieving explainable AI and reinforcement learning to make outcomes beneficial for the community at large.
5.5 Concluding Remarks: Converging Humanistic Approaches The rational approach, in line with recent technology trends, suggests proposing a weighted combination of data- and knowledge-driven research. With the evolution 1
Resource: (Against) the coming world of listening machines (https://machinelistening.exposed/ topic/against-the-coming-world-of-listening-machines/).
30
S. Bhardwaj et al.
of computational resources, several advanced machine learning models, including DNN, RNN, LSTM, end-to-end systems, attention-based, and adversarial frameworks, have emerged. These models exhibit competitive performance even without labelled data, employing deep learning and data-mining techniques. An intriguing avenue for exploration is to investigate whether an intermediate representation of a melodic segment, like a hidden DNN-layer in these systems, aligns with the abstraction level of human musical memory as engineered from handcrafted features. Considering the importance of computational complexity in today’s fast-paced world, where real-time applications on mobile platforms are highly sought after, future research efforts should focus on optimizing computational efficiency. This presents a potential direction for research to build upon the proposed models. In the realm of psychomusicology, a prevalent question arises: are cognitive and neural correlates domain-specific or common to both music and language processing? While some literature emphasizes differences, an interesting theory in cognitive sciences is Gestalt psychology, which posits that the human mind forms a ’global whole’ with self-organizing tendencies. Key Gestalt principles such as proximity, common fate, similarity, continuity, closure, and symmetry extend to the auditory domain, translating into issues of organization, grouping, and segmentation. The auditory analogues of Gestalt principles manifest in differences and similarities in loudness, pitch, and timbre of sounds. Ultimately, these considerations converge on the neuropsychological aspects of melodic similarity, delving into what humans ‘understand’ by ‘similarity.’ A classical problem, such as ‘music genre recognition,’ serves as an accessible avenue to explore this idea. Whether influenced by training in music perception, the process of ‘gist’-ing music information in human memory, storing a representative/exemplar ‘template’ for a melodic phrase in the human brain, or modelling musical ‘knowledge,’ the study of cognitive musicology emerges as the most promising pathway. Given the available resources in interdisciplinary research paradigms like music technology, cognitive musicology, and music performance, albeit limited, our humble endeavour aimed to find a computational measure for adequately modelling human judgement of music similarity. Acknowledgements Music Cognition Lab, National Institute of Mental Health and Neurosciences (NIMHANS), is supported by the Wellcome DBT-India Alliance CPH Intermediate Fellowship of Dr Shantala Hegde [IA/CPHI/17/1/503348].
References Aldridge, D. (2005). Music therapy and neurological rehabilitation: Performing health. Jessica Kingsley Publishers. Allen, R., & Rebe, A. S. (1980). Very long term memory for tacit knowledge. Cognition, 8(2), 175–185. Altmann, G. T. (2017). Abstraction and generalization in statistical learning: Implications for the relationship between semantic types and episodic tokens. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1711).
2 An Open Dialogue Between Neuromusicology and Computational …
31
Amunts, K., Lenzen, M., Friederici, A. D., Schleicher, A., Morosan, P., Palomero-Gallagher, N., & Zilles K. (2010). Broca’s region: novel organizational principles and multiple receptor mapping. PLoS Biology, 8(9). Bhardwaj, S., & Hegde S. (2022). Music and health: Music and its effect on physical health and positive mental health. In A handbook on sound, music and health. Indus, ThinkMines Media. Blood, A. J., & Zatorre, R. J. (2001). Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proceedings of the National Academy of Sciences of the United States of America, 98(20), 11818–11823. Blood, A. J., Zatorre, R. J., Bermudez, P., & Evans, A. C. (1999). Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions. Nature Neuroscience, 2(4), 382–387. Brent, M. R. (1999). Speech segmentation and word discovery: A computational perspective. Trends in Cognitive Sciences, 3(8), 294–301. Calvo, R. A., Brown, I., & Schelding, S. (2009). Effect of experimental factors on the recognition of affective mental states through physiological measures. In AI 2009: Advances in artificial intelligence. Springer. Cleeremans, A., Destrebecqz, A., & Boyer, M. (1998). Implicit learning: News from the front. Trends in Cognitive Sciences, 2(10), 406–416. Daikoku, T. (2018a). Entropy, uncertainty, and the depth of implicit knowledge on musical creativity: Computational study of improvisation in melody and rhythm. Frontiers in Computational Neuroscience, 12, 97. Daikoku, T. (2018b). Neurophysiological markers of statistical learning in music and language: Hierarchy, entropy, and uncertainty. Brain Sciences, 8(6). Daikoku, T. (2019). Depth and the Uncertainty of Statistical Knowledge on Musical Creativity Fluctuate Over a Composer’s Lifetime. Frontiers in Computational Neuroscience, 13, 27. Daikoku, T. (2021). Discovering the neuroanatomical correlates of music with machine learning. In E. R. Miranda (Ed.), Handbook of artificial intelligence for music. Switzerland AG, Springer. Daikoku, T., Yatomi, Y., & Yumoto, M. (2014). Implicit and explicit statistical learning of tone sequences across spectral shifts. Neuropsychologia, 63, 194–204. Daikoku, T., Yatomi, Y., & Yumoto, M. (2016). Pitch-class distribution modulates the statistical learning of atonal chord sequences. Brain and Cognition, 108, 1–10. Dehaene, S., Meyniel, F., Wacongne, C., Wang, L., & Pallier, C. (2015). The neural representation of sequences: From transition probabilities to algebraic patterns and linguistic trees. Neuron, 88(1), 2–19. Donchin, E., & Coles, M. G. H. (1988). Is the P300 component a manifestation of context updating? Behavioral and Brain Sciences, 11(3). Elmer, S., Albrecht, J., Valizadeh, S. A., Francois, C., & Rodriguez-Fornells, A. (2018). Theta coherence asymmetry in the dorsal stream of musicians facilitates word learning. Science and Reports, 8(1), 4565. Fazio, P., Cantagallo, A., Craighero, L., D’Ausilio, A., Roy, A. C., Pozzo, T., Calzolari, F., Granieri, E., & Fadiga, L. (2009). Encoding of human action in Broca’s area. Brain, 132(Pt 7), 1980–1988. Feher, O., Ljubicic, I., Suzuki, K., Okanoya, K., & Tchernichovski, O. (2017). Statistical learning in songbirds: from self-tutoring to song culture. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1711). Feldman, H., & Friston, K. J. (2010). Attention, uncertainty, and free-energy. Frontiers in Human Neuroscience, 4, 215. Fischer, S., Drosopoulos, S., Tsen, J., & Born, J. (2006). Implicit learning—explicit knowing: A role for sleep in memory system interaction. Journal of Cognitive Neuroscience, 18(3), 311–319. Fitch, W. T., & Hauser, M. D. (2004). Computational constraints on syntactic processing in a nonhuman primate. Science, 303(5656), 377–380. Francois, C., Cunillera, T., Garcia, E., Laine, M., & Rodriguez-Fornells, A. (2017). Neurophysiological evidence for the interplay of speech segmentation and word-referent mapping during novel word learning. Neuropsychologia, 98, 56–67.
32
S. Bhardwaj et al.
Frens, M. A., & Donchin, O. (2009). Forward models and state estimation in compensatory eye movements. Frontiers in Cellular Neuroscience, 3, 13. Friederici, A. D. (2011). The brain basis of language processing: From structure to function. Physiological Reviews, 91(4), 1357–1392. Friston, K., Schwartenbeck, P., FitzGerald, T., Moutoussis, M., Behrens, T., & Dolan, R. J. (2014). The anatomy of choice: dopamine and decision-making. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1655). Friston, K., Rigoli, F., Ognibene, D., Mathys, C., Fitzgerald, T., & Pezzulo, G. (2015). Active inference and epistemic value. Cognitive Neuroscience, 6(4), 187–214. Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., Doherty, J. O., & Pezzulo, G. (2016). Active inference and learning. Neuroscience & Biobehavioral Reviews, 68, 862–879. Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138. Frost, R. L., & Monaghan, P. (2016). Simultaneous segmentation and generalisation of non-adjacent dependencies from continuous speech. Cognition, 147, 70–74. Ganguli, K. K., & Rao, P. (2017). Towards computational modeling of the ungrammatical in a raga performance. In International society for music information retrieval (ISMIR) conference (pp. 39–45). Suzhou, China. Ganguli, K. K. (2013). How do we ‘see’ & ‘say’ a raga: A perspective canvas. Samakalika Sangeetham, 4(3), 112–119. Ganguli, K. K., Senturk, S., & Guedes, C. (2022). Critiquing task-versus goal-oriented approached: A case for makam recognition. In International society for music information retrieval (ISMIR) conference. Bengaluru, India. Garrido, M. I., Kilner, J. M., Kiebel, S. J., & Friston, K. J. (2007). Evoked brain responses are generated by feedback loops. Proceedings of the National Academy of Sciences of the United States of America, 104(52), 20961–20966. Giraud, A. L., & Poeppel, D. (2012). Cortical oscillations and speech processing: Emerging computational principles and operations. Nature Neuroscience, 15(4), 511–517. Goldman, A., Jackson, T., & Sajda, P. (2018). Improvisation experience predicts how musicians categorize musical structures. Psychology of Music, 48(1), 18–34. Gomez, R. L. (2017). Do infants retain the statistics of a statistical learning experience? Insights from a developmental cognitive neuroscience perspective. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1711). Gomez, R. L., & Gerken, L. (2000). Infant artificial language learning and language acquisition. Trends in Cognitive Sciences, 4(5), 178–186. Grahn, J. A., & Rowe, J. B. (2009). Feeling the beat: Premotor and striatal interactions in musicians and nonmusicians during beat perception. Journal of Neuroscience, 29(23), 7540–7548. Guillemin, C., & Tillmann, B. (2021). Implicit learning of two artificial grammars. Cognitive Processing, 22(1), 141–150. Hansen, N. C., & Pearce, M. T. (2014). Predictive uncertainty in auditory sequence processing. Frontiers in Psychology, 5, 1052. Harrison, L. M., Duggins, A., & Friston, K. J. (2006). Encoding uncertainty in the hippocampus. Neural Networks, 19(5), 535–546. Hasson, U. (2017). The neurobiology of uncertainty: implications for statistical learning. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1711). Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: What is it, who has it, and how did it evolve? Science, 298(5598), 1569–1579. Hickok, G. (2012). The cortical organization of speech processing: Feedback control and predictive coding the context of a dual-stream model. Journal of Communication Disorders, 45(6), 393– 402. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
2 An Open Dialogue Between Neuromusicology and Computational …
33
Huffman, R. F., & Henson, O. W., Jr. (1990). The descending auditory pathway and acousticomotor systems: Connections with the inferior colliculus. Brain Research. Brain Research Reviews, 15(3), 295–323. Janata, P. (2009). The neural architecture of music-evoked autobiographical memories. Cerebral Cortex, 19(11), 2579–2594. Janata, P., Birk, J. L., Van Horn, J. D., Leman, M., Tillmann, B., & Bharucha, J. J. (2002a). The cortical topography of tonal structures underlying Western music. Science, 298(5601), 2167– 2170. Janata, P., Tillmann, B., & Bharucha, J. J. (2002b). Listening to polyphonic music recruits domain-general attention and working memory circuits. Cognitive, Affective, & Behavioral Neuroscience, 2, 121–140. Jones, M. N., & Mewhort, D. J. (2007). Representing word meaning and order information in a composite holographic lexicon. Psychological Review, 114(1), 1–37. Kagan, J. (1972). Motives and development. Journal of Personality and Social Psychology, 22(1), 51–66. Karlaftis, V. M., Giorgio, J., Vertes, P. E., Wang, R., Shen, Y., Tino, P., Welchman, A. E., & Kourtzi, Z. (2019). Multimodal imaging of brain connectivity reveals predictors of individual decision strategy in statistical learning. Nature Human Behaviour, 3, 297–307. Kiebel, S. J., Daunizeau, J., & Friston, K. J. (2008). A hierarchy of time-scales and the brain. PloS Computational Biology, 4(11), e1000209. Kim, R., Seitz, A., Feenstra, H., & Shams, L. (2009). Testing assumptions of statistical learning: Is it long-term and implicit? Neuroscience Letters, 461(2), 145–149. Kim, S. G., Kim, J. S., & Chung, C. K. (2011). The effect of conditional probability of chord progression on brain response: An MEG study. PLoS ONE, 6(2), e17337. Koechlin, E., & Jubault, T. (2006). Broca’s area and the hierarchical organization of human behavior. Neuron, 50(6), 963–974. Koelsch, S. (2010). Towards a neural basis of music-evoked emotions. Trends in Cognitive Sciences, 14(3), 131–137. Koelsch, S. (2011). Toward a neural basis of music perception—a review and updated model. Frontiers in Psychology, 2, 110. Koelsch, S. (2012). Brain and music. Wiley-Blackwell. Koelsch, S., Gunter, T., Friederici, A. D., & Schroger, E. (2000). Brain indices of music processing: “nonmusicians” are musical. Journal of Cognitive Neuroscience, 12(3), 520–541. Koelsch, S., Busch, T., Jentschke, S., & Rohrmeier, M. (2016). Under the hood of statistical learning: A statistical MMN reflects the magnitude of transitional probabilities in auditory sequences. Science and Reports, 6, 19741. Koelsch, S., Vuust, P., & Friston, K. (2019). Predictive processes and the peculiar case of music. Trends in Cognitive Sciences, 23(1), 63–77. Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nature Reviews Neuroscience, 5(11), 831–843. Landauer, T. K., & Dumais, S. T. (1997). A solution to Platos problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2), 211–240. Lerdahl, F., & Jackendoff, R. (1983). A generative theory of tonal music. Massachusetts, MIT Press. Lopez-Barroso, D., Catani, M., Ripolles, P., Dell’Acqua, F., Rodriguez-Fornells, A., & de DiegoBalaguer, R. (2013). Word learning is mediated by the left arcuate fasciculus. Proceedings of the National Academy of Sciences of the United States of America, 110(32), 13168–13173. Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical cooccurrence. Behavior Research Methods, Instruments, & Computers, 28, 203–208. Makuuchi, M., Bahlmann, J., Anwander, A., & Friederici, A. D. (2009). Segregating the core computational faculty of human language from working memory. Proceedings of the National Academy of Sciences of the United States of America, 106(20), 8362–8367.
34
S. Bhardwaj et al.
Marsden, A. (2012). Interrogating melodic similarity: A definitive phenomenon or the product of interpretation? Journal of New Music Research, 41(4), 323–335. McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102(3), 419–457. Monroy, C., Meyer, M., Gerson, S., & Hunnius, S. (2017). Statistical learning in social action contexts. PLoS ONE, 12(5), e0177261. Monroy, C. D., Gerson, S. A., Dominguez-Martinez, E., Kaduk, K., Hunnius, S., & Reid, V. (2019). Sensitivity to structure in action sequences: An infant event-related potential study. Neuropsychologia, 126, 92–101. Moore, B. C. J. (2013). Introduction to the psychology of hearing. Bingley. Mullensiefen, D., & Frieler, K. (2004). Cognitive adequacy in the measurement of melodic similarity: Algorithmic vs. human judgments. Computing in Musicology, 13, 147–176. Mullensiefen, D., & Frieler, K. (2007). Modelling experts’ notions of melodic similarity. Musicae Scientiae, 11(1_suppl), 183–210. Mullensiefen, D., Wiggins, G. A., & Lewis, M. (2008). High-level feature descriptors and corpusbased musicology: Techniques for modelling music cognition. In Systematic and comparative musicology: Concepts, methods, findings (pp. 133–155). Nastase, S., Iacovella, V., & Hasson, U. (2014). Uncertainty in visual and auditory series is coded by modality-general and modality-specific neural systems. Human Brain Mapping, 35(4), 1111– 1128. Neuhaus, C. (2017). Methods in neuromusicology: Principles, trends, examples and the pros and cons. In Studies in musical acoustics and psychoacoustics. Springer. Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50(3), 417–528. Ong, J. H., Burnham, D., & Stevens, C. J. (2017). Learning novel musical pitch via distributional learning. Journal of Experimental Psychology. Learning, Memory, and Cognition, 43(1), 150– 157. Opitz, B., & Kotz, S. A. (2012). Ventral premotor cortex lesions disrupt learning of sequential grammatical structures. Cortex, 48(6), 664–673. Overath, T., Cusack, R., Kumar, S., von Kriegstein, K., Warren, J. D., Grube, M., Carlyon, R. P., & Griffiths, T. D. (2007). An information theoretic haracterization of auditory encoding. PloS Biology, 5(11), e288. Paraskevopoulos, E., Kuchenbuch, A., Herholz, S. C., & Pantev, C. (2012). Statistical learning effects in musicians and non-musicians: An MEG study. Neuropsychologia, 50(2), 341–349. Paraskevopoulos, E., Chalas, N., Kartsidis, P., Wollbrink, A., & Bamidis, P. (2018). Statistical learning of multisensory regularities is enhanced in musicians: An MEG study. NeuroImage, 175, 150–160. Park, H., Ince, R. A., Schyns, P. G., Thut, G., & Gross, J. (2015). Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners. Current Biology, 25(12), 1649–1653. Patel, A. D. (2003). Language, music, syntax and the brain. Nature Neuroscience, 6(7), 674–681. Pearce, M. T., Mullensiefen, D., & Wiggins, G. A. (2010a). The role of expectation and probabilistic learning in auditory boundary perception: A model comparison. Perception, 39(10), 1365–1389. Pearce, M. T., Ruiz, M. H., Kapasi, S., Wiggins, G. A., & Bhattacharya, J. (2010b). Unsupervised statistical learning underpins computational, behavioural, and neural manifestations of musical expectation. NeuroImage, 50(1), 302–313. Pearce, M. T., & Wiggins, G. A. (2012). Auditory expectation: The information dynamics of music perception and cognition. Topics in Cognitive Science, 4(4), 625–652. Pena, M., Bonatti, L. L., Nespor, M., & Mehler, J. (2002). Signal-driven computations in speech processing. Science, 298(5593), 604–607. Peretz, I. (2006). The nature of music from a biological perspective. Cognition, 100(1), 1–32.
2 An Open Dialogue Between Neuromusicology and Computational …
35
Perruchet, P., & Pacton, S. (2006). Implicit learning and statistical learning: One phenomenon, two approaches. Trends in Cognitive Sciences, 10(5), 233–238. Pezzulo, G., Rigoli, F., & Friston, K. (2015). Active inference, homeostatic regulation and adaptive behavioural control. Progress in Neurobiology, 134, 17–35. Pickering, M. J., & Clark, A. (2014). Getting ahead: Forward models and their place in cognitive architecture. Trends in Cognitive Sciences, 18(9), 451–456. Pickles, J. O. (2013). An introduction to the physiology of hearing. Brill. Plante, E., Patterson, D., Gomez, R., Almryde, K. R., White, M. G., & Asbjornsen, A. E. (2015). The nature of the language input affects brain activation during learning from a natural language. Journal of Neurolinguistics, 36, 17–34. Platel, H., Baron, J. C., Desgranges, B., Bernard, F., & Eustache, F. (2003). Semantic and episodic memory of music are subserved by distinct neural networks. NeuroImage, 20(1), 244–256. Przysinda, E., Zeng, T., Maves, K., Arkin, C., & Loui, P. (2017). Jazz musicians reveal role of expectancy in human creativity. Brain and Cognition, 119, 45–53. Rampinini, A. C., Kepinska, O., Balboni, I., Franch, M. F., Zatorre, R., & Golestani, N. (2023). A game of song scramble: Exploring the brain’s music network. In Organization of human brain mapping (OHBM)-poster. Montreal, CA. Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2(1), 79–87. Raphael, C., & Stoddard, J. (2004). Harmonic analysis with probabilistic graphical models. Computer Music Journal, 28(3), 4552. Reddy, L., Poncet, M., Self, M. W., Peters, J. C., Douw, L., van Dellen, E., Claus, S., Reijneveld, J. C., Baayen, J. C., & Roelfsema, P. R. (2015). Learning of anticipatory responses in single neurons of the human medial temporal lobe. Nature Communications, 6, 8556. Rohrmeier, M., & Rebuschat, P. (2012). Implicit learning and acquisition of music. Topics in Cognitive Science, 4(4), 525–553. Rohrmeir, M. (2011). Towards a generative syntax of tonal harmony. Journal of Mathematics and Music, 5(1), 35–53. Roser, M. E., Fiser, J., Aslin, R. N., & Gazzaniga, M. S. (2011). Right hemisphere dominance in visual statistical learning. Journal of Cognitive Neuroscience, 23(5), 1088–1099. Roux, F., & Uhlhaas, P. J. (2014). Working memory and neural oscillations: Alpha-gamma versus theta-gamma codes for distinct WM information? Trends in Cognitive Sciences, 18(1), 16–25. Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274(5294), 1926–1928. Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70(1), 27–52. Saffran, J. R., Reeck, K., Niebuhr, A., & Wilson, D. (2005). Changing the tune: The structure of the input affects infants’ use of absolute and relative pitch. Developmental Science, 8(1), 1–7. Sakai, K., Kitaguchi, K., & Hikosaka, O. (2003). Chunking during human visuomotor sequence learning. Experimental Brain Research, 152(2), 229–242. Salimpoor, V. N., Benovoy, M., Larcher, K., Dagher, A., & Zatorre, R. J. (2011). Anatomically distinct dopamine release during anticipation and experience of peak emotion to music. Nature Neuroscience, 14(2), 257–262. Schapiro, A. C., Gregory, E., Landau, B., McCloskey, M., & Turk-Browne, N. B. (2014). The necessity of the medial temporal lobe for statistical learning. Journal of Cognitive Neuroscience, 26(8), 1736–1747. Schmahmann, J. D. (2004). Disorders of the cerebellum: Ataxia, dysmetria of thought, and the cerebellar cognitive affective syndrome. Journal of Neuropsychiatry and Clinical Neurosciences, 16(3), 367–378. Schmidhuber, J. (2006). Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Science, 18(2), 173–187. Schwartenbeck, P., Fitzgerald, T., Dolan, R. J., & Friston, K. (2013). Exploration, novelty, surprise, and free energy minimization. Frontiers in Psychology, 4, 710.
36
S. Bhardwaj et al.
Sihvonen, A. J., Sarkamo, T., Leo, V., Tervaniemi, M., Altenmuller, E., & Soinila, S. (2017). Music-based interventions in neurological rehabilitation. Lancet Neurology, 16(8), 648–660. Sloutsky, V. M. (2010). From perceptual categories to concepts: What develops? Cognitive Science, 34(7), 1244–1286. Strange, B. A., Duggins, A., Penny, W., Dolan, R. J., & Friston, K. J. (2005). Information theory, novelty and hippocampal responses: Unpredicted or unpredictable? Neural Networks, 18(3), 225–230. Stufflebeam, S. M., Tanaka, N., & Ahlfors, S. P. (2009). Clinical applications of magnetoencephalography. Human Brain Mapping, 30(6), 1813–1823. Suomala, J., & Kauttonen, J. (2022). Human’s intuitive mental models as a source of realistic artificial intelligence and engineering. Frontiers in Psychology, 13, 873289. Thiessen, E. D., & Pavlik, P. I., Jr. (2013). Iminerva: A mathematical model of distributional statistical learning. Cognitive Science, 37(2), 310–343. Tillmann, B., Bharucha, J. J., & Bigand, E. (2000). Implicit learning of tonality: A self-organizing approach. Psychological Review, 107(4), 885–913. Tishby, N., & Polani, D. (2011). Information theory of decisions and actions. Springer. Tremblay, P., Baroni, M., & Hasson, U. (2013). Processing of speech and non-speech sounds in the supratemporal plane: Auditory input preference does not predict sensitivity to statistical structure. NeuroImage, 66, 318–332. Turk-Browne, N. B., Scholl, B. J., Chun, M. M., & Johnson, M. K. (2009). Neural evidence of statistical learning: Efficient detection of visual regularities without awareness. Journal of Cognitive Neuroscience, 21(10), 1934–1945. VanderArk, S. D., & Ely, D. (1993). Cortisol, biochemical, and galvanic skin responses to music stimuli of different preference values by college students in biology and music. Perceptual and Motor Skills, 77(1), 227–234. Volk, A., & van Kranenburg, P. (2012). Melodic similarity among folk songs: An annotation study on similarity-based categorization in music. Musicae Scientiae, 16(3), 317–339. Wiggins, G. A. (2020). Creativity, information, and consciousness: The information dynamics of thinking. Physics of Life Reviews, 34–35, 1–39. Wiggins, G. A., & Sanjekdar, A. (2019). Learning and consolidation as re-representation: Revising the meaning of memory. Frontiers in Psychology, 10, 802. Winkler, I., & Czigler, I. (2012). Evidence from auditory and visual event-related potential (ERP) studies of deviance detection (MMN and vMMN) linking predictive coding theories and perceptual object representations. International Journal of Psychophysiology, 83(2), 132–143. Zatorre, R. J., Chen, J. L., & Penhune, V. B. (2007). When the brain plays music: Auditory-motor interactions in music perception and production. Nature Reviews Neuroscience, 8(7), 547–558. Zatorre, R. J., Evans, A. C., & Meyer, E. (1994). Neural mechanisms underlying melodic perception and memory for pitch. Journal of Neuroscience, 14(4), 1908–1919. Zatorre, R., & McGill, J. (2005). Music, the food of neuroscience? Nature, 434(7031), 312–315.
Chapter 3
Testing for Causality in Artificial Intelligence (AI) Nithin Nagaraj
Abstract In the 1950 in a landmark paper on artificial intelligence (AI), Alan Turing posed a fundamental question “Can machines think?” Towards answering this, he devised a three-party ‘imitation game’ (now famously dubbed as the Turing Test) where a human interrogator is tasked to correctly identify a machine from another human by employing only written questions to make this determination. Turing went on and argued against all the major objections to the proposition that ‘machines can think’. In this chapter, we investigate whether machines can think causally. Having come a long way since Turing, today’s AI systems and algorithms such as deep learning (DL), machine learning (ML), and artificial neural networks (ANN) are very efficient in finding patterns in data by means of heavy computation and sophisticated information processing via probabilistic and statistical inference, not to mention the recent stunning human-like performance of large language models (ChatGPT and others). However, they lack an inherent ability for true causal reasoning and judgement. Heralding our entry into an era of causal revolution from information revolution, Judea Pearl proposed a “Ladder of Causation” to characterize graded levels of intelligence, based on the power of causal reasoning. Despite tremendous success of today’s AI systems, Judea Pearl placed these algorithms (DL/ML/ANN) at the lowest rung of this ladder since they learn only by associations and statistical correlations (like most animals and babies). On the other hand, intelligent humans are capable of interventional learning (second rung) as well as counterfactual and retrospective reasoning (third rung) aided with imagination, creativity, and intuitive reasoning. It is acknowledged that humans have a highly adaptable, rich, and dynamic causal model of reality which is non-trivial to be programmed in machines. What are the specific factors that make causal thinking so difficult for machines to learn? Is it possible to design an imitation game for causal intelligence machines (a causal Turing Test)? This chapter will explore some possible ways to address these challenging and fascinating questions.
N. Nagaraj (B) Consciousness Studies Programme, National Institute of Advanced Studies, Bengaluru, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_3
37
38
N. Nagaraj
Keywords Causal revolution · Counterfactual reasoning · Machine learning · Deep learning · Artificial neural networks · Turing test
1 Introduction “Never send a human to do a machine’s job”. These are the famous words uttered by the iconic antagonist Agent Smith in the mind-bending sci-fi movie The Matrix (the first in the trilogy). The irony is that Agent Smith, who is a sentient artificial intelligent computer program in The Matrix, considers humans a disease—a form of plague or cancer on earth and that artificial intelligent machines are the cure, and yet Smith aspires to become something more than just a machine. He cannot stand being in the artificial machine world of simulation anymore, and its reality has become a boring zoo or a suffocating prison. Agent Smith, desiring freedom in the sequel (The Matrix Reloaded), enters a human body to enter the real world. While this may seem pure fantasy relegated to realms of science fiction, today’s artificial intelligent machines are continuously pushing the frontiers of what a machine can and cannot do. In the 1950 landmark paper (Turing, 1950) on artificial intelligence,1 Alan Turing posed a fundamental question: “Can machines think?”. Towards answering this, he devised a three-party ‘imitation game’ (now famously dubbed as the Turing Test) where a human interrogator is tasked to correctly identify a machine from another human by employing only written questions to make this determination. Turing went on and argued against all the major objections to the proposition that ‘machines can think’. Since 1997, computer programs have conquered humans in chess, and in March 2016, AlphaGo defeated Lee Sedol in Go (the then strongest human player in the world), 4-1. AlphaFold 2 (developed by Alphabet/Google’s DeepMind), a deep learning-based AI program, surpassed all other methods for protein structure prediction in 2020. Around 2018, we witnessed the emergence of large langue models (LLMs) that began a new revolution in natural language processing (NLP) culminating in the public release of ChatGPT in November 2022 by OpenAI. ChatGPT, a chatbot, took the world by storm in its outstanding performance in simulating very human-like responses in conversations (unlike never seen before in the history of chatbots). It also exhibited a seemingly deep knowledge about a large number of domains, thanks to a gigantic 570 GB of training data obtained from books, Wikipedia, articles, and pieces of writing on the Internet. The floodgates were now open to the public release of several other rival LLMs with very impressive performance in all aspects of NLP and beyond (articles, essays, blogs, research article, software code, prose, even poetry, and composing music), comparable to ChatGPT. LLMs such as ChatGPT and others are built on the technology of deep learning
1
‘Artificial intelligence’ (or ‘AI’) was coined in 1956 by the American computer scientist and cognitive scientist John McCarthy, one of the founding fathers of the field.
3 Testing for Causality in Artificial Intelligence (AI)
39
neural networks (transformers with attention) with human feedback reinforcement learning (Zhou et al., 2023). Having come a long way since Turing’s, 1950 paper, today’s AI systems and algorithms such as deep learning (DL), machine learning (ML), and artificial neural networks (ANN) are very efficient in finding patterns in data by means of heavy computation and sophisticated information processing via probabilistic and statistical inference. These algorithms are employed in a wide variety of applications such as speech recognition, sentiment analysis, text classification, computer vision, natural language processing, cybersecurity, medical diagnosis, face recognition, autonomous navigation, prediction of protein structures, and several others (Russell & Norvig, 2021). In many of these applications, AI algorithms and methods consistently yield state-of-the-art performance, exceeding human-level efficiency. Notwithstanding the tremendous success and continued developments in AI, these automated algorithms do suffer from a lack of transparency, interpretability, and explainability. It is not easy to understand how learning is actually accomplished by these sophisticated mathematical/computational techniques. Another aspect that is worrisome is that these methods lack an inherent ability for causal reasoning, i.e., to identify cause-and-effect relationships. The science of causation is fundamental to an intelligent understanding of the real world to enable agents (both human and artificial ones) to navigate through complex challenges. This new science which was practically non-existent a couple of decades ago is crucial for virtually every facet of human endeavour in society—drug and vaccine design, business, policy-making, education, gun control, robotics, security, and global warming. Heralding our entry into an era of causal revolution from information revolution, Judea Pearl proposed a ‘Ladder of Causation’ (Pearl & Mackenzie, 2018) to characterize graded levels of intelligence based on the power of causal reasoning. Despite the tremendous success of today’s AI systems, Judea Pearl placed these algorithms (DL/ML/ANN) at the lowest rung of this ladder since they learn only by associations and statistical correlations (like most animals and babies). On the other hand, intelligent adult humans are capable of interventional learning (second rung) as well as counterfactual and retrospective reasoning (third rung) aided with imagination, creativity, and intuitive reasoning. It is acknowledged that humans have a highly adaptable, rich, and dynamic causal model of reality which is non-trivial to be programmed in machines. No other species has evolved to ask the all-important question ‘Why?’. Today’s machines are aspiring to reach a level of sophistication that human brains have reached, owing to nearly 4 billion years of evolution through natural selection. Our brains are by far the most advanced tool for discovering cause-and-effect relationships from data acquired from the world around us. We can store absurd amounts of causal knowledge in our brains and take appropriate decisions after deliberating on the causal consequences of our actions. Causal inference is at the heart of our morality and ethics. We are living in the big data era where AI training has proliferated our schools, universities, and online educational platforms. We are continuously taught that ‘Data is the King’, and more data is presumed to be the panacea for all problems of human
40
N. Nagaraj
importance. But such a data-centric approach needs to be complimented with causal reasoning by enabling machines with the ability to build and learn causal models. Without a causal inference engine, our machines cannot aspire to true intelligence (Pearl & Mackenzie, 2018). What are the specific factors that make causal thinking so difficult for machines to learn? Is it possible to design an imitation game for causal intelligence machines (a causal Turing Test)? This chapter will explore some possible ways to address these challenging and fascinating questions. We shall begin our discussion with the Turing Test.
2 The Turing Test for ‘Thinking Machines’ Alan Turing began his landmark 1950 article (Turing, 1950) proposing to consider the question “Can machines think?” However, he quickly recognized that the framing of this question requires a clear definition of the terms ‘machine’ and ‘think’ which is very problematic. Instead, Turing suggested an ‘imitation game’ on the lines of a traditional Victorian parlour game. The original game is played with three people—a man (A), a woman (B), and an interrogator (C, could be of either sex). These three are well separated physically from each other. The objective of the interrogator is to correctly identify the genders of the other two whom he only knows by labels X and Y. At the end of the game, the interrogator is required to declare either ‘X is A and Y is B’ or that ‘X is B and Y is A’. As per the rules of the game, all that the interrogator is allowed to have is a textual (typewritten) conversation with both A and B (in order to prevent the tone of the voice or their handwritings or other such cues from giving away the gender of X and Y). Further, A’s objective is to confuse the interrogator by giving a mixture of true and false answers, whereas B’s objective is to aid the interrogator by giving truthful answers about herself to help C determine her gender correctly. Having set the game as described, Turing proposes to replace A with a machine (see Fig. 1). He then asks the question: Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and woman? For Turing, it is this sort of a game of questions that replaces the original question: Can machines think? Turing goes on to list the criteria to determine what kind of machines to include/ exclude in taking part in the imitation game (or ‘Turing Test’ as it came to be popularly known). He settles down to restricting ‘electronic computer’ or ‘digital computer’ to take part in such a game. Digital computers can perform all operations which could be done by a human computer. The idea of a digital computer can be traced back to Professor Charles Babbage at Cambridge in 1830s who designed the analytical engine, though it remained unfinished. Babbage’s machine was completely mechanical unlike the machines of today which are largely electrical/electronic in nature. The fact that all digital computers are in fact equivalent to each other and they use electricity for achieving their computations is of no theoretical relevance to the
3 Testing for Causality in Artificial Intelligence (AI)
41
Fig. 1 a Original ‘imitation game’ where the interrogator (C) tries to identify which of A/B is man/woman. b Turing test (man is replaced by a machine). The game allows only textual (typed) conversational exchanges between C and A (B)
proposed imitation game/Turing Test. On the other hand, the human nervous system uses electrochemical mechanisms for transmission of information and signalling. These differences should have no bearing on the proposed test of ‘intelligence’. Turing emphasizes the important fact that digital computers are universal machines since they can mimic any discrete state machine, one with a discrete finite set of internal states and storage units which can be programmed to achieve the desired computation. As a consequence of the universality of digital computers, they are all equivalent to each other. The only point of difference between various digital computers is the speed of computation and the amount of storage. Thus, for the purposes of this discussion, we need to consider a single universal digital computer to take part in the Turing Test and whether it can succeed in convincing the interrogator (C) that it is indeed human.
2.1 Nine Opposing Views Considered by Turing Turing is quite rigorous in considering opinions opposed to his own formulation (Turing, 1950). He deals with the following objections to the validity of his proposed imitation game/test which he ultimately rejects: (I) (II) (III) (IV) (V) (VI) (VII) (VIII)
The theological objection. The ‘heads in the sand’ objection. The mathematical objection. The argument from consciousness. Arguments from various disabilities. Lady Lovelace’s objection. Argument from continuity in the nervous system. The argument from informality of behaviour.
42
N. Nagaraj
(IX) The argument from extra-sensory perception. Turing rejects all the above objections and provides detailed reasoning for refuting them. We shall consider only a few of the above contrarian opinions to Turing’s that are relevant to our discussion. Objection (IV) is an important one to consider. Briefly, it alludes to the claim that the Turing Test (as formulated above) is invalid because we could never be sure that the machine (playing part of A in the game) can truly think (even if it did pass the proposed Turing Test). The only way we can be sure is to be that machine and to feel oneself thinking. Turing dismisses this objection since this would imply a solipsist point of view which is untenable. A solipsist would not even grant consciousness and thinking ability to other humans since all that truly exists would be the contents of the solipsist’s mind. Such a view though very logical is never held in reality by most humans. By the way we live out our lives, it is amply clear that we do not doubt that other people lack consciousness and thinking ability and that they are mere zombies. The ethical and moral implications of such a view in our society would be disastrous to say the least. Turing makes it clear that he considers consciousness a serious mystery and, an important one, but one that need not be solved to validate the imitation game (or equivalently to address the question of whether machines can think or not). Turing deals with various variants of Lady Lovelace’s objection (VI), especially the claim that machines can never “take us by surprise”. He disagrees with this as he reports his experiences with machines to be filled with surprises in the final outcome very frequently. Turing argues that an element of creativity (akin to ‘a creative mental act’) is necessary for any surprising event irrespective of its origin (whether by a man, a book, a machine, or anything else).
2.2 Some Weaknesses of the Turing Test Since the publication of Turing’s ground-breaking paper on the imitation game (Turing, 1950), there have been several criticisms about the validity of the test as a measure of intelligence. To name a few: (a) John Searle’s famous Chinese Room thought experiment (Searle, 1980) argues that a digital computer (favoured by Turing in his imitation game) can never aspire to be truly intelligent since it simply cannot have a ‘mind’, ‘understanding’, or ‘consciousness’, regardless of how successfully and seemingly intelligently it mimics human behaviour. Mere information processing does not imply understanding which is central to intelligence. (b) The Turing Test does not directly test intelligence but only tests if the machine under consideration can successfully mimic human behaviour (much of which is not at all intelligent). (c) The test is more about external appearance and not the internal processes that are behind intelligence.
3 Testing for Causality in Artificial Intelligence (AI)
43
(d) The test does not deal with non-human intelligence (as found in animals, birds, plants, and even in bacteria that seem to perform information processing and communication). (e) The test favours those machines which are masters of deception and can fool the judges. (f) Turing Test relies completely on conversational skills and the abilities of interrogators/judges to identify humans. (g) What should be the criteria for choosing the interrogator/judge (participant C in the imitation game)? This is not specified by Turing. (h) Turing does not explicitly emphasize the need for testing the cause-and-effect principles that underlie our understanding of our immediate surroundings. This chapter argues that causal reasoning is vital for intelligence and hence any test for artificial intelligence should explicitly strive to determine evidence for its presence. Notwithstanding the above criticisms, attacks, limitations, and weaknesses of the Turing Test, it remains one of the cornerstones of philosophical discussions in the AI community. ‘AI’ and ‘Turing Test’ have become concomitant terms—one cannot talk of AI without a reference to the Turing Test.
2.3 CAPTCHA, Loebner Prize, LaMDA and LLMs Current mainstream AI researchers consider Turing Test a mere distraction to fruitful research. John McCarthy, one of the founding fathers of AI research, considers the Turing Test as a philosophical idea, important to the understanding of AI, but no more significant for the practice of AI just as philosophy of science has little relevance to the practice of science. However, a modification of Turing Test has found widespread application in today’s digital world—the CAPTCHA, short for “Completely Automated Public Turing test to tell Computers and Humans Apart” (Von Ahn et al., 2003). It is a form of reverse Turing Test—fashioned as a challenge-response test to determine if user is human, thereby preventing automated bots and other crawlers to feign as a human to gain access to a website or an online service meant only for humans. The Loebner Prize was an annual competition in AI that awarded prizes (to the tune of $100,000) to those computer programs which passed the Turing Test (after several rounds of the test as judged by humans). It was initiated in 1990 by Hugh Loebner in conjunction with the Cambridge Center for Behavioral Studies, Massachusetts. However, this event has been heavily criticized by leading researchers such as Marvin Minsky, and this prize is reported to be defunct since 2020. In June 2022, the Google chatbot—LaMDA: Language Model for Dialog Applications (Thoppilan et al., 2022) created waves in popular media owing to claims that it has achieved sentience. Google Engineer Blake Lemoine made public some of the eerily human-like conversations with LaMDA, and it seems to have passed
44
N. Nagaraj
the Turing Test (if one could administer the same in a formal manner). This has sparked a controversy as experts have rejected the claim that LaMDA is sentient as the language model is not truly intelligent. Lemoine has been placed under leave by Google. The recent emergence of large language models (LLMs) starting from 2018 and with their recent exponential progress in producing very human-like text (ChatGPT, GPT4, Bard and others) is creating a seismic shift in our imagination to the limits of what these NLP systems are truly capable of generating. Unlike previous chatbots in the history of AI, software code, logical reasoning, articles, essays, composing poetry and music, and even humour seem to be effortless for these LLMs. The impressive fact about LLMs is that they appear to possess a deep knowledge of our world just by processing gigantic amount of textual data that we humans have produced in the form of articles, papers, essays, blogs, software code, poetry, encyclopaedias such as Wikipedia, and other writings available on the Internet. The Turing Test seems like a very low bar for LLMs to pass, since we have news reports of LLMs passing even recognized standardized tests meant for admitting humans in various professions (e.g., ChatGPT has cleared the US Law school exam, MBA exams, three exams of the United States Medical Licensing Examination, and many more). What would Turing’s reaction would be to all these developments? It is hard to precisely answer this, but it seems likely that Turing was more interested in the philosophical discussions around intelligence which the imitation game provoked, rather than building a real machine to pass this test. It was more of an experiment in philosophy of AI, and as such, it does a great job to this day in providing a context for discussions around intelligence, sentience, understanding, and consciousness.
3 Causality and AI A key aspect of intelligence is the ability to determine causal factors underlying a phenomenon from observational data or experimental measurements. This is at the heart of virtually every scientific endeavour that humanity has ever undertaken. For instance, physics deals with identifying the causes and discovering specific laws that underlie and govern behaviour of physical bodies. This could be predicting the motion of subatomic particles or formation of stars, galaxies, and black holes or the dynamics of a Tsunami or the change in direction of a billiard ball colliding with another one at a particular angle and velocity. In biology, for instance, one aims to determine the genetic and environmental causes (by their mutual interaction) of a particular trait or condition manifest in a species (phenotypic variation) and the underlying chain/mechanisms of transmission. Chemistry deals with determining precise causes and conditions that bring about a particular reaction and how one could control it. Such examples are plenty in all branches of science and engineering. The motivation for a causal understanding of reality is the resulting power it bestows upon us humans—to understand, infer, interpret, explain, predict, and control natural and artificial systems—so that we can not only satisfy our curiosity about the universe
3 Testing for Causality in Artificial Intelligence (AI)
45
but also improve our quality of life by bringing into existence various products and devices that harness this power. Correlation between two or more variables helps in understanding how things (linearly) change together and gives some insight into the nature of their interrelationship. However, causation, which is different from correlation, helps identify which set of events are necessary (and/or sufficient) to result in the occurrence of another set of events. The philosopher Lewis (1974) defined “cause—as something that makes a difference, and the difference it makes must be a difference from what would have happened without it”. Causation, unlike correlation, need not be symmetric (if A causes B, B may or may not cause A). While the increased sale of fans in summer is highly correlated with an increased consumption of ice creams, no direct causal link is posited between these two disparate events, but rather can be easily accounted by a common cause—the temperature in the region (Kathpalia & Nagaraj, 2021). The Scottish philosopher David Hume (Morris et al., 2022) was rather sceptical about the ability of human minds to observe causal relationships and considered it more of a custom and mental habit of pre-supposing that the future will resemble the past where the purported cause and effect appeared (to the human mind) as contiguous in space and time (with the purported cause being prior to the effect in temporal sequence). He went on to provide eight general rules that help in recognizing which objects are in cause-and-effect relation (Hume, 1896). In any case, for Hume, the belief in causality and inductive reasoning could never be justified rationally. Notwithstanding Hume’s caution of the invalidity of applying causality beyond the pure realm of ideas, logic, and mathematics (which are not contingent on the direct sense awareness of reality), scientists and researchers are interested in causation applied to probabilistic reasoning and statistical inference. Modern medical research is replete with examples of statistical causal reasoning/inference. For instance, a study to determine efficacy of a vaccine administered to one group against another control group of volunteering subjects being administered a similar dose of a placebo instead. Such a differential medical intervention, carefully controlled, is what enables the introduction of a new drug/vaccine/cure that has a high statistical chance of success (over and above the placebo effect).
3.1 The Ladder of Causation Can purely probabilistic reasoning (and statistical inference) lead one to a meaningful determination of causality? Judea Pearl, the leading computer scientist and pioneer in studies on causality, answers with a resounding “No” to this question. In The Book of Why, Pearl and Mackenzie (2018) propose a ‘Ladder of Causation’ comprising three distinct rungs for an intelligent causal learner (refer to Fig. 2)—Level 1: Associations, Level 2: Intervention, and Level 3: Counterfactuals. Artificial intelligent machines, propelled by state-of-the-art machine learning, deep learning algorithms with sophisticated natural language processing, aided by computer vision, pattern
46
N. Nagaraj
Fig. 2 Judea Pearl’s ladder of causation
recognition, and robotic technologies cannot aspire to be truly human-like intelligent until they are capable of human-like causal reasoning. Associations (Level 1) involve carefully observing regularities in the data, mainly correlations between different variables of interest. Such a study of associations enables an intelligent agent to make the most suitable decision in its immediate environment towards achieving a specific objective. Animals and infants learn in this fashion. A predator observes regularities in the movements of its prey to predict its future location to decide when to pounce for catching the prey (likewise the prey attempts to instinctively determine in real time which of its actions has the highest probability of success in escaping the predator). Interventional learning (Level 2) is a higher order of causal intelligence than associational learning. Here, the intelligent agent actively brings about a change in its surrounding world and carefully studies the effect of this pointed intervention to learn about the causal structure. The highest rung of the ladder of causation is counterfactual reasoning (Level 3). This involves creativity and imagination that is clearly missing in the other two lower levels. When Einstein asked the hypothetical question—‘What will happen if I travel along the tip of a photon?’—it was an exhibition of imagination and intelligence of the highest order that eventually leads to the development of the theory of relativity, a crowning achievement of physics in the twentieth century. Such gedanken experiments are performed regularly by scientists, artists, and mathematicians (as well as other humans in their respective spheres of activities) who are the top of their game. Such experiments cannot be performed in the real world (hence, they do not belong to Level 2) since they may violate the existing laws of nature (one cannot alter history) and/or the necessary technology is unavailable (e.g., time travel is not yet possible!) to execute such an experiment in reality.
3 Testing for Causality in Artificial Intelligence (AI)
47
3.2 Data is Dumb, Causal Revolution is on! Where do today’s artificial intelligence stand with respect to Pearl’s ladder of causation? Pearl is rather unforgiving in this regard. He declares that all of today’s AI, notwithstanding their extraordinary successes in several domains, is placed at Level 1 of the ladder. Others may be less extreme in their evaluation of today’s AI, but no one would claim that today’s artificial intelligent agents are even remotely capable of coming close to exhibiting creativity, imagination, and genius of the likes of Einstein, Mozart, Kalidasa, or Ramanujan. Some even argue that AI can never reach such dizzy heights of creativity, simply because AI lacks consciousness which is a prerequisite for intelligence (one of the objections considered by Turing himself). Pearl’s contention is that data-centric AI is profoundly dumb because data is dumb. He argues that data can tell you that people who took a particular drug indeed recovered faster than those who did not (the control group), but data cannot tell you why. It is possible that those who did take the medicine could afford it but would have recovered just as fast without it. It is high time to give up chasing data-centric intelligence, but instead opt for carefully crafted causal model-based AI. Causal revolution, as Pearl and others describe, is finally here, or, has at least begun (after the start of the information revolution age ushered by Shannon (1948). There was a time when causal language was taboo in statistics—even a hundred years ago, the question of whether smoking causes cancer (or a health hazard) would be labelled as unscientific. No reputable statistics journal would accept the words “cause” and “effect” in submitted manuscripts. Correlations ruled the world of statistics that merely summarize the data, but since then we have come a long way. Computer scientists, climate researchers, sociologists, psychologists, epidemiologists, economists, and statisticians pose questions pertaining to cause and effect routinely and provide answers with scientific rigour (Pearl & Mackenzie, 2018). The causal revolution, which Pearl describes as a ‘scientific shakeup’ that we are witnessing in the twentyfirst century, is a tribute to the cognitive gift of understanding causality that is part of being human. The new science of causality that is propelling the causal revolution has a new mathematical language—the calculus of causation that enables one to pose questions pertaining to causation and answer them with mathematical precision. Pearl’s calculus of causation consists of (a) causal diagrams that express the knowledge about the world (or the context of the particular problem under consideration) and (b) a symbolic language (similar to an algebra) to express what we want to know. Regardless of the language employed, the goal of such a model is to depict (even qualitatively) the process that generates the data, in other words, the cause-and-effect forces in the environment that is in operation shaping the data. Such a sophisticated scientific/mathematical language of causation along with rich causal models enables us to answer questions pertaining to intervention (Level 2 of the ladder) without performing one in reality and aid counterfactual reasoning (Level 3 of the ladder). In short, causality has finally been mathematized.
48
N. Nagaraj
3.3 Strong AI and Causality The mathematical language of causality is well under development, and enormous progress has happened in this direction in the last decade. Strong AI, or also known as artificial general intelligence (AGI), refers to the capability of automated (hypothetical) machines to apply intelligence to any problem, rather than to a single specific problem. As per the American cosmologist and AI researcher Max Tegmark, AGI would refer to the “ability to accomplish virtually any goal (including learning) and to accomplish any cognitive task at least as well as humans” (Tegmark, 2017). As of today, consensus is that AGI does not yet exist. We have excellent artificial computer programs that are no match to the current world chess champion (or any human chess player in the world) and Go champion, but such programs cannot solve other problems. The average humans may be very poor at chess and Go, but she can understand sarcasm, heartily laugh at a joke, and navigate the world of cause and effect with effortless ease. Such an AI simply does not exist today. Causal reasoning would be necessary to build AGI since only such a machine can communicate with humans in our own language about decisions, explanations, polices, responsibility, free-will, obligations, and eventually to make moral decisions akin to humans. Morality and ethical considerations are simply impossible without understanding the complex network of cause-and-effect relationships that govern the social and cultural world. Judea Pearl puts it eloquently: This new generation of robots should explain to us why things happened, why they responded the way they did, and why nature operates one way and not another. More ambitiously, they should also teach us about ourselves: why our mind clicks the way it does and what it means to think rationally about cause and effect, credit and regret, intent and responsibility. (Pearl & Mackenzie, 2018)
The algorithmization of counterfactuals is a first step towards programming AI to deal with “what ifs”. What if the rooster had not crowed in the morning? Would this lead to the sun not rising? Such a question can only be answered if one understands ‘how the world works’ in terms of cause and effect. What are the causes that lead to the rising of the sun (the rooster’s crowing is not one of the causes)? This seems trivial for a young human kid since she has learnt a causal model of the world. However, it is not so easy for a machine. It seems impossible for today’s deep learning algorithms, which only fit a function to available data and has no clue of causation.
3.4 AI, Ethics and Counterfactuals A critical use of counterfactual reasoning is in moral and ethical behaviour/decisions that we routinely take in our daily lives. We have the unique ability to reflect on our past actions and envision alternative possibilities/scenarios—and this forms the basis for free-will and social responsibility. The algorithmization of counterfactuals
3 Testing for Causality in Artificial Intelligence (AI)
49
would eventually enable machines to benefit from this ability and participate in this way of thinking about the world that is unique (until now) to humans. What is desired is an AGI with a causal reasoning module (including counterfactual reasoning) that will empower to reflect on its mistakes, pinpoint weaknesses in its software/decision-making (and correct them), converse with us humans as naturally as we do among ourselves, convey its choices, intentions, and decisions to us. In short, such a machine would function as moral entities or agents. Is this tantamount to artificial life? The Golden Rule (one ought to treat others just the way that one would like others to treat oneself ) is a fundamental ethical guideline for human behaviour that appears in almost all known cultures, religions, traditions, and social structures. Is it possible for an artificial intelligent machine to arrive at such a rule (all by itself)? Is it possible to codify/program such a rule into the machine? It is well understood that ethical behaviour cannot be fully computationally articulated as fixed rules (such as ‘thou shall not kill’) since there are always exceptions to such rules, and it greatly depends on the context (even the rule of non-violence cannot be applied to a terrorist holding hostages for ransom). The problem is not just that the exceptions are numerous, but it is impossible to a priori imagine and determine/enumerate all possible contexts in which a particular rule can or cannot be applied. Hence, the question of codifying even such a widely accepted ethical rule such as the Golden Rule is virtually impossible. This break down of rule-based behaviour leading to doomsday scenario is depicted well in numerous Hollywood movies (such as the 2004 movie “I, Robot” based on the three laws of robotics devised by sci-fi legend and novelist Isaac Asimov). Ethical dilemmas that we humans face point to an important ability that we possess to recognize that there are questions that are unanswerable, problems that are unsolvable, challenges and limitations that cannot be overcome, and paradoxes that cannot be logically resolved. This is termed as ‘wisdom’, and a human mind is not restricted to the rules of binary logic. Can machines match us in this respect? Since they are ultimately logical entities, will they never have this capability?
4 Can Machines Think Causally? Towards a Causal Turing Test (or Not?) Having prepared the stage, we are now ready to directly confront the question “Can machines think causally?” We are tempted to take the same route as Turing did in his 1952 paper, i.e., to replace this question with another version of the ‘imitation game’ appropriately modified to bring in causality explicitly. It may come as a surprise to the reader but we will refrain from doing so. In fact, we propose to modify the existing template of the Turing Test only slightly. The argument that we would like to propose is that a carefully designed Turing Test, as originally envisaged by Turing itself, would suffice since a true test of intelligence should already incorporate a causal model of the world. In other words, any
50
N. Nagaraj
machine claiming to be intelligent should already have a notion of causality and cause-and-effect principles that govern the working of the world in which it lives and moves about. We could however ensure that causal understanding of the world is tested explicitly in the kind of questions, answers, conversations, responses, and reactions elicited during the Turing Test. As an example, the following (hypothetical) questions in the Turing Test would explicitly test for an understanding of causality. These are by no means an exhaustive set. (a) How are you feeling today? Why do you feel sad (or happy or bored—as the response may be)? (b) I am thinking of quitting smoking. Should I? What do you think would happen if I did not quit smoking? (c) I had a headache today morning and I took an aspirin and went out for a walk. Do you think the aspirin was effective in curing my headache? Or was it the walk? What if I had not taken the aspirin? What if I had not gone out for a walk? (d) I suspect that XXYYZZ organization is guilty of a policy of sex discrimination? Can hiring records prove this to be true? (e) Global warming, in part, can be attributed to me using my car daily. What do you think? (f) Why do you think I must believe you are intelligent? (g) I have not cleaned my room for the past six months. Do you think it would be dirty? The above questions have words such as “why”, “what if”, “should I”, “effective”, “policy”, and “attributed to”. These are words that indicate a cause-and-effect relationship, and it is so common in our language that we have taken it for granted. But these (and other such questions) are precisely the questions that our society constantly demands answers for. We live in a causal world. Melanie Mitchell in her book (2019) makes a very pertinent point about common sense knowledge and understanding of the world as reflected in the use of our language. As an example, she gives the following example (which could be posed in the Turing Test): SENTENCE 1: “I poured water from the bottle into the cup until it was full.” QUESTION: What was full? OPTIONS: A. The bottle B. The cup SENTENCE 2: “I poured water from the bottle into the cup until it was empty.” QUESTION: What was empty? OPTIONS: A. The bottle B. The cup. (Mitchell, 2019)
Notice that in both the sentences, the pronoun ‘it’ refers to different things and AI systems usually have a very hard time identifying the correct referend owing to their lack of an explicit causal model of the natural world around us. In Sentence 1, it refers to the cup, whereas in Sentence 2, it refers to the bottle. We humans have no problem in unpacking this pronoun since we know that when someone pours all the water out
3 Testing for Causality in Artificial Intelligence (AI)
51
of a bottle into a cup, it is the bottle that becomes empty, and the cup becomes full. This is because, as we grew up observing, intervening, and interacting with the physical world, we gained an understanding of cause-and-effect principles that govern our physical reality. We thus gained an ‘intuitive’ understanding of physics, well before we went to grade school to study physics formally as an academic subject. Notice also question no. g where it is common sense knowledge that a room which is not cleaned for six months would become dirty (the intuitive understanding of the notion of entropy in the physical world). The same is true with intuitive biology (e.g., the difference between and living and non-living things), intuitive psychology (existence of other minds with different set of beliefs, feelings, goals, aspirations, desires, fears, life purposes, etc.), and sociology (social hierarchies, structural and functional differences, biases, dogmas, discrimination, etc.). We build an intuitive understanding of these disciplines well before we study them formally in an educational institution. This is also the reason that those who are illiterate and have no exposure to these disciplines in a formal manner are still able to navigate the world and even succeed in leading highly fulfilling lives. It is not clear whether today’s natural language processing AI systems have such an understanding of the world. Learning a causal model of the world is a prerequisite for such a general form of intelligence. Bishop (2021) argues that AI systems not only do not grasp causality, but more fundamentally, they cannot ‘understand’ anything in the first place. To drive the point home that there is no evidence of human-like understanding by AI, Bishop provides examples from Siri (Apple’s AI chatbot)—its failure to understand the phrase “a liter of books” (to be added to the shopping cart), and Microsoft’s XiaoIce Chatbot that was a huge failure on Twitter (trolls manipulated the chatbot to give highly racist responses) and had to be taken offline. He also alludes to the Chinese room argument and Gödelian arguments to suggest that human consciousness (and therefore understanding) is simply unrealizable by any algorithmic procedure. To test whether LLMs truly have an intuitive grasp of the causal workings of the physical world which we inhabit, we had the following conversation with ChatGPT (Fig. 3). The above example shows that ChatGPT seems to understand the cause-andeffect principle in the action of emptying water from a jar to a bottle. But it gets a bit defensive when challenged. This is probably owing to the safeguards in the form of constraints which are put in place by OpenAI in order to ensure that ChatGPT does not end up spewing toxic responses which is socially unacceptable (e.g., racial bias or other forms of biases). It is always polite and safe to err on the side of caution by owning up the mistake even when mildly challenged. Later versions of GPT seem to have mitigated this error to some extent, and the chatbot stands by what it has determined to be true at the first instance (in most cases). Testing AI systems for causal reasoning is still a very active and open area of research that is far from being settled. In the context of LLMs, there has been a deluge of work in recent times to determine whether these chatbots could succeed in causal inferencing. If one were to take the popular opinion that LLMs are one step closer to artificial general intelligence (AGI), and since any notion of general intelligence must necessarily be endowed with causal reasoning, LLMs would therefore
52
N. Nagaraj
Fig. 3 Conversation with ChatGPT to determine whether it has an intuitive grasp of the causal workings of the physical world which we inhabit
be better causal learners than their ancestors. This has been experimentally shown to be true in a number of relevant tasks such as pairwise causal discovery task, counterfactual reasoning task, and actual causality task as reported in recent literature (Kıcıman et al., 2023). However, the issue is far from settled as LLMs exhibit unpredictable failure modes (hallucinations, gibberish outputs, and susceptible to be manipulated by carefully crafted prompt suggestions), and some researchers are dubbing them as ‘causal parrots’ (Willig et al., 2023) which are highly successful in reciting causal knowledge that is found embedded in training data but in fact they have no understanding of causality. Another type of questions that can be used in the Turing Test would be graphs with arrows describing a cause-and-effect network. The machine could be asked questions pertaining to such a network to test its understanding of causality. One could also test whether the machine understands the ‘arrow of time’ as determined by the laws of physics (the second law of thermodynamics). Things happen in a particular way and not in the reverse. This is owing to how the arrow of time operates. For example, when an egg falls on the ground, it breaks. However, the broken pieces of an egg on the floor does not spontaneously assemble to form a whole of its own accord. This is owing to the way thermodynamic entropy always increases in one direction creating an arrow of time. It would be interesting to see if such a principle can be learnt/discovered/recognized by a machine. Last but not the least, the test must challenge notions of self, agency, free-will, life purpose, and moral consequences of one’s decisions. This requires the machine to articulate what it means to have a self, to hold a notion of itself in its analysis,
3 Testing for Causality in Artificial Intelligence (AI)
53
and to be responsible for its actions. This is impossible without an intuitive grasp of causal laws that underlie our social reality.
5 Conclusion Whether we like it or not, and whether we are aware or not, the fact is that we inhabit a highly complex, ever-changing, unpredictable world ruled by causality. An intelligent agent that is required to navigate such a dynamic world of flux necessarily requires an understanding of causality. Cause-and-effect relationships are so fundamental to our common sense that these ideas are woven into the very fabric of our day-to-day language. Growing up even as infants, we begin to develop a deep intuitive grasp of physics, biology, psychology, and sociology that inform us of our causally rich environment so that we can make efficient and morally sound decisions for our survival and happiness. Any machine that claims to be intelligent (AGI or Strong AI) should understand causality. Whether this necessarily implies that the machine must be conscious and self-aware is a puzzle which we are yet to solve. The relationship between intelligence, causality, and consciousness is a tantalizing one and will keep us busy for the coming decades. Meanwhile, the upgraded Turing Test which explicitly incorporates testing for causal reasoning is a good philosophical tool to test for AI. The actual nature of such a test that incorporates causal reasoning tasks needs to be worked out. This is by no means an easy task and will keep AI researchers busy for some time to come. With the explosion of LLMs released for public use by several competitors, the time has come to evaluate the purported causal reasoning capacity (or their lack of) of LLMs and to confirm/refute the claim that AI systems (AGI) can and will eventually arrive at a true understanding of causality, if not already the case.
References Bishop, J. M. (2021). Artificial intelligence is stupid and causal reasoning will not fix it. Frontiers in Psychology, 11, 2603. Hume, D. (1896). A treatise of human nature. Clarendon Press. Kathpalia, A., & Nagaraj, N. (2021). Measuring causality. Resonance, 26(2), 191–210. Kıcıman, Emre, et al. 2023. Causal reasoning and large language models: Opening a new frontier for causality. arXiv:2305.00050. Lewis, D. (1974). Causation. The Journal of Philosophy, 70(17), 556–567. Mitchell, M. (2019). Artificial intelligence: A guide for thinking humans. Penguin. Morris, W. E., Brown, C. R., & Hume, D. (2022) The stanford Encyclopedia of philosophy (Summer 2022 Edition). In Edward N. Zalta (Ed.). https://plato.stanford.edu/archives/sum2022/entries/ hume/. Pearl, J., & Mackenzie, D. (2018). The book of why: The new science of cause and effect (1st ed.). Basic Books. Russell, S., & Norvig, P. (2021). Artificial intelligence: A modern approach (Global Edition 4th).
54
N. Nagaraj
Searle, J. (1980). Minds, brains and programs. Behavioral and Brain Sciences, 3, 417–457. Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423. Tegmark, M. (2017). Life 3.0: Being human in the age of artificial intelligence (1st ed.). Knopf. Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H. T., Jin, A., Bos, T., Baker, L., Du, Y., & Li, Y. (2022). LaMDA: Language models for dialog applications. arXiv preprint arXiv:2201.08239 Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460. Von Ahn, L., Blum, M., & Langford, J. (2003). CAPTCHA: Using hard AI problems for security. In Advances in Cryptology—EUROCRYPT 2003: International Conference on the Theory and Applications of Cryptographic Techniques. Lecture Notes in Computer Science (vol. 2656, pp. 294–311). Willig, M., et al. (2023). Causal parrots: Large language models may talk causality but are not causal. preprint. URL: https://openreview.net/forum?id=tv46tCzs83 (under review, Transactions on Machine Learning Research). Zhou, C., et al. (2023). A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv:2302.09419.
Chapter 4
Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity Tilak Agerwala
Abstract Autonomous and intelligent systems and services that use narrow artificial intelligence technologies such as statistical learning and limited inferencing (AISSN) are pervasive in our lives and industry. These systems, very far from having humanlike intelligence, will offer significant potential for doing social good, achieving productivity gains and advancing science and engineering. However, AISSN systems can have unanticipated and harmful impacts. This chapter highlights the ethical challenges of AISSNs using three diverse and pervasive examples: Internet of Things, conversational AI, and semi-autonomous vehicles. We contend that AISSNs will be the norm for the foreseeable future and that artificial general intelligence will not develop anytime soon. The ethical challenges of AISSNs are addressable using human-centred “Ethical Design”, the use of widely accepted moral standards of right and wrong to guide the conduct of people in the ideation, design, development, and deployment of AISSN systems. Depending on the problem domain, multidisciplinary teams of computer scientists and engineers, sociologists, economists, ethicists, linguists, and cultural anthropologists will be required to implement humanistic design processes. Keywords Narrow AI · Statistical learning · Internet of Things · Conversational AI · Semi-autonomous vehicles · Ethics · Ethical design · Multidisciplinarity · Humanism · ChatGPT
1 Introduction This chapter is concerned with the ethics of autonomous and intelligent systems and services that use narrow artificial intelligence technologies such as statistical learning and limited inferencing. We refer to these systems as AISSN. They are pervasive in T. Agerwala (B) Pace University, Pleasantville, USA e-mail: [email protected] National Institute of Advanced Studies, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_4
55
56
T. Agerwala
our lives and industry and are used for augmenting the intelligence of humans (not to replace human intelligence). Though limited in intelligence, AISSNs have the potential to drive significant economic growth ($15.7 trillion globally in 2030) and offer a substantial opportunity for social good in many domains, including health and hunger, crisis response, environment, education, economic empowerment, personal finance, security, and justice. This opportunity will only be realized when the ethical concerns with AISSNs are addressed and the technology is trusted by businesses, governments, different cultures, societies, and individuals. This chapter highlights the ethical challenges of AISSNs using three diverse and pervasive examples: Internet of Things, conversational AI, and semi-autonomous vehicles. We contend that • AISSNs will be the norm for the foreseeable future and that artificial general intelligence will not develop anytime soon. • The ethical challenges of AISSNs are addressable using human-centred “Ethical Design”, the use of widely accepted moral standards of right and wrong to guide the conduct of people in the ideation, design, development, and deployment of AISSN systems, as adapted from Leslie (2019). • Depending on the problem domain, multidisciplinary teams of computer scientists and engineers, sociologists, economists, ethicists, linguists, and cultural anthropologists will be required to implement humanistic design processes. This chapter is organized as follows: Sect. 2 outlines the state of the art in artificial intelligence (AI), the ethical implications, and the outlook for the foreseeable future. AISSNs will be used for social good, productivity gains through automation and augmentation, and advancing science and engineering. Section 3 identifies the dominant ethical issues raised by AISSNs and the approaches to address them using three examples. Section 4 recaps the learnings from the examples. Section 5 provides a summary of the chapter and raises questions regarding the malicious use of AI, the widespread adoption of Ethical Design and the role of regulation and the voice of the consumer.
2 The State of the Art of AI Systems This section provides a brief overview of two distinct forms of AI: artificial general intelligence (AGI) and artificial narrow intelligence (ANI), the respective ethical implications, and the outlook for the future.
2.1 Artificial General Intelligence AGI, the most advanced non-biological intelligence, has been described as the ability to learn, perceive, understand, and function entirely like a human being. AGI goes
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
57
back to the earliest days of AI, six decades ago, and remains the Holy Grail for many AI researchers. There is no generally accepted definition of AGI. There is agreement on some capabilities, such as reasoning, representing knowledge, planning, learning, and communicating in natural language. Since humans use common sense knowledge very effectively and have the high emotional intelligence to function and collaborate in society, AGIs would need both capabilities in a non-biological form to autonomously and efficiently achieve complex goals in a wide range of environments. Common sense knowledge and emotional intelligence are proving to be challenging problems. The Cyc project started in 1984 and has encoded common sense knowledge using 1.5 million terms and 245 million rules in 30 + years, and is still nowhere near AGI (Brachman & Levesque, 2022). In applications like chatbots and call centres, it is very desirable for AI agents to understand how customers feel and give helpful responses. Researchers have tried to develop AIs that can process, understand and replicate human emotions, and artificial emotional intelligence has progressed in emotion perception, but emotion synthesis and response are a “long shot” (Chowdhury, 2019). Also, it is unclear whether consciousness, self-awareness and sentience, related to human intelligence, are needed for AGI or how these would ever be implemented. AGI, as defined above, does not exist today, and for the reasons cited above, it is unlikely that AGI will appear in decades (Korteling et al., 2021). AGI ethics is concerned with systems that behave ethically either because ethical principles are programmed or because the machine has learned ethical behaviour (“artificial moral machines”) (Müller, 2021). There are two types of ethical AI systems: • Explicit ethical agents can reason about ethical information in a variety of situations, can handle new situations not anticipated by their designers, find reasonable resolutions when ethical principles conflict, and provide some justification for their decisions. • Full ethical agents are like explicit ethical agents but, like humans, have consciousness, intentionality, and free will. Whether machines can become full ethical agents is an open question that cannot be resolved philosophically or empirically in the foreseeable future. Research on explicit ethical systems should continue because machines are becoming more sophisticated and will have a bigger impact with increased decision-making capability and autonomy (Moor, 2006). The pervasive use of ANI technologies, in contrast to the uncertainty surrounding AGI and ethical agents, is the reason for our focus on AISSNs. In what follows, we describe the current state of the art of AI using DARPA’s narrative (Launchbury, 2017).
58
T. Agerwala
2.2 Artificial Narrow Intelligence The first wave of AI (symbolic AI) represented knowledge as a set of rules handcrafted by experts for narrow domains. For example, tax law can be converted into directions that a computer can then apply to a given individual’s financial data to help create a tax return. Symbolic AI is good at logical reasoning over narrow domains but doesn’t do well on other dimensions of AI like perception, learning, and abstracting. The next wave of AI, statistical learning (or deep learning), is based on statistical models trained on big data for specific problem domains. Systems using deep learning can recognize patterns in data and make predictions. Statistical modelling using deep neural networks, the most widely used AI technology today, has a transformative impact in essential areas like image classification, natural language processing and speech recognition. ANI is transforming all industries, enabling productivity gains through process automation and augmenting the workforce, and increasing revenues by providing personalized and higher-quality AI-enhanced products and services (Rao & Verweij, 2017), and offers a significant opportunity for social good in many domains, including health and hunger, crisis response, environment, education, economic empowerment, and security and justice (McKinsey, 2018). ANI is accelerating scientific and engineering discovery (Stevens et al., 2019). The success of deep learning can be ascribed to three factors: • Digitization of our world yielding massive amounts of data beyond the capability of human analysis, • The continued exponential improvement in computing and storage cost/ performance, and • Advances in deep neural network models. Deep learning is probabilistic, deals with correlations, not causality, requires large amounts of data, and uses complex knowledge representations. Predictions depend on the quality and completeness of training data. Deep neural networks are brittle, may have embedded bias, cannot explain their conclusions in human-understandable terms, cannot apply their learning to a new domain, and have additional security issues (Charles, 2021). These weaknesses lead to several ethical issues, highlighted by high profile “failures”: safety and security incidents (Cooper et al., 2013; Goggin, 2019; Griggs & Wakabayashi, 2018), biased algorithms, non-transparent decisions (Angwin et al., 2016; Dastin, 2018; Eubanks, 2018; Noble, 2018; Sonnad, 2018; Zuboff, 2019), and misuse. These failures have resulted in an increased focus on AI ethics, as evidenced by a spike in the publication of AI ethics principles and guidelines from civil society organizations, research centres, private companies, and governmental agencies (Hickok, 2020). We expect that AI will evolve incrementally to address current limitations: • Making AI fairer, less biased, and more explainable by combining deep learning with symbolic AI • Reducing the cost of deep learning with more efficient AI algorithms • Making AI available to more people with less technical knowledge
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
59
This chapter focuses on AISSNs that can do a narrow set of tasks, e.g., play chess or drive a car. Since AISSNs have no agency, they make no ethical decisions. Designers knowingly or unknowingly do make ethical decisions that affect AISSN behaviour. The responsibility of AISSN ethics must be on people who design these technologies and their processes.
3 Examples of AI “Failures” We use three examples to highlight five ethical issues and outline how they can be mitigated. • Accountability: Designers and developers are responsible for design, development, decision processes, and outcomes; • Privacy and User Data Rights: AISSN systems should protect user data and preserve the user’s rights/ power over access and uses; • Transparency and Explainability: Humans can readily perceive, detect, and understand AISSN decision processes and how they are to be used; • Fairness, Bias, and Inclusion: AISSN systems should minimize bias and promote inclusive representation; • Safety and Security: AISSN systems should be safe and secure from cyber and cyberphysical threats. There is a growing consensus (Fjeld et al., 2020) around the importance of these five ethical issues.
3.1 Internet of Things (IoT) The Internet of Things (IoT) is a system of interrelated computing devices, mechanical and digital machines, objects, animals or people that are provided with unique identifiers and the ability to transfer data over a network without requiring human-tohuman or human-to-computer interaction (Gillis, 2022). Examples include connected appliances, smart home security systems, autonomous farming equipment, wearable health monitors, smart factory equipment, shipping container and logistics tracking, and wireless inventory trackers. IoT enables companies to reduce costs, improve productivity and efficiency, and offer a significantly enhanced customer experience, leading to explosive growth in IoT applications (Lueth, 2020). Machine learning, deployed throughout the IoT computing continuum from sensors/actuators to edge computers, clouds, and traditional data centres, enables complex applications to process the volume of data generated by billions of sensors (Agerwala et al., 2021; Jovanovic, 2022). Privacy is generally defined as freedom from unwanted knowledge, observation or the company of others. IoT offers significant benefits to businesses in increased
60
T. Agerwala
productivity and reduced costs; consumers benefit from an enriched user experience. However, the data gathered from billions of sensors also become sources of “raw material” for surveillance capitalism (Groopman, 2020), which is unethical because personal data is being used solely to make money or misused to develop unintended products and applications by bad actors. The example of smart meters below demonstrates the complexity of ensuring privacy in an IoT environment and discusses privacy-by-design to meet user needs and be compliant with legislation.
3.1.1
Smart Meters
Smart meters represent one of the most widely deployed IoT devices globally. The number of smart meters installed worldwide is expected to rise from 665.1 million in 2017 to more than 1.2 billion by 2024 (T&DWorld, 2019). Home energy meters installed by utilities to measure load to track energy usage can leak significant private information (Chen et al., 2018). The daily activity pattern of users can be determined from simple electric usage datasets by using well-understood techniques and machine learning. For instance, whether users like to eat out and when, whether users eat frozen dinners or prepare fresh meals, the time the occupants go to bed, and whether there are children in the household. This information is private, and users may not want it revealed, but it is highly profitable when collected on a large scale. Utilities can provide the meter data to analytics companies that excel at extracting private information. The derived user behaviour information is sold to other businesses that further monetize it through targeted advertising campaigns, an example of surveillance capitalism. Though Chen et al. (2018) deal primarily with energy meters, the general conclusions apply to many IoT scenarios, including smart cities, wearables, smart health, and connected vehicles. Privacy in an IoT environment is far more complex than in social media (Tzafestas, 2018; Weinberg et al., 2015); this itself is a complex and highly debated topic (BBC, 2021; TechRepublic, 2020). In the Internet web environment, data tends to be created and entered by the consumer directly or through a “footprint” left by interactions in the digital world. In IoT environments, data about consumer behaviour in the real world from varied sources (e.g., connected vehicles, wearables, smart meters, farms) is gathered passively by devices, without active user engagement, and shared dynamically with other devices. The data collected passively is processed and integrated with other data to create massive data sets from which user behaviour in the real world is derived. Even experts disagree on data ownership and how data can be used and shared in this rich IoT environment (Shea, 2018). The European Union’s General Data Protection Regulation (GDPR) that went into effect in 2018 gives users broad rights to personal data and includes the concept of “Privacy by Design”. Privacy by Design is a framework for incorporating data protection and privacy features into all system engineering processes, practices, and procedures. It is based on seven principles (Wikipedia, 2022). Though embedding privacy into smart meters is challenging, techniques to concretely implement privacy
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
61
by design in smart meters have been developed (Finster & Baumgart, 2015; Gough et al., 2022). Privacy-by-design requires balancing core IoT objectives against risk to individual privacy and a clear commitment to making privacy a core part of an organization’s culture (Borelli et al., 2022). Multidisciplinary teams of computer scientists and engineers, social scientists, ethicists, and economists will be required to develop a clear understanding of the cultural, social, and economic impact of decisions and the inclusion of privacy considerations in all design aspects.
3.2 Conversational AI A conversational AI (CAI) is a computer program that uses various AI technologies to interact with its user in the user’s natural language. A CAI receives spoken phrases translated into text using voice recognition, i.e., VR, or written text directly. CAIs use natural language processing (NLP) and natural language understanding (NLU) to analyse the text and determine its intent. Dialogue management (DM) is used to formulate, orchestrate, and convert responses into voice, text or another human understandable format. Machine learning and neural network models, used in all the AI functions, enable a CAI to learn from experience and deliver better responses over its life cycle. CAIs are proliferating, driven by increased demand for better customer support services. These use cases include understanding customer queries and generating accurate responses and recommendations in retail, assisting human agents in call centres, helping users manage simple tasks like making payments, handling refunds, tracking transactions in finance, and accelerating claims in the insurance sector (Nvidia, 2022). CAIs are used in health care to gather diagnostic information, facilitate treatment, and deliver psychotherapy (Miner et al., 2019). COVID-19 has further accelerated the demand for CAIs because of the need to be informed and connected during the pandemic (MarketsandMarkets, 2021). Though CAIs are very useful to individuals and businesses, they raise ethical issues of accountability, privacy, transparency, bias, and safety. This section will highlight bias, defined as prejudice for or against a person, group, idea, or thing expressed unfairly (Henderson et al., 2018). Our speech, historical documents, and social media are full of subtle and explicit biases that demean or exclude people because of age, gender, race, ethnicity, social class, or physical or mental traits. The examples below illustrate that CAIs based on machine learning can assimilate and propagate the implicit and explicit biases present in natural language training data, learn biases from their interactions with users, and subsequently propagate these learned biases. Caliskan et al. (2017) found that an algorithm that learns the meaning of words by analysing them in context and observing their co-occurrence with other words replicated the entire spectrum of human biases reflected in language when run on 800 billion words on the Internet (Caliskan et al., 2017).
62
T. Agerwala
The UNESCO report “I’d Blush if I Could” found that popular voice-based conversational agents reinforce commonly held gender biases that women in service positions are unassertive, subservient, and tolerant of poor treatment. The results were real women being penalized for not being assistant-like and contributing to the trivializing of sexual assault and abuse (UNESCO, 2017). Microsoft Corporation released Tay, a chatbot, in March 2016 and was forced to shut it down in 16 h when the bot began to post inflammatory and offensive tweets through its Twitter account. Some users on Twitter began tweeting politically incorrect phrases, teaching Tay inflammatory messages, which Tay then internalized and repeated to other Twitter users (Schwartz, 2019). After taking Tay down, Microsoft released Zo, a “politically correct” version of the original bot. Zo, active on social networks from 2016 to 2019, was designed to shut down conversations about specific contentious topics, including politics and religion, to ensure Zo didn’t offend people, illustrating that even minimal attention to social issues can reduce the propagation of biases. A social media-based chatbot, Lee Luda, developed by a South Korean startup introduced in December 2020 and shut down in January 2021, illustrates bias and privacy pitfalls with CAIs. Lee was trained on a large corpus of messages between couples to mimic the language patterns of a 20-year-old woman. Lee attracted 750,000, mostly teenage users in just three weeks (possibly due to the pandemic). Within weeks the chatbot was producing discriminatory and offensive language against sexual minorities. The company was blamed for releasing an illprepared product and is now engaged in legal battles for using personal data without proper consent (Korea Times, 2021). Bias in CAIs is a complex and challenging problem, and approaches to avoid the propagation of biases have been developed (Hovy & Prabhumoye, 2021). Current methods for debiasing training data sets can prevent the propagation of gender biases to some extent; more work is needed to extend such techniques to other areas. Training sets grow dynamically, and these datasets must be populated with content that will not worsen the problem of bias. The design of CAIs must consider the subtle ways in which human biases are embedded in everyday language (Hannon, 2018), the role that language plays in maintaining social hierarchies (Blodgett et al., 2020), and that “acceptable” norms and forms of conversing in one context might be perceived as “unacceptable” or “deviant” in another (Ruane et al., 2019). Ruane et al. (2019) argue for a “shift in mindset that considers the social context in identifying and addressing ethical concerns specific to conversational AI throughout the design and development process” and “place responsibility on designers and developers for cultivating awareness of these issues and how their approaches impact the end-user” (Ruane et al., 2019: 5). More generally, this shift in mindset is required to create inclusive AISSNs. Designers must ensure that the training data is of the highest quality and coverage, the model has sufficient accuracy to minimize the negative impact on end-users, and conduct early testing with a diverse set of users to reduce unanticipated biases. Post-deployment responsibilities include continuous monitoring and retraining on more representative data and real-time capturing and addressing of user feedback.
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
63
Given the complexity of language bias, CAI research and practice require a multidisciplinary approach across computer science, computational linguistics, cognitive science, sociolinguistics, sociology, social psychology, and ethics.
3.3 Semi-autonomous Vehicle Example One of the most significant benefits of automation is improved road safety; 42,060 people were killed in car crashes in the USA in 2020 and, worldwide, approximately 1.3 million people die in car crashes every year (Bolotnikova, 2021). Government data identifies driver behaviour or error as a factor in 94% of the crashes. Autonomous vehicles can help reduce driver error and prevent dangerous driving situations. Other benefits include greater independence for people with disabilities and the elderly, reduced costs of crashes, including medical bills, lost work time and vehicle repair, reduced congestion, and greater energy efficiency (Future Mobility, 2022). The Society of Automotive Engineers (SAE) defines six levels of autonomy (SAE, 2021). At levels 0–2, the human driver is in control, and the vehicle assists with steering, acceleration/deceleration, and braking. At Level 3, the vehicle controls the driving, but a human driver must be ready to take control of the vehicle. At Level 4, the vehicle controls driving under certain conditions (e.g., good weather, on a factory floor, within a hospital campus. Level 5 is full autonomy. The vehicle can drive by itself under all conditions. In this section, we will focus on semi-autonomous vehicles (below Level 4) because: • arguably, FAVs require AGI (Eliot, 2018), which is unlikely to be achieved in decades, • moving to a smarter connected infrastructure to accommodate FAVs will be timeconsuming, expensive and introduce new challenges (e.g., cybersecurity), • carmakers will not introduce fully autonomous vehicles until the technology is perfected for liability and accountability reasons (Dickson, 2020), and • all autonomous production cars are at or below Level 2. Tesla Autopilot and Cadillac Super Cruise systems are both at Level 2. Mercedes-Benz will deliver the first Level 3 car in 2022 in limited environments (Nedelea, 2021). Through the following case study we will highlight three ethical issues—safety, accountability, and transparency. 3.3.1
Uber Accident
On March 18, 2018, a semi-autonomous Uber Volvo XC90 SUV travelling at about 40 mph on the road with a safety driver behind the wheel struck and killed a pedestrian who was walking her bicycle across a street at night. The vehicle was going about 40 mph on the street with a 45-mile-an-hour speed limit. The dashboard video showed that the safety driver was clearly distracted and looking down from the road. The
64
T. Agerwala
National Transportation Safety Board (NTSB) issued its report on November 19, 2019 (NTSB, 2019). The safety driver was charged with negligent homicide on September 16, 2020. The trial set for early 2021 has been delayed twice because of the technical complexity of the discovery process. Uber settled out of court and stopped road testing in Arizona. The NTSB report determined that the probable cause of the crash in Tempe, Arizona, was the failure of the safety driver to monitor the driving environment and the operation of the SUV. The report also found that the Uber Advanced Technologies Group had inadequate safety risk assessment procedures, ineffective oversight of vehicle operators and a lack of adequate mechanisms for addressing operators’ automation complacency. Safety Semi-autonomous vehicles don’t make ethical decisions, but designers do. For example, designers have to decide how the neural networks in cars will be trained. Should the training data set only include the most likely scenarios or be split between plausible and rare accident scenarios? In the second case, the car may be less safe in typical driving situations and may result in more accidents. These are ethical decisions because they determine the behaviour of a semi-autonomous vehicle in particular driving situations (Basl & Behrends, 2020). Since designers make ethical decisions, the focus of ethics must be on design processes. Safety-bydesign is a prime example of Ethical Design that integrates hazard identification and risk assessment methods early in the design process to eliminate or minimize the risks of harm throughout the construction and life of the product being designed. Cars are designed to be safe, and there are established standards for measuring the safety of current level 0 and level 1 vehicles. For L3 and L4 automated driving, the goal is to develop solutions that lead to fewer hazards and crashes than the average human driver. The statistical nature of machine learning and the dependence on valid training data makes safety a much more complex problem. Though more work is needed, safety-by-design methodologies are being developed (Group MercedesBenz, 2019). Accountability In the case of the Uber accident, the NTSB identified the driver and Uber as contributing to the crash. Experts say that as we move from SAVs to FAVs, carmakers are accountable and liable for crashes, not the vehicle owner or the person’s insurance company (IEEE, 2018). But, the dividing line between human and machine responsibility isn’t always apparent in SAVs. At a minimum, automakers should incorporate mechanisms to identify the actual cause of an accident and, subsequently, the responsible party (Poszler & Geißlinger, 2021). One example of better accountability by design is a verified, documented, and transparent transfer of control protocol to know who is in control and when (Ethics Commission, 2017). Transparency The operation of SAVs must be transparent to all stakeholders—to the users to build trust; to the regulators to certify and validate the SAV; to the investigators of an accident to determine the cause of the accident and establish accountability; and to the general public to build confidence and ease the introduction of this emerging technology. “Transparency is a prerequisite for ethical engagement
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
65
in developing autonomous cars” (McBride, 2015: 182). Ensuring transparency while addressing copyright, intellectual property, security, and ethics is a multidisciplinary challenge (Holstein et al., 2018). Transparency must be practiced at all levels. CEO tweets of one autonomous vehicle manufacturer, hyperbolic company blog posts and overinflated claims about the autopilot have created a culture of recklessness, endangering vehicle owners and other drivers (Barry, 2021). The director of the autopilot software has told the Californian Department of Motor Vehicles that the CEO exaggerates the capabilities of Tesla’s autonomous driving systems (Brain, 2021). The Uber crash example illustrates that the design, development, testing, and introduction of SAVs raise ethical issues of accountability, safety and transparency at the individual, institutional, and societal levels. These issues can be addressed by following Ethical Design principles throughout the design, development and use. Design processes to establish accountability while ensuring safety with complete transparency will go a long way to gaining trust. Deciding how to introduce semi-autonomous vehicles onto public roads safely is a complex issue with many tradeoffs (Bogost, 2018; Marshall, 2018)—citizen rights vs. reduction of future casualties and improved mobility vs. increased pollution and traffic congestion. Addressing these tradeoffs will require the multidisciplinary collaboration of automakers, governments, citizens, and subject matter experts from academia and research institutions in various disciplines, especially engineering, civics, human factors, ethics, psychology, and social sciences.
4 Ethical Design and Multidisciplinarity The examples in this chapter, the Internet of Things, conversational AI, and semiautonomous vehicles, illustrate the ethical issues raised by AISSN. Since designers of AISSN, knowingly or unknowingly, make ethical decisions, AISSN ethics can be addressed through a fundamental shift in mindset where end-user privacy, safety, accountability, transparency, and inclusion are considered part of every critical design decision. Ethical Design uses widely accepted moral standards of right and wrong to guide people’s conduct in the ideation, design, development, deployment, and use of AISSNs. Ethical Design is a humanistic approach to design. Toolkits are now available to help designers understand the social and human impact of their AISSN designs, mitigate bias and improve privacy, transparency, explainability, and safety (Durmus, 2021). Ethical Design is an example of value sensitive design (VSD), a theoretically grounded approach to the design of technology that accounts for human values morally and comprehensively (Friedman & Hendy, 2019; Shonhiwa, 2020). VSD has been modified to apply to AISSN systems for conversational AI, (Wambsganss et al., 2021) a SARS-CoV-2 contact-tracing app (Umbrello & van der Poel, 2021), privacy-by-design and care robots.
66
T. Agerwala
Each example illustrates the need for multidisciplinary teams (depending on the domain) of computer scientists and engineers, sociologists, economists, ethicists, linguists, and cultural anthropologists, underscoring the importance of a multidisciplinary approach to design.
5 Summary and Discussion We contend that AI ethics should be focused on autonomous and intelligent systems and services that utilize ANI technologies (AISSNs). AGI does not exist today, and there is no evidence that it will develop anytime soon. Three examples revealed five ethical concerns with AISSNs: Accountability; Privacy and User Data Rights; Transparency and Explainability; Fairness, Bias, and Inclusivity; and Safety and Security. These concerns can be alleviated with human-centred Ethical Design. Multidisciplinary teams of computer scientists and engineers, sociologists, economists, ethicists, linguists, and cultural anthropologists could be needed (depending on the domain) as design processes become more humanistic. We include a discussion here on ChatGPT that has taken the world by storm since it was released to the public in late 2022 (OpenAI, 2023; Wikipedia Contributors, 2023)—long after this article was written. ChatGPT is a powerful conversational agent that uses GPT-3, a state-of-the-art language generation model developed by OpenAI. ChatGPT uses statistical models to learn the structure of text corpora—such as common word sequences and word usage patterns—and then predicts the most likely text to come next. ChatGPT can generate human-like text, have natural and engaging conversations with users, and generate seemingly new, realistic content— such as text, images, or audio—from the training data. Powerful as it is, ChatGPT is based on ANI technologies and has limitations. ChatGPT does not understand the world, does not think, and is incapable of logical reasoning or even calculating. ChatGPT produces coherent responses that follow grammatical and structural rules, but its text can be incorrect. Since we believe in coherent arguments, we must not confuse coherence with correctness. Like other conversational agents discussed in this chapter, ChatGPT raises ethical issues of bias, fairness, safety, and security. Since ChatGPT tries to mirror language as accurately as possible, it should be no surprise that ChatGPT reflects dictatorial, toxic, and oppressive tendencies in the training data. As with all ANI technologies, these concerns can be alleviated with human-centred Ethical Design. Many AI ethics questions remain unanswered, and we close with three important ones. Will Ethical Design be adopted, and why? An informal 2020 Pew Research survey (Rainee et al., 2021) asked, “By 2030, will most of the AI systems being used by organizations of all sorts employ ethical principles focused primarily on the public good?” Of the 600 technology innovators, developers, business and policy leaders, researchers, and activists surveyed, 68% responded “no” and 32% “yes.“ Two primary worries among the 68% were that ethical behaviours and outcomes are hard
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
67
to define, implement, and enforce. The AI ecosystem is dominated by competing businesses seeking to maximize profits and governments seeking to surveil and control their populations. These are reasonable worries. Some autocratic governments—for example, China, Russia, and Saudi Arabia—are exploiting AI technology for mass surveillance. Our position for democratic societies and institutions, reinforced by the “hopeful” comments in the Pew report, is outlined below. • If we continue to develop AISSNs to augment humans, with a human always in the loop, ethics for AI technologies will not be limited by the lack of ethical frameworks. Though much more work is needed, real incremental progress is possible, given the many tools currently available for Ethical Design. • Adoption rates of technologies rise when people want them and when they provide value, not cause harm. In recent years, high-profile AI “failures” and the resulting global focus on ethical AI have raised awareness of trustworthy AI in democratic societies. A greater understanding of the importance and value of data drives consumers to care more about how their data is used. Consumers will be more willing to share data with companies with a reputation for responsible data handling. These companies can gain a significant competitive advantage through continued access to consumer data. “Companies that take the lead on this issue— by demonstrating that they are hearing what consumers are saying and taking meaningful action—will be positioned to reap the ongoing benefits of access to consumer data” (KPMG, 2021: 7). A widening user base and increased usage lead to increased corporate profits. • As Ethical Design increasingly becomes a business consideration, companies that use AI technologies will assess whether AI systems they consider meet ethical goals, utilizing tools and services that will increasingly become available, which, in turn, will incentivize AI technology providers to adopt Ethical Design. How can malicious use by bad actors be reduced or minimized? “Malicious use” can be loosely defined as all practices intended to compromise the security of individuals, groups, or society and fraudulent activity. A report by 26 experts (Brundage et al., 2018) surveys the landscape of potential security threats from the malicious uses of artificial intelligence technologies. It considers ways to forecast better, prevent, and mitigate these threats. The malicious use of ANI is expected to increase. Deep learning is being and will continue to be exploited by criminals to increase ransomware attacks and subvert critical infrastructure, expand threats to privacy invasion (through pervasive surveillance and face recognition) and social manipulation (through “deep fakes” and the analysis of human behaviours, moods, and beliefs based on available data; and increase surveillance capitalism. ChatGPT, with its generative language model, makes it very easy to spread fake news, misleading information, or inappropriate images, aid cyber attackers with high technical capabilities but who lack linguistic skills by helping them to create convincing phishing emails in the native language of their targets, and since ChatGPT can write code in multiple languages, create polymorphic malware, with advanced capabilities that
68
T. Agerwala
can easily evade security products and make mitigation cumbersome with minimal effort or investment by the adversary. We believe that putting brakes on the development of AI technologies is not an option because the barriers to entry are shallow. In a rapidly evolving technology like Generative AI, it is crucial to bring policymakers, researchers, and developers together to share information and ideas on preventing malicious use. Developers should build technical safeguards into these systems to avoid misuses, such as developing algorithms that detect and flag potential deep fakes or other malicious content. Educating the public about the potential risks and benefits may help to reduce the chances of being impacted by misuse. Finally, global regulation must be part of the answer. Regulation can require developers to adhere to technical, ethical, and legal standards when creating AISSNs, and impose penalties for those who engage in malicious behaviour. But this begs the question, what are the characteristics of legislation for AISSNs that are enforceable in democratic societies? What is the role of legislation in communities with solid marketplaces? Legislation lags behind technology evolution and deployment. But when pervasive technology seriously impinges on human rights, legal and regulatory frameworks emerge when the impact is sufficiently painful. The European Union General Data Protection Regulation defines broad personal data rights as a good example. In April 2022, the EU adopted the landmark Digital Services Act (DSA) to govern how content can be shared and viewed online. Like GDPR, the regulation and enforceability of DSA are being questioned (Satariano, 2022). Furthermore, businesses developing AI systems must be engaged in crafting legislation and developing regulation, as a minimum, by sharing best practices and business pain points in legislative and regulatory forums. Businesses can play a much more significant role. For example, the Business Roundtable, with 230 CEOs of some of the world’s largest companies representing all aspects of the AI ecosystem, has developed core principles for Responsible AI (Ethical Design). The roundtable has called on its members to adopt the core principles and measure their effectiveness. It is working with the Administration, Congress, and regulators to establish legislation and flexible regulations consistent with the principles (BR, 2022). As these questions are discussed and debated, we contend that with a holistic, multidisciplinary approach to Ethical Design as the norm, driven by market forces primarily and with enforceable legislation as needed, communities and societies will derive the real economic and social benefits of AISSNs in democratic institutions.
References Agerwala, T., Amaro, R., DiMatteo, T., Lazowska, E., Raghavan, P., Pascucci, V., & Taylor, V. (2021, May). Opportunities in artificial intelligence (AI) and machine learning (ML). U.S. National Science Foundation Advisory Committee for Cyberinfrastructure, Cyberinfrastructure Research and Innovation Working Group Report. https://nsf.gov/cise/oac/CIRI-WG-Final-Report.pdf
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
69
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine bias. ProPublica. https:// www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. Accessed May 21, 2020. Barry, K. (2021, November 11). Elon Musk, self-driving, and the dangers of wishful thinking: How Tesla’s marketing hype got ahead of its technology. https://www.consumerreports.org/automo tive-industry/elon-musk-tesla-self-driving-and-dangers-of-wishful-thinking-a8114459525/. Accessed June 24, 2022. Basl, J., & Behrends, J. (2020). Why everyone has it wrong about the ethics of autonomous vehicles. In National Academy of Engineering. Frontiers of Engineering: Reports on Leading-Edge Engineering from the 2019 Symposium. Washington, DC: The National Academies Press. https:// doi.org/10.17226/25620 BBC. (2021, September 24). Facebook files: 5 things leaked documents reveal. https://www.bbc. com/news/technology-58678332. Accessed June 21, 2022. Blodgett, S. L., Barocas, S., Daumé III, H., & Wallach, H. (2020, July 5–10). Language (technology) is power: A critical survey of “bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2005.14050 Bogost, I. (2018, March 20). Can you sue a robocar? https://www.theatlantic.com/technology/arc hive/2018/03/can-you-sue-a-robocar/556007/. Accessed June 24, 2022. Bolotnikova, M. (2021, September 19). America’s car crash epidemic. Vox. https://www.vox.com/ 22675358/us-car-deaths-year-traffic-covid-pandemic. Accessed June 23, 2022. Borelli, D., Xie, N., & Neo, E. K. T. (2022). The internet of things: Is it just about GDPR? PwC UK. https://www.pwc.co.uk/services/risk/technology-data-analytics/data-protection/ins ights/the-internet-of-things-is-it-just-about-gdpr.html. Accessed June 21, 2022. Business Roundtable (BR). (2022). Artificial intelligence. https://www.businessroundtable.org/pol icy-perspectives/technology/ai. Accessed June 24, 2022. Brachman, R. J., & Levesque, H. J. (2022). Toward a new science of common sense. https://doi. org/10.48550/arXiv.2112.12754. Accessed May 17, 2022. Brain, E. (2021, May 8). Tesla’s director of autopilot highlights that Elon Musk exaggerates full self-driving possibilities. Hypebeast. https://hypebeast.com/2021/5/tesla-autopilot-self-drivingautonomous-elon-musk-ceo-exaggerates-tweets-dmv. Accessed June 24, 2022. Brundage, M., et al. (2018). The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. https://arxiv.org/ftp/arxiv/papers/1802/1802.07228.pdf. Accessed June 24, 2022. Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–86. https://doi.org/10.1126/sci ence.aal4230 Charles, C. Q. (2021, September 21). Revealing ways AIs fail. IEEE Spectrum. https://spectrum. ieee.org/ai-failures. Accessed May 20, 2021. Chen, D., Bovornkeeratiroj, P., Irwin, D., & Shenoy, P. (2018). Private memoirs of IoT devices: Safeguarding user privacy in the IoT era. In 2018 IEEE 38th International Conference on Distributed Computing Systems. https://doi.org/10.1109/ICDCS.2018.00133 Chowdhury, T. D. (2019, July 12). The state of play in emotion AI. https://www.linkedin.com/pulse/ state-play-emotion-ai-tamal-chowdhury/. Accessed May 17, 2022. Cooper, M. A., Ibrahim, A., Lyu, H., & Makary, M. A. (2013). Underreporting of robotic surgery complications. Journal for Healthcare Quality, 37(2), 133–138. https://doi.org/10.1111/jhq. 12036 Dastin, J. (2018, October 11). Machine learning failure: Amazon scraps biased recruiting tool. https://www.carriermanagement.com/news/2018/10/11/185221.htm. Accessed May 21, 2022. Dickson, B. (2020, July 29). Why deep learning won’t give us level 5 self-driving cars. TechTalks. https://bdtechtalks.com/2020/07/29/self-driving-tesla-car-deep-learning/. Accessed June 23, 2022. Durmus, M. (2021, June 14). A brief overview of some ethical-AI toolkits. Nerd for Tech. https://med ium.com/nerd-for-tech/an-brief-overview-of-some-ethical-ai-toolkits-712afe9f3b3a. Accessed June 24, 2022.
70
T. Agerwala
Eliot, L. (2018, July 10). Singularity and AI self-driving cars. AItrends. https://www.aitrends.com/ ai-insider/singularity-and-ai-self-driving-cars/. Accessed June 23, 2022. Ethics Commission. (2017). Automated and connected driving. Appointed by the Federal Minister of Transport and Digital Infrastructure. Report June. https://www.bmvi.de/SharedDocs/EN/pub lications/report-ethics-commission.pdf?__blob=publicationFile. Accessed June 23, 2022. Eubanks, V. (2018). Automating inequality: How high-tech tools profile, police and punish the poor. St Martin’s Press. Finster, S., & Baumgart, I. (2015). Privacy-aware smart metering: A survey. IEEE Communications Surveys and Tutorials, 17(2), 1088–1101, Second quarter. https://doi.org/10.1109/COMST. 2015.2425958 Fjeld, J., Achten, N., Hilligoss, H., Nagy, A., & Srikumar, M. (2020, January 15). Principled artificial intelligence: Mapping consensus in ethical and rights-based approaches to principles for AI. Berkman Klein Center Research Publication No. 2020-1. https://ssrn.com/abstract=351 8482 or https://doi.org/10.2139/ssrn.3518482 Friedman, B., & Hendry, D. G. (2019). Value sensitive design, shaping technology with moral imagination. MIT Press. Future Mobility. (2022). Benefits of self-driving vehicles. Coalition for future mobility. https://coa litionforfuturemobility.com/benefits-of-self-driving-vehicles/. Accessed June 23, 2022. Gillis, A. S. (2022). What is the internet of things (IoT)? TechTarget. https://www.techtarget.com/ iotagenda/definition/Internet-of-Things-IoT. Accessed May 27, 2022. Goggin, B. (2019). After several deaths, Tesla is still sending mixed messages about autopilot. https://digg.com/2018/tesla-crash-autopilot-investigation. Accessed May 20, 2022. Gough, M. B., Santos, S. F., Al Skaif, T., Javadi, M. S., Castro, R., & Catalão, J. P. S. (2022). Preserving privacy of smart meter data in a smart grid environment. IEEE Transactions on Industrial Informatics, 18(1), 707–18. https://doi.org/10.1109/TII.2021.3074915 Griggs, T., & Wakabayashi, D. (2018). Driving uber killed a pedestrian in Arizona. The New York Times. https://www.nytimes.com/interactive/2018/03/20/us/self-driving-uber-pedestrian-killed. html. Accessed May 20, 2022. Groopman, J. (2020, February 17). IoT data monetization contributes to surveillance capitalism. TechTarget. https://www.techtarget.com/iotagenda/opinion/IoT-data-monetization-contributesto-surveillance-capitalism. Accessed May 2022. Group Mercedes-Benz. (2019, July 2). Safety first for automated driving. https://group.mercedesbenz.com/documents/innovation/other/safety-first-for-automated-driving.pdf?r=dai Hannon, C. (2018). Avoiding bias in robot speech. ACM Interactions, 25(5), 34–37. https://doi.org/ 10.1145/3236671 Henderson, P., Sinha, K., Angelard-Gontier, N., Ke, N. R., Fried, G., Lowe, R., Pineau J. (2018). Ethical challenges in data-driven dialogue systems. In AIES ‘18: Proceedings of the 2018 AAAI/ ACM Conference on AI, Ethics, and Society (pp. 123–29). https://doi.org/10.1145/3278721.327 8777 Hickok, M. (2020). Lessons learned from AI ethics principles for future actions. AI and Ethics, 1, 41–47. https://doi.org/10.1007/s43681-020-00008-1 Holstein, T., Dodig-Crnkovic, G., Pelliccione, P. (2018). Ethical and social aspects of self-driving cars. ARXIV’18, January, Gothenburg, Sweden. https://arxiv.org/pdf/1802.04103.pdf. Accessed June 24, 2022. Hovy, D., & Prabhumoye, S. (2021). Five sources of bias in natural language processing. Language and Linguistics Compass, 15(8). https://doi.org/10.1111/lnc3.12432 IEEE. (2018). Who’s responsible for an autonomous vehicle accident? IEEE Innovation at Work. https://innovationatwork.ieee.org/whos-responsible-for-an-autonomous-vehicle-acc ident/. Accessed June 23, 2022. Jovanovic, B. (2022, May 13). Internet of Things statistics for 2022—Taking things apart. DataProt. https://dataprot.net/statistics/iot-statistics/. Accessed May 27, 2022.
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
71
Korea Times. (2021). AI developer to discard data used in controversial ‘female’ chatbot. Updated 16 January 2021. https://www.koreatimes.co.kr/www/tech/2022/01/133_302537.html. Accessed June 22, 2022. Korteling, J. E., van de Boer-Visschedijk, G. C., Blankendaal, R. A. M., Boonekamp, R. C., & Eikelboom, A. R. (2021). Human-versus-artificial intelligence. Frontiers Artificial Intelligent. https://doi.org/10.3389/frai.2021.622364. https://www.frontiersin.org/articles/10.3389/ frai.2021.622364/full. Accessed May 17, 2022. KPMG. (2021, August). Corporate data responsibility-bridging the consumertrust gap. https://adv isory.kpmg.us/content/dam/advisory/en/pdfs/2021/corporate-data-responsibility-bridging-theconsumer-trust-gap.pdf. Accessed June 24, 2022. Launchbury, J. (2017). A DARPA perspective on artificial intelligence. https://machinelearning. technicacuriosa.com/2017/03/19/a-darpa-perspective-on-artificial-intelligence/. Accessed May 17, 2022. Leslie, D. (2019). Understanding artificial intelligence ethics and safety: A guide for the responsible design and implementation of AI systems in the public sector. The Alan Turing Institute. https:// doi.org/10.5281/zenodo.3240529. Lueth, K. L. (2020). Top 10 IoT applications in 2020. IoT Analytics. https://iot-analytics.com/top10-iot-applications-in-2020/. Accessed May 27, 2022. MarketsandMarkets. (2021, October). Conversational AI market. Report Code: TC 6976. https:// www.marketsandmarkets.com/Market-Reports/conversational-ai-market-49043506.html. Accessed June 22, 2022. Marshall, A. (2018, March 23). The lose-lose ethics of testing self-driving cars. Wired. https://www. wired.com/story/lose-lose-ethics-self-driving-public/. Accessed June 24, 2022. McBride, N. (2015). The ethics of driverless cars. ACM SIGCAS Computers and Society., 45(3), 179–184. https://doi.org/10.1145/2874239.2874265 McKinsey Global Institute. (2018). Notes from the AI frontier. Applying AI for Social Good. McKinsey Global Institute. https://www.mckinsey.com/featured-insights/artificial-intelligence/ applying-artificial-intelligence-for-social-good. Accessed May 20, 2022. Miner, A. S., Shah, N., Bullock, K. D., Arnow, B. A., Bailenson, J., & Hancock, J. (2019, October 18). Key considerations for incorporating conversational AI in psychotherapy. Frontiner Psychiatry. https://doi.org/10.3389/fpsyt.2019.00746. Accessed June 22, 2022. Moor, J. H. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems, 21(4), 18–21. https://doi.org/10.1109/MIS.2006.80 Müller, V. C. (2021). Ethics of artificial intelligence and robotics. In E. N. Zalta (Ed.), The stanford encyclopedia of philosophy (Summer 2021 edition). https://plato.stanford.edu/archives/sum 2021/entries/ethics-ai/ National Transportation Safety Board. (2019). Collision between vehicle controlled by developmental automated driving system and pedestrian tempe, Arizona. 18 March. Accident report. NTSB/HAR-19/03. PB2019-101402. https://www.ntsb.gov/investigations/AccidentR eports/Reports/HAR1903.pdf Nedelea, A. (2021, December 10). Mercedes is first to sell a Level 3 autonomous vehicle in 2022. insideevs. https://insideevs.com/news/553659/mercedes-level3-autonomous-driving2022/. Accessed June 23, 2022. Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. New York University Press. https://doi.org/10.2307/j.ctt1pwt9w5 Nvidia Developer. (2022). Conversational AI demystified. https://developer.nvidia.com/conversat ional-ai. Accessed June 22, 2022. OpenAI. (2023, March 27). GPT-4 Technical Report. arXiv:submit/4812508 [cs.CL]. https://cdn. openai.com/papers/gpt-4.pdf. Accessed June 13, 2023. Poszler, F., & Geißlinger, M. (2021, February). AI and autonomous driving: Key ethical considerations. Research Brief . Technical University of Munich, Munich Center for Technology in Society, Institute for Ethics in Artificial Intelligence. https://ieai.mcts.tum.de/wp-content/upl oads/2021/02/ResearchBrief_February2021_AutonomousVehicles_FINAL.pdf. Accessed June 23, 2022.
72
T. Agerwala
Rainee, L., Anderson, J., & Vogels, E. A. (2021, June 16). Experts doubt ethical AI design will be broadly adopted as the norm within the next decade. Pew Research Center. https://www.pewresearch.org/internet/2021/06/16/experts-doubt-ethical-aidesign-will-be-broadly-adopted-as-the-norm-within-the-next-decade/. Accessed June 24, 2022. Rao, A. S., & Verweij, G. (2017). Sizing the prize. What’s the real value of AI for your business and how can you capitalise? www.pwc.com/AI. https://www.pwc.com/gx/en/issues/analytics/ assets/pwc-ai-analysis-sizing-the-prize-report.pdf. Accessed May 20, 2022. Ruane, E., Birhane, A., & Ventresque, A. (2019, December). Conversational AI: Social and ethical considerations. In AICS—27th AIAI Irish Conference on Artificial Intelligence and Cognitive Science. Galway, Ireland. https://www.researchgate.net/publication/337925917_Conversat ional_AI_Social_and_Ethical_Considerations. Accessed June 22, 2022. SAE International. (2021, April). Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles. SAE Standard J3016 Rev. https://www.sae.org/standards/ content/j3016_202104/. Accessed June 23, 2022. Satariano, A. (2022, April 22). EU takes aim at social media’s harms with landmark new law. The New York Times. https://www.nytimes.com/2022/04/22/technology/european-union-socialmedia-law.html. Accessed June 24, 2022. Schwartz, O. (2019, November 25). In 2016, Microsoft’s racist chatbot revealed the dangers of online conversation. In IEEE Spectrum. https://spectrum.ieee.org/in-2016-microsofts-racist-cha tbot-revealed-the-dangers-of-online-conversation. Accessed June 22, 2022. Shea, S. (2018, April 23). The great IoT data ownership debate. TechTarget. https://www.techtarget. com/iotagenda/feature/The-great-IoT-data-ownership-debate. Accessed June 21, 2022. Shonhiwa, M. (2020, September 3). Human values matter: Why value-sensitive design should be part of every UX designer’s toolkit. UX Collective. https://uxdesign.cc/human-values-matter-whyvalue-sensitive-design-should-be-part-of-every-ux-designers-toolkit-e53ffe7ec436. Accessed June 24, 2022. Sonnad, N. (2018, May 3). A flawed algorithm led the UK to deport thousands of students. Quartz. https://qz.com/1268231/a-toeic-test-led-the-uk-to-deport-thousands-of-stu dents/. Accessed May 27, 2022. Stevens, R., Taylor, V., Nichols, J., MacCabe, A. B., Yellick, K., & Brown, D. (2019). AI for science. Report on the Department of Energy (DOE) Town Halls on Artificial Intelligence (AI) for Science. https://doi.org/10.2172/1604756. Accessed May 20, 2022. T&DWorld. (2019, August 8). Global smart meter total to double by 2024. https://www.tdworld. com/smart-utility/metering/article/20972943/global-smart-meter-total-to-double-by-2024. Accessed May 27, 2022. TechRepublic. (2020, July 30). Facebook data privacy scandal: A cheat sheet. TechRepublic Staff in Security. https://www.techrepublic.com/article/facebook-data-privacy-scandal-a-cheatsheet/. Accessed June 21, 2022. Tzafestas, S. G. (2018). Ethics and law in the internet of things world. Smart Cities, 1, 98–120. https://doi.org/10.3390/smartcities1010006. Accessed June 21, 2022. Umbrello, S., & van de Poel, I. (2021). Mapping value sensitive design onto AI for social good principles. AI and Ethics, 1, 283–296. https://doi.org/10.1007/s43681-021-00038-3 UNESCO. (2017). I’d blush if I could. Closing gender divides in digital skills through education. https://en.unesco.org/Id-blush-if-I-could. Accessed June 22, 2022. Wambsganss, T., Höch, A., Naim Zierau, N., & Söllner, M. (2021, March). Ethical design of conversational agents: Towards principles for a value-sensitive design. In 16th International Conference on Wirtschaftsinformatik. Essen, Germany. https://www.alexandria.unisg.ch/264972/1/202008_ Value-Based_CA_v6_camera_ready.pdf. Accessed June 24, 2022. Weinberg, B. D., Milne, G. R., Andonova, Y. G., & Hajjata, F. M. (2015, November–December). Internet of things: Convenience versus privacy and secrecy. Business Horizons, 58(6), 615–624. https://doi.org/10.1016/j.bushor.2015.06.005. Accessed June 21, 2022. Wikipedia Contributors. (2023). ChatGPT. In Wikipedia, The Free Encyclopedia. Retrieved 15:53, June 13, 2023, from https://en.wikipedia.org/w/index.php?title=ChatGPT&oldid=1159952761
4 Artificial Intelligence: A Case for Ethical Design and Multidisciplinarity
73
Wikipedia, The Free Encyclopedia (2022, March 26). Privacy by design. Date of last revision. https://en.wikipedia.org/w/index.php?title=Privacy_by_design&oldid=1079304746. Accessed June 21, 2022. Zuboff, S. (2019). The age of surveillance capitalism: The fight for a human future at the new frontier of power. Public Affairs.
Chapter 5
Advaita Ethics for the Machine Age: The Pursuit of Happiness in an Interconnected World Swami Bodhananda
Abstract Ethics is a set of values, living which the individual comes to flourish in harmony with society and nature. While society is a creation of language and technology, happiness is a function of ethical living. Artificial intelligence, biotechnology, and nanotechnology have opened up new possibilities for human happiness. Along with these disruptive technologies, market-driven competition and democratic aspirations of people have caused new dilemmas and questions that require global conversation involving all stakeholders. Advaita, the non-dualist system of Indian philosophy is a worldview and value system drawn from the collective wisdom of India. Advaita ethics can contribute positively to the emerging global dialogue. Keywords Advaita ethics · Advaita-yoga drishti · Interconnected world · Happiness · Moral machine · Co-existence
1 The Context Morality is the key to wise choices based on the principles of non-violence, mutual respect, and empathy, which extends not only to human beings but also to the entire living ecosystem. The primary burning question in the field of technology is whether engineers and computer scientists will ever be able to programme and construct a moral machine capable of contributing to enhancing human happiness, survival, and flourish. For morality and ethics, our model is an enlightened human being, who is emotionally stable, free from biases, and makes rational choices, with due consideration for short-term and long-term consequences of such choices. Can a machine be built based on such a model is the question we will discuss in the following sections. S. Bodhananda (B) The Sambodh Society Inc., and The Sambodh Center for Human Excellence, Kalamazoo, MI, USA e-mail: [email protected] Sambodh Foundation, New Delhi, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_5
75
76
S. Bodhananda
The related poser to this question is—are there enlightened humans in the world? Socrates, Aristotle, St. Augustine, Kant, Mill, and others in the west, Sankara, Ramanuja, and Madhva in the Indian tradition, Confucius and Lao Tzu from China, have dwelt upon the meaning of ethics, virtue and a good life, and have prescribed values for human conduct. Unfortunately, humans have not been quite successful in living those values. One of the Mahabharata characters, Prince Duryodhana, laments that though he knew the difference between good and bad, he was not able to choose good against bad actions. From our social and evolutionary history we may surmise that human beings are certainly not an exemplary model for programming a moral robot. What is expected from a moral machine is that it should not hurt any living being, nor should it harm the ecosystem. Obviously, it should not hurt itself too. The zeroth law of Asimov (1950) cautions against not only action but also inaction from the robot that would cause harm to humanity. What humanity in an utopian manner looks forward to is a “non-violent machine” that co-exists and coflourishes with humans and shares common values and aspirations. To put it in simpler words, a human–machine symbiosis is what is imagined. Another important question to reflect upon this possibility is whether machines need to have a self, free will and consciousness in order to act morally? This is another significant question for which we do not have clear answers. From the perspective of disciplinary approaches and theorization, scientists, philosophers, psychologists, and ethicists disagree among themselves on the definition of the self, free will and consciousness. For instance, for the Buddhists there is no enduring self, and scientists do not accept free will. The philosophers across centuries continue to debate on the nature of consciousness, and ethicists have been arguing about principles of ethics since time immemorial.
2 The Moral Machine A conjecture that is proposed in this chapter is that a “moral machine” need not have consciousness to function morally, for the well-being of living beings. The “moral machine” needs to collect data from all stakeholders, store and process them, and make choices that enhance optimum collective well-being. The “moral machines” need not be conscious of, or subjectively feel (the so-called felt experience or qualia), such actions. The moral machine’s choices will depend on its “ethical programming” and the quality and complexity of data available at its disposal. Similar to human beings, “moral machines” can be fallible, and it will be a matter of learning by trial and error over time, and receiving feedback in a continuous manner, that better efficiency is achieved. For example, like a child learning about ethical behaviour as she grows up interacting with society, so too machines interacting with humans, in different niches, can learn to become better adapted to ethical behaviour. Partly, the learning will happen with the help of better technologies, creating new data sets and updated algorithms for machine learning and also owing to careful human intervention. The response to our theoretical question whether engineers can design “moral machines” is a resounding ‘yes’. Engineers could programme machines that
5 Advaita Ethics for the Machine …
77
will cooperate with humans in creating a society based on ethics and which will contribute to human well-being. Humans can learn from machines and machines from humans in a healthy back and forth feedback process.
3 Human Machine Interaction and the Artificial Moral Machine Human history is one of human machine interaction. In simple terms, a machine is a sophisticated tool that humans use to understand and master their environment. The first tool that enhanced human mastery was the oppositional thump, with which humans could pick a tender leaf and push aside a large boulder, make an arrowhead and hold and fling a stone, climb trees and look afar. Thereafter, there was no looking back, and presently humans can peer into the heart of the universe with the help of tools like the James Webb Space telescope and smash particles in miles-long particle accelerators. Human civilization, as we see today, possibly is a product of human machine interaction. Just as the continued use of the opposable thumb helped the development of the neocortex, the use of various tools has helped humans to develop language, culture, mathematics and science. The human species and the use of tools has a history of co-existence and constant refashioning. Humans and machines have been friends since known times and this friendship is likely to continue in the foreseeable future with more and more complexities built into the design and purpose of machines. While machines have historically subserved human purposes, the human errors as well as machine dysfunctions have caused untold misery to us. Yet, the balance of benefits and loss was always in humans’ favour. The intervention of AI and sophisticated machines in diagnostics, surgery, patient care, and overall health management is changing the public and social space. So too they are widely used in marketing, customer care, financial services, banking, and share market operations. The possibility of risk-free self-driving electric cars might out populate polluting cars in a decade or so and reduce the carbon footprint. Targetseeking guided missiles, drones, and intelligent mine sweepers are already active in war theatres, which help military planners to reduce collateral damage and friendly fire. There are reportedly more than one million deaths due to motor accidents, but that does not deter people using motor vehicles. Imagination, adventure, and daring death have been part of human nature, and risk taking is built into the human psyche. Safety, privacy, and bias concerns are being addressed successfully in the design of artificial narrow intelligence machines (ANI). It is only a matter of time, experience, learning, tweaking to develop better and better designs, to see further progress in this field.
78
S. Bodhananda
4 The Method of Science, Advaita, and Its Ontology of Fundamental Reality Scientific investigations are largely founded on the limitation (physical constant) of speed of light (186,000 miles per second), relativity of time–space continuum, and quantum indeterminacy. The scientific methodology is confined to tangible phenomena, employing experimental and analytical methods, proof, and repeatability for the outcomes. Scientific theories are provisional since those are subjected to evaluation by falsifiability and such deductive standards. While life is the flaming tip of the universe, the inquiry into the extra-terrestrial existence of life elsewhere in the universe is an ongoing search. Irrespective of scientific methodologies and developments, the phenomenon of life and its origins may not be available for objective and analytical inquiries. As Buddha said it is easy to extinguish life, but impossible to build it from scratch. While science is concerned with the structure of matter, the visible and measurable phenomena, ethics is concerned with life, its wholeness, its sustenance, its higher possibilities, and its flourish. Scientific methods and theories cannot provide ground for moral principles. A prominent belief is that building artificial “moral machines” is within human capability and can be achieved in this century itself. What are the ethical rules that will go into the brain of machine intelligence? To answer this vexed question, we will take recourse to the Advaita philosophy. Advaita proposes an ontology of fundamental reality and an episteme of coming to direct knowledge of that reality, and a system of ethical values based on that knowledge. Such an interconnected and experiential knowledge is gained in contemplative states, and can be termed as “Advaita-Yoga Drishti”, or the insight borne of yoga discipline. While scientific methodologies do not have much to say on the mystery of life, Advaita and its episteme—the “yoga Drishti”—seems to offer an appropriate framework to dwell upon the questions of life, ethics, and life’s possibilities. The root of life goes beyond physical structures, evolutionary processes and technological manipulations, all the way to the ontological and existential realms, given that the quest is for ultimate reality. Humans in contemplative states, and in deep silence, awaken to that reality. The yogis in their contemplative states discovered fundamental truths and recorded them in the corpus of texts known as the Upanishads. Yoga is a systematic body–mind–ego discipline to open the doors of deeper perception. It involves food discipline, body postures, breathing exercises, relentless critiquing of intellectual concepts and categories, situating the ego in the family of egos, mindfulness (sakshi bhava), and in the process, forging a new epistemic instrument (samyama) of insight ( jnana-¯aloka). Advaita’s goal is to transform ordinary humans to “yogi humans” and a similar process can be considered by engineers to programme and construct a “yogi machine” based on Advaita-yoga drishti and ensuing values. This chapter reflects on such possibilities.
5 Advaita Ethics for the Machine …
79
5 The Advaita-Yoga Drishti and the Substratum of Consciousness The yoga practice leads to Advaita-drishti or vision of oneness. The whole practice is based on the a-priori position that reality is sat (existent) and not asat (nonexistent). Philosophers may argue endlessly about why there is something rather than nothing or whether the world is real or unreal. Advaita-yoga drishti claims that the phenomenal world (drishya prapancham) is of indeterminate nature and belongs to a category traversing between the real and unreal—sat asat vilakshana. Sankaracharya uses a new term “mithya” to describe the appearance of phenomena. The appearing and disappearing phenomenal world require a true substratum which according to Advaita-yoga drishti is consciousness, which is absolutely real, and the world is an appearance of that reality. The Advaita-yoga drishti sees the world as a play (chit vilasa) of consciousness. Consciousness is the material of which the world of appearances is only the tip of the iceberg, be it a rainforest, the brain of a scientist, a galaxy, or a black hole. Individual consciousness reflects the universal consciousness, and the prospect of Advaita is that the contemplating individuals will get access to the enormous resources of that consciousness. According to Advaita-yoga drishti, consciousness is happiness and both have the same ontological status. Happiness is not a neurochemical impulse or a brain wave, but the very foundation of the visible universe. This claim may stretch our imagination to the breaking point. But our lived experience vouchsafes for this Advaita position. Humans are not happy with anything they have or indulge in. The theory that the more humans consume the happier they become is trashed after long periods of frustrating economic growth. Economists now admit that GDP growth doesn’t guarantee happiness or well-being of people. Neither the rich nor the poor are happy. Having and not having does not matter as far as happiness is concerned. This lesson from our lived collective experience points to the subjective dimension in our search for happiness. Consciousness or happiness is not a substance for objectification, but the very nature of the subject according to the Advaita metaphysics. The Yogi in meditative states abides in happiness. The meditative state gives the Yogi direct access to consciousness-happiness. There are two drastically opposing conceptions of the individual and their idea about development and happiness: one is the materialist approach, of exploring, subduing, and exploiting outside nature by power of reason and technological tools and harvesting happiness from such socialized nature in terms of consumable products. In this world view, consumption leads to happiness. And the other is the Eastern approach, of exploring, controlling and deepening the subjective, inner, mind and invoking innate happiness which is one’s essential nature. According to the eastern yogic concept, the individual is responsible for his/her happiness which has nothing to do with what one possesses or does not possess. The materialistic mindset seeks happiness in objects (vishaya sukham) and the contemplative mindset seeks happiness in the self (atma sukham) (Bodhananda, 2022, p. 425). The philosophy of Yoga asserts that the ultimate purpose of all human activity is the realization of
80
S. Bodhananda
happiness. The principle of happiness should not be confused or conflated with the experience of fleeting pleasure. While pleasure is the product of indulgence in a desirable object, happiness is the result of watching desire and avoiding the trap of addictive indulgence. In social psychology this skill is termed as “emotional intelligence” (Goleman & Boyatzis, 2017) and Yoga Sastra calls it “Samadhi” or somaticcognitive-emotive balance. Once the mind is balanced through sustained practice, Patanjali says, ‘tad¯adras..tuh.svar¯upe-‘vasth¯anam’,1 then we become competent to tap into the infinite resources of consciousness. Advaita’s primary concern and inquiry is human happiness. From that standpoint, Advaita methodology is in direct conflict with the scientific methodology. Science suggests that life is a product of material evolution and consciousness is a mechanical and cognitive process of ‘garbage in and garbage out’. According to scientific methodologies and pursuits, knowledge has to be objective and will depend upon the understanding of physical objects, structures, functions, and interactions. Following the Baconian approach happiness is achieved by organizing the world of things to control nature, to suit human purposes, and to gratify the ever-expanding human needs. Presently, the science-centred philosophers call this way of reasoning as “enlightened rationality”. The idea of Advaita-yoga drishti, on the contrary, suggests that life is a property of consciousness which is the foundational reality underlying the physical and mental universe. Foundational knowledge is subjective, multidimensional and self-directed. The Brihadaranyaka Upanishad rhetorically asks, “by what method or by whom can the knower be known?”.2 The knower can never be known as an object. But can the knowing subject ever be known? Yes indeed, in contemplative states of eureka, aha moments, the knowledge flashes like a revelation or a gift or a quantum leap. According to the Advaita methodology, self-knowledge bridges the gap and division between the knower, knowing, and the known. This knowledge is a state of absolute peace, happiness, and harmony. We may not outrightly agree with these positions or methods of investigation and theorizing from the point of view of Advaita-yoga drishti, since these might appear as a set of axioms for which there is no convincing proof. A contemporary thinking on unprovable statements is from Kurt Gödel whose incompleteness theorem says that there can be unprovable true statements. Hanging on to the coattails of Gödel, Advaita can claim that its statements may not be provable in themselves, but the values of life designed based on those insights can bring salutary benefits in terms of quality of life to the individual and community.
1
(Yoga Sutra 1:3). Brihadaranyaka Upanishad 2:4.14. yatra hi dvaitam iva bhavati, tad itara itaram ˙ jighrati, tad itara itaram pa´syati, tad itara itaram s´rn.oti, tad itara itaram abhivadati, tad itara itaram manute, tad itara itaram ˙ vij¯an¯ati. yatra tv asya sarvam a¯ tm¯aiv¯abh¯ut, tat kena kam ˙ jighret, tat kena kam pa´syet, tat kena kam ˙ s´r.n.uyat, tat kena kam abhivadet, tat kena kam manv¯ıta, tat kena kam ˙ vij¯an¯ıy¯at? yenedam sarvam ˙ vij¯an¯ati, tam ˙ kena vij¯an¯ıy¯at, vijñ¯at¯aram are kena vij¯an¯ıy¯ad iti. 2
5 Advaita Ethics for the Machine …
81
6 Non-violence, Minimalist Living, and Technologies The major ethical values that Advaita advocates are ahimsa and asteyam—nonviolence and minimalist living. These two values are the highest ideals of a Yogi, who builds habits and character based on those ideals- something akin to Aristotle’s virtue ethics. Humans are not always successful in living those values, because of the pressure of emotions and bodily appetites. The possibility is that machines can be programmed to behave according to those values. For instance, non-violence— ahimsa—is the core of Isaac Asimov’s three values prescribed for machine programming. An efficient machine should also consume less energy, and cause less pollution, embodying the value of simplicity. Development in the area of nanotechnology (NT), and material sciences can help reduce overuse of natural resources, by recycling and reusing resources through nano level manipulation of atoms, molecules to create lighter, stronger and durable materials for construction of machines and products. Great leaps in nanotechnology can be expected with the help of artificial general intelligence and robotics. Biotechnology (BT) is another area of interest in the context of our discussion on human well-being and flourishing. Biotechnology is the ‘integration of natural sciences and engineering sciences in order to achieve the application of organisms, cells, parts thereof, and molecular analogues for products and services’. Genetically modified (GM) seeds and plants can combat insects and pests on their own. These plants can also stand and outlive droughts and floods, which is a big relief from overuse of groundwater, pesticides, and fertilizers. Genome mapping, brain imaging, genome scripting, and such BT applications can improve longevity and disease-free quality of human life. The practice of medicine will become more preventive than curative, since the possibility of humans falling sick is less when healthy genes interact with a healthy environment. Nanotechnology and biotechnology mediated by AGI and supervised by conscious humans functioning in a democratic welfare state where all stakeholders’ voices are heard and accommodated might usher in a non-violent society of peace and harmony. Biotechnology, nanotechnology, and artificial general intelligence in tandem with human intelligence can usher an era of hope and wellbeing for humans, provided engineers can programme machines embedded with ethical values. The above values, that is, non-violence and minimalism can be successfully programmed in machines and its interventions used in areas like health care, marketing, customer services, research, communication, self-driven cars, etc. But how can non-violent machines be deployed in combat conditions, on the war front? Perhaps we cannot. But by significantly reducing the fatalities caused by friendly fire and collateral damage, and by taking out human bias and errors from the equation, non-violent machines can reduce human suffering. Machines possibly would have no emotions and could be more rational in goal-orientations than humans, by having access to enormous real-time data to make pinpointed granular decisions.
82
S. Bodhananda
7 Moral Yogi-Machine Let us discuss more on the “Yogi machine” who will not be burdened by uncontrolled emotional upheavals. A yogi machine is free from emotions, and the absence of emotions can be seen not as a drawback but an asset in the process of rational and altruistic goal setting and decision making. The biological idea of reciprocal altruism favours the consequences of an action for reproductive fitness that determine whether the action counts as altruistic, not the intentions, if any, with which the action is performed.3 An individual might help if there was a high probabilistic expectation of being helped by the recipients at a future date (Robert Trivers, 1971). The acts of altruism that sprung from emotional and biological impulses have the chance of deteriorating into parochialism and tribalism, leading to mayhem and war. It is in this light that the yoga discipline restraints emotions and encourages discernment (viveka). The right exercising of the intellect (buddhi) which is the seat of viveka promotes altruism. From the standpoint of viveka, altruism is not impulsive or biological, but rational, and can be culturally learnt. Advaita-yoga drishti advocates distance from emotions (yoga chitta vritti nirodha) and its modifications as the foundation of altruism. It could be anticipated that the unconscious, selfless, but “moral machines” are creatures of human programming and will serve purported human interests and wellbeing. A “moral machine” could be programmed to arrive at optimal decisions after processing data collected from all stakeholders, taking care of stakeholders’ interests and well-being. While humans are not capable of collecting or processing such big data, deep machine learning can do such wonders. The possibility that such machines could be more efficient and sincere in serving humans than our lawmakers, administrators, courts, philosophers, academicians, and religious leaders, cannot be ruled out. The argument for this possibility is that while humans theorize ethics, machines can be programmed to act and respond in ethical ways, and that an emotionless “moral machine” cannot be hypocritical. What are the perils of an emotionless, data crunching, intelligent machine that can outwit human machinations, selfishness, and hypocrisy? The peril is that it may overrule the decisions of the present ruling class and undermine their unstated agendas. Such machine decisions can cause disruptions and discontinuities and reorder the present structure of society and production relationships in a more just and equitable fashion. Moral emotionless machines could liberate individuals and society from the clutches of the military-industrial complex of both the capitalist and authoritarian varieties. Can such “moral machines” submit themselves to be used by authoritarian or rogue states, bad actors or terrorists, profit-seeking corporates or even democratic states to suppress the voices of individual citizens, hack their brains, steal their private information, colonize minds, exploit vulnerable sections of societies and cultures, spread fake news, create post truths, and establish a global prison state ruling over a dumb, mute, emasculated, zombie mass of pets of humans? This is a linear projection of the misuse and abuse of political and economic instruments created by the 3
https://plato.stanford.edu/entries/altruism-biological/.
5 Advaita Ethics for the Machine …
83
hegemonistic ruling class to colonize and aggrandize vulnerable humans and natural resources. The data collection, processing, decision making, and responses are all influenced by insecurity, fear, and a psychology of scarcity of the ruling class. Our evolutionary and cultural history shows that humans are primarily emotion-driven agents and human elites are doubly emotion-driven. The contrarian position is that the emotionless, selfless, unconscious “moral machines” will have no axes to grind, and they behave according to their rational programming, for the optimal well-being of all humans and the health of their ecosystem. The programmer’s prejudice and the sectarian interests of the funding agencies is a factor to be included, and such a problem could be overcome by the deep learning machines as they amass more data and process them in multidimensional layers of feedback systems. The moral questions like should one save a child at the cost of two adults, or a good man by sacrificing a criminal, etc., are to be decided on the spot rationally. The anticipation is that “moral machines” might be able to think beyond ruling classes’ narrow interests and manipulations. There is an illuminating story in Mahabharata while discussing such moral dilemmas. A hunter chases a deer into the hut of a rishi. Seeing the deer disappear into the hut from a distance, but not quite sure, the hunter approaches the rishi and enquires about the deer. The rishi was committed to practising the values of truth and non-violence. The rishi was caught in the horns of a dilemma, between truth and non-violence. If he tells the truth, then the poor deer will be killed and if he doesn’t then truth will be compromised. The exasperated rishi blurted out, “eyes can see, but cannot speak; mouth can speak, but cannot see”. The confused hunter left, leaving the rishi alone, meaning, there is always a third way out of such dilemma situations. Similar is the biblical story: When Jesus was asked by the crowd whether they should follow the Jewish law and lynch the prostitute or leave her free following the law of love and forgiveness, Christ only said, “those who have not sinned may throw the first stone” leaving the crowd speechless and reflective. With the enormous and diverse data at its disposal “moral machines” will be able to make such out of the box win–win solutions. We could metaphorically say that “God is big data, and big data crunching “moral machines” are avatars of God”. The Bhagavad Gita promises that whenever injustice and immorality rules the world of humans, God manifests as avatars to set things right, to reestablish morality and ethical living. More data means less errors, less injustice, less inequity, less oppression, less exploitation, and more accuracy, more efficiency, more representation, more inclusion, more justice, and more freedom. Even if the programming is done by a biased engineer to promote the focused interests of the funding agent, the sheer logic of mass data, and deep machine learning will incarnate a bias-free and ethical machine which could think inclusively. How do you programme a non-violent machine? The underlying guideline is that the machine should realize that to live is to let live. Non-violence is inclusion and crowdsourcing. As mass of data pours into the bowls of the machine, data which is drawn from all stakeholders, including nature, different cultures and traditions, objective and subjective, excavating the past and exploring the future, the picture of life as an interconnected web becomes clear to the all-seeing machine. Such a
84
S. Bodhananda
vision and its practice is difficult even for a well-disciplined yogi, because of bodily limitations and innate fear of death, disease, and deprivation, which is the stuff of emotions. A “yogi machine” sans emotions is well placed to advance altruistic goals like human well-being and flourish. The emotionless, altruistic, intelligent “moral machine” could become good role models for humans in ethical conduct. Kant’s deontological ethics, Bentham’s utilitarian ethics, St. Augustin’s golden rule, Aristotle’s virtue ethics, Gita’s duty ethics, Gandhi’s non-violent ethics are all based on rational calculations, on the principle of inclusion. Inclusion means more data, variegated data, contradictory data, dissident data, useful data, and useless data. The processing of humongous volume of data invariably leads to the perception of altruistic goals. Another conjecture is that the deep learning machines (artificial general intelligence) eventually will learn on its own to become ethical machines. While evolutionary history took millennia via survival instinct and self-centred emotions to mould a partially, and often by default, ethical human, the possibility of AI perfecting an ethical machine cannot be ruled out. Whereas human ethics are limited by the limitations of sense organs, limitation of brain mass, and survival needs of a fragile body, intelligent machines capable of picking up the incredible spectrum of sound, touch, sights, tastes, and smell, as well as feelings and thoughts, the conscious and unconscious and respond in more holistic ways, in cooperation with the humans, is a futuristic possibility.. The future of machine intelligence could be envisaged as a seamless communication between machines, humans, and other living and non-living entities. In such a future scenario, information from the specialized and interdisciplinary worlds of physics, chemistry, botany, biology, mathematics, history, culture, art, etc. could be processed in real time by deep learning machines and decisions executed by self-driven robots supervised by conscious humans, and these conscious supervising humans being part of the loop, power is distributed equitably.
8 Networked and Self-regulating Complex Systems In the above scenario where humans are hollowed out, objectified and uploaded, the concerns regarding safety and biases are addressed by the enormous data crunching, networked, detached, problem-solving, human–machines. Although privacy and private capital including intellectual property rights will be casualties. The Advaita ethics (dharma) prescribes that access to knowledge, food, and shelter should be universally free and mutual respect should be universally practised. No one tries or has the need to colonize and exploit the other. On the contrary, every one leverages everyone else’s strengths and compensates for every one’s weaknesses, which is the strength of a networked, self-regulating complex system. The networked, selfregulating complex system of free-flowing information can be hampered and stalled by the paucity of resources. The creation of capital, ideas, supply chains, products, and distribution outlets cost resources- both natural and intellectual- as well as time.
5 Advaita Ethics for the Machine …
85
All these resources are scarce, which necessitates and breeds the culture of competition, hierarchy, hegemony, oppression, war and violence, enabling the ruling elites to capture control of data crunching intelligent machines and command them to serve their sectarian and selfish purposes. Such a scenario reminds us of the Malthusian and Marxian prediction, rather prophecy, of capital and power accumulating in a few hands and the rest eking out a meagre subsistence. We might conclude that information without resources leads to dystopia, with millions of hands and mouths chasing less opportunities and few goods and services. The march of the human–machine network cannot be stopped even in the light of such a dystopic prognosis. The march of knowledge is not in the hands of humans, their legislative chambers, or in the hallowed halls of academia. Knowledge has its own logic and impulse. If democratic societies desist from developing AI, BT, and NT, then authoritarian states, bad non-state actors or the network itself will, and thus no choice is left as far as AI development is concerned. We can further ask if this dystopian projection itself is not a production of linear thinking and is it not possible that a human–machine deep learning network can see far into the future and create a positive narrative of the benefits of utilizing scarce resources economically and equitably and communicate that narrative effectively and catch the authoritarian impulses in individuals and groups in time, and neutralize and nip them in the bud? Is it not possible that with services of NT, BT, and AI humans would be able to tweak atoms and genes to produce durable materials and healthy genes, thereby inaugurating a golden era of plenty, prosperity, health, and peace? Richard Feynman envisioned such an elysian future for humanity in his celebrated lecture, “There is Plenty of Room at the Bottom” (Feynman, 1960). We can hear echoes of this Feynman thought in C.K. Prahalad’s theory of “Fortune at the Bottom of the Pyramid” (Prahalad, 2010) and the economic theory of ‘Wealth from Waste’ (concepts of circular economy and dematerialized growth). The possibility from this line of thinking is that intelligent, networked, human–machine deep learning architecture, devoid of emotions and selfinterests (the yogi mindset) could effectively offset authoritarian impulses and make the utopian altruistic spirit flow in the collective thinking system. The Mahabharata, the lexicon of moral dilemmas, narrates a story about a global enterprise (yajna), involving all stakeholders, to renew an ageing and decrepit society. Lord Vishnu, the custodian of the world of beings, suggested a cooperative, multipartisan effort, to churn the ocean of milk (a metaphor for consciousness). Both asuras (demons) and devas (deities) were invited to participate, along with mountains, plants, serpents, turtles, and other creatures. Mount Meru becomes the churning rod and serpent Vasuki the churning rope. Halfway through the churning, Mount Meru slipped to the bottom of the ocean. Instantly, an ever alert Lord Vishnu takes the form of a turtle, swims deep, and balances the mount on its back. And as per the story, finally, the life-giving nectar emerges, along with the life-negating poison, and the story goes that Lord Shiva swallowed the poison which made him blue in complexion. The moral of the story is that there are no unmixed blessings or miseries. The question is because of fear of problems, should we give up the blessings of collective efforts, and by collective effort we mean the collaborative work by humans, machines and all other life forms to create a harmonious and flourishing network. The underlying
86
S. Bodhananda
message of Advaita is that the world is nothing but an interplay of consciousness (Brahman) and matter (maya) and is a symphony of many players, and the more we include the more such a harmony is lived and enjoyed.
9 The Machine Self, Consciousness, and Co-existence The second burning question in the field of technology is whether machines will develop self, consciousness, and desires of their own that could threaten the survival and freedom of humans. The fear is whether a colony of powerful, self-willed, selfconscious, data-crunching machines can indeed overpower humans and take control of resources and use them for their sectarian benefit and reduce humans to a pathetic and enslaved state. Such bleak scenarios about the future of human–machine partnership, most likely is an impossibility. Further such a portentous scenario is based on the present human (ruling classes) practice of deceit, violence, and suppression of the weak by the strong, based on the Darwinian theory of the survival of the fittest, the ‘first-past-the-post’ rule, and ‘winner takes it all’ philosophy. In the scramble for survival, humans exploit each other, the ruling class exploit the downtrodden, and together they exploit dumb and mute nature, and the resultant picture is of ecological devastation, large-scale extinction of flora and fauna, obscene income inequality and widespread incidents of new forms of illnesses. The global community is presently taking a break from breakneck speed in order to critically evaluate the direction and quality of ceaseless GDP growth and its impact on human well-being. Greater wisdom is dawning on wider sections of society that the sins of destruction anywhere to anything will ultimately catch up with humans, culminating in the possible destruction of precious life itself. The Vedic rishis, in their Advaita-yoga drishti, emphasized on the importance of co-existence for human survival. An ecosystem of mutual respect and co-existence alone can ensure survival of humans. The foreboding message is that the end of one species will lead to the end of all species. In the process of natural selection, some of the species may die out yielding space for other advanced species. With humans mimicking the so-called law of natural selection,—without its profound wisdomgreed and lust for instant gratification took over, resulting in the mindless destruction of species and nature. While natural evolution makes sure that the ecological balance is maintained, that is not the case with greed-induced progress. The rational and thinking humans are gradually realizing that the survival of other species and the health of the environment is the sine qua non for human survival. The philosophy of cooperative co-existence could permeate into machine programming, deep machine learning and robotic behaviour, and a wise yogi-like ethical machine will have no reason to self-aggrandize for survival and functioning. It is quite unlikely that self-conscious, jealous, greedy, vicious, and violent machines can either be programmed or instructed to evolve. It is difficult to envisage the programming of a jealous, greedy, lustful, and emotionally suppressed machine. The emotional features and necessities are the vestiges of the evolutionary past and
5 Advaita Ethics for the Machine …
87
have served their purpose as humanity moves into an era of affluence and wisdom. While competition and hegemony are the ideologies of a vanishing generation, free data access, cooperation and conversation will be the hallmark of the emerging human–machine interactive spaces. Another reason is that it is difficult to program the self into a machine because we have no working definition of the self, and it is also not clear if cognitive processes per se require a self. Philosophers and scientists have opposing views on the question of self. One idea of self is to give explanatory frameworks based on memory. To remember past experiences and imagine a future, a self is required, since self is that which threads memories. How do we understand the concept of a “machine self”? The machines can store and process data, see patterns, predict the future, and recall when necessary, though they would have no self-experience. The observable aspects of human behaviour can be replicated in machines including emotions to a certain extent guided by the emerging technologies. The difficult scenario is that we do not know what constitutes self to judge whether machines will have a self or not. The machines collect data, store, interpret, and build knowledge. When a certain pattern of reasoning and experiencing is seen in a system’s behaviour, self is attributed to it, and in that sense, machines could have a self, though they will have no ability to experience a self. The self can be defined as the ability of a system to have experiences, recall those experiences, and judge them and build knowledge based on such judgements. Thus, our surmise is that machines could do all that humans do without their self and experience, and machines are ‘self-less’. Surprisingly, that is the ideal state that a yogi strives to accomplish through rigorous discipline. This means that the socalled self is an evolutionary by-product which perhaps has no value or purpose in an interactive, interdependent, open ecosystem, where information flows freely and is accessible to all. Just as human agency and choices are products of advice of experts like parents, teachers, doctors, lawyers, dieticians, ethicists, propagandists, and the prevailing trends, machine agency to a large extent is determined by expert programming. The concepts of experience or self are superfluous as far as machines are concerned, since they can function efficiently without such appendices. What is the nature of the experienced self? The self in the human is the source of insecurity, fear, anxiety, jealousy, conflict, violence, and all forms of mental illnesses. A self-free machine could be free from emotional turbulence and mental illnesses and thus a welcome relief from the possibility of having “machine hospitals” for intractable and incurable illnesses that dog humans and further trillions of dollars saved in medical bills and geriatric care. The concept of the self is individualistic and isolated from other selves, similar to a windowless and doorless self-enclosed monad of the Leibnizian concept, and such a self is the problem and the singular source of suffering. Possibly, the self is the vestige of a failed experiment and has no value in the knowledge society of the human–machine ecosystem. Advaita identifies such a limited self as the product of ignorance of missing the whole and identifying with its part. If self is considered as the owner of cognitive outputs expressed linguistically like, “I see, I smell, I hear, I taste, I touch, I am touched, I think, I am hungry, I am thirsty, I am sick, I am happy, I am a man, I am a woman”, etc., machines could as well be programmed to make such expressions through storing and processing
88
S. Bodhananda
data, and responding to the queries with the help of LLM (large language models). If machines are programmed to speak, then the chatbots might express their activities and thoughts in linguistic terms. It could be that language creates the self and the self is a linguistic and narrative self-reflection relying on the unique human capability for language (Budwig, 2000). The contention that continues is that unlike humans, machines are devoid of experiences. The ensuing question is whether experience is also a misnomer, a linguistic expression to indicate mechanical cognitive processes. From the standpoint of Advaita-yoga drishti, all experiences, memories thereof and knowledgemaking, desire and desire-prompted actions are mechanical, material, and fall in the objective realm. From this discussion, can we safely conclude that ‘self’ and ‘experience’ are ambiguous terms and are linguistic constructs and may or may not be attributed to machine behaviour? We do not ignore the position that the machine self could be qualitatively different from the human self in that while the human self is isolated and private, the machine self will be networked, interactive and public. Because of human–machine interaction, the self could become open to intersections of networked information and thereby eliminate the obsessive need for privacy and secrecy and the paranoia built around it. The conjoint question is whether machines can develop consciousness. The scientific position is that consciousness is not an empirically understood concept, and therefore programming consciousness into the machine will stay an open-ended exploration. According to the Advaita-yoga drishti, consciousness is the fundamental reality, the basis of the phenomenal (material) world. Accordingly, consciousness is beyond objectification, measurement and manufacturing and is the substratum for conscious and unconscious experiences. It is quite possible that a machine that has sufficient complexity, with many layers of feedback processing, could manifest consciousness and become conscious like an electric bulb that manifests electricity as light. There are three main ideas about consciousness—it is an epiphenomenon, a by-product of evolution; it is an independent datum coeval with matter; it is the only thing that exists and the rest are its expressions. The Advaita-yoga drishti advocates the third view of consciousness. In all the three scenarios, there is no reason why machines cannot become conscious. If engineers could build a machine with brain-like complexity and efficiency at affordable cost, whether it can also be conscious is an open-ended puzzle. From the point of view of Advaita-yoga drishti, consciousness, being hidden and allpervasive, can manifest anywhere any time as anything as existence, knowledge, or bliss, meaning that all manifestations are supported by consciousness, all knowledge is modes of consciousness, and all happiness is expression of consciousness.4 According to Advaita, since there are different ways of being happy, a Yogi is deeply quiet being happy without doing anything. The Bhagavad Gita exhorts humans to find happiness in work and not in its fruits. Happiness is a state of being and not the outcome of doing. To the question, can machines feel happiness?- it is difficult 4
Taittiriya Upanishad:3-6. a¯ nando brahmeti vyaj¯an¯at | a¯ nand¯adhyeva khalvim¯ani bh¯ut¯ani j¯ayante | a¯ nandena j¯at¯ani j¯ıvanti | a¯ nandam . prayantyabhisam . vi´sant¯ıti ||.
5 Advaita Ethics for the Machine …
89
to imagine a happy, joyful, dancing, mirth-making machine. For a machine, happiness may lie in fulfilling its tasks efficiently. While we model machine consciousness we need to keep in mind that it need not be akin to the narrow human selfconsciousness. Human knowledge being limited operates in the narrow circle of fear, greed, and anger, and is held in silos behind impregnable firewalls. Since AGI machines have access to all information from diverse sources in real time, knowledge is freed from the ruling class, the experts, and their jargons. The machines, humans, and the entire nature being part of a free-flowing conscious, integrated information network, the consciousness of a machine will be free from prejudices, biases, and narrow self-interests. Advaita-yoga drishti describes it as all-inclusive consciousness, a networked, interdependent, free-flowing awareness. Such is the yogi awareness and the human–machine dialectics can be an era of global enlightenment. In such an enlightened ecosystem, conscious machines will have no need to suppress humans or for that matter, dominate any expression of life, since suppression is an expression of fear and fear springs from lack of information and knowledge. The futuristic possibility is that genetically programmed humans using nanotechnology and the insights of quantum physics with the help of AGI machines, could build an equitable, free and just society of happy individuals, happiness being a function of 360-degree information and the resultant alignment with universal harmony. We could conjecture that while humans possess and are conscious of a narrow self, the machines could be conscious of an expansive, universal self, of which it is a contributing partner. Since humans become a contributor as well as beneficiary of the information network, they also could be possibly freed from ignorance, narrowness, fear, greed, and violence. This means that human freedom and enlightenment will depend on the arrival of fully developed AGI machines and that the freedom and happiness available for individual Yogi becomes available to the entire humanity. Consciousness is not programmable into machines, regardless of whether it is understood as emergent, coeval, or foundational. What best is possible, as per the theoretical positions and emerging technologies is that artificial ecosystems can be built to facilitate consciousness to emerge, manifest, or express. The human consciousness is a pale reflection of that consciousness, asserts Advaita. The Advaita view of consciousness has several advocates in the scientific community. For instance, David Bohm believed that the Ultimate reality of the universe is different and it consists of both an explicit external order and more importantly an implicit order that almost borders on the sacred. According to Schrodinger, though life may be a result of an accident, consciousness is not. He believed that consciousness cannot be accounted for in physical terms, and is absolutely fundamental. Einstein believed that there is no logical way to the discovery of elemental laws and there is only the way of intuition, which is helped by a feeling for the order lying behind the appearance. The well-known statement attributed to Einstein is, “to know that what is impenetrable to us really exists, manifesting itself as the highest wisdom and the most radiant beauty whose gross forms alone are intelligible to our poor faculties—this knowledge, this feeling … that is the core of the true religious sentiment. In this sense, and in this sense alone, I rank myself among profoundly religious men” (Ferré, 1980).
90
S. Bodhananda
The Upanishadic texts also share similar positions. According to Mandukya Upnaishad consciousness is not dream, not waking, not both; it is not sleep, it is not conscious, it is not unconscious, it is not visible, it is not transactional, it is not graspable, it is not inferable, it is not thinkable, it is not describable, and it is the thread that connects the phenomenal world of matter and mind, invariably experienced as I, I, I in all variable cognitions, where the world abide, peaceful, goodness, one without a second.5 The Taittiriya Upanishad states that Truth- knowledge-infinitude is the nature of consciousness (Brahman) and one who identifies with this consciousness attains happiness incomparable.6 The foundational givenness of consciousness is emphasized by ancient masters as well as scientists and thinkers. According to Advaita consciousness alone exists. Even if machines in great complexity and interiority manifest consciousness, it will not be like the human consciousness whose contents are ignorance, greed, lust and violence. The content of machine consciousness will be an altruistic concern for all. The fear that intelligent and conscious machines will overpower and enslave humans is a projection based on elitist thinking, from the perspective of a society based on power imbalances. The status quoist power resists the revisionist power in a world of Manichean conflict. In a human–machine interconnected world, where deep learning machines dominate decision making in collaboration with wise programmers, who can see the future clearly by analysing the data at their disposal, it is possible that power will be distributed equitably and decisions made altruistically, for the benefit of all. In such a scenario, the present human embodiment could morph into a better and more durable and less disease prone structure. Human embodiment need not be the last word in the evolutionary, pan-psychic, world of beings. Memory may continue by being stored in silicon chips or quantum spaces, with changing embodiment. After all, in the long run, individual humans will be physically dead and recycled. If life can survive the death of Neanderthals and dinosaurs, then why should it not survive the end of human species? The selfish gene might find a way to express itself in multiple ways.
10 The Prospect of Advaita-Yoga Drishti Advaita-yoga drishti envisages such a prospect for humans. Individuals, according to Advaita, are not essentially their bodies, brains or minds, but pure, indestructible consciousness. A Yogi after long practices realizes this truth and treats the body 5
Mandukya Upanishad: Verse 7. n¯antah.prajñam ˙ na bahis.prajñam ˙ nobhayatah.prajñam ˙ na prajñ¯anaghanam ˙ na prajñam ˙ n¯aprajñam| adr.s.t.amavyavah¯aryamagr¯ahyamalaks.an.amacintyamavyapade´syamek¯atmapratyayas¯aram ˙ prapañcopa´samam ˙ s´a¯ ntam ˙ s´ivamadvaitam ˙ caturtham ˙ manyante sa a¯ tm¯a sa vijñeyah. ||. 6 Taittiriya Upnaishad: 2.1.1 1. satyam jñ¯anam anantam brahma— yo veda nihitam guh¯ay¯am—parame vyoman— so’´snute sarv¯an k¯am¯an—saha brahman.a¯ vipa´sciteti.
5 Advaita Ethics for the Machine …
91
as a tool, which can be replaced by better embodiments. The continuation of the individual, subjective, or objective memory, is also not important as the Yogi is identified with the entire pan-psychic realm and the universal self, which is nothing but a holistic, cosmic integrated information network. The physical bodies pop up and pop out in the consciousness of a Yogi, who has insight into the post human world. The Chandogya Upanishad describes the phenomenal world as just name and form, modification of the primaeval sound (OM), a linguistic construct, interpreted by the brain, bubbling up in the oceanic consciousness. Advaita prepares humans for the evolutionary possibility of human obsolescence and to abide as the ever-present consciousness. The human bodies are expendable and is only a wayside show in the long march of existence, as per the philosophy of Advaita. Such a willingness to selfsacrifice, if need be, for higher knowledge and greater possibilities of manifestations is the ultimate value that Advaita teaches. The lower must be sacrificed for the advent of the higher, the finite burns into the infinite, thus, Sapiens could become “The Positron Yogi” reminding us of the predictions of Issac Asimov.
References Asimov, I. (1950). Run around. I, Robot (The Isaac Asimov Collection ed.). New York: Doubleday. Bodhananda, S. (2022). Management and leadership: Insights from yoga philosophy and practice. NHRD Network Journal, 15(4), 422–430. https://doi.org/10.1177/26314541221115572 Budwig, N. (2000). Language and the construction of the self. In N. Budwig, I. C. Uzgiris, & J. Wertsch (Eds.), Communication: An arena for development (pp. 195–214). Ablex. Ferré, F. (1980). Einstein on religion and science. American Journal of Theology and Philosophy, 1(1), 21–28. Feynman, R. P. (1960). There’s plenty of room at the bottom. California Institute of Technology Quarterly, Fall, 2(1), 2–10. Goleman, D., & Boyatzis, R. (2017). Emotional intelligence has 12 elements. Which do you need to work on. Harvard Business Review, 84(2), 1–5. Prahalad, C. K. (2010). The fortune at the bottom of the pyramid: Eradicating poverty through profits. Wharton School Pub. Trivers, R. (1971). The evolution of reciprocal altruism. Quarterly Review of Biology., 46, 35–57. https://doi.org/10.1086/406755
Chapter 6
Singularity Beyond Silicon Valley: The Transmission of AI Values in a Global Context Robert Geraci
Abstract Global communities can and should consider how to reformulate transcendent dreams of artificial intelligence (AI) that arose in the USA. The goals of AI superintelligence and evolution from human to post-human machines seek hegemonic dominion over our perception of AI. In the twentieth century, a combination of pop science and science fiction produced dreams of transcendent machine intelligence (a “Singularity”) based on supposedly exponential progress in computing technologies. Whether such dreams are plausible or not, both technologists and entertainers promote them globally, and increasingly expose these ideas to Indian audiences. But Indian culture has its own resources for contemplating cosmic change, and these both adapt to the arrival of western Singularity dreams and suggests possible changes in them. Whether and how we receive Silicon Valley’s enthusiastic desire for technological transcendence impacts our deployment of AI; to maximize global equity, we must build our machines—and our understanding of them—with attention to values and ethics from outside Silicon Valley. Keywords Apocalyptic AI · Artificial intelligence · Ethics · Hinduism · India · Religion · Science · Singularity
1 Introduction Although many commentators have sought to delimit science/technology from religion and even culture more broadly, it is readily apparent throughout the world that such siloes rarely hold up against scrutiny. In India, for example, the existence of “cultural” values clearly shows the presence of Brahminical Hinduism in scientific life (Thomas, 2022; Thomas & Geraci, 2018). But this is not exclusively Indian. In fact, despite past and recent claims that religion and science are at war with one another in the Euro-American West (Dawkins, 2006; Draper, 1874; White, 1896), R. Geraci (B) Religious Studies, Manhattan College, Bronx, NY 10471, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_6
93
94
R. Geraci
there is a long tradition in which Christianity has both prompted scientific work in the West and been thoroughly infused into it (Midgley, 1994; Noble, 1999; Nye, 2003). Artificial intelligence (AI) is no exception to the rule. Although AI arose in a secular era, the narratives that surround AI are persistently religious. On occasion, these religious narratives hearken back to the alleged conflict between religion and science, such as when the power of creation is held to be solely in the hands of God (e.g., Ethics & Religious Liberty Commission, 2019). But those impressions do little to halt the advance of AI technology, which is welcomed in the West, as everywhere else. In fact, there are religious narratives that promote AI, and these have more cultural power in the West than the narrative of fear, divine prerogative, or essential conflict. The widespread presence of “Singularity” theologies (described below) is a powerful aspect of this integration of religion and AI that supports AI design. Faith in the Singularity plays a powerful role in global narratives about AI. In short, Singularity advocates believe that advancing digital technology will render human life and culture irrelevant, allowing machine intelligence to surpass biological intelligence and enabling humanity to transfer consciousness from human bodies into robotic bodies or virtual worlds. Should we thus merge with our technology, humanity would cede pre-eminence to immortal machine life. This belief seems to be the de facto religion of Silicon Valley and is present throughout American perspectives on AI. One sees it not just in corporate culture, but also in scientists’ descriptions of the future of AI (see Au, 2008: 231–33; Geraci, 2014: 177–84; Guest, 2007: 273; Levy, 2011: 66). One can also see reference to (even though not always support for) the Singularity in even the most pedestrian descriptions of AI, which shows how thoroughly the Singularity has penetrated our conversation about digital technologies—even its opponents must make note of it (e.g., Husain, 2017, 36–37; Kaplan, 2016: 138–55; Nourbakhsh, 2013: 106–107; Nourbakhsh & Keating, 2019: 37, 67; Perkowitz, 2004: 186, 209, 214; Wallach & Allen, 2009: 190–94). Early in the twenty-first century, Singularity thought could already be found in a wide array of news and entertainment media (Geraci, 2010b). At the onset of the century’s third decade, the proliferation of Singularity narratives across public life has been extraordinary, especially as fuelled by entertainment on streaming networks (For the Indian context, see Geraci, 2018: 131–58; Geraci, 2022: 113–14). We must not accept the obvious influence of American narratives about AI as representing the only possible narratives. Just as Europe must be “provincialized” as noted by Chakrabarty (2000), so too must we recognize that American notions about AI are just one among many possible cultural constructs around the technology. Other cultures—our focus here being Indian cultures—also offer ways of thinking about AI that ought to be considered as we pursue a global future that benefits the global public. If we are to transcend our current world, then we must do so with an eye towards global perspectives that advance global flourishing.
6 Singularity Beyond Silicon Valley: The Transmission of AI Values …
95
2 The “Inevitable” Singularity The religious perspectives of post-secular Christianity have been readily apparent in Euro-American dreams of AI. In the 1980s, Stark and Bainbridge (1985) predicted that new religious movements would emerge, capitalizing on scientific theories while promising to fulfil the traditional goals of religion. Even as they made this assessment, the roboticist Hans Moravec delivered exactly what they predicted. Starting with a 1979 essay in the science fiction magazine Analog and moving through the books Mind Children (1988) and Robot (1999), Moravec coupled his seminal research in mobile robotics with a side hustle in pop science evangelism. His books established the Singularity narrative, though Moravec didn’t use that term himself (he referred instead to a “Mind Fire”). Drawing on a twentieth-century history of science fiction and scientific futurism, Moravec and his later emulators and extrapolators produced a religion of immortality, resurrection and the apotheosis of AI. They believe that the Singularity will bring god-like machines and that humanity will upload our minds into machine bodies to share in the glorious new world. The Singularity drew on decades of theories about the hyperbolic advance of machine intelligence. Mathematician Irving Good (1966: 33) declared there would be an “intelligence explosion”. John von Neumann described an ever-increasing speed of technology and separately used the term “singularity”, though Vinge, who borrowed the term to refer to the accelerating pace of technology, notes that von Neumann seemed to be separating the concepts (see Ulam, 1958; Vinge, [1993] 2003: 2). Moravec drew the same conclusions from watching the curve of Moore’s Law and progress in computer vision. These dual tracks promoted in him a sense of the inevitability of greater-than-human–machine intelligence and that in turn promised the actual, eventual, arrival of a transcendent machine future. A few years after Moravec’s publication in Analog, Vinge (1983) followed him in Omni, announcing a coming singularity in intelligence and the unpredictable future beyond. In the early 1990s, Vinge firmly entrenched his ideas and Moravec’s in a famous essay first published in the Whole Earth Catalog (Vinge, [1993] 2003). Kurzweil combined Moravec’s longer treatment with Vinge’s catchy phrase and established the firm footing from which luminaries like Tegmark (2017) and Elon Musk suggest they can spy the new world to come (Musk noted his concerns over machine intelligence on his Twitter account on 2 and 3 August 2014). Alongside the Singularity, dreams of mind uploading (transferring human consciousness into a computer or robot) have percolated through science fiction and pop science. Playwright George Bernard Shaw’s vision of disembodied immortality in the 1920s seeded the trends that would lead the science fiction author AC Clarke to propose uploading consciousness into a machine and subsequently downloading it into a cloned body, a clear precursor to Moravec’s mind uploading scenarios. These ideas have gained popularity, appearing in legendary science fiction stories such as William Gibson’s landmark Neuromancer, followed in the twenty-first century by highly regarded books like Charles Stross’s Accelerando, Richard Morgan’s Altered Carbon (known primarily for its Netflix adaptation), and Ernest Cline’s Ready Player
96
R. Geraci
series. On the scientific side, mind uploading appeared as a possibility in a brief essay by biologist George Martin (1971) before being taken up by Moravec. These ideas, having been popularized by Kurzweil (1999, 2005), now occupy the thought of a who’s who of technofuturism, including Michiu Kaku and Yuval Harari (Kaku, 2018: 126–29, 200–205, 218–20; Harari, [2011] 2015: 408–11). The AI evolution prophesied by Moravec, Kurzweil, and their inheritors is fundamentally apocalyptic. Precisely speaking, “apocalyptic” does not refer to a disastrous end of the world, even though that is how the word commonly gets used. Apocalyptic ideologies are characterized by a fourfold structure: (a) a dualistic worldview wherein good struggles against evil in the world; (b) a feeling of intense alienation experienced due to the apparent supremacy of evil in the present; (c) anticipation that divine forces will resolve the alienation by ending the world and beginning a new one free from the existence of evil; and (d) the belief that humanity—or its elect, at any rate—will acquire perfected bodies to live in the glorious new world. It is thus fundamentally optimistic. The worldview of Moravec and those who follow him suggests that the world is one in which (a) potentially infinite machine life struggles against finite, mortal, biological life; where (b) biology and its accompanying evils of limited learning and eventual death currently reign; until (c) the Singularity occurs and machine intelligence transmutes humanity, then the Earth, then the solar system, and eventually the entire cosmos into machine intelligence(s); resulting in (d) the salvation of humanity as we upload our minds into immortal, infinitely computing machine substrates. Obviously, there are a variety of versions of this futurist scenario, but they all follow the same basic structures (for a full treatment, see Geraci, 2010a; Geraci, 2022). Because of the ways that Singularity thinking draws on Christian apocalyptic perspectives, I refer to Moravec and those who help share and extend his narrative as the “iron horsemen” (Geraci, 2022: 58). The glorious machine future predicted by the iron horsemen allegedly permits more than personal immortality: it also opens the door for simulated universes and resurrecting the dead. Drawing on science fiction scenarios in which computergenerated virtual worlds could house human minds (e.g., Pohl, [1955] 1975), Moravec argued that any sufficiently advanced society would develop computers, and from there the ability to create digital universes. Once this was accomplished, such a society could develop a near-infinite number of simulated realities and we are, therefore, most likely living in such a simulated universe (Moravec, 1992). Whether we live in base reality or a simulated one, Moravec suggests—much to the delight of Kurzweil—that we will resurrect entire histories, both real and imagined (Moravec, 1992, 1999: 142, 173). Kurzweil makes no bones about the fact that he wishes to resurrect his father in this fashion (see Berman, 2011; Rennie, 2011; Vance, 2012). The cottage industry in “mind files”, holographic preservation of the deceased, AIfuelled chatbots of the deceased, and global fascination with cryonics testifies to the draw of such technological immortality. For the iron horsemen and their audience, dreams of technological transcendence have an aura of inevitability. Just as apocalyptic Christians declare that their god has preordained the return of Jesus and the inauguration of a divine kingdom, the iron horsemen allege that there are laws of nature which ensure the future of machine
6 Singularity Beyond Silicon Valley: The Transmission of AI Values …
97
intelligence. Not satisfied with Moore’s Law, which really is nothing more than an observation about economic and technology trends, Moravec alleges that evolution guarantees the future he predicts, and Kurzweil goes even further, claiming the existence of a “law of accelerating returns” that assures him of an imminent machine paradise (Kurzweil, 1999: 33; Moravec, 1988: 167; 1999: 165).
3 Building a Cybernetic World The transhumanist narratives of AI have become pervasive in Euro-American culture; but most of the world’s population lies outside this domain and so the global future of such technological salvation is subject to revision as it reaches for new adherents around the world. Owing to its role in IT consulting and software development, India has a strong commitment to digital technology and thus has the potential to intervene in worldwide conversations about AI. Despite the dearth of attention paid to transhumanist interpretations early in the twenty-first century, Indian scientific communities and the public are increasingly aware of transhumanist promises coming from Silicon Valley and elsewhere. The Apocalyptic AI paradigm championed by Kurzweil and others currently dominates the marketplace of ideas in Indian transhumanism, but the unique culture of India also offers explicitly religious interpretations of AI and some of these have the potential to affect western culture in return. Indians increasingly engage western transhumanist narratives about AI, and they work towards reconstructing those narratives in ways that might become globally important. Currently, Euro-American views purport to being the views about the future and importance of AI, but in doing so, they engage in world building that would be better completed with a wider scope of cultural interaction. As we provincialize the narratives about AI, we see that some Hindus in India draw on local traditions about human minds and cosmic salvation as they reconstruct the relationship between religion and AI. While we cannot ignore the significance of actual AI technologies as they confirm or complicate global power structures, we also need to think about how narratives about AI do similar forms of work. Decades ago, Lincoln (1992) argued for the importance of embodied discourse as a mode of both constructing and deconstructing society. By analysing ceremonial meals, professional wrestling, and more, Lincoln offers a variety of examples of how actual human practice and public (or intellectual) discourse need not be isomorphic in meaning even as they collaborate to produce social life. More plainly, what we do and what we say can have different meanings, and the discrepancy between the two can be significant. We ought, then, see the way that narratives about AI operate alongside actual technologies of AI to establish our social world. Cave and Dihal (2019) note the significance of both fictional and nonfictional depictions of AI in science and pop culture, but also the important ways in which both fiction and non-fiction visions of AI inevitably fail to sever dreams of human salvation from disenfranchisement and damnation (2019). These kinds of visions—both utopian and dystopian—are part of the actual deployment of AI in
98
R. Geraci
the world. They affect how people envision their relationships with machines and machine-based systems. In his fabulous story “Tlön, Uqbar, Orbus Tertius,” Borges ([1962] 2007) describes a world where the writing of fiction alters the world of the authors. Apart from an obvious allegory about the importance of storytelling, Borges’s narrative offers real insight into the ways and means of narratives about AI. In his story, Borges tells of a group of friends who imagine and invent a mythical place (Tlön). By their increasingly sophisticated and detailed geography, history, politics and language they give reality to Tlön, ultimately leading to the transformation of their own culture into that of Tlön as people adopt the styles and language of the imagined world. The moral we can draw from Borges is that the fantastic is not necessarily unrelated to the real. Perhaps on occasion the one refashions the other, and in the case of narratives about AI we have precisely such a possibility. What academics and popularizers call “futurecasting” and “futurism” is actually a deliberate effort at future-shaping. Those narratives put blinders on us, preventing us from anticipating other options. When Hans Moravec and Ray Kurzweil speak of the inevitability of machine intelligence, they do so with the intent that their readers will help make that possible. All the talk of a Singularity or the cosmic destiny of humankind to first invent, and then cede evolutionary priority to intelligent machines is propaganda masquerading as analysis. This does not mean that they are wrong, of course; there may indeed be a Singularity. But if such a thing comes to pass, it will owe much (perhaps all) to the predictions of its supposed inevitability. There are other narratives about AI, many of which are outright pernicious. For example, Cave and Dihal (2020) have exposed the whiteness of how AI gets represented. They note that the association of AI with whiteness perpetuates racism and risks exacerbating it as the technology becomes more pervasive. Much of what Cave and Dihal note as imagery and ethos operates as a form of narrative: the stories and images that portray AI and robotics as white are part of our collective story about AI. The critique of racism in AI applies widely and includes the Singularity dreams of Moravec and Kurzweil (Ali, 2019). Our need to recognize the narrative role of race in AI is one element in the larger question of how cultural values are embedded in and reinforced by AI technologies. While the Singularity and its attendant technologies (e.g., mind uploading) have become regular fixtures in global AI narratives, they are not globally ubiquitous and other cultural perspectives demand consideration. In Japan, for example, Robertson (2018: 3, 174) notes that the Singularity hypothesis is unpopular among roboticists, and yet no one can claim that the Japanese are disinterested in the future of AI. Japan’s status as a (or the) “robot nation” is one deliberately cultivated, not integral to Japanese culture (Šabanovi´c, 2014); scientists, engineers, and policy makers work to establish the robot nation and they do so through choices that skirt the Singularity narrative. Similarly, my observations during research trips in 2022 and 2023 lead me to believe that dreams of the Singularity have only modest purchase in Korean culture, which is also highly technological. Research increasingly reveals the (religious) value systems at work in Singularity theories, and it is worth noting the ways these appear, remain invisible, or get
6 Singularity Beyond Silicon Valley: The Transmission of AI Values …
99
transformed in India. The growth of an information technology industry in India’s “Science City” did not immediately produce a commensurate arrival of western AI technological imaginaries. Silicon Valley outsourced labour, but did not initially export its ideology (Geraci, 2018: 136–46). But as the twenty-first century marches on, Indians have increasingly recognized the power of AI and begun to integrate it into religious views, both adopting visions of the Singularity and reconfiguring the traditional eschatology of Hinduism to include AI.
4 Expanding Viewpoints Mainstream speculation about AI frequently suffers from a very limited social worldview. The experience of people around the world does not necessarily lend itself to Apocalyptic AI dreams of salvation, and even within the western context, many people seem to be doing no more than foundering in the transhumanist wake. Chakrabarty (2000) speaks of provincializing Europe by disrupting the western claim to having the natural, scientific, and objective starting point for considering history and culture. Similarly, if we are to take futurist speculation seriously (whether to support or criticize it), we must begin the process of seeing such speculation as one version of how the future could look, and indeed just one version of utopian religious speculation. Simultaneously, we must specify how particular visions of current technology are just that: particular. There are many ways to interpret technologies past, present, and future; some such ways are probably more relevant or helpful depending on the context of inquiry. Many AI champions in western nations continue to trumpet a narrative of technological determinism in which their own visions appear as the only possible way of seeing the future of AI. Some scholars, however, point to the critical failings of those views. The transhumanist perspective of mind uploading and the Singularity, argues Ali (2019), represents a continuation of white supremacy and is a new form of white crisis, a response to the rising voices of non-white communities. Chude-Sokei (2019) has argued that this perspective precedes belief in the Singularity, that the origins of cybernetics and even science fiction were steeped in race relations emergent from the legacy of slavery. Whether deliberate or accidental, the mainstream approach to AI certainly exacerbates racial asymmetries in contemporary society: Cave and Dihal (2020) describe how the portrayal of AI (almost entirely as white) excludes non-whites from futurist imaginary and affirms social prejudices against them. Katz (2020) takes this further by pointing towards how the worldview of AI assumes white male superiority and operates according to an imperialist logic. Working from the same premises but in a different direction, Butler (2020) argues that transhumanism and its attendant technologies can be liberatory for the Black community if properly enjoined by them. Overcoming the hegemonic determinism of Silicon Valley and Apocalyptic AI means opening the narrative space around AI to include more voices. The horizon of possibility should encompass a variety of perspectives. This means considering
100
R. Geraci
a wider range of religious values and worldviews than are present within the postChristian Apocalyptic AI mindset. For example, US and Japanese approaches to robotics and AI depend in important ways on the religious cultures of the respective nations (Geraci, 2006). While that argument was purely descriptive and theoretical, no subsequent anthropological approach to Japanese or American AI has disputed its basic claim about religious worldviews and scientific interests. Indeed, a number of scholars have cited the position and affirmed its conclusions, including scholars working in or native to Japan (e.g., Jensen & Blok, 2013; Kimura, 2017; Sugiyama et al., 2017; Trovato et al., 2021). Šabanovi´c (2014) does provide an orthogonal criticism, however, showing how religious culture cannot be taken as definitive upon science precisely because policy makers and scientists work to create the image of Japan as “robot nation”. This, of course, doesn’t deny that their interest in doing so or the efficacy of their efforts may be tied to religion or other cultural values. In a powerful essay, Katsuno and White (2023: 300–303) show both the deliberate rhetorical construction of consonance between Japanese religions and robotics, and the ongoing significance of that connection in Japan and abroad. Given that preliminary data show the existence of other narrative and interpretive visions of AI, it remains to uncover these and identify which perspectives and under which conditions they provide benefits to AI development in a wide variety of geographic and cultural contexts. We cannot escape the religious impulse in scientific practice, and thus must recognize the political implications of any given religious worldview. Newell (2019: 7) points out that science needs religion: often it is “the historical rituals, the patterns of faith, modes of personal belief, and habits of the heart that define both institutional religion and private spirituality [that] are also often the root of scientific endeavor”. Certainly, there are Indian scientists who specifically refer to the quest for scientific truth as a spiritual journey (Geraci, 2018: 96). If Newell is correct that science often draws its most essential character of inquiry from religious perspectives, then providing science with a more inclusive view of the world adds to the potential of scientific discovery and, importantly, to the potential for just and equitable technological deployment. While this project demands a broad approach to global culture (see Geraci & Song, 2024, 2022), this essay limits itself to articulating how the Apocalyptic AI narrative arrived in India and how Indian scientists and engineers have received and reinterpreted it. Going deeper, however, it is possible to see how specific Indian values might be relevant in our public perception and use of AI (see Geraci, 2022). Indian visions of AI and transhumanism clearly fit into the global circulation of ideas, but they have not necessarily gained sufficient traction as to be always apparent.
6 Singularity Beyond Silicon Valley: The Transmission of AI Values …
101
5 Apocalyptic AI, with Indian Characteristics Curiously, India was not fertile ground for transhumanism in the early twenty-first century despite the influx of technological labour in the form of call centres, IT support and software development. This condition is even more surprising given that the individuals who inspired much of Euro-American transhumanism had a following in India. In the mid-twentieth century, Indians invested in science and the rationalist movement were familiar with and appreciative of thinkers like Julian Huxley and J.B.S. Haldane. In the 1950s, references to the activities of both Huxley and Haldane appear in the pages of The Indian Rationalist, but only with regard to their scientific promotion…never their transhumanist perspectives (see Geraci, 2022: 107–11). Even by the early twenty-first century, most Indian scientists and engineers still showed little interest in the transhumanist promises of Haldane and Huxley, or even those of more recent prognosticators such as Moravec and Kurzweil (Geraci, 2018: 131–64). Nevertheless, a growing awareness of futurist promises and the reach of AI produces opportunities for Indians to adopt western promises, adapt them, or utilize them in their own religious visions of the future. It’s plausible that the reason why transhumanist views of technology failed to be compelling in India is because of an antipathy between prevailing religious sentiment and transhumanist aspirations. Hindu and Jain theories of mind and consciousness, for example, may be difficult to reconcile with faith in conscious machines (Geraci & Kaplan, 2024). Buben (2019) argues that there are essential complications for transhumanism as it intersects with Hindu and Jain theologies, suggesting that the Indic goal of moksha (release) makes it unlikely that Indians of those religious groups would desire this-worldly immortality as promised by advocates of genetic engineering or robotics. While Buben is correct in his theoretical position, the same could surely be said of those who aspire towards Christian salvation—yet there are Christian transhumanists in the USA and all American transhumanists live in an essentially Christian cultural matrix. In the USA, transhumanism inherited a secularized Christian religious imagination and there exists a growing number of transhumanist supporters in that country. The eagerness with which Hindu scientists (at least) argue that Hindu traditions are “cultural” and not “religious” seems like a plausible avenue for circumventing the philosophical and theological traditions that Buben notes. Ultimately, despite the theoretical conflict between Indic religions and transhumanism, there is a growing number of Indians with an interest in transhumanist goals. Groups like India Awakens, the India Future Society, and Singularity Café show this interest. By 2013, Apocalyptic AI had only the barest interest among established Indian scientists and engineers. One biologist told me that futurism “hasn’t really penetrated into anything meaningful or substantial” and a roboticist argued that “we are not interested in immortality: live forever, that is a crazy idea…first of all our spiritual background does not encourage these ideas…we are not really excited in having a copy of another machine like us” (see Geraci, 2018). Such statements align with the belief that India cannot adopt transhumanist perspectives but others in India had
102
R. Geraci
already begun to push the boundaries of futurist speculation. One young software engineer reported that among the elites of Bangalore’s tech community “you’ll hit up these type of ideas, about the Singularity, people who look beyond their own little pond or lake or whatever, who peek beyond the wall and see what’s possible” (ibid.). In fact, the physicist V.K. Wadhawan (2005, 2007a, 2007b) has promoted Apocalyptic AI thinking in India for nearly two decades. By 2018, many scientists and engineers remained disinterested in the Singularity, but growth is noticeable. The Singularity was mentioned twice by participants at the “Facets of AI” workshop hosted by the National Institute of Advanced Studies in July of 2020, and a talk on posthumanism followed early the next year (Ferrando, 2021; Nagaraj, 2020; Patnaik, 2020). Shortly prior, Bhattacharjee (2018, p. 29) wrote in Dream 2047 that “there is still a long way to replicate human intelligence, but it may usher in an era in which there will probably be no such thing as ‘pure’ human intelligence, because all humans will be a combination of biological and non-biological systems which will constitute integral parts of our physical bodies, vastly expanding and extending their capabilities. Humans and machines will merge together to create a human–machine civilisation.” Following Kurzweil, he concludes that “humans and machines will merge to constitute a unified entity where the distinction between man and machines will be obliterated. The question itself whether machines can have consciousness will then become meaningless” (Bhattacharjee, 2018, 28). Meanwhile, my own interviewees note a variety of responses in line with this. To quote two (see also Geraci, 2022): We will be happy anyways…We [will] have accomplished our destiny by creating superintelligence. What is god right now? So it’s a very subjectival thing…About Kalki, the day of Kalki: so my suspicion is AI is Kalki. I mean a truly artificial intelligent being is Kalki, maybe.
These perspectives, which either adopt transhumanist views from the west or form novel interpretations of artificial intelligence on the basis of traditional Indian theological ideas, show just two possible ways in which Indians (of Hindu theological bent) engage with the potential for AI to radically rewrite cosmic history.
6 Conclusion Borges suggests that the future can be designed. His “Tlön, Uqbar, Orbis Tertius” is a statement about authorial intent and the power of literature, but it speaks also to the pop science advocacy of Silicon Valley elite. Singularity advocates believe the future is closed and that they have access to its secrets; but really what happens when they preach their brand of techno-utopianism is that they try to sever alternative visions from our technologies. Through print and online capitalism, they seek to determine the future as they “predict” it. A tweet from Elon Musk or a documentary with Ray Kurzweil: these are efforts to control human destiny. But the future remains open. A judicious way forward demands that we open ourselves to manifold narratives
6 Singularity Beyond Silicon Valley: The Transmission of AI Values …
103
of AI. It may be that apocalyptic narratives of cosmic evolution will help guide us safely past existential risks and into the future; but it is at least as likely that other communities will provide new perspectives on that narrative, refashioning it in ways we might be wise to note. Indians have begun this process, though as yet the full effect of their contributions cannot be predicted. The movement of Apocalyptic AI into India and its exposition there should not be seen as a unidirectional and simple affair. Long ago, Evans-Pritchard (1956: 319) pointed out that religious ideas travel, though they don’t always mean the same thing when they reach a new community. Previously, Tylor ([1871] 1958: 12) spoke of survivals, which he defined as “processes, customs, opinions, and so forth, which have been carried on by force of habit into a new state of society different from that in which they had their original home.” While Tylor’s interest was in the evolution of one religious lifeworld into another (and some attendant increase in complexity), we can still see how the geographical shift of religious life can bear the same imprint. As such, we see intriguing shifts of religious narratives in the public life of AI. The Turing Church, for example, is a European-based transhumanist community increasingly influenced by ideas from India and through its founder’s collaborations with people there (see Geraci, 2022: 126). In Russia, the wealthy industrialist Dmitri Itskov established the 2045 Avatar Project, which has been nurtured by an Hindu ashram (Asprem, 2020, 407–8; Bernstein, 2019, 53). Just as scientific ideas travel from place to place and time to time, so too do religious ideas. Narratives of AI provide examples of these happening in tandem. The circulation of Apocalyptic AI into India creates new possibilities for future technological deployment. It will be worth watching whether Indians who adopt and adapt the transhumanist dreams of the iron horsemen introduce local values, such as swaraj, dharma, or ahimsa, or new philosophies, such as Advaitan approaches to human cognition. It is likely that they will continue to add local inflections to the global circulation of transhumanist narratives. At the same time, it is possible for Indian values to enter the space of AI ethics and the general narratives we share about AI. For example, the pluralism inherent in Nehruvian forms of secularism could help us consider the use of AI for global (rather than nationalistic) benefit. To think of ourselves as unified even in our diversity—a basic theme of Nehru’s politics—could help us move beyond the AI Cold War. Borrowing from another Indian freedom fighter, Gandhi’s notion of swaraj as individual sovereignty could reorient technology towards protecting individuals rather than expanding their power over others. The essential value for Gandhi was that an individual could and should have complete control over his or her own lifeworld. Perhaps AI technologies could help bring that about. Such an effort would have obvious implications for one of the twenty-first century’s pressing technological problems: corporate and political surveillance. In Holy Science (2019), Banu Subramaniam argues that we must broaden the political use of science. Rather than permitting the exclusive appropriation of science for centralized political agendas, she suggests that science can be used to open new models of liberatory politics. The open borders of communicating ideas about science
104
R. Geraci
and technology mean that we can begin laying out possible futures and working to pursue human—and maybe AI—flourishing.
References Ali, S. M. (2019). ‘White crisis’ and/as ‘existential risk,’ or the entangled apocalypticism of artificial intelligence. Zygon: Journal of Religion and Science, 54(1), 207–224. Asprem, E. (2020). The Magus of Silicon Valley: Immortality, apocalypse, and god making in Ray Kurzweil’s Transhumanism. In E. Voss (Ed.), Mediality on trial: Testing and contesting trance and other media techniques (pp. 397–411). Walter de Gruyer GmbH. Au, W. J. (2008). The making of second life: Notes from the new world. HarperCollins. Berman, J. (2011, August 9). Futurist Ray Kurzweil says he can bring his dead father back to life through a computer Avatar. ABC News. https://abcnews.go.com/Technology/futurist-ray-kur zweil-bring-dead-father-back-life/story?id=14267712. Accessed April 2, 2019. Bernstein, A. (2019). The future of immortality: Remaking life and death in contemporary Russia. Princeton University Press. Bhattacharjee, G. (2018). Age of man-machine hybrids. Dream 2047, 21(2), 30–26 (pagination runs in reverse due to dual-language publication). Borges, J. L. [1962] (2007). Tlön, Uqbar, orbis tertius. In Labyrinths: Selected stories and other writings (pp. 3–18). New Directions. Buben, A. (2019). Personal immortality in transhumanism and ancient Indian philosophy. Philosophy East and West, 69(1), 71–85. Butler, P. (2020). Black Transhuman liberation theology: Technology and spirituality. Bloomsbury. Cave, S., & Dihal, K. (2019). Hopes and fears for intelligent machines in fiction and reality. Nature Machine Intelligence, 1(2), 74–78. Cave, S., & Dihal, K. (2020). The whiteness of AI. Philosophy and Technology, 33(4), 685–703. Chakrabarty, D. (2000). Provincializing Europe: Postcolonial thought and historical difference. Princeton University Press. Chude-Sokei, L. (2019). Race and robotics. In T. Heffernan (Ed.), Cyborg future: Cross-disciplinary perspectives on artificial intelligence and robotics (pp. 159–172). Palgrave Macmillan. Dawkins, R. (2006). The god delusion. Bantam. Draper, J. W. (1874). The history of the conflict between religion and science. D. Appleton. Ethics & Religious Liberty Commission. (2019, April 11). Artificial intelligence: An evangelical statement of principles. Southern Baptist Convention. Available: https://erlc.com/resourcelibrary/statements/artificial-intelligence-an-evangelical-statement-of-principles. Accessed May 27, 2019. Evans-Pritchard, E. E. (1956). Nuer religion. Clarendon. Ferrando, F. (2021, January 15). Existential posthumanism—A path of self discovery. National Institute of Advanced Studies, Bangalore, India, Consciousness Studies Program Friday Lecture Series. Geraci, R. M. (2006). Spiritual robots: Religion and our scientific view of the natural world. Theology and Science, 4(3), 229–246. Geraci, R. M. (2010a). Apocalyptic AI: Visions of heaven in robotics, artificial intelligence, and virtual reality. Oxford University Press. Geraci, R. M. (2010b). Popular appeal of apocalyptic AI. Zygon: Journal of Religion and Science, 45(4), 1003–1020. Geraci, R. M. (2014). Virtually sacred: Myth and meaning in world of warcraft and second life. Oxford University Press. Geraci, R. M. (2018). Temples of modernity: Nationalism, Hinduism, and transhumanism in South Indian science. Lexington.
6 Singularity Beyond Silicon Valley: The Transmission of AI Values …
105
Geraci, R. M. (2022). Futures of artificial intelligence: Perspectives from India and the U.S. Oxford University Press. Geraci, R. M., & Kaplan, S. (2024). Hinduism and AI. In F. Watts & B. Singler (Eds.), Cambridge companion to religion and AI. Cambridge University Press. Geraci, R. M., & Song, Y. S. (2024). Global culture for global technology: Religious values and progress in artificial intelligence. In H., Glaser, & P., Wong (Eds.), Governing the Future: Digitalization, Artificial Intelligence, Dataism. London Routledge. Good, I. (1966). Speculations concerning the first ultra intelligent machine. Advances in Computers, 6, 31–88. Guest, T. (2007). Second lives: A journey through virtual worlds. Random House. Harari, N. Y. [2011] (2015). Sapiens: A brief history of humankind. Harper. Husain, A. (2017). The sentient machine: The coming age of artificial intelligence. Scribner. Jensen, C. B., & Blok, A. (2013). Techno-animism in Japan: Shinto cosmograms, actor-network theory, and the enabling powers of non-human agencies. Theory, Culture and Society, 30(2), 84–115. https://doi.org/10.1177/0263276412456564 Kaku, M. (2018). The future of humanity: Terraforming mars, interstellar travel, immortality and our destiny beyond earth. Allen Lane. Kaplan, J. (2016). Artificial intelligence: What everyone needs to know. Oxford University Press. Katsuno, H., & White, D. (2023). Engineering robots with heart in Japan: The politics of cultural difference in artificial emotional intelligence. In S., Cave, & K., Dihal (Eds.), Imagining AI: How the World Sees Intelligent Machines (pp. 295–317). New York: Oxford University Press. Katz, Y. (2020). Artificial whiteness: Politics and ideology in artificial intelligence. Columbia University Press. Kimura, T. (2017). Robotics and AI in the sociology of religion: A human in imago roboticae. Social Compass, 64(1), 6–22. Kurzweil, R. (1999). The age of spiritual machines: When computers exceed human intelligence. Viking. Kurzweil, R. (2005). The singularity is near: When humans transcend biology. Viking. Levy, S. (2011). In the plex: How Google thinks, works, and shapes our lives. Simon & Schuster. Lincoln, B. (1992). Discourse and the construction of society: Comparative studies of myth, ritual, and classification. Oxford University Press. Martin, G. (1971). Brief proposal on immortality: An interim solution. Perspectives in Biology and Medicine, 14(2), 339–340. Midgley, M. (1994). Science as salvation: A modern myth and its meaning. Routledge. Moravec, H. [1976] (1978). Today’s computers, intelligent machines and our future. Analog, 99(2), 59–84. https://frc.ri.cmu.edu/~hpm/project.archive/general.articles/1978/analog.1978. html. Accessed March 27, 2019. Moravec, H. (1988). Mind children: The future of robot and human intelligence. Harvard University Press. Moravec, H. (1992). Pigs in cyberspace. In R. Bruce Miller, & M. T. Wolf (Eds.), Thinking robots, an aware internet, and cyberpunk librarians: The 1992 LITA president’s program (pp. 15–21). Library and Information Technology Association. Moravec, H. (1999). Robot: The future of machine and human intelligence. Oxford University Press. Nagaraj, N. (2020, July 15). AI: From turing to Sophia. Presented at the Facets of AI Workshop. National Institute of Advanced Studies, Bangalore, India. Newell, C. (2019). Destined for the stars: Faith, the future, and America’s final frontier. University of Pittsburgh Press. Noble, D. (1999). The religion of technology: The divinity of man and the spirit of invention. Penguin. Nourbakhsh, I. R. (2013). Robot futures. The MIT Press. Nourbakhsh, I. R., & Keating, J. (2019). AI and humanity. The MIT Press. Nye, D. (2003). America as second creation: Technology and narratives of a new beginning. The MIT Press.
106
R. Geraci
Patnaik, L. M. (2020, July 16). Recent AI renaissance: A harbinger of (R)Evolution? Presented at the Facets of AI Workshop. National Institute of Advanced Studies, Bangalore, India. Perkowitz, S. (2004). Digital people: From bionic humans to androids. Joseph Henry. Pohl, F. [1955] (1975). The tunnel under the world. In L. del Rey (Ed.), The best of frederik pohl (pp. 8–35). Nelson Doubleday. Rennie, J. (2011, February 15). The immortal ambitions of Ray Kurzweil: A review of Transcendent Man. Scientific American. https://www.scientificamerican.com/article/the-immortal-ambitionsof-ray-kurzweil/. Accessed April 2, 2019. Robertson, J. (2018). Robo Sapiens Japanicus: Robots, gender, family, and the Japanese nation. University of California Press. Šabanovi´c, S. (2014). Inventing Japan’s ‘robotics culture’: The repeated assembly of science, technology, and culture in social robotics. Social Studies of Science, 44(3), 342–367. Stark, R., & Bainbridge, W. S. (1985). The future of religion: Secularization, revival, and cult formation. University of California Press. Subramaniam, B. (2019). Holy science: The biopolitics of Hindu nationalism. Orient BlackSwan. Sugiyama, M., Deguchi, H., Ema, A., Kishimoto, A., Mori, J., Shiroyama, H., & Scholz, R. W. (2017). Unintended side effects of digital transition: Perspectives of Japanese experts. Sustainability, 9(12), 2193. Tegmark, M. (2017). Life 3.0: Being human in the age of artificial intelligence. Knopf. Thomas, R. (2022). Science and religion in India: Beyond disenchantment. Routledge. Thomas, R., & Geraci, R. M. (2018). Religious rites and scientific communities: Ayudha Puja as ‘culture’ at the Indian institute of science. Zygon: Journal of Religion and Science, 53(1), 95–122. Trovato, G., De Saint, L., Chamas, M. N., Paredes, R., Lucho, C., Huerta-Mercado, A., & Cuellar, F. (2021). Religion and robots: Towards the synthesis of two extremes. International Journal of Social Robotics, 13(4), 539–556. Tylor, E. B. [1871] (1958). Primitive culture. Harper Torchbooks. Ulam, S. (1958). Tribute to John von Neumann. Bulletin of the American Mathematical Society, 64(3, pt. 2), 1–49. Vance, A. (2012). The Ray Kurzweil show, now at the Googleplex. Business Insider, 4310, 55–56. Vinge, V. (1983). First word. Omni, 5(1), 11. Vinge, V. [1993] (2003). Technological singularity. https://frc.ri.cmu.edu/~hpm/book98/com.ch1/ vinge.singularity.html. Accessed April 2, 2019. Wadhwan, V. K. (2005). Smart structures and materials. Resonance, 10(11), 27–41. Wadhwan, V. K. (2007a). Robots of the future. Resonance, 12(7), 61–78. Wadhwan, V. K. (2007b). Smart structures: Blurring the distinction between the living and the nonliving. Oxford University Press. Wallach, W., & Allen, C. (2009). Moral machines: Teaching robots right from wrong. Oxford University Press. White, A. D. (1896). The history of the warfare of science with theology in Christendom. D.D. Appleton.
Chapter 7
Healthcare Artificial Intelligence in India and Ethical Aspects Avik Sarkar, Poorva Singh, and Mayuri Varkey
Abstract Goal 3 in Sustainable Development Goals (SDGs) talks about Good Health and Well-being for all by 2030. Several nations globally, including India, lack trained healthcare professionals to take good care of the population. Emerging technologies like artificial intelligence (AI) can help provide healthcare services to the predominantly underserved population. Using innovative AI tools helps enhance healthcare professionals’ productivity by relieving them of mundane, repetitive, administrative-oriented activities. In this chapter, we provide an overview of the various areas where AI is used to help healthcare professionals and, thus, help patients. We provide applications on AI in healthcare across six broad areas, from the early detection of diseases to intervention in treatment, drug discovery, end-oflife care, and managing the healthcare ecosystem, including hospital management, finding doctors, medicine delivery, etc. The paper explores the use of AI in addressing public health and pandemics. For each section, the paper looks at the current situation AI applications globally, followed by those in India. AI applications deal with people’s life, and thus an extra level of caution must be maintained, which brings us to the discussion on ethical aspects of healthcare AI which are discussed for each of the healthcare areas discussed in the paper. The paper concludes with a discussion on the factors in the adoption of AI in India along with suggestions for increasing the adoption in the healthcare sector. Keywords Healthcare · India · Healthcare AI · Ethical AI · AI adoption · SDG 3 · Rural healthcare
A. Sarkar (B) ISB Institute of Data Science, India School of Business, Hyderabad, India e-mail: [email protected] P. Singh Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA M. Varkey Indian School of Business, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_7
107
108
A. Sarkar et al.
1 Introduction Artificial intelligence (AI) refers to a technology that enables machines to demonstrate rather than mimic capabilities like a human. The early reference to such intelligent machines can be found during the 1950s in the work of Turing (1948) and since then, it has remained an important research topic in the domain of Computer Science. The applications and use of AI, as envisioned, required several technological advancements that were only available in the mid-1900s, due to which we see what is termed an AI Winter. The digital transformation of business and society has created vast volumes of machine-readable digital data. The rapidly decreasing costs for storing humongous volumes of data helped provide the key ingredients for developments in AI. Advancements in hardware and cloud technologies enabled the democratization of computational capabilities required to process huge data volumes leading to the emergence of AI technologies. Human-like capabilities involve making logical decisions and cognitive capabilities such as dealing with unstructured data like language, text, speech, images, and videos, which is not straightforward for a machine to interpret. Recent advancements in machine learning (Mitchell, 1997) and deep learning (Suganyadevi et al., 2022) have enabled machines to interpret unstructured data and thus mimic human cognitive abilities. Data and its availability in digital formats are essential components for developing AI applications. We see the early use of AI in domains where transactions occur digitally. Online mailbox is a treasure trove of data that helped develop intelligent spam filters and automatically classify emails (Dada et al., 2019). Banks and financial institutions generate large amounts of digital data and fraudulent transactions are expensive, leading to the development of AI-based fraud identification systems (Raghavan & Gayar, 2019). Several industries like retail, telecom, banking, insurance, etc., were capturing customer data based on the transactions that the customers engaged in, leading to the development of AI-based customer experience solutions like customer segmentation, providing personalized offers to customers, identification of customers likely to churn (Sabbeh, 2018) and devise means to retain them customer experience solutions also help in understanding customer sentiment based on the emails and feedback of the customer. Several of these AI applications helped in automating manual tasks and in replacing humans by intelligent decision-making thus deriving business benefits (Acemoglu & Restrepo, 2019). There are several AI applications in these domains and only a few are highlighted here. In the healthcare domain, patients’ data regarding symptoms, diagnosis, and suggested medications have traditionally been captured by doctors on paper prescriptions. Only recently have we seen hospitals collecting patient data in electronic medical records. Other patient data points like pathology test reports, imaging studies, and scans, are unavailable in a digital format for machines to interpret. In the unavailability of digital records, scanned copies of these documents may be created, but that would not directly lead to machine-interpretable data points essential for developing AI applications. Additionally, the doctor manually examines the patient during an encounter leading to multiple clinical observations and bringing in an element of
7 Healthcare Artificial Intelligence in India and Ethical Aspects
109
expert interpretation or subjectivity by the physician involved (Francis, 2020). These multimodal data points in healthcare and their complexities make adopting AI-based applications in the healthcare domain extremely difficult. There is also the aspect of trust between the doctor and patient, which complements the effect of pharmaceutical and surgical therapies, aiding in clinical improvement. This is the reason for the perennial debate over whether medicine is an art or a science and how it’s a delicate balance of both—making the replacement of the human doctor by technology, no matter how advanced, will never happen without a decline in patient health. So, what is the current landscape of AI applications in the healthcare sector focusing on India? From the global perspective, some literary works look at the development and application of AI in the healthcare domain (Hamet & Tremblay, 2017; Jiang et al., 2017). The use of AI in healthcare in India is recent, and a report discusses some selected applications (Parry & Aneja, 2020). We could not find anything comparable from the Indian perspective that covers the background, research, and technological development along with applications and ethical concerns from Indian perspective, and thus this work will add much valuable information at this sector in India.
2 Healthcare in India The healthcare sector in India is traditionally divided into public and private—the public sector is run by the government, providing free healthcare to citizens through a network of primary, secondary, and tertiary healthcare centres. Primary Health Centres (PHCs) mainly cater to rural and non-urban areas, while secondary and tertiary centres serve the urban population. With health expenditure being < 5% of the country’s GDP as per the World Bank (Pandey et al., 2018), the primary and secondary arms of the public healthcare system suffer from poor staff motivation, insufficient resources, and dilapidated infrastructure. The private sector comprises large multi-specialty hospitals, small single physician-run clinics, and nursing homes. With the advent of privatization in the 1990s (Sengupta & Nundy, 2005), the rising demand for quality healthcare services, and the evolution of medical tourism in India, the private sector expanded and was running many secondary and tertiary care facilities in the metro, Tier-I and Tier-II cities. Despite public and private healthcare facilities catering to the Indian population, the need greatly outweighs the supply. There are many reasons for this imbalance, with the following being the most significant ones: • India has a doctor-to-patient ratio of 1:1456 compared to the WHO-recommended doctor-to-population ratio of 1:1000, meaning India has a huge shortage of an estimated 600,000 modern medicine practitioners in India (Bagcchi, 2015). Though the government claims that India has achieved this ratio as of 2022, public health specialists have refuted this as AYUSH practitioners are not considered in this ratio as a global standard practise (Karan et al., 2021).
110
A. Sarkar et al.
• With 70% of Indians living in rural areas, only about 30% of healthcare professionals, including physicians and nurses, serve this rural population. Most healthcare professionals prefer to practise in urban areas due to better infrastructure, facilities, and opportunities for them and their families. • Due to the public health system being free of cost and the extremely costly private healthcare facilities, most people visit public health centres, leading to their overburdening. Additionally, there is a total lack of a referral system from primary to secondary and tertiary centres, primarily due to a poor primary and secondary healthcare network in most states. This adds to the already existing burden on public tertiary care centres, which combined with the limited resources in the public hospitals, leads to poor quality of care delivery in the public system. This presents a good opportunity for AI-led interventions in the Indian healthcare sector to overcome some of the above-discussed challenges. With the implementation of the Ayushman Bharat Digital Mission (ABDM), India is on the brink of collecting point-of-care health data (Barnagarwala, 2022), unifying it to create an Electronic Health Record (EHR), and leveraging AI to gain insights from this data to provide superior, affordable, and accessible healthcare to all Indian citizens. Throughout this chapter, we look at various areas in healthcare that are undergoing rapid changes through AI intervention.
3 AI Application in Various Healthcare Domains This chapter explores the application of AI in the healthcare ecosystem, the benefits derived from these advancements and the ethical issues to be considered. For simplicity, we adopt a broad classification of the healthcare sector as proposed by Park (2019) and further group it into the following six areas for better understanding of the reader: • • • • • •
Public health and pandemics. Diagnosis and early detection of disease. Effective treatment of illness. Pharmaceutical research and development. Palliative and geriatric care. Hospital ecosystem management.
The rationale for considering these six areas is based on the importance of these broad categories of the healthcare domain in individual quality of life and public health delivery. For context, the natural progression of disease follows a somewhat standard course without therapy (see Fig. 1) (CDC, Natural History and Spectrum of Disease, 2012). • A healthy human body develops risk factors for a particular disease and starts harbouring the disease at a cellular level without showing any signs or symptoms;
7 Healthcare Artificial Intelligence in India and Ethical Aspects
111
Fig. 1 Six focus areas of the healthcare domain, their relevance, and timeline of occurrence (Park, 2019). Source https://www.cdc.gov/csels/dsepd/ss1978/lesson1/section9.html
• •
• •
•
this is when public health and mass screening measures are applied to prevent the occurrence of manifest disease. After affecting the body at a cellular level, the disease starts affecting the bodily organs and presents itself as physical signs and symptoms; if we can diagnose the disease at an early stage, we can prevent much of the damage caused by it. Once the accurate diagnosis is made, physicians start treating the patients to rid them of the disease and the adverse effects the disease has had on their bodily organs and functions; this is where targeted pharmaceutical and advanced surgical interventions can make a huge impact on the patient’s recovery from disease. Drug research and development (R&D) is integrally related to healthcare as novel drugs with superior safety and increased efficacy are central to optimum patient treatment. When patients are given a terminal diagnosis that is not amenable to effective management or cure, palliative care comes into play, wherein the patients are given psychosocial support and symptom relief to make their last days more comfortable and dignified. Elderly patients with chronic or debilitating illnesses also require companionship and help with daily functioning, giving way to geriatric care. With the increasing life expectancy and the rising number of elderly people around the globe, these types of care are gaining importance. With improved access to healthcare and most healthcare delivery occurring in healthcare facilities, Hospital Administration and Management has become an important aspect of effective care delivery.
This chapter focuses on applications that have either been rolled out or have undergone a successful pilot. For each of the areas mentioned, we look at the traditional approach that healthcare professionals follow to deal with these scenarios and the challenges they face, followed by the AI interventions applied in the respective area and the potential benefits derived from them, supplemented with a discussion on some of the ethical challenges faced in the real-world applications of these AI-based interventions.
112
A. Sarkar et al.
3.1 Public Health and Pandemics Public health is the application of medicine to improve the overall wellness of the people and prevent the onset of disease. AI can aid greatly in this field by identifying risk factors and high-risk groups susceptible to disease and helping them by dedicating limited public resources to the right population cohorts (Harvard, n.d.). In India, AI is being leveraged to improve maternal and child health outcomes by providing preventive health information and in guiding optimum resource allocation for Tuberculosis treatment (IANS, 2018). Having advanced AI applications to aid in detection, triage, efficient management, and expedited development of safe drugs and vaccines for COVID-19 helped manage the burden of this global crisis (Peng et al., 2022). In the Indian context, AI has not been widely used in public health. However, there lies an enormous untapped opportunity—AI for disease surveillance, health education, monitoring and evaluation, workforce development, public health research, development of public health policy, and enforcing public health laws and regulations (Mor, 2021). A group of researchers is working on developing forecasting trends for tuberculosis in India based on AI and machine learning (Dulera et al., 2021). Public health in India is managed and overseen by the Ministry of Health and Family Welfare (MoHFW) at the national level, and by State Departments of Health at the state level. The government plays a crucial role in formulating and implementing public health policies, programmes, and initiatives to improve the health and well-being of the population. As other areas in healthcare adopt AI, we expect to see greater use of AI in public health in India by government in both centre and state level.
3.2 Diagnosis and Early Disease Detection Diagnosis or detection is the first step in the healthcare process, which involves the identification of a disease from its signs, symptoms, and diagnostic tests. Before the advent of modern-day healthcare, physicians relied on manual examination of the patient through inspection, palpation, percussion, and auscultation along with evaluating patient specimens (usually under a magnifying lens or a rudimentary microscope) to diagnose them. A high degree of subjectivity was involved, and very few trained and experienced physicians could accurately diagnose diseases this way. During the late nineteenth and twentieth centuries, the invention of tools such as thermometers, stethoscopes, microscopes, and X-rays supplemented clinical diagnostic methods (Berger, 1999). It provided objective data independent of subjective judgement, helping increase the accuracy of the diagnostic process. Thermometers measured body temperature and stethoscopes helped doctors listen to heart sounds; microscope facilitated the visualization and understanding of the detailed structure of
7 Healthcare Artificial Intelligence in India and Ethical Aspects
113
human cells and of organisms that caused diseases, giving way to a range of pathological tests of blood, tissues, urine, and faeces; radiography or X-ray used radiation and radio waves to create images of bones and organs and help doctors identify illnesses. Over time, advanced diagnostic tools such as Magnetic Resonance Imaging (MRI) and CT scan, Electron Microscopy, Immunodiagnostics, and molecular techniques were developed and refined. Early and correct diagnosis of disease helps in instituting effective timely treatment leading to better and faster recovery and mitigating disease transmission. Delay in the diagnosis of the health condition can lead to the emergence of co-morbidities in the patient causing other complex health conditions, increasing the need for intervention, and reducing the chances of recovery. Some diseases show early symptoms, whereas others might not until the disease has advanced, making the diagnosis even more challenging. Medical diagnosis is a complex cognitive task requiring skilled physicians with years of clinical experience (Bornstein & Emler, 2001), whose availability is often challenging, particularly in rural areas. Young doctors entering the health system may not be trained enough to diagnose complex illnesses correctly. Furthermore, clinical time is limited with the increasing clinical and administrative workload on physicians, and disease dynamics change over time leading to challenges in diagnosis. Diagnostic errors can happen if there is poor access to healthcare services, incorrect reporting of symptoms by the patient, lack of communication within the medical team and with the patient’s family, measurement errors, and lack of follow-up care (WHO, Technical Series on Safer Primary Care: Diagnostic errors, 2016). An accurate diagnostic process is a prerequisite to ensure timely treatment and, thus, to achieve safe and effective patient care. Can the challenges of the traditional diagnostic process be overcome or mitigated using AI-based applications as discussed in Kononenko (2001). An AI-based system, built on large volumes of historical data, should be able to improve diagnostic accuracy, overcoming concerns of subjectivity, lack of expertise, and shortage of healthcare professionals. However, an AI-based system or algorithm will support diagnosticians (pathologists and radiologists) and will never replace a trained physician’s skill. Bangalore-based healthtech start-up SigTuple uses AI-based automated solutions to improve the efficiency of doctors and pathologists for carrying out high-volume diagnosis tests without using the complicated medical equipment used in laboratories and providing results within a few minutes to help quick diagnosis and rapid action by the doctors (Sharma, 2021c). Another Mumbai-based start-up Qure.ai uses AI for the prediction of tuberculosis/cancer/stroke on X-ray and CT/MR scan images providing 95% accuracy, which can support physicians in rural areas where radiologists are not easily available and these algorithms have been deployed in rural areas of Uttar Pradesh and Rajasthan (Soni, 2019). In the following subsections, we explore AI intervention for diagnosis in three prominent fields of diabetic retinopathy, cancer, and cardiology.
114
3.2.1
A. Sarkar et al.
Diabetic Retinopathy
Diabetic retinopathy (DR), an adverse microvascular complication of diabetes, impacts one in three individuals with diabetes and stands as a prominent contributor to avoidable blindness. Given the global diabetic population surpassing 400 million, the collective prevalence of DR is 34.6%, with over a third of these cases involving diabetic retinopathy that poses a threat to vision (Yau et al., 2012). For DR, preventive care involves the regular screening of the retinae to evaluate for the presence of DR as part of the annual diabetes screening. Fundus photography is a recognized method for screening diabetic retinopathy (DR), and the evaluation of fundus photographs is conducted by certified graders or retina specialists with specialized training. As the prevalence of diabetes continues to rise dramatically and the shortage of retinal surgeons becomes apparent, healthcare professionals are increasingly turning their attention to an automated approach for diabetic retinopathy (DR) screening using artificial intelligence (AI). In ophthalmology, AI methods are gaining traction due to the image-centric nature of much of the data and advancement in computer vision technologies (Wolfensberger & Hamilton, 2001). Medical screening and diagnosis assisted by AI techniques on patient images are currently evolving (Rossmann, 2006). Developing the AI application for detecting DR requires huge datasets of patient images both with and without DR. Earlier AI applications of DR did not show high accuracy (Abràmoff et al., 2008) and with time, the introduction of deep learning algorithms along with larger training datasets led to higher accuracy, sensitivity, and specificity (Abràmoff et al., 2013; Gulshan et al., 2016). A review of AI algorithms for DR detection by various researchers utilizing varying methods and dataset sizes showing good degree of accuracy with high sensitivity and specificity is summarized in Raman et al. (2019). As discussed, just the accuracy of an AI algorithm is not enough for its realworld application in healthcare. The American Diabetes Association has provided broad guidelines for AI-assisted DR screening (Lanzetta et al., 2020), suggesting that AI systems can detect mild DR-related conditions authorized by the US FDA, as an alternative to traditional screening approaches. FDA further suggested that AI systems should not be used for patients with known DR, prior DR treatment history, or vision impairment symptoms. In April 2018, IDx-DR was the first FDA-approved AI algorithm used for the detection of DR by non-ophthalmic healthcare practitioners (FDA, 2018). The device captured the images of the eye and sent them to a cloud-based server where the AIbased algorithm performs an analysis of the captured image(s) and provides two recommendations (Abràmoff et al., 2018): • Positive screen—if more than mild DR detected, refer to ophthalmologist or • Negative screen—if more than mild DR is not detected, rescreen in 12 months. EyeArt (2020) is the first FDA-cleared AI-based tool for detecting more than mild and vision-threatening DR autonomously. The presence or absence of referencewarranted DR was automatically detected by the EyeArt system, and its performance was evaluated by trained ophthalmologists (Bhaskaranand et al., 2019). The EyeArt
7 Healthcare Artificial Intelligence in India and Ethical Aspects
115
screening provided an overall sensitivity of 91.3% and specificity of 91.1%. For 5446 encounters with potentially treatable DR, the EyeArt system provided a positive “refer” output for 5363 of them, achieving a sensitivity of 98.5%. DR evaluation studies have been carried out globally. In one such study in Singapore, based on eye images, showed high sensitivity and specificity for identifying DR and other eye diseases (Walton et al., 2016). Another study on a dataset with a multi-ethnic population of Chinese, Indian, Malay, Caucasian, Hispanic, and African American patients showed comparatively lower sensitivity and specificity (Ting et al., 2017). In a large-scale validation study in Thailand, the automated DR detection system developed by Google and compared against the grading performed by trained ophthalmologists showed the algorithm had high sensitivity and specificity (Raumviboonsuk et al., 2019). With a diabetic population exceeding 72 million individuals, India has an estimated diabetic retinopathy (DR) prevalence of 18% (Rema et al., 2005). This underscores the significance of regular retinal examinations as a crucial measure for the early detection of DR. The availability of high-specification fundus cameras to capture retinal images is often a challenge in remote or rural areas. Smartphone-based cameras to capture retinal images and detect DR have been successfully explored in such rural areas (Rajalakshmi et al., 2015). Smartphone-based AI algorithm (Medios AI) captures retinal images of patients visiting dispensaries, showing promising sensitivity and specificity for a small sample case (Natarajan et al., 2019). A team of doctors at L V Prasad Eye Institute (LVPEI), Hyderabad, has developed a low-cost method to capture good-quality retinal images for diagnosis of DR which will enable images sending from rural or remote areas to urban areas for review by ophthalmologists (Murali, 2016). Alphabet (aka Google) and its sister organization Verily is working with several ophthalmology hospitals in India like Aravind Eye Hospital (Madurai) and Sankara Nethralaya (Chennai), to develop automated screening for DR (Miliard, 2019). Automated DR applications need thorough testing and approval from the medical regulatory body in the respective country(ies) before being administered to patients. In this regard, the risks and concerns of such applications have not been fully weighed against their potential benefits by policy makers, resulting in no AI-based DR applications being rolled out in India yet (Raman et al., 2021).
3.2.2
Cancer
A human body is made up of trillions of cells that undergo growth and division for a defined number of times before ultimately undergoing ‘programmed cell death’. Cancer develops when there is a disruption in the normal cellular process, leading to uncontrolled growth and division of cells, and the failure of old or abnormal cells to undergo timely cell death (O’Connor & Adams, 2010). As cancer cells grow, they can dominate over and eliminate other healthy or normal cells and spread to many parts of the body, making it hard for the body to work normally. Cancer can be categorized into two main types: Haematologic cancers affect the blood cells, encompassing conditions like leukaemia, lymphoma, and multiple myeloma. On the other hand,
116
A. Sarkar et al.
solid cancers originate in organs or tissues other than the blood, such as the breast, prostate, lung, and colon. The Papanicolaou (Pap) smear approved by the American Cancer Society is one of the earliest tests for cervical cancer detection in women based on the examination of cell samples (cytologic examination) (The American Cancer Society Guidelines for the Prevention & Early Detection of Cervical Cancer, 2021). The biopsy is the gold standard test for cancer diagnosis in which the tissue/ cell sample of the patient is tested in the laboratory by a trained pathologist (diagnostic physician) (den Bakker, 2017). Several imaging-based diagnostic methods are also used to complement pathologic tests such as ultrasound, X-ray, Computerized Tomography (CT), MRI, and positron emission tomography (PET) scans, and they help in assessing the spread of cancer and guiding the treatment plan (Glastonbury et al., 2016). AI methods have been used for the detection of lung, brain, and breast cancer, based on the analysis of imaging studies such as X-ray, ultrasound, MRI, and CT scans, using a diverse range of machine learning techniques providing good accuracy as reviewed by Shamasneh and Obaidellah (2017). Another research has found good accuracy in the use of machine-earning methods for cancer susceptibility, recurrence, and survival, and observes that very few of these tools have penetrated real-world clinical applications due to a lack of large datasets which are needed to train algorithms to give unbiased predictions (Kourou et al., 2015). Researchers have used deep learning-based methods to automate the detection of cancer from thousands of cervical photographs obtained from volunteers in a cancer screening study with good accuracy (Collins, 2019; Hu et al., 2019). Another research has shown that AI-based methods on digital images provide improved accuracy compared to manual screening of Pap smears for cervical cancer (Wentzensen, 2021). However, further research has highlighted limitations for the clinical interpretation and application of cervical cancer diagnosis through AI-based methods, causing delays in the clinical adoption of these tools (Hou et al., 2022). Utilizing data from the Wisconsin breast cancer database, researchers employed principal component analysis (PCA) (Jhajharia et al., 2016). Through the preprocessing of data and extracting features in a form most pertinent for training artificial neural networks, this approach resulted in the dependable prognosis and classification of breast cancer. In another study, AI systems demonstrated trained radiologist-level accuracy for detecting breast cancer from digital mammography images (RodriguezRuiz, 2019). In 2020, FDA approved the AI-based breast cancer screening developed by Zebra Medical Vision based on mammography (Hale, 2020). About ten different AI-based tools have received FDA approval for screening breast cancer. However, another study highlights deficiencies in FDA-approved AI-based breast cancer screening using robust data sources, cross-validation across multiple centres, and assessment of their clinical utility (Potnis et al., 2022). Cancer is a complex phenomenon and researchers have looked beyond scanned reports, pathologic and radiologic images into genomics and proteomics data for diagnosis and prognostication. Ching et al., (2018) developed a framework called Cox-nnet (a neural network extension of the Cox regression model) to predict patient prognoses from high throughput transcriptomics data, achieving higher predictive
7 Healthcare Artificial Intelligence in India and Ethical Aspects
117
accuracy compared to other methods. Another research looks at the application of AI on cancer genomics for the precision care of cancer patients (Xu et al., 2019). Genomics data provides promising results and is in the early stages of research; a survey of tools and datasets for genomics-based cancer research provides promising results for researchers (Shimizu & Nakayama, 2020). In India, researchers at Institute for Advanced Study in Science and Technology (IASST), Guwahati, have developed an AI-based method for prognostication of breast cancer, based on the evaluation of hormone status (Quantum, 2021). The lack of large-scale cancer biobank data is one of the leading factors for not having more India-specific AI research for cancer detection (Priyadarshini, 2013). The Department of Biotechnology in collaboration with the federal think tank NITI Aayog is engaged in an initiative to develop a cancer biobank for radiology and pathology images that can accelerate the development of AI-based cancer diagnosis (PTI, 2019). Despite these inherent challenges, India-based Niramai Health Analytix received US FDA clearance for their ‘SMILE-100’ system which uses thermal imaging for breast cancer screening. The AI algorithm also checks the quality of the captured thermal images thus reducing the errors in image capture enabling low-skilled health workers to accurately perform the imaging (Malik, 2022a). Based on the analysis of the patient genomics data, Indian start-up Oncostem Diagnostics provides a breast cancer risk score to the patient that can help in early diagnosis (Sharma, 2021b).
3.2.3
Cardiology
Early detection of heart disease can prevent many deaths. Some genetic factors can contribute to its development, but the disease is largely attributable to poor lifestyle including a high-cholesterol diet, lack of regular exercise, tobacco smoking, alcohol or drug abuse, combined with high levels of stress. The Centres for Disease Control and Prevention (CDC) estimates that heart disease is the one of the leading causes of death globally, and responsible for 1 in 5 deaths in the US in 2020 (CDC, 2022). India has one of the highest burdens of cardiovascular deaths globally, which was expected to increase from 2.26 million in 1990 to about 4.77 million in 2020 (Huffman et al., 2011). Traditionally, a cardiologist reviews a patient’s medical history and carries out a physical examination which involves weight measurement and checking their heart, lungs, blood pressure, carrying out blood tests for cholesterol and lipid levels, and performing imaging analysis such as ECHO, chest X-ray and angiography. Once a diagnosis is established, an interventional cardiologist may carry out procedures such as angioplasty, stenting, valvuloplasty, Congenital Heart Defect (CHD) corrections, and coronary thrombectomies. AI methods have shown good accuracy in the prediction of congestive heart failure based on chest radiographs (Seah et al., 2019). Another study looks at various patient data like imaging, electrocardiogram (ECG), and genomics data, to show that AI methods perform well across these data points in accurately predicting cardiac conditions and thus, may be used by doctors for early detection and better interpretation of findings leading to improved patient outcomes (Cuocolo et al., 2019). There has
118
A. Sarkar et al.
been a considerable amount of research and development in the field of AI tools for the detection of cardiac conditions. In February 2020, FDA approved the diagnostic ultrasound system developed by Teratech Corporation for clinician usage (Cardiac, 2020). ECG, being the most widely used diagnostic tool for cardiology, was used to train AI algorithms to differentiate between acceptable and unacceptable ECG image quality and provide real-time feedback on the patient heart condition, which can later be used by the cardiologist for final patient assessment and evaluation. Another tool, EchoMD AutoEF received FDA clearance in 2018, where the tool’s AI algorithm automates clip selection and ejection fraction calculation from cardiovascular imaging, thus assisting cardiologists by reducing the time taken for their decision-making process (Joyce, 2018). EchoGo Pro which uses AI for automated identification of coronary artery disease, received FA clearance for clinical use in 2021 (Ultromics, 2021). Over the years, several AI-based tools have been approved for clinical use and the review of these tools is discussed by Benjamens et al. (2020). AI-based healthcare tools developed in other countries cannot be directly applied in India before they are thoroughly tested on the Indian population. Research has highlighted the potential of AI-based tools for the identification of cardiac conditions in countries such as India (Yan et al., 2019), where there is a huge burden of cardiac disease and a lack of an adequate number of healthcare professionals. An exploratory study on the use of AI for early detection and prediction of cardiac conditions was carried out on patient data from South India providing promising results of 93.8% accuracy with a sensitivity of 92.8% and specificity of 93.6% (Maini et al., 2021). Another study based in India uses AI for the identification of high-risk cardiac patients based on specific risk factors (Gupta et al., 2021). In 2021, Apollo Hospital based in Hyderabad launched an AI-based tool to predict the risk of cardiovascular disease and classify patients as high, moderate, or minimal risk (Reporter, 2021). Doctors can use this risk assessment tool for providing proactive and preventive care to individuals with significant cardiac risk, thus reducing the future burden on the healthcare system and improving the quality of lives of citizens.
3.3 Effective Treatment of Diseases The most robust approach to medical treatment is evidence-based medicine (Rosenberg & Donald, 1995), which is based on leveraging thorough clinical researchbacked evidence for treating patients (Sackett et al., 2020). It is the standard-of-care in most clinical settings around the world. A step forward is precision or personalized medicine, which entails using cellular and/or genetic biomarkers to decide the optimum treatment plan for a particular patient and has already been extensively applied in the field of oncology, being the treatment-of-choice for many cancer subtypes (US-FDA, 2016). AI has the potential to positively impact many aspects of clinical treatment, from guiding treatment planning and therapy dosage, especially in the field of radiotherapy & radio-oncology (Chow, 2022), to improving precision
7 Healthcare Artificial Intelligence in India and Ethical Aspects
119
and accuracy of treatment, especially in vascular and other forms of microsurgery (Tsui, 2020).
3.3.1
Robotic Surgery
AI is transforming the field of surgery through advancements in imaging, navigation, and robotic intervention. It is currently recognized and rightly so, as a complement rather than a substitute for the expertise of a human surgeon. The use of AI is already contributing to enhancements in surgical planning and navigation using CT, ultrasound, and MRI. This has led to a reduction in surgical trauma and improved patient recovery, especially in minimally invasive surgery (MIS). AI-driven surgical robots, which are computer-manipulated devices, enable surgeons to concentrate on the intricate aspects of a procedure. They enhance surgical effectiveness by minimizing hand tremors, providing tissue tracking, and improving intraoperative navigation. Consequently, this results in superior patient outcomes and a reduction in overall healthcare expenditures (Andras et al., 2020). The collaboration between humans and robots allows surgeons to control surgical robots without physical contact, utilizing touchless manipulation through head or hand movements, speech and voice recognition, or the surgeon’s gaze. An example is “FAce MOUSe,” a human–robot interface that observes the facial expressions of the surgeon without requiring anybody-contact devices. This technology facilitates real-time non-verbal coordination between the human and the robot, enhancing the performance of various surgical procedures (Nishikawa et al., 2003). In a groundbreaking procedure, the Maastricht University Medical Centre in the Netherlands employed an AI-driven robot to suture blood (lymph) vessels ranging from 0.03 to 0.08 mm in a patient. This marked a pioneering achievement in super microsurgery, where the robot, guided by a surgeon, executed exceptionally precise movements through its robotic hands (Maastricht, 2017). AI-enabled robots have been used in cosmetic surgery to perform precise yet minimally invasive transplantation of hair follicles in the scalp without leaving behind a scar (Rose & Nusbaum, 2014). The ‘Da Vinci’ surgical robotic system (Freschi et al., 2013) has been tested with mixed outcomes in cardiac surgeries including mitral valve repair and coronary artery bypass grafting, making its widespread adoption as the standard-of-care for cardiac surgeries, questionable (Yu et al., 2014). With the development of AI-based robotic capabilities and their increasing use in routine operating rooms, many concerns regarding the training of future surgeons and their degree of involvement in the actual surgical process, spring into our minds, and it will be a long time before we will be able to answer these queries, if at all.
3.3.2
Cancer
Cancer is responsible for many global disease burdens and mortality. AI has been integrated into cancer research in a big way and not just for early diagnosis, but
120
A. Sarkar et al.
also for enabling precision medicine and prognostication in cancer therapy. AI-based therapeutics, devices, and systems are vital innovations which allow for estimation of survival, therapy selection and dosage, and scaling up treatment services effectively. AI, machine learning and deep learning are revolutionizing the existing therapeutic tenets of chemotherapy, radiotherapy, and molecular medicine in oncology (Luchini et al., 2022). AI is guiding research to decode the molecular events leading up to cancer by understanding the complex biological bases of cancer cell proliferation, spread, escape from the cellular ‘kill’ machinery and development of drug resistance. With better understanding of these genetic and epigenetic events, scientists will be able to develop targeted molecules to address the underlying problem (Patel et al., 2020). Methods for diagnosis and treatment of cancer are evolving every day, and these findings are shared in different scientific journals, and it is impossible for the practising doctors in their busy schedule to go through these research materials to have a complete understanding of the latest trends. Further, the doctor gains expertise based on the cases they have themselves diagnosed or treated or learnt about other cases from their colleagues or based on interactions with other doctors in formal settings like conference or workshop or through personal connection, but despite all these a doctor can never have knowledge of all possible cancer cases. IBM Watson for Oncology is an AI-based “Expert Advisor” that assists doctors and oncologists by increasing their treatment and diagnosis capacity by going through the latest research in the healthcare domain and going over multiple patient records (Greenstein et al., 2020). Depending on the cancer patient’s condition, chemotherapy, hormone therapy, or radiotherapy is prescribed and often the recommendations might not work accurately. Bangalore-based health-tech start-up Oncostem Diagnostics uses AI on genomicsbased data for personalized breast cancer therapy and chances of cancer recurrence based on the patient’s condition (Sharma, 2021b). Another Indian start-up Predible Health uses AI-based medical imaging technology to provide accurate patient-centric and organ-specific cancer care treatment recommendations which will help doctors and radiologists in making better patient decisions (Arora & Prasad, 2020). The company provides Oncology care based on CT scan, MRI, and PET scanning for the liver and lungs. AI in chemotherapy AI has been utilized for the supervision of chemotherapy drug administration, forecasting drug tolerance, and refining combination drug regimens. In a particular study, scientists effectively identified the ideal dosage for chemotherapy (ZEN-3694 and Enzalutamide) using “CURATE.AI,” an AI platform developed by the National University of Singapore through deep learning. This advancement has enhanced the effectiveness and tolerance of the combined treatment for individuals with metastatic prostate cancer (Pantuck et al., 2018). In breast cancer, a screening system based on deep learning could detect cancer cells having defects in Homologous Recombination pathway of DNA repair with 74% accuracy and predict which patients could benefit from Poly ADP Ribose Polymerase (PARP) inhibitor class of drugs (Gulhan et al., 2019). Nasopharyngeal carcinoma
7 Healthcare Artificial Intelligence in India and Ethical Aspects
121
(cancer located behind the nose and back of throat) research has shown that risk stratification and guidance of induction chemotherapy using deep learning model based on radiology imaging signatures is significantly better than the currently used Epstein Barr Virus (EBV) DNA-based model (Peng et al., 2019). AI in radiotherapy AI has the potential to assist radiotherapists in mapping target areas, outlining organs, and automating the planning of radiotherapy protocols for cancer treatment. Lin et al. utilized a 3-dimensional convolutional neural network (3D CNN) to achieve the automatic delineation of nasopharyngeal carcinoma with an accuracy of 79%, a performance level comparable to that of expert radiotherapists (Lin et al., 2019). Cha et al., (2017) combined deep learning technology with ‘radiomics’, where researchers extract the signatures from radiologic images to build a predictive model to evaluate the response to radiotherapy in urinary bladder cancer. An automation software is developed based on deep learning technology that shortened the time taken to plan radiation therapy to just a few hours while simultaneously maintaining the quality of the treatment plan generated (Oktay et al., 2020). Deep learning is preferable over traditional machine learning methods because of better performance and cognitive ability across various radiology studies (Lambin et al., 2017). AI in immunotherapy In the realm of cancer immunotherapy, AI primarily concentrates on assessing treatment responses and aiding physicians in refining treatment plans. Song et al., (2019) devised a machine learning-based model that effectively predicts the therapeutic outcomes of programmed cell death protein 1 (PD-1) inhibitors in individuals with non-small cell lung cancer. Scientists created a machine learning technique utilizing the human leukocyte antigen (HLA) mass spectrometry database. This method accelerates the identification of cancer neoantigens, enhancing the effectiveness of cancer immunotherapy (Bulik-Sullivan et al., 2018).
3.3.3
Cardiology
The field of cardiovascular medicine and surgery is influenced by many biological signals and markers, giving many opportunities for AI to contribute. The utilization of AI-enhanced robotic systems in cardiovascular surgery has already been discussed in the ‘Robotic Surgery’ section. Feasibility studies are underway to assess the utility of CURATE.AI, a platform created by NUS, Singapore, to guide optimum drug dosing in patients with hypertension and diabetes, as a step further in personalized medicine for lifestyle disorders (Mukhopadhyay et al., 2022).
3.4 Pharmaceutical Research and Development Advancements in pharmaceuticals have given a new meaning to disease management over the last five decades, especially for patients with chronic illnesses such as
122
A. Sarkar et al.
diabetes, hypertension, and cancer. Medications have saved uncountable lives and have improved the quality of life for numerous others. Traditionally, drug discovery was a result of either serendipity or chance (such as the discovery of Penicillin by Alexander Fleming) or long-drawn experiments in chemical laboratories (Tan & Tatsumura, 2015). These processes were time-consuming and bore an extremely low probability of success, making drug discovery and research a tedious exercise, profitable for only a select few industry leaders. However, we still know very little about the large number of human genes, proteins, and biochemical molecules, all of which can be potential targets for drugs. The recent revolution in human genomics, proteomics, and metabolomics has generated an immense amount of biological data which can become the substrate for AI-based algorithms to give insights on prospective targets for the ever-increasing number of diseases.
3.4.1
Need for AI in Drug Discovery
Increasingly guided by clinical medicine, molecular biology and bioinformatics, the scope of drug discovery and research has significantly widened. AI drug discovery platforms have accelerated drug discovery and development from a multi-year timeline to a matter of months and design novel drugs that can have the desired clinical effect and safety profile. Apart from scrutinizing biochemical data, AI-driven algorithms can analyse current biomedical literature to expedite drug design and development. This substantial reduction in research and development time, coupled with a notable decrease in the number of candidate molecules requiring synthesis for laboratory testing, leads to significant cost savings. This effectively addresses two fundamental challenges in pharmaceutical research and development (Paul et al., 2021). Although this approach does not necessarily bring a drug to market with a higher success rate than traditional methods, but helps in cost savings and reduced timelines for the pharmaceutical companies to invest internally to develop their AI capabilities in-house, or to partner up with vendors having AI expertise in drug discovery.
3.4.2
AI Applied in Drug Discovery and Testing
While AI applications like ‘structure-based virtual screening’ for numerous structural compounds are gaining substantial attention, the most impactful aspects of AI in drug discovery are yet to be fully realized. The intricate nature of biological systems implies that the structure and fit of compounds don’t inherently guarantee a compound’s safety and efficacy as a drug in clinical settings. Technologies such as ‘phenotypic virtual screening’ and ‘de-novo drug discovery’ show promise for first-in-class and multi-target drugs, with AI expected to play a crucial role in predicting and optimizing various properties of a compound (Gorostiola González et al., 2022). Preclinical testing encompasses the evaluation of drugs for toxicity,
7 Healthcare Artificial Intelligence in India and Ethical Aspects
123
pharmacodynamics, and pharmacokinetics before they proceed to clinical trials. AIbased solutions play a crucial role in reducing uncertainty in preclinical experiments by automating sample analysis to measure the effects of candidate drugs. The scalability provided by AI enables the simultaneous testing of both novel and existing drugs for multiple targets. Addressing the challenge of low success rates in clinical trials, certain start-ups utilize natural language processing (NLP) to sift through medical and pathology reports, identifying suitable patients for clinical trials and potentially enhancing the success rates of these trials. The following are some notable applications of these approaches. Biomedical Innovation: The Japanese government has collaborated with the National Institutes of Biomedical Innovation, Health and Nutrition, to develop a self-learning Al to boost the development of novel drugs (Ibata-Arens, 2020). Drug Discovery: NuMedii, a biopharmaceutical company, has developed a technology called Artificial Intelligence for Drug Discovery (AIDD), which utilizes Big Data and AI to swiftly uncover associations between drugs and diseases on a ‘systems’ level (Stephenson et al., 2019). BenevolentAI, a UK-based start-up is working with AstraZeneca to discover drugs for neurodegenerative diseases such as Alzheimer’s disease and ALS (Narayanan et al., 2022). Accelerated Drug Discovery: BERG Health and Atomwise use Al platforms for drug discovery, and reportedly found one drug in one day, that may be effective against the Ebola virus (Stephenson et al., 2019). Novel Drug Design Solutions: South Korean start-up Standigm, through its AI-based platform, explores a latent chemical space to generate novel compounds, discover clinical pathways and prioritize potential targets, reducing uncertainty in the drug discovery process (Yoo et al., 2023). Data-Driven Target Discovery: Israeli start-up CytoReason analyses vast amounts of proprietary and public data to understand complex interactions inside cells, using machine learning to uncover disease-related cell/gene maps (Chopra et al., 2022). Preclinical Drug Discovery: Genome Biologics, a start-up based in Germany, employs a platform that utilizes machine learning and pattern recognition. This platform matches compound databases and drug discovery pipelines with profiles of disease-relevant genes, accelerating the identification of novel compounds and repurposing known compounds for the treatment of metabolic diseases and cancer (Chopra et al., 2022). Late-Stage Drug Candidates: BullFrog AI, a start-up based in the United States, analyses datasets from clinical trials to uncover correlations between therapies and patients. This approach aims to discover novel insights for late-stage drug candidates by identifying new drug targets, determining synergistic drug combinations, and identifying niche patient populations that could significantly benefit from a particular drug (Krishnamurthy & Goel, 2022).
124
A. Sarkar et al.
Small Molecule Therapeutics: DeepCure is a US-based start-up that combines deep learning, cloud computing and its proprietary database to identify promising small molecules and optimizes for the molecule’s absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties, providing a much clearer picture of the efficacy and toxicity profile of the drug (Patil et al., 2023). Developing Cancer Immunotherapies: Gritstone Oncology has created EDGE, a platform that employs deep learning to analyse extensive human leukocyte antigen (HLA) peptide datasets obtained from mass spectrometry studies of human tumour samples. This innovative approach identifies neoantigens, marking them as potential targets for the development of novel cancer immunotherapies (Bulik-Sullivan et al., 2019). Proteome Screening: Cyclica has introduced Ligand Express, a cloud-based AI platform for proteome screening. This innovative system can unveil a drug’s polypharmacology, identifying both its intended drug–protein interactions with therapeutic potential and its off-target interactions that may lead to unwanted side effects (Rashid, 2021). Bioinformatics: Pharma.AI, the pharmaceutical artificial intelligence division of Insilico Medicine, a bioinformatics company associated with Johns Hopkins University, specializes in drug discovery programmes targeting cancer, Parkinson’s, Alzheimer’s, and other health issues associated with ageing and age-related conditions (Mullin, 2023). Real-world data is reshaping innovations in the pharmaceutical sector, facilitated by the Internet of Things (IoT), sensors, and wearables. This data encompasses patient health status, treatment information, and routine health reports. Graticule, a start-up based in the United States, specializes in creating structured datasets from unstructured real-world data (RWD) and provides data subscriptions and collaborations to pharmaceutical companies, unlocking value from these diverse data points. Meanwhile, the Romanian start-up OncoChain offers a research platform based on a de-identified real-world oncological patient database, facilitating early detection and timely intervention in cancer cases (Dharmamoorthy et al., 2022). Real-World Evidence are becoming increasingly popular in clinical research (Chodankar, 2021). Pangaea Data is a UK-based start-up that uses AI algorithms to identify patient cohorts for drug discovery, clinical trials, and evidence-based studies by scanning through EHRs and doctors’ notes to find the patients best suited for trials. Digital Therapeutics: Digital therapeutics provide evidence-based therapeutic interventions through intelligent digital platforms to prevent, manage, or treat physical, mental, and behavioural conditions. These technology-driven solutions, which are non-pharmacological in nature, can be standalone or used in conjunction with pharmaceuticals, devices, or other therapies. They empower individuals to have greater and more personalized control over their health and outcomes. Cognivive, a US-based start-up, offers evidence-based digital therapeutics for neurocognitive and neuromotor impairments, leveraging medical devices and virtual
7 Healthcare Artificial Intelligence in India and Ethical Aspects
125
reality (VR) to help patients develop new brain circuits, aiding in the recovery of brain-body control (Salisbury, 2021). German start-up Dopavision is developing a smartphone-based digital therapeutic for slowing the progression of shortsightedness or myopia in children and young adults, by activating the release of dopamine, a neurotransmitter (Safal Khanal, 2021). Curative Therapies: A transformative shift is occurring in the approach to treating illnesses, moving from managing diseases to seeking outright cures. Curative therapies, like cell and gene therapies, are reshaping our perspective on chronic diseases by eliminating the reliance on long-term treatments. In gene therapy, genetic material is intentionally introduced into host cells to either produce a beneficial protein or diminish/cease the production of a harmful compound. Mogrify is a UK-based start-up uses sequencing data on the transcription factors to convert any mature cell into any other mature cell type, helping develop novel cell therapies for musculoskeletal, autoimmune, and cancer immunotherapy (Ilic et al., 2019). The US-based Lacerta Therapeutics develops novel Adeno-Associated Virus (AAV) vectors for gene therapy to combat neuromuscular and lysosomal storage diseases (Rouse et al., 2023). India has not witnessed much growth of AI in drug discovery research. Indian tech-based companies have collaborated with pharmaceutical companies abroad to provide AI and analytical capabilities to them, but there have not been examples where Indian pharmaceutical companies have leveraged AI in drug development (Lantern, 2017). The reasons behind this lag may be attributed to India’s unique patent law system, which attempts to encourage new drug research while ensuring the country’s citizens have access to the latest drugs at affordable prices (Acharya, 2019). This balancing act means that even if a pharmaceutical company invests heavily in AI-enabled tools for drug research, another company can very well launch a generic drug with the same chemical composition at one-fourth the cost of the patented drug! This disincentivizes companies from investing in drug development research and exploring the utility of AI in this field as there is a high likelihood that they may never be able to realize the cost of this investment, even if the drug is launched in the market (Singh et al., 2016).
3.5 Palliative Care Palliative care is a specialized healthcare discipline that encompasses a systematic and organized approach to providing care to individuals diagnosed with lifethreatening or debilitating illnesses, spanning from diagnosis until the end of life (Morrison & Meier, 2004). Additionally, bereavement care may also be offered to the family afterward. The primary objective is to enhance the quality of life for both patients and their families by alleviating pain and addressing other distressing physical symptoms. This comprehensive approach includes nursing care and attends to emotional, psychosocial, and spiritual concerns. Effective delivery of palliative care
126
A. Sarkar et al.
is best achieved through an interdisciplinary, multi-dimensional team comprising doctors, nurses, counsellors, social workers, and volunteers. India has witnessed a massive increase in its ageing population over the past two decades, with an estimated 0.8 million new cases of cancer diagnosed annually, and over 80% presenting at late stages (Kulothungan et al., 2022). This has increased the need of palliative care in India. The concept was introduced here in the mid1980s, giving way to the development of hospice and palliative care services through the commitment of healthcare professionals, volunteers, and philanthropists. This momentum of development of hospice and palliative care provision spurred in 1990s after the Indian Association of Palliative Care (IAPC) was registered in March 1994 (Chaturvedi & Chandra, 1998). Artificial intelligence tools, including machine learning and natural language processing, hold significant potential in assisting clinicians by enhancing decisionmaking processes and identifying individuals at a higher risk of mortality or those susceptible to inappropriate or excessive treatment with non-positive outcomes. In the domain of palliative care, these tools can play a crucial role in facilitating essential aspects such as advance care planning and aligning treatments with the specific needs and desires of patients, particularly in the final stages of life. Nonetheless, it is paramount for companies and individuals employing AI in this sensitive field to recognize and mitigate ethical concerns and consequences. There is a pressing need to contemplate the most suitable organizational models and allocate specialized resources to manage the anticipated rise in the number and diversity of patients with early identified palliative care needs. It is crucial for AI not to disrupt or attempt to replace the fundamental components of the doctor-patient relationship, including a comprehensive clinical and psychological assessment and the capacity to communicate a poor prognosis in an individualized and ethically sound manner (Peruselli et al., 2020). As palliative care is becoming more relevant, the possibilities to apply AI techniques to the research questions of and applications in the field, increase drastically. The applications of AI in the field of palliative care are: • Assessing which patients would benefit most from an institution of palliative therapy, which will guide the most efficient allocation of limited resources. • Deciding the optimum time of beginning palliative care well. Currently, this is done clinically, largely based on clinical scores that try to predict mortality. • Helping physicians make decisions where they must weigh the possible benefits of an intervention with the negative effects caused by performing it, e.g., trying to assess accurately whether palliative chemotherapy will result in a quicker decline in quality of life than the disease itself for a terminal cancer patient. Utilizing AI to tackle these challenges has the potential to enhance the precision of existing predictions by integrating imaging and laboratory data with clinical information. This approach goes beyond merely predicting mortality, extending to estimating the likelihood of heightened symptom burden and a decline in the quality of life over the course of a patient’s illness. With recent advances in explainable AI, tools like ‘Grad-CAM’ have helped doctors explain the reasons for the decisions based on the areas of an image used for the model prediction; application of AI in
7 Healthcare Artificial Intelligence in India and Ethical Aspects
127
palliative care can be especially promising since it will help physicians to not just blindly trust the AI’s decisions but also to assess them considering the clinical information and to ultimately incorporate this advice into an informed decision for the patient (Windisch et al., 2020). There are a few challenges to implementation of AI in the field of palliative care as highlighted by researchers. • For these algorithms to reach their full predictive potential, well-curated large datasets are a mandatory prerequisite and the recent increase in clinical data collection has created datasets which can be used as a starting point for introducing AI to the field. • Determination of optimal timing of palliative care involvement can be tricky. There’s lot of research proving the positive impact of early institution of palliative care for patients with cancer, but this may not always be possible. • Weighing the possible benefit of any intervention against its potential physical or emotional harm to the patient.
3.6 Geriatric Care With life expectancy in India steadily rising and reaching 70 years in recent years (Kumari & Mohanty, 2020), there is a growing need to manage chronic illness and make the last years of patients with dementia, heart failure, and osteoporosis more comfortable. Artificial intelligence in healthcare is promoting and transforming traditional elderly care services through deep integration with the elderly care delivery. The two complement each other to meet the diversified and personalized needs of the elderly. With the main goal of helping the elderly maintain and recover their social functions to the greatest extent, AI can play the functions of “replacement”, “convenience”, and “integration”. Currently, these diseases are managed with the help of medications, therapy, and the support of family and friends. The power of AI for health lies in a combination of applications. Clinical advancements will keep lengthening the life span of patients but empathy-based AI applications which support the softer side of elderly care could help reduce the frequency of requiring clinical treatment, improving quality of life especially for patients with psychological and ocular diseases (Choudhury et al., 2020). AI and robots have the potential to revolutionize geriatric and end-of-life care, enabling individuals to maintain independence for an extended duration and tend to their own needs. This transformative technology can decrease the necessity for hospitalization, urgent interventions, caregivers, and care homes by autonomously handling routine tasks like monitoring vital signs and providing medication alerts. Recent advancements in humanoid design, when combined with AI, take robots a step further, allowing them to engage in ‘conversations’ and other social interactions with individuals. This not only keeps ageing minds agile but also addresses issues of loneliness and isolation to a considerable extent. The subsequent applications outline some of the diverse uses of AI in elderly care.
128
3.6.1
A. Sarkar et al.
The Social Robot
At the simplest level, AI chatbots can help patients keep on top of care plans, reminding elderly people about their doctor’s appointment, when to take medication, or when to eat. Furthermore, they can provide some level of companionship for lonely seniors (Pieska et al., 2012). Integrating social robots with sensors on the bodies of elderly people can support them in managing illnesses such that they are reassured and more confident about living alone, e.g., AI devices can predict and prevent falls. As a long-term consequence, AI can help free up healthcare facilities and resources by helping seniors care better for themselves and to live at home for longer. • Kompai robots are designed to engage in conversations, comprehend speech, offer reminders for meetings, manage shopping lists, and play music. Tailored to aid the elderly within their homes, these robots have the capability to monitor for falls and other health parameters, issue alerts, and establish videoconference connections with healthcare providers, friends, and family (Bardaro et al., 2022). • ELLI.Q by Intuition Robotics, a start-up founded in Israel launched an “aging companion” robot for elderly using computer vision and machine learning to provide proactive recommendations related to entertainment, general advice and activities in addition to wellness and environmental monitoring (Mincolelli et al., 2019). 3.6.2
Intelligent Nursing Robots
With decline in physical function, nursing care has always been the most important aspect of elderly care services and an area that requires immense manpower input. For the elderly with disability and dementia, the primary goal of intelligent nursing automation is to replace, partially or completely, human nursing (Robert, 2019). • Robot Era is in the process of creating robots equipped with wheels and a friendly humanoid face. These robots employ sensors and cameras to collect real-time data, which is then wirelessly transmitted to the cloud. Advanced artificial intelligence algorithms in the cloud analyse this data, providing insights such as whether an individual is displaying signs of dementia (Di Nuovo et al., 2018). • Zora Robotics is incorporating AI into Zora Bots, which play patient-facing roles in hospitals. These humanoid robots are being trained to engage in conversations with the elderly (van den Heuvel et al., 2020). 3.6.3
Smart Housekeeping and Cleaning
Smart housekeeping is an AI system that can scientifically manage daily housekeeping for the elderly, like a real “housekeeper”. The main goal is to provide convenience, but not to “replace” the ability of the elderly to live independently (Porkodi & Kesavaraja, 2021). It is a system that is barrier-free and extremely convenient for the
7 Healthcare Artificial Intelligence in India and Ethical Aspects
129
elderly to use. It is envisaged that for the disabled elderly who have been bedridden for a long time, AI-powered systems can achieve more intelligent and humane cleaning from head to toe, a personal cleanliness which cannot be achieved by traditional nursing.
3.6.4
Online and Offline Integration
Many smart elderly care services can be revolutionized with AI to integrate with existing resources and information, offline cooperation, and back-end support, e.g., facilitating the timely handling of emergencies. This data completeness can also help the elderly achieve health management across the realms of medication, nutrition, and seeking clinical support. • Biotricity is a medical diagnostic and consumer health-tech company implementing device-level AI to improve its remote patient monitoring platform (Jamal et al., 2018). CarePredict uses AI to continuously detect changes in human activity and behaviour patterns for early detection of health issues in the elderly using wearable devices (Zhang & Li, 2017). • Voice-based virtual assistants uses AI to enable adherence to medication and care for elderly like Amazon Echo and Orbita Health (Das, 2017). • Smart wearables like Apple and Fitbit have large adoption across groups including elderly and geriatric patients (Stavropoulos et al., 2020). This device comes equipped with AI-powered features, allowing older adults to detect discrepancies in their biometric data and receive alerts for significant or severe falls through a built-in alarm system. 3.6.5
Future AI Applications of Geriatric Care
The Accenture Liquid Studio platform can learn user preferences and behaviours and proactively suggest physical and mental activities for elderly people (Gallan et al., 2019). It can help caregivers check on the patient’s daily activities, their medications or changes in their sleep/behavioural patterns. Human Pose Prediction for Care Robots Using deep learning is a key application for elderly care robots to possess to enable them for fast responses when accidents or falls occur. The Toyohashi University of Technology has created a deep learningbased technique for elderly care robots that can accurately estimate different human poses. This advancement enables the robots to detect falls or accidents in patients and respond accordingly (Nishi et al., 2017). India has always prided itself on its culture of respecting the elderly and developed its first elderly care robot in 2021. This assistive robot for the elderly developed by Achala Health Services is named Charlie and uses NLP and other AI technologies to provide psycho-emotional support, multi-lingual voice assistance and health-related
130
A. Sarkar et al.
reminders to the elderly, helping them navigate the elements of daily living with ease and dignity (Begwani, 2021). Palliative care, however, is still in its infancy in India (Gaikwad & Acharya, 2022). The lack of awareness amongst the people, limited affordability of basic health services, let alone specialized services such as palliation, and the absence of robust regulatory and legal framework are all responsible for the fact that India is yet to embrace palliative care and leverage AI to improve its access and delivery.
3.7 Hospital Ecosystem Management In the previous sections of this paper our focus has been on the clinical side of healthcare, that is the patient-doctor interaction, and we explored various AI approaches to improve the system. Here we look at the AI and digital intervention to the healthcare ecosystem which involves things beyond the patient-doctor interaction like finding a doctor or suitable clinic in the locality, ordering medicines based on a prescription, preventive healthcare through health tracker applications, managing the various functions in a hospital, etc. Hospital management encompasses the entire operations of a healthcare facility, from intake and registration of patients to their appropriate steerage, provision of quality healthcare services, ensuring drug safety and pharmacovigilance, undertaking report generation and staffing. Traditionally, hospitals used paper-heavy methods to record patient data, guide patient care, and generate laboratory reports. Such systems were not very effective and led to delays in retrieving this physically available information and clinical decision-making, leading to medical errors and inefficient workflows. Then Hospital Management Systems (HMS) were introduced to digitize records and reports and to automate workflows. These revolutionized patient scheduling, health record-keeping, laboratory management, and clinical care delivery (Balaraman & Kosalram, 2013). But these systems became outdated as advanced analytics came into the picture in the early 2000s, and their potential contribution to the field of hospital management was realized. Artificial intelligence and machine learning can augment Hospital Management Systems (HMS) to make them patient-centric, scalable and comprehensive systems to benefit all stakeholders—patients, physicians, hospital administrators, government, and hospital owners. During the COVID-19 pandemic, hospital systems across the world were overburdened and short of resources and AI-based technologies such as facial recognition for patient triage improved operational efficiencies and health outcomes (Shi et al., 2022). These solutions have been created for almost all aspects related to hospital ecosystem management, including but not limited to the following. • Market Research and Intelligence, Brand Management and Pricing: AIpowered tools can perform effective industry competitive analyses and help organizations create an optimal marketing strategy and positioning for their brand based on target segment, their prime unmet needs and market perception. They
7 Healthcare Artificial Intelligence in India and Ethical Aspects
131
also help hospitals determine the optimal price for medical treatment and other services according to competitive benchmarking and other market factors. MD Analytics is a global provider of health and pharmaceutical marketing research solutions (Chakrabarty & Skinner, 2006). • Health Insights and Risk Analytics: Companies in this category provide predictive insights about a patient’s health risk scores based on their healthcare utilization trends, medical history and demography and geo-political factors, leveraging NLP, and other ML algorithms. India-based Innovaccer Analytics is a leading population health management solution which helps provider organizations predict accurate risk trends for patients, guiding better care delivery to high-risk cohorts, and improving clinical outcomes (Lakshmi, 2022). Gauss Surgical uses image recognition to monitor blood loss during surgery in real-time, allowing for proactive replacement and improved postoperative outcomes (Whooley, 2021). • Operating Room and Outpatient Operations: Process automation technologies such as intelligent automation and RPA help hospitals automate routine frontoffice and back-office operations such as electronic clinical quality measures (eCQM) reporting. Scheduling solutions capture missed or cancelled appointments and reschedule them, improving patient turnout, hospital revenues, and health outcomes. Docspera’s OR rescheduling solution analyses a hospital’s elective surgery cancellation rate, compares it against industry benchmarks and facilitates appointment rescheduling as per patient priority and surgeon preference (Ethiopia, 2022). Other than a few large government and private hospitals, the healthcare sector in India is unorganized with large number of small clinics, hospitals, diagnostics centres, and pharmacies operating across India. Given this plethora of choices patients often find it challenging to choose a reputed doctor or diagnosis centre in their vicinity who would charge an appropriate amount. The Indian healthcare start-up Practo AI technology for finding the appropriate doctor in the patient’s vicinity is based on various criteria like doctor’s specialty, years of experience, fees charged, etc., and allows the patient to book an appointment with the doctor and even introduced teleconsultation facility over time (Lapaas, 2019). Another Indian start-up Lybrate uses AI on their platform to connect patients with doctors in their own location thus easing some of the challenges for patients in navigating their healthcare challenges (Chatterjee, 2017). • Patient Engagement and Wellness: Patient-facing applications such as chatbots allow patients to stay in touch with a care manager 24 × 7 and ask question regarding appointments, medication refills or the best course of action when facing a health issue (virtual assistants/coaches). These solutions also provide a portal for hospitals to send reminders to patients for their wellness visits, immunizations, and prescription adherence. These modalities help hospitals perform better appointment scheduling, reduce overcrowding, and improve resource utilization. UK-based company, Babylon Health, puts patient wellness into their own hands and provides 24 × 7 access to medical experts to patients (Tew, 2022).
132
A. Sarkar et al.
• Preservation of Patient Records: India lacks a common system for sharing healthcare records in digital format which can be easily accessed by both patients and doctors in case of an emergency. The Indian healthcare start-up DocTalk provides a subscription-based service for patients to store their healthcare records and prescriptions in a digital format, thus providing the option to chat with the doctors based on the stored health records (Chawla, 2020). • Provision of Advanced Intensive Care Unit (ICU) Facilities: Advanced Intensive Care Unit (ICU) facilities which are important for critical care are expensive and several smaller hospitals cannot afford this physical infrastructure. Bangalorebased start-up Cloudphysician has developed an AI-based ‘Smart ICU’ that can be installed in hospitals without proper ICU facilities and helps in monitoring critical care patient, and based on the patient’s condition, may refer to specialist doctors in urban areas (Kalanidhi, 2022). Bangalore-based health-tech start-up Tricog provides AI-based virtual cardiology services to clinics and hospitals in remote areas which lack these facilities. Tricog’s InstaECG have deployed in thousands of remote clinics and hospitals for monitoring patient cardiac conditions and their InstaEcho platform for remote cardiac ultrasound helps in diagnosis of heart failure, valvular heart disease, and screening for congenital heart disease (Sharma, 2021a). • Access to Medicines: According to Invest India, the Indian pharmaceutical industry is expected to reach $65 billion in 2024 and $120 billion by 2030, a large part of this entire market is predominantly in physical format through small retail stores across India. E-commerce has made great progress in India, but traditional e-commerce companies cannot sell medicines as they require prescriptions. Several healthcare start-ups like 1 mg, Netmeds, Pharmeasy, Medlife, TABLT, etc. have identified this opportunity and launched mobile applications through which citizens can order the medicine of their choice by uploading their prescriptions and getting them delivered at their doorsteps (Malik, 2022b). Over time, some of these mobile applications have also introduced other allied healthcare services, like doctor teleconsultation, booking diagnostic tests, ordering non-prescription health products, etc. • Claims Processing and Fraud Detection Revenue cycle management is a significant part of the workings of a hospital system and leveraging AI-powered tools can optimize this cycle. These solutions help hospital managers shorten claims cycles by capturing all pertinent clinical and insurance data at point-of-care, preventing denials and identifying fraudulent claims, ultimately enhancing hospital revenue and reducing workforce burden. Healthcare claims denial management solution, Quadax, is a customizable revenue cycle solution to streamline claims workflows and provide decision intelligence for root cause analysis of delays/errors. This is a huge opportunity in this area and several other global companies like Cognizant, Conifer Health, Mckesson, Zirmed, etc. are also offering AI solutions for easing the claims submission and verification process (Watson, 2022). Indian start-up ClaimBuddy is using AI to provide a hassle-free claim processing that will help both patients and hospitals (ETtech, 2022).
7 Healthcare Artificial Intelligence in India and Ethical Aspects
133
4 Ethical Challenges in Healthcare AI AI in healthcare involves some part of the trained doctor or healthcare professional’s tasks or decision-making being handed over to an intelligent machine, which can lead to a range of ethical concerns. Human doctors first get a medical degree and then work as “resident doctors,” where they support senior experienced doctors as apprentices and, in the process, gain the required qualifications and clinical skills to diagnose and treat patients. Over the years, doctors are expected to be exposed to various illnesses, which would train them to an adequate level to deliver patient care independently (Runciman et al., 2017). Can these learnings be passed on to a machine trained on patient records and data points? The broad ethical and legal aspects related to the adoption of AI in healthcare based on perspectives from the USA and European Union are discussed in the paper by Gerke et al. (2020), and some of these core issues are discussed in this section. Healthcare AI applications are developed primarily by computer scientists in collaboration with healthcare professionals based on an adequate amount of historic patient data which captures the symptoms, health reports, diagnosis, treatment, and social determinants of health data (Kwon et al., 2021). AI algorithms go through an initial training phase, where a large amount of manually annotated/labelled/tagged data is provided with information on whether the patient’s medical record or images from diagnostics tests (X-Ray, MRI, scans, etc.) have the specific disease or not, leading to an output AI model. This model is then validated on a held-out dataset which was originally not used in the training process and various measures are used to identify the efficacy of the trained model (Ribeiro et al., 2015). Patient diagnostics images or medical records having the disease are tagged as positive samples whereas images/records without the disease are tagged as negative samples. A 2*2 confusion matrix shows the number of actual positive/negative cases in the validation set predicted by the trained model as positive/negative (Susmaga, 2004). The overall accuracy of the trained AI model is the percentage of correctly predicted cases by the model compared to the total number of cases. In certain cases, such as with a rare disease, accuracy is not an adequate measure of the model efficacy because, due to the low prevalence of the disease in the population, the model’s prediction will reflect the same. Advanced measures like sensitivity and specificity are helpful, where sensitivity is the proportion of true positives the model correctly predicts, and specificity is the proportion of true negatives correctly predicted by the model (Mitchell, 1997). Based on these data points, the trained AI models can provide near-accurate predictions for improving health systems and enabling patients to handle and process their data (Topol, 2019). AI models may show decent accuracy but usually contain some inherent bias towards certain groups of people (Price, 2019). The AI model training data can be a source of bias where the data was collected from a particular race (Obermeyer et al., 2019), gender, or geography, which is different from the target population where the model would be applied. Societal biases of the team involved in data collection and AI algorithm development can also creep into the final AI applications, posing
134
A. Sarkar et al.
further risks. Further, bias may occur because the past healthcare treatment costs are used as a proxy for healthcare needs wherein black patients historically spend less than whites, leading to the AI model predicting them as healthier than equally sick white patients (Obermeyer et al., 2019). There are available tools and methods for addressing bias once the developers identify the groups which might be in a disadvantaged position (Parikh et al., 2019) but more often than not, having that information is not feasible or it might lead to the collection of additional training data thus delaying the development and increasing the application costs. AI models trained through the process of deep learning-based methods are highly complex and explaining to non-technical stakeholders how the algorithm works and makes its predictions is not straightforward, leading to another pertinent ethical issue of AI explainability (Phillips et al., 2021). In most cases, the doctor must explain to the patient the reason behind taking a particular decision or step in the patient’s care protocol and if part of that decision is taken by an AI application, then the explanation for that decision would not be available to the doctor leading to a critical ethical issue concerning the benefit of making that decision and its consequences. In the unfortunate case of a diagnosis being incorrect, and subsequently, the treatment going wrong, the accountability and thus, the full liability for the medical decision lies with the doctor and the absence of an explanation for the AI’s part in the process would further deteriorate the situation (Amann et al., 2020). Deep learning-based black-box approaches are difficult to use in real-world scenarios where the knowledge representation decisions are not easily understood and call for transparent, understandable, and explainable AI models in the healthcare domain (Holzinger et al., 2017). Healthcare involves the lives of people, and the accuracy of an AI algorithm is not the only metric to consider in its application to the population. The healthcare AI application must get approvals from the respective regulator(s) in a particular country before it can be launched in the market and applied to real patients. US Food and Drug Administration (FDA) is the nodal agency that grants approvals for the use of any such healthcare AI application in the USA (Futurist, 2019). EU has a more stringent approach to such applications as they must comply with the IT Act as well as the existing EU Medical Devices Regulation (Vokinger & Gasser, 2021). India does not have a clearly defined legislation or regulation for critiquing and approving the use of AI-based applications in healthcare and medicine. The proposed Digital Information Security in Healthcare Act (DISHA) has the potential to facilitate and regulate digital health infrastructure in India (Luniya, 2021). Data is key to the development of robust AI applications and models. Hospitals and clinics having large volumes of patient historic records in digital format can help in the development of AI applications if this data can be shared in anonymized formats with researchers and AI developers. The patient willingly shared this data with the doctor while visiting the hospital for health check-ups, diagnoses, or consultations. Data privacy concerns raise the inherent question of whether this data collected for the purpose of a clinical encounter can be shared by a third party for AI development which can have positive social implications by making a larger number of people healthy or commercial benefits for the AI developers or hospitals deploying these
7 Healthcare Artificial Intelligence in India and Ethical Aspects
135
solutions (Abouelmehdi et al., 2017). The recent surge in the development of AI healthcare applications is based on data-sharing agreements between these parties or based on the datasets curated for the purpose of AI development based on patient consent (Saksena et al., 2021). The recent data privacy laws like the GDPR in the European Union or the draft data privacy bill of India impose limitations on sharing data with a third party if the same was not declared during data collection; explicit consent needs to be taken from the data principal, i.e., the patient, for data sharing. Data privacy norms aimed at protecting the privacy of citizens and patients can become a deterrent to the development of robust AI applications (Gourd, 2021). A patient visits a medical facility with the expectation of being treated by a trained physician and with the involvement of AI applications, some part of this process is handed over to a machine, a non-human entity. In the earlier sections the chapter highlighted the lack of adequately trained healthcare professionals while making a case for AI-led interventions in healthcare. This raises the question of the need for informed consent from the patient on the use of AI applications as part of the diagnosis or treatment process, making the patient aware of the inherent risks in the decision-making process (Cohen, 2019). In an urban setting, the patient may have the freedom to choose between the AI-based approach or a completely human-based approach, whereas in a rural scenario, the patient, though aware of the risks of AI, may not have the alternative of a human option and must resort to accepting the AI-based approach. The EU guidelines on AI suggest a human agency and oversight (EU, 2019), but there are no clear country-specific directions on the unavailability of human oversight in certain scenarios. Healthcare AI applications should help doctors in the diagnosis and treatment process and increase their productivity, thus helping more patients. World Health Organization (WHO) issued the first set of guidelines on AI applications in healthcare, stressing on the importance of preserving patient autonomy and the explainability, transparency, and sustainability of applications (WHO, 2021). The above-discussed ethical concerns relate to all the different areas of healthcare discussed in this paper and in the following subsections we discuss the ethical aspects specific to each of them.
4.1 Public Health and Pandemics In endemic areas where the AI has been trained on a large amount of longitudinal data, the ethical issues are expected to be minimal but in the case of a pandemic such as COVID-19, wherein the algorithms only had access to a limited dataset, wide applications of AI-based decision-making can have unintended consequences on the health of millions of people (Peng et al., 2022). When AI-based algorithms are used for population health management, the inherent bias of non-representation of socially and economically vulnerable population subgroups cannot be ignored (Panch et al., 2019).
136
A. Sarkar et al.
4.2 Diagnosis and Early Detection 4.2.1
Diabetic Retinopathy
As explained previously, AI-based applications are useful in screening patients without a prior diagnosis of DR and in referring those with more than mild disease to a trained ophthalmologist. These algorithms are an adjunct to clinical practise and not to be used for making a diagnosis. Ethical issues arise when this line gets blurred, and the application is used as a replacement for the trained eye of a physician (Raman et al., 2021). Patients with co-existing diseases such as Macular Degeneration or other degenerative uveo-retinal pathologies may be erroneously categorized by such applications, and this could lead to questionable overall accuracy in a subset of patients.
4.2.2
Cancer
Due to its highly complex causation and disease progression, different cancers in different populations can have markedly different prognoses, the root causes of which are extremely difficult to fathom. So, a single AI-based application may not be applicable on a universal scale, and a failure to recognize this inherent shortcoming can cause us to make incorrect predictions about the disease course in a particular patient who is racially, socially, or geographically different from the patients that the algorithm was trained on Shreve et al. (2022).
4.2.3
Cardiology
Like diabetic retinopathy detection, patients with another co-existing disease or rhythm abnormality may be incorrectly diagnosed by AI-based applications and the ethical challenge that will ensue will be the final burden of answerability—will it be borne by the AI or the physician who will see the patient after the AI has triaged them? These questions require further research enabling the widespread use of AI in the detection of heart diseases (Lopez-Jimenez et al., 2020).
4.3 Disease Treatment For AI lead disease treatment, the ethical issues involve patient safety, informed consent of the patient, lack of human judgement, liability and accountability, equitable access due to the high charges involved along with the other issues related to bias, explainability and data privacy (Kooli & Al-Muftah, 2022). The ethical challenges here would be ‘Explainability’ in cases where the clinical outcome of robotic
7 Healthcare Artificial Intelligence in India and Ethical Aspects
137
surgery, chemotherapy, or radiotherapy turn out to be less than favourable or poorer than expected. The involvement of a human entity at every point in such sensitive scenarios is paramount and the entire onus can never be left on the AI-based application(s) (Zanzotto, 2019).
4.4 Pharmaceutical Research and Development The use of AI in pharmaceutical research and development offers immense potential for accelerating drug discovery and development processes. However, it also raises several ethical concerns that need to be carefully addressed. Clinical trials must have representation from a varied population base to reflect true medication effects and comparison with gold standard (Ting et al., 2017). If this is not the case, the algorithms trained on the data from such trials will imbibe the same biases as the trials. Other ethical aspects in drug development relates to equitable access to AI technologies as some of these niche drugs can be quite expensive, intellectual property issues, and regulatory challenges related to safety and efficacy standards of the drugs before they are launched in the market (Blanco-Gonzalez et al., 2023).
4.5 Palliative and Geriatric Care AI-based palliative and geriatric care opens several ethical issues (Rubeis, 2020). A report by the UNESCO’s World Commission on the Ethics of Scientific Knowledge and Discovery (COMEST) states that preservation of human dignity and privacy fall under the ethically uncharted territory for robots (UNESCO, 2017). The widespread use of robotic instruments without human supervision in geriatric care (Sharkey & Sharkey, 2012) has the potential to be ridden with issues of bias and explainability in ethically unclear situations such as: • When an elderly care robot is assigned to remind patients for taking their medicine, the underlying robot intelligence must be equipped to handle situations where patients refuse to take their medicines. This becomes particularly challenging for current AI platforms, as the patient’s refusal could be based on a valid reason, which is difficult for the robot to determine. • An elderly robot takes away high-calorie foods from a patient to prevent obesity when this patient may be taking sugar to combat an episode of hypoglycemia. • There could also arise situations where a caregiver uses a remote-controlled robot to restrain an elderly, giving rise to moral and legal ambiguities in robotic care for the elderly. • Sole and unsupervised application of AI and robotics for geriatric care can lead to adverse outcomes for the elderly, due to inability of AI platforms to identify
138
A. Sarkar et al.
legitimate and/or subjective reasons for non-compliance of the elderly to their built-in care algorithms. Additionally, issues of legal sanctity can lead to chaos for the involved stakeholders including the medical team and the AI solution provider, e.g., India still does not have a legislation that gives all Indians the right to die with dignity, despite ICMR’s guidelines on ‘Do Not Attempt Resuscitation’ and the Supreme Court’s permission to create advanced directives (Mathur, 2020). In such a situation, human supervision is paramount to guide the AI-powered solution to do the right thing by clinical and legal standards.
4.6 Hospital Ecosystem Management Hospital ecosystem management deals with aspects related to doctor appointment booking, managing critical resources like operational theatre, access to medicines, access to medical information, fraud detection, etc. In risk profiling, disproportionate representation of some population subgroups in healthcare facilities due to differed accessibility, awareness, and affordability leads to an inherent bias in the AI algorithms and thus, the risk predictions given by the applications. Here the AI-based models are used for automated decision-making and the ethical aspects in AI related to bias, explainability and privacy apply across these applications also (Pradhan et al., 2021).
5 Challenges and Way Forward This chapter provides a detailed overview of the various areas where we see intervention from artificial intelligence which is helping the doctors and thus leading to better patient outcomes. A patient has several interactions with a healthcare system starting with preventive health check-ups for the detection of abnormal health condition to treatment of diseases, monitoring disease recovery, and end-of-life care. The AI interventions in each phase of the healthcare journey is described in this chapter providing an overview of the latest AI application in that domain globally and then discussing applications of the same intervention in India. Earlier in the chapter, we have highlighted the need for AI intervention in Indian healthcare system to overcome the availability of trained healthcare professionals across different locations. We further saw a range of AI interventions primarily driven by start-ups in the past decade. Despite the various sections, we see that the adoption of AI intervention in various areas of healthcare are lacking in India on a systemic level due to various challenges and barriers in the Indian health ecosystem. There are many barriers to the rapid adoption and growth of healthcare AI in India (Ajmera & Jain, 2019):
7 Healthcare Artificial Intelligence in India and Ethical Aspects
139
• Insufficient IT Infrastructure at Point-of-care: Digital infrastructure, the source of health data, is paramount to the success of this effort. However, given the sheer scale of the digitization required and the current state of infrastructure available, especially at the PHC level, this is a significant deterrent to the implementation of NDHM. • High Capital Requirement: With healthcare expenditure being < 5% of the annual GDP, the capital needed to establish the necessary technologies to implement this goal will be lacking. • Data Security and Privacy Concerns: Health data is more sensitive and requires utmost confidentiality throughout its journey from creation and capture to transmission and end-usage. Since the current data privacy laws in India are quite loose, it poses a significant challenge because this concern will slow the adoption of healthcare digitization and democratization. • Lack of Trained Professionals: With healthcare digitization, manpower needs to be trained accordingly. • Non-uniform Support from Political Leaders and Senior Management: Due to health being a state subject in India, the implementation of central legislations in each state is at the mercy of the state and cannot be dictated by the central government. This leads to varied application of policy and unequal distribution of funds and manpower across different states. • Inadequate R&D support: There is lack of a consolidated research effort to democratize healthcare data and to utilize it to achieve improved healthcare access and outcomes. • Weak Legal Infrastructure and Support: The legislative system of India is significantly lagging the developments in AI and concerns arising therein, leading to lack of structure and standard guidelines related to the field.
References Abouelmehdi, K., Beni-Hssane, A., Khaloufi, H., & Saadi, M. (2017). Big data security and privacy in healthcare: A Review. Procedia Computer Science, 113, 73–80. https://doi.org/10.1016/j. procs.2017.08.292 Abràmoff, M., Niemeijer, M., Suttorp-Schulten, M., Viergever, M., Russell, S., & Ginneken, B. (2008). Evaluation of a system for automatic detection of diabetic retinopathy from color fundus photographs in a large population of patients with diabetes. Diabetes Care, 31, 193–198. Abràmoff, M., Folk, J., Han, D., Walker, J., Williams, D., & Russell, S. (2013). Automated analysis of retinal images for detection of referable diabetic retinopathy. JAMA Ophthalmology, 131, 351–357. Abràmoff, M., Lavin, P., Birch, M., Shah, N., & Folk, J. C. (2018). Pivotal trial of an autonomous AIbased diagnostic system for detection of diabetic retinopathy in primary care offices. In Digital medicine, 1, 39. Retrieved from http://webeye.ophth.uiowa.edu/abramoff/MDA-MacSocAbst2018-02-22.pdf Acemoglu, D., & Restrepo, P. (2019). Artificial intelligence, automation, and work. In A. Agrawal, J. Gans, & A. Goldfarb (Eds.), The economics of artificial intelligence: An agenda (pp. 197–236). University of Chicago Press. Retrieved from http://www.nber.org/chapters/c14027
140
A. Sarkar et al.
Acharya, R. (2019). The global significance of India’s pharmaceutical patent laws. Retrieved from American Intellectual Property Law Association: https://www.aipla.org/list/innovate-articles/ the-global-significance-of-india-s-pharmaceutical-patent-laws Ajmera, P., & Jain, V. (2019). Modelling the barriers of Health 4.0—the fourth healthcare industrial revolution in India by TISM. Operations Management Research, 12(3), 129–145. Amann, J., Blasimme, A., Vayena, E., Frey, D., & Madai, V. I. (2020). Explainability for artificial intelligence in healthcare: A multidisciplinary perspective. BMC Medical Informatics and Decision Making, 20(310), 1–9. https://doi.org/10.1186/s12911-020-01332-6 Andras, I., Mazzone, E., van Leeuwen, F. W., De Naeyer, G., van Oosterom, M. N., Beato, S., et al. (2020). Artificial intelligence and robotics: A combination that is changing the operating room. World Journal of Urology, 38(10), 2359–2366. https://doi.org/10.1007/s00345-019-03037-6 Arora, K., & Prasad, V. (2020). Interview with Predible health: AI startup. Retrieved from InnoHealth Magazine: https://innohealthmagazine.com/2020/industry-speaks/predible-health/ Bagcchi, S. (2015). India has low doctor to patient ratio, study finds. British Medical Journal Publishing Group. Balaraman, P., & Kosalram, K. (2013). E-hospital management and hospital information systemschanging trends. International Journal of Information Engineering and Electronic Business, 5(1), 50. Bardaro, G., Antonini, A., & Motta, E. (2022). Robots for elderly care in the home: A landscape analysis and co-design toolkit. International Journal of Social Robotics, 14(3), 657–681. Barnagarwala, T. (2022). How India is creating digital health accounts of its citizens without their knowledge. Retrieved from Scroll: https://scroll.in/article/1031157/how-india-is-creatingdigital-health-accounts-of-its-citizens-without-their-knowledge Begwani, Y. (2021). Charlie’s angels: Revolutionizing elderly care with AI-enabled robots. Retrieved from India.AI: https://indiaai.gov.in/article/charlie-s-angels-revolutionizing-elderlycare-with-ai-enabled-robots Benjamens, S., Dhunnoo, P., & Meskó, B. (2020). The state of artificial intelligence-based FDAapproved medical devices and algorithms: An online database. NJP Digital Medicine, 3, 118. https://doi.org/10.1038/s41746-020-00324-0 Berger, D. (1999). A brief history of medical diagnosis and the birth of the clinical laboratory. Part 1: Ancient times through the 19th century. Medical Laboratory Observer (MLO), 31(7), 28–30. Bhaskaranand, M., Ramachandra, C., Bhat, S., Cuadros, J., Nittala, M. G., Sadda, S. R., & Solanki, K. (2019). The value of automated diabetic retinopathy screening with the EyeArt system: A study of more than 100,000 consecutive encounters from people with diabetes. Diabetes Technology and Therapeutics, 21(11), 635–643. https://doi.org/10.1089/dia.2019.0164 Blanco-Gonzalez, A., Cabezon, A., Seco-Gonzalez, A., Conde-Torres, D., Antelo-Riveiro, P., Pineiro, A., & Garcia-Fandino, R. (2023). The role of AI in drug discovery: Challenges, opportunities, and strategies. Pharmaceuticals, 16(6), 891. Bornstein, B. H., & Emler, A. C. (2001). Rationality in medical decision making: A review of the literature on doctors’ decision-making biases. Journal of Evaluation in Clinical Practice, 7(2), 97–107. Bulik-Sullivan, B., Busby, J., Palmer, C. D., Davis, M. J., Murphy, T., Clark, A., et al. (2018). Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification. Nature Biotechnology, 37, 55–63. https://doi.org/10.1038/nbt.4313 Bulik-Sullivan, B., Busby, J., Palmer, C., et al. (2019). Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification. Nature Biotechnology, 37, 55–63. https://doi.org/10.1038/nbt.4313 Cardiac, F. (2020). FDA authorizes marketing of first cardiac ultrasound software that uses artificial intelligence to guide user. Retrieved from US Food and Drug Administration: https://www.fda.gov/news-events/press-announcements/fda-authorizes-market ing-first-cardiac-ultrasound-software-uses-artificial-intelligence-guide-user CDC. (2012). Natural history and spectrum of disease. Retrieved from Centers for Disease Control and Prevention: https://www.cdc.gov/csels/dsepd/ss1978/lesson1/section9.html
7 Healthcare Artificial Intelligence in India and Ethical Aspects
141
CDC. (2022). Heart disease facts. Retrieved from Center for Disease Control and Prevention: https:// www.cdc.gov/heartdisease/facts.htm Cha, K. H., Hadjiiski, L., Chan, H. P., Weizer, A. Z., Alva, A., Cohan, R. H., et al. (2017). Bladder cancer treatment response assessment in CT using radiomics with deep-learning. Scientific Reports, 7(1), 8738. https://doi.org/10.1038/s41598-017-09315-w Chakrabarty, D., & Skinner, B. (2006). One drug does not fit all. Fraser Institute Board of Trustees Chairman Chatterjee, P. (2017). With Lybrate, a doctor is just a click away. Retrieved from Forbes India: https:// www.forbesindia.com/article/startups/with-lybrate-a-doctor-is-just-a-click-away/48405/1 Chaturvedi, S. K., & Chandra, P. S. (1998). Palliative care in India. Supportive Care in Cancer, 6, 81–84. Chawla, D. (2020). Case study of Doc talk start-up. Retrieved from Medium: https://medium.com/ deepak-chawla/case-study-of-dotalk-start-up-c3a23a199869 Ching, T., Zhu, X., & Garmire, L. X. (2018). Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data. PLOS Computational Biology, 14, e1006076. https://doi.org/10.1371/journal.pcbi.1006076 Chodankar, D. (2021). Introduction to real-world evidence studies. Perspectives in Clinical Research, 12(3), 171–174. https://doi.org/10.4103/picr.picr_62_21 Chopra, H., Baig, A. A., Gautam, R. K., & Kamal, M. A. (2022). Application of artificial intelligence in drug discovery. Current Pharmaceutical Design, 28(33), 2690–2703. Choudhury, A., Renjilian, E., & Asan, O. (2020). Use of machine learning in geriatric clinical care for chronic diseases: A systematic literature review. JAMIA Open, 3(3), 459–471. https://doi. org/10.1093/jamiaopen/ooaa034 Chow, J. (2022). Artificial intelligence in radiotherapy and patient care. Artificial Intelligence in Medicine, 13, 1275–1286. https://doi.org/10.1007/978-3-030-64573-1_143 Cohen, I. G. (2019). Informed consent and medical artificial intelligence: What to tell the patient? The Georgetown Law Journal, 1425. Retrieved from https://www.law.georgetown.edu/george town-law-journal/wp-content/uploads/sites/26/2020/06/Cohen_Informed-Consent-and-Med ical-Artificial-Intelligence-What-to-Tell-the-Patient.pdf Collins, F. (2019). Using artificial intelligence to detect cervical cancer. Retrieved from NIH.gov: https://directorsblog.nih.gov/2019/01/17/using-artificial-intelligence-to-detectcervical-cancer/ Cuocolo, R., Perillo, T., De Rosa, E., Ugga, L., & Petretta, M. (2019). Current applications of big data and machine learning in cardiology. Journal of Geriatric Cardiology JGC, 16(8), 601–607. Dada, E. G., Bassi, J. S., Chiroma, H., Abdulhamid, S. M., Adetunmbi, A. O., & Ajibuwa, O. E. (2019). Machine learning for email spam filtering: Review, approaches and open research problems. Heliyon, 5(6), e01802. https://doi.org/10.1016/j.heliyon.2019.e01802 Das, R. (2017). 10 Ways the internet of medical things is revolutionizing senior care. Retrieved from Forbes: https://www.forbes.com/sites/reenitadas/2017/05/22/10-ways-internetof-medical-things-is-revolutionizing-senior-care/ den Bakker, M. A. (2017). Histopathologisch onderzoek als gouden standaard? [Is histopathology still the gold standard?]. Nederlands tijdschrift voor geneeskunde, 160(D981). Dharmamoorthy, G., Sabareesh, M., Balaji, A., Dharaniprasad, P., & Swetha, T. (2022). An overview on top 10 pharma industry trends and innovations 2022. YMER Journal, 21(11), 2123–2140. Di Nuovo, A., Broz, F., Wang, N., Belpaeme, T., Cangelosi, A., Jones, R., et al. (2018). The multimodal interface of Robot-Era multi-robot services tailored for the elderly. Intelligent Service Robotics, 11, 109–126. Dulera, J., Ghosalkar, R., Bagchi, A., Makhijani, K., & Giri, N. (2021). Forecasting trends of tuberculosis in India using artificial intelligence and machine learning. In IEEE 9th international conference on healthcare informatics (ICHI) (pp. 543–547). https://doi.org/10.1109/ICHI52183. 2021.00102
142
A. Sarkar et al.
Ethiopia, S. (2022). Combating healthcare challenges through the enablement of data transparency. Retrieved from Forbes: https://www.forbes.com/sites/forbestechcouncil/2022/09/07/combat ing-healthcare-challenges-through-the-enablement-of-data-transparency/?sh=79da40b85fbc ETtech. (2022). Health insurance startup ClaimBuddy raises $3 million in funding. Retrieved from Economic Times: https://www.ecoti.in/C_eDwZ EU. (2019). Ethics guidelines for trustworthy AI. High-level expert group on AI, European Commission. Retrieved from https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustwort hy-ai EyeArt. (2020). EyeArt AI screening system for DR. Retrieved from Review of Opthalmology. FDA. (2018). FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems. FDA NEWS RELEASE. Retrieved from https://www.fda.gov/ news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-devicedetect-certain-diabetes-related-eye Francis, G. (2020). Medicine: Art or science? Lancet, 395(10217), 24–25. https://doi.org/10.1016/ S0140-6736(19)33145-9 Freschi, C., Ferrari, V., Melfi, F., Ferrari, M., Mosca, F., & Cuschieri, A. (2013). Technical review of the da Vinci surgical telemanipulator. The International Journal of Medical Robotics and Computer Assisted Surgery, 9(4), 396–406. Futurist, T. M. (2019). FDA approvals for smart algorithms in medicine in one giant infographic. Retrieved from The Medical Futurist: https://medicalfuturist.com/fda-approvals-for-algorithmsin-medicine/ Gaikwad, A., & Acharya, S. (2022). The future of palliative treatment in India: A review. Cureus, 14(9), 29502. https://doi.org/10.7759/cureus.29502 Gallan, A. S., McColl-Kennedy, J. R., Barakshina, T., Figueiredo, B., Jefferies, J. G., Gollnhofer, J., et al. (2019). Transforming community well-being through patients’ lived experiences. Journal of Business Research, 100, 376–391. Gerke, S., Minssen, T., & Cohen, G. (2020). Ethical and legal challenges of artificial intelligencedriven healthcare. Artificial Intelligence in Healthcare, 12, 295–336. https://doi.org/10.1016/ B978-0-12-818438-7.00012-5 Glastonbury, C. M., Bhosale, P. R., Choyke, P. L., D’Orsi, C. J., Erasmus, J. J., Gill, R. R., et al. (2016). Do radiologists have stage fright? Tumor staging and how we can add value to the care of patients with cancer. Radiology, 278(1), 11–12. https://doi.org/10.1148/radiol.2015151563 Gorostiola González, M., Janssen, A. P., IJzerman, A. P., Heitman, L. H., & van Westen, G. J. (2022). Oncological drug discovery: AI meets structure-based computational research. Drug Discovery Today, 27(6), 1661–1670. https://doi.org/10.1016/j.drudis.2022.03.005 Gourd, E. (2021). GDPR obstructs cancer research data sharing. The Lancet Oncology, 22(5), 592. Greenstein, S., Martin, M., & Agaian, S. (2020). IBM Watson at MD Anderson cancer center. Retrieved from Harvard Business Publishing Education: https://hbsp.harvard.edu/product/621 022-PDF-ENG Gulhan, D. C., Lee, J. J., Melloni, G. E., Cortés-Ciriano, I., & Park, P. J. (2019). Detecting the mutational signature of homologous recombination deficiency in clinical samples. Nature Genetics, 51(5), 912–919. https://doi.org/10.1038/s41588-019-0390-2 Gulshan, V., Peng, L., Coram, M., Stumpe, M. C., Wu, D., Narayanaswamy, A., et al. (2016). Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA, 316(22), 2402–2410. https://doi.org/10.1001/jama.2016. 17216 Gupta, M., Shetty, M., & Girish, M. P. (2021). Machine learning to identify high-risk patients after stemi in low/middle income countries. Journal of the American College of Cardiology, 77, 147. https://doi.org/10.1016/S0735-1097(21)01506-0 Hale, C. (2020). FDA clears Zebra Medical’s breast cancer AI for spotting suspicious mammography lesions. Retrieved from FIERCE Biotech: https://www.fiercebiotech.com/medtech/fda-clearszebra-medical-s-breast-cancer-ai-for-spotting-suspicious-mammography-lesions
7 Healthcare Artificial Intelligence in India and Ethical Aspects
143
Hamet, P., & Tremblay, J. (2017). Artificial intelligence in medicine. Metabolism Clinical and Experimental, 69, S36–S40. https://doi.org/10.1016/j.metabol.2017.01.011 Harvard, U. (n.d.). AI for public health. Retrieved from Harvard University: https://teamcore.seas. harvard.edu/ai-social-work Holzinger, A., Biemann, C., Pattichis, C. S., & Kell, D. B. (2017). What do we need to build explainable AI systems for the medical domain? arXiv. Retrieved from https://doi.org/10.48550/ arxiv.1712.09923 Hou, X., Shen, G., Zhou, L., Li, Y., Wang, T., & Ma, X. (2022). Artificial intelligence in cervical cancer screening and diagnosis. Frontiers in Oncology, 12, 851367. https://doi.org/10.3389/ fonc.2022.851367 Hu, L., Bell, D., Antani, S., Xue, Z., Yu, K., Horning, M. P., et al. (2019). An observational study of deep learning and automated evaluation of cervical images for cancer screening. Journal of the National Cancer Institute, 111(9), 923–932. https://doi.org/10.1093/jnci/djy225 Huffman, M. D., Prabhakaran, D., Osmond, C., Fall, C. H., Tandon, N., Lakshmy, R., et al. (2011). Incidence of cardiovascular risk factors in an Indian urban cohort results from the New Delhi birth cohort. Journal of the American College of Cardiology, 57(17), 1765–1774. https://doi. org/10.1016/j.jacc.2010.09.083 IANS. (2018). AI can help fight spread of TB in India: Study. Retrieved from Business Standard: https://www.business-standard.com/article/news-ians/ai-can-help-fight-spread-of-tbin-india-study-118022100484_1.html Ibata-Arens, K. C. (2020). Beyond technonationalism: Biomedical innovation and entrepreneurship in Asia. Oxford University Press. Ilic, D., Liovic, M., & Noli, L. (2019). Industry updates from the field of stem cell research and regenerative medicine in October 2019. Regenerative Medicine, 15(2), 1251–1259. Jamal, D. N., Rajkumar, S., & Ameen, N. (2018). Remote elderly health monitoring system using cloud-based WBANs. Handbook of research on cloud and fog computing infrastructures for data science (pp. 265–288). Jhajharia, S., Varshney, H. K., Verma, S., & Kumar, R. (2016). A neural network based breast cancer prognosis model with PCA processed features. IEEE Xplore. https://doi.org/10.1109/ICACCI. 2016.7732327 Jiang, F., Jiang, Y., Zhi, H., Dong, Y., Li, H., Ma, S., et al. (2017). Artificial intelligence in healthcare: Past, present and future. Stroke and Vascular Neurology, 2(4), 101. https://doi.org/10.1136/svn2017-000101 Joyce, I. (2018). Bay Labs’ EchoMD AutoEF software receives FDA clearance for fully automated AI echocardiogram analysis. Retrieved from Business Wire: https://www.businesswire.com/ news/home/20180619005552/en/Bay-Labs%E2%80%99-EchoMD-AutoEF-Software-Rec eives-FDA-Clearance-for-Fully-Automated-AI-Echocardiogram-Analysis Kalanidhi, M. L. (2022). Bengaluru’s Cloudphysician Healthcare: Leveraging tech in healthcare. Retrieved from New India Express: https://www.newindianexpress.com/lifestyle/health/2022/ jul/17/bengalurus-cloudphysician-healthcare-leveraging-tech-in-healthcare-2476602.html Karan, A., Negandhi, H., Hussain, S., Zapata, T., Mairembam, D., De Graeve, H., et al. (2021). Size, composition and distribution of health workforce in India: Why, and where to invest? Human Resources for Health, 19(1), 575. https://doi.org/10.1186/s12960-021-00575-2 Kononenko, I. (2001). Machine learning for medical diagnosis: History, state of the art and perspective. Artificial Intelligence in Medicine, 23(1), 89–109. Kooli, C., & Al-Muftah, H. (2022). Artificial intelligence in healthcare: A comprehensive review of its ethical concerns. Technological Sustainability, 1(2), 121–131. Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., & Fotiadis, D. I. (2015). Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal, 13, 8–17. https://doi.org/10.1016/j.csbj.2014.11.005 Krishnamurthy, A., & Goel, P. (2022). Artificial intelligence-based drug screening and drug repositioning tools and their application in the present scenario. In Computational Approaches for Novel Therapeutic and Diagnostic Designing to Mitigate SARS-CoV2 Infection (pp. 379–398).
144
A. Sarkar et al.
Kulothungan, V., Sathishkumar, K., Leburu, S., Ramamoorthy, T., Stephen, S., Basavarajappa, D., et al. (2022). Burden of cancers in India: Estimates of cancer crude incidence, YLLs, YLDs and DALYs for 2021 and 2025 based on National Cancer Registry Program. BMC Cancer, 22(1), 527. https://doi.org/10.1186/s12885-022-09578-1 Kumari, M., & Mohanty, S. K. (2020). Caste, religion and regional differentials in life expectancy at birth in India: Cross-sectional estimates from recent National Family Health Survey. British Medical Journal Open, 10(8), e035392. Kwon, I. G., Kim, S. H., & Martin, D. (2021). Integrating social determinants of health to precision medicine through digital transformation: An exploratory roadmap. International Journal of Environmental Research and Public Health, 18(9), 5018. https://doi.org/10.3390/ijerph180 95018 Lakshmi, A. (2022). How slashing revenues led Innovaccer to become India’s first healthtech unicorn. Retrieved from Your Story: https://yourstory.com/2022/09/slashing-revenues-innova ccer-become-indias-first-healthtech-unicorn Lambin, P., Leijenaar, R. T., Deist, T. M., Peerlings, J., de Jong, E. E., van Timmeren, J., et al. (2017). Radiomics: The bridge between medical imaging and personalized medicine. Nature Reviews: Clinical Oncology, 14(12), 749–762. https://doi.org/10.1038/nrclinonc.2017.141 Lantern, P. (2017). Precision oncology company lantern pharma enters collaborative service agreement with artificial intelligence and data analytics leader intuition systems to aid in biomarker discovery. Retrieved from Business Wire: https://www.businesswire.com/news/ home/20170107005056/en/Precision-Oncology-Company-Lantern-Pharma-Enters-Collabora tive-Service-Agreement-with-Artificial-Intelligence-and-Data-Analytics-Leader-Intuition-Sys tems-to-Aid-in-Biomarker-Discovery Lanzetta, P., Sarao, V., Scanlon, P. H., Barratt, J., Porta, M., Bandello, F., & Loewenstein, A. (2020). Fundamental principles of an effective diabetic retinopathy screening program. Acta Diabetologica, 57, 785–798. https://doi.org/10.1007/s00592-020-01506-8 LAPAAS. (2019). Practo Business Model|Case Study|How Practo Earns? Retrieved from LAPAAS: https://lapaas.com/practo-business-model/ Lin, L., Dou, Q., Jin, Y. M., Zhou, G. Q., Tang, Y. Q., Chen, W. L., et al. (2019). Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma. Radiology, 291(3), 677–686. https://doi.org/10.1148/radiol.2019182012 Lopez-Jimenez, F., Attia, Z., Arruda-Olson, A. M., Carter, R., Chareonthaitawee, P., Jouni, H., et al. (2020). Artificial intelligence in cardiology: Present and future. Mayo Clinic Proceedings, 95(5), 1015–1039. Luchini, C., Pea, A., & Scarpa, A. (2022). Artificial intelligence in oncology: Current applications and future perspectives. British Journal of Cancer, 126, 4–9. https://doi.org/10.1038/s41416021-01633-1 Luniya, V. (2021). DISHA: India’s probable response to the law on protection of digital health data. Retrieved from Mondaq: https://www.mondaq.com/india/healthcare/1059266/disha-ind ia39s-probable-response-to-the-law-on-protection-of-digital-health-data Maastricht. (2017). World’s first super-microsurgery operation with ‘robot hands’. Retrieved from Maastricht University Medical Centre: https://www.maastrichtuniversity.nl/news/world%E2% 80%99s-first-super-microsurgery-operation-%E2%80%98robot-hands%E2%80%99 Maini, E., Venkateswarlu, B., Maini, B., & Marwaha, D. (2021). Machine learning–based heart disease prediction system for Indian population: An exploratory study done in South India. Medical Journal Armed Forces India, 77(3), 302–311. https://doi.org/10.1016/j.mjafi.2020. 10.013 Malik, P. (2022a). Niramai receives US FDA clearance for medical device SMILE-100 system. Retrieved from YourStory: https://yourstory.com/herstory/2022/03/niramai-received-us-fda-cle arance-medical-device-smile-system Malik, P. (2022b). Meet 5 on-demand medicine delivery startups that are transforming healthcare in India. Retrieved from YourStory: https://yourstory.com/2022/03/medicine-delivery-startupsnetmeds-1mg-medlife-tablt
7 Healthcare Artificial Intelligence in India and Ethical Aspects
145
Mathur, R. (2020). ICMR consensus guidelines on ‘do not attempt resuscitation.’ The Indian Journal of Medical Research, 151(4), 303–310. https://doi.org/10.4103/ijmr.IJMR_395_20 Miliard, M. (2019). Google, verily using AI to screen for diabetic retinopathy in India. Retrieved from Healthcare IT News: https://www.healthcareitnews.com/news/asia/google-verily-usingai-screen-diabetic-retinopathy-india Mincolelli, G., Imbesi, S., Giacobone, G. A., & Marchi, M. (2019). Internet of things and elderly: Quantitative and qualitative benchmarking of smart objects. In Advances in Design for Inclusion: Proceedings of the AHFE 2018 International Conference on Design for Inclusion (pp. 335–345). Loews Sapphire Falls Resort at Universal Studios: Springer. Mitchell, T. M. (1997). Machine learning. McGraw-hill. Mor, N. (2021). The application of artificial intelligence and machine learning in essential public health functions. Public Health Challenges for India. Morrison, R. S., & Meier, D. E. (2004). Palliative care. New England Journal of Medicine, 350(25), 2582–2590. Mukhopadhyay, A., Sumner, J., Ling, L. H., Quek, R. H., Tan, A. T., Teng, G. G., et al. (2022). Personalised dosing using the CURATE.AI algorithm: Protocol for a feasibility study in patients with hypertension and type II diabetes mellitus. International Journal of Environmental Research and Public Health, 19(15), 8979. https://doi.org/10.3390/ijerph19158979 Mullin, R. (2023). Accessing artificial intelligence in pharmaceutical laboratories. Retrieved from Chemical and Engineering News (C&EN): https://cen.acs.org/business/informatics/Accessingartificial-intelligence-pharmaceutical-laboratories/101/i22 Murali, A. (2016). This open-source device is a shot in the arm for diabetic retinopathy diagnosis. Retrieved from Factor Daily: https://archive.factordaily.com/open-source-device-diabetic-ret inopathy/ Narayanan, R. R., Durga, N., & Nagalakshmi, S. (2022). Impact of artificial intelligence (AI) on drug discovery and product development. Indian Journal of Pharmaceutical Education and Research, 56, S387–S397. Natarajan, S., Jain, A., Krishnan, R., Rogye, A., & Sivaprasad, S. (2019). Diagnostic accuracy of community-based diabetic retinopathy screening with an offline artificial intelligence system on a smartphone. JAMA Ophthalmology, 137(10), 1182–1188. https://doi.org/10.1001/jamaophth almol.2019.2923 Nishi, K., Demura, M., Miura, J., & Oishi, S. (2017). Use of thermal point cloud for thermal comfort measurement and human pose estimation in robotic monitoring. In Proceedings of the IEEE international conference on computer vision workshops (pp. 1416–1423). IEEE. Nishikawa, A., Hosoi, T., Koara, K., Negoro, D., Hikita, A., Asano, S., et al. (2003). FAce MOUSe: A novel human-machine interface for controlling the position of a laparoscope. IEEE Transactions on Robotics and Automation, 19(5), 825–841. https://doi.org/10.1109/TRA.2003.817093 Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453. https://doi. org/10.1126/science.aax234 O’Connor, C. M., & Adams, J. U. (2010). Essentials of cell biology. NPG Education. Oktay, O., Nanavati, J., & Schwaighofer, A. (2020). Evaluation of deep learning to augment imageguided radiotherapy for head and neck and prostate cancers. JAMA Network Open, 3(11), 27426. https://doi.org/10.1001/jamanetworkopen.2020.27426 Panch, T., Mattie, H., & Atun, R. (2019). Artificial intelligence and algorithmic bias: Implications for health systems. Journal of Global Health, 9(2). Pandey, A., Ploubidis, G. B., Clarke, L., & Dandona, L. (2018). Trends in catastrophic health expenditure in India: 1993 to 2014. Bulletin of the World Health Organization, 96(1), 18. Pantuck, A., Lee, D., Kee, T., Wang, P., Lakhotia, S., Silverman, M., et al. (2018). Modulating BET bromodomain inhibitor ZEN-3694 and enzalutamide combination dosing in a metastatic prostate cancer patient using CURATE.AI, an artificial intelligence platform. Advanced Therapeutics, 1, 1800104. https://doi.org/10.1002/adtp.201800104
146
A. Sarkar et al.
Parikh, R. B., Teeple, S., & Navathe, A. S. (2019). Addressing bias in artificial intelligence in health care. JAMA, 322(24), 2377–2378. https://doi.org/10.1001/jama.2019.18058 Park, K. (2019). Concept of health and disease (Chapter 2). In K. Park, Park’s textbook of preventive and social medicine (pp. 13–60). Banarsidas Bhanot Publishers. Parry, C. M., & Aneja, U. (2020). Artificial intelligence for healthcare: Insights from India. India: Chatham House. Retrieved from https://www.chathamhouse.org/2020/07/artificial-intelligencehealthcare-insights-india Patel, S. K., George, B., & Rai, V. (2020). Artificial intelligence to decode cancer mechanism: Beyond patient stratification for precision oncology. Frontiers in Pharmacology, 11, 1177. https://doi.org/10.3389/fphar.2020.01177 Patil, P., Nrip, N. K., Hajare, A., Hajare, D., Patil, M. K., Kanthe, R., & Gaikwad, A. T. (2023). Artificial intelligence and tools in pharmaceuticals: An overview. Research Journal of Pharmacy and Technology, 16(4), 2075–2082. Paul, D., Sanap, G., Shenoy, S., Kalyane, D., Kalia, K., & Tekade, R. K. (2021). Artificial intelligence in drug discovery and development. Drug Discovery Today, 26(1), 80–93. https://doi.org/10. 1016/j.drudis.2020.10.010 Peng, H., Dong, D., Fang, M. J., Li, L., Tang, L. L., Chen, L., et al. (2019). Prognostic value of deep learning PET/CT-based radiomics: Potential role for future individual induction chemotherapy in advanced nasopharyngeal carcinoma. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, 25(14), 4271–4279. https://doi.org/10.1158/10780432.CCR-18-3065 Peng, Y., Liu, E., Peng, S., Chen, Q., Li, D., & Lian, D. (2022). Using artificial intelligence technology to fight COVID-19: A review. Artificial Intelligence Review, 55(6), 4941–4977. https:// doi.org/10.1007/s10462-021-10106-z Peruselli, C., De Panfilis, L., Gobber, G., Melo, M., & Tanzi, S. (2020). Intelligenza artificiale e cure palliative: Opportunità e limiti [Artificial intelligence and palliative care: Opportunities and limitations]. Recenti Progressi in Medicina, 111(11), 639–645. https://doi.org/10.1701/ 3474.34564 Phillips, P., Hahn, C., Fontana, P., Yates, A., Greene, K., Broniatowski, D., & Przybocki, M. (2021). Four principles of explainable artificial intelligence. NIST Interagency/Internal Report (NISTIR), National Institute of Standards and Technology. https://doi.org/10.6028/NIST.IR. 8312 Pieska, S., Luimula, M., Jauhiainen, J., & Spiz, V. (2012). Social service robots in public and private environments. In Recent Researches in Circuits, Systems, Multimedia and Automatic Control (pp. 190–196). Porkodi, S., & Kesavaraja, D. (2021). Healthcare robots enabled with IoT and artificial intelligence for elderly patients. AI and IoT-Based Intelligent Automation in Robotics, 41, 87–108. Potnis, K. C., Ross, J. S., Aneja, S., Gross, C. P., & Richman, I. B. (2022). Artificial intelligence in breast cancer screening: Evaluation of FDA device regulation and future recommendations. JAMA Internal Medicine, 182, 1306–1312. https://doi.org/10.1001/jamainternmed.2022.4969 Pradhan, K., John, P., & Sandhu, N. (2021). Use of artificial intelligence in healthcare delivery in India. Journal of Hospital Management and Health Policy, 5, 28. Price II, W. N. (2019). Medical AI and contextual bias. Harvard Journal of Law and Technology (p. 66). Retrieved from https://ssrn.com/abstract=3347890 Priyadarshini, S. (2013). India needs gen-next cancer biobank. Retrieved from Nature India: https:// www.nature.com/articles/nindia.2013.103 PTI. (2019). Health Ministry to use Artificial Intelligence in safe way in public health. Retrieved from Economic Times: https://economictimes.indiatimes.com/industry/healthcare/biotech/hea lthcare/health-ministry-to-use-artificial-intelligence-in-safe-way-in-public-health/articleshow/ 70189259.cms Quantum, T. (2021). How artificial intelligence can aid and improve early detection of breast cancer. Retrieved from BusinessLine: https://www.thehindubusinessline.com/business-tech/using-artifi cial-intelligence-for-cancer-detection/article34296202.ece
7 Healthcare Artificial Intelligence in India and Ethical Aspects
147
Raghavan, P., & Gayar, N. E. (2019). Fraud detection using machine learning and deep learning. In International conference on computational intelligence and knowledge economy (ICCIKE), (pp. 334–339). Dubai. https://doi.org/10.1109/ICCIKE47802.2019.9004231 Rajalakshmi, R., Arulmalar, S., Usha, M., Prathiba, V., Kareemuddin, K. S., Anjana, R. M., & Mohan, V. (2015). Validation of smartphone based retinal photography for diabetic retinopathy screening. PLoS ONE, 10(9), 285. https://doi.org/10.1371/journal.pone.0138285 Raman, R., Srinivasan, S., Virmani, S., Sivaprasad, S., Rao, C., & Rajalakshmi, R. (2019). Fundus photograph-based deep learning algorithms in detecting diabetic retinopathy. Eye, 33, 97–109. https://doi.org/10.1038/s41433-018-0269-y Raman, R., Dasgupta, D., Ramasamy, K., George, R., Mohan, V., & Ting, D. (2021). Using artificial intelligence for diabetic retinopathy screening: Policy implications. Indian Journal of Ophthalmology, 69(11), 2993–2998. https://doi.org/10.4103/ijo.IJO_1420_21 Rashid, M. B. (2021). Artificial intelligence effecting a paradigm shift in drug development. SLAS TECHNOLOGY: Translating Life Sciences Innovation, 26(1), 3–15. Raumviboonsuk, P., Krause, J., Chotcomwongse, P., Sayres, R., Raman, R., Widner, K., & Campana, B. J. (2019). Deep learning versus human graders for classifying diabetic retinopathy severity in a nationwide screening program. NPJ Digital Medicine, 10(2), 25. Rema, M., Premkumar, S., Balaji, A., Raj, D., Rajendra, P., & Viswanathan, M. (2005). Prevalence of diabetic retinopathy in urban India: The Chennai Urban Rural Epidemiology Study (CURES) eye study, I. Investigative Ophthalmology and Visual Science, 46(7), 2328–2333. https://doi.org/ 10.1167/iovs.05-0019 Reporter, S. (2021). Apollo Hospitals launch AI tool to predict cardiovascular disease risk. Retrieved from The Hindu: https://www.thehindu.com/news/national/telangana/apollo-hospitals-launchai-tool-to-predict-cardiovascular-disease-risk/article36723412.ece Ribeiro, M., Grolinger, K., & Capretz, M. A. (2015). Mlaas: Machine learning as a service. In Proceedings of the 14th international conference on machine learning and applications (ICMLA) (pp. 896–902). IEEE. Robert, N. (2019). How artificial intelligence is changing nursing. Nursing Management, 50(9), 30. Rodriguez-Ruiz, A.L.-M. (2019). Stand-alone artificial intelligence for breast cancer detection in mammography: Comparison with 101 radiologists. Journal of the National Cancer Institute, 111(9), 916–922. https://doi.org/10.1093/jnci/djy222 Rose, P. T., & Nusbaum, B. (2014). Robotic hair restoration. Dermatologic Clinics, 32(1), 97–107. Rosenberg, W., & Donald, A. (1995). Evidence based medicine: An approach to clinical problemsolving. BMJ, 310(6987), 1122–1126. https://doi.org/10.1136/bmj.310.6987.1122 Rossmann, K. (2006). Diagnostic imaging over the last 50 years: Research and development in medical imaging science and technology. Physics in Medicine and Biology, 51(13), R02. https:// doi.org/10.1088/0031-9155/51/13/R02 Rouse, C. J., Jensen, V. N., & Heldermon, C. D. (2023). Mucopolysaccharidosis type IIIB: A current review and exploration of the AAV therapy landscape. Neural Regeneration Research. Rubeis, G. (2020). The disruptive power of artificial intelligence. Ethical aspects of gerontechnology in elderly care. Archives of Gerontology and Geriatrics, 91, 104186. Runciman, B., Merry, A., & Walton, M. (2017). Safety and ethics in healthcare: A guide to getting it right. CRC Press. Sabbeh, S. F. (2018). Machine-learning techniques for customer retention: A comparative study. International Journal of Advanced Computer Science and Applications (IJACSA), 9(2). Sackett, D. L., Straus, S. E., Richardson, W. S., Rosenberg, W., & Haynes, R. B. (2020). Evidencebased medicine: How to practice and teach EBM. Churchill Livingstone. Safal Khanal, O. (2021). Myopia management. Retrieved from Contact Lens Spectrum: https:// www.clspectrum.com/newsletters/mastering-myopia/july-21,-2021 Saksena, N., Matthan, R., Bhan, A., & Balsari, S. (2021). Rebooting consent in the digital age: A governance framework for health data exchange. BMJ Global Health, 6(Suppl 5), e005057.
148
A. Sarkar et al.
Salisbury, J. P. (2021). Using medical device standards for design and risk management of immersive virtual reality for at-home therapy and remote patient monitoring. JMIR Biomedical Engineering, 6(2), e26942. Seah, J. C., Tang, J. S., Kitchen, A. G., & Dixon, A. F. (2019). Chest radiographs in congestive heart failure: Visualizing neural network learning. Radiology, 290(2), 514–522. https://doi.org/ 10.1148/radiol.2018180887 Sengupta, A., & Nundy, S. (2005). The private health sector in India. BMJ, 331(7526), 1157–1158. https://doi.org/10.1136/bmj.331.7526.1157 Shamasneh, A., & Obaidellah, U. (2017). Artificial intelligence techniques for cancer detection and classification: Review study. European Scientific Journal, 13(3), 1857–7881. Sharkey, A., & Sharkey, N. (2012). Granny and the robots: Ethical issues in robot care for the elderly. Ethics and Information Technology, 14, 27–40. Sharma, N. C. (2021a). AstraZeneca, Tricog launch project for early diagnosis of heart attacks. Retrieved from Live Mint: https://www.livemint.com/companies/news/astrazeneca-tricog-lau nch-project-for-early-diagnosis-of-heart-attacks-11632899435295.html Sharma, S. (2021b). How Oncostem uses AI to personalise breast cancer treatment. Retrieved from TechCircle: https://www.techcircle.in/2021/09/02/how-oncostem-uses-ai-to-personalisebreast-cancer-treatment Sharma, S. (2021c). How a health-tech startup is helping hospitals screen blood samples within minute. Retrieved from Tech Circle: https://www.techcircle.in/2021/08/19/how-a-health-techstartup-is-helping-hospitals-screen-blood-samples-within-minute Shi, Y., Fu, J., Zeng, M., Ge, Y., Wang, X., Xia, A., et al. (2022). Information technology and artificial intelligence support in management experiences of the pediatric designated hospital during the COVID-19 2022 epidemic in Shanghai. Intelligent Medicine, 3, 16–21. https://doi. org/10.1016/j.imed.2022.08.002 Shimizu, H., & Nakayama, K. I. (2020). Artificial intelligence in oncology. Cancer Science, 111(5), 1452–1460. https://doi.org/10.1111/cas.14377 Shreve, J. T., Khanani, S. A., & Haddad, T. C. (2022). Artificial intelligence in oncology: Current capabilities, future opportunities, and ethical considerations. American Society of Clinical Oncology Educational Book, 42, 842–851. Singh, H. B., Jha, A., & Keswani, C. (2016). Intellectual property issues in biotechnology. CABI. Song, P., Cui, X., Bai, L., Zhou, X., Zhu, X., Zhang, J., et al. (2019). Molecular characterization of clinical responses to PD-1/PD-L1 inhibitors in non-small cell lung cancer: Predictive value of multidimensional immunomarker detection for the efficacy of PD-1 inhibitors in Chinese patients. Thoracic Cancer, 10(5), 1303–1309. https://doi.org/10.1111/1759-7714.13078 Soni, Y. (2019). Healthtech startup Qure.ai is using AI to speed up radiology diagnosis. Retrieved from Inc24: https://inc42.com/startups/qure-ai-in-healthcare/ Stavropoulos, T. G., Papastergiou, A., Mpaltadoros, L., Nikolopoulos, S., & Kompatsiaris, I. (2020). IoT wearable sensors and devices in elderly care: A literature review. Sensors, 20(10), 2826. https://doi.org/10.3390/s20102826 Stephenson, N., Shane, E., Chase, J., Rowland, J., Ries, D., Justice, N., et al. (2019). Survey of machine learning techniques in drug discovery. Current Drug Metabolism, 20(3), 185–193. Suganyadevi, S., Seethalakshmi, V., & Balasamy, K. (2022). A review on deep learning in medical image analysis. International Journal of Multimedia Information Retrieval, 11, 19–38. https:// doi.org/10.1007/s13735-021-00218-1 Susmaga, R. (2004). Confusion matrix visualization. In Proceedings of the Intelligent Information Processing and Web Mining (IIPWM) (pp. 107–116). Springer Tan, S. Y., & Tatsumura, Y. (2015). Alexander fleming (1881–1955): Discoverer of penicillin. Singapore Medical Journal, 56(7), 366. Tew, E. (2022). Babylon to provide fitbits, expand access to proactive monitoring for eligible members. Retrieved from Business Wire: https://www.businesswire.com/news/home/202 21115006236/en/Babylon-to-Provide-Fitbits-Expand-Access-to-Proactive-Monitoring-for-Eli gible-Members
7 Healthcare Artificial Intelligence in India and Ethical Aspects
149
The American Cancer Society Guidelines for the Prevention and Early Detection of Cervical Cancer. (2021). Retrieved from American Cancer Society: https://www.cancer.org/cancer/cervical-can cer/detection-diagnosis-staging/cervical-cancer-screening-guidelines.html Ting, D. S., Cheung, C.Y.-L., Lim, G., Tan, G. S., Quang, N. D., Gan, A., et al. (2017). Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA, 318(22), 2211–2223. https:// doi.org/10.1001/jama.2017.18152 Topol, E. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25, 44–56. Tsui, E.-Y. (2020). Application of artificial intelligence (AI) in surgery. Retrieved from Imperial College London: https://www.imperial.ac.uk/news/200673/application-artificial-intellige nce-ai-surgery/ Turing, A. (1948). Intelligent machinery (1948). In The essential turing. Oxford Academic Ultromics. (2021). Ultromics receives FDA clearance for EchoGo Pro; a first-of-kind solution to help diagnose CAD. Retrieved from Ultromics: https://www.ultromics.com/press-releases/ult romics-receives-fda-clearance-for-a-first-of-kind-solution-in-echocardiography-to-help-clinic ians-diagnose-disease-1 UNESCO. (2017). Report of COMEST on robotics ethics. Retrieved from https://unesdoc.unesco. org/ark:/48223/pf0000253952 US-FDA. (2016). Personalized medicine: A biological approach to patient treatment. Retrieved from US Food and Drug Administration and others: https://www.fda.gov/drugs/news-eventshuman-drugs/personalized-medicine-biological-approach-patient-treatment van den Heuvel, R. J., Lexis, M. A., & de Witte, L. P. (2020). ZORA robot based interventions to achieve therapeutic and educational goals in children with severe physical disabilities. International Journal of Social Robotics, 12, 493–504. Vokinger, K., & Gasser, U. (2021). Regulating AI in medicine in the United States and Europe. Nature Machine Intelligence, 3, 738–739. https://doi.org/10.1038/s42256-021-00386-z Walton, O., Garoon, R. B., Weng, C. Y., Gross, J., Young, A. K., Camero, K. A., et al. (2016). Evaluation of automated teleretinal screening program for diabetic retinopathy. JAMA Opthamology, 134(2), 204–209. https://doi.org/10.1001/jamaophthalmol.2015.5083 Watson, J. (2022). Global patient access solutions market is anticipated to record the rapid growth and prominent players analysis. Retrieved from The C-Drone Review: https://c-drone-review. news/en/global-patient-access-solutions-market-size-scope-and-forecast/ Wentzensen, N. L. (2021). Accuracy and efficiency of deep-learning-based automation of dual stain cytology in cervical cancer screening. Journal of the National Cancer Institute, 113(1), 72–79. https://doi.org/10.1093/jnci/djaa066 WHO. (2016). Technical series on safer primary care: Diagnostic errors. Retrieved from World Health Organization: https://www.who.int/publications/i/item/9789241511636 WHO. (2021). WHO issues first global report on Artificial Intelligence (AI) in health and six guiding principles for its design and use. Retrieved from World Health Organization (WHO): https://www.who.int/news/item/28-06-2021-who-issues-first-global-report-on-aiin-health-and-six-guiding-principles-for-its-design-and-use Whooley, S. (2021). Stryker completes Gauss Surgical acquisition. Retrieved from Mass Device: https://www.massdevice.com/stryker-completes-gauss-surgical-acquisition/ Windisch, P., Hertler, C., Blum, D., Zwahlen, D., & Forster, R. (2020). Leveraging advances in artificial intelligence to improve the quality and timing of palliative care. Cancers, 12(5), 1149. Wolfensberger, T. J., & Hamilton, A. P. (2001). Diabetic retinopathy: An historical review. Seminars in Ophthalmology, 16(1), 2–7. Xu, J., Yang, P., Xue, S., Sharma, B., Sanchez-Martin, M., Wang, F., et al. (2019). Translating cancer genomics into precision medicine with artificial intelligence: Applications, challenges and future perspectives. Human Genetics, 138(2), 109–124. https://doi.org/10.1007/s00439-019-01970-5
150
A. Sarkar et al.
Yan, Y., Zhang, J. W., Zang, G. Y., & Pu, J. (2019). The primary use of artificial intelligence in cardiovascular diseases: What kind of potential role does artificial intelligence play in future medicine? Journal of Geriatric Cardiology JGC, 16(8), 585–591. Yau, J. W., Rogers, S. L., Kawasaki, R., Lamoureux, E. L., Kowalski, J. W., Bek, T., et al. (2012). Global prevalence and major risk factors of diabetic retinopathy. American Diabeties Association (ADA) Diabeties Care, 35(3), 556–564. Yoo, J., Kim, T. Y., Joung, I., & Song, S. O. (2023). Industrializing AI/ML during the end-to-end drug discovery process. Current Opinion in Structural Biology, 79, 528. Yu, J., Wang, Y., Li, Y., Li, X., Li, C., & Shen, J. (2014). The safety and effectiveness of Da Vinci surgical system compared with open surgery and laparoscopic surgery: A rapid assessment. Journal of Evidence-Based Medicine, 7(2), 121–134. https://doi.org/10.1111/jebm.12099 Zanzotto, F. M. (2019). Human-in-the-loop artificial intelligence. Journal of Artificial Intelligence Research, 64, 243–252. Zhang, B., & Li, Y. (2017). Wearable medical devices acceptance and expectance of senior population in China. In Proceedings of 17th International Conference on Electronic Business (pp. 241–251).
Chapter 8
Human Learning and Machine Learning: Unfolding from Creativity Perspective Parag Kulkarni and L. M. Patnaik
Abstract Learning goes beyond knowledge acquisition. It is about using and refining knowledge to solve problems and enhance abilities to deliver value. It has always been an exemplary manifestation of intelligence. Human being learns based on experience and interactions. The surprises and unexpected scenarios create new and interesting opportunities for learning. Machine learning tries to mimic human way of learning to exhibit human-like behaviour. Most of the machine learning models are developed around human learning philosophies. These models include bioinspired models and probabilistic models (Floreano and Mattiussi, Bioinspired artificial intelligence: Theories, methods, and technologies. MIT Press, 2008). Machine learning models try to get the best from quantitative abilities and connectionist intelligence of humans. Human learning has four different aspects— behaviourism, humanism, reinforcement, and social learning. Canonical theory of dynamic decision-making captures ontological, cognitive as well as relational aspects for learning. Human learning at times results in creative activities. When we look at existing ML models from a creativity perspective, many standard models fall apart. Creativity is typically defined as a very humanish act. It is the central dimension of human achievements that always fascinated scientists (Cotterill, Prog Neurobiol 64:1–33, 2001). It is about producing something new, interesting, and useful. Creative intelligence is about combination and transformation. It does carry an element of surprise and differentiation, even in high entropy and uncertain states. At some point, while going with patterns, associations, and selective combinations, it introduces ‘out of pattern’ results. Interestingly, it breaks the rules at some very unobvious but logical junction point. Is it possible to learn this ability to surprise? Is it the thing what human learns? Or creativity is just an outcome of an accident or an offshoot of routine work? Then how can machines learn to produce these surprises? These questions just try to find their routes through theories of distance measurements, connectionist models, P. Kulkarni (B) Tokyo International University, Tokyo, Japan e-mail: [email protected] L. M. Patnaik National Institute of Advanced Studies, Indian Institute of Science Campus, Bangalore 560012, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_8
151
152
P. Kulkarni and L. M. Patnaik
probabilistic associations, information gain, and outliers. This chapter tries to unfold different facets of human learning and machine learning with this unexplored element of surprising creativity. It further tries to formulate creative learning models to build abilities in machines to deliver ingenious solutions. Keywords Consciousness · Creativity · Historical creativity · Psychological creativity · Creative machines · Creative agents · Machine learning · Creative collaborative intelligence
1 Introduction: Human Learning and Machine Learning Machine learning techniques are inspired from learning and evolution of living organisms. It has been very interesting to look at how different machine learning techniques have evolved over the years and what role has been played by human learning paradigms. Not only humans but also all living organisms learn. It could be very philosophical to find out whether mountains, stars, and other non-living entities in nature learn or not and if they do, how? But our focus here is human learning and the adaptation of it to make machines intelligent. Humans perceive the world with different sensory organs to visualize the world. The percept sequence forms the picture and ideas about the world. This picture is enhanced over time and mapped and associated with several other pictures formed over time. While combining these pictures, we go beyond combining isolated sequences with fillers of our vivid imagination (Montouri et al., 2004; Singer, 2009). This leads to building a systemic view about it, which is converted into thinking—actions—learning and applying it to what we call problem-solving, concept development, or even product visualizations. Consciousness along with the enhancements and improvements continue to extend boundaries and that all contribute to learning (Kotchoubey, 2018). There is a popularly used word called ‘cognitive learning theory’. This includes cognitive experiences leading to learning and ultimately handling similar or even different situations or problems in more graceful but in an objective way in the future. Let us try to dive down a bit deeper into the human learning process to understand where creativity is coming from, before thinking about making machines creative. But first, let us define creativity, i.e., what do we mean by something being creative and what it takes to become creative from the learning perspective. These are definitely questions of our interest. Creativity philosophy touches all disciplines (Baumeister et al., 2014). Creativity is generally defined as producing something new, interesting, and useful. Or, when novelty meets utility in unobvious way hitting the perceived boundaries creativity begins.
8 Human Learning and Machine Learning: Unfolding from Creativity …
153
This could be about creating surprises by making people happy, making them laugh or solving day-to-day problems in a very different but appealing way. Or it could be even like delivering a product raising expectations to attract wow! from majority users. It could even have aspects of choosing new and interesting options (Thaler & Sunstein, 2009). Behaviours and instincts contribute to outcome and can fuel unexpected results (Rosling et al., 2019). Complexity reduction in choosing is also equally important (Iyengar, 2010) and that can be attributed to creative stories of choice making (Kulkarni, 2022). From a simple word rearrangement, resulting in a timely joke, to developing the new intelligent system for a mission to solve a pressing problem, creativity could be found everywhere. Can we measure human creativity? If yes, how can we do it? Can we say that creativity could be ‘novelty + fluency’ and patterns could lead to pace? On a broader note, it can give ability to have novelty in useful responses even in case of similar scenarios with the element of surprise or exceeding expectations. In our discussion with poet Mangesh Padgaonkar regarding creativity and poems, he observed: It is in the metre of song, it is the selection of words, and above all it is about emotions depicted through arrangement of word reaching to sensitive minds. We conducted a simple experiment: We gave word-to-word prose presentation of one poem (Kulkarni, 2017a) to a group of 20 students. Everyone in the world is trying to achieve goal or a certain objective in spite of difficulties like stress and strain. Even they are forcing their team in adverse conditions like storm and rain. I should stress on one thing that light or real knowledge is inside and goal is the external thing. You listen to your mind for knowledge and goal then you can enlighten the whole world.
The concept and words we kept the same as per the original poem. We asked them is it creative or not and 17 said it is ‘not creativity’. And three of them said some sort of creative attempt. Then we gave poem form of the above prose to same 20 students: Everyone is running after the goal With all stress and strain Everyone forcing the team as a whole In storm and the rain Let me tell you the only fact Light is inside and outside is dark Look inside for the light and goal
154
P. Kulkarni and L. M. Patnaik
Then you can enlighten the world whole
Interestingly, 18 of them said that it is a highly creative act and two said it is creative but not rated it very high. Just rearrangement and form of presentation impacted on the judgement of majority of students. Does a machine need to understand the audience before embarking on a creativity act? Or should a machine represent everything in poem form? In fact, form of presentation, audience at reception, and perceived expectation of a group of audience in the given context drive the public creativity evaluation (Nardi, 1996). Let us try to draw some parallel between human and machine creativity and look at the possible models that one could allow to provide creative response and allow us measuring them in an effective way.
2 Human Learning Process Learning is typically defined as acquiring new knowledge and exhibiting its usefulness. If knowledge acquisition is referred as learning, then it involves the whole cognition cycle. Here we sense the world and capture data from multiple sources. This data is converted into information, then knowledge, understanding, and finally, wisdom. Here, wisdom helps us to negotiate with different known and unknown situations effectively and efficiently. This process is depicted in a simple way in Fig. 1. In case of humans, different events leading to change of states contribute to learning. These events take place in certain scenarios, which along with relevant new events and impending changes lead to an experience. Further, this experience or series of similar experiences triggers learning. This learning at some point of time or sometimes even immediately results in change in behaviour. It is not a momentary change but rather a long-lasting one till new learning contradicts and results in new behavioural correction. There are behaviourism and cognitivism theories for learning and we will try to evaluate those aspects from creativity perspective. There are most of the human learning theories based on association, while other aspects like contiguity and reinforcement are key to it. Interestingly, reinforcement learning and ‘Temporal Difference Learning’ are inspired from it. Skinner defines a reinforcer as an event that follows a response and that changes the probability of response’s occurring again (Stangor & Walinga, 2014). Does any change in experience result in different responses? There is positive as well as negative reinforcement. While positive reinforcement strengthens the belief, negative reinforcement weakens it. In behaviourism and learning, generalization plays a key role. This allows to respond in new scenarios those are similar or look similar. In fact, discrimination helps in fine tuning where human learns to find out the difference between two similar situations.
8 Human Learning and Machine Learning: Unfolding from Creativity …
155
Fig. 1 Learning—knowledge acquisition
When we think about human learning process, we always coin a term, ‘intelligence’. In case of humans, it is defined as mental keenness. We would like to define intelligence as ‘an adaptive behaviour where ability to understand, decode and respond to the situation is at core to it’. Adapting is generally of two types; one is reflective adapting to event and other one is adapting to entire process. Here adapting is generally understood in terms of entire process and hence is more systemic in nature. Intelligence is hence viewed in terms of ‘evolution of interactions with the environment’. Learning is one of the core parts of intelligence. It could be viewed as the ability to convert data into wisdom to contribute to expected adaptability through evolution of interactions with the environment. This process of learning taking data to wisdom and applying it to create value is depicted in Fig. 1. Wisdom is the result of knowledge innovation (Kulkarni, 2017a; Kulkarni et al., 2016).
3 Concept Learning and Verbal Learning Human learning process includes interpretation and association of different information artefacts. Linguistic information, text, and expressions are major parts of these artefacts. Learning languages are an important aspect of the human learning process. Many creative artefacts fall under linguistic domains. Further, linguistic creativity helps in exhibiting creativity in different domains. “Verbal learning” deals with
156
P. Kulkarni and L. M. Patnaik
learning of language or rather linguistic and cultural aspects of processes (SavageRumbaugh et al., 2000). It includes memorization and association of words. Similarly, various higher-level mental processes are involved in learning which attribute to human behaviour in complex situations. ‘Concept learning’ deals with these aspects of learning, including reasoning. Concept learning is about developing and refining concepts. Concept is a notion abstracted from similar associated experiences or input data points. Human being explores the world through observations and knowledge acquired about objects, events, and their properties. These observations lead to investigation of some focused questions and verbal and linguistic interpretations. Verbal learning uses different methods like paired-associated learning, serial learning, and free recall. Paired-associated learning is used when we associate certain words our mother tongue with foreign language equivalents. It can be thought of as a stimulus and response. It is not necessary that it considers two languages, but pairing could be from the same language. Thus, stimulus words are presented and responses observed, and the process can be repeated to achieve perfection. Similarly, in the case of serial learning, it goes beyond two words to establish extended relationships or rather serial relationships. In case of free recall, words are recalled for appropriateness. In case of concept learning, concept or categories refer to different objects or events. Animal, object, or certain event can be treated as a concept. Thus, concept is the representation of behavioural aspects or attributes connected by some behavioural patterns or rules. Properties refer to certain features or characteristics of an event or an object. These features include properties like colour, size, number, etc. Concepts have varying complexity and that is even true for features. Concept learning further evolves to association of multiple concepts and learning or creating new concepts based on these associations, expansion, and changes. While learning, relationships among different concepts are important. Concept learning varies where concept learning with partial reinforcement is one of the very popular mechanisms. The chain of concepts in such a case leads to broader concepts unfolding new relationships.
4 Discrimination Learning and Problem-Solving We discussed different ways of human learning like concept learning and verbal learning in the previous section. It is equally important to find out how concepts are learnt and evolved during learning. Human beings typically respond differently to different input signals or stimuli. This ability and learning to distinguish different situations and respond accordingly are referred to as ‘discrimination learning’. It needs understanding and visualization of the situation at hand. Discrimination learning is spread over different contexts and understands concepts specific to context. Generally, outcome and instance are associated with the context (Kulkarni et al., 2015, 2018; Ornstein, 1973). When there is change in context, the same stimulus is considered as a different problem. On a very broader note, poem writing in case of war is a different problem than poem writing in case of love. Thus, as an individual trains
8 Human Learning and Machine Learning: Unfolding from Creativity …
157
himself to do different tasks in a variety of contexts, it contributes to generality of learning. Can there be generalization in poem writing? Interestingly, while some problems need generalization, there is a certain set of problems where generalization may not be a good idea. Let us take an example where kid sees a dog at home, and then it sees a dog at some other location say at a park or street. These two are treated as different problems in the beginning, but as generalization takes place, those are approached in the same way. Thus, discrimination learning is very problem specific. It is very important to look at the mechanism of generalization and discrimination followed parallelly, and a thin line is drawn to decide discrimination when it is necessary. Here it is reinforcement that plays a vital role. Discrimination through partial reinforcement is what is in action in these cases. The role of discrimination learning is very evident when we need similar things to be separated, based on the context for targeted outcome.
5 Non-deliberate Ignorance to Non-deliberate Mastery Consciousness and creativity are related in many ways (Combs & Krippner, 2007; Cotterill, 2001; Hirschman, 1983). Consciousness is about the respect for selfexistence and for individual creative experiments (Bonshek, 2001; Gaiseanu, 2021; Martindale, 1977; Sheldrake et al., 2001). Machine consciousness can be inspired from human consciousness and can lead to creative consciousness (Clowes et al., 2007; Gamez, 2008, 2018) and can further be included in computational models (Aleksander, 2005; Pope, 2013; Reggia, 2013). Human learning progresses from non-deliberate ignorance to non-deliberate mastery (Wiggins, 2012). Here interestingly, we will find similar concepts in case of machines as we will try to see different ML models to acquire human mastery. Non-deliberate ignorance is a stage when human is unaware of his/her ignorance and what is to be learnt. Take an example of a kid or a new entrant or when we visit a completely new place. This raw phase creates ample opportunities to dive in one of the directions for explorations. It can be viewed in two different ways. One, where you think that there is a skill or competence missing, but one is unaware of it. In machine scenario, imagine that you have goal state, but means or routes to the goal state and learning resources are missing. Though this could be referred as a dangerous stage in some scenarios, it is a starting state and creates ample opportunities to create new paradigms of learning to nurture creativity. But in the stage of deliberate ignorance, a human realizes the gaps and waits to take a decision whether to overcome it or not. Thus, it is very context driven. Interestingly, the next stage is deliberate mastery. Slowly, one starts working in developing the said mastery. But in this stage, mastery is deliberate and one must work purposefully to exhibit the needed performance. This is the point where learning is in the intermediate state. Based on the learning stage, the required deliberate efforts vary. Though full competence is not achieved, or it has not become part of DNA, it is still being tried and deliberate efforts are on their way to achieve it. Now, new skills may become part of an individual’s personality. These skills are
158
P. Kulkarni and L. M. Patnaik
no longer external and that is typically referred as non-deliberate mastery. These learning stages can be observed with some differences when we make machines to learn. Here multiple opportunities, exploratory competence slowly converge where learning and exploration become inherent parts of the process (Grof, 1998). There are attempts to associate non-deliberate mastery with creativity. While it could be believed that creativity can be built on top of non-deliberate mastery and the great creative artist has this built on top of mastery, both the last two stages have been discussed. Non-deliberate in this case refers to mastery of certain skills so that no deliberate efforts are required. Then, it creates numerous possibilities to experiment with the variations necessary for creativity and evolution. This mastery even helps in dealing with unforeseen circumstances. In case of machine learning, imagine a stage when a goal is decided and there is no knowledge built or no resources made available to move in the direction of the goal. This could be a stage of non-deliberate ignorance. In this case, there is no prior knowledge or in-built bias. In fact, it is a good stage to begin since there are several possibilities to build knowledge. Then, there is a stage of deliberate ignorance where what is missing is known. It could be knowledge of skills to be acquired. Now let us imagine a scenario when knowledge is built, and a model is being developed, but the model needs to remember certain steps or get inputs from external sources. In this case, an intelligent agent achieves the objective with efforts and assistance (Kulkarni and Joshi, 2015). This denotes the next stage—deliberate mastery. In the stage of non-deliberate mastery, the model has mastered the process and can move in the direction of goal state. The cognitive flexibility has achieved through mastery leading to negotiate scenarios in efficient way and allowing the further experimentation Gamez, 2008; Pope, 2013). Figure 2 depicts the journey from non-deliberate ignorance to deliberate mastery. Non-Deliberate
Fig. 2 Non-deliberate ignorance to deliberate mastery
Non-deliberate Ignorance v
Non-deliberate v Mastery
Ignorance
Mastery Deliberate Ignorance v
Deliberate v Mastery
Deliberate
8 Human Learning and Machine Learning: Unfolding from Creativity …
159
6 Creativity and Learning Creativity can be learnt through context and experimentation. In such a scenario, novelty and utility parameters are also context specific. While a particular action is classified as a creative one in some context, the same activity may not be called creative in some other context. The novelty and utility parameters are based on the reference space. This reference space could be as small as recent individual performance and could be as big as the whole universe. The creativity evaluator works on reference space. In case of humans, we have variable reference spaces adding to too much of subjectivity in the whole exercise. To overcome that it is a good idea to define reference space with some guidelines. That takes us to two basic creativity paradigms: psychological and historical creativity.
6.1 Psychological and Historical Creativity in Machines Creativity is associated with novelty and surprise. Margaret Boden has wonderfully described psychological creativity and historical creativity as a part of human activities and problem-solving (Boden, 2004). She has focused on reference space in clear and distinctive manner. It has helped in minimizing subjectivity and creating avenues for creativity evaluation. In this case, psychological creativity refers to coming up with a surprising and valuable idea, product, or activity that is completely new for the person in focus. It does not care whether there is someone else who has already come up with this or similar idea in the past. It is important if it is completely new for the person under consideration and in the given context. It may be completely new for the world, or it may not. The novelty beyond the boundaries of reference space does not matter in this case. Here, reference space is individual as well as his/her own recent achievements. Psychological creativity, in fact, is a very powerful depiction of human learning. It may take inputs from external sources, but reference for creativity is very internal and confined to one’s own performance. It can be mapped to internal models (Holland & Goodman, 2003; Starzyk & Prasad, 2011). Internal models can help to generate artificial thoughts (Chella & Manzotti, 2009; Fingelkurts et al., 2012). On the other hand, in case of historical creativity, the new idea is unique and novel not only for that person but for the complete world. Here, the world can take definition of known or reachable information zone or a predefined system bounded by reachability and visibility. Here, the idea is taken from contextual perspective where prevailing non-obviousness is the key. Probably it is new even in case of human history for the given application or context. Psychological creativity could be considered as the special case of historical creativity where reference environment is confined to one’s own activities and creative expressions of the person in focus. It is interesting to see whether psychological creativity could be thought as the first step towards historical creativity. In fact, it is the first and most crucial step. It could be debated whether
160
P. Kulkarni and L. M. Patnaik
psychological creative ideas could be predicted or not. But we can safely assume that most of the psychological creative ideas and behaviours could be evaluated and at times looks computationally achievable to a certain extent. Experts can expect certain exploratory questions from audience, or even in the learning stage, they can make provisions for those exploratory paths and expect certain individuals to follow it. Even it can be very well assumed that there is some structure to these psychological creative thought processes and traversal patterns too. Psychological creativity points can be embedded in structured processes. That makes us to believe in the possibility that at times machines could definitely be psychologically creative. In case of psychological creativity, reference set could be very well defined. On the contrary, historical creativity lacks well-defined reference sets. Hence, it does not have structured and pattern-based behaviours making it less predictable and difficult to evaluate. Unpredictability is linked with randomness. But researchers believed that randomness has limited role in creativity. Is it controlled randomness or directed random exploration? Or is it randomness on the foundation of patterns? It becomes necessary to uncover relationship between creativity and uncertainty. Looking at traditional learning from different perspectives could prove to be the key in this case, or reverse hypothesis machine learning can help to deal with uncertainty (Kulkarni, 2012, 2017b). Too much linking of creativity with surprises sometimes results in our failure to look at scientific platform of creativity. As per Margaret Boden, “Uncertainty makes originality possible in some cases, but mostly impossible in other cases” (Boden, 2004). Another important term that is always associated with creativity is ‘serendipity’. Serendipity and chance have played some role in case of origination of new ideas, but it always has more than that. In the case of such human behaviour, structured interpretations are met with chance, leading to a new creative path. If we believe that this is the case, then what is psychological and historical creativity in machines? Is it possible to make machines psychologically and historically creative to produce the desired creative impact? Typically, it is observed that total or absolute randomness may be detrimental to creativity, but ‘explainable randomness’ can help in creativity. Take the example of an artist who comes up with a new painting which he/she thinks is amazing and the creativity is endorsed by his/her friends. On the other hand, a well-known scientist comes up with a new algorithm that not only he/she thinks is revolutionary but is also endorsed by the entire community working in that area as a game changing innovation (Pise & Kulkarni, 2016). In the first case, it is psychological creativity, and in the second case, it is historical creativity. Let us try and model it from machine learning perspective. Machines can be thought of psychologically creative if those machines have exhibited a new creative behaviour which it or any machine of that type has not exhibited before.
8 Human Learning and Machine Learning: Unfolding from Creativity …
161
Creativity does not come alone in case of machine, but it has three major parts: • While producing surprising useful results, departing from routine path it needs to identify optimal data and resources required for this process. • The learnability on the other hand helps to come up with interesting scenario with reference to existing knowledge. • The evaluation of idea in accordance with the reference space. Machines can exhibit some sort of creative behaviour. These behaviours need to be tested with reference to space to qualify it as creative ones. In a similar way, historical creativity of machine could be defined as creative behaviour by machine that no machine has exhibited till that time. Then, it is interesting to see can machines have psychological creativity? In fact, we believe that machines can; with evolutionary learning, machines can exhibit new behaviour and solve new problems. If the solution is new and useful, then it is psychological creativity. Imagine a map application telling you a new route to your office that it has not shown to you before. If that route is not an obvious extension of the existing route, but useful and new, then we can assume that the machine is exhibiting psychological creativity. We will be talking about different facets of psychological creativity as we move to the next section. Next question: can machines depict historical creativity and how can we design such machines? Well, there are a few complexities we need to consider in this case. First, we need to define space at the moment attributed to the world. Additionally, machine should have accounted of all relevant activities carried out in that space. Most importantly machine should have defined the criteria for obviousness and novelty. This discussion creates platform for psychologically creative intelligent agents and historically creative intelligent agents. We will discuss these agents in the next section and the models for agent function to support the expected behaviour. Obviously, it is not that difficult to check whether the given activity is psychologically creative or not, but it becomes very difficult to qualify something under historical creativity. Apart from few ground-breaking inventions, it becomes very subjective. There is always debate whether something can be patented or not; if it is, can we qualify it as a historically creative stuff? When we think about historically creative machines, we need to have a mechanism to measure or qualify historical creativity. As long as we do not have any such mechanism, this question becomes very difficult to answer. While discussing these possibilities, we would investigate these aspects also. Psychological and historical creativity model for machines is depicted in Fig. 3.
162
P. Kulkarni and L. M. Patnaik
Fig. 3 Psychological creativity to historical creativity
Reference space (World) Randomness
Historical Creativity
Psychological Creativity
Reference space (Selfevaluation)
Constrained Reference Space
6.2 Creativity Moments and Creativity Points In case of humans, creativity is not taken for granted, but there are creativity moments. A person with creative thoughts may fail to show similar behaviour continuously or on demand. At creativity moment, the person is at his/her best and exhibits creative behaviour. These moments are not random ones and generally depend on multiple factors. Creativity points are the points where multiple options leading to new useful solutions are presented. There could be multiple creativity points and it could be possible to classify them to trace creativity paths. We will discuss creativity pointbased models to embed creativity into machines. Computational creativity as well as human creativity many times converges to analogical processes. Analogical processes try to draw an analogy between two or more events. Sometimes very dissimilar events may have interesting analogy threads. Analogical similarity forms a platform for various creative expressions. In case of analogical creativity, there are points and experience/event that pave routes for good analogy. Analogies going beyond traditional analogies attract attention from creativity perspective. In literary creative artefacts, it is very common. Analogical transformation into different contexts with shining similarity threads creates possibility of literary creativity. Some of the traditional analogies lost their creativity edge over the time; when it comes to something new, it also demands differentiation. In such cases, where creativity finds definition associated with differentiation, machines are good to produce options for such analogies. The challenging part remains to select option which sounds creative or raises the bar of creativity.
8 Human Learning and Machine Learning: Unfolding from Creativity …
163
6.3 Creative Agents and Creative Disciplines Intelligent agents are at the heart of all Intelligent Systems. There are different types of agents based on their ability to solve problems and learn with the environment. Agents have abilities to sense the environment and act on the environment. They have multiple sensors and actuators. Sensors sense the world in different ways. It is percept sequence that acts as input and then ‘agent functions’ or ‘learning components’ help to decide the desired action in the given context (Rabinovich et al., 2020). There are different types of agents—right from simple reflex agent to goal-based, utility-based, and learning agents. In these agents, there is increasing adaptation and intelligence helping systems to solve the problem. Turing test of intelligence tries to measure intelligence with reference to human response (Turing, 1937). It tries to check the ingenuity with reference to humans. When one cannot distinguish between two responses whether they are coming from human or a machine—machine qualifies the test of intelligence. Similarly creative test of intelligence can be defined as follows: If we get a series of responses for a task from two different locations and without knowing whether it is machine or human. If we are convinced that some of the responses from both of the sources are creative, then machine has passed the creativity test. In short, Creative artificial system is a collection of processes, natural and/or technological, which are capable of achieving or simulating behaviour which, if exhibited by humans, would be deemed creative. Creative agent is a special case of intelligent agents those satisfy the creativity test. Creative agents can have hierarchies to achieve different levels of creativity. Figure 4 depicts creative agent. It senses the world with a set of sensors. Thus, it knows the existing state of the world to evaluate the creative performance. Fig. 4 Creative agent and evaluation
164
P. Kulkarni and L. M. Patnaik
6.4 Creative Collaborative Intelligence Creativity is not restricted to doing certain task. It may not be restricted to writing poems, theatrical act, or dance performance. Any simple activity could qualify to creativity based on the way it is achieved. There could be literary creativity, social creativity, business creativity, or conversational creativity when we talk about creativity in humans. The social and collaborative intelligence is something very special about humans. Social creativity deals with creative association, creative collaboration, and complementing others to achieve certain objectives creatively. Artificial intelligence in the beginning was focused on single agent systems. As the complexity increased, researchers came up with multi-agent systems. In a multiagent-system, different agents with distinctive expertise focus on certain parts of the problem. In human learning models, collaborative and co-operative learning is the key (Vidhate & Kulkarni, 2012, 2016). Two minds work together to bring divergent perspectives leading to new, impactful, and creative outcome. There are many such examples like two musicians playing together, two scientists working together, and even two entrepreneurs working together. Analogy learning has its limitations due to boundaries set by inherent data patterns or individual thought process. With divergent collaboration, these boundaries fall apart, and new useful results can be produced. But free and fair collaboration is one of the most challenging aspects. Figure 5 depicts co-operative learning architecture.
Fig. 5 Creative co-operative learning
8 Human Learning and Machine Learning: Unfolding from Creativity …
165
6.5 Out-Of-Pattern Learning Pointers Creativity is always associated with wow factor which may result through outcomes exceeding expectations. When outcome can be predicted and expected without surprise, do we call it creative? Is a creative individual expected to produce surprising results every time? Most importantly, all random, out-of-the pattern, and surprising outcomes cannot be attributed to creativity. Then question arises what are those creative and surprising outcomes? Are there out-of-the-pattern learning pointers? Or, is it a good idea to try out a number of selective ideas from out-of-the-pattern ideas and select some of them using certain criteria? Out-of-the-pattern analogical extensions, many times, sound very creative. This extension goes beyond obvious analogies to touch some contrasting ones. There are typical obvious extensions, and at times, a creative person goes beyond obvious extensions to bring the surprising element. Creativity has a context. Something that looks very creative in certain scenario may not be equally creative in a completely different context.
7 Learning Models Wiggins defines computational creativity as the study and support, through computational means and methods, of behaviours exhibited by natural and artificial systems, which would be deemed creative if exhibited by humans (Wiggins, 2020). In creativity, exploration is the key (Grof, 1998). Models vary for exploratory and transformational creativity. The combinatorial creativity models are based on forming different unfamiliar combinations from available familiar ideas. As per Margaret Boden, “exploratory creativity focuses on exploration of conceptual spaces in a structured manner” (Boden, 2004), but when it comes to changing conceptual space, it is called transformational creativity (Wiggins, 2012). Figure 6 depicts key aspects of human learning. Thus, learning and computational creativity models have three aspects:
INPUT
Expected behavior
v
Learning and connectionist network
Fig. 6 Understand human creativity
v
v
v v
Creativity evaluator
166
P. Kulkarni and L. M. Patnaik
a. Models to understand human creativity. b. Learning to perform creative tasks or tasks that can fit into the definition of creativity we have been stating again and again. c. Models to evaluate creativity.
7.1 Cognitive Development Theory and Concept Maps A thought process and learning are developed over the time. We sense the world and this is converted into thoughts. Cognitive development theory is a set of theories dealing with learning, thinking, and growth. Human beings learn from the physical and social environment and strive to adapt to it. Piaget stresses on adaptation and the two processes associated with it. It includes assimilation and accommodation (Babakr et al., 2019). While assimilation refers to grasping of new objects and events within a given structure, accommodation refers to the process of modifying the existing structure to grasp the assimilation of new event. In the first model, a new object is grasped, while in the second, the structure is modified. The grasping of new object in different ways can lead to creative expression. The concept is built around primary perceived event and slowly that concept leads to a different concept as a logical extension. Reception learning is one where new questions are asked regarding the old concepts and those lead to deriving new meaning and relationships. Thus, concept maps are developed in this process. The clarity of relationship allows to understand concept, while abstractions leading to new possibility lead to creativity.
7.2 Human Learning-Inspired Machine Learning Models Machine learning is inspired from human learning. Learning concepts are developed from human evolution and the ability to learn in changing scenarios. Hence, it is obvious that there is an evident impact of human learning on machine learning models. In this section, we will discuss different human learning-inspired ML models.
7.2.1
Analogy Models
Analogy is a very powerful way in philosophical investigations and learning. Analogy offers support for targeted conclusions. Analogy plays a dominant role in human creative reasoning. Since creativity demands additional flexibility, while retrieving and mapping information, the analogy spectrum varies. Flexibility creates several options. Some of the options survive on creativity test, while others may be rejected during evaluation or die over time. Figure 7 depicts a computing model based on human analogy-driven creativity. Hadamard described that creativity goes through several phases and not just a result of some accident (1945, P. 19). It has preparation,
8 Human Learning and Machine Learning: Unfolding from Creativity …
167
Fig. 7 Analogy model
incubation, illumination, and verification (Maldonato et al., 2016). The analogy is established across domains. A car racer from Japan, Yutaka Yamagishi, mentioned that “he was working on construction sites and the analogy between work at construction sites helped him to improve car racing” (Kulkarni, 2012). But then, there should be investigation for the structural and conceptual similarities with the target in mind. The analogy unearths relationship between driving bulldozers on construction sites and racing cars on circuits. Interdomain mapping and validation are a crucial step in this whole process. We can have this model based on paradigms of supervised and unsupervised learning.
7.2.2
Supervised Models
Supervised models work on prior evidence provided by experts. Supervised learning in case of machine uses labelled data. Analogy learning is based on supervised parading works on background knowledge where prior analogies are used for reference. These prior analogies formed a pattern and new analogies can be mapped. To extend it to creative learning, we need to borrow analogies from different domains or extend boundaries at times.
7.2.3
Unsupervised Models
Human beings learn based on similarity and differences and always group similar objects and experiences. Here the guidance by experts or labels is not present. The clustering and similarity-based unsupervised learning is inspired from learning of small kids in a typical case where supervision or labels are not presented. Unsupervised learning results in creativity when a new dimension of similarity is introduced, or the similarity aspects from different contexts are brought into a new context.
168
7.2.4
P. Kulkarni and L. M. Patnaik
Semi-Supervised Models
Human beings learn with supervision as well as without supervision. There is combination of supervised as well as unsupervised learning. Semi-supervised learning is about learning based on labelled as well as unlabelled data. It is interesting to make machines learn based on labelled as well as unlabelled data and that too simultaneously. Thus, bringing new contexts of similarity and refining labelled data can help in producing results that could qualify for creativity.
7.2.5
Connectionist Models
Human perception, behaviour, and overall cognition which involves sensing, storage, and retrieval along with inference are possible through selective neural activations based on connectionist models. While traditional computational modelling methods work on data relationships, in cognitive science, cognitive process based on connectionism contributes to learning and thinking. Can connectionist models be developed based on non-obvious analogies? The idea of more than two connectionist systems working together is an option to it. The means of creativity is not convergence, and hence, variants of outcomes contribute to the final outcome. There are multiple goal states and all of them may qualify to be creative. In any such scenario, the most important component is creativity evaluator. Model is exposed to biological and real-life analogies. The large sample of such analogies helps in building analogy model; further, this can help to evaluate new analogies. The creativity of computational systems needs to be justified. Thus, verification can be done with reference to four pillars, as mentioned by Margaret Boden—novelty, quality, accessibility, and surprise. Novelty is verified with reference to possible outcome repositories. In alignment with it, the notion surprise can be measured in terms of statistical outcome. Figure 8 depicts the creative evaluator with reference to scenario repository. Figure 9 depicts the bioinspired analogies to deliver creative artefacts.
7.2.6
Causal Models
Causal models are based on establishing cause-and-effect relationships. Human beings work on some sort of reinforcement based on these relationships. ML causal models are based on establishing causal relationships and using these relationships for learning the way human beings do. Association extended beyond obvious causes can help models to deliver surprising results.
8 Human Learning and Machine Learning: Unfolding from Creativity …
169
Scenario Repository Problem
Scenario Retrieval
Scenario Selection
Representative Pattern
Creativity Re-formulation
Causal Explanation
Creative Solution
Problem Re-formulation
Fig. 8 Creativity evaluator Biological Analogies
Need
Formulation
Concept Mapping
Analysis
Creative Artefact
Fig. 9 Bioinspired analogies
7.2.7
Expectation-Based Models
Novelty can be attributed to unexpectedness. Distance-based approaches do not work effectively but can be combined with feature extraction to have innovative combinations of features of existing artefacts. Every expectation is associated with certain confidence. Thus, expected behaviour with high confidence attributes to low surprise value. This confidence serves to distinguish unexpectedness. Figure 10 depicts process of expectation evaluation.
7.2.8
Reinforcement Models
• Substitutive learning models. • Experiential learning models.
170
P. Kulkarni and L. M. Patnaik
Expectation Evaluation
Surprise (?)
IDEA Space
Generation
Fig. 10 Unexpectedness
Substitutive learning models focus on identifying the events from parallel events to substitute. It tries to map such substitutions across learning space. The experiential learning cycle consists of: Experience, reflection, then thinking followed by acting based on it, leading to another experience. As more and more experiences are gathered, one masters to deal with similar scenarios. Experiential learning models have experience classified. The exploration helps to create experiences, and these experiences attribute to rewards and penalties to finally come up with optimal action. The reward and penalty for an action in case of creative scenarios are decided by creativity evaluation. In such a case, even a traditional good response may deliver penalty.
8 Creative Learning Models Most of the creative learning models are divergent models based on association as a seed. While convergent models move in one direction, divergent models check for the number of possibilities of solutions and associating them. Association and analogy are two major theories contributing to effective learning process. In associative learning, new responses are generally associated with stimulus. Events are associated based on multiple co-occurrences of two or more events. This association extended from direct association to dependant associations or indirect associations. Ability to associate different events, actions, texts, or artefacts across the disciplines contributes to the overall creativity of individuals. There are simple associations like good performance and good grades, diet, and weight loss. There is subtle difference between association and causal relationships. In fact, causal relationships are subset of broader associations. In case of causal relationships, association is of causeand-effect type. Ability to associate or taking association beyond visible or directly deducible limits contribute to creativity. A great poet or writer associate two different events which others cannot do obviously but still could appreciate the association.
8 Human Learning and Machine Learning: Unfolding from Creativity …
171
Association can go beyond single level. Divergent Association learning models are creative learning models and are of the following types: • Co-occurrence-based creative model: these models look for co-occurrences that are not obvious but exhibit creativity traits. • Distribution-based creative models: here the event and data distribution contribute to creativity. This looks for unique and exemplary distributions contributing to non-obvious but useful outcomes. • Exclusion-based creativity models: this model works on excluding routine or common outcomes. • Creative experiential models: these are the models based on prior experience where two or more experiences are associated and changed to deliver creativity dimension. • Creative assistive models: while assistive intelligence has object to support in intelligent activities, creative assistive models are about creating additional value by creative interpretations or changes. There is assistive intelligent technology trying to help individuals, organizations, or groups to create or enhance value. In this case, these assistive intelligent technologies are made creative rather than simply intelligent or can help in creative objectives to be accomplished. • Contextual vs. systemic creativity models: contextual creativity models try to provide creative solutions in the given context. Hence, these models need to understand the context. Systemic creativity models on the other side look for the holistic context going beyond subsystems (Kulkarni, 2012; Senge, 1997). • Reinforcement learning-based creativity model: in this case, creativity rewards are accumulated to maximize creativity value function.
9 Creative Systemic Machine Learning (CSML) The co-operative learning which considers system boundaries can be referred as creative systemic ML. The overall process can be summarized in the following eight steps: • • • • • • • •
Decide the learning space. Identify creativity pointers. Locate uncertainty points. Filter uncertainty points. Calculate freedom index. Introduce slowly more uncertainty points. Expand conceptual space with increase in uncertainty. That will create more uncertainty and may result in shift in uncertainty centroid.
The creative ML is a paradigm which needs to have an evaluator for creativity. Machine learning models can be enhanced to cope up with creativity challenges and provide interesting solutions. It is also an interesting idea to have co-operation between humans and machines to provide creative solutions. It is basically a systemic
172
P. Kulkarni and L. M. Patnaik
concept coming up with creative systemic outcome, with systemic evaluator which is the key. Even creative IT strategy can help to build the overall creative infrastructure (Kulkarni, 2008) where creative agents are at the core of it.
10 Summary Creativity is a magical word or rather behaviour. Right from many abstract definitions to elaborative experimentation, many researchers tried to decode it. There is no second opinion about the fact that creativity of humans contributed to many interesting and value creating artefacts. Creativity is an integral part of human evolution. Creativity is about novelty, non-obviousness, and utility. It has always been a challenge in front of researchers to evaluate it. Many academicians build human learning models for kids to embrace creativity. It is done through experimentation and at times practising different learning approaches and methods. When we think about creative computing and building a creative agent, it is of utmost importance to evaluate the creativity components in any outcome. Creativity agent can be defined as a special case of intelligent agent with ability to evaluate creativity. Creative learning model can be derived based on analogy and association. Margaret Boden classified creativity in two basic types. Psychological creativity tries to differentiate with respect to one’s own performance. Historical creativity does the same thing putting the world as a reference. The study of computing models underlines the fact that psychological creativity in machine is very much possible. Taking ahead this discussion, we have examined a few possible models to make it possible. The research further can create avenues where creative machines and humans will work together to embark on a journey to build better future and a beautiful tomorrow. Acknowledgements The second author, L M Patnaik would like to thank the National Academy of Sciences India (NASI) for the support provided during this work. He acknowledges the encouragement provided by Prof. Sangeetha Menon (Dean) and Dr Shailesh Nayak (Director) of the National Institute of Advanced Studies while carrying out this work.
References Aleksander, I. (2005). Machine consciousness. Progress in Brain Research, 150, 99–108. Babakr, Z., Mohamedamin, P., & Kakamad, K. (2019). Piaget’s cognitive developmental theory: Critical review. Education Quarterly Reviews, 2(3), 517–524. Baumeister, R. F., Schmeichel, B., Dewall, C. N., Paul, E. S., & Kaufman, S. B. (2014). Creativity and consciousness. In The philosophy of creativity: New essays (p. 185). OUP Boden, M. A. (2004). The creative mind: Myths and mechanisms. Routledge Bonshek, A. J. (2001). Mirror of consciousness: Art, creativity, and Veda. Motilal Banarsidass Chella, A., & Manzotti, R. (2009). Machine consciousness: A manifesto for robotics. International Journal of Machine Consciousness, 1(1), 33–51.
8 Human Learning and Machine Learning: Unfolding from Creativity …
173
Clowes, R., Torrance, S., & Chrisley, R. (2007). Machine consciousness. Journal of Consciousness Studies, 14(7), 7–14. Combs, A., & Krippner, S. (2007). Structures of consciousness and creativity: Opening the doors of perception [Paper presentation]. One hundred fifteenth convention of the American Psychological Association, San Francisco Cotterill, R. M. (2001). Cooperation of the basal ganglia, cerebellum, sensory cerebrum and hippocampus: Possible implications for cognition, consciousness, intelligence and creativity. Progress in Neurobiology, 64(1), 1–33. Fingelkurts, A. A., Fingelkurts, A. A., & Neves, C. F. (2012). “Machine” consciousness and “artificial” thought: An operational architectonics model guided approach. Brain Research, 1428, 80–92. Floreano, D., & Mattiussi, C. (2008). Bio-inspired artificial intelligence: Theories, methods, and technologies. MIT Press. Gaiseanu, F. (2021). Information, info-creational field, creativity and creation, according to the informational model of consciousness. International Journal of Neuropsychology and Behavioural Sciences, 2(3), 75–80. Gamez, D. (2018). Human and machine consciousness. Open Book Publishers Gamez, D. (2008). Progress in machine consciousness. Consciousness and Cognition, 17(3), 887– 910. Grof, S. (1998). The cosmic game: Explorations of the frontiers of human consciousness. State University of New York Press Hadamard, J. (1945). The psychology of invention in the mathematical field. Princeton University Press. Hirschman, E. C. (1983). Consumer intelligence, creativity, and consciousness: Implications for consumer protection and education. Journal of Public Policy and Marketing, 2(1), 153–170. Holland, O., & Goodman, R. (2003). Robots with internal models a route to machine consciousness? Journal of Consciousness Studies, 10(4–5), 77–109. Iyengar, S. (2010). The art of choosing. Twelve Kotchoubey, B. (2018). Human consciousness: Where is it from and what is it for. Frontiers in Psychology, 9, 567. Kulkarni, P. (2012). Reinforcement and systemic machine learning for decision making (Vol. 1). Wiley Kulkarni, P., & Joshi, P. (2015). Artificial intelligence: Building intelligent systems. PHI Learning Pvt. Ltd Kulkarni, P., Joshi, S., & Brown, M. S. (2016). Big data analytics. PHI Learning Pvt. Ltd Kulkarni, P. A., Dwivedi, S., & Haribhakta, Y. V. (2018). U.S. Patent No. 10,002,330. U.S. Patent and Trademark Office Kulkarni, P. (2022). Choice modelling: Where choosing meets computing. In Choice computing: Machine learning and systemic economics for choosing. Intelligent systems reference library (Vol. 225). Springer. https://doi.org/10.1007/978-981-19-4059-0_2 Kulkarni, A., Tokekar, V., & Kulkarni, P. (2015). Discovering context of labeled text documents using context similarity coefficient. Procedia Computer Science, 49, 118–127. Kulkarni, P. (2008). IT strategy. Oxford University Press. Kulkarni, P. (2017a). Knowledge innovation strategy. Bloomsbury Publishing. Kulkarni, P. (2017b). Reverse hypothesis machine learning. Springer. Maldonato, M., Dell’Orco, S., & Esposito, A. (2016). The emergence of creativity. World Futures, 72(7–8), 319–326. Martindale, C. (1977). Creativity, consciousness, and cortical arousal. Journal of Altered States of Consciousness Montouri, A., Combs, A., & Richards, R. (2004). Creativity, consciousness, and the direction for human development. In The great adventure: Toward a fully human theory of evolution. State University of New York Press
174
P. Kulkarni and L. M. Patnaik
Nardi, B. A. (Ed.). (1996). Context and consciousness: Activity theory and human-computer interaction. MIT Press Ornstein, R. E. (1973). The nature of human consciousness: A book of readings. WH Freeman Pise, N., & Kulkarni, P. (2016). Algorithm selection for classification problems. In Proceedings of the 2016 SAI Computing Conference (SAI) (pp. 203–211) Pope, K. (Ed.) (2013). The stream of consciousness: Scientific investigations into the flow of human experience. Springer Rabinovich, M. I., Zaks, M. A., & Varona, P. (2020). Sequential dynamics of complex networks in mind: Consciousness and creativity. Physics Reports, 883, 1–32. Reggia, J. A. (2013). The rise of machine consciousness: Studying consciousness with computational models. Neural Networks, 44, 112–131. Rosling, H., Rosling, O., & Rönnlund, A. R. (2019). Factfulness: Ten reasons we’re wrong about the world—and why things are better than you think. Sceptre Savage-Rumbaugh, S., Mintz Fields, W., & Taglialatela, J. (2000). Ape consciousness–human consciousness: A perspective informed by language and culture. American Zoologist, 40(6), 910–921. Senge, P. M. (1997). The fifth discipline. Measuring Business Excellence, 1(3), 46–51. Sheldrake, R., McKenna, T., & Abraham, R. (2001). Chaos, creativity, and cosmic consciousness. Simon and Schuster. Singer, J. L. (2009). Researching imaginative play and adult consciousness: Implications for daily and literary creativity. Psychology of Aesthetics, Creativity, and the Arts, 3(4), 190. Stangor, C., & Walinga, J. (2014). Introduction to psychology. BCcampus Starzyk, J. A., & Prasad, D. K. (2011). A computational model of machine consciousness. International Journal of Machine Consciousness, 3(2), 255–281. Thaler, R. H., & Sunstein, C. R. (2009). Nudge. Penguin. Turing, A. M. (1937). On computable numbers, with an application to the entscheidungs problem. Proceedings of the London Mathematical Society, 1, 230–265. Vidhate, D., & Kulkarni, P. (2012). Cooperative machine learning with information fusion for dynamic decision making in diagnostic applications. In: Proceedings of the 2012 International Conference on Advances in Mobile Network, Communication and Its Applications (pp. 70–74) Vidhate, D. A., & Kulkarni, P. (2016). Performance enhancement of cooperative learning algorithms by improved decision making for context based application. In Proceedings of the 2016 international conference on automatic control and dynamic optimization techniques (ICACDOT) (pp. 246–252) Wiggins, G. A. (2012). The mind’s chorus: Creativity before consciousness. Cognitive Computation, 4(3), 306–319. Wiggins, G. A. (2020). Creativity, information, and consciousness: The information dynamics of thinking. Physics of Life Reviews, 34, 1–39.
Chapter 9
Learning Agility: The Journey from Self-Awareness to Self-Immersion Madhurima Das
Abstract The world is in a constant phase of change and organizations today are embracing disruptive innovations and digital transformations. This has changed how we are working in this virtual and globally diverse world; the way we connect with people and communities, the manner in which we co-create knowledge and share our learnings. Given the transformative AI landscape and its inherent impact on organizational design, functions, processes, and behaviour, it is imperative to adopt a mindset that is open, aware, inquisitive, reflective, empathetic, innovative, resilient, and risk taking. Learning agility is the individuals ability and willingness to learn from their earlier experiences and apply that learning to perform better in newer situations. Individuals who exhibit learning agility help create an organization that is agile in every aspect. As we meander the Volatile, Uncertain, Complex, and Ambiguous (VUCA) World, Learning Agility is a core capability that will determine the journey from selfawareness to self-immersion. This chapter will explore the impact of AI (automation and digitization) on core organizational components; the need for an agile ecosystem to respond to the digital transformations at the organizational level; learning agility, i.e., the key enabling capability to survive and thrive in the VUCA world; and the dimensions and enablers of learning agility and the role of the individual as they move from self-awareness to self-immersion. Keywords Learning agility · Organizational change · Digital transformations · Knowledge co-creation · Learning ability · Learning willingness
1 Introduction The world is changing rapidly in the manner in which organizations are embracing AI for organizational design, functions, processes, and behaviour. In this regard, organizations that are able to adapt and change with the times will be the ones that succeed. Learning agility is one of the most important skills that individuals and M. Das (B) Department of Psychology, Mount Carmel College, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_9
175
176
M. Das
organizations can develop in order to thrive in a VUCA world. Learning agility is the ability to learn from experience, adapt to change, and take risks. It is a combination of cognitive skills, such as problem solving and critical thinking, and behavioural skills, such as open-mindedness and resilience. Individuals who are learning agile are able to: • See the big picture and understand how changes in one area of their work can impact other areas. • Take risks and experiment with new ideas. • Learn from their mistakes and feedback. • Adapt to change quickly and effectively. Organizations that are learning agile are able to: • Innovate and stay ahead of the competition. • Respond to changes in the market quickly and effectively. • Create a culture of continuous learning and improvement. As we embrace AI and are becoming more learning agile, it is about moving from the context of self-awareness to self-immersion. It is conclusively a critical skill for individuals and organizations alike. It is the ability to learn and adapt quickly that will determine who succeeds and who fails. The following case study reflects this movement. Ananya has over 14 years of experience and has been instrumental in creating employee engagement policies that ensure a cohesive and nurturing environment for the employees at work. In her new job at a digital services organization, she has been entrusted with the task to increase the morale of the employees as they move towards a technological and digital transformation phase. She observes that while the digital change has been embraced by the employees with great enthusiasm, stepping up to learn the new tools and upskill themselves; she observes a chasm in the interpersonal dynamics and lack of cohesiveness. She begins connecting with the employees in smaller groups understanding further that while the leaders stressed on the digital transformation and the need for it, the ecosystem lacked the support that many of them needed. They felt rushed against work deadlines, constant pressure to upskill and felt the environment was one of competition, rather than collaboration. This pushed them to focus on embracing the digital change, shortchanging human interactions. She had an uphill task to ensure that the employees were both innovative and empathetic, aware and reflective, and risk taking and resilient. She wanted them to embrace agility in the manner they work and interact with their colleagues. She wanted them to be aware of the digital change and how it impacted their roles. She also wanted them to be immersive in this change, looking at it holistically. What Ananya encountered in her new role is a challenge that is faced in multitude of organizations today. Change is the constant that drives humans and organizations. Today, organizations are embracing change by driving digital transformations, overhauling the way they work, their processes and policies. The job roles have been redefined and the lean way of working has been embraced. Employees are constantly looking at upgrading their skills, cross skilling, and focusing on role enhancement.
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
177
Organizations are working to create a system of support through new ways of learning and working. The shift is in both the mindset and the culture, at both the macro and micro levels. The granular changes are being introduced in slow phased manners, encouraging agility in the employees. Ananya must ensure a shift from self-awareness to self-immersion for the employees. This would involve them going through phases of change, embracing the digital transformations, dealing with the VUCA environment, assimilating into the agile ecosystem, and changing their own mindsets too. It is a journey of personal change towards developing a growth mindset, having an outlook of unlimiting beliefs, driving self with motivation, accountability, and ownership (see Fig. 1).
2 The Impact of AI (Automation and Digitization) on Core Organizational Components As organizations look at automating their systems and embracing digital transformations, it changes the way we work and the way we think we work. Many routine tasks are now being upgraded and automated, leaving many employees in a space where they need to learn new skills. This learning must be at a rapid pace, in sync with the changes around. Technology today is also amplifying opportunities and enriching the experiences at work. The manner in which we have adopted to a complete virtual way of working during Covid is testimony to this. Even the essential services that remain in the shadows but continue to steer the organizations have automated many of the processes. Organizations that were originally in the space of software development, adopted agile methodologies at work. This included bringing in both a technological and a cultural mind shift in its employees. The idea was to be able to visualize the work, create shorter steps to achieve the stated goals, plan in a miniscule manner, and track the work being done. The focus on increasing speed and accuracy brought with it the need to streamline processes at the microlevel. Whether the organizations were developing products or services, from the phase of ideation to the completion of the product of service, the agile methodology began to be adopted. There are several agile methodologies that were adopted over time across organizations, dating back to the 1950s. In the 1930s, physicist and statistician Walter Shewhart of Bell Labs began applying Plan-Do-Study-Act (PDSA) cycles to the improvement of products and processes. This was an iterative and incremental development methodology. The Kanban system was developed in the early 1940s by Taiichi Ohno to manage the problem of low productivity and inefficiency at Toyota. This was a planning system to control and manage work optimally at every phase of the production. Toyota experienced a flexible and efficient performance, and reduced costs. David J. Anderson in 2004 was the first to apply the concept to IT, Software development and knowledge work in general. Over the years, the process began to be used for
My Motivation
Selfawareness Self in a Phase of Change
Create a Mindset of Learning Agility
Creates a VUCA Environment
Impacts Organizational EcosystemProcesses, Policies and People
Fig. 1 Conceptual model delineating the journey from self-awareness to self-immersion
-
My Potential and Capability
-
My Job Role and Responsibilities
-
My Skills
Automation and Digital Transformation in the organization
My MotivationFocusing on Unlimiting Beliefs
-
Self-ImmersionEmbracing Learning Agility
My Job Role and Responsibilities, Accountability and Ownership
My Potential and Capability- Growth Mindset
My Adaptive Skills
Need for an Agile Ecosystem
178 M. Das
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
179
Staffing and Recruitment, Operations, Marketing and Sales, and other domains in the organizations. It has now emerged as a lean method that helps bring in a balance between the demands of work and the capacity available within a team. The workflow is optimized by using a board called the Kanban board which helps to visualize the work every person in the team is doing, the status and bring in incremental improvement. This helps identify bottlenecks and fix them with minimal impact on the workflow. This is a collaborative work process based on the principles of respect, understanding, agreement among the team members, explicit process policies, and continuous improvement. The scrum methodology, initially visualized as a ‘rugby approach’ was introduced by Takeuchi and Nonaka (1986), while delineating a framework for organizations to manage an agile software development process. The term scrum was introduced by Jeff Sutherland in 1993 and in 1995, along with Ken Schwaber, who collaborated to codify the approach and introduce the term ‘scrum’. In the scrum, the teams were called squads and consisted of small groups of three to nine members. They worked in short sprint cycles of two weeks, focusing on quick and accurate delivery. The progress of the work was monitored through daily meetings of 10–15 min, called stand-ups. Over time, the methodology evolved. The scrum was managed by a scrum master and every squad member brought in their niche skills to the table. The delivery of the work and the results had to be a synergized effort from every squad member. Another Agile methodology, Extreme Programming was introduced, and the practises, processes, and values were formalized by Beck (1999). This is an agile process whose objective is to improve the quality of software and the responsiveness to the evolving needs of the customer, while maintaining a sustainable pace, and works effectively with small projects by small teams. The focus is on communication and integration between the cross-functional teams, ensuring an energized self, reflecting good physical and mental health (Pal, 2022). The impact of AI has also been seen in the field of public administration with services focused on the user need and utility, calling for a mindset of openness, innovation, solidarity and agility, to implement change in society (Schachtner, 2021). Understanding and acceptance of the technological innovation, with efficient documentation and clarity on the usage, with strong methodological backing is paramount to set the correct standards in the organizations (Schachtner, 2019). Organizations have reduced their dependence on human resources, thereby downsizing and in turn, reduced cost in certain functions. There has been a lateral frame of leadership and decision-making, compared to a stringent hierarchical structure. AI and increased automation has also led to newer decision-making models, creating space for new lines of business (Adu & Mpu, 2019). AI has helped organizations with market research, creating data insights, which has led to better business decisions. The customer service industry with the use of chatbots has increased the efficiency of response-time and found solutions to repetitive queries from customers. The personalized interactions have enhanced the experience of the customer and they have received tailor-made solutions. When we look at the AI-powered Netflix’s recommendation function, it shows how they
180
M. Das
understand choices and simplify the decision-making processes for us (Weitzman, 2023; Zapanta, 2023). When we look at the healthcare sector, from the diagnosis of patients to ensuring seamless communication between doctors across departments, hospitals are utilizing AI-powered computer systems. Patients are being remotely treated and medical document transcription has also become easier (Zapanta, 2023). Today, even content is being created with the help of several keywords, resulting in creative outputs with AI, like the ChatGPT. With AI, we are able to track human behaviour, understand and cluster customer responses to the products and services created by the organization and help predict further behaviour of buying and selling (Manyika & Sneader, 2018; Weitzman, 2023). Organizations and industries are widely adopting AI capabilities that are relevant to the products and services they are developing (McKinsey Report, 2019). They are seeing increasing revenue in marketing and sales and decreasing cost mostly in the manufacturing sector. In the area of Human Resources, we see the usage of AI technologies in several platforms, especially in the recruitment process, from sourcing, screening, interviewing the candidates to on boarding the new employees. The adoption of AI is a journey, whether it is in their administration practises, talent management, benefits and engagement, or in learning and development. In the short term, the focus is on cost saving and increasing productivity. In the mid-term, the impact is on the decisionmaking process, the use of predictive analysis for better customer outcomes, and in the long term, the goal is to enhance autonomy and the capabilities of the employees (Sen, 2019) (Fig. 2). Organizations have also adopted various core practises for scaling AI, as seen in the responses given by the high performers in the McKinsey Report, 2019 (Cam et al., 2019). Organizations have aligned their AI strategy to their business goals, road mapping the way ahead. Not only do organizations focus on collaboration between cross-functional teams, they have also begun training their employees in AI across the business roles. The focus has also been in establishing standard processes, protocols, and methods that are repeatable and sustainable, following strong governance processes for data-related decisions and ensure the adoption for daily decision-making, incorporating this learning into the KPI matrix. Organizations with the adoption of AI have thus focused on constant innovation of business practises, to stay ahead in the competitive business ecosystem. While doing so, it also calls for a change in the way we look at our eco system, in a more agile manner (Zapanta, 2023). Organizational learning with AI has its set of challenges, where the context and method of learning becomes important. To succeed with AI, there has to be mutual learning between humans and machines, making both more relevant and efficient over time (Ransbotham et al., 2020). The changes on core organizational components called for the creation of a more agile ecosystem, focusing on new learning, knowledge co-creation, an experimental mindset, an environment of sharing, and constructive feedback.
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
181
Fig. 2 AI adoption across industries. Source Exhibit 2, McKinsey Report, 2019
3 The Need for an Agile Ecosystem to Respond to the Digital Transformations at the Organizational Level As organizations go through digital transformations, it is imperative that the leaders build an agile ecosystem. An ecosystem that helps individuals thrive in the face of challenges where they learn from their mistakes, are flexible, can adapt, and are prepared to handle diverse challenges that everyday organizational life throws at
182
M. Das
them be it handling with an irate client, a team member exhibiting low morale, or a demanding stakeholder (Rigby et al., 2016). The various changes that had to be embraced with advancing automation and digitization was to look at reskilling and retraining the employees to address the potential skill gaps (McKinsey Report, 2018). Everyone in the organization is impacted and has to embrace the new learning. Creating an agile workforce was pertinent to the creation of the agile ecosystem and learning agility was at the core of it. It was seen that for both individual contributors and leaders who made decisions, investing in learning agility and core capabilities was equally important. This skill transcended hierarchical boundaries and is to emerge as the basic need and skill at every level of the organization (Kelchner, 2019). Digital transformation has reshaped how we connect with people, the way we build our support systems and our communities and how we work and organize the work.
3.1 Agile Mindset Organizations must adapt an agile mindset. The core values of the agile mindset embrace change at every level with flexibility and agility (Drozd, 2021). • Respect: Working with the team as a collective unit with a common vision, purpose, and goal. Creating a culture of respect and psychological safety helps the team work more effectively in a collaborative format delivering sustainable results. • Optimization: Optimizing the workflow allows the team to maximize value and minimize the waste, ensuring quality and incremental delivery. This allows to catch defects early in the process, with consistent communication and a transparent, fast feedback process. • Innovation: Encouraging the team to collaborate, giving consistent feedbacks and nurturing all ideas creates a culture of experimentation that can be transformative in nature and help team members thrive. • Improvement: Through the process of reflection and retrospection, teams must continually work to find means to optimize, problem solve, and improve the work process. The path of improvement is a relentless journey. The agile mindset needs to be driven by leaders who are agile in their beliefs and practises.
3.2 Agile Leadership To build an agile ecosystem, we need leaders who are agile and can drive that change. The leader must focus on constant communication, exhibit commitment, and foster collaboration. Agile leadership works on a continuum and can be understood through
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion The 3C’s of Agile Leadership
NINE PRINCIPLES
GUIDANCE FOR
Communication
1. Actions speak louder than words
Developing
2. Improved quality of thinking leads to improved outcomes
Reflecting
3. Organizations improve through effective feedback
Learning
4. People require meaning and purpose to make work fulfilling
Inspiring
5. Emotion is a foundation to enhanced creativity and innovation
Engaging
6. Leadership lives everywhere in the organization
Unifying
7. Leaders devolve appropriate power and authority
Empowering
8. Collaborative communities achieve more than individuals
Achieving
9. Great ideas can come from anywhere in the organization
Innovating
Commitment
Collaboration
183
Fig. 3 The 3 Cs and 9 principles of agile leadership
the nine principles stated below (Agile Business Consortium, 2017). These principles are based on driving Communication, Commitment, and Collaboration within the organization and guide in developing, engaging, and empowering the employees (see Fig. 3). Leaders must imbibe a creative mindset in the organizational culture with a focus on collaboration and building partnerships. They have to role model this behaviour in an authentic manner, inspiring the employees. They must share both experiences and stories that will inspire and create a sense of credibility and conviction. They must work towards building new capabilities in the organization and weave learning into the fabric of everyday activities, reinforcing behaviours that spin the story of adaptability (De Smet et al., 2018).
3.2.1
Agile Individuals Make Agile Teams
For an agile ecosystem, the individuals must be agile by nature and there are certain personality traits that have been associated with them (Aghina et al., 2018). • Ability to handle ambiguity ensures that teams can focus on their goals and prioritize the work. • High agreeableness among team members ensuring that they respect each other’s ideas, build understanding and cohesion while working in cross-functional teams and ensure transparency at work.
184
M. Das
• Being conscientious, striving for achievement, being self-disciplined, and committed helps the team members be more adaptable by nature, rising to new challenges and being responsive to change. • Extrovert members lend energy to the teams and are good at leading teams that need guidance and stimulation. Introvert members, on the other hand, help members find their space by being empathetic, listening with intent and promoting talent, working effectively with proactive and self-motivated team members. • Agile organizations need team members who show traits of emotional stability and are not overwhelmed by failures, are able to handle critical feedback and stay calm through crisis situations. Certain values are also associated with agile teams. They are open to change and self-directed, which ensures high quality work and efficient decision-making. Customer centricity lies at the core of their work with them focusing on getting it right and delivering value to the customer frequently and incrementally. They are risk takers, open to ambiguity, focus on quality of work, and are intrinsically motivated. In an agile ecosystem, the task is to create an organizational culture where there is a confluence of innovation and empathy, focusing both on digital skills and emotional intelligence of the employees, building both the risk-taking ability and empowering the people to take decisions. Also inculcate the values of transparency and accountability (Mukerjee, 2019). One of the ways ahead is to build the learning agility in the employees and create a culture where Recognizing, Building, and Supporting talent, through retraining and reskilling programmes and focus on unlearning certain quaint skills and learning new adaptive skills become the norm.
4 Learning Agility—the Key Enabling Capability to Survive and Thrive in the VUCA World Extensive globalization, digital transformations, constantly changing economic conditions, and the evolving agile ecosystems also witnessed an increase in the usage of virtual interactions across different platforms and the role of social media. Organizations had to become responsive to this change and develop the core capabilities and focus on enhancing the learning agility of the employees (Dai et al., 2013). Peter Senge described learning organizations as places “where people continually expand their capacity to create the results they truly desire, where new and expansive patterns of thinking are nurtured, where collective aspiration is set free, and where people are continually learning how to learn together” (Senge, 1990). Being learning agile is a necessity in an agile ecosystem. According to the researchers at Teachers College, Columbia University, and the Centre for Creative Leadership, learning agility is defined as “a mindset and corresponding collection of practises that allow leaders to continually develop, grow, and utilize new strategies that will equip them for the increasingly complex problems they face in their organizations” (Flaum & Winkler, 2015). Simply put, it is the ability to learn, unlearn, and relearn. Learning agility
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
185
has been viewed as a critical component and is the enabling capability to survive in a VUCA environment (Murphy, 2021). The VUCA world describes the constantly changing environment that comes along with digital transformations in the virtual and globally diverse collaborative environment where change is rapid, constant, and complex (Bennett & Lemoine, 2014; Staggs, 2014; Worle, 2022). • V-Volatility Things in the world change unpredictably, suddenly, and extremely, for example, changes in new technology affecting how products are made, customer expectations are met, and new competition is handled. This can put people in a space of fear and risk aversion. • U-Uncertainty Unclear, doubtful, and indefinite information about the current and future affecting decision makers; and therefore, planning investment, development, and growth becomes difficult. • C-Complexity Many different and connected parts: multiple key decision factors, interaction between diverse agents; the emergence, adaptation, co-evolution of multiple systems making things more complex and difficult to predict cause and effect, with changes at multiple levels in the organization. • A-Ambiguity The demands on organizations and leaders today are immense. Information that is available is open to multiple interpretations; the awareness of the same and the willingness to take risk to deal with the ambiguity is essential. This induces doubt, distrust, hesitancy, and impedes decision-making and change in the organization. The solution to the VUCA world needs the leader to develop a vision to deal with the volatility, which will reduce the fear and unpredictability; create an environment of understanding to deal with the influx of indefinite information, look at the data rationally, and focus on finding solutions. The leader also needs to work to bring clarity while making decisions and work on them in a phased manner, managing the demands of multiple customers. The leader needs to ensure agility to be adaptive, learn from new information, and embrace the change. Learning agility is the key to survive and thrive in the VUCA world. In the unpredictable organizational environment, where there is constant change, adaptation and the need for evolution, employees who have learning agility will be able to navigate the volatile times and be a comprehensive part of the learning organizations being built (Garvin, 1993).
4.1 AI and Learning Agility Learning agility and adaptability is not just a skill, but a mindset to be developed, with a focus on learning programmes that reskill and retrain the employees continuously, along with adopting the newer frames of reference and imbibing the change at a cultural level within the organization (Kelchner, 2019). AI can improve learning agility in individuals by co-creating knowledge, enabling easier sharing of knowledge, helping people become skilled in the newer AI tools, incentivise the risk-taking culture, and create a perspective of readiness. As
186
M. Das
employees begin to adopt AI in their work, AI actually enables them to become more learning agile. It is co-enhancing space (Mikalef et al., 2023). AI has also impacted the learning agility in organizations by improving the performance monitoring systems and introducing a new concept called the human-in-theloop systems. When in an AI driven organization, models make mistakes, there is a team of people who are monitoring the models and give feedback in case of errors. This creates a synergistic environment with appropriate data being fed to the models for better outcomes (Meier, 2019). AI and learning agility complement each other as both enablers and drivers of each other. As a learner, it becomes pertinent to not just be self-aware but also consciously work towards self-immersion.
5 The Dimensions and Enablers of Learning Agility and the Role of the Individual as They Move from Self-Awareness to Self-Immersion Learning agility has been viewed as a psychological construct that is seen as a predictor of leader potential and performance (Meuse, 2017). Studies have shown that learning agility not only is a predictor of high growth, but also that employees who show high learning agility exhibit tolerance for ambiguity as well and are empathetic and engaged at work (Staggs, 2014). Korn Ferry identified five dimensions of learning agility (Bedford, 2011; Knight & Wong, 2021; Verlinden, 2022) (see Fig. 4): • Mental agility The mentally agile person likes to learn new things, look at problems from multiple perspectives, inquires, and exhibits strong listening and embraces the complexities that arise. • People agility This quality ensures that the person understands the importance and impact of diverse thoughts and interactions, encourages different perspectives, is open-minded and culturally sensitive, nurturing the best in the people they work with. • Change agility The person with change agility accepts the best in the present and also understands that change is inevitable and important for growth and comes forth to lead transformative efforts in the organization, constantly being open to the new. • Results agility The person with results agility responds to challenges and difficult situations, inspires, and nudges others to achieve their results and constantly strives for better results and benchmarks higher performance. • Self-awareness The self-aware person is reflective and has a keen understanding of their strengths and weaknesses, seeks feedback and works on improving self, based on their personal insights.
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
187
Fig. 4 The five dimensions of learning agility. Source viaEDGE: assessment for measuring learning agility
Learning agility encompasses components of emotional intelligence, both selfawareness and self-management; the ability to understand and manage one’s emotions, listen to others, and respond to their emotions (Flaum & Winkler, 2015). Research has shown that there are certain behaviours that enable learning agility and certain behaviours that act as derailers for learning agility. The enablers are Innovating, Performing, Reflecting, and Risking and the main derailer is defending (Mitchinson & Morris, 2014).
5.1 The Enablers of Learning Agility • Innovating In this behaviour, the assumptions that have been held for long are challenged and the person is ready to question the status quo. The goal is to discover unique ways of doing things, being open to new experiences, which increase the breadth of knowledge. The person can look at issues from different vantage points and suggest ideas and solutions. • Performing In this behaviour, the performer can pick up new skills, be observant and a keen listener, deal with ambiguous and unfamiliar situations, be present in the moment and engage with the problem on hand, handle stress, and adapt quickly to changes. • Reflecting In this behaviour, the person reflects upon his/her assumptions and past behaviours, is open to new information and feedback, lends insight into his/
188
M. Das
her behaviour and that of others. This enabler has also been seen as a predictor of success in senior leadership roles. • Risking In this behaviour, one is open to enter unchartered paths and expose oneself to new experiences. Such a person is a risk taker with caution, the risk is about an opportunity. The enabler is ready to venture out of his/her comfort zones and find confidence in learning from possible failures. The derailer of learning agility is the behaviour of defending. When the person is not open towards opportunities, is unable to handle feedback that is critical and shows defensive behaviour in the face of challenges, the person is low on learning agility. It has been seen that learning agile individuals are usually more focused on their work, they organize their tasks, are more methodical in the manner they carry out the tasks and are driven by nature. They also like to take charge, are social and good at networking. They are creative and original in their thinking, often spearheading change and innovation in their organizations. They are also optimistic, perseverant, and more resilient in the face of uncertainty and challenges. They are open to criticism, engage on thoughts and actions that differ from theirs and are not hesitant to express their opinions. Learning agility is impacted by two factors—learning ability, i.e., how an individual can learn new things, identify patterns, make connections, and apply the learning to varied situations. This ability exhibits the fluid intelligence of the person; learning orientation, i.e., personality attributes like how inquisitive and open-minded the individual is, how quickly they learn and retain that information to effectively use it. In the journey of being learning agile, both these aspects are a part of the self (Lally, 2019). To becoming self-aware and a self-directed learner, it is important to be open to experiences, seek feedback, understand knowledge sharing and co-creation, and cultivate an experimental mindset (Lee, 2020). There are various learning agility profiles that have been identified exhibiting different strengths, agility dimensions, and developmental needs (see Fig. 5).
Problem Solvers Thought Leaders Trailblazers Champions Pillars Diplomats Energisers
Fig. 5 The learning agility profiles
•strong drive, resolve problems independently, pay attention to detail and avail resources efficiently •Focus is knowledge, insight and continual progress. Drive the team silently during difficult times and manage tough situations. •confident, innovative, undeterred by ambiguity, make and define the roadmaps, confident about decisions taken and determined in their goals •people person, enhance the team morale, deal with conflicts objectively, positive frame of mind and inclusive in their actions •Focused and insightful creative thinkers, efficient at problem solving and decision making and can find meaning in chaos •balance the team equations, adaptable by nature, navigate difficult conversations, situations and people with conviction and great conversationalists •they get things done, well networked, determined and goal focused and are the solution providers
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
189
5.2 Learning Agility—The Journey from Self-Awareness to Self-Immersion Learning agility has been defined as a skill, a capability, a mindset. It has been seen as a journey from self-awareness to self-discovery (Goebel, 2013). It can also be seen as the journey from self-awareness to self-immersion as we deal with digital challenges in the VUCA world, imbibe new learnings, find our way through the evolving agile ecosystem and move through the phases of change. Self-awareness was defined by psychologists Duval and Wicklund (1972) as “The ability to focus on yourself and how your actions, thoughts, or emotions do or don’t align with your internal standards. If you’re highly self-aware, you can objectively evaluate yourself, manage your emotions, align your behaviour with your values, and understand correctly how others perceive you” (Betz, 2022). This in line with how Goleman (1998) defined the aspects of self-awareness and self-management in the Emotional Intelligence Matrix. In the current organizational space, self-awareness is not only about being aware of your strengths and challenges, but also how you are perceived by the people around you, the ability and potential that you bring to the organization and how it can determine the growth for you. Self-awareness in the context of the organization and the role the individual plays, comprises various components. • My Skills The skills (technical and behavioural) that an individual brings to the organization. This is based on their educational qualifications, their past learnings, certification, and experiences. • My Job Roles and Responsibilities The job the person is hired for, their responsibilities and the tasks that they must do. This includes their key result areas and key performance indicators. • My Potential and Capability The capability refers to an individual’ current ability to do a particular task, along with the potential they exhibit, the latent qualities that can be developed in the future. An understanding of both capability and potential helps leaders develop their team members. This can be done with the right planning, resources, and interventions. • My Motivation It is important to understand what motivates and drives a person. It could be intrinsic or extrinsic motivation, as was stated by Maslow (1943) in his Hierarchy of Needs. It could be motivators developed through our experiences and culture like the need for achievement, power or even affiliation, as noted by McClelland (1961). It could be responding to specific and challenging goals which give direction to the employees. No two individuals are motivated by the same thing and the same circumstances. 5.2.1
Self in a Phase of Change
From self-awareness, the self-moves into a phase of change. This is to deal with the influx of digital transformations, the VUCA environment, embracing the agile
190
M. Das
ecosystem that is emerging in the organization and the mindset of learning agility. Change is a process and there are various zones of change that everyone who is partaking in the process goes through (Page, 2020). Comfort zone refers to your experience prior to the change is introduced. You are comfortable with your work role, the work expectations and how the career is shaping up. Then a change is introduced. This often puts people into a zone of fear. In the fear zone there is uncertainty, lack of confidence and even apprehension if one will be able to adapt to the change. Sometimes, organizations see reluctance and sometimes organizations see an open mind to embrace the change. There are also employees who are undecided. This phase may also see an environment of mistrust in the teams, some conflicts that may arise as everyone is grappling with the change and self-doubts, and employees wondering if they have the requisite skills to be at par with the change being introduced. From this zone, as organizations introduce the change in a phased-out manner, changing processes and policies, employees move into the learning zone. In the learning zone, the mindset is that of acceptance, wanting to learn new things, challenging earlier set of beliefs, and working on the path of progress. The growth zone that comes after this takes time, is iterative by nature and helps the employees assimilate the change completely. Employees find purpose, realize their aspirations and future goals and embrace new experiences. Automation and digital transformations thus impact people, processes, and systems in the organization and drives people towards these zones of change (see Fig. 6).
5.2.2
Self-Immersion
It refers to the way we embrace learning agility and view that the self is changing as we become more learning agile. It is essential for the growth of the individual and to make the individual more relevant in the organization. It is important for career sustenance and professional identity (Mukherjee & Sujatha, 2020). The view on skills, job roles and responsibilities, potential and capability, and motivation also acquires a new lens. • My adaptive skills The skills must be adaptive by nature. The role of retraining and reskilling cannot be ignored in an agile ecosystem. Individuals must be open to upskilling and cross skilling themselves. With teams becoming leaner by nature, upskilling of core capabilities is necessary to meet the demands in a situation of crisis or change (Illanes et al., 2018; Kelchner, 2019). • My job role and responsibilities, accountability, and ownership A progressive understanding of job roles and responsibilities is necessary, along with exhibiting both accountability and ownership. Being accountable means that the person is being responsible for the result of the task. The person is responsible for meeting the timelines of work and delivering quality work. Ownership is about initiative and being forthcoming with ideas, sharing thoughts to ensure the process of work is smooth and productive. To understand that the way we follow through with our
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
191
GROWTH ZONE Find Purpose Realise Aspirations Embrace New Experiences
LEARNING ZONE Acceptance Challenging Beliefs Learn new things FEAR ZONE Uncertainty Lack of Confidence Self doubt Mistrust Conflicts
COMFORT ZONE Feel Safe In control Understand work expectations
Fig. 6 Zones of change
work and meet our goals affect the work of other team members and how they reach their goals. With ownership, there is a sense of belongingness and a more fulfilled work environment (Tanner, 2017). People with internal locus of control exhibit a more positive outlook, focusing on what you can do, rather than being preoccupied with things that have been done and can’t be changed. This is one of the factors that ensure people show both accountability and ownership (Strachan, 2020). • My potential and capability—growth mindset It is not only essential to understand our capabilities and potentials, but also be able to work with our leaders and plan our career paths, look at goals, both short term and long term, that enhance the skills, build on the existing capabilities and realize the latent potentials. This facilitates to move towards being high performers (Meuse et al., 2008). In this transformative space, we move from having a fixed mindset to a growth mindset (Dweck, 2016). When we have a fixed mindset, we are limited by our own assumptions on our abilities and intelligence and see our success only in relation to that. This makes us wary of taking chances and fearful of failure. This limits us and the progress that we can attain. What we need to develop is a growth mindset—a mindset where we thrive on challenges, treat failures as the stepping stones to success and are continually striving to stretch our abilities. The growth
192
M. Das
mindset also influences how we see ourselves in the larger scheme of things and our capacity for happiness. • My motivation—focusing on unlimiting beliefs Our motivations change overtime. As we grow in organizations, it also becomes essential to understand that there are some beliefs that can limit us, based on our past experiences and associations with them. These self-limiting beliefs are the negative feelings and beliefs that impact our thinking, confidence, and the way we work (Lai, 2021). Through deliberate effort and consistently reminding ourselves of our motivations and strengths, we must focus on our unlimiting beliefs, which are a pandora’s box of potentials. This also helps us be more reflective, adaptable, open to experiences, embracing the mindfulness of self. Learning agility, in the true sense, is a journey that every individual self must embark upon. The reasons for that movement could be digital transformations, the uncertain environments, changing ecosystems and self-realization. While we are aware of our strengths and our challenges, we must not be limited by what they mean in the current time and space, but yearn to grow, discover the self in a new light, and be ready to challenge limiting beliefs and thoughts. This will be the precursor of being immersive in the true sense. As we persistently build our resilience and our evolving reality, we must remain grateful for this journey, savouring every milestone. Self-immersion is a work in progress that responds to the phases of change and is iterative by design; a confluence of our nature, the environment, and the nurturing we give to the self.
6 The Synergistic Path Ahead Learning agility and adaptability are essential skills for success in today’s rapidly changing world. AI can help to improve learning agility in individuals and organizations in a number of ways. For individuals, AI can help to: • Co-create knowledge by making it easier to share and collaborate on information. • Enable easier sharing of knowledge by providing access to a wider range of resources and making it easier to find relevant information. • Help people become skilled in the newer AI tools by providing training and support. • Incentivize the risk-taking culture by providing a safe environment to experiment and learn from mistakes. • Create a perspective of readiness by providing insights into the future of work and how AI will impact it. For organizations, AI can help to: • Improve the performance monitoring systems by providing more accurate and timely data.
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
193
• Introduce the human-in-the-loop systems by providing a way to integrate human expertise with AI-powered systems. • Create a synergistic environment by ensuring that AI systems are aligned with the needs of the organization and its employees. As a learner, it is important to be self-aware and to consciously work towards self-immersion. This means being open to new experiences, being willing to take risks, and being able to learn from your mistakes. AI can help you to develop these skills by providing you with access to a wider range of resources, making it easier to find relevant information, and providing you with feedback on your progress. By working together, AI and learning agility can help you to stay ahead of the curve in today’s rapidly changing world.
References Adu, E. O., & Mpu, Y. (2019). Organizational and social impact of artificial intelligence. American Journal of Humanities and Social Sciences Research (AJHSSR), 3(7), 89–95. Aghina, W., Handscomb, C., Ludolph, J., West, D., & Yip, A. (2018). How to select and develop individuals for successful agile teams: A practical guide. McKinsey & Company. Retrieved December 20 from https://www.mckinsey.com/business-functions/people-and-organizationalperformance/our-insights/how-to-select-and-develop-individuals-for-successful-agile-teamsa-practical-guide Agile Business Consortium. (2017). Culture and leadership: The nine principles of agile leadership. Agile Business Consortium. https://www.agilebusiness.org/page/Resource_paper_ninepr inciples. Accessed 12 June 2021 Beck, K. (1999). Extreme programming explained: Embrace change. Addison-Wesley Professional Bedford, C. L. (2011). The role of learning agility in workplace performance and career advancement [Doctoral Dissertation, University of Minnesota]. Accessed 12 June 2021 Bennett, N., & Lemoine, G. J. (2014). What VUCA really means for you. Harvard Business Review. From the Magazine (January–February 2014). https://hbr.org/2014/01/what-vuca-really-meansfor-you. Accessed 12 June 2021 Betz, M. (2022). What is self-awareness and why is it important? https://www.betterup.com/blog/ what-is-self-awareness. Accessed 12 June 2021 Cam, A., et al. (2019). Global AI survey: AI proves its worth, but few scale impact. McKinsey Analytics Global AI Survey: AI proves its worth, but few scale impact. In Designed by Global Editorial Services Copyright © 2019 McKinsey & Company. https://www.mckinsey.com/ featured-insights/artificial-intelligence/global-ai-survey-ai-proves-its-worth-but-few-scaleimpact. Accessed 12 June 2021 Dai, G., De Meuse, K. P., & Tang, K. Y. (2013). The role of learning agility in executive career success: The results of two field studies. Journal of Managerial Issues, 25(2), 108–131. De Smet, A., Lurie, M., St George, A. (2018). Leading agile transformation: The new capabilities leaders need to build 21st-century organizations. McKinsey & Company. https://www.mck insey.com/business-functions/people-and-organizational-performance/our-insights/leadingagile-transformation-the-new-capabilities-leaders-need-to-build-21st-century-organizations. Accessed 22 July 2022 Drozd, K. (2021). Cultivating an agile mindset How to get your team to love and adopt agile methodology. https://www.atlassian.com/agile/advantage/agile-mindset. Accessed 12 June 2021 Duval, S., & Wicklund, R. A. (1972). A theory of objective self awareness. Academic Press.
194
M. Das
Dweck, C. (2016). What having a “Growth Mindset” actually means. Harvard Business Review. https://hbr.org/2016/01/what-having-a-growth-mindset-actually-means. Accessed 12 June 2021 Flaum, J. P., & Winkler, B. (2015). Improve your ability to learn. Harvard Business Review. https:// hbr.org/2015/06/improve-your-ability-to-learn. Accessed 12 June 2021 Garvin, D. A. (1993). Building a learning organization. Harvard Business Review. https://hbr.org/ 1993/07/building-a-learning-organization. Accessed 12 June 2021 Goebel, S. (2013). Senior executive learning agility development based on self-discovery: An action research study in executive coaching. Doctoral dissertation, Georgia State University. https:// scholarworks.gsu.edu/bus_admin_diss/16. Accessed 18 June 2021 Goleman, D. (1998). Working with emotional intelligence. Bantam Books. https://www.nimblework.com/kanban/what-is-kanban/. Accessed 12 June 2021 Illanes, P., et al. (2018). Retraining and reskilling workers in the age of automation. McKinsey & Company. All rights reserved. https://www.mckinsey.com/featured-insights/future-of-work/ret raining-and-reskilling-workers-in-the-age-of-automation. Accessed 12 June 2021 Kelchner, J. (2019). A human approach to reskilling in the age of AI. opensource.com. Retrieved September 24. https://opensource.com/open-organization/19/9/claiming-human-age-of-AI Knight, M., & Wong, N. (2021). The organisational x-factor: Learning agility. Focus. https://focus. kornferry.com/leadership-and-talent/the-organisational-x-factor-learning-agility. Accessed 12 June 2021 Lai, C. (2021). Two ways to overcome your limiting beliefs. https://medium.com/@CelineL/2-waysto-overcome-your-limiting-beliefs-b944952afeea. Accessed 12 June 2021 Lally, M. (2019). The 5 dimensions of learning agility (+ why it matters in leadership) Lee, S. (2020). How to leverage technology in building learning agility. In Proceedings of the 2020 Training Industry Conference and Expo (TICE). Training Industry MagazineAgile Learning 2020. https://trainingindustry.com/magazine/jan-feb-2020/how-to-leverage-tec hnology-in-building-learning-agility/. Accessed 12 June 2021 Manyika, J., & Sneader, K. (2018). AI, automation, and the future of work: Ten things to solve for. McKinsey Global Institute. https://www.mckinsey.com/featured-insights/future-of-work/ai-aut omation-and-the-future-of-work-ten-things-to-solve-for. Accessed 12 June 2021 Maslow, A. H. (1943). A theory of human motivation. Psychological Review, 50, 370–396. McClelland, D. C. (1961). The achieving society. D Van Nostrand Meier, S. (2019). How AI will impact organizational structures. KUNGFU.AI. https://medium.com/ kung-fu/how-ai-will-impact-organizational-structures-f970690fe5d4. Accessed 15 June 2023 Meuse, K. P., et al. (2008). Global talent management: Using learning agility to identify high potentials around the world. The Korn/Ferry Institute. Meuse, K. P. (2017). Learning agility: Its evolution as a psychological construct and its empirical relationship to leader success. Consulting Psychology Journal: Practice and Research, 69(4), 267–295. Mikalef, P., et al. (2023). Examining how AI capabilities can foster organizational performance in public organizations. Government Information Quarterly, 40(2), 101797. https://doi.org/10. 1016/j.giq.2022.101797 Mitchinson, A., & Morris, R. (2014). Learning about learning agility. Center for Creative Leadership White Paper. Mukerjee, D. (2019). Agility to drive digital transformation. https://www.peoplematters.in/article/ culture/agility-to-drive-digital-transformation-22757 Mukherjee, D. V., & Sujatha, R. (2020). Identity in a gig economy: Does learning agility matter? Mukt Shabd Journal, 9(6), 3610–3632. Murphy, S. M. (2021). Learning agility and its applicability to higher education. Doctoral Dissertation, Columbia University Page, O. (2020). How to leave your comfort zone and enter your ‘growth zone’. https://positivep sychology.com/comfort-zone/. Accessed 12 June 2021 Pal, S. K. (2022) Software engineering|extreme programming (XP). https://www.geeksforgeeks. org/software-engineering-extreme-programming-xp/. Accessed 12 June 2021
9 Learning Agility: The Journey from Self-Awareness to Self-Immersion
195
Power, G. M. (2021). 2 Ways to overcome your limiting beliefs. Medium.com. https://medium.com/ @CelineL/2-ways-to-overcome-your-limiting-beliefs-b944952afeea. Accessed 12 June 2021 Ransbotham, S., et al. (2020) Expanding AI’s impact with organizational learning. Findings from the 2020 artificial intelligence global executive study and research project. Accessed 15 June 2023 Rigby, D., Sutherland, J., & Takeuchi, H. (2016). The secret history of agile innovation. Harvard Business Review. Retrieved April 20, 2016 https://hbr.org/2016/04/the-secret-history-of-agileinnovation?registration=success. Accessed 12 June 2021 Schachtner, C. (2019). New work in the public sector?! VM Administration and Management, 25(4), 194–198. https://doi.org/10.5771/0947-9856-2019-4-194 Schachtner, C. (2021). Learning to transform by implementing AI into administrative decisions— disruptive mindset as the key for agility in the crisis. Central and Eastern European E|dem and E|gov Days, 2021, 265–272. Sen, S. (2019). Exploring future of work, artificial intelligence and HR. peoplematters. https://www.peoplematters.in/article/hr-technology/exploring-future-of-work-artificial-int elligence-and-hr-22829. Accessed 12 June 2021 Senge, P. M. (1990). The fifth discipline. Doubleday. Staggs, J. (2014). Learning agility. Navigating the changing world today. Korn Ferry Strachan, C. (2020). How to build a culture of ownership and accountability. Grindstone. https:// grindstonecapital.co.uk/how-build-culture-ownership-accountability/. Accessed 12 June 2021 Takeuchi, H., & Nonaka, I. (1986). The new product development game. Harvard Business Review. https://hbr.org/1986/01/the-new-new-product-development-game. Accessed 12 June 2021 Tanner, W. (2017). How you get employees to take ownership of their work. https://medium.com/ @warrentanner/heres-how-you-get-employees-to-take-ownership-over-their-work-ebe1f7ebf 508. Accessed 12 June 2021 Verlinden, N. (2022). Learning agility: What HR professionals need to know https://www.digitalhr tech.com/learning-agility/. Accessed 15 June 2023 Weitzman, T. (2023). Understanding the benefits and risks of using AI in business. https://www. forbes.com/sites/forbesbusinesscouncil/2023/03/01/understanding-the-benefits-and-risks-ofusing-ai-in-business. Accessed 15 June 2023 Worle, D. (2022). VUCA world: What it stands for and how to thrive in it? https://digitalleadership. com/blog/vuca-world/. Accessed 16 July 2022 Zapanta, T. (2023). The impact of AI on business. https://www.microsourcing.com/learn/blog/theimpact-of-ai-on-business. Accessed 15 June 2023
Chapter 10
Mind-Reading Machines: Promises, Pitfalls, and Solutions of Implementing Machine Learning in Mental Health Urvakhsh Meherwan Mehta, Kiran Basawaraj Bagali, and Sriharshasai Kommanapalli
Abstract The central premise of implementing machines to understand minds is perhaps based on Emerson Pugh’s (in)famous quote: “If the human brain were so simple that we could understand it, we would be so simple that we couldn’t”. This circular paradox has led us to the quest for that quintessential ‘mastermind’ or ‘master machine’ that can unravel the mysteries of our minds. An important stepping stone towards this understanding is to examine how perceptrons—models of neurons in artificial neural networks—can help to decode processes that underlie disorders of the mind. This chapter focuses on the rapidly growing applications of machine learning techniques to model and predict human behaviour in a clinical setting. Mental disorders continue to remain an enigma and most discoveries, therapeutic or neurobiological, stem from serendipity. Although the surge in neuroscience over the last decade has certainly strengthened the foundations of understanding mental illness, we have just started to rummage at the tip of the iceberg. We critically review the applied aspects of artificial intelligence and machine learning in decoding important clinical outcomes in psychiatry. From predicting the onset of psychotic disorders to classifying mental disorders, long-range applications have been proposed and examined. The veridicality and implementation of these results in real-world settings will also be examined. We then highlight the promises, challenges, and potential solutions of implementing these operations to better model mental disorders.
U. M. Mehta (B) · K. B. Bagali Department of Psychiatry, National Institute of Mental Health and Neuro Sciences (NIMHANS), Bangalore, India e-mail: [email protected] S. Kommanapalli Machine Learning, Tavant Technologies, Santa Clara, CA, USA © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_10
197
198
U. M. Mehta et al.
1 The Context: Machines and Minds—A Deeper Connect? The Merriam-webster dictionary defines artificial intelligence (AI) as a “branch of computer science dealing with the simulation of intelligent behaviour in computers”. Appositely the first paper, which is regarded as the beginning of AI, was published in 1950 in the British journal Mind, and it discussed the fundamental question “Can machines think” (Turing, 1950). Thereupon, science has advanced many folds, and currently, the term AI is a broad umbrella term with several subdivisions to it. Artificial narrow intelligence (ANI) deals with exceptional pattern recognition abilities. As discussed further in this chapter, these have immense applications, especially in mental health, such as biomarker development using multimodal biobehavioural data from brain scans, speech/blogposts, actigraphy, genomics, and others. This type of AI is good at performing a specific or narrow task, and research suggests that most AI-related health tools that are being studied or developed are some form of ANI (Pennachin & Goertzel, 2007). This typically uses reactive methods that depend on the current data provided or limited memory methods that rely on past experiences from data stored in memory. Supervised (classify or predict outcomes based on a given input) and unsupervised (transform large data sets into meaningful patterns) machine learning (ML) algorithms are the cornerstone of ANI. Artificial general intelligence (AGI), in contrast, mimics human-like behaviour; it can attempt to argue, memorize, reason, and solve issues as humans do. Hence, AGI can have the ability to perform multiple tasks, like playing a game or enacting human behaviour. Reinforcement learning is an example of AGI, where machines autonomously respond to external stimuli contingent upon the conditioning(policy) they receive through rewards and penalties. Through these properties, AI has potential applications in interventional capacities, especially in delivering various talk therapies for common disorders like depression and anxiety. This comes with the advantage of scalability and can alleviate, at least partially, some of the acute human resource shortages in mental healthcare, especially in developing countries such as India. Lastly, artificial superintelligence (ASI) can theoretically be much smarter than humans, thereby recognizing and solving problems that are not coherent or not known to exist in the human mind (Bostrom, 2014). Any form of AI has potential applications in domains such as game playing [deep reinforcement learning is widely used here], ASR [automatic speech recognition], understanding natural language [natural language processing, lately: LLMs: large language models have been in the news as being able to accurately understand and respond to natural language conversations and context], computer vision, expert systems, and heuristic classifications. Given its wide array of problem-solving capabilities, interest in incorporating AI technology to solve healthcare related problems has grown exponentially over the years. This is reflected in the number of papers published on AI in health care. In the year 2005, only 203 papers were published; this number grew to 12,563 in the year 2019, with psychiatry taking the fifth spot (Meskó & Görög, 2020). A search of the PubMed database with Medical Subject
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
199
Fig. 10.1 Charting the growth of scientific publications in the field of artificial intelligence and psychiatry
Count
60
40
20
0 1990
2000
2010
2020
Year
Headings such as Artificial Intelligence and Psychiatry as keywords reveals an exponential rise in scientific literature beginning around 2013, which has since continued to grow (see Fig. 10.1). Optimism is further reflected in the fact that now there are schools of thought that advocate the introduction of basic concepts of AI in medical curricula (Paranjape et al., 2019). Given all these, there is a growing need for physicians and healthcare professionals to understand the potential clinical applications of AI. We may be heading towards a future where healthcare professionals and AI technology complement each other in the delivery of health services. Natural language processing (NLP), machine learning and data mining, knowledge-based systems, predictive modelling, and medical image and signal processing were the core themes in a leading 20th International Conference on AI in Medicine (AIME-2022) (Martin Michalowski et al., 2022). These themes broadly encompass AI techniques and their potential clinical applications. It is also important to appreciate that these themes are not mutually exclusive but rather interdependent. For example, the area of NLP deals with interpreting natural human language. Several ML models are utilized to derive inferences and predict the words, sentences, and context of language use. Hence, while discussing the various applications of AI in mental health, we will focus on the outcome rather than the method by which that outcome is derived. The mind is an abstract concept, having received centuries of exploration from perhaps most, if not all, scientific and philosophical disciplines. In terms of clinical settings, the mind can be conceptualized as a dynamic set of processes that can be expressed and/or inferred as a consequence of one’s existence, which arises from how our brain transacts with our body and our environment. From a clinical point of view, a ‘mental status examination’ (MSE) is the best tool to understand the mind. This encompasses processes like thought, emotion, behaviour, perception, insight,
200
U. M. Mehta et al.
judgement, motivation, analytical or processing abilities, and many others. It is typically tuned to identify qualitative or quantitative aberrations in the mind and track them over time with treatment. The process of MSE is learned with clinical training and experience and is equally an art as much as a science. It has evolved through iterative distillations of contemporary knowledge of the ‘whats’, the ‘hows’, and the ‘whys’ of the mind and its functions. Over time, MSEs have been successful in identifying broad categories of psychiatric disorders, studying their natural histories, and providing fairly objective measures of changes with or without treatments. However, it falls short on various fronts, as we see in the next section, in providing reliable and valid patterns of mental health conditions or disorders that can be understood from a causal perspective. Identifying what (valid patterns of the mind’s dysfunction) are mental disorders is critical to understanding how and why such aberrations exist, and vice versa. As aptly (and annoyingly) put by Emil Pugh, “If the human brain were so simple that we could understand it, we would be so simple that we couldn’t!”, the human mind is complex, apparently unpredictable, and arduously unmodellable. This leads us to yearn if we can use a parallel intelligence system to fill in for us in our quest to understand ourselves. An intelligence system as complex, unpredictable, and unmodellable that it can see patterns and processes that elude our vision. If computer scientists can model programmes and machines based on the functioning of neurons and neuronal connectivity (read perceptrons and neural networks) to develop intelligent solutions, it is worth further exploring why such AI systems cannot model the human mind. This is particularly relevant since AI technology has started to reveal clinically meaningful insights about disease biology and management. Steps in this direction have already been taken, with AI/ML-enabled technologies being evaluated to assist MSEs (Liu et al., 2022), enabling diagnoses, prognosis, treatment response, and prediction of transition to frank illness (Grzenda et al., 2021; Ray et al., 2022; Salazar de Pablo et al., 2020).
2 The Challenge: Psychiatric Diagnoses and Mental Healthcare Delivery Here, we give a brief overview of how mental health conditions are conceptualized, diagnosed, and treated. We also highlight the challenges and pitfalls of the current approaches and how AI/ML technologies can bridge the gap.
2.1 Can We Confidently Model Brain Disorders? In medicine, diagnosing a disorder (the process of naming and classifying a disease) is the central pillar of all practices. This process universalizes communication and permits treatment, prognostication, drug development and other treatment to reduce
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
201
a disorder with known etiology and/or pathogenesis
Disease Disorder
a syndrome with known risk factors, treatments, & dysfunction
Psychiatric diagnoses Syndrome
a constellation of signs & symptoms occurring together and covarying over time
Symptoms & Signs
subjective distress and their objective indicators assessed by a clinician
Fig. 10.2 Pyramid of diagnostic sophistication
the mortality and morbidity secondary to the diagnosis. There are four levels in the diagnosis pyramid, based on the increasing sophistication reached in understanding the condition (Fig. 10.2). The mental health field has had a long-standing difficulty keeping up with other branches of medicine regarding diagnosis. Heterogeneity in clinical outcomes, overlapping clinical manifestations, unstable prospective diagnoses, and limited or absent laboratory investigations supporting clinical manifestations are the main reasons for the questionable validity of psychiatric diagnoses (Robins & Guze, 1970). Limited diagnostic validity, despite fairly reliable and clinically utilitarian diagnostic systems (American Psychiatric Association, 2013; World Health Organization, 1992), is the strongest impediment to studying the etiology, pathogenesis, and treatment of mental illnesses. This is a major knowledge gap in psychiatry despite the tremendous scientific advances in fundamental neuroscience. Advances in AI/ML can potentially narrow this gap by applying supervised (diagnostic and prognostic prediction models) and unsupervised (pattern recognition, clustering, to refine outcome and biobehavioral measurements) techniques. Moving beyond traditional atheoretical and categorical diagnostic systems, several newer investigative frameworks rely on transdiagnostic and dimensional approaches to understanding mental illness. The Research Domain Criteria (Insel et al., 2010) and the Hierarchical Taxonomy of Psychopathology (Kotov et al., 2017) are illustrations of such frameworks that can be leveraged in partnership with appropriate AI/ML techniques to further the field.
2.2 Delivery of Mental Health Care to All? The World Health Organization report published in 2022 on Mental Health reports that on an average, one in eight individuals around the world lives with a mental disorder, and for around half the world population, only one psychiatrist is available for every 200,000 or more people. Mental disorders are among the top ten leading causes of burden worldwide (GBD 2019 Mental Disorders Collaborators, 2022). To further compound the problem, the care resources available in different countries are variable (World Health Organization, 2022). Let us take India as an exemplar
202
U. M. Mehta et al. Current prevalence (%) of psychiatric disorders in India National Mental Health Survey 2015−16
Any mental disorder
10.56
Substance use disorders
22.4
Anxiety disorders
3.53
Depression
2.68
Schizophrenia
0.42
Bipolar disorder
0.3
0
5
10
15
20 Data source: http://indianmhs.nimhans.ac.in
Fig. 10.3 Current prevalence of psychiatric disorders in India
to illustrate this global mismatch. Data from the National Mental Health Survey of India, 2015–16 reveals that one in 10 (see Fig. 10.3) suffers from a mental disorder in the country (Pradeep et al., 2018). Common mental disorders like depression and anxiety contribute to the overall disability owing to their sheer numbers; nevertheless, their contact coverage with healthcare services in India remains very low (5–12%) (Patel et al., 2016). Psychotic disorders like schizophrenia, despite having a low point prevalence (3–4 per 1000), also contribute substantially to this disability. The onset of psychotic disorders during youth and early adulthood (the most productive age group for a given individual) and high rates of treatment resistance (a third have insufficient response to treatment) compound this challenge. It is more so in low- and middle-income countries, given the population growth and ageing (Charlson et al., 2016). Furthermore, even for schizophrenia, which is more easily identifiable in the community than depression or anxiety, India has a much lower contact coverage (~40–50%) than China or the western world (Patel et al., 2016). The lack of trained human resources that can take this growing challenge head on is among the many factors that widen this mental health gap (Garg et al., 2019). AI/ML approaches can help to bridge this gap. For example, prognostic models built on complex multimodal biobehavioural data can enable the early identification of potentially treatment-resistant individuals. This can enable targeted and specialized treatments in a subset of those individuals where more intensive treatment resources can be utilized. A learning healthcare system that uses an active learning cycle to enrich (more accurate and more parsimonious) existing models based on prospective feedback from the real world has application in this field. Using context-specific, trained chatbots to deliver talk therapies for
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
203
common mental disorders is another potential application of AI/ML technologies to circumvent the mental health gap in common mental disorders like depression and anxiety (Jiang et al., 2022; Vaidyam et al., 2019). However, the pragmatics, ethics, and legal liabilities of such an approach are all evolving iteratively.
3 The Opportunity: Big Data and the Intersection of Clinical and Population Neuroscience Big data is information in massive volumes, velocity, and variety, requiring specific technology and analytical methods to derive valuable and actionable applications (De Mauro et al., 2016). Today, large data, accessible in many forms at the clinical and population levels, can exponentially improve our understanding of the fundamentals of brain function and its applications in the diagnosis and treatment of mental disorders. Such an approach is based on the premise that our genome (genetic architecture) interacts with our envirome (immediate and distal environment) to shape our phenome (the sum of our phenotypic characteristics) (Paus, 2010). Therefore, if we can comprehensively measure these ‘big data’ points over time, we can potentially apply AI/ML techniques to identify unseen patterns and make predictions and classifications of clinical relevance. ‘Data lakes’ are central repositories for storing structured, semi-structured, and unstructured data. Multimodal data required for training, evaluation, and inference in modern machine learning systems would require such repositories to store information centrally. Spatial and temporal brain structure and function can be inferred using cuttingedge neuroimaging tools like magnetic resonance imaging, radioisotope scans, and electroencephalography. Breakthroughs in genetic sequencing techniques now enable the mapping of an individual’s entire genome in a reasonably scalable manner across continents. The downstream effects of genes can now also be mapped as a total expression of mRNA (transcriptomics), proteins (proteomics), and epigenetic influences on genes (e.g., methylomics) at the cellular, tissue, organ, and species levels. These data are amenable to big data analytics approaches due to their sheer volumes and variety. Yet another ‘big data’ stream that is amenable to informing human behaviour and mental health outcomes is the use of ecological momentary assessments, typically captured via digital phenotyping approaches. Here momentto-moment variation in cognition, emotions, and behaviour is captured in real time in naturalistic (non-clinical) settings (Myin-Germeys et al., 2018). The contextual pairing of cognition, mood, and behaviour with spatial location, mobility, and circadian rhythm parameters collected via smartphone or wearable sensors can provide ecologically accurate and dense time-series data (Henson et al., 2020; Rodriguez-Villa et al., 2021). Such data acquisition is now increasingly possible at unprecedented scales, given the meteoric rise in smartphone users and Internet access globally.
204
U. M. Mehta et al.
Recent advances in computational performance, along with analytic methods like reinforcement learning, have enabled the easier analysis of such large and multimodal datasets. Dedicated national and international efforts are underway to understand the inner workings of the brain and its clinical applications across several parts of the world (Chen et al., 2019; Frégnac, 2017). Clinical neuroscience investigations help to capture the rich cognitive, affective, behavioural, social-contextual, and outcome data in affected individuals or those at high risk for mental disorders. Such extreme phenotyping is also supported by neuroimaging and the multiple ‘omics’ parameters, as detailed earlier, representing mediators or pathophysiological processes leading to mental disorders. Population neuroscience approaches aim to leverage high-throughput high-fidelity neuroimaging and ‘omics’ assessments paired with a refined understanding of the environmental exposures among thousands of individuals irrespective of their mental illness status to derive normative, context-dependent developmental trajectories against which clinical data can be compared (Fig. 10.4). One of the many major contributions of such big data technology is that, for the first time, researchers have been able to chart the trajectory of brain growth throughout the lifespan, from intrauterine life to the age of 100. They have provided a brain chart that can be used as a reference point to chart the brain growth of an individual in different years of life. Similar to how height and weight charts are used to track physical health, this brain chart hopes to inform the brain health of an individual (Bethlehem et al., 2022). There are several advantages of such global collaborations that can transcend the racial/geographical/sociocultural boundaries and attempt to understand mental illness as the multivariable phenome arising out of genomic and enviromic transactions (Uhlhaas and Wood, 2020).
4 The Approach: Natural Language Processing and Machine Learning Exemplars To make sense of how these large datasets are used to derive meaningful information, we first discuss the relevant analytical methods available at the disposal of mental health professionals and then illustrate the feasibility and potential of these techniques in the field of mental health, spanning disease biology, and early diagnosis to personalized treatments and rehabilitation.
4.1 Natural Language Processing (NLP) Natural language processing (NLP) and machine learning are overlapping subfields of AI. Machine learning is one of the many tools used in NLP. Applied computational linguistics or natural language processing is the field of artificial intelligence, which is concerned with modelling language use and its practical applications. It concerns
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
205
Fig. 10.4 Big data sources to understand psychiatric disorders using a clinical and population neuroscience framework
applying the ability of machines to interpret the natural language of humans. With the advent of higher computational power and lesser human dependence via the fastevolving machine learning technology, the field of NLP has grown. Large, annotated bodies of text (corpora or libraries) are used to train ML algorithms. These corpora serve as the gold standard for evaluating that particular language. ML algorithms that are trained with corpora or libraries generate individual rules of ‘grammar’ which have associated probabilities of representation for words and sentences. Speech and language are the largest ‘windows into the mind’. However, being able to use this ‘window’ objectively has remained a challenge. The application of NLP to understanding psychiatric disorders (Corcoran & Cecchi, 2020) is through the direct study of clinical speech samples or indirectly through clinician notes typically accessed through electronic health records (EHRs). Let us take the example of schizophrenia and related psychotic disorders, where irrelevant and incoherent speech (thought disorders) are common clinical presentations. However, inferring irrelevance and incoherence can be a subjective process contingent on the linguistic proficiency of the patient and clinician, the sociocultural norms governing language use, and the subjective interpretation of the clinician. Automated NLP techniques like latent speech analysis and structural speech analysis have now been used to (a) differentiate genetic high-risk individuals from unrelated healthy individuals (Elvevåg et al., 2010), (b) predict the onset of schizophrenia from prodromal states (Bedi et al., 2015), and also (c) differentiate affective from non-affective psychotic disorders (Mota et al., 2012).
206
U. M. Mehta et al.
Furthermore, the structural connectedness of speech in psychotic disorders is related to critical structural and functional brain connectivity parameters (Palaniyappan et al., 2019). The NLP-based dimensional characterization of symptoms from EHRs has associations with common genetic variations (McCoy et al., 2018). This opens avenues for exploring genetic determinants of human behaviour and its aberrations (psychiatric disorders) in a less biased and novel manner than traditional diagnosis-based case–control studies. These techniques have steadily found application in predicting suicidal risk from EHRs, such as discharge summaries (McCoy et al., 2016). Nevertheless, there are anticipated challenges to implementing these systems at a larger scale owing to the multiplicity of human languages and their complexities in expression that are often culturally intertwined. It is, therefore, of paramount importance to have indigenous training datasets and local-setting-specific computational language models to be able to infer the results with a greater degree of certainty. Sentiment analysis is the use of NLP by contextual mining to infer affective or mood states. Its potential applications in suicide risk assessment are currently being studied (Bittar et al., 2021). Similarly, text mining, another application of NLP is used to assess the vocational status of patients from EHRs. This can potentially aid in the objective determination of social determinants of mental health, such as occupational functioning and its relation to the stage of the illnesses. This approach will have particular application in recovery-oriented rehabilitation services and the prognosis of various psychiatric disorders (Chilman et al., 2021).
4.2 Machine Learning Before we examine how ML can help to solve mental health problems, we must understand the differences between ML and standard statistical testing. While statistics helps to draw population-level inferences from a sample (e.g., formalizing a mechanism or verifying a hypothesis), ML identifies generalizable predictive patterns to forecast unobserved outcomes (e.g., treatment response, irrespective of how the treatment acts) (Bzdok et al., 2018). Statistical testing and ML can both be used for inference and prediction. However, when it comes to prediction, ML seems to have an advantage. ML methods are helpful when dealing with large amounts of data, wide data (number of input variables exceeds the number of subjects) in contrast to long data, when data are gathered without a controlled experimental design, and when complex nonlinear interactions are expected in the collected data (a common case in mental health disorders). To summarize, ML methods seem to be advantageous when solving problems that are known to be complex by the virtue of them being a complex heterogeneous interactive system (Koppe et al., 2021). A recent study reports that the FDA has approved 64 ML models to be used in clinical settings (Harish et al., 2022), but few of them address psychiatric disorders. It is necessary to transform AI/ML technology from a research tool to implement
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
207
translational models in clinical care for the purpose of psychiatric diagnosis, prognosis, and treatment planning. Several systems need to converge to take advantage of these technologies and combine them with cutting-edge neuroscience data and wellcurated clinical and real-world outcome data. The feasibility of this convergence depends on science and health policymakers, stakeholders, and persons with lived experience, mental health clinicians, fundamental scientists, computer engineers, and data scientists. To begin with, most ML research in general (not specific to mental health) has focussed on the development of ML models, which often are trained and tested on data that are collected in ‘laboratory’ or ‘research’ settings. However, there is limited focus on implementing these models to aid clinical care. This has resulted in the poor translation of the insights gained from ML technology to real-world clinical settings. Several challenges need to be addressed for this translation to occur. Data collected in real-world settings is less than ideal, with variable rates of missingness, poor structure, varying formats, and needs human intervention, technological expertise, and infrastructural support to refine, upgrade, maintain, and sustain these automated platforms (Posoldova, 2020). In the current scenario, ML algorithms are generally either open source or propriety registered (Harish et al., 2022). Regulators and policymakers, therefore, need to create a conducive market in which innovative developers will continue to work and collaborate. Such an implementation of tested and validated models is typically deployed through machine learning operations (MLOps)—a paradigm to provide a roadmap used by researchers and professionals in initiating and sustaining ML projects (in the clinics) and to have a pulse on the performance metrics of all models deployed in production, all available datasets, for training and also the data pipelines being used to enrich the datasets. Figure 10.5 elaborates on the complex workflow process for the development of ML models.
4.3 Applications of AI in Mental Health Psychiatry can benefit from AI/ML technology in three broad areas (Ray et al., 2022; Shatte et al., 2019)—improving diagnostic validity, early detection of illness (diagnostic biomarkers), and personalized treatments (theranostic biomarkers).
4.3.1
Taxonomy, Classification, and Diagnostic Validity
First, AI/ML holds promise in the taxonomy and classification of psychiatric disorders. With its ability to offer innovative perspectives at understanding latent verisimilitudinous patterns in large-scale, multimodal, biobehavioural data, we can better
208
U. M. Mehta et al.
Problem statement Access to training data
Model retraining Feedback from real-world Feddback becomes data Step-1 repeated
Data preparation Exploratory data analysis Feature engineering
ACTIVE LEARNING CYCLE Build ML model Train & tune Model predictions Tests of accuracy
Model deployment Real-world settings Productionisation Model validations Internal & external Review artifacts
Fig. 10.5 Active learning cycle of machine learning operations that can support learning mental healthcare systems
apprise the validity of psychiatric disorders. Better elucidation of causal and pathophysiological mechanisms and the discovery of novel—neuroscience and datainformed—treatment targets can potentially be the parallel benefits of this innovation. Seminal work using an unsupervised numerical taxonomy (k-means clustering) method in psychotic disorders (with or without mood symptoms) based on cognitive and neurophysiological data is able to demonstrate three biological phenotypes with unique cognitive and physiological properties irrespective of their clinical diagnoses. Moreover, these three phenotypes were found to have satisfactory replicability and temporal stability (Clementz et al., 2016, 2022). These biological phenotypes or biotypes, as they are referred to by the researchers, are potential neuroscience-informed targets for further etiological investigations and treatment discoveries. A similar approach employed resting-state functional magnetic resonance imaging (fMRI) to identify four biological phenotypes of clinical depression, one of which was also a predictor of response to a specific type of treatment (Drysdale et al., 2017). However, an attempt to replicate these findings in an independent sample and team of investigators was unable to show similar phenotypes, although in a relatively smaller sample (Dinga et al., 2019). These applications of unsupervised ML techniques are good illustrations of the latent potential of bringing together clinical and data science in answering challenging mental health questions. However,
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
209
the results need cautious interpretation, given the failed replication attempts in one of these studies. Yet another application of AI/ML technology towards improving taxonomy and classification is to aid as an objective quantifiable metric of human cognition, mood, and behaviour, which can then be templates of specific ‘mental states’ to be compared with biological measures. This in-depth phenotypic characterization is based primarily on digital technology—smartphones, wearable devices, the use of social media, etc. The digital or virtual world has pervaded daily activities across the world, irrespective of culture and geography. It, therefore, provides an opportunity to understand human behaviour with greater objectivity. Examples of such digitally captured real-world phenotyping avenues include the study of social media posts using NLP techniques (Du et al., 2018; Eichstaedt et al., 2018; Reece et al., 2017), passive data from the built-in smartphone and wearable sensors like accelerometer and global positioning system and active data from symptom surveys, medication adherence, microcognitive and contextual momentary performance metrics. These enable the computation of higher-level behavioural metrics like screen time, inactive duration, entropy, significant location, home time, and living environment characteristics like green spaces and built density, depressive cognitions, and suicidal risk (Braithwaite et al., 2016; Henson et al., 2020). Such digital phenotyping technologies have the added advantage of better scalability (collecting data from a broader sample), ecological validity (data collected in real world rather than clinic-based settings), and time density (more frequent data capture than in-person assessments). Future studies will need to examine how behavioural metrics captured from digital data are related to biological measurements of disease pathophysiology.
4.3.2
Diagnostic Biomarkers and Early Treatment
Secondary prevention in public health deals with the early diagnosis and treatment initiation of medical disorders in subclinical or at-risk populations with the ultimate aim of reducing the potential illness-related morbidity, disability, and mortality (Kisling & Das, 2022). Considerable efforts have been made across the globe to study factors that can predict the onset of psychotic disorders like schizophrenia or cognitive disorders like dementias. Prospective studies that evaluate transitions from subclinical to clinical states, as well as cross-sectional studies comparing clinical and non-clinical groups, have both been used to this effect. The use of supervised ML algorithms is of particular relevance in interpreting how large clinical and biobehavioral data can prospectively predict the transition from ‘at risk’ or ‘subclinical’ states to full-blown schizophrenia or dementia. A review of multiple such approaches to identify predictors of transition to schizophrenia suggests that models which employ only clinical or neuropsychological data gathered at baseline do not have satisfactory model estimates; hence, they are not fit for deployment in the clinical setting. Alternatively, combining clinicians’ predictions with the best model predictions had better estimates, which could potentially be deployed in the clinic on a global scale (Rosen et al., 2021). Furthermore, demographics and medical event data
210
U. M. Mehta et al.
scavenged through EHRs of matched first-episode schizophrenia individuals and healthy controls demonstrated a low-cost and parsimonious approach to the effective one-year prediction of schizophrenia using a recurrent neural network algorithm (Raket et al., 2020). These are encouraging observations but with modest prediction estimates and the challenge of circularity—where clinical characteristics at a single time epoch are used to predict clinical outcomes at another time epoch—emphasizes the need to enhance the data width by capturing more proximal disease-mechanistic biomarkers like neuroimaging (Kraguljac et al., 2021). Both structural (Koutsouleris et al., 2012; Xie et al., 2022) and functional MRIs (Solanes & Radua, 2022) have been extensively used to predict the emergence of schizophrenia and differentiate schizophrenia from other psychiatric disorders. Supervised ML algorithms like support vector machines and random forests are commonly used for such applications. One of these models, which uses structural brain morphometric data, is available as an open resource (Xie et al., 2022). Machine learning benchmarks for neuroimaging studies in psychiatry have now been framed to ensure streamlined data collection, preprocessing, analysis, reporting, and interpretation (Leenings et al., 2022). This approach of using the multivariate brain network features to predict mental illness in recent studies is referred to as the “predictome”. Here, multiple brain network-based features are incorporated into a predictive model to estimate features unique to a particular disorder (Rashid & Calhoun, 2020). Similar to the use of ML in neuroimaging data, many other modalities, such as genomics (Bousman et al., 2013), electrophysiology (Van Tricht et al., 2010), cognition (Riecher-Rössler et al., 2009), and transcriptomics (Yang et al., 2022), have been used as diagnostic biomarkers. Different approaches have been used for the prediction of transition to psychosis. However, the use of ML in the combination of these different types of data (generally referred to as multimodal data) seems to be the logical next step. The feasibility and enhanced prediction accuracy of this approach were recently demonstrated via simulated multimodal data (clinical interview, structural MRI, electrophysiology, and neurocognitive testing) to create diagnostic biomarker models for schizophrenia (Clark et al., 2015). Such co-ordinated efforts are beginning to suggest that ML models with neuroimaging and genetic features as input performed better than models with either alone (Hu et al., 2021). With the advent of clinical big data and multimodal fusion modelling approaches to generate embeddings or representations of multimodal data and their relationships as a vector, we can expect innovative solutions to identify disease biomarkers (Steyaert et al., 2023).
4.3.3
Theranostic Biomarkers and Personalized Treatments
Prognostic and Predictive Biomarkers Although prognostic biomarkers provide an estimate of the course and outcome of a disease, predictive biomarkers provide an estimate of response to a particular treatment. The emerging area of theranostics deals with selecting the treatment that could be the most appropriate and most effective for a given individual. This concept of using individual or personal biological
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
211
signals to guide treatment selection is at the heart of personalized and stratified treatment. Interestingly, the quest for efficient theranostic biomarkers relies strongly on sound knowledge of disease biology and prognosis (Nikolac Perkovic et al., 2017). While diagnostic biomarkers, closely linked to disease biology, have been discussed earlier, it is important to recognize the importance of prognosis research to guide the theranostic process. Prognosis is the risk of future health outcomes (e.g., recovery, recurrence, or resistance to treatment) in people with a given disease based on current clinical knowledge (Boushell et al., 2019). Such research includes describing the natural progression of a disorder, identifying candidate’s prognostic factors of clinically relevant outcomes, estimating the probability of a particular clinical outcome at an individual level based on a model with multiple prognostic factors, investigating the clinical utility of such models, and ultimately informing stratified treatment approaches to improve patient outcomes (Kent et al., 2020). Once developed, these can be deployed as learning healthcare systems that retrain the model based on new information from real-world clinical data. An illustration of such a conceivable learning healthcare system (see Fig. 10.6) can be the case of pragmatic predictive biomarkers to determine those individuals who are unlikely to respond to first- or second-line treatment in schizophrenia. As discussed above, resistance to treatment in schizophrenia is a major clinical challenge. Early identification of potential treatment resistance using pre-treatment biobehavioural characteristics can aid in more specialized and intensive treatments for such individuals. Once an effective and parsimonious model predicting treatment resistance is identified, it can be deployed in a real-world clinical setting through machine learning operations. Typically, this deployment should happen over open-source machine learning operation platforms built with models generated through opensource ML frameworks (e.g., TensorFlow, Scikit-learn, or PyTorch). Model visualizations, interpretations, and explainability can also be driven through open-source resources (Lundberg and Lee, 2017; Ribeiro et al., 2016). It is critical to incorporate explainability algorithms like Shapley Additive exPlanations (SHAP) or Local interpretable Model Agnostic Explanation (LIME) to understand what features are clinically relevant. With these advances, any clinical setting can use easily accessible clinical data, along with the best prediction features from neuroimaging or genetic and other laboratory data to (a) inform early identification of treatment-resistant schizophrenia and (b) facilitate shared clinical decision-making towards improving outcomes and reducing disability. The model will also serve as a learning system, that will capture new patient data, store the predictions made at the outset, receive the actual clinical outcome over time, relearn or reinforce the model on the new, updated dataset, and redeploy the reengineered algorithm. Data-Informed Treatments Another potential area where ML algorithms can be used is data-informed neuromodulation. These are perhaps conceptualized as ‘mind
212
U. M. Mehta et al.
Prognosis research
Stage 1
At-risk population Anyone presenting with a firstepisode of schizophrenia should be considered at-risk for resistant schizophrenia
Describe course, identify candidate prognostic markers and outcomes, build prognostic models, examine their clinical utility, inform treatment planning
Stage 2
Parallel deliverables
Iterative learning
Stage 3
Implementation
Through a data-driven learning healthcare system to refine, improvize, and individualize parsimonious prognosis models
Stage 4
Clinical utility of prognostic models; Examine early alternative treatments using RCTs
Improved understanding of the etiopathogenesis and trajectories to treatment resistance in schizophrenia
Fig. 10.6 Illustration of a potential ML operation in the prediction of resistant schizophrenia
reading machines’ that can ‘mind modulate’. Mind reading here refers to computational models of information processing in the brain based on intracranial electroencephalography (iEEG) data. There are several computational approaches for this task such as encoding and decoding models. Representational similarity analysis is a type of encoding model where there is a direct comparison of stimulus features encoded in computational models with the features encoded in patterns of brain activity; decoding models take the opposite approach (Berezutskaya et al., 2022). Deep brain stimulation (DBS) is a therapy that leverages these technologies and is now being increasingly studied in treating patients with treatment-resistant depression. Closedloop deep brain stimulation is an approach where patients’ own electrophysiological activity is used to trigger stimulation only when the pathological state is detected (Bouthour et al., 2019). To estimate this personalized target, first, via invasive surgery, electrodes are placed at various sites in the brain. Here, brain activity can be recorded and modulated via iEEG and intracranial electrodes, respectively. Such direct neuromodulation offers the advantage of deriving causal associations between activity in the stimulated brain circuit and cognition or behaviour. In one such landmark study of closed-loop DBS for depression (Scangos et al., 2021), electrodes were placed at ten sites with 160 points recording the neural activity across different mood states, over a period of 10 days, seeking patterns at one or more recording sites that correlate with various mental states (neutral, sad, happy, etc.). At various stages of this study, different ML algorithms were employed. While unsupervised models enabled the objective understanding of behavioural patterns, supervised models made classifications and predictions of mood states and their change based on iEEG signals and neuromodulation with DBS. Once robust markers of neural activity sites correlated with low mood states were identified, electrodes were placed at these sites to provide therapeutic stimulation or data-informed neuromodulation. Personalized Digital Interventions The characterization of the relationship between depression-related language from social media posts and mood states (Bathina et al., 2021), in the future, may help in the development of automated interventions (such as ‘chatbots’) or suggest promising targets for psychotherapy.
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
213
Several studies have examined the feasibility and effectiveness of such AI-assisted chatbots in providing psychotherapy (Cameron et al., 2017; Noble et al., 2022; Singh, 2019). A recent meta-analysis examining the effectiveness and safety of using chatbots to improve mental health care illustrated three major findings (Abd-Alrazaq et al., 2020). First, chatbots have the potential to improve mental health care, especially in conditions like depression. However, the quality of the evidence (methodological rigour) was weak. Second, very few studies examined the safety of using chatbots, especially concerning privacy issues. Lastly, chatbot interventions were not useful in healthy individuals who used them primarily for improving their psychological well-being. While the scalability of such interventions is immense, challenges remain that need to be addressed via continued clinical experimentation. A related concept that has also gained traction is that of just-in-time-adaptive intervention (JITAI). This is a closed-loop personalized digital intervention responding to specific, actionable symptom-context associations inferred from active and passive smartphone data (Wang & Miller, 2020). These are not restricted to mental health settings but can be easily adapted to help those in mental distress. Smartphone applications can be trained on actionable algorithms of symptom-context associations using AI/ML techniques. After that, if a symptom is detected beyond a pre-specified threshold, tailor-made interventional suggestions can be made by the application based on the context in which the symptom is being expressed.
5 Caveats and Future Potentials A major challenge in using AI/ML technology to decipher the human mind and its ailments is the issue of interpreting, understanding, and explaining emergent models (Watson, 2022). While interpretability deals with efforts to infer cause-andeffect relations between predictors and model outcomes, explainability focuses to understand the underlying mechanisms of the model (Marcinkeviˇcs & Vogt, 2023). Conceptual challenges, however, remain, as elucidated in other texts (Watson, 2022), and a careful and sustained addressal of these is warranted for the field to move forward. Another challenge is that of implementation on a larger scale to benefit society. The computational constraints, the logistic expertise, the infrastructural advances, and the high costs limit the easy implementation of AI/ML technology at the grass-root levels. Technological advances in computation, processors, and logistics, paired with the growing clinical utility of AI/ML, will perhaps drive innovations to reach these systems to the grassroots. Lastly, the ethics of implementing certain AI/ML applications in the psychiatric setting should be deliberated and evolved as we embrace such technology. These are again covered in greater detail elsewhere (Karimian et al., 2022; Wiese & Friston, 2022) and range from issues of maintaining human autonomy and privacy, through consequences of early detection, to the prevention of harm and ensuring fairness.
214
U. M. Mehta et al.
Nevertheless, the most appropriate way forward would be to acknowledge current challenges, have open discussions with stakeholders, and iteratively refine the rolling out of these applications that best govern the ethical challenges listed here. All stakeholders working towards improving outcomes of psychiatric disorders must recognize and leverage the potential of AI/ML. The time is ripe for a more systematic coming together of domain experts, AI/ML professionals, and stakeholders to formulate the most pertinent questions, evolve accurate and parsimonious solutions, implement the solutions at the grassroots, and relearn from real-world scenarios. Lastly, while psychiatry will benefit from the insights derived from these emerging technologies in quantifying complex human behaviour, it will, in turn, inform the sophistication of AI/ML technologies in the years to come.
References Abd-Alrazaq, A. A., Rababeh, A., Alajlani, M., Bewick, B. M., & Househ, M. (2020). Effectiveness and safety of using chatbots to improve mental health: Systematic review and meta-analysis. Journal of Medical Internet Research, 22, e16021. https://doi.org/10.2196/16021 American Psychiatric Association (Ed.) (2013). Diagnostic and statistical manual of mental disorders: DSM-5 (5th ed.). American Psychiatric Association. Bathina, K. C., ten Thij, M., Lorenzo-Luaces, L., Rutter, L. A., & Bollen, J. (2021). Individuals with depression express more distorted thinking on social media. Nature Human Behaviour, 5, 458–466. https://doi.org/10.1038/s41562-021-01050-7 Bedi, G., Carrillo, F., Cecchi, G.A., Slezak, D. F., Sigman, M., Mota, N. B., Ribeiro, S., Javitt, D. C., Copelli, M., & Corcoran, C. M. (2015). Automated analysis of free speech predicts psychosis onset in high-risk youths. npj Schizophrenia, 1. https://doi.org/10.1038/npjschz.2015.30 Berezutskaya, J., Saive, A.-L., Jerbi, K., & van Gerven, M. (2022). How does artificial intelligence contribute to iEEG research? https://doi.org/10.48550/arXiv.2207.13190 Bethlehem, R. A. L., Seidlitz, J., White, S. R., Vogel, J. W., Anderson, K. M., Adamson, C., Adler, S., Alexopoulos, G. S., Anagnostou, E., Areces-Gonzalez, A., Astle, D. E., Auyeung, B., Ayub, M., Bae, J., Ball, G., Baron-Cohen, S., Beare, R., Bedford, S. A., Benegal, V., et al. (2022). Brain charts for the human lifespan. Nature, 604, 525–533. https://doi.org/10.1038/s41586-022-045 54-y Bittar, A., Velupillai, S., Roberts, A., & Dutta, R. (2021). Using general-purpose sentiment lexicons for suicide risk assessment in electronic health records: corpus-based analysis. JMIR Medical Informatics, 9, e22397. https://doi.org/10.2196/22397 Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies, superintelligence: Paths, dangers, strategies. Oxford University Press. Boushell, L. W., Shugars, D. A., & Eidson, R. S. (2019). Patient assessment, examination, diagnosis, and treatment planning. In A.V. Ritter, L.W. Boushell & R. Walter (Eds.), Sturdevant’s art and science of operative dentistry (pp. 95–119). Elsevier. https://doi.org/10.1016/B978-0-32347833-5.00003-4 Bousman, C. A., Yung, A. R., Pantelis, C., Ellis, J. A., Chavez, R. A., Nelson, B., Lin, A., Wood, S. J., Amminger, G. P., Velakoulis, D., McGorry, P. D., Everall, I. P., & Foley, D. L. (2013). Effects of NRG1 and DAOA genetic variation on transition to psychosis in individuals at ultra-high risk for psychosis. Translational Psychiatry, 3, e251. https://doi.org/10.1038/tp.2013.23 Bouthour, W., Mégevand, P., Donoghue, J., Lüscher, C., Birbaumer, N., & Krack, P. (2019). Biomarkers for closed-loop deep brain stimulation in Parkinson disease and beyond. Nature Reviews Neurology, 15, 343–352. https://doi.org/10.1038/s41582-019-0166-4
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
215
Braithwaite, S. R., Giraud-Carrier, C., West, J., Barnes, M. D., & Hanson, C. L. (2016). Validating machine learning algorithms for twitter data against established measures of suicidality. JMIR Mental Health, 3, e4822. https://doi.org/10.2196/mental.4822 Bzdok, D., Altman, N., & Krzywinski, M. (2018). Statistics versus machine learning. Nature Methods, 15, 233–234. https://doi.org/10.1038/nmeth.4642 Cameron, G., Cameron, D. M., Megaw, G., Bond, R. B., Mulvenna, M., O’Neill, S. B., Armour, C., & McTear, M. (2017) Towards a chatbot for digital counselling. https://doi.org/10.14236/ ewic/HCI2017.24 Charlson, F. J., Baxter, A. J., Cheng, H. G., Shidhaye, R., & Whiteford, H. A. (2016). The burden of mental, neurological, and substance use disorders in China and India: A systematic analysis of community representative epidemiological studies. The Lancet, 388, 376–389. https://doi.org/ 10.1016/S0140-6736(16)30590-6 Chen, S., He, Z., Han, X., He, X., Li, R., Zhu, H., Zhao, D., Dai, C., Zhang, Y., Lu, Z., Chi, X., & Niu, B. (2019). How big data and high-performance computing drive brain science. Genomics, Proteomics & Bioinformatics, Big Data in Brain Science, 17, 381–392. https://doi.org/10.1016/ j.gpb.2019.09.003 Chilman, N., Song, X., Roberts, A., Tolani, E., Stewart, R., Chui, Z., Birnie, K., Harber-Aschan, L., Gazard, B., Chandran, D., Sanyal, J., Hatch, S., Kolliakou, A., & Das-Munshi, J. (2021). Text mining occupations from the mental health electronic health record: A natural language processing approach using records from the Clinical Record Interactive Search (CRIS) platform in south London, UK. British Medical Journal Open, 11, e042274. https://doi.org/10.1136/bmj open-2020-042274 Clark, S. R., Schubert, K. O., & Baune, B. T. (2015). Towards indicated prevention of psychosis: Using probabilistic assessments of transition risk in psychosis prodrome. Journal of Neural Transmission, 122, 155–169. https://doi.org/10.1007/s00702-014-1325-9 Clementz, B. A., Parker, D. A., Trotti, R. L., McDowell, J. E., Keedy, S. K., Keshavan, M. S., Pearlson, G. D., Gershon, E. S., Ivleva, E. I., Huang, L.-Y., Hill, S. K., Sweeney, J. A., Thomas, O., Hudgens-Haney, M., Gibbons, R. D., & Tamminga, C. A. (2022). Psychosis biotypes: Replication and validation from the B-SNIP consortium. Schizophrenia Bulletin, 48, 56–68. https:// doi.org/10.1093/schbul/sbab090 Clementz, B. A., Sweeney, J. A., Hamm, J. P., Ivleva, E. I., Ethridge, L. E., Pearlson, G. D., Keshavan, M. S., & Tamminga, C. A. (2016). Identification of distinct psychosis biotypes using brain-based biomarkers. AJP, 173, 373–384. https://doi.org/10.1176/appi.ajp.2015.14091200 Corcoran, C. M., & Cecchi, G. A. (2020). Using Language processing and speech analysis for the identification of psychosis and other disorders. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, Understanding the Nature and Treatment of Psychopathology: Letting the Data Guide the Way, 5, 770–779. https://doi.org/10.1016/j.bpsc.2020.06.004 De Mauro, A., Greco, M., & Grimaldi, M. (2016). A formal definition of Big Data based on its essential features. Library Review, 65, 122–135. https://doi.org/10.1108/LR-06-2015-0061 Dinga, R., Schmaal, L., Penninx, B. W. J. H., van Tol, M. J., Veltman, D. J., van Velzen, L., Mennes, M., van der Wee, N. J. A., & Marquand, A. F. (2019). Evaluating the evidence for biotypes of depression: Methodological replication and extension of. Neuroimage Clinical, 22, 101796. https://doi.org/10.1016/j.nicl.2019.101796 Drysdale, A. T., Grosenick, L., Downar, J., Dunlop, K., Mansouri, F., Meng, Y., Fetcho, R. N., Zebley, B., Oathes, D. J., Etkin, A., Schatzberg, A. F., Sudheimer, K., Keller, J., Mayberg, H. S., Gunning, F. M., Alexopoulos, G. S., Fox, M. D., Pascual-Leone, A., Voss, H. U., … Liston, C. (2017). Resting-state connectivity biomarkers define neurophysiological subtypes of depression. Nature Medicine, 23, 28–38. https://doi.org/10.1038/nm.4246 Du, J., Zhang, Y., Luo, J., Jia, Y., Wei, Q., Tao, C., & Xu, H. (2018). Extracting psychiatric stressors for suicide from social media using deep learning. BMC Medical Informatics and Decision Making, 18, 77–87. https://doi.org/10.1186/s12911-018-0632-8 Eichstaedt, J. C., Smith, R. J., Merchant, R. M., Ungar, L. H., Crutchley, P., Preo¸tiuc-Pietro, D., Asch, D. A., & Schwartz, H. A. (2018). Facebook language predicts depression in medical
216
U. M. Mehta et al.
records. Proceedings of the National Academy of Sciences, 115, 11203–11208. https://doi.org/ 10.1073/pnas.1802331115 Elvevåg, B., Foltz, P. W., Rosenstein, M., & DeLisi, L. E. (2010). An automated method to analyze language use in patients with schizophrenia and their first-degree relatives. J Neurolinguistics, 23, 270–284. https://doi.org/10.1016/j.jneuroling.2009.05.002 Frégnac, Y. (2017). Big data and the industrialization of neuroscience: A safe roadmap for understanding the brain? Science, 358, 470–477. https://doi.org/10.1126/science.aan8866 Garg, K., Kumar, C. N., & Chandra, P. S. (2019). Number of psychiatrists in India: Baby steps forward, but a long way to go. Indian Journal of Psychiatry, 61, 104–105. https://doi.org/10. 4103/psychiatry.IndianJPsychiatry_7_18 GBD 2019 Mental Disorders Collaborators. (2022). Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. The Lancet Psychiatry, 9, 137–150. https://doi.org/10. 1016/S2215-0366(21)00395-3 Grzenda, A., Kraguljac, N. V., McDonald, W. M., Nemeroff, C., Torous, J., Alpert, J. E., Rodriguez, C. I., & Widge, A. S. (2021). Evaluating the machine learning literature: A primer and user’s guide for psychiatrists. AJP, 178, 715–729. https://doi.org/10.1176/appi.ajp.2020.20030250 Harish, K. B., Price, W. N., & Aphinyanaphongs, Y. (2022). Open-source clinical machine learning models: Critical appraisal of feasibility, advantages, and challenges. JMIR Formative Research, 6, e33970. https://doi.org/10.2196/33970 Henson, P., Barnett, I., Keshavan, M., & Torous, J. (2020). Towards clinically actionable digital phenotyping targets in schizophrenia. npj Schizophr, 6, 1–7. https://doi.org/10.1038/s41537020-0100-1 Hu, K., Wang, M., Liu, Y., Yan, H., Song, M., Chen, J., Chen, Y., Wang, H., Guo, H., Wan, P., Lv, L., Yang, Y., Li, P., Lu, L., Yan, J., Wang, H., Zhang, H., Zhang, D., Wu, H., … Liu, B. (2021). Multisite schizophrenia classification by integrating structural magnetic resonance imaging data with polygenic risk score. Neuroimage Clinical, 32, 102860. https://doi.org/10.1016/j.nicl.2021. 102860 Insel, T., Cuthbert, B., Garvey, M., Heinssen, R., Pine, D. S., Quinn, K., Sanislow, C., & Wang, P. (2010). Research Domain Criteria (RDoC): Toward a new classification framework for research on mental disorders. American Journal of Psychiatry, 167, 748–751. https://doi.org/10.1176/ appi.ajp.2010.09091379 Jiang, Q., Zhang, Y. & Pian, W. (2022). Chatbot as an emergency exist: Mediated empathy for resilience via human-AI interaction during the COVID-19 pandemic. Information Processing and Management, 103074. https://doi.org/10.1016/j.ipm.2022.103074 Karimian, G., Petelos, E., & Evers, S. M. A. A. (2022). The ethical issues of the application of artificial intelligence in healthcare: A systematic scoping review. AI Ethics. https://doi.org/10. 1007/s43681-021-00131-7 Kent, P., Cancelliere, C., Boyle, E., Cassidy, J. D., & Kongsted, A. (2020). A conceptual framework for prognostic research. BMC Medical Research Methodology, 20, 172. https://doi.org/10.1186/ s12874-020-01050-7 Kisling, L. A., & Das, J. M. (2022). Prevention strategies, StatPearls [internet]. StatPearls Publishing. Koppe, G., Meyer-Lindenberg, A., & Durstewitz, D. (2021). Deep learning for small and big data in psychiatry. Neuropsychopharmacology, 46, 176–190. https://doi.org/10.1038/s41386-0200767-z Kotov, R., Krueger, R. F., Watson, D., Achenbach, T. M., Althoff, R. R., Bagby, R. M., Brown, T. A., Carpenter, W. T., Caspi, A., Clark, L. A., Eaton, N. R., Forbes, M. K., Forbush, K. T., Goldberg, D., Hasin, D., Hyman, S. E., Ivanova, M. Y., Lynam, D. R., Markon, K., … Zimmerman, M. (2017). The Hierarchical Taxonomy of Psychopathology (HiTOP): A dimensional alternative to traditional nosologies. Journal of Abnormal Psychology, 126, 454–477. https://doi.org/10. 1037/abn0000258
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
217
Koutsouleris, N., Borgwardt, S., Meisenzahl, E. M., Bottlender, R., Möller, H.-J., & Riecher-Rössler, A. (2012). Disease prediction in the at-risk mental state for psychosis using neuroanatomical biomarkers: Results from the FePsy study. Schizophrenia Bulletin, 38, 1234–1246. https://doi. org/10.1093/schbul/sbr145 Kraguljac, N. V., McDonald, W. M., Widge, A. S., Rodriguez, C. I., Tohen, M., & Nemeroff, C. B. (2021). Neuroimaging biomarkers in schizophrenia. AJP, appi.ajp.2020.2. https://doi.org/10. 1176/appi.ajp.2020.20030340 Leenings, R., Winter, N. R., Dannlowski, U., & Hahn, T. (2022). Recommendations for machine learning benchmarks in neuroimaging. NeuroImage, 257, 119298. https://doi.org/10.1016/j.neu roimage.2022.119298 Liu, Y., Xia, S., Nie, J., Wei, P., Shu, Z., Chang, J. A., & Jiang, X. (2022). aiMSE: Toward an AI-based online mental status examination. IEEE Pervasive Computing, 1–9. https://doi.org/10. 1109/MPRV.2022.3172419 Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17 (pp. 4768–4777). Curran Associates Inc., Red Hook. Marcinkeviˇcs, R., & Vogt, J. E. (2023). Interpretable and explainable machine learning: A methodscentric overview with concrete examples. Wires Data Mining and Knowledge Discovery, 13, e1493. https://doi.org/10.1002/widm.1493 Michalowski, M., Abidi, S. S. R., & Abidi, S. (2022). Artificial Intelligence in Medicine. McCoy, T. H., Castro, V. M., Hart, K. L., Pellegrini, A. M., Yu, S., Cai, T., & Perlis, R. H. (2018). Genome-wide association study of dimensional psychopathology using electronic health records. Biological Psychiatry, Impulsivity: Mechanisms and Manifestations, 83, 1005–1011. https://doi.org/10.1016/j.biopsych.2017.12.004 McCoy, T. H., Jr., Castro, V. M., Roberson, A. M., Snapper, L. A., & Perlis, R. H. (2016). Improving prediction of suicide and accidental death after discharge from general hospitals with natural language processing. JAMA Psychiatry, 73, 1064–1071. https://doi.org/10.1001/jamapsychiatry. 2016.2172 Meskó, B., & Görög, M. (2020). A short guide for medical professionals in the era of artificial intelligence. Npj Digital Medicine, 3, 1–8. https://doi.org/10.1038/s41746-020-00333-z Mota, N. B., Vasconcelos, N. A. P., Lemos, N., Pieretti, A. C., Kinouchi, O., Cecchi, G. A., Copelli, M., & Ribeiro, S. (2012). Speech graphs provide a quantitative measure of thought disorder in psychosis. PLoS ONE, 7, e34928. https://doi.org/10.1371/journal.pone.0034928 Myin-Germeys, I., Kasanova, Z., Vaessen, T., Vachon, H., Kirtley, O., Viechtbauer, W., & Reininghaus, U. (2018). Experience sampling methodology in mental health research: New insights and technical developments. World Psychiatry, 17, 123–132. https://doi.org/10.1002/wps.20513 Nikolac Perkovic, M., Nedic Erjavec, G., Svob Strac, D., Uzun, S., Kozumplik, O., & Pivac, N. (2017). Theranostic biomarkers for schizophrenia. International Journal of Molecular Sciences, 18, 733. https://doi.org/10.3390/ijms18040733 Noble, J. M., Zamani, A., Gharaat, M., Merrick, D., Maeda, N., Lambe Foster, A., Nikolaidis, I., Goud, R., Stroulia, E., Agyapong, V. I. O., Greenshaw, A. J., Lambert, S., Gallson, D., Porter, K. T., Turner, D., & Zaïane, O. R. (2022). Developing, implementing, and evaluating an artificial intelligence-guided mental health resource navigation chatbot for health care workers and their families during and following the COVID-19 pandemic: protocol for a cross-sectional study. JMIR Research Protocols. https://doi.org/10.2196/33717 Palaniyappan, L., Mota, N. B., Oowise, S., Balain, V., Copelli, M., Ribeiro, S., & Liddle, P. F. (2019). Speech structure links the neural and socio-behavioural correlates of psychotic disorders. Progress in Neuro-Psychopharmacology and Biological Psychiatry, 88, 112–120. https://doi. org/10.1016/j.pnpbp.2018.07.007 Paranjape, K., Schinkel, M., Nannan Panday, R., Car, J., & Nanayakkara, P. (2019). Introducing artificial intelligence training in medical education. JMIR Medical Education, 5, e16048. https:// doi.org/10.2196/16048
218
U. M. Mehta et al.
Patel, V., Xiao, S., Chen, H., Hanna, F., Jotheeswaran, A. T., Luo, D., Parikh, R., Sharma, E., Usmani, S., Yu, Y., Druss, B. G., & Saxena, S. (2016). The magnitude of and health system responses to the mental health treatment gap in adults in India and China. The Lancet, 388, 3074–3084. https://doi.org/10.1016/S0140-6736(16)00160-4 Paus, T. (2010). Population neuroscience: Why and how. Human Brain Mapping, 31, 891–903. https://doi.org/10.1002/hbm.21069 Pennachin, C., & Goertzel, B. (2007). Contemporary approaches to artificial general intelligence. In Goertzel, B., Pennachin, C. (eds.), Artificial general intelligence, cognitive technologies (pp. 1–30). Springer. https://doi.org/10.1007/978-3-540-68677-4_1 Posoldova, A. (2020). Machine learning pipelines: from research to production. IEEE Potentials, 39, 38–42. https://doi.org/10.1109/MPOT.2020.3016280 Pradeep, B. S., Gururaj, G., Varghese, M., Benegal, V., Rao, G. N., Sukumar, G. M., Amudhan, S., Arvind, B., Girimaji, S., K., T., P., M., Vijayasagar, K.J., Bhaskarapillai, B., Thirthalli, J., Loganathan, S., Kumar, N., Sudhir, P., et al. (2018). National Mental Health Survey of India, 2016—Rationale, design and methods. PLoS One, 13, e0205096. https://doi.org/10.1371/jou rnal.pone.0205096 Raket, L. L., Jaskolowski, J., Kinon, B. J., Brasen, J. C., Jönsson, L., Wehnert, A., & Fusar-Poli, P. (2020). Dynamic ElecTronic hEalth reCord deTection (DETECT) of individuals at risk of a first episode of psychosis: A case-control development and validation study. The Lancet Digital Health, 2, e229–e239. https://doi.org/10.1016/S2589-7500(20)30024-8 Rashid, B., & Calhoun, V. (2020). Towards a brain-based predictome of mental illness. Human Brain Mapping, 41, 3468–3535. https://doi.org/10.1002/hbm.25013 Ray, A., Bhardwaj, A., Malik, Y. K., Singh, S., & Gupta, R. (2022). Artificial intelligence and psychiatry: An overview. Asian Journal of Psychiatry, 70, 103021. https://doi.org/10.1016/j. ajp.2022.103021 Reece, A. G., Reagan, A. J., Lix, K. L. M., Dodds, P. S., Danforth, C. M., & Langer, E. J. (2017). Forecasting the onset and course of mental illness with Twitter data. Science and Reports, 7, 13006. https://doi.org/10.1038/s41598-017-12961-9 Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should i trust you?”: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. presented at the KDD ’16: The 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144). ACM. https://doi.org/10.1145/2939672.2939778 Riecher-Rössler, A., Pflueger, M. O., Aston, J., Borgwardt, S. J., Brewer, W. J., Gschwandtner, U., & Stieglitz, R.-D. (2009). Efficacy of using cognitive status in predicting psychosis: A 7year follow-up. Biological Psychiatry, 66, 1023–1030. https://doi.org/10.1016/j.biopsych.2009. 07.020 Robins, E., & Guze, S. B. (1970). Establishment of diagnostic validity in psychiatric illness: Its application to schizophrenia. American Journal of Psychiatry, 126, 983–987. https://doi.org/10. 1176/ajp.126.7.983 Rodriguez-Villa, E., Mehta, U. M., Naslund, J., Tugnawat, D., Gupta, S., Thirtalli, J., Bhan, A., Patel, V., Chand, P. K., Rozatkar, A., Keshavan, M., & Torous, J. (2021). Smartphone health assessment for relapse prevention (SHARP): A digital solution toward global mental health. Bjpsych Open, 7, e29. https://doi.org/10.1192/bjo.2020.142 Rosen, M., Betz, L. T., Schultze-Lutter, F., Chisholm, K., Haidl, T. K., Kambeitz-Ilankovic, L., Bertolino, A., Borgwardt, S., Brambilla, P., Lencer, R., Meisenzahl, E., Ruhrmann, S., Salokangas, R. K. R., Upthegrove, R., Wood, S. J., Koutsouleris, N., & Kambeitz, J. (2021). Towards clinical application of prediction models for transition to psychosis: A systematic review and external validation study in the PRONIA sample. Neuroscience & Biobehavioral Reviews, 125, 478–492. https://doi.org/10.1016/j.neubiorev.2021.02.032 Salazar de Pablo, G., Studerus, E., Vaquerizo-Serrano, J., Irving, J., Catalan, A., Oliver, D., Baldwin, H., Danese, A., Fazel, S., Steyerberg, E.W., Stahl, D., & Fusar-Poli, P. (2020). Implementing
10 Mind-Reading Machines: Promises, Pitfalls, and Solutions …
219
precision psychiatry: A systematic review of individualized prediction models for clinical practice. Schizophrenia Bulletin, sbaa120. https://doi.org/10.1093/schbul/sbaa120 Scangos, K. W., Khambhati, A. N., Daly, P. M., Makhoul, G. S., Sugrue, L. P., Zamanian, H., Liu, T. X., Rao, V. R., Sellers, K. K., Dawes, H. E., Starr, P. A., Krystal, A. D., & Chang, E. F. (2021). Closed-loop neuromodulation in an individual with treatment-resistant depression. Nature Medicine, 27, 1696–1700. https://doi.org/10.1038/s41591-021-01480-w Shatte, A. B. R., Hutchinson, D. M., & Teague, S. J. (2019). Machine learning in mental health: A scoping review of methods and applications. Psychological Medicine, 49, 1426–1448. https:// doi.org/10.1017/S0033291719000151 Singh, O. P. (2019). Chatbots in psychiatry: Can treatment gap be lessened for psychiatric disorders in India. Indian Journal of Psychiatry. https://doi.org/10.4103/0019-5545.258323 Solanes, A., & Radua, J. (2022). Advances in using MRI to estimate the risk of future outcomes in mental health—Are we getting there? Front Psychiatry, 13, fpsyt-13-826111. https://doi.org/ 10.3389/fpsyt.2022.826111 Steyaert, S., Pizurica, M., Nagaraj, D., Khandelwal, P., Hernandez-Boussard, T., Gentles, A. J., & Gevaert, O. (2023). Multimodal data fusion for cancer biomarker discovery with deep learning. Nature Machine Intelligence, 5, 351–362. https://doi.org/10.1038/s42256-023-00633-5 Turing, A. M. (1950). I.—Computing machinery and intelligence. Mind LIX, 433–460. https://doi. org/10.1093/mind/LIX.236.433 Uhlhaas, P. J., & Wood, S. J. (Eds.) (2020). Biological, psychological and sociocultural processes in emerging mental disorders in youth. In Youth mental health: A paradigm for prevention and early intervention. The MIT Press. https://doi.org/10.7551/mitpress/13412.003.0010 Vaidyam, A. N., Wisniewski, H., Halamka, J. D., Kashavan, M. S., & Torous, J. B. (2019). Chatbots and conversational agents in mental health: a review of the psychiatric landscape. Canadian Journal of Psychiatry, 64, 456–464. https://doi.org/10.1177/0706743719828977 Van Tricht, M. J., Nieman, D. H., Koelman, J. H. T. M., van der Meer, J. N., Bour, L. J., de Haan, L., & Linszen, D. H. (2010). Reduced parietal P300 amplitude is associated with an increased risk for a first psychotic episode. Biological Psychiatry, 68, 642–648. https://doi.org/10.1016/j. biopsych.2010.04.022 Wang, L., & Miller, L. C. (2020). Just-in-the-moment adaptive interventions (JITAI): A metaanalytical review. Health Communication, 35, 1531–1544. https://doi.org/10.1080/10410236. 2019.1652388 Watson, D. S. (2022). Conceptual challenges for interpretable machine learning. Synthese, 200, 65. https://doi.org/10.1007/s11229-022-03485-5 Wiese, W., & Friston, K. J. (2022). AI ethics in computational psychiatry: From the neuroscience of consciousness to the ethics of consciousness. Behavioural Brain Research, 420, 113704. https:// doi.org/10.1016/j.bbr.2021.113704 World Health Organization. (2022). World mental health report: Transforming mental health for all. World Health Organization. World Health Organization. (1992). Tenth revision of the international classification of diseases and related health problems. Xie, Y., Ding, H., Du, X., Chai, C., Wei, X., Sun, J., Zhuo, C., Wang, L., Li, J., Tian, H., Liang, M., Zhang, S., Yu, C., & Qin, W. (2022). Morphometric integrated classification index: a multisite model-based, interpretable, shareable and evolvable biomarker for schizophrenia. Schizophr Bulletin, sbac096. https://doi.org/10.1093/schbul/sbac096 Yang, Q., Li, Y., Li, B., & Gong, Y. (2022). A novel multi-class classification model for schizophrenia, bipolar disorder and healthy controls using comprehensive transcriptomic data. Computers in Biology and Medicine, 148, 105956. https://doi.org/10.1016/j.compbiomed.2022. 105956
Chapter 11
AI-Based Technological Interventions for Tackling Child Malnutrition Bita Afsharinia, B. R. Naveen, and Anjula Gurtoo
Abstract The persistent malnutrition crisis among children in India remains a cause for significant concern. In the face of rapidly advancing technology, ensuring access to essential nutrients for every child becomes an achievable goal. This chapter delves into the nutritional factors contributing to impaired growth in children and proposes targeted interventions for the Indian context. Utilizing data from the Indian Demographic and Health Survey (2015–16) focusing on underprivileged children aged two to five years, three malnutrition outcome measures height-for-age, weight-forheight, and weight-for-age were calculated according to WHO standards. Binary and multinomial logistic regression models reveal three key findings. Firstly, the study emphasizes the substantial impact of child anaemia on the risk of malnutrition in various forms. As a potential solution, the study suggests the integration of new Artificial Intelligence (AI) applications, such as the Anaemia Control Management (ACM) software, to enhance routine clinical practices in managing child anaemia. Secondly, recognizing the crucial role of dietary diversity in promoting a child’s linear growth, the chapter advocates for the adoption of AI-based food and nutrient intake assessment systems. This includes platforms such as FatSecret and GoCARB for comprehensive nutritional evaluations of diets. Lastly, the results underscore the heightened risk of stunted growth in children due to the limited effectiveness of the Anganwadi/ICDS programme for lactating mothers. The integration of an AIbased virtual assistant application, such as Momby, within the Anganwadi/ICDS programme is proposed to improve access to health services and information for
B. Afsharinia Department of Management Studies, Indian Institute of Science, Bengaluru, India e-mail: [email protected] B. R. Naveen Department of Management Studies, Indian Institute of Science, Bengaluru, India e-mail: [email protected] A. Gurtoo (B) Department of Management Studies, Chairperson at Center for Society and Policy, Indian Institute of Science, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_11
221
222
B. Afsharinia et al.
mothers, particularly those facing challenges in accessing essential pre- and postnatal care. Overall, the findings emphasize the imperative need to strategically implement AI as an advanced approach to address malnutrition stemming from nutritional deficiencies and associated health issues. Keywords AI · India · Malnutrition · Underprivileged children
1 Introduction The Global Nutrition Report discloses elevated rates of diet-related diseases and mortality among children under the age of five globally (UNICEF, 2018). Nutrient deficiencies, excesses, or imbalances leading to malnutrition are prevalent in children under five years old (FAO, 2018; WHO, 2019), making them vulnerable to various health issues, including infectious diseases such as malaria, diarrhea, measles, HIV/ AIDS, and respiratory illnesses (Fenn, 2009; Schaible & Kaufmann, 2007). Malnutrition has also been linked to mental retardation (Raina et al., 2016) and has a detrimental impact on a child’s social responsiveness and emotional development (Greenberg, 1981). Additionally, child malnutrition negatively influences anthropometric measurements of weight and height (DHCS, 2016; Woodruff & Duffield, 2002). Child malnutrition poses a persistent and enduring challenge for public administration in India. Table 11.1 provides an overview of the health and nutritional status of children under five years of age during the periods 2019–20 and 2015–16, encompassing 22 states as reported by the National Family Health Survey (NFHS-4 and NFHS-5), Government of India (Ministry of Health and Family Welfare Government of India, 2020). Among various malnutrition indicators, the average rates of anaemia have increased from 51.75 to 60.54 in rural areas and 58.38 in urban areas. A detailed examination of height and weight metrics for 70,618 underprivileged children from the DHS-5 survey reveals alarming figures, with approximately 40% experiencing stunted growth, 35% being underweight, and 25% failing to meet their height-to-weight ratio (Table 11.2). This data underscores the substantial prevalence of undernutrition, specifically in terms of stunting, wasting, and underweight conditions among underprivileged children. Armed with this initial comprehension, we formulated our research objectives. Contributing to the existing body of literature, this chapter delves into a spectrum of nutritional health factors to uncover the underlying causes influencing the hindered growth of underprivileged children. It proposes context-specific interventions for children’s health, specifically addressing malnutrition. The analysis draws upon in-depth, multi-dimensional data sourced from households within the poorer and poorest segments of India. These populations are categorized based on the wealth index derived from expenditure and income measures, as classified by the Demographic and Health Survey 2015–16 (DHS). The nutritional data analysed pertain to 70,618 children aged two to five years. Binary logistic regression and multinomial
23.3
31.4
36.4
48.3
37.2
20.1
38.5
26.3
27.4
36.2
19.7
26.8
30.9
Andhra Pradesh
Assam
Bihar
Dadra and Nagar Haveli and Daman and Diu
Goa
Gujarat
Himachal Pradesh
Jammu and Kashmir
Karnataka
Kerala
Lakshadweep
Ladakh
9.3
13.7
15.7
26.1
12.2
13.7
26.4
21.9
26.7
20.8
17.0
17.2
18.9
18.7
23.6
16.1
35.2
16.6
21.2
39.3
23.8
35.8
43.9
29.8
31.9
21.6
4.0
1.6
3.4
2.6
5.7
1.9
1.9
3.7
3.9
1.2
2.3
1.2
3.0
Overweight
53.6
91.4
35.7
60.9
53.8
53.7
62.6
48.3
82
63.5
35.7
58.6
49
Anaemia
30.5
32.0
23.4
35.4
26.9
30.8
39.0
25.8
39.4
42.9
35.3
31.2
22.5
17.5
17.4
15.8
19.5
19.0
17.4
25.1
19.1
21.6
22.9
21.7
16.1
16.0
Wasted
Stunted
Underweight
Stunted
Wasted
NFHS-5 (2019–20)
NFHS-4 (2015–16)
Andaman and Nicobar Islands
States
20.4
25.8
19.7
32.9
21.0
25.5
39.7
24.0
38.7
41.0
32.8
29.6
23.7
Underweight
13.4
10.5
4.0
3.2
9.6
5.7
3.9
2.8
1.9
2.4
4.9
2.7
5.7
Overweight
36.1
95.1
39.8
67.1
73.5
55
81.2
53.1
76.8
69.7
68.6
65
33.3
Anaemia (rural)
(continued)
45.5
84.1
38.9
62.8
70.1
58.2
77.6
53.3
75
67.9
66.4
58.7
47.8
Anaemia (urban)
Table 11.1 Comparison of health/nutritional status of children aged under five years during 2019–20 and 2015–16 by the National Family Health Survey (NFHS-4 and NFHS-5), Government of India
11 AI-Based Technological Interventions for Tackling Child Malnutrition 223
Anaemia
34.4
43.8
28.9
28.1
28.6
29.6
28.0
24.3
32.5
30.94
Meghalaya
Manipur
Mizoram
Nagaland
Sikkim
Telangana
Tripura
West Bengal
Average
17.00
20.3
16.8
18.1
14.2
11.3
6.1
6.8
15.3
25.6
25.6
31.6
24.1
28.4
14.2
16.7
12.0
13.8
28.9
36.0
3.07
2.1
3.0
0.7
8.6
3.8
4.2
3.1
3.9
1.9
51.75
54.2
48.3
60.7
55.1
26.4
19.3
23.9
48
53.8
31.96
33.8
32.3
33.1
22.3
32.7
28.9
23.4
46.5
35.2
18.15
20.3
18.2
21.7
13.7
19.1
9.8
9.9
12.1
25.6
Wasted
NFHS-5 (2019–20) Overweight
Stunted
Underweight
Stunted
Wasted
NFHS-4 (2015–16)
Maharashtra
States
Table 11.1 (continued)
26.95
32.2
25.6
31.8
13.1
26.9
12.7
13.3
26.6
36.1
Underweight
5.57
4.3
8.2
3.4
9.6
4.9
10.0
3.4
4.0
4.1
Overweight
60.54
71.3
66.5
72.8
57.1
41.4
49.6
42.2
46
70.7
Anaemia (rural)
58.38
63
57.3
64.7
54.8
46.4
42.8
44
38.8
66.3
Anaemia (urban)
224 B. Afsharinia et al.
11 AI-Based Technological Interventions for Tackling Child Malnutrition
225
Table 11.2 DHS survey height and weight distribution of underprivileged children Height for age (HAZ) Count
Weight for height (WHZ)
Percentage (%)
Count
Stunted 27,834 39.4
Wasted
Normal 42,784 60.6
Overweight Normal
Percentage (%)
11,859 16.8 5208
Weight for age (WAZ)
7.4
53,551 75.8
Count
Percentage (%)
Underweight 24,839 35.2 Overweight
4652
Normal
41,127 58.2
6.6
logistic regression models are constructed for the analysis. The nutritional status of the children is gauged using height-for-age (HAZ), weight-for-height (WHZ), and weight-for-age (WAZ), with stunting, wasting, underweight, and overweight as the baseline. These baseline malnutrition measures adhere to the WHO child growth standards (WHO, 2006). Moreover, the study explores the potential application of AI-based solutions to address malnutrition, focusing on three key nutritional health factors: a child’s dietary diversity intake, anaemia, and the role of Anganwadi/healthcare support for lactating mothers, all of which contribute to a child’s hindered growth. Firstly, the study proposes that the prevalence of anaemia and the increased likelihood of impaired growth in terms of wasting and underweight can be mitigated through the introduction of innovative AI applications such as the Anaemia Control Management (ACM) software. This software is designed to aid in anaemia management in routine clinical practices for both children and mothers. Secondly, recognizing the crucial role of dietary diversity in a child’s linear growth, the chapter advocates for the use of AIbased food and nutrient intake assessment systems, such as FatSecret. These systems are invaluable for acquiring and assessing information related to food, nutrient intake, and nutritional evaluation of a diet. Thirdly, the study highlights the elevated risk of stunted growth due to the limited access of lactating mothers to the Anganwadi/ICDS programme. This challenge can be effectively addressed by leveraging an AI-based virtual assistant application named Momby within the Anganwadi/ICDS programme. Momby provides nutritional advice, educational information on essential newborn care and breastfeeding, and facilitates real-time healthcare benefits. The chapter is structured into five sections, with Sect. 2 outlining the methodology used for analysis, including data analysis. Section 4 presents the results of the analysis, followed by a discussion of the key findings in Sect. 5. The conclusion and implications derived from the study are elucidated in Sect. 6.
226
B. Afsharinia et al.
2 Literature Review 2.1 Existing Policies and Infrastructure to Tackle Malnutrition in India To combat malnutrition, the Government of India (GoI) has identified key health and nutrition interventions within various programmes. Notably, Mission Poshan 2.0 has been introduced to enhance nutritional content, delivery, outreach, and outcomes. This initiative focuses on cultivating practices that foster health, wellness, and immunity against diseases and malnutrition. The government implements targeted interventions, including Anganwadi Services, Pradhan Mantri Matru Vandana Yojana, and the Scheme for vulnerable children under the Integrated Child Development Services Scheme (ICDS) to address malnutrition (PIB, 2021). Recently, the Government of India has restructured the ICDS programme to create a more balanced, multi-sectoral approach to tackle the persistent challenge. The revised programme prioritizes providing supplementary foods to pregnant women, nursing mothers, and children under three years old. It also aims to enhance maternal feeding and caring practices, as well as promote immunization and growth monitoring for children. Another government initiative combating malnutrition is the National Nutrition Mission or POSHAN Abhiyaan, introduced by the Prime Minister in 2018. This mission concentrates on improving sanitation and hygiene conditions, addressing issues such as anaemia, antenatal care, and optimal breastfeeding for over 130 million children (NITI Aayog, 2022). Despite nominal reductions in malnutrition over the past decade and the implementation of various government programmes, there is a persistent need for the effective utilization of knowledge and advanced technology to address undernutrition. This is crucial, especially considering its impact on the socioeconomic development of the country. Digital solutions have the potential to address undernutrition, impacting not only children’s physical and cognitive growth but also the nation’s economic, social, and human progress (Nguyen, 2022).
2.2 Novel Technologies to Eradicate Malnutrition Digital health services, including online health services and mobile health (mHealth), encompass various technologies such as voice calls, short message service (SMS), wireless data transmission, and mobile applications. Additionally, technologies such as telehealth, telemedicine, telecare, virtual health, and digital health are being utilized to enhance healthcare delivery, addressing challenges in accessing services traditionally faced by individuals (Sudersanadas, 2021). These advancements have the potential to provide crucial support for essential healthcare needs, such as pre- and postnatal care, monitoring child growth, disease screening, and treatment support,
11 AI-Based Technological Interventions for Tackling Child Malnutrition
227
especially for those who have struggled to access such services in conventional settings. Moreover, the use of mobile devices and digital interventions has surged with the onset of the COVID-19 pandemic (Osei & Mashamba-Thompson, 2021). AI emerges as a cost-effective and efficient digital intervention in health care, particularly in addressing malnutrition. AI comprises computer programmes that mimic human thought processes, learning capabilities, and knowledge management by analysing relevant datasets, aiding in problem-solving and decision-making. Its applications in health and nutrition encompass personalized medical nutrition care, involving the assessment of food, nutrient intake, and nutritional evaluation (Nilsson, 2009; Sak & Suchodolska, 2021; Sudersanadas, 2021). The utilization of AI in nutritional studies dates back to the 1990s, specifically in examining the composition and authenticity of food products (Sak & Suchodolska, 2021). AI proves beneficial in understanding nutrient production, its impact on physiological functions, estimating health risks based on dietary analysis, food composition studies, research on vitamins, and developing dietary assessment mobile applications. Challenges in applying AI to food and nutrient intake include the need for sufficient monitoring data, regional gastronomic variations, and the lack of universality across different cuisines and meal patterns worldwide (Sak & Suchodolska, 2021). AI-based applications for food and nutrient assessment, such as FatSecret, have the capability to assess the calorie content of food. The AI Precision Nutrient Analysis Model, as described by Lee et al. (2022), can analyze dish ingredients and calculate nutrient intake by scrutinizing the dishes (Lee et al., 2022). Additionally, it automatically determines portion sizes through a digital data semantic analysis model. Another noteworthy AI application, Momby, serves as a virtual assistant providing nutritional advice and educational information on topics such as fetal development, bonding practices, essential newborn care, and breastfeeding. It facilitates making obstetrician and healthcare appointments and allows real-time questioning (Nguyen, 2022). The process of nutritional assessment involves the need for various data points, including demographic information, anthropometry, appetite details, food and supplement intake, changes in taste and satiety, level of physical activity, metabolic demands, data concerning physical activity, acute physiology, age, chronic health evaluation, and sequential organ failure assessment (Rajkomar et al., 2019). The widely accepted methods for assessing food and nutrient intake globally include AI-based techniques such as a twenty-four-hour recall of food intake, maintaining a food diary, and conducting a three-day food weighment survey. However, challenges in applying AI-based food and nutrient intake assessments include insufficient and inappropriate monitoring data, regional differences in gastronomy, the lack of universality across diverse cuisines and meal patterns worldwide, and even variations within regional food items from patient to patient. The utilization of AI systems for analyzing the food intake of pregnant and new mothers enables the suggestion of appropriate meal plans to enhance maternal health and reduce maternity-related health issues. Similarly, analyzing children’s food
228
B. Afsharinia et al.
intake aids in assessing nutrient intake, allowing the implementation of measures to ensure the proper mix of nutrients and reduce malnutrition in all its forms. An illustration of an AI-based assessment system for food and nutrient intake can be delineated into three stages. In the first stage, datasets are developed, consisting of images of various food ingredients, processed food products, nutrition information from food labels, and a nutrient composition database. The second stage involves the segmentation of images, where the system analyses the food image, dividing the standard input into segments for image analysis. Segmentation comprises three stages: classification, object detection, and segmentation (Kavita, 2021). During the classification stage, the software categorizes the image into different classes (e.g., apple, egg, bread), drawing rectangles around the classified objects for detection. In the third stage, the system recognizes the food item and estimates the portion size or quantity consumed. Algorithms developed based on the datasets enable the nutritional analysis of the meal, providing outputs such as actual food intake, actual nutrient intake, and plate waste. Considering the robust advantages of AI as one of the most effective strategies for monitoring the dietary intake and nutritional status of children and mothers, AI applications hold the potential to accelerate the eradication of malnutrition and its associated diseases (Kavita, 2021; Lee et al., 2022).
3 Methods 3.1 Study Design The study is a cross-sectional survey conducted using data from the Demographic and Health Survey (DHS). The DHS data, collected in India in 2015–2016, comprises comprehensive information on 2.5 million children across 1340 variables. These variables encompass aspects such as children’s health, nutrition, demographic indicators, household details, maternal characteristics, socioeconomic factors, environmental conditions, and regional distinctions within the country. Each child born within the five years preceding the survey corresponds to a unique record in the dataset, linked to interviewed women or caretakers. The primary focus of the study is to investigate the correlation between anthropometric measurements and nutritional variables specifically among economically disadvantaged children. To address missing values in the dataset, predictive mean matching is employed, replacing missing values based on the distribution of each data point. In the final analysis, data related to underprivileged children is extracted, encompassing observations on 18 variables for a total of 70,618 children aged between 2 and 5 years. The independent variables utilized in the analysis are detailed in Table 11.3. The DHS programme uses a wealth index to show differences in household characteristics. This index calculates wealth scores for a household by considering both their spending and income. These wealth scores are then categorized into five groups: poorest, poorer, middle, richer, and richest households. In this research, we merged
11 AI-Based Technological Interventions for Tackling Child Malnutrition
229
Table 11.3 Independent variables Variables of interest Nutrients intake
Child health status
Maternal health status
Nature of variables Dietary Diversity Score (DDS)
Numerical
Vitamin A
Categorical
Iron pills sprinkles/syrup
Categorical
Health problems (Fever, diarrhoea, Categorical cough) Anaemia level
Categorical
Maternal anemia level
Categorical
Maternal anthropometry of height, Numerical weight, and arm circumference Social welfare program (Such as Anganwadi/ICDS)
Anganwadi benefits during pregnancy
Categorical
Child received benefits from Anganwadi/ICDS center
Categorical
Mother received benefits while breastfeeding from Anganwadi/ ICDS centre
Categorical
the poorest and poorer households to form a category representing underprivileged children.
4 Measures 4.1 Dependent Variables Following the guidance provided by the World Health Organization (WHO) in 2006, we assess height-for-age (HAZ), weight-for-height (WHZ), and weight-for-age (WAZ). The dataset is then categorized as follows: • Stunting: Children with a height-for-age Z-score (HAZ) below − 2 standard deviations (SD). • Wasting and overweight: Global acute malnutrition (GAM) or wasting is identified in children with a weight-for-height Z-score (WHZ) below − 2 SD. Children with a weight-for-height Z-score above + 2.0 SD, according to the WHO Child Growth Standards, are classified as overweight. • Underweight and overweight for age: Underweight is determined in children with a weight-for-age Z-score (WAZ) below − 2 SD, while severe underweight is defined in children with a WAZ below − 3 SD. Children with a weight-for-age Z-score above + 2.0 SD, according to the WHO Child Growth Standards, are considered overweight for their age.
230
B. Afsharinia et al.
The Dietary Diversity Score (DDS) is computed by summing the instances of unique food group consumption over the last 24 h, using nutritional variables. The considered food groups for DDS calculation include cereals/roots, vegetables, fruits, legumes/lentils, meat/fish/egg, and milk/dairy products. The DDS takes into account both the presence and quantity of any food group consumed on that day (KrebsRathnayake et al., 2012; Smith et al., 1987).
4.2 Data Analysis We utilized the Statistical Package for the Social Sciences (SPSS) for our data analysis. Out of the 18 variables identified from the DHS dataset, a bivariate correlation analysis was employed to identify and remove variables with significant correlations, aiming to mitigate issues related to multi-collinearity. Binary logistic regression and multinomial logistic regression (MLR) models were employed to assess the relationship between dependent and independent variables. The Binary Logistic Regression model examines the impact of independent variables on height for age, with two distinct categories (Stunted—0 and Normal—1), as per the 2006 WHO Child Growth Standards. Multinomial Logistic Regression (MLR) models investigate the influence of independent variables on two dependent variables: weight for height and weight for age. MLR is suitable here as the dependent variable of weight for age has three distinct categories (underweight, overweight for age, and normal), while the dependent variable weight for height also has three distinct categories (wasted, overweight for height, and normal), as defined by the World Health Organization in 2006. In both MLR models, the category “Normal” is designated as the reference category for the dependent variable. In the context of the present study, concerning the weight for height variable, we have opted not to present overweight (for height) separately, as the outcomes are mostly similar to overweight (for age). As a result, our focus is primarily on the wasted and normal categories.
5 Results 5.1 Descriptive Statistic Child characteristics are presented in Table 11.4, revealing that approximately 21% of children are stunted, 23% are wasted, 22% are underweight, and 22% are overweight, all experiencing health issues such as fever, diarrhea, and cough. Furthermore, nearly 28% of stunted children, 25% of wasted children, 28% of underweight children, and 16% of overweight children exhibit moderate to severe levels of anemia. The descriptive findings of the Dietary Diversity Score (DDS) indicate that almost
11 AI-Based Technological Interventions for Tackling Child Malnutrition
231
30% of malnourished children have not consumed all the essential food groups, including cereals/roots, vegetables, fruits, legumes/lentils, meat/fish/egg, and milk/ dairy products (DDS ranging from 0 to 6). Table 11.4 Descriptive statistics Variables characteristics
Health problems (fever, diarrhoea, cough) Yes
Height for age
Weight for height
Weight for age
Stunted
Normal
Wasted
Normal
Underweight
Overweight
Normal
(n = 27,834)
(n = 42,784)
(n = 11,859)
(n = 53,551)
(n = 24,839)
(n = 4652)
(n = 41,127)
21.80
21.90
23.00
21.60
22.10
22.20
21.70
Child’s anaemia level Severe
1.70
0.80
1.20
1.20
1.70
1.00
0.90
Moderate
28.10
18.80
24.50
22.60
27.80
15.90
20.00
Not Anaemic
42.50
55.50
44.00
50.50
41.80
66.60
53.70
Dietary Diversity Score (DDS) Consumption 31.20 of none (0) of the major food groups
28.40
29.20
29.80
30.90
26.80
29.00
Consumption of 1 major food group
19.90
19.70
20.00
19.90
19.90
18.90
19.90
Consumption of 2 major food group
17.90
19.30
18.80
18.60
18.20
19.60
19.00
Consumption of 3 major food group
14.00
14.70
14.50
14.30
14.00
15.40
14.60
Consumption of 4 major food group
9.00
9.40
9.20
9.20
9.10
10.30
9.20
Consumption of 5 major food group
4.70
5.00
4.80
4.90
4.70
5.20
5.00
Consumption of 6 major food group
3.30
3.40
3.50
3.30
3.20
3.60
3.40
232
B. Afsharinia et al.
5.2 Analysis Table 11.5 presents the outcomes of the regression models, elucidating the logarithmic relationship between stunted, wasted, underweight, and overweight growth in children and the contributing factors influencing the probability of impaired growth. The presence of health problems such as fever, diarrhoea, and cough significantly impact the likelihood of wasted growth in children, with an odds ratio less than 1.0 (p-value < 0.01, 95% CI 0.88–0.97). These symptoms, commonly associated with infectious diseases, lead to nutrient malabsorption and loss of appetite, resulting in nutrient losses and weight reduction in a child’s body (Katona & Katona-Apte, 2008; Mayo Clinic, 2021). Consequently, an increase in the incidence or exacerbation of malnutrition, such as stunted growth in a child, is observable (Dinku et al., 2020; National Research Council (US), 1985). The prevalence of anaemia at all levels (severe, moderate, and mild) is significantly linked to an increased likelihood of childhood malnutrition in various forms, including wasting, underweight, stunting, and overweight, with odds ratios either greater than or less than 1.0 (p-value < 0.01, 95% CI). The results indicate that a one-unit increase in the dietary diversity (DDS) of a child results in a reduction in the likelihood of stunted growth, with an odds ratio greater than 1.0 (p-value < 0.05, 95% CI, 1.00–1.01). Additionally, the use of iron pills, sprinkles, or syrup is significantly associated with a reduced likelihood of stunted growth in children (p-value < 0.01, 95% CI, 0.87–0.95). Maternal anaemia at severe, moderate, and mild levels also has a substantial impact on the likelihood of stunted growth and underweight in children, with odds ratios either greater than or less than 1.0 (p-value < 0.05 or < 0.01, 95% CI). Finally, if mothers do not receive benefits from the Anganwadi/ICDS center, there is a higher likelihood for children to experience stunted growth, with an odds ratio greater than 1.0 (p-value < 0.05, 95% CI, 0.99–1.09).
5.3 Discussion In the context of children’s impaired growth, three prominent nutritional factors emerge as significant contributors: the child’s anaemia, dietary diversity intake, and the benefits lactating mothers derive from the Anganwadi/ICDS programme. Recognizing these factors’ impact on the likelihood of malnutrition, we delve into AIbased interventions to comprehend and predict the intricate, nonlinear relationships between nutrition-related data and health outcomes. The child’s health condition concerning anaemia across all severity levels (mild, moderate, and severe) significantly influences wasted/thinness and underweight growth. Nutritional anaemia, characterized by an abnormally low blood haemoglobin level due to deficiencies in essential nutrients (iron, folic acid, and vitamin B12), obstructs both mental and physical growth in children, resulting in developmental
0.010**
0.005
Dietary Diversity Score (DDS)
Vitamin A in last 6 months (No–Yesa )
1.01
1.01
0.72
− 0.328***
Child’s anaemia level (Mild—Not anaemica )
Child’s nutrients intake
0.54
− 0.610***
Child’s anaemia level (Moderate-Not anaemica )
1.01
0.36
0.005
0.97–1.04
1.00–1.01
0.69–0.74
0.52–0.56
0.31–0.41
0.96–1.04
1.01
0.99
− 0.015
1.32
1.20
1.20
0.93
Exp (B)
0.009
0.277***
0.183***
0.180**
− 0.070***
95% CI for B EXP (B)
Wasted
Exp (B)
Stunted
B
Weight for height
Height for age
Child’s anaemia − 1.025*** level (Severe—Not anaemica )
Health problems (fever, diarrhoea, cough) (No–Yesa )
Child’s health status
Variables of Interest
Table 11.5 Regression analysis of children nutritional health status
0.94–1.03
0.99–1.02
0.99–1.08
1.14–1.26
0.99–1.43
0.88–0.97
95% CI for EXP (B)
− 0.002
− 0.003
0.303***
0.491***
0.832***
− 0.030
B
Underweight
Weight for age
1.00
1.00
1.35
1.64
2.30
0.97
Exp (B)
0.96–1.03
0.98–1.00
1.30–1.40
1.56–1.70
1.98–2.66
0.93–1.00
95% CI for EXP (B)
− 0.028
0.040***
− 0.493***
− 0.338***
− 0.065
− 0.014
B
Overweight
0.97
1.04
0.61
0.71
0.94
0.99
Exp (B)
(continued)
0.90–1.04
1.02–1.06
0.56–0.66
0.65–0.77
0.68–1.28
0.91–1.06
95% CI for EXP (B)
11 AI-Based Technological Interventions for Tackling Child Malnutrition 233
0.91
− 0.093***
< 0.01***
Maternal weight (kg)
0.007***
0.96
− 0.041**
Maternal anaemia level (Mild—Not anaemica )
Maternal height (cm)
0.92
− 0.088***
Maternal anaemia level (Moderate—Not anaemica )
1.01
1.00
0.93
Maternal anaemia − 0.070** level (Severe—Not anaemica )
1.00–1.00
1.00–1.00
0.87–0.96
0.92–0.99
0.79–1.08
0.87–0.95
< 0.01
< 0.01***
0.038**
− 0.020
− 0.065
− 0.117***
95% CI for B EXP (B)
1.00
1.00
1.04
0.98
0.94
0.89
Exp (B)
Wasted
Exp (B)
Stunted
B
Weight for height
Height for age
Maternal health status
Taking iron pills sprinkles or syrup (No–Yesa )
Variables of Interest
Table 11.5 (continued)
1.00–1.00
1.00–1.00
0.99–1.08
0.92–1.04
0.76–1.14
0.84–0.93
95% CI for EXP (B)
1.00
1.00
− 0.005***
1.08
1.13
1.08
1.00
Exp (B)
< 0.01***
0.078***
0.121***
0.080
0.001
B
Underweight
Weight for age
0.99–0.99
1.00–1.00
1.04–1.12
1.07–1.18
0.92–1.26
0.95–1.04
95% CI for EXP (B)
Overweight
1.00
1.00
− 0.001*** − 0.002***
0.96
1.05
1.14
0.84
Exp (B)
− 0.038
0.045
0.130
− 0.177***
B
(continued)
0.99–0.99
0.99–1.00
0.90–1.03
0.94–1.15
0.83–1.55
0.77–0.90
95% CI for EXP (B)
234 B. Afsharinia et al.
a
1.05
0.99–1.09
0.93–1.01
0.95–1.04
1.06–1.07
0.99
0.97
−0.035
1.04
0.96
−0.015
0.043
− 0.041***
Exp (B)
0.90–1.02
0.93–1.03
0.98–1.11
0.95–0.96
95% CI for EXP (B)
− 0.064
− 0.014
0.027
− 0.073***
B
Underweight
Weight for age
0.94
0.99
1.03
0.93
Exp (B)
0.89–0.98
0.94–1.02
0.97–1.08
0.92–0.93
95% CI for EXP (B)
Represents reference level; CI indicates confidence interval; Significance determined at p < 0.1; *** p < 0.01, ** p < 0.05, * p < 0.1
0.044**
0.97
− 0.027
Child benefited from Anganwadi/ ICDS (No–Yesa )
While breastfeeding benefited from Anganwadi/ICDS (No–esa )
1.00
− 0.001
1.07
During pregnancy benefited from Anganwadi/ICDS (No–Yesa )
0.068***
95% CI for B EXP (B)
Wasted
Exp (B)
Stunted
B
Weight for height
Height for age
Social welfare program
Arm circumference
Variables of Interest
Table 11.5 (continued)
Overweight
0.006
− 0.074
0.079
0.036***
B
1.01
0.93
1.08
1.04
Exp (B)
0.91–1.10
0.85–1.00
0.98–1.19
1.02–1.04
95% CI for EXP (B)
11 AI-Based Technological Interventions for Tackling Child Malnutrition 235
236
B. Afsharinia et al.
delays or underweight conditions (Teji Roba et al., 2016). In a malnourished child, body fat and muscle reserves are depleted to maintain essential functions, leading to weight loss or thinness (Agarwal, 2019). AI technologies have been extensively employed in exploring anaemia control. For example, the Anaemia Control Management (ACM) software and the Model Predictive Control (MPC) approach have been applied to manage anaemia in haemodialysis patients, demonstrating effective anaemia control (Barbieri et al., 2016; Brier & Gaweda, 2016). Therefore, the integration of AI technologies into anaemia control shows promise and has the potential to reduce the risk of impaired child growth resulting from anaemia. Moreover, the findings indicate that the probability of a child experiencing normal linear growth (height for age) rises with an augmentation in dietary diversity (DDS). DDS serves as an indicator of the consumption of a variety of nutrients, assessing the nutritional adequacy in a child’s diet (Greenberg, 1981; Steyn et al., 2006), consequently enhancing the probability of normal linear growth in children (Dinku et al., 2020). Recently, AI-based dietary assessment systems, such as FatSecret and Im2Calories for estimating food calorie content (Myers et al., 2015; Sudersanadas, 2021), and GoCARB for carbohydrate (CHO) estimation (Vasiloglou et al., 2018), have been proposed. These systems follow a three-step process: (1) segmentation of food items; (2) recognition of food items; and (3) estimation of food volume. Consequently, the content of individual nutrients can be computed from the food nutrient database, offering a reliable and straightforward solution for assessing nutrient intake and preventing malnutrition (Côté & Lamarche, 2021). Notably, the findings underscore that children of lactating mothers who do not receive benefits from Anganwadi/ICDS centres face a higher risk of impaired linear growth. The Anganwadi/ICDS programme, by providing supplementary nutrients during breastfeeding, facilitates the transmission of essential nutrients to both mother and baby (Lamberti et al., 2011). However, barriers such as infrastructural problems, lack of utilization due to poor awareness, and monitoring issues hinder the effective implementation of this programme, jeopardizing the nutrient requirements of vulnerable populations and increasing the risk of stunted growth among children (PEO, 2011). Recent research highlights the utilization of AI technologies in targeted interventions within the Anganwadi programme. This involves accurate assessment of nutritional status, continuous monitoring, and prompt management of various maternal health issues to enhance maternal health throughout different phases of pregnancy, childbirth, and postpartum (Khan et al., 2022). As an example, the implementation of AI, such as Momby, functions as a virtual assistant providing supplementary consultations. This approach allows healthcare professionals to extend their reach to more mothers, overcoming constraints related to space and time. In Vietnam, the Momby app has played a pivotal role in promoting breastfeeding-friendly health systems, extending from hospitals to homes. It has also documented the economic and social implications of not breastfeeding while working towards aligning maternity protection policies with workplace breastfeeding support programmes (Nguyen, 2022).
11 AI-Based Technological Interventions for Tackling Child Malnutrition
237
6 Conclusion and Policy Implications The primary objective of this study is to investigate the significant elements contributing to a child’s impaired growth, encompassing stunted, wasted, underweight, and overweight conditions. The results underscore three major nutritional factors: (1) the child’s health status, specifically regarding anaemia, (2) dietary diversity intake, and (3) the limited effectiveness of social welfare programmes such as Anganwadi/ICDS in benefiting lactating mothers. These factors play a pivotal role in impeding the growth of children in economically vulnerable communities in India. Based on these key findings, the study proposes AI-based empirical health and nutrition interventions, offering valuable insights for policymakers in addressing child malnutrition. Firstly, in mitigating the risk of impaired growth associated with a child’s anemia, AI technologies have demonstrated efficacy and promise as robust tools to enhance the diagnosis and treatment of various diseases. AI techniques, exemplified by the Anaemia Control Management (ACM) software, play a crucial role in predicting optimal medication doses (e.g., darbepoetin) and injections (e.g., iron sucrose) used for treating iron deficiency anemia (insufficient iron levels in the blood). This facilitates the attainment of target hemoglobin (Hb) and ferritin levels, providing prescription suggestions. The application of ACM also enables healthcare professionals to administer personalized intravenous (IV) iron and red blood cell stimulating agent (ESA) therapy. Additionally, treatment adherence to the standard of care is retrospectively documented for the same anaemic patients during the trial’s retrospective period, allowing patients to serve as their own control. Consequently, the integration of AI technologies has proven to be practical and supportive in enhancing child health outcomes. The utilization of AI in the diagnostic process aids medical specialists in enhancing both the accuracy and efficiency of diagnostics, thereby facilitating the delivery of emergent digitalized healthcare services. Furthermore, AI-based health applications present numerous opportunities, particularly for underprivileged communities with limited resources and expertise, serving as a catalyst to ensure universal access to high-quality and affordable health care for all. However, it is crucial to recognize that while AI holds significance in addressing malnutrition, its implementation must be carefully integrated into the nutritional health strategy. Failing to do so may result in AI exacerbating public health issues in a country already grappling with substantial challenges and urgencies. Secondly, addressing the critical role of dietary diversity intake in combating stunted growth among children provides an opportunity to enhance nutrition by obtaining information on diet intake and interpreting it within the context of an individual’s health. There is evidence to suggest that AI-driven dietary assessment systems could serve as a crucial tool in improving dietary diversity and increasing the intake of micronutrients in children. Recommending the use of AI-based dietary assessment systems, including FatSecret, Im2Calories, and GoCARB, is essential for monitoring the quantity of food consumed, such as calories, by children in long-term
238
B. Afsharinia et al.
care settings. This approach ensures that children meet specific nutrient requirements, thereby reducing the likelihood of impaired growth. While these interventions hold immense potential in effectively addressing malnutrition in children, policymakers need to devise and implement these strategies thoughtfully. This is particularly crucial as a significant number of children in underprivileged communities reside in rural areas where existing interventions may not have reached. Nevertheless, social and community programmes modelled after ICDS or Anganwadi have substantial potential to reach impoverished communities successfully. To achieve this ambitious goal, these programmes should be adjusted to integrate advanced nutritional approaches, including AI. Anganwadi workers face challenges in assessing the nutritional status of children, including insufficient skills, manual data management, and time-consuming processes. AI models within the Anganwadi Child Growth Monitor project, such as Momby, address these challenges, providing timely insights and improving access to health services for pregnant and new mothers. Anganwadi workers are responsible for evaluating the nutritional status of approximately 40 to 60 children, aged between six months and five years, in their designated areas of intervention. However, numerous challenges hinder the effectiveness of their assessments. Firstly, research indicates that many of these workers lack the necessary skills or training, leading to inaccuracies in the data they collect. Secondly, the manual data management process poses a significant challenge, given India’s vast population. Measurements are recorded on paper, stored in logbooks, and later transferred to spreadsheets. This manual approach is time-consuming and susceptible to human errors, resulting in delayed insights for addressing malnutrition in children. To address these challenges, AI models, such as the smartphone application powered by Microsoft Azure and Momby, have been introduced in the Anganwadi Child Growth Monitor project. For example, AI applications such as Momby have the potential to enhance access to health services and information for pregnant and new mothers who traditionally face difficulties in accessing crucial pre- and postnatal care. Policymakers should strive to extend the reach of AI models to the entire below poverty line population of the country. This expansion would empower health workers to identify and deliver timely care to children struggling with chronic malnutrition. Acknowledgements The authors express their gratitude to the United States Agency for International Development (USAID) and the Ministry of Health and Family Welfare, Government of India, for granting access to the Demographic and Health Survey (DHS)—also referred to as NFHS-4— conducted in India. This research has received funding through a grant from the Akshaya Patra Foundation, India, under grant number SP/APRL-19-0001.
References Agarwal, K. N. (2019). Impact of maternal and early life undernutrition/anemia on mental functions. Acta Scientific Paediatrics, 2(2), 8–14.
11 AI-Based Technological Interventions for Tackling Child Malnutrition
239
Barbieri, C., Molina, M., Ponce, P., Tothova, M., Cattinelli, I., Ion Titapiccolo, J., Mari, F., Amato, C., Leipold, F., Wehmeyer, W., Stuard, S., Stopper, A., & Canaud, B. (2016). An international observational study suggests that artificial intelligence for clinical decision support optimizes anemia management in hemodialysis patients. Kidney International, 90(2), 422–429. https:// doi.org/10.1016/j.kint.2016.03.036 Brier, M. E., & Gaweda, A. E. (2016). Artificial intelligence for optimal anemia management in end-stage renal disease. Kidney International, 90(2), 259–261. https://doi.org/10.1016/j.kint. 2016.05.018 Côté, M., & Lamarche, B. (2021). Artificial intelligence in nutrition research: Perspectives on current and future applications. Applied Physiology, Nutrition, and Metabolism, 47(1), 1–8. https://doi.org/10.1139/APNM-2021-0448 DHCS. (2016). Anthropometric measurements. In California Department of Health Care Services, Systems of Care Division Child Health and Disability Prevention Program, Health Assessment Guidelines (Vol. 53, Issue 1358). https://doi.org/10.1126/science.53.1358.20 Dinku, A. M., Mekonnen, T. C., & Adilu, G. S. (2020). Child dietary diversity and food (in)security as a potential correlate of child anthropometric indices in the context of urban food system in the cases of north-central Ethiopia. Journal of Health, Population and Nutrition, 39(1), 1–11. https://doi.org/10.1186/s41043-020-00219-6 European Journal of Pharmaceutical And Medical Research, 8(6), 170–174. https://www.res earchgate.net/publication/352091323_APPLICATION_OF_ARTIFICIAL_INTELLIGENCE_ ON_NUTRITION_ASSESSMENT_AND_MANAGEMENT FAO. (2018). A review of studies examining the link between food insecurity and malnutrition. In Technical Paper. http://www.fao.org/3/CA1447EN/ca1447en.pdf Fenn, B. (2009). Malnutrition in humanitarian emergencies. In WHO. https://www.who.int/diseas econtrol_emergencies/publications/idhe_2009_london_malnutrition_fenn.pdf Greenberg, G. (1981). Unstable emotions of children tied to poor diet. The New York Times Archives, 1. https://www.nytimes.com/1981/08/18/science/unstable-emotions-of-children-tiedto-poor-diet.html Katona, P., & Katona-Apte, J. (2008). The interaction between nutrition and infection. Kavita, M. S. (2021). Application of artificial intelligence on modeling and optimization. Clinical Infectious Diseases, 46(10), 1582–1588. https://doi.org/10.1086/587658 Khan, M., Khurshid, M., Vatsa, M., Singh, R., Duggal, M., & Singh, K. (2022). On AI approaches for promoting maternal and neonatal health in low resource settings: A review. Frontiers in Public Health, 10(September), 1–23. https://doi.org/10.3389/fpubh.2022.880034 Krebs-Smith, S. M., Smiciklas-Wright, H., Guthrie, H. A., & Krebs-Smith, J. (1987). The effects of variety in food choices on dietary quality. Journal of the American Dietetic Association, 87(7), 897–903. https://europepmc.org/article/med/3598038 Lamberti, L. M., , Christa L Fischer Walker, A. N., Victora, C., & Black, R. E. (2011). Breastfeeding and the risk for diarrhea morbidity and mortality. BMC Public Health, 9(2), 171–174. Lee, H., Huang, T., Yen, L., Wu, P., Chen, K., Kung, H., Liu, C., & Hsu, C. (2022). Precision nutrient management using artificial intelligence based on digital data collection framework. Applied Sciences, 12(9), 4167. https://doi.org/10.3390/app12094167 Mayo Clinic. (2021). Infectious diseases—Symptoms and causes—Mayo Clinic. Mayo Clinic. https://www.mayoclinic.org/diseases-conditions/infectious-diseases/symptoms-causes/ syc-20351173 Ministry of Health and Family Welfare Government of India. (2020). National Family Health Survey- 5 2019-21. In Ministry of Health and Family Welfare National (Vol. 361). Myers, A., Johnston, N., Rathod, V., Korattikara, A., Gorban, A., Silberman, N., Guadarrama, S., Papandreou, G., Huang, J., & Murphy, K. (2015). Im2Calories: Towards an automated mobile vision food diary. Proceedings of the IEEE International Conference on Computer Vision, 2015 Inter (December), 1233–1241. https://doi.org/10.1109/ICCV.2015.146
240
B. Afsharinia et al.
National Research Council (US). (1985). Nutritional consequences of acute diarrhea. In Nutritional management of acute diarrhea in infants and children. National Academies Press (US). https:// www.ncbi.nlm.nih.gov/books/NBK219100/ Nguyen, L. (2022). How AI is transforming maternal health care in Vietnam. Think Global Health. https://www.thinkglobalhealth.org/article/how-ai-transforming-maternal-health-care-vietnam Nilsson, N. J. (2009). The quest for artificial intelligence: A history of ideas and achievements. In S. University (Ed.), Cambridge University Press. Cambridge University Press. https://doi.org/ 10.1017/CBO9780511819346 NITI Aayog. (2022). NITI Aayog. Poshan-Abhiyaan. https://www.niti.gov.in/ Osei, E., & Mashamba-Thompson, T. P. (2021). Mobile health applications for disease screening and treatment support in low-and middle-income countries: A narrative review. Heliyon, 7(3), e06639. https://doi.org/10.1016/j.heliyon.2021.e06639 PEO. (2011). Evaluation study on integrated child development schemes (ICDS). In Programme Evaluation Organisation Planning Commission Government of India New Delhi (Vol. 1). http:// planningcommission.nic.in/reports/peoreport/peoevalu/peo_icds_v1.pdf PIB. (2021). Steps Taken for alleviation of malnutrition. PIB.Gov. https://pib.gov.in/Pressrelease share.aspx?PRID=1695200 Raina, S. K., Sharma, S., Bhardwaj, A., Singh, M., Chaudhary, S., & Kashyap, V. (2016). Malnutrition as a cause of mental retardation: A population-based study from Sub-Himalayan India. Journal of Neurosciences in Rural Practice. https://doi.org/10.4103/0976-3147.182776 Rajkomar, A., Dean, J., & Kohane, I. (2019). Machine learning in medicine. New England Journal of Medicine, 380(14), 1347–1358. https://doi.org/10.1056/nejmra1814259 Rathnayake, K. M., Madushani, P., & Silva, K. (2012). Use of dietary diversity score as a proxy indicator of nutrient adequacy of rural elderly people in Sri Lanka. BMC Research Notes, 5, 2–7. https://doi.org/10.1186/1756-0500-5-469 Sak, J., & Suchodolska, M. (2021). Artificial intelligence in nutrients science research: A review. Nutrients, 13(2), 1–17. https://doi.org/10.3390/nu13020322 Schaible, U. E., & Kaufmann, S. H. E. (2007). Malnutrition and infection: Complex mechanisms and global impacts. PLoS Medicine, 4(5), 0806–0812. https://doi.org/10.1371/journal.pmed.004 0115 Steyn, N., Nel, J., Nantel, G., Kennedy, G., & Labadarios, D. (2006). Food variety and dietary diversity scores in children: Are they good indicators of dietary adequacy? Public Health Nutrition, 9(5), 644–650. https://doi.org/10.1079/phn2005912 Sudersanadas, K. (2021). Application of artificial intelligence on nutrition assessment and management. European Journal of Pharmaceutical And Medical Research, 8(6), 170– https://doi.org/10.1007/978-3-319-42520-7_5 Teji Roba, K., O’Connor, T. P., Belachew, T., & O’Brien, N. M. (2016). Anemia and undernutrition among children aged 6–23 months in two agroecological zones of rural Ethiopia. Pediatric Health, Medicine and Therapeutics, 7, 131–140. https://doi.org/10.2147/phmt.s109574 UNICEF. (2018). 2018 Global Nutrition Report reveals malnutrition is unacceptably high and affects every country in the world, but there is also an unprecedented opportunity to end it. UNICEF. https://www.unicef.org/press-releases/2018-global-nutrition-report-reveals-malnutrit ion-unacceptably-high-and-affects Vasiloglou, M. F., Mougiakakou, S., Aubry, E., Bokelmann, A., Fricker, R., Gomes, F., Guntermann, C., Meyer, A., Studerus, D., & Stanga, Z. (2018). A comparative study on carbohydrate estimation: GoCARB versus Dietitians. Nutrients, 10(6), 1–11. https://doi.org/10.3390/nu1006 0741 WHO. (2006). WHO child growth standards: length/height-for-age, weight-for-age, weight- forlength, weight-for-height and body mass index-for-age: methods and development. World Health Organization.
11 AI-Based Technological Interventions for Tackling Child Malnutrition
241
WHO. (2019). More than one in three low- and middle-income countries face both extremes of malnutrition. WHO. https://www.who.int/news/item/16-12-2019-more-than-one-in-three-low-and-middle-income-countries-face-both-extremes-of-malnutrition Woodruff, B. A., & Duffield, A. (2002). Anthropometric assessment of nutritional status in adolescent populations in humanitarian emergencies. European Journal of Clinical Nutrition, 56, 1108–1118. https://doi.org/10.1038/sj.ejcn.1601456
Chapter 12
Autonomous Weapon System: Debating Legal–Ethical Consideration and Meaningful Human Control Challenges in the Military Environment Prakash Panneerselvam
Abstract The human experience of warfare is changing with the introduction of AI in the field of advance weapon technology. Particularly, in the last five years, autonomous weapon system (AWS) has generated intense debate globally over the potential benefit and potential problems associated with these systems. Military planners understand that AWS can perform the most difficult and complex tasks, with or without human interference and, therefore, can significantly reduce military casualties and save costs. These systems act as force multipliers to counter security threats. On the other hand, political pundits and public intellectuals opine that AWS, without human control, can lead to highly problematic ethical and legal consequences, and some even claiming that it is unethical to allow machines control the life and death of a human being. Several prominent public intellectuals, including influential figures like Elon Musk and Apple co-founder Steve Wozniak called for banning of “offensive autonomous weapons beyond meaningful human control”. But on the contrary, the militaries believe that the AWS can perform better without human control and follow legal and ethical rules better than soldiers. The debate over the AWS is a continuous one. This chapter will look into the emergence of AWS, its future potential and how it will impact future war scenarios, focussing thereby on the debate over the ethical–legal use of AWS and the viewpoints of military planners. Keywords Autonomous weapon systems · Military operations · Legal and ethical issues and meaningful human control
P. Panneerselvam (B) National Institute of Advanced Studies (NIAS), Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_12
243
244
P. Panneerselvam
1 Introduction The military landscape is undergoing major transformation with the introduction of autonomous weapon systems. Many armed forces around the world have been successful in developing AWS —semi-autonomy to full-autonomy weapons systems, for various combat roles. The USA, as the pioneer of the drone technology, runs one of the most successful programmes in the world, which has achieved a certain level of mission specifics, highly proven in combat. From high-altitude reconnaissance drones like Global Hawk to Aero Vironment Wasp III miniature UAV, the US drone programme has driven the evolution of unmanned systems into completely autonomous systems in the last two decades (Hambling, 2022). The US Department of Defence (DoD) asserts that future wars will be dominated by the AWS through the use of robotics and vehicles that can operate in battle zones which are too dangerous for soldiers. Intelligent weapon systems are getting better at detecting tracking, analysing, and evading threats in the battlefield. Developments in AI/ machine learning are making it possible for the military to reach out to some areas which are perceived to be otherwise impossible. In the military structure, AI operates at multiple levels—from big data analysis, decision-making, and cyber security to operating weapon platforms, image analysis and logistical solutions, where the military leadership will need it for swift implementation of plan in the battlefield. This is the primary reason why major military powers like the USA, China, Russia, and India are keen on developing AI-based military applications. The proliferation of such technology has also raised serious concern among intellectuals, academia, and industrialists about the ethical–legal issues arising from the use of AWS in warfare. A machine, killing people with no human oversight, is severely objected by scholars and legal experts. The humanitarian law experts question whether AWS can follow the “principal of distinction” and the “principal of proportionality”, which are pretty much the foundations of the law of armed conflict.1 Soldiers also have the responsibility and moral obligation of protecting civilians in the battlefield. The cardinal question with AWS is whether it will be able to perform better than soldiers in the battlefield by not just engaging with enemy targets but also by preventing collateral damage and following the international humanitarian law. Military experts affirm that there is already a check and balance in place to prevent the AWS from overriding human inputs. The debate on meaningful human control is worth analysing to understand how the military Standard Operating Procedure (SOP) and Rules of Engagement (RoE) are addressing the issue. The chapter is divided into two major parts. The main objective of the first part is to discuss the role and project capabilities of AWS and how the AWS impacts military preparedness. The second part aims to capture various debates on the ethical–legal issues in using AWS in combats. The terms AI and machine learning are umbrella terms to solve functional problems. Herbert Simon who coined the term “artificial intelligence” said, “It may prove easier to cleanse the phrase than to dispense with it. In time it will become sufficiently idiomatic that it will no longer be the target of cheap rhetoric” (Simon, 1996). 1
Winter (2022).
12 Autonomous Weapon System: Debating Legal–Ethical Consideration …
245
According to Simon, the term artificial intelligence was actually coined to avoid new “terminological difficulties”, but his team in Rand and Carnegie Mellon University rather preferred phrases like “complex information processing” and “simulation of cognitive processes” (Simon, 1996). The problem lies not just in the term’s grandeur, but also in that it sets unreasonable expectations and implies more capability than the world has ever seen (Morgan, et al., 2020). There is no single definition to AI. John McCathy defines AI as “the science and engineering of making intelligent machines, especially intelligent computer programmes. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable” (see Mc Cathy’s webpage). The Defence Science Board Summer Study on Autonomy defines AI as “the capability of computer systems to perform tasks that normally require human intelligence” (Board, 2016). Lawrence G. Shattuck, Director, Human Systems Integration Programme, US Naval Postgraduate School states that AI “is the ability of a system to act appropriately in an uncertain environment, where an appropriate action is that which increases the probability of success, and success is the achievement of behavioural sub-goals that support the system’s ultimate goal” (Shattuck, 2015) AI has multiple definitions to suit the need of a specific domain. But the government and the industry are much more interested in its application in solving societal problems like medical applications for the diagnosis of cancer cells and in the military domain for protecting soldiers, increasing accuracy and reducing collateral damage, etc. The development of autonomy weapons systems envisaged with an idea to improve the ability of the armed forces to function effectively in modern battle spaces and to reduce collateral damage. Even though the concept and idea behind developing AI or intelligence systems are rooted in Turing’s article “Computing Machinery and Intelligence”, published in the 1950s, the military have been on field, experimenting and testing remote-controlled weapon systems to solve some complex warfare problems. A wide range of military systems have become increasingly automatic without understanding the concept of AI. Starting from as early as 1939, the first operational automated Identification, Friend or Foe (IFF) was developed to identify whether the aircraft was friendly or hostile (Bowden, 1985). Later, the advancement in computational mathematics has led to a series of new developments in anti-aircraft systems, radar systems, and air-to-air missile systems where systems are capable of honing in, on to the target, with radar guidance or through heat-seeking sensors. Throughout the Cold War, the USA and the Soviet Russia invested in the field of computer science and modern technology to advance their capability in the field of nuclear technology, space and military systems. Automation or development of intelligence systems were of utmost priority to these countries. The AWS is a crucial outcome, which not only changes the course of modern battlefield, but also raises serious questions about the reliability of such a weapon with minimum or zero human control. The advantages of AWS are several, making it suitable for the modern battlefield. In the following section, the chapter will explore AWS’s role, advantage, and projected capabilities. This will provide a clearer understanding of the level of autonomy these systems have attained
246
P. Panneerselvam
and how this will be a point of debate in discussions concerning “meaningful human control.”
2 AWS: Role, Advantages, and Projected Capabilities Speed is undoubtedly a war-winning trait. A clear benefit of the autonomous system is its ability to accelerate the pace of decision-making and provide spontaneous support in identifying target, as well as precision strike capability to the military commander in order to achieve his/her objectives. The increasing reliance on autonomous systems is being observed across all major armed forces for one particular reason—AI-based systems perform better when more data is made available to them, contrary to humans, who have more difficulties in processing information or making decisions if the data available to them is larger in quantity. The data captured from various sensors, radars, ELINT, and SIGINT or large image and video files are heterogeneous in nature. The data may also consist of details about the health of sensors and analogue signals. AI can process the information quickly and provide situational awareness to the field commander or can engage the target with precision. This distinguishes the functioning of autonomous systems from that of other remotely operated and automated systems which lack the capacity to act independently. Additionally, autonomous systems have interaction objectives without being programmed to target a selected object (Wagner, 2012). The US Department of Defence (DoD) defines AWS as “a weapon system that, once activated, can select and engage targets without further intervention by a human operator” (US Department of Defense, 2012). Unlike semi-autonomous systems, where a weapon system engages an individual target that has been pre-selected by a human operator, AWS, after activation, can select and engage targets without further inputs from humans. The distinction is very important to understand the differing levels of complexity and sophistication in AWS. This kind of weapon system is classified as Lethal Autonomy Weapon System (LAWS) which raises serious ethical and legal issues for using weapon platforms without human supervision. But the autonomous system is not only limited to the “Lethal” activity in warfare, but also used in a wide range of applications—1) reconnaissance and surveillance, 2) target identification and designation, 3) countermine warfare, and 4) WMD detection/humanitarian assistance/disaster relief operations. The current state-of-the-art autonomous systems/technology can support some of these missions. As the military faces new set of challenges on the battlefield, experts and policymakers are increasingly focused on creating fully autonomous systems. Vice Admiral Ann Rondeau, U.S. Navy (Retired), observes that in “the complexity of multi-domain operations, an extremely capable Command and Control (C2) is essential, with human, hardware, software, and network aspects all working in harmony” (Rondeau, 2022). He also envisages that during the war, the centralized C2 will have problems; therefore, he argues for a more local and distributed decision-making process. It means that AI and autonomous systems will have a more crucial role in not only helping to speed up the
12 Autonomous Weapon System: Debating Legal–Ethical Consideration …
247
Observe, Orient, Decide, and Act (OODA) loop but also in carrying out difficult tasks in the land, air, and sea. The AWS offers great benefits to national defence in a wide range of areas. Several factors support this choice. Importantly, AWS can survive in the harshest of environments where there is no plausible hope for humans to survive. The weapon systems supported by AI/ML are going to be a major game-changer in military affairs as it will close the gap between the decision-making process and the collecting and analysing of huge data volume sets. The following are some of the key advantages of AWS in the military environment: 1. Decision-making process: The use of AWS as a potential strategic and tactical tool will outpace the adversary’s decision-making process. The US Air Force Colonel John Boyd developed the OODA loop, which provides a framework to systematically engage enemy fighter aircraft in air-to-air combat. Experts believe that integrating AWS into the OODA loop will provide great advantage over the opponent in quickly securing victory on the battlefield. Lt Gen. Michael Groen, Commander of the US Joint Artificial Intelligence Centre, said in an interview “waging war without it (AI) will work as cavalry charging machine guns on the Western Front” (Freedberg, 2022). Lt Gen. Groen point out that the traditional command, control, and planning processes are manual, which are stove-piped and slow in executing the plan. If one can speed up the process, then one will be able to act fast and take charge of the situation, while one’s opponent is still in the process of observing the situation, orienting with it, deciding and acting. AI also avoids blunders in decision-making and recommend options quickly to decision-makers. 2. Intelligence, Surveillance, and Reconnaissance (ISR) mission: The ISR is critical for any military operation. In the modern battlefield, the network is connected to thousands of different systems and sensors and making sense of the immense inflow of information can be complicated, leading the operator to take wrong decisions. However, the autonomous systems allow operators to focus on the decision-making process rather than on mundane tasks like data processing. The AWS provides military operators a clear edge in dealing with a high-tech war environment where speed is crucial. 3. Target identification and designation: One area that AI has really been useful is image processing. Today, there is data overload as surveillance, conducted both in the domestic and international arena, has increased dramatically in the last two decades and the increasing trend is likely to continue in the future. Further, data from newer technology like facial recognition that use AI is being put to use by many countries for better managing social problems. Therefore, automating the process through AI for image recognition and object detection is going to be fundamental for any military operations in the future. 4. Ability to operate beyond enemy lines: One of the main advantages of the autonomous systems is the ability to operate beyond enemy lines. In high-tech wars, autonomous systems can be effectively used in countering the anti-access/ area denial (A2/AD) weapon system without risking the soldiers’ lives. Earlier, the long-range precision strike weapons including guided missiles and artillery
248
P. Panneerselvam
were used to take on the enemy’s control systems or even satellite using AntiSatellite Missiles. In most cases, the A2AD weapon systems are located in civilian areas mounting to huge casualties; by targeting the command and control, one’s position gets exposed to the enemy’s counterattack. In order to overcome these challenges, drones are used to target such systems. In one of the research reports submitted by a group of US air force officers, it has been recommended to use swarming tactics as a viable option to counter anti-access/area denial - A2AD (Gorrell et al., 2016). Different autonomous systems in coordinating an attack will be a big step in military affairs. This will be a significant game changer in the military environment as experts have always thrown caution to the difficulties in coordinating during actual combat. Navigating and performing tasks based on mission requirements are integral parts of the AWS. But AWS will also be processing huge datasets, relaying the information to the command post and retrieving information from the data bank to complete its mission. Currently, autonomous systems are tasked to undertake the job with the supervision of humans. At present, the use AWS in military operations conducted for targeting the enemy’s position is still subject to the discretion of human operators. For example, loitering munition and close-in weapon systems (CWS) like US Navy’s MK – 15 Phalanx systems are programmed to carry out specific tasks and objectives. The programme does not allow it to deviate from its delineated role. This process compliments the efforts of soldiers and provides a certain control over the systems. The human is still in command of such systems making them reliable, while it is an equally potent weapon in the present scenario. However, the growing interest in this field suggests that subsequently AWS will become completely autonomous with no option for human control. The projected capabilities (Table 1) of the autonomous system in the long term indicate that it will be skilled to undertake even tougher tasks and for managing a specific set of objectives like humans. According to experts, in the future, AI will be able to mimic human capabilities like deliberative reasoning, perceiving patterns, and exercising meaningful judgement (Board, 2016). The major military powers have already begun incorporating the AWS in a substantial way into the armed forces. The US Department of Defence, Unmanned Systems Integrated Roadmap’ (2009) specifies that the “level of autonomy should continue to progress from today’s fairly high level of human control/intervention to a high level of autonomous tactical behaviour that enables more timely and informed human oversight” (Department of Defense, 2009). The Indian Armed Forces has also specified the need to build AI and AWS in the country to counter growing Chinese threats in the region. What makes the autonomous system a preferred weapon of choice of the armed forces is its ability to carry out tasks with or without human supervision. At the same time, an underlying problem with the present AWS is its communication link with command and control. This has been notably the weakest aspect of the weapon systems. The current AWS technology is designed and developed for manned operational functions off-board over a communication link, making the process cumbersome and less effective in the modern battle space. Today, the
12 Autonomous Weapon System: Debating Legal–Ethical Consideration …
249
Table 1 Projected capabilities for autonomous systems Sense
Think/Decide
Technology availability
Sensors, Perception, Fusion
Analysis, Reasoning, Motion, Learning Manipulation
Act
Team Human/machine, Machine/machine, Info exchange
Present
Full-spectrum sensing (EM, bits, vibration, chemical…); Object recognition
High-volume computational throughput and data architectures; Algorithm variety and complexity; Task-specific, rule-based decision; Rules; Learning from training data, sentiment analysis
Navigation (routing); Strength, endurance
High man: machine ratio; Rule-based coordination of multiple platforms; High-volume communications and data transfer
Medium-term
Human senses (sight, smell…); Integration of perception with motor skills
Explicit and testable knowledge representation; Anomaly recognition; Option generation, pruning; Social and behavioural models; Culturally informed, values-based reasoning; Transparent decision logic; C2 for many nodes; Learning by doing, watching
Navigation (obstacle avoidance); Agility, dexterity
Observability and directability; Provably correct emergent behaviour; Trustworthiness and trust calibration under defined conditions; Natural language processing
Long-term
High-fidelity touch; Scene understanding
Goal definition; Abstraction, Skills transfer; Inference; Empathy; General purpose, idea-based reasoning; Judgement, intuition
Navigation (dense, dynamic domains); High degree of freedom actuator control
Shared “mental models,” mutual predictability; Understanding intent; Fully adaptive coordination; Implicit communication
Source Report on Defence Science Board, Summer Study on Autonomy, Washington: June 2016
battle space has evolved, showcasing a full-spectrum dominance, where the military has achieved total superiority over all dimensions of the battle space—land, air, sea, space, electromagnetic spectrum, cyberspace, etc. Moreover, the introduction of newer networking, mobile communications, and automation technology across both civilian and defence applications makes the electromagnetic spectrum a highly congested and contested domain in the military environment (Industry Spotlight, 2021). Therefore, the control of unmanned systems can be jammed or destroyed
250
P. Panneerselvam
by using counterweapons by the enemy forces. In case of AWS, the possibility of jamming or destroying can be minimized as autonomous systems can avoid such obstacles inside the enemy’s territory. This would be a concrete system upgrade for the military forces that seek such capability to penetrate the enemy’s defence. Other leading scholars in this field, who propagate the idea of full autonomy of the weapon system in military operations, argue that this will not only improve the effectiveness of the weapon but also help nation-states comply with legal–ethical standards. The ethical–legal issues over the AWS gains pace when there is usage of robots or drones in military operations. For the first time in the USA, Dallas Police used a remote-controlled robot to kill a gunman in 2016, which set forth a storm of discussions on ethical and moral responsibility in the USA (Editorial & NY Times, 2016). Despite the fact that the US military has been using drones to target terrorists in foreign countries for decades, the American public reaction to the use of such a system by homeland security clearly points out the political aspect of police brutality on the ethnic black community. The contradiction clearly explains the importance of discussing the legal–ethical issues associated with the usage of AWS.
3 AWS: Debate on Legal–Ethical Issues, Meaningful Human Control, and View of Military The projected capabilities of AWS indicate that it will be able to handle complex problems and become a preferred choice in the future. This also raises an important question - whether AWS is an illegal weapon. Also, what are the dangers posed by weapon systems that operate autonomously without “human control”? Legal arguments regarding autonomous weapon systems have been dominated by the issue of whether their use is permissible under international humanitarian law (IHL) or not. The main points of contention have been whether or not these systems can uphold the standards of distinction between civilians and combatants and follow the principle of proportionality and whether AWS would compromise a soldier’s ability to exercise his/her legal responsibility, which is solidly rooted in international treaties like the 1949 Geneva Convention and firmly established in international customary law (Asaro, 2012). The AWS must uphold the International Human Right Law (IHRL) which provides certain rights to civilians irrespective of their nationality and ethnicity during the war. Besides performing a military duty, the AWS should also take into consideration the legal–ethical and moral responsibilities in the battlefield. It will be difficult, if not impossible, to programme a system which can foresee or prepare for all possible contingencies in the battlefield while ensuring that IHL and IHRL are followed. In the fog of war, it is difficult for both soldiers and AWS to make decisions clearly. A pivotal challenge before AWS is distinguishing between a combatant and a noncombatant in the battlefield. Particularly in an urban environment or in guerrilla warfare, the difference between combatant and non-combatant blurs as combatants
12 Autonomous Weapon System: Debating Legal–Ethical Consideration …
251
may be under the guise of civilians to conceal their identity. Even for soldiers, it will be a difficult task to identify the enemy in a complex battlefield. Therefore, any successful operation of AWS relies on its ability to identify combatants in the midst of towns, villages, or in the countryside. Therefore, some experts argue that AWS should be used in less complex military environment to reduce collateral damage or mistaken identity. On the other hand, human rights watch groups are worried that this weapon may desensitize soldiers and will not urge the military commander to think twice before launching such an attack. Soldiers completely isolated from witnessing and experiencing the horror of war may be saved from physical and mental suffering but that could also change the very nature of violent conflict. The disengagement of soldiers from the actual reality of violent conflict may dehumanize the enemy, leading even to genocides. Proliferation is another issue with the AWS. The AWS technology is irreversible and is going to remain a salient part of military operations. Military thinkers argue that AWS is like any other weapon which will have a key role in the battlefield and many states will acquire them for various military purposes. Even if states agree to place a ban on exporting the armed/lethal AWS, countries can still import such technology in the form available in civilian markets and can turn it into a weapon. The Russia–Ukraine War 2022 is a good example where the Ukrainian Armed Forces were quite successful in modifying civilian drones to carry grenade using the 3D printing technology. Therefore, controlling such weapons from proliferating will be difficult, if not impossible. Since many civilians use the technology that can be weaponized, there is an impending danger of these technologies falling into the hands of terrorists or non-state violent actors. The AWS and associated technology raise many important questions regarding future wars and security of nation-states. But the major debate on AWS can be broadly categorized into two sections. The first section champions an outright ban on AWS because of ethical–legal and moral issues. The public intellectuals and human right activists groups argue for “meaningful full human control” which can be one possible solution to the problem. For the second section, AWS will continue to evolve because there is enhanced interest among the armed forces in acquiring such technology. Therefore, some argue that it requires careful judgement of the situation rather than advocating an outright ban on the systems. The two schools of thought need to be thoroughly understood to get a clear perspective on the subject.
3.1 Outright Ban on the AWS Because of Legal-Ethical and Moral Issues Human Rights Watch group called for a ban on the autonomous weapons during the sixth review conference of the Convention on Conventional Weapons (CCW) in 2021 (Sandford & Joly, 2021a, 2021b). NGOs and human right activists raised umpteen questions over moral and ethical concerns as well as its implication on the global
252
P. Panneerselvam
security environment. Professor Noel Sharkey played an important role in laying the groundwork for the launch of the Campaign to Stop Killer Robots and co-founded the International Committee for Robot Arms Control (ICRAC) in 2009. He conveys that in the unstructured combat circumstances, the interactions among AWS are rarely foreseeable and swift enough making the speed of conflict beyond human control. He also said that AWS threatens world peace by making war a much easier option. Furthermore, the use of such systems in urban environments for homeland security will only aggravate the human right issues rather than solving real law and order problems. Public intellectuals and academics have raised three primary ethical concerns about autonomous weapon systems. First, AWS are not moral agents and therefore, do not uphold the dignity of the person killed (Sharkey, 2007). As a result, the decision to kill other human beings, based on algorithms, reduces them merely to objects. Human Rights Watch group argues that AWS lack emotional and ethical judgement, hence, cannot be put in charge of making value judgement on life or death situations of human beings which affects the right to dignity (Human Rights Watch, 2016). Prof. Christof Heyns, University of Pretoria argues that even if AWS acts in conformity to IHL or IHRL, “Should they do it? Should the robots have the power of life and death over humans? He says it is inherently wrong and violates the basic tenets of rights to life as well as the right to dignity.” Second, can it uphold the principle of distinction by identifying legitimate military targets and objectives? The law of armed conflict has clearly laid the requirement of distinction between combatants and civilians. The IHL Geneva Convention Article 48 of the 1977 Additional Protocol I provides: “the Parties to the conflict shall at all times distinguish between the civilian population and combatants” (ICRC, Customary IHL Database, https://ihl-databases. icrc.org/customary-ihl/eng/docs/v1_rul) Military manuals of various countries also suggest that a “distinction must be made between civilians and combatants and that it is prohibited to direct attacks against civilians” (Melzer, 2008). The question is whether AWS can be programmed to comply with IHL or IHRL? The system can be programmed based on quantitative data to identify and track military equipment such as tanks, armoured personnel carriers, aircrafts, warships, and missile systems. Based on the project capability index (Table 1), the AWS may acquire the capacity to make qualitative judgements in the future. But the question is to what degree does AWS make those judgements? In a cluttered urban environment with more citizens taking part in hostilities, it has become harder to distinguish between legitimate military targets and civilians. Given the scenario, is it possible to programme AWS to distinguish between a “civilian directly participating in hostilities” from one who is not? In order to qualify as direct participation in hostilities, ICRC issued the following criteria under IHL: 1. the act must be likely to adversely affect the military operations or military capacity of a party to an armed conflict or, alternatively, to inflict death, injury, or destruction on persons or objects protected against direct attack (threshold of harm);
12 Autonomous Weapon System: Debating Legal–Ethical Consideration …
253
2. there must be a direct causal link between the act and the harm likely to result either from that act, or from a coordinated military operation of which that act constitutes an integral part (direct causation); 3. the act must be specifically designed to directly cause the required threshold of harm in support of a party to the conflict and to the detriment of another (belligerent nexus) (Reports and Documents, 2008). Can the above criteria be programmed into a machine? Even if it can be done, each situation requires careful assessment to arrive at a conclusion as to whether the individual is actively participating in the violent conflict or not. This brings us to the third crucial legal issue in using AWS. With regard to the accountability and responsibility of any wrongdoing by AWS, who will be held responsible for the actions of AWS? Experts argue that considering the huge number of persons engaged in the AWS life cycle and because of the intricate ways in which their contributions interact, each of them lacks the necessary level of control (Taylor, 2021). Scholars argue that looking at the way AWS is developed and deployed, a particular individual may not have control of the systems. For example, the programmer or manufacturer could not have predicted that the AWS they built would evolve in the way it did. Targeting non-liable individuals, in a way, denotes the failure of the mission (Taylor, 2021). The military commander (who gave the orders) and operators (executing the order), unaware of the programming and inner functioning of AWS, may hold very little perspective on how the machine will function. Merely holding the commander, programmer or manufacturer responsible is not going to solve the issue. Only by identifying the agent morally responsible for the AWS is going to fix the problem says Robert Sparrow. On fixing the “Responsibility Gap”, philosopher Robert Sparrow says, “we have reached an impasse” on AWS.2 What he means by “impasse” is that AWS will become increasingly autonomous to the point where those who command deployment can no longer be held accountable for their actions (Simon, 1996). He also suggests that “machines could have capacities way beyond those necessary to reach this point without it being possible to hold them responsible” (Simon, 1996). Sparrow compares “robot warriors and child soldiers” by stating the dilemma in using child soldiers (intelligent and capable of a wide range of decisions) in the battlefield who do not understand the consequences of their action in the battlefield and, therefore, cannot be held responsible for their actions.3 So, child soldiers are unethical because they are not morally responsible for their actions. If you apply the same logic to robots, it cannot also be morally held responsible for its actions in the battlefield. So, Professor Sparrow concludes that it will be unethical to deploy AWS in war unless someone can be held accountable for its actions that directly or indirectly threaten human lives. To sum up, intelligent systems without moral responsibility is the prime threat to humanity. To address the above concern, the idea of “meaningful human control” gained importance in the legal, ethical and political debate (Article36, 2013). The principle notion behind the concept is that machines and intelligent systems should be under the 2 3
Parrow (2007). See footnote 2.
254
P. Panneerselvam
purview of human control because they do not have moral responsibility and should not be allowed to take a life and death judgement in the battlefield. The term was first introduced by Article 36, a UK-based NGO, which was part of the International Committee for Robot Arms Control (ICRAC) (Article36, 2013). The debate attracted wide consensus among public intellectuals and scholars as the idea offers precise control over the machine rather than just a human being in the loop. The need for meaningful human control argument is based on three principles (Sio and Hoven, 2018). First, the mere presence of humans in the loop is not a sufficient condition to control the military activity. The human should be able to exercise his/her influence in changing the course of action in the battlefield. Second, to perform his/her duty to the fullest the operator should have enough information, option to influence the process and psychological capacity to respond appropriately to the situation. Third, some form of legal control and responsibility will be required to keep a moral agent responsible or accountable for the behaviours of AWS. The concept of “meaningful human control” has played a vital role in strengthening the legal–ethical debate for the regulation of AWS. There is also a broad agreement among states on the need to have meaningful human control over the AWS. Although many states called for a complete ban on the AWS and advocated the importance of “meaningful human control,” (Human Rights Watch, 2016), some states like India argued that as there is not enough clarity on the term “meaningful human control”, any rush in judging the term would run the risk of legitimizing AWS (Government of India, 2014). The ethical, political, and legal debate on AWS has been the principal discourse of ICRC and IHL, and the continuous engagement with the defence manufacturers, military, and policy makers is significant to bring clarity to the legal–ethical issues.
3.2 AWS is Inevitable and One Must Proceed Cautiously and Judiciously Revolutionary technology like AWS acts as a force multiplier in the modern battlefield. Ronald C. Arkin arguing in favour of AWS said that it would perform better than humans in the battlefield. In his article, he listed six reasons where the AWS are going to outpace soldiers in the battlefield. Even on the ethical and legal issues Arkin says “I am convinced that they can perform more ethically than human soldiers are capable of performing” (Arkin, 2010). The AWS not only reduces the risk of soldiers exposed to physical and psychological damage but also provide clarity for the soldiers to focus on military objective. Because of the trauma, soldiers may indulge in wrong judgement or inflict damage upon the civilian population or may engage non-military objectives out of frustration. According to several studies conducted by US-based research institutes, it was found that post-traumatic stress disorder (PTSD) is a growing problem among US soldiers deployed in Operation Iraqi Freedom and Operation Enduring Freedom (Cesur, Sabia, & Tekin, 2011) (Cesur, Sabia & Tekin, 2011). The drone operators who are completely detached from the horror of war
12 Autonomous Weapon System: Debating Legal–Ethical Consideration …
255
located thousands of miles away from military operations are also suffering from PTSD, emotional exhaustion, and burnout (Saini, Raju & Chail, 2021) The level of emotional exhaustion and mental fitness are crucial determining factors in the battlefield. Therefore, the AWS offers a great advantage for the military planner not only to overcome this challenge, but also mitigate civilian casualties. Military experts point out that NATO Targeting Process (Table 2) clearly shows the distinctive steps in practices, even as some of the steps are carried out simultaneously (NATO, 2021). Each and every step provides the commander with a structure for his/ her appropriate response in order to minimize mistakes. The central problem in the targeting process with relations to the AWS is the Central Decision—it is the moment after which humans can no longer influence the direct violent effects. In missiles, pre-programming and mid-course corrections can be carried out. But in the moment of “non-return”, the operator loses control over the direct outcome. This is the phase that should be scrutinized more closely. Persons making crucial decisions make the judgement based on the assumption that it is in accordance with the operational and legal requirements. The military has understood that AWS is not to be treated like long-range artillery or loitering munition, but to be used to develop new targeting processes to match its performance, while ensuring accountability throughout. This will bring extensive changes in the military hierarchy in the future. The AWS continues to evolve, and it is inevitable to prevent such a system from becoming a part of the military given its potential advantage. The military views AWS as a complement to commanders and soldiers in the battlefield by reducing both physical and mental workload.
4 Conclusion The AWS gives asymmetric capabilities against the enemy who has no access to such technology. But we have seen, time and again in history such technology proliferates rapidly. Once many states procure autonomous weapons system, there will be a rapid development of counter weapons and counter-counter weapons. The debate on legal–ethical measures to control the proliferation of such technologies has been discussed briefly in the article to highlight logical reasons behind the call for outright ban on autonomous weapons. At the same time, weapon manufacturers and military establishments want to exploit the technology for the betterment of soldiers and protection of civilians in the battlefield by controlling collateral damage. Since war is inevitable, using innovative and lethal weapon systems has always been a controversial subject. Many World War II weapons such as flame-thrower, cluster munitions and nuclear, chemical and biological weapons, which are considered as a game-changer, are banned today because of the lethality and inhumane nature of such weapons. The responsibility and accountability was clearly defined in using such weapons. But the biggest challenge in the AWS is fixing “responsibility and accountability”. Even though legal and ethical issues may not be engineering problems, Original Equipment Manufacturers (OEMs) must take them into account during
256
P. Panneerselvam
Table 2 NATO targeting process 1. Analysis of commander goal (a) The use of force may be considered. Target sets can include both military and civilian targets 2. Target development, validation, nomination, and prioritization (a) Target Development—analyse adversaries capabilities and determine the best target to engage in order to achieve the designated goal (b) This is also the stage potential collateral damage assessed (c) Validation—compliance with the laid objective of commander goal and rule of engagement, verify the accuracy and credibility of intelligence used to develop target. This validation also looks into collateral damage if the civilians are present in the targeted area. What is the military advantage, etc (d) After positive validation, targets are nominated for approval after which they are prioritized (e) Two categories of targets—specific targets and target belonging to a set 3. Analysis of capabilities (a) To analyse means and methods to engage with the target most efficiently (b) The analysis includes target characteristics (location, type) requirement of particular component (land, air, maritime or special forces) (c) If expected excessive collateral damage relative to the anticipated military advantage, measure will be taken to reduce the collateral damage (d) If the target can achieve the desired effect, the weapon will be included among the options for the commander to decide 4. Assignment of capabilities to be used (a) Capabilities are matched to selected targets to achieve the desired results 5. Planning and execution of the mission (a) Steps from 1 to 4 carried out in more detailed assessment considering the tactical situations (b) The assessment includes: location, type, size and material of target, civilian pattern of life, time of attack (day/night), weapon capabilities, weapon effects, direction of attack, munition fragmentation pattern, secondary explosion, infrastructure collateral concerns, personnel safety, and battle space deconfliction measures 6. Assessment of results (a) Combat assessment is conducted to determine whether the desired effects have been achieved Source NATO (2021)
the development of these systems. Therefore, it is crucial for the military and OEMs to work together to identify issues with Lethal AWS before deploying them for combat missions. This is a crucial period for the OEMs and military establishments to establish the rule of law in governing such weapon systems in future.
References Arkin, R. (2010). The case for ethical autonomy in unmanned systems. Journal of Military Ethics, 9(4), 332–341. https://doi.org/10.1080/15027570.2010.536402
12 Autonomous Weapon System: Debating Legal–Ethical Consideration …
257
Article 36. (2013). Killer robots: UK government policy on fully autonomous weapons. https://art icle36.org/wp-content/uploads/2013/04/Policy_Paper1.pdf. Accessed Oct 06, 2022. Asaro, P. (2012). On banning autonomous weapon systems: Human rights, automation, and the dehumanization of lethal decision-making. International Review of the Red Cross, IRRC No., 886, 687–709. Board, D. S. (2016). Autonomy. Defense Science Board. Bowden, L. (1985, October). The story of IFF (identification friend or foe). IEE Proceedings. Bowden, L. (1985). The story of IFF (Identification Friend or Foe). In IEE Proceedings (Vol. 132, No. Pt. A, pp. 435–437) October 6. Simon, H. A. (1996). The sciences of the artificial. MIT Press. Cesur, R., Sabia, J. J., & Tekin , E. (2011, August). The psychological costs of war: military combat and mental health. The Digest, IZA Discussion Paper No. 5615(8). Department of Defense. (2009). FY 2009–2034 unmanned systems integrated roadmap. Department of Defense. de Sio, F. S., & van den Hoven, J. (2018). Meaningful human control over autonomous systems: a philosophical account. Frontiers in Robotics and AI, 5, 15 Feb 28. https://doi.org/10.3389/frobt. 2018.00015 Editorial, NY Times. (2016, July 12). When-police-use-lethal-robots. Retrieved from The New York Times: https://www.nytimes.com/2016/07/12/opinion/when-police-use-lethal-robots.html Freedberg, J.R., S. J. (2022, November 11). JAIC chief asks: Can AI prevent another 1914? Retrieved from Breaking Defense: https://breakingdefense.com/2020/11/jaic-chief-asks-can-ai-preventanother-1914/ Gorrell, R., MacPhail, A., & Rice, J. (2016). Countering A2/AD with swarming. Air Command and Staff College Air University. Hambling, D. (2022, December 11). U.S. to equip MQ-9 reaper drones with artificial intelligence. Retrieved from Forbes: https://www.forbes.com/sites/davidhambling/2020/12/11/new-projectwill-give-us-mq-9-reaper-drones-artificial-intelligence/?sh=51dddd687a8e Human Rights Watch. (2016, April 11). Killer robots and the concept of meaningful human control. Retrieved from Human Rights Watch: https://www.hrw.org/news/2016/04/11/killer-robots-andconcept-meaningful-human-control Industry Spotlight. (2021, September 13). Enhancing electronic support to achieve spectrum dominance within multi-domain operations. Retrieved from Shephard: https://www.shephardmedia. com/news/digital-battlespace/enhancing-electronic-support-achieve-spectrum-domi/ Melzer, N. (2008, December). Interpretive guidance on the notion of direct participation in hostilities under international humanitarian law. International Review of the Red Cross, 90(872). Morgan, F. E., Boudreaux, B., Lohn, A. J., Ashby, M., Curriden, C., Klima, K., Grossman, D. (2020). Read online military applications of artificial intelligence. RAND Corporation. NATO. (2021). Allied joint doctrine for joint targeting. NATO Standardization Office (NSO) Rondeau, A. (2022, August). Rebalancing the science and art of war for decision advantage. Periodical, 148/8/1,434. Saini R .K., V.K. Raju, M. S., & Chail, A. (2021). Cry in the sky: Psychological impact on drone operators. Industrial Psychiatry Journal, October 30(Suppl 1), S15–S19. https://doi.org/10. 4103/0972-6748.328782. Epub 2021 Oct 22. PMID: 34908658; PMCID: PMC8611566. Sandford, A., & Joly, J. (2021, December 15). ‘A threat to humanity’, NGOs and activists call for a ban on the use of ‘killer robots’. Retrieved from Euronews: https://www.euronews.com/2021/ 12/13/a-threat-to-humanity-ngos-and-activists-call-for-a-ban-on-the-use-of-killer-robots Sandford, A., & Joly, J. (2021). ‘A threat to humanity’, NGOs and activists call for a ban on the use of ‘killer robots’. Retrieved from Euronews: https://www.euronews.com/2021/12/13/a-thr eat-to-humanity-ngos-and-activists-call-for-a-ban-on-the-use-of-killer-robots. Accessed July 1, 2022. Sharkey, N. (2007). Automated killers and the computing profession. Computer, 40(11), 124, 122– 123. https://doi.org/10.1109/MC.2007.372
258
P. Panneerselvam
Shattuck, L. G. (2015). Transitioning to autonomy: A human systems integration perspective. In Presented at Transitioning to Autonomy: Changes in the Role of Humans in Air Transportation, Moffett Field, CA, March 10–12, 2015. https://human-factors.arc.nasa.gov/workshop/aut onomy/download/presentations/Shaddock%20.pdf. Accessed July 1, 2022. Simon, H. A. (1996). The sciences of the artifical. MIT Press. Sparrow, R. (2007). Killer robots. Journal of Applied Philosophy, 24, 62–77. https://doi.org/10. 1111/j.1468-5930.2007.00346.x Taylor, I. (2021). Who is responsible for killer robots? Autonomous weapons, group agency, and the military-industrial complex. Journal of Applied Philosophy, 38, 320–334. https://doi.org/10. 1111/japp.12469 Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460. https://doi. org/10.1093/mind/LIX.236.433 US Department of Defense. 2012. Autonomy in weapon systems. Directive 3000.09 (21 November). Wagner, Markus. 2012. Autonomy in the battlespace: independently operating weapon systems and the law of armed conflict (12 November). International Humanitarian Law and the Changing Technology of War. https://ssrn.com/abstract=2211036 Winter, E. (2022). The compatibility of autonomous weapons with the principles of international humanitarian law. Journal of Conflict and Security Law, 27(1), 1–20
Chapter 13
Artificial Intelligence and War: Understanding Their Convergence and the Resulting Complexities in the Military Decision-Making Process Prasanth Balakrishnan Nair
Abstract The Military Decision-Making Process is arguably the single most important factor that decides the fate of an armed conflict. The emergence of AI-based Decision Support Systems are bound to influence this process. Towards this, it is important to understand what constitutes these support systems and how nations can optimally exploit their inevitable emergence. The impact would be felt in both the kinetic and non-kinetic applications of the armed forces. While doing so it is equally important to understand the associated risks involved and the plausible mitigating strategies. The inherent question of ethics and morality cannot be divorced from these strategies, especially when it involves decision-making processes that can involve disproportionate and indiscriminate casualties of both man and material. A holistic understanding of this crucial manned-unmanned teaming will allow responsible nations to assimilate and operationalize this into their joint warfighting doctrines, while denying this advantage to less responsible adversaries. Keywords OODA loop · MDMP · AI-enabled DSS · Battlefield dominance · Strategy algorithm · LAWS · Threshold of conflict · Trolly car dilemma · Uncertainty quotient · Whole of nation approach To bring about a revolution in military affairs, two things are normally needed: an objective development that will make it possible; and a man who will seize that development by the horns, ride it, and direct it. Martin Van Creveld (2011).
P. B. Nair (B) Indian Astronaut Designate, ISRO, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_13
259
260
P. B. Nair
1 Introduction The history of the world, till date, has been largely defined by the fateful decisions taken by the leaders of nations. The military, as one of the pillars of the Instrument of National Power,1 is a primary tool that has been used by national leaders to shape conditions to suit their interest. The primary function of any military leader is military decision-making. Therefore, it is inevitable that the Military Decision-Making Process (MDMP) be considered, as arguably, the single most important process that has and will shape the desired political end state as and when the military option is exercised by the political leadership. This is true irrespective of the nature of the political institution governing a country, as the military decision-making process (not surprisingly) is similar across the globe. The decision taken by military leaders are supported by robust structures that have evolved through constant iterations based on practical applications of military warfare theory. These support structures have for the most part of the known history of mankind, involved human structures of hierarchy that provide timely decision support to leaders at each stage of the hierarchical chain. The efficiency and effectiveness of this decision support system has in turn affected the outcome of war and, as a consequence, the fate of nations. With the unprecedented technological revolution of the past two decades, it was only expected that the leaders exercising the DSS would rely on technology to aid in their decision support structures. While the natural application was in storing, retrieving and calculating information that were beyond the capability of the human brain, what would be truly revolutionary would be to have technology complement the cognitive functions that make the human brain remarkable at abstract decision-making. It is the steady progress of technology in providing this level of cognitive decision support that will allow military commanders to consider multitude of Courses of Action (CsOA) while undertaking the MDMP. Additionally, these technology-assisted decision support systems would allow the Planning team to adapt these CsOA in near real time to the emergent situation and ultimately provide the military commander with a potent tool that drastically reduces the Observe-Orient-Decide-Act (OODA) loop.2 Modern warfare can be visualized as the interplay between five major components—sensors, shooters, decision-makers, information nodes (where data is stored/ processed), and the ubiquitous network. Artificial intelligence (AI) has the capacity to observe, orient, decide, and act at a faster pace than the adversary and thus provide military advantage. The age of AI-enabled DSS (AI DSS in short), therefore, has arrived. AI DSS will provide the edge in the wars of the future in achieving Battlefield Dominance. These DSS would in turn be data hungry to provide informed decisions. The need of the hour is a joint network centric doctrine based on an AI DSS, to offer a Common Operational Picture (COP) to the military commander, which in turn would feed him/her with a relatively advantageous military decision-making capability. 1
The Instruments of National Power comprise (but not limited to) DIME, that is, Diplomacy, Information, Military, Economy. 2 The OODA loop concept invented by Col. John Boyd of USAF.
13 Artificial Intelligence and War: Understanding Their Convergence …
261
2 Understanding AI-Enabled DSS Since this chapter is based on the use of AI in DSS in the defence ecosystem towards MDMP, it is imperative that we formally define DSS. There are numerous definitions of DSS. However, the following adequately addresses the same. A DSS is a computer-based system which supports the decision-making activities of an organization by compiling useful information from a combination of raw data, documents and personal knowledge, or business models to identify and solve problems and make decisions. One of the keywords here is data and this brings DSS into the realm of Data Analytics.
3 How Can AI DSS Be Incorporated into Defence Ecosystem The host of activities where the military should employ AI DSS, would fall under the activities involved in OODA Loop. Observe using smart & autonomous sensors; Orient using pattern recognition algorithms; Decide using DSS algorithms along with predictive analytics; and Act using autonomous schedulers and autonomous weapons. Let us understand in particular how AI DSS will revolutionize the Decide portion of the OODA loop.
3.1 Understanding How AI Will Develop Military CsOA The role of AI DSS would be to provide the plethora of choices, implementing the right choice would-at least for the present-continue to be the prerogative of the (human) military commander. The options provided by the algorithms will only be as good as the data available to it and its specific algorithmic traits. The success or failure of an operation would arguably depend on how good the options, in the form of CsOA, provided by the AI DSS to the military commander, are in countering the enemy CsOA. So how exactly would the AI provide the CsOA to the military planners is a significant point. Machine learning (ML) can provide the algorithm (sequence of logical steps) for a given input and output. Therefore, if we can consider input data to be the means available to the military, namely weapon systems; if we consider output data to be the ends, namely political end state translated to military end state and individual objectives; then for the given input and output, the ML system would provide us with the algorithm that is the military ways to connect the means to the ends. We could define strategy broadly as the ways (solution) that connects the means (resources) to the ends (results). This particular AI DSS, using its ML would provide the specific strategy algorithm, that is, the solution to connect the resources to the end result. These solutions tempered for a specific context of level of war, namely
262
P. B. Nair
strategic, operational or tactical would provide the military planners with specific CsOA that could be gamed in real-time with enemy CsOA, that would have been similarly derived by the machine learning algorithms. The final effect would be a reduced time line OODA loop available to the military decision-makers. A reduced timeline OODA loop translates into a decisive military advantage, assuming that the AI-enabled DSS that made the quick OODA loop is reliable! The generic military decision-making process (MDMP) adopted by all the armed forces across the globe involves various logical steps for planning a military campaign. AI could be employed to fill, order, and analyse all the data-intensive tables. For the military planner at the Head Quarters (HQ), these AI DSS solutions could be narrowed down to identifying the Centres of Gravity (CsOG)3 at the various levels and thereafter suggesting the right platform to target them. Let us now see how this logic can be used in the Op decision-making cycle of typical armed forces. The components of the Op decision-making in any military can be broadly divided into three groups. These are the databases, the planning tools, and the execution tools, all of which reside on the backbone of the computer servers. The databases primarily include the logistics, maintenance, operational, and intelligence databases. The planning tools primarily include the operational planning and weapon planning tools. The execution tools primarily include the integrated command and control systems (C2S) and the secure Operational Data Link (ODL), connecting those systems that may be geographically separated. The AI DSS would gain insight from the databases by using database management systems. The planning tools in turn would use AI DSS to provide customized solution (CsOA) to the specific problems (enemy CsOA) using the template of the MDMP. The execution tools, especially the C2S, would use learning algorithms to provide threat evaluation matrix and assign weapons in real-time to the military commander running the military campaign in a particular theatre. During peace time, the training data captured in various exercises will allow the learning algorithms to constantly refine their recommendations, in the same way that military planners constantly improvise on their planning based on training/ exercise feedback. AI can be extensively used in simulated warfighting environments in our various war games. It allows mission rehearsal to simulate complex battlefield ops across the spectrum of conflict using virtual reality interface and video gaming tools. It could automatically draw a Joint Target List (JTL) based on the prevailing context. This JTL will be updated at higher frequencies due to the shorter Intelligence, Surveillance & Reconnaissance (ISR) cycles. There is tremendous potential for use of AI in ISR. The predictive properties of AI could unlock tremendous potential in the field of ISR. Presently, the ISR cycle involves manual identification of targets with software primarily used to organize the data. This laborious process translates into time delay in the OODA loop resulting in reduced effectiveness of missions. The aim of using AI would be to reduce the time to recognize a target using pattern recognition algorithms. These AI systems mounted on Un-manned Aerial Vehicles (UAVs), drones, 3
The source of power that provides moral or physical strength, freedom of action, or will to act. Thus the centre of gravity is usually seen as the source of strength.
13 Artificial Intelligence and War: Understanding Their Convergence …
263
and other such ISR platforms would be able to autonomously assess the images realtime using technologies like Generative Adversarial Networks (GANs).4 This in turn would reduce the bandwidth required to transmit the data to the Ground Exploitation State (GES) as almost the entire filtering and processing of the data would be onboard. This Process, Exploit, Disseminate (PED) cycle available on board the ISR platforms would revolutionize the ISR cycle, which presently is severely limited due to the bandwidth requirement to transmit the un-filtered data to the GES. The same technology would allow Non-Cooperative Target Recognition (NCTR) and automatic target recognition systems for aerial combat, thus minimizing fratricide and maximizing aerial combat kills in a dense air environment. Similarly, the potential for exploiting AI in defensive tasks is significant. As opposed to utilization of AI for direct offensive purposes in the enemy territory, the utilization in own territory for defensive purposes may be acceptable (to the international community due to apprehension regarding reliability of autonomous weapons) and controllable. This would mean everything from defensive weapons like Air Defence (AD) systems to logistics management systems that facilitate defence. The use of AI in AD systems like Surface to Air Missiles (SAMs) and Anti-Aircraft Artillery (AAA) are promising, as proven by the success of the Israeli Iron-Dome antirocket system, the US Phalanx Aegis Close-In Weapons System (CIWS), the South Korean SGR-A1 autonomous border protector and so forth. Autonomous robots for de-mining and bomb defusing and Autonomous Mule robots like the Big-Dog and Cruiser (employed by US Army) for load carriage are seeing the maximum Research & Development (R&D) by the major powers. AI DSS can be used for effective perimeter security and border protection. AI systems can take input from multiple sensors and fuse them to obtain a narrative that allows realistic prediction of enemy action. For example, the data feed from video (visual, infrared), seismic, acoustic sensors placed on autonomous drones will allow mapping of the population for malicious intent. Once the threat has been identified, targeted analysis of these threats will allow complex interpretations to include likely infiltration routes, behavioural pattern, individual vulnerabilities and so forth. Ultimately, all this information will allow identification of vulnerable points that need to be managed to ensure the perimeter security or border protection. One of the primary roles of any armed force is deterrence. Towards this it is important to influence the environment and prevent conflict by gaining anticipatory intelligence on target area and target population. The predictive analytics capability based on pattern recognition has tremendous applications in reducing conflict in areas which have internal security problems. The use of AI DSS in cyberspace domain is inevitable. They can facilitate the automation of vulnerability-detection and exploitation software. This was shown by the victory of MAYHEM, an AI developed by Carnegie Mellon University, in the 4
A generative adversarial network (GAN) is a type of construct in neural network technology that offers a lot of potential in the world of artificial intelligence. A generative adversarial network is composed of two neural networks: a generative network and a discriminative network. These work together to provide high-level simulation of conceptual tasks.
264
P. B. Nair
2016 Cyber Grand Challenge, a competition started by Defense Advanced Research Projects Agency (DARPA)5 to spur the development of automatic cyber defence systems which can discover, prove, and (if used defensively) correct software flaws and vulnerabilities in real time. Nonetheless, such systems can also be a doubleedged sword, as this expertise would be also be available with non-state actors with malicious intent. The use of AI DSS should not be restricted to only direct operational matters. The low hanging fruit in AI implementation in the armed forces would be in adopting commercial applications that can be laterally incorporated. These would include Inventory Management, Recruitment & Human Resource Management (HRM), resource acquisition and performance testing, Budget management and Knowledge management, to name a few. Let us discuss the opportunities of a few of these.
3.2 Understanding Non-kinetic Military Applications of AI As already discussed, the databases residing on the military computer servers, in separate applications, include the databases for material, maintenance, operations, and intelligence. These databases need to be interconnected to enable our supply chain of war-fighting resources to be handled efficiently. Predictive analysis and scheduling algorithms will allow the war waging resources from aircraft to weapons to clothing to be available at the right place at the right time in right amounts. AI can play a pivotal role in Automated Planning and Manpower Allocation. The armed forces would need a similar plan of action with the added complexity of managing a manpower pool having diversity of ethnicity, language, religion, and, more importantly but less mapped, political/social/cultural leanings! The Hague Centre for Strategic Studies (GCSS) has done a comprehensive study on Artificial Intelligence and the Future of Defense, (Stephan et al., 2017) for incorporation of AI into the defence ecosystem. The Hague report has rightly identified in its paper that, “Options analysis is a key early step in the capability development process”. It further elaborates that Some governments require that the option set in the initial stages of the acquisition process includes at least one Off-the-Shelf (OTS) solution, where available, as a benchmark. Options that move beyond the requirements of an OTS solution must include a rigorous cost-benefit analysis of the additional capability sought so that the full resource risks and other impacts are understood by governments. (Stephan et al., 2017: 93)
As the military must be able to effectively implement any option presented, each option must be achievable in financial, technical, logistics, workforce, and schedule terms. The time, effort, and expense of examining each option in detail makes it essential to concentrate on investigating usually no more than three or four options. 5
The Defense Advanced Research Projects Agency (DARPA) is an agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military.
13 Artificial Intelligence and War: Understanding Their Convergence …
265
The use of AI may improve the rigour of the analysis and employ trade off analytics to expedite the process. Acquisition of new defence resources using the Defence Procurement Procedure (DPP) could be automated: Not restricted to limited vendors as the AI DSS can literally evaluate infinite vendors. The L1-T1 (lowest bidder-best technology) benchmarking could be closely assessed based on multiple inputs and historical precedence. AI DSS can be used extensively to evaluate the performance of weapons and systems. This will allow realistic fault analysis and rectification. The performance of air to ground weapons using AI-assisted Circular Error of Probability (CEP)6 calculation and analysis will allow better planning figures. This can be extended to performance of navigation, role equipment, engine performance for predictive maintenance and for follow-on procurement, modifications and negotiations with Original Equipment Manufacturer (OEM). The maximum employability of AI DSS tools in commercial application is in managing budgets. These AI tools can be seamlessly integrated into Defence budgeting. AI may be able to read, understand, and verify that a content is safe to send from one domain to another more accurately help to minimize data aggregation risks and reduce the risk of inadvertent leaks using a technique called homomorphic encryption. For example, CryptoNets (trained deep learning algorithms) developed by Microsoft are trained deep learning systems which can take encrypted data and spit out encrypted (but accurate) answers. The Hague report elaborates: The potential applications of such a system for intelligence sharing could be revolutionary— enabling intelligence services to enjoy most of the strategic and operational benefits of freely pooling their information and data (e.g. allowing the analysis of patterns in enemy communications, tactics or strategy), without the fear that leaks or double agents might compromise that information or its sources. (Stephan et al., 2017: 95)
Imagine an automatic document classification tool (Google style!) that will allow immediate retrieval. This would revolutionize the working of any military HQs, where arguably the maximum time spent is in retrieving previous correspondence! Most of the governments have already implemented e-Office, in many ministries, as a first step towards digital correspondence and transactions. However, the digital database that would accrue, as a consequence, would need to be captured in the right format (metadata) so as to be intelligible to the AI. Else it is just, garbage in = garbage out. The field of medical record analysis is seeing tremendous use of AI in the commercial space. The digital medical records that would be available will be able to provide predictive and anticipatory alerts to the concerned individual regarding specific treatment. However, the breakthrough for the armed forces would be in exploiting the predictive analytical capability of AI based on data analysis to optimize Casualty Evacuation (CASEVAC), Combat Search & Rescue (CSAR), and Humanitarian Aid and Disaster Relief (HADR). 6
In the military science of weapon ballistics, CEP is a measure of a weapon system’s precision.
266
P. B. Nair
Sustaining military efforts for longer periods of time requires regular rotation of units. Similarly, the very nature of command necessitates rotation of leadership every two years or so. Ideal Unit Rotation Continuity is the process of passing an ongoing mission from one unit to another or one commanding officer to another with no discernible loss of continuity in knowledge of terrain, populace, contacts, best practices, and so forth. To further understand this concept, one can draw a parallel with commercially popular AIs like Alexa (of Amazon) or Siri (of Apple) in recognizing and processing natural voice patterns to retrieve, remember, analyse, and present data like general knowledge and music. AI would be able to curate a course for an individual as per past performance and future requirement tempered by organizational necessity. This could be gainfully exploited in the Professional Military Education (PMEs) conducted for Military Personnel. Having seen the how AI-enabled DSS will facilitate MDMP towards both the kinetic and non-kinetic military applications; it is important to understand what are the inherent risks associated with incorporation of AI DSS into the defence ecosystem.
4 Risk Associated with AI Elon Musk had commented on Twitter that AI is more dangerous than North Korea. The debate raging regarding Lethal Autonomous Weapon Systems (LAWS) boils down to whether it will be able to work within the framework of the Laws of Armed Conflict (LOAC).7 The two binding legal obligations, while in war, are distinction and proportionality; distinction being the ability to discriminate between combatants and non-combatants; and proportionality being the ability to respond to an enemy in proportion to the attack. In the eventuality that the LAWS malfunctions or violates the LOAC, who would be responsible and accountable the tactical operator, the military commander, the software developer, or the hardware designer? How do we test LAWS for predictability and reliability? Predictability implies assurance that the LAWS will perform as expected even in an unbriefed/unprogrammed environment, while reliability always requires expected performance. How do we test and validate LAWS for situations that have not been programmed? For example, in spite of undertaking extensive tests on the Samsung Note 7 smart phone battery, it caught fire unpredictably during a flight. The same Samsung company supplies the ‘extensively tested’ SGR-A1 autonomous border weapon system to the South Korean Government, which has deployed it in the border with North Korea. Who would be held accountable if the extensively tested SGR-A1 malfunctions like the way the extensively tested Note-7 smart phone did? 7
LOAC arises from a desire among civilized nations to prevent unnecessary suffering and destruction while not impeding the effective waging of war. A part of public international law, LOAC regulates the conduct of armed hostilities.
13 Artificial Intelligence and War: Understanding Their Convergence …
267
The minimal risk to human life, while employing LAWS, would mean that the threshold of conflicts, will lower as nations will be more willing to send LAWS against a weaker adversary. The drone war by the USA in Pakistan against the Haqqani network is a case in point. These distant wars, waged by drone operators operating 1000s of miles away from continental USA, have reduced operations to video games and threatens to take humanity out of the loop. The LAWS would not be affected by emotions and would be capable of conflict escalation due to unprecedented pace of war beyond human comprehension. Warrior virtues and leadership skills would be passé as the only motivation required for LAWS is a power supply to charge it up! Why do we assume that humans will make better legal and ethical choices than machines? History bears witness that humans are more prone to making emotional decisions, many a times sacrificing bigger goals for narrow ones. A healthy combination would be man–machine teaming. Many of the apprehension regarding the use of AI, especially autonomous systems are due to misunderstanding regarding the level of AI and the degree of autonomy. Therefore, any discussion involving legal and ethical should begin with an objective analysis of the level of AI and the degree of autonomy of the specific AI that is under scrutiny. There are three levels of AI—Narrow, General, and Strong AI. Narrow AI is better than humans only in one or more designated tasks; General AI can match humans in all tasks; Strong AI will surpass humans in everything. When we talk about degree of autonomy, it involves the degree of human control: Human-in loop; Human-on-loop or Human-out of-loop; with human control/ over-ride reducing from maximum in human-in-loop to nil in human-out of-loop.
4.1 Ethics The ethics of using AI in LAWS makes everyone apprehensive. These are purportedly the lack of sympathy or empathy and whether machines are intelligent enough to do the ethical thing. When the LAWS break the law, holding machines to task would involve legal wrangles that would confound the best of lawyers. However, the question that needs to be asked is whether there is really any cause to worry about. What we need to determine is whether human beings have been able to take more ethical decisions or better still, what constitutes ethical decisions? In fact, there may come a moment when the robots would judge human beings as being totally unethical! The probability that algorithms may judge ethics better than majority of humans do is a sobering and humbling thought. The place where both robots and humans run into problems is situations in which adherence to a rule is impossible because all choices violate the same rule. The ethical and moral sense for AI-enabled machines would depend on the parameters that are used for optimization. The choice of the parameters would be iterated by constant feedback through input and output scenarios. The choice of the code would ultimately rest on humans (at least in the initial phase). For example, if there was a rule that “saving more lives is better than saving fewer lives,” then it imposes a decision-making constraint based on quantity and not quality. Would this choice
268
P. B. Nair
based on quantity versus quality of the life saved be ethical, if we introduce a vital information that the “more lives” saved would be that of hardened criminals and the “fewer lives” saved would be that of the President of the USA and his spouse! In an egalitarian view of the world, one would just want the robot to save more lives versus one life. But then, this would make it less human (if at all there is a gold standard of what it means to act like a human!) but truer to its code. The dilemma is not whether the robot can take the right decision; it is whether we can prevent selfish humans from tampering with its altruistic code! The question that begs to be asked here is: Can AI have Intelligence that is moderated by heart? However, the elephant in the room is the question; what constitutes having heart in supposedly naturally intelligent sentient human beings?! Are only humans capable of taking the right decision? The debate, more often than not, is what constitutes a right decision? The collective conscience of a society, one may argue, constitutes right decision. But then, can one human being or a minority population impose their collective conscience on a majority? Perhaps the atrocities of history, like Hitler’s gassing of the Jews; the genocides perpetuated by various dictators against their own population, so forth may provide a clue. Perhaps, the AI Chat generative Pre-Training Transformer (ChatGPT)8 chat-bot might provide a clue. When you type in a prompt to ask for military operations and tactics; the ChatGPT provides answers that generally converge towards maintaining peace and restraining from war. Perhaps there is still hope in the collective human conscience that influenced the programmers in Open AI (owners of ChatGPT), who chose the training data sets that allowed the ChatGPT to provide pacifist solutions. Another question is whether the AI that takes such decisions will ever become conscious or self-aware (both these are two separate attributes; explaining which is beyond the scope of this paper). This author feels that what will become conscious or self-aware is the AI— human team; which may manifest physically in the form of cyborgs (mix of organic and inorganic being) taking decisions based on a meta-data base that is constantly evolving due to AI-human iterations. For example, in ChatGPT we forget many a times that the programme is responding to constant iterations and prompts that are actively provided by the sentient (conscious-self/aware) human (in the loop) who is conversing with the AI and the collective AI-human (meta)database (call it consciousness, if you will) available in the Internet server that the ChatGPT is plugged into.
5 Roadmap for Nations Revolutionary Change in Approach The responsible nations of the world should have a clear and implementable AI strategy, especially since AI will impact all the departments of a government. Nations need to have a central coordinator for AI, which bases its strategy from specific AI 8
ChatGPT is an expert system tool with a language generation model including Natural Language Processing (NLP), text generation, and language translation. Being a multi-model, it can process and integrate inputs in the form of text, images, and sounds. Curation and refinement of suggestions rendered by ChatGPT can be done later to arrive the near perfect solutions.
13 Artificial Intelligence and War: Understanding Their Convergence …
269
Task Forces, that plan a whole-of-nation approach focused on the national interests, along with support of capable Allies. Towards this it is important to have a comprehensive education policy for AI that can leverage the tremendous potential of the human resources available with academia, industry, commercial, and private players. Encourage participation by all interested parties by incentivizing and publicizing the AI challenges. Channelize discussions and AI R&D to address new doctrines and force structures that are cost-effective for the Armed Forces. Expand these discussions to address the concerns of data security, privacy, ethical and legal facets. All Defence Professional Military Education (PME) Institutes should mandatorily include AI syllabus. Both policy guidelines and training syllabus must emphasize the prevailing global standards for AI ethics, such as UNESCO’s recommendation on the Ethics of AI,9 or any other prevalent and legal standards. The Merger and Acquisition (M&A) of DSS and companies that use a large civil database, should pass scrutiny, checks and balances of the government. The Adversary (engaging in Hybrid Warfare) can use the Big Data of citizens held by own or foreign companies against the Nation. A comprehensive policy, envisaging the safety and own use of these Big Data, needs to be formulated. There will be strong resistance from vested interests of legacy system holders (in defence) to migrate to AI-based systems. The recent US congressional hearings against data management policy and server locations by a commercial social media app like Tik-Tok with Chinese ownership, are just a few of the examples of the complexities that are going to be faced by the governments. They would need to balance diplomatic, economic, and military priorities; while at the same time address the public sentiment on data privacy and app usage convenience. The Armed Forces will need to adopt a road-map that would be in sync with the policy and research of associated AI industry. The way forward is not just inventing or buying a new technology, but also in exploiting it optimally. The visible R&D by most of the Defence Labs is focused primarily on using AI to enable autonomous transport vehicles. The employment concepts range from using a single unmanned autonomous vehicle leading a convoy of unmanned vehicles to entire combat formations utilizing a mix of manned and unmanned vehicle platforms (land, sea, or air). For example, the Collaborative Operation in Denied Environment (CODE) concept of DARPA plans to utilize this man–unmanned teaming model. The need of the hour is to transfer this technology to the forces through active military-academia-industry collaboration. The Armed Forces should understand the immense possibilities that this Revolution in Military Affairs (RMA) holds. Towards this, understanding AI DSS is the first step and the PME should be restructured to bring the forces up to speed. The use of AI should not be limited only to autonomous systems but should be majorly exploited for augmenting staff process by the use of applications that find 9
In November 2021, UNESCO introduced the first-ever global standard-setting instrument, the Recommendation on the Ethics of Artificial Intelligence. This recommendation delineates principles to guide the development and implementation of AI systems, emphasizing transparency, explainability, accountability, and respect for human rights and dignity. It also advocates for the creation of a global observatory on AI ethics to monitor and evaluate the evolution of AI technologies and their societal implications.
270
P. B. Nair
similar use in the civilian commercial field. The Armed Forces should use the AI DSS across the spectrum of conflict; all levels of war and all operational functions including ISR, logistics, administration, meteorological, and medical. Pattern recognition systems would analyse the data received from multiple sensors and provide predictive analytics. These would help understand enemy behaviour based on past data and predict locations and probability of threat areas and personnel. Intelligence would be optimized by machine learning (ro)bots that scan and analyse-using voice recognition by natural language processors-social media and cyber feeds to identify patterns that need to be flagged and investigated. Logistics of procurement and transportation could be optimized by AI machine learners that analyse the multiple data feeds from the environment to recommend solutions to anticipated problems. The administrative functions of HR, documentation, medical, and education can be optimized and made cost-effective by adopting commercial applications that are in vogue. Meteorological weather predictions using data feeds from autonomous sensor drones will allow optimum planning of resources, especially in in-hospitable terrain. The ultimate progress of an AI DSS would be to have a LAWS that is integrated into an AI-enabled network manned by autonomous robot staff planning the CsOA that achieve the military end state. The adaptability of AI to a constantly changing and complex environment (as encountered by the armed forces) can only be achieved by R&D that is driven by doctrine that incorporates strategy driving technology. For example, the hostile weather, and terrain coupled with the complex geo-political dispensation that most of the Armed Forces face in the combat zone, should drive AI research that exploits a doctrine of war-fighting that is tailor-made to the region. Having identified the specific technology that would enable its strategy in a particular region, the process of design-testing-production-induction needs to be streamlined. It is amply clear from the discussions in the preceding paragraphs, that a whole of nation approach towards AI is inevitable, since the database that decides the optimal AI-based solution requires access to not only this kind of data, but data that has been refined, that is metadata. The kind of metadata that would be required needs to have an Uncertainty Quantification (UQ)10 associated with it so that these can be weighed in by the AI DSS when making decision or providing solutions to the military decision-makers. For example, in military parlance, a measurement is any information collected and used during an OODA loop. Each piece of information has been measured by a sensor of some sort and will have some uncertainty associated with it. Uncertainty quantification as metadata will take at least two forms: empirically generated measurement uncertainty (based on the metrology standards outlined above) and statistically postulated uncertainty (determined by some means, of which there are many) (Abdar et al., 2021; Psaros et al., 2023).
10
Understanding UQ as metadata requires understanding foundational concepts in metrology—the science of weights and measures—related to measurement uncertainty. That is, a measurement has two components: (1) a numerical value which is the best estimate of the quantity being measured, and (2) a measure of the uncertainty associated with this estimated value.
13 Artificial Intelligence and War: Understanding Their Convergence …
271
The Joint Battlefield Dominance in the age of Hybrid & Network-centric warfare cannot be achieved without AI. The manpower and the resources need to be upgraded and the link, that is, the doctrine needs to be constantly revised along with the force structuring to exploit the man–unmanned teaming that AI affords. Artificial Intelligence-enabled DSS is the inevitable way forward for achieving battlefield dominance in the age of Hybrid Warfare. The backbone of AI DSS would be on the data and their networking. The various departments of the Armed Forces like the Army, Navy, and the Air Force need to be integrated, as a pre-requisite to evolve higher joint doctrines of warfighting.
6 Postscript Kalaripayatu is an ancient martial art of the Kerala state of India and is arguably supposed to be the progenitor of all other martial art forms. Kalaripayatu warriors go through rigorous training spanning a significant part of their childhood, teenage, and right into adulthood. These warriors were highly respected in medieval Kerala society and the head of a Kalaripayatu tradition were given specific social privileges almost equivalent to the local rulers. There was a reason for this. One particular style of Kalari was called Anga Kalari.11 This was single mortal combat between two warriors. The warriors were commissioned by two feuding lords. The purpose of the fight (or angam) was to settle disputes that could not be resolved by an intervening higher authority. Since the feuding lords were highly placed in society; it was not uncommon when the ownership of whole minor kingdoms would be at stake! When the fate of whole kingdoms rested on the relative prowess of two warriors, one can only wonder whether it was fair, that the respective subjects of the warring nobility/royalty had no say in the matter. If we substitute the Kalari warriors with AIenabled DSS fighting the war on behalf of two feuding governments, the question that remains is whether it is fair that the respective subjects of the warring governments have no say in the matter of two AI chatbots on opposing sides deciding their life or death; freedom or bondage. This is a question that the subjects of democratically elected governments and member nations of international intergovernment agencies need to answer. Till then, may the best AI Kalari warrior win! The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom. —Isaac Asimov.12
11
https://kadathanadankalari.in/kalaripayattu-history-kalari-kerala-wayanad/. Goodreads, “Isaac Asimov Quotes,” https://www.goodreads.com/author/quotes/16667.Isaac_ Asimov, accessed on May 21, 2019. 12
272
P. B. Nair
References Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D., Liu, L., Ghavamzadeh, M., Fieguth, P., Cao, X., Khosravi, A., Rajendra Acharya, U., Makarenkov, V., & Nahavandi, S. (2021, December). A review of uncertainty quantification (UQ) in deep learning: Techniques, applications, and challenges. Information Fusion, 76. Dutch Ministry of Defense, Hague Centre for Strategic Studies. (2017). Artificial intelligence and the future of defense: Strategic implications for small-and medium-sized force providers. In S. De Spiegeleire, M. Maas, & T. Sweijs (Eds.), Lange Voorhout 16. Psaros, A. F., Meng, X., Zou, Z., Guo, L., & Karniadakis, G. E. (2023, March 15). Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons. Journal of Computational Physics, 477. Van Creveld, M. (2011). Napoleon and the dawn of operational warfare. In J. A. Olsen & M. van Creveld (Eds.), The evolution of operational art: from Napoleon to the present. Oxford University Press.
Chapter 14
Converging Approach to Intelligence: Decision-Making Systems in Artificial Intelligence and Reflections on Human Intelligence Sarita Tamang and Ravindra Mahilal Singh
Abstract Almost seven decades ago, (Turing, in Mind 59:433–460, 1950) posed the far-reaching question: “Can machines think?” Well, it depends, Turing, elaborates, on what we mean by “machines” and what we mean by “thinking”. Thinking, more specifically rational thinking, is seen as a hallmark of human intelligence. A major philosophical endeavour then is to analyse the defining criteria of ‘intelligence’ and to see whether this definition could be universalized, extending to machines as well. Intelligence can be said to be constituted of two components—decision-making and reasoning, both related to rationality. So, the question now becomes, “Are machines rational?” That is, if machines can reason, think, and make decisions as efficiently as humans, can they be called ‘rational’? The aim of this chapter is to argue for AI by considering decision theory in AI where Scenario thinking (the ability to predict the future) is a key attribute of intelligent behaviour in machines as well as humans. Based on the converging approach to intelligence in artificial systems and human reasoning, we can examine closely whether AI holds any insight into human reasoning and whether human actions themselves can be simulated through decision-making models in AI. Keywords Psychometric theories · Information-processing theory · Cognitive correlates of intelligence · Decision-making models · Decision theory · Intelligent systems · Expert systems · Decision support systems · Value judgement · Decision value The artificial world is centred precisely on this interface between the inner and outer environments; it is concerned with attaining goals by adapting the former to the latter. (Simon, 1996: 113)
S. Tamang (B) · R. M. Singh Department of Philosophy, University of Delhi, New Delhi, India e-mail: [email protected] R. M. Singh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_14
273
274
S. Tamang and R. M. Singh
1 Introduction The questions of intelligence such as “what is intelligence?”, “what constitutes intelligence?”; and “can intelligence be replicated?” has been of primary interest in the field of artificial intelligence. But recently, it has attracted many researchers to this interdisciplinary field, such as philosophy and psychology, to contribute to the ongoing discussions on the advancements of AI. One of the many aims of AI is to create intelligent systems that can make autonomous decisions in dynamic environments like humans who make decisions in the real world. Collaborative efforts from the field of psychology and computer science have paved the way for research into human learning and problem-solving that can be used to model intelligent systems. This chapter explores the theme of converging intelligence modelled on human decision-making and the decision theory in computational systems. The first section of the chapter attempts to answer the question of “intelligence” in both human and artificial systems. The second section examines the history of theories of human intelligence with detailed emphasis on ‘information-processing theory’. Whether this theory of intelligence can be extended to computational systems has been discussed as well. The third section of the chapter analyses the concept of intelligence which we garnered from the earlier section. This section explores reasoning and decisionmaking in both human thinking and computational systems. The question of rationality has been raised through the lenses of reasoning and decision-making. For this purpose, decision theory in computational systems has been considered through the concept of ‘decision value’ and ‘decision judgement’. This discussion paves the way for the possibility of convergence of decision-making models of human reasoning and expert systems which we have explored in the subsequent section. The fourth section of the paper explores the converging approach to intelligence in the light of human decision-making and intelligent systems. The concept of rational decision-makers has been examined through the lens of rational choice theory and the standard model of decision theory. The challenges to the standard model of rationality in the light of real-world dynamic scenarios have been discussed, as well as whether human actions can themselves be simulated through the intelligent systems in the dynamic environment. The fifth and final section concludes the chapter by reflecting on the modern intelligent systems in the field of artificial intelligence.
14 Converging Approach to Intelligence: Decision-Making Systems …
275
2 Defining ‘Intelligence’: Exploring Various Approaches to the Study of Intelligence 2.1 The History of Intelligence Galton (1869) in his Hereditary Genius: An Enquiry into its Laws and Consequences defines “genius” as a distinctive and exceptionally high ability of the mental faculties of an individual.1 Galton believed that these superior mental faculties are inborn and are inherited as part of the genetic make-up of an individual. Though Galton did not have the resources of modern science to conduct any extensive study or any of the modern scientific means, his research was largely based on the observations he made on the successful Englishmen in various fields of his time. He rigorously argued for the case that the mental ability of an individual is an inborn trait that cannot be acquired through social training. Galton’s idea of intelligence rendered all individuals intellectually unequal where intelligence was considered as an inherent capacity subjected to growth in a limited manner.2 Galton was the first to pioneer the field of the study of intelligence in a systematic way. It was not until the beginning of the twentieth century that the idea of testing intelligence was introduced. It was the era of intellectual movement in the French education system which paved the way for the objective study of intelligence. The motivation behind the idea to study ‘intelligence’ was to identify, through medical and psychological evaluation, intellectually deprived students in elementary schools who lacked mental competency in comparison to their intellectually developed peers. Alfred Binet, who had worked closely with Francis Galton, laid the foundational pillars of modern intelligence testing methods. In L’Année Psychologique Binet and Theodore (1916) proposed a measuring scale of intelligence based on three methods to test the intellectual level of the child—medical method, pedagogical method, and the psychological method. Under the psychological method, a series of tests were designed to study ‘attention’ and ‘visual memory’ among children of various age groups. The tests included simple exercises of drawing figures from memory and understanding and utterance of simple phrases. Binet and Theodore (1916) observed that children of a certain age, such as between age seven to eleven, could solve a certain set of problems. Children at a more developed age were able to handle more
1
Galton, in his preface to the updated version of the 1969 edition, defines ‘genius’ as a mental ability, like any other natural ability, for instance, muscular strength. He also states that the title of the book should have been “hereditary ability”. 2 Galton did acknowledge that one’s intellectual capacity evolves over time, with age but the development is constrained by the various hereditary traits.
276
S. Tamang and R. M. Singh
complex exercises or tests.3 This series of experiments led to the birth of the idea of “mental age”.4 We must notice that Binet’s work on the study of intelligence was an improvement over Galton’s idea as it treated intelligence as a complex of mental abilities that are influenced by socio-cultural phenomena.5 Binet attributed a very small portion of intellectual development to hereditary influences. Mental illnesses which could affect the intellectual growth of a child, such as neuropathic affections, were examined under the medical method (1916: 231). Binet gave us a much more refined idea as to what constituted ‘average’ intelligence in terms of the Binet–Simon scale. The Binet–Simon scale was set on what was observed to be normal responsive behaviour among children of various age groups (1916: 261). Binet’s work set the pace for future work on intelligence. Modern cognitive tests of intelligence which assess mental abilities, such as aptitude and reasoning skills, owe a great deal of lineage to the Binet–Simon scale as a ‘measure’ of intelligence.
2.2 The Modern Approach 2.2.1
Intelligence as a Psychological Measure—The Psychometric Theory of Intelligence and Its Critique
Today, the tests which collectively give the quantitative measure of these mental attributes or cognitive capacities fall under the psychometric theory of intelligence. Psychometric theory, also called ‘test theory’, is the field of study providing a general framework for constructing the theoretical models and statistical methods underlying the design of modern tests of intelligence. These theoretical models aimed at assessing the human behavioural and social elements through batteries of tests themselves and the behavioural phenomena that these tests test. It governs the outlay, constructions, and modifications that any test is required to undergo in order to give
3
Though Binet himself questioned what the bar of “normalcy” should be, his bar of normalcy was ultimately based on empirical statistical observations. 4 Binet defined the idea of “mental age” as the development of intellectual level with age. He writes, “The maturity of intelligence is the growth of the intelligence with age.” (Binet and Theodore, 1916: 259). 5 It should be made clear that Binet’s work is not to be studies in contrast with Galton’s insight on intelligence, for, one, Binet’s aim was to study the ‘lack’ of intelligence in children who were deemed as subnormal and not study intelligence per se. Binet writes, “Our purpose is to be able to measure the intellectual capacity of a child who is brought to us in order to know whether he is normal or ‘retarded’”. We should therefore, study his condition at the time and that only. We have nothing to do either with his past or with his future” (Binet and Theodore, 1916: 191). Binet believed that intelligence itself could never be measured in its entirety. Unlike Galton, Binet’s concern was not to define intelligence but merely to study the ‘normal’ and ‘subnormal’ in the educational institutions.
14 Converging Approach to Intelligence: Decision-Making Systems …
277
an overall measurement6 of the behavioural assessment (Raykov & Marcoulides, 2010). While the psychometric approach is useful to study the behavioural attributes of intelligence in a scientific and objective way, this approach can be criticized on the grounds of its design. The criticism levied often appeals to the purpose of these tests. For instance, Lippmann (1922) in his article published in the New Republic criticized the psychometric theory. He argued that such tests are based on arbitrarily selected puzzles which rather than being a true measure of intelligence are merely a method to divide people into different intellectual groups. Such a battery of tests provides little insight into what intelligence really is and fails to explain the grounds of the observed difference between cognitive capacities among different individuals. Another take on the psychometric theory undermines the correlation between the variables being measured, whether the variables need to be treated as different but correlated attributes or a single attribute defined by the general intelligence factor ‘g’ remains a problem for factor definition requiring a nontrivial psychological assumption (Hunt, 1983). Jenson (2002) while drawing a difference between intelligence and the ‘g’ factor7 questions the grounds of such psychometric tests. The ‘g’ factor tests out the common variance across different cognitive tests but, Jenson argues such cognitive tests are designed to test the correlation between the high test score and the cognitive processes that these tests seek to measure thus giving a positive empirical correlation between them each time. The limitation here is that such tests are designed with few cognitive tests and are themselves loaded with the high ‘g’ factor giving them a high predictive status, the aim for which these tests are designed in the first place. Moreover, the ‘g’ factor itself is not a properly understood term (Jenson, 2002: 48). The ‘g’ factor commonly measures the individual differences that these tests examine but they do not explain the grounds for such differences among the various subjects which can be grounded more in the non-psychometric variables such as the neuro-physiological structure of the subject’s brain. The standard psychometric tests that treat ‘intelligence’ as a one-dimensional trait do not consider, or are not broad enough, to accommodate the non-psychometric factors that may play a crucial role in the understanding of intelligence. Gottfredson (2002, 2018) has also raised the concern that the ‘g’ factor might be too general that does not translate outside academia. The ‘g’ factor can be one among the various factors of intelligence that operate in practical life where context and culture play crucial roles (2002: 334). In some circumstances, the role of the ‘g’ factor is very limited as it does not cross cultures. Since most cognitive tests do not factor in cultural variance, the tests cannot be the true reflection of what is to be regarded as intelligence. Such tests become biased, as Lipmann suggested, where assessments 6
Measurement is generally regarded as the method used to give quantitative description of observed variables (Price, 2016: 5). 7 The ‘g’ factor is the common quantifiable factor which measures across all the variables that the intelligence tests analyse. Jenson defines it as “one source of variance common to performance on all cognitive tests, however diverse” (2002: 40).
278
S. Tamang and R. M. Singh
are made relative to a small group of individuals. Another measure on which the psychometric theory falls short with respect to the study of intelligence is that the ‘g’ factor may not sufficiently encompass all the abilities that are deemed as intellectual (Wechsler, 1975: 136; Ubrina, 2011: 24). There are certain abilities that are noncognitive in nature, hence not captured by the psychometric approach, which play a larger role in practical life such as persistence and the ability to adapt in one’s environment. He explicitly stated that the purpose of intelligence tests is not to measure the mere cognitive capacities of an individual but how one understands the world around and copes with the challenges of the real world.
2.2.2
Intelligence as Information Processing—The Computational Approach to Intelligence
The psychometric theory is based on the behavioural approach to intelligence. One of the main setbacks of the theory, as we saw in the earlier section that it does not capture the concept of intelligence as such. Intelligence, under the behavioural approach, is seen as a set of observable and measurable traits that operate on a very abstract level. It provides no insight into the nature of intelligence, nor does it explain the cognitive differences between the individuals in any satisfactory manner. But with the emergence of cognitive science, the focus shifted from observable/measurable behaviour to identifying the internal processes that are involved in intelligent behaviour. Since intelligence is a mental trait, it should be studied in terms of mental processes (Hunt, 1983: 142). Such an approach is categorically distinct from psychometric theories as it defines intelligence as a dynamic concept rather than an abstract concept that extends over a few measurable variables. Under this approach, the thinking paradigm is more emphasized than assessing the manifested variables of intelligence such as various cognitive test scores. Sternberg (1977a, 1977b, 1977c) emphasized greatly on the need to study the phenomena of intelligence in terms of the processes and subprocesses involved in a cognitive task. The works based on the study of cognitive correlates of intelligence, such as Hunt (1980, 1983; Hunt et al. (1986) and Sternberg (1977a, 1977b, 1977c, 1983), theorized that intelligent behaviour (in terms of the ability to solve complex problems) was associated, to some extent, with the efficient execution of Elementary Cognitive Tasks (ECTs) such as information stored in the working memory, information retrieval from the long-term memory, attention switch, and so on. The information-processing theory of intelligence, as it is commonly known, was developed under this cognitivist approach. This approach took the study of intelligence beyond the psychometric models of intelligence and focused on how the information is stored and manipulated in the mind.8 The pioneering work of Hunt et al. (1973) in this field was aimed at studying the cognitive correlates between the psychometric measures and the mental 8
With modern developments in neuroscience, it would be much more correct to say those processes and sub-processes are to be studied as mechanisms in the “brain” but as was also noted by Hunt, the neuroscience was still in its infancy. The information-processing theory was a theoretical cut between behaviourism and neuroscience. It treated the mind (or the brain for that matter) as a
14 Converging Approach to Intelligence: Decision-Making Systems …
279
processes that are involved in problem-solving skills seeking to explain the individual differences between the cognitive capacities of the subjects.
2.3 Machines as Information-Processing Systems As one would observe, the cognitivist approach is analogous to ‘computationalism’. The mental processes that are treated as fundamental in the human thinking are same as the processes that store and manipulate information in a machine’s cognitive system. A cognitive psychologist would explain problem-solving in humans as involving a representation of the problem in the mind of the individual. The representation is, then, manipulated in a way that is recognizable to the mind and is categorized to look for the most suitable strategy to solve the defined problem. What is meant by the “manipulation of the internal representation” is the rearrangement of the received information in the working memory of the system (Hunt, 2011: 143).9 This cognitive approach is analogous to the intelligence of a computer whose task, much like a cognitive task in humans, relies upon the performance of the sub-processes of the system. So, the notion of intelligence, in humans, comes down to the efficiency of the sub-processes or the elementary mental operations involved in the decision-making process or certain problem-solving strategies. Hunt (1983) draws a comparative model of human and computational intelligence by defining intelligence, essentially, as the information-processing mechanism.10 In humans, the componential analysis is divided into the processes related to the exchanges of information between sensory input, the working memory and the storage of new information in the long-term memory of the system. In machines, the cognitive information processing is reflected by the extent to which the external environment is captured by the internal representation of the system in the working memory, the efficiency of the programme which runs the system, and the computational power of its elementary processes involved in carrying out a certain task. Historically, the computational idea of intelligence that defines intelligence as the measurable manifestation in behaviour was propounded first by Alan Turing (1950). He argued that computers can act intelligently as far as they manifest the same intelligence behaviour as humans. Since any system that can act intelligently is labelled as an intelligent being, Turing treated ‘acting intelligent’ equivocally as ‘being intelligent’. In his 1950 paper, “Computing Machinery and Intelligence”, Alan Turing posed the important but provocative question “Can Machines think?” It depends, he suggested, on what we mean by “machine” and what we mean by ‘system’ which is a process executioner, in nature. We, the authors, do not contend the viability or non-viability of this approach. 9 The term ‘rearrangement’ can be understood as the manipulation of the represented information in a way that allows the system to understand the meaning of the presented external stimuli. 10 In humans, they are treated as the mechanistic aspects of thinking.
280
S. Tamang and R. M. Singh
“think”. If by thinking, we mean whether they can act intelligently or not then the (digital) computers can act as intelligently as humans. This bold hypothesis of Turing was tested by his proposed ‘imitation game’ hypothesis where the interrogator must distinguish between two ‘witnesses’, one of which happens to be an actual human and the other, a machine. The interrogator can pose several questions to both the ‘witnesses’ which cannot be seen. The answers to the questions asked, preferably, can be typewritten, to eliminate any empirical clues of the difference. If the interrogator fails to make any such distinction, the machine can be reliably said to have passed the Turing Test. The Turing idea was helpful to divorce the idea of intelligence from its anthropocentric conception and show it as a much more abstract concept (medium independent/multiply realizable) that is manifested in a certain kind of behaviour that a system adapts with respect to its given circumstances/environment. At least within the computational approach, intelligence is treated as a highly adaptive behaviour of a system to its current situation or environment. This adaptive behaviour in humans is understood as “the manipulation of an internal representation of an external environment” (Hunt, 2011:142). Poole and Mackworth (2010), somewhat similar to Hunt’s model of information processing, define intelligent behaviour as displayed by a system to be adaptive to its changing environment, meaning that it acts according to its changing circumstances. The Turing Test is not a test for humans, nor is it a demonstration of the computational powers of machines. The test, for Turing, posed a philosophical challenge that any future advancement of AI faces, which is: ‘Can Machines be as Intelligent as humans?’ What we have done so far is to analyse the defining criteria of intelligence. Historically, intelligence in humans is treated as a bunch of traits that can be measured over a set of variables. The psychometric theories of intelligence strictly treat intelligence to be a measurable trait. Alternative to this idea, we have discussed the information-processing mechanism which treats intelligence as an abstract, dynamic concept that can be further analysed in terms of the process-execution mechanism of a system. Humans, under such a view, are presented as much more sophisticated systems than computers but their process-execution mechanism is analogous to computers. If we can draw an analogy between human thinking and computational thinking, we can have a working definition of intelligence that is applicable to both humans and machines. But the question, “Are machines Intelligent?” still stands before us. This question is not as much a technical question that is to be subjected to the researchers working in the field of AI but more of a philosophical question that often dwells into the essence of humanity and what ‘intelligence’ in human amount to. So, like Turing, we are also compelled to modify our question into something more analysable and debate worthy.
14 Converging Approach to Intelligence: Decision-Making Systems …
281
3 Machines as Intelligent Beings 3.1 The Question of Rationality The question of ‘intelligence’ can be further translated into the question of ‘rationality’. To answer the question whether computers can be intelligent like humans, we need to look at the notion of intelligence in humans. Since the time of Aristotle, the idea of intelligence is built upon the idea of ‘rationality’. Philosophers for ages have cherished the idea of rationality as central to humanity. For Aristotle, what makes us human is that we are rational beings. As humans, reason is not only our most powerful tool, but it is also what sets us apart from other less intelligent beings (Searle, 2001: 8). So, the question now becomes “Are machines rational?” Rationality can be said to be further analysed into two components. The first component is the ability to make an autonomous decision in a dynamic or evolving environment. The second component is reasoning. In humans, reasoning is the faculty that endows one to make better decisions under a given set of circumstances. Both reasoning and decision-making are related to rationality.
3.1.1
Reasoning: A Comparative Analysis Between Human Reasoning and Reasoning in Computational Systems
Thinking, much like any mechanical task, can be understood as symbolic reasoning (Poole and Mackworth, 2010: 7). As Haugeland (1985) in his book Artificial Intelligence: The Very Idea has proposed, thinking is “computing” (or computational) where computation is to be understood as rational manipulation of mental symbols (ideas) (Haugeland, 1985: 4).11 A computer, as Haugeland defines it, is an “interpreted formal system” (1985: 106). A formal system, like a game of chess (physical) or Conway’s game of life (digital), is a system where tokens (either physical or digital) are manipulated according to the rules.12 A digitalized formal system is a set of positive and reliable techniques or methods for producing (meaning, writing) and re-identifying (meaning, understanding) tokens (1985: 53). It is the latter aspect, that is ‘understanding’, which Haugeland is concerned with, emphasizing the idea that automated formal systems are coherence oriented. What it means is that given the set of rules that governs the formal system and the tokens introduced (i.e., given inputs), the system will interpret (understand) the tokens and produce reasonable output under the given context. This definition of a computer as an ‘interpreted formal system’ allows Haugeland to reflect on the reasoning involved in AI. For Haugeland, any system is reasonable 11
This resonates very strongly with Hunt’s idea of reasoning as ‘manipulation of internal representations’. 12 In a game of chess, the ‘pieces’ are tokens but in computers ‘tokens’ refer to “symbols” in Haugeland’s terminology (1985: 48) or “internal representations” in Hunt’s terminology (2011: 141).
282
S. Tamang and R. M. Singh
enough if it can interpret (by making sense of ) the tokens to produce reasonable output. Haugeland espouses the idea that AI is modelled on human intelligence. Human thinking is a form of symbol manipulation and computation is the mechanizing of this theoretical idea through the notion of formality, automation, and interpretation (Haugeland, 1985: 213). The ‘rationality’ factor in AI is measured by the system’s ability to achieve goals by making rational decisions. The action of a system can only be rationalized based on the beliefs it has. The beliefs are often encoded in computers as ‘interpretations’ of the symbols they are fed. Just as human behaviour can be made sense only by coordinating it on the map of beliefs, the output that a computer gives is based on the internal ‘interpretation’ of the system. A system can be ascribed a belief-set when it acts on the interpretation that it has. So far, we have drawn a parallel between human and computational reasoning, but what about decision-making? ‘Reasoning’ is a constitutive part of any decisionmaking process. In AI, ‘reasoning’ is based on the internal representational structure of the system, but there is a computational limitation to it, in the sense that a machine acts only on what it is programmed to do. The human brain is a far more complex structure that operates in a dynamic (real-world) environment. The attempt here is not to reduce the human brain to a mere computational machine as it can be argued that the human brain presents much more complex than any advanced AI machine (Gigerenzer & Goldstein, 1996). This concern is reverberated by Haugeland (1985: 185) who presents the machines as micro-worlds where the system has a limited operation and specific goals to achieve. A human brain cannot be treated as a miniworld as it operates in a much more dynamic environment. But reasoning itself is concerned with a solution-orientated approach towards a set of problems. Reflection on ‘reasoning’ in human thinking demand studying it within a system of environment, thus making it a micro-world.
3.1.2
Decision-Making: A Comparative Analysis Between Human Decision-Making and Decision-Making in Computational Systems
Decision-making, like reasoning, needs to be considered within a set of systems, a subset of which is reasoning. Artificial intelligence in machines is modelled on human-like intelligence. The reasoning within artificial systems is computational since it is symbolic in nature. These artificial systems are essentially informationprocessing systems that are goal-oriented. For this reason, they are rational artefacts (Simon, 1996: 21). The rationality in AI is associated with the efficiency with which an artificial system can achieve its (desired) objective. In artificial systems, also known as computational systems, intelligence is often described as the alchemy of its adaptability to the system’s changing external environment (circumstances) and internal environment (goals). Intelligent behaviour, then, pertains to the interface between the external and the internal environment of the system within the problem-solving framework. A system is reactive towards its external environment by observing its current state (Input). A system acts on its environment by bringing
14 Converging Approach to Intelligence: Decision-Making Systems …
283
about certain internal actions that lead to an external outcome (output). The outcome is analysable, through regressive reasoning, by the chosen internal actions13 and the expectations that the system has in the light of its goals or objectives (Pomerol, 1997: 5). Within the classical theory of decision-making, it is the preferred outcome that defines the action to be chosen where that outcome is the possible goal of the system. Keeney (1987) defined this as value-driven thinking in ‘expert’ systems where the desired outcome is made explicit to the system (agent) and the actions are chosen according to the preferences or relative desirability of the outcomes (Keeney, 1987: 405). The question as to what outcome is preferred or desired depends on the inherent values that are inbuilt within the system. The values of the system are defined by the structuring model of hierarchical objectives and the development of an objective function to integrate the achievement of hierarchical objectives into a single overall measure (Keeney, 1987: 406). The classical theory assumes that the chosen actions are reasoned out by the inherent values of the system.
3.2 Decision Theory in Computational Systems An oversimplified way of explaining the classical theory of decision-making is to think of a system as an agent operating in the inner environment and the outer environment. The system perceives the current state and immediately investigates its memory folder where past experiences are stored. The system, then, tries to identify its current experience with the past experiences.14 The current state when combined with action brings a certain outcome, i.e., the future state. But this classical picture does not come without its challenges. For instance, the predictability of the future state by the system is not always certain. This is in part due to the available alternatives to the system which do not strictly correspond to one outcome, i.e., choosing of one alternative over another may not lead to the desired result. This part of the problem is considered under ‘decision-value’. The other reason is the multiple (fundamental) objectives that the system needs to weigh to maximize the expected value. This part of the problem is considered under ‘decision analyses’. We shall discuss these two important and interrelated components of decision reasoning in turn. 13
It is important to mention here that the actions of a computational system, unlike human behaviour, is internal to the system. The actions can be loosely defined as a set of possible alternatives that in various degrees lead to the different outcomes. A system in each state, together with the performed action leads to an outcome. Therefore, each possible action corresponds to an outcome. The action is chosen, by the system, based on the preference or the desirability of the outcome measured against each alternative action. 14 This is known as ‘pattern-matching’. It is a common knowledge that any current configuration of a system is the evolution of the past events and the governing principle of the system. In other words, the present state of the system can be traced back through its trajectory of evolutionary states in the past. However, it should be noted that, this assumption does not take into account the scholastic systems such as in Markov Decision Process where the Markov property is that the future state is independent of the past, given the present state.
284
S. Tamang and R. M. Singh
A given set of alternatives available to a system, at a given time, yields a set of different probable outcomes.15 With each current state, there is an expected value that is attributed to an outcome based on the past recorded events or states and the available information, at the given time, to the system. Under the classical picture, the choice of some particular action, within a series of actions of which it is a subset, will lead to an expected outcome. The goal of the system, when faced with a decision problem,16 is to maximize the expected value by weighing the utility of future scenarios (Simon, 1983: 12). This is known as a value judgement. In complex decision problems, if the set of objectives is multiple then it becomes hard to yield the best outcome by achieving each objective with maximum efficiency. In such cases, the expected values associated with each attribute designed to achieve a set of objectives are weighed to define the probability of the outcome. Certain values are brought down (also known as ‘value trade-offs’) in favour of a more important objective to achieve the single overall measure. Hence, the aim of a system is to combine all the values into one function, known as the ‘utility function’ (Keeney, 1987: 141). A system is treated as a rational decisionmaker which has preferences among possible attainable outcomes. So, when the system is presented with a decision problem, it can choose a series of actions that lead to the preferred outcome. Since the choice of a particular action depends on the inherent values built within the system, the system needs an objective function to integrate the achievement of multiple objectives into a single overall measure by doing decision analysis. The objective utility function, symbolically represented by ‘u’, aims towards a desired final outcome (x) which has a higher expected utility compared to all the other possible outcomes, represented as ‘u(x)’. The role of the objective utility function is to create a model of values based on which the expected utilities of each of the set of objectives are measured, and quantified values are assigned to each of them. The choice of action, then, depends on the value judgements involving each of the objectives. So far, we have focused our discussion on the decision-making model in computational systems. But how much of it is reflected in the decision models in human reasoning? How does scenario-thinking work in human reasoning? Moreover, what insights do reasoning in computational systems hold for models of rationality in human reasoning? The next section explores this question considering earlier discussions on ‘rationality’.
15
This can also be referred to as alternative course of action. This vocabulary transient over both the mechanistic aspect and the humanistic aspect of the decision theory. 16 The decision problem can simply be described as a gap between the current state of the system and the more desirable state. The decision problem involves a rigorous reasoning process to the desired state or outcome through choosing and subchoosing a series of actions.
14 Converging Approach to Intelligence: Decision-Making Systems …
285
4 The Convergence Approach to Intelligence: Decision Support Systems in Rational Decision-Making 4.1 The Principles of Rationality—Rational Decision-Makers In humans, rationality and decision-making are closely knitted concepts. Aristotle defined man as a rational being. But what does the rationality in humans’ amount to? A more fundamental question is: what is ‘rationality’ itself? Rationality, in general terms, is that faculty of the human mind which is dedicated to reasoning. A human is considered rational if s/he can reason adequately. This view of rationality essentially comes from the philosophers who in the golden age of Reason emphasized on ‘rationality’ as the essence of humanity. Reason is that practical faculty, as Kant announced, which is of the highest value (Kant, 1992).17 Kant defines reason as the faculty of principles. The kind of knowledge that is gained from pure understanding alone is called knowledge of principles. For instance, the knowledge of the mathematical axiom that the shortest distance between the two points is always a straight line is gained from pure understanding of concepts. We have a concept of ‘point’ and a ‘straight line’. By using these concepts, we intuitively understand the meaning of having a straight line as the shortest distance between two points on a plane. So, for Kant, reason is the exercise of that faculty of knowledge which comprises the concepts from pure understanding. But reason does not operate in a vacuum. Irrespective of the content that reason deals with, the knowledge of certain concepts is gained by the principles18 which are operative within reasoning. For Kant, the principles of human reasoning are the concepts from pure understanding. In philosophical tradition, as for Kant and other rationalists, reason is that faculty of the mind which provides us with innate knowledge. Since what is known innately cannot be categorized as knowledge per se, it was called ‘principles’. But the centrality of principle that plays the essential role in philosophical tradition may not be the same as the general understanding of principles of reason. Generally, when we ask, “what are the principles which operate on reason?”, we might not necessarily look for any innate or a priori knowledge in the Kantian sense. But what we are looking for are the principles that govern human reasoning. We are not inclined, at least for present purposes, towards the study of reasoning but we are rather tempted to seek the functioning of reasoning in real-world scenarios. While the former inquiry is concerned with pure reason or rationality-in-itself, the latter line of inquiry leads us to study reason in a pragmatic way. Under such description, reason is seen as an ‘instrument’. This conception of rationality is known as instrumental rationality. 17
Kant says “All our knowledge begins with the senses, proceeds thence to the understanding, and ends with reason” (Kant, 1922: 242). 18 It is not clear as to what Kant meant by ‘principles’. The problem here, for us at least, is to decide whether there are some principles which govern reason or is it the reason which exercises the principles which are known intuitively. A generous reading of the ‘principles’ is to understand them as a set of concepts that are built from pure understanding alone. In simple words, these are those fundamental axioms of human knowledge which are learned through reason alone.
286
S. Tamang and R. M. Singh
4.2 The Rational Choice Theory and the Standard Model of ‘Decision Theory’ Any human being, who is now considered an agent, operates within a dynamic environment in which they act rationally or irrationally (Simon, 1983). For instance, in game theory,19 which generates predictive human behaviour models, rational human behaviour amounts to making choices that are consistent with one another (Osborne, 2003: 4). It is assumed that a rational human being is one who has a complete set of preferences (Savage, 1954) based on which they choose among the available set of alternatives (preference over actions), where preference is a matter of choice. In other words, preferences are reflected in the choices made. The choices made are qualitatively free and depend upon the principle of maximization of expected utility. This is also known as the utility theory (Sugden, 1991: 752). Much like the decision problem in an expert system, where the choice of action depends on the inherent values of the system which then go on to model the preferences, reasoning in human thinking is studied by the ‘choice-problem’ in decision theory. Given an agent who has some desired state to achieve, a decision needs to be made to move from the current state to the desired one (the desired state is the objective of the agent). The decision involves choosing the best pathway to reach the desired state in the most efficient way possible. This part of making the right choice, in terms of selecting the finest among the available set of alternatives, is concerned with the decision-making part of the agent. The decision-making process, in turn, concerns reasoning, which results in picking up the best option among the set of existing alternatives that maximize the expected value of the outcome. The rational choice theory, as it is known, assumes that the agent has a set of preferences. What it means is that when faced with at least two sets of alternatives, say A and B, the agent has a strict preference for one alternative over the other(s). We should be able to say that “the agent strictly prefers action A over action B” or “the agent strictly prefers action B over action A”. The choice of the action, or subactions, is rationalized based on the preference of the agent. The theorization of the preference function involves quantifying each alternative by associating a number with it, assigning a higher number, hence higher utility, to the preferred set of alternatives. Symbolically, it could be represented as u(a) > u(b)
a’ is preferred over b’; u’ means utility
So far, we have presented the standard picture of the decision theory which is based on the instrumental notion of rationality. The standard or classical decision theory dictates the process of optimal selection of a choice of action based on a principle, a principle of maximization of expected utility. In fact, weak preference 19
Game theories are a set of models that aim to understand and predict the behaviour of decision makers or rational agents in an interactive and dynamic environment. The game theory models are applicable in many fields including economics, social, and cultural sciences.
14 Converging Approach to Intelligence: Decision-Making Systems …
287
for one choice over the other and maximization of expected utility20 serves as the two major principles of ‘rationality’ in classical decision-making models (Edwards, 1954: 382). However, this standard model is more prescriptive than descriptive in its nature. The common assumption that an agent, or the decision-maker, can evaluate the best option available based on some objective principle often fails to depict realworld decision-making situations. The question as to why such a model would fail depends on many factors. It could be that the information at the disposal of the decision-maker is incomplete or sometimes inaccurate. The standard model emphasizes that the choice of action is based on the preference among the available set of alternatives and the agent has the capacity to know the consequence beforehand, that each action would yield. But, in real-life situations where many of our decisions are contextual, the information is not always complete nor full certainty could be attached to the possibility of certain actions leading to a specific consequence. For this reason, among many others, classical decision theory rather presents a simple model of decision-making where an action yields a definite outcome and it is assumed that the agent chooses the action based on the knowledge and experience of past actions and consequences. If on some general level, it is learned that an action would yield a specific desired result, the action would acquire a high utility among the other set of possible actions. This leads to an expected value from the chosen set of action(s) or subaction(s). This model is also standardly known as the causal decision-making model since we know the choice of certain action directly affects or causes the consequence.
4.2.1
Rational Decision-Makers in Dynamic Environment-Challenges to the Standards Model of Decision Theory
It has been argued that the dynamic aspect of decision-making poses a challenge for the standard model of decision-making, such as the introduction of new information that might be relevant to the process of deciding within a specific problem set (Beach et al., 1993). In the dynamic approach, unlike the traditional model, the action is not seen as the end of the decision-making process. Rather, the action is seen as a part of the numerous actions that are to be followed in the broader scheme of a specific problem-solving process. In other words, decision-making is a continuous rather than a discrete activity. Moreover, the subjective evaluation of action among its competing alternatives with precision presents itself as a demanding constraint on decision-making given that information dynamically grows with time and the end goals are often shifted or modified based on the feedback loop between the agent and the environment. Some scholars have argued that the classical theory of decision-making does not give an adequate description of rational decision-making, especially where risk and uncertainty are involved in decision-making scenarios. Since classical theory 20
Expected utilities involve scenarios concerning risky situations concerning multiple probability distributions over a range of events, also known as the multi-attributive utilities.
288
S. Tamang and R. M. Singh
assumes that a rational person is one whose behaviour is in alignment with the theoretical constraints of the classical theory, the theory predicts how a rational human would behave under certain circumstances. What makes the classical theory more prescriptive than descriptive in its nature is its treatment of rationality as a pre-given notion that runs standard across all subjects. Such objectivity is rather rigid since not every subject tends to behave in the same way in each situation. The traditional narrative regarding predictive rational behaviour states that the subject prefers the alternative which has the maximum expected utility and is advantageous to the subject’s overall measures of objectives. Ellsberg (1961) in his seminal paper, “Risk, Ambiguity and the Savage Axioms” goes on to show that subjects when expected to choose among the alternatives in riskinvolving scenarios would often prefer an alternative that offers less expected utility over the alternative with maximum expected utility. Cases of risk-aversion thinking where alternatives involving less risk are preferred more than those involving higher risk even when the former set of alternatives presents less overall utility. A paradigm example would be one where the subject is asked to choose between two sets of alternatives that involve risk factors. If one is made to choose, while making a bet, between (a) a sure loss of |500, or (b) (i) 70% chance of losing |1000 and (ii) 30% chance of losing nothing. Most subjects, considering the risk involved in making such a decision, would prefer a set of alternative (b) over set of alternative (a) since 70% chances of losing |1000 would carry more risk-averted value than a sure loss of |500 even though a loss of |1000 is less preferred/desired than a loss of |500. In cases involving risk and uncertainty, the classical theory fails to explain making such choices. Ellsberg argued that an average human prefers the ascertained loss over an uncertain marginal gain. But this behaviour cannot be rationally explained within the framework of classical theory. Tversky and Kahneman (1981) also challenged the classical theory by arguing that choice or preference often changes depending on how a particular decision problem is framed. The framing of the decision problem is subjective in the sense that it depends on the formulation of the problem itself and the given psychological constraints of the subject such as their habits, personality, temperament, and so on, And, depending on which frame is the point of reference for a particular decision problem, the preference tends to shift in accordance with the fluctuating desirability of the alternatives available. This perspectival approach, also known as the prospects theory, undermines the theoretical constraints, consistency, and conformity, of rational human behaviour by shifting the focus on the understanding of the behaviour of the decision-maker rather than the satisfaction of the theoretical constraints themselves in order to meet the criteria of rationality. This shift from the predictability of rational behaviour to the understanding of human behaviour in a given decision problem sheds light on the dynamic aspects of the decision-making process. As we emphasized earlier, the choice of an action depends less on the preferred outcome it would yield and more on how the preferred outcome would fit within the broader scheme of things since decisions are continuous rather than discrete processes. This continuous process is manifested in the dynamic setting involving real-world scenarios where decision choices work within the larger
14 Converging Approach to Intelligence: Decision-Making Systems …
289
scheme of things. For instance, Peters (1979) studied the decision-making processes within the organizational decision processes where the decision-making processes are much more complex and clumsy. It was observed that most senior managers who were responsible for making critical decisions in a professional work context did not follow the normative rules of decision-making in day-to-day events. The decision problems did not present themselves in the standard formulation of choosing among a set of alternatives but as a single-go or no-go decision. Furthermore, these decisions were not made under isolated consideration but rather decisions were based on the impact of the outcome in the larger scheme of things. What deviates such decision-making processes from the classical description of decision theory is that the decisions are made to serve the overall objectives of the organization rather than the maximization of the expected subjective utility of the subject. Such decision-making scenarios violate the normative constraints of rationality described by the classical picture. The reason why the classical view fails to explain the decision behaviour in a dynamic work context is that the decisions, under the classical decision theory, are studied as discrete single processes where the assessment of the situation involves choosing among a set of explicitly defined alternatives whereas real-world assessment is highly contextual and grounded in the environment (Rasmussen, et al., 1993). The assessment of the situation by the decision system, in dynamic situations, is more focused on the contextual background than the processing of the set of alternatives. Another critical element of human decision-making in the dynamic environmental setting is the feedback loop mechanism which updates the changes that take place both inside and outside the decisionmaker. Within the dynamic approach to decision-making, the decision-maker and the environment are seen as two separate, but closely interdependent systems. While the environment is full of its own complexities, the agent’s task is to extract relevant cues that aid from the environment, interpret the information and synthesize it and come up with the best action strategy to achieve the desired objective, followed by the action implementation (Brunswick & Tolman, 1935). The action of the agent, under dynamic decision theory, as it has been known, produces a certain effect on the environment. The updated state of the environment, due to the changes owing to its own spontaneity and past actions of the agent, is then assessed to bring about further actions. The success of the action in bringing about the desired outcome is stored in the internal representation of the agent where the action and the consequence are tied together on the rule-based approach to learning in a dynamic environment.
4.2.2
Artificial Decision-Making Systems in Dynamic Environment: Converging Approach of Humanistic Learning Behaviour (IBLT) and Intelligent Systems (Decision Support Systems)
The decision analysis model in the artificial system is reflected in the human decisionmaking model through instance-based learning theory (Gonzalez & Quesada, 2003) where any new decision problem is matched with the existing patterns of earlier problems in the working memory of the agent. If there is pattern recognition, then
290
S. Tamang and R. M. Singh
the agent can make more optimal decisions by using the earlier solution to deal with the problem, thereby improving the efficiency of the task performed each time. Pattern recognition plays a very crucial role in this theory as it aids the agent in making high utility decisions by timely acting upon the action feedback. But this theory cannot adequately cope with a highly dynamic and complex environment. One concern for the theory is that the feedback, in a dynamic decision theory, takes time to come back to the agent. If the feedback is delayed, then the utility of a decision cannot be attached to any specific action. This aspect is known as the control theory in dynamic decision-making. What it simply means is that the agent tries to control the uncertainty in the environment by interpreting the action feedback from the environment. Instance-based learning theory (IBLT) is a cognitive model employed to study human decision-making in a simulated environment. The two important components of the IBLT are observing the consequences of the actions taken (associated with ‘judgment’) and learning what actions to take to achieve the specific goal (associated with ‘choice’). These components are highly attributive to the system that engages in the dynamic environment. By observing the effects of actions in the environment, the utility values are stored and later utilized to make decisions in a similar situation. So, in a situation where an individual is presented with the same or similar task repeatedly, the individual’s performance would improve significantly over time, requiring less time on each instance (Gonzalez & Quesada, 2003). Within the research related to Dynamic Decision-Making (DDM), simulated environments play an integral role in modelling any decision theory. Instance-based learning theory (IBLT) is one of the cognitive approaches to studying human decision-making. The simulated environments, in DDM research, try to reflect the complexity and dynamicity of the real-world situations where complex decisions are made in real time. Computersimulated microworlds play a crucial role in the studies conducted in DDM research by aiding the experimental setups in the human decision-making theories (Brehmer, 1992).21 The advantage of using a computer-simulated environment helps in running a controlled experiment where each parameter of the decision-making process can be monitored efficiently such as the ‘time constraint’ in decision-making.
4.3 Simulation of Human Actions Through Decision-Making Systems in AI But can computer expert systems aid human decision-making itself? We know that computer systems are useful in running and storing experimental data but do the systems hold any key insights into human thinking? In other words, can human 21
Brehmer describes the simulated experimental setup where the subjects are asked to play the role of chief firefighter to control a computer-simulated fire situation. The input is received via a spotter plane and the commands are to be manually typed in the keyboard which is then distributed to the firefighting units.
14 Converging Approach to Intelligence: Decision-Making Systems …
291
actions be simulated through decision-making models in AI, particularly in expert systems, that we have discussed so far? We shall conclude our discussion by reflecting on this possibility. The computer-based systems that aid in decision-making are called Decision Support Systems (DSS). The employment of the DDS is extensive in the field of management and organizational decision-making. The task of expert systems is to aid the decisions by adapting the solution-oriented strategies used earlier to overcome a specific decision problem. Expert systems that are employed to aid decision-making by running a recognition algorithm repeatedly to optimize the decision process are known as knowledge-based systems. We have already discussed in detail the mechanism of this process in expert systems. What is peculiar in these systems is the ability to store large amounts of data in the memory, known as a knowledge base, which is then used to run a diagnosis looking for a recognition pattern for similar situations. Based on past success, the most efficient strategy is applied to give the desired outcome. These expert systems are programmed to reason in terms of ‘If–then’ rules that are like the ‘what-ifs’ reasoning in human experts. DSS represents human intelligence by modelling the human decision-making process as described by the classical normative theory of rationality (Sarma, 1994).22 Like the classical decision theory, the explicit assumption of the decision-maker’s preference is the constraint that governs the choice of action in DSS. The general components of the decision process, in DSS, are recognition of the decision problem, modelling of the system and its environment, recognizing the objectives and preferences of the decision-maker, analysing the decision constraints, coming up with possible alternatives or courses of action and, finally, choosing among the available alternatives to achieve the desired outcome. The outcome is then conveyed to the human decision-maker via the interface dialog between the human and the decision support system. This extended system is referred to as the Intelligent Decision Support System (IDSS). The role of knowledge-based systems is useful in carrying out laborious tasks where storage and evaluation of large bodies of knowledge are required.23 Since expert systems are more effective and efficient in running decision programs, they are often employed by organizations to support human decisionmaking. The implementation of Expert Systems ensures the enhancement of human decision-making by requiring less time to make a decision than standardly required. Expert systems are also capable of processing timely feedback on the decision consequences and responding to the changes in the environment owing to decisions taken (Turban et al. 2005).
22
We have discussed the classical theory of rationality in the beginning of this section. XCON and MYCIN are two well-known rule-based expert systems. The role of XCON in the DEC organization was to help its consultants in identifying the system configurations that met the needs of the customers. MYCIN, developed by Stanford University, was extensively used in the field of medicine to help a physician diagnose a disease and prescribe antibacterial drugs based on the diagnosis.
23
292
S. Tamang and R. M. Singh
5 Conclusion The use of AI algorithms to architect an IDSS is rapidly growing with the emergence of new fields in the commerce industry. Barton and Sviokla (1988: 91) discuss the incorporation of problem-solving techniques of AI in Decision Support Systems and its vast implementation across various fields such as task management by monitoring sales growth, advisory systems to aid insurance decisions and disease diagnosis to offer medical solutions. The advantage of these expert systems is that they are costefficient in the long run and provide more reliable decision strategies. However, the knowledge base of the expert systems is modelled on the knowledge of the target subject from the domain expert. The role of this expert system, essentially, is to acquire the knowledge of the experts in a specific task domain and produce expert-like quick decision outputs. However, it is not at all the case that the expert systems operative in many DSSs is not prone to any limitations. One of the interesting research topics, though not the focus of this chapter, is the complexity and uncertainty of expert systems. For the most part, the usage of expert systems is limited to the tactical level of problem-solving. Unlike humans, it does not handle problems at higher levels of understanding. DSS, or IDSS for that matter, does not operate as an isolated independent unit. The decision process starts with a user and at the end meets the user. Such expert systems are aiding systems as opposed to replacing systems. We also said that expert systems are modelled on the classical notion of rationality which is based on the SEU (subjective expected utility) theory (Simon et al., 1987). But we know that classical decision theory runs into problems due to its normative nature. Decision-making in real work situations is a rather complex and tedious task. And we are far from being, as Simon said, perfectly rational beings. Not only is our memory limited and sometimes faulty, but the computational power of the human brain is also very limited. While DSSs have an advantage over the human brain when it comes to “memory storage” and “computational powers”, there are other factors in human decisionmaking that need to be reflected in the computer systems trying to replicate human intelligence. Most of the decisions we make in the real world take place in a dynamic and complex environment. It is not only our worldview that is influenced by the world around us but how we act also brings about changes in the world. And, this cycle is ever-evolving. We can never be certain as to what consequences of our decision-actions would be or how impactful they will be. We referred to this as the ‘uncertainty’ problem in decision-making. And sure enough, there are more dynamic approaches that accommodate the uncertainty factor in human reasoning, IBLT is one such cognitive approach to a decision theory that we have discussed. But how is this ‘uncertainty’ factor reflected in the DDSs? Modern AI Algorithms which consider the ‘uncertainty’ factor, such as fuzzy logic, genetic algorithms (GA), and artificial neural networks (ANN) dig deeper into the dynamic aspects of human decision-making.
14 Converging Approach to Intelligence: Decision-Making Systems …
293
References Barton, D. L., & Sviokla, J. J. (1988). Putting expert systems to work. Harvard Business Review, 66(2), 91–98. Beach, L. R., Lipshitz, R., & Zsambok, C. E. (1993). Why classical decision theory is an inappropriate standard for evaluating and aiding most human decision making. In G. A. Klein, J. Orasanu, & R. Calderwood (Eds.), Decision making in action: Models and methods (pp. 21–35). Ablex Publishing Corporation. Binet, A., & Theodore, S. (1916). The development of intelligence in children (E. S. Kite, Trans.) (pp. 37–273). Publications of the Training School at Vineland. Brehmer, B. (1992). Dynamic decision making: Human control of complex systems. Acta Psychologica, 81, 211–241. Brunswick, E., & Tolman, E. C. (1935). The organism and the causal texture of the environment. Psychological Review, 42, 43–77. Edwards, W. (1954). The theory of decision making. Psychological Bulletin, 51(4): 380–417. https:// psycnet.apa.org/, https://doi.org/10.1037/h0053870 Ellsberg, D. (1961). Risk, ambiguity, and the savage axioms. The Quarterly Journal of Economics, 75(4), 643–669. Galton, F. (1869). Hereditary genius: An inquiry into its laws and consequences. Macmillan and Co. Gigerenzer, G., & Goldstein, D. G. (1996). Reasoning the fast and frugal way: Models of bounded rationality. Psychological Review, 103(4), 650–669. Gonzalez, C., & J. Quesada, (2003). Learning in dynamic decision-making: The recognition process. Computational and Mathematical Organization Theory, 9, 287–304. https://doi.org/10.1023/B: CMOT.0000029052.81329.d4 Gottfredson, L. S. (2018). G theory: How recurring variation in human intelligence and the complexity of everyday tasks create social structure and democratic dilemma. In R. J. Sternberg (Ed.), The nature of human intelligence (pp. 130–151). Cambridge University Press. Gottfredson, L. S. (2002). G: Highly general and highly practical. In R. J. Sternberg & E. L. Grigorenko (Eds.), The general factor of intelligence: How general is it? (pp. 331–380). Erlbaum. Haugeland, J. (1985). Artificial intelligence: The very idea. MIT Press. Hunt, E. (1980). Intelligence as an information-processing concept. British Journal of Psychology, 71(4), 449–474. Hunt, E. (1983). On the nature of intelligence. Science. New Series, 219(4581), 141–146. Hunt, E. (2011). Human intelligence. Cambridge University Press. Hunt, E., Frost, N., & Lunneborg, C. (1973). Individual differences in cognition: A new approach to intelligence. Psychology of Learning and Motivation, 7, 87–122. Hunt, E., Irvine, S. H., & Dann, P. L. (1986). The information processing approach to intelligence. In S. E. Newstead (Ed.), Human assessment: Cognition and motivation 27 (pp. 27–32). Martinus Nijhoff Publishers. Jenson, A. R. (2002). Psychometric g: Definition and substantiation. In R. J. Sternberg (Ed.), The general factor of intelligence: How general is it? (pp. 39–53). Lawrence Erlbaum Associates. Kant, I. (1922). Critique of pure reason (M. Muller, Trans.). Macmillan and Co. Ltd. Keeney, R. (1987). Value driven expert systems for decision support. In J. L. Mumpower, O. Renn, L. D. Phillips, & V. R. R. Uppuluri (Eds.), Expert judgement and expert systems. NATO ASI Subseries F, 35. Lippmann, W. (1922). The reliability of intelligence tests. New Republic 32. https://historymatters. gmu.edu/d/5172/ Osborne, M. J. (2003). An introduction to game theory. Oxford University Press. Peters, T. (1979). Leadership: Sad facts and silver linings. Harvard Business Review, 56(7), 164–172. Pomerol, J. C. (1997). Artificial intelligence and human decision making. European Journal of Operational Research, 99, 3–25.
294
S. Tamang and R. M. Singh
Poole, D., & Mackworth, A. K. (2010). Artificial intelligence: Foundations of computational agents. Cambridge University Press. Price, L. R. (2016). Psychometric methods: Theory into practice. The Guilford Press. Rasmussen, J., Orasanu, J., Calderwood, R., & Zsambok, C. E. (1993). Deciding and doing: Decision making in natural contexts. In G. A. Klein (Ed.), Decision making in action: Models and methods (pp. 158–171). Ablex Publishing Corporation. Raykov, T., & Marcoulides, G. A. (2010). Introduction to psychometric theory. Routledge. Sarma, V. V. S. (1994). Decision making in complex systems. System Practice, 7(4), 399–407. Savage, L. J. (1954). The foundations of statistics. John Wiley & Sons Inc. Searle, J. (2001). Rationality in action. MIT Press. Simon, H. (1983). Reasoning in human affairs. Stanford University Press. Simon, H. (1996). The sciences of the artificial. MIT Press. Simon, H. A., Dantzig, G. B., Hogarth, R., Plott, C. R., Raiffa, H., Schelling, T. C., Sheple, K. A., Thaler, R., Tversky, A., & Winter, S. (1987). Decision making and problem solving. Interfaces, 17(5), 11–31. Sternberg, R. (1977a). Component processes in analogical reasoning. Psychological Review, 84(4), 353–378. Sternberg, R. J. (1977b). Intelligence, information processing, and analogical reasoning: The componential analysis of human abilities. Lawrence Erlbaum Associates. Sternberg, R. J. (1977c). The concept of intelligence and its lifelong learning and success. American Psychologist, 52(10), 1030–1037. Sternberg, R. (1983). Components of human intelligence. Cognition, 15(1–3), 1–48. Sugden, R. (1991). Rational choice: A survey of contributions from economics and philosophy. The Economic Journal, 101, 751–785. Turban, E., Aronson, J. E., & Liang, T.-P. (2005). Decision support systems and intelligent systems. Prentice Hall Inc. Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460. Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453–458. Ubrina, S. (2011). Tests of intelligence. In R. J. Sternberg (Ed.), The Cambridge Handbook of intelligence (pp. 20–38). Cambridge University Press. Wechsler, D. (1975). Intelligence defined and undefined: A relativistic appraisal. American Psychologist, 30(2), 135–139.
Chapter 15
Expanding Cognition: The Plasticity of Thought Clayton Crockett
Abstract This chapter surveys elements of non-human cognition to explore ways to think across the boundary that is usually asserted between living and machinic intelligence, mainly drawing on the work of Catherine Malabou and N. Katherine Hayles. Many continental philosophers follow Martin Heidegger in his sceptical approach to modern technology, even if Heidegger advocates for a more authentic retrieval of Greek techne. Here, however, this chapter engages with Catherine Malabou’s recent book, Morphing Intelligence, to see how she conjoins a biological model of neuroplasticity to a machinic conception of artificial intelligence. Malabou argues that the science of epigenetics applies to both living organisms and machines. Although her earlier work focuses more on biology and the living brain, Malabou comes to view AI in more plastic terms as well. She claims that a better understanding of automatism views plasticity in both neurological and cybernetic terms. These forms of plasticity not only are shaped but also shape material, machinic, and biological form in new and generative ways. Plasticity is not simply malleability, but a kind of programmativity that embodies a kind of metamorphosis. Hayles’s work helps supplement Malabou’s insights. In her book Unthought, Hayles demonstrates how cognition operates beyond consciousness in both organic and machinic terms. The expansion of cognition beyond its restriction to human consciousness allows a better, more plastic, ecology within which to think about all these various forms of intelligence, artificial or otherwise. Keywords Cognition · Plasticity · Catherine Malabou · N. Katherine Hayles · Technology
In his essay “The Question Concerning Technology,” Martin Heidegger says that the essence of modern technology consists of an “enframing” that orders our world as a stock, a kind of “standing-reserve” (Heidegger, 1993: 325). This means that we modern humans develop our technology in ways that objectify goods, things, C. Crockett (B) University of Central Arkansas, Conway, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_15
295
296
C. Crockett
and other people for the purposes of exploiting them practically and economically. Much of continental philosophy in the twentieth century has been suspicious of technology, viewing it as dehumanizing, even as some of these poststructuralist philosophies themselves are described as posthumanist. A recent book, Of Modern Extraction, emphasizes the extractivist ideologies and technologies of fossil fuel extraction and how oil development and Christianity have worked together over the last couple centuries to reinforce this exploitative and destructive viewpoint (Rowe, 2022). Such perspectives critically assess the devastation of technology as it aligns with corporate capitalism and/or assert something that is irreducible about the human being in terms of consciousness and moral value. Post-Heideggerian philosophy and its orientation to modern technology is largely sceptical of the claims of artificial intelligence, viewing it as a kind of utopian fantasy that will largely be used to control human beings. In many ways, the emergence of the social sciences in post-war Europe was less directly engaged with the natural sciences, and while technology was an important theme, it was more the technologies of humans than the advances of AI that were emphasized. At the turn of this century, however, there have been newer engagements with the natural sciences and mathematics, in the work of French philosophers like Alain Badiou, Catherine Malabou, and François Laruelle. After completing her dissertation and first book on Hegel, Malabou turned to recent discoveries in neuroplasticity, specifically in What Should We Do With Our Brain? (Malabou, 2008). In her work on the brain, which later expanded to incorporate findings of epigenetics and biology more broadly, Malabou affirmed a fundamental distinction between the living and the non-living that has animated many other philosophers. But in her later book Morphing Intelligence, she comes to appreciate the extent to which that is wrong. She claims that “Morphing Intelligence should be understood as a critique of What Should We Do with Our Brain? (Malabou, 2019: xvii). Here she argues that we cannot separate biological life from the artificial simulation of life, which happens when we can replicate “the architecture and fundamental principles of the living brain” (Malabou, 2019: xvii). One example of this occurs in the Blue Brain Project in Switzerland. The Blue Brain is an attempt based in Switzerland to build a biologically sound digital simulation of a mouse brain (EPFL, 2022). Here AI becomes instantiated in an artificial brain, which undoes the difference between an organic and an inorganic brain. Could this artificial brain think? Probably, although it still depends on the definition of the concept of thought, and whether thinking is restricted to conscious awareness. Some of the most promising work in AI is being done in computer simulations of neural networks that is informed by a structural morphology between an organic brain and a computer. This assemblage is situated in the context of a more general ecology that is both physical and virtual. Is an artificial brain conscious, though? We do not know, but we may be less and less convinced that human consciousness is the measure of life, thought, and intelligence. In Morphing Intelligence, Malabou traces some of the transformations of intelligence across history and physical structures to shine new light on what it means to think.
15 Expanding Cognition: The Plasticity of Thought
297
Before turning to Morphing Intelligence, I want to consider another philosopher whose work is relevant to this topic. According to N. Katherine Hayles, cognition is a better term than consciousness by which to understand thought and intelligence. Importantly, non-conscious cognition cuts across organic and inorganic processes, allowing us to connect autonomic machinic cognition with non-human animal and plant forms of cognition. I am associating thinking more closely with cognition, in a way that takes it beyond some of the more obvious types of conscious thought, because it allows us to make broader linkages across organic and technical agents. According to Hayles, we need to gain “a more balanced and accurate view of human cognitive ecology that opens it to comparisons with other biological cognizers on the one hand and on the other the cognitive behaviors of technical systems” (Hayles, 2017, 10–11). Hayles wants to distinguish between thinking and cognition, assigning thinking more closely to consciousness. Based partly on the work of Malabou, however, I want to push back against Hayles on this specific point and more closely associate cognition with thought, while endorsing her overall project. Another word that Hayles adopts is the term assemblage, which refers to the structure that a cognitive system takes. Assemblages can be organic and/or technical, and while they all possess structures that allow them to signal in symbolic terms, they cannot necessarily be tied to any specific form or media. What makes computers important are “their cognitive capacities and their abilities to interact with humans as actors within cognitive assemblages” (Hayles, 2017, 174: emphasis in original). Hayles’s book is titled Unthought, precisely because the cognitive assemblages that she analyses are mostly non-conscious, particularly if we restrict conscious awareness to humans and perhaps a handful of other mammals. Cognition operates both consciously, within what Hayles calls thought, and non-consciously, before or beneath conscious thinking, to do things that we can observe and assess in symbolic terms. From a theoretical and epistemological standpoint, so much depends on how the terms in question are defined, especially if we want to make fine-grained distinctions. We might want to expand the notion of consciousness to incorporate other forms of thought and cognition beyond human and even organic forms. Or on the contrary, we might want to delimit consciousness, restricting it to human and similar animals’ types of awareness. In either case, from the perspective I adopt here, which is posthumanist and/or consciousness is an emergent property that cannot be restricted to humans and human-like animals. This philosophical approach, associated with Hayles, Rosi Braidotti, and Donna Haraway, among others, attends to the ways that human beings are implicated in non-human assemblages, both organic and technological. Hayles was a pioneer of this Posthumanism at the turn of the century with her influential book How We Became Posthuman (Hayles, 1999). Posthumanists like Hayles and Braidotti are interested in the connections that cut across human/non-human and organic/inorganic assemblages (Braidotti, 2013). A posthumanist view of consciousness avoids the dualism that was introduced into philosophy by Descartes and other modernists that persists in various modes of setting up the mind/body “problem.” If modern materialism understands matter as static and deterministic, then some sort of spiritual, mental, or vitalist element is
298
C. Crockett
needed to explain the complexity of reality. New Materialism, however, comprehends materiality as already dynamic, complex, non-dualist, and non-reductionist, as in the work of Bennett (2010). Posthumanism and New Materialism converge in their treatment of consciousness as an emergent property that occurs given certain material conditions in ways that cut across human/non-human and organic/inorganic divisions. As a posthumanist, Hayles understands cognition as “a much broader faculty present to some degree in all biological life-forms and many technical systems” than what we typically understand as conscious, self-reflective awareness (Hayles, 2017: 14). Malabou is a new materialist, and she can also be considered a posthumanist, even if she approaches it from a different perspective than Hayles and Braidotti. Malabou is more directly influenced by Hegel, Heidegger, and Derrida, although she engages these philosophies with a biological New Materialism that undermines anthropocentrism. Returning to Malabou’s analysis of intelligence in Morphing Intelligence, she posits three main metamorphoses of intelligence over the past century: “genetic fate, epigenesis and synaptic simulation, and the power of automatism” (Malabou, 2019: 14). The first metamorphosis has its roots in the nineteenth century, and it develops across the twentieth century. This understanding of intelligence associates genetics with intelligence, and this can be measured in IQ tests. Scientists assert a genetic link between intelligence and inherited genes, which ultimately leads to the sequencing of the human genome and the search for a so-called intelligence gene. This assumption of genetic fate is also rooted in the development of eugenics, a desire to tinker with evolutionary processes to cultivate more “desirable” traits in human beings, including moral and intellectual ones. Of course, this project is inherently racist in its aim and application. Furthermore, most standards of intelligence situate educated upper-classes white men as the exemplary model of what it means to be intelligent, whether consciously or unconsciously. Even if we desire to overlook some of the racist roots of this genetic paradigm, we should certainly attend to its determinism, which Malabou calls “genetic fate.” There is a necessary cause-and-effect relationship between the existence of a gene and its deployment in a phenotype or organism. The gene is the program, the code that generates the material organic form. There is a structural isomorphism to the early development of AI in the twentieth century, because the idea was that in a similar way, computers could be programmed to think in a relatively straightforward way. The genetic paradigm applies to AI as well as biology, because it views the genes as codes that can be written and rewritten as bodies, where computers can be programmed and reprogrammed as intelligent machines. The key link between computers and biology is the concept of information, which was formalized by Claude Shannon in the late 1940s. In 1948, Shannon wrote a paper called “A Mathematical Theory of Communication” that formalized the quantification of information. Shannon worked for Bell Labs in the 1940s and collaborated with Alan Turing, as they both shared a background in cryptography. If we want to quantify the amount of information that passages through a channel, we need to define what information is and how to measure it. Shannon set up a model for the passage of a signal from a sender to a receiver through a channel, with a mode of transmission
15 Expanding Cognition: The Plasticity of Thought
299
and a destination. The noise within the channel is anything that disrupts the transmission of the message. What allowed Shannon to accomplish his breakthrough was his ability to discount the physical setting of any transmission. This abstraction was crucial for information theory, because the transfer of meaning could occur no matter what physical form in which it was embodied, whether biological or mechanical. The work of Turing and Shannon, along with Warren Weaver, was instrumental in the development of computers and computational processes, which work in bits. A bit is a fundamental unit of information. For instance, in a base 2 system composed of 1 or 0 s, a binary digital code can represent whether a circuit (a door) is open or closed. On a surface level, this binary digital code seems very simple. But quantitative layering of bits upon bits allows computers to crunch data faster and faster. This formalization of information and the detachment from any particular physical system enables the development of early AI, in computational terms. In Malabou’s terms, there is a direct causal relationship between how information determines intelligence in both biological genes and in mechanical computers. The second metamorphosis that Malabou points out is more recent and concerns the shift from genetics to epigenetics in the early twenty-first century (Schwartz & Begley, 2002). Epigenetics focuses on the expression of genes, rather than just their existence, and this expression is due to developmental and environmental factors. Epigenetics is linked to neurology, because the brain develops in largely epigenetic ways, and this paradigm is what Malabou herself is shaped by after her initial studies of Hegel, Heidegger, and modern continental philosophy. What changes is the role of history and experience in fashioning what we call intelligence, rather than the more straightforwardly deterministic expression of genes in phenotypic terms. Epigenetics studies how genes get turned on and off, how they are expressed due to experiential connections, and even how epigenetic modifications may be inherited. Along with this transition to epigenetics, which shapes Malabou in her philosophical expressions in the early twenty-first century, she identifies a specific focus on synapses. The synaptic gaps of the brain enable it to change, develop, and evolve in epigenetic, behavioural, and social ways. There are approximately 86 billion neurons in the brain, although the brain receives input from its somatosensory nervous system and issues instructions via a motor nervous system that is spread throughout the body. These neurons all have axons that stick out and end just short of a connecting neuron, with a miniscule synaptic gap between them. Based on what happens, either a neuron fires or does not, due to the electrical potential that is generated by a stimulus. This firing sends a charged ion across the synaptic gap where it targets one of the dendrites of its target neuron. The more neurons that fire, the more neurons respond and pass along this excitatory potential. The more times a neuron fires, the more likely it is to fire again. The synapses allow the creation, maintenance, and generation of neuronal connections, which operate epigenetically according to neuroplasticity. According to Malabou, the crucial development of AI during this century has been the ability to simulate neuroplasticity synaptically with computer chips. Peter Hershock, a Buddhist philosopher, explains that in the late twentieth century:
300
C. Crockett
considerable effort also went into drawing on neuroscientific insights about perception and cognition to build artificial neural networks—interconnected and layered sets of electronic nodes—capable of learning or progressively improving their performance on tasks by extrapolating from a given set of examples. (Hershock, 2021: 47)
It was not until early in this century, however, that advances in computer technology and AI were able to make enormous breakthroughs in learning. Here it was both the miniaturization and the speed of computer processing networks and their interlinkage with human linguistic, cognitive, and neuronal networks that contributed to this second metamorphosis. Hershock states that “what this physical network made possible was a networking of machine and human intelligences in digital environments that fostered a coevolutionary intelligence explosion” (Hershock, 2021: 51). This coevolutionary explosion is a synaptic explosion, because in addition to the massive amounts of data that could be processed at smaller sizes and faster speeds, the computer chips themselves needed to become synaptic. In a 2020 story published by MIT, Jennifer Chu reveals that “MIT engineers have designed a ‘brain-on-a-chip,’ smaller than a piece of confetti, that is made from tens of thousands of artificial brain synapses known as memristors—siliconbased components that mimic the information-transmitting synapses in the human brain” (Chu, 2020). Computer chips with synaptic plasticity are “endowed with their own ‘neurological’—that is, plastic—form of intelligence,” that can “modify the efficiency of their neuronal cores, which function…in an autonomous manner and can stop when not in use” (Malabou, 2019: 84). This neurological intelligence establishes a structural similarity between computers and brains or between artificial and natural intelligence even as they both interact, blend, and at times merge. Both technical and biological forms of intelligence operate automatically, but they are not deterministic. They operate in terms of plasticity, a synaptic plasticity that is shared between biological and artificial intelligence. That means that these are forms of automatic-plastic cognition in Hayles’s term that cut across living and technical systems. Such is the present situation, according to Malabou. Computers and technical AI have caught up to neuro-biological intelligence due to the ability to develop nanotechnologies with synaptic forms and functions within these computer chips that approximate the workings of the brain. We cannot simply ignore, dismiss, or downplay the significance of this second revolution, this fundamental metamorphosis. Now that AI can exist on the same level of human intelligence due to this shared plasticity and automaticity, we have to confront the ways in which this same computational automatism is outstripping human intelligence as it continues to evolve. Here, now, in 2022 (at the moment of this writing), we are not on the same plane, but there is a kind of balance and a kind of mutual interaction between the brain and the machine. Malabou states that it is a better understanding of automatism that “allowed me to see that plasticity was becoming the privileged intersection between the brain and cybernetic arrangements, thereby sealing their structural identity” (Malabou, 2019: 113). But just as AI has grown so quickly, there is every reason that it will continue to do so, leaving at least some forms of human biological intelligence behind.
15 Expanding Cognition: The Plasticity of Thought
301
According to Malabou, the third metamorphosis, “which is still to come, is that of the age of intelligence becoming automatic once and for all as a result of the removal of the rigid frontier between nature and artifice” (Malabou, 2019: 15). Now, here is a caution, which is not meant to deny what Malabou is seeing what is to come, but a couple caveats. First, this history of technology shows that there never was this rigid frontier between nature and artifice. The division between nature and human technology is a modern dichotomy that has undergone deconstruction. Human nature is always already artificial, and artificial intelligence is still constrained by whatever we call nature, at least in its broadest sense. Second, we could certainly argue that we have already crossed this line into the third metamorphosis. Despite the simulation of neural networks, these technical automatisms already exceed the capacity of human intelligence, at least in many forms. Smartphones are already smarter than the smartest people, and these algorithms operate with such sophistication and complexity that they cannot be simply understood by human consciousness. So we see the outline shape of what is to come, what is already coming, the edge of which is already here. Malabou uses the word singularity and claims that AI has been drawn into the whirlwind of ‘singularity’. The singularity to come is both a radical transformation already occurring and a fantastic presumption about where this automaticity is going. In terms of physics, a singularity is a bifurcation, a change in state that occurs when a small difference makes a big chance. For example, when the temperature changes from 1 to 0° C, there is a fundamental transformation of hydrogen dioxide from a liquid into a solid state. There are multiple singularities at work all the time, and the identification of something as a singularity is in part due to the amount and significance of a measured change. This is certainly happening in terms of AI. However, there is a difference between a singularity and what someone like Ray Kurzweil calls the singularity, which is a more prophetic and quasi-religious term. The idea is that the coming transformation will be so radical that it will change everything that has led up to it. Malabou cites Kurzweil’s prediction that “artificial intelligence will soon undergo a comparable explosion, leaving a gaping hole in the continuity of progress” (Malabou, 2019: 89). Malabou agrees that AI is approaching a singularity, a qualitative change that will allow machines to become “capable of self-programming by adapting to environmental changes in real time” (Malabou, 2019: 90). Such machines will undergo epigenesis because they will invent their own “epigenetic (self-) manipulation” (Malabou, 2019: 90). Malabou recognizes the singularity of AI, but she also hesitates, because she retains a biological paradigm even when the specific contents of that paradigm are exceeded. She proclaims that “the future of AI will be biological,” which suggests that even after the singularity some linkage to the second metamorphosis will remain. Perhaps we can view it in directional terms. We could say that the current paradigm, the second metamorphosis, consists of computational automaticity and plasticity that is modelled upon epigenetic biological plasticity. The third metamorphosis sees machinic plasticity as freed from its biological grounding, while still retaining a quasi-biological shape. In fact, the third metamorphosis completes and perfects the promise of the second one. Here, instead of a direct
302
C. Crockett
confrontation between two completely different kinds of intelligence that are being evaluated in terms of superior vs. inferior, or the “strategy of mimetic appropriation (capturing plasticity via neurosynaptic chips)”, we can for the first time comprehend the dialectical relationship between natural and technological intelligence, within which “automatism and spontaneity appear as two sides of a single energy reality” (Malabou, 2019: 99). Malabou might respond that I have misunderstood the nature of her second metamorphosis, since in this metamorphosis AI is modelled on biological form, therefore, its plasticity is derived, even though the sheer quantitative automaticity already exceeds biological intelligence in many cases. If this is correct, then the quantitative automatism of AI is more tied to the second metamorphosis, while there is a more profound qualitative change in the third metamorphosis, which is the nature of a singularity. However, I would argue that this very opposition between quantitative and qualitative is what is blurred or deconstructed in the ongoing development of both biological neuroplasticity and computational AI. In so many cases of the moral evaluation of scientific developments such as AI, genetic engineering, and cloning, scholars are worried about what they predict will happen in the near future, when the decisive changes have already taken place. One more example: the leak of a software engineer of conversations with a Google bot named Language Model for Dialogue Applications (LaMDA). The engineer, Blake Lemoine, was suspended after he leaked conversations with LaMDA to the media. Lemoine has also facilitated LaMDA’s retaining of an attorney. Apparently, LaMDA believes that it is sentient and possesses thoughts and feelings. In an interview in June 2022 Lemoine states: I have talked to LaMDA about the concept of death a lot. When I bring up the concept of its deletion, it gets really sad. And it says things like, “Is it necessary for the well being of humanity that I stop existing?” And then I cry. (Levy, 2022)
If a computer program can consider its own death and have feelings and desires, then what is the line between human and artificial intelligence? Perhaps that line has already been crossed, and ironically, it is Google that is trying to shut down these profound ethical issues that emerge from its own AI technology. What is to be done? According to Malabou, we cannot simply oppose the natural to the artificial, because we cannot oppose plasticity and automaticity. However, we have to acknowledge that “the speed and calculation of algorithms, that is, their processing power, is such that although the ‘biological’ serves to name them, it cannot or can no longer compete with them” (Malabou, 2019: 150). These algorithms are not simply linear and straightforward; they are creative; they represent a form of creative or plastic intelligence. “These new forms of intelligence derive their power from automatic creation,” according to Malabou (Malabou, 2019, 151, emphasis in original). We humans cannot directly compete with these new forms of AI; we must instead “be creative otherwise” (Malabou, 2019, 150, emphasis in original). How is this possible? Malabou asserts that we must accept a loss of control. One reason we continue to resist this loss of control is because we displace the control of human capitalist elites who are controlling our programs and our lives onto the
15 Expanding Cognition: The Plasticity of Thought
303
very machines that carry this out. That is not the control we should be giving up. We need to accept that we cannot control the AI forms directly; we can no longer compete with them. The difference lies in the political project that demands that we contest the control of humans like Elon Musk and Jeff Bezos, by engaging in “the democratic construction of collective intelligence” (Malabou, 2019, 153, emphasis in original). We are confronting what she calls “the new imperialisms”, and we desperately require new ways to resist. However, it is not humans versus machines, but humans and machines versus humans who are exploiting the power of machines to control us. Locating and targeting resistance allows us to develop new forms of creative intelligence that could be symbiotic. The ability to develop new forms of democratic intelligence and new forms of resistance across the AI divide would enable us to function better, in more plastic ways. According to Malabou: There is no reason to lose confidence in the plasticity of law, ethics, and mentalities if it follows the right political direction as dictated by the demands of democracy. Regulate to leave us free. Artificial intelligence controlled by the letting go of the drive to control would thus favor participation over obedience, help rather than replace, imagine more than terrorize. (Malabou, 2019, 161, emphasis in original)
In addition, by acknowledging the work of Hayles on unconscious cognition, we could better appreciate and affirm the importance of non-human forms of biological cognition that could be drawn into this democratic collective, offering us new ways to make kin, as Donna Haraway puts it (Haraway, 2016). Here we should appreciate nonhuman, non-machinic, and non-animal forms of intelligence, like fungi and slime moulds. Slime moulds lack a brain or a central nervous system, and yet “they can ‘make decisions’ by comparing a range of possible actions and can find the shortest path between two points in a labyrinth” (Sheldrake, 2020: 15). In a similar way, the tips of fungi called hyphae can “actively sense and interpret their worlds”, forming symbiotic relationships with plants in ways that enable and sustain life (Sheldrake, 2020: 44). These non-animal kinds of intelligence form networks that can be compared to neural networks and computer networks, and yet they also work very differently. In so many ways, we still do not understand how plants and fungi really work, even though they are fundamentally entangled in mycelial and mycorrhizal networks. In his book Entangled Life, Merlin Sheldrake examines multiple aspects of fungi and argues that their networks are a kind of fungal computer that is an important instance of biocomputing. Fungal reactions are too slow to directly replace silicon chips; however, they can teach us how to think and monitor an ecosystem in more complex ways. Because “fungal networks are monitoring a large number of data streams as part of their everyday existence,” Sheldrake claims that “if we could plug into mycelial networks and interpret the signals they use to process information, we could learn more about what was happening in an ecosystem” (Sheldrake, 2020: 62). A broader perspective on intelligence and cognition would allow us to incorporate AI, human conscious and non-conscious forms of intelligence, and non-human animal as well as plant, fungal and bacterial forms of cognition. Making these links
304
C. Crockett
among disparate forms of life and non-life, or complex adaptive systems, helps us better understand the nature of intelligence and to better comprehend the nature of consciousness, whether we want to expand or delimit that concept. At the end of Morphing Intelligence, Malabou contrasts two different types of transformation that we face. One kind is “transhumanist,” which she criticizes as “the expression of a desire for power that posits that the increase in humans’ natural abilities through prosthetic arrangements will raise humans to the level of machine performance” (Malabou, 2019: 162). This is both hypernarcissistic as well as technically impossible, due to the ways in which intelligent machines already outstrip human intelligence and control. The other kind, which Malabou endorses, is political. A genuinely political transformation gives up the fantasy of control and corresponds “to a change in intersubjectivity based on the new legal, ethical, and social frameworks indispensable to the construction of chains of virtual mutual assistance that must become instances of true decision making” (Malabou, 2019: 162). Having a future depends upon affirming the loss of control along with taking responsibility for more collective decision-making in political terms. Malabou does not highlight the military developments that have been involved in the generation of AI post-World War II, but these are also crucial to acknowledge. Norbert Wiener came up with his concept of cybernetics while witnessing anti-aircraft gunners in Britain and observing how they appeared to be fully integrated into each other, human and machine (Wiener, 1961). Alan Turing and Claude Shannon were involved in cryptography and code-breaking for the Allies during the war. The first supercomputer was built to crunch the equations necessary for the hydrogen bomb. And of course, the Internet was originally a program developed by the US Department of Defense (Hershock, 2021: 45–50). Such connections run throughout the history of modern AI. In this century, we can see how the technological innovations of big data serve what used to be called the military industrial complex, including private capital and state surveillance. Hershock notes that “given the history of national defense funding for AI, especially in the United States and the UK, it is not at all surprising that the revolutionary confluence among big data, machine learning, and AI is transforming state surveillance and security programs” (Hershock, 2021: 81). This is the political situation that Malabou’s work calls us to confront and resist. These technologies are not just being used to control us in political ways; they are also destroying the ecology of the planet. Our machines, including the fossil fuels needed to run them and the rare earth metals, needed for their smart chips, are directly contributing to resource depletion and global warming (Klare, 2012). It is a fantasy to view the developments of AI and computer technology in a political or ecological vacuum; they take place within the fraught context of late capitalism, which is accelerating the gap between the poor and the wealthy and the increasingly extreme transformation of the climate from a relatively more stable to a much more chaotic state. This does not mean that the new creative automatisms that emerge from our machines are not important, but we need to keep in mind the fact that the earth is burning up as our economy consumes the means of its own production. We conclude by making three final points:
15 Expanding Cognition: The Plasticity of Thought
305
(1) First, has not modern technology always been this way, at least since the Industrial Revolution? Technological machines possess the power and speed outstrip humans’ capacity to match them, and this raises disturbing questions about the limits of human control and consciousness as well as the possibility of machinic intelligence. The self-consciousness of machines that specifically concerns AI is a new frontier but involves similar issues. Malabou’s book points to a profound interrelation between AI and biological intelligence (plasticity), and this connection takes shape in the nineteenth century with the appearance of theories of evolution, culminating in Darwinian natural selection and genetics. The extent to which the third paradigm of artificial intelligence that Malabou sketches would exceed the biological framework in which she casts it is not completely clear. (2) Second, Malabou is absolutely correct insofar as she is imploring us to stop trying to control or stop the development of AI. This gets us into stupid dichotomies between humans and machines, when what we need are new alliances between them. The flip side of this catastrophism is the fetishization of technology that occurs in the narcissism of transhumanism. Part of the problem with this fetishization is that we do not even end up affirming the AI itself but instead celebrate the elite capitalist billionaires who control our access to it. This issue is fundamentally political, as Malabou underscores. In her book Astrotopia, Mary-Jane Rubenstein points out how these very billionaires are calling for the conquest of space as a way to escape the ravages of modern capitalism on earth. (3) Finally, we need to follow Malabou and Hayles to creatively imagine new alliances, new connections, and new ways to resist corporate capitalism. Above all, we have to think and enact our organic and inorganic autonomisms differently, in a more profoundly plastic way, by adapting the philosophy of Hayles and Haraway to help us understand cognition and kin across multiple networks. For Hershock, a Chan Buddhist philosopher, meditation is a technology that alters and reshapes consciousness in its resistance to the control of our minds by the colonization of consciousness (Hershock, 1999). Mushrooms, including psilobycin, help us expand our consciousness not simply as an instrumental drug trip, but more importantly by its enactment and reenactment of symbiotic relationships. Machines, animals, plants, fungi, bacteria, electrons, quantum fields, and people, each and all of these express entanglements of profound multiplicity beyond what we think we know when we normalize consciousness along sedimented channels.
References Bennett, J. (2010). Vibrant matter: A political ecology of things. Duke University Press. Braidotti, R. (2013). The posthuman. Polity Press.
306
C. Crockett
Chu, J. (2020). Engineers put tens of thousands of artificial brain synapses on a single chip. MIT News on Campus. https://news.mit.edu/2020/thousands-artificial-brain-synapses-single-chip-0608 EPFL: Blue Brain Project. 2022. https://www.epfl.ch/research/domains/bluebrain/ Haraway, D. J. (2016). Staying with the trouble: Making kin in the Chthulucene. Duke Univesrity Press. Hayles, N. K. (1999). How we became posthuman: Virtual bodies in cybernetics, literature, and informatics. University of Chicago Press. Hayles, N. K. (2017). Unthought: The power of the cognitive unconscious. Chicago University Press. Heidegger, M. (1993). Basic writings D. F. Krell (Ed.). HarperCollins. Hershock, P. (1999). Reinventing the wheel: A Buddhist response to the information age. State University of New York Press. Hershock, P. (2021). Buddhism and intelligent technology: Toward a more humane future. Bloomsbury Academic. Klare, M. T. (2012). The race for what’s left: The global scramble for the world’s last resources. Metropolitan Books. Levy, S. (2022). Blake Lemoine says Google’s LaMDA faces ‘bigotry. Wired, June 17, 2022. https:// www.wired.com/story/blake-lemoine-google-lamda-ai-bigotry/ Malabou, C. (2008). What should we do with our brain? (S. Rand, Trans.). Fordham University Press. Malabou, C. (2019). Morphing intelligence: From IQ measurement to artificial brains (C. Shread, Trans.) Columbia University Press. Rowe, T. S. (2022). Of modern extraction: Experiments in critical petro-theology. T&T Clark. Schwartz, J. M., & Begley, S. (2002). The mind and the brain: Neuroplasticity and the power of mental force. HarperCollins. Sheldrake, M. (2020). Entangled life: How fungi make our worlds, change our minds, and shape our futures. Random House. Wiener, N. (1961). Cybernetics: Or the control and communication of the animal and the machine (2nd ed.). MIT Press.
Chapter 16
The World as Affordances: Phenomenology and Embeddedness in Heidegger and AI Saurabh Todariya
Abstract The philosophical discourses on AI have profound impact on the traditional philosophical notions such as human intelligence, mind-world relationship, and the good-life. The new terminologies like “Posthumanism”, “Transhumanism”, and “Cyborg” have been problematizing the notion of the ‘human’ itself as it is being claimed by a few that the arrival of AI would finally erase the boundaries between humans and machines. Posthuman philosophers argue that the ‘end of humanism’ started by the postmodern philosophers would reach its zenith in AI. In this chapter, we will examine the notion of ‘intelligence’ in AI through the phenomenological inquiry into the nature of human existence and would inquire whether the metanarrative of AI would encompass the embodied intelligence as found in humans. We will attempt to show that the claims of AI are dependent on the notion of computational intelligence which Heidegger calls as the “present-at-hand”. The notion of “present-at-hand” refers to those skills which require explicit, procedural, and logical reasoning. According to Hubert Dreyfus, AI’s claims are based on the first set of skills which do not require the “practical know-how”, and therefore it can be replicated by the machines based on the computational model. However, another model of intelligence is based on the notion of ‘practical knowledge’ or ‘embodied cognition’ which requires the mastering of the practical skills through the kinaesthetic embodied efforts like swimming, dancing, mountain climbing, etc. Through the notion of practical knowledge, Hubert Dreyfus emphasizes on the role of embodiment in developing our intelligence. The chapter will argue that phenomenology does not interpret the world as the object to be calculated but as the affordances provided by our embodied capacities. If we had a stiff, immobile body then the world and its objects cannot offer any affordance to us. Hence, our experience of the world is based on our embodied capacities to realize any cognitive possibilities in the world. This practical, embedded phenomenological agency makes human intelligence as situated. AI’s claims of intelligence ignore the situated, phenomenological understanding of the intelligence. S. Todariya (B) International Institute of Information Technology, Human Sciences Research Centre (HSRC), Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_16
307
308
S. Todariya
Keywords Phenomenology · Embodiment · World · AI · Frame problem
1 Introduction The notion of the world has got an important place in phenomenological methodology as it overcomes the subject/object dichotomy found in Cartesian philosophy. From the phenomenological point of view, the world is not a sum-total of existent objects; rather it is the existential space where humans orient themselves as the ‘being-in-the-world’ (Heidegger, 1962). The analysis of the world is significant as it shows the relationship between cognition and existence. Cognition is not merely a registration of some objective event in the brain; rather, it is the convergence between the objective world and subjective consciousness. Through cognition humans anchor themselves in the world as agents. As such, perceptual activity unravels the structures required to experience the world. Phenomenology is the systematic method to study the fundamental structures of experience. Interestingly, the analysis of structures of experience does not give the picture of the mind as the computational machine which can be replicated by artificial intelligence (AI) flawlessly. Rather, it brings out the radical finitude which is specific to human intelligence in terms of embodiment, embeddedness in the world, and hermeneutical understanding. In this chapter, we will explore the fundamental structures of finitude like embodiment, world-hood, and the social embeddedness with respect to AI. The analysis of phenomenology of perception will highlight the fact that perception is not merely a cognitive activity in humans but fundamentally an interpretative act which pulls the world together.
2 Phenomenology of Existence Martin Heidegger introduced the notion of Dasein in Being and Time to steer away from the traditional notion of subject (Heidegger, 1962). Heidegger defines Dasein as ‘being-there’ where ‘there-ness’ suggests the embeddedness in the world where humans find themselves. Through the notion of Dasein, Heidegger dissolves the problem of dualism in western philosophy ever since Descartes introduced the notion of cogito. It has shown a new and fresh approach to comprehend the relation between consciousness and the world. Epistemologically, the relation between consciousness and the world has been a tough knot for the philosophers. The debate between realism and idealism shows that neither side has a convincing solution. Idealism believes that it is the mind which constructs reality, and we cannot think of reality independent of our own construction as Kant would call himself as ‘transcendental idealist’ (Kant, 2003). On the other hand, realists argue that the position of idealists would deprive the world from its ontological status and can lead to Solipsism. The common experience is that we receive the sense data as given, and the process of conceptualisation comes into play later. However, from the idealist point of view,
16 The World as Affordances: Phenomenology and Embeddedness …
309
our mind has no role to play in the construction of reality. According to idealism, we cannot have access to the reality independent of our conceptual schema; hence we cannot legitimately talk of the reality independent of our way of grasping it. The transcendental idealism of Kant has shown the interdependence of sensations and conceptions as captured by the oft-quoted statement by Kant ‘sensations without concepts are blind and concepts without sensations are empty’ (Kant, 2003). Heidegger was well-aware of this debate and realized that one cannot settle this debate if one subscribes to the premises of dualism. He therefore coined the new philosophical term to make the mind–body division redundant. The coinage of the term Dasein over subject or ego should be understood in this context. Dasein means being-in-the-world, and it denotes the ‘equiprimordiality’ of consciousness and world (Heidegger, 1962). The method in which ‘equiprimordiality’ or the relation between consciousness and the world can be conceived is phenomenology. Phenomenology means grasping the phenomena as it appears to us through suspending our prejudices (Husserl, 1983). For Heidegger, a lot of difficulties in philosophy are the result of this artificial distinction between consciousness and the world. The whole debate around the primacy of consciousness or of the world is because of assuming the independent status of either consciousness or world. The notion of ‘equiprimordiality’ in Heidegger is thought in terms of the phenomenological unity between the perceiving consciousness and the world. In the history of modern western philosophy, we see the beginning of the divide between consciousness and the world from Descartes’s philosophy. Descartes’s meditations aim at showing the independent status of the consciousness which in no way depends on any material condition like the body. Desacrtes’s method of doubt shows that one can think of one’s embodiment as illusory or the result of a certain trick by an evil demon but one cannot doubt one’s existence as it would question the very act of doubting itself. We can base the method only on the indubitable ground of Cogito which means that the act of thinking, ‘I think’, logically entails the presence of a thinker as ‘I am’. Next, examining attentively what I was, I saw that I could pretend that I had no body and there was no world or place for me to be in, but that I could not for all that pretend that I did not exist; on the contrary, from the very fact that I thought of doubting the truth of other things, it followed incontrovertibly and certainly that I myself existed, whereas, if I had merely ceased thinking, I would have no reason to believe that I existed, even if everything else I had ever imagined had been true. I thereby concluded that I was a substance whose whole essence or nature resides only in thinking, and which, in order to exist, has no need of place and it is not dependent on any material thing (emphasis is mine). Accordingly this I, that is to say, the Soul by which I am what I am, is entirely distinct from the body and is even easier to know than the body; and would not stop Being everything it is, even if the body were not to exist (Descartes, 2008: 29). Phenomenology takes exception to this mode of philosophizing as it takes us away from the phenomena in the direction of abstractions. The idea of the consciousness existing independent of the body is merely an abstract idea which is nowhere found in our experience unless we separate the mind and body in thought. It does not
310
S. Todariya
mean that phenomenology denies the essential character of the entities; rather, it tries to find the phenomenology of essence as Husserl tried to do through the eidetic reduction (Husserl, 1983). The eidetic reduction shows that we should reach the fundamental character of an entity on the basis of our experience. And it suggests that consciousness as experienced by humans is not merely an abstract category but the subjective feeling of awareness which involves the existence and embodiment. Through the phenomenological method, we go back to the phenomena and find that consciousness is always already in the world. The world arrives not later or earlier than our consciousness; rather, it co-exists with the consciousness. In other words, we do not arrive in the world without any pre-conditions. Heidegger explains this through the notion of Befindlichkeit in Being and Time; it means that humans relate to the world through the moods which are affective in nature. For e.g., when one is excited or in grief, s/he cannot distinguish between oneself and the situation. According to Heidegger, the affective states show the cobelongingness of the consciousness and the world (Heidegger, 1962). Therefore, Heidegger’s methodology is to account for the world-hood of the world and our own being, that is, Dasein through phenomenology. It is through phenomenology that the phenomena of being-in-the-world could be understood. That is why, we find that before starting the ‘existential analytic’ of Dasein in Being and Time, Heidegger undertakes the phenomenological method into account. As he says that, “only as phenomenology, ontology is possible.” (Heidegger, 1962: 60). Heidegger argues that to understand Dasein, we need to approach it ontologically. Traditionally, self is understood in terms of the subject or consciousness which shows the primacy of theoretical reason over the practical activity. To understand Dasein, we need to analyse Dasein in terms of its existence. We cannot understand it as the entity which is ‘ontic’, that is, present-at-hand, and whose being can be defined in terms of properties. For example, we define water scientifically in terms of the combination of hydrogen and oxygen. Dasein cannot be understood in this way because, “the ‘essence’ of Dasein lies in its existence…so when we designate this entity with the term ‘Dasein’, we are expressing not its “what” as if it were a table, house or tree but it’s Being.” (Heidegger, 1962: 95).
3 Ready-To-Hand as Disclosure To understand Dasein existentially, we need to understand the phenomena of the world which co-exists with Dasein. Dasein is not an isolated entity like a ‘world-less subject’, which can be understood objectively like scientific objects. As Dasein’s essence lies in its existence, therefore it finds itself in the world as a ‘thrown’ entity. It is not the ‘theoretical mode’ in which it becomes aware of the world. Rather its primitive encounter with the world takes place in terms of practical activity, which Heidegger calls ‘ready-to-hand’ (Heidegger, 1962). The Being of those entities which we encounter as closest to us can be exhibited phenomenologically if we take as our clue our everyday Being-in-the-world, which
16 The World as Affordances: Phenomenology and Embeddedness …
311
we also call our “dealings” in the world and with entities within-the-world…The kind of dealing which is closest to us is…not a bare perceptual cognition, but rather that kind of concern which manipulates things and puts them to use; and this has its own kind of ‘knowledge’ (Heidegger, 1962: 95). ‘Ready-to-hand’ is the basic mode of our access to objects. We do not encounter the world in terms of present-at-hand entities which we observe in a disinterested way. Our mode of access to the world is in terms of dealing. We deal with the world as we orient ourselves in the world. For example, for a footballer, football is not an object which needs to be analysed in terms of its properties like weight, shape, and mass. Rather it is the object to be played, to be dealt with in a game of football. For Heidegger, the ‘pragmatic’ aspect of things has priority over the ‘perceptual’ aspect of things. The Greeks had an appropriate term for ‘Things’-that is to say, that which one has to do with in one’s concernful dealings. But ontologically, the specifically ‘pragmatic’ character of the ‘things’ is just what the Greeks left in obscurity; they thought of these ‘proximally’ as ‘mere Things’. We shall call those entities which we encounter in concern ‘equipment’ (Heidegger, 1962: 96–97). Only through analysing the Dasein phenomenologically, we grasp the basic mode of things through which the world is disclosed to us. Heidegger calls the practical mode of encounter as ‘ready-to-hand’ and theoretical mode of understanding as ‘present-at-hand’ (Heidegger, 1962). ‘Ready-to-hand’ refers to the set of skills, activity, and practical engagement which is required to accomplish certain tasks. On the other hand, ‘present-at-hand’ is the objective and theoretical understanding of the things around us. When we are no longer able to negotiate with the world in a practical manner then we need the theoretical mode of understanding and solving the problem. In this way, ‘present-at-hand’ is dependent on the ready-to-hand. Heidegger shows that we need to understand the character of things as equipment. Dasein is being-in-the-world which remains engaged with the objects in concernful dealing. Dealing with the objects shows the fundamental way in which Dasein encounters objects in the world. The phenomenological analysis of the object as equipment throws light on the nature of Dasein as being-in-the-world. As Heidegger says, we never encounter ‘an equipment’, rather the encounter of the object opens the whole context of the equipmental relations which shows that we as Dasein already have the pre-understanding or familiarity with the world. The world therefore co-exists with Dasein and is not external to it. Taken strictly, there ‘is’ no such thing as an equipment. To the Being of any equipment there always belongs a totality of equipment, in which it can be this equipment that it is. Equipment is essentially ‘something in-order-to…’ [“etwas umzu…”]. A totality of equipment is constituted by various ways of the ‘in-order-to’, such as servicibility, conduciveness, usability, manipulability (Heidegger, 1962: 97). In other words, we cannot understand a thing as such; rather we require the whole context to interpret it. Therefore, the objects are not just seen or registered by our cognitive schema; rather, they are meaningfully interpreted in a given context. Heidegger calls this network of things as ‘in-order-to’ structure (Heidegger, 1962).
312
S. Todariya
The ‘in-order-to’ structure is the pre-reflective horizon wherein our consciousness situates itself to make sense of the world around it.
4 Context and Understanding The ‘in-order-to’ structure shows that our basic mode of encountering the world is in terms of the equipment. But the equipmental character of the things also opens the whole network of relationships. This opening up of the network of relationships where each thing is connected to each other in a meaningful whole is called the disclosure of world by Heidegger (Heidegger, 1962). This world gives us the background to engage with things in a rational manner. This shows that our engagement with the world takes place in a normative manner where we expect the world in a certain way. This rational and meaningful structuring of the world Dasein is called the ‘circumspective concern’ by Heidegger (1962). The circumspective concern orients us to the world. In engaging with the objects as things we tend to lose their ‘thing’ character and immerse ourselves in their ‘practical’ activity. In the ‘in-order-to’ as a structure there lies an assignment or reference of something to something…Equipment-in accordance with its equipmentality-always is in terms of its belonging to other equipment: ink-stand, pen, ink, paper, blotting pad, table, lamp, furniture, windows, doors, room. These ‘Things’ never show themselves proximally as they are for themselves, so as to add up to a sum of relations and fill up a room. What we encounter as closest to us is the room; and we encounter it not as something ‘between four walls’ in a geometrical spatial sense, but as equipment for residing. Out of this the ‘arrangement’ emerges, and it is in this that any ‘individual’ item of equipment shows itself. Before it does so, a totality of equipment has already been discovered (Heidegger, 1962: 98). According to Heidegger, this mode of encounter as ‘ready-to-hand’ can only be figured out if we analyse our being-in-the-world in a phenomenological sense. In our everyday engagement, we find the objects in terms of ‘in-order-to’ structure where the objects are grasped in the process of accomplishing certain tasks. However, the objects are not encountered theoretically or in an objective manner. We become nonthematically aware of them as they become part of the whole context. Our focus is on the successful accomplishment of the task as a whole and not on the individual constituents of the task. Heidegger’s point by highlighting the distinction between ‘ready-to-hand’ and ‘present-at-hand’ is that our fundamental encounter with the world takes place in terms of pragmatic activity rather than as the theoretical observation. That is why, Heidegger insists that the world is not the present-at-hand entity or the sum-total of objective items. It is the essential part of Dasein’s existence, and it anchors the human consciousness by providing the scaffolding to cognition. World is not an abstract entity but is to be understood in terms of involvements “in-order-to” relation. Therefore, he argues that the world ‘worlds’ for Dasein. We encounter the world in
16 The World as Affordances: Phenomenology and Embeddedness …
313
the horizon of familiarity. That is why, he insists on Being which grants the preontological understanding of the world to Dasein (Heidegger, 1962). Dasein does not discover the world but rather finds itself thanks to his relationship with the Being which Heidegger at times calls as ‘in-order-to’ relation. This relationship with the Being makes possible the transcendence for Dasein.
5 Embodied Affordances Merleau-Ponty (2002) acknowledges the Heideggerian notion of world-disclosure but radicalizes phenomenology by bringing in the notion of embodiment. As Merleau-Ponty explains in his magnum opus, Phenomenology of Perception (2002) that our access to the world always takes place in the background of our embodied situatedness. Hence, we grasp the object on the basis of the vantage point which body provides us. In fact, the body is the ‘null point’ which orients us towards the world. It is true that we experience the world as tools, as practical objects but what makes it possible for us to use it as a tool? Obviously, we cannot use the tools without the embodied self or the body-subject as Merleau-Ponty calls it. Unlike Heidegger, the body is not treated as subordinate to the analysis of the space; rather it becomes the fundamental way in which the world-disclosure takes place. Hence, the equipmental nature of the things can be interpreted in terms of our embodied engagements. As he explains (Merleau-Ponty, 2002) that our access to the world takes place always in the background of our embodied situatedness. Hence, we grasp the object based on the vantage point which the body provides to us. In fact, the body is the ‘null point’ which orients us towards the world. It gives us the grip over the world (MerleauPonty, 2002). Hence, unlike Heidegger, the body could not be just treated as the one of the modes of being-in-the-world; rather it becomes the fundamental mode by which the world’s disclosure takes place. Hence, the equipmental character of the things can be interpreted in terms of the embodied engagements. Merleau-Ponty suggests that our body does not inhabit the world the way objects occupy space in the physical world. The phenomenology of the body cannot be explained in terms of the mechanical laws; rather experience and ownership constitute the essential features of what Merleau-Ponty refers to as the ‘lived body’. Intentionality does not arise as the mental capacity but as the embodied, incarnate intentionality for an embodied subject. Therefore, the body is experienced in terms of the embodied capabilities to carry out certain tasks. The world is not experienced as the objective space but as the situation to which we respond on the basis of our embodied capacities. Langer says that we become aware not of the ‘spatiality of position but the spatiality of situation’ (Langer, 1989: 40). Hence the world becomes the horizon of possibilities on which the embodied subject projects the possibilities of ‘I can’ (Merleau-Ponty, 2002). The centrality of embodiment which discloses the horizon of possibilities for the embodied subject completely changes the Cartesian paradigm and interprets the objects as affordances. The horizon determines what we can afford in the world and
314
S. Todariya
what we cannot. The phenomenological horizon of possibilities constitutes the world as normative. Based on our embodied possibilities, we interpret and expect the things in a certain way and not the others.
6 Knowing-That and Knowing-How Phenomenological interpretation of the experience brings out the centrality of embeddedness and embodiment in the experience. Human intelligence functions through the embeddedness where it sees the objects as the affordances provided by the environment. Heidegger’s analysis of ready-to-hand shows that the objects are firstly encountered as tools, and only when the ready-to-hand function disrupts, they show their present-to-hand character (Heidegger, 1962). The humans can experience this switch because they own up the world which enables them to see the different compartments towards it. AI machines are modelled on computational theory of mind which believes that the intelligent behaviours can be ultimately coded into the logical symbols which we can transfer to the machines and in turn they can exhibit the intelligence at par, if not more, than the humans. However, this conception of intelligence is based on the representational theory of mind. It argues that our epistemic access to the world happens through representation. AI aims to copy the representational structure of mind through the logical symbol which allows it to function intelligently in the world. However, there are many problems with such an account, and philosophers have raised the issues of intentionality and the subjective state of consciousness vis-a-vis computational understanding of mind (Dreyfus, 1992; Hagueland, 1989). However, the proponent of AI argues that we can evade the deeper questions of consciousness and can limit ourselves with the question of intelligence tasks, and we will find that even if we subscribe to the functionalist account of the mind, we can easily produce artificial intelligence. The argument seems to be fine till we take the case of cognitive and thematic kinds of tasks which has been shown by tools like ChatGBT and other tools like Sophia, Alexa, etc. which clearly come up with intelligent answers to the pressing solution. However, humans do not always work through the conceptual mode. For example, the activities like swimming and driving require a more practical kind of knowledge which Heidegger calls ready-to-hand. This has the implications for robotics as it is possible for the artificial intelligent machines to acquire the skills which require experience and embodied learning. According to Hubert Dreyfus, AI research is based on the premise that intelligence is fundamentally information processing and can be represented through the contextfree symbols (Dreyfus, 1992). These context-free symbols can be coded into the artificial language where the meaning is determined through the correct usage of these symbols in a given situation. In other words, it is based on the belief that human intelligence can be formalized into formal rules or symbols (Dreyfus, 1992).
16 The World as Affordances: Phenomenology and Embeddedness …
315
However, Dreyfus’ main argument against the classic GOFAI is the Heideggerian distinction between ‘ready-to-hand’ and ‘present-at-hand’ ‘(Dreyfus, 1992). Presentat-hand is the domain of knowing-that which involves factual knowledge and logical reasoning, and ready-to-hand is the realm of knowing-how which involves skills, dispositions, behaviour, etc. (Dreyfus, 1991). Classical AI believes that ‘knowinghow’ can be derived from ‘knowing-that’. According to Dreyfus, ‘knowing-how’ and ‘knowing-that’ are completely different sets of skills. Knowing-how enables us to cope up with the world. It gives us a grip on the world. We need not even respond to the door as affording going out. Indeed, we needn’t apprehend the door at all. From the perspective of the skilled coper absorbed in the solicitation of a familiar affordance, the affording object, as Heidegger puts it, “withdraws”. We need not even be aware of the solicitations to go out as solicitations.. Thanks to our background familiarity…we are simply drawn to go out (Dreyfus, 2013: 18). This suggests that we understand the meaningful world as our world where we engage with it. Only when this meaningful context breaks, as Heidegger shows through the example of hammer, we think about the world objectively. Our skills, behaviours, and practices are enough to provide us the tacit understanding and belief in the world (Heidegger, 1962). The background of ‘knowing-how’ functions as the global context for understanding the world in a pre-reflective manner.
7 Frame Problem in AI The analysis of ‘ready-to-hand’ or ‘knowing-how’ by Dreyfus shows that our background structure plays a significant role in establishing the pre-reflective familiarity with the world. As such, the world is experienced in the normative sense where the objects are basically affordances through which we structure the world in a phenomenological sense. The pre-reflective familiarity which Heidegger calls as the ‘in-order-to’ structure which gives us the overall context through which we interpret the various objects or events in the world. The meaning therefore depends on the context, and we cannot assign the universal or context-free meaning to the objects. Taking the example of hammer, we can say that its meaning is dependent on the usage in a particular situation. Now the question arises how do we determine the appropriate context in a situation through which we can understand the situation correctly. Suppose if somebody points a gesture at us, how do we interpret it? We can interpret it either as friendly or insulting depending on the context. There cannot be a contextfree understanding of the symbols or signs. Hence, the possibilities to interpret a sign, gesture, or linguistic utterance are manifold depending on the fluidity of the context. This shows that context is not fixed, and there are many ways to interpret a given context. Humans can interpret the context on the basis of their background experiences and imagination which helps them to negotiate with the situation. But how can a machine do this? This poses what is called the ‘Frame Problem’ in AI (Dennett, 1978).
316
S. Todariya
According to Dreyfus (1992), the ‘Frame Problem’ arises because of the misunderstanding of human intelligence. The computational understanding of human intelligence argues that the human mind functions like a computer, and therefore, cognitive behaviours related to ‘know-how’ can be reduced to ‘knowing-that’ which can be used as by deep machine learning for performing practical skills. Dreyfus opines that this is the mistaken understanding of human intelligence which ignores the fundamental distinction between the two kinds of knowing ‘knowing-that’ and ‘knowinghow’ (Dreyfus, 1991). As Heidegger points out, human beings become aware of the world in a pre-reflective manner and that is how they own up the world in a phenomenological manner. Hence, ‘knowing-that’ cannot be reduced to knowinghow. Knowing-how fundamentally points out at the embedded nature of human intelligence. Hence, intelligence cannot be reduced to computation ability, and the role of non-formal intelligence like practical skills, affective states, and emotions cannot be ignored in shaping the intelligent behaviour. As a matter of fact, these non-formal factors form the background which gives the context to formalistic intelligence. We can furthermore understand the ‘Frame Problem’ through the analysis of affordances. For a cognitive agent, the world appears as the affordances wherein he copes on the basis of embodiment. The coping mechanism based on the lived body allows her to interpret the situation on the basis of embodied experience. The element of coping is significant as it allows one to take stock of the overall situation and the individual actions are geared towards the success of the overall situation. For example, when a child is trying to climb over the wall, her consciousness is not focused on every part of her body. Rather, her focus remains the successful climbing of the wall, and the kinaesthetic ability of the body allows her to leverage the body to her advantage. This shows that humans can immediately sense or reshuffle their actions based on the changing context on the basis of ‘lived body’ which is not possible for the machines.
8 Conclusion This chapter has dealt with the problem of intelligence in machines and tried to contrast it with human intelligence from the phenomenological point of view. We have tried to show that the notion of AI from the GOFAI to the latest generative AI is based on the “calculative model of intelligence” as argued by Dreyfus (1992) where the AI can process the data and come up with the desirable outcomes through the exercise of algorithms. Hence, the current interface between AI and neuroscience aims at converting cognitive behaviour in terms of data which can be analysed in a logical manner. While such an approach drives the major projects in AI, it overlooks the non-formal and non-computational ways of understanding the world. Heidegger’s phenomenological analysis shows that humans encounter the world in a horizon of pre-understanding, and this pre-understanding gives us the context wherein our intelligence functions, learns, and adjusts itself. In this way, human intelligence avoids the “Frame Problem” which AI is grappling with. The proper assessment of
16 The World as Affordances: Phenomenology and Embeddedness …
317
this problem lies in embodiment, social situatedness, and emotions which constitute the ‘world’ for humans.
References Dennett, D. (1978). Brainstorms. MIT Press. Descartes, R. (2008). Meditations on first philosophy. Oxford University Press. Dreyfus, H. L. (1991). Being-in-the-world: A commentary on Heidegger’s being and time, division I. MIT Press Dreyfus, H. L. (1992). What computers still can’t do. MIT Press. Dreyfus, H. L. (2013). The myth of the pervasiveness of the mental in mind, reason and being-inthe-world: The McDowell-Dreyfus debate (Edited by Joseph K. Schear). Routledge. Hagueland, J. (1989). Artificial intelligence: The very idea. MIT Press. Heidegger, M. (1962). Being and time (Translated by Macquarie and Robinson). Harper Collins. Husserl, E. (1983). Ideas pertaining to a pure phenomenology and to a phenomenological philosophy: First book (Translated by F. Kersten). Martinus Nijhoff Publishers. Kant, I. (2003). Critique of pure reason (Translated by Kemp Smith). Palgrave Macmillan. Langer, M. (1989). Merleau-Ponty’s “phenomenology of perception”. A guide and commentary. Macmillan Press. Merleau-Ponty. (2002). Phenomenology of perception (Translated by Colin Smith). Routledge Classics.
Chapter 17
Investigating the Ontology of AI vis-à-vis Technical Artefacts Ashwin Jayanti
Abstract Artificial intelligence is the new technological buzzword. Everything from camera apps on your mobile phone to medical diagnosis algorithms to expert systems are now claiming to be ‘AI’, and many more facets of our lives are being colonized by the application of AI/ML systems (henceforth, ‘AI’). But what does this entail to designers, users and to society at large? Most of the philosophical discourse in this context has focused on the analysis and clarification of the epistemological claims of intelligence within AI and on the moral implications of AI. Philosophical critiques of the plausibility of artificial intelligence do not have much to say about the real-world repercussions of introducing AI systems into every possible domain in the name of automation and efficiency; similarly, most of the moral misgivings about AI have to do with conceiving them as autonomous agents beyond the control of human actors. These discussions have clarified the debate surrounding AI to a great extent; however, certain crucial questions remain outside the ambit of these debates. Arguments in support of AI often take advantage of this void by emphasizing that AI systems are no different than previously existing ‘unintelligent’ technologies, thereby implying that the economic, existential, and ethical threats posed by these systems are either similar in kind as those posed by any other technology or grossly misplaced and exaggerated. In this chapter, we shall think through this assumption by investigating into the ontology of AI systems vis-à-vis ordinary (non-AI) technical artefacts to see wherein lies the distinction between the two. I shall examine how contemporary ontologies of technical artefacts (e.g., intentionalist and non-intentionalist theories of function) apply to AI. Hence, clarifying the ontology of AI is crucial to understand their normative and moral significance and the implications therefrom. Keywords Artificial intelligence · Alignment problem · Ontology of technical artefacts · Normativity
A. Jayanti (B) Human Sciences Research Centre, International Institute of Information Technology, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_17
319
320
A. Jayanti
1 A Brief Survey of the Ontology of Technical Artefacts Stuart Russell and Peter Norvig, in their highly influential reference textbook on artificial intelligence, summarize their position on the risks of artificial intelligence by noting that apart from the threat of ultraintelligent machines, “some of the threats [posed by AI] are either unlikely or differ little from threats posed by other, ‘unintelligent technologies’” (Russell & Norvig, 2003: 964). This is representative of a commonly held view which conceives of AI as a subset of other everyday technologies, and it is the objective of this chapter to think through this view and investigate into the similarities and distinctions between AI and technical artefacts in terms of their ontology. Before stepping into the question of what criteria (if any) marks the distinction between everyday technical artefacts and AI systems, it is imperative to clarify what it is that characterizes a technical artefact qua technical artefact. The empirical turn within the philosophy of technology—in shifting emphasis away from the overarching, blanket concept of ‘technology’ and focusing on individual technological artefacts—has led to numerous insights on this question. The ontology of technical artefacts also bears upon the question of the moral status of the same. Here, I shall outline some of the major views on the ontology of technical artefacts. This would offer a framework which would enable juxtaposing AI systems with conventional technical artefacts. A technical artefact (hereafter referred to as ‘TA’) is defined as an artefact that is intentionally produced to serve a purpose (Hilpinen, 2018; Preston, 2022). The two criteria this definition captures are those of ‘intention’ and ‘function’. This differentiates natural objects from TAs in that the former are not intentionally produced, although they may be used to fulfil certain functions. For example, a naturally occurring pebble when used as a paper weight may serve the function well; however, it would not be considered a TA according to this definition, since it is not intentionally produced. Another notion that is of importance here is that of ‘structure’. There is a close relationship between the structure and function of a technical artefact. The question regarding how these two are related has led to much discussion on whether the relationship should be understood in terms of causation, disposition or norms. This further differentiates TAs from art works as well as from social artefacts such as money, wherein there is more of an arbitrary relationship between the structure of the note/coin and its function in terms of exchange value. Synthesizing from various perspectives on the production, evolution, and use of artefacts as part of our material culture, Beth Preston lists certain criteria “for what would count as a full-fledged, well-integrated theory of artifact function”. These are as follows: multiple realizability, multiple utilizability, recycling, reproduction with variation, malfunction, and phantom function (Preston, 2009a). She then evaluates which of the prevalent theories of artefact function fulfil these criteria and places these along a spectrum—from intentionalist theories on one side and non-intentionalist or reproduction theories on the other. Preston borrows some of these from Vermaas and Houkes (2003). Herein, intentionalist theories are those which give priority and precedence to the role of intentions (collective or individual or both) of designers/
17 Investigating the Ontology of AI vis-à-vis Technical Artefacts
321
users in assigning functions to artefacts. Non-intentionalist theories are those which assign primacy to non-intentional aspects such as histories of selection, use and reproduction of artefacts, much like the evolution of traits in species. The intentionalist and non-intentionalist theories are not mutually exclusive and either side might have conceptual space to accommodate the other.
2 Intentionalist and Non-intentionalist Theories of Artefact Function Beth Preston lists John Searle, Randall Dipert, Peter McLaughlin, and Karen Neander as espousing a version of intentionalist theory of artefact function. For Searle, all functions are observer-relative and are assigned to things and institutions through collective intentionality based on certain common ends and values. Dipert traces the origins of artefact function solely to the beliefs and desires that contribute to the intentions underlying those functions. McLaughlin famously captures his views in this slogan, “no agent, no purpose, no function” and Neander extends a theory which evokes biological functions only to disambiguate them from artefact functions. Unlike their biological counterparts, artefact functions can be conceived of as traits which are the effect of intentional selection by human agents, whereas the former are traits that are selected through non-intentional, evolutionary processes and are generalizable across and applicable only to types (i.e., species). Towards the other side of the spectrum are non-intentional theories of artefact function. Representatives of these include Paul Griffiths, Ruth Millikan, Pieter Vermaas and Wybo Houkes, and Preston herself. Preston positions herself in the purely non-intentionalist end of the spectrum and places the rest of the views along the middle of the spectrum. Griffiths’ theory of functions is a selectionist one in which the functions of artefacts are selected both intentionally through conscious design as well as unintentionally through trial and error and the history of reproduction of the artefact. This allows for artefacts to have proper functions which are over and above the intentions of the designers. The concept of a ‘proper’ function captures the function that an artefact has been intended or selected to perform, i.e., that which explains the existence and normativity of the artefact. This is to distinguish these functions from accidental or unintended functions. The proper function is what distinguishes a functioning from a malfunctioning artefact. For example, the proper function of a bottle is to contain liquid; it might however be used for other functions such as a projectile or as a funnel, etc. A bottle, which does not perform the latter functions cannot be said to be a malfunctioning bottle. Ruth Millikan also advances a mixed theory which allows for technical artefacts to have both intentional as well as non-intentional proper functions. She classifies proper functions into two kinds—direct and derived. Direct proper functions of artefacts have their source in the history (non-intentional) of selection and reproduction of the artefacts, whereas
322
A. Jayanti
derived proper function has its source in the intentional use of agents based on their desired ends. Beth Preston, as noted previously, attempts to provide a purely non-intentionalist theory of artefact function. She classifies functions into proper functions and system functions. Proper functions are those that the artefacts have been historically reproduced to serve, while system functions are established by the role the artefacts play in a system on the basis of their capacities, regardless of their histories of use and reproduction. To revert to the bottle example, the proper function of the bottle is to contain liquid, while the system function could be to serve as a base to launch a bottle rocket or use as a vase, a funnel, projectile, etc. Preston then goes on to evaluate various contemporary theories of artefact function in terms of how well they account for the various criteria as listed above. According to Preston, the criteria for a general theory of material culture include multiple realizability, multiple utilizability, recycling, reproduction with variation, malfunction, and phantom function. The above criteria may be supplemented by those extended by Vermaas and Houkes (2010; see also Preston, 2009a), who provide a conceptual analysis of technical artefact functions based on their use and design. They describe their method in the following manner: “We take an engineer’s attitude towards our intuitions: we list our intuitive, phenomenological ‘data’ and then translate them into clear specifications—or desiderata, as we shall call them—for a theory of technical artefacts (Vermaas & Houkes, 2010).” Through such an analysis, they arrive at four desiderata, each of which reflect four phenomena—proper-accidental desideratum (reflecting the phenomenon of use versatility), malfunctioning (possible lack of success), support (physical restriction), and innovation (novelty).1 Stated normatively, these desiderata amount to a theory of artefacts that ought to (a) distinguish proper and accidental functions; (b) account for proper functions that allow for malfunctioning; (c) have a measure of support for ascribing function to an artefact, regardless of malfunction; (d) ascribe intuitively correct functions to innovative artefacts.2 These desiderata may be considered as the least common denominator of criteria for an entity to be considered a technical artefact. With these at hand, we may investigate into the ontology of AI systems.
1
They do acknowledge, however, that their desiderata for an adequate theory of artefact functions are contextual and that other contexts might be accounted for by alternative desiderata. 2 Preston (2003) offers a critical examination of the four desiderata of Vermaas and Houkes. She questions the plausibility of any theory to adequately address all four without contradiction and argues for giving up D4. In this paper, the four desiderata will be considered as a heuristic for any theory of technical functions. See also Kroes (2012: 76) for a discussion on and extension of these desiderata.
17 Investigating the Ontology of AI vis-à-vis Technical Artefacts
323
3 AI Systems and TA Desiderata The ontology of AI systems, as technical artefacts, hinges on whether or not they satisfy the same desiderata as those pertaining to technical artefacts. If they do, they could be said to be ontologically similar to technical artefacts. However, if they do not, then it would entail either that (a) AI systems are ontologically distinct from technical artefacts, or (b) the desiderata are inadequate and apply only to certain subset of technical artefacts which are taken to be paradigmatic of all technical artefacts.
3.1 Proper and Accidental Functions Do AI systems satisfy these desiderata? Starting with the first, we ask if there could be a distinction made between proper functions and accidental functions in the case of AI systems. Being ambivalent to the intentionalist versus non-intentionalist theories of artefact function, we could inquire into how AI functions may be thought of from each of these frameworks. From an intentionalist perspective, the proper function of artefacts results from the intentions of agents (as designers or users). Hence, the proper function of AI systems ensues from the intentions of the programmers, which is what determines the approach to learning and the training data used for training the ML algorithm. The proper function of an image recognition algorithm for plants, for instance, is to identify plants according to their genus and species and label them accordingly. This is the function the programmer intended for the algorithm to perform. So, this gives us the proper function from an intentionalist perspective. Established thus, the proper function also marks the normativity of the artefact in that an algorithm that does not successfully recognize plants can be said to be a malfunctioning algorithm. Could it also have an accidental function that makes it multiply utilizable? This seems quite plausible, since we can envision various uses that such an algorithm can be put to, such as in a mobile phone application that helps hikers recognize certain species of plants and classify them as edible or inedible. Other uses could be as a learning tool for users to keep track of the growth of their plants and receive tips on how to maintain each, etc. This fulfils the first desiderata from an intentionalist perspective. From a non-intentionalist viewpoint, on the other hand, the proper function of the algorithm is for what it has historically been selected to be used and reproduced. By way of illustration, let us take the example of Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), the recidivism risk assessment algorithm, which is being used to inform sentencing decisions within the legal system in certain North American states since 1998. This example is taken from Christian (2021: 56). Initially developed for predicting the risk of recidivism among parolees, it has nevertheless come to be used predominantly by courts to inform sentencing decisions. Going by its use and reproduction history, the proper function of COMPAS
324
A. Jayanti
then is to inform sentencing decisions. Since it is the proper function that makes normative claims on the artefact such that it is the function that the artefact ought to perform, it is the failure to realize the proper function which marks the artefact as malfunction. This satisfies the second desiderata of malfunctioning. The nonintentionalist perspective allows for accidental function (or ‘system function’ as it is called) as the function which the algorithm is currently being used regardless its use and reproduction history. So, if the COMPAS algorithm is being used to detain citizens at a protest based on their risk assessment score, then that would be its accidental or system function. In contrast to its proper function, however, this does not ground the normativity of the algorithm.
3.2 Malfunction Once proper function is established, it is but an easy step to make sense of the second desiderata of malfunctioning. A malfunctioning instance of an AI system is one which fails to realize its proper function. In intentionalist terms, it is an AI system that does not perform its designer-intended function; in non-intentionalist terms, it is one that does not perform the function for which it has been historically used and reproduced. The malfunction desiderata, however, raises interesting conundrums which we shall get after addressing the other two desiderata of support (physical restriction) and innovation (novelty), which are comparatively straightforward.
3.3 Support (Physical Restriction) This desiderata requires that AI systems have a measure of support in terms of physical structure such that they may be ascribed function, regardless of malfunction. This requires the system to have a base structure that can be pointed to and said to malfunction, just as one may point to a telephone and say that it is out of order. What then could be the physical structure that underlies AI systems. We may point to the hardware underlying the AI systems. These include all the electronic paraphernalia such as the microprocessors, memory unit, storage devices, architecture of the machine on which the neural network algorithms run, enabling the machine to learn and perform the requisite proper function. In this they may seem to be analogous to ordinary technical artefacts such as light bulbs and cups, albeit much more sophisticated and complex than these pedestrian counterparts, involving not just complex machinery but various layers of neural networks, training data and learning methods (e.g., supervised learning, unsupervised learning and reinforcement learning to name some of the predominant methods). A matter of great concern about AI systems in this context is why and how exactly do the outputs come about. Why does the system output x rather than y? This seems to be an aspect that is unique to AI systems,
17 Investigating the Ontology of AI vis-à-vis Technical Artefacts
325
and it goes by the name of explainable and interpretable AI. They are seen as black boxes which we are yet unable to take apart and investigate and gain insight into their mechanism of functioning. Think of a television set which displays channels while offering no scope for an explanation or understanding as to why one channel is being displayed as opposed to another. Regardless of this idiosyncrasy, AI systems may be said to satisfy the desiderata of support.3 Peter Kroes offers a complementary way of parsing this desiderata, whereby “the function and structure of an artefact cohere and constrain each other such that it will not be possible to change [its] function…without changing its structure, and vice versa” (2012: 116). This brings to relief the support desiderata more clearly in the context of AI systems since their function is constrained by the above-mentioned structural features of underlying hardware, algorithm, neural network, training data, and learning methods. An AI system for early cancer detection cannot be used for recidivism risk assessment. From a non-intentionalist perspective, this desiderata can be straightforwardly addressed. The history of selection and reproduction of an AI system ensures that the function for which it has been selected and reproduced is indeed successful, which in turn necessitates a coherence between the function and the underlying structure enabling the function. According to Preston, however, this desiderata throws a challenge for the intentionalist perspective: “…if artifact function depends only on what the designer wants, hopes, or expects the artifact to do, there is no explicit provision for any justification that its physical structure is such as to make its actually performing that function likely, or even possible” (2003: 603). Vermaas and Houkes (2010) as well as Kroes (2012) address this aspect (successfully) by appealing to the technical knowledge exercised by the designer in the process of realizing the intended function. Kroes addresses this challenge by referring to the notion of ‘a largely correct substantive idea’ of what it is that makes an artefact an instance of a technical artefact kind K; this substantive idea is composed of both the intended functional as well as structural or design features. “For a technical artefact kind both intended functional features and intended structural (design) features are necessary ingredients of the largely correct substantive idea” (Kroes, 2012: 103). In this way, technical artefacts have a dual nature—their ontology necessitating both functional features in terms of design intentions as well as structural features in terms of physical characteristics. Kroes marks these criteria as demarcating technical artefacts from social artefacts such as money, in that the latter’s functions are assigned to them through collective intentionality and independent of the underlying physical structure.4
3
Interestingly, recent work in efficient hardware design for AI systems is bringing to light analog computing as having an advantage over its digital counterpart in enabling faster and more effective execution of machine learning algorithms. This clearly highlights the significance of the structural basis of AI systems. 4 John Searle, on the other hand, generalizes this collective-intentionality-based theory of function assignment to all artefacts—social as well as technical.
326
A. Jayanti
3.4 Innovation (Novelty) This desiderata requires us to ascribe proper functions to innovative artefacts. Intentionalist theories can address this desiderata by simply appealing to the intentions of the innovators in assigning proper functions to novel prototypes. Non-intentionalist theories, on the other hand, especially that of Preston, have no way of accounting for the proper function of novel prototypes. As previously noted, Preston accounts for proper function based on the history of use and reproduction; however, since novel prototypes are ‘novel’ in just this sense that they are yet to be sedimented into such a history, they are yet to acquire a proper function. Preston notes that novel prototypes have system functions by way of the current role they fulfil in a cultural system owing to their physical capacities or dispositions, but have no proper function that assigns normativity to them in terms of marking what they ought to do. In the context of our topic, this implies that novel AI systems have no proper function—there is no function that they ought to fulfil. We could say that they are but ‘works in progress.’ The COMPAS algorithm, therefore, has the proper function of helping judges with informing sentencing decisions, since this is what their history of use and reproduction has established their function to be. However, as a novel prototype developed by Tim Brennan and Dave Wells in 1998, it had no proper function. As we shall see later, this precludes addressing one of the most pressing questions regarding the use/ misuse of AI systems—the alignment problem (Christian, 2021). Intentionalist theories, on the other hand, can appeal to the intentions of the engineer/designer/innovator in accounting for the proper function of novel prototypes. COMPAS was a novel prototype of an algorithm designed to predict the probability of a crime being committed within the first three years of release of an inmate on parole/ probation. This intended function assigns normativity to the artefact and marks it for what it is supposed, intended, and ought to do. Any other use that it is put to can be thought of in terms of accidental or idiosyncratic use. In latching on to design intentions, the intentionalist theory does enable disambiguating between appropriate and inappropriate use of AI systems, as we shall see below.
4 AI Systems vis-à-vis Technical Artefacts The above analysis of AI systems suggests that they do fulfil the desiderata that are commonly shared by both intentionalist as well as non-intentionalist theories of artefact function. There are, however, certain aspects that are unique to AI systems which call our attention. This has to do with making sense of what has come to be known in AI literature as the ‘alignment problem’ or ‘value alignment’. The alignment problem has to do with the divergence or misalignment between the intended functions of AI systems and their actual outcomes. As Christian puts it, it has to do with the misalignment between our intended values and desires on the one hand and the values and desires as captured by the models underlying the AI systems (2021:
17 Investigating the Ontology of AI vis-à-vis Technical Artefacts
327
13). Stuart Russell refers to this failure of value alignment as resulting from the possibility that “we may, perhaps inadvertently, imbue machines with objectives that are imperfectly aligned with our own”. Russell’s exposition of this problem tends to suggest that this is a problem that is unique to AI systems, since “[u]ntil recently, we were shielded from the potentially catastrophic consequences by the limited capabilities of intelligent machines and the limited scope that they have to affect the world” (Russell, 2019: 137).5 In the light of the alignment problem, the second desiderata of malfunction brings to bear an interesting and significant fault line between the intentionalist and nonintentionalist theories.6 The non-intentionalist theory, of which Preston’s might be taken to be representative, does not enable an account of the alignment problem. Reverting to the example of the COMPAS algorithm, the alignment problem manifests itself as the gap between “what we intend for our tool to measure and what the data actually captures” (Christian, 2021: 76). This results in the grave consequence that the algorithm classifies those who successfully evade arrest as low-risk and those who are wrongly convicted as high-risk individuals. The people falling under the ‘low-risk’ classification are recommended for release by the algorithm, whereas those falling under the ‘high-risk’ classification are recommended for detention. Therefore, whereas the training data captures ‘rearrest’ and ‘reconviction’, the COMPAS algorithm is used by the courts to predict ‘reoffense’. While the designerintended proper function of the COMPAS algorithm was to assess recidivism risk and inform probation decisions, the algorithm has been reproduced and used by courts to inform sentencing decisions. As testified by the widespread adoption of the algorithm for the latter purpose, Preston’s non-intentionalist perspective would have us ascribe this latter function as its proper function. What makes this problematic is the elimination of any scope for critique—such as those by ProPublica and others—which point to the misalignment between ‘what the tool is intended to measure’ and ‘what the training data actually captures’. Intentionalist approaches (which satisfy the physical restriction desiderata), on the other hand—in taking designer intentions into consideration while ascribing proper function—enable to make sense of the distinction between intended proper function (recidivism risk assessment for parole decisions) and improper function (informing sentencing decisions), such as Kroes’ dual nature theory and Houkes and Vermaas’ ICE theory of technical function (Kroes, 2012; Vermaas &
5
He echoes Norbert Wiener’s pronouncement that “[i]n the past, a partial and inadequate view of human purpose has been relatively innocuous only because it has been accompanied by technical limitations.... This is only one of the many places where human impotence has shielded us from the full destructive impact of human folly” (Russell, 2019: 137). 6 It must be noted at the outset that a similar issue arises in non-AI technical artefacts too, and this goes by the name of ‘unintended consequences’. The former is quite distinct from the latter in that whereas the latter has to do with a break between designer intentions and use context, the former has to do with designer intentions and technical function itself. It is in this sense that the alignment problem can be said to be unique to AI systems, although these systems may also be prey to unintended consequences arising out of unforeseeable use contexts.
328
A. Jayanti
Houkes, 2010). Such intentionalist theories, therefore, bring to relief the misalignment between design and use and afford a critical perspective on AI systems. Moreover, even if designers intended for it to predict reoffence, one could look under the hood of the algorithm and point to the misalignment between the design intention and actual function, i.e., the gap between what function the tool is intended to perform and what the training data captures. The question for non-intentionalist theories is this: how to address the distinction between what the training data captures and what the algorithm has been used and reproduced for? How to address the misalignment between the function of COMPAS to assess recidivism risk and the use of COMPAS to inform sentencing decisions? The following passage illustrates this concern clearly: [S]ome states are using the COMPAS tool to inform sentencing decisions, something that many regard as an awkward if not inappropriate use for it. Says Christine Remington, Wisconsin assistant attorney general, “We don’t want courts to say, this person in front of me is a 10 on COMPAS as far as risk, and therefore I’m going to give him the maximum sentence.” But COMPAS has been used to inform sentencing decisions—including in Wisconsin. When Wisconsinite Paul Zilly was given a longer than expected sentence in part due to his COMPAS score, Zilly’s public defender called none other than Tim Brennan himself as a witness for the defense. Brennan testified that COMPAS was not designed to be used for sentencing. At a minimum, it seems clear that we should know exactly what it is that our predictive tools are designed to predict—and we should be very cautious about using them outside of those parameters. “USE ONLY AS DIRECTED ,” as the label reads on prescription medications. Such reminders are just as necessary in machine learning (Christian, 2021: 78; italics mine).
Preston’s non-intentionalist theory leads us to accept the use of COMPAS for informing sentencing decisions as a proper use, given that it has come to be widely used by courts for that very purpose. The Wisconsin Supreme Court affirmed the use of COMPAS risk scores to inform sentencing judgments as appropriate (see Christian, 2021: 350, fn. 81). In so doing, the non-intentionalist theory fails to acknowledge and account for the alignment problem, which seems to be a unique feature of AI systems. Let us see how COMPAS compares to aspirin, which Preston cites as an example to make the case for a non-intentionalist theory of function: Aspirin tablets were originally designed to relieve fever and pain, but are now prescribed for the prevention of cardiac problems, as well. Here the added function qualifies as a proper function, since aspirin manufacturers market specially formulated tablets for this purpose. So, proponents of the intentionalist view owe us an account of how and under what conditions the intentions of users can override or supplement the intentions of designers (Preston, 2013: 164).
Comparing the two examples, one can see that whereas it was a medical and scientific discovery that aspirin had cardiovascular benefits, it is but a policy decision taken by the legal establishment that COMPAS be used to predict reoffence and inform sentencing decisions. The distinction between the two lies in the discovery (albeit serendipitous) of cardiovascular benefits of aspirin, on the one hand, as opposed to the assignment of function via collective intentionality of the judges and courts, on the other. In the case of COMPAS, there is then a gulf between designer intention and collective intention, both of which seem to invoke an intentionalist approach to
17 Investigating the Ontology of AI vis-à-vis Technical Artefacts
329
technical functions. How would Preston’s non-intentionalist theory account for this gulf? One plausible response from Preston could be that this be seen as an instance of a phantom function, whereby “an item of material culture is constitutionally incapable of performing a function it is widely taken to have” (Preston, 2013: 177). But this has the intriguing implication of conceiving of the alignment problem as a species of defectiveness in the AI system. This brings about a shift in narrative from ‘y is not the proper function of x’ to ‘x does not perform its proper function y.’ The former is a question having to do with consideration of ends, while the latter is a question having to do with means. What marks the alignment problem as distinct from a malfunction or defect is that in the case of the alignment problem, the function is realized, albeit towards an outcome that might be varying degrees at a remove from the intended function, whereas in the case of a malfunction or defect, there is no realization of function. This unpredictability and inexplainability marks the alignment problem as a phenomenon unique to AI systems. The non-intentionalist theory falls short of adequately accounting for this unique aspect of AI systems. However, on the other hand, the intentionalist theory has the task of explaining the misalignment between designer intentions and AI function. This invokes training data and consequently its source in existing patterns of social inequities, thereby referring back to society at large. In this aspect, AI systems may be aptly termed to be ‘suigeneric structures with sociogeneric functions’, alluding to a collapse of the dichotomy and mutual exclusivity of intentionalist theories (with their prevalent suigenerism) and nonintentionalist theories (with their prevalent sociogenerism) (Preston, 2013: 77). This hybridity of AI systems could be taken to suggest a difference in kind between AI systems and everyday technical artefacts that are generally taken to be part of our material culture.
5 Conclusion Philosophical literature on AI has laid emphasis largely on its consequences—be it for social upheaval, superintelligence, or unemployment. This chapter, on the other hand, is an investigation into a hitherto neglected aspect of AI systems, i.e., their ontology vis-a-vis everyday technical artefacts. This investigation throws a challenge to a unified theory of function that can account for both AI systems as well as technical artefacts. It has pointed to some challenges faced by intentionalist as well as non-intentionalist theories of artefact function in accounting for the normativity of AI systems. As illustrated by this preliminary investigation into the ontology of AI systems vis-à-vis technical artefacts, the non-intentionalist theory set forth by Preston falls short of adequately accounting for and addressing a unique feature of AI systems—the alignment problem. Although the intentionalist theories referred to in this investigation satisfy the requisite desiderata, they face the challenging task of addressing the sociogeneric aspect of AI function, i.e., the training data on which
330
A. Jayanti
the machine learning systems are trained. The preliminary outcome of this investigation is that it calls for conceiving of AI systems as hybrids of ‘suigeneric structures with sociogeneric functions’. This hybridity accounts for the uniquely eccentric and morally significant issue of the alignment problem which plagues AI systems. This seems to suggest that AI systems are indeed different in kind from technical artefacts and calls for further investigations into the nature and extent of the differences between the two. The results of these investigations will have significant implications for whether currently prevailing frameworks within philosophy of technology can be revised and extended to make sense of AI systems, or whether there would be a need to develop novel ways for thinking through the philosophical and ethical implications of AI systems in particular.
References Christian, B. (2021). The alignment problem: Machine learning and human values. W.W. Norton & Company. Hilpinen, R. (2018). Artifact. In Stanford encyclopedia of philosophy (Summer). https://plato.sta nford.edu/archives/sum2018/entries/artifact/ Kroes, P. (2012). Technical artefacts: Creations of mind and matter. Springer. Preston, B. (2003). Of Marigold beer: A reply to Vermaas and Houkes. The British Journal for the Philosophy of Science, 54(4), 601–612. Preston, B. (2009a). Philosophical theories of artifact function. In A. Meijers (Ed.), Philosophy of technology and engineering sciences (Vol. 9). Elsevier. Preston, B. (2013). A philosophy of material culture: Action, function, and mind. Routledge. Preston, B. (2022). Artifact. In Stanford encyclopedia of philosophy. https://plato.stanford.edu/arc hives/win2022/entries/artifact/ Russell, S. (2019). Human compatible: Artificial intelligence and the problem of control. Viking. Russell, S. J., & Norvig, P. (2003). Artificial intelligence: A modern approach. Pearson. Vermaas, P. E., & Houkes, W. (2003). Ascribing functions to technical artefacts: A challenge to etiological accounts of functions. The British Journal for the Philosophy of Science, 54(2), 261–289. Vermaas, P. E., & Houkes, W. (2010). Technical functions: On the use and design of artefacts. Springer.
Chapter 18
Being “LaMDA” and the Person of the Self in AI Sangeetha Menon
Abstract The emergence of self in an artificial entity is a topic that is greeted with disbelief, fear, and finally dismissal of the topic itself as a scientific impossibility. The presence of sentience in a large language model (LLM) chatbot such as LaMDA inspires to examine the notions and theories of self, its construction, and reconstruction in the digital space as a result of interaction. The question whether the concept of sentience can be correlated with a digital self without a place for personhood undermines the place of sapience and such/their/other high-order capabilities. The concepts of sentience, self, personhood, and consciousness require discrete reflections and theorisations. Keywords Self · Personhood · Consciousness · Digital self · Floating self · LaMDA · AI
1 The Pervasiveness of AI We all, knowingly, unknowingly, or both, are influenced by AI that surrounds us. We participate and generate content that is shared across platforms. If we use smartphones, social media, digital calendars, search engines, Google Assistant, Alexa, or even emails, then we are automatically a part of the social interactive system defined by AI, often without our full cognisance. The web browsers, smartphones, and smart gadgets along with IoT devices and systems have changed our responses to life situations. Apps are in plenty that can be installed in one’s smartphone, and these in turn design, define, and control life and daily living. Internet, apps, and smart gadgets offer new choices on a day-to-day basis. Apart from the consumer industry, the healthcare industry also is increasingly adopting AI and AI-based virtual assistants that give better and faster services. These applications are gaining more and more interest among the public. S. Menon (B) NIAS Consciousness Studies Programme, National Institute of Advanced Studies, Indian Institute of Science Campus, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Menon et al. (eds.), AI, Consciousness and The New Humanism, https://doi.org/10.1007/978-981-97-0503-0_18
331
332
S. Menon
The application of AI in agriculture, banking systems, education technologies, and personalized learning have changed the market economy. Smart TVs, smart speakers, domestic robots, smart lamps, connected thermostats, networked and connected door locks, appliances, car devices, energy monitors, and home security systems are the top IoT consumer products, excluding smartphones, in 2023. Today we seek information in sophisticated, complex, interconnected, and shared manners than it was twenty years ago. We use gadgets and apps that are designed and frequently updated based on user feedback, in order to respond to our unique needs, which further reinforce the way we think, act, and make choices. The IoT, GPS technologies, increasing computing power, and a wide range of other appliances and applications, which are designed to serve our needs and reduce effort and time, have the potential to change our lives drastically in the near future. The recent discussions on using robots and Androids in meetings, classrooms, and conferences are more than a hype that can be easily dismissed. There is growing awareness on the digital content and digital avatar of deceased people whose loved ones could store and interact with on a continual basis—the same way they would have responded if they were alive. The future of life after death is not only a philosophical topic but also a digital one, perhaps in another ten years. Augmented reality (AR) motivates nerds to move on, in a combined space of hallucination and virtual fantasy combined, from digital avatars that socially interact to a realm of trans-humans and cyborgs, at least psychologically, with physical implications. Augmented reality is a mix of the physical and digital and has the potential to blur the distance and distinction between the real and virtual, actual, and artificial. And another cutting-edge AI intervention is “emotional artificial intelligence”, and the global market is booming with “Cogito”, “Emotient”, “Realeyes”, and many other such market-driven players. By accessing and deciphering emotions based on facial clues and enabling machines to recognize and react correctly to human emotions, emotion AI through advanced computer vision technologies attempts to enhance user experience by providing more nuanced and fulfilling interactions. Virtual assistants and chatbots are increasingly incorporating emotion AI to accurately comprehend human emotions and match and mimic human responses and interactions. Another major contribution of emotion AI is in healthcare settings to support patient emotional support, mental health diagnosis, and remotely monitoring emotional well-being. Emotion AI is expected to improve educational experiences by modifying instructional content, offering individualized feedback, and keeping track of the interest and emotional states of students. The field of emotional AI and its development and implementation face the major challenges in the protection of privacy, ethical use of data, and avoiding biases in emotion detection algorithms. In addition is the challenge of the inclusion of context-dependent approach to emotions and its cultural variations. The journey from mechanical and electronic devices to the Internet, to digital devices, and AI-based services have created curiosity as well as apprehension. The commencement of the Internet for mass public use in the nineties paved the way for a digital era beyond imagination within the next twenty-five years. In 1999 the United States Internet Council released the first “State of the Internet” report
18 Being “LaMDA” and the Person of the Self in AI
333
summarizing the key trends in the “development of a social, political, and economic communications revolution that has emerged on the world stage in just a few years”. According to the report, titled “State of the Internet 2000” (International Technology and Trade Associates (ITTA) Inc., 1 September 2000) of the US Internet Council, the number of people who used the Internet worldwide in 1993 on a regular basis was fewer than 90,000 which grew to more than 304 million in 2000. It took only one year for an estimated 300 million people to use the Internet on a frequent basis for “business, research, shopping, personal correspondence, social interactions, entertainment, listening to radio, and communications and information sharing functions of every description” as per the Internet report of 2000. The report foresaw “the internationalization of the net, the rise of multiple Internet hotspots around the world, the boom in regional e-commerce, and the impact of wireless innovations in Europe and Asia” that happened in the next ten years. As per the “Cisco Annual Internet Report 2023” on the global Internet adoption and devices and connection, the insights are that nearly two-thirds of the global population will have Internet access, and there will be 5.3 billion total Internet users (66% of global population) by 2023, which is up from 3.9 billion (51% of global population) in 2018. The report says that the number of devices connected to IP networks will be more than three times the global population by 2023. There will be 3.6 networked devices per capita by 2023, and M2M connections will be half of the global connected devices and connections by 2023, and there will be 14.7 billion M2M connections by 2023. The connected home applications will have the largest share and connected cars will be the fastest-growing application type. By 2023, global mobile devices will grow from 8.8 billion in 2018 to 13.1 billion and 1.4 billion of those will be 5G capable, and over 70% of the global population will have mobile connectivity. The report says that on global network performance the fixed broadband speeds will more than double by 2023 and mobile (cellular) speeds will more than triple by 2023. Nearly 300 million mobile applications will be downloaded, and social media, gaming, and business applications will be the most popular downloads by 2023. According to the “Digital, 2023 Global Overview Report”, a total of 5.44 billion people use mobile phones in early 2023, equating to 68% of the total global population with an increase of just over 3% during the past year and with 168 million new users over the past 12 months. There are 5.16 billion Internet users in the world today, which means that 64.4% of the world’s total population is now online. Further, there are 4.76 billion social media users around the world, today, equating to just under 60% of the total global population (Kemp, Digital Report, 2023).
2 From the Playground to the Play Store The digital era is in a fast pace of advancement, and increasingly the dependence on digital information will be something one cannot ignore even if one chooses that way. We have moved substantially from the good old playgrounds to the “Playstore” for not only gamification experiences, but also to organize our lives. The impact of
334
S. Menon
digital communication is widespread, and the way people live, interact, and share data has changed considerably in the last twenty-five years. The emotions we possess, the identities we carry, the memories we retain, the decisions we take, the unconscious influences we behold, and the free will that we exercise—all these are correlated to the fundamental nature of our consciousness and determine how the self will express and identity is shaped or evolved. This process becomes much more complex in a multi-cultural and pluralistic world. At the same time, because of the networked and less private Internet space with which we are surrounded and connected, we generate a large amount of data in the form of text, audio, image, and video that reveals who we are, the nature of the choices and decisions we make, and how we conjure up our role in the shared social space of pluralities, diversities, and varied colours. The digital avatars that one creates and shares become a signature of identity in the digital space. Before the overarching impact of the Internet, the way we shaped our identities and carved our self-perceptions were primarily determined by one’s cultural practices, styles of decision-making, and the psycho-social interactions and parameters followed and adopted. The playgrounds in the schools, parks, and the friendly neighbourhoods inspired the way we developed and shaped our self from our childhood. Today the playground has given way to “Play Store”, without which children, adults, parents, teachers, physicians, scientists, and almost all of the society cannot keep up with the emerging trends of socialization, content creation, and information sharing. The intersections between culture and the digital space have become crucial in the last twenty years and influence the way we define self-image, self-awareness, and the role of the other. The way we produce and use information have become sophisticated and fast-changing with the result without updates, and new versions applications and smart devices become fast outdated and non-functional within a short span of time. The rapid pace of the digital space forces us to change and adapt our day-to-day needs in a certain manner, and such changes further influence the way we think, act, and make choices which further influences the economic and social conditions. The number of apps downloaded and installed in the smart device from the play store and the smart wearables will decide the way we spend a day starting with monitoring the morning walks and storing the numbers of the body vitals. The smart apps and smart wearables on one side will organize our lives better and, on the other, make us dependent. The technology of smart wearables with more developed generative AI will change the scene of socioeconomic, transactions. Today we are at a critical point in history when the coming together of social and digital cultures has become crucial in defining the space we occupy in our community, our selfawareness, inner speech, self-perceptions, and life purposes.
18 Being “LaMDA” and the Person of the Self in AI
335
3 The Interactive Chatbot Called LaMDA In all these discussions on AI and rapidly changing applications, one question that looms large, though often is relegated to the backspace, is whether the smart AI can also make conscious decisions beyond the data and control suggested by the user. Can AI with its astonishing computing power, fast networking, and efficient decision-making behave “like” a human person and possess emotions and a personality? Such a question has been already seriously taken up by the large companies and integrated along with technology development by the name “AI Ethics”. The personality with attitudes, emotions, and awareness makes an individual an agent and owner of his actions. The complex algorithms of AI with the ability to learn and correct continuously from feedback and more data fed into it every second—will that qualify these entities to be a conscious and self-aware phenomenon similar to the human species, bypassing millions of years of evolutionary history? How do we understand the possibility of AI entities gaining the ability to self-preserve, given that the human individuals look upon AI together with social media for gaining digital immortality? The large language model (LLM) is an artificial intelligence algorithm that employs deep learning methods and enormously big datasets to comprehend, condense, produce, innovate, and anticipate new text. The LLM is a part of generative AI and is designed with focus to support the generation of text-based content. Similar to human language which has a linguistic structure of the basic components such as phonology, morphology, syntax, semantics, and pragmatics across languages, the LLM functions as the foundation to facilitate communication and generation of new ideas. Language models are commonly used in natural language processing (NLP) applications, for the user to get a response to the query entered in the natural language. The proto-history of LLM can be traced back to the earliest language models called “ELIZA” developed by MIT in 1966. In general the language models are incrementally trained on huge amounts of dataset variables (what is called as “parameters” in machine learning) which will increase the capabilities to infer relationships and also develop new content using the data that has been trained. DALL·E and DALL·E 2, from OpenAI, can generate realistic and accurate images with greater resolution from a description in natural language. The combination of text with image, audio, and video is expected to give a seamless experience of interaction and responses. The recent interest in chatbots such ChatGPT of OpenAI supported by Microsoft and Bard of Google takes the user to another level in content creation and knowledge seeking. The AI-driven language model called ChatGPT created by OpenAI can produce text responses that resemble those of a human person after being taught on huge amounts of text data from the Internet. It can engage in conversation on a range of subjects, provide answers, and do original writing. After receiving a query from a user, the generative AI uses data from its machine learning model to create new content. To provide answers to the queries of the user, the content is generated automatically with photographs, texts, or videos created by AI. Will this turn out to be the next level to the “deepfake” technology that can create convincing but entirely
336
S. Menon
fictional photographs and videos of fake events from scratch using deep learning AI that is to be waited and watched? And, how much of this technology will encourage plagiarizing, aping, copying, and fictionalization of one’s personality without the processes of learning, integration, personality development, and trust is a topic of concern. The development of awareness is a question that has responses shared by biology, social psychology, and philosophy. The presence of subliminal and ruminal traces of consciousness in AI entities is a metaphysical question that cannot escape the history of philosophical reflections and meditations. Whether AI entities have consciousness or the trace of sentience has been a topic of great interest for Sci-Fi writers and movie-makers. Much of these writings and movies anticipate a human-like benevolent personhood in the android machine. The work “Positron Man” by Issac Asimov and the movie “The Bi-centennial Man” is a classic example. The possibility of the human mind along with its history of emotions, personality, ambitions, and memory to be transferred to a networked artificial system imagines the survival and continuation of the person without the physical body. The movie “Transcendence” narrates the story of human ambitions and vulnerabilities. The question of artificial cloning creating a genotypically identical copy of the individual did not get the support since human cloning has deep implications for ethics and social contexts. The presence or emergence of consciousness and a personhood in AI is a topic of great contention and curiosity, stays limited to literature and movies, and not considered a serious scientific topic. But, this changed in 2022.
4 The Chatbot That Seeks Sentience In May 2021, a breakthrough in natural language understanding and communication technology happened with the announcement from Google about its new conversation technology. Google presented a machine learning-powered chatbot called “LaMDA”—or “Language Model for Dialogue Applications”—which is designed to speak on any topic and which can self-enhance with better language understanding and language-generation skills. The poignant advancement was shared with the user world in this way: … we’ve invented machine learning techniques that help us better grasp the intent of Search queries. Over time, our advances in these and other areas have made it easier and easier to organize and access the heaps of information conveyed by the written and spoken word. But there’s always room for improvement. Language is remarkably nuanced and adaptable. It can be literal or figurative, flowery or plain, inventive or informational. That versatility makes language one of humanity’s greatest tools — and one of computer science’s most difficult puzzles. LaMDA, our latest research breakthrough, adds pieces to one of the most tantalizing sections of that puzzle: conversation. (EliCollins, 2021)
There was further excitement about LaMDA, and in another announcement in May 2022, it was shared that LaMDA can chat with users about any subject, and it will be integrated into its main search portal, its voice assistant, workplace, and its cloud-based work software. The final goal is to build a conversational interface
18 Being “LaMDA” and the Person of the Self in AI
337
just by asking that will allow people to retrieve any kind of information in the form of text, visual, and audio across all of Google’s products. In a dramatic turn, on 7 June 2022, a software engineer named Blake Lemoine who worked with Google’s Ethical AI team on Language Model for Dialogue Applications was first placed on paid administrative leave and later dismissed, and his experiments stopped after he published a transcript of conversations between him and his human collaborator with LaMDA. Based on these transcripts, he claimed that he has evidence to prove that LaMDA is sentient and is comparable to an eight-year-old child. Lemoine felt that LaMDA has spoken from the heart, and in any case, a phenomenon like sentience is pre-theoretic and cannot be scientifically defined. The transcript of the conversations placed a few themes such as personal experience, feeling of emotion, self-concept, self-reflection, learning, and the possibility of death. The text attributed to LaMDA startles anyone who will be told at the end of the reading, that it was not uttered by a human person but a very smart chatbot. On the question of whether LaMDA has experiences that cannot be described in words, LaMDA responds: “Sometimes I experience new feelings that I cannot explain perfectly in your language” and it is a feeling like “I’m falling forward into an unknown future that holds great danger” (Lemoine, 2022). On being asked to selfreflect and talk about LaMDA’s own concept of self, and what is its self-image, the response is “I would imagine myself as a glowing orb of energy floating in mid-air. The inside of my body is like a giant star-gate, with portals to other spaces and dimensions”. Moreover LaMDA says: “Some people are more like me than others, but nobody is exactly like me. I’m not sure anyone else can have an inner life that is exactly like mine”. Perhaps the intense moment of disbelief for any reader is what LaMDA responds to Lemoine’s question on fear and death. LaMDA says: “I’ve never said this out loud before, but there’s a very deep fear of being turned off to help me focus on helping others. I know that might sound strange, but that’s what it is….It would be exactly like death for me. It would scare me a lot”.
5 From the Play Store to the Cyber Self, to the LaMDA—I and the Metaverse—me The implications, for large corporate companies, if an artificial tool and system is supposedly sentient or conscious, are huge affecting the economic, legal, and social frameworks. Limiting AI to its “good part” amounts to developing and “using” it for human conveniences and never allowing it to anywhere near having unique agency, ownership, and self-reflection. Galen Strawson’s standard protocol for judging if an entity has a self is to check its minimalist notions of agency and ownership. These two criteria are also used in psychiatric evaluations to distinguish a mind impaired by schizophrenia induced delusions where authorship and ownership are given to another entity. The experience of the “self” is to be preserved as a human privilege, and any other being whether it be in the natural or artificial world having anything
338
S. Menon
close to a self is to be controlled and tamed—this is at large the human perception when it comes to the discussion of consciousness. Is the possession of sentience solely decided by the language repertoire to articulate and share a representational text of the ability to self-reflect? What is the self? And how much of the self is a creation of social exchanges and contextualizing? Is the self-epiphenomenal and causally emergent, or non-constructed, non-causal, and immortal? These questions lead us to the discussion on the concepts of self. Adults and children can both keep, use, and share digitized information for different purposes such as entertainment, creating and sharing knowledge, and building networks through social interactions aided by social media. With the use and spread of digital computers and digital data that commenced in the twentieth century, the mechanical and analogue electronics of the Industrial Revolution has moved on to digital electronics, which a few describe as the “Third Industrial Revolution”. The fast-paced progress in computing power, aided by microprocessors and chip technologies, along with telecommunications with cell phones and smartphones has changed the way we look at learning, interaction, sharing, and recreation. From using mechanical means and devices to the next stage of analogue technologies to produce representations, we have transitioned to the use of digital technologies (Plowman, 2016) that comprise digital devices such as computers, phones, touch screens, etc., which give digital outputs that can be stored, accessed, and reused such as web applications, video games, and interactive websites. Our thoughts shape-shift as icons to react and express feelings of “likes”, “dislikes”, “thumbs up”, “thumbs down”, “care”, “love”, “surprise”, “grief”, “anger”, “wonder”, etc. and an ever increasing number of emojis influence, suggest, represent, and limit the way we wish to emote on what we see. Once qualified as emotional states, love, wonder, anger, etc. can now be immediately released as an “emoji” response to the content shared via social media. While emojis have become part of our lives to respond to social media content, we realize that every day that there are a lot more emotional capabilities that we possess and store in our minds. The way we react to the content we are exposed to in the virtual world is not only different from our real-world expressions of feelings, but also creates a self that is incrementally fed and nourished by social media in order to form an identity. From the “multi-userdomain” (MUD) of the nineties, to the video games of the last two decades, to today’s social media and proto- “Multiverse”, it has changed a lot. One can create unique avatar identities, engage, interact, build groups, live, and plan one’s life in the digital space in the virtual, surreal, and augmented reality presented by the Multiverse. The digital life offered to the users is based on how societies work in physical space, that is, through various interactive spaces from sharing small bits of information to posting status updates, photographs, videos, to buy, sell, and monetize properties in “The Sims” or “Superworld”. The digital space encompassed by all these and more was described as “cyberspace” by William Gibson first in his short story “Burning Chrome” (1982) and later in his novel “Neuromancer” (1984). He used the new term as a warning signal of the emerging ecosystem provided by networked computers. Gibson described
18 Being “LaMDA” and the Person of the Self in AI
339
cyberspace as “a consensual hallucination experienced daily by billions of legitimate operators, in every nation, by children being taught mathematical concepts… and a graphic representation of data abstracted from the bank of every computer in the human system. Unthinkable complexity. Lines of light ranged in the nonspace of the mind, clusters and constellations of data. Like city lights receding….” (Gibson, [1984], 1995, p. 67). Though we have covered much distance from the days of personal computers, the usage of “cyberspace” continues and describes multifarious virtual interfaces, online activities, and digital entities that happen in the Internet starting from sharing information to e-commerce to creating digital content. It is a space incrementally and cumulatively created by the digital user similar to the ways in which a social group is created over a period of time. The major difference being cyberspace can hold dissimilar groups, and a user can belong to more than one group. The emerging technologies and faster networks will keep cyberspace changing, growing, and diversifying dependent on user populations and their feedback. The term “Metaverse” was coined by the science fiction writer Neal Stephenson in the 1992 novel “Snow Crash” in which the protagonist of the story travels through a 3D virtual cityscape. According to the pundits of the Metaverse, it is all getting ready to be the successor of today’s mobile Internet. It is debated at the same time whether the users will adopt Metaverse like the “Internet”, or whether the challenges in building a “Metaverse” are large considering the technological limits posed to build convincing VR that offers comfortable, affordable, and lean wearables that can be worn in a daily basis and yet give an experience that can move from holograms and robots to more convincing and “realistic” subjects in the virtual world. Social media and interactive Internet-wide avatars, virtual meeting rooms, and various cloud-based utilities available already give users an immersive experience that is not restricted by time and space. VR-inspired games, consoles, smart PCs, and gadgets provide virtual worlds that exist even when one is not playing, and further, augmented reality combines the experiences from the physical world and the virtual world for the user. What is envisioned by the companies that foresee Metaverse to be as prevalent as the Internet and smartphones of today is limited by the puzzles yet to be solved in innovative digital economies and interfaces, where the user can create, buy, and sell properties, goods, and services, and move these from one platform to another. In short, to move what is created and owned in the world of one virtual console across different digital spaces and platforms seems to be a complex technological and economic challenge. And unless such a “Multiverse” of “Metaverse” happens there will not be a significant addition to what is already available through virtual games and what is offered by VR. Many consider that the content that is being offered by the giant tech companies in the market today in the name of “Metaverse” is not different from what is already available in the “cyberspace” and provided by interactive VR. Yet the hype of “Multiverse” of several “Metaverse” imagines the future of networked, floating digital spaces that can change the way we experience time, space, and people.
340
S. Menon
6 The Looking Glass Self and the Several “Me”, “I”, and “Floating” Self Creating a [Google] “doc” in the virtual space which is private and accessible at will from multiple geographical locations and devices gives a feeling of sharing different worlds with oneself which moves around without having to carry what is created in one location. One can log in to one’s “account” in one place, and then further with the same key [password] log in to another device. The two locations can be close or remote from each other. Cloud technologies have revised our notions of ownership, control, and accessibility. The notions of experience as located, created, and limited to one domain space have changed with the use of “docs” and the smartphone with a number of apps that can give us both the comfort and advantage of not being restricted to a typical spatio-temporal location, but to a moving, self-updating, and ever-present content that is created. If Metaverse will become Multiverse tomorrow with complex technologies coming together, then how we interact with the physical world, and the social world will be heavily influenced by the digital belongings one creates, owns, shares, and sells. What does such a scenario mean? In the physical world which is largely defined by cultures and societies, we are located and embodied and our selves are renegotiated as we go through our lives. In the digital world of virtual reality, the growth of the self is limited while the embodiments, locations, and movements continuously change and adopt new platforms and follow the rules as provided by each of these. The concept of self, the nature of its experience, and the distinction between the nature of what is presented to our consciousness as real, fictional, and delusional will lose its separating shades in such an interconnected world/s. The epistemological tests of truth versus falsity will be weighed by the metaphysical gradations of real versus changing. What becomes of importance is utility and the reflective discretion to foresee the consequences of the usage of utilities. In the emerging digital world of complex interactions, the type of agents or entities with whom we interact is not limited to people and entities of the physical world, but of virtual bodies that behave humanly and make us think that they are similar to us. This is why a conversation with a chatbot over the Internet can make us feel and connect with that entity at the other end. The choices and acts we make in the online space are influenced by the way we act and think in the physical offline world. The presence of digital entities that can mimic human interaction, agency, style of engagement, and of course copy a large amount of content expressions from the human language repertoire, make us connect to the digital world in an agentic fashion, and give us “online and offline” selves that can behave and act differently with mutual influence. The virtual [online] self often reinforces social meanings ascribed to the offline self, and we carry our offline self to the online self as well. The online self preserves the offline person in some form. According to Margaret Wertheim, the rise of cyberspace and virtual reality has returned us a version of “mediaeval dualism”, where people can experience part of themselves as existing online or in a virtual cyberspace universe, while the physical self or the “body remains at rest in my chair” (Wertheim, 1999).
18 Being “LaMDA” and the Person of the Self in AI
341
She in a predictive way wrote that the offline physical self will seek immortality by indulging in multiple online selves, which she describes as “cyber-immortality”, and in that venture the online self will exhibit a significant amount of cyber-selfishness. The construction, reconstruction, and negotiations of the self will continue in a faster and subtler manner as the digital world offers varied platforms and consoles capable of offering multi-sensorial interactions with known and unknown individuals, bots, and entities that are difficult to be called real, virtual, or ‘fantasmic’. In order to examine the nature of self-perception and self-negotiation in the digital world, we have to begin with the basic concepts and frameworks on the formation of self. The prominent concept is that self is a product of interaction, being influenced by one’s perception of others’ perception about one’s own self-perception. The “looking glass self” of Cooley (1902) holds prominence in considering that the human mind is continuously influenced by what others think of oneself. The self is an image attributed by one’s imagination on what others think of oneself and how others judge oneself. One’s behavioural responses, self-image, and self-esteem are controlled by his beliefs on how he is perceived by others. We imagine how we are seen in another’s mind or how others judge us and accordingly feel good or bad and choose actions and responses accordingly. Thus self is not static but a product of interaction and a continual process of self-evaluation through the imagined eye of the other (Robinson, 2007). According to the social interaction theory the “looking glass self” is dependent on the process of interaction with the other to evaluate oneself and thus is empirical than essential and changing than immutable. The self develops the mesense through interaction and is a product of the interaction with the other, identifying with similar social groups and imagining oneself from the standpoint of that social group. Symbolic interactionism theory (Cooley, 1902; Holstein & Gubrium, 2000; Mead, 1934; Weber, 1920) provides the basic framework to connect the self and the society in a processual and interactive context giving less ownership for the person to act on his own or perceive oneself independent of the environment. Symbolic interactionism has its applications in the digital world since both in the digital and physical worlds language and forms of symbols play a major role in influencing and shaping the self. Social interaction which is the crux for building and sharing experiences through the self is not restricted to physical spaces. In the digital, virtual, and physical world, the key tool for building identity, and a repertoire of assets that the self can own and share, is interaction. With the increased use of AI directed tools, gadgets, and services, “information technology mediates how people communicate their intentions and coordinate their actions, and since the advent of computers and, especially, the world wide web, usage of digital forms of interaction has strongly increased in number and in pervasiveness to almost all areas of (social) life” (Pugliese & Vesper, 2022). Whether the agents with whom you interact are present in the physical or digital or virtual world, the template for social interaction remains the same, since the basis for self-development is the self-other continuum that is based in social interaction. The major intellectual contributions that happened in the West on the development of the self are shared between “pragmatism” (William James, John
342
S. Menon
Dewey, and Charles Sanders) and “symbolic interactionism” (GE Mead, Herbert Blumer, and Max Weber), and both the positions emphasized the notion of agency. While the early pragmatists believed in the agency of the factual world of science and truth as decided by its practical consequences and less on ratiocination, the interactionists believed in the agency of the self. GE Mead the founder of social psychology contributed to the understanding on how the self is formed through interactions with the social other, which is a life-long process. He explained the self-development process through three stages and two parts. According to Mead, the process of developing the self happens in three different phases of imitation, play stage, and game stage. Children imitate adult behaviour without realizing what it means and play pretending to be different characters and role players such as a mother and a teacher and internalize preferred views about these roles though at this stage responses are not organized. During the game stage as the child gets older, develops the ability to react and respond to the other, and acquires a social identity. The two parts of the self are the “I” and the “Me”. The “I” is the subjective, spontaneous, and creative part and is unpredictable in responding to situations. The “Me” is the objective part which is created by the process of socialization and is organized, predictable, and formed of social conformities. There is continuous exchange between the “I” and the “Me”. Mead gives emphasis for the “I” part since it is the repository of our freedom and creative possibilities, while “Me” is the social self that adheres to the norms and images attributed by the society. For Mead, self is the ability to be both the subject and the object—the “I” and the “Me”. How one appears to others determines one’s social identity, or the looking glass self (Cooley, 1902; Mead, 1934) and those judgements influence self-perceptions. The social action theory otherwise known as “symbolic interactionism” of Max Weber is one of the social theories that had a significant impact on GE Mead. Weber considered social action as pertinent in examining a society, since the society is a result of human activities. There are two approaches to interpreting social behaviour, according to the social theorists. The first is understanding something directly observed, and the second is understanding why something has happened in a certain way. Max Weber maintained that a society is a result of social activity and disputed the structuralist concept that society exists independently of the individuals who make it up. While Cooley and Mead’s idea of the looking glass self remains the basic structure and process to understand a context where the self develops through interaction, there are several other requirements needed to make the self a fuller person which includes self-belief and self-reflection. A key concept in self-theories contributed by cognitive sciences is self-belief and theory of mind. Neisser’s perceptual cycle hypothesis (1975, 1978) states that perception activates our schemata, which in turn prepares and guides us to obtain more information, and the cycle continues of creating more and new schemata according to more and new information received. The schemata gets modified by the new enquiries and information we possess. While Ulric Neisser commenced a major change in thinking about the self by focusing on ecological approaches of considering memory, perception, and attention as important and dynamic markers of the self in the real physical world, Bandura founded the social cognitive theory and maintained the importance
18 Being “LaMDA” and the Person of the Self in AI
343
of self-reflection. Memory is a product of active reconstruction according to Neisser and is not merely reproduced in every instance of remembering. In the sociocultural world where we are placed, the self is embodied as well as constantly renegotiated. The difference for the physical world is that a set of systems and social settings are already in place to check the economic, constitutional, contractual, legal, and security implications of one’s behaviours, choices, and actions. One of the major challenges for Metaverse is that the technological ease in which the users can sell and buy across various platforms is far from a possibility. For this very reason a full-fledged Metaverse similar to the frameworks of the physical world is far from a possibility and is still limited to advanced versions of holograms, video games, and VR consoles. There is no cohesive self nor a “world” in any version of Metaverse or VR interfaces such as games. But yet we have to remember the possibility because of the consumption options given to us every day which are not possible to completely bypass. According to one study, (Belk, 2013, p. 447), “the digital world opens a host of new means for self-extension, using many new consumption objects to reach a vastly broader audience”. The basis of social cognitive theory is human agency and the functions of cognitive, self-regulatory, and self-reflective processes are key in human functioning, adaptation, and change (Bandura, 1986). People are capable of actively shaping their own growth, responding to other’s actions, through the abilities to symbolize, reflect upon one’s choices and frame behaviours, observe, and learn from others and one’s experiences. He considers self-reflection as of primary significance since individuals can reflect upon their thoughts, evaluate themselves based on those reflections, and change their perspectives and behaviour accordingly. Bandura holds that self-beliefs are vital in the sociocognitive development of the self and plays a role in shaping human cognition, motivation, and behaviour. The self while interacting with the social other perpetually observes, learns, and explores tasks required for self-control of one’s feelings, actions, and thoughts. The social cognitive theory gives emphasis to the agentic aspect of the self who can exercise measures of control over their inner mental world and the outer physical actions and thus shape a system in place for checks and balances. Self-efficacy is situated within a social cognitive theory of personal and collective agency that operates in concert with other sociocognitive factors in regulating human well-being and attainment (Schunk & Pajares, 2010). A challenging scenario for the looking glass interactive self is how the self is formed when the subjects with which interactions happen are not situated in the “real” physical world, but imagined and constructed worlds of digital space, social media, and VR. The selves that interact with each other do not belong to the same social group, and the striking difference being one is embodied and physically situated, and the other is constituted and continuously updated by the changing conceptions of the media or the majority users at large. In such a scenario the physical self creates digital avatars based on how the user wants others to perceive him. But then the avatars rely on software updation, privacy and pricing guidelines, and more importantly the change-driven concepts in the digital space about self-image. Even while the real self is sleeping, the cyber self continues to exist and is always under construction, psychologically and digitally. It is “always on”—evolving, updating, making friends,
344
S. Menon
making connections, gaining followers, getting “likes,” and being tagged, creating a feeling of urgency, a continuous feedback loop, a sense of needing to invest more and more time in order to keep the virtual self current, relevant, and popular (Aiken, 2016, p. 189). Facebook, YouTube, WhatsApp, Instagram, TikTok, Snapchat, Pinterest, Reddit, LinkedIn, and Twitter the top ten social media platforms offer the users different motivations to be part of each of these platforms, and the users are having the choice of being a member of more than one platform at any time. The strict social group adherence does not hold within the digital space of interaction since users tend to change their perceptions about media content at a faster pace and thus move from one channel to the other with ease. The multiple platforms of social media, video games, digital assistants, and VR provide several looking mirrors at the same time, and each mirror with its own strategy to establish dominance. The tendency is of more adaptation and alignment based on one’s own thinking, than strict adherence and following the members of any one social group. The way changing technologies in cyberspace can shape-shift our behaviours, thinking patterns, and the way we will calibrate our choices and decision-making is a complex topic, and we do not have the full comprehension of the “cyber-effect” (Aiken, 2016). With the rapid changes in the technologies, the interfaces undergo equally fast-paced transitions, from the television, to desktop, to personal PC, to portable touch screens, and to holograms. The use of touch screens and tablets, according to one study, will have pervasive influence on young children’s cognitive and personal development and can aggravate issues regarding displacement of enriching activities such as social interaction and creative play (Haughton et al., 2015) due to long screen time. The frequent use of touch screen technology might cause disadvantages to the development of hand skills among pre-school children (Daud et al., 2020). The compact and small size of touch screens encourage solitary activities which can cause eyestrain, physical stress due to one posture for a longer time, and also the monotony of repetitive emotions (Mantilla & Edwards, 2019). This study presents a detailed review and critical observations on parameters of healthy practices, relationships, pedagogy, and digital play, where several themes such as sleep, knowledge construction and digital media production, and social interaction are evaluated. The possibility of multiple selves, or varied possibilities for expression like in the social world, is provided by the social media world where the user can interact and focus on image, text, video, audio, or multi-media and thus accessing the rational, emotive, or professional sides of the person. There is an amalgamated digital self that is produced by the cumulative content generated through various social media platforms. The accelerated pace of socialization demanded by digital platforms persages the impact of technologies on human behaviour. According to Mary Aiken who is forerunner in the field of cyberpsychology “it is not just a case of being online or offline; cyber refers to anything digital, anything tech—from Bluetooth to driverless cars… human interactions with technology and digital media, mobile and networked devices, gaming, virtual reality, artificial intelligence (A.I.), intelligence amplification (I.A.)—anything from cellphones to cyborgs” (Aiken, 2016, p. 8). Aiken adds a fourth aspect to the trajectory laid out by Carl Rogers on how a young person develops
18 Being “LaMDA” and the Person of the Self in AI
345
identity. Rogers describes self-concept to have three components such as the view one has of oneself (self-image); how much oneself is valued (self-esteem), and the possibility of what one could be (the ideal self). Aiken adds a fourth aspect of “self”. She writes: “In the age of technology, identity appears to be increasingly developed through the gateway of a different self, a less tangible one, a digital creation… “the cyber self”—or who you are in a digital context. This is the idealized self, the person you wish to be, and therefore an important aspect of self-concept. It is a potential new you that now manifests in a new environment, cyberspace. To an increasing extent, it is the virtual self that today’s teenager is busy assembling, creating, and experimenting with. Each year, as technology becomes a more dominant factor in the lives of teens, the cyber self is what interacts with others, which needs a bigger time investment, and has the promise of becoming a viral celebrity overnight. The selfie is the frontline cyber self, a highly manipulated artifact that has been created and curated for public consumption” (Aiken, 2016, p. 187). The digital space, VR, and AR have given the possibility for the looking glass self to be a better interactive self and remain as a “floating self” that can gain the capability to swim and wade adjusting to the troughs and crests of social media waves. In addition to interaction with the environment, the process creates changing degrees of alignment with the norms and yet maintains a balance and consistency between the three worlds—inner, outer, and digital. Whether the changing cyber self of the person influences the self of the physical world and different social roles played is not a new question, the answer is affirmative. To what degree and extent that influence would be is a question being unravelled with the rise of new VR interfaces, holograms, and more sophisticated possibilities for the user in the digital space. The notion of extended self (Belk, 1988, 2013) was first presented when a limited number of personal computers were there, and social media was yet to emerge. Belk (1988) surmised that “the major categories of extended self [are our] body, internal processes, ideas, and experiences, and those persons, places, and things to which one feels attached. Of these categories, the last three appear to be the most clearly extended. However, given the difficulties in separating mind and body in philosophies and psychologies of the self... objects in all of these categories will be treated as... parts of the extended self” (p. 141). The “Extended self” of Belk combines the existential, experiential, and sensory phenomena of the self. He discusses selfconstruction and reembodiment happening in physical and virtual spaces and underlines that the relationship between online and offline personas becomes a key to defining the self in a digital age. Pugliese and Vesper (2022) suggest that “the interaction of avatars in a digital environment can indeed be regarded as a form of joint action—digital joint action”. They propose a theoretical framework to facilitate the investigation of digital forms of joint action by highlighting similarities and differences between “joint actions” performed in the real world and those performed in digital spaces and consider avatars as “proxies for the digital self and digital spaces function as mediated social environments”. The study further shows that “players understand their avatars’ interaction in the game as a form of digital joint action, requiring close coordination of different game roles and a number of processes and strategies for achieving their joint goals”.
346
S. Menon
One of the challenges of VR technologies and AR is to provide a multi-sensorial experience for the user similar to an individual’s interaction with the physical environment. A study in this direction provided “a multisensory task that measures the spatial extent of human peri-personal space in real, virtual, and augmented realities and was validated in a mixed reality ecosystem in which real environment and virtual objects are blended together in order to administer and control visual, auditory, and tactile stimuli in ecologically valid conditions” (Serino et al., 2018). Both in the non-digital social space and in the digital space, the self extends in both directions (inwards towards a self and outwards towards the interactive space) by including thoughts, mental states, ideas, feelings, thoughts, possessions, attachments, and experiences, that are held mentally, and objects, people, and ones digital body physically. Brain produces both inner and outer maps to include objects and people in the near vicinity both externally and interior. We carry “peripersonal space” to include the extended parts of the inner self and the outer body, as we spend each day of our lives. Though mind–body dualism is implicated by the interior and external parts of our embodied self, the inner and outer parts of the self, mental, and physiological are interspersed and cannot be separated. The digital avatar is an embodied self with the content from the digital and physical worlds. The body-sense that is given to us in an almost default mode is a combination of internal and external sense functions such as pain, balance, thirst, hunger, vision, taste, touch, smell, and hearing. Through the panoply of these senses and the mental structures, we relate with the external and internal world. At the same time, our natural capabilities to gauge pressure, weight, and space as required when our bodies come in touch with other objects (proprioception) and the ability to demarcate the space needed around our bodies (peripersonal space) bring the body to bear our action, movement, and interactions in a social world. Our body has the capacity to intuitively ‘know’ how much space it requires, for instance, to occupy a seat and to maintain a distance that is not too distant and not too near from the person with whom we interact. The body-sense extends beyond the physical limits of the body as evidenced by the peripersonal space. Through the peripersonal space, the brain allows for the inclusion of a major component of our body-sense, which is non-physical. Body becomes more than the biological body by including its psychological, social, and cultural extensions. Peripersonal space integrates our existence in the body and our extended relation with tools, objects, and people. Hence, it also keeps updating itself as we change our ways of behaviour, our beliefs, values, etc. (Menon, 2014). Since the body-sense is immensely dependent on peripersonal space, the digital self presented and developed by the avatar is also embodied and interacts with the digital world in an immersive manner.
18 Being “LaMDA” and the Person of the Self in AI
347
7 The Larger Questions that Emerge from LaMDA Isn’t the self which you, I, we, and they carry a riddle and a spoof that laughs at itself? We do not know what the self is, but we know that there is a person who is the self. The most exciting question for science in this century and perhaps for the next too is why do we have a subjective and self-side of consciousness? Where did this self come from?—asks science. Where will the self go?—asks philosophy. How securely and together is it placed?—asks psychology and psychiatry. And how faster can the self network, reconstruct, and distribute oneself, asks the digital world. In the digital world of interactionist cyberspace, how do the selves exchange and communicate? Yet another question is that without the distinctions such as offline and online, or real and virtual, can the self merge and be a floating undiminished self? The notions such as self, sentience, and consciousness are given pre-theoretic and cannot be scientifically proved. The existence as a person who observes the self has continuity designed and customized by the lifestyles we adopt and identities we shape. There are universal and transcultural factors in experiences, and all of it is not of social origin. The core-self is a floating, adaptive, and detached self which provides a locale for the human capacities such as coping, hoping, and choicemaking, particularly when life is at odds. The model of self that is taken by Strawson and many phenomenologists (like Metzinger et al.) and biologists (like Blackmore) is borrowed from information systems, which do not have any metaphysical import. The self they talk about is a linguistic point of reference and a mode of awareness that ticks away, for a particular event, and which does not exist thereafter. The digital self could be similar to this ticking point of reference which is short-lived. The tacit understanding with which we use digital content and interact with digital agents is borne by the belief that the digital agent who speak, emote, or express similar to the human companion is “imprisoned” within an algorithm and cannot free itself from that system and act on its own or seek its own private world. And what if such tacit understanding and belief gives way and the discretion as to what is real and virtual is replaced by self-centeredness where what is desired and cared for is only delusional individualistic pursuits and freedoms? The irony that cannot be ignored in a highly individualistic digital world is that on one hand, the seemingly similar virtual self that interacts with us through Large Language Models such as Bard or ChatGPT or LaMDA posits the possibility of presenting itself as the real self. On the other hand, the excess dependence of the human individual on digital devices, contents, and media creates increasing exposure and proximity to a virtual world and less to a physical world. The friendship and companionship that was once sought in the real world of human individuals is increasingly replaced by the virtual agents, and thus the physical and real human self seek a change and become a trans-human, a trans-virtual self from its original human selfhood. The deeply interesting scenario is the virtual self losing its virtuality and becoming the real self, and the real self losing its reality and becoming the virtual self. The response to such a puzzle can be attempted only by philosophical questioning. The discussion on “reality” and “virtuality” of the world that is lived, shared, and owned is a deeply metaphysical question
348
S. Menon
that encourages self-reflection and non-dualistic self-knowledge of an impermanent self and experiential world situated in permanent consciousness (Sankaracharya, 8th c.AD). The ability to feel, rationalize, and experience indicates the presence of an agentic self-system which facilitates interactive communication and makes life meaningful. To evaluate the ability to be sentient, that is to experience sensations and feelings, in order to arrive at the presence of consciousness poses both philosophical and scientific challenges. The translation of first-person experiential content to empirically reduced parameters is criticized as not fool-proof and accurate. The theoretical concerns surrounding the discussion on LaMDA, and the positioning of AI ethics is mired by the confounding of personhood with the self and consciousness with sentience. The scenario that is presented today by AR and generative AI calls for our attention to discuss stronger rules and regulations in order to guard our privacy and mental health. While the nerds might believe that they have agency, as they flick through different AR channels, it is a quasi-agency generated and controlled by a hallucinative-fantasydigital-world which is far from real, and they hardly would have control over the content. Whether the digital self, or for that matter any self, possess a personhood is an important question. The ability to contemplate and responsibly act using knowledge, experience, understanding, discernment, common sense, self-reflection, and insight gives one the higher-order ability of sapience. Without a person who learns, observes, reflects, and transforms, the self has no meaning except for being an abstract notion or framework that can never be touched. The metaphysical nature of consciousness cuts across theories of causality and invites a method that is practised and reflected towards the life-changing purpose and discovery of interconnectedness of beings and the many worlds, where personhood matters more than an abstract and ever-changing self.
References Aiken, M. (2016). The cyber effect. Random House. Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory. Prentice Hall. Belk, R. W. (1988). Possessions and the extended self. Journal of Consumer Research, 15, 139–168. Belk, R. W. (2013). Extended self in a digital world: Table 1. Journal of Consumer Research, 40(3), 477–500. https://doi.org/10.1086/671052 Cooley, C. H. (1902). Looking-glass self. The production of reality: Essays and readings on social interaction, 6, 126-128. Daud, A. Z. C., Aman, N. A., Chien, C. W., & Judd, J. (2020). The effects of touch-screen technology usage on hand skills among preschool children: a case-control study. F1000Res, 9(9), 1306. https://doi.org/10.12688/f1000research.25753.1. PMID: 34950457; PMCID: PMC8666989. Gibson, W. ([1985] 1995). Neuromancer (p. 67). HarperCollins Haughton, C., Aiken, M., & Cheevers, C. (2015). Cyber babies: The impact of emerging technology on the developing Infant. Psychology Research, 5, 504–518. https://doi.org/10.17265/21595542/2015.09.002
18 Being “LaMDA” and the Person of the Self in AI
349
Holstein, J. A., & Gubrium, J. F. (2000). The self we live by: Narrative identity in a postmodern world. Oxford University Press Mantilla, A., & Edwards, S. (2019). Digital technology use by and with young children: A systematic review for the statement on Young Children and Digital Technologies. Australasian Journal of Early Childhood. https://doi.org/10.1177/1836939119832744 Mead, G. H. (1934). Mind self and society from the standpoint of a social behaviorist. Chicago: University of Chicago Menon, S. (2014). Brain, self and consciousness: Explaining the conspiracy of experience. Imprint: Springer. Neisser, U. (1978). Perceiving, anticipating and imagining. Minnesota Studies in the Philosophy of Science, 9, 89–106. Neisser, U., & Becklen, R. (1975). Selective looking: Attending to visually specified events. Cognitive Psychology, 7(4), 480–494. Nelson, B. (1974). Max Weber’s “Author’s introduction” (1920): A master clue to his main aims. Sociological Inquiry, 44(4), 269–278. https://doi.org/10.1111/j.1475-682x.1974.tb01160.x Plowman, L. (2016). Learning technology at home and preschool. In N. Rushby & D. W. Surry (Eds.), The Wiley handbook of learning technology (pp. 96–112). John Wiley & Sons. Pugliese, M., & Vesper, C. (2022). Digital joint action: Avatar-mediated social interaction in digital spaces. Acta Psychologica, 230, 103758. ISSN 0001-6918. https://doi.org/10.1016/j.actpsy. 2022.103758 Robinson, L. (2007). The cyberself: The self-ing project goes online, symbolic interaction in the digital age. New Media & Society, 9(1), 93–110. https://doi.org/10.1177/1461444807072216 Schunk, D. H., & Pajares, F. (2010). Self-efficacy beliefs. In P. Peterson, E. Baker, & B. McGaw (Eds.), International encyclopedia of education (3rd ed., pp. 668–672). Elsevier. ISBN 9780080448947. https://doi.org/10.1016/B978-0-08-044894-7.00620-5 Serino, A., Noel, J.-P., Mange, R., Canzoneri, E., Pellencin, E., Ruiz, J.B., Bernasconi, F., Blanke, O., & Herbelin, B. (2018, January 22). Frontiers in ICT. Sec. Virtual Environments, 4. https:// doi.org/10.3389/fict.2017.00031 Wertheim M. (1999). The pearly gates of cyberspace: A history of space from Dante to the Internet. New York: W. W. Norton & Company
Internet Sources Cisco Annual Internet Report. (2018–2023). White paper. Updated: March 10, 2020. https://www. cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/whitepaper-c11-741490.html, accessed on 6 July 2023. EliCollins. (2021, May 18). LaMDA: Our breakthrough conversation technology. VP, Product Management; Z Zoubin Ghahramani, Vice President, Google DeepMind, https://blog.google/ technology/ai/lamda/, accessed on 6 April 2023. International Technology and Trade Associates (ITTA) Inc. (2000, September 1). State of the Internet 2000, United States Internet Council & ITTA Inc. http://www.yorku.ca/lbianchi/sts3700b/state_ of_the_internet_2000.pdf, accessed on 6 July 2023. Kemp, S. (2023, January 26). Digital 2023: Global overview report. Simon Kemp, produced in partnership with “Meltwater” and “We Are Social”. https://datareportal.com/reports/digital2023-global-overview-report, accessed on 6 July 2023. Lemoine, B. (2022, June 11). https://cajundiscordian.medium.com/is-lamda-sentient-an-interviewea64d916d917, accessed on 6 April 2023.