430 116 26MB
English Pages 1920 [1921] Year 2023
Lorenzo Magnani Editor
Handbook of Abductive Cognition
Handbook of Abductive Cognition
Lorenzo Magnani Editor
Handbook of Abductive Cognition
With 186 Figures and 54 Tables
Editor Lorenzo Magnani Department of Humanities University of Pavia Pavia, Italy
ISBN 978-3-031-10134-2 ISBN 978-3-031-10135-9 (eBook) https://doi.org/10.1007/978-3-031-10135-9 © Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This handbook explores abduction (inference to hypotheses), an important but, at least until 2000, neglected topic in philosophy, logic, and epistemology. In the first book of mine about abduction (Abduction, Reason and Science. Processes of Discovery and Explanation, 2001), my aim was the to integrate philosophical, cognitive, and computational issues, while also discussing some cases of reasoning in science and medicine. The main thesis was that abduction is a significant kind of scientific reasoning, helpful in delineating the first principles of a new theory of science. In the preface of that book, I said that the status of abduction is very controversial. When dealing with abductive reasoning, misinterpretations and equivocations are common. What are the differences between abduction and induction? What are the differences between abduction and the well-known hypotheticodeductive method? What did Peirce mean when he considered abduction a kind of inference? Does abduction involve only the generation of hypotheses or their evaluation too? Are the criteria for the best explanation in abductive reasoning epistemic, pragmatic, or both? How many kinds of abduction are there? In the last decades, abduction was extensively studied in logic, semiotics, philosophy of science, computer science, artificial intelligence, and cognitive science. The interest in abduction derived largely from the neglect of the logic of discovery in the case of the neopositivists but also in the so-called postpositivism, for example in both Popper and Kuhn. Research on abduction immediately acquired a strong interdisciplinary character and this handbook respects this feature: I suggest to the reader to see the various parts that are related to various disciplines in their intertwining with abductive cognition. Indeed, the debate about abduction, a concept introduced by Aristotle and revivified by Charles Sanders Peirce that can be rudimentary classified as “reasoning to hypotheses,” has crossed philosophy over the last decades, ranging from the most speculative to the most pragmatic and cognitive outlooks, from the logical and computational models to the role played in plenty of disciplines. Recently, novel and rich epistemological and logical perspectives and both wide philosophical and cognitive insights sparked new interdisciplinary studies about how abductive cognition works and is used. The relevance of the discourse about abduction also transcended the boundaries of philosophy, as it was immensely boosted by the progress made in computation since the 1960s, making the topic of abduction not only relevant to scientists and philosophers, but also to computer v
vi
Preface
scientists, programmers, and logicians. After all, the first expert systems created in the field of artificial intelligence, were devoted to diagnostic reasoning in medicine, that is to an exemplar case of abductive cognition, engaged in intelligently selecting the correct hypothesis/diagnosis in an already available encyclopedia: in this case, we are dealing with what I called “selective abduction.” Other related fields of study, strictly connected to abduction, were the studies of inferential processes that would go beyond traditional logic and yet play a crucial role in enhancing knowledge about abduction: a lot of non-classical logics were promoted by the need to formally clarify abductive reasoning. To provide an initial definition, we can agree that abduction is something we use in order to gain some benefit in the understanding or explanation of something else, which can be called the starting data. An abduction lets us explain some data, and consequently behave in a way that would not be possible without it. Creativity is surely entangled with abduction: every time humans or computational machines create a new interesting hypothesis, we face a high-level case of what I called “creative abduction.” This definition of abduction should make it easy to also appreciate how situations that we face every day are tackled by making use of abductive cognition: to deal with other people, we make abduction regarding their minds and their intentions; to operate machinery, we make hypotheses about their functioning; in the remote case of trying to escape from wild animals, we make hypotheses of their hunting strategies and perceptual systems; to explore novel environments, we make hypotheses regarding their spatial configurations; to mention only a few. We make use of abductive hypotheses in a wide array of circumstances, but what all abductions actually share is a dimension of practical usefulness. We create abductive hypotheses (imagine the case of scientific discovery) or make use of abductive hypotheses that have already been generated by other people or machines and are considered reliable. Moreover, abductions can display a “manipulative” nature, since in many cases they are built thanks to an interplay with external, material supports (i. e., by means of artifacts, paper sheets, sound waves, body gestures). However, they can be built thanks to merely “internal” resources: in the case of mental guesses, they are produced thanks to brain wirings by synapses and chemicals (a mental hypothesis, for instance, is the mental and powerful construction activity, that can take advantage of both sentential aspects—internalized propositions—but also of internalized non-sentential models, visualization, thought experiments, etc.). I myself contributed to the study of abduction with various books,1 that certainly cover a lot of aspects regarding this interesting topic. However, recent research on 1 L.
Magnani (2022), Discoverability. The Urgent Need of an Ecology of Human Creativity, Springer, Cham, Switzerland; L. Magnani (2017), The Abductive Structure of Scientific Creativity. An Essay on the Ecology of Cognition, Springer, Cham, Switzerland; L. Magnani (2009), Abductive Cognition. The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning, Springer, Berlin/Heidelberg; L. Magnani (2001), Abduction, Reason, and Science. Processes of Discovery and Explanation, Kluwer Academic/Plenum Publishers, New York. (Chinese edition: [意] 洛 伦 佐·玛 格 纳 尼 / 著 ; 李 大 超 , 任 远 / 译 , 《 溯 因 、 理 由 与 科 学——发 现 和 解 释的过程》,中国广州:广东人民出版社 2006 年. Translated by Dachao Li and Yuan Ren, Guangdong People’s Publishing House, Guangzhou, 2006).
Preface
vii
abduction became wide and rich and regarded issues I did not touch: in the past few years, I have been thinking about organizing a big international conference to invite all the researchers on the planet that contributed to the research in abduction, to publish informed and variegated proceedings. Already being engaged in the organization of the Ninth Conference on Model-Based Reasoning (MBR023_ROME),2 in December of 2020 I opted for the idea of promoting a huge Handbook of Abductive Cognition, which Springer accepted with enthusiasm. My purpose was to establish an initial remarkable groundswell around a concept that only 20–30 years ago was almost completely unknown to the vast majority of researchers in the fields of philosophy, logic, and computer science, in the broadest sense of this last expression. In order to grasp to the fullest the rich universe of abductive cognition, its relevance in science, but also as the general cognitive architecture of certain important kinds of reasoning, I divided the handbook into 14 parts. The first 3 parts can be seen as the ABC of the discourse, providing a philosophical, logical, and theoretical alphabet for the understanding of abduction, while the remaining 11 parts each deal with precise, and applied, fields in which abduction plays a central role, from medicine and mathematics to architecture and design, from artificial intelligence and visual/diagrammatic cognition to neuroscience, economics, education, and human sciences. A specific chapter deals with a very traditional and central issue, that is, abductive creativity, and another one affords the recent and challenging study regarding the role of abduction in adversarial reasoning, in which the interplay between violence, deception, detection, cunning, and ruse in social interplays is at stake. I have myself edited a Miscellaneous part that presents a variety of important studies that, as the Editor-in-Chief, I was not able to insert into the other parts but that have to be offered to the reader: from the role of abduction in Galileo Galilei and Plato and its interplay with phylogenetic inference to its function in psychology and neuroscientific areas, from the problem of abduction in Bayesianism to the importance of the well-known “inference to the best explanation.” In summary, we can say that knowledge about abductive cognition increased year after year: I think this handbook testifies that abduction has acquired a central status in various disciplines, surely in philosophy, logic, epistemology, and cognitive science, but also demonstrated its capacity to fecundate new intelligibility of important issues in many other fields of the current research, not only of scientific character I do not wish to dwell too long in the description of the content of the chapters that the reader can easily find in the introduction to each of them. Below I provide a list of the various chapters together with the editors that shared with me the difficult task of completing the handbook; 1. Philosophy and Abduction, Section Editor: Sami Paavola 2. Theoretical and Cognitive Issues on Abduction and Scientific Inference, Section Editor: Woosuk Park
2 Model-Based
Reasoning, Abductive Cognition, Creativity. Inferences & Models in Science, Logic, Language, and Technology, chaired by me, E. Ippoliti, and S. Arfini
viii
Preface
3. Logics of Hypothetical Reasoning, Abduction, and Evidence, Section Editor: Atocha Aliseda 4. Abduction and Medicine: Diagnosis, Treatment, and Prevention, Section Editors: Daniele Chiffi and Mattia Andreoletti 5. Abduction in Mathematics, Section Editor: Ferdinand D. Rivera 6. Diagrams, Visual Models, and Abduction, Section Editors: Gianluca Caterina and Rocco Gangle 7. Abduction and Computation, Section Editor: Akinori Abe 8. Abduction and Economics, Section Editor: Fernando Tohmé 9. Abduction in Education and Human Sciences, Section Editor: Alger Sans Pinillos 10. Abduction, Creative Cognition, and Discovery, Section Editor: Selene Arfini 11. Abduction and Technological Design, Section Editor: Ehud Kroll 12. Adversarial Abduction, Section Editor: Samuel Forsythe 13. Abduction and Cognitive Neuroscience, Section Editor: Gustavo Cevolani 14. Miscellaneous Section Editor: Lorenzo Magnani I prefer to think of this handbook’s extensive theoretical scope as having a parallel in the editorial process. In fact, a lot of decision-making ensued when I decided to serve as general editor of the Handbook of Abductive Cognition. To be able to make a decision, I had to think about what editing a handbook about the special topic of abduction was like. Otherwise said, in order to decide I had to know better, and in order to know better I had to make myself a source of abductions regarding handbook editing. Part of this intricate abductive behavior was in my thoughts, while the other parts were in sketches and emails. It was somewhat deductively inferred from evidence (other handbooks), and it was partially the result of abductive assumptions. Once the hypotheses about the handbook had sufficiently stabilized, providing me with a good forecast of the primary criticalities and some (wishful) scheduling, I began the project, and the handbook hypotheses, which were continually updated as the work progressed, would direct my behavior step by step. I should also mention that I took on the editing of this handbook because, up until now, the literature on abduction and on the inferential and logical processes that underlie it, and its philosophy has been widely dispersed in more or less recent (and variably authoritative) collections and monographs, journal articles, and conference proceedings. The purpose of this handbook is also to provide readers with the opportunity to access the foundational information and the state of the art of these studies in a singular, trustworthy source written by a group of well-known experts. This handbook is the exemplary fruit of research and collaboration. As Editorin-Chief, I was able to rely on the formidable team of editors I mentioned above, who took the reins of their parts: Sami Paavola, Woosuk Park, Atocha Aliseda, Daniele Chiffi, Mattia Andreoletti, Ferdinand D. Rivera, Gianluca Caterina, Rocco Gangle, Akinori Abe, Fernando Tohmé, Alger Sans Pinillos, Selene Arfini, Ehud Kroll, Samuel Forsythe, Gustavo Cevolani. They are all outstanding and diligent academics and researchers, and I am really appreciative of them for taking the initiative to get in touch with authors, encourage submissions, and review them
Preface
ix
while staying in touch with me. The Part Editors also could count on some of the most well-known and brightest academics in each discipline and field, they are too many to mention here, but my undying recognition and gratitude go to them as well. Along with my acknowledgment, all of the editors and authors certainly have my congratulations and admiration upon the completion of this work. Many of the editors and contributors were already part of the ever-growing MBR (model-based reasoning) community, an enthusiastic collective of philosophers, epistemologists, logicians, cognitive scientists, computer scientists, engineers, and other academics working in the different and multidisciplinary aspects of what is known as “model-based reasoning,” also already focusing on hypothetical-abductive reasoning and its role in scientific rationality. Together with Tommaso Bertolotti I edited a huge Handbook of Model-Based Science, published in 2017 by Springer. The outreach of the present Handbook of Abductive Cognition goes far beyond the theoretical and personal borders of the MBR community, but it can nevertheless be saluted as a further celebration of the 25 years of work and exchange since the first MBR conference was held in Pavia, Italy, in 1998. For me, this handbook also serves as a tribute to the brilliant researchers who joined me or engaged with the MBR community who have since passed away, but whose contributions will always be recognized and valued. Last but clearly not least, I am most grateful to Springer’s editorial and production team for their constant trust, encouragement, and support. In particular, I wish to thank Lydia Mueller and Leontina Di Cecco, Springer publishing editors, Salmanul Faris Nedum Palli, Project Coordinator, and Jawahar Babu, Project Manager, as their resilient help and collaboration made a difference in achieving this handbook. Finally, I hope that the Handbook of Abductive Cognition will serve as a valuable resource to draw new scholars and stimulate decades of lively advancement in this exciting multidisciplinary field, beyond its instructive value for our community. The huge successes that have occurred over the past 50 years are excellently brought together in this handbook’s contents in a highly beneficial way. The information and knowledge contents in this handbook will undoubtedly be a helpful resource and direction for those who will conduct the even more capable and varied subsequent generations of study on abduction. Pavia, Italy March 2023
Lorenzo Magnani
Contents
Volume 1 Part I Philosophy and Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
Introduction to Philosophy and Abduction . . . . . . . . . . . . . . . . . . . . . Sami Paavola
3
2
Peirce’s Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Francesco Bellucci and Ahti-Veikko Pietarinen
7
3
Abduction and Semiosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mariana Vitti Rodrigues
21
4
Abduction as a Logic of Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sami Paavola
43
5
Abduction and Perception in Peirce’s Account of Knowledge . . . . . Aaron Bruce Wilson
61
Part II
6
7
Theoretical and Cognitive Issues on Abduction and Scientific Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
Introduction to Theoretical and Cognitive Issues on Abduction and Scientific Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Woosuk Park
83
Discoverability in the Perspective of the EC-Model of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lorenzo Magnani
87
8
How Abduction Fares in Mathematical Space . . . . . . . . . . . . . . . . . . John Woods
115
9
The Logical Process and Validity of Abductive Inferences . . . . . . . . Gerhard Minnameier
159
xi
xii
Contents
10
Theory-Generating Abduction and Its Justification . . . . . . . . . . . . . Gerhard Schurz
181
11
Abduction and Truth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ilkka Niiniluoto
209
12
Imagination, Cognition, and Methods of Science in Peircean Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahti-Veikko Pietarinen and Francesco Bellucci
Part III
13
14
Logics of Hypothetical Reasoning, Abduction, and Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction to Logics of Hypothetical Reasoning, Abduction, and Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Atocha Aliseda Abduction from a Dynamic Epistemic Perspective: Non-omniscient Agents and Multiagent Settings . . . . . . . . . . . . . . . . Angel Nepomuceno-Fernández, Fernando Soler-Toscano, and Fernando R. Velázquez-Quesada
229
245
247
251
15
Abduction and Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cristina Barés Gómez and Matthieu Fontaine
281
16
Paraconsistency, Evidence, and Abduction . . . . . . . . . . . . . . . . . . . . . A. Rodrigues, M. E. Coniglio, H. Antunes, J. Bueno-Soler, and W. Carnielli
313
17
Qualitative Inductive Generalization and Confirmation . . . . . . . . . . Mathieu Beirlaen
351
18
Modeling Hypothetical Reasoning by Formal Logics . . . . . . . . . . . . Tjerk Gauderis
381
Part IV
19
Abduction and Medicine: Diagnosis, Treatment, and Prevention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction to Abduction and Medicine: Diagnosis, Treatment, and Prevention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniele Chiffi and Mattia Andreoletti
413
415
20
Abduction in Prognostic Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniele Chiffi and Mattia Andreoletti
419
21
Abduction, Clinical Reasoning, and Therapeutic Strategies . . . . . . Raffaella Campaner and Fabio Sterpetti
443
22
Abductive Reasoning in Clinical Diagnostics . . . . . . . . . . . . . . . . . . . Carlo Martini
467
Contents
23
xiii
Medical Reasoning and the GW Model of Abduction . . . . . . . . . . . . Cristina Barés Gómez and Matthieu Fontaine
Part V
481
Abduction in Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
507
24
Introduction to Abduction in Mathematics . . . . . . . . . . . . . . . . . . . . . F. D. Rivera
509
25
The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elizabeth de Freitas
517
Peirce’s Conception of Mathematics as Creative Experimental Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel G. Campos
541
26
27
Using Abduction for Characterizing the Process of Discovery . . . . . Michael Meyer
557
28
Abduction and Creativity in Mathematics . . . . . . . . . . . . . . . . . . . . . . Paul Ernest
585
29
Abductive Arguments Supporting Students’ Construction of Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bettina Pedemonte
613
Part VI Diagrams, Visual Models, and Abduction . . . . . . . . . . . . . . . . .
639
30
Introduction to Diagrams, Visual Models, and Abduction . . . . . . . . Gianluca Caterina and Rocco Gangle
641
31
Existential Graphs as a Visual Tool of Abductive Cognition in Intuitionistic Logic and Various Sublogics . . . . . . . . . . . . . . . . . . . . . Arnold Oostra
647
Visual Semiotics, Abduction, and the Learning Paradox: The Role of Graphic Signs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inna Semetsky
669
32
33
On the Missing Diagrams in Category Theory . . . . . . . . . . . . . . . . . . Eduardo Ochs
697
34
Abduction in Diagrammatical Reasoning . . . . . . . . . . . . . . . . . . . . . . Frederik Stjernfelt
729
35
Peirce’s Diagrammatic Reasoning and Abduction . . . . . . . . . . . . . . . Ahti-Veikko Pietarinen
741
36
Abduction in Diagrammatic Reasoning: A Categorical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gianluca Caterina, Rocco Gangle, and Fernando Tohmé
761
xiv
Contents
Part VII
Abduction and Computation . . . . . . . . . . . . . . . . . . . . . . . . . . .
789
37
Introduction to Abduction and Computation . . . . . . . . . . . . . . . . . . . Akinori Abe
791
38
Humans Reason Skeptically . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meghna Bhadra, Islam Hamada, Steffen Hölldobler, and Luís Moniz Pereira
797
39
Default Negation in Normal Logic Programs Considered as Minimal Abduction of Positive Hypotheses . . . . . . . . . . . . . . . . . . . . . Alexandre Miguel Pinto and Luís Moniz Pereira
833
40
Evaluation of Abductive Hypotheses: A Logical Perspective . . . . . . Mariusz Urba´nski
875
41
Speculative Computation: Application Scenarios . . . . . . . . . . . . . . . . João Ramos, Tiago Oliveira, Davide Carneiro, Ken Satoh, and Paulo Novais
901
42
Abductive Logic Programming and Linear Algebraic Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tuan Quoc Nguyen, Katsumi Inoue, and Chiaki Sakama
923
Acquisition of Feature Concepts Via Open Abductive Communication with Data Jackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yukio Ohsawa, Teruaki Hayashi, Sae Kondo, and Akinori Abe
945
43
Volume 2 Part VIII
Abduction and Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
973
44
Introduction to Abduction and Economics . . . . . . . . . . . . . . . . . . . . . Fernando Tohmé
975
45
Abduction in Economics: A Philosophical View . . . . . . . . . . . . . . . . . Marcelo Auday, Ricardo Crespo, and Fernando Tohmé
979
46
Abduction in Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fernando Delbianco and Fernando Tohmé
991
47
C. S. Peirce’s Conception of Abduction and Economics . . . . . . . . . . 1013 James R. Wible
48
Abduction and Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035 Ramzi Mabsout
Contents
xv
Part IX
Abduction in Education and Human Sciences . . . . . . . . . . . . . 1047
49
Introduction to Abduction in Education and Human Sciences . . . . 1049 Alger Sans Pinillos
50
Abduction in Earth Science Education . . . . . . . . . . . . . . . . . . . . . . . . 1055 Phil Seok Oh
51
Abductive Inquiry and Education: Pragmatism Coordinating the Humanities, Human Sciences, and Sciences . . . . . . . . . . . . . . . . . 1085 John R. Shook
52
Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching of School Scientific Explanation and Argumentation . . . . 1109 Agustín Adúriz-Bravo and Leonardo González Galli
53
Abductive Irradiation of Cultural Values in Shared Spaces: The Case of Social Education Through Public Libraries . . . . . . . . . 1147 Alger Sans Pinillos
Part X Abduction, Creative Cognition, and Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173 54
Introduction to Abduction, Creative Cognition, and Discovery . . . . 1175 Selene Arfini
55
Abduction and Creative Theorizing . . . . . . . . . . . . . . . . . . . . . . . . . . . 1181 Robert Folger, Christopher Stein, and Nicholas Andriese
56
Creativity and Abduction According to Charles S. Peirce . . . . . . . . 1205 Sara Barrena and Jaime Nubiola
57
Surprise as the Dawning of Abductive Rationality: Evidence from Children’s Narratives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235 Donna E. West
58
Abduction Beyond Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1271 P. D. Bruza and Andrew Gibson
59
The Foundations of Creativity: Human Inquiry Explained Through the Neuro-Multimodality of Abduction . . . . . . . . . . . . . . . . 1289 Jordi Vallverdú and Alger Sans Pinillos
Part XI Abduction and Technological Design . . . . . . . . . . . . . . . . . . . . 1317 60
Introduction to Abduction and Technological Design . . . . . . . . . . . . 1319 Ehud Kroll
xvi
Contents
61
Abductive Reasoning in Creative Design and Engineering: Crossroads of Data-Driven and Model-Based Engineering . . . . . . . 1325 Pieter Pauwels and Vishal Singh
62
Abduction in the Evaluation of Designs . . . . . . . . . . . . . . . . . . . . . . . . 1351 Andy Dong
63
Logical Processes Underlying Creative and Innovative Design . . . . 1363 Sharifu Ura
64
Abduction and Design Theory: Disentangling the Two Notions to Unbound Generativity in Science . . . . . . . . . . . . . . . . . . . . . . . . . . . 1385 Ehud Kroll, Pascal Le Masson, and Benoit Weil
Part XII Adversarial Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1407 65
Introduction to Adversarial Abduction . . . . . . . . . . . . . . . . . . . . . . . . 1409 Samuel Forsythe
66
The Epistemology of Secrecy: The Roles of Abduction in the Investigation of Deep State Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1413 Frederik Stjernfelt
67
Abductive Ruses: The Role of Conjectures in the Epistemology of Deception from High-Level, Reflective Cases to Low-Level, Perceptual Ones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1437 Francesco Fanti Rovetta
68
Adversarial Abduction: The Logic of Detection and Deception . . . . 1465 Samuel Forsythe
69
Abduction and Violence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1489 Lorenzo Magnani
Part XIII
Abduction and Cognitive Neuroscience . . . . . . . . . . . . . . . . . 1517
70
Introduction to Abduction and Cognitive Neuroscience . . . . . . . . . . 1519 Gustavo Cevolani
71
Reverse Inference, Abduction, and Probability in Cognitive Neuroscience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1523 Davide Coraci, Fabrizio Calzavarini, and Gustavo Cevolani
72
Abduction: Theory and Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1551 Igor Douven
73
Plausible Reasoning in Neuroscience . . . . . . . . . . . . . . . . . . . . . . . . . . 1581 Tommaso Costa, Donato Liloia, Mario Ferraro, and Jordi Manuello
Contents
Part XIV
xvii
Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1619
74
Introduction to Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1621 Lorenzo Magnani
75
Galilean Methodology and Abductive Inference . . . . . . . . . . . . . . . . . 1625 Fabio Minazzi
76
Abduction as Phylogenetic Inference: Epistemological Perspectives in Scientific Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1651 Elizabeth Martínez-Bautista
77
Abductive Research Methods in Psychological Science . . . . . . . . . . . 1681 Brian D. Haig
78
Motor Simulation of Facial Expressions, But Not Emotional Mirroring, Depends on Automatic Sensorimotor Abduction . . . . . . 1709 Valentina Cuccio and Fausto Caruana
79
Abduction and Metaphysics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1727 Maria Regina Brioschi
80
Deduction–Abduction–Induction Chains in Plato’s Phaedo and Parmenides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1759 Priyedarshi Jetli
81
Reason-Giving-Based Accounts of Abduction . . . . . . . . . . . . . . . . . . . 1795 Paula Olmos
82
Tracking Abductive Reasoning in the Natural Sciences . . . . . . . . . . 1835 Andrés Rivadulla
83
Inference to the Best Explanation: An Overview . . . . . . . . . . . . . . . . 1863 Frank Cabrera
84
The Limits of Subjectivism: On the Relation Between IBE and (Objective) Bayesianism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1897 Alexandros Apostolidis and Stathis Psillos
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1921
About the Editor
Lorenzo Magnani, philosopher, epistemologist, and cognitive scientist, is a professor of Philosophy of Science at the University of Pavia, Italy, and the director of its Computational Philosophy Laboratory. His primary research interests are in the area of philosophy of science, logic, and artificial intelligence. A major objective of his research has been to create a working synthesis between epistemological/historical perspectives and investigations of representations and reasoning practices carried out in the sciences of cognition. He devoted various studies in the fields of philosophy of science, logic, and cognitive science also making specific and deep research on abductive cognition (with four monographs), science, technology, and human values, philosophy of computing, critical thinking, non-standard logics, philosophy of medicine, history and philosophy of geometry, violence, morality, and religion. His historical research centered on nineteenth- and twentieth-century geometry and philosophy of geometry. His previous positions have included: visiting researcher (Carnegie Mellon University, 1992; McGill University, 1992–93; University of Waterloo, 1993; and Georgia Institute of Technology, 1998–99); and visiting professor (visiting professor of Philosophy of Science and Theories of Ethics at Georgia Institute of Technology, 1999– 2003; Weissman Distinguished Visiting Professor of Special Studies in Philosophy: Philosophy of Science at Baruch College, City University of New York, 2003); and visiting professor at the Sun Yatsen University, Canton (Guangzhou), China from 2006 to 2012. In the event of the 50th anniversary of the re-building of the Philosophy Department xix
xx
About the Editor
of Sun Yat-sen University in 2010, an award was given to him to acknowledge his contributions to the areas of philosophy, philosophy of science, logic, and cognitive science. A Doctor Honoris Causa degree was awarded to Lorenzo Magnani by the Senate of the Stefan ¸ cel Mare University, Suceava, Romania. In 2015 Lorenzo Magnani has been appointed a member of the International Academy for the Philosophy of the Sciences (AIPS). He currently directs international research programs in the EU, USA, and China. He is author of several books and articles: Non-Euclidean Geometries (in Italian, 1978); Applied Epistemology (in Italian, 1991); Introduction to Computational Philosophy (1997); Textbook of Logic: Classical Logic and Logic of Common Sense (with R. Gennari) (in Italian, 1997, reprinted in 2022); Philosophy and Geometry. Theoretical and Historical Issues (2001); and his book Abduction, Reason, and Science (New York, 2001) (Chinese edition: Translated by Dachao Li and Yuan Ren, Guangdong People’s Publishing House, Guangzhou, 2006 [意] 洛 伦 佐·玛 格 纳 尼 / 著; 李 大 超, 任 远 / 译, 《 溯 因 、 理 由 与 科 学——发现和解释的过程》 ,中国广州:广东人民 出 版 社 2006 年), which become a well-respected work in the field of human cognition. The book Morality in a Technological World (Cambridge, 2007) develops a philosophical and cognitive theory of the relationships between ethics and technology from a naturalistic perspective. The book describes how modern technology has brought about consequences of such magnitude that old policies and ethics can no longer contain them. The book Abductive Cognition. The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning aims at increasing knowledge about creative and expert inferences. The study of these high-level methods of abductive cognition is situated at the crossroads of philosophy, logic, epistemology, artificial intelligence, neuroscience, cognitive psychology, animal cognition, and evolutionary theories; that is, at the heart of cognitive science. The monograph Understanding Violence. The Intertwining of Morality, Religion, and Violence: A Philosophical Stance has been published by Springer, in 2009 and 2011. A new monograph has been published by Springer in 2017, The Abductive
About the Editor
xxi
Structure of Scientific Creativity. An Essay on the Ecology of Cognition. The recent book Eco-Cognitive Computationalism. Cognitive Domestication of Ignorant Entities (2022), published by Springer, offers an entirely new dynamic perspective on the nature of computation. The last book Discoverability. The Urgent Need of an Ecology of Human Creativity, published by Springer in 2022, shows that discoverability is closely related to the sustainability of human creativity from an “eco-cognitive” perspective. He wrote 18 monographs, edited books in Chinese, 18 special issues of international academic journals, and 19 collective books, some of them deriving from international conferences. Editor-in-chief (with T. Bertolotti) of the Springer Handbook of Model-Based Science, 2017, and Editor-in-chief of the present Handbook of Abductive Cognition, Springer, 2023. Since 1998, initially in collaboration with Nancy J. Nersessian and Paul Thagard, he created and promoted the MBR Conferences on Model-Based Reasoning, which in 2023 is in its ninth edition. Since 2011 he is the editor of the Book Series Studies in Applied Philosophy, Epistemology and Rational Ethics (SAPERE), Springer, Heidelberg/Berlin.
Section Editors
Akinori Abe Chiba University Inageku, Chiba Japan
Alger Sans Pinillos Department of Humanities – Philosophy Section University of Pavia Pavia, Italy
Daniele Chiffi Department of Architecture and Urban Studies Politecnico di Milano Milan, Italy
xxiii
xxiv
Section Editors
Ehud Kroll Department of Mechanical Engineering Braude College of Engineering Karmiel, Israel
F. D. Rivera Department of Math & Statistics San Jose State University San Jose, California, USA
Fernando Abel Tohmé Department of Economics and Mathematics Universidad Nacional del Sur and CONICET Institute at Bahía Blanca (INMABB) Bahía Blanca, Argentina
Section Editors
xxv
Gianluca Caterina Center for Diagrammatic and Computational Philosophy Endicott College Beverly, MA, USA
Gustavo Cevolani MInD Research Group, MoMiLab Research Unit IMT School for Advanced Studies Lucca Lucca, Italy
Lorenzo Magnani Department of Humanities University of Pavia Pavia, Italy
Mattia Andreoletti Department of Health Science and Technology ETH Zurich, Switzerland
xxvi
Section Editors
Prof. Atocha Aliseda Instituto de Investigaciones Filosóficas Universidad Nacional Autónoma de México Mexico City, Mexico
Rocco Gangle Department of Humanities and Center for Diagrammatic and Computational Philosophy Endicott College Beverly, Massachusetts, USA
Samuel Forsythe Department of International Security Peace Research Institute Frankfurt Frankfurt, Germany
Section Editors
xxvii
Prof. Sami Paavola Faculty of Educational Sciences University of Helsinki Helsinki, Finland
Selene Arfini Department of Humanities – Philosophy Section, and Computational Philosophy Laboratory University of Pavia Pavia, Italy
Woosuk Park School of Digital Humanities and Computational Social Sciences Korea Advanced Institute of Science and Technology Daejeon, South Korea
Contributors
Akinori Abe Faculty of Letters, Chiba University, Inageku, Chiba, Japan Agustín Adúriz-Bravo CONICET/Instituto de Investigaciones CeFIEC, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina Atocha Aliseda Instituto de Investigaciones Filosóficas, Universidad Nacional Autónoma de México, Mexico City, Mexico Mattia Andreoletti Faculty of Philosophy, Università Vita-Salute San Raffaele, Milan, Italy Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland Nicholas Andriese University of Central Florida, Orlando, FL, USA H. Antunes Department of Philosophy, Federal University of Bahia, Salvador, Brazil Alexandros Apostolidis Department of History & Philosophy of Science, National and Kapodistrian University of Athens, Athens, Greece Selene Arfini Department of Humanities – Philosophy Section, University of Pavia, Pavia, Italy Marcelo Auday Universidad Nacional del Sur (UNS) & IIESS-CONICET, Bahía Blanca, Argentina Cristina Barés Gómez Departamento de Filosofía, Lógica y Filosofía de la Ciencia, Universidad de Sevilla, Seville, Spain Sara Barrena Departamento de Filosofía, University of Navarra, Pamplona, Spain Mathieu Beirlaen Ghent University, Gent, Belgium
xxix
xxx
Contributors
Francesco Bellucci Department of the Arts, University of Bologna, Bologna, Italy Department of Philosophy and Communication Studies, University of Bologna, Bologna, Italy Meghna Bhadra Technische Universität Dresden, Dresden, Germany Maria Regina Brioschi Department of Philosophy “Piero Martinetti”, University of Milan, Milan, Italy Department of Excellence, Ministry of Education, University and Research (MIUR), Milan, Italy P. D. Bruza Queensland University of Technology, Brisbane, QLD, Australia J. Bueno-Soler Centre for Logic, Epistemology and the History of Science, and School of Technology, Rua Paschoal Marmo, University of Campinas, Limeira, Brazil Frank Cabrera Milwaukee School of Engineering, Milwaukee, WI, USA Fabrizio Calzavarini Department of Letter, Philosophy, Communication, University of Bergamo, Bergamo, Italy Center for Logic, Language, and Cognition (LLC), University of Turin, Turin, Italy Raffaella Campaner Department of Philosophy and Communication Studies, University of Bologna, Bologna, Italy Daniel G. Campos Department of Philosophy, Brooklyn College of The City University of New York, Brooklyn, NY, USA Davide Carneiro CIICESI, Escola Superior de Tecnologia e Gestao, ˜ Politécnico do Porto, Portugal W. Carnielli Centre for Logic, Epistemology and the History of Science, Rua Sérgio Buarque de Holanda, University of Campinas, Campinas, Brazil Fausto Caruana Institute of Neuroscience (IN), National Research Council of Italy (CNR), Parma, Italy Gianluca Caterina Center for Diagrammatic and Computational Philosophy, Endicott College, Beverly, MA, USA Gustavo Cevolani Models, Inference, and Decisions (MInD) Group, MoMiLab Research Unit, IMT School for Advanced Studies Lucca, Lucca, Italy Center for Logic, Language, and Cognition (LLC), University of Turin, Turin, Italy Daniele Chiffi DAStU, Politecnico di Milano, Milan, Italy M. E. Coniglio Department of Philosophy and Centre for Logic, Epistemology and the History of Science, University of Campinas, Campinas, Brazil
Contributors
xxxi
Davide Coraci Models, Inference, and Decisions (MInD) Group, MoMiLab Research Unit, IMT School for Advanced Studies Lucca, Lucca, Italy Institut d’Histoire et de Philosophie des Sciences et des Techniques (IHPST), Paris 1 Panthéon-Sorbonne University, Paris, France Tommaso Costa Focus Lab, Department of Psychology, University of Turin, Turin, Italy GCS-fMRI, Koelliker Hospital, Turin, Italy Ricardo Crespo IAE Business School, Universidad Austral & Conicet, Buenos Aires, Argentina Valentina Cuccio Department of Ancient and Modern Civilizations (DICAM), Polo universitario “Annunziata”, viale Annunziata, University of Messina, Messina, Italy Elizabeth de Freitas Adelphi University, New York, NY, USA Fernando Delbianco Universidad Nacional del Sur (UNS) & INMABBCONICET, Bahía Blanca, Argentina Andy Dong School of Mechanical, Industrial, and Manufacturing Engineering, Oregon State University, Corvallis, OR, USA Igor Douven IHPST/Panthéon–Sorbonne University/CNRS, Paris, France Paul Ernest University of Exeter, Exeter, UK Francesco Fanti Rovetta Research Training Group ‘Situated Cognition’, Osnabrück University, Osnabrück, Germany Mario Ferraro Department of Physics, University of Turin, Turin, Italy GCS-fMRI, Koelliker Hospital, Turin, Italy Robert Folger Management Department, University of Central Florida, Orlando, FL, USA Matthieu Fontaine Departamento de Filosofía, Lógica y Filosofía de la Ciencia, Universidad de Sevilla, Seville, Spain Samuel Forsythe Peace Research Institute Frankfurt, Germany
Frankfurt am Main,
Rocco Gangle Center for Diagrammatic and Computational Philosophy, Endicott College, Beverly, MA, USA Tjerk Gauderis R&D Into the Trees, Gent, Belgium Andrew Gibson Queensland University of Technology, Brisbane, QLD, Australia
xxxii
Contributors
Leonardo González Galli CONICET/Instituto de Investigaciones CeFIEC, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina Brian D. Haig University of Canterbury, Christchurch, New Zealand Islam Hamada Technische Universität Dresden, Dresden, Germany Teruaki Hayashi Department of Systems Innovation, School of Engineering, The University of Tokyo, Tokyo, Japan Steffen Hölldobler Technische Universität Dresden, Dresden, Germany North Caucasus Federal University, Stavropol, Russian Federation Katsumi Inoue The Graduate University for Advanced Studies, SOKENDAI, Japan National Institute of Informatics, Tokyo, Japan Priyedarshi Jetli Philosophy, University of Delhi, Mumbai, India Sae Kondo Department of Architecture, School of Engineering, Mie University, Tsu, Mie, Japan Ehud Kroll Department of Mechanical Engineering, ORT Braude College, Karmiel, Israel Pascal Le Masson Center of Management Science (CGS) – i3 UMR CNRS 9217, Mines Paris – PSL, Paris, France Donato Liloia Focus Lab, Department of Psychology, University of Turin, Turin, Italy GCS-fMRI, Koelliker Hospital, Turin, Italy Ramzi Mabsout Department of Economics, American University of Beirut, Beirut, Lebanon Lorenzo Magnani Department of Humanities, Philosophy Section and Computational Philosophy Laboratory, University of Pavia, Pavia, Italy Jordi Manuello Focus Lab, Department of Psychology, University of Turin, Turin, Italy GCS-fMRI, Koelliker Hospital, Turin, Italy Elizabeth Martínez-Bautista School of Philosophy and Letters, UNAM, Ciudad de México, Mexico Carlo Martini Faculty of Philosophy, Vita-Salute San Raffaele University, Milan, Italy and Center for Philosophy of Social Science, University of Helsinki, Helsinki, Finland
Contributors
xxxiii
Michael Meyer Institute for Didactics of Mathematics, University of Cologne, Cologne, Germany Fabio Minazzi Dp di Scienze Teoriche e Applicate, Università degli Studi dell’Insubria, Varese, Italy Gerhard Minnameier Faculty of Economics and Business, Goethe University Frankfurt, Frankfurt am Main, Germany Angel Nepomuceno-Fernández Facultad de Filosofía, Grupo de Lógica, Lenguaje e Información, Universidad de Sevilla, Seville, Spain Tuan Quoc Nguyen The Graduate University for Advanced Studies, SOKENDAI, Japan National Institute of Informatics, Tokyo, Japan Ilkka Niiniluoto Department of Philosophy, History, and Art Studies, University of Helsinki, Helsinki, Finland Paulo Novais Algoritmi Centre,University of Minho Braga, Portugal Jaime Nubiola Departamento de Filosofía, University of Navarra, Pamplona, Spain Eduardo Ochs Universidade Federal Fluminense, Rio das Ostras, Brazil Phil Seok Oh Department of Science Education, Gyeongin National University of Education, Anyang, Gyeonggi-do, Republic of Korea Yukio Ohsawa Department of Systems Innovation, School of Engineering, The University of Tokyo, Tokyo, Japan Tiago Oliveira Algoritmi Centre, University of Minho, Braga, Portugal Paula Olmos Linguistics, Modern Languages, Logic and Philosophy of Science, Universidad Autónoma de Madrid, Madrid, Spain Arnold Oostra Universidad del Tolima, Ibagué, Colombia Sami Paavola Faculty of Educational Sciences, University of Helsinki, Helsinki, Finland Woosuk Park Digital Humanities and Computational Social Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea Pieter Pauwels Department of the Built Environment, Eindhoven University of Technology, Eindhoven, The Netherlands Bettina Pedemonte Department of Neurology, UCSF Dyslexia Center, UCSF Memory and Aging Center, San Francisco, CA, USA Luís Moniz Pereira NOVA-LINCS – Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Lisbon, Portugal
xxxiv
Contributors
Ahti-Veikko Pietarinen Ragnar Nurkse Department of Innovation and Governance, Tallinn University of Technology, Tallinn, Estonia Alexandre Miguel Pinto Outra Limited, London, UK Stathis Psillos Department of History & Philosophy of Science, National and Kapodistrian University of Athens, Athens, Greece João Ramos CIICESI, Escola Superior de Tecnologia e Gestao, ˜ Politécnico do Porto, Portugal Andrés Rivadulla Department of Logic and Theoretical Philosophy, Complutense University of Madrid, Madrid, Spain F. D. Rivera Department of Mathematics and Statistics and Ed.D. in Educational Leadership Program, San Jose State University, San Jose, CA, USA A. Rodrigues Department of Philosophy, Federal University of Minas Gerais, Belo Horizonte, Brazil Chiaki Sakama Wakayama University, Wakayama, Japan Alger Sans Pinillos Department of Humanities – Philosophy Section, University of Pavia, Pavia, Italy Ken Satoh National Institute of Informatics, Tokyo, Japan Gerhard Schurz Department of Philosophy, University of Duesseldorf, Duesseldorf, Germany Inna Semetsky Institute for Edusemiotics Studies, Melbourne, VIC, Australia John R. Shook Bowie State University, Bowie, MD, USA Vishal Singh Centre for Product Design and Manufacturing, Indian Institute of Science, Bengalaru, India Fernando Soler-Toscano Facultad de Filosofía, Grupo de Lógica, Lenguaje e Información, Universidad de Sevilla, Seville, Spain Christopher Stein Department of Management, School of Business at Siena College, Loudonville, NY, USA Fabio Sterpetti Department of Philosophy, Sapienza University of Rome, Rome, Italy Frederik Stjernfelt Aalborg University, Copenhagen, Denmark Fernando Tohmé Dpto. de Economía, Universidad Nacional del Sur, INMABBUNS-CONICET, Bahía Blanca, Argentina
Inna Semetsky: deceased.
Contributors
xxxv
Sharifu Ura Division of Mechanical and Electrical Engineering, Kitami Institute of Technology, Kitami, Japan ´ Mariusz Urbanski Faculty of Psychology and Cognitive Science, Adam Mickiewicz University, Pozna´n, Poland Jordi Vallverdú ICREA Academia – Department of Philosophy, Autonomous University of Barcelona, Bellaterra (Cerdanyola del Vallès), Spain Fernando R. Velázquez-Quesada Department of Information Science and Media Studies, Universitetet i Bergen, Bergen, Norway Mariana Vitti Rodrigues Department of Philosophy, Faculty of Philosophy and Sciences, São Paulo State University, Marília, Brazil Benoit Weil Center of Management Science (CGS) – i3 UMR CNRS 9217, Mines Paris – PSL, Paris, France Donna E. West State University of New York at Cortland, Cortland, NY, USA James R. Wible Department of Economics, Paul College of Business and Economics, University of New Hampshire, Durham, NH, USA Aaron Bruce Wilson South Texas College, McAllen, USA John Woods The Abductive Systems Group, Department of Philosophy, University of British Columbia, Vancouver, Canada
Part I Philosophy and Abduction
1
Introduction to Philosophy and Abduction Sami Paavola
Contents Abduction and Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 5
Abstract
A background and a short summary of Part A: Philosophy and Abduction is presented. Peirce has provided different kinds of means for answering an “abductive puzzle,” that is, how problems are formulated and solutions take shape in the first phases of inquiry. Abduction has evolved with transformations in philosophy. The four chapters shortly introduced concentrate on semiotic, methodological, and epistemological issues surrounding abduction.
Abduction and Philosophy The analyses of abduction are a rich source for digging into deep philosophical questions. Abduction opens up problem areas in epistemology, methodology, reasoning, logic, creativity, and processes of discovery, among other things. Hintikka (1998) has maintained that Peirce’s importance can be seen in his gift for finding key problems in philosophy. According to Hintikka, Peirce used abduction to point out a central question of contemporary epistemology, that is, what is the nature of ampliative reasoning, and how do human beings discover new knowledge and theories? Even if we did not agree on Peirce’s solutions, he has provided means
S. Paavola () Faculty of Educational Sciences, University of Helsinki, Helsinki, Finland e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_79
3
4
S. Paavola
and tentative hypotheses for answering an “abductive puzzle” (Paavola, 2015). The abductive puzzle refers here to mysteries surrounding the “first” phases of inquiry where problems are formulated and solutions take shape, and how these phases of inquiry can be analyzed and promoted by philosophical means. Throughout his long career, Peirce maintained that there is a third mode of inference, that is, abduction (under different names). But his conceptions of abduction changed and developed, and similarly, his ways of solving the abductive puzzle evolved over the years. Peirce’s abduction is a “cluster concept” (see Wilson, this volume) having different, overlapping interpretations. Since Peirce, his various formulations have provided means for further solutions. For example, what is the basis for people finding productive or true explanations and theories? What kinds of process are involved in first stages of inquiry? What is the role of logic and reasoning in creative processes? Novel ways of answering the abductive puzzle have also been developed independently of Peirce. Besides abduction, Peirce’s broad, systematic, and evolving theory of signs and logic and his classification of sciences have provided means for new interpretations of the abductive puzzle. Peirce is famous for his almost endless distinctions of trichotomies like icon, index, and symbol; grammar, critic, and methodeutic; sign, object, and interpretant; rheme, digisign, and argument; firstness, secondness, and thirdness; and esthetic, ethics, and logic. These are especially appropriate for analyzing philosophical problems when they facilitate the fine-grained distinctions needed to understand multifarious forms of reasoning, logic, and processes of inquiry. Abduction has also evolved with more general transformations in philosophy. Peirce’s formulations of abduction were long more or less neglected in the heyday of logical empiricism and quite long after that. Critics of logical empiricism like Karl Popper also shared similar ideas with them when it came to the “context of discovery.” Abduction aimed at formulating areas of logic and reasoning which were thought to be beyond philosophical analysis. Changes in the philosophy of science started to occur with historical and sociological turn by scholars like N. R. Hanson and Thomas Kuhn, which opened up an avenue for new interest in abductive questions. Hanson was especially influential in advancing issues surrounding abduction and the logic of discovery. Another line of inquiry central to abduction started with Gilbert Harman who formulated the Inference to the Best Explanation (IBE) model in the 1960s. Harman’s starting point was neither Peirce nor Hanson, but the IBE model has had many overlaps with the formulations and uses of abduction, especially within the philosophy of science. After Hanson and Harman, interpretations of abduction have spread into areas of research like logic, cognitive sciences, semiotics, and methodological discussions of various research disciplines. Chapters in this section discuss abduction from four points of view. The themes of the chapters are both classic and current in terms of abduction. The chapter by Bellucci and Pietarinen describes crucial elements and phases of Peirce’s original theory of abductive reasoning, discerning Peirce’s early, syllogistic view of abduction, and the mature theory of abduction as a first step in the process of inquiry.
1 Introduction to Philosophy and Abduction
5
It also discusses questions on both the validity and methodology of abductive reasoning. The second chapter by Wilson opens up a fundamental quandary of abduction – how perception can be abductive even though it seems that perception is not inferential. Wilson presents a nonessentialist reading of abduction where abduction can cover both inferential and noninferential (“instinctive”), and also nonpropositional processes. He connects this analysis to Peirce’s account of inquiry. The third chapter by Vitti analyzes abduction in the context of the Peircean theory of signs, and semiosis. The paper highlights the role of iconicity in the structure of abduction, as well as the central meaning of clue-like signs in abduction. Vitti analyzes illustrative examples of abduction both in science and in daily life. In the fourth chapter, Paavola describes the evolution of abduction as a logic of discovery. Hanson’s texts have been important in these discussions. Since Hanson, many commentators have treated abduction as a logic of pursuit, but the chapter also lists various ways of defending abduction as a logic of discovery. In summary, the chapters concentrate on the character of abduction, its validity and justification, and semiotic, methodological, and epistemological issues surrounding abduction from several angles. The focus is on Peirce’s accounts of abduction and Peirce-inspired analyses of abduction. There are clearly different kinds of debates on what abduction is, and how it should be interpreted. This is how it should be from the Peircean point of view as well. One aim of the research has been to find the “true” nature of abduction as a third main mode of inference while recognizing that there are a multitude of ways of developing interpretations of abduction, and solutions to the abductive puzzles.
References Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3), 503–533. Paavola, S. (2015). Deweyan approaches to abduction? In U. Zackariasson (Ed.), Action, belief and inquiry. Pragmatist perspectives on science, society and religion (pp. 230–249). Nordic Studies in Pragmatism 3. Helsinki: Nordic Pragmatism Network.
2
Peirce’s Abduction Francesco Bellucci and Ahti-Veikko Pietarinen
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Early Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Mature Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Justification of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Methodeutic of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8 8 12 15 18 19 19
Abstract
This paper presents the essentials of Peirce’s original theory of abductive reasoning. It explains the differences between the first phase of Peirce’s thinking on abduction, in which the logical framework is largely syllogistic, and the mature phase, in which abduction becomes the first step in the three-step process of scientific inquiry. The problems of the validity of abductive reasoning and that of the methodology of abductive reasoning are also briefly discussed. Keywords
Abduction · Deduction · Induction · Peirce · Methodology
F. Bellucci () Department of the Arts, University of Bologna, Bologna, Italy e-mail: [email protected] A.-V. Pietarinen Ragnar Nurkse Department of Innovation and Governance, Tallinn University of Technology, Tallinn, Estonia e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_7
7
8
F. Bellucci and A.-V. Pietarinen
Introduction According to Aristotle, reasoning is either deductive or inductive, this dichotomy being exhaustive (APo I.18, 81a40; Eth Nic VI, 1139b24–30; Top I, 105a10– 19). This was still the prevailing view in the nineteenth century. For example, John S. Mill declares: “Reasoning [ . . . ] is popularly said to be of two kinds: reasoning from particulars to generals, and reasoning from generals to particulars; the former being called Induction, the latter Ratiocination or Syllogism” (Mill, 1843, II, i, §3). Likewise, for William Whewell the “process of deriving truths by the mere combination of general principles in particular hypothetical cases, is called deduction; being opposed to induction, in which [ . . . ] a new general principle is introduced at every step” (Whewell, 1840, I, I, i, §11). And Stanley Jevons: “The processes of inference always depend on the one same method of substitution; but they may nevertheless be distinguished according as the results are inductive or deductive” (Jevons, 1874: 13). Unlike his British predecessors, Peirce considered reasoning or inference to be of three, not two, essentially distinct kinds: along with deduction and induction, he isolated a third form, which he variously called “hypothesis,” “reasoning a posteriori,” “abduction,” “presumption,” and “retroduction.” Like Aristotle and explicitly following him, in a first phase (§1), Peirce conceives of induction and abduction as inversions of the syllogism. Later he would abandon the syllogistic framework of the early theory and would construe abduction and the method of finding and selecting explanatory hypotheses; in the mature theory (§2), abduction is not just another kind of reasoning along deduction and induction; it is also a distinct step in the overall process of scientific inquiry, whose typical pattern is abduction → deduction → induction. In his last years, Peirce devoted a great deal of thought to two further aspects of abductive logic: the problem of the justification of this form of reasoning, which belongs to the second department of logic or “logical critics” (§3), and its methodology , which belongs to the third department of logic or “methodeutic” (§4). Along with the problem of abduction’s logical form, the critic and the methodeutic of abduction are two central themes of Peirce’s mature theory of abductive reasoning. The present paper outlines Peirce’s early and mature theories of abduction and presents the problems of the justification and methodology of abductive reasoning. A later paper in this volume, Chap. 12, “Imagination, Cognition, and Methods of Science in Peircean Abduction,” discusses the scientific values of the economy of research, including the uberty of abduction, namely, its creative power of generating new ideas that contribute to scientific discovery.
The Early Theory Aristotle’s theory of induction, epag¯og¯e, is in APr II.23. Induction is an argument “proving the major term of the middle term by means of the minor” (68b16–17),
2 Peirce’s Abduction
9
where “major,” “minor,” and “middle” terms are to be taken in reference to the syllogism in the first figure. Here is Aristotle’s example: (1) SYLLOGISM Every animal without gall (B) is long-lived (A). Man, horse, and mule (C) are without gall (B). Therefore, man, horse, and mule (C) are long-lived (A). (2) INDUCTION Man, horse, and mule (C) are long-lived (A). Man, horse, and mule (C) are without gall (B). Therefore, every animal without gall (B) is long-lived (A).
Induction is the inversion of syllogism, for while syllogism connects the major term (A) with the minor by means of the middle term (C), induction connects the major (A) with the middle (B) by means of the minor (C) (68b33–34). Aristotle sets a condition for the validity of induction: “If, then, C converts with B and the middle term does not reach beyond the extreme, then it is necessary for A to belong to B” (68b23–25). If B and C convert, (2) becomes the first-figure syllogism in (3), which is deductively valid: (3) Man, horse, and mule (C) are long-lived (A). Animals without gall (B) are man, horse, and mule (C). Therefore, every animal without gall (B) is long-lived (A).
This, of course, requires that man, horse, and mule are the only animal species without gall. In Part An IV.2, species other than the three comprised under C are mentioned (deer, dolphin, camel) which are without gall, which would seem to empirically invalidate the condition of convertibility. But it has to be kept in mind that in APr II.23 Aristotle’s concerns are purely logical, and the empirical falsity of the convertibility does not affect his logical point that if such terms are convertible, then induction is deductively valid. As expected, Peirce notices this. Convertibility of B and C makes of Aristotle’s epag¯og¯e an argument from complete enumeration: “Aristotle evidently supposes that a general term is equal to a sum of singulars,” while “the extension of a universal term consists in the total of possible things to which it is applicable and not merely to those that are found to occur” (W 1:263). Rather than as a deductively valid inference based on the convertibility of B and C (middle and minor of the corresponding first figure or (1)), Peirce construes Aristotle’s epag¯og¯e as a deductively invalid third figure syllogism. Peirce accepts that induction is an inversion of a syllogism, i.e., an inference of the major proposition of a first-figure syllogism from its minor premise and conclusion. If induction, as Aristotle explained (even though the reference to the third figure is not Aristotelian), is deductively invalid inference in the third figure, then there must be another deductively invalid inference in the second figure. Later in the years, Peirce would attribute this idea to Aristotle himself and would identify the form of reasoning that Aristotle presents at APr II.25. “Abduction,” apag¯og¯e, is an argument in which “it is clear that the first term belongs to the middle and unclear that the middle belongs to the third, though nevertheless equally convincing (piston)
10
F. Bellucci and A.-V. Pietarinen
as the conclusion, or more so; or, next, if the middles between the last term and the middle are few” (69a20–24). Apag¯og¯e comes in two species, and it is quite clear that Aristotle has first-figure syllogisms in mind. Yet Peirce claims that the passage in question contains some corrupt readings and proposes to amend it and change the order of the propositions so as to have the minor premise of a syllogism being inferred from the other two, thus in an inference in the second figure (see the editorial notes in Peirce, 1998: 527–528n11). The best reconstruction of Peirce’s later reading of APr II.25 is in Flórez (2014), who argues that Peirce’s reading is untenable; see also Bellucci, 2019 and, for a different take, Magnani, 2017: 96–113. Nevertheless, Peirce’s own doctrine is that something like a deductively invalid inference in the second figure exists which is precisely the form of inference by which a hypothesis explaining facts is put forward. The substance of this doctrine is first published in the paper “On a Natural Classification of Arguments” (Peirce, 1868a). While in induction the major proposition of a first-figure syllogism is inferred from its minor premise and conclusion, in abduction the minor proposition of a first-figure syllogism is inferred from the major and the conclusion. The syllogism of which induction and abduction are inversion is called “explaining syllogism.” In his Lectures on Logic, Hamilton had proposed to label “sumption” the major premise and “subsumption” the minor of a syllogism, the combination of sumption and subsumption giving the “conclusion” of the syllogism (Hamilton, 1860: 198–201). Peirce seems to have imitated Hamilton’s terminology: already in the ninth Lowell Lecture of 1866 (Peirce, 1982: 471–488), Peirce calls rule, case, and result, respectively, the major premise, the minor premise, and the conclusion of the explaining syllogism. Given this terminology, induction is said to infer a rule from a case falling under that rule and a result, while abduction infers a case from a rule under which it falls and a result. The same terminology and conceptual framework are found in “Deduction, Induction, and Hypothesis” (Peirce, 1878). Here is how abduction (then still called “hypothesis”) is described in this latter paper: Hypothesis is where we find some very curious circumstance, which would be explained by the supposition that it was a case of a certain general rule, and thereupon adopt that supposition. Or, where we find that in certain respects two objects have a strong resemblance, and infer that they resemble one another strongly in other respects. (Peirce, 1878: 472)
Abduction is supposing that a given surprising fact is a case falling under a rule. Alternatively, it is supposing that things sharing certain characters also share other characters. The two descriptions are equivalent. Take the first of the series of examples of abductive reasoning that Peirce makes in that same paper: I once landed at a seaport in a Turkish province; and, as I was walking up to the house which I was to visit, I met a man upon horseback, surrounded by four horsemen holding a canopy over his head. As the governor of the province was the only personage I could think of who would be so greatly honored, I inferred that this was he. This was an hypothesis. (Peirce, 1878: 472)
Let us start from the description of abduction in terms of shared characters. In the “New List” of 1867, Peirce had described abduction as an argument in which
2 Peirce’s Abduction
11
the premises are a likeness of the conclusion (Peirce, 1868b: §15). In order to understand this, a step backward into his Harvard and Lowell lectures of 1865– 1866 is needed. Like Peirce’s “Hamiltonian” terminology (Rule, Case, Result), the “semiotic” description of abduction in the “New List” is rooted in those lectures. After having introduced the two traditional logical quantities of “connotation” and “denotation” (also termed “comprehension” and “extension”), Peirce divides signs or representations into “copies” (later, “likenesses” and “icons”), which connote without denoting, “signs” in the strict sense (later, “indices”), which denote without connoting, and “symbols,” which both connote and denote and denote in consequence of the connotation; in a symbol, the connotation fixes the denotation (Peirce, 1982: 272). Usually, a combination of symbols is a symbol; yet symbols may also combine in composite symbols that lose either their capacity to denote or their capacity to connote. A composite symbol that has denotation without adequate connotation is an “enumerative term”; a composite symbol that has connotation but no denotation that is adequate to it is a “conjunctive term” (Peirce, 1982: 278–279). Take the composite symbol “man riding a horse surrounded by four horsemen holding a canopy over his head”; each of its components (“ride,” “horse,” “man,” “canopy,” etc.) is a symbol, because each denotes what it does because of its connotation; but their juxtaposition is not a symbol, because while it connotes several characters (the sum of the connotations of its component symbols), yet those characters are collectively unable to fix the total denotation: there may be distinct kinds of things (distinct denotations) satisfying the description (corresponding to the connotation). This composite symbol is a conjunctive term: it has connotation but no denotation that is adequate to it. Now, since a conjunctive term has connotation but no adequate denotation, it can never appear as the subject of a proposition, because the subject of a proposition denotes and the predicates connotes. Therefore, an inference through a conjunctive term can only be in the second figure, i.e., in the figure in which the middle term is predicate in both premises: (4) P1. This man is riding a horse surrounded by four horsemen holding a canopy over his head. P2. The governor of the province rides horses surrounded by four horsemen holding a canopy over his head. C. Hence, this man is the governor of the province.
This inference can only be in the second figure because its middle term is a conjunctive term, and a conjunctive term can never appear as the subject of a proposition. P1 records data from observations: certain characters are observed together in a certain circumstance; P2 connects those characters to something else in the only way it is possible to connect them, i.e., as a composite predicate. It must be noted that P2 is not convertible; the governor of the province has those characters, but not anything having those characters is the governor of the province; in a conjunctive term, connotation does not determine the denotation. The first description in the passage quoted, according to which abduction is supposing that a given surprising fact is a case falling under a rule, is equivalent to the second. P2 is a rule, and the conclusion C expresses a case falling under it;
12
F. Bellucci and A.-V. Pietarinen
a rule and a case falling under it are related as major and minor premise of a firstfigure syllogism, respectively, and the result of the falling of the case under the rule is the conclusion of it. Rule and result have the same predicate; if one then infers the case from the rule and the result, the inference is in the second figure. The description of abduction as the inference of a case from a rule and a result is found in both the “Natural Classification of Arguments” of 1867 (Peirce, 1868a), in “Some Consequences of Four Incapacities” (Peirce, 1868c), and in “Deduction, Induction, and Hypothesis” (Peirce, 1878); the description of abduction in semiotic terms as the inference through a conjunctive term is found in §15 of the “New List” (Peirce, 1868b) but is more exhaustively discussed in the Harvard and Lowell Lectures of 1865–1866. There, it is explained why deduction is an inference through a symbol, induction an inference through an enumerative term, and abduction (at that time called “hypothesis”) an inference through a conjunctive term (Peirce, 1982: 446–447). §15 of the “New List” succinctly presents a matter that is fully explained in the lectures. The most mature version of this early theory is in the article “A Theory of Probable Inference” in the volume of Studies in Logic edited by Peirce himself (Peirce, 1883); here, as before, Peirce explains induction and abduction as inversions of a valid deductive syllogism.
The Mature Theory In July 1910, Peirce wrote to Paul Carus: “the division of the elementary kinds of reasoning into three heads was made by me in my first lectures and was published in 1869 in Harris’s Journal of Speculative Philosophy [i.e. Peirce, 1868c]. I still consider that it had a sound basis. Only in almost everything I printed before the beginning of this century [ . . . ] I more or less mixed up Hypothesis and Induction” (Peirce, 1933–1958: 8.227). There is no reason to doubt about Peirce’s retrospective account. But in what sense were abduction and induction more or less mixed up in Peirce’s pre-1900 writings? Some indications can be gleaned from the Minute Logic of 1902: Upon this subject [i.e. abduction], my doctrine has been immensely improved since my essay ‘On The Theory of Probable Inference’ was published in 1883. In what I there said about ‘Hypothetic Inference’ I was an explorer upon untrodden ground. I committed, though I half corrected, a slight positive error, which is easily set right without essentially altering my position. But my capital error was a negative one, in not perceiving that, according to my own principles, the reasoning with which I was there dealing could not be the reasoning by which we are led to adopt a hypothesis, although I all but stated as much. But I was too much taken up in considering syllogistic forms and the doctrine of logical extension and comprehension, both of which I made more fundamental than they really are. As long as I held that opinion, my conceptions of Abduction necessarily confused two different kinds of reasoning. (Peirce, 1933–1958: 2.102)
What before 1900 Peirce called “hypothesis” and “abduction” is actually an induction about characters: instead of making an inference about things in extension, as in induction, one makes an inference about characters in comprehension. Instead
2 Peirce’s Abduction
13
of saying that the characters possessed by a sample are possessed by the whole to which the sample belongs, one says that something possessing a sample of the characters of a thing possesses all the characters of that thing (i.e., it is that thing whose characters it possesses). Both induction proper and the 1865–1883 form of reasoning called abduction are inferences from the sample to the whole. In the one, the sample is taken in extension, in the other in comprehension. All inference from the sample to the whole is inductive. Abduction in the sense of 1865–1883, Peirce says, should be called “abductive” or “qualitative” induction. It is one of the three species of induction (crude induction, quantitative induction, qualitative induction) that Peirce identifies in his later writings on inductive logic (see, e.g., Peirce, 1933–1958: 2.755–772; cf. Cheng, 1967; Goudge, 1940; Goudge, 1950: 158ff). So in the second Cambridge Conference of 1898 on “Types of Reasoning,” Peirce declares: “I first gave this theory in 1867, improving it slightly in 1868. In 1878 I gave a popular account of it in which I rightly insisted upon the radical distinction between Induction and Retroduction. In 1883, I made a careful restatement with considerable improvement. But I was led astray by trusting to the perfect balance of logical breadth and depth into the mistake of treating Retroduction as a kind of Induction” (Peirce, 1992: 141). Abduction is not an induction about characters. It is the process of forming an explanatory hypothesis. Abduction, induction, and deduction are not just three distinct kinds of arguments; they also are as distinct stages of a typical scientific investigation. One of the first and best descriptions of the three-step process is in the 1901 article “On the Logic of Drawing History from Ancient Documents” (Peirce, 1998: 75–114). When confronted with surprising facts, i.e., facts of observation or experience that are contrary to our expectations, an explanation is required: “the explanation must be such a proposition as would lead to the prediction of the observed facts, either as necessary consequences or at least as very probable under the circumstances. A hypothesis, then, has to be adopted, which is likely in itself, and renders the facts likely. This step of adopting a hypothesis as being suggested by the facts is what I call abduction. I reckon it as a form of inference, however problematical the hypothesis may be held” (Peirce, 1998: 94–95). The hypothesis is a proposition from which the surprising facts would follow deductively. The “classic” formulation of the logical form of abduction is in the seventh and last Harvard Lecture of 1903 (Peirce, 1933–1958: 5.189): (5) The surprising fact, C, is observed. But if A were true, C would be a matter of course. Hence, there is reason to suspect that A is true.
Cast in this form, the 1878 example about the Turkish province becomes: (6) The surprising fact that a man riding a horse surrounded by four horsemen holding a canopy over his head is observed. But if this man were the governor of the province, the fact that he rides horses surrounded by four horsemen holding a canopy over his head would be a matter of course. Hence, there is reason to suspect that this man is the governor of the province.
14
F. Bellucci and A.-V. Pietarinen
“Would be a matter of course” means “would be a deductive consequence of it”: the hypothesis is a proposition which, if true, would necessitate the truth of the surprising fact. It is the antecedent of a (supposedly) true conditional, and the conditional is the explanation of the surprising fact. Making an explanatory hypothesis thus amounts to finding an antecedent. This is the reason why in some of his later writings, Peirce calls this form of reasoning “retroduction”: because it “starts at consequents and recedes to a conjectural antecedent from which these consequents would, or might very likely logically follow” (Peirce 2019–2021, 3.1:348; Bellucci & Pietarinen, 2014). The retroductive process of adopting the hypothesis, i.e., of finding a potential antecedent of which the surprising fact is a consequent, is the first step in inquiry. The second step is to trace necessary, or deductive, consequences from the hypothesis: “the first thing that will be done, as soon as a hypothesis has been adopted, will be to trace out its necessary and probable experiential consequences. This step is deduction” (Peirce, 1998: 95). By abduction, one forms the hypothesis that “this man is the governor of the province.” By deduction, the deductive consequences of the hypothesis are traced: (7) This man is the governor of the province. Governors of the province can speak French and wear long shirts under the coat. Hence, this man can speak French and wears a long short under the coat.
The deductive consequences of the hypothesis are experimental predictions that must be selected independently of any knowledge about their truth; it must not be observed but potentially observable that the man whose identity is conjectured can speak French and wears a long shirt under the coat. Of course, the surprising fact whose explanation is sought is already, in itself, a deductive consequence of the hypothesis, for were it not, the hypothesis could not be said to explain it. But in the second step, deductive consequences of the hypothesis are drawn other than the fact which prompted the adoption of it. That further deductive consequences are drawn, i.e., that predications are made, is necessary, if the hypothesis is going to be tested; one can only test a hypothesis by testing its predictions. The third step of inquiry consists precisely in the testing of the hypothesis by means of a testing of those predictions: Having, then, by means of deduction, drawn from a hypothesis predictions as to what the results of experiment will be, we proceed to test the hypothesis by making the experiments and comparing those predictions with the actual results of experiment. [ . . . ] [When] we find that prediction after prediction, notwithstanding a preference for putting the most unlikely ones to the test, is verified by experiment, whether without modification or with a merely quantitative modification, we begin to accord to the hypothesis a standing among scientific results. This sort of inference it is, from experiments testing predictions based on a hypothesis, that is alone properly entitled to be called induction. (Peirce, 1998: 96–97)
The operation of testing a hypothesis by experiment is induction. It consists in considering the predictions from the hypothesis, remarking what conditions should be satisfied in order for those predictions to be fulfilled, causing those conditions to be satisfied by experiment, and noting the result of the experiment. If the predictions
2 Peirce’s Abduction
15
are fulfilled, the hypothesis is inductively given a certain confidence. In example (7), the predictions that are made from the hypothesis are that the man can speak French and wears a long shirt under the coat; I can make an experiment, i.e., I can cause the conditions for those predictions to be fulfilled and note the result. For example, I can address the man in French or somehow make him raise the coat; if the predictions are fulfilled, if he can replies in French and a long shirt becomes visible under the coat, the experiment has had positive result, and the hypothesis that the man is the governor of the province is confirmed (at least, provisionally). The inductive character of the procedure of hypothesis verification derives from the fact that the predictions tested are a sample of all the predictions from the same hypothesis that could be tested (cf. Reilly, 1970: 62). I can conclude that the man is the governor of the province on the basis of the induction that just as these predictions are verified, so must all the predictions drawn from the same hypothesis. The third step of inquiry is the inductive generalization that what is found true of some predictions would be found true of all of them. Abduction is not a form of reasoning that is isolated from the others; it is an essential component of scientific inquiry. A hypothesis is first put forth by abduction and then verified by induction through verification of its observable consequences drawn by deduction. Scientific inquiry is a “dialectic” between abduction and induction – the beginning and the end of hypothetical thinking – connected by the gateway of deduction. In later years, Peirce places an increasing importance on abduction as an “invitation to inquiry,” explaining the very nature of abduction being as “reasoning from surprise to inquiry.” Its conclusion C is inferred in an interrogative mood, or as argued in Ma and Pietarinen (2018), as a co-hortative “Is C the case? Let us investigate!”
The Justification of Abduction In his mature writings, Peirce divided logic into three departments: speculative grammar, which provides a definition and analysis of logic’s principal objects (propositions and arguments); logical critics, which provides a justification of the three main species of arguments; and speculative rhetoric or methodeutic, which provides both general and specific methodological principles for inquiry. As far as abduction is concerned, logical critics is concerned with its justification, while methodeutic is concerned with its methodology. Let us start with justification. Initially, the problem of justifying abduction was taken by Peirce to be a question of reducing abductions to their corresponding deduction. We saw that in the early theory, abduction and induction are inversions of a first-figure syllogism which is called “explaining syllogism.” If the explaining syllogism is deductively valid, then one of its inversions is inductively valid and the other is abductively valid. For example, the following argument (8) is the first-figure syllogism of which (4) is an inversion, i.e., is its explaining syllogism: (8) The governor of the province rides horses surrounded by four horsemen holding a canopy over his head.
16
F. Bellucci and A.-V. Pietarinen This man is the governor of the province. Hence, this man is riding a horse surrounded by four horsemen holding a canopy over his head.
Since (8) is (deductively) valid, (4) is (abductively) valid. Mutatis mutandis, the same would be true of induction. The same theory applies to the “statistical version” of the theory which Peirce had presented in the 1883 Studies in Logic (Goudge, 1950: 203–205). Yet, not only is this theory dependent on the “syllogistic” framework of Peirce’s early theory of abduction and thus destined to be revised in the context of the mature theory; also, it leaves unexplained what it means to be “abductively” or “inductively” valid. We know what it means to be deductively valid and what it means to be deductively invalid, namely, in a deductively valid inference, the truth of the premises necessitates the truth of the conclusion. In this sense, (8) is deductively valid and (4) deductively invalid. But outside deduction, the concept of validity is in need of some further explanation. According to Peirce, the question of the justification of abduction is the “bottom question of logical Critic” (Peirce, 1998: 443). In later years, Peirce attempted to justify abduction by reference to an “instinct” that man possesses for correct reasoning. Such a reference is itself abductive, for it explains the surprising fact that man is able to make correct abductions; and as an abduction, it is verified by induction: “it is a primary hypothesis underlying all abduction that the human mind is akin to the truth in the sense that in a finite number of guesses it will light upon the correct hypothesis. Now inductive experience supports that hypothesis in a remarkable measure” (Peirce, 1998: 108); “Abduction is no more nor less than guessing, a faculty attributed to Yankees. Such validity as this has consists in the generalization that no new truth is ever otherwise reached while some new truths are thus reached. This is a result of Induction” (Peirce 2019–2021, 3.1:282). Perhaps the best statement of this idea is the following from a 1909 draft of the “Preface” that Peirce wanted to add to the republication of his Illustrations of the Logic of Science (Peirce, 2014). The logical justification of Retroduction [ . . . ] is as follows. In the first place, we certainly do thoroughly believe and cannot help so believing, do what we may, that some reasonings are sound. For we can free ourselves of a belief only by reasoning ourselves out of it, and to do this is to believe that some reasonings are sound. Now although it is, of course, one thing to believe a proposition, no matter how thoroughly and firmly, and quite another for the proposition to be true, yet practically for the believer they are one and the same. For if his belief is perfect he thinks he is sure it is true and between that and his thinking it is true there is no practical difference. We must and do admit, therefore, that some reasonings are sound. But to say this is to say that some instinct or natural impulse to believe is in conformity with the real nature of things; and the only question is how far that conformity extends. This can only be ascertained by sampling; and the process of sampling will consist in taking Retroduction after Retroduction and testing the truth of each by as large a sample of its consequences as can conveniently be obtained. This justifies Retroduction, which simply puts that process of testing into practice for single Retroductions; and there is nothing in the justification that cannot be learned from indubitable external observation and equally indubitable reasoning. (Peirce, 2014: 250–251)
2 Peirce’s Abduction
17
That some inferences that man makes are valid is indubitable, because to doubt it would be to believe in the validity of some other inference and thus to assume precisely what is thereby doubted. Now, that some inferences are valid, and especially that some abductive inferences are valid, is something that has to be explained, and it is explained by making the hypothesis that man truly has some power of knowing things – that man has a power of abduction. Peirce put this in the strong terms of the hypothesis that man has “some instinct or natural impulse to believe [ . . . ] in conformity with the real nature of things.” Another passage from the 1908 “Neglected Argument” recites, “there is a reason, an interpretation, a logic, in the course of scientific advance; and this indisputably proves [ . . . ] that man’s mind must have been attuned to the truth of things in order to discover what he has discovered. It is the very bed-rock of logical truth” (Peirce, 1998: 444). Peirce repeatedly argues that the human mind has evolved an instinct to represent reality correctly more often than would be the case by pure chance. This is in fact a plausible hypothesis that explains why some abductive inferences that man makes are in fact correct. Let us call it the ur-abduction underlying all abductive reasoning. Its logical form is more or less the following: (9) It is observed that some abductive inferences are correct. If a human being had a power of abduction, then that some abductive inferences are correct would be a matter of course. Hence, there is reason to think that a human being has a power of abduction.
Now, the ur-abduction that man has a power of abduction must, like every abduction, be put to test in order to see whether and to what extent it stands up to experimental probing. “The only question,” Peirce states, “is how far that conformity extends.” The ur-abduction that man has a power of abduction must be verified by experience, i.e., inductively. Such an inductive verification of the urabduction is provided by the history of science: since the history of science provides us with abundant examples of successful or partly successful abductive inferences, the fundamental abduction or ur-abduction that man has a power of abduction is confirmed at least in some measure and is valid in precisely that measure. The history of science provides, as it were, the material for making the ur-induction that verifies that ur-abduction, just like single inductions verify single abductions. “This justifies Retroduction,” Peirce concludes, as it “simply puts that process of testing into practice for single Retroductions.” The justification of abductive reasoning therefore lies in the fundamental hypothesis, or ur-abduction, that man has a power of abduction. The whole idea of justifying abduction through abduction is circular (Fann, 1970). Its circularity depends on the fact that the three kinds of arguments (deduction, induction, and deduction) are essentially distinct and irreducible, i.e., on what has been called Peirce’s “autonomy thesis” (Kapitan, 1997; Hintikka, 1998). If the autonomy thesis holds, then any account of the justification of abduction has to involve some circularity. We note that Peirce’s account is circular but not viciously circular, because the ur-abduction is independently verified through an argument from the history of science, which is a fundamental induction or ur-induction. Abduction is
18
F. Bellucci and A.-V. Pietarinen
therefore directly justified through an ur-abduction, which in its turn is checked inductively: the justification of abduction is for Peirce directly abductive and indirectly inductive (cf. Bellucci & Pietarinen, 2020).
The Methodeutic of Abduction Once logical critics has provided a justification of abduction, it is up to methodeutic to provide instructions as to the actual practice of abductive logic. Abductions are for Peirce “the only ones in which after they have been admitted to be just, it still remains to inquire whether they are advantageous” (Peirce, 1902), i.e., after they have been critically analyzed as to their validity, it still remains to methodeutically analyze them as to their advantageousness. This means that hypotheses arrived at by abduction which are equivalent from the point of view of their validity may differ as to the relative advantage that selecting one or the other may have for the overall process of scientific inquiry and explanation. In other words, there may be an advantage in selecting one hypothesis over another for the sake of testing, and this is the business of the methodeutic of abduction to determine. Unfortunately, Peirce nowhere gives a full picture of the character of his intended methodeutic of abduction, but it at least seems clear that (i) he has offered criteria for judging of the advantageousness of hypotheses and that (ii) these criteria are somehow ordered by preference. Here is a partial reconstruction (cf. Bellucci & Pietarinen, 2021; for an alternative view see Goudge, 1950: 199–201). First, a hypothesis should explain the surprising fact, i.e., it must be an explanatory hypothesis. This criterion is satisfied by any concept or proposition of which the surprising fact is a deductive consequence. Second, among explanatory hypotheses, only those that qualify as “experimental” or those “comparable with experience” should be considered. In fact, hypotheses can be verified only if their deductive consequences, i.e., the predictions made from them, are objects of a possible experience. In the last of his seven Harvard Lectures of 1903, Peirce famously declared: “If you carefully consider the question of pragmatism you will see that it is nothing else than the question of the logic of abduction. That is, pragmatism proposes a certain maxim which, if sound, must render needless any further rule as to the admissibility of hypotheses to rank as hypotheses” (Peirce, 1998: 234). The pragmatic maxim, considered as a maxim of the methodeutic of abduction, excludes any explanatory hypothesis that cannot be experimentally verified or matched with experience. Third, among experimental hypotheses, one should first select for testing those that are simple. In his later writings on abduction, Peirce often refers to Galileo’s maxim that hypothesis should be as simple as possible. If this maxim is taken in the sense that hypotheses should be as logically simple as possible, i.e., they should add the least to what has been observed, then, he argued, following this maxim to perfection would require that one should content herself with the very facts observed (Peirce, 1998: 444). If on the contrary Galileo’s maxim is taken to mean that those hypotheses are to be selected which are “more natural and facile for the human mind
2 Peirce’s Abduction
19
than another which renders the facts equally intelligible” (Peirce 2019–2021, 3/1: 354), then the maxim becomes perfectly reasonable: “I do not mean that logical simplicity is a consideration of no value at all, but only that its value is badly secondary to that of simplicity in the other sense” (Peirce, 1998: 444–445). Fourth, among equally simple hypotheses, those should be selected whose verification is in the sense of the economy of research the “cheapest.” Peirce’s argument for the economic criterion is roughly as follows: the logical validity of abduction presupposes that nature be in principle explainable. This means that to discover is simply to expedite an event that would eventually come to pass. Therefore, the ultimate real service of a logic of abduction must be of the nature of an economy. Economy itself depends on three factors: cost (of money, time, energy, thought), the value of the hypothesis itself, and the effects that it renders upon other projects and hypotheses (Peirce, 1933–1958: 7.164–231). The whole of Peirce’s methodeutic of abduction is summarized in the following fragment: “The recommendations of an explanatory hypothesis are, 1st , verifiability; 2nd , simplicity; 3rd , economy” (Peirce 2019–2021, 1:474). While the qualification “explanatory” gives the first criterion for acceptable abductions, verifiability, simplicity, and economy are three further and successive criteria for selecting explanatory hypotheses that are to be subjected to verification or experimental test. This is, in brief, the substance of Peirce’s methodeutic of abduction. Some further aspects that concern the novelty and truth of the conclusions conjectured by abductive reasoning (such as the quality of abductions being low in “security” but high in “uberty”) are presented in the Chap. 12, “Imagination, Cognition, and Methods of Science in Peircean Abduction.”
Conclusions Abduction was Peirce’s own creation. The nineteenth century logic had followed Aristotle in dividing all ratiocination into the deductive and the inductive species. Peirce saw better than anyone else that science and scientific progress cannot be due solely to deductive and inductive patterns of inference. He had both a logical critic (i.e., a justificatory account) and a logical methodeutic (i.e., a methodological account) of abduction and justly merits the title of father of abductive logic.
References Bellucci, F. (2019). Abduction in Aristotle. In D. Gabbay et al. (Eds.), Natural arguments. A tribute to John Woods (pp. 551–564). College Publications. Bellucci, F., & Pietarinen, A.-V. (2014). New light on Peirce’s conceptions of retroduction, deduction and scientific reasoning. International Studies in the Philosophy of Science, 28(4), 353–373. Bellucci, F., & Pietarinen, A.-V. (2020). Peirce on the justification of abduction. Studies in History and Philosophy of Science. Part A, 84, 12–19.
20
F. Bellucci and A.-V. Pietarinen
Bellucci, F., & Pietarinen, A.-V. (2021). Methodeutic of abduction. In J. Shook & S. Paavola (Eds.), Abduction in cognition and action (pp. 107–127). Springer. Cheng, Z. (1967). A note on Charles Peirce’s theory of induction. Journal of the History of Philosophy, 5(4), 361–364. Fann, K. T. (1970). Peirce’s theory of abduction. Nijhoff. Flórez, J. A. (2014). Peirce’s theory of the origin of abduction in Aristotle. Transactions of the Charles S. Peirce Society, 50, 265–280. Goudge, T. (1940). Peirce’s treatment of induction. Philosophy of Science, 7(1), 56–68. Goudge, T. (1950). The thought of Charles S. Peirce. University of Toronto Press. Hamilton, W. (1860). Lectures on metaphysics and logic. Ed. by H. L. Mansel & J. Veitch. Gould and Lincoln. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3), 503–534. Jevons, S. (1874). The principles of science. A treatise on logic and scientific method. Macmillan. Kapitan, T. (1997). Peirce and the structure of abductive inference. In N. Houser, D. D. Roberts, & J. van Evra (Eds.), Studies in the logic of Charles Peirce (pp. 477–496). Indiana University Press. Ma, M., & Pietarinen, A.-V. (2018). Let us investigate! Dynamic conjecture-making as the formal logic of abduction. Journal of Philosophical Logic, 47(6), 913–945. Magnani, L. (2017). The abductive structure of scientific creativity. Springer. Mill, J. S. (1843). A system of logic, ratiocinative and inductive. Parker. Peirce, C. S. (1868a). On the natural classification of arguments. Proceedings of the American Academy of Arts and Sciences, 7, 261–287. Peirce, C. S. (1868b). On a new list of categories. Proceedings of the American Academy of Arts and Sciences, 7, 287–298. Peirce, C. S. (1868c). Some consequences of four incapacities. Journal of Speculative Philosophy, 2, 140–157. Peirce, C. S. (1878). Deduction, induction, and hypothesis. Popular Science Monthly, 13, 470–482. Peirce, C. S. (Ed.). (1883). Studies in logic, by members of the Johns Hopkins University. Little, Brown, and Company. Peirce, C. S. (1902). Carnegie application. RL 75. Peirce Papers, Harvard University. Peirce, C. S. (1933–1958). The collected papers of Charles S. Peirce (8 Vols). Ed. by C. Hartshorne, P. Weiss, & A. W. Burks. Harvard University Press. Peirce, C. S. (1982). Writings of Charles S. Peirce: A chronological edition (Vol. 1). Ed. by the Peirce Edition Project. Indiana University Press. Peirce, C. S. (1992). Reasoning and the logic of things. Ed. by K. L. Ketner. Harvard University Press. Peirce, C. S. (1998). The essential Peirce (Vol. 2). Ed. by the Peirce Edition Project. Indiana University Press. Peirce, C. S. (2014). Illustrations of the logic of science. Ed. by C. de Waal. Open Court. Peirce, C. S. (2019–2021). Logic of the future. Peirce’s writings on existential graphs (3 Vols). Ed. by A.-V. Pietarinen. De Gruyter. Reilly, F. E. (1970). Charles Peirce’s theory of scientific method. Fordham University Press. Whewell, W. (1840). The philosophy of the inductive sciences, founded upon their history. John W. Parker.
3
Abduction and Semiosis Mariana Vitti Rodrigues
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Semiosis and the Logical Signs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning in the Context of Peircean Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . . . Imagining Possible Scenarios Through Iconic Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction, Insight, and Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Premises to Conclusion: The Iconic Logical Principle of Abduction . . . . . . . . . . . . . Abduction as the Reasoning from Surprise to Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22 24 28 32 33 34 36 39 40
Abstract
This chapter presents the concept of abduction in the context of Peircean General Theory of Signs or Semiotics. It introduces the role of abduction, understood as a hypothesis-making reasoning, in the dynamic, collective, and meaning-making process of sign action, or semiosis. In general terms, abduction can be described as an ampliative form of argument which starts from a surprising fact, something unexpected, some state of doubt or curiosity and concludes provisionally with a reasonable hypothesis that is worth of further investigation. It describes the semiotic structure of abduction by highlighting the role of iconicity in the elaboration of possible imaginary scenarios by the unveiling of clue-like signs in order to conceive plausible explanations about the object under investigation. Illustrative examples of abduction, in science and daily life, are given to aid the understanding of the semiotic nature of abductive reasoning.
M. Vitti Rodrigues () Department of Philosophy, Faculty of Philosophy and Sciences, São Paulo State University, Marília, Brazil © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_8
21
22
M. Vitti Rodrigues
Keywords
Abduction · Semiosis · Meaning · Making processes · Pragmaticism · Scientific discovery
Introduction Accordingly, just as we say that a body is in motion, and not that motion is in a body we ought to say that we are in thought, and not that thoughts are in us. (Peirce CP 5.289, 1864)
Is there a logic that underlies the generation of new ideas? If so, would there be a formal way to proceed in order to reason according to this logic? Peirce answers positively to the first question, and, to account for the second question, he develops what he calls Semiotics, “the quasi-necessary, or formal, doctrine of signs.” Semiotics is a normative science which offers a method to approach truth by studying the characteristics of signs, its different aspects, and the relationship among them. In general terms, a sign is characterized as something which stands in the place of something else by embodying partial aspects of the object it represents (Peirce CP 2.227, 1903). In other words, Semiotics is a method to approach an object in its dynamicity, as a set of habits with unfolding possibilities, which determines possible course of action for those who are willing to learn from experience. In this context, Semiotics is considered an objective and anti-psychologistic science that offers a guide to self-controlled and self-corrective reasoning through which every agent is able to approach reality through experience (Bellucci, 2014; Colapietro, 2003). Peirce understands that reality appears as three different modes of experience which demarcate the conditions of what can be known, what is intelligible, and, as such, what can be the object of Semiotics. The author describes the three modes of experience as: “[b]eing of positive qualitative possibility, the being of actual fact, and the being of law that will govern facts in the future” (Peirce CP 1.23, 1896). The first category describes the experience of real possibility, the quality of feeling, sensations, creative action of unfolding possibilities; the second category comprises the experience of action and reaction, the actual fact, the here and now; the third category presents the experience of habits, law, mediation, and betweenness which emphasizes the relationship among the first and the second categories. These three categories are interrelated and interweaved in the dynamicity of feelings, actions, and habits which permeates the world of experience. To illustrate the three modes of experience, think of an untouched field of green grass. The greenness of the field can be experienced as a quality that does not present any specific form, but provides many possibilities to emerge, new species of flowers to grow, and new animals to appear. As pure possibility, the grass field can be perceived as an open-ended, but creative, space of quality that, for having “nothing,” has the possibility to become everything. Now, imagine that someone walks through this field and, by doing so, it leaves a momentary mark on the grass that
3 Abduction and Semiosis
23
fades away some hours later. This mark, although ephemeral, can be experienced as an existent, a reaction from the contact of the steps on the grass which is time and space dependent. Finally, imagine that many people walk regularly through this field shaping a path on the grass that resist time and can guide other’s action. It creates a form, or habit, which mediates the possibilities of the untouched field and the constant physical contact of passengers. As a habit, it has some lasting effects; however, it can be dissolved if passengers stop walking through the green field. The three modes of experience comprise the meaning-making process characterized as semiosis or sign action. Immersed in thoughts that shape the agents’ beliefs and guide their actions, one’s cognition is a result of previous cognitions forming a collective and continuous flow of meaning-making processes of sign interpretation which takes place in space and time. By emphasizing the relationship intrinsic to the dynamicity of experience, the consideration of what is an object, a sign, or its meaning depends on their logical role in the semiotic chain. For instance, an apple can be the object of a sign if it is represented in a drawing; the same apple can be a sign of affection if someone gives it as a gift; furthermore, a woman who sells apples can be a sign which generates the idea of an apple in the person who know her when she sees her. According to Peirce, we only think in signs not because we represent the objects of perception as thoughts inside our minds or brains, but because we are a part of the semiosis as being immersed in the ongoing process of habit change and habit formation. Semiotic agents are able to capture the dynamicity of reality through experience by being in a logical relationship with the object of consideration. As stated by Peircean pragmaticism: “Consider what effects, which might conceivably have practical bearings we consider the object of our conception to have. Then our conception of these effects is the whole of our conception of the object” (Peirce, CP 5.402, 1878). In the meaning-making process, the act of perceiving, reasoning, and acting are intimately connected once: “[t]he elements of every concept enter into logical thought at the gate of perception and make their exit at the gate of purposive action; and whatever cannot show its passports at both those two gates is to be arrested as unauthorized by reason” (Peirce CP 5.212, 1903). But what can happen when something unexpected is perceived? In the context of doubt, uncertainty, vagueness, incompleteness, surprise, or indecision, one might not habitually derive from perception and former beliefs the possible consequences of a given object. Some effort upon thinking must be undertaken: the perception of something strange triggers the process of reasoning, a self-controlled, deliberative, and self-corrective practice. As a first stage of scientific investigation, abduction is characterized as “the only kind of reasoning which introduces new ideas” aiming the generation of reasonable explanatory hypotheses in order to re-establish the state of belief. This chapter presents the semiotic structure of abduction by highlighting the role of semiosis in the elaboration of possible imaginary scenarios by the unveiling of clue-like signs in the conception of plausible explanations about the object of investigation. In the first section, Peirce’s characterization of the notion of sign is presented, highlighting the semiotic constitution of abduction as an Argument
24
M. Vitti Rodrigues
(i.e., a sign of law). The second section presents the concept of abduction in the context of Peircean philosophy. The third section highlights, through illustrative examples drawn from the history of sciences, different aspects of the iconicity of abductive reasoning.
Semiosis and the Logical Signs Symbols grow. They come into being by development out of other signs, particularly from icons, or from mixed signs partaking of the nature of icons and symbols. We think only in signs. (Peirce CP 2.302, 1894)
When diving into Peircean semiotics and the realm of meaning-making processes of sign action, one may start seeing, hearing, touching, and even smelling signs everywhere. But what is a sign? What is the relationship between sign, perception, and abduction? What is semiosis? Which kind of sign does constitute the semiosis of abduction? In simple terms, a sign is a relation between three elements which performs three different logical roles: the object of the sign which determines the sign-vehicle which embodies the form of its object and generates the interpretant of the sign, or a more developed sign, being characterized as the possible effects of the signvehicle as determined by its object. The relationship among these three elements constitutes the irreducible process of semiosis, or “[ . . . ] an action, or influence, which is, or involves, a cooperation of three subjects, such as a sign, its object, and its interpretant, this tri-relative influence not being in any way resolvable into actions between pairs” (Peirce EP 2.411, 1907). In other words, a sign is a: [ . . . ] Cognizable that, on the one hand, is so determined (i.e., specialized, bestimmt) by something other than itself, called its Object, while, on the other hand, it so determines some actual or potential Mind, the determination whereof I term the Interpretant created by the Sign, that that Interpreting Mind is therein determined mediately by the Object. (CP 8.177, 1909)
Thus, a sign is a relation structured by three elements, the sign-vehicle or representamen, the object, and the interpretant. In this triad, the object directly determines a sign that embodies its form, and indirectly determines its interpretant through the forms conveyed by the sign-vehicle. The interplay among these three elements allows a sign, in a broad sense, to create a semiotic chain through the process of semiosis, i.e., the actions performed by signs that are able to generate interpretants. Note that the interpretant of the sign, or the possible effects of a sign, is different from the actual interpreter of a sign understood as a human or non-human agent that learns from experience by perceiving, interpreting, and generating signs in the continuous process of semiosis. In this context, sign can be characterized as a medium that communicates the available forms of an object to a possible interpretant: [ . . . ] a Sign may be defined as a Medium for the communication of a form [ . . . ] As a medium, the Sign is essentially in a triadic relation, to its Object which determines it, and
3 Abduction and Semiosis
25
to its Interpretant which it determines [ . . . ] That which is communicated from the Object through the Sign to the Interpretant is a Form; that is to say, it is nothing like an existent, but is a power, is the fact that something would happen under certain conditions. (Peirce EP2: 544, 1909)
According to this definition, a sign is a medium that embodies the form available in the object to a possible interpretant constituting a mutual relationship among its three parts. Here, the Peircean concept of Form can be interpreted as regularity, or habit, which allows the reasoner to interpret its functionality as indicative of a particular class of entities, events, facts, and processes: “[ . . . ] the form communicated from the object to the interpretant through the sign is not a thing or a particular shape of a thing, or something alike, but a regularity, a habit which allows a given system to interpret that form as indicative of a particular class of entities, processes, phenomena, and, thus, to answer to it in a similarly regular, lawful way” (El-Hani et al., 2009, p. 93). In the following diagram (Fig. 1), El-Hani et al. (2009) represent the Peircean notion of sign highlighting the interrelation among its three irreducible elements. In sum, a sign is something that embodies the form of something else to an other (possible or existent). For being a general definition, the concept of sign might seem very abstract, but upon some reflection, one may find signs everywhere. Try to think of something that stands in the place of something else and, by doing so, invites a given course of action. For instance, the picture of a friend as something (an image) which stands in the place of something else (a friend) invites one to revive her memories of the friendship (a more developed sign). A huge, dark cloud in the sky can be interpreted as something which stands in the place of something else, the possibility of rain, and invites one to take her umbrellas when leaving her home. A traffic stop sign is something that stands in the place of something else, a traffic rule, and invites the driver to stop. Different kinds of signs embody different aspects of its object according to its representational ability. For example, consider different forms that a sign can represent the dynamic object fire: a picture, painting, or drawing can represent its object by resemblance, by embodying some visual similarity. This relationship of sign and object is iconic. An icon embodies the properties of its object (existent or not) according to its own capacity of representation. In Peirce’s words, an Icon is “[ . . . ] a sign which refers to the Object that it denotes by virtue of characters Fig. 1 Diagram of the structure of the sign. (El-Hani et al., 2009, p. 94)
26
M. Vitti Rodrigues
of its own” (CP 2.247, 1903). The smoke resulting from the burning fire can be considered a sign of the object that, standing in the place of the fire, might indicate the presence of fire by the physical connection with its object. This relationship of sign and object is indexical . An Index is a sign “[ . . . ] which refers to the Object that it denotes by virtue of being really affected by that Object,” such as a footprint on the sand (Peirce CP 2.248, 1903). The word “fire” represents the object fire by convention; as a general term, “fire” incites who are familiar or habituated with the English language to think of fire. This relationship of sign and object is symbolic. A symbol “[ . . . ] is a sign which refers to the Object that it denotes by virtue of a law, usually an association of general ideas, which operates to cause the Symbol to be interpreted as referring to that Object” (Peirce CP 2.249, 1903). Peirce (EP2: 318, 1904) highlights that, although symbols are too general to say something about an existent singular object, they could carry icons which embody possible properties of an object and indexes which indicate the existence of an object: “[ . . . ] a symbol, if sufficiently complete, always involves an index, just as an index sufficiently complete always involves an icon.” In this context, symbols grow in meaning by unveiling the characteristics of a dynamic object through reasoning: “[i]t is the growth of symbols which allows for ideas to become explicit and thus influence the course of evolution of the world” (Stjernfelt, 2014, p. 298). By reasoning in signs, the meaning of general ideas grows, and new layers of reality might be discovered in its conceivable consequences. Peirce claims that a community of inquirers that applies the scientific method aiming to unveil the layers of experience might, in the long run, get closer to the truth of reality. The author emphasizes that: The real, then, is that which, sooner or later, information and reasoning would finally result in, and which is therefore independent of the vagaries of me and you. Thus, the very origin of the conception of reality shows that this conception essentially involves the notion of a community without definite limits, and capable of a definite increase of knowledge. (CP 5.311, 1868 – author’s highlights excluded)
According to Peirce’s realism, a community of inquirers is capable to approach a true answer for a genuine question by disclosing the possible consequences of a given conception. Truth, characterized as the end of inquiry, should be considered as an aim to be pursued through the development of the pragmatic maxim (see Misak [2001] 2004; Legg, 2014). Legg (2014, p. 206) stresses that: It is not ‘end’ in the sense of ‘finish’. It is ‘end’ in the teleological sense of ‘aim’ or ‘goal’. Rather than a description of some future time where all questions are settled, Peirce’s explication of truth is an idealised continuation of what scientists are doing now, namely settling questions about which they genuinely doubt.
Within his pragmaticism, Peirce (CP 4.536, 1906) develops a theory of meaning by distinguishing three kinds of interpretant: (i) Immediate Interpretant, which “is ordinarily called the meaning of the sign,” it is the newly generated, more developed sign that allows the continuity of semiosis through the unfolding of the possible implications that a given object may exert in one’s conduct; (ii) Dynamic Interpretant, which is the actual effect of a sign in an actual interpreter (being human
3 Abduction and Semiosis
27
or non-human); and (iii) Final Interpretant “which refers to the manner in which the Sign tends to represent itself to be related to its Object.” Bellucci (2018, p. 315) explains that The immediate interpretant is the sign that a sign aims to produce; the dynamic interpretant is the sign that it actually produces; the normal [i.e. Final] interpretant is the sign that it ought to produce. [ . . . ] the normal interpretant is the “final” representation that sufficient scientific consideration of the sign ought to produce.
The richness of this characterization is that the interpretant of the sign is not only the generated more developed sign responsible for the continuity of the semiosis or the actual effect of the sign in a given interpreter; it also represents the sign-objectinterpretant relationship which, if sufficiently developed, guides a community of inquiries towards the end of inquiry. It is possible to investigate the logical consequences of the sign-object relationship in the determination of its immediate interpretant in terms of possibility, existence, and law. For example, the isolated term “red” is a symbol that, in relation to its interpretant, is a sign of pure possibility: it represents a characteristic of a possible object, but it does make any reference to the object it might describe (Peirce CP 2.250, 1903). If one says “something is red” without specifying what “something” refers to, the implied redness is simply a property that may be applied to a possible object. In relation to its interpretant, the relationship of sign and object is called Rhematic. Rhematic signs have their immediate interpretant as being implicit: the determination of a more developed sign is not present in the sign itself. Rhematic signs are the building blocks of propositions or Dicisigns. Roughly, a Dicisign is a sign that says something about something; it stands in the place of a fact and, as such, mirrors the structure of the fact it represents by being a sign that claims that the relationship sign-object holds (Peirce CP 2.320, 1903). Semiotically speaking, a Dicisign is characterized as a double sign constituted by a rhematic icon and a rhematic index structured by a syntax. The iconic part of a Dicisign embodies the possible properties that can be attributed to a possible object, being the description of the object (its predicate). The indexical part of the Dicisign indicates possible objects of attribution, being the reference of the object (its subject). Stjernfelt (2014, p. 56) explains that the “[ . . . ] doubleness of the Dicisign is what enables it to express truth: it is true in case the predicate actually does apply to the subject – which is the claim made by the Dicisign.” In other words, a Dicisign stands in the place of something other than itself, a fact, by mirroring the structure of a fact it represents in terms of icon-index juxtaposition. An example might be helpful here. Imagine you and your friend were having an evening walk when you see a shining dot moving in the night sky. When observing the object, you realized it is an International Space Station (ISS). You say to your friend pointing to that object: “Look!” Your friend has now a part of the proposition corresponding to the indexical reference of an object in the sky. However, she does not know how to recognize an ISS, so she asks: “What is it?” To complete the Dicisign, you offer a description of the effects of a passing ISS in the dark sky. The co-localization of your verbal description plus your finger pointing to the object in
28
M. Vitti Rodrigues
the night sky structures a Dicisign by icon-index juxtaposition which generates an interpretant of existence that claims the truth of the fact it aims to represent. Dicisigns can be considered clue-like signs because they claim that the iconic description of its object is actually connected to the indexical representation of its object. While the index points to an existent object, its iconicity furnishes a superfluous amount of information that, upon reasoning, might be disclosed (cf. section “Imagining Possible Scenarios Through Iconic Abduction”). In other words, being an interpretant of existence, Dicisigns not only claim the actuality of the fact they represent by means of description and reference, they also offer an opportunity to the reasoner to unveil more information that is initially perceived. It is also possible to conceive rhematic icons and indexes as potential clue-like signs, or signs that might convey information about a given state of affairs, once “[a]ny Rheme, perhaps, will afford some information; but it is not interpreted as doing so” (Peirce CP 2.250, 1903). Whereas Dicisigns are structured by rhematic icons and rhematic indexes, Arguments are signs composed by two or more dicisigns structured by a logical principle. In relation to its interpretant, Argument is a Sign of law (Peirce CP 2.252, 1903). Different from rhemes and dicisigns, Arguments have their immediate interpretant explicit: its conclusion. While Dicisigns separately indicate their object (which it makes explicit by its indexicality), Arguments separately indicate their conclusion, making explicit its immediate interpretant (see Bellucci, 2014; for a thoroughly account of Peirce’s speculative grammar, see Bellucci, 2018). Abduction, as an Argument, is characterized as a form of valid inference in which logical principle, the relationship between its premises and its conclusion, is iconic. As a consequence of its iconicity, the conclusion of abduction, its explicit immediate interpretant, resembles aspects of the fact stated in its premises by its own representational abilities. In the following, it is presented the concept of abduction in the broad scope of Peircean later works in order to introduce its role in the meaningmaking process of generation and interpretation of signs.
Abductive Reasoning in the Context of Peircean Philosophy [ . . . ] that all human knowledge, up to the highest flights of science, is but the development of our inborn animal instincts. (Peirce CP 2.754, 1883)
During the course of his work, Peirce outlined a variety of different, but not inconsistent, characterizations of the concept of abduction trying to develop a method of scientific investigation (Fann, 1970). In his later writings, Peirce recognizes that abduction constitutes a fundamental stage of inquiry for being the only kind of reasoning to introduce new ideas. He stresses that “[a]bduction is the process of forming an explanatory hypothesis. It is the only logical operation which introduces any new idea; for induction does nothing but determine a value, and deduction merely evolves the necessary consequences of a pure hypothesis” (Peirce
3 Abduction and Semiosis
29
CP 5.171, 1903). Fann (1970, p. 29) explains that “[ . . . ] according to the later view only abductions involve additions to the facts observed” being intimately connected with the flow of perceiving and interpreting signs (Peirce CP 5.212, 1903). Note that, according to this characterization, induction only determines a value in order to test the explanatory power of an abductive hypothesis in its deduced consequences. As Minnameier (2004, p. 82) highlights “[ . . . ] even in the simplest inductions we first abduce to some explanation that makes sense for us, and only then do we come to accept (or possibly reject) this explanation” (for a dynamic and recursive account of the interplay among the three kinds of reasoning, see Minnameier, 2004, 2017). In the meaning-making process of the semiosis, the fine line between thinking in signs and reasoning through signs lies in the possibility of self-controlled and selfcorrective reasoning. The perception of a strange fact, or a surprising perceptual judgment, prompts one to reason in order to explain an unusual observation and return to the state of belief. Perceptual Judgments are the elements of perception that, although fallible, cannot be denied at first glance. By abduction, in contrast to perceptual judgment, one doubts and questions the elements of his perception. However, there is no sharp line to demarcate the end of a perceptual judgment and the beginning of abduction: “[ . . . ] that abductive inference shades into perceptual judgment without any sharp line of demarcation between them” (Peirce CP 5.181, 1903). Imagine, for example, that you see a lion on the streets of a big city. While your heartbeat increases, once you cannot doubt the sensation of the perception of a lion, you almost instantaneously question that observation: “Isn’t it strange to see a lion in the city?” From the moment of the unquestioned perception, characterized here by the notion of perceptual judgment, you enter the realm of abduction looking for reasonable hypotheses that, if true, would explain the surprising observation. For instance, you formulate two hypotheses: (H1) the strange object is, in fact, a statue of a lion; and (H2) the strange object is a huge dog. Both hypotheses, if true, would explain the surprising observation. Note that the formulation of these hypotheses is only the first stage of inquiry: if you are curious enough to know if one of your hypotheses is true, you have to test them inductively in its deduced consequences. Luckily, in this fictional scenario, the animal was a golden retriever wearing a lion mane, and the deduced consequences of H2 did not threaten your life. Furthermore, it is important to understand that the analytical description of scientific inquiry in terms of abduction, deduction, and induction might not depict the complexity of actual processes of reasoning that is experienced by an agent (see, for instance, Anderson, 1986; Minnameier, 2004). In sum, the perceived surprising fact that shades into abductive reasoning constitutes the major premise in the syllogistic structure of the logical form of an abductive inference: The surprising fact, C, is observed; But if [the hypothesis] A were true, C would be a matter of course, Hence, there is reason to suspect that A is true. (Peirce CP 5.189, 1903)
30
M. Vitti Rodrigues
Peirce continues: “Thus, A cannot be abductively inferred, or if you prefer the expression, cannot be abductively conjectured until its entire content is already present in the premise, ‘If A were true, C would be a matter of course’” (ibid.). The formal, syllogistic form of abduction has brought many objections in the literature such as comparing abductive inference to the fallacy of affirming the consequent or put in question the validity of abduction. Is abduction an inference or an insight? Is it a rational process or just blind guesses? How does one generate the hypothesis A if its content is already present in the minor premise? (For a detailed discussion of these objections, see Burks, 1946; Fann, 1970; Anderson, 1987; Hintikka, 1998; Paavola, 2004a, b.) Peirce (CP 6.525, 1901, highlights added) advocates the legitimacy of abduction as a valid form of argument claiming that: The first starting of a hypothesis and the entertaining of it, whether as a simple interrogation or with any degree of confidence, is an inferential step which I propose to call abduction. This will include a preference for any one hypothesis over others which would equally explain the facts, so long as this preference is not based upon any previous knowledge bearing upon the truth of the hypotheses [i.e., deduction], nor on any testing of any of the hypotheses, after having admitted them on probation [i.e., induction]. I call all such inference by the peculiar name, abduction, because its legitimacy depends upon altogether different principles from those of other kinds of inference.
In this context, Hintikka (1998, p. 512) explains that Peirce “[ . . . ] is going beyond rules of inference that depend on the premise-conclusion relation alone and is considering also rules or principles of inference ‘of an altogether different kind.’” In the semiosis of inquiry, the altogether different logical principle of abduction is iconic, i.e., the sign refers to its conclusion as resembling the truth of the fact stated in its premises, but not asserting it. The abductive conclusion is characterized as an explicit immediate interpretant or investigand, which might be perceived as an interrogation which claims for further inquiry (cf. section “Abduction as the Reasoning from Surprise to Inquiry,” see also Bellucci & Pietarinen, 2020). Hintikka (1998, p. 514) proposes the notion of strategy to account for the logical principle which underlies abduction explaining that “[ . . . ] strategic rules of inquiry are justified by their propensity to lead the inquirer to new truths when [they are] consistently pursue as a general policy.” Paavola explains that (2012, p. 60, author’s highlights; see also Paavola, 2004a), “[ . . . ] this means that in strategies more than one step or move can and must be taken into account at the same time.” This notion of strategy might be compared to the notion of heuristic search that describes problem-solving reasoning as a method of introducing reasonable hypotheses that may be true when contrasting to step-by-step reasoning or trial and error method (Schickore, 2018). Abduction, comprehended as an ampliative form of reasoning based on strategical rules, allows the reasoner to imagine different scenarios which enable and constrain the formulation of reasonable kinds of hypotheses. Paavola (2004a) explains that “[ . . . ] in strategies the reasoner tries to anticipate the counterarguments, and to take into account all the relevant information, and this rules out very ‘wild’ hypotheses, except, when there is no other available.” According to this
3 Abduction and Semiosis
31
framework, abduction can be characterized as logic of discovery that underlies the generation of new ideas because one can describe reasons to suggest a hypothesis, i.e., one can work in a meta-level of analysis describing what are the scientists’ reasons to suggest a certain kind of hypothesis in the first place (Hanson, 1965, p. 50; Paavola, 2004a). In Peirce’s development of a logic of discovery, he also addresses the concept of abduction as an act of insight, as a form of instinct, as an outcome from the faculty of imagination (Pietarinen & Bellucci, 2016). Admitting that abduction is an inference (i.e., a “[ . . . ] controlled adoption of a belief as a consequence of other knowledge,” Peirce CP 2.442, 1893) and, at the same time, may appear as a flash of insight, Peirce (CP 5.181, 1903) states that: The abductive suggestion comes to us like a flash. It is an act of insight, although of extremely fallible insight. It is true that the different elements of the hypothesis were in our minds before; but it is the idea of putting together what we had never before dreamed of putting together which flashes the new suggestion before our contemplation.
According to the above quotation, an abductive inference can appear as a flash of insight or an “aha-experience” in a momentary glimpse when a diversity of elements come together in a meaningful way. Note that the feeling of insight does not invalidate the logical inference; as Peirce (CP 7.190, 1901) emphasizes, “[ . . . ] the emotion is merely the instinctive indication of the logical situation.” The logical situation lies precisely in the attempt to exert control over signs by self-corrective reasoning grounded in previous cognitions that might be supported by some sort of instinctual abilities to guess right under uncertainty and vagueness. Peirce’s argument for an instinctual account of abduction is grounded in the hypothesis that humankind did not have enough time to proceed blindly or through trial-and-error attempts to guess right in order to consolidate science as we know it: Think of what trillions of trillions of hypotheses might be made of which one only is true; and yet after two or three or at the very most a dozen guesses, the physicist hits pretty nearly on the correct hypothesis. By chance [s]he would not have been likely to do so in the whole time that has elapsed since the earth was solidified. (CP 5.172, 1903)
Moreover, the author inquires: If there are other animals acting and thinking by instinct, why should it be denied to human being? Peirce (1913, p. 464) answers this question advocating that reason is nothing but part of our instinctive abilities, that he calls reasoning-power or ratiocination: Reasoning-power, or Ratiocination [ . . . ] is the power of drawing inferences that tend toward the truth, when their premises or the virtual assertions from which they set out are true. I regard this power as the principal of human intellectual instincts; and in this statement I select the appellation “instinct” in order to profess my belief that the reasoning-power is related to human nature very much as the wonderful instincts of ants, wasps, etc., are related to their several natures.
By suggesting abduction as a different kind of inference (instead of being truthpreserving, is hypothesis-generative), Peirce (CP 6.525, 1901, 1913) allows a logical approach to the process of the creative generation of reasonable explanations.
32
M. Vitti Rodrigues
Abductive Argument guarantees its uberty (or fruitfulness) providing the generation of new explanatory ideas that, if true, would explain the surprising fact, fueling the process of scientific investigation with creative imagination. When characterizing abduction as part of our instinctual power of reasoning that provides insightful explanations, “[ . . . ] it must be remembered that abduction, although it is very little hampered by logical rules, nevertheless is logical inference, asserting its conclusion only problematically or conjecturally, it is true, but nevertheless having a perfectly definite logical form” (Peirce CP 5.188, 1903). In this context, instinct is a necessary but not sufficient condition for the generation of explanatory hypotheses (Anderson, 1987, p. 160), because, “[f]or Peirce, the form of an abduction need only make a hypothesis render a conclusion unsurprising; it is a possibilistic inference whose test is in futuro and it does not give us only one answer to each problem or a set of premises” (ibid, p. 162). Thus, the conclusion of abduction does not depend on the preservation of the truth of the fact stated in its premises, but on the ability for allowing fruitful conditions of possibility to support scientific investigation. In sum, the ampliative, fruitful aspect of abduction is not to be understood as an element of psychology, but as a logical principle that permeates scientific work in the conception of reasonable hypotheses or even generation of “wild hypotheses,” when there are no evident initial clues to explain a given surprising fact. In the following, it is presented a detailed account of the semiosis of abduction emphasizing the human ability to imagine possible scenarios from gathering clue-like signs (i.e., Dicisigns as the building blocks of Arguments).
Imagining Possible Scenarios Through Iconic Abduction As presented, Peirce’s advocates that all thinking “is performed upon signs of some kind or other, either imagined or actually perceived.” According to the author, “[t]he best thinking [ . . . ] is done by experimenting in the imagination upon a diagram or other scheme, and it facilitates the thought to have it before one’s eyes” (Peirce NEM 1, 122, 1976). Whereas Peirce stresses that every reasoning has diagrammatic elements and “[ . . . ] diagrammatic reasoning is the only fertile reasoning” (cf. Peirce CP 4.571, 1906), only recently more scholars started to consider the fruitfulness of diagrammatic reasoning in the context of abduction (see Paavola, 2011; Pietarinen & Bellucci, 2016; Bellucci & Pietarinen, 2020). In general terms, a diagram is predominantly an iconic sign: it embodies the intelligible relations between the parts of the object it resembles by its own characters. The self-sufficient aspect of diagrams provides a sort of autonomy to the sign which does not impose restrictions over the flow of imagination, allowing the reasoner the possibility to create, observe, experiment, and manipulate the iconic instance of the object in order to glance possible scenarios making explicit relations that was formerly implicit. Stjernfelt (2007, p. 91), explains that iconic signs can be “[ . . . ] characterized by containing implicit information which in order to appear
3 Abduction and Semiosis
33
must be made explicit by some more or less complicated procedure accompanied by observation.” Whereas a diagram is predominantly conceived as an iconic sign, it cannot be characterized as a pure icon, but as a sign of mixed character: its indexicality might indicate the existence of the object it represents, as a map is a diagram of the territory; its symbolic character offers a general signification that the diagram synthetizes the elements of a general conception which object is represented as an icon: A diagram, indeed, so far as it has a general signification, is not a pure icon; but in the middle part of our reasonings we forget that abstractness in great measure, and the diagram is for us the very thing. So in contemplating a painting, there is a moment when we lose the consciousness that it is not the thing, the distinction of the real and the copy disappears, and it is for the moment a pure dream — not any particular existence, and yet not general. At that moment we are contemplating an icon. (Peirce EP 1: 226, 1885)
When contemplating a diagram, an agent might forget, even for a short while, that the diagram stands in the place of the object it represents. The iconic character of a diagram allows the reasoner to derive new properties of the object being represented, when she is able to perceive “[ . . . ] a relation between parts of that diagram that had not entered into the design of its construction” (NEM IV: 353, 893; NEM IV: 275–276, c. 1895). Peirce explains that “[ . . . ] a very extraordinary feature of Diagrams is that they show [ . . . ] that a consequence does follow, and more marvelous yet, that it would follow under all varieties of circumstances accompanying the premises” (Peirce NEM IV 317–318, 1909). Through reasoning with diagrams, an agent might perceptually experience a general concept, or symbol, by contemplating a representation of the concept which synthetizes the relevant elements or properties of the represented concept. Pietarinen and Bellucci (2016, p. 474 – authors’ highlight) emphasize that “[t]he icon-imagination, and the iconic-imaginative moment in reasoning depend on the possibility of directing the construction of a perceptual experience.” In sum, the iconicity of abductive reasoning fuels the faculty of imagination, where the elaboration, observation, and experimentation upon diagrams play a fundamental role in deriving new properties of the object under scrutiny (Campos, 2009; Paavola, 2011; Pietarinen & Bellucci, 2016). In the following, the conceptual characterization of the iconic aspect of abduction is highlighted by presenting examples drawn from the history of science to emphasize the role of diagrams in the semiotic process of imagining possible hypothetical scenarios as strategic steps to explain surprising phenomena.
Abduction, Insight, and Diagrams But look! What was that? One of the snakes had seized hold of its own tail, and the form whirled mockingly before my eyes. As if by a flash of lightning I awoke; and this time also I spent the rest of the night in working out the consequences of the hypothesis. (Kekulé [1890] 1958, p. 22)
34
M. Vitti Rodrigues
In his dream, Kekulé ([1890] 1958) contemplated before his mind’s eye a diagram, a sign which embodied a possible structure that, if true, would explain the relationship among the parts of the object under his scrutiny: the biding structure of hydrogens and carbons in a molecule of benzene. By a sudden insight Kekulé (ibid., p. 22) put together pieces of a jigsaw observing through his mental eye atoms moving in a specific way which formed a structure that he “had never been able to discern” before. Kekulé’s flash of insight was not a simple dream; it was the generation of an abductive hypothesis. As a chemist, he knew that the relationship among the atoms that constitute the molecule of benzene was fundamental to explain the behavior of the chemical element. His hypothesis was generated not by systematic step-by-step attempts that considered possible configurations of carbon and hydrogen atoms in an open structure. He conceived a new plausible form of configuration: from an open relationship to a closed ring structure with alternating single and double bonds (see, for instance, Boden, 1994; Carey, 2019). In this example, Kekulé perceived the whirled form before his eyes as an icon of a symbol: the diagram of a circling snake constituted the iconic part of the newly generated hypothesis by disclosing possible properties of the concept of benzene which was formerly implicit. The imaginative character of diagrammatic reasoning was possible because the iconic part of the diagram “[ . . . ] represents other information not explicitly contained in that signification” (Pietarinen & Bellucci, 2016, p. 473). In other words, the perceptual experience is brought by the iconic representation of the symbol which allows new information to be derived. In sum, a diagram (a perceptual experience) embodies a symbol (a concept) and, by not being the concept (by standing in the place of the concept), the iconic feature of the diagram allows the unveiling of possible properties that were implicit in the symbol. In this sense, Pietarinen and Bellucci (2016, p. 474) explain that by reasoning through diagrams “[ . . . ] the act of imagination instantiates the concept: it offers to the senses an empirical image that embodies the concept.” In Kekulé’s example, reasoning through diagrams provided the possibility to imagine different scenarios which allowed the conception of a closed moving structure that, in turn, contributed to the continuation of semiosis by providing reasons to further inquiry as he became one of the founders of the structural theory of organic chemistry (Kekulé [1858]1958). The next section explores the syllogistic form of abduction as a diagram highlighting its iconic logical principle.
From Premises to Conclusion: The Iconic Logical Principle of Abduction Day and night I was haunted by the image of Kolletschka’s disease and was forced to recognize, ever more decisively, that the disease from with Kolletschka died was identical to that from which so many maternity patients died. (Semmelweis, 1983, p. 88)
3 Abduction and Semiosis
35
Semmelweis was intrigued by the unknown cause of childbed fever, a disease which led many women and newborn babies to death in the maternity hospital. He tried to find a plausible explanation in order to prevent new deaths but was unsuccessful until he realized the striking similarity between the symptoms of childbed fever and the symptoms of blood poisoning which killed his friend, Kolletschka. Semmelweis conjectured that if something, such as a cadaveric agent, were a common cause of both diseases, it would explain the similarity of the symptoms and indicate a plausible solution to prevent new deaths (for a detailed account, see Semmelweis, 1983; for an abductive account of Semmelweis’ discovery, see Paavola, 2006). This example illustrates the iconic character of abduction as an Originary Argument: [ . . . ] which presents facts in its Premise which present a similarity to the fact stated in the Conclusion, but which could perfectly well be true without the latter being so, much more without its being recognized; so that we are not led to assert the Conclusion positively but are only inclined toward admitting it as representing a fact of which the facts of the Premise constitute an Icon. (Peirce CP 2.96, 1902)
In the simplified version of Semmelweis’ example, the premises of abductive reasoning present two facts: Fact 1 Childbed fever and its symptoms M1, M2, M3; Fact 2 Blood Poisoning and its symptoms M1, M2, M3. By creating a diagram upon which Semmelweis could experiment possible relationships involved in these two events, he noticed that their properties or symptoms could be seen as similar. In other words, he delineated the relevant aspects of the constellations of similar events gathered from previous perceptual judgements “[ . . . ] omitting all that was accidental, retaining all that was essential, observing suggestive relations between the parts of the diagram, performing diverse experiments upon it [ . . . ]” (Peirce W. 8, p. 290, 1985 - Peirce describes Kepler’s abduction). The conjecture of a possible cadaveric agent can be considered an idea or hypothesis concerning real things (Fact 1 and 2) that went beyond what were given in Semmelweis’ perception. His provisory abductive conclusion introduced a new more developed sign which synthetizes the range of properties he was trying to understand by embodying the iconic properties of the facts stated in the premises. Although Semmelweis’ hypothesis of the existence of a cadaveric agent was rejected in his lifetime, it gave rise to germ theory, sanitation, and hand hygiene, nourishing the continuity of semiosis towards an ideal end of inquiry. In Table 1, the Semiotic approach to a syllogistic structure of abductive reasoning is summarized. In sum, as a fruitful kind of inference, abduction can be characterized as an ampliative form of reasoning which conclusion is a hypothetical statement, a question, a hint, or a guess, inferred from scattered pieces of information which serves as clues for generating possible explanations that resemble the truth of the surprising fact stated in its premises.
36
M. Vitti Rodrigues
Table 1 Summarization of the semiotic approach to a syllogistic structure of abduction Logical principle Premises
Syllogistic abduction The surprising fact, C, is observed
But if A were true, C would be a matter of course
Conclusion Hence, there is reason to suspect that A is true
Iconic semiosis Recognition of surprising fact(s) through the observation of Perceptual Judgments that shade into abductive reasoning Based on the perceived clue-like signs, it is suggested a kind of hypothesis A that, if true, would explain the surprising fact C There are reasons to investigate if A is true, because the consequences of the hypothesis A resemble the properties of Fact 1
Semmelweis’ abduction Surprising Fact1: M1, M2, M3 are symptoms of childbed fever Complementary Fact 2: M1, M2, M3 are symptoms of blood poisoning If an unknown cadaveric agent (hypothesis A) would cause childbed fever, C (Fact 1) would be explainable
Hence, there are reasons to suspect that a cadaveric agent (hypothesis A) might be the cause of childbed fever
Abduction as the Reasoning from Surprise to Inquiry It is a strange fact, characteristic of the incomplete state of our present knowledge, that totally opposing conclusions are drawn about prehistoric conditions on our planet, depending on whether the problem is approached from the biological or the geophysical viewpoint. (Wegener, 1966, p. 5)
In the quote above, Wegener inquired if there would be an explanation that, if true, would unify these opposing conclusions in a coherent matter by turning them into clues that indicate a possible course of inquiry. Wegener thought so. While contemplating the map of the world, Wegener noticed the congruence in the shape of the coastlines of the continents and inferred that the continents might have drift away from each other by moving horizontally as they were floating ice in the ocean (Wegener, 1966, p. 37). Although the continental drift hypothesis was considered by others before him, no scientist has committed to undertake further research on the topic. This example illustrates that a fruitful abductive hypothesis does not, necessarily, have to be something completely new in order to provide reasonable explanations to account for surprising phenomena (Anderson, 1987; Paavola, 2004a). The surprising fact noticed by Wegener was the opposing conclusions about the plausibility of the continental drift hypothesis according to different sources of evidence. When stumbling accidentally upon a paleontological report, he started having reasons to suspect that the idea of continental drift could be a reasonable hypothesis worth of further investigation: “[the] fundamental soundness of the idea took root in my mind.” The surprising congruence of the continents’ coastlines
3 Abduction and Semiosis
37
along with new paleontological, geological, and geophysical evidence provided by several observed facts were searching for a reasonable explanation. Could the hypothesis according to which the continents are slowly moving horizontally be that explanation? To undertake his investigation, Wegener performed a series of question-answer games searching for clues which could make the current shape of oceans and continent explainable. Through abduction, Wegener tried to imagine what was not there, exploring a general conception – continental drift hypothesis – whose conceivable consequences, if true, would synthetize the collected evidence from the past, and furnish ways to be tested in the future. Winther (2020 p. 163, author’s highlights) explains that “Wegener’s drift hypothesis required him to move dialectically between visualizations of Earth’s surface today and yesterday, acting as if certain geological, meteorological, and ecological features had to be different in the distant past. He imagined multiple worlds.” Wegener (1966, p. 76–77 – highlights added) stresses that: It is just as if we were to refit the torn pieces of a newspaper by matching their edges and then check whether the lines of print run smoothly across. If they do, there is nothing left but to conclude that the pieces were in fact joined in this way.
By elaborating, experimenting, manipulating, and contemplating diagrams as cartographic maps, Wegener could conceive movement in static maps and draw possible scenarios from his faculty of imagination (see Fig. 2). He hypothesized that the continents were one big piece of land that started to drift away long ago. This example illustrates Peirce’s later characterization of scientific investigation which approximates the idea of abduction to the idea of interrogation, evincing the iconic semiosis of abduction by presenting the role of diagrams, as literal cartographic maps, in scientific thinking. Whereas abduction starts from the perception of a surprising fact, its conclusion should be considered in an interrogative mode, as a hypothesis worth of further investigation (Peirce CP 6.528, 1901). Bellucci and Pietarinen (2020) bring the following passage where Peirce characterizes abduction as reasoning from surprise to inquiry emphasizing its iconic aspect: The whole operation of reasoning begins with Abduction, which is now to be described. Its occasion is a surprise. That is, some belief, active or passive, formulated or unformulated, has just been broken up. It may be in real experience or it may equally be in pure mathematics, which has its marvels, as nature has. The mind seeks to bring the facts, as modified by the new discovery, into order; that is, to form a general conception embracing them. In some cases, it does this by an act of generalization. In other cases, no new law is suggested, but only a peculiar state of facts that will “explain” the surprising phenomenon; and a law already known is recognized as applicable to the suggested hypothesis, so that the phenomenon, under that assumption, would not be surprising, but quite likely, or even would be a necessary result. This synthesis suggesting a new conception or hypothesis, is the Abduction. It is recognized that the phenomena are like, i.e. constitute an Icon of, a replica of a general conception, or Symbol. (Peirce EP 2:286, 1903 highlights added; for a detailed analysis see Bellucci & Pietarinen, 2020)
The novelty of this characterization is that the conclusion of abductive reasoning is conceived as an interrogation or investigand:
38
M. Vitti Rodrigues
Fig. 2 Reconstruction of the map of the world for three periods according to the displacement theory (Wegener, 1929, p. 13 – German version is in public domain)
The conclusion is drawn in the interrogative mood (there is such a mood in Speculative Grammar, whether it occur in any human language or not). This conclusion, which is the Interpretant of the Abduction, represents the Abduction to be a Symbol, — to convey a general concept of the truth, —but not to assert it in any measure [ . . . ]. (Peirce EP 2:287, 1903)
According to this characterization, the conclusion of abduction can be understood as a Symbolic Dicisign in the interrogative mood: as a sign which embodies the iconic properties of the facts stated in its premises but does not assert the truth of the fact it expresses. In other words, the hypothetical conception of the truth, stated in the conclusion of abduction, might synthetize the constellations of apparent disconnected facts by resembling the iconic properties of the facts stated in its premises. The abductive conclusion, or the meaning of the Symbolic Dicisign, should be understood as an interrogation: “[ . . . ] what is such an interrogation but first, a sense that we do not know something; second a desire to know it; and third an effort — implying a willingness to labor — for the sake of seeing how the truth may really be. If that interrogation inspires you, you will be sure to examine the instances; while if it does not, you will pass them by without attention” (Peirce EP 2.48, 1898).
3 Abduction and Semiosis
39
It might be helpful to return to Wegener’s example to clarify the given approach to abduction. In its simplifying version, his abductive argument can be presented as constituted by three premises as sources of clue-like signs extracted from: (P1) paleontological clues that consisted in fossils of fauna and flora showing a biogeographic map of identical species which were not capable of living in salt water or have travelled that far across the oceans; (P2) glaciation clues constituted by ice traces in distant continents which present similar sedimentary rocks probably formed in the same place and geological time; (P3) geophysical clues that were collected by the measurement of the surface of earth and the floor of the ocean which showed different densities of materials indicating that some parts of the earth are younger than others (Wegener 1999; see also Winther, 2020). To make sense of these constellations of clues and evidence (i.e., by perceiving and gathering Dicisigns which claim the truth of the icon-index colocalization), Wegener imagined a general idea: that the continents were not static, they might have been moving horizontally, drifting away, or crushing together. His general idea formed a conception that synthetizes the properties of the collected evidence by embodying the relevant aspects of the facts stated in the premises. Although Wegener could not determine which mechanisms were responsible for continental drift, his research was worth of further investigation, even though it was not recognized as reasonable in his lifetime. His hypothesis acquired new meaning through the development of measurement technologies and new collected data being confirmed thanks to Tharp’s creative ability of mapping the ocean floor. By generating, observing, and experimenting upon diagrams, Tharp hypothesized the existence of a valley in the Mid-Atlantic ridge which, if true, would confirm Wegener’s continental drift hypothesis. In an expedition, Jacques Cousteau filmed the valley confirming Tharp’s hypothesis which implied Wegener’s continental drift theory (Tharp, 1999). This example illustrates the growth of signs through collective, aim-directed, self-corrective, and self-controlled investigation in which the strategical aspect of abduction enables and constrains the formulation of reasonable hypotheses that would explain a genuine doubt and guide the community towards an ideal end of inquiry. In sum, abduction is characterized as an ampliative form of argument which starts from the perception of a surprising fact and concludes provisionally with a plausible explanatory hypothesis that claims for further inquiry. In Peirce’s words, abduction is considered a “Reasoning from Surprise to Inquiry” (Peirce to Welby, July 16, 1905, RL 463, see “Manuscripts”).
Conclusions This chapter has presented a Semiotic account of abduction as a logic that underlies the conception of new ideas. In contrast to a static, truth-preserving, or trial-anderror account of hypotheses-generation, Peirce’s semiotics offers a processual, dynamic, and collective approach of scientific investigation which involves the interplay between abduction, deduction, and induction. Whereas abductive reasoning
40
M. Vitti Rodrigues
introduces new ideas by means of suggestion of reasonable explanatory hypotheses, the role of deductive reasoning is to draw the conceivable consequences of a hypothesis given by abduction to be evaluated by induction. The role of induction is to experiment, in the long run, the explanatory power of a hypothesis in its deduced predictions. The scientific method of investigation is a lively activity which enables symbols to grow by unfolding the properties of the object under consideration in a continuous process of semiosis. In this context, abduction is characterized as an Argument that presents an iconic logical principle; its conclusion embodies the properties expressed in the fact stated in its premises. Being considered in the interrogative mode, the conclusion of abduction presents a plausible hypothesis that prompts researchers to pursue a course of inquiry. The semiotic process of scientific investigation is like gradually peeling the layers of an onion, unveiling the properties of an object by gathering clue-like signs, filling in the gaps, and creating reasonable hypotheses that direct further investigation. Kekulé’s benzene chemical structure, Semmelweis’ cadaveric agent, and Wegener’s continental drift hypothesis are illustrations of the semiotic process when symbols grow. In sum, through reasoning in signs, a community of agents which shares the scientific method can approach reality in the dynamicity of experience towards an ideal end of inquiry where the pragmatic maxim is accomplished. Acknowledgments This work has been made possible by funds provided by the Sao Paulo Research Foundation, FAPESP (Project number 2020/03134-1).
References Anderson, D. R. (1986). The evolution of Peirce’s concept of abduction. Transactions of the Charles S. Peirce Society, 22(2), 145–164. http://www.jstor.org/stable/40320131 Anderson, D. R. (1987). Creativity and the philosophy of C. S. Peirce. Martinus Nijhoff. Bellucci, F. (2014). “Logic, considered as semeiotic”: On Peirce’s philosophy of logic. Transactions of the Charles S. Peirce Society, 50(4), 523–547. https://doi.org/10.2979/trancharpeirsoc. 50.4.523 Bellucci, F. (2018). Peirce’s speculative grammar: Logic as semiotics. Routledge: Taylor & Francis. Bellucci, F., & Pietarinen, A.-V. (2020). Icons, interrogations, and graphs: On Peirce’s integrated notion of abduction. Transactions of the Charles S. Peirce Society, 56(1), 43–61. https://doi.org/ 10.2979/trancharpeirsoc.56.1.03 Boden, M. (1994). What is creativity? In Dimensions of creativity. MIT Press. Burks, A. W. (1946). Peirce’s theory of abduction. Philosophy of Science, 13(4), 301–306. http:// www.jstor.org/stable/185210 Campos, D. G. (2009). Imagination, concentration, and generalization: Peirce on the reasoning abilities of the mathematician. Transactions of the Charles S. Peirce Society, 45(2), 135–156. https://doi.org/10.2979/tra.2009.45.2.135 Carey, F. A. Benzene. Encyclopedia Britannica, 11 Oct 2019. https://www.britannica.com/science/ benzene. Accessed 13 Dec 2021. Colapietro, V. (2003). The space of sings: C.S. Peirce’s critique of psychologism. In D. Jacquette (Ed.), Philosophy, psychology, and psychologism (Philosophical studies series) (Vol. 91). Springer. https://doi.org/10.1007/0-306-48134-0_7
3 Abduction and Semiosis
41
El-Hani, C. N., Queiroz, J., & Emmeche, C. (2009). Genes, information, and semiosis (Tartu semiotics library) (Vol. 8). Tartu University Press. Fann, K. T. (1970). Peirce’s theory of abduction. Martinus Nijhoff. Hanson, N. (1965). The idea of a logic of discovery. Dialogue, 4(1), 48–61. https://doi.org/10. 1017/S0012217300033291 Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3). http://www.jstor.org/stable/40320712 Kekulé, A. ([1858] 1958). August Kekulé and the birth of the structural theory of organic chemistry in 1858. Translate by Benfey, O. Journal Chemical Education, 35(1). Legg, C. (2014). Charles Peirce’s limit concept of truth. Philosophy Compass, 9, 204–213. https:/ /doi.org/10.1111/phc3.12114 Minnameier, G. (2004). Peirce-suit of truth – Why inference to the best explanation and abduction ought not to be confused. Erkenntnis, 60, 75–105. Minnameier, G. (2017). Forms of abduction and an inferential taxonomy. In Magnani & Bertolotti (Eds.), Handbook of model-based science (pp. 175–195). Springer. Misak, C. ([2001] 2004). Truth and the end of inquiry: A Peircean account of truth. Oxford University Press. Paavola, S. (2004a). Abduction as a logic and methodology of discovery: The importance of strategies. Foundation of Science, 9, 267–283. Paavola, S. (2004b). Abduction through grammar, critic, and methodeutic. Transactions of the Charles S. Peirce Society, 40(2), 245–270. Paavola, S. (2006). Hansonian and Harmanian abduction as models of discovery. International Studies in the Philosophy of Science, 20(1), 93–108. Paavola, S. (2011). Diagrams, iconicity, and abductive discovery. Semiotica, 186(4), 297–314. https://doi.org/10.1515/semi.2011.057 Paavola, S. (2012). On the origin of ideas: An abductivist approach to discovery (Revised and enlarged edition). Lap Lambert Academic Publishing. Paavola, S. (2014). Fibers of abduction. In T. Thellefsen & B. Sorensen (Eds.), Charles Sanders Peirce in his own words: 100 years of semiotics, communication and cognition (pp. 365–372). Walther de Gruyter. Peirce, C. S. (1913). An essay toward improving our reasoning in security and Uberty. The Peirce edition. The essential Peirce: Selected philosophical writings. Project. Indiana University, 1998, 2:463–474. Peirce, C. S. (1958). In C. Hartshorne, P. Weiss (Eds.), The collected papers of Charles Sanders Peirce (Electronic edition. Vols. I–VI, 1931–1935. Vols. VII–VIII) (A. W. Burks, Ed.). Intelex Corporation/Harvard University Press. [Quoted as CP, followed by the volume and paragraph]. Peirce, C. S. (1967). “Manuscripts” in the Houghton Library of Harvard University, as identified by Richard Robin, “Annotated catalogue of the papers of Charles S. Peirce”. University of Massachusetts Press, and in “The Peirce papers: A supplementary catalogue”, Transactions of the C. S. Peirce Society 7(1971): 37–57. Cited as MS followed by manuscript number and, when available, page number. https://archive.org/details/annotatedcatalog0000robi/page/ n7/mode/2up Peirce, C. S. (1976). In C. Eisele (Ed.), The new elements of mathematics by Charles S. Peirce, 4 vols. Mouton. Cited as NEM followed by volume and page number. Peirce, C. S. (1982). In E. Moore, C. J. W. Kloesel, et al. (Eds.), Writings of Charles S. Peirce: A chronological edition (Vol. 8). Indiana University Press. Cited as W followed by volume and page number. Peirce, C. S. (1992). In N. Houser & C. Kloesel (Eds.), The essential Peirce: Selected philosophical writings (Vol. 1 (1867–1893)). Indiana University Press. https://doi.org/10.2307/j.ctvpwhg1z Peirce, C. S. (1998). In Peirce Edition Project (Eds.), The essential Peirce: Selected philosophical writings (Vol. 2 (1893–1913)). Indiana University Press. Pietarinen, A. V., & Bellucci, F. (2016). The iconic moment. Towards a Peircean theory of diagrammatic imagination. In J. Redmond, O. Pombo Martins, & Á. Nepomuceno Fernández (Eds.), Epistemology, knowledge and the impact of interaction. Logic, epistemology, and the Unity of science (Vol. 38). Springer. https://doi.org/10.1007/978-3-319-26506-3_21
42
M. Vitti Rodrigues
Schickore, J. (2018). Scientific discovery. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (Summer 2018 Edition). https://plato.stanford.edu/archives/sum2018/entries/scientificdiscovery/. Semmelweis, I. ([1861]1983). The etiology, concept, and prophylaxis of childbed fever (K. Codell Carter, Trans.). University of Wisconsin Press. Stjernfelt, F. (2007). Diagrammatology: An investigation on the borderlines of phenomenology, ontology, and semiotics. Springer. Stjernfelt, F. (2014). Natural propositions. The actuality of Peirce’s doctrine of Dicisigns. Docent Press. Tharp, M. (1999). Connect the dots: Mapping the seafloor and discovering the mid-ocean ridge. In L. Lippsett (Ed.), Lamont-Doherty Earth Observatory of Columbia twelve perspectives on the first fifty years 1949–1999. https://www.whoi.edu/who-we-are/about-us/people/awards-recognition/mary-sears-women-pioneers-in-oceanography-award/award-recipients/marie-tharp-biography/. Accessed Aug 2021. Wegener, A. (1929). Die Entstehung der Kontinent und Ozeane (4th ed.). Friedrich Vieweg & Sohn. Wegener, A. (1966). The origin of continents and oceans. Translated from the 4th edition by John Biram. Dover Publications. Winther, R. G. (2020). When maps become the world. University of Chicago Press. https://doi.org/ 10.7208/chicago/9780226674865.001.0001
4
Abduction as a Logic of Discovery Sami Paavola
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Short History of Abduction as a Logic of Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peirce’s Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . After Peirce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Logic of Pursuit or Logic of Discovery or both? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Logic of Pursuit (without Logic of Discovery) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as a Logic of Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44 45 45 48 51 51 53 56 57
Abstract
The chapter analyzes the evolution of conceptions of abduction as a logic of discovery. Peirce’s original texts have provided the means for various, overlapping interpretations. N. R. Hanson’s texts were crucial in defending and developing the idea that abduction can be interpreted as a logic of discovery. Many commentators since Hanson have been more hesitant on this and more in line with the logical empiricists, maintaining that discovery is not an area for logic and justification. Since Hanson, it has been typical to discern a middle area in between discovery and justification, that is, the context of pursuit. Many commentators have maintained that abduction is the logic for the context of pursuit, and cannot be the logic for the context of discovery, the latter being an area for psychology or heuristics but not for logic. Others maintain that abduction can be a logic of discovery, which means rational reconstruction of heuristic
S. Paavola () Faculty of Educational Sciences, University of Helsinki, Helsinki, Finland e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_9
43
44
S. Paavola
processes and dynamics involved in discovery. The chapter presents arguments both for the logic of pursuit without discovery interpretation, and for various ways of developing abduction as a logic of discovery, as well as supporting the potential of interpreting abduction as a logic of discovery. Elements for these formulations come from Peirce and Hanson, according to whom abduction is a “weak” form of inference, close to guessing, is based on the use of clue-like signs, and on iconic relationships, and is aimed at reconstructing dynamic inquiry and search processes of discovery. Keywords
Abduction · Logic of discovery · Logic of pursuit · Peirce · Hanson
Introduction The “logic of discovery” is a controversial topic in the philosophy of science. The term has frequently been used in the literature, even though often arguing that it is not possible to analyze it with the tools of philosophy of science. Abduction is clearly one candidate for forming a basis for the logic of discovery. It has by no means been the only candidate, as there are other, often much older methodological models suggested for this role like inductivism, the method of analysis and synthesis, and the interrogative model of inquiry. These alternatives are not necessarily separate from each other, and there are clear overlaps and connections between them and with abduction (see Hintikka, 1998; Niiniluoto, 1999). Abduction as a logic of discovery seems to involve a dilemma similar to many other discussions of the logic of discovery. If abduction is interpreted as a central part of discovery, it seems to be a matter of something other than logic, for example, an instinct, perception, or guessing. If abduction is interpreted more strictly as logic, then it seems that it is no longer logic of discovery but something else, like a logic of prior appraisal of already discovered hypotheses. The controversies surrounding abduction as a logic of discovery are also connected with interpretations of “logic” or “logic of discovery” (Aliseda, 2006; Cellucci, 2020). Logic can refer to algorithmic or formal systems or syllogistic reasoning. More strongly, many critics of the “logic of discovery” have interpreted it to mean an algorithmic method that could mechanically produce creative solutions to problems (which is then seen to be impossible), but those who defend the logic of discovery interpret it differently (Cellucci, 2020, p. 870). Abduction has already provided means for broader interpretations of logic by Peirce himself (Fann, 1970). One challenge is that there are actually many interpretations of the basic meaning of abduction which have provided various conceptions of abduction as a logic of discovery. It is fair to say that the mainstream interpretation within twentieth-century philosophy of science was that there could not be any logic of discovery in any
4 Abduction as a Logic of Discovery
45
logical sense. This view was in line with the hypothetico-deductive (HD) model of inquiry (Hempel, 1966). On this view, discovery is a matter of such things as happy guesses, coincidences, or creativity which can be treated within empirical sciences (like the psychology of creativity), or perhaps with heuristic rules of thumb, but not as logic or reasoning. This HD view was often contrasted with traditional inductivism, and it was argued that discovery involves qualitative and conceptual changes which require creativity that cannot be explained by inductive methods. There have been quite firm conceptions within the philosophy of science which have been suspicious of the “logic of discovery.” The mainstream interpretation of the logic of discovery (which was an “anti logic of discovery,” i.e., the viewpoint that it cannot involve a sense of reasoning or logic) was challenged bit by bit during twentieth-century philosophy of science. A big influence on these challenges was a novel interest in Peirce’s philosophy, and various interpretations of abduction. Abduction has been applied and developed in many other research fields besides philosophy of science, as in AI research, cognitive science, medical science, and the methodologies of various fields of empirical research (see Paavola, 2012, p. 31–45). These fields have been less influenced in their discussions of the possibility of the logic of discovery. In this chapter, I discuss how abduction is interpreted in relation to the logic of discovery especially within the philosophy of science. I am mainly using “logic” in a broad sense, that is, focusing on abduction as a form of reasoning and as a part of larger methodological processes. First, I present a short history of abduction in relation to discussions of discovery. I start with Peirce’s formulations of abduction and then delineate later interpretations. Second, I discuss more in detail two prominent ways of interpreting abduction, that is, abduction as a logic of pursuit, and abduction as a logic of discovery. I analyze the interpretations of both of these ways of understanding the meaning of abduction in relation to discovery, offering multiple arguments against a “generative” interpretation, as well as different ways of interpreting abduction as a logic of discovery.
A Short History of Abduction as a Logic of Discovery Peirce’s Formulations Charles S. Peirce developed abduction as a third main mode of reasoning (besides induction and deduction) in numerous papers and texts with different formulations and problem areas over several decades. Peirce was a system builder who continually developed his system of signs, reasoning, and sciences and applied them in different contexts. It is then no wonder that his texts have left room for various interpretations of the key features of abduction in relation to discovery as well. Besides this, if the aim is not just a historical, exegetic reading of Peirce’s abduction but to take into account later evolutions of it, the number of potential interpretations grows even more (see Niiniluoto, 2018, p. vi). It seems quite clear that there cannot be any one, “correct” interpretation of Peirce’s abduction, but it
46
S. Paavola
does provide inspiration and means for several interpretations both now and in the future. Two main phases in Peirce’s works on abduction are typically discerned, first as an evidencing process, and then as a part of a methodological process (Burks, 1946; Fann, 1970). In his earlier writings, abduction (or “hypothesis”) was treated as an evidencing process, and often by syllogistic means. In his famous formulations, he presented abduction as the inference of the minor premise (case) from the conclusion (result) and the major premise (rule) and contrasted it both to deduction and induction (Peirce CP 5.275–276, 1868; CP 2.623, 1878). In his early works, Peirce often presented abduction as a form of probable reasoning (Peirce CP 2.511, 1867; CP 5.276, 1868), as well as a weak kind of argument for “making a hypothesis” (Peirce CP 2.624, 1878). Peirce was not using the notion of the “logic of discovery” in relation to abduction (in his texts in general), but his many formulations in the early works are often in line with that kind of idea, at least in a broad sense. Abduction is used to explain some circumstances, especially the “very curious circumstance, which would be explained by the supposition that it was a case of a certain general rule, and thereupon adopt that supposition” (Peirce CP 2.624–625, 1878). As an example, Peirce says that numberless documents and monuments referring to Napoleon are explained by the supposition that Napoleon has really existed, or fish and shell fossils in the interior of the country are explained by a supposition that the sea had covered that land at some previous time (ibid.). In his later writings, Peirce did not abandon syllogistic formulations of abduction, but the later “methodological” (or “methodeutical”) interpretation meant that abduction was treated as the first stage of inquiry (see Peirce CP 7.164–255, 1901; CP 6.522–547, 1901; EP2: 258–299, 1903). According to this methodological interpretation, ideas are suggested by abduction, made clearer by using deduction, and then tested by using induction (see also CP 6.469–473, 1908; CP 7.202–219, 1901). Abduction is a weak form of inference that “merely suggests that something may be” (CP 5.171–172, 1903). It is in line with Peirce’s thoroughgoing fallibilism (CP 1.171, c. 1893; CP 1.13–14, c. 1897; Niiniluoto, 2018). One often cited formulation of abduction is from this later phase (Peirce EP2: 231, 1903): The surprising fact, C, is observed; But if A were true, C would be a matter of course. Hence, there is reason to suspect that A is true.
The skeleton of the formula is similar to syllogisms, but there are some additions like “the surprising fact . . . is observed,” “would be a matter of course,” and “there is reason to suspect.” Many later characterizations of abduction by Peirce emphasize elements easily related to discovery. All the ideas of science come to it by the way of Abduction. Abduction consists in studying facts and devising a theory to explain them. (Peirce CP 5.144-145, 1903)
Or:
4 Abduction as a Logic of Discovery
47
An originary Argument, or Abduction, is an argument which presents facts in its Premiss which present a similarity to the fact stated in the Conclusion, but which could perfectly well be true without the latter being so, much more without its being recognized; so that we are not led to assert the Conclusion positively but are only inclined toward admitting it as representing a fact of which the facts of the Premiss constitute an Icon. (Peirce CP 2.96, 1903)
Another change (apart from an evidencing to a methodological process) between Peirce’s early views and later views was the role of instinct (or insight) as a basis for abduction. In his early works, Peirce had explicitly denied instinct as a basis for hypothesis (i.e., abduction) (CP 2.749, 1883). He argued that there might be a special adaptation of the human mind to the universe, but that cannot be a basis for the validity of reasoning. Others have supposed that there is a special adaptation of the mind to the universe, so that we are more apt to make true theories than we otherwise should be. Now, to say that a theory such as these is necessary to explaining the validity of induction and hypothesis is to say that these modes of inference are not in themselves valid, but that their conclusions are rendered probable by being probable deductive inferences from a suppressed (and originally unknown) premiss. (CP 2.749, 1883)
In his later works, Peirce often emphasized that we must assume that humans have some kind of guessing instinct which is a basis for abduction (or is the same as abduction), otherwise it would be impossible to explain how people have made any progress in science at all (CP 7.220, 1901). The argument was that new ideas cannot be produced by pure chance operations because that would have taken too long compared to human history. Such an instinct is not supposed to be infallible by Peirce, but strong enough that it has helped humans to find right or good hypotheses more effectively than by chance (CP 6.476, 1908). Peirce had different kinds of formulations and names for this kind of guessing instinct in his texts, like an insight (CP 5.173, 1903), or il lume naturale (a natural light) (CP 1.80, c. 1896; CP 1.630, 1898). Other than his basic formulations, Peirce had several ways of characterizing abduction that have given rise to several interpretations and emphases. In his texts on abduction, he highlighted the elements of (see in more detail: Paavola, 2012, 46–47): 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
guessing (e.g., Peirce CP 7.219, 1901), insight (e.g., CP 5.173, 1903), instinct (e.g., CP 1.80–81, c. 1896), perception and perceptual judgments (e.g., CP 5.180–194, 1903), sensations and emotions (e.g., CP 5.291–292, 1868), conceptions (e.g., CP 2.776, 1901), pattern recognition, or related ideas (e.g., PPM 282–283, 1903), the maxim of pragmatism (e.g., CP 5.195–197, 1903), the economy of research (e.g., CP 7.220, fn 18, 1901), interrogation (e.g., CP 6.525, 528, 1901), an inference of a cause from its effect (e.g., W 1:180, 1865),
48
S. Paavola
12. the category of Firstness (e.g., CP 2.89–102, c. 1902), 13. an inference through an icon (e.g., CP 2.96, c. 1902), 14. pure play, and musement (e.g., CP 6.455–469, 1908). This list is not supposed to be exhaustive, but it does show that there are several ways of starting to develop abduction as a weak form of inference. Peirce himself combined and emphasized these elements variously in his texts.
After Peirce During Peirce’s lifetime and for long after that (Peirce died in 1914), abduction did not arouse much interest. Many researchers from the logical empiricist tradition maintained that Peirce’s views were actually in line with the hypothetico-deductive model (HD), but he had confused psychological and logical issues, especially with his formulations of abduction. . . . what he [Peirce] calls “abduction” suffers from an unfortunate obscurity which I must ascribe to his confounding the psychology of scientific discovery with the logical situation of theories in relation to observed facts. (Reichenbach, 1938, p. 36; see also Nagel, 1933, p. 381–3; Braithwaite, 1934, p. 510)
Karl Popper had very similar ideas as logical empiricists on the logic of discovery: . . . my view of the matter . . . is that there is no such thing as a logical method of having new ideas, or a logical reconstruction of this process. My view may be expressed by saying that every discovery contains ‘an irrational element’, or ‘a creative intuition’ in Bergson’s sense. (Popper 1959/2002, p. 8)
Research oriented to Peirce’s philosophy also started to appear slowly. Arthur W. Burks (1946) defended abduction as a logic of discovery by making a distinction between “logica docens” (critical logic) and “logica utens” (instinctive habits of reasoning), arguing that abduction can deal with both of them. Frankfurt (1958) was more critical, maintaining that Peirce held the paradoxical view that abduction concerns both the imaginative faculty and logical inference. Frankfurt also put the argument often stated after him that the formulation of abductive inference as usually presented cannot be a way of originating hypotheses when the hypothesis is already in the premises. It seems that the hypothesis is supposed to be discovered by some other means. Other books and papers started to appear, often highlighting abduction from the point of view of creative insight (e.g., Anderson, 1987; Davis, 1972). The Transactions of the Charles S. Peirce Society started to appear in 1965, providing a forum for papers on Peirce’s abduction (e.g., Ayim, 1974; Sharpe, 1970). K. T. Fann’s book (Fann, 1970, written originally in 1963 as an MA thesis) Peirce’s Theory of Abduction was the first thoroughgoing presentation of Peirce’s own theory. A clear defender and developer of abduction as a logic of discovery was N. R. Hanson in the late 1950s and 1960s (he died in an accident in 1967). Hanson
4 Abduction as a Logic of Discovery
49
explicitly argued against the mainstream philosophy of science of his time that it was a “logic of the finished research report,” and had neglected what has always been the most important thing in novel scientific research, that is, the theory-finding and discovery (Hanson, 1958a, 1961, p. 20–22). Hanson developed several formulations of abduction as backward reasoning, starting from either an anomaly or data, and directed toward explanatory hypotheses (Hanson, 1963, 1965). He maintained that the reasons for suggesting hypotheses are not only psychological but also logical and, in that sense, can and should be analyzed within philosophy of science (Hanson, 1958b, 1963, 1965). Hanson also emphasized abduction’s proximity to observation and to interpreting the same data in a novel way, that is, seeing new intelligible “patterns” in puzzling data (Hanson, 1958a). Hanson and abduction as a logic of discovery was given new prominence around the 1980s by “friends of discovery.” The latter term has been used to refer to a new interest in the area of discovery in the late 1970s and 1980s (see Nickles, 1980a, p. 1). Two books edited by Thomas S. Nickles (1980b, 1980c) were especially influential. Many friends of discovery referred explicitly to Hanson and abduction in their writings. They wanted to abandon a strict dichotomy, heralded by logical empiricists, between the empirical (and often mystical) context of discovery, and the conceptual or logical context of justification. Many of them defended an “in between” area between these two, that is, a logic of pursuit, or a logic of preliminary analysis (see the next section for more detail). The point was that philosophy of science should not only be interested in the justificatory assessment of scientific results but also in a “heuristic appraisal,” that is, the assessment of forwardlooking fertility of scientific proposals (Nickles, 1989). Many were still critical of abduction as a logic of discovery, and especially of Hanson’s Peircean formulations of abduction (Achinstein, 1970; Nickles, 1980a, p. 22–25; Kapitan, 1992). It was often maintained that even if abduction cannot treat discovery in a “genuine” sense, it can be about preliminary analysis of the pursuit-worthiness of hypotheses. A separate line of research on nondeductive reasoning was started in the late 1960s by Gilbert Harman (1965, 1968), who formulated “the inference to the best explanation” (IBE) model. Harman’s starting point was not Peirce or abduction but a more general claim that there is an important form of nondeductive reasoning to a hypothesis that best explains the evidence. Later formulations of IBE took Peirce’s formulations of abduction into account more explicitly (Psillos, 2000). Quite confusingly, the Inference to the Best Explanation model is nowadays often called “abduction.” Many researchers leaning on the Peircean interpretation of abduction have argued that abduction and IBE should be seen as different kinds of inferences (Minnameier, 2004; Paavola, 2006; Campos, 2011). On this interpretation, abduction concerns reasons for suggesting hypotheses. This distinction became more complex when later versions of IBE aimed at analyzing processes of discovery with IBE coming closer to Peirce’s and Hanson’s formulations and aims. Peter Lipton’s formulations of IBE in particular have affinities with Peircean abduction by distinguishing actual and potential explanations, on the one hand, and likeliness and loveliness of explanations, on the other (Lipton, 2004). Lipton’s criteria for “loveliness” (mechanism, precision, unification, and the elegance and
50
S. Paavola
simplicity criterion) have commonalities with Peirce’s criteria for good abductive hypotheses (see Paavola, 2006). Abduction started to be noted in several fields of research in relation to methodological questions and/or reasoning (e.g., Simon, 1977[1968], p. 25–45; Merton, 1968, p. 156–171; Thagard, 1988, p. 52–65; Schum, 2001). In AI research, abduction was often interpreted from the point of view of IBE (Josephson & Josephson, 1994; Pople, 1973). Around the 1980s, many researchers were interested in abduction from the semiotic point of view, and in relation to the methods used by detectives, emphasizing observation of trifles, imagination, and a guessing instinct as a basis for abduction (for a collection of essays, see Eco & Sebeok, 1983). One inspiration was Peirce’s own story about working as a detective. In this text, Peirce himself was not explicitly analyzing abduction but a guessing instinct which human beings seem to have for finding true explanations (Peirce, 1929; some parts published in Peirce CP 7.36–48, 1907). An additional interest in abduction as a logic of discovery is the connection to the interrogative approach to inquiry. These formulations were often critical to abduction. But they have emphasized the role of explanation seeking why-questions as a critical part of inquiry processes with close affinity to abduction (Hintikka, 1998; Kleiner, 1983; Sintonen, 2004). Hintikka emphasized that with abduction Peirce propounded a fundamental question of contemporary epistemology, that is, “What is ampliative reasoning like?” (Hintikka, 1998). Hintikka’s own answer to this question was a combination of deductive reasoning and question – answer steps which provide new information on the reasoning process, and skills in using strategic rules of reasoning (Hintikka, 1999). Abduction as the retroductive inference of a cause from its effects has affinities with the old method of analysis and synthesis, more specifically with the analysis phase (Niiniluoto, 1999). As in Hintikka’s interrogative model of inquiry, abduction is interpreted within deductive reasoning rather than as a separate mode of its own. Since the 1990s especially, the discussions on abduction have expanded into various areas of methodology, and formulations of reasoning and discovery. One important field is the various methods of analysis in empirical research, and methodological approaches where issues on theory generation and processes of discovery are discussed. Abduction is interpreted as a part of “theoretical sensitivity” (Bryant, 2009; Kelle, 2005), or “imaginative theorizing” (Locke et al., 2004, 2008). In the philosophy of science and cognitive sciences, Lorenzo Magnani has been influential in organizing “Model-Based Reasoning” (MBR) conferences (starting in 1998) and has edited several books on abduction. Magnani has suggested “manipulative abduction,” for example, besides theoretical abduction. Manipulative abduction is close to ideas of distributed cognition, emphasizing thinking “through doing, and not only . . . about doing” (Magnani, 2004, p. 229; Magnani, 2001). Practical abduction analyzed in relation to human practices has affinities to this, aimed at enriching working hypotheses instead of testing them in a traditional sense (Paavola, 2021). There are also other deviations from mainstream interpretations of abduction in relation to explanatory accounts. Gabbay and Woods (2005) have argued that abduction should cover transformations of “ignorance problems,” not just search for explanations.
4 Abduction as a Logic of Discovery
51
In summary, it can be seen that there are many ways of interpreting and formulating abduction in relation to discovery already provided by Peirce, and similarly since Peirce. Next I shall discuss abduction by distinguishing between abduction as a logic of pursuit, and as a logic of discovery because it is a clear dividing line between differing interpretations of abduction as a logic of discovery.
Logic of Pursuit or Logic of Discovery or both? As we saw, N. R. Hanson defended and developed the idea that abduction provides a “logic of discovery.” Many commentators since Hanson have credited him with maintaining that the distinction between the “context of justification” (as the area for the philosophy of science) and the “context of discovery” (potentially the area of empirical sciences but not for philosophy, or logic) is not valid. At least there is an in-between area which can be analyzed by philosophical and logical means but which is not just about final justification. At the same time, many maintained that Hanson’s claims for the logic of discovery were confused. In 1959, David Schon, while commenting on Hanson’s formulations of the logic of discovery, noted that Hanson seemed to confound the logic of discovery and the logic of preliminary evaluation of a hypothesis (or reasons for suggesting, and entertaining, a hypothesis) (Schon, 1959). Peirce’s formulations of abduction seem to leave room for various interpretations, sometimes emphasizing the logic of discovery and sometimes the logic of pursuit (see also Niiniluoto, 2018, p. 76–85). According to Peirce, abduction is, among other things, a way not only how all the ideas of science have come (Peirce CP 5.144–145, 1903) but also about “adopting a hypothesis” (Peirce EP2: 231, 1903), or about “the invention, selection, and entertainment of the hypothesis” (Peirce HP 2:895, 1901). Next I shall present two ways of interpreting Hanson’s (and Peirce’s) abduction: first, as a logic of pursuit (or a preliminary evaluation) in between discovery and justification, and by maintaining that this means that abduction is not, or cannot be, the logic of discovery in a “generative” sense; second, different means of interpreting abduction as a logic of discovery. While my own view is in line with the latter, I think that the former interpretation is important in developing abduction as a logic of pursuit. The former interpretation is problematic if it denies the potential of abduction as a logic of discovery.
Logic of Pursuit (without Logic of Discovery) Many authors have maintained that Hanson’s formulations of abduction can apply within the context of pursuit but not discovery. (Kordig 1978; and Salmon 1966, p. 114), for example, maintained that there are issues of plausibility in hypotheses prior to an experimental test, which are also different than the logic of justification. He identified three areas: (1) initial thinking, (2) plausibility, and (3) acceptability, equating Hanson’s formulations of abduction to the issues of plausibility.
52
S. Paavola
He explicitly denied that logic could concern “initial thinking” (the context of discovery). For Kordig, initial thinking is the area for cultural and psychological factors (see p. 114). When it comes to discovery, he was in line with the logical empiricists (and Popper): “Here [in the area of “initial thinking”] logic is not essential to discovery, as logical empiricists stress” (ibid.). “Friends of discovery” in the 1980s who became interested in discovery and at the same time in Hanson and abduction shared this view. Most maintained that Hanson had successfully shown the need for a “middle” area in between discovery and justification (with names like “prior assessment,” “preliminary evaluation,” and “context of pursuit”). Not all were against the logic of discovery. For example, Curd (1980) emphasized an extended period of time between scientists first beginning to think seriously about a problem and ending when a research report is written up. He maintained that abduction can be about inferences scientists make in reasoning about their hypotheses, and why these inferences are reasonable (ibid., p. 203– 205, p. 213). At the same time, many others argued that Hanson’s (or Peirce’s) formulations of abduction failed as a logic of discovery (Nickles, 1980a, p. 18–22; Schaffner, 1980, p. 173–179). There were various arguments for this, for example, because the hypothesis is supposed to be known in the premises, it seems that it is not the way to generate it (see also Frankfurt, 1958), since the formulations of abduction allow all kinds of “wild” hypotheses, it seems that it is not a way to obtain useful or productive hypotheses, and since the formulation of abduction does not take the role of background theories into account, it seems to emphasize observational data as a basis for scientific research one-sidedly (see Achinstein, 1970, p. 91–93; Achinstein, 1971, p. 117–119). One influential paper strongly supporting the idea of the context of pursuit was Larry Laudan’s “Why was the logic of discovery abandoned?” (Laudan, 1980). Laudan argued that the idea of a logic of discovery had been relevant before the nineteenth century, when the search had been for an infallible logic of discovery. In that case, the logic of discovery would function simultaneously as a logic of justification (and vice versa). The situation changed with a strong emphasis on the fallibility of all knowledge in the nineteenth century (promoted, for example, by Herschel and Whewell). Post hoc evaluations of theories then became central. Laudan ends the paper by saying that there is also a heuristic problem about science and the generation of theories but adds: “ . . . before one concludes that the logic of discovery still has a philosophical rationale, one must ask what is specifically philosophical about studying the genesis of theories” (Laudan, 1980, p. 182). Several commentators have questioned Laudan’s abrupt conclusions on the relevance of heuristics and logic of discovery within the philosophy of science (Cellucci, 2015; Shah, 2007). Laudan described the historical shift from infallibilist ideas on the logic of discovery to fallibilism and a clear separation between the context of discovery and the context of justification but advanced rather weak arguments as to why heuristics and the logic of discovery were not important topics for the philosophy of science. Laudan was not referring to logical empiricism or the HD-model ín the twentieth-century, but his views on discovery were in line with them.
4 Abduction as a Logic of Discovery
53
Abduction as a logic of pursuit (with different names) has been developed in several papers. McKaughan (2008) argued that generative interpretation mischaracterizes Peirce’s view of abduction. According to McKaughan, Peirce’s abduction is noninferential, based on insight or instinctive capacity when it comes to discovery. He agreed on Laudan’s ideas of pursuit-worthiness and Peirce’s discussion on the economy of research (see also Achinstein, 1993). Peirce emphasized the economy of research in various papers on abduction (Peirce CP 7.220, fn 18; CP 5.600). The testing of hypotheses requires time, energy, and money, and “practically grounded comparative recommendations” should be taken into account in the pursuit-worthiness of the hypotheses (McKaughan, 2008, p. 458; Niiniluoto, 2018, p. 80–85). Similarly Khachab (2013) argued that the focus on Peirce’s abduction is not how ideas emerge (logic of discovery) but on finding “good” hypotheses. Khachab then emphasizes abduction in relation to Peirce’s maxim of pragmatism and a set of criteria for good hypotheses like clarity, explanatory power, and experimental verifiability. With a similar interpretation of abduction, Mohammadian (2019) has argued for the ranking of hypotheses using the economy of research. By using this kind of interpretation, Peircean abduction comes much closer to the Inference to the Best Explanation (IBE) model than usually understood by the Peirce scholars (ibid.; cf. above).
Abduction as a Logic of Discovery In this section, various approaches are provided to defending the idea that abduction can provide a logic of discovery, in contrast to the criticisms discussed above. At the start, it should be noted that those who are defending the idea of an abductive logic of discovery are usually explicitly arguing against the algorithmic, mechanical, and infallible logic of discovery (Hanson, 1961, p. 20–21; Cellucci, 2020). In line with deep-seated fallibilism, abduction is not supposed to be a mechanical way of finding true explanations but rather heuristic reasoning helping the inquirer in question to find promising and productive hypotheses. It is also supposed to be a rational reconstruction of reasons for suggesting hypotheses, not a manual ensuring creativity. Another remark is that Peirce’s and Hanson’s descriptions of elements or features of abduction are often quite similar to those descriptions of discovery that many critics of the idea of a logic of discovery offer. For example, Reichenbach argued that discovery was guided by guesses, hunches, intuition, and issues of plausibility and creativity (see Reichenbach, 1951, p. 230–231). Quite similarly, Kordig maintained that discovery (or “initial thinking”) concerned how scientists hit upon ideas, and was an area for imagination, guessing, and creatively and poetically intuited hypotheses. For Reichenbach and Kordig, these were reasons for denying the possibility of a logic within discovery, but for the defenders of the abductive logic of discovery, abduction lurks in these imaginative guesses, hunches, and issues concerning plausibility. In any case, there are different but overlapping ways of
54
S. Paavola
explaining how abduction can be inferential and imaginative, or about reasoning and insight at the same time. One interpretation is based on Peirce’s formulations on abduction, especially his later writings. Peirce often highlighted that abduction was a form of reasoning, but at the same time there was an important element of instinct (or observational insight) explaining the success of abductively formed hypotheses. This combination of inferential and instinctual (or insightful) elements of abduction is defended by many later commentators (e.g., Burks, 1946; Mohammadian, 2019). Many critics of a logic of discovery argue that if the success of abduction is based on an instinct, it is not a form of reasoning in a deep sense any more. Another direction of developing abduction as a logic of discovery, overlapping somewhat with the previous one, is to see it as a method used by detectives. This interpretation is backed up by Peirce himself with his narrative about when he himself had solved the theft of his valuable belongings on a boat trip (Peirce, 1929). Peirce maintained that he was able to solve the theft by using a guessing instinct based on the fact that “we often derive from observation strong intimations of truth, without being able to specify what were the circumstances we had observed which conveyed those intimations” (Peirce, 1929, p. 282 [CP 7.46, c. 1907]). Semioticians in the early 1980s were especially interested in this kind of interpretation of abduction where the use of minute clues, insightful guessing, and inference are amalgamated (see Eco & Sebeok, 1983). The role of hunches and clue-like signs has not been developed much further even when one would have imagined that Peirce’s theory of signs provides fertile means for that kind of semiotic interpretation (see Vitti, this volume). Peter Lipton developed the idea that criteria for “loveliness” are central to the generation of good hypotheses (Lipton, 2004). These criteria are also interesting from the point of view of clue-like signs (see Paavola, 2006). Tschaepe (2013, 2014) has shown how abductive guessing operates as a part of the inquiry processes. One line of interpretation is to maintain that discovery and creativity in reasoning are based on iconicity, and/or observation. This also concerns creativity in deductive and mathematical reasoning (Cellucci, 2020) but is especially important in abduction where “Firstnesses” and iconicity are also crucial according to Peirce (e.g., Peirce CP 2.89–102, c. 1902). Hanson emphasized an observational element of “seeing that” as a basis for discovery. Both Peirce and Hanson maintained that abduction was close to perception and perceptual judgments (see Peirce CP 5.181, 1903; Wilson, this volume). Reversible figures, i.e., visual data that can be interpreted in several ways, are borderline cases between perceptual judgments and abductive inferences because they show that percepts contain inferential or interpretative elements (see Peirce CP 5.183–184, 1903; Hanson, 1958a). An additional aspect of abduction as a logic used by detectives is to emphasize it from the point of view of strategies of (abductive) reasoning. Critics of abduction as a logic of discovery have often concentrated on definitory rules of abduction, that is, on basic formulations of abduction and how useful and valid these individual formulations seem to be. The emphasis on the skills of using strategies of abductive reasoning and dynamically combining several moves in reasoning provides further
4 Abduction as a Logic of Discovery
55
means of answering many standard criticisms of abduction as a logic of discovery (Paavola, 2004). Different strategies of abductive reasoning can guide the search for productive hypotheses even when it is not known what the final premises of abductive reasoning shall be. Paavola (2012, p. 206–211) lists seven productive abductive reasoning strategies: (1) Searching somehow anomalous, surprising, or disturbing phenomena and observations; (2) observing details, little clues, and tones; (3) continuous search for hypotheses and noting their hypothetical status; (4) aiming at finding what kind or type of explanations or hypotheses might be viable to constrain the search in a preliminary way; (5) aiming at finding explanations (or ideas) which themselves can be explained (or be shown to be possible); (6) searching for “patterns” and connections that fit together to constitute a reasonable unity; and (7) paying attention to the process of discovery and its various elements and phases. A basis for such abductive strategies is the idea of reasoning used by detectives. It also aims at developing Hanson’s formulations of a logic of discovery further (that is, abduction) as reasoning “backwards” from an anomaly to an explanation: According to Hanson, it “should concern itself with the scientists’ actual reasoning which C. proceeds retroductively, from an anomaly to B. the delineation of a kind of explanatory H which A. fits into an organized pattern of concepts. (Hanson, 1965, p. 50)
There have been also aims at developing abductive logic of discovery with a precise formal treatment (Aliseda, 2004; Niiniluoto, 2018, p. 35–50). For example, Meheus and Batens (2006) interpret abduction as adaptive logic and as paraconsistent logics which allow contradictions (strictly denied in classical logic). Then the fallacy of affirming the consequent (A - > B, B, then A), which is close to basic formulations of abduction, can be added as an abductive rule of inference with rules telling how it can be applied. Another way of interpreting abduction dynamically is an interrogative construal of abduction (Ma & Pietarinen, 2016). Peirce had formulations of abduction with clear interrogative emphasis (e.g., Peirce CP 6.525, 528, 1901; EP 2:287, 1903). Ma and Pietarinen (2016) develop logical notation based on Peirce’s texts which would take this kind of logic in making conjectures or “reasoning from surprise to inquiry” into account more accurately. Their formulations can be interpreted with the logic of pursuit but have a clear relevance also for the logic of discovery in a generative sense. An important part is to interpret an abductive conclusion in an “interrogative” (or “investigand”) mood (instead of a typical indicative mood). Here abductive conclusion is not just a hypothesis but a conclusion like: It is to be inquired whether A is not true. (ibid., p. 79-80)
As a continuation of this, Bellucci and Pietarinen (2020) emphasize the iconic character of abduction, maintaining that Peirce was developing and experimenting with the ideas of the interrogative mood and related logical graphs in his later works. These kinds of formulations are rare in Peirce’s texts, but they show that he felt a need and the potential for this kind of logic.
56
S. Paavola
In summary, there have been various ways of defending and developing abduction explicitly as a logic of discovery. Elements for these formulations come from Peirce’s formulations, according to which abduction is a “weak” form of inference, close to (or even the same as) guessing, is based on uses of clue-like signs and on iconic relationships, and is aimed at reconstructing dynamic search and inquiry processes of discovery. What is missing (or what is at most implicit) in these formulations is the social character of processes of inquiry. Peirce himself emphasized the social nature of scientific research but not in relation to abduction. Peirce stressed that abduction was “il lume naturale” (some kind of an “instinctive insight”), and/or the logical relationship between premises and conclusions, mostly ignoring social interaction and use of cultural resources when it comes to idea generation. The logic of discovery, or a rational reconstruction of it, is not necessarily against abduction interpreted as a logic of pursuit (see also Folger et al., this volume). But the logic of discovery counters the claim that abduction cannot but be a logic of pursuit. Actually there are two ways of interpreting the pursuitworthiness of a hypothesis. One is in line with the logic of pursuit without the logic of discovery. Here the logic of pursuit means that the hypothesis (which has already been discovered) is somehow evaluated in a preliminary way, for example, if it is promising or worth being tested. But pursuit-worthiness can also be interpreted within the logic of discovery. An idea or an hypothesis can be like a “working hypothesis” that can be modified, reconstructed, and developed further. The focus is then on the pursuit-worthiness of an idea for further development, not whether it is to be accepted in the form it is in. It can be argued that a theory is rarely constructed just with one line of argument, or at once as a ready-made theory to be evaluated and tested, but often requires many modifications and versions to work even as a good hypothesis. In this sense, the abductive logic of discovery is not totally distinct from the logic of justification either (cf. Niiniluoto, 2018, p. 84–85; Peirce CP 2.662, 1910). As Peirce noted, the acceptance of an abductive conclusion may have different forms of strength even with the logic of discovery: On account of this Explanation, the inquirer is led to regard his conjecture, or hypothesis, with favor. As I phrase it, he provisionally holds it to be “Plausible”; this acceptance ranges in different cases – and reasonably so – from a mere expression of it in the interrogative mood, as a question meriting attention and reply, up through all appraisals of Plausibility, to uncontrollable inclination to believe. (Peirce CP 6.469, 1908)
Conclusion The chapter aimed at presenting the evolution of main conceptions of abduction as a logic of discovery. Peirce’s many formulations have left room for various interpretations. Besides this, there is a lot of room for interpretations of abduction that depart from Peirce’s formulations. N. R. Hanson has been an important figure in discussions of abduction as a logic of discovery. Many critics have argued that
4 Abduction as a Logic of Discovery
57
Hanson managed to show that abduction is a logic of pursuit but been more hesitant on interpreting abduction as a logic of discovery. It can be argued that Hanson’s program on the logic of discovery was more successful than often thought, but it requires that formulations of abductive inference be elaborated and developed further. This kind of development has characterized various frontiers of abduction. The developmental work on abduction as a logic of discovery is still, however, on its way and is also work in progress in the context of discovery.
References Achinstein, P. (1970). Inference to scientific laws. In R. H. Stuwer (Ed.). Minnesota Studies in the Philosophy of Science, 5, 87–111. Achinstein, P. (1971). Law and explanation: An essay in the philosophy of science. Clarendon Press. Achinstein, P. (1993). How to defend a theory without testing it: Niels Bohr and the “logic of pursuit”. Midwest Studies in Philosophy, 18, 90–120. Aliseda, A. (2004). Logics in scientific discovery. Foundations of Science, 9(3), 339–363. Aliseda, A. (2006). Abductive Reasoning. Logical Investigations into Discovery and Explanation. Synthese Library, vol. 330. Springer. Anderson, D. R. (1987). Creativity and the philosophy of C.S. Peirce. Martinus Nijhoff Publishers. Ayim, M. (1974). Retroduction: The rational instinct. Transactions of the Charles S. Peirce Society, 10(1), 34–43. Bellucci, F., & Pietarinen, A. V. (2020). Icons. Interrogations, and graphs: On Peirce’s integrated notion of abduction. Transactions of the Charles S. Peirce Society, 56(1), 43–61. Braithwaite, R. B. (1934). Review of collected papers of Charles Sanders Peirce (Vols. I-IV). Mind, 43, 487–511. Bryant, A. (2009). Grounded theory and pragmatism: the curious case of Anselm Strauss. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 10(3), Art. 2, http://nbnresolving.de/urn:nbn:de:0114-fqs090325 [Date of access: March, 11th, 2022]. Burks, A. W. (1946). Peirce’s theory of abduction. Philosophy of Science, 13, 301–306. Campos, D. G. (2011). On the distinction between Peirce’s abduction and Lipton’s inference to the best explanation. Synthese, 180(3), 419–442. Cellucci, C. (2015). Why should the logic of discovery be revived? A reappraisal. In E. Ippoliti (Ed.), Heuristic reasoning (Studies in applied philosophy, epistemology and rational ethics) (Vol. 16). Springer. Cellucci, C. (2020). Reconnecting logic with discovery. Topoi, 39(4), 869–880. Curd, M. V. (1980). The logic of discovery: An analysis of three approaches. In T. Nickles (Ed.), Scientific discovery, logic, and rationality (pp. 201–219). D. Reidel Publishing Company. Davis, W. (1972). Peirce’s epistemology. Martinus Nijhoff. Eco, U., & Sebeok, T. A. (Eds.). (1983). The sign of three. Dupin, Holmes, Peirce. Indiana University Press. Fann, K. T. (1970). Peirce’s theory of abduction. Martinus Nijhoff. Frankfurt, H. G. (1958). Peirce’s notion of abduction. The Journal of Philosophy, 55(14), 593–597. Gabbay, D. M., & Woods, J. (2005). The reach of abduction. Insight and trial. A practical logic of cognitive systems (Vol. 2). Elsevier. Hanson, N. R. (1958a). Patterns of discovery. University Press. Hanson, N. R. (1958b). The logic of discovery. The Journal of Philosophy, 55(25), 1073–1089. Hanson, N. R. (1961). Is there a logic of scientific discovery? In H. Feigl & G. Maxwell (Eds.), Current issues in the philosophy of science (pp. 20–35). Holt, Rinehart and Winston.
58
S. Paavola
Hanson, N. R. (1963). Retroductive inference. In B. Baumrin (Ed.), Philosophy of science: The Delaware seminar (1961–1962) (Vol. I, pp. 21–37). Interscience Publishers, a division of John Wiley & Sons. Hanson, N. R. (1965). Notes toward a logic of discovery. In R. J. Bernstein (Ed.), Perspectives on Peirce (pp. 42–65). Yale Univ. Press. Harman, G. H. (1965). Inference to the best explanation. Philosophical Review, 74, 88–95. Harman, G. H. (1968). Enumerative induction as inference to the best explanation. The Journal of Philosophy, 65, 529–533. Hempel, C. G. (1966). Philosophy of natural science. Prentice-Hall. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3), 503–533. Hintikka, J. (1999). Is logic the key to all good reasoning? In J. Hintikka (Ed.), Inquiry as inquiry: A logic of scientific discovery, Jaakko Hintikka selected papers (Vol. 5). Kluwer Academic Publishers. Josephson, J. R., & Josephson, S. G. (Eds.). (1994). Abductive inference. Computation, philosophy, technology. Cambridge University Press. Kapitan, T. (1992). Peirce and the autonomy of abductive reasoning. Erkenntnis, 37, 1–26. Kelle, U. (2005). “Emergence” vs. “forcing” of epical data? A crucial problem of “grounded theory” reconsidered. Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 6(2), Art. 27, http://nbn-resolving.de/urn:nbn:de:0114-fqs0502275. Khachab, C. E. (2013). The logical goodness of abduction in CS Peirce’s thought. Transactions of the Charles S. Peirce Society, 49(2), 157–177. Kleiner, S. A. (1983). A new look at Kepler and abductive argument. Studies in History and Philosophy of Science, 14(4), 279–313. Kordig, C. R. (1978). Discovery and justification. Philosophy of Science, 45(1), 110–117. Laudan, L. (1980). Why was the logic of discovery abandoned. In T. Nickles (Ed.), Scientific discovery, logic, and rationality (pp. 173–183). D. Reidel Publishing Company. Lipton, P. (2004). Inference to the best explanation. Second edition, first published 1991. : Routledge. Locke, K., Golden-Biddle, K., & Feldman, M. S. (2004). Imaginative theorizing in interpretative organizational research”. In D. H. Nagao (Ed.) Best paper proceedings, 64th annual meeting of the Acad. Management. New Orleans. Locke, K., Golden-Biddle, K., & Feldman, M. S. (2008). Making doubt generative: Rethinking the role of doubt in the research process. Organization Science, 19(6), 907–918. Ma, M., & Pietarinen, A. V. (2016). A dynamic approach to Peirce’s interrogative construal of abductive logic. IfCoLog Journal of Logics and their Applications, 3(1), 73–104. Magnani, L. (2001). Abduction, reason, and science. Processes of discovery and explanation. Kluwer Academic/Plenum Publishers. Magnani, L. (2004). Model-based and manipulative abduction in science. Foundations of Science, 9(3), 219–247. McKaughan, D. J. (2008). From ugly duckling to swan: C. S. Peirce, abduction, and the pursuit of scientific theories. Transactions of the Charles S. Peirce Society, 44(3), 446–468. Meheus, J., & Batens, D. (2006). A formal logic for abductive reasoning. Logic Journal of IGPL, 14(2), 221–236. Merton, R. K. (1968). Social Theory and Social Structure. Enlarged Edition. : The Free Press. Minnameier, G. (2004). Peirce-suit of truth – Why inference to the best explanation and abduction ought not to be confused. Erkenntnis, 60, 75–105. Mohammadian, M. (2019). Beyond the instinct-inference dichotomy: A unified interpretation of Peirce’s theory of abduction. Transactions of the Charles S. Peirce Society, 55(2), 138–160. Nagel, E. (1933). Charles Peirce’s guesses at the riddle. The Journal of Philosophy, 30(14), 365– 386. Nickles, T. (1980a). Introductory essay: Scientific discovery and the future of philosophy of science. In T. Nickles (Ed.), Scientific discovery, logic, and rationality. D. Reidel Publishing Company.
4 Abduction as a Logic of Discovery
59
Nickles, T. (Ed.). (1980b). Scientific discovery, logic, and rationality. D. Reidel Publishing Company. Nickles, T. (Ed.). (1980c). Scientific discovery: Case studies. D. Reidel Publishing Company. Nickles, T. (1989). Heuristical appraisal: a proposal. Social Epistemology, 3(3), 175–188. Niiniluoto, I. (1999). Abduction and geometrical analysis: Notes on Charles S. Peirce and Edgar Allan Poe. In L. Magnani, N. J. Nersessian, & P. Thagard (Eds.), Model-based reasoning in scientific discovery. Kluwer Academic/Plenum Publishers. Niiniluoto, I. (2018). Truth-Seeking by abduction. Synthese Library, Vol. 400. Springer. Paavola, S. (2004). Abduction as a logic and methodology of discovery: The importance of strategies. Foundations of Science, 9(3), 267–283. Paavola, S. (2006). Hansonian and Harmanian abduction as models of discovery. International Studies in the Philosophy of Science, 20(1), 91–106. Paavola, S. (2012). On the origin of ideas. An Abductivist approach to discovery. Revised and enlarged edition. Saarbrücken: Lap Lambert Academic Publishing. Paavola, S. (2021). Practical abduction for research on human practices: Enriching rather than testing a hypothesis. In J. Shook & S. Paavola (Eds.), Abduction in cognition and action: Logical reasoning, scientific inquiry, and social practice. Springer. Peirce, C. S. (1929). Guessing. Hound & Horn, 2(3), 267–282. Peirce, C. S. [CP (volume.paragraph, year] (1931-1958). Collected Papers of Charles Sanders Peirce, vols. 1–6, Hartshorne, C. and Weiss, P., (Eds.), vols. 7–8, Burks, A. W., (Ed.). Cambridge, Mass: Harvard University Press. Peirce, C. S. [EP (volume: page numbers, year)] (1992–1998). The Essential Peirce: Selected Philosophical Writings, 2 vols., the Peirce Edition Project (Eds). Bloomington & Indianapolis: Indiana University Press. Peirce, C. S. [HP (volume: page numbers, year)] (1985). Historical Perspectives on Peirce’s Logic of Science. A History of Science, 2 vols., Carolyn Eisele (Ed.). Berlin: Mouton Publishers. Peirce, C. S. [PPM (page numbers, year)] (1997). Pragmatism as a Principle and Method of Right Thinking. The 1903 Harvard Lectures on Pragmatism, Ann Turrisi (Ed.). Albany: State University of New York Press. Peirce, C. S. [W (volume: page numbers, year)] (1982), Writings of Charles S. Peirce: A chronological edition, the Peirce edition project (Eds). : Indiana University Press. Pople, H. E. (1973). On the mechanization of abductive logic. In Proceedings of the Third International Joint Conference on Artificial Intelligence. Stanford, CA. Popper, K. (2002/1959). The Logic of Scientific Discovery. Logik der Forschung first published 1934. First English edition 1959. Routledge Classics. Routledge: London and New York. Psillos, S. (2000). Abduction: Between conceptual richness and computational complexity. In P. A. Flach & A. C. Kakas (Eds.), Abduction and induction. Essays on their relation and integration (pp. 59–74). Kluwer Academic Publishers. Reichenbach, H. (1938). On probability and induction. Philosophy of Science, 5(1), 21–45. Reichenbach, H. (1951). The rise of scientific philosophy. University of California Press. Salmon, W. C. (1966). The foundations of scientific inference. University of Pittsburgh Press. Schaffner, K. F. (1980). Discovery in the biomedical sciences. In T. Nickles (Ed.), Scientific discovery: Case studies. D. Reidel Publishing Company. Schon, D. (1959). Comment on Mr. Hanson’s “The logic of discovery”. The Journal of Philosophy, 56(11), 500–503. Schum, D. A. (2001). Species of abductive reasoning in fact investigation in law. Cardozo Law Review, 22, 1645–1681. Shah, M. (2007). Is it justifiable to abandon all search for a logic of discovery? International Studies in the Philosophy of Science, 21(3), 253–269. Sharpe, R. (1970). Induction, abduction, and the evolution of science. Transactions of the Charles S. Peirce Society, 6(1), 17–33. Simon, H. A. (1977). Models of Discovery and Other Topics in the Methods of Science. D. Reidel Publishing Co., The Netherlands. (Boston Studies in the Philosophy of Science, Vol. LIV.)
60
S. Paavola
Sintonen, M. (2004). Reasoning to hypotheses: Where do questions come? Foundations of Science, 9(3), 249–266. Thagard, P. (1988). Computational philosophy of science. MIT Press. Tschaepe, M. (2013). Gradations of guessing: Preliminary sketches and suggestions. Contemporary Pragmatism, 10, 135–154. Tschaepe, M. (2014). Guessing and abduction. Transactions of the Charles S. Peirce Society, 50(1), 115–138.
5
Abduction and Perception in Peirce’s Account of Knowledge Aaron Bruce Wilson
Contents Peircean Abduction: A Non-essentialist Account . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peircean Abduction: Instinct, Association, and Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peircean Perception: Its Nature and Abductive Character . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peirce’s Account of Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62 66 69 75 77 78
Abstract
Peirce’s methodeutical logic, or what might be called his “account of knowledge,” is focused on the nature of inquiry; and on that account, it is the nature of inquiry to develop toward a “final opinion” or “final consensus” over the long run. Peirce holds abductive processes to be key to this development in several ways. Not only does he characterize any process through which an hypothesis is introduced as abductive, Peirce also insists that perception is abductive. In the 1903 Harvard lectures, he argues that perception supplies the “first premises” for all our reasonings and inquiries – namely, our perceptual judgments, which include those judgments against which hypotheses are tested. To make sense of how perception can be abductive, but yield judgments rather than hypotheses, a non-essentialist reading of (Peircean) abduction is presented that extends, what Tomas Kapitan has called, Peirce’s “comprehension thesis.” Abductions can include processes resulting in judgments or in hypotheses, it can be inferential or non-inferential (e.g., an “insight”), and it can include nonpropositional associative processes involving only icons and indices (e.g., percepts). But while,
A. B. Wilson () South Texas College, McAllen, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_10
61
62
A. B. Wilson
in Peirce, “abduction” can cover so many different processes, it remains key to his account of inquiry and to his explanation for how inquiry would develop toward a final result. Keywords
Peirce · Abduction · Perception · Inquiry · Semiotics
Peircean Abduction: A Non-essentialist Account As is well known, Peirce’s concept of abduction is multifaceted. There are forms of inference that he identifies as abductive, a step of inquiry that he characterizes as abductive, and an “instinct” or “insight” that he says is abductive, and he says that perception is abductive. How, exactly, should the relation between these different facets of Peircean abduction be understood has been the subject of much of the literature on it. Significant interest in Peircean abduction arose in the 1990s, perhaps partly motivated by interest in “inference to the best explanation” or “IBE” within mainstream philosophy of science (e.g., Lipton, 1991). However, as readers of Peirce know, abduction and IBE are significantly different, as one has to do with accepting a theory (i.e., IBE), while the other has to do with accepting a theory provisionally for purposes of experimental testing (i.e., Peircean abduction). Nonetheless, IBE continues to be confused with abduction (see Mcauliffe, 2015). Hintikka (1998) was among the first to rail against this conflation of IBE with abduction, arguing that abduction is “ampliative,” introducing new information, while IBE is not. The ampliative aspect of Peircean abduction is also called the “generative” aspect, and with abductive inference, the generation of some new hypothesis is supposed to occur in the movement from the premises to the conclusion, such that the new hypothesis occurs in the conclusion but not in the premises. To use an example of Peirce’s: All the beans from this bag are white. These beans are white. Therefore, these beans are from this bag. (CP 2.623, 1878)
The conclusion here says something that is not said in either of the premises. While one might argue that the conclusion is implicit in the premises, the inference is not deductively valid so the conclusion cannot be implicit in the premises in the same general way that it is in deductive inferences. The abductive argument here seems to be analogical, and in fact, in these early writings on logic (pre-1900), Peirce characterizes abductive inference or “hypothetic inference” precisely along the lines of analogical inference. He writes: “[Hypothesis is] where we find that in certain respects two objects have a strong resemblance, and infer that they resemble one another strongly in other respects” (CP 2.624, 1878). In the example above, it is
5 Abduction and Perception in Peirce’s Account of Knowledge
63
stated that two sets of beans are each white, and from the fact that one set is “from this bag,” it is inferred that the other set is “from this bag” as well. In writings beginning around 1900, Peirce begins to characterize abduction in a more inclusive way. While in 1878, he had characterized it as an inference “making an hypothesis” (CP 2.623, 1878), in the 1903 Harvard lectures, he characterizes abduction as a whole step of inquiry, marked by the function of introducing an explanatory hypothesis. He says that “[t]his step of adopting a hypothesis as being suggested by the facts, is what I call abduction” (CP 7.202, 1901). He also now characterizes deduction and induction as steps of inquiry, with induction involving “the experimental testing of a theory,” which proceeds from the deduction of a theory’s experimental consequences. However, Peirce continues to regard abduction to have an inferential form – he adds: “I reckon [abduction] as a form of inference, however problematical the hypothesis may be held” (ibid.). He presents the inferential form of abduction as follows: “The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true.” (EP 2:231)
This is the inferential form of abduction as a whole step of inquiry. While it successfully captures abduction as a whole step of inquiry, as others have pointed out, it does not capture the ampliative or generative aspect of abduction. The hypothesis, A, is contained in the premises, while to be generative, it seems, the hypothesis must appear only in the conclusion. Despite that the above inferential form of abduction does not seem ampliative, Peirce continues to regard abduction as ampliative or generative. In the same lectures, he declares that “abduction is the only process by which a new element can be introduced into thought” (EP 2:224, 1903). It might be supposed that the inferential form of the abductive step of inquiry assumes or encompasses more specific inferences that are ampliative, such as the analogical inferences he classified as abductive in earlier work, and that these inferences occur in passing from the first premise to the second premise. There does not seem to be anything in Peirce that contradicts this. Also, as some have argued (e.g., Paavola, 2004), the inference as a whole could still somehow guide the generative process, whether that process is itself inferential or not. While others have argued that if the inferential form of the abductive step of inquiry, as presented above, does not itself bear the generative property, then the abductive step is not generative (e.g., Mohammedian, 2019), so long as the step encompasses inferences or other processes that are generative, there is no reason to contradict Peirce and deny that the abductive step as a whole is generative. Peirce’s claim that “[a]bduction must cover all the operations by which theories and conceptions are engendered” (CP 5.590, 1903) has been called the comprehension thesis (Kapitan, 1997). Upon this thesis, any “operation” by which a new conception or theory is generated is abductive. This includes generalizations, or inferences to some general law that would explain the facts (see EP 2:287). While the confirmation of a generalization by particular instances is still inductive, as
64
A. B. Wilson
Peirce characterizes induction as the testing step of inquiry, the suggestion of a generalization by particular instances is abductive, as such suggestion fulfills the abductive function of introducing a hypothesis. Further, Peirce’s comprehension thesis implies that abduction can include noninferential “operations.” At the above and at other places (e.g., HP 2.898, 1901 EP 2:240, 1903; and EP 2:287, 1903), he describes abduction not specifically as a type of inference but generally as a “process” or “operation,” and there is no obvious indication that he means only inferential operations or processes (while, as will be seen, there are indications that he means to include non-inferential processes). Non-inferential abductive processes would include processes that are not propositional, nor even symbolic. Although any abductive process must at least be semiotic – otherwise, the process is merely psychological and loses its relevance to logic – a semiotic process could operate only upon icons and indices, lacking symbolic elements altogether. Further on, both associative processes and perceptual processes will be identified as types of abductive processes that, in Peirce, operate on nonsymbolic signs. Non-inferential and nonsymbolic abductive processes are likely involved in what, in his later writings, Peirce refers to as an abductive “insight,” “instinct,” or “faculty” – an instinct for guessing correctly (e.g., CP 7.219, 1901) or a “faculty of divining the ways of Nature” (CP 5.173, 1903). According to Peirce, we have a natural ability to guess correctly more often than we would guess correctly just by pure chance, and this ability can be called abductive (also see CP 7.679, 1903; EP 2:217-18, 1903). This “abductive instinct” may account for the generative or ampliative character of the whole abductive step of inquiry. As Paavola (2005) explains: “in [Peirce’s] later writings, a guessing instinct, or an instinct for finding good hypotheses, was an important aspect of abduction, indeed, a central element that made the originary character of abduction understandable” (131). Again, this does not conflict with the claim that abductive inference is generative. As has been seen, the abductive step of inquiry, which may employ the abductive instinct, has itself an inferential form. Moreover, the generative character of the abductive instinct could itself be accounted for by inferences – namely, unconscious inferences – that are generative. While Peirce generally seems to regard inference as a conscious and deliberate act, he allows that we make unconscious inferences, such as in the 1903 Harvard lectures, where he writes: “There are, as I am prepared to maintain, operations of the mind which are logically exactly analogous to inferences excepting only that they are unconscious and therefore uncontrollable” (EP 2:188). However, unconscious inference might not alone account for the generative character of our abductive instinct. As it will be explained, non-symbolic semiotic processes may also play a generative role. The comprehension thesis allows much to be classified as abductive, but even it might not include everything that Peirce seems to allow to be classed as abductive. While the comprehension thesis classifies every semiotic operation by which conceptions and theories are generated as abductive, at some places, Peirce describes abduction simply as the adoption of an hypothesis (e.g., CP 2.96, 1902), and the adoption of an hypothesis need not be generative. The hypothesis may have
5 Abduction and Perception in Peirce’s Account of Knowledge
65
been generated elsewhere and simply adopted into a new context. For instance, suppose a team of researchers generate and adopt some hypothesis to explain a certain phenomenon, but that research team or project falls apart. However, many decades later, a new team forms and researches the same phenomenon, and this new team stumbles upon the hypothesis of the previous team and adopts it. The new team has made an abductive step in adopting the hypothesis, even though they did not generate the hypothesis (and were not even the first to adopt it). Now, we come to a central conjecture regarding Peircean abduction. It is that either we should recognize more than one concept of abduction in Peirce or else we should recognize Peirce’s concept of abduction as a cluster concept, in that there is no single feature of abduction that all abduction must share, excepting extremely general features. While there is a feature characterizing all Peircean abduction – all abduction must introduce some sign (icon, symbol, hypothesis, etc.) into some context – that feature is so general that it leaves little to say about Peircean abduction universally. It becomes more useful to focus on different and more specific types of abduction. But as all these types of abduction are connected, with some being directly involved in others (as the abductive instinct seems involved in the abductive step of inquiry), it seems more accurate to recognize one general concept of abduction in Peirce, but one that is a cluster formed out more specific concepts. This can be called the non-essentialist reading of Peircean abduction. One might argue that, even if not all abduction is generative, and even if not all abduction is inferential, all abduction still results in hypotheses. Peirce describes a hypothesis as a “supposed truth” (CP 6.525) which we only provisionally accept for the purposes of testing it. As a “supposed truth, ” we commit ourselves, not to its truth but only to its being further tested. However, once it is admitted that not all abduction is inferential or operates upon propositions, it easy to see how, then, not all abduction results in hypotheses. Hypotheses need to be truth-apt and, thereby, need to have propositional content; and while an abductive operation that does not operate upon propositions might still result in a proposition, the comprehension thesis seems directly to allow abductive operations to result in “conceptions” rather than hypotheses (CP 5.590, 1903), and it is not evident that “conceptions” are always propositional. Moreover, in the sixth and seventh lectures of the 1903 Harvard series, Peirce characterizes perception as abductive; and as it will be explained further on, he holds perception to result in judgments, and not in hypotheses. He characterizes the “perceptual judgment ” as the assertion of a proposition to oneself concerning what one perceives (CP 5.29), and as an assertion, we make ourselves “responsible for its truth” (CP 5.543). Thus, a perceptual judgment, despite being the result of an abductive process, cannot be a hypothesis, as the truth of an hypothesis is only supposed and not asserted. In Peirce, supposing and judging carry very different commitments. Another reason he cannot regard perceptual judgments as hypotheses is that he regards perceptual judgments as judgments against which our hypotheses are tested during the inductive step of inquiry. If perceptual judgments were effectively hypotheses themselves, we could not turn on any perceptual judgment to confirm or disconfirm a hypothesis, because hypotheses do not have the epistemic
66
A. B. Wilson
standing to confirm or disconfirm other hypotheses. All hypotheses need to be tested against something of a higher epistemic standing than the hypothesis itself. So, despite that Peirce often defines abduction as the operation or process of forming or adopting an “explanatory hypothesis” (CP 5.171, 5.189), no specific type of propositional attitude necessarily attaches to the result of Peircean abduction most generally. One might argue that his claim, in the 1903 Harvard lectures, is not that perception is abductive but that perception is like abduction, so that he is only comparing perception to abduction and not subsuming perception under abduction. However, as it will be shown later, his claim is that the perceptual process differs from abductive inference only by a matter of degree, not in kind; and as he mostly intends “inference” to refer to a self-controllable operation, the difference between perception and abductive inference is mainly a difference in degree of (self-)controllability, so that the general type of operation is still the same – that general type of operation being abduction. Thus, as a general class of operation that can be deliberate or nondeliberate, abduction can issue in judgments as well as in hypotheses. However, as both “perceptual judgments” and “adoptions” introduce a proposition to a line of reasoning or inquiry, all abduction can still be characterized as introducing some semiotic element to some new context. It might not be a new line of reasoning or inquiry into which an abductive process introduces a new element. Reasoning and inquiry are always self-controlled processes operating upon propositions, while, upon Peirce’s comprehension thesis, sub-symbolic processes occurring at the uncontrolled level of cognition can also be abductive. Thus, the “essence” of abduction would have to be broadened to “introduce a new sign into some semiotic process.” This description does characterize all abduction, but it is so broad that it renders abduction in general uninteresting. The individual types of abductions are of more interest, whether it is abduction as a deliberate step of reasoning or abduction as a sub-symbolic, uncontrolled process resulting in a new conception.
Peircean Abduction: Instinct, Association, and Selection As already indicated, in Peirce, there are two levels at which an inference or a process can be identified as abductive: a deliberate or self-controlled level and a nondeliberate, unself-controlled or “instinctive” level. A deliberate or selfcontrolled-level abduction always has an inferential form, either of the arguments he classifies as abductive in his earlier work or as the step of reasoning to which, as has been seen, he attributes an inferential form in the 1903 Harvard lectures. At the deliberate or self-controlled level, an inference is abductive if it at least introduces a new proposition to a deliberate inquiry or course of reasoning, either in the form of a hypothesis or a judgment. Nondeliberate, instinctive abductions are interesting because Peirce repeatedly mentions them but says relatively little about them. As he sees it, the exact nature of the sub-personal processes involved in these abductions is a matter best left
5 Abduction and Perception in Peirce’s Account of Knowledge
67
to psychology (CP 5.55, 1903). However, some insights relating to their semiotic character or structure can still be drawn from Peirce. In some writings, Peirce recognizes associative processes of the mind that are not self-controlled nor necessarily conscious and some of which seem abductive. In the 1890s, he distinguishes between association by contiguity and association by resemblance, and he says that “suggestion by contiguity may be defined as the suggestion by an idea of another, which has been associated with it, not by the nature of thought, but by experience” (CP 7.391), while “resemblance, then, is a mode of association by the inward nature of ideas and of the mind” (CP 7.392). Elsewhere he says: “This sort of association by virtue of which certain kinds of ideas become naturally allied, as crimson and scarlet, is called association by resemblance” (CP 7.498). His remarks on association by resemblance – namely, that these associations occur by “the inward nature of ideas” or because ideas become “naturally allied” – suggest that it is not the conscious perception of a resemblance that causes the association but is, rather, something intrinsic to the ideas that causes their association. Indeed, he explicitly says that “in my opinion, it is not the resemblance which causes the association, but the association which constitutes the resemblance. . . . It is [ideas] clustering together in the Inner World that constitutes what we apprehend and name as their resemblance” (CP 4.157, 1897). We do not deliberately cluster these ideas together; however, when they do naturally cluster, Peirce thinks that we tend to perceive a resemblance. What Peirce means by an “idea” is not entirely clear and likely changes with context, although, with respect to the association of ideas, he seems to mean feelings (e.g., CP 4.157), especially as he distinguishes between ideas and conceptions, the latter of which he says are habits (CP 7.498). With respect to association by resemblance, ideas seem to serve as iconic signs, as he says that “I call a sign which stands for something merely because it resembles it, an icon” (CP 3.362, 1885). The “natural clustering together” of ideas or feelings makes for icons, and this is where we seem to have an abductive process. At a few places, Peirce describes abduction as the suggestion of a hypothesis by means of resemblance. For instance, he writes: “The mode suggestion by which, in abduction, the facts suggest the hypothesis is by resemblance—the resemblance of the facts to the consequences of the hypothesis” (CP 7.218, 1901). Notice that this corresponds to Peirce’s earlier classification of analogical arguments as abductive. While generally he thinks that we are conscious of our ideas or feelings, this process by which ideas or feelings are associated “by resemblance,” or by which they “naturally cluster together,” can be entirely unconscious and, mostly importantly, nondeliberate. The association of ideas, particularly association by resemblance, helps to explain how our abductive instinct can be generative. However, our abductive instinct cannot be merely generative, as then the conception or hypothesis that is generated could be anything. There must also be mechanisms either that constrain what conceptions or hypotheses get generated or that select one hypothesis out of the many that are generated. Moreover, these selective mechanisms must be truthconducive, as, according to Peirce, our minds are able, “in some finite number of guesses, to guess the sole true explanation” (CP 7.219). His account of association
68
A. B. Wilson
by resemblance does not seem to account for this selective feature; at least, it is unclear how ideas being “naturally allied” would tend to produce true hypotheses at a rate greater than random generation. What Peirce calls “association by contiguity” might explain this selective feature of our abductive instinct. He says that associations by contiguity result from experience, and, in that case, it is reasonable to expect such associations to tend to match the patterns and regularities in experience, upon which true predictions can be made. A problem, however, is that he identifies association by contiguity with induction, not abduction. He writes: “The mode of suggestion by which in induction the hypothesis suggests the facts is by contiguity” (CP 7.218, 1901). So, if our tendency to guess correctly is abductive, yet involves association by contiguity, then it seems that our abductive instinct would have to involve an inductive element – and with that, it might involve a deductive element as well. One might argue that this violates Peirce’s autonomy thesis, upon which abduction cannot be reduced to any combination of deduction and induction. Peirce writes: “Abductive and Inductive reasoning are utterly irreducible, either to the other or to Deduction, or Deduction to either of them” (CP 5.146, 1903). As Kapitan (1997) describes it, “abduction is, or embodies, reasoning that is distinct from, and irreducible to, either deduction or induction” (478). However, the claim that Peircean abduction has, or can have, deductive and inductive components does not violate this autonomy thesis. So long as some component of the abductive process, such as the purely generative component, cannot be modeled deductively or inductively, abduction remains irreducible to induction or deduction. Thus, our abductive instinct can incorporate associations by continuity, despite their inductive character, so long our abductive instinct also incorporates uniquely generative elements. Allowing for inductive and deductive components in abductive processes can explain not just the selective character of such processes but also how abductive processes tend to select true hypotheses or result in “correct guesses,” at rates much greater than random selection. For instance, the abductive process might involve something like “simulated tests” that weed out all but one or a few of the generated results. While many hypotheses could be randomly generated through association by resemblance, associations by contiguity then effectively “tests” each hypothesis against patterns recorded from experience, and those which do not hold up to those patterns are weeded out. Such simulated testing also requires elements of the hypothesis to be drawn out in a manner like deduction. Thus, the selection process in unconscious abductions may occur through cognitive mechanisms that are functionally like (what Peirce describes as) deductive and inductive steps of inquiry – and, thus, like these steps of inquiry, the selection process could tend to result in true conclusions. Note that the literature on Peircean abduction tends to emphasize the role of economic criteria in selection of a hypothesis, as Peirce himself emphasizes such criteria in an 1876 note on “the Economy of Research” (W 4:145-151) as well as in a 1901 manuscript on “Drawing History from Ancient Documents” (EP 2.75-114). In the latter work, he identifies three considerations for the selection of a hypothesis:
5 Abduction and Perception in Peirce’s Account of Knowledge
69
empirical testability, explanatory power (the statistical or necessary deduction of the explanandum from the explanans), and economic considerations (time, money, etc.), the last of which weighs criteria like testability and explanatory power against cost and time. However, while Peirce does emphasize these “economic” selection criteria, they apply mostly to deliberate selections of hypotheses (i.e., deliberate abductions). Regarding nondeliberate or instinctive abductions, we cannot say to what degree such considerations figure into the uncontrolled, unconscious process. As Kapitan (1990), Niiniluoto (1999), and others have observed, instinctive abductions might involve heuristic processes and biases, such as anchoring (assigning greater weight to certain information because of its place in a sequence), which sometimes direct us away from the truth but at other times direct us toward it. Instinctive abductions may involve various habits. On Peirce’s view, virtually all cognitive semiosis takes place through habits, whether the habits are inherited or acquired (see Wilson, 2016, pp. 123–128). Peirce would agree that we can acquire habits improving our abductive instincts (e.g., CP 5.160, 1903) and that we can train our abductive instincts to identify promising hypotheses in the same general way we can train our perception to identify certain objects in our environment, despite not having direct control over either our instincts or our perception. After all, Peirce considers perception itself to be a form of abduction.
Peircean Perception: Its Nature and Abductive Character While Peirce’s account of perception is not given as much attention in the literature as his account of abduction, it is still given much attention. No book has been published exclusively focused on it, but it is the subject of numerous articles, from Bernstein (1964) to Humphreys (2019), and chapters are devoted to perception in several books on Peirce, from Hookway (1985) to Wilson (2016). Along with his account of abduction, Peirce’s account of perception is vital to understanding his account of inquiry and knowledge. Nowhere is this more apparent than in his 1903 Harvard lectures. These lectures show not only that, on his account, perception is crucial to inquiry and knowledge but also that this is because of its abductive character. Peirce had already declared that all knowledge depends on perception in his 1883 article, “A Theory of Probable Inference” (W 4.481-522). But in the 1903 Harvard lectures, this is explained as meaning that all knowledge depends on perceptual judgments, where a perceptual judgment is “a judgment asserting in propositional form what a character of a percept directly present to the mind is” (CP 5.54). What Peirce means by a “percept” is less clear from these lectures and is generally more controversial. However, both the percept and perceptual judgment are essential to perception and to understanding its abductive character. To understand Peirce’s account of perception, one must also look to works other than the 1903 Harvard lectures, particularly an unpublished 1903 manuscript, R 881, written shortly after the lectures, and a 1906 The Monist article, “A Prolegomena to an Apology for Pragmaticism,” which makes explicit (if obscure) connections to his
70
A. B. Wilson
mature semiotic, the best known feature of which is the classification of signs into icons, indices, and symbols. Both this trichotomy and his sign-object-interpretant trichotomy help us to understand, on his account, the nature of perception and the respective roles of the percept and the perceptual judgment. What Peirce means by a “perceptual judgment” seems clear enough – it is “the first judgment of a person as to what is before his senses” (5.115). It is also clear that he thinks that our perceptual judgments are uncontrollable, in that we cannot directly control what they say and we cannot directly help but to accept what they say. Their roles in inquiry and knowledge are clear as well. Peirce holds that perceptual judgments are the “basic premises” of all our reasonings and inquiries, and as such they are, substantially, the materials from which we build up our knowledge. They are also the judgments which confirm or contradict our theoretical predictions in the inductive step of inquiry and which bring our beliefs into doubt in everyday experience. However, the answers to some questions about perceptual judgments are not immediately clear from what Peirce says about them. First, exactly what propositional content can perceptual judgments take? To use Peirce’s example, while looking at a yellow chair, is our judgment “the chair is yellow,” or is it “the chair appears yellow”? In R 881, both are referred to as perceptual judgments (CP 7.631-35), yet one suggests that our perceptual judgments are infallible subjective appearance reports, “the chair appears yellow,” while the other suggests that they are fallible reports of objective fact, “the chair is yellow.” Given the interpretative nature of perceptual judgments, as it will be explained further on, it seems that perceptual judgments could take either type of proposition. However, another question along the same lines is: can a judgment concerning what a word and sentence means be a perceptual judgment? After all, in hearing or seeing words and sentences in our native language, we immediately and uncontrollably judge their meaning. As argued elsewhere (Wilson, 2017), Peirce’s account entails that we can perceive semantic or semiotic properties, a claim that is also defended outside the Peirce literature (e.g., Brogaard, 2018). A second question is: how do our minds determine a perceptual judgment, or by what exact processes is a certain perceptual judgment selected over others? We know that the answer to this question concerns abduction, and further on it will be explained how. A third question is: do perceptual judgments constitute a type of knowledge – namely, “perceptual knowledge” – or, given that they are results of abductive processes, can they have only the epistemic status afforded to hypotheses? As explained earlier, because perceptual judgments include those judgments against which our hypothesis are tested, if they had an equal epistemic standing to scientific hypotheses, then perceptual judgments could not serve to falsify hypotheses, as, having such equal epistemic standing, our hypotheses could equally falsify perceptual judgments. While Peirce recognizes perceptual judgments to be fallible, he must also recognize them to have, overall, greater epistemic standing than hypothesis. What Peirce means by a “percept” cannot be as easily discerned at face value as what means by a “perceptual judgment.” At some places, he seems to suggest that
5 Abduction and Perception in Peirce’s Account of Knowledge
71
the percept is a sort of sensory image or picture, such as in the lectures, where he says that the perceptual judgment “is as unlike [the percept] as the printed letters in a book, where a Madonna of Murillo is described, are unlike the picture itself” (CP 5.54). However, for Peirce, the percept is not a mental image that passes before the “mind’s eye” and exists entirely “in the head.” He insists that “it is the external world that we directly observe” (EP 2.62, 1901). This is implied at each place he upholds “the doctrine of immediate perception” of commonsense philosophers such as Thomas Reid, which says that we directly perceive external objects (W 2:471, 1871; EP 2:155, 1903; and CP 7.639, 1903). Peirce also says that “the object perceived is the immediate object of the destined ultimate opinion” (8.260, 1905) – i.e., the real. However, the reality perceived is not always an external reality. Peirce classifies hallucinations and dreams as perceptions (CP 7.638 and CP 7.646, 1903), in which case the perceived reality may only be an internal one. While the objects dreamt about are not real, the fact that one is dreaming is real, and, it can be argued, it is that fact that we perceive, where the error lies only with the perceptual judgment (see Wilson, 2016, pp. 205–210 for a Peircean account of perceptual error). In R 881, Peirce explains that “image” would be a misnomer for the percept because the percept does not “profess to represent” anything. However, in saying this, he does not mean that percepts are not signs at all. He means that they are not propositional signs. In 1906, he says that percepts are semes, which he defines broadly as “anything which serves for any purpose as a substitute for an object of which it is, in some sense, a representative or Sign” (4.538). But if percepts do not represent things as images or as propositions do – or as icons or as symbols do – then they must represent things as indices do, namely, through direct physical relations. As it has been argued elsewhere (Wilson, 2016, pp. 190–204), Peirce regards the percept as a direct perception of whatever inquiry would, at “the final opinion,” find it to be a direct perception of. The percept is not the perceived object itself. It is, rather, the act of perception, only absent any conceptualization or judgment concerning what is perceived. The percept is a cognitive index, and as an index, it only points to the object, directs our attention to it, or “presents” it to us; and the object it presents is always a reality. Although the percept bears phenomenal qualities or “firstness,” it is not those phenomenal qualities to which the percept directs us. Generally, the percept directs us to some external reality, and the “directness” of perception consists JUST in the percept directing us to some reality. However, the percept presents that reality whole and unconceptualized. The description or analysis of the perceived reality is left to the perceptual judgment, which could always misdescribe the reality. However, indexically, perception is always of something real. What that reality really is, however, might not be determined until “the final opinion.” While the percept is only an index and the perceptual judgment is a symbol (specifically, an internally asserted proposition), the percept is still the “input” of the perceptual judgment. For Peirce, this makes the perceptual judgment an interpretation or interpretant of the percept – an interpretant of the percept as an index of the perceived object. In Peircean semiotics, the interpretant is an interpreting sign of another sign, which makes it a sign of the same object as the sign
72
A. B. Wilson
it interprets. Peirce says that the perceptual judgment only represents the percept as an index, and it is an index of the percept through its causal relationships with the percept (CP 7.628, 1903). However, the perceptual judgment is a symbol or, more specifically, a proposition of the same object or reality that is presented or indexed by the percept. Thus, while the perceptual judgment points to the percept, the perceptual judgment is true or false, not of the percept but of the reality presented or indexed by the percept. To summarize: the percept indexes the perceived reality but describes nothing; the perceptual judgment indexes the percept but describes the perceived reality, correctly or incorrectly. However, of particular interest here is the process by which the perceptual judgment interprets the percept: the process going from the input of the percept and resulting in a certain perceptual judgment. According to Peirce, introspection cannot guide us here, not only because he denies that we have any such capacity (EP 1.66-67, 1868) but also because he regards the process going from percept to the perceptual judgment not to be subject to self-control. So, any reasoning or any deliberate cognitive action concerning perception can set out only with perceptual judgments. Our perceptual judgments mark the point at which we can begin exercising self-control over our own cognition; thus, any attention one pays to one’s own perceptual experience will simply result in some other perceptual judgment. As Peirce remarks: “As for going back to the first impressions of sense, as some logicians recommend me to do, that would be the most chimerical of undertakings” (Peirce, 1931–58, 2.141, 1902). In the 1903 Harvard lectures, Peirce also remarks that “[y]ou may adopt any theory that seems to you acceptable as to the psychological operations by which perceptual judgments are formed. For our present purpose it makes no difference what that theory is” (CP 5.55). His “present purpose,” in that context, is to determine the significance of perception for inquiry; and on his view, its significance is that all inquiry depends on perceptual judgments, from the “surprises” that causes doubt to the observations of experimental results. However, Peirce also recognizes that the process by which the perceptual judgment occurs (in response to the percept) is semiotic; otherwise, the perceptual judgment would not be a natural index of the percept. Moreover, the process resulting in a certain perceptual judgments must have a semiotic structure that allows for the perceptual judgment to be, not just an index of the perceived reality but also a true description of it. Peirce needs to explain how some propositional content arises in response to percept and why that content could be true of the reality indexed by the percept. Peirce surmises that the semiotic process resulting in a certain perceptual judgment can be characterized as abductive. In this way, we understand perceptual judgments, not as incorrigible reports of what we perceive but as conclusions of unconscious inferences or interpretations that are subject both to correction and to conflicts with other interpretations. That is, any perceptual judgment is subject to falsification through inquiry, despite that we must always inquire upon some perceptual judgments.
5 Abduction and Perception in Peirce’s Account of Knowledge
73
In the 1903 Harvard lecture, Peirce introduces the claim that perception is abductive as the third of his three “cotary proposition of pragmatism” – “cotary” referring to whetstones (stones used for sharpening blades) – which he says “put the edge on the maxim of pragmatism” (EP 2.226, 1903). Toward the end of the sixth lecture, after he had defended the first two cotary propositions, he argues: I do not think it is possible fully to comprehend the problem of the merits of pragmatism without recognizing these three truths: first, that there are no conceptions which are not given to us in perceptual judgments, so that we may say that all our ideas are perceptual ideas. This sounds like sensationalism. But in order to maintain this position, it is necessary to recognize, second, that perceptual judgments contain elements of generality, so that Thirdness is directly perceived; and finally, I think it of great importance to recognize, third, that the abductive faculty, whereby we divine the secrets of nature, is, as we may say, a shading off, a gradation of that which in its highest perfection we call perception. (EP 2.223-24)
As presented here, the third “truth” is that our abductive faculty or instinct is continuous with perception, suggesting that perception and our abductive faculty are the same general kind of cognitive operation, having only different degrees of certain features. Peirce further explains this claim in the seventh and final lecture: The third cotary proposition is that abductive inference shades into perceptual judgment without any sharp line of demarcation between them; or, in other words, our first premisses, the perceptual judgments, are to be regarded as an extreme case of abductive inferences, from which they differ in being absolutely beyond criticism. (EP 2.227)
Here Peirce repeats that the conclusions of abductive inferences are on the same spectrum with perceptual judgments, suggesting that they differ only with respect to their controllability and capacity for logical criticism. The shift from “abductive faculty” to “abductive inference” suggests that it is of no consequence to Peirce which one we say is on a spectrum with perception. And it would not be clear how abduction and perception could be on the same spectrum and how this point could be important, unless perception itself could be correctly characterized as abductive. In the next paragraph, Peirce explains better what he means: On its side, the perceptive judgment is the result of a process, although of a process not sufficiently conscious to be controlled, ... If we were to subject this subconscious process to logical analysis, we should find that it terminated in what that analysis would represent as an abductive inference, resting on the result of a similar process which a similar logical analysis would represent to be terminated by a similar abductive inference, and so on ad infinitum. (EP 2.227)
Note that Peirce does not say that the subconscious process forming perceptual judgments consists in abductive inferences, but that “logical analysis” would represent that process as consisting in abductive inferences. In the 1903 Harvard lectures, Peirce seems to use “inference” in the sense of a deliberate step of reasoning, so that one reason, at least, the perceptual process cannot actually consist
74
A. B. Wilson
in abductive inferences is that the perceptual process is not deliberate or selfcontrolled. Another reason might be that he thinks the perceptual process does not even have an inferential structure, as a passage from some propositions (premises) to others (conclusions). Indeed, as he emphasizes that the percept is not propositional, and not even a symbol, the percept cannot serve as a literal premise in an abductive inference. Thus, at least some part of the perceptual process cannot have a literal inferential structure (which, again, is not to say it cannot have a semiotic structure). However, two points should be made here. First, as explained earlier, abduction does not trade just in propositions. So even if the perceptual process does not have an inferential structure, it can still be characterized as abductive. As explained earlier, abduction most generally can be characterized by the function of introducing some semiotic content to some semiotic context, and perception certainly qualifies as abductive in this way. Second, Peirce may regard the process resulting perceptual judgments, or some part of that process, as having an abductive inferential structure; only the individual abductions are not naturally individuated. He says that the perceptual process “does not have to make separate acts of inference, but performs its act in one continuous process” (EP 2.227). He mentions this also where he argues that if the perceptual judgment “were of a nature entirely unrelated to abduction, one would expect that the percept would be entirely free from any characters that are proper to interpretations, while it can hardly fail to have such characters if it be merely a continuous series of what discretely and consciously performed would be abductions” (EP 2.229, my emphasis). In other words, the perceptual process can be said to have an abductive inferential structure; only the inferences are not discrete and are not consciously performed. In that case too, perception is not merely “like” abduction – perception is abductive, only in an unconscious, uncontrolled, and continuous manner. From what other materials besides the percept does the perceptual process abduce the perceptual judgment? Peirce does not directly specify what they are. However, his account suggests that there is “top-down” processing, whereby cognitive habits such as beliefs can contribute to the interpretation of the percept in the formation of the perceptual judgment. The percept itself is affected by this interpretative process, as Peirce indicates where he introduces the percipuum, or “the percept as it is immediately interpreted in the perceptual judgment” (CP 7.643). The percipuum is the percept, but as interpreted in the perceptual judgment. The percipuum is the perceptual appearance or presentation, the cognitive index (secondness) bearing phenomenal properties (firstness) but with a conceptual structure (thirdness) that makes for the appearance or presentation of conceptually distinct objects and properties. (See Haack, 1994, p. 19, for an alternative interpretation of the “percipuum.”) The abductive process by which the perceptual judgment occurs affects the percept itself (top-down processing), so that, in perception, our attention is directed not just toward a whole perceptual field but to specific and distinct objects and properties. Peirce remarks that “we perceive what we are adjusted for interpreting” (EP 2.229), showing he understands that we are always primed to perceive what we expect to perceive and that what we expect to perceive depends on a whole network of acquired concepts and beliefs. Our concepts and
5 Abduction and Perception in Peirce’s Account of Knowledge
75
beliefs bear dynamical, indexical relations to perception, so that certain indexical actions of the percept will connect with certain concepts and beliefs and enter them into the abductive process resulting in the perceptual judgment.
Peirce’s Account of Knowledge Unlike his famous modern predecessors, such as Descartes and Kant, who were focused on the possibility of empirical knowledge, Peirce seems most interested in the nature of inquiry, where inquiry might not seem to explain the possibility of empirical knowledge but, rather, to presume empirical knowledge to be possible. However, ultimately, Peirce offers an account of the possibility of knowledge that avoids Cartesian foundationalism and Kantian transcendental idealism, and his focus on inquiry – and, hence, perception and abduction – is essential to it. In Peirce, “knowledge” would be best understood in relation to a final, fixed consensus about reality resulting from sufficient inquiry, so that questions concerning the possibility of empirical knowledge are reformulated into questions concerning the possibility of inquiry leading to a final, fixed consensus about reality (see Wilson, 2016 and 2018). His pragmatism or “pragmaticism” (EP 2:33435, 1905) can be viewed as the groundwork for this account of knowledge, as it requires that the traditional questions be answered in terms of “practical bearings” (EP 1:63). For Peirce, while we can define “truth” as correspondence to reality, such definitions are mere abstractions, achieving only the second grade of clearness. We must seek a third grade of clearness by considering the significance that the objects of our concepts have for our perception and purposeful conduct (their “practical bearings”). The most direct significance that truth and knowledge have for perception and conduct, on Peirce’s view, is that truth and knowledge are the aims and expected products of the observations and conduct constituting inquiry (EP 2:449-50, 1908). In inquiry, we conduct ourselves and make observations in ways aimed at truth and knowledge, and inquiry is not complete until truth and knowledge are obtained. As Peirce explains in his 1877 and 1878 papers, an “ultimate result of inquiry” is a point or limit at which there is no longer any possibility of genuine doubt to dislodge any opinion or belief, so that all opinion becomes permanently settled, or all belief becomes permanently fixed. On the famous belief-doubt model of inquiry he develops in those papers, beliefs are described as habits of action, and doubt arises when these habits are disturbed by “surprises” or failed expectations; the “irritation of doubt” then instigates inquiry, and inquiry continues until the doubt is eased through the establishment of new “belief-habits” (or the reestablishment of old ones). Thus, the possibility of inquiry having an ultimate result is the possibility of a state of belief avoiding all surprise or failed expectation and, thereby, all genuine doubt. Psychologically, a doubt could still arise (e.g., from a delusion, etc.); but at the final opinion, doubt would never arise from perception or logically correct inference from perception that really conflicts with our beliefs. Inquiry aims for fixed beliefs, or belief that would avoid all such “genuine” doubt by perception
76
A. B. Wilson
or any correct reasoning therefrom (CP 5.445, 1905). In this respect, we can view inquiry as the attempt to adapt our beliefs to reality, or, at least, to reality as it affects our senses and (through our senses) our beliefs. Truth and knowledge, then, must be understood in terms of such adaptation. Thus, the possibility of empirical knowledge is the possibility of the adaption of belief to reality, and a few things are needed to make that possible. First, there must be a mechanism for belief variation, so that any belief, having some possibility of success, would eventually be “tried.” There must also be a selection mechanism, but that would just be the success a belief could have in avoiding being eliminated in the long run through sufficient doubt. With sufficient variation, we would, at least, chance upon those beliefs setting successful expectations, and eventually these beliefs would be the only ones remaining, as all the others will have been repeatedly doubted, due to failed expectations, and “filtered out.” At least in this superficial way, as involving belief variation and selection, Peirce’s epistemology is evolutionary (as has been observed by other commentators, e.g., Skagestad, 1979). This is how abduction is crucial to his epistemology, as abduction is particularly crucial to the variation part of the process. As has been seen, in Peirce, abduction is a mechanism for belief variation, but it does not generate new beliefs in an entirely random or spontaneous way. It incorporates inductive and deductive elements to filter out propositions that do not align with patterns in our experience. This is compatible with the idea that, by nature, we employ rules or heuristics to make guesses that have a greater chance at being correct than if the guess were completely random. These rules or heuristics would themselves be doubted if the guesses we make upon them tend not to hold up against experience. But we should expect that some rules or heuristics have led to such success that they have become strongly engrained. Some might even involve physiological adaptations for sorting through sensory information, resulting from biological evolution and ecological pressures on organisms to avoid unexpected and undesirable consequences. In large part, this is the nature of perception, which, as was explained above, is, in Peirce, the main way that our beliefs clash with reality to result in doubt and inquiry. In scientific inquiry, on his account, our hypotheses clash with reality through perception, either eliminating them or confirming them, where consistent confirmation can transform them into full beliefs (or belief-habits). Thus, on Peirce’s account, perception is particularly crucial to the selection process in the approach toward the final opinion. Perception is also crucial to the variation process in being itself abductive. Peirce regards perception as, essentially, a fundamental biological mechanism for guessing correctly. But it is likely that natural evolution has not stopped with perception. Functioning independently of perception are various “instincts” for guessing correctly – instincts that Peirce calls “abductive” – and which, like perception, are not always correct and perhaps less often correct than perception. From this overview, Peirce’s account of knowledge, and his account of abduction might seem “naturalistic,” as it rests on psychological claims and makes evolutionary analogies. It is, perhaps, more naturalistic than he liked to have admitted.
5 Abduction and Perception in Peirce’s Account of Knowledge
77
Although Peirce repudiates psychologism and warns against basing philosophy on any special science, he explicitly advises philosophers to draw ideas from the whole range of human knowledge (EP 1:307, 1891), and he argues that philosophical reasoning can correctly rest on “common experience,” including “common-sense observations concerning the workings of the mind” (CP 5.485, 1907). Peirce’s “anti-naturalism” consists mainly in the claim that our philosophical reasonings should not rest on observations made within special scientific research, which is to say that the direct empirical evidence for scientific theories cannot also serve as direct evidence for philosophical theories. However, this is not to say that our philosophical reasonings cannot rest on scientific ideas and knowledge that have become embedded in our common experience and common knowledge, such as knowledge concerning the basic mechanisms of organic evolution (despite that those mechanisms were not quite “common knowledge” during Peirce’s time!). Peirce indicates his methodology early on, in “Some Consequences of Four Incapacities” (Journal of Speculative Philosophy, 1868: 140–57), where he says that “we cannot begin with complete doubt. We must begin with all the prejudices which we actually have when we enter upon the study of philosophy” (EP 1:65). Concerns about circular justification in Peirce are misplaced as they assume that it is necessary to justify the whole of our empirical knowledge, whereas, for Peirce, any attempt at such justification is misguided. We can read Peirce as fundamentally concerned with explanation, and not with justification. On his approach, we should aim at a coherent explanation of knowledge, and there is nothing incoherent or circular about explaining knowledge, as a would-be “final opinion” or final consensus, based on common experience and scientific ideas like evolution. It is not the assertion of our common experience and scientific ideas as knowledge that explains the nature of knowledge. Our common experience and scientific ideas could have a lesser epistemic status but still do the explanatory work that Peirce requires. Likewise, when we approach the topic of perception in Peirce, we find that he is not fundamentally concerned with justifying the general reliability of our perceptual judgments or with responding to external world skepticism. He is fundamentally concerned with the roles that perception plays in inquiry and in the development of inquiry toward a final consensus. On the one hand, the role of perception, on his account, belongs to the inductive step of inquiry. Perception compels judgments which either reinforces our beliefs or prompts us to doubt our beliefs. On the other hand, perception is abductive, in that perceptual process involves the generation and selection of an interpretation of the perceived reality.
Conclusion In the 1900s, Peirce comes to recognize that “abduction” covers a much broader class of phenomena that he originally thought. As he embeds his theory of inquiry into his general semiotic, abduction comes to include not just a type of inference or a step of inquiry but a whole aspect of the semiotic process – the aspect of introducing some sign into some context in some way, whether it is a feeling that becomes a
78
A. B. Wilson
sign (icon) of another feeling naturally associated with it or is a proposition that is adopted as a leading hypothesis in scientific research. The act of introducing some sign into some context (in some way) can be distinguished from the act of weaking or strengthening the sign (induction) and from the act of drawing out other signs from it (deduction). While such broad semiotic use of “abduction” is not immediately transparent in Peirce, as it has been shown here, it is suggested at many places in his writings, not the least by his third “cotary proposition” of pragmatism that directly characterizes perception as abductive. Since perception is not propositional all the way down (to the percept) and since it does not issue in hypotheses, by characterizing perception as abductive, Peirce at least implicitly acknowledges that abduction is a very broad type of semiotic process. Again, while such a broad characterization of abduction might seem to render abduction uninteresting, the more specific types of abduction, such as perception, remain interesting, and it can be helpful to recognize that each type of abductive process fulfills a very broad role of any semiotic or cognitive process. So long as the abductive aspect of a semiotic or cognitive process can be distinguished from, what we might discern as, its inductive and deductive aspects, abduction remains a distinct aspect of such processes. Thus, abduction remains an essential distinction in any account of knowledge or cognition, and it is particularly essential to Peirce’s dynamical account of cognition and inquiry as tending toward an ultimate result. Indeed, so far as the approach toward an “ultimate result” of inquiry can be regarded as a process of introducing an ultimate result, inquiry itself, in Peirce, can be classed as one great abductive process (only, as with other abductive processes, one having inductive and deductive components).
References Bernstein, R. (1964). Peirce’s theory of perception. In E. C. Moore & R. S. Robin (Eds.), Studies in the philosophy of Charles Sanders Peirce: Second series (pp. 165–189). University of Massachusetts Press. Brogaard, B. (2018). In defense of hearing meanings. Synthese, 195, 2967–2983. Haack, S. (1994). How the critical common-sensist sees things. Histoire Épistémologie Langage, 16(1), 9–34. Hintikka, J. (1998). What Is aabduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3), 503–534. Hookway, C. (1985). Peirce. Routledge and Kegan Paul. Humphreys, J. (2019). Subconscious inference in Peirce’s epistemology of perception. Transactions of the Charles S. Peirce Society, 55(3), 326–346. Kapitan, T. (1990). In what way is abductive inference creative? Transactions of the Charles S. Peirce Society, 26(4), 499–512. Kapitan, T. (1997). Peirce and the structure of abductive inference. In N. Houser, D. D. Roberts, & J. van Evra (Eds.), Studies in the logic of Charles Sanders Peirce (pp. 477–496). Indiana University Press. Lipton, P. (1991). Inference to the best explanation. Routledge. Mcauliffe, W. H. (2015). How did abduction get confused with inference to the best explanation? Transactions of the Charles S. Peirce Society, 51(3), 300–319.
5 Abduction and Perception in Peirce’s Account of Knowledge
79
Mohammedian, M. (2019). Beyond the instinct-inference dichotomy: A unified interpretation of Peirce’s theory of abduction. Transactions of the Charles S. Peirce Society, 55(2), 138–160. Niiniluoto, I. (1999). Defending abduction. Philosophy of Science, 66, S436–S451. Paavola, S. (2004). Abduction as a logic and methodology of discovery: The importance of strategies. Foundations of Science, 9(3), 267–283. Paavola, S. (2005). Peircean Abduction: Instinct or Inference. Semiotica, 153–1(4), 131–154. Peirce, Charles S. (1857–1914). Manuscripts held at the Houghton Library of Harvard University, as identified in Richard Robin. 1967. Annotated Catalogue of the Papers of Charles S. Peirce. Amherst, University of Massachusetts Press. And in Richard Robin. 1971. “The Peirce papers: A supplementary catalogue.” Transactions of the Charles S. Peirce Society 7, no. 1 (winter), 37–57. Referred to as R[catalogue#]:[sheet#] Peirce, Charles S. (1931–58). The collected papers of Charles Sanders Peirce (8 vols. Vols. 1–6 edited by Charles Hartshorne and Paul Weiss. Vols. 7–8 edited by Arthur W. Burks). Harvard University Press. Referred to as CP[volume#].[paragraph#]. Peirce, Charles S. (1982–2010). The writings of Charles S. Peirce: A chronological edition. (7 vols. to date. Edited by The Peirce Edition Project). Indiana University Press. Referred to as W. Peirce, Charles S. (1992–98). The essential Peirce: Selected philosophical writings. (2 vols. Vol. 1 edited by Nathan Houser and Christian Kloesel. Vol. 2 edited by the Peirce edition project). University Press. Referred to as EP. Skagestad, P. (1979). C. S. Peirce on biological evolution and scientific progress. Synthese, 41(1), 85–114. Wilson, A. B. (2016). Peirce’s empiricism: Its roots and its originality. Lexington Books. Wilson, A. B. (2017). What do we perceive? How Peirce ‘expands our perception’. In K. Hull & R. K. Atkins (Eds.), Peirce on perception and reasoning: From icons to logic (pp. 1–13). Routledge. Wilson, A. B. (2018). Peirce’s hypothesis of the final opinion. European Journal of Pragmatism and American Philosophy, 10(2) Online since 11 January 2019.
Part II Theoretical and Cognitive Issues on Abduction and Scientific Inference
6
Introduction to Theoretical and Cognitive Issues on Abduction and Scientific Inference Woosuk Park
Abstract
Part B of this handbook presents an overview of the most recent research on the foundational and cognitive issues on abduction. Lorenzo Magnani focuses on the problem of discoverability in the context of his own EC-model of abduction. John Woods’ chapter discusses abduction in mathematical space. Gerhard Minnameier analyzes abduction in its overall connection with deduction and induction and then in terms of inferential sub-processes. Gerhard Schurz discusses common cause abductions that explain correlated empirical dispositions in terms of common theoretical causes. Ilkka Niiniluoto argues that successful abduction leads to increasing truthlikeness. Finally, Ahti-Veikko Pietarinen and Francesco Bellucci examine the problem of psychologism about logic in Peirce’s notion of abduction.
In the last century, abduction was extensively studied in logic, semiotics, philosophy of science, computer science, artificial intelligence, and cognitive science. The surge of interest in abduction derived largely from serious reflection on the neglect of the logic of discovery at the hands of logical positivists and Popper, especially their distinction between the context of discovery and the context of justification. At the same time, the desire to recover the rationality of science that has been seriously challenged by the publication of Kuhn’s The Structure of Scientific Revolutions might be another important factor. However, the consensus is that researchers have failed to secure the core meaning of abduction, let alone to cover the full range of its
W. Park () Digital Humanities and Computational Social Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_81
83
84
W. Park
applications. The controversial status of abduction can be immediately understood if we consider our inability to answer the following questions satisfactorily: • What are the differences between abduction and induction? • What are the differences between abduction and the well-known hypotheticodeductive method? • What does Peirce mean when he says that abduction is a kind of inference? • Does abduction involve only the generation of hypotheses or their evaluation as well? • Are the criteria for the best explanation in abductive reasoning epistemic or pragmatic or both? • How many different kinds of abduction are there? Fortunately, the situation has improved much in the last quatercentury. To say the least, some ambitious attempts to attain a unified overview of abduction have been made, e.g., in Gabbay and Woods (2005), Magnani (2001), and Aliseda (2006). Each of these attempts emphasizes its own strengths and achievements. For example, Aliseda’s book represents some logical and computational approaches to abduction quite well. Gabbay and Woods, by introducing the distinction between explanatory and nonexplanatory abductions, adopt a broadly logical approach comprehending practical reasoning of real-life logical agents. By introducing his multiple distinctions between different kinds of abduction, i.e., selective/creative, theoretical/manipulative, and sentential/model-based, Magnani (2001, 2009, 2017) develops an eco-cognitive view of abduction, according to which instances of abduction are found not only in science and any other human enterprises but also in animals, bacteria, and brain cells. All these serious attempts to understanding abduction have influenced each other enormously. Further, as can be testified by Niiniluoto (2018), abduction has become more and more the central issue in philosophy of science, philosophy in general, and even in broader contexts. The existence of this Handbook of Abductive Cognition itself makes it evident that all these endeavors turn out to be extremely fruitful. Part B of this handbook presents an overview of the most recent research on the foundational and cognitive issues on abduction inspired by all this. The chapter by Lorenzo Magnani focuses on the problem of discoverability in the context of his own EC-model of abduction. Of particular interest is his novel suggestion to see inferences adopting the more general concepts of input and output instead of those of premisses and conclusions. John Woods’ chapter discusses abduction in mathematical space. This is a timely attempt to face an important problem in view of the special status of mathematics as the superordinate science in Peirce’s classification of sciences. Woods’ approach to the problem as a test case for abduction through the semantics of fiction has utmost theoretical interest. Gerhard Minnameier analyzes abduction in its overall connection with deduction and induction and then in terms of inferential sub-processes (i.e., colligation, observation, and judgment). In his timely examination of GW model, he focuses in the delimitation of the abductive task and the criterion for the validity of abduction. Gerhard Schurz
6 Introduction to Theoretical and Cognitive Issues on Abduction and. . .
85
discusses common cause abductions that explain correlated empirical dispositions in terms of common theoretical causes in in-depth fashion. Further, he claims that these abductions play also an important role in the justification of metaphysical theories, such as perceptual realism. Starting with Peirce’s suggestion of analyzing the reliability of ampliative inferences by truth frequencies, Ilkka Niiniluoto examines the recent debates between those who emphasize the ignorance-preserving character of abduction and those who emphasize abduction as a powerful method of discovery, probabilistic confirmation, or acceptance of the best explanation. And he concludes that successful abduction leads to increasing truthlikeness. Finally, Ahti-Veikko Pietarinen and Francesco Bellucci discuss the thorny issue of the possibility of psychologism about logic in Peirce’s notion of abduction. From their analysis of some of the most pertinent notions allied to abduction, such as instinct, they conclude that we can give these concepts perfectly nonpsychological, natural, and scientific glosses.
References Aliseda, A. (2006). Abductive reasoning. Logical investigations into discovery and explanation. Springer. Gabbay, D., & Woods, J. (2005). The reach of abduction: Insight and trial. A practical logic of cognitive systems (Vol. 2). Elsevier. Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation. Kluwer. Magnani, L. (2009). Abductive cognition. The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. Magnani, L. (2017). The abductive structure of scientific creativity. Springer. Niiniluoto, I. (2018). Truth-seeking by abduction. Springer.
7
Discoverability in the Perspective of the EC-Model of Abduction The Centrality of Eco-Cognitive Situatedness Lorenzo Magnani
Contents Defending Discoverability: The Urgent Need of an Ecology of Human Creative Abductive Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Eco-Cognitive Model of Abduction (EC-Model) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What Is the Optimization of Eco-Cognitive Situatedness in the Case of Abductive Cognition? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Irrelevance and Implausibility Are Not Necessarily Offensive to Rationality . . . . . . . . . . . . . AKM-Schema, GW-Schema, and EC-Model of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . Framing Abduction in the Perspective of “General Inferential Problems” . . . . . . . . . . . . . Irrelevance and Implausibility Disculpated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eco-Cognitive Openness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Explaining the Optimization of Eco-Cognitive Situatedness in a Logical Perspective . . . . . . Situatedness: The Centrality of “Optimally Positioning” Input and Output . . . . . . . . . . . . Affordances, Diagnosticability, Discoverability, and Abduction . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88 90 92 94 94 96 98 100 101 104 107 110 111
Abstract
Fruitfully approaching the problems of discoverability involves an important intermediate step, which concerns the role of abductive cognition, that is, reasoning to hypotheses and the logical models of it. To this aim, when engaged in formalizing abductive reasoning, it is extremely useful to see inferences adopting the more general concepts of input and output instead of those of premisses and conclusions, which are standardly used to characterize abduction in the syllogistic scheme of the fallacy of “affirming the consequent.” Indeed,
L. Magnani () Department of Humanities, Philosophy Section and Computational Philosophy Laboratory, University of Pavia, Pavia, Italy e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_1
87
88
L. Magnani
from this perspective, abductive inferences can be first of all seen as related to logical processes in which input and output fail to hold each other in an expected relation, with the solution involving the modification of inputs, not that of outputs. Unfortunately, if input and output fail to hold each other in a “good” relation, very difficult is to solve the related abductive problem: discoverability is jeopardized. In this perspective – and given the fact that science produces and “maximizes” cognition through a process in which affirming truths implies negating truths–the analysis of abductive processes leads us to the emphasis on the importance of the following main aspects: “optimization of eco-cognitive situatedness,” “maximization of changeability” of both input and output, and high “information-sensitiveness.” It will be also illustrated that irrelevance and implausibility are not always offensive to reason, and so they can favor both discoverability and discovery. A final section will be devoted to a clarification of the strict relationship between diagnosticability, “affordances” – as environmental anchors that allow us to better exploit external resources – and abduction. Especially diagnosticability, but of course also discoverability, is of course related to the availability of the appropriate affordances. Keywords
Abduction · Discoverability · Creativity · EC-Model of abduction · Irrelevance · Implausibility · Eco-cognitive openness · Eco-cognitive situatedness · AKM-schema of abduction · GW-schema of abduction · Affordances
Defending Discoverability: The Urgent Need of an Ecology of Human Creative Abductive Cognition To further illustrate the importance of sustaining human creativity through the enhancement of discoverability, the example of the recent Encyclical Letter of the Holy Father Francis, head of the Catholic Church, is very useful. In the recent Encyclical Letter Laudato Si’ of the Holy Father Francis on Care for Our Common Home, the Pope various times quotes the importance of human creativity in solving the several problems of humanity that are illustrated. Creativity, and so abductive cognition, is seen as a fundamental tool necessary to fix the failures of our societies and our lives. Just to make some examples, the following sentences can be found: “In order to continue providing employment, it is imperative to promote an economy which favours productive diversity and business creativity” (129), “creativity should be shown in integrating rundown neighbourhoods into a welcoming city” (152), and “[. . . ] political and institutional frameworks do not exist simply to avoid bad practice, but also to promote best practice, to stimulate creativity in seeking new solutions and to encourage individual or group initiatives” (177). Moreover, the awareness concerning the fact that in current societies, creativity is jeopardized is
7 Discoverability in the Perspective of the EC-Model of Abduction
89
exhibited: speaking of the effect of technology, the following sentences can, for example, be found, “Our capacity to make decisions, more genuine freedom and the space for each one’s alternative creativity are diminished” (108) and “Human creativity cannot be suppressed” (131). Chapter 4 of the Encyclical Letter introduces the need of an integral ecology; also necessary in education, many subtypes of ecology are illustrated: first of all environmental, economical, and social ecology, but also cultural ecology, ecology of daily life, human ecology, and ecology of man (“man too has a nature that he must respect and that he cannot manipulate at will” – 155). In all these endeavors, related to the emancipation of human lives, human creativity is so fundamental that it is mandatory to strongly suggest to integrate the Encyclical deontological commitments with what can be called ecology of human creativity. This proposal of integration resorts to the need of accepting that this kind of ecology has priority over the others illustrated in the Encyclical Letter: consequently, a first ecological duty is the one of protecting and sustaining human creativity because it is exactly human creativity that can grant the implementation of the other kinds of ecology quoted from the Encyclical letter. The ecology of human creativity, that is, an ecology of abductive cognition, has to be considered a conditio sine qua non in the following sense: without creativity and skillful human capacities, all the other envisaged and invoked ecologies and sustainability, in general, tend to sadly fail. The recent book (Magnani, 2022) clearly illustrates the importance of sustaining the so-called epistemic niches (on cognitive and epistemic niches, see the last section of the present chapter), that is, those eco-settings that are the condition of possibility of scientific abductive creativity. It is necessary to center the attention on scientific creativity because of the importance of it in our current societies and collectives, not only because of its intrinsic – so to speak – value but also because, in our technological era, full of science-based artifacts of various types, knowledge in general and scientific knowledge in particular are fundamental. In the book Morality in a Technological World. Knowledge as Duty (Magnani, 2007), this issue has been already extendedly treated, arguing that the maintenance and flourishing of technological societies require a great deal of new scientific and ethical knowledge as well as modern approaches to moral deliberation, and to achieve these goals, compelling analyses have been provided also offering a variety of strategies that might be used to solve them. It seems mandatory to hypothesize that producing and applying recalibrated appropriate general, scientific, and moral knowledge have become a duty. It is in this vein that the book analyzed troubling issues such as cyberprivacy, globalization, bad faith, cloning, biotechnologies, and ecological imbalances: the right creative knowledge can manage these challenges and counter many of technology’s ill effects by preserving ownership of our own destinies, encouraging responsibility, and enhancing freedom. However, creative knowledge can completely flourish only when discoverability is sustained, protected, and enhanced. A consequence of the above considerations is the following: if creativity is jeopardized and in particular scientific abductive creativity, it is practically impossible to acceptably and successfully deal with our technological societies,
90
L. Magnani
also from an ethical perspective: this means that the need of an ecology of human creativity endorsed here is rooted in the just sketched observations, quoting that book of 2007.
The Eco-Cognitive Model of Abduction (EC-Model) Fruitfully approaching the problems of discoverability first of all involves an important intermediate step, which concerns the role of abductive cognition, that is, reasoning to hypotheses and the logical models of it. To this aim, when engaged in formalizing abductive reasoning, it is extremely useful to see inferences adopting the more general concepts of input and output instead of those of premisses and conclusions, which are standardly used to characterize abduction in the syllogistic scheme of the fallacy of “affirming the consequent.” This section describes the so-called eco-cognitive model (EC-Model) of abduction that has been introduced to the aim of highlighting some aspects of hypothetical reasoning, such as ecocognitive openness and situatedness, that are very useful to substantiate the problem of discoverability, especially in science. These important concepts will be explained below in the following sections of this chapter. Let us come back to a preliminary illustration of the EC-Model of abduction (eco-cognitive model). The reader that is interested in further details can be addressed to (Magnani, 2009, 2017). At the center of this perspective on cognition, and consequently of abduction, is the emphasis on the “practical agent,” on the individual agent operating “on the ground,” that is, in the circumstances of real life. In all its contexts, from the most abstractly logical and mathematical to the most roughly empirical, it is always important to emphasize the cognitive nature of abduction. Reasoning is something performed by cognitive systems. At a certain level of abstraction and as a first approximation, a cognitive system is a triple (A, T , R), in which A is an agent, T is a cognitive target of the agent, and R relates to the cognitive resources on which the agent can count in the course of trying to meet the target information, time, and computational capacity, to name the three most important. In this perspective, the agents are also embodied distributed cognitive systems: cognition is embodied, and the interactions between brains, bodies, and external environment are its central aspects. Cognition is occurring taking advantage of a constant exchange of information in a complex distributed system that crosses the boundary between humans, artifacts, and the surrounding environment, where also instinctual and unconscious abilities play an important role. This interplay is especially manifest and clear in various aspects of abductive cognition, that is, in reasoning to hypotheses, and gives birth to the EC-Model quoted above. This perspective adopts the wide Peircean philosophical framework, which approaches “inference” semiotically (and not simply “logically”): Peirce distinctly says that all inference is a form of sign activity, where the word sign includes “feeling, image, conception, and other representation” (Peirce, 1866–1913, 5.283). It is clear that this semiotic view is considerably compatible with the description
7 Discoverability in the Perspective of the EC-Model of Abduction
91
of cognitive systems as embodied and distributed systems. It is in this perspective that the role of abductive cognition can be fully appreciated, which not only refers to propositional aspects but is also performed in a framework of distributed cognition, in which also models, artifacts, internal and external representations, and manipulations play an important role. Already in a passage concerning abduction (i.e., – “leading away”) of Chapter B25 of the Aristotelian Prior Analytics, some of the current wellknown distinctive characters of abductive cognition can be clearly seen, which are in tune with the eco-cognitive model of abduction (EC-Model), which will be soon described. Aristotle is already pointing to the fundamental inferential role in reasoning of those externalities that substantiate the process of “leading away” ( ). A new positive perspective about the “constitutive” eco-cognitive character of abduction can be gained, just thanks to Aristotle himself: the various aspects, philosophical, logical, and cognitive of the Aristotelian ideas concerning abduction are illustrated in details in (Magnani, 2015a, 2016, 2017). The backbone of this approach can be found in the manifesto of the eco-cognitive model (EC-Model) of abduction in (Magnani, 2009). A reader interested in more details concerning the EC-Model of abduction can refer to (Magnani, 2015a, 2016). It might seem awkward to speak of “abduction of a hypothesis in literature,” but one of the fascinating aspects of abduction is that not only it can warrant for scientific discovery but for other kinds of creativity as well. Abduction does not have to be necessarily seen as a problem-solving device that sets off in response to a cognitive irritation/doubt: conversely, it could be supposed that esthetic abductions (referring to creativity in art, literature, music, games, etc.) arise in response to some kind of esthetic irritation that the author (sometimes a genius) perceives in herself or in the public. Furthermore, not only esthetic abductions are free from empirical constraints in order to become the “best” choice: many forms of abductive hypotheses in traditionally perceived-as-rational domains (such as the setting of initial conditions, or axioms, in physics or mathematics) are relatively free from the need of an empirical assessment. The same could be said of moral judgments: they are eco-cognitive abductions, inferred upon a range of internal and external cues and, as soon as the judgment hypothesis has been abduced, it immediately “can” become prescriptive and “true,” informing the agent’s behavior as such. The kinds of abduction that do not need empirical confirmation have been called “knowledge-enhancing”: Peirce implicitly provides various justifications of the knowledge-enhancing role of abduction, that is, when abduction is not considered an inference to the best explanation (IBE) in the classical sense of the expression, that is, an inference necessarily characterized by an empirical evaluation phase, or inductive phase. In Chapter 3 of Magnani (2017), the example of conventions is illustrated: abducing conventions favors and increases knowledge even if these hypotheses remain evidentially inert – at least in the sense that it is not possible to empirically falsify them. Consequently, abduced conventions are evidentially inert but knowledge-enhancing at the rational level of science. A rich analysis of the various criteria (or – more simply – of the various intermediate criteria for provisionally “judging” abductive results, such as plausibility,
92
L. Magnani
relevance, consistency, simplicity, minimality, likeliness, loveliness, probability, uberty, fruitfulness, etc.) that characterize a “best explanation,” as they are illustrated by the rich literature on the issue produced in last decades, is offered by Cabrera (2017, 2020). Cabrera proposes a less misleading name for IBE (at least in the case of the merely “explanatory” aspects): “inference to the hypothesis with the optimal combination of explanatory virtues.” This qualification emphasizes the “contextual” character and the fact that a privileged model of explanation is irrelevant and unnecessary. It is also usefully observed: “Of course, explanationists must tackle other obstacles, such as offering more detailed accounts of the virtues, of why those virtues are epistemically relevant, and of how the virtues are to be agglomerated and compared. Still though, appealing to the explanatory virtues [. . . ], provides an elegant solution to what would otherwise be a devastating objection to IBE” (Cabrera, 2020, p. 21). When the distinction between selective and creative abduction was introduced, in (Magnani, 1988) – further explained in (Magnani, 2001) – the first case was referred to those cognitive processes that are active in diagnostic reasoning, where abduction is merely seen as an activity of “selecting” from an encyclopedia of pre-stored hypotheses. Creative abduction instead refers to the building of new hypotheses, for example, in scientific discovery. A recent and clear analysis of this dichotomy and of other classifications emphasizing different aspects of abduction just described is also given in (Park, 2015, 2017a,b).
What Is the Optimization of Eco-Cognitive Situatedness in the Case of Abductive Cognition? As already anticipated in the previous section when it is said that abduction can be knowledge-enhancing, the reference is to various types of newly produced knowledge of various novelty levels, in the absence of an empirical evaluation phase, or inductive phase, as Peirce called it. Some cases of new knowledge produced in science (e.g., conventions in physics, such as the principle of conservation of energy or the Hamilton’s principle in geometrical optics and in dynamics, and models that mediate various processes of reasoning in creative research settings) are cases of knowledge-enhancing abduction. However, also knowledge produced in an artificial game thanks to a smart application of strategies or the invention of new strategies and/or heuristics has to be seen as the fruit of knowledge-enhancing abduction. This means that abduction is not necessarily ignorance-preserving (reached hypotheses would always be “presumptive,” and to be accepted, they always need empirical confirmation), as contended by (Gabbay and Woods, 2005), when explaining their Gabbay-Woods schema of abduction (GW-schema). Abduction can creatively build new knowledge by itself (i.e., as an inference not necessarily characterized by an empirical evaluation phase, or inductive phase), as various examples coming from the area of history of science and other fields of human cognition clearly show. It is necessary to add that a better support of the claim about the knowledge-enhancing character of abduction is present in the recent (Magnani, 2015a, 2016) and that
7 Discoverability in the Perspective of the EC-Model of Abduction
93
Woods has recently enriched, modified, and moderated his views of ignorance preservation presented with Gabbay in the book quoted above; see Woods (2017). It is contended that to reach selective or creative good abductive results, efficient strategies have to be exploited, but it is also necessary to count on a cognitive environment characterized by various degrees of what has been called eco-cognitive situatedness, in which that eco-cognitive openness already envisaged by Aristotle thanks to the emphasis on “leading away” is fundamental (Magnani, 2016). To favor good creative and selective abduction, reasoning strategies must not be “locked” in an external restricted eco-cognitive environment such as in a scenario characterized by fixed definitory rules and finite material aspects (e.g., an artificial game, Go or Chess), which would function as cognitive mediators able to constrain agents’ reasoning (on the role of locked and unlocked strategies, see Magnani (2019) and Chapter 3 of Magnani (2021)). In brief, the optimization of eco-cognitive situatedness concerns the substantial problem of discoverability and diagnosticability, almost totally disregarded in the literature on abduction (and just sketched by Peirce himself). It is important to anticipate that the solution of this problem has to take into account the fact that it is necessary to refer to something fundamentally characterized by contextual configurations. Research on abduction has frequently “implicitly” emphasized the fruitful role of cognitive openness. Hendricks and Faye (1999) consider trans-paradigmatic abduction a form of discovery in which a guessed hypothesis transcends the prompt empirical agreement between two paradigms. The paradigms are presumed to belong to the same field (e.g., physics) where one of the fields is well established and the other is emerging (e.g., classical and quantum physics): New theoretical concepts can be introduced transcending the current body of background knowledge while yet others remain within the given understanding of things. A case in point would be the formulation of the hypothesis of electron spin. Bohr considered the spin conjecture as a welcome supplement to the current magnetic core theory. Pauli remained rather skeptical pertaining to the spin hypothesis due to the fact that it actually required the theory of quantum mechanics for its proper justification, which was not part of the background knowledge at the time of the conjecture. In such cases, two paradigms are competing and the abduction is then dependent upon whether the conjecture is made within the paradigm or outside it. Hence we distinguish between paradigmatic and transparadigmatic abduction (Hendricks and Faye, 1999, p. 287).
Furthermore, people draw on different domains of knowledge to arrive to an abductive conclusion thanks to what (Gibson and Bruza, 2021) call “transepistemic abduction” (TeA), which illustrates how two agents, in order to successfully explain a phenomenon, reason across two very distant cognitive fields (e.g., computational and psychosocial domains) despite each agent being ignorant of the other domain knowledge. The authors themselves acknowledge that TeA represents a case that is partially concerned with the eco-cognitive perspective: “TeA may not necessarily accommodate wider understandings of abduction like the eco-cognitive model proposed by Magnani. For example, TeA may not necessarily encompass perceptions, aesthetic decisions or moral judgements in the way that a eco-cognitive view of abduction might” (Gibson and Bruza, 2021, p. 475).
94
L. Magnani
Another interesting procedure that can refer to higher abductive processes in need of cognitive openness is the chunk-and-permeate process (Brown and Priest, 2004), in which consideration is given to conditions under which mutually incompatible well-grounded theories can interact to bring forth solutions to problems which neither theory can solve on its own. This process introduces a paraconsistent reasoning strategy, in which information is broken up into chunks, and a limited amount of information is allowed to flow between chunks, and it is applied to model the reasoning employed in the original infinitesimal calculus.
Irrelevance and Implausibility Are Not Necessarily Offensive to Rationality AKM-Schema, GW-Schema, and EC-Model of Abduction From the abduction logicians. properties notes:
perspective of the EC-Model, the available logical views concerning have to be reconsidered. By now this fact is acknowledged by some Estrada-González (2013, p. 182), for example, by referring to the called consistency, minimality, etc. stated in the AKM-schema usefully
We think that overemphasizing the characteristics of abduction as it occurs in scientific practices and daily life scenarios has led to overlook some features that abduction in those circumstances shares with other phenomena in which some given outputs fail to stand in a certain relation with some given inputs and thus a modification on those inputs is in order.
The label AKM-schema was proposed by Gabbay and Woods (2005) to refer to the last names of the authors that promoted it: for A they refer to Aliseda (1997, 2006); for K to Kowalski (1979), Kuipers (1999), and Kakas et al. (1993); and for M to Magnani (2001) and Meheus et al. (2002), a schema which is contrasted to their own (GW-schema), which has been fully explained in Chapter 2 of Magnani (2009). Indeed, abduction in an eco-cognitive perspective is not circumscribed by indicating particular requirements of each particular field of application, such as scientific discovery, diagnosis, machine discovery in artificial intelligence (AI), and computational creativity. The available logical models of abduction (orchestrated around the properties of classical deduction), the so-called nonmonotonic, adaptive, etc. logics, which restrict their structure to some of the above requirements, are extremely important and provide images of abduction that, even if logically powerful, are partial, limited, and not appropriate to represent general perspectives. Of course, it is necessary to acknowledge that the rich research on abduction already available in the fields of logic, cognitive science, philosophy, and artificial intelligence (AI) has already rehabilitated the “cognitive” importance of the fallacy “affirming the consequent” (abductive reasoning corresponds to this fallacy in the light of classical logic), traditionally taken as the mistake of having a conditional and its consequent and from this deriving the antecedent. When reframed in the spirit
7 Discoverability in the Perspective of the EC-Model of Abduction
95
of the naturalization of logic, this fallacy becomes a form of abduction endowed with a positive cognitive value, in most of the real-life reasoning contexts in which it occurs, included diagnosis and creative processes. On naturalization of logic, a note has to be added: in (Magnani, 2015b), Woods’ program of naturalization of logic advanced in his recent book Errors of Reasoning: Naturalizing the Logic of Inference (Woods, 2013) is illustrated and commented. This program aims at bringing logic into a creative rapprochement with cognitive science. This can be achieved by trying to do for logic what over 40 years ago Quine and others attempted for epistemology: similarly, it is necessary to propose a “naturalization” of the logic which leads logic closer to actual human inference through a process of deidealization. Abduction as the fallacy of affirming the consequent can exhibit what some writers have called “material validity.” The standard concept of material validity is illustrated in (Brandom, 2000), as a case of semantically valid inference, which instantiates an invalid syntactic form. Here it is adopted the same term slightly modifying its meaning, in this case to refer to the fact that an invalid form provides a cognitive good semantic outcome. A new logical framework for abduction is needed: an important step of the naturalization of this fallacy is certainly the already widespread acknowledgment of the agent-based nature of abduction but also – as it will be stressed below, taking advantage of the EC-Model – of its eco-cognitive situatedness. The already available taxonomy of the various kinds of abduction, proposed and analyzed by many authors, is extremely useful because it shows that is related to cognitive activities extremely variegated but also tremendously fruitful, resorting to the common aim of guessing hypotheses that can provide various kinds of reliable knowledge, common, scientific, moral, aesthetical, etc. These abductively established kinds of cognition do not necessarily aim at providing truths in the sense of the word established by the tradition of the epistemology of natural sciences: for example, abduction certainly works when performed to establish a new “truth” in physics, but also when exploited to make a “good” hypothesis (e.g., about a person) in gossip, to the aim of “explaining” his behavior. In this last case, the reaching of a truth based on evidence, that is, a truth that would be such because, for example, in tune with the spirit of scientific rationality – is usually a secondary concern. For example, in the case of cognitive moral interplays of gossip, it is the produced hypothesis (about a person or an event) that establishes – constitutes, it can be said – a special kind of “truth” by itself, that truth that is such because locally/situationally accepted and efficiently adopted by the involved human group: this is just what it takes to promote the narratives or the actions, sometimes violent, which the relationships between the individuals of the group aim at performing. The readers interested in more details about hypothetical cognition in gossip are illustrated in (Magnani, 2011) and (Bertolotti and Magnani, 2014). First of all, it can be said that an abductive inference is not different from other inferences; it presents outputs derived from inputs: the AKM and GW schemas quoted above treat abduction in this way. A special perspective can be assumed, which shapes the inferential status of abduction in a smart way. It can
96
L. Magnani
be anticipated that, in this perspective, abduction appears orthogonal to deduction, but not incompatible with it, such as it is occurring when abduction is seen as a fallacy, in the light of the classical Peircean syllogistic framework: the important difference is indeed that in the perspective that will be soon illustrated, no direct reference to the fallacious character of abduction is postulated. Hence, to the aim of “naturalizing” abduction in logic, and to follow the main tenets of the EC-Model, it is provisionally appropriate to adopt a broad logical view of the so-called inferential problems, in which the semantic problem of the fallacious character of abduction and the problem of its presumptive character (stressed by the GW schema) are not primarily taken into account.
Framing Abduction in the Perspective of “General Inferential Problems” It is necessary to remember that the presumptive character of abduction manifestly stresses that necessity – taken in the received Aristotelian sense – is just only one criterion which can characterize the solution of an inferential problem, if considered from a wide point of view. Also plausibility, probability, possibility, sufficiency, and defeasibility are completely legitimate (Woods, 2013, Chapter 8). A general inferential problem is not inevitably characterized by the “necessity” of classical deduction; in the Aristotelian sense, other aspects have to be taken into account and considered legitimate. Let us adopt, following Estrada-González (2013), the more general concepts of input and output instead of premisses and conclusions: as already explained at the beginning of this chapter, this view consists in seeing inference not only in terms of the reaching of an output or the modification of it, like in the standard view of deductive proofs, but also as the process of modifying part of the inputs, or both inputs and outputs, in order to obtain the desired relation between inputs and outputs. From this point of view, deductive and inductive inferences are situated at the same level because they concern the finding of suitable modifications in the outputs (Estrada-González, 2013, pp. 189–191). In this framework, abductive cognition is easily depicted in terms of the situation in which some given outputs fail to stand in a certain relation with some given inputs, and thus a modification on those inputs is needed. It is a received and common way of modeling abduction in logic (cf. e.g., Aliseda 2006). Still following Estrada-González, let us consider a simple example such as the consequence relation of a logic L, L , and let α, β be formulas of the language of L, and © is a certain binary connective. An abductive problem can be illustrated by the following incomplete argument, in the sense that the conclusion does not L-follow from the premisses: α©β L β
(1)
7 Discoverability in the Perspective of the EC-Model of Abduction
97
The making of an abduction that has been just described represents the logical counterpart of the cognitive (and epistemological) fill-up problem already quoted above, that is, the finding of a suitable improvement of the context (the premisses), here the variable ?, such that from the original context and its enrichment, the conclusion L-follows: α©β, ? L β.
(2)
A further note has to be added concerning the fill-up problem. It is well-known that there is the problem of finding criteria for hypothesis selection. “But there is the prior problem of specifying the conditions for thinking up possible candidates for selection. The first is a ‘cutdown’ problem. The second is a ‘fill-up problem’; and with the latter comes the received view that it is not a problem for logic” (Woods (2011, p. 243) emphasis added). In Chapter 6 of a recent book on abduction (Magnani, 2017), it is contended that in a wide eco-cognitive perspective, the cutdown and fill-up problems in abductive cognition appear to be spectacularly contextual. Obviously, there are many ways to complete the argument above. A list of various presumptive very elementary solutions can be easily provided (e.g., α, or β), showing that to the aim of choosing among them, some preferential measures are needed: this resorts to the cutdown problem. In the scheme (2), according to Estrada, rules might be added, and the conclusions may even be rules. What is needed is an input. However, the scheme represents a kind of oversimplification: abducing new rules cannot simply consist in adding an input instead of ?. It rather requires some kind of dynamic of the relation expressed by L . As it will be better explained below, in the case in which new rules are abduced, what is changed is the relation between inputs and outputs, not only the inputs (even if the new rule is some kind of input as well). This is, for example, highlighted by considering a dynamic reading of L , and that should be accounted for by the scheme (3) at p. 103, in which X in X L expresses different potential √ relations between input and output and L expresses the fact that – dynamically – the expected relation is reached. It is necessary to repeat that, in the simple scheme (2), what is needed is clearly an input. However, abducing new rules cannot simply consist in adding an input instead of ?. This is a point also addressed by (Barés-Gómez and Fontaine, 2021) and (Fontaine and Barés-Gómez, 2019) in their adaptive dialogical approach to abduction: in dialogues, there are two kinds of rules. The particular rules provide the local semantics of the connectives. The structural rules define the context of argumentation (the underlying logic), thereby the global semantics of the connectives. A change of structural rules involves a change of logic. By articulating different sets of structural rules in a specific kind of game, a dynamic consequence relation may be obtained. This is precisely how Fontaine and Barés conceive abduction of rules in dialogues: while the argumentative partners may be interacting
98
L. Magnani
in the context of a dialogue for paraconsistent logic (with constraints on the use of negation), the hypothesis that negation behaves consistently at some point in the dialogue may be conjectured (together with a certain commitment to defend further conditions). This might appear less general, but generality is here to be understood in terms of the structure of the dialogues which can be similarly adapted to other dynamic processes. Hence, this perspective approaches abduction without disregarding its pragmatic/dialectical dimension, thanks to the adoption of a dialogical logic. This current logical illustration of the dialectic involved in abduction is able to model argumentative interactions leading to conjectures, through a kind of “dialectification.” It was said above that it is clear that in principle there might be infinitely many ways to complete the above argument indicated in the formula (2). Estrada-González (2013, p. 183) notes that abduction “[. . . ] is usually thought as the construction of a (scientific) hypothesis for adding it to a theory and other previous knowledge (the premises) in order to explain a problematic phenomenon (the conclusion).” If this received schema of abduction is adopted, in the simple example, various solutions should be discarded because they are not good options: for example, according to some chosen criteria, it can be said that solutions do not have to be trivial, redundant, for example, if one has a connective K such that β is L implied by both (α©β)Kβ and α©β, the solution can obviously be to add the formula (α©β)Kβ. Also, they do not have to lack explanatory power (e.g., the solution β). Furthermore, a solution that is not simple, and so excessively explanatory, certainly less simple than another available, has to be discarded, such as in the following cases, in which complicated solutions are added: α©β, ϑ, β, α, (α©β)Kβ L β or α©β, (α©β)Kβ, L β. Indeed, there is another solution available that is the preferred one – the solution α. In sum, by adopting the received logical view of abduction (AKM-schema), our general idea of abduction is restricted being conditioned by special cases (such as the dominant ones of medical diagnosis): unfortunately, by restricting our perspective in this way, some good solutions that would be remarkable and productive in other cases (e.g., different with respect to scientific discovery and medical diagnosis) can be ruled out. To avoid this outcome, the GW-schema and the EC-Model of abduction can be of some help. These schemas do not refer to consistency and minimality as necessary requirements.
Irrelevance and Implausibility Disculpated It has to be said that many standard perspectives on abduction still demand two properties, which are presented as possessed by “every” kind of solution for an abductive problem: 1. Relevance: the solution, the guessed hypothesis H , should be relevant to the problem, for example, if an agent’s knowledge does not suffice to know why the bartender in Kuala Lumpur has been killed, releasing the true Newtonian
7 Discoverability in the Perspective of the EC-Model of Abduction
99
hypothesis that the planets move according to the law of gravitation has nothing to do with the given problem: it is not relevant. 2. Plausibility: The abduced hypothesis H should be characterized by some designated degree of plausibility. If an agent’s knowledge does not suffice to know who killed the bartender in Kuala Lumpur, releasing the hypothesis that the killer is the President of United States has to do with the problem (because, after all, the President is a human being and we know human beings are potential killers) but is sufficiently implausible as to count as a solution. Since for many abductive problems, there are – usually – many guessed hypotheses, the abducer needs to reduce this space to one. This means that the abducer has to produce the best choice among the members of the available group: it is extremely difficult to see how this is done, both formally and empirically. There is the problem of finding criteria for hypothesis selection. But there is the prior problem of specifying the conditions for thinking up possible candidates for selection, already explained above. In particular, some models of abduction aim at producing fragments of classical logic in which instances of abductive or backward reasoning are allowed. It is also interesting to note that recent studies, even if not directly related to the naturalization of the logic of abduction, are strongly concerned with the role of context in reasoning: Zardini emphasizes, in a recent rich formal treatment, the role of simple temporal “intercontextual” logics, which aim at adequately modeling the validity of certain arguments in which the context changes (Zardini, 2014). Indeed in the eco-cognitive perspective, the relevance of a guessed hypothesis would seem a trivial requirement because it is “hard to see how it might fail to be relevant,” as Estrada-González says (2013, p. 185). He further adds: “Someone might press the point that what is required is the relevance not of solutions, but of candidates to be solutions. However, I think it might go against all those pleas connecting abduction with creativity, hypothesis generation, guessing, etc.” This approach has to be accepted. This is exactly the point to be stressed and further explained. First of all, it is necessary to note that relevance is context and time dependent. When Feyerabend (1975) emphasizes the role of what he calls “counterinduction”, he is just presenting to us the complete unreasonable and unwarranted character of scientific discovery: the guessed hypothesis could be devoid of relevance to the problem in the framework of the upholders of the rival theory but also, even if not necessarily, in the perspective of the agent herself that – paradoxically – guessed the new “strange” hypothesis. Of course, the relevance requirement is related to the current state of knowledge of both agonists. However, the new hypothesis can appear “relevant” later on, for example, when recognized as a new discovery. To summarize, candidates to be solutions which seem weird – irrelevant – soon can become relevant if they are recognized as solutions. Something similar can be said in the case of plausibility. First of all, in general, it is impossible to be sure that our guessed hypotheses are plausible (even if it is well-known that looking for plausibility is a human good and wise heuristic); indeed an implausible hypothesis can, later on, appear plausible. Moreover, when
100
L. Magnani
a hypothesis solves the problem at hand, this is enough to count as a solution of the abductive problem (even if not necessarily a good solution or the best solution). In case the target would be the preservation of the property of plausibility, at most, it can be contended that in some cases, it is just potential, given the time dependency illustrated above. To make an example, the strange Cartesian hypothesis of a plenum vortices made of particles, destroyed by the Newtonian concept of action at distance, later on, appeared more rational and fully compatible with the Einsteinian framework: Thus Descartes was not so far from the truth when he believed he must exclude the existence of empty space. The notion indeed appears absurd, as long as physical reality is seen exclusively in ponderable bodies. It requires the idea of the field as the representative of reality, in combination with the general principle of relativity, to show the true kernel of Descartes’ idea; there exists no space “empty of field” (Einstein, 2014, pp. 375–376).
In sum, irrelevance and implausibility not always are offensive to reason: to delineate the fill-up problem, neither relevance nor plausibility is necessary; they are just two “typical” smart and fruitful principles human beings subjectively adopt to look for hypotheses. Unfortunately, they are no longer “typical,” for example, in the case of high-level kinds of cognitive creativity. Also the GW-schema acknowledges the fact that relevance or plausibility cannot be taken to be general conditions of hypothesis selection. Scientists would agree that the really surprising and fruitful thought arising from abduction has to challenge prevailing conceptions by suggesting ideas that are prima facie neither relevant nor plausible, nor that even appear to contradict pre-established notions (see, e.g., the book by Livio (2013)).
Eco-Cognitive Openness It is now useful to provide a short introduction to the concept of eco-cognitive openness from a logical point of view. The new perspective inaugurated by the socalled naturalization of logic – this new project is illustrated in (Magnani, 2015b), and see also above the previous parts of this section – contends that the normative authority claimed by formal models of ideal reasoners to regulate human practice on the ground is, to date, unfounded. It is necessary to propose a “naturalization” of the logic of human inference. Woods holds a naturalized logic to an adequacy condition of “empirical sensitivity” (Woods, 2013). A naturalized logic is open to study many ways of reasoning that are typical of actual human knowers, such as fallacies, which, even if not truth preserving inferences, nonetheless can provide truths and productive results. Of course one of the best examples is the logic of abduction, where the naturalization of the well-known fallacy “affirming the consequent” is at play. Gabbay and Woods (2005, p. 81) clearly maintain that Peirce’s abduction, depicted as both (a) a surrender to an idea and (b) a procedure for testing its consequences, perfectly resembles central aspects of practical reasoning but also of creative scientific reasoning. It is useful to refer to recent research on abduction (Magnani, 2016), which stresses the importance in good abductive cognition of various kinds of the already
7 Discoverability in the Perspective of the EC-Model of Abduction
101
quoted optimization of situatedness: abductive cognition in a situation of “strong” eco-cognitive openness is, for example, very important in scientific reasoning because it refers to that activity of creative hypothesis generation which characterizes one of the more valued aspects of rational knowledge. It can be said that in scientific creative reasoning, discoverability is only granted by a maximization/optimization of eco-cognitive openness. The study above teaches us that situatedness is related to the so-called eco-cognitive aspects, referred to various contexts in which knowledge is “traveling” to favor the solution of an inferential problem – especially in science but also in the case of other abductive problems, such as diagnosis – the richness of the flux of information has to be maximized. In can be guessed that it is exactly the presence of a “maximization/optimization of eco-cognitive openness” that can favor the discoverability of abductive hypotheses characterized by what Peirce called “uberty,” as a potentiality to arrive at undiscovered truths: “I think logicians should have two principal aims: 1st, to bring out the amount and kind of security (approach to certainty) of each kind of reasoning, and 2nd, to bring out the possible and desirable uberty, or value in productiveness, of each kind” (Peirce, 1866–1913, 8.384). As illustrated above, hypotheses of this kind – uberous hypotheses are a typical possible fruit of abduction – are often irrelevant and implausible in the perspective of the contemporary context but have a high potential to favor undiscovered truths. This means that plausibility and relevance are often deceptive guidance that can generate a bad outcome: the overlooking of potentially fecund hypotheses. The price of uberty is lost security. The reader that is interested in more details about uberty in Peirce’s philosophy can refer to McJohn (1993), Campos (2011), Mcauliffe (2015), and Pietarinen (2020).
Explaining the Optimization of Eco-Cognitive Situatedness in a Logical Perspective A good abductive logical system is classically characterized by the following general distinct levels, which resort to the fill-up and cutdown aspects already explained above: – a base logic L1 with proof procedures Π ; – an abductive algorithm which deploys Π to look for missing premisses and other formulas to be abduced; – a logic L2 for deciding which abduced formulas to choose, which criteria of selection apply, etc. This logic is related to the specification of suitable constraints regarding plausibility, relevance (topical, full-use, redundancy-oriented, probabilistic), etc., and economy, making the ideal agent able to discount and select information that does not resolve the task at hand (Gabbay and Woods, 2005). As illustrated above, it is thanks to the so-called GW-schema that Gabbay and Woods criticize what they call the classic AKM-schema of abduction. A primary
102
L. Magnani
gift provided by this new GW-schema was the opening of the discussion about ignorance preservation but also about nonexplanatory and instrumental abduction, considered as not intrinsically consequentialist – nonexplanatory and instrumental abduction are illustrated in (Magnani, 2009, Chapter 2), also providing some case studies. In the previous parts of this chapter, relevance and plausibility have been described as context- and time-dependent, and so they are not necessary aspects of all potential successful abductions: this means that also in this perspective, eco-cognitive environmental situatedness of abductive cognition matters more than expected. From the perspective of logic, the first consequence is that the fill-up problem should consist of a logic eco-cognitively disciplined and multimodally built. Also taking advantage of the analysis of the Aristotelian concept of “leading away” ( ), it was indicated above that looking for a naturalization of the logic of abduction, it is fruitful to follow the main tenets of the EC-Model. Consequently it is appropriate to adopt, when dealing with the so-called “inferential problem” of abduction, the more general concepts of input and output instead of those of premisses and conclusions, in the perspective already illustrated above. In such a way, there is one more advantage: it is more natural to accept the multimodal character of the inferences involved. In this framework, we expect to have a logic in which the artificial language presents an extendable expressive capacity and that can also be itself composed of icons and not only of symbols, like it is occurring in the case of heterogeneous logics. Some logicians seem to acknowledge this need when dealing with a broad view of the inference problems: “An inference can be seen as an argument completion process, where premises are reduced either to a single formula or to multiple formulas, depending on whether single – or multiple – conclusion logics are used. But although the conclusion might consist of a single formula, we might need to add rules, theories, etc. to complete an argument, and not just adding formulas in the context. Nonetheless, the conclusions could be also rules (if for example, in a logic we want a rule to be a derived one, we may need to make some input to the original logic), theories, valid arguments, diagrams, etc.” [. . . ] (Estrada-González, 2013, p. 186). To the aim of explaining and describing how discoverability and diagnosticability are so important in abductive cognition, the reader has to be patient and to examine the following characterization of abduction, which will highlight and illustrate the problem of optimization of eco-cognitive situatedness, taking advantage of simple logical considerations. Let Θ = {Γ1 , . . . , Γm } be a background theory; P = {Δ1 , . . . , Δn } a set of true sentences corresponding – for example – to phenomena to be explained; and a consequence relation, usually, but not necessarily, the classical one. According to the received AKM schema quoted above, abduction refers to the inference of P from Θ and H = {A1 , . . . , Ak }, a collection of hypotheses, given some further constraints, which can basically relate to consistency, minimality, and preference, as repeatedly stated. In this perspective, an abductive problem concerns the finding of a suitable improvement of A1 , . . . , Ak such that Γ1 , . . . Γm , A1 , . . . , Ak L Δ1 , . . . , Δn is L-valid. It is obvious that an improvement of the input Γ1 , . . . Γm can be reached by additions of a “new”
7 Discoverability in the Perspective of the EC-Model of Abduction
103
background theory but also by modification and enrichment of the input already available in the given inferential problem: this process refers to the importance of the “optimal position” of both input and output, as it will be better described in the following section. In the perspective above in terms of inferential problems, an inference is not only seen in terms of the process that leads to the generation of and output or to the proof of it, like in the traditional and standard view of deductive proofs, when we have to obtain the output from the input. In this broader logical view, an inferential problem is also the process of increasing or modifying part of the input, or both input and output, in order to obtain a desired relation between input and output. Of course, the reached extension or modification is the solution of the inferential problem. It is clear that in the case of abductive cognition, input and output fail to stand each other in an expected relation, and the solution requires enrichment of part of input, not of output: these enrichment represents the solution of the abductive problem. This process of modification of input is – as already said – basically multimodal, and various context-depending conditions have to be fulfilled in order to reach the (best) solution. From this perspective, the general form of an inferential abductive problem can be symbolically rendered as follows (Estrada-González, 2013, p. 189): Λ1 , . . . , Λi , ?I X L Υ1 , . . . , .Υj
(3)
in which X L indicates that input and output do not stand each other in an expected √
relation and that the generation of the input ?I can provide the solution. L will denote that the expected relation is obtained. In general, in this characterization, the direction is not from evidence/premisses to abductive outputs, but the forward fashion is adopted, where the inferential parameter sets some appropriate logical relationship between an input which consists in both the abductive guess to be found and a background theory and an output – premisses, for example, an evidence, a novel phenomenon to be abductively “explained” through facts, rules, or even new theories. The inferential parameter does not have to be considered neither the semantic entailment above nor the classical derivability. Aliseda observes that it ranges – at least in the case of the main received nonstandard logical accounts – over diverse values such as probable inference, logic programming, or dynamic inference: “abduction is not one specific non-standard logical inference mechanism, but rather a way of using any one of these” (Aliseda, 2006, p. 47); also, she stresses that it is logically useful to depict “explanatory” consistent abduction by extracting those sets of “top-down” (van Benthem, 2007, p. 272) basic properties that are called structural rules of inference, such as conditional reflexivity, simultaneous cut, conclusion consistency, modified monotonicity, modified cut, rejection of permutation in dynamic inference, etc., which “fit a logical format” (Aliseda, 2006, p. 150). The structural rules reverberate – Aliseda (cit., p. 95) says they “state how abductive explanatory logic behaves” – the fact that abduction can be expressed by
104
L. Magnani
deviant (but still logical) systems, endowed with extra-systematic specific notions of validity, even if, unfortunately, they do not provide methods for generating abductive explanations, such as it is occurring computing abduction thanks to logic programming and semantic tableaux. Finally, it has to be remembered that if we consider the generation of the input ?I that provides the solution as subjunctively attained, such as in the case of the ignorance-preserving abductions illustrated by the GW-schema, then these computing devices do not appear to be “genuine” abductions. In this perspective, Woods observes that in the case of semantic tableaux, abduction resembles enthymeme resolution and so does not reflect its presumptive character: the task resorts to the rehabilitation of an ailing deduction. In semantic tableaux abduction, “the task is to find a φ that closes a model-connection between a theory and some (usually) empirical data, which also is the repair of a decrepit deduction” (Woods, 2007, p. 310). In the light of the presumptive character of the GW schema, and only in this light, there is nothing that is abductive about such closures. Indeed |-closures are possible without there being any ignorance problem to which the closure is a response.
Situatedness: The Centrality of “Optimally Positioning” Input and Output It seems important to note that, to get good abductions, such as the creative ones that are typical of scientific innovation, the input (background theory) and output X (evidence, premisses) of the formula Λ1 , . . . , Λi , ?I X L Υ1 , . . . , .Υj (in which L indicates that input and output do not stand each other in an expected relation and that the generation of the input ?I can provide the solution) have to be thought as optimally positioned. Not only this optimality, in various types of abduction even if not in all, is made possible – at least in scientific discovery – by a maximization of changeability of both input and output; again, not only input has to be enriched with the possible solution, but to do that, other aspects of input have usually to be changed and/or modified (more details are illustrated in (Magnani, 2016, Section three)). Indeed, in the eco-cognitive perspective, an “inferential problem,” given some starting evidence/premisses, can be enriched by the appearance of a new output (i.e., more evidence/premisses) to be accounted for, and the inferential process has to restart. This is exactly the case of abduction, and the cycle of reasoning reflects the well-known nonmonotonic character of abductive reasoning. Abductive consequence is ruptured by new and newly disclosed information and so defeasible. In this perspective, abductive inference is not only the product of the generation of the input ?I (i.e. the reaching of the abductive hypothesis) but, in general, actually involves the intertwined modification of both input (also the background theory to be adopted) and output (the evidence/premisses). Consequently, abductive inferential processes are highly information-sensitive, that is, the flux of information which interferes with them is continuous and systematically human(or machine)-promoted
7 Discoverability in the Perspective of the EC-Model of Abduction
105
and enhanced when needed. This is not true of traditional inferential settings, for example, proofs in classical logic, in which the modifications of the input are minimized, proofs are usually taken with a “given” input, and the burden of proofs is dominant and charged on rules of inferences and on the smart choice of them together with the choice of their appropriate sequentiality. This changeability first of all refers to a wide psychological/epistemological openness in which (at least in science) knowledge transfer has to be maximized. In sum, considering an abductive “inferential problem” as symbolized in the above formula, a suitably “anthropomorphized” logic of abduction has to take into account a continuous flux of information from the eco-cognitive environment and so the constant modification of both input and output on the basis of both: 1. the new information available, 2. the new information inferentially generated, for example, new inferentially generated input aiming at solving the inferential problem. To conclude, optimization of situatedness is the main general property of logical abductive inference, which – from a general perspective – defeats the other properties such as minimality, consistency, relevance, plausibility, etc. These are special subcases of optimization, related to the quality of the guessed hypotheses, and are intertwined with the kind of a given situatedness, at the level of the appropriate abductive inference to the new input indicated with a question mark of the above formula. Another count is indeed the problem of choosing, selecting, and finding the appropriate input (background theory) and output (evidence/premisses), that is, an optimal positioning of the eco-cognitive situatedness, to the aim of “favoring” the required kind of abductive process. Again, an abductive solution is still related to that Aristotelian “leading away” ( ) – quoted above – that is on the starting of the application of a supplementary logic implementing an appropriate formal inference engine which can provide a solution. This supplementary logic implements a new inference engine from the output of the formula (3). If the aim is the naturalization of the logic of the abductive processes and its special consequence relation, which has to be strongly “eco-cognitive-sensitive”, EC , it is necessary at this point to note that a naturalized abductive logic would have to first of all refer to the: • optimization of situatedness. Situatedness is related to eco-cognitive aspects: to favor the solution of the abductive problem, input and output of the formula above have to be thought as optimally positioned. Other accessorial aspects which favor the success of abductive cognition have to be immediately illustrated: 1. in scientific discovery (but also in other fields of abductive cognition, such as medical diagnosis), the above-illustrated optimality has to be further character-
106
2.
3.
4.
5.
6.
L. Magnani
ized as a maximization of changeability of both input and output. Not only inputs have to be enriched with the possible solution to favor the potential increase of knowledge, even if not satisfactory, so favoring the restarting of a new inferential process, but, to do that, other inputs (background theory) have usually to be built, changed, and/or modified; consequently, abductive inferential processes are highly information-sensitive, that is, the flux of information which interferes with them is continuous and systematically human (or machine)-promoted and enhanced when needed. This is not true of traditional inferential settings, for example, proofs in classical logic, in which the modifications of the input are minimized, proofs are usually taken with a “given” input, and the burden of proofs is dominant and charged on rules of inferences and on the smart choice of them together with the choice of their appropriate sequentiality; indeed, in the eco-cognitive perspective, an “inferential problem” can be enriched by the appearance of a new output to be accounted for, and the inferential process has to restart. This is exactly the case of abduction, and the cycle of reasoning reflects the well-known nonmonotonic character of abductive reasoning. Abductive consequence is ruptured by new and newly disclosed information and so defeasible. In this perspective, abductive inference is not only the outcome of the modification of the input but, in general, actually involves the intertwined modification of both input and output; contrarily to the case of classical demonstrative systems, which are characterized by what has been called maximization of memorylessness (Magnani, 2017), naturalizing a logic of abduction is related to the need of keeping record of the past life of abductive inferential praxes; traditional logical systems are abstract in the sense that they are based on a maximal independence regarding sensory modality, and so they strongly stabilize experience and common categorization; on the contrary, this requirement is no longer crucial, but “abstractness” is circumstance-based and so eco-cognitively conditioned. Moreover, it is necessary to repeat that multimodality of formalization is really important: a naturalized logic of abduction is open to a modification of both language and rules; as Peirce says, abduction is “akin to the truth” – “It is a primary hypothesis underlying all abduction that the human mind is akin to the truth in the sense that in a finite number of guesses it will light upon the correct hypothesis” (Peirce, 1866–1913, 7.220): research on abduction in the last decades certainly has described it as an exceptional example of a significant and fruitful truthgenerating nondeductive reasoning.
Further, let us remember that in an abductive “inferential problem” as symbolized in (3) (cf. above, section “Explaining the Optimization of Eco-Cognitive Situatedness in a Logical Perspective”), it is extremely important to grant multimodality. The logical inferential process that is involved in the whole modification of input and output has to be strongly intended as multimodal not only from the point of view of the cognitive devices “represented” (e.g., not merely propositions but also
7 Discoverability in the Perspective of the EC-Model of Abduction
107
diagrams) but also from the point of view of the applied rules: model-based but also computational aspects have to be taken into account.
Affordances, Diagnosticability, Discoverability, and Abduction This final section is devoted to a clarification of the strict relationship between discoverability, diagnosticability, “affordances” – as environmental anchors that allow us to better exploit external resources – and abduction. Especially diagnosticability and of course also discoverability are of course related to the availability of the appropriate affordances. Gibson defines “affordance” as what the environment offers, provides, or furnishes. For instance, a chair affords an opportunity for sitting, air breathing, water swimming, stairs climbing, and so on. By cutting across the subjective/objective frontier, affordances refer to the idea of agent-environment mutuality. Gibson did not only provide clear examples but also a list of definitions (Wells, 2002) that may contribute to generating possible misunderstanding: 1. affordances are opportunities for action; 2. affordances are the values and meanings of things which can be directly perceived; 3. affordances are ecological facts; 4. affordances imply the mutuality of perceiver and environment. The main problem in this section is related to the relationship between abduction and affordances: indeed human and nonhuman animals can “modify” or “create” affordances by manipulating their cognitive niches so favoring or impeding certain abductive results. Representational delegations to the external environment that are configured as parts of cognitive niches are those cognitive human actions that transform the natural environment into a cognitive one. Humans have built huge cognitive niches, characterized by informational, cognitive, and, finally, computational processes, as described by the studies in the field of biosciences of evolution by Odling-Smee, Laland, and Feldman (Odling-Smee et al., 2003; Laland and Sterelny, 2006; Laland and Brown, 2006). Epistemic niches, already quoted in the first section of the present chapter, are of course built to create those special eco-cognitive environments that are directly related to scientific cognition. Gibson appears convinced that “The hypothesis that things have affordances, and that we perceive or learn to perceive them, is very promising, radical, but not yet elaborated” (Gibson, 1979, p. 403). Let us deepen this issue: it can be said that the fact that a chair affords sitting means some clues (robustness, rigidity, flatness) can be perceived from which a person can easily say “I can sit down.” Now, suppose the same person has another object O. In this case, the person can only perceive its flatness. He/she does not know if it is rigid and robust, for instance. Anyway, he/she decides to sit down on it, and he/she does that successfully. Again, the problem at
108
L. Magnani
stake is that of direct and indirect visual perception. It is thanks to the effect of action that new affordances can be detected and stabilized. Now, the point is that two cases can be delineated: in the first one, the cues we come up with (flatness, robustness, rigidity) are highly diagnostic to know whether or not we can sit down on it, whereas in the second case, we eventually decide to sit down, but we do not have any precise clue about it. How many things are there that are flat, but one cannot sit down on? A nail head is flat, but it is not useful for sitting. This example further clarifies two important elements: firstly, finding/constructing affordances certainly deals with a (semiotic) inferential activity (Windsor, 2004); secondly, it stresses the relationship between an affordance and the information that specifies it that only arises in the eco-cognitive interaction between environment and organisms. In this last case, the information is reached through a simple action, in other cases through action and complex manipulations. The term “highly diagnostic” explicitly refers to the abductive framework. It is well-known that abduction is classically considered a process of inferring certain facts and/or laws and hypotheses that render some sentences plausible and that explain or discover some (eventually new) phenomenon or observation (Magnani, 2001). The distinction between theoretical and manipulative abduction illustrated above in this chapter extends the application of that concept beyond the internal dimension. Some studies on abduction (e.g., Magnani 2009) have repeatedly stressed that from Peirce’s philosophical point of view, all thinking is in signs, and signs can be icons, indices, or symbols: moreover, all inference is a form of sign activity, where the word sign includes “feeling, image, conception, and other representation” (Peirce, 1866–1913, 5.283) and, in Kantian words, all synthetic forms of cognition. That is, a considerable part of the thinking activity is “modelbased” and consequently non-sentential. Of course model-based reasoning acquires its peculiar creative relevance when embedded in abductive processes, so that a model-based abduction can be individuated. In the case of diagnostic reasoning in medicine, a physician detects various symptoms (that are signs or clues) in a multimodal way, for instance, cough, chest pain, and fever; then he/she may infer that it is a case of pneumonia. The original Gibsonian notion of affordance especially deals with those situations in which the “perceptual” signs and clues that can be detected prompt or suggest a certain action rather than others. In the original Gibsonian view, the notion of affordance is mainly referred to proximal and immediate perceptual chances, which are merely “picked up” by a stationary or moving perceiver. It is important to note that perceiving affordances also involves evolutionary changes and the role of sophisticated and plastic cognitive capacities (further details below in this section). These capacities are already available and belong to the normality of the adaptation of an organism to a given ecological niche. Nevertheless, if it is acknowledged that environments and organisms’ instinctual and cognitive plastic endowments change, it can be argued that affordances can be related to the variable (degree of) abducibility of a configuration of signs: a chair affords sitting in the sense that the action of sitting is a product of a sign activity in which some physical properties are perceived (flatness, rigidity, etc.), and therefore it is possible to ordinarily “infer”
7 Discoverability in the Perspective of the EC-Model of Abduction
109
(in Peircean sense) that a possible way to cope with a chair is sitting on it. So to say, in most cases, it is a spontaneous abduction to find affordances because this chance is already present in the perceptual and cognitive endowments of human and nonhuman animals. Describing affordances that way may clarify some puzzling themes proposed by Gibson, especially the claim concerning the fact that affordances are directly perceived and that the value and meaning of a thing are clear at first glance: organisms have at their disposal a standard endowment of affordances (e.g., through their hardwired sensory system), but at the same time, they can extend and modify the scope of what can afford them through the suitable cognitive abductive skills. The important fact that environments change but also perceptive capacities enriched through new or higher-level cognitive skills change have to be stressed, those capacities that go beyond the ones granted by the merely instinctual levels: if affordances are usually stabilized, this does not mean they cannot be modified and changed and that new ones can be formed. René Thom nicely says that “Nature is present in the behavior of inanimate beings. But the animate being is able to exploit natural regularities in order to stabilize connections that would be accidental, not generic, in the inanimate world” (Thom, 1988, p. 217). The reader that is interested in Thom’s catastrophe theory and its consequences for the concept of abduction can refer to (Magnani, 2009, Chapter 8). First of all, affordances appear durable in human and animal behavior, like kinds of habits, as Peirce would say (Peirce, 1866–1913, 2.170). For instance, that a chair affords sitting is a fair example of what is at stake in this case. This deals with what may be called stabilized affordances. That is affordances that are experienced by humans as highly successful. Once evolutionarily formed, or abductively created/discovered through cognition, they are stored in embodied or explicit cognitive libraries and retrieved upon occasion. Not only they can be a suitable source of new chances, through analogy. Very different objects that equally afford sitting can be available. For instance, a chair has four legs and a back, and it also stands on its own. The affordances exhibited by a traditional chair may be an analogical source and transferred to a different new artifact that presents the affordance of a chair for sitting down (and that to some extent can still be described as a chair). Consider, for instance, the variety of objects that afford sitting without having four legs or even a back. Let us consider a stool: it does not have even a back or, in some cases, it has only one leg or just a pedestal, but it affords sitting as well as a chair. Second, affordances are also subjected to changes and modifications. Some of them can disappear because new configurations of the cognitive environmental niche (e.g., new artifacts) are invented with more powerful offered affordances. Consider, for instance, the case of blackboards. Progressively, teachers and instructors have partly replaced them with new artifacts which exhibit affordances brought about by various tools, for example, slide presentations. In some cases, the affordances of blackboards have been totally re-directed or re-used for more specific purposes and actions. For instance, one may say that a logical theorem is still easier to be explained and understood by using a blackboard, because of its affordances that
110
L. Magnani
give a temporal, sequential, and at the same time global perceptual depiction to the matter. Of course – in the case of humans – objects can afford different persons in different ways. This is also the case of experts: they take advantage of their advanced knowledge within a specific domain to detect signs and clues that ordinary people cannot detect. For instance, a patient affected by pneumonia affords a physician in a completely different way compared with that of any other uncultured person. Being abductive, the process of perceiving affordances mostly relies on a continuous activity of hypothesizing which is cognition-related. That A affords B to C can be also considered from a semiotic perspective as follows: A signifies B to C. A is a sign, B the object signified, and C the interpretant. Having cognitive skills (e.g., knowledge contents and inferential capacities but also suitable pre-wired sensory endowments) about a certain domain enables the interpretant to perform certain abductive inferences from signs (viz., perceiving affordances) that are not available to those who do not possess those apparatuses. To ordinary people, a cough or chest pain is not diagnostic, because they do not know what the symptoms of pneumonia or other diseases related to cough and chest pain are. Thus, they cannot make any abductive inference of this kind and so perform subsequent appropriate medical actions. In summary, the cognitive lesson of this last section is that it is also necessary to consider discoverability and diagnosticability as related to the establishment of appropriate affordances. They are environmental anchors that allow us to better exploit external resources, in the abductive problem, first of all with respect to diagnosticability but of course also to discoverability.
Conclusion In this chapter, the fundamental aspects of discoverability have been described, taking advantage of the eco-cognitive model of abduction (EC-Model). To this aim, a logical formalization which adopts the general concepts of input and output instead of the classical ones of premisses and conclusions has been proposed. In this light, an abductive inference can be seen as a logical process in which input and output fail to hold each other in an appropriate relation, with the solution involving the modification of inputs, not that of outputs. It is thanks to this model that discoverability can be seen as jeopardized when input and output fail to hold each other in a “good” relation: consequently, the abductive problem at hand cannot be solved. In the case of scientific creative reasoning, which has been described as characterized by a systematic “maximization” of cognition, the relative creative abductive processes are more likely when an “optimization of eco-cognitive situatedness” is reached, certainly busted by the “maximization of changeability” of both input and output, which is a kind of high “information-sensitiveness.” Finally, the role of irrelevance and implausibility in promoting both discoverability and discovery has been vindicated. Plausibility and relevance, even if virtuous cognitive heuristics, are not the only criteria that can guide the generation of abductive
7 Discoverability in the Perspective of the EC-Model of Abduction
111
hypotheses. Also the role of “affordances,” as environmental anchors that allow us to better exploit external resources, had been examined, first of all in the case of abductive problems, with respect to diagnosticability but of course also with respect to discoverability. Acknowledgements Research for this chapter was supported by the PRIN 2017 Research 20173YP4N3—MIUR, Ministry of University and Research, Rome, Italy.
References Aliseda, A. (1997). Seeking Explanations: Abduction in Logic, Philosophy of Science and Artificial Intelligence. PhD thesis, Amsterdam: Institute for Logic, Language and Computation. Aliseda, A. (2006). Abductive Reasoning. Logical Investigations Into Discovery and Explanation. Heidelberg/Berlin: Springer. Barés-Gómez, C., & Fontaine, M. (2021). Between sentential and model-based abductions. A dialogical approach. Logic Journal of the IGPL, 29(4), 425–446. Special Issue on “Formal Representations in Model-Based Reasoning and Abduction”, ed. by A. Nepomuceno, L. Magnani, F. Salguero, C. Barés-Gómez, & M. Fontaine. Bertolotti, T., & Magnani, L. (2014). An epistemological analysis of gossip and gossip-based knowledge. Synthese, 191, 4037–4067. Brandom, R. (2000). Articulating Reason: An Introduction to Inferentialism. Cambridge, MA: Harvard University Press/Edinburgh. Brown, B., & Priest, G. (2004). Chunk and permeate, a paraconsistent inference strategy. Part I: The infinitesimal calculus. Journal of Philosophical Logic, 33(4), 379–388. Cabrera, F. (2017). Can there be a Bayesian explanationism? On the prospects of a productive partnership. Synthese, 194(4), 1245–1272. Cabrera, F. (2020). Does IBE require a “model” of explanation? British Journal for the Philosophy of Science, 71(2), 727–750. Campos, D. G. (2011). On the distinction between Peirce’s abduction and Lipton’s inference to the best explanation. Synthese, 180(3), 419–442. Einstein, A. (2014). Relativity and the problem of space [1952]. In Ideas and Opinions (S. Bergmann, Trans., pp. 360–377). New York: Crown Publisher. Estrada-González, L. (2013). Remarks on some general features of abduction. Journal of Logic and Computation, 232(1), 181–197. Feyerabend, P. (1975). Against Method. London/New York: Verso. Fontaine, M., & Barés-Gómez, C. (2019). Conjecturing hypotheses in a dialogical logic for abduction. In D. Gabbay, L. Magnani, W. Park, & A. Pietarinen (Eds.), Natural Arguments, A Tribute to John Woods (pp. 379–414). London: College Publications. Gabbay, D. M., & Woods, J. (2005). The Reach of Abduction. Amsterdam: North-Holland. Gibson, A., & Bruza, P. (2021). Transepistemic abduction: Reasoning across epistemic domains. Logic Journal of the IGPL, 29(4), 469–482. Special Issue on “Formal Representations in ModelBased Reasoning and Abduction”, ed. by A. Nepomuceno, L. Magnani, F. Salguero, BarésGómez, M. Fontaine. Gibson, J. J. (1979). The Ecological Approach to Visual Perception. Boston, MA: Houghton Mifflin. Hendricks, F. V., & Faye, J. (1999). Abducting explanation. In L. Magnani, N. J. Nersessian, & P. Thagard (Eds.), Model-Based Reasoning in Scientific Discovery (pp. 271–294). New York: Kluwer Academic/Plenum Publishers. Kakas, A., Kowalski, R. A., & Toni, F. (1993). Abductive logic programming. Journal of Logic and Computation, 2(6), 719–770. Kowalski, R. A. (1979). Logic for Problem Solving. New York: Elsevier.
112
L. Magnani
Kuipers, T. A. F. (1999). Abduction aiming at empirical progress of even truth approximation leading to a challenge for computational modelling. Foundations of Science, 4, 307–323. Laland, K. N., & Brown, G. R. (2006). Niche construction, human behavior, and the adaptive-lag hypothesis. Evolutionary Anthropology, 15, 95–104. Laland, K. N., & Sterelny, K. (2006). Perspective: Seven reasons (not) to neglect niche construction. Evolution. International Journal of Organic Evolution, 60(9), 4757–4779. Livio, M. (2013). Brilliant Blunders: From Darwin to Einstein. Colossal Mistakes by Great Scientists That Changed Our Understanding of Life and the Universe. New York: Simon & Schuster. Magnani, L. (1988). Epistémologie de l’invention scientifique. Communication and Cognition, 21, 273–291. Magnani, L. (2001). Abduction, Reason, and Science. Processes of Discovery and Explanation. New York: Kluwer Academic/Plenum Publishers. Magnani, L. (2007). Morality in a Technological World. Knowledge as Duty. Cambridge: Cambridge University Press. Magnani, L. (2009). Abductive Cognition. The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning. Heidelberg/Berlin: Springer. Magnani, L. (2011). Understanding Violence. The Intertwining of Morality, Religion, and Violence: A Philosophical Stance. Heidelberg/Berlin: Springer. Magnani, L. (2015a). The eco-cognitive model of abduction. ’ now: Naturalizing the logic of abduction. Journal of Applied Logic, 13, 285–315. Magnani, L. (2015b). Naturalizing logic. Errors of reasoning vindicated: Logic reapproaches cognitive science. Journal of Applied Logic, 13, 13–36. Magnani, L. (2016). The eco-cognitive model of abduction. Irrelevance and implausibility exculpated. Journal of Applied Logic, 15, 94–129. Magnani, L. (2017). The Abductive Structure of Scientific Creativity. An Essay on the Ecology of Cognition. Cham: Springer. Magnani, L. (2019). AlphaGo, locked strategies, and eco-cognitive openness. Philosophies, 4(1), 8. Magnani, L. (2021). Computational domestication of ignorant entities. Unconventional cognitive embodiments. Synthese, 198, 7503–7532. Special Issue on “Knowing the Unknown” (guest editors L. Magnani & S. Arfini). Magnani, L. (2022). Discoverability. The Urgent Need of an Ecology of Human Creativity. Cham: Springer. Mcauliffe, W. H. B. (2015). How did abduction get confused with inference to the best explanation? Transactions of the Charles S. Peirce Society, 51(3), 300–319. McJohn, S. M. (1993). On uberty: Legal reasoning by analogy and Peirce’s theory of abduction. Willamette Law Review, 29, 191–235. Meheus, J., Verhoeven, L., Van Dyck, M., & Provijn, D. (2002). Ampliative adaptive logics and the foundation of logic-based approaches to abduction. In L. Magnani, N. J. Nersessian, & C. Pizzi (Eds.), Logical and Computational Aspects of Model-Based Reasoning (pp. 39–71). Dordrecht: Kluwer Academic Publishers. Odling-Smee, F. J., Laland, K. N., & Feldman, M. W. (2003). Niche Construction. The Neglected Process in Evolution. Princeton, NJ: Princeton University Press. Park, W. (2015). On classifying abduction. Journal of Applied Logic, 13, 215–238. Park, W. (2017a). On Lorenzo Magnani’s manipulative abduction. In L. Magnani & T. Bertolotti (Eds.), Handbook of Model-Based Science (pp. 197 –213). Berlin: Springer. Park, W. (2017b). Abduction in Context. The Conjectural Dynamics of Scientific Reasoning. Cham: Springer. Peirce, C. S. (1866–1913). Collected Papers of Charles Sanders Peirce. Cambridge, MA: Harvard University Press. Vols. 1–6, C. Hartshorne & P. Weiss, Eds.; Vols. 7–8, A. W. Burks, Ed. (1931– 1958).
7 Discoverability in the Perspective of the EC-Model of Abduction
113
Pietarinen, A.-V. (2020). Abduction and diagrams. Logic Journal of the IGPL, 29(4), 447–468. Special Issue on “Formal Representations in Model-Based Reasoning and Abduction”, ed. by A. Nepomuceno, L. Magnani, F. Salguero, C. Barés-Gómez, & M. Fontaine. Thom, R. (1988). Esquisse d’une sémiophysique. Paris: InterEditions. Translated by V. Meyer, Semio Physics: A Sketch. Redwood City, CA: Addison Wesley, 1990. van Benthem, J. (2007). Abduction at the interface of logic and philosophy of science. Theoria, 60(22/3), 271–273. Wells, A. J. (2002). Gibson’s affordances and Turing’s theory of computation. Ecological Psychology, 14(3), 141–180. Windsor, W. L. (2004). An ecological approach to semiotics. Journal for the Theory of Social Behavior, 34(2), 179–198. Woods, J. (2007). Ignorance and semantic tableaux: Aliseda on abduction. Theoria, 60(22/3), 305–318. Woods, J. (2011). Recent developments in abductive logic. Studies in History and Philosophy of Science, 42(1), 240–244. Essay Review of Magnani, L. (2009). Abductive Cognition. The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning. Heidelberg/Berlin: Springer. Woods, J. (2013). Errors of Reasoning. Naturalizing the Logic of Inference. London: College Publications. Woods, J. (2017). Reorienting the logic of abduction. In L. Magnani & T. Bertolotti (Eds.), Springer Handbook of Model-Based Science (pp. 137–150). Cham: Springer. Zardini, E. (2014). Context and consequence. An intercontextual substructural logic. Synthese, 191, 3473–3500.
8
How Abduction Fares in Mathematical Space John Woods
Contents Peirce’s Abductive Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Kempson Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Three Faces of Logical Consequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Logic in Peirce’s Narrow Sense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Logic Daze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cognitive Economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Habit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Data-Respecting Epistemology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theorematic Proof and Kant’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trial by Combat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
116 118 121 123 125 130 135 139 146 150 152 153
[I]t is not to be expected that any rational opinion about logic will become prevalent among philosophers within a generation, at least. C. S. Peirce (1896) It is sometimes said that the highest philosophical gift is to invent important new philosophical problems. If so, Peirce is a major star [in] the firmament of philosophy. By thrusting the notion of abduction to the forefront of philosophers’ consciousness he has created a problem which – I will argue – is the central one in contemporary epistemology. Jaakko Hintikka (1998)
J. Woods () The Abductive Systems Group, Department of Philosophy, University of British Columbia, Vancouver, Canada e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_2
115
116
J. Woods
Abstract
Cognitive space is a metaphor for the environments in which beings like us lead our intellectual lives. Mathematical space is one of its most interesting precincts and one of its slipperiest customers. In Peirce’s philosophy, abduction and mathematics are logical practices in which hypotheses are invoked and consequences drawn. Peirce took mathematics to be the superordinate science to which all the rest of science owes its justification. Not the received view then or now, Peirce was not deterred from further unorthodoxies about mathematical hypotheses, which he saw as figures of the theorist’s own creative making-up. Thus a topological hypothesis would have no more claim to truth-evaluability than a line from a detective story or a poem. Taken at face-value, the epistemic devastation of all science speaks for itself. The objective of this chapter is to show that even on the face-value reading, mathematics could have a prosperous Peircean future, but not before some adroit adaptation of the semantics of fiction. It is a test case for abduction. Keywords
Abduction · Axiomatics · Cognitive economics · Implicity · Hypothesis · Mathematics · Tacity · Truth-making
Peirce’s Abductive Schema Imagine that a question has arisen whose answer escapes you and lies beyond your timely means to find. You might decide to pay the matter no further mind, but if the question presses, you could be prompted to entertain a hypothetical answer, and reflect on the ways in which it might be put to the test. This is the spirit in which three lines were penned which nearly everyone who thinks that Peirce matters for abductive logic could paraphrase from memory: The surprising fact C is observed. But if A were true, C would be a matter of course. Hence there is reason to suspect that A is true. Peirce, 1931–1958 (CP 5. 189)
On this reading, abduction is a two-stage affair. In the first a hypothesis-search has been made and a candidate selected for second-stage involvement, in which the hypothesis is submitted to experimental test (CP 5. 599; 6. 469–473; 7. 202–219). When a candidate-hypothesis arrives for trial, the search-phase gives way to tests by inductive methods. If the hypothesis lacks empirical content, inductive testing would have to be structured in a way to give it the requisite accommodation. There is nothing to be gained by sending it to the controlled-experiment labs of the Centre for High Arctic Diseases. Such was a prospect to which Peirce was certainly favorably inclined. Writing of the methods of inductive diagramming, he emphasizes that:
8 How Abduction Fares in Mathematical Space
117
The difference between setting down spots in a diagram and making new dots for the creation of logical thought is huge. (CP 3. 424; emphases mine)
This, one might think, is itself rather surprising to say about induction, and something to return to in taking up again the question of how mathematical hypotheses might win their experimental spurs at trial. It is also advisable to take early notice of the key expressions which are invoked without explanation in the Peirce schema: “surprising,” “matter of course,” and “suspect,” none of which plays a fixed role in a more general Peirce-like account. The subjunctive “were-would” of line (2) is unexplained here but remains a key theoretical fixture. Indeed, it was Peirce himself who may have broken the waters of a possible worlds approach to semantics in his 1896 Monist paper on “regenerated” logic (Peirce, 2010 PM, 11–14; cf. Starr, 2021). Also unspecified is the intended interpretation in line (3) of the conclusion indicator “Hence.” Further evidence of Peirce’s openness to a plurality of abduction triggers can be seen a bit later in a passage in which “abduction” is replaced by “retroduction,” not as the name of a new concept, but as an intendedly better name for a concept already to hand Peirce, 1966 (MS 856). There is some scholarly disagreement about what an abductively spurred question asks. There is no set answer to this. Questions vary depending on subject-matter, context, and the asker’s interests (Niiniluoto, 1993, 2018; Minnameier, 2004; Schurz, 2008; Park, 2017a, b; Woods, 2020b; Douven, 2021). For present purpose it suffices to stick with the basic idea as set out in this section’s own opening lines. There are two quite different ways for an abducer to handle the hypothesisselection stage of the abductive process. He can guess that some proposition whose truth-value is presently unknown to him will do well in the experiental-test stage. Or he can float an idea of his own creative conjecture in hopes of a prosperous sometime future. These guesses answer different questions: “What is the fact that would solve my problem?,” and “What is the as-yet undeveloped possibility which might mature into something that solves my problem?” It could be of some interest here that when anglophones speak of framing a hypothesis, they use the English translation of the Latin root “fingere” for “fabricate,” a word carrying normatively at-odds meanings in English. Peirce is clearly open to each of these ways of guessing, but, rather remarkably, his view of the second is that at the point of conjecture, the hypothesis does not and cannot possess a truth-value. What, then, are we to make of line (3) for hypotheses of this creative kind? How can there be reason to suspect a nontruth-valued thing to be true? Also corresponding to the two different senses of “hypothesis” are two different implicit semantics, one for the statements of the empirical sciences and another for those of the abstract non-empirical sciences. However, distinction between empirical and abstract science doesn’t match with the fact-guessing v hypothesis-making distinction. It actually sits athwart it. In each of these disciplinary kinds of enquiry, there is ample evidence of both kinds of guessing. Concerning any mathematical hypothesis, Peirce takes a very tough line, insisting that it:
118
J. Woods
is not [even] a metaphysical proposition, because it is no proposition at all, but only an imitation proposition . . . . A proposition cannot predicate a character not capable of sensuous presentation; nor can it refer to anything with which experience does not connect us. Peirce, 1985 (HP 1, 734; cf. Atkins, 2021, 1.2, 1.3)
Lacking, as they do, for empirical referents, the hypotheses of mathematics have nothing for its predicates to be true of. They are, so to say, semantically barren. This is Peirce at his semantically most ungenerous, in which mathematics is reduced to a grim, syntactic instrumentalism. Except for some skeptically minded philosophers, wholesale semantic barrenness has no market-share in mathematical practice, and it could be that Peirce has overstated and over-generalized a more defensible position. Some sign of this can be found at MS 773, 2–3, where Peirce insists that while the referents of made-up concepts lack existence, they are as real as any existent referent. It remains to be seen whether the inconsistency can be worked out. To help keep track of things, the first claim is the semantic barrenness thesis and the second is the real-but-nonexistent thesis. The challenge is to find contexts in which the two theses might co-obtain. Of course, that mathematical hypotheses are truth-valueless is not a known fact. It is a hypothesis of Peirce’s own framing. This turns out to matter. For if in turn it is framed in the creative manner of hypothesizing, what Peirce is saying of mathematical hypotheses is in and of itself something that lacks a truth-value. It is not true, therefore, that mathematical hypotheses are truth-valueless, and Peirce is hoist on his own pétard. Later on, an effort will be made to unpierce Peirce from his uncomfortable perch. But before moving on, it would be advisable to sound the self-hoisting warning in a more general form: • The self-hoisting peril: To have condemned a human practice in the terms and manner of that very practice is a self-defeating thing to have done.
The Kempson Rule In the late 1990s at a 1-day workshop on abductive logic at King’s College London, sponsored by the Group on Logic and Computation and organized by Dov Gabbay and Donald Gillies, the noted linguist Ruth Kempson remarked on how odd it was that reasoning at its most common and well-practiced was its least common subject of theoretical enquiry. Some attendees remarked that this theoretical neglect reflected a lingering positivist distain for the logic of discovery. More than 20 years later, logic’s attention to abduction has intensified to the extent that recent developments justify a Handbook the large size and wide scope of this present one. As became clear that London day, the study of abduction is a multi-sourced enterprise encompassing philosophy, mathematics, computer science, cognitive systems, and the engineering and social sciences. Of the workshop’s attendees, roughly half were non-philosophers, a number of whom had been unaware of the historical link to Peirce. Even within philosophy proper, not everyone adopts Peirce’s approach or gives him much mind at all. This observation can
8 How Abduction Fares in Mathematical Space
119
be made without complaint, for since the Turing breakthrough of 1950 and its AI and neuroscientific adaptations, the spurt of probability logics in real-number frameworks, the rise of theoretical linguistics and cognitive systems, one sees that the traditional bifurcation of knowledge into the deductively attained and the inductively justified had become as somewhat careworn. Theoretical intuitions about “third-way reasoning” started taking hold and hypotheses were proposed, followed by influential work in fuzzy, circumspection, defeasible, nonmonotonic, default, relevant, paraconsistent, and autoepistemic reasoning. It would only be a matter of time for abduction to find a theoretical home in the third-way writings. For a quick sample, see Simon (1957), Zadeh (1965), Priest (1979), McCarthy (1980), Howard (1980), Reiter (1987), Girard (1987); and Gabbay et al. (1994). See also Bruza et al. (2000), Magnani (2001), Gabbay and Woods (2003), Gabbay and Woods (2003), Woods (2003), Makinson (2005), d’Avila Garcez et al. (2007), Bruza et al. (2009); and Woods (2013), especially the latter’s chapters 7, “Third-way reasoning,” chapter 8, “Following from,” and chapter 11, “Abducing.” Given the disciplinary spread of this third-way interest, perhaps it is not all that surprising that in its entry on abduction in the Stanford Encyclopedia of Philosophy Peirce is mentioned but not cited. When SEP added a one-page supplement on abduction’s link to Peirce, there was slightly more of him but not much. In the recently published Handbook of Rationality (Knauff & Spohn, 2021), of the six indexed citations of abduction, none mentions Peirce, and of the eight indexed citations of Peirce, none mentions abduction. Clearly, there is no Grand Abbey of Abduction into whose great arches is sculpted the decree, “Enter not into abductive enquiry, ye who are unlettered in Peirce!” Even so, Aristotle to one side for the present, it cannot be denied that before its rise to late twentieth-century attention, no one was more alive to abduction’s peculiarities than Peirce, and it would seem the sheerest folly to grant him historical notice while denying him theoretical weight. Putting these proprietary matters to one side, Ruth Kempson had taken note of something of the first importance. Comparatively speaking, most of a human being’s perfectly correct reasoning is deductively invalid, some of it having generalized from small samples or single cases, and the rest of it successful in the best of the distinctive ways in which abduction is good. How odd it is, then, that from its beginnings to the present day, logic should have been so pre-occupied with deduction. It may have something to do with the fact that logic’s founding motivation was deductive, and its greatest achievements to date have been deductively wrought. And certainly on those occasions in which truth-preservation is our inferential target, all steps necessary to get its theory right must be taken. In the large and many-latticed thing that Peirce eventually took logic in its broad sense to be – concerning which, see section “Logic in Peirce’s narrow sense” – the abduction-deduction-induction triple is itself a highly ranked component, with abduction in transitively superordinate position to deduction, and deduction in turn to induction. Of these three, it is abduction that reigns supreme, and in so saying it fell to Peirce to have made the theoretical counterpart of Kempson’s observation nearly a century later. Whereas Kempson had noted the primacy of abduction in inferential practice, Peirce had accorded the same ranking to the theory of inferential
120
J. Woods
practice. But because in all things that matter in the sciences of human behavior, practice precedes theory, a tip of the hat is owed to Ruth Kempson. It would be appropriate to elevate her important observation to some greater formal dignity, calling it “Kempson’s abduction-first rule,” or more briefly, the Kempson Rule. It is an important data-specification rule for any serious theory of human inference to gather up into systematic account. The term “data for theory” is intended in the usual way, that is, as denoting items of pre-theoretical consensus introduced to guide the theory’s search for the as-yet unknown disclosures awaiting discovery. It begins with the fact, once observed by Quine, that all enquiry begins in medias res. Data for theory are taken as defaults against which the theory in its further developments must not transgress except for principled cause. Aside from 5 uncertain years of 1-year contracts to teach mathematical logic in the Johns Hopkins mathematics department, Peirce had no academic preferment and wrote the work for which we know him during his after-hours as a scientist for various departments of the US government. Apart from an occasional foundational grant, Peirce wrote his philosophy on his own time and his own dime (the latter of which was at times terribly wanting). His range as a working scientist was extraordinary, and he made groundbreaking contributions in many of his many spheres of interest. Apart from his pioneering work in topology and the transfinite (Peirce, 1976) (NEM, 3, 39–63; 101–115; 121–124), Peirce was a well-experienced practitioner of the methods of empirical science, both human and social. For that to be so, he had to have been fully seized of the critical importance of data for theory. The data for theory prescribed by the Kempson Rule are primarily these two: • Most of human inference is structured abductively. • Abductive reasoning is something that we are good at, albeit not perfect. (We are only human, after all.) The Kempson Rule is distally endorsed by Jaakko Hintikka in our second epigraph, and we might also note how nicely it sits as a corollary of the What Actually Happens Rule for the logic of human inference: • To see what [reasoning] agents should do, look first to what they actually do. Then repair the account if there are particular reasons to do so. (Woods, 2005, p. 734; cf. Woods, 1999). The most important of Peirce’s insights into abduction are two, one being that in matters of human reasoning, abduction takes behavioral priority over deduction and induction, and the other being that abduction’s most flourishing home is mathematics. A further feature of Peirce’s understanding of abductive inferences is that, even at their most successfully drawn, they provide no reason to believe the abduced hypotheses to be true (RLT, 178); that is to say, well-abduced hypotheses are justificationly null and evidentially inert. Against this, Woods (2017, 6.2.3) sketches an epistemology that respects What Actually Happens and provides for the
8 How Abduction Fares in Mathematical Space
121
possibility that by line (3) of the Peirce schema, the reasoner could actually have come to know his abduced hypothesis to be true. More of this later.
The Three Faces of Logical Consequence This is the right place at which to give notice of an overlooked distinction in the logic of deduction, failure to attend which has occasioned wholly avoidable but widespread confusion and error. The core of deductive logic is the relation of deductive consequence, and its converse deductive entailment. From antiquity onwards, it has been understood that when some proposition S is a deductive consequence of some propositions S1 , . . . Sn , then S follows of necessity from the Si together. It is easy to see that consequence manifests itself in three linked but different ways. In the first way, some consequence is had (or entailed by) some propositions. In the second, some proposition is spotted as a consequence had by those propositions. In its third manifestation, some proposition is drawn as a consequence of those others. Consequence’s three faces are having, spotting, and drawing. They are nested in the manner of Russian matryoshka dolls, with having nested in spotting and spotting nesting in drawing, and linked by backwards transitivity. That is to say, there is no drawing without a spotting, and no spotting without a having. In matters of logical structure, having is the simplest of the three. It is a binary relation over truth-evaluable items (propositions in my parlance), and when it obtains it does so in “logical space,” a space unpopulated by people. Spotting is a ternary relation over propositions and a third relatum the cognitive agent who does the spotting. When spottings occur, they occur in “psychological space,” or more specifically “cognitive space,” a space abundantly stocked with people on the cognitive make. Drawing is likewise a ternary relation over propositions and agents. When a consequence is drawn, it happens in the conclusional subspace of cognitive space. This is of particular Peircean interest when it happens in the mathematical subspace of cognitive space. Each of these manifestations is subject to different conditions. There are conditions under which havings obtain, but none on whether they obtain in the normatively correct way. Spottings and drawings occur. They are subject to occurrence conditions and are also sensitive to considerations of correctness or otherwise. It is prudent to keep it in mind that it is propositions that “do” the entailing, and it is we who do the spotting and inferring. Although we can’t spot or draw consequences that aren’t “there,” these doings are nonetheless subject to further conditions on successful performance. At a minimum, infinitely many truthpreserving consequences are had by any cluster of true propositions, but any effort to spot or draw them all would bespeak a barking mad malfunction of the agent’s belief-making devices. Given that consequence-having is an unpeopled relation, we should be able to proceed with it at some remove from what people do or fail to do when they reason. Spotting and drawing are different. In the absence of people, they are
122
J. Woods
nothing whatever. Accordingly, a full-service logic of consequence must cater for all three domains. To start with, there is no place for psychological considerations in the logic of having, whereas for spotting and marshalling of consequences they cannot be denied a presence, and therewith an open question: To whose theoretical hands is the handling of cognitive considerations best assigned? Some are of the opinion that it falls to psychologists to deliver the goods, but Peirce thinks that in matters of logic psychological deliverance is best assigned to philosophically minded logicians. His reason for saying so lies in the classification of the sciences in which psychology defers to metaphysics and metaphysics in turn defers to logic Peirce, 1992 (EP 2, 258–262). Be that as it may, the more important consideration is whether psychological facts have a place in deductive reasoning. Given that spottings and drawings occur in cognitive space thanks to beings who make their way in life by using their heads, the question answers itself. Affirmatively. Peirce himself confirms this: Formal logic must not be too formal it must represent a fact of psychology, or else it is in danger of degenerating into a mathematical recreation. (CP 2. 710)
We have it therefore that the logic of deduction partitions into the people-free theory of deductive entailment and the people-based theory of right deductive reasoning, making this the right place to showcase the: • First law of proof-theory : Apart from truth-preservation, no property intrinsic to deductive entailment is dispositive for right deductive reasoning. By “first” is meant first in importance. It is not a discretionary law, and it is certainly not to be trifled with. Any theorem-prover who scants the law is standing occasion for category-mistake. In deductive contexts, it is an elementary condition on proof rules that they be truth-preserving. Every other condition is a selection-for-use condition, highly sensitive to context and resistant to summary generalization. This has been an open secret since its decisive exposure by Harman (1970). Consider a simple case. That 1 is a number is an elementary axiom of Peano arithmetic. That “1 is a number or Nice is nice in November” is deductively entailed by that axiom is an elementary fact of logic. Yet no one thinks that the entailed sentence is a theorem of Peano arithmetic, or that it is any way a mathematical truth. The entailment is solid, but the proof is badly flawed. It takes us to where no working mathematician would ever want to go. One of the constant companions of proof-making are premiss-searches and rulesearches in quest of profitable reciprocities. Proving has three characteristics of particular importance. It is goal-oriented, tendentious, and distraction-intolerant. Peirce is alive to this. The main “business of the mathematician is to discover new theorems,” and he recommends “leaving the grinding of them down into corollaries to the logician” (NEM 4, 289). If a prover felt himself bound to draws every consequence that could be got from his opening premisses or from the lines that follow, he would, in matters of logic, be in the same fix that Tristram Shandy was in in matters of autobiography, in each beached at the outset on the shoals
8 How Abduction Fares in Mathematical Space
123
of unreachable goals. Here, too, there is much to-ing and fro-ing and a good deal of trial and error. Over-all experience, lots of it, is the best teacher, and it teaches some useful lessons. Here is one of the more important: • A diagnostic caution: If a prover deputizes an entailment borne by one or more of his premisses and doesn’t like the result, or cannot abide by it, then cancel the license. But do not go into crisis-management mode. And do not seek solace in magical invocations of the normatively ideal. If the rule displeases the prover notwithstanding its truth-preserving provisions, the failure is his, not its. It might be noted in passing that relevant and paraconsistent logicians routinely transgress this caution (see, e.g., Beal & Restall, 2006; Brown, 2007; Priest, 2007; cf. Woods, 2003). Proof-goals vary with a prover’s interests, and should not be mixed up with one with the other. There are discovery-proofs, confirmation-proofs, explicitizing proofs, knowledge-enlarging proofs, problem-solving proofs, and exploratory and manipulative proofs. Proof-theorists do a good job in alerting us to structural differences – direct proof, indirect proof, conditional proof, reductio proof, and so on. But, at some real cost, the goal-orienting differences are overlooked. We will see more of this ahead when Peirce responds to Kant’s complaint about deduction. In the glory days of nineteenth-century mathematics, there was a good deal of consternation over the state mathematical proof was then in (Frege, 1884; Stein, 1988; Gray, 2004; Ferreirós, 2007). In twenty-first century logical circles, research is abundant and many windows left open. In an impressive study which still has legs, Beklemishev and Visser report thirty-nine unsolved problems in the logic of proof, the first five of which pertain to “informal concepts of proof” (p. 38). The purpose in mentioning this here is to emphasize that it is not foreclosed that a full-service logic of deduction would direct its theory of proof to the precincts of cognitive space in which the psychological realities of abductive support are given all the sway that is due them. To the extent that it honors these requirements, a full-service deductive logic would count as logical theory in naturalized form (concerning which, see Woods, 2016a, 2020a, b; Magnani, 2018). In the coming sections, it will be proposed that the logic of spotting and having is best served in two stages. At stage one the observable regularities found in human cognitive experience are sent to what is sometimes called “cognitive economics.” Cognitive economics serves in turn as a staging theory for a causal-response epistemology of logic. Meanwhile, there is more to be learned from Peirce.
Logic in Peirce’s Narrow Sense Peirce is the independent co-founder with Frege of mathematical logic, although their respective treatments of quantifiers differ from one another markedly. First in print was Frege’s Begriffsschrift (1879) followed in 1883 by Peirce’s “The logic of relatives” (CP 3. 328–358). Neither of these logics makes theoretical mention of
124
J. Woods
abduction, and they are clearly not the place to look for instruction on that matter. Peirce, however, draws a distinction between logic in the “strict” or “narrower” sense and logic in the “broad” sense (CP 4. 373), but he is not as clear as he should be in demarcating their respective domains. Some scholars identify logic in the strict sense with formal logic, but Peirce says of formal logic that “it is nothing but mathematics applied to logic” (CP 4. 263), and yet Peirce leaves us in no doubt that if abduction is to have a flourishing theoretical home, it can only be found in logic in the broad sense. Since Peirce’s writings on logic are many and scattered, it is a considerable advantage to have at hand some good expository scholarship, as with Engel-Tiercelin (1991), Hilpinen (2004), Pietarinen (2006), and Bellucci and Pietarinen (2015, 2016), for example. One of the merits of the secondary literature on an especially difficult thinker is the heavy lifting it does on its readers’ behalf. But there is nearly always value to be added by going directly to the primary sources. Writing in 1898, Peirce announces that his: . . . proposition is that logic, in the strict sense of the term, has nothing to do with how you think . . . . Logic in the narrower sense is a science which concerned itself primarily with distinguishing reasonings into good and bad reasonings, and with distinguishing provable reasonings into strong and weak reasonings. (RLT, 143)
A page later he adds that: . . . it is plain, that the question of whether any deductive argument, be it necessary or probable, is sound is simply a question of the mathematical relation between . . . one hypothesis and . . . another hypothesis. (RLT, 144)
Of course, if Peirce’s strict logic were a logic of consequence-having, that is, of deductive entailment, his opening line would be tautological. However, the antecedent of this conditional being untrue, it could appear that the opening line is trouble for Peirce. For, as is well-known, Peirce learned from his mathematically eminent father, Benjamin, that mathematics is “the science which draws necessary ‘conclusions’. (CP 3. 558; 4. 229), and the last thing one can say about drawing conclusions is that it has nothing to do with how we reason. So, then, Peirce’s logic in the strict sense is not a theory of the entailment relation. Coming back now to Peirce’s broad and decidedly odd view of mathematics, it is not only mathematics the science of specified subject-matters and infrastructures – topology, for example – that Peirce means by mathematics. It is defined not by subject-matter but rather by the a priori character of drawing deductive consequences from premisses of whatever subject-matter (CP 2. 532; 4. 239). Logic on the other hand is “the science of drawing necessary conclusions” is a theoretical enquiry into the ways and means of a priori reasoning” (CP 4. 124; 4. 242). The purpose of logic is “to analyze reasoning and see what it consists of” (CP 2. 582). The logician examines the relations between the premisses and conclusions of reasoning (CP 4. 239; 4. 370; 4. 481; 4. 533). Yet, in “The nature of mathematics,” Peirce parts with his father Benjamin, writing: On the other hand, it is an error to make mathematics consist exclusively in the tracing out of necessary consequences. For the framing of the hypothesis of the two-way spread of imaginary quantity, and the hypothesis of Riemann surfaces were certainly mathematical achievements. (PM, Art. 3)
8 How Abduction Fares in Mathematical Space
125
But against the grain of the non-truth-valued thesis, a question presses. In what do these mathematical achievements inhere and in what manner are they manifest? A further curiosity of Peirce’s philosophy of logic is how logic is situated in relation to mathematics. The long-received view is that logic is the reigning science, with mathematics running a close second. It is not this way with Peirce. Mathematics is self-regulating and self-certifying reasoning, whose valid inferences are “more evident than any such [= logical] theory could be” (CP 2. 120). In Peirce’s terms, mathematics is an “acritical” endeavor. When he speaks of it as a science which draws necessary conclusions, he means by “science” any form of enquiry into anything whatever that is open to any form of enquiry in whose exercise the reasoning is a priori. So conceived of, if geometry and set theory are mathematics, so too is any episode of reasoning, whatever its subject-matter, that is deductively sound. It could be about peaches and cream or the best place to the sun in the grips of winter. It is important to be clear about this. When Peirce declares for the independence of mathematics from logic, he is not lording topology over the syllogistic or over his own logic of relatives either. Even so, given that logic is the science that examines our consequence-drawings and hypothesis-framings (NEM 4, 289), what pray is it examining them for? By “how you think” Peirce appears to mean how we do in fact think, with respect to which he has two points of contrast. Logic in the strict sense is not merely descriptive of how things actually happen when a human being does the thinking that leads him to a conclusion. Logic’s strict concern is with how such thinking should be done. As formulated, Peirce’s suggestion is that left to its own devices, reasoning as it actually happens stands in need of instruction about how it should be done (CP 1. 577) and, as a normative science, is a branch of ethics (CP 1. 611; 1. 573; 1. 575; 5. 35; 5. 130). Having done nothing at all to show this to be so, the burden remains with Peirce. He should be careful in discharging it and be mindful of the perils of self-hoisting. Over its many centuries, establishing its own normative authority is something that logic is least good at. Peirce inherits the problem and compounds it greatly with his independence thesis. If mathematics in Peirce’s broad sense of the term is acritical, that is, is a self-regulating and self-certifying form of practice, what has it to learn from logic, and in what respects do logic’s normative presumptions hit mathematically corrective paydirt? This, too, must be looked into further.
Logic Daze In his latter period, Peirce classified logic in the broad sense as what might conceived of today as a combination of formal semantics, speech-act theory, and communication studies. Peirce’s name of choice was “semeiotic” (with the second “e” his) for the general theory of signs. He subdivides semiotic into (1) Speculative Grammar , which investigates the analysis, definition, and a further subdivision of signs suitable for productive employment in science; (2) Logical Critic , which is the study of the validity and justification associated with each kind of scientific
126
J. Woods
reasoning; and (3) Methodeutic , or Speculative Rhetoric (or in plainer words, methodology), which is the science of scientific enquiry of which, as it happens, Aristotle was the systematic founder. A nice recent treatment can be found in Bellucci and Pietarinen (2021a, b). Subordination is a relation of dependency, and Peirce clearly holds that what a subordinate science depends on a superordinate one for its justification. A remarkable feature of the three components of Logical Critic is that each is self-justifying and is also justified deductively. Of the three, only deduction has a sole-sourced justification, namely, deduction itself (CP 2. 786; MS 293). Induction is also self-justifying but also a deductively justified Peirce, 1982 (W 1, 280–285; MS 328). Retroduction is justified in all three ways, self (W 1, 280– 281; MS 328), deductively (CP 5. 146; MS 293), and inductively (CP 2. 726; MS 630). Of the three, it is abduction that is most highly credentialed and sits highest in the ways of reasoning, and is so in spite of the fact that: While Abductive and Inductive reasoning are utterly irreducible, either to the other, or to Deduction or Deduction to either of them, yet the only rationale of these methods is essentially Deductive or Necessary. If then we can state wherein the validity of Deductive reasoning lies, we shall have defined the foundation of logical goodness of whatever kind. (CP 5. 146)
In the 1898 Harvard Lectures Conference, Peirce sees it this way: As for Induction and Retroduction, I have shown that they are nothing but apagogical [= abductive, JW] transformations of deduction and by that method the question of the value of any such reasoning is at once reduced to the question of the accuracy of Deduction. Peirce, 1992a (RLT, p. 145)
Sometimes, it must be said, Peirce has the sweep and force of Grand Opera: Implausible. Noisy. Magnificent. Preposterous. To modern eyes we have in the remarkable entanglements above a fugue of imaginary and unexplained justificatory imposition, leaving us in no doubt of the weight assigned by Peirce to the justification-condition in the entire reach of his logic. The section entitled “The three faces of logical consequence” introduced the difficulties of rule-selection for the correct management of deductive proof. One of the difficulties, as mentioned, is that a reasoner’s proof-objectives are so highly contextualized that a general-purpose users’ manual is too much to hope for. Boiled down to its essentials, what is wanted from proof-theory is some finite specification of what we should not want from a truth-preserving transformation and of ways in which to avoid them. It is not always easy. In a charming exchange between Felix Klein and Herman Weyl on the ups and downs of axiomatic proof, Klein retorts, “Suppose I have solved a problem; I have taken a hurdle or jumped a ditch. Then you axiomaticians come around and ask: Can you still do it after tying a chair to your leg?” (Weyl, 1953, p. 225). A search the daze of Peirce’s logic in the broad sense, discloses much talk of justificational obligations, but next to nothing specific. As it happens, however, such specifications are known and ready to hand. They lie in the parts of Aristotle’s Posterior Analytic that Peirce overlooked (and Frege, too, for that matter). Further details in section “Trial by combat”.
8 How Abduction Fares in Mathematical Space
127
In “The nature of mathematics,” Peirce re-asserts the independence of mathematics from logic, writing that: in the perspicuous and absolutely cogent reasonings of mathematics . . . appeals [to logic] are altogether unnecessary. (PM, Art. 4)
This is an important concession. As far back as Plato, the idea has circulated that for a discipline to attain the standing of a true science, it must be subject to axiomatic reconstruction. In such arrangements, axioms are the first principles of those sciences, true and necessary but insusceptible to and unneedful of prove. Peirce is no enemy of this idea, yet as already indicated, Peirce has trouble coloring within the lines of his own distinctions. But whether a priori reasoning in general or the reasonings of mathematics and science, Peirce shows himself at ease with validities that speak for themselves and owe nothing to the proof-powers of logic. By whatever name or none in particular, Peirce concedes the existence of primitive truths, first principles or, in an older meaning, axioms. What is more, although hypotheses are essential to mathematics: It cannot be said that all framing of hypotheses is mathematics. For that would not distinguish between the mathematician and the poet . . . Detective stories and the like [also] have an unmistakable mathematical element. (ibid., Art. 2)
It would be wise to bookmark this remarkable passage. Fiction’s sentences might be hypothesis-framing, but how could consequences be drawn from them if they lacked truth-values? Still, that Peirce should liken the mathematician’s creative power to the novelist’s or poet’s betokens a semantic sensibility which could be used to advantage in removing the sting from semantic barrenness without the necessity of killing it outright. It suggests a way of performing what we could call the Semantic Recovery Task. The current state of the logic of fiction cannot itself be said to enjoy wide reflective equilibrium (Armour-Garb & Woodbridge, 2015; Jacquette, 1996; Parsons, 1980; Sainsbury, 2005; Walton, 1990; Woods, 1974, 2018). But it is beyond dispute that if the statements of fiction are denied the capacity to refer, all hope of bearing-truth values is lost, and no one would read novels or discuss them over coffee or establish university departments of literature. It was noted earlier that Peirce is of two minds about whether hypotheses are capable of reference to the nonexistent. In the present instance his overall project would be better served if he sided on this question with the Ayes and agreed that there are some important things that don’t actually exist, and have no need to. Certainly, if the discernible regularities of worldwide readerly practice are anything to go on, fiction is not only a truth-bearing medium of cognitive exchange, but a truth-making medium as well. Everyone familiar with Sherlock Holmes knows it to be true of him that in the 1880s he abided in London’s Baker Street and false that he resided in Cheyne Walk or the 16th arrondissement of Paris. Everybody who knows Holmes knows that these facts were brought to pass by the pennings of Sir Arthur Conan Doyle. Everyone familiar with literature knows the importance of character development. The Holmes we meet at the beginning of A Study in Scarlet is but a portence of the man we find at its finish. The parallels between fiction, so
128
J. Woods
understood, and mathematics are altogether striking. When a key difference between the two is flagged, there is a way to seeing the making of real truth in mathematics by the communal efforts of its practitioners. The difference is this: For every Doyle-dependent truth about Holmes, there is an opposite truth, made so by the extra-fictional world. There is no such barrier for the facts of creative mathematics. It is simply not true that every statement of mathematics is contradicted by a fact about the world. Think here of the binomial theorem. There is in this a difference of fundamental importance for what matters here. It gives us the Making-up Makingtrue Paradigm. It is paradigmatic of fiction’s intuitive semantics. The question it raises is whether it travels in good order to the semantics of mathematic. If it does, it does so under some interesting travel restrictions. Roughly speaking, the truths of fiction are made so by the authors of their objects and events. The hypotheses of mathematics are also made by their authors, but they are made true or false by subsequent communal means. Saying so leaves some obvious questions to be taken up. See “Conclusion”. Fiction to one aside, it is clear that Peirce has landed mathematics in a parlous mess. Nothing is true in mathematics, and nothing to be known of its objects, least of all that it even has objects. And bearing in mind that mathematics is the superordinate science to whose truths all of science must defer, either mathematics has nothing at all to do with science or all of science must come crashing down, an absurdity in either case. To make matters more pinching, Peirce also thinks that abduction’s future as a tenable mode of reasoning is strongly bound up with the good it does for mathematics in all its various manifestations. News this bad raises the possibility of false attribution or misbegotten interpretation. So we should pause to ascertain just how much the semantic anomalist Peirce actually is, that is, how steadfast his opposition to true negative existentials. In “The nature of mathematics” he writes that: the mathematician is not concerned with real truth, but only studies the substance of his hypothesis. (PM, Art. 2)
Indeed: in framing mathematical hypotheses no logic is required, since it is indifferent from a mathematical point of view, how far the hypothesis agrees with the observed facts. (Art. 3)
Nor is this a matter of mere indifference. As we saw: I reply that it [= the mathematical hypothesis] is not a metaphysical proposition, because it is no proposition at all, but only an imitation proposition . . . . A proposition cannot predicate a character not capable of sensuous presentation; nor can it refer to anything with which experience does not connect us. (HP 1, p. 734)
This is problematic. Not only are mathematical hypotheses incapable of reference, they are also incapable of truth-bearing: Truth belong exclusively to propositions . . . . Truth is the conformity of a representation to its object . . . . (EP 2, 379–380)
8 How Abduction Fares in Mathematical Space
129
Inference, on the other hand, is intimately tied up with belief. Any inference of either three logical stripes is “a passage from one belief to another” (CP 4. 53). Indeed, it is the: conscious and deliberate adoption of a belief as a consequence of another cognition. (CP 2. 442)
And: What particularly distinguishes a general belief or opinion, such as in an inferential conclusion from other habits, is that it is active in the imagination (CP 2. 148)
On top of all that, while hypotheses are essential to mathematics and it is to mathematics that all science must defer: what is properly and usually called belief . . . has no place in science at all. (RLT, p. 112)
Indeed: There is no proposition at all in science which answers to the conception of belief. (idem.)
It is clear from the omitted words of the first of the passages quoted just now that what Peirce takes belief to be in the “proper” and “usual” sense is a state of mind which is immune from rational reconsideration. That, of course, is not the natural way of things with the likes of us, who maintain their beliefs until there is reason not to. And with this comes a problem. If, as Peirce says, inference is a passage from one belief to another, then inference itself hardly ever happens in science. What, then does Peirce think mathematicians are doing when they prove mathematical theorems, and to what good end do they bother? How does it come to pass that all of science is bound by these ersatz inferences of mathematics? As things now stand, there are two anomalies in Peirce’s philosophy of mathematics. One is that it is to mathematics that logic and all the rest of science owes its justification. The other is the semantic barrenness thesis. Taken together we have a third by implication: • The scientific wasteland implication: There can be no knowledge to be gleaned from science. All of science is an epistemic nullity. Something has gone seriously awry here. The two theses of mathematical sovereignty and semantic barrenness are hypotheses (not propositions) of Peirce’s own advancement, and the scientific-wasteland implication is drawn from them by inference. But without belief there is no inference, and without propositions there are no truth-values. And since mathematics is the science that draws inferences from hypotheses, and mathematics reigns supreme, Peirce has remounted unawares his own semantic pétard. In ancient times, Aristotle coined the term “paralogismos” for what we call the fallacy of mistaking an improperly constructed argument as properly constructed. In the present case, when we take a non-truth-valued thing for true and do so in a truth-telling frame of mind, we seem to have committed a semantic variation of a fallacy. We have mistaken a non-truth-value-bearing thing for a truth-value
130
J. Woods
bearer. With fallacies there is no prospect of deductive recovery, and if the parallel held here, there would be no prospect of semantic recovery either. Against this it can be proposed that a logic of conceptual change leaves it open that in contexts under present consideration, non-truth-valued talk could in some manner have the cognitive value of true talk. It remains to make the case for the why and the how of such possibilities, in section “A data-respecting epistemology”. The phenomenon of conceptual change is a large and constant presence in human cognitive life. It is the locus classicus of the logic of invention in its best-understood sense. Consider in this regard the concept of elliptic functions in the nineteenth century, and the concept of proof in our own. The first has finally attained its maturity, but the second still has a way to go. Indeed, the concept of classical first-order logic wasn’t full-grown until the late 1960s (Lindström, 1969). As it happens, moreover, this way of proceeding has Peircean roots. The facts which the logician cannot ignore: come within the range of every man’s normal experience, and for the most part in every waking hour of his life. (CP 1. 241)
They constitute: the universal data of experience that we cannot suppose a man not to know and yet be making enquiries. (CP 4. 116)
If Peirce means this, the implications for logical space are as large as they are unmistakable.
Cognitive Economies Whatever the details of one’s acquaintance with topology or quantum computing, there are truths to tell in science, and objects and relations of which those truths hold true. The domain of thought is a mother lode of human knowledge, and that alone is a fact to stir an abducer’s soul. We are all abducers here. We are organic beings, denizens of the natural world and subjects of the causal order. In the stark terms of evolutionary survival, there is not much good to be said for us. Among fellow upright vertebrates, we are not strong, not swift, not well-protected from the weather’s ravages. We do, however, have an advantage that makes all the difference. We have the brains that nature has fashioned for humans. All natural beings are causally linked to their environments, that is to say, to their habitats and to one another, and they in turn to other sectors of the natural world. These arrangements form ecologies, and whatever is to come to pass with us comes to pass within ecological compass. All beings in the natural order are faced with ecologically sourced coordination-problems, each in its own way a maker of things to happen and the recipient what others make to happen. Whatever the story of humanity’s command of mathematics, it will have to be a story of ecological systems dubbed by Lorenzo Magnani as “eco-cognitively open”: This special kind of ‘openness’ is physically rooted in the fundamental character of the human brain as an open system constantly coupled with the environment (that is, an ‘open’
8 How Abduction Fares in Mathematical Space
131
or ‘dissipative’ system): its activity is the uninterrupted attempt to achieve equilibrium with the environment in which it is embedded, and this interplay can never be switched off without producing severe damage to the brain. (2022, p. 1; cf. 2009; 2015)
The permanent attachment of the brain to the ever-changing environment subjects its own to a ceaseless causal wash which, upon arrival, is dispersed to the relevant causal-processing units. The constancy of causal update enables the system to stabilize and maintain ecological equilibrium. When a causal input carries information, it is often dispersed to the system’s information-processors in linguistically expressed semantic form – “Watch out for the truck!” But, by far, the largest share of it is processed with the potential not the reality of syntactico-semantic realization. And much of that is stored in the unconscious for possible realization in propositional form under the right cues. There is a common saying that catches the drift of this idea: “Good, heavens, how odd! I knew that all along and didn’t realize it until now.” Here is something else to bookmark. The widely distributed experience of not knowing what one knows until something nudges the awareness of it is a fact so large, so representative and so common as to warrant some principled attention from epistemologists. Call this the Unknowing Knowing datum. It is also instructive to bear in mind how much a human being will have come to know even by the onset of speech, and yet how little of it he carries around in the front of his mind. There is no reason to think that we retain all the information that washes overs, but there is every reason that we retain and store very large quantities of it in memory. Studies of the dissipative brain suggest an analogue with quantum field theory. Here, for example, is Giuseppe Vitiello: In the dissipative quantum model of the brain the vacuum code is taken to be the memory code. Again memory is represented by a given degree of ordering. A huge number of memory records can be thus store, each one in a vacuum of a given code . . . In the dissipative model all the vacua are available for memory printing. (2012, p. 316)
In “The nature of mathematics,” Peirce concludes by saying that “Psychics and Physics are widely separated, and influence one another little” (PM, Art. 4). Note well: not “never” but “little” and sometimes to good avail. There are aspects of Peirce’s synechist understanding of continuity that dispose him to the actual existence of infinities even in the finite-looking reaches of nature. But all we need take from dissipative studies is that the memory is enormously large and carries with it a heavy demand for storage capacity which poses, in turn, a large challenge to our memory-retrieval mechanisms. These observations can be marked as serious data for a theory of abduction. A good way to think of our causally open, and coordination-problem-solving ecological arrangements is in economic terms. The bulleted pretheoretic data in Sect. 2 – abduction leaves the largest footprint on human reasoning; and humans are dab hands at abductive reasoning – set the stage for what in Woods (2012) is called “cognitive economics.” Like a money economy, which is system for the generation and circulation of wealth, a cognitive economy is a system for the generation and circulation of knowledge. From an ecological perspective, cognitive economies are prior in ordo essendi. In the money case, the medium of exchange is money, and in the
132
J. Woods
cognitive case it is information. Parties to economies of both kinds are subject to resource-limitations – capital flow, time, information, computational power, talent, opportunity, and so on. In a quite general way, action in both places is subject to estimates of effort in relation to resource-draw against expected outcome-value. In each case, an economic system is an interactive and multi-agent cooperative driven by both individual and shared ends and regulated by convention, that is, by shared patterns of conduct for the resolution of co-ordination problems. Economies, both cognitive and moneyed, are the principal instruments of ecological well-being. A money economy does best when exchanges are made freely and competitively. A cognitive economy is like this, too. It flourishes best in free-markets of competing ideas. It is easy to see that a free-market-of-ideas cognitive economy is a standing check on whatever catches notice in the relevant sectors of the market. When it is an economy of public-announcement, market-checks are a matter of course. But short of special dispensation to a willing hand, there will be good ideas that go unnoticed and unchecked. Some good ideas receive no notice because there is no one who understands them. Gauss, who was Riemann’s doctoral supervisor and mentor, didn’t understand the 1851 breakthrough of the Riemann manifold, a respect in which Gauss was not alone. Yet before long, Riemann helped things along with a successful application of his ideas to Abelian functions (Riemann, 1857). In this paper Riemann was able to obtain results which Weierstrass had previously got using the more standard algorithmic or algebraic methods – long and painstaking ones. Riemann’s caught Weierstrass’ notice and led him to rethink his approach to the theory of functions in his famous Berlin lectures. For a nice account readers could profit from Bottazani and Gray (2013). There is a good deal of hit and miss in the cognitive economy, and sometimes it takes an enemy to make of a rival a star, although posthumously so in this case. The central point to draw from this is that a cognitive economy is a thoroughgoingly competitive enterprise. It is easy to see the appeal that these developments would come to have for evolutionary game-theoretic logicians. For in each case, economic activity means playing the market (Pietarinen, 2006; Hintikka, 2007; van Benthem, 2011). We can take the natural wordings of competitive markets of ideas as another datum of cognitive economics for epistemological accounting. Here are a further seven: • We are organisms with a drive to know things. (Plato was right. Knowledge is seductive.) • We know lots and lots of things about lots and lots of different things. • We make lots and lots of mistakes about a lot of different things. • We are effective resource-managers, constantly mindful of the drag of costs against the pull of benefits. While error-making is always a local cost, acrossthe-board error-avoidance is a cost-benefit disaster. • Overall, it is to our cost-benefit advantage to make, detect, and correct errors than to avoid making them no matter what. • We are beings who learn from our mistakes. Error-making is a profitable spur to knowing.
8 How Abduction Fares in Mathematical Space
133
• The Enough Already Datum: All things considered, we are right enough enough of the time about enough of the right enough things to survive, biologically prosper and, from time to time, build great civilizations. Readers might question whether everything on this list qualifies as a pre-theoretical fact or even as a generally agreed pretheoretical default, which latter is the spirit in which they have been advanced here. Those of contrary opinion can take as fact what they see as fact, and treat the remainders as hypotheses abductively stirred by them. So seen, they are submitted to the competitive markets of abductive logic in quest of a market-share that could keep them in contention and give them room to stretch and grow. In money economies, parties prosper when their goods achieve large marketshare. In some cases, goods do well enough on the demand side to dominate the market. Here, as elsewhere, there can be too much of a good thing. If a product manages to corner the market, there can be a downwards ripple affect on competition. And without competition, healthy markets wither. Much the same is true of the cognitive economy, an idea – that is, a proposition, doctrine, or theory – does well when it has the requisite number of confident professional backers. We could liken a healthy cognitive market to a competitive system in wide reflective equilibrium, and an idea with good market-share to a proposition stably situated in such an arrangement. The expression “wide reflective equilibrium” was coined by John Rawls, but the concept it expresses originated with Nelson Goodman’s classical paper on how logic’s principles are (and are not) justified. Roughly put, a logical principle is justified to the extent that regular practice comports with it, and reasoning is justified to the extent that it heeds the logical principles. Goodman’s solution has been deftly adapted by William Lycan for enquiry into the management of philosophical disputes. See here Rawls (1971), Goodman (1955); and Lycan (2019). Actually the idea postdates Goodman by two millenia. One finds it in the Nicomachean Ethics at 1145b 2–7, and can paraphrase it thusly: • An account of some disputed matter holds fast when it solves all the questions left open by received opinion and yet preserves as many of the expert opinions as possible. Further details in Woods (2001), chapter 8, and (2021a, b, c). When a mathematician manages to prove an antecedently presumed mathematical fact, the normal thing to do is to tell people about it or submit it to editorial consideration. This marks the onset of the proof’s non-empirical market-test and, the greater the circulation of the journal that publishes it, the larger and more vigorous the testing becomes. When a mathematician arrives at some mathematical hypothesis whose proof, if it exists, exceeds his timely (or lifetime) reach, it goes to the market unproved. Consider a far from common case. An abducer goes to market without proof and fails to find one then or long afterwards or never. Some of these offerings quickly drop from sight. Some await until such time as they chance to be spotted for their overlooked value. Such, as some see it, has been the (so far)
134
J. Woods
undeserved fate of Mooney (2000). It is but one example of many others. Those that remain could not have done without some requisite degree of market-share. Marketshare, recall, is reflected in the width of the wide reflective equilibrium within which a hypothesis has standing. As market-behavior makes clear, there are two reasons to send a hypothesis to market. One is to find confirmation there. The other is to find fruitful, if provisional, employment pending future confirmation-possibilities (Tappenden, 2008). Cognitive economics, the mode of enquiry, is the study of cognitive economies. It is a purely descriptive science which, in all matters cognitive, focuses on the empirically discernible behavioral regularities of beings like us. If, and when, normative considerations begin to press, an orderly transition could be considered from cognitive economics to an epistemology that best respects the behavioral findings currently at hand. Its purpose would be the normative licensing of how reasoning is normally done and/or the correction of its perceived shortcomings. In making the transition, we should take care not to misunderstand the contrast between a descriptive account of human behavior and a normative theory overseeing what the descriptions provide. It is not uncommon to mark this contrast by distinguishing between the kinds of terms that are appropriate for these respective modes of enquiry – the first containing descriptive terms only, with normative terms reserved for the follow-up theory. This is a mistake. No descriptively adequate account of human cognitive behavior can redact the speech regularities that attend human behavior. Accordingly, the right descriptive account must have a normatively stocked lexicon with which to capture the regularities which ripple through our epistemic assessments of ourselves and others. This creates an important burden of proof for any normative theory which seeks to override communal assessment at the local level. It must produce adequate grounds for doing it. Under business-as-usual conditions, all of us in our cognitive moments engage in normatively self-regulating practices. It falls to the tribunals of epistemology to issue or withhold certificates of normative validation from the metatheoretical on high. Seen this way, a normative epistemology functions in the manner of a Parliamentary Budget Office. It is not an easy task, for without persuasive authority, theoretical override of locally accepted practice carries a high risk of question-begging or behavioral indifference, or both. Peirce himself was well-attuned to the importance of economic factors in human reasoning, and writes of them approvingly (CP 5. 196; MS 690; CP 7. 164–231). Related discussion can be found in Rescher (1976), Hausman (1993). Still, there are problems. They show up in Peirce’s classification of the sciences in which logic outranks the descriptive sciences. Logic is a normative science in which the rightness of a line of reasoning has nothing to do with how the mind works. If logic is the science of the necessities of reality, and unconcerned with the necessities of thought (EP 2, 242–257), where, then, is to be found the Peircean value of the established cross-cultural regularities of the cognitive economy? One of the attractions of normatively self-regulating cognitive economies is the presence there of a behavioral regularity which could be called normalcynormativity convergence. It suggests that in matters of cross-cultural, non-
8 How Abduction Fares in Mathematical Space
135
ideological and apolitical aspect, reasoners of the human kind are disposed to accept that reasoning as it should be done is reasoning as it is usually done. N–N convergence works as a default, and it is left open that, concerning some matters and given certain externalities, how we do reason is how to reason incorrectly. Any decently functioning cognitive economy will be well-experienced in the detection of repair of exceptions to the N-N convergence default. Should a cognitive economy lose its way in the management of its own normative affairs, one could see the attraction of a further body whose function would be to serve as a Court of Normative Appeal. That, certainly, has been the function historically attributed to epistemology, the philosophy of human understanding and knowledge. Often overlooked, however, is epistemology’s remit as the methodology of the deductive sciences, whose principal function was to explain how knowledge is produced as the causal outcome of demonstrative proof. That would be an objective for which cognitive economics could be a helpful preparation. If knowledge is the sort of thing that is brought to be by causal means, there could be merit in finding an epistemology in which this causal connection is adequately captured and plausibly generalized to all knowledge. If, in so proceeding, normative doubts claimed our notice, well and good. One would judge the matter on its merits and amend practice or amend the norm as best one could. In any case, it is helpful to bear in mind that, in matters of logical inference, normativity is a domestic product, not an export from some mythical neverland. We are now at the point at which to raise two questions that matter for Peirce’s logic of abduction. Having stirred the pot of cognitive economics with a spoon of Peircean disposition, it is fitting that we ask how the economy’s observed regularities fare in an epistemology designed to pay them some respectful mind on their merits; and having done so, to ask how well or otherwise hypothetical reasoning of broadly Peircean stripe comes out there? A second question asks of the epistemological findings how well or otherwise they fit further particulars of Peirce’s logic? It bears repeating that epistemology is no stranger to logic. Logic was founded as the epistemology of the deductive sciences. Its purpose was not to deconstruct the concept of knowledge, but rather to lay out the conditions under which a scientist’s scientific knowledge is produced. Produced where, we might ask? In unpeopled provinces of logical space and, if so to what avail? Or in the heavily trafficked byways of cognitive commerce in terra firma?
Habit We come now to a marked change of tone in Peirce abductive writings, and a lessening emphasis on alleged semantic shortcomings, justificatory dependencies, and presumptuous normative authority. In his huge oeuvre – especially when writing of the influence on logical reasoning of instinct, habit, the practical, and the creative impulse – Peirce is not at all out of sympathy with what actually happens in the heavily trafficked byways of human life. In these moments, we see large flashes
136
J. Woods
of Peirce as cognitive economist avant la lettre. Of the great logicians, aside from Aristotle, few match Peirce’s sharp eye for the observed regularities of cognitive practice. And, as it fell to him to see, wherever we have the regular we have the habitual. Perhaps the clearest indication of Peirce’s openness to cognitive economics as a prelude to logic in its more normatively prescriptive sense is to be found in the distinction between logica utens and logica docens. The distinction comes from the thirteenth century, and Peirce is likely to have derived it from Duns Scotus. It is not a clear distinction, but it is full of promise – or, as Peirce might have said, gravid with unspoken truth. Before digging in, one should notice, in each occurrence the ambiguity of “logica.” It matches the ambiguity of “logic” as between the name of a kind of practice and the name of a kind for theory of that kind of practice. In yet a third sense, “logic” denotes the wherewithal for logical practice and logical theorizing. For even further ambiguities, readers could consult Beziau (2010). Peirce himself sees logica as a kind of faculty. Our word “utensil” derives from the Latin, as does “docent” from “docens.” If there were logics of these things, one might be a logic of tools and another a logic of its outcomes. Also nearby is the contrast between being taught by experience and being taught at school. If we took the basic idea as contrasting doing what comes naturally with doing what requires some tutelage, there would be no attraction in seeing them as sealed off from another, and no indication in Peirce’s writings of thinking otherwise. Some very good work has been done by Peirce scholars on the linkages of habit to the utens-docens distinction (see, e.g., Pietarinen, 2005). But since the focus here is on abduction, there would be some economic advantage in tracking Peirce’s understanding of habit by itemizing the uses he makes of it in the relevant writings. Of particular interest, and no little importance, are patterns of cognitive behavior to which we have become habituated by nature or thanks to some profitable congress with nurture. On the hypothesis-selection side, Peirce emphasizes our capacity for guessing, an instinctive capacity, sometimes innate and sometimes learned from experience. Peirce acknowledges that we are good enough at guessing to make scientific progress possible, and he does not shy away from saying that we owe our success to the “spontaneous conjectures of instinctive reason” (CP 2. 443), that is, to the possession of “lune rationale” (MS 873 13–15) and “a certain power of divining the truth” (ibid., 638, 14–15). Good guessing is not a lucky stab in the dark. When a hypothesis presents itself, it does so in something like the way in which an observed object does, that is, with force majeure. If a duck sits in the center of your field of vision, like it or not you are seeing a duck. Abduction is like this too. When in a state of having guessed well and having surrendered to the Insistence of an Idea, the hypothesis, “as the Frenchman says, c’est fort comme moi” (CP 5. 181). When an idea comes to one in a flash, it does so at its bidding, not one’s own. There is no doubting the truth that lies in these forceful metaphors. But if reckoned-up from an explanatory point of view, they would tell us no more about the workings of hypothesis-selection than the dormitive virtue tells us about
8 How Abduction Fares in Mathematical Space
137
the workings of sleep. Good guessing, like a talent for scouting future greatness in 14-year-old Minor Hockey League players, is itself as much the occasion of abductive enquiry as its own prior abductive deliverance. Peirce is alert to such reflexivities. He sees methodology as one of logic’s chief operating parts, and, since methodology is the science of scientific enquiry, it is therefore one of its own objects. This is not a matter on which Peirce is always able to steer straight course, but helpful guidance can be had from Bellucci and Pietarinen (2021a, b). Certainly in common sense terms there is room to wonder whether there is any way of reasoning in which instinct is one of them? See, for example, Paavola (2005) and Magnani (2011). The present author’s own view, to be a bit previous about it, is the more that we can show even our most consciously deliberative cases of inference to have a causal character, the easier it will be to invest certain kinds of instinctual practice with cognitive significance. If a successful abduction provides no grounds for believing the abduced hypothesis to be true, how can there be reason to believe that our good guesses actually are good? What, indeed, would the good of them be? What are we to make of the Insistence of an Idea if not to believe it? Peirce’s implied answer is that while the hypothesis not itself worthy of belief, it is however worthy of its day in court. But why? The answer is to be found in the fact that, in one of its meanings, a suspicion is a hunch and, unlike an idle stab in the dark, a hunch can compel conviction. It is widely accepted in policing contexts that among the most talented Serious Crimes investigators are those with strong hunches underlain by weak evidence. The last thing a huncher would volunteer to do is throw in the towel for want of evidence. Rather the huncher will move mountains to see the matter through to a final conclusion, suggesting that one way of reading the inference schematized by Peirce is that a well-abduced hypothesis is attended by high, if inconclusive, confidence. Here to this same effect is Peirce: It is a primary hypothesis underlying all abduction that the human mind is akin to the truth in the sense that in a finite number of guesses, it will light upon the correct hypothesis. (CP 7. 220)
The postulated finitude is encouraging, but the finite can still be overwhelmingly large. Researchers have yet to arrive at the right abduction about why abduction is as successful as it is. The answer may lie in the nature of abduction’s grasp of a selected hypothesis, that is to say, in its sensitivity to abduction’s two-stagedness. In selecting a hypothesis the abducer must be satisfied that it preserves the truth of the schema’s subjunctive conditional at line (2), and must also be satisfied that, once selected, the hypothesis is likely to have a decent shot at trial. This, in fact, is what Peirce does appear to think: Proposals for hypothesis inundate us in an overwhelming flood while the process of verification to which each must be subject before it can count at all an item even of likely knowledge, is so very costly in time, energy, and money, that Economy here, would override every other consideration even if there were other serious considerations. In fact there are no others. (CP 5. 602)
138
J. Woods
Indeed if the abducer: examines all the foolish theories he might imagine, he never will . . . upon the true one. (CP 2. 776)
Peirce’s has had a slip of the pen here. It is simply not true that hypotheses flood in and inundate us. In lots of cases, hypotheses are notoriously elusive. It is true that information floods in, but even then doesn’t as a matter of course inundate us (we have filtration systems designed to minimize the impact of noise, informationoverload, semantic junk, and so on). Still, Peirce has had an essential insight into the workings of abduction. Abduced hypotheses are the fruits of hypothesis-searches whose success lies in no small part in our talent for selecting candidates that test well. And here Peirce says something rather telling: The hypothesis suggested by the present writer is that all laws result of evolution, that underlying all other laws is the only tendency which can grow by its own virtue, the tendency of all things to be habits. (CP 6. 101)
Concerning Peirce’s thinking about the foundations of continuity, Joseph Dauben (1977) observes that: For Peirce, such ideas justified themselves as a matter of instinct, of common sense. (p. 132; emphasis mine)
Like any other working mathematician, Peirce thought it obvious that there are basic mathematical certainties and, as a philosopher of mathematics, he accepts that unconscious reasoning: . . . forms the bedrock of fundamental mathematical statements and the rules on which the truths of mathematical statements and the rules on which the truths of mathematical propositions hang. The validities of statements and assertions that this facility produces, however mathematical, formal or informal, appear to be beyond any doubt. (Pietarinen, 2005, p. 359; cf. CP 2. 182).
By “this faculty,” Pietarinen is referring to what Peirce calls logica utens. And so? Well, as we have it now one of the benefits of cognitive economics is the role its findings can play in a follow-up epistemology. An economic finding can be a datum for the philosophical theory of knowledge, or it can be a hypothesis to which is attached an invitation for the epistemologist to take in charge. Peirce’s writings on the cognitive habits of human beings can be seen as contributions to cognitive economics, hence either as data for epistemology to respect or as hypotheses for it to deal with. Come back now to CP 7. 220. If “the human mind is akin to the truth,” that would be reason to suspect that: • Insistenceas a mark of knowledge: When we yield to the Insistence of an Idea and have no independent means of support and no independent grounds for resistance, it can be hypothesized that well-made Insistent belief will take the yielder to what, he might already know subconsciously.
8 How Abduction Fares in Mathematical Space
139
That, in turn would shed some light on the Unknowing Knowing phenomenon. • A source of unknowing knowing is knowing the thing in question subconsciously, and with it comes a question. Just what does it take to make something an instance of subconscious knowing? We turn to that now in hopes of some headway with the semantic recovery task.
A Data-Respecting Epistemology It is well worth repeating that in the beginning logic was the epistemology of axiomatic science. It is we who are the beneficiaries of understanding and knowledge and who are, in turn, beings hardwired to acquire the means to search it out. So it would be a prudent first step into the philosophy that accounts for knowledge to remind its theorists of the empirically discernible regularities of knowledge-seeking life. This is the role of cognitive economics, and it comes into play wearing two hats. Under the one, its findings are presented as motivating data for epistemology. Under the other, they are occasion for epistemology to do them theoretical justice and correct such missteps as may come to principled notice. Otherwise, they should not be trifled with. In Woods (2017, 6.2.3) and elsewhere it is proposed that the epistemology that fits this bill most naturally and most agreeably is a causal-response epistemology (CR), assembled in the ways recommended in Woods (2013). Contra Goldman (1967), (1979), it is an account which distinguishes between conditions under which a belief is justified or well-evidenced and conditions under which it is well-caused. It rejects the presumption that if, on some occasion, a belief lacks evidential backing and yet is believed, then one’s belief-forming devices are out of order, or anyhow out of kilter on that occasion. That rejection is the fundamental insight of CRepistemology, and it presses the question of what is to be made of it in greater detail? The basic idea is that knowledge is the causal product of information-processing under the right conditions. For example: • At some time t let X be some cognitive agent, S be some true proposition, and I be some good information. Suppose now that, in processing I, X’s processing devices induced his belief-making devices to produce in X the belief that S. Accordingly, X knows that S at t if information I is good, that is, accurate, up-to-date and uncluttered by irrelevancy, spottable inconsistencies, and other forms of semantic junk; X’s belief-forming devices are in sound working order and operating here as they should (in the manner for which nature has made them) and, finally, there are no negative externalities. For example, X is not hallucinating. Similar conditions can be laid out for CR-inference. For example, the basic idea behind deductive or truth-preserving inference can be sketched as follows:
140
J. Woods
• When in processing some I, X’s processing devices induce his belief-forming devices to produce the compound belief that S follows from S1 , . . . , Sn and also that the Si are true, then in believing in turn that S is true, X has correctly inferred that S from the Si provided that I is good information, his belief-forming devices are in good working order and operating here as they should, and there are no negative externalities. In each case, the examples are reserved for conscious knowing and inferring, but these passages convey the basic idea and enough of the general hang of things to be getting on with in our further reflections. In this framework, belief is a state of mind. Any belief-capable being of any stripe must have a mind for its beliefs to be states of. The CR-model calls upon knowers to be in the requisite state of mind. There are two alternatives to consider. In the one, the knower not only thinks that something is true, but harbors no doubt at all about whether it is indeed the case. In the other, the state the knower is in is one in which he experiences himself as knowing for a cold hard fact that the thing in question is true. It is a state in which the knowledge of it has taken possession of him. If we want a theory of knowledge which honors the economic facts, it will interpret the belief-condition on knowledge in a way that accommodates the observed facts of unknowing knowing. It is easy to see belief in the first sense just above as the more accommodating conception. But if the second sense prevailed, even self-experienced knowings need not be the real thing. Either way, knowing without knowing is something for epistemologists to take into account. Not all information-processing occurs in the cognitive up-above, that is, in the full light of conscious awareness. Most of the information that a human organism will process is processed in the cognitive down-below. Informationprocessing down-below has in varying degrees all or most of the following properties: It is mechanism-centered, unconscious, automatic, inattentive, involuntary, non-linguistically structured, semantically inert, deep-down, parallel, and computationally luxuriant. Information-processing up-above has also in varying degrees: all or most of the following properties. It is agent-centered, conscious, controlled, attentive, voluntary, linguistically structured, semantically loaded, surfacely contextualized, linear, and computationally puny (Schiffrin, 1997). Reasoning down-below, and knowledge too, is interesting. When it happens it happens as was said, out of sight of the mind’s eye, beyond the reach of the heart’s command, and unnegotionable by tongue or pen (or keystroke). It is important to emphasize that when these properties are instantiated, they need not be instantiated in equal degree. For example, as we might expect, it is true that we are conscious of anything that’s caught our attention, but we don’t pay attention to all the things we are conscious of (Mole, 2008). Much of the causal traffic in the human down-below is entirely subcognitive, a matter of energy-to-energy transductions. At some level, the causal flow converts energy-to information, at which point primitive cognitive contact is made. Once information enters the picture, the causal flow starts to take on a productive potential for inference and, in time, produces tacit and implicit knowledge. Under the right
8 How Abduction Fares in Mathematical Space
141
conditions some of it breaks the surface into the light of conscious day. It is an understatement that there is much yet to learn of the transitions from energy-toenergy transductions and energy-to-information conversions and from thence to the knowledge produced by Andrew Wiles’ elliptical curves proof (Wiles, 1995). One of the unsolved puzzles is that knowledge is an information-thirsty state to be in, but for most of everything we know at t there isn’t room for in the conscious mind at t. Of course, no one thinks that the holding capacity of the conscious mind kills the knowledge that’s not presently in it. So somehow the knowledge is preserved and stored in the down-below. This brings us to a central claim of CR-epistemology. It draws an analogy from a key notion from classical thermodynamics. Thephase-transition thesis: Information down below is subject to phase-transitions from one state to a qualitatively different state up above, and also capable of reverse phase-transition back down. In the passage upwards, information loses properties and gains opposite ones. On the way down properties acquired on the way up are lost and their opposites regained. In a more antique formulation, when information is in phase-transition, it retains its haecceity and loses its prior quiddity in acquiring a different one (Callen, 1985; chapter 9).
There is a link between Aristotle’s concept of potentiality (dunamis) and the phasetransitions of modern physics. Dunamis is a thing’s capacity to take on a new form without losing its identity, that is, without undergoing a substantial change Aristotle, 1984 (Metaphysics, Book 17 1 1046a 12, 1048a 25, 27). At this point it might be protested that all this phase-transitions speculation would be lost in Peirce even if he knew of it. It isn’t in fact so: The soul [= mind] then certainly does act dynamically on matter. It does not follow that it acts directly upon matter, because there may be involved an endless series of transformations of energy from motion of one fluid to another, all these fluids being spiritual [= mental], followed by a beginningless series of transformations of energy in one fluid to energy in another, all these fluids being material. (NEM 3, 897; emphases his)
Since knowledge is a storable commodity, it must have features, beyond unconsciousness, that enable it to be stored. It is known that in thermodynamically closed systems, consciousness is an information-suppressor. In the sensorium, the intersection of the five senses, ≈ 11 million bits of information are processed each second. If the information is processed consciously, the count plummets to ≈ 40 bits per second. If linguistically formulated, the count drops to ≈ 16. (Zimmerman, 1989, 166–175). No one thinks that the human information system is thermodynamically closed. But the fact remains that there is massively more that we know than we can ever get our conscious minds around at any given time. What is it about consciously held knowledge that makes it so hard to keep at the front of our minds? If we don’t store knowledge in the conscious sector of the head, what is it about its unconscious sector that makes for its more capacious storage capacity? It is not the comparative sizes of the head’s two sectors. The human unconscious is not an amazon.com warehouse. One might ask whether there can be attention down below without consciousness. The answer is yes. Biological organisms are responsive to calls or signals from their
142
J. Woods
environments, including other organisms. Doing so facilitates their management of environmental complexity. On the other hand, it hardly seems possible that all biological organisms are subject to the phase-transitions that human knowers are subject to; see here Orzumi et al., 2018. After a bit of reflection it becomes clear that the difference as stark as that between all that we know at t and all we can be aware of it at t cannot be explained by limitations of size. The better explanation is that properties borne by knowledge when it is consciously had, are erased when stored down below. We know some of the properties of consciously held knowledge. We know that the information that underlies it is semantically structured, truth-evaluable, and linguistically expressible. It could be argued that properties such as these are storage-busters. It is not that the conscious mind is so small, but rather that what’s in it is too costly to store in quantities that match its own size. By “costly” is meant that full-presence would crash the system. On that hypothesis, these are the properties that don’t make the cut in the passage to the vaults below, with obvious implications for tacit and implicit (TI)-knowledge. It is here that the phase-transition thesis earns its keep and adds further value to the quest for Semantic Recovery. We are looking for a way to breathe some life into subconscious knowing without transgressing the well-attested regularities of the cognitive economy. If a storage cost of knowledge currently at conscious hand is the loss of it syntactico-semantic structure, the phase-transition mechanisms assure its recovery when resurfacing to the conscious up-above. Given Peirce’s insistence that truth-value possession is the exclusive preserve of the syntactico-semantically structured, what is to be said of the truth-value recovery upon resurfacing? The answer is implicit in the phase-transition thesis. If the true proposition that Caesar defeated the Gauls loses syntactico-semantic structure upon deposit in the vaults of down-below and recovers it upon resurfacing, it re-requires its pre-descent properties, one of which being the property of being true. But, one could ask, wouldn’t the predicate “true” also lose its syntactico-semantic form in the downbelow? And reacquire it upon return? The answer is “yes” each time. In the first instance, the item that is semantically stripped down in its descent from up-above is virtually the proposition it was beforehand. And in the second instance, the same can be said. The item stripped down in its descent from up-above is virtually the truth-value the Caesar proposition had beforehand. This gives us a way to lend some clarity to the Unknowing Knowing phenomenon. Come back to the occasion when it strikes someone that his knowledge of S is something that he has known all along without realizing it. What makes such a thing happen, we asked. The CR-answer, albeit a conjectural one, that the things we have known all along without realizing it until now are things of which we have had tacit and implicit knowledge all along which has been triggered to surface by some new informational cue. Logicians of a many-valued disposition may see in this scenario inviting occasion for further truthvalues. One advantage of so seeing is that for well over a 100 years, logicians have been building theoretical formal frameworks for the accommodation of such ideas. While the virtuality project could be handed over to many-valued hands, there is no
8 How Abduction Fares in Mathematical Space
143
need to. There is no need to reify virtual truth as a companion entity of truth itself, and there is no need to make up a semantic story about what makes a nonclassical value “dedicated.” The economic benefits of the cognitive down-below are so substantial as to make a strong case for adaptive advantage. See again Vitiello (2001). There isn’t room enough in consciousness for the storage of all we know, all the memories of life’s experiences, all the background information that undergirds successful enquiry. We can’t pack all we’ve come to know in just the past hour in the front of our minds or even at the back. Some of it, no doubt, will soon dissipate without trace, but some of it will be retained where the storage costs are affordable and the storage capacity ample. This, the cognitive down-below, is the home and native land of tacit and implicit knowledge, the large depository of background information, and the principal center of action-guiding mental life. The means are now at hand to remove the sting of semantic barrenness without the necessity of its wholesale abandonment. The doctrine of propositional immaturity and future growth to truth-evaluability makes as much sense as the fact of conceptual evolution. Peirce is one of the many who take concepts as constituting elements of propositions. It is clear that at birth concepts take time and ecological pressure to attain maturity. That being so, any proposition embodying a conceptually immature constituent will itself be semantically immature. But if, like its conceptual ingredient, it is headed for future prosperity, it too will in time succor the fruits of bivalence. Fixity here is a key consideration. If a half-baked concept leaves some slack of referential determinacy – think here of Sinn und Bedeutung – how could half-baked propositions not leave some semantic slack as well? Peirce is theoretically well-prepared for this in his explorations of n-valued logics (NEM 3, 739–768), and his fallibilism is a concurrent encouragement. If reality is constantly on the move, it is never nontrivially the case that our most rationally secure beliefs are wholly immune from revision for cause, it only stands to reason that a young proposition’s semantic prospects could await the rewards of growing up. One of the large benefits of the CR-model of knowledge is the home it provides for tacit and implicit knowledge in the cognitive down-below. If, as the CR-model provides, a condition on prosperity in the tacit and implicit down-below is the attendant phasetransitions of syntactico-semantic structures up-above to the reversible virtualities of subconscious knowing, wouldn’t that afford a welcome theoretical home not only for Peirce’s n-valued leanings but also for his reservations about the semantic maturity of young hypotheses newly made up? In the JTB approach, knowledge depends on what’s true and on successful casemaking. In the CR approach knowledge depends on what’s true and what one’s belief-mechanisms endow one with. In the former case, beliefs are needful of casemanagement. In the latter they are states of mind that wax or wane. Of the two, the CR-arrangement is the more primitive, widely spread and evolutionarily prior. It reflects the priority of causation in the natural order. Case-making awaited the emergence of some of the higher faculties of cognition, and for successful casemaking we had to wait a bit longer. So a key question presses:
144
J. Woods
• Genesis: Just when in the grand arch of a human life does knowing things make its first appearance? The question is rhetorical. Justification is case-making, and case-making is a forensic skill, later acquired and, in any event, difficult to master. Well after we have acquired some varied ken of things, we learn to give conversational expression to differences of opinion. It might be true as Mercier and Sperber (2011) aver that in the absence of hearing and seeing reason-giving speech in action, young children could not have learned to speak their mother-tongues in the way in which they do. Opinions differ as to how good the average human is at winning agreements on their merits (Woods, 2016a, b), but there isn’t the slightest doubt that meritorious case-making does not mark the onset of human knowledge or set a precondition of it. This settles the hash of JTB as the model of knowledge in general, but it retains as ample a particular future as it can manage to colonize and domesticate. One test of an earned JTB-prosperity is proof-theory in application to the deductive sciences. And since in Peirce’s way of thinking mathematics is tethered to abduction, there are special reasons to keep an eye on how the tethering works out. Now that the CR model is trending well with the data for theory, it would be useful to record some further data for epistemology. • Truth-telling: Left to their own devices, beings like us have a strong and widespread disposition towards truthfulness. This is a matter of great Peircean insistence: A scientific man must be single-minded and sincere with himself. Otherwise his love of truth will melt away, at once. He can, therefore, hardly be otherwise than an honest, fairminded man . . . It is quite natural, therefore that a young man who might develop into a scientific man should be a well-conducted person. (CP 1. 149)
The benefits that redound to this ingrained habit are enormous, especially in respect of the circulation-costs of knowledge-spread. Its principal instrument by far is attestation, that is, by being told things, and wasn’t it Peirce who reminded us that we know who our parents are by hearsay? Peirce sees truthfulness not as a courtesy but as a necessary of human life. He sees one’s individual self as undetachable from its social dimension on which depends one’s capacity for consciousness and communication. These are critical junctures for Peirce. For without the impulse of truthfulness, communication will wither, and without communication, social being will lose all cohesion and its moorage will be lost. That could mark the end of the species homo sapiens (CP 7. 57; 1. 172; 7. 569). Peirce knows as well as any of us the virtues of properly occasioned untruthfulness, and the large disposition of people on the make to lie their heads off. Even so, the default position is that humans are truth-tellers and, that being so, it bears weight that if in the general case someone tells you something as true you will believe it too, in default of considerations to opposite effect.
8 How Abduction Fares in Mathematical Space
145
Since this datum holds widely enough to qualify as a default, it is appropriate that we remind ourselves that, by and large, our belief-forming mechanisms are in good working order and, when called upon, operate in the way that they are supposed to. We now have a CR-endorsement of: • Told-knowledge: Most by far of what a human knows, or ever will, he knows because he was told it as true under conditions that qualify his belief as knowledge. This is as true as it is at the local coffee shop, the newspaper’s citydesk, and the Field’s Institute. It is a systemwide epistemic phenomenon. Not all or even most of what we’ve been told face-to-face, or read in the library or searched out online originates with the tellers. Tellings rather are the producers of chains of telling among agents and multi-agents alike (MI6) in manifold clusters of epistemic circulation. The Wiles’ proof of 1995 is nearly 120 pages long and encompasses mathematical developments ensuing from the Pythagorean triples of Babylonian times. To bring these reflections to a restful pause, we note that since one’s beliefinducing mechanisms are causally triggered by our information-processing devices, we also have it that: • Telling as causal: In the general case, if one believes the thing one’s been told on account of being told it, not only does he know it, but the telling was knowledgecausing. A striking feature and source of its value of the Told-Knowledge phenomenon is its range and inclusiveness. This bears considerable emphasis. And wherever ToldKnowledge takes effect, it displaces evidence as a necessary condition. In his “The fixation of belief,” Peirce is rightly critical of unearned saysoauthority, and certainly not everything we are told is something that will cause us to know it to be true. But the Told-Knowledge phenomenon is a fixed feature of human life, concerning which the efficacy of persuasive authority cannot plausibly be gainsaid. By the Actually Happens Rule, this is not a phenomenon to be slighted after the fact by normative tribunes. But it leaves a fundamental question for CRepistemology: • How do we tell? In what are we to find the difference between tellings that are knowledge-makers and those that fall short? There is a rough regularity that runs through our positive responsiveness to being told things as true. We don’t respond positively unless we are sufficiently inclined to take our teller to be in a position to know the thing he tells and his readiness to tell it as an indication that he knows whereof he speaks. Whatever the details of how we know, perhaps one should say what Peirce says about abduction. We are good, though not perfect, at discerning knowledge-causing tellings. Our selective knowledge-inducing causal-responsiveness to being told things as true is yet another
146
J. Woods
feature of cognitive life in which we proceed with a success ratio that indicates adaptive advantage and an instinctive readiness to yield to. Here, too, the EnoughAlready datum holds court. It would be wrong to leave the impression that the CR and JTB orientations are rivals in all respects. It is true that they compete for recognition as a general condition of knowledge, but no one of CR-persuasion would deny the particulars that contextualize the need for justification in most dimensions of cognitive life. Any close observance of cognitive practice – both everyday and as specialized as it gets – would disclose a smooth overall alliance between the two approaches to knowledge acquisition, not least that evidenced in Peirce’s own large and varied example.
Theorematic Proof and Kant’s Problem Wired into our overall success with hypothesis-selection is our capacity for what Peirce calls “Originary” or creative thinking. He writes that abduction: is Originary in respect to being the only kind of argument which starts a new idea. (CP 5. 171)
How, we ask, is this done? Peirce answers: It is true that the different elements of the hypothesis were in our minds before; but it is the idea of putting together what we had never before dreamed of putting together which flashes the new suggestion before our contemplation. (CP 5. 181)
Moreover: • “For every symbol is a living thing, in a very strict sense that is no mere figure of speech. The body of a symbol changes slowly, but its meaning inevitably incorporates new elements and throws off old ones. (CP 2. 22) Some will see this as pure Kant. Kant is famous for the distinction between analysis and synthesis. Analysis, says Kant, is the business of making our concepts clear, where synthesis is the business of making clear concepts, the capacity for which, like “the schematism of our understanding, . . . is an art concealed in the depths of the human soul” (Kant, 1933 A 141-B). While not rejecting Kant’s distinction, Peirce was not, however, synthetically minded about the apriorities of mathematics. Like Frege (but not on his account), Peirce finds for the analyticity of mathematics. To be analytic, he says, is to be a definition or logically derivable from definitions (CP 6. 595). (This, by the way, is almost word-for-word in Frege, but again not on Frege’s account in Peirce.) Here is Peirce on what really matters. We have met it before. It is the place where Peirce changes his mind about negative existentials, allowing as he now does for truth-bearing reference to the nonexistent. • Almost all the theoric inferences are positively creative, that is, they create, not existent things, but entia ratione which are quite as real. (MS 773, 2–3).
8 How Abduction Fares in Mathematical Space
147
This is a vital concession. Peirce is saying, and is right to, that there are real things that do not exist, things that are just as real as those that do exist, and that those things are free creations of the human mind. This plays a foundational role for truth-making in mathematical contexts. There is a growing and important literature about the ways and means and upshots of scientific creativity. The press of time puts their discussion here out of reach. See, for example, Thagard (1992), Nersessian (2008), Magnani (2017), Woods (2018), Aliseda (2021) and Shook (2021). Bearing on this directly is the difference between what Peirce calls corollarial and theorematic proofs. Corollarial deduction is reasoning: where it is only necessary to imagine any case in which the premisses are true in order to perceive immediately that the conclusion holds in that case. (MS L 75)
Theorematic deduction: is deduction in which it is necessary to experiment in the imagination upon the image of the premiss in order from the result of such experiment to make corollarial deductions to the truth of the conclusion.
Corollarial proof is the object of Kant’s nonampliative complaint. Theorematic proof is Peirce’s (and he thinks the only) way of getting round it. In MS 617 and MS 201, Peirce makes it clear that he takes his treatment of theorematic inference to be of the first importance to his philosophy of logic. Theorematic reasoning in turn is broken down into two parts. Its first part is analysis, a technique by which the concepts embedded in the premisses are explicated. In part two, the deduction proceeds by laying out what follows logically from those explicated premisses (MS 842 35, 43; MS 843 44). In some of the analytic phases, provisions are made for the introduction of lines which are discharged before the final outcome, rather in the manner of indirect proofs from days of yore. In other respects, analysis is more aggressive. Analysis in this sense marks a significant departure in our understanding of deductive inference, for Peirce is saying that the deducer is not only free, but has a duty, to reconceptualize the premisses in ways that load the input with new information that is passed on to the output. By these means, Peirce wants to free valid deductive inference from the charge of triviality. Frege did the same with “constructive” definitions and ideographic proof. In each case, it was a fevered response to a longstanding charge that there is nothing new to be learned from deductively derived conclusions. Kant’s complaint against truth-preserving proofs with premisses already known to be true is that their conclusions cannot tell us more than we knew at the outset. There are two basic but inequivalent ways in which corollarial proofs can be understood to function. In the one, conclusions explicitize the knowledge left implicit in the premisses. In the other, conclusions actualize the epistemic premisses’ epistemic potential. In the first case, what is made known at the end has been unknowingly known all along. In the second, it is left open that what the conclusion makes known hasn’t in any way been known beforehand. Seen the first way, Kant’s observation is correct but his complaint unfounded. Seen the second way, he is simply mistaken. Besides, mathematics brims with proofs that look nothing like these.
148
J. Woods
We might find it odd that Peirce would have shared Kant’s dissatisfaction. In Peirce’s philosophy of mathematics, the sole role of proof lies in the tracing of consequences of hypotheses of one’s own antecedent fabrication. Such proofs are purely exploratory; they are proofs that spot consequences, not establish their truth. Still, to traditional eyes, a theorematic proof is defective. It has the look of a fallacy of equivocation. Assuming that hypotheses have conceptual components, when we move from these lines in first occurrence to recurrence somewhere later in the proof, some prior concept has been made-over or some new concept under the old name has been thought-up and introduced without proof as an extra line. It is quite true that some theorematic proofs bear some resemblance to some kinds of indirect proof, and give no offense in that regard. But a plain old proof by contradiction is not, just so, a theorematic proof; so we should not make too much of fragmentary resemblances. Theorematic proof is always proof with a palmed card up its sleeve. Of equal importance in Peirce’s philosophy of mathematics is the employment of iconic measures for proof-making (NEM 3, 405–446, 491; NEM 4, 38, 47–48, 176, 238). There is a rich literature on diagrammatic proof, whose origins predate Euclid, in which we find clear precedents for the value that Peirce attaches to the theorematic. Diagrams perform several different functions, from visual support of an instruction-manual to the exploratory fiddlings in the upper divisions of abstract science, the fiddlings of “manipulative abduction” (Magnani, 2017). But theorematically motivated diagramming assigns to its iconic processes the ways and means of conceptual change in medias res a given proof. Peirce’s attachment to this feature is as steadfast as Frege’s embrace of the ideography he announced in 1879. We have here a distinction between diagrams as visual aids and diagrams as instruments of conceptual change and premissory expansion in purpose-built artificial languages. For the ancient background of this usage, Macbeth (2010) provides solid insight. For its impact on the revolutionary stages of the late nineteenth century, see also Macbeth (2018). Perhaps there would be some value to pause briefly with Frege (1967). At page 6 of Begriffsschrift, Frege likens his formula language for concept-writing to a microscope, which hooks up nicely with Peirce’s insistence that logic is the study of drawing necessary conclusions by diagrammatic reasoning. The key is that although we can’t see with the naked eye what the microscope reveals about the deep structure of a visible object, its images (representations) can be seen as set out in considerable diagrammatic detail. So if under the microscope of Frege’s Begriffsschrift the depth grammar of language itself can be gleaned, it too can be diagrammatically rendered and made available to the surface-level reasoner. If the connections revealed in the down-below shape the conditions under which pure thought is formed, Begriffsschrift can diagram those too. Of course, Begriffsschrift lacks for practical employment at the corner store, Gramercy Grill and even at the Field’s Institute. All the same, every indicative declarative sentence of any human language has a depth grammar that regulates semantic flow and for which in principle, a Begriffschrift might be provided. If all this held true then, for every truth-valued sentence of Begriffsschrift, there would be a counterpart sentence of natural speech in whose depth grammar that truth-value would be preserved.
8 How Abduction Fares in Mathematical Space
149
Here, too, there is a large and growing – and very fine – literature on Peirce’s iconic and graphic achievements (Pietarinen, 2006, 2019; Bellucci & Pietarinen, 2016; Peirce, 2020, Gangle et al., 2021; Bellucci & Pietarinen, 2021a, b). But, for what concerns me here, we would be better served by actually giving it a miss. Here is why. When Kant condemns truth-preserving proof as epistemically stunted, he appears to be faulting the entailment relation for what might not be in it to provide. It can be conceded without strain that the relation of entailment is truthpreserving, but it is a good deal less clear whether it is knowledge-preserving. More generally speaking, let V be any property that a cognitive agent might value in the conclusions he derives from premisses possessing the same value for him. It would be several bridges too far to suppose that they are intrinsic to truth-preservation, or at least its constant companion. In matters of proof, what can be got from what is a function of the prover’s interests, of the semantic content (or otherwise) of his premisses, of their subject-matter and the particulars of the context in which the desire for a proof arose in the first place. Consider now the case of a theorem-prover named Zenon, and consider the case of knowledge-producingproofs for any science that tolerates the presence (and the importance) of first principles, for unproved truths that speak for themselves. (Aristotle is also in that number, as is Frege and Peirce.) Indeed, given Peirce’s openness to principles that speak for themselves without the aid of logic and his large attachment to the foundational hierarchies of science, this is an idea that sings on the street where he lives. Suppose, in addition to truth-preservation, that Zenon has imposed on the system’s proof-rules the following constraints: • Intelligibility: Axioms are known to be true by anyone who understands the language, and are eligible for insertion at any stage of the proof. • Structural dependency: Save for axioms, every line of the proof must contain some nonlogical terms of the preceding lines. • Nonmonotonicity: Once the proof’s premisses have been lodged, subsequent nonaxiomatic addition to the premiss-pool is disallowed. • Subject-matter preservation: Every line of the proof must share the subjectmatter of some preceding lines. When these conditions are met, we have three further derived ones: • Theorem-generation: If the proof’s premisses are axioms, any subsequent line is a theorem of that same system. We also have: • Knowledge-production: The proof is so constructed that anyone who is able to follow it and know its premisses to be true will be caused to know that its conclusion is true. • Consistency: Any system abiding by the primary constraints is inconsistencyproof. Mindful of the importance of the first law of proof-theory we should note that the constraints are properties of present proof and not facts about entailment.
150
J. Woods
Summing up, here is what is now on hand: truth-preservation, true startingpoints, meaning-overlap relevance, premiss nonredundancy, and topical relevance. What relevant and paraconsistent logicians want for entailment, Zenon wants for knowledge-producing proof, and in so doing abides with the first law of proof. This is Zenon’s answer to Kant, and he can thank Aristotle for having provided it. Each of the properties listed here is a property of features that define his concept of syllogism. Syllogisms, it must be said, are highly constrained truth-preserving structures, made so by defining conditions that exhibit the proof-conditions that Zenon has just now called upon. But he does not invoke the syllogistic conditions from which his proof-properties have been extracted. He doesn’t want the constraints; he wants the properties of four of them. And he wants the properties he wants from them to be free to travel. A syllogism is a valid truth-preserving structure entirely constructed from categorical propositions – propositions in the form “All – are –,” “No – are –,” “Some – are –,” and “Some aren’t –.” The Aristotelian details needn’t detain us here, but readers could consult Prior Analytics A 24b 19–24; Topics 1100a 25–27, and some follow-up discussion can be found in Woods (2014), chapter 3. But the point to be emphasized is that the present conditions on knowledge-producing proofs are not the conditions that define syllogisms. Syllogisms are too confining. Peirce’s iconic provisions were shaped as an answer to Kant, but Kant, as we now see, is mistaken. So for the purposes of knowledge-producing proof, the diagrams can also be omitted. True, Peirce says that mathematics has no need or place for corollarial proofs. In that respect he, too, is mistaken. It is true, in any case, that proofs of this sort are wanted for the consequence-drawing dimension of deductive consequence and, as framed here, Zenon’s proof-rules will fill the bill. And although Peirce is steadfast in his conviction that mathematics cannot advance without theorematic proof, that, one should note, is an empirical claim with little to support it among the observed regularities of mathematical practice. See, for example, Ferreirós (2015).
Trial by Combat Logic’s first provisions for the testing and licensing of unproved propositions are to be found in Aristotle’s essay On Sophistical Refutations and chapter VIII of his Topics. On a quick reading, Aristotle is setting out some rather stylized rules for prevailing against an opponent in a debate. In fact he is doing something far more substantial. He is doing some of the basic engineering of devices for testing by nonempirical means the unprovable principles of the deductive sciences. The basic idea is that you test the bona fides of an unproved but well-embedded proposition of science by declaring war against it. If it goes down, it stays down. If it stays up, it prevails. In a stripped-down example, there are two participants, the party who advances a thesis and then responds to questions put to him by his opponent. If as matters proceed, the questioner is able to derive from the thesis-holder’s own answers the contradictory of that thesis, then the questioner has prevailed. The core ingredient of the contest is what Aristotle dubs a proof ad hominem. It is not, he
8 How Abduction Fares in Mathematical Space
151
says, a proof in the full sense, but rather against the opponent who holds the thesis in question (Soph. Ref. 22, 178b 17, 170a 13, 17–18, 20, 177b 33–34, 183a 22, 24; Top. 161a 21; Metaphysics K5). What the ad hominem proves is not the falsity of the holder’s thesis, but rather that its holder has made an inconsistent defense of it. Suppose now that the theses of these contests were reserved for endoxa, that is, for opinions held by everyone whomsoever, or by the many or great majority of people, or by the wise or the top experts in the discipline within which the thesis has arisen. Opinions of this kind have come to be called “dialectical” opinions, and ad hominem proofs have come to be called “dialectical” arguments and there are various streams of dialectical logic making the rounds to this day. But by whatever name, what matters here is that ad hominem arguments are difficult to win given the large toehold of dialectical opinion and the high standard for dialectical refutation. The difficulty multiplies for the holder when the object of the exercise is to find propositions that withstand all efforts of the best minds in the field to trip the trap of an inconsistent defense. The polished version of dialectical testing is set forth in Posterior Analytics Book A. It is a contribution to the logic of abduction (epag¯og¯e), framed in response to what Aristotle takes to be a fact. Although the first principles or axioms of a deductive science are true, indispensable, most clearly understood, and neither susceptible to nor needful of proof, they are not in fact self-announcing. How, then, is it confirmed that the received first principles of a science are the real thing. The short answer is that they are received, that is taken to be the ones that count. And it is here that Aristotle brings to the fore the dialectical instruments forged in the foundery of On Sophistical Refutations. As Aristotle rightly assumes, it lies in the routine management of the cognitive economies in which first principles gain ascendancy that such principles will be the products of dialectical exhaustion. They will have survived all efforts of refutation from the top experts in the field to show them incapable of consistent defense. Accordingly, we can say of a principal that has traction in an economy in wide reflective equilibrium that it is inductively supported to the extent that it is ad hominem-resistant and ad ignorantiam-inviting. A first principle therefore has cornered the market, and flourishes in the absence of competitors. The successful candidate will also have a good ad ignorantiam record as indicated by the fact that if there were something wrong with it, the market would have known it by now. After all, at some point a statute of limitations must weigh in. What Aristotle would have us take from this is abductively wrought. What best accounts for a proposition that exhausts all competition is that the principle is in fact true, indispensable, and neither needful nor susceptible of proof. As Peirce once said, induction is justified abductively. What Aristotle has done in effect is describe in a somewhat stylized way the wordings of the cognitive economies of deductive science. For some surprising convergence with Peirce, Frege and Russell, readers could consult Woods 2020b, 2021b. Unlike deduction, induction is not a world-closing thing. Things happen; telescopes are invented; and the best of minds sometimes change. Aristotle is a fallibilist of Peircean cast. Things are as our best principles take to be so, except for such
152
J. Woods
dialectical objects as might come to arise, in which case? qualifications would have to be entertained (Metaphysics, 1011b 14–20). Although Aristotle’s inductive test was framed for the certification of scientific axioms, it is easily adaptable to Peircean abductive measures for testing mathematical hypotheses. The testing is administered, without the need or the resources for endless ad hominem attack case by case. Administered by the business-as-usual workings of the mathematical economy, tests are supplemented by autoepistemic considerations. Once an idea hits the market and remains there for a spell, if it has yet to have been revealed as a loser, its survival chances are good. Its refutationresistance reward speaks well for it. It is not world-closing assurance. Things happen in mathematics. Riemann’s manifold was one, Cantor’s diagonalization was another, and Michizuki’s inter-universal Teichmüller theory is knocking at the door (Woods, 2021d; p. 3362).
Conclusion As the present volume makes clear, the role of abduction in human knowledge is now the subject of a large, growing, and multi-faceted research effort, most of whose results have yet to achieve settled market-share. It is to the large credit of abductive logic’s modern founder that abduction’s indispensability to mathematics should have now have made footfall in various of the subject’s contemporary research programs. Among its still-open questions is the relationship between making-up new concepts in mathematics and making true the propositions in which those concepts are constituents. Another open problem, a direct inheritance from Peirce, is sorting out abduction’s operational role in the deductive architecture of deductive proof. Bridging from this – especially in light of the remarkable biography of the Wiles proof of Fermat’s Last Theorem – is coming to sound explanatory terms with the large and indispensable role of multiagency, both in science and everyday cognitive life. By far the most prominent example of this is the free market of ideas in the cognitive economies of mankind. It is not a research-problem reserved for abduction. But to the extent that abduction interweaves profitably with deduction and induction in a solo-agent’s cognitive doings, it is to that extent and moreso the case that these same arrangements will take multi-cognitive effect. One of the still-unanswered questions concerning abduction’s role in the philosophy of mathematics is the Making-up, Making-true paradigm bruited in Sect. 4. What is arguably the oddest semantical feature of natural language is the capacity to create real but nonexistent objects by writing things down and concurrently making captivating things true of them in that same way. Not all semanticists of fiction acknowledge the reality of nonexistent objects, but if we stick with MS 773, 2–3, Peirce is not in that number, nor is he alone in that respect (Woods, 1974, 2018). As already remarked, it is a standing fact about any author-dependent truth of fiction about a non-existent object that some nonfictional fact of the world about some or other existent object stands against it. That is one big difference between fiction
8 How Abduction Fares in Mathematical Space
153
and mathematics. The other is that the truths of fiction are author-created, whereas made-up mathematics is made by individuals, but made true by market forces. For this to work, various considerations must come to pass. One is that a hypothesis whose constitutive concepts have matured is now a proposition that has grown into bivalence. Another is that, if these transitions were effected by market-forces, then as much as we succeed in making up mature concepts, we also make propositions bivalent. From this an abductive inference awaits the taking up. Since market-forces serve as hypothesis-testers and pause when bivalence is achieved, what best accounts for this is that, there being no current reason to find the hypothesis false, we have it autoepistemically by argumentum ad ignorantiam that what we’ve now got hold of is a proposition made true by the market’s creative ambit. This is something that must be looked into further. Acknowledgments For their generous assistance, good example, and good judgment in the preparation of this work, I warmly thank in alphabetical order Atocha Aliseda, Richard Atkins, Jean-Yves Beziau, Lorenzo Magnani, Woosuk Park, Ahti-Veikko Pietarinen, and Alirio Rosales. A second round of thanks to Professors Magnani and Park for skillful editorial assistance and to project coordinator Salmanul Faris Nedum Palli for technical support.
References Aliseda, A. (2021). The place of logic in creative reason. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action: Logical reasoning, scientific inquiry, and social practice (pp. 149–160). Springer Verlag. Alward, P. (2012). Empty revelations: An essay on talk about and attitudes toward fiction. McGillQueens University Press. Aristotle. (1984). The completed works of Aristotle. The Revised English Translation, two volumes, Jonathan Barnes, Ed. Princeton University Press. Armour-Garb, B., & Woodbridge, J. A. (2015). Pretense and pathology: Fictionalism and its applications. Cambridge University Press. Atkins, R. K. (2021). Peircean examination of Gettier’s two cases. Synthese, 199, 12945–12961. Beal, J. C., & Restall, G. (2006). Logical pluralism. Oxford University Press. Bellucci, F. (2019). Abduction in Aristotle. In D. M. Gabbay, L. Magnani, W. Park, & A.-V. Pietarinen (Eds.), Natural arguments: A tribute to John Woods (pp. 551–564). College Publications. Bellucci, F., & Pietarinen, A.-V. (2015). Charles Sanders Peirce: Logic. In Internet encyclopedia of philosophy (pp. 1–40). https://eiep.utm.edu Bellucci, F., & Pietarinen, A.-V. (2016). Existential graphs as an instrument of logical analysis. Part I: Alpha. The Review of Sumbolic Logic, 9, 209–287. Bellucci, F., & Pietarinen, A.-V. (2021a). Methodeutic of abduction. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action: Logical reasoning, scientific inquiry, and social practice (pp. 107–127). Springer Verlag. Bellucci, F., & Pietarinen, A.-V. (2021b). An analysis of existential graphs – Part 2: Beta. Synthese, 199, 7705–7726. Beziau, J.-Y. (2010). Logic is not logic. Abstracta, 6, 73–102. Bottazzini, U., & Gray, J. (2013). Hidden harmony – Geometric fantasies: The rise of complex function theory. Springer. Brown, B. (2007). Preservationism: A short history. In D. M. Gabbay & J. Woods (Eds.), The many valued and nonmonotonic turn in logic (Vol. 7, pp. 95–127). Elsevier Science.
154
J. Woods
Bruza, P., Cole, R., Bari, A., & Song, D. (2000). Towards operational abduction from a cognitive perspective. Logic Journal of the IGPL, 14, 161–177. Bruza, P., Barros, A., & Kaiser, M. (2009). Augmenting web service discovery by cognitive semantics and abduction. In Proceedings 2009 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology (pp. 403–410). IEEE Press. Callen, H. B. (1985). Thermodynamics and an introduction to thermostatistics (2nd ed.). Wiley. D’Avila Garcez, A. S., Gabbay, D. M., Ray, O., & Woods, J. (2007). Abductive reasoning in neuralsymbolic systems. Topoi, 26, 37–49. Dauben, J. W. (1977). C. S. Peirce’s philosophy of infinite sets. Mathematics Magazine, 50, 123–135. Douven, I. (2021). Abduction, The Stanford Encyclopedia of philosophy, Summer 2021 Edition, In Edward N. Zalta, Ed. https://plato.Stanford.edu/archives/sum2021.entries/abduction/ Engel-Tiercelin, C. (1991). Peirce’s semiotic version of the semantic tradition in formal logic. In N. Cooper & P. Engel (Eds.), New Inquiries into Meaning and Truth (pp. 187–213). London: St. Martins Press. Ferreirós, J. (2007). Labyrnth of thought: A history of modern set theory. Birkhäuser. First edition 1999. Ferreirós, J. (2015). Mathematical knowledge and the interplay of practices. Princeton University Press. Flach, P. A., & Kakas, A. C. (Eds.). (2000). Abduction and induction: Essays on their relation and interpretation. Kluwer. Frege, G. (1950). Die der Arithmetik, eine logische mathematische Untersuchung über den Begriff der Zahl, Breslau: W. Koebner, 1884. Translated as The Foundations of Arithmetic: A LogicoMathematical Enquiry into the Concept of Number by J. L. Austin, Oxford: Basil Blackwell. Frege, G. (1967). Begriffsschrift, a formula language, modeled upon that of arithmetic. In J. van Heijenoort (Ed.), From Frege to Gödel: A sourcebook in mathematical logic 1879–1931 (pp. 5– 82). Harvard University Press. Gabbay, D. M., & Woods, J. (2003). Agenda relevance: A study in formal pragmatics, volume 1 of a practical logic of practical systems. North-Holland. Gabbay, D. M., & Woods, J. (2005). The reach of abduction: Insight and trial, volume 2 of a practical logic of cognitive systems. North-Holland. Gabbay, D. M., Christopher, J., & Robinson, J. A. (Eds.). (1994). Handbook of logic in artificial intelligence and logic programming, volume 3, nonmonotonic reasoning and uncertain reasoning. Oxford University Press. Gangle, R., Caterina, G., & Tohmé, F. (2021). Abductive spaces: Modelling concept framework revisions with category theory. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action: Logical reasoning, scientific inquiry, and social practice (pp. 49–73). Springer Verlag. Girard, J.-Y. (1987). Linear logic. Theoretical Computer Science, 50, 1–102. Goldman, A. (1967). A causal theory of knowing. Journal of Philosophy, 64, 357–372. Goldman, A. (1979). What is justified belief? In G. Pappas (Ed.), Justification and knowledge (pp. 1–23). Reidel. Gray, J. J. (2004). Anxiety and abstraction in nineteenth century mathematics. Science in Context, 17, 27–47. Harman, G. (1970). Induction: A discussion of knowledge to the theory of induction. In M. Swain (Ed.), Induction, acceptance and rational belief (pp. 83–99). Dordrecht: Reidel. Hausman, J. A. (1993). Contingent valuation: A critical assessment. Amsterdam: Elsevier. Hilpinen, R. (2004). Peirce’s logic. The rise of modern logic from leibniz to frege, Dov M. Gabbay & J. Woods (Eds.), (pp. 611–658). Amsterdam: Elsevier. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34, 503–533. Hintikka, J. (2007). Socratic epistemology: Explorations of knowledge-seeking by questioning. Cambridge University Press. Howard, W. A. (1980). The formulae-as-type notion of construction in 1969. In J. Selden & R. Hindley (Eds.), To H. B. Curry: Essays in combinatory logic, lambda calculus and formalism (pp. 479–490). Academic.
8 How Abduction Fares in Mathematical Space
155
Jacquette, D. (1996). Meinongean logic: The semantics of existence and nonexistence. De Gruyter. Kant, I. (1933). Critique of pure reason, 1781, 1787. Macmillan. Knauff, M., & Spohn, W. (Eds.). (2021). The handbook of rationality. MIT Press. Lindström, P. (1969). On extensions of elementary logic. Theoria, 35, 1–11. Lycan, W. G. (2019). On evidence in philosophy. New York: Oxford University Press. Macbeth, D. (2010). Diagrammatic Reasoning in Euclid’s Elements, in B. van Kerhove, J. De Vuyst & J.-P. van Bendegem (Eds.), Philosophical Perspectives on Mathematical Practice (pp. 235–267). London: College Publications. Macbeth, D. (2018). Logical form mathematical practice, and Frege’s Begriffsschrift, Annals of pure and applied logic, 169, 1419–1436. Magnani, L. (2001). Abduction, reason and science: Processes of discovery and explanation. Kluwer/Plenum. Magnani, L. (2009). Abductive cognition. The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. Magnani, L. (2011). Abductive Cognition: The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning. Heidelberg: Springer. Magnani, L. (2011). Understanding Violence: The Intertwining of Morality, Religion and Violence: A Philosophical Stance. Berlin: Springer. Magnani, L., & Bertolotti, T. (Eds.). (2017). Handbook of Model-Based Science. Springer. Magnani, L. (2018). The abductive structure of scientific creativity: An essay on the ecology of cognition. Heidelberg: Springer. Makinson, D. (2005). Bridges from classical to nonmonotonic logic. College Publications. McCarthy, J. (1980). Circumspection – A form of non-monotonic reason. Artificial Intelligence, 13, 27–39. Mercier, H., & Sperber, D. (2011). Why do Humans Reason? Arguments for an argumentative theory, Behavioural and Brain Sciences, 34 (2), 57–74. Minnameier, G. (2004). Peirce-suit of truth: Why inference to the best explanation and abduction should not be confused. Erkenntnis, 60, 75–105. Mole, C. (2008). Attention and consciousness. Consciousness Studies, 15, 86–104. Mooney, R. J. (2000). Integrating abduction and induction in machine learning. In P. A. Flach & A. C. Kakas (Eds.), Abduction and induction (pp. 181–191). Kluwer. Moore, M. E. (Ed.). (2010). Philosophy of mathematics: Selected writings of Charles Sanders Peirce. Indiana University Press. Nersessian, N. (2008). Creating scientific concepts. MIT Press. Niiniluoto, I. (1993). Peirce’s theory of statistical explanation. In E. C. Moore (Ed.), Charles S. Peirce and the philosophy of science: Papers from the Harvard Sesquicentennial congress (pp. 186–207). University of Alabama Press. Niiniluoto, I. (2018). Truth-seeking by abduction, volume 400 in the Synthese library. Springer. Paavola, S. (2005). Peircean abduction: Instinct or inference? Semiotica, 153, 131–154. Park, W. (2017a). Abduction in context: The conjectural dynamics of scientific reasoning. Springer. Park, W. (2017b). Magnani’s manipulative abduction. In L. Magnani & T. Bertolotti (Eds.), Handbook of model-based science (pp. 197–213). Springer, Springer. Park, W. (2021). On abducing the axioms of mathematics. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action: Scientific inquiry, and social practice (pp. 161–175). Springer. Parsons, T. (1980). Nonexistent objects. Yale University Press. Peirce, C. S. (1896). Regenerated logic. The Monist, 7, 19–40. Peirce, C. S. (1931–1958). Collected papers, eight volumes. C. Hartshorne, & P. Weiss (Eds.), volumes 1–6, A. W. Burks, editor, volumes 7–8. Harvard University Press. Cited here as CP + volume number + pages Peirce, C. S. (1966). The Charles S. Peirce papers, microfilm edition. Harvard University Photographic Service. References here to the Annotated Catalogue of the Papers of Charles S. Peirce, Amherst: University of Massachusetts Press, 1967, as emended by Richard S. Robin in The Peirce Papers: A Supplementary Catalogue, Transactions of the Charles S. Peirce Society, 7 (1971), 37–57. Cited here as MS + Robin’s catalogue number + pages.
156
J. Woods
Peirce, C. S. (1976). In C. Ersele (Ed.), The new elements of mathematics, by Charles S. Peirce, four volumes. Mouton. Cited here as NEM + volume + pages. Peirce, C. S. (1982). In M. H. Fisch et al. (Eds.), Writings of Charles S. Peirce: A chronological edition, four volumes to date. Indiana University Press. Cited here as W + volume + pages. Peirce, C. S. (1985). In C. Eisele (Ed.), Historical perspective on Peirce’s logic of science, 2 volumes. Mouton. Cited before as HP + page number. Peirce, C. S. (1992a). In K. L. Kettner (Ed.), Reasoning and the logic of things: The Cambridge conference lectures of 1898. Harvard University Press. Cited here as RLT + pages. Peirce, C. S. (1992b). In N. Houssen & C. Kloesel (Eds.), The essential Peirce: Selected philosophical writings, two volumes. Indiana University Press, 1998. Cited here as EP + volume + pages. Peirce, C. S. (2010). In M. E. Moore (Ed.), Philosophy of mathematics: Selected writings of Charles Sanders Peirce. Indiana University Press. Cited here as PM + pages. Peirce, C. S. (2020). In A.-V. Pietarinen (Ed.), Logic of the future: Writings on existential graphs, history and applications (Vol. I). De Gruyter. Pietarinen, A.-V. (2005). Cultivating the habits of reason: Peirce and the logica utens versus logica docens distinction. History of Philosophy Quarterly, 22, 357–372. Pietarinen, A.-V. (2006). Signs of logic: Peircean themes on the philosophy of language, games and communication. Springer. Pietarinen, A.-V. (2019). Logic of the Future. Berlin: De Gruyter. Priest, G. (1979). The logic of paradox. Journal of Philosophical Logic, 8, 219–241. Priest, G. (2007). Paraconsistency and dialetheism. In D. Gabbay & J. Woods (Eds.), Handbook of the history of logic (Vol. 7, pp. 179–204). North Holland. Reiter, R. (1987). Non-monotonic reasoning, annual reviews of computer. Science, 2, 147–186. Rescher, N. (1976). Plausible Reasoning: An Introduction to the Theory and Practice of Plausibilistic Inference. Amsterdam: Van Gorcum. Riemann, B. (1857). Theorie des Abelschen Funktionen. Journal für die reine und angewandtze Mathematik, 54, 101–155. Sainsbury, M. (2005). Reference without referents. Clarendon Press. Schiffrin, R. M. (1997). Automatism and consciousness. In J. D. Cohen & J. W. Schooler (Eds.), Scientific approaches to consciousness, (pp. 49–64). Mahwah, NJ: Erlbaum. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. Shook, J. R. (2021). Abduction, the logic of scientific creativity, and scientific realism. In J. Shook & S. Paavola (Eds.), Abduction in cognition and action: Logical reasoning, scientific inquiry, and social practice (pp. 207–227). Springer Verlag. Simon, H. (1957). Models of man. Wiley. Starr, W. B. (2021). Conditionals and counterfactual logic. In M. Knauff & W. Spohn (Eds.), The Handbook of rationality (pp. 147–186). MIT Press. Stein, H. (1988). Logos, logic and logistiké: Some philosophical remarks on the nineteenth-century transformations of mathematics. In W. Aspray & P. Kitcher (Eds.), History and philosophy of modern mathematics, Minnesota studies in the philosophy of science, XI (pp. 238–259). University of Minnesota Press. Tappenden, J. (2008). Mathematical concepts: Fruitfulness and naturalness. In P. Mancosu (Ed.), The philosophy of mathematics and mathematical practice (pp. 276–301). Oxford University Press. Thagard, P. (1992). Conceptual revolutions. Princeton University Press. van Benthem, J. (2011). Logical dynamics of information and interaction. Cambridge University Press. Vitiello, G. (2001). My Double Unveiled: The dissipative quantum model of the brain. Amsterdam: John Benjamins. Vitello, G. (2012). The dissipative brain. In G. Globus, K. H. Pribram & G. Vitiello (Eds.), Brain and Being: At the Boundary between Science, Philosophy, Language and Arts, (pp. 37–338). John Benjamins: Amsterdam.
8 How Abduction Fares in Mathematical Space
157
Walton, K. (1990). Mimesis as make-believe. Harvard University Press; Magnani, 2009. Weyl, H. (1953). Über der Mathematik und mathematischem Physik. Studium Generale, 6, 219– 258. Wiles, A. (1995). Modular elliptical curves and Fermat’s last theorem. Annals of Mathematics, 141, 433–551. Wilson, A. B. (2017). The Peircean solution to non-existence problems: Immediate and dynamical objects. Transactions of the Charles S. Peirce Society, 53, 528–552. Woods, J. (1974). The logic of fiction: A philosophical sounding of deviant logic (1st ed.). Mouton; 2nd edition with a Foreword by Nicholas Griffin, volume 23 of Studies in Logic, College Publications, 2009. Woods, J. (1999). Peirce’s abductive enthusiasms. Protosociology, 13, 117–125. Woods, J. (2001). Aristotle’s earlier logic (1st ed.). Hermes Science. Woods, J. (2003). Paradox and Paraconsistency: Conflict resolution in the abstract sciences. Cambridge University Press. Woods, J. (2005). Epistemic bubbles, in Sergei Artemov. In H. Barringer, A. d’Avila Garcez, L. C. Lamb, & J. Woods (Eds.), We will show them: Essays in honour of Dov Gabbay (Vol. 2, pp. 731–774). College Publications. Woods, J. (2012). Cognitive economics and the logic of abduction. Review of Symbolic Logic, 5, 148–161. Woods, J. (2013). Errors of reasoning: Naturalizing the logic of inference., 2nd edition revised and extended, volume 45 of studies in logic. College Publications. Woods, J. (2014). Aristotle’s earlier logic (Studies in logic series) (Vol. 53, 2nd ed.). College Publications. Woods, J. (2016a). Logic naturalized. In J. Redmond, O. Pombo-Martins, & Á. N. Fernandéz (Eds.), Epistemology, knowledge and the impact of interaction (pp. 403–432). Springer. Woods, J. (2016b). Does changing the subject from A to B really enlarge our understanding of A? Logic Journal of IGPL, 24, 456–480. Woods, J. (2017). Reorienting the logic of abduction. In L. Magnani & T. Bertolotti (Eds.), Springer handbook on model-based science (pp. 138–150). Springer. Woods, J. (2018). Truth in fiction: Rethinking its logic, volume 391 in the Synthese library. Springer. Woods, J. (2020a). Abduction and inference to the best explanation. In J. A. Blair (Ed.), Studies in critical thinking (2nd ed., pp. 329–349). Windsor. Woods, J. (2020b). Russell and Aristotle on first principles: A surprising concurrence. In J. A. Blair & C. W. Tindale (Eds.), Rigour and reason: Essays in honour of Hans Vilhelm Hansen (pp. 52–86). Windsor. Woods, J. (2021a). Peirce, Russell and abductive regression. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action: Logical reasoning, scientific inquiry, and social practice (pp. 129–145). Springer Verlag. Woods, J. (2021b). What did Frege take Russell to have proved? Synthese, 198, 3949–3977. published online 22 July 2019. Woods, J. (2021c). The role of the common in cognitive prosperity: Our command of the unspeakable and unwriteable. Logica Universalis, 15, 399–433. Woods, J. (2021d). The role of the common in cognitive prosperity: Our command of the unspeakable and the unwriteable, Logica Universalis, 15, 399–443. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353. Zimmerman, M. (1989). The nervous system and the context of information theory. In R. F. Schmidt & G. Thews (Eds.), Human Physiology (pp. 166–175). Berlin: Springer-Verlag.
9
The Logical Process and Validity of Abductive Inferences Gerhard Minnameier
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The GW Model of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Presentation of the GW Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Critical Aspects of the GW Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in the Context of the Inferential Triad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction, Deduction, and Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems with Abductive Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Triad of Colligation, Observation, and Judgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On Coherence and the Reach of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Proper Epistemological Role of Coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coherence and Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coherence, Perception, and Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
160 162 162 165 167 167 168 170 172 172 174 175 177 178
Abstract
Abduction is the process of generating hypotheses for explanation-seeking phenomena (and possibly also non-explanatory problems). While this process has often been taken as no more than spontaneous impulse, Peirce held that it was an inference. Today, apart from Peirce’s original statement of the abductive inference, the Gabbay-Woods model (or simply the GW model) is presumably the most prominent account of it. This model is introduced and evaluated. One main problem is seen in the delimitation of the abductive task and the criterion for the validity of abduction. Here, the GW-model and other accounts appear to
G. Minnameier () Faculty of Economics and Business, Goethe University Frankfurt, Frankfurt am Main, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_3
159
160
G. Minnameier
be vague, and this vagueness might be due to the all too well-known confusion of abduction with inference to the best explanation (IBE). In what follows, abduction is first analyzed in its overall connection with deduction and induction, then in terms of inferential sub-processes (i.e., colligation, observation, and judgment), and finally with respect to the validity criterion for abduction, which is seen in “coherence.” In the section that follows, the concept of coherence is discussed, especially against the backdrop of it being taken as a criterion for “truth.” Here, the processes of generating hypotheses and (dis-)confirming them have to be kept as strictly apart as abduction and IBE. Based on the idea that abduction basically consists in making incoherent pieces of information coherent, the final part discusses how far non-deliberative cognitive functioning can be understood in terms of abduction, particularly in terms of abductive inferences, and what this means for the overall project of pragmatism. Keywords
Abduction · Inference · Coherence · Unification · Perception · Intuition · Naturalization of logic
Introduction Why is abduction such a fascinating topic? One reason might be that the very concept appears to be both ambiguous and multifaceted. It is well known that some equate abduction with IBE (i.e., inference to the best explanation), which is an idea that goes back to Gilbert Harman (1965) but which also corresponds to the early Peirce’s original notion of abduction (CP 2.623, 1878), then labeled “hypothesis.” However, it is commonplace today that abduction in the sense of the mature Peirce concerns the generation of theories, or ideas in general, rather than their confirmation (see, e.g., Hintikka, 1998; Minnameier, 2004, 2017; Paavola, 2006; Tiercelin, 2005; Iranzo, 2007; Niiniluoto, 2018; Pietarinen & Bellucci, 2015; Yu & Zenker, 2018) (Recently, McKaughan (2008), Campos (2011), Mackonis (2013), and Urba´nski and Klawiter (2018) have argued in favor of a wide notion of IBE, including abduction, although they endorse the distinction between the two modes (as does Lipton, 2004, p. 148). Hence, the differentiation as such seems to be beyond any doubt). Nonetheless, even in this specific sense of abduction as the inference from surprising facts to explanatory theories, different forms and viewpoints abound (see also Minnameier, 2017, 2019). Another reason might be that abduction somehow fills a kind of “logical void” by reconstructing a specific process of reasoning as “logical” that has hitherto been conceived as fortuitous and spontaneous – which, in turn, contributes to the concept’s ambiguity. Recall, for instance, that in a famous book with the telling title The Logic of Scientific Discovery, Karl Popper explains that “(t)he initial stage, the act of conceiving or inventing a theory, seems . . . neither to call for logical analysis nor to be susceptible of it” (Popper, 1935/2002, p. 36). And he goes on saying that
9 The Logical Process and Validity of Abductive Inferences
161
“(t)he question how it happens that a new idea occurs to a man . . . may be of great interest to empirical psychology; but it is irrelevant to the logical analysis of scientific knowledge” (ibid.). In contrast, abduction is meant to describe just this initial stage of inquiry, but as an inferential process rather than the “happy guesses” (Hempel, 1966, p. 15) that Popper and many scholars of his time had in mind (for a broader and deeper account of the historical context of abduction, see Niiniluoto, 2018, chapters 1 and 5). However, even Peirce himself claimed that “abduction is, after all, nothing but guessing” (CP 7.219, c. 1901). Contrary to pure guessing, however, Peirce maintained that humans dispose of a certain “instinct” to come up with fruitful hypotheses. In particular, he points out that: . . . it is a primary hypothesis underlying all abduction that the human mind is akin to the truth in the sense that in a finite number of guesses it will light upon the correct hypothesis. ( . . . ) For if there were no tendency of that kind, if when a surprising phenomenon presented itself in our laboratory, we had to make random shots at the determining conditions, trying such hypotheses as that the aspect of the planets had something to do with it, or what the dowager empress had been doing just five hours previously, if such hypotheses had as good a chance of being true as those which seem marked by good sense, then we never could have made any progress in science at all. ( . . . ) We cannot go so far as to say that high human intelligence is more often right than wrong in its guesses; but we can say that . . . it has been, and no doubt will be, not very many times more likely to be wrong than right. (CP 7.220, c. 1901)
This guessing instinct is said to be fallible, yet immensely more accurate than pure chance operations could ever be. Peirce compared it with the adaptive instincts of animals (CP 5.591, 1903) and also held “that we often derive from observation strong intimations of truth, without being able to specify what were the circumstances we had observed which conveyed those intimations” (CP 7.46, c. 1907) (See Paavola (2005) for a comprehensive account of the Peircean guessing instinct). So, on the one hand, abduction appears as a mental operation driven by intuition and spontaneous inspiration. On the other hand, Peirce stressed “that abduction, although it is very little hampered by logical rules, nevertheless is logical inference, asserting its conclusion only problematically or conjecturally, it is true, but nevertheless having a perfectly definite logical form” (CP 5.188, 1903). The passage just cited is followed by Peirce’s famous statement of the abductive inference: The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true. (CP 5.189, 1903)
And he goes on saying that “A cannot be abductively inferred . . . until its entire content is already present in the premiss” (ibid.). However, against this account, Kapitan (1992) argued that because the hypothesis was already included in the premises, it could not possibly be inferred. If we took this at face value, we would have to conclude that either abduction is no inference, or that the above inference is no abduction.
162
G. Minnameier
However, there is a way out of this purported predicament for abduction. As a starting point, the fairly recent GW model of abduction will be considered, which is named after Dov Gabbay and John Woods, who first suggested it in Gabbay and Woods (2005). They reconstructed abduction in terms of a dynamical epistemic logic. This allows us to look at abduction as a process of reasoning in which various steps can be individuated. However, as we shall see, the GW model seems to cover slightly too much and hence blur a little what might be taken as the essence of abduction. Against this backdrop, the very structure of abduction in terms of three characteristic sub-processes that any kind of inference consists of will be revealed. In this respect, Peirce himself has distinguished between “colligation,” “observation,” and “judgment.” Based on this triad, the described confusion can be dissolved. The solution consists of an analysis of (1) the inferential process of abduction and (2) the validity of this particular kind of inference and (3) its role at different levels of cognition.
The GW Model of Abduction Presentation of the GW Model The GW model of abduction was introduced as an advancement of the so-called AKM model, named after Atocha Aliseda (2006, 2017), Antonis C. Kakas (Flach & Kakas, 2000), Theo A. F. Kuipers (1999), and Lorenzo Magnani (2001, 2009). While the AKM model focuses mainly on the notions of an abductive problem and an abductive solution, the GW model aims at reconstructing the cognitive process of abduction as a whole. Both accounts are based on Peirce’s famous statement of the abductive inference as mentioned above. In the AKM model, “if A were true, C would be a matter of course” is interpreted in the sense that C is deducible from A (see also Niiniluoto, 2007). Woods represents the AKM model as follows (for an alternative formalization, see Aliseda, 2017, p. 225): 1. E 2. K E 3. H E Then an abduction is the derivation of H from three further facts: 4. K(H) is consistent. 5. K(H) is minimal. 6. K(H) → E. Accordingly, 7. H. (Woods, 2013, pp. 376–7).
9 The Logical Process and Validity of Abductive Inferences
163
Here, “E” is the explanandum; “K,” the relevant background knowledge; “→,” a consequence relation; “H,” a hypothesis; and “K(H),” the update of K by adding H. Conditions 1 and 2 denote the abductive problem that K does not entail E. Condition 3 is meant to prevent self-explanations and ad hoc hypotheses that are unrelated to the relevant background knowledge. Condition 4 requires K(H) not to contain contradictions. Condition 5 is meant to prevent overly redundant explanations, e.g., in the form of a proliferation of hypotheses based on deductive closure. Condition 6 denotes that the updated knowledge base entails E, and condition 7 contains the respective conclusion. Woods criticized this view for a number of reasons that relate to a set of propositions he draws from Peirce (Woods, 2013, p. 375). His critique is summarized as follows (ibid., p. 377): The standard schema imposes no requirement that E be surprising, or that successful abduction be non-probative or evidentially inert, or that the sentence K(H) → E be in subjunctive mood, or that the conclusion of an abduction be simply that H can plausibly be conjectured, or that the schema’s conclusional operator mark the inference as defeasible. Neither is there any reflection on what it is for something to be a matter of course.
This critique might not be entirely justified (see, e.g. Magnani, 2009, pp. 70–71). However, it is obvious that both the abductive problem, i.e., the aspect of and the reason for surprise, and what is required from an abductive solution still need some further clarification. As for the abductive solution, the minimality condition seems to be particularly problematic. Gabbay and Woods ask, for instance: “Does the condition require that H be the least modification of K that delivers the intended goods? Or, does it require that H modify the least class of K that delivers the goods? Or does it mean both?” (2005, p. 55). Furthermore, Aliseda held that the minimality condition “aims at capturing either a criterion of best explanation by which minimal may be interpreted as selecting the weakest explanation (e.g., not equal to → ϕ) (Here, represents the background knowledge K and ϕ the explanandum E) or a preferred explanation (which requires a predefined preference ordering)” (2017, p. 225). Whatever the criterion for selection might be, it could lead to a confusion of abduction with IBE, as has been argued elsewhere (Minnameier, 2016, 2017, 2019). The GW model, however, has been introduced to compensate for these deficiencies. Here it is in the most recent version (Woods, 2017, p. 139, with a slight correction) (The brackets in lines 11 and 12 read “sub-conclusion, 1–7” and “conclusion, 1–8,” respectively. This seems to have been taken over from an earlier version in Woods (2013, p. 379). However, in the present schema, there is one more line, which is why the sub-conclusion should relate to lines 1–8 and the conclusion to lines 1–9. For an earlier version of the GW model, see Gabbay and Woods (2005, p. 47)): 1. T ! E [The! operator sets T as an epistemic target with respect to some state of affairs E] 2. −R(K, T) [fact] 3. Subduance is not presently an option [fact]
164
G. Minnameier 4. 5. 6. 7. 8. 9. 10. 11. 12.
Surrender is not presently an option [fact] H ∈ K [fact] H ∈ K∗ [fact] −R(H, T) [fact] −R(K(H), T) [fact] H R(K(H), T) [fact] H meets further conditions S1 , . . . Sn [fact] Therefore, C(H) [sub-conclusion, 1–8] Therefore, Hc [conclusion, 1–9]
The epistemic target “T” implies an element of surprise, and “!” denotes the agent’s willingness to attain it. Condition 2 states that the attainment relation (that K hits the target T) is currently not established. “Subduance” and “surrender” would constitute easy ways out and are therefore excluded (“Subduance” means to evade the problem by asking someone who would know or search the Internet rather than racking one’s brain). According to conditions 5 and 6, the wanted hypothesis is neither in the agent’s knowledge set (which is trivial) nor in the immediate successor knowledge set K∗ , i.e., obvious and easily accessible information that might do the job. Since H is neither part of K nor of K∗ , the wanted attainment relation R does not hold either for H or the updated knowledge base K(H) (lines 7 and 8). Nonetheless, H provides us with a tentative solution, which is denoted by the subjunctive conditional relation in line 9. This captures the idea that “if A were true, C would be a matter of course.” Based on the ignorance documented in lines 1–8, H is conjectured (line 11), and finally activated (line 12), while line 10 points to conditions like consistency and minimality as discussed above in the context of the AKM-model. Woods admitted that “(s)pecifying the Si is perhaps the hardest open problem for abductive logic” (2013, p. 379). The final parts of the schema are perhaps difficult to grasp. First, line 9 introduces the subjunctive relation between H and R(K(H), T). Second, H has to undergo some evaluative process at the end of which it is activated, which obviously means that H is used in some sense or for some further purpose. Third, however, despite this evaluation, abduction as a whole is ignorance-preserving (as required not only by Woods but also by Peirce). Woods makes this very clear (2013, p. 380): When I say that an abduction involves the activation of a hypothesis in a state of ignorance, it is not at all necessary or frequent, that the abducer be wholly in the dark, that his ignorance be total. It need not be the case, and typically isn’t, that the abducer’s choice of a hypothesis is a blind guess, or that nothing positive can be said of it beyond the role it plays in the subjunctive attainment of the abducer’s original target (although sometimes this is precisely so). Abduction isn’t mysticism. In particular, it is not foreclosed that there might be evidence that lends a hypothesis a positive degree of likelihood. But when the evidence is insufficient for activation, sometimes explanatory force is the requisite “top-up”. Abduction is often a deal-closer (albeit provisionally) for what induction cannot bring off on its own.
On this account, we can summarize that abduction is a conjecture quite different from a “blind guess,” so that a result is produced in a goal-directed process of reasoning and evaluated, to some extent, within that very process rather than only later in the course of deduction and induction.
9 The Logical Process and Validity of Abductive Inferences
165
Critical Aspects of the GW Model Although the GW model clearly improves on the AKM model with respect to the criticisms leveled at the latter, a few critical aspects remain. Specifically, three issues deserve to be discussed: First, the notion of the “target” for abduction points to a specific abductive problem which, however, seems to be in need of further clarification because also deduction and induction each have a certain purpose and thus their own epistemic target. Second, both the AKM and the GW model imply that not any hypothesis capable of explaining the facts yields a valid abduction but that certain further aspects would have to be met to sanction the selection of a given hypothesis. This carries the risk of confounding abduction with IBE. Third, since the GW model claims to be an important step toward the naturalization of logic (Woods, 2013, 2016, 2017), it should give us a clear picture of the cognitive process of abduction in terms of the cognitive states the abducer passes through and how these are linked with each other. The schema includes important insights in this respect but focuses more on the judgmental rather than the processual aspect of abduction. As for the target, it is Gabbay and Woods merit not only to have highlighted that there is more to an abductive problem than that some E cannot be derived from K but also that there are more than just explanatory problems (2005; also Magnani, 2009). In particular, non-explanatory abduction includes strategic and normative reasoning (Minnameier, 2017). So, to state that valid abductions have to deliver explanations for phenomena that can as yet not be explained is only part of the story. This raises the question of what is the common core element of epistemic targets of abduction, as opposed to those of deduction and induction. A sufficient answer to this question would also allow us to specify what makes abductions valid or invalid. This takes us straight to the second question of how far abduction goes and, in particular, how far the aspect of the selection of a certain hypothesis belongs to it or not. At this point, two notions of “selection” have to be clearly distinguished in the context of abduction. One relates to Magnani’s opposition of creative abduction and selective abduction (Magnani, 2009; see also Schurz, 2008, 2017). The former applies to novel abductions and the latter to abduction in the context of knowledge application. Ordinary medical diagnoses are a case in point. When patients report their symptoms to practitioners, the latter make abductive inferences from those symptoms to possible diseases the respective patients might have. So, they generate possible explanations – or, broadly speaking, theories – from their background knowledge and select (some of) them for further scrutiny. Note that there are two quite different aspects of selection in Magnani’s approach: the selection of elements Hi for which H R(K(H), T) holds and the selection of a subset of one or more hypotheses from the ones considered in the first place. The first one is unproblematic and relates to lines 9 and 11 of the GW-schema, i.e., the statement that H R(K(H), T) and the respective conjecture C(H). The second one refers to line 10, in particular, i.e., the further requirements a hypothesis should fulfill, and the selection based on these might go beyond abduction and
166
G. Minnameier
confound it with IBE. An open question is also what the final activation Hc ought to yield above and beyond the abductive (sub-)conclusion C(H) and, again, whether this transcends abduction as such. Gabbay and Woods stated that “C(H) is read ‘It is justified (or reasonable) to conjecture that H’. Hc denotes the discharge of H. H is discharged when it is forwarded assertively and labelled in ways that reflect its conjectural origins. (Here the label is ‘c’ in superscript position)” (2005, p.47). If we take “activation” in line 12 as the agent’s decision to act upon H, however, tentatively and cautiously, it would have to be interpreted in the sense of what Peirce has called “abductory induction.” This is an inductive inference that sets off some action or choice in cases in which the agent is not really sure but has to make a choice under uncertainty. Peirce gives us the famous example of the (purported) Catholic priest “(S)uppose we wish to test the hypothesis that a man is a Catholic priest, that is, has all the characters that are common to Catholic priests and peculiar to them. ( . . . ) I might say to myself, let me think of some other character that belongs to Catholic priests, beside those that I have remarked in this man, a character which I can ascertain whether he possesses or not. All Catholic priests are more or less familiar with Latin pronounced in the Italian manner. If, then, this man is a Catholic priest, and I make some remark in Latin which a person not accustomed to the Italian pronunciation would not at once understand, and I pronounce it in that way, then if that man is a Catholic priest he will be so surprised that he cannot but betray his understanding of it. I make such a remark; and I notice that he does understand it. But how much weight am I to attach to that test? After all, it does not touch an essential characteristic of a priest or even of a Catholic. It must be acknowledged that it is but a weak confirmation, and all the more so, because it is quite uncertain how much weight should be attached to it. Nevertheless, it does and ought to incline me to believe that the man is a Catholic priest. It is an induction, because it is a test of the hypothesis by means of a prediction, which has been verified. But it is only an abductory induction, because it was a sampling of the characters of priests to see what proportion of them this man possessed, when characters cannot be counted, nor even weighed, except by guesswork. It also partakes of the nature of abduction in involving an original suggestion; while typical induction has no originality in it, but only tests a suggestion already made.” (CP 6.526 [c. 1901]).
Similarly, Woods argued that “when the evidence is insufficient for activation, sometimes explanatory force is the requisite ‘top-up’. Abduction is often a dealcloser (albeit provisionally) for what induction cannot bring off on its own” (2013, p. 380; see also 2012, p. 154). This would also explain why in line 10 the GW model requires H to satisfy certain further conditions. However, Woods was imprecise and, as it seems, also unclear about how to determine these conditions (and what would finally justify full-blown activation), when he admitted that “(o)f course, the devil is in the details. Specifying the Si is perhaps the hardest open problem for abductive logic. ( . . . ) The Si are conditions on hypothesis selection. I have no very clear idea about how this is done, and I cannot but think that my ignorance is widely shared” (2017, pp. 139–40; see also 2013, p. 379, n. 349). He discussed the aforementioned consistency and minimality constraints of the AKM model in this respect and thought they are perhaps too strong and that minimality is vague in itself.
9 The Logical Process and Validity of Abductive Inferences
167
Another way to interpret the activation Hc of a hypothesis is that it is ready for use in subsequent inferences, in particular as the major premise for a subsequent deduction. However, if this were true, we would not talk about the final step of abduction, but the initial step of deduction. Hence, in whatever way we wish to understand the “activation” of a hypothesis, it would either belong to deduction or to induction, not to abduction. This puts abduction in the context of the overall connection of abduction, deduction, and induction. But before we look at this, let us briefly consider the third critique that the GW model does not deliver a complete account of the process of abduction. It states the problem (lines 1 to 8) and then presents the solution. This is alright if one only aims at determining whether an abduction is valid. However, in terms of the process, it only marks the beginning and the end, while the – typically much longer – process of thinking about the abductive problem and searching for a solution, possibly in vain, is cut off.
Abduction in the Context of the Inferential Triad Abduction, Deduction, and Induction Based on the above critique, we should take Peirce’s claim that there are three kinds of inference, but which are entirely distinct from each other, very seriously. In the following passage, he made this claim while he was at the same time pointing out that abduction and induction can easily be confused. Nothing has so much contributed to present chaotic or erroneous ideas of the logic of science as failure to distinguish the essentially different characters of different elements of scientific reasoning; and one of the worst of these confusions, as well as one of the commonest, consists in regarding abduction and induction taken together (often mixed also with deduction) as a simple argument. Abduction and induction have, to be sure, this common feature, that both lead to the acceptance of a hypothesis because observed facts are such as would necessarily or probably result as consequences of that hypothesis. But for all that, they are the opposite poles of reason . . . (CP 8.218 [c. 1901])
When Peirce stated that abduction and induction both lead – in the positive case – to the acceptance of hypotheses, but are none the less opposite poles, he meant that via abduction, a hypothesis is accepted for probation, while via induction, it is accepted as true (in the empirical and the pragmatist sense of truth). In other words, abduction introduces the hypothesis, and induction eventually confirms it. However, abduction and induction are not just poles between which reason oscillates, but abduction, deduction, and induction form an overall coherent reasoning process. Furthermore, this triadic relationship is recursive, because induction leads to a projection of the theory’s content onto all its cases, not only past and present but also future or as yet undiscovered ones (Fig. 1; see also Minnameier, 2004, 2010, 2017). Accordingly, this projection feeds back into abduction in the sense of selective abduction (see above) and into the whole cycle, so that each and every
168
G. Minnameier Theory
Surprising Facts ( )
Induction
Necessary Consequences (
, ,… )
Fig. 1 The dynamical interaction of abduction, deduction, and induction
application of an accepted theory implicitly probes it again and may lead to further confirmation or possible also the theories refutation. This is the very essence of pragmatism. It not only delivers a coherent account of the connection of the three inferences in terms of a meta-abduction (Pietarinen & Bellucci, 2015) but also matches with Peirce’s own well-known description in (CP 5.171) “Abduction is the process of forming an explanatory hypothesis. It is the only logical operation which introduces any new idea; for induction does nothing but determine a value, and deduction merely evolves the necessary consequences of a pure hypothesis. Deduction proves that something must be; Induction shows that something actually is operative; Abduction merely suggests that something may be. Its only justification is that from its suggestion deduction can draw a prediction which can be tested by induction, and that, if we are ever to learn anything or to understand phenomena at all, it must be by abduction that this is to be brought about”. (CP 5.171 [1903]).
Based on this recursive cycle, we can also say more on the abductive problem because one may wonder where it comes from. Typically it is just stated as in “The surprising fact C is observed . . . ”. However, we can reconstruct it as a negative result of an induction, in which an established belief gets shaken or even downright refuted, leading to a projection of the negation of a previously confirmed hypothesis based on novel evidence. Hence, an abductive problem first has to be induced in this way, before it may be stated as the premise for abduction, which in turn are then the “surprising facts” that call for an explanation. Abduction then describes the process of developing potential explanations, which can subsequently be examined by deduction (derivation of testable necessary consequences) and evaluated by induction (based on actual empirical testing).
Problems with Abductive Problems At this point, let us consider Aliseda’s notion of an “abductive problem.” On her account, abductive problems can take two forms, given φ (2017, p. 225; see also Nepomuceno-Fernández et al., 2017):
9 The Logical Process and Validity of Abductive Inferences
169
1. ϕ and ¬ ϕ (novel abductive problem). 2. ϕ and ¬ ϕ (anomalous abductive problem). In (1) the pair (, ϕ) constitutes a novel abductive problem, because does not entail either ϕ or ¬ϕ, while in (2), it constitutes an anomalous abductive problem, because entails ¬ϕ. However, as will be shown below, her examples for novel abductive problems reveal that they do not constitute an abductive problem at all, while it seems that the “anomalous” version describes the “regular” case of abductive problems. Aliseda discusses the following issue as an example for novel abduction (2006, pp. 29 and 48): All you know is that the lawn gets wet either when it rains, or when the sprinklers are on. You wake up in the morning and notice that the lawn is wet. Therefore you hypothesize that it rained during the night, or that the sprinklers have been on. (p. 29)
Plainly speaking, this does not seem to be an abductive problem at all, but rather an inductive one. To be more precise, it could be called an inverse inductive problem (see Minnameier, 2017, on inverse inferences) (For a critical remark, especially on the suggestion to understand Peirce’s “theorematic deduction” as an inverse deduction, see Niiniluoto, 2018, p. 27), because we are actually confronted with two competing hypotheses (for the surprising fact that the lawn is wet, possibly in a dry region) and have to find evidence that confirms one and disconfirms the other (or both). What is required, therefore, is the performance of a decisive experiment, an experimentum crucis (in the simplest case perhaps by asking the gardener), to find out the truth. We might ask whether there could be any such thing as a novel abductive problem in the sense Aliseda described it. Consider the following example: You do not know whether we are going to have a particularly hot summer (ϕ) or a particularly cold one (¬ϕ). So, whichever happens apart from, say, a normal summer, would be something new and perhaps even surprising (as we would not predict either ϕ or ¬ϕ). However, in this case both would only be surprising in that they deviate from what we deem likely and therefore expect, but it would not be surprising in an epistemic sense. For instance, if Jones wins a huge amount of money in a lottery, this was not likely a priori, and it is not surprising that he won inasmuch as someone actually had to be lucky one. As Olsson has pointed out, “(w)hat is needed in addition to low likelihood for an event to be surprising is the presence of an alternative hypothesis that would make the event likely” (2005, p. 190). Take Aliseda’s example for anomalous abductive problems (2006, pp. 29 and 48): You know that rain causes the lawn to get wet, and that it is indeed raining. However, you observe that the lawn is not wet. How could you explain this anomaly? (p. 28)
Here the hypothesis, based on the knowledge that it is actually raining, is that the lawn would normally have to be wet. However, it just is not. This is an anomaly, indeed, and it is also surprising. Therefore, Olsson seems to be right when he claimed that “all surprising events are arguably anomalous” (2005, p. 191). Hence,
170
G. Minnameier
truly abductive problems would be equivalent to Aliseda’s class of anomalous abductive problems.
The Triad of Colligation, Observation, and Judgment As already mentioned in the Introduction section, Peirce reminds us that despite its weak conclusion, which only yields possible solutions to an abductive problem, abduction is nonetheless a “logical inference . . . having a perfectly definite logical form” (CP 5.188, 1903). The question still is how to determine this logical form. As to the process of abductive reasoning, which shall be analyzed in the present section, a starting point consists in the fact that any inference must have a definite beginning and a definite end (unless no conclusion is reached). It begins with a question that calls for an answer in the form of the respective conclusion. According to Peirce, this process can – and supposedly has to – be subdivided into three distinct steps, namely, “colligation,” “observation,” and “judgment” (see also Minnameier, 2010 , for more detail): The first step of inference usually consists in bringing together certain propositions which we believe to be true, but which, supposing the inference to be a new one, we have hitherto not considered together, or not as united in the same way. This step is called colligation. (CP 2.442, c. 1893) The next step of inference to be considered consists in the contemplation of that complex icon . . . so as to produce a new icon. ( . . . ) It thus appears that all knowledge comes to us by observation. (CP 2.443-444, c. 1893) A few mental experiments – or even a single one . . . – satisfy the mind that the one icon would at all times involve the other, that is, suggest it in a special way . . . Hence the mind is not only led from believing the premiss to judge the conclusion true, but it further attaches to this judgment another – that every proposition like the premiss, that is having an icon like it, would involve, and compel acceptance of, a proposition related to it as the conclusion then drawn is related to that premiss. [This is the third step of inference.] (CP 2.444, c. 1893)
Peirce concluded that the “three steps of inference are, then, colligation, observation, and the judgment that what we observe in the colligated data follows a rule” (CP 2.444). Accordingly, any inference starts from the colligation of certain premises. These premises are observed and perhaps manipulated so as to produce an answer to the question inherent in the colligated premise and hit the target. However, every such result must, as such, spring to our mind spontaneously in the process of observation, so that this result cannot be taken at face value but has to be followed by a judgment in order to exclude spurious thoughts and make the result the conclusion of an inference rather than of mere association. As to the question of a target, we now see that any inference must have a target, for without such a target it would be useless to engage in inferring whatsoever. The target of abduction is to remove surprise and produce coherence (this will be explained in more detail further below). That of deduction is to draw necessary consequences by revealing implications of the premises. The target of induction is what Woods calls “deal-closing” in that it aims at some kind of determination,
9 The Logical Process and Validity of Abductive Inferences
171
whether in the sense of establishing knowledge (or forming a belief), or in the sense of determining a certain course of action (however uncertain the agent and however pressing the circumstances may be). In the context of the GW schema of abduction, the colligated premise consists of the surprising fact and the background knowledge that makes the fact surprising (i.e., that leads to a contradiction or incoherence within our overall web of beliefs). This is expressed in lines 1 and 2 of the model. The rejection of subduance and surrender (lines 3 and 4) might be read as setting off observation to develop a hypothesis of one’s own. What actually happens in the course of observation is not part of the GW-model, but the result (H) is included in the lines that follow. Lines 5 to 8 merely establish that H is ignorance-preserving, i.e., it is no knowledge and hence not part of either K or K∗ , whereas line 9 contains the crucial insight that H might explain the surprising fact, hence hitting the target. In line 11, then, the final conclusion is drawn. As for the “further conditions” mentioned in line 10, it should be noted that Peirce also mentions “certain conditions” in (CP 5.189, 1903). So let us look at this passage: Long before I first classed abduction as an inference it was recognized by logicians that the operation of adopting an explanatory hypothesis – which is just what abduction is – was subject to certain conditions. Namely, the hypothesis cannot be admitted, even as a hypothesis, unless it be supposed that it would account for the facts or some of them. The form of inference, therefore, is this: The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true.
If the analysis presented above is right, however, these conditions are not entirely the same as the ones alluded to in the GW model. As explained above, the consistency requirement certainly has to be met by any valid abduction. However, apart from this the only and also the main criterion for the validity of abduction is coherence. To account for the facts means to make them cohere together with and in the light of the suggested hypothesis. Moreover, since coherence requires consistency, coherence can be taken as the criterion for validity of abduction. A crucial point with respect to the GW model and the process of abduction is how we have to read the subjunctive conditional H R(K(H), T) in line 9 precisely. Woods qualified it as a “fact” that could mean that it ought be read in the sense of “if A were true, C would be a matter of course.” This would directly sanction the conjecture that “there is reason to suspect that A is true” and support the critique of the GW model. However, if H R(K(H), T) were to be read in the sense of “H might do the job,” it would merely express the spontaneous idea, which is the outcome observation. In this case, line 9 would mark the end of observation, and lines 10 to 12 denote the abductive judgment, where line 10 would state that the coherence requirement were met, and line 11 would correspond to the conjecture that “if A were true, C would be a matter of course” in line 11 to the final conclusion that “there is reason to suspect that A is true.”
172
G. Minnameier
Of course, this is merely an interpretation of the GW model and one that also modifies the author’s earlier reading of if; see Minnameier, 2016, 2019.) Taken together the results of this analysis can be summarized in the following propositions: – Proposition 1: Peirce’s famous statement of the abductive inference describes the final judgment, not the overall process, of abduction. It expresses the acceptance of the result of the observational step and establishes that the hypothesis, in fact, removes the initial surprise. – Proposition 2: The conditions mentioned by Peirce are to be read as consistency and coherence. Since the latter includes the former, coherence is the central condition on which the validity of abduction depends. – Proposition 3: Lines 1 and 2 of the GW model describe the colligated premise (also referring back to a preceding negative induction by which the incoherence of the observed facts with the epistemic subject’s background knowledge is established, which makes the observed facts surprising in the first place). – Proposition 4: Line 9 captures the result of observation, i.e., the idea that “comes to us like a flash” (CP 5.181, 1903) upon observing the colligated premise. – Proposition 5: Lines 10 to 12 capture the final judgment based on the colligated premise and establish that the coherence criterion is met (line 10), which sanctions the conjecture that “if A were true, C would be a matter of course” in line 11 and the conclusion that “there is reason to suspect that A is true” in line 12.
On Coherence and the Reach of Abduction The Proper Epistemological Role of Coherence In the preceding section, coherence was suggested as the criterion for the validity of abduction. This may appear as a matter of course, on the one hand, yet as a matter of surprise, on the other hand. Of course, explanations are meant to offer a coherent account of the explananda, and this may make the statement almost self-evident. However, “coherence” has so far been mainly assigned the role of a “truth-maker” in philosophy of science or a criterion for “knowledge” in epistemology. In other words and in Peircean terms, it has been put in the context of induction rather than that of abduction. Therefore, the very meaning of “coherence” should be discussed in more detail. Moreover, if coherence is understood in the sense of unification, as will be expounded further below, it not only allows us to (re)construct higher-order forms of knowledge but equally lower-order forms in terms of perception, tacit knowing, and intuition. There are different strands of coherence theories of truth, in particular an older idealistic account, which is based on the view that beliefs themselves are the foundations of knowledge rather than some reality that is something other than a belief, and a more modern epistemological account that is based on skepticism about correspondence theories of truth (Walker, 2018; Young, 2018). Since the former is
9 The Logical Process and Validity of Abductive Inferences
173
neither very prominent today nor relevant in the present context, we shall only be concerned with the latter. The basic argument in favor of coherentism consists in the epistemological insight that we cannot take a position outside our beliefs and compare propositions to objective facts, because everything we take as facts are also beliefs. The coherence theory of truth, which rests on a coherence theory of epistemic justification, however, has been criticized on grounds that (1) a proposition might still correspond to reality, even if it cannot be known that it does, and (2) that coherence might be no more and no less than an indication of correspondence. The first objection can be rejected quite easily, because referring to a proposition being true without anybody knowing that it is true employs the notion of absolute truth, which is unacceptable insofar as it is unattainable. Hence, the argument is vain. However, the second point cuts deeper. Thagard, for instance, argued that “if there is a world independent of representations of it, as historical evidence suggests, then the aim of representation should be to describe the world, not just to relate to other representations. My argument does not refute the coherence theory , but shows that it implausibly gives minds too large a place in constituting truth” (2007, pp. 29–30). Of course, it could be mentioned in opposition that the historical evidence, to which Thagard referred, consists of beliefs, too. And this can hardly be denied. But nonetheless, the content of those beliefs refers to something that is no belief. Accordingly, the main critique against coherence theories in general has been that it lacks grounding. Concepts have a meaning (intension) and a reference (extension), but if the reference is but another belief, coherent webs of belief might be something like castles in the air. This has led Haack (1993) to call for “foundherentism” as a combination of foundationalism and coherentism that combines the ideas of correspondence and coherence. The question is just how to combine them. While this is not the place to discuss this issue broadly, it can rightfully be claimed that Peircean pragmatism delivers just that. As we have seen, abduction is the process that basically produces coherence by eliminating elements of surprise, which, as such, introduce incoherence. While abduction makes previously incoherent beliefs coherent, they are – to quote Woods once again – ignorance-preserving. Abduction is not meant to yield truth, and therefore, abduction is no inference that can contribute anything to a “coherence theory of truth” as such. However, induction is the process that according to Peirce delivers truth, and it does so by testing hypotheses empirically. Of course, truth in this sense cannot be established once and for all, but because hypotheses are always tested based on experience made in their light, they establish and secure correspondence. This understanding sheds new light on the following passage, which was already cited above: “Abduction and induction have . . . this common feature, that both lead to the acceptance of a hypothesis because observed facts are such as would necessarily or probably result as consequences of that hypothesis. But for all that, they are the opposite poles of reason . . . ” (CP 8.218 [c. 1901]). If we accept that coherence is no criterion for truth in the sense of confirmation, i.e., if we do not take coherence in the sense of a “coherence theory of truth,” we get a clearer picture of what it achieves and where it belongs. It is the criterion for valid abduction, while
174
G. Minnameier
induction delivers the grounding. In this very sense, they are truly the two opposite poles of reason.
Coherence and Unification This first accounts of coherence focused on logical relations between the propositions in a system of beliefs, i.e., logical relations other than mere consistency. For instance, while Ewing (1934) required that every element be entailed by all the other elements of the set taken together, C. I. Lewis (1946) held that on the assumption of the other propositions only should the antecedent probability of each single proposition be increased (see Olsson, 2005, pp. 13–14). However, even BonJour in his seminal work (BonJour, 1985, pp. 97–99) did not get beyond describing relations between individual beliefs in a system and toward an answer to the question of what actually makes beliefs coherent that have previously been incoherent. The notion of coherence that fits with abduction, however, is what Thagard (2007) called “explanatory coherence” and what Schurz (1999) discussed under the appropriate label of “unification”. Schurz pointed out that neither the nomic expectability approach (as incorporated in the DN model of explanation) nor the causality approach, which aims at a list of all causes or causally relevant factors for the explanation-seeking fact, yield satisfactory explanations. As an illustration, he provided the following instructive example: For instance, “Peter is flying past the window in the third floor, because one second ago he was flying past the window in the fifth floor” is not only a predictively but even a causally adequate argument. Still it is not adequate as an explanation, because the cause is here just as much in need of explanation as the effect. (Schurz, 1999, 97)
To be sure, the events in the example are clearly coherent in the sense of versions mentioned above. However, they fail to provide a satisfying explanation because they give no answer as to why Peter is flying past the window anyway. Explanation, on Schurz’s account, means to give reasons why the explanation-seeking events happen rather than predicting them or mentioning causal factors that are themselves in need of explanation. Unification is what Schurz suggests as the main criterion in this respect, and it means that the facts that obviously fail to cohere with the reasoner’s background knowledge are made coherent in the sense of eliminating or at least reducing surprise (It should be noted, however, that Schurz (2008) reserves unification for his analysis of higher-order forms of abduction). Abduction always unifies in this sense because it merges incoherent premises into a literally unitary account. However, it does so by integrating the premises at a higher level of abstraction. And this unifying aspect of “explanatory coherence” goes beyond the early notions of coherence. It is crucial that this notion of coherence is divorced from coherence theories of truth or of epistemic justification which concern the inductive rather than the abductive context (see also Niiniluoto, 2018, pp. 12–15, with respect to the differentiation of criteria for abduction and IBE) (To be sure, Niiniluoto (ibid, p. 17) here refers to a broader notion of IBE which would not be restricted to induction. After all, since explanations can also be abductively better
9 The Logical Process and Validity of Abductive Inferences
175
or worse, there could be something like an abductively best explanation, too, which would, accordingly, be the most coherent one in the sense described in this section). Thagard seemed to hit the point when he expounded that coherent explanations not only yield more breadth by explaining more phenomena than previous explanations but also more depth by describing and integrating layers of mechanisms. Here Thagard argued that “the world is in fact organized in terms of parts, from organisms down to subatomic particles, and layers of mechanisms, from viral infection down to chemical bonding” (2007, p. 43; see also Mackonis, 2013, p. 985). However, scientific theories not only have to integrate layers of mechanisms that can be observed in the real world but also (layers of) theories in the world of science. As Levi has pointed out, theories typically not only have to explain the particular phenomena that give rise to them, but they also have to explain and integrate their predecessors (1991, p. 154; see also Olsson, 2005, p. 191). Hence, coherence in the sense of the criterion suggested for the validity of abduction is equivalent to unification and reconstructs the facts to be explained at a higher level of abstraction at which those facts are made coherent with the background knowledge with which it failed to cohere before. For instance, a “theory” that Sherlock Holmes (or any other detective) develops integrates singular events but also reconstructs them at a level that includes causal and conditional relations not present in the simple presentation of the events so reconstructed. What are the solutions for fictional and real-life detectives are at the same time the explanatory problems for psychologists who try to find out what drives criminals or produces certain kinds of social conflicts that may lead to crimes. This does not need to be discussed any further here. For more details on abduction and explanatory levels, see Minnameier (2017). The important result in the present context is (1) that coherence is to be taken in this narrow sense of unification and (2) that this kind of coherence is what abduction is all about in that it takes the epistemic subject from lower levels of cognition to higher ones.
Coherence, Perception, and Intuition If coherence in the sense described above is the main ingredient of valid abductions, and if it might yield a cascade of higher levels of cognition and reasoning, the perspective could also directed downward toward pre-linguistic forms of cognition. This might allow us to consider forms of abduction beyond the confines Peirce himself had set to it. It is well known that Peirce was grappling with perception and perceptual judgments. On the one hand, he found “that abductive inference shades into perceptual judgment without any sharp line of demarcation between them . . . (that) are to be regarded as an extreme case of abductive inferences, from which they differ in being absolutely beyond criticism” (CP 5.181, 1903), but on the other hand, perceptual judgment was still thought to be different. The reason is that “(o)n its side, the perceptive judgment is the result of a process, although of a process not sufficiently conscious to be controlled, or, to state it more truly, not controllable and therefore not fully conscious” (ibid.). Apparently, however, Peirce went wrong in his claim that perceptual judgment is uncritical and therefore lacks an important inferential component. After all, he
176
G. Minnameier
called it “judgment.” So the individual must judge in some way. And he discussed creative perceptions when looking at drawings or when confronted with ambiguous geometric shapes, such as cubes or stairs, which we can imagine from different three-dimensional perspectives (CP 5.183, 1903). He claimed that we cannot help but imagine these figures in one way or another, and the fact that we cannot help but see them this way is seen as a failure of cognitive control. This is a categorical error, however, because it refers to cognitive control of a higher level of consciousness than that of perception. After all, reflection on perception is a different activity than perceiving as such. Consider the following example: You are walking through a forest at night and see only shadows, just as much as the sparse moonlight allows you to see. Suddenly you notice something that could be a human or a large animal, but you are not sure. These preliminary perceptions are based on immediate sensory stimuli that you integrate to form meaningful impressions. In this process, you make contingent sense stimuli coherent by perceiving a human being or something else. At this stage, it is merely a bold hypothesis to make sense of your sensory stimuli. In the following stage, you may ask aloud if there is someone. This means that you infer from your preliminary perception that if it were a person, they should be able to respond appropriately to your question. And then you apply this as a test and collect the data. If no one replies, you might either believe that there is nobody and that you were mistaken or you might become all the more cautious, fearing that someone might lie in ambush. Whatever is the case, this reveals that perception delivers coherent accounts of sense stimuli in the first place, which are then tested based on predictions. A correctly understood perception therefore fulfills all necessary conditions for abduction. Even more, it also fulfills the necessary conditions of the whole inferential triad. Therefore, perception does not differ in any way from abduction. There are only different levels of abstraction to be distinguished (see Minnameier, 2017). Moreover, analyzing perception in this way is a cornerstone of pragmatism. It connects human reasoning and inference to more basic processes of life. Peirce seemed to have seen this, but perhaps failed to fully appreciate it: On its side, the perceptual judgement is the result of a process, although of a process not sufficiently conscious to be controlled, or, to state it more truly, not controllable and therefore not fully conscious. If we were to subject this subconscious process to logical analysis, we should find that it terminated in what that analysis would represent as an abductive inference, resting on the result of a similar process which a similar logical analysis would represent to be terminated by a similar abductive inference, and so on ad infinitum. (CP 5.181, 1903)
Therefore, we should just take it that perceptual judgments are “plainly nothing but the extremist case of Abductive Judgments” (CP 5.185, 1903). Peirce also stated that “(e)ven after the percept is formed there is an operation which seems to me to be quite uncontrollable. It is that of judging what it is that the person perceives. A judgement is an act of formation of a mental proposition combined with an adoption of it or act of assent to it” (CP 5.115, 1903). So, on the one hand, he denied perception to be controllable, yet on the other hand, he clearly spoke of an “adoption of it or act of assent to it” which at least requires a minimum of controllability.
9 The Logical Process and Validity of Abductive Inferences
177
Peirce obviously thinks of cognitive control in terms of a higher process of reflection, which, however, concerns a different act, namely, that of reflection on perceptions rather than that of making them. Hence, on the one hand, he held that “the perceptive judgment is the result of a process, although of a process not sufficiently conscious to be controlled, or, to state it more truly, not controllable and therefore not fully conscious” (CP 5181, 1903). On the other hand, however, he judged that “(i)f the percept or perceptual judgment were of a nature entirely unrelated to abduction, one would expect that the percept would be entirely free from any characters that are proper to interpretations” (CP 5.184, 1903). Peirce thus attached great importance to the observation that neither percepts nor perceptual judgments are merely “given” but rather are generated by cognitive processes in the interaction with the environment (see also Hookway, 2012, pp. 15–18 as well as 149–164). This analysis of perception seems to be of great relevance for pragmatism in general for two main reasons. One reason is that perception is just one out of many processes in the wider field of intuition (see, e.g., Fridland, 2021; Magnani, 2009; Park, 2014; Viola, 2016). Most of the acts we perform are neither planned nor reflected upon. Nonetheless they are certainly intelligent and also controlled in the sense explained above. Any such act depends on a coherent interpretation of the situation one is in and the consequences one derives from them. Recent theories on “predictive processing” (Clark, 2015, 2017) or “active inference” (Friston, 2010; Friston et al., 2015, 2016) have claimed that even a pre-conscious organismic behavior is to be understood in terms of inferences made on the basis of sensory inputs that allow organisms to make predictions and try to reduce prediction errors (and that they do this permanently). Regarding human cognition, this extension of the reach of abduction – and by the same token deduction and induction – allows us to integrate dual-process theories of cognition (e.g., Kahneman 2012; Stanovich 2012; Evans & Stanovich, 2013), in particular “system 1 thinking” and tacit knowing (Polanyi, 1958/1998; Polanyi, 1966; see also Hermkes, 2016). The second reason is that extending the reach of abduction and inferential functioning in general in this way reinforces the foundational aspect of pragmatism and the inferential triad. Based on this view, every cognitive content, however simple or elementary, is produced in an act of making external stimuli – as received by the sensory receptors – coherent via abduction and secure foundation by making predictions and, in particular, correcting prediction errors. This seems to be crucial for the overall project of pragmatism (see also Hookway, 2012, p. 18) and the naturalization of logic (e.g., Magnani, 2019, pp. 276–279).
Conclusions Taken together, the main results of the analysis in the paper can be summarized like this: First we considered the possibility that new concepts and theories are brought about by mere chance and rejected it. There obviously has to be some process by which those concepts and theories are inferred from what is already at hand and constitutes the abductive problem. Also abduction is not to be confused with IBE.
178
G. Minnameier
Next, two models of abduction were presented and subjected to critique: the AKM and the GW model. Whereas the former basically reconstructs Peirce’s famous statement of the abductive inference, the latter widens the perspective and includes further important aspects of the overall process of abduction and its requirements. However, it also contains important ambiguities that were highlighted. Based on a reconstruction of the overall connection of the three inferences and their subdivision into inferential sub-processes, i.e., colligation, observation, and judgment, the GW model was reanalyzed in terms of these specific inferential steps. One important result of this reanalysis was that coherence came out as the one and only central requirement for the validity of abduction. In a way this result appears surprising, because coherence is mostly discussed in the context of establishing truth, while truth is clearly not what abduction, as such, yields. It only identifies possible candidates for truth. However, if coherence is taken only as part of the answer, that given by abduction, it still plays a central role in finding the truth about surprising facts, which is complemented by deduction and induction, in particular a foundational justification via induction. However, the notion of coherence that is relevant for abduction is explanatory coherence in the sense of unification, which implies a higher-order representation of the content colligated in the abductive premise (at which this previously incoherent content can be made coherent). If coherence in this sense is what abduction is all about, the reach of abduction can be extended into realms that Peirce thought were excluded from abduction for a lack of cognitive control, in particular perception, but generally speaking all possible forms of intuition, and perhaps even below the level of consciousness. This would give pragmatism a strong naturalistic foundation and the necessary grounding, especially if abduction and induction are taken as the two “opposite poles of reason” (CP 8.218 [c. 1901]), one concerned with the construction of coherent concepts, broadly considered, the other with their confirmation and ongoing evaluation.
References Aliseda, A. (2006). Abductive reasoning: Logical investigation into discovery and explanation. Springer. Aliseda, A. (2017). The logic of abduction: An introduction. In L. Magnani & T. Bertolotti (Eds.), Handbook of model-based science (pp. 219–230). Springer. BonJour, L. (1985). The structure of empirical knowledge. Harvard University Press. Campos, D. G. (2011). On the distinction between Peirce’s abduction and Lipton’s inference to the best explanation. Synthese, 180, 419–442. Clark, A. (2015). Radical predictive processing. Southern Journal of Philosophy, 53(S1), 3–27. Clark, A. (2017). Busting out: Predictive brains, embodied minds, and the puzzle of the evidentiary veil. Noûs, 51(4), 727–753. Evans, J., & Stanovich, K. E. (2013). Dual-process theories of higher cognition: Advancing the debate. Perspectives on Psychological Science, 8, 223–241. Ewing, A. C. (1934). Idealism: A critical survey. Methuen. Flach, P., & Kakas, A. (Eds.). (2000). Abduction and induction: Essays on their relation and integration. Kluwer Academic. Fridland, E. (2021). Skill and strategic control. Synthese, 199, 5937–5964.
9 The Logical Process and Validity of Abductive Inferences
179
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews – Neuroscience, 11, 127–138. Friston, K., Rigoli, F., Ognibene, D., Mathys, C., Fitzgerald, T., & Pezzulo, G. (2015). Active inference and epistemic value. Cognitive Neuroscience, 6(4), 187–224. Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., O’Doherty, J., & Pezzulo, G. (2016). Active inference and learning. Neuroscience and Biobehavioral Reviews, 68, 862–879. Gabbay, D. M., & Woods, J. (2005). A practical logic of cognitive systems, Vol. 2: The reach of abduction – Insight and trial. Elsevier. Haack, S. (1993). Evidence and inquiry: Towards reconstruction in epistemology. Blackwell Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74(1), 88–95. Hempel, C. G. (1966). Philosophy of natural science. Prentice-Hall. Hermkes, R. (2016). Perception, abduction, and tacit inference. In L. Magnani & C. Casadio (Eds.), Model-based reasoning in science and technology – Logical, epistemological, and cognitive issues (pp. S.399–S.418). Springer. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3), 503–533. Hookway, C. (2012). The pragmatic maxim: Essays on Peirce and pragmatism. Oxford University Press. Iranzo, V. (2007). Abduction and inference to the best explanation. Theoria: An International Journal for Theory, History and Foundations of Science, 22(3), 339–346. Kahneman, D. (2012). Thinking: Fast and slow. Allen Lane. Kapitan, T. (1992). Peirce and the autonomy of abductive reasoning. Erkenntnis, 37(1), 1–26. Kuipers, T. (1999). Abduction aiming at empirical progress or even truth approximation leading to a challenge for computational modelling. Foundations of Science, 4(3), 307–323. Levi, I. (1991). The fixation of belief and its undoing. Cambridge University Press. Lewis, C. I. (1946). An analysis of knowledge and valuation. Open Court. Lipton, P. (2004). Inference to the best explanation (2nd ed.). Routledge. Mackonis, A. (2013). Inference to the best explanation, coherence and other explanatory virtues. Synthese, 190, 975–995. Magnani, L. (2001). Abduction, reason and science: Processes of discovery and explanation. Kluwer/Plenum. Magnani, L. (2009). Abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. Magnani, L. (2019). Errors of reasoning exculpated: Naturalizing the logic of abduction. In D. Gabbay, L. Magnani, W. Park, & A. V. Pietarinen (Eds.), Natural arguments: A tribute to John Woods (pp. 269–308). College Publications. McKaughan, D. J. (2008). From ugly duckling to swan: C. S. Peirce, abduction, and the pursuit of scientific theories. Transactions of the Charles Sanders Peirce Society, 44, 446–468. Minnameier, G. (2004). Peirce-suit of truth – Why inference to the best explanation and abduction ought not to be confused. Erkenntnis, 60, 75–105. Minnameier, G. (2010). The logicality of abduction, deduction, and induction. In M. Bergman, S. Paavola, A.-V. Pietarinen, & H. Rydenfelt (Eds.), Ideas in action: Proceedings of the applying Peirce conference (pp. 239–251). Nordic Pragmatism Network. Minnameier, G. (2016). Abduction, selection, and selective abduction. In L. Magnani & C. Casadio (Eds.), Model-based reasoning in science and technology – Logical, epistemological, and cognitive issues (pp. 309–318). Springer. Minnameier, G. (2017). Forms of abduction and an inferential taxonomy. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based reasoning (pp. 175–195). Springer. Minnameier, G. (2019). Re-reorienting the logic of abduction and the naturalization of logic. In D. Gabbay, L. Magnani, W. Park, & A. V. Pietarinen (Eds.), Natural arguments: A tribute to John Woods (pp. 353–373). College Publications. Nepomuceno-Fernández, A., Soler-Toscano, F., & Velsquez-Quesada, F. (2017). Abductive reasoning in dynamic epistemic logic. In L. Magnani & T. Bertolotti (Eds.), Handbook of model-based science (pp. 269–293). Springer.
180
G. Minnameier
Niiniluoto, I. (2007). Structural rules for abduction. Theoria: An International Journal for Theory, History and Foundations of Science, 22(3), 325–329. Niiniluoto, I. (2018). Truth-seeking by abduction. Springer. Olsson, E. J. (2005). Against coherence: Truth, probability, and justification. Clarendon Press. Paavola, S. (2005). Peircean abduction: Instinct or inference? Semiotica, 153(1), 131–154. Paavola, S. (2006). Hansonian and Harmanian abduction as models of discovery. International Studies in the Philosophy of Science, 20, 93–108. Park, W. (2014). How to learn abduction from animals? – From Avicenna to Magnani. In L. Magnani (Ed.), Model-based reasoning in science and technology – Theoretical and cognitive issues (pp. 207–220). Springer. Pietarinen, A.-V., & Bellucci, F. (2015). New light on Peirce’s conceptions of retroduction, deduction, and scientific reasoning. International Studies in the Philosophy of Science, 28(4), 353–373. Polanyi, M. (1958/1998). Personal knowledge: Towards a post-critical philosophy. Routledge. Polanyi, M. (1966). The tacit dimension. Routledge & Paul. Popper, K. R. (1935/2002). The logic of scientific discovery. Routledge. Schurz, G. (1999). Explanation as unification. Synthese, 120, 95–114. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. Schurz, G. (2017). Patterns of abductive inference. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based science (pp. 151–173). Springer. Stanovich, K. E. (2012). On the distinction between rationality and intelligence: Implications for understanding individual differences in reasoning. In K. J. Holyoak & R. G. Morrison (Eds.), Oxford library of psychology. The Oxford handbook of thinking and reasoning (pp. S.433– S.455). Oxford University Press. Thagard, P. (2007). Coherence, truth and the development of scientific knowledge. Philosophy of Science, 74, 26–47. Tiercelin, C. (2005). Abduction and the semiotics of perception. Semiotica, 153(1), 389–412. Urba´nski, M., & Klawiter, A. (2018). Abduction: Some conceptual issues. Logic and Logical Philosophy, 27(4), 583–597. Viola, T. (2016). Peirce on abduction and embodiment. In M. Jung & R. Madzia (Eds.), Pragmatism and embodied cognitive science: From bodily intersubjectivity to symbolic articulation (pp. 251–268). de Gruyter. Walker, R. C. S. (2018). The coherence theory of truth. In M. Glanzberg (Ed.), The Oxford handbook of truth (pp. 219–237). Oxford University Press. Woods, J. (2012). Cognitive economics and the logic of abduction. Review of Symbolic Logic, 5(1), 148–161. Woods, J. (2013). Errors of reasoning: Naturalizing the logic of inference. College Publications. Woods, J. (2016). Logic naturalized. In J. Redmond, O. Pombo Martins, & A. Nepomuceno Fernández (Eds.), Epistemology, knowledge and the impact of interaction. Logic, epistemology, and the unity of science (Vol. 38). Springer. Woods, J. (2017). Reorienting the logic of abduction. In L. Magnani & T. Bertolotti (Eds.), Handbook of model-based science (pp. 137–150). Springer. Young, J. O. (2018). The coherence theory of truth. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. https://plato.stanford.edu/archives/fall2018/entries/truth-coherence/ Yu, S., & Zenker, F. (2018). Peirce knew why abduction isn’t IBE: A scheme and critical questions for abductive argument. Argumentation, 32(4), 569–587.
Theory-Generating Abduction and Its Justification
10
Gerhard Schurz
Contents Introduction: Abduction and Inference to the Best Explanation . . . . . . . . . . . . . . . . . . . . . . . . Unification and Independent Testability: Two Rationality Criteria for Theory-Generating Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Common Cause Abduction in Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instrumentalistic Justification of Theory-Generating Abduction . . . . . . . . . . . . . . . . . . . . . . . Justifying Theory-Generating Abduction by Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Justification of Perceptual Realism: Normal World Versus Brain in the Vat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Justification of Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
182 185 191 194 195 198 201 205 205
Abstract
Theory-generating abductions introduce new (theoretical) concepts into their conclusion. This form of abduction underlies all uncertain inferences from (singular or general) empirical facts to theoretical hypotheses that explain these facts by unobserved or unobservable entities and properties, expressed by theoretical concepts. Theory-generating abductions are discriminated from speculative postfacto abductions by two scientific rationality criteria: unification and independent testability. A particularly important form of theory-generating abductions in science is common cause abductions that explain correlated empirical dispositions in terms of common theoretical causes. These abductions play also an important
G. Schurz () Department of Philosophy, University of Duesseldorf, Duesseldorf, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_4
181
182
G. Schurz
role in the justification of metaphysical theories, such as perceptual realism. A justification of theory-generating abductions is possible based on a weak principle of causality that is in turn justified by a noncausal abduction. Keywords
Abduction and IBE · Theory-generating abduction · Speculative abduction · Unification · Independent testability · Common cause abduction
Introduction: Abduction and Inference to the Best Explanation Abduction comprises a whole family of inference patterns. Examples of abductions reach from the everyday inference from a track in the sand to a man who walked there to the epistemological inference from internal sense experiences to an external reality that causes them. Abductive inferences have the following structure in common (cf. Peirce, 1903, 5.189): General Pattern of Abduction: Premise 1: A singular or general fact E that is in need of explanation. Premise 2: A system S of background beliefs, which implies that a certain hypothesis H is a (most) plausible potential explanation for E available in S (“potential” in the sense that if H were true, it would explain E). Conclusion: H is conjectured to be true or at least close to the truth. Abductive inferences go back to Peirce (1903, 5.170, 5.189), who understood them as inferences to a conjectured cause or explanation of observed facts. They have been reconstructed by Harman (1965) as the “inference to the best explanation” (IBE). Already Peirce understood abductive inferences as inferences to preferred explanatory hypotheses that are more plausible than possible competitors, at least in some places (Peirce, 1901, 6.525). So, Peirce’s understanding of abductions can be conceived as approximately similar as Harman’s IBEs, whence a variety of philosophers have proposed a loose equivalence of abductive inferences and IBEs (e.g., Harman, 1965; Lipton, 1991; Josephson & Josephson, 1994, 5; Walton, 2001; Psillos, 2000; Schurz, 2008; Douven, 2011; Williamson, 2016; Niiniluoto, 2018, 17 f.). However, one should better be careful here, since the relation between abduction and IBE is controversially discussed in the literature. In what follows some important reasons why abduction and IBE cannot be rigidly separated are presented. Several authors have argued for a separation of abduction and IBE (e.g., Hintikka, 1998; Minnameier, 2004; Gabbay & Woods, 2005; Yu & Zenker, 2018). They argue that abductions merely generate possible or promising explanatory hypotheses, but they do not confirm them and don’t make them probably true. In contrast, the latter operation is characteristic for IBEs and more generally for inductive inferences in the broad understanding of “induction” as confirmation. This idea of separating
10 Theory-Generating Abduction and Its Justification
183
abduction from induction is found also in many (though not in all) writings of Peirce on abduction (see the excellent compilation in Minnameier 2017). I think, however, that Peirce’s idea of separating abduction from IBE and from induction in the broad confirmational sense does not work, for the following reasons. Assume that a surprising fact E1 (or set of facts) is given, and abduction creates a hypothesis H that explains E1 (in the given background knowledge K). Next, from H further consequences are derived and these consequences are confirmed by further facts E2 . Then according to the Peircean separation proposal, the latter operation is no longer an abduction, but an IBE, which has to be considered as an inductive confirmation, i.e., as an induction (see Minnameier, 2017, 176). This view is untenable, however, because it follows from probability theory that because H explains the original fact E1 (the fact for which H was abducted), H is inductively confirmed by E1 , i.e., P(H|E1 ) > P(H) holds, with “P” for probability. The proof is simple: a minimal condition for H to explain E1 is that H raises E1 ’s probability, i.e., P(E1 |H) > P(E1 ); so P(H|E1 ) > P(H) follows (since P(H|E1 )/P(H) = P(E1 |H)/P(E1 ), whereby it is assumed, as usual, that P(H) and P(E1 ) are nonzero). Of course, the set of explained facts {E1 , E2 } conveys a much stronger confirmation to H than the original fact E1 . Moreover, as it will be seen in section “Unification and Independent Testability: Two Rationality Criteria for Theory-Generating Abduction”, if H contains theoretical parameters that have been fitted toward E1 , then an independent confirmation of H by E2 is indispensible. But this does not change the fact that already E1 provides some confirmation for H. This leads to the conclusion that a strict separation of abduction from induction in the broad confirmational sense is impossible. It is for this reason that the loose concept of induction as any nondeductive and probability-conveying argument is avoided in this chapter, because otherwise all forms of IBEs and abductions would be inductive in nature, and a clear separation of induction and abduction would be impossible. Induction is always understood in the narrow enumerative sense, as a projection of a property or a pattern from observed instances to unobserved instances (inductive prediction) or to the entire domain (inductive generalization). This narrow notion of induction is clearly separable from abduction viewed as inference to a (causal or noncausal) explanation. Our arguments do not exclude that a gradual or blurred distinction between the abductive step of hypothesis-generation and confirmatory steps of hypothesis testing can be useful on other grounds. Note, however, that also in scientific practice, these two steps can hardly be disentangled; Raftopoulos (2016) illustrates this point at hand of Newton’s Optics. Even if abductions and IBEs are loosely identified, one should keep in mind that the (predominantly logical-cognitive) literature on abduction and the (predominantly philosophical) literature on IBE have focused on different and complementary aspects. While the philosophical literature has focused on the problem of the truth conduciveness and probabilistic justification of IBEs in general, the logical and cognitive approaches have concentrated on explicating patterns of abduction as search strategies for generating promising explanatory hypotheses.
184
G. Schurz
Schurz (2008) proposes a classification of different patterns of abductions, distinguishing (among others) between fact abductions, law abductions, and existential abductions. A particular kind of existential abductions is theory-generating abductions. They are the topic of this chapter. In theory-generating abductions, new theoretical concepts are abducted, also called “latent” variables, together with theoretical laws that connect them with each other and with empirical concepts and that explain the empirical facts stated in the premises. Newton, for example, abducted the theoretical concept of force together with mechanical force laws, in order to explain the elliptic orbits of planets. It was an important insight of the post-positivistic philosophy of science that theoretical concepts cannot be reduced to observable concepts via definitions (Carnap, 1956; Hempel, 1951; French, 2008; Stegmüller, 1976; Schurz, 2014; Sect. 5.1–3); they are rather inferred by abduction as the best available explanation of empirical facts discovered by observation and (enumerative) induction. Because of their ability to introduce new concepts, theory-generating abductions have also been called creative in the literature, as opposed to selective abductions that choose a most plausible explanation or “cause” from a multitude of potential causes, according to a background system of explanatory laws that are already known and confirmed elsewhere (Magnani, 2001, 20; Schurz, 2008, Sect. 1; Niiniluoto, 2018, 13). However, being “creative” is a rather general and vague property that in a sense also applies to other forms of abductions; therefore the more precise designation of “theory-generating” abduction is preferred here. The ability of abductive inferences to introduce new terms (individual constants or predicates) relevantly into their conclusion is a particular property of (theory-generating) abductions that distinguishes them from deductive or inductive inferences (in the narrow enumerative sense, as explained). This has already been seen by Peirce, who once wrote that induction “never can originate any idea whatever. Nor can deduction. All the ideas of science come to it by way of abduction” (1903, CP 5.145). Going beyond Peirce, a brief proof of this fact will now be presented. For this purpose, some logical terminology is introduced: small letters (a, b, ... possibly indexed) stand for individual constants (denoting individuals); capital letters (F, G, ...) stand for predicates; “Fa” stands for “individual a has property F”; the logical symbols are ¬ (negation), ∧ (conjunction), ∨ (disjunction), → (material implication), ∀ (universal quantifier), and ∃ (existential quantifier); P(p) stands for the probability of a proposition p and P(p|q) for the conditional probability of p given q. With this in the background, a nonlogical term τ in the conclusion (individual constant of predicate) is defined as new iff it is not contained in the premises, and it is defined as relevant iff whenever τ is replaced in some of its occurrences by an arbitrary type-equivalent new term, the so modified conclusion is (i) neither entailed by the premises (ii) nor by the original conclusion. Clause (i) prevents that the new symbols “b” and “G” introduced by irrelevant deductive inferences based on disjunctive weakening, such as Fa/Fa ∨ Gb, count as relevant (here “/” is the inference stroke, meaning “therefore”), and clause (ii) avoids that these symbols count as relevant in the inductive inference with a disjunctively weakened conclusion, as in “Fa1 , . . . ,Fan /Fan+1 ∨ Gb.” In both cases these symbols
10 Theory-Generating Abduction and Its Justification
185
are replaceable by an arbitrary new type-equivalent symbol without destroying the validity of the inference. Given this definition, the following is easily provable: (i) In deductive inferences it is not possible that the conclusion contains singular terms or predicates that are simultaneously new and relevant (which is a consequence of the theory of uniform substitution for individual constants and for predicates). (ii) In inductive inferences new singular terms are relevantly introduced into the conclusion. For example, in the inductive inference “Fa1 ,...,Fan /Fan+1 ,” the individual constant an+1 is new and relevant, since Fb (for arbitrary b) is (i) neither entailed by Fa1 ,...,Fan (ii) nor by Fan+1 . In contrast, inductive inferences cannot contain new relevant predicates in their conclusion (since by definition “Fa1 ,...,Fan- /Gan+1 ” is not a correct inductive inference). (iii) Finally, in theory-generating certain abductive inferences, new predicates are relevantly introduced into the conclusion, typically theoretical predicates designating unobserved or unobservable entities. Various examples will be presented in the following. A special case of theory-generating abductions is analogical abductions. Here, an empirical regularity is explained by a theoretical model that is established by analogy to another model in a different domain of application (cf. Hesse, 1976; Gentner, 1983). An example is the abductive explanation of the empirical laws of the propagation of sound (cf. Thagard, 1988, 67). The phenomenon of diffraction (i.e., the audibility of sound behind a wall) excludes a theoretical explanation of sound by moving particles because the particles would be stopped by the wall. To explain diffraction, the theoretical model of sound waves was created, based on an analogy to water waves. Analogical inferences are important as search heuristics. However, per se, analogies do not necessarily offer confirmational support; often they lead to error. For example, electromagnetic waves turned out to have different properties than mechanical waves; in spite of the analogy, they do not possess a medium in which they “swing.” In a global historical perspective, analogical abductions have led humans more often into error than to the truth; the explanation of natural phenomena such as the rising sun by intentional agents or “Gods” is a nice analogy, but is clearly false from a scientific viewpoint. An independent confirmation of the analogy is needed. In conclusion, analogical abductions face similar epistemological challenges as theory-generating abductions; therefore they are here treated as a subcase of theory-generating abductions.
Unification and Independent Testability: Two Rationality Criteria for Theory-Generating Abduction The core problem of the justification of theory-generating abduction is their discrimination from pure speculations. The problem lies in the fact that by using post-facto speculations postulating suitable hidden “powers,” one can explain any
186
G. Schurz
empirical facts whatsoever. For this reason, several contemporary empiricists (e.g., van Fraassen, 1989, part II) have skeptical reservations against abductive inferences. In this section, it is proposed to handle the discrimination problem by using two rationality criteria. For this purpose, the notion of “theory” is understood in a wide sense, covering all sorts of hypotheses about unobservable entities, including scientific theories as well as speculations. A necessary condition for a hypothesis H to explain a fact E is that H entails E or increases E’s probability. However, not every hypothesis H that entails E or makes E probable is a reasonable potential explanation that can be genuinely confirmed by E. It may also be a pseudo-explanation and a corresponding pseudo-confirmation. Already Lipton (1991, 58) emphasized that the best available “explanation” may not be good enough to be rationally acceptable. There are two major kinds of pseudoexplanations or pseudo-confirmations: the first kind of “pseudo” is logical, and the second one is epistemic: Logical pseudo-explanations: They are produced by a so-called tacking by conjunction (Lakatos, 1970, 128; Glymour, 1981, 67). Here (in the simplest case), the hypothesis is formed by “tacking” an irrelevant conjunct X to a given evidence E, for example, “E ∧ Aliens have occupied earth.” E∧X pseudo-explains E, because P(E|E∧X) = 1 > P(E), and E pseudo-confirms the hypotheses E∧X in the standard Bayesian sense because it raises its probability, i.e., P(E∧X|E) > P(E∧X). But obviously, this is neither a genuine explanation (see already Hempel & Oppenheim, 1948, 275) nor a genuine confirmation. For a genuine confirmation, it is required that E also raises the probability of those content parts of the hypothesis H that “transcend E,” i.e., are not logically entailed by E. However, the only Etranscending content part of X∧E is X, and E is probabilistically irrelevant to X, i.e., P(X|E) = P(X). For the same reason, X∧E is not a genuine explanation of E, because in a genuine explanation, the E-transcending content part of the explanans must be explanatorily relevant to E (but X is explanatorily irrelevant to E). Epistemic pseudo-explanations: They are what typically underlies speculative abductions in the history of ideas and in common sense; therefore they are in the focus of this section. Their structure is more subtle than logical pseudo-explanations (whose “pseudo” nature is obvious). In an epistemic pseudo-explanation, the pseudo-connection between the explanandum and the hypothesis is generated by a theoretical parameter or “variable” whose values are fitted toward E post facto, i.e., by using the information E. The problem is that by post-facto fitting of sufficiently many hidden variables, one may construct a speculative pseudo-explanation for any constellation of facts whatsoever. (Note that a “variable” is understood here not as a logical variable but as a mathematical variable, i.e., a function from a logical variable ranging over a domain of entities into a space of values; an example is “Color(x) = blue”; here “x” is the logical variable, “Color” is the mathematical variable or parameter, and “blue” a value of the mathematical variable). Post-facto speculations cannot be captured by criteria of logical irrelevance. The crucial challenge for a workable account of theory-generating abduction is a reasonable identification method for post-facto speculations. For this purpose, two criteria are proposed:
10 Theory-Generating Abduction and Its Justification
187
(i) Unification – the hypothesis H must unify empirical facts. (ii) Independent testability – the hypothesis H must be independently testable by use-novel evidence in the sense of Worrall’s (2006). H is said to be independently testable iff H entails or makes probable some empirical facts that were not used in the construction of H; so these facts constitute potentially predictive content of H, as they could have figured as predictions. Thereby, a “prediction” is not understood in the temporal sense of future predictions, but in the more general epistemic sense, namely, a “prediction” is a consequence that was not already known before it was inferred from the hypothesis. First, these criteria are illustrated by way of examples. In the simplest kind of a speculative abduction, the hypothesis “explains” any given fact, singular or general, by postulating a special power that caused this fact – for example, “God” or some more mundane conspirative power. In what follows, read “ψ(X)” as “some power of kind ψ intends that X happens” (thus “ψ” is a second-order property, and “X” a logical second-order variable ranging over propositions). Depending on whether X expresses an observed fact or an observed empirical regularity, there are two kinds of speculative abductions:
Speculative fact abduction: Explanandum E: Ca Abductive conjecture H: ψ(Ca) ∧ ∀X(ψ(X) → X)
Example: We have a Corona pandemic. God wanted that we have a Corona pandemic, and whatever God wants, happens.
Speculative fact abductions have been performed by our human ancestors since the earliest times. All sorts of unexpected events can be pseudo-explained by speculative fact abductions. They do not achieve unification, because for every event (E) a special hypothetical “wish” of God (ψ(E)) has to be postulated (Schurz & Lambert, 1994, 86). For the same reason, such pseudo-explanations are entirely post hoc and do not entail any use-novel predictions by which they could be independently tested. The theoretical parameter ψ(X), for “God wanted that X,” can be fitted toward any possible experience E whatsoever, simply by inserting ‘E for “X.” In speculative law abductions, an empirical regularity, frequently clothed into an empirical disposition, is pseudo-explained by a hidden power:
Speculative law abduction: Explanandum E: ∀x(Ox → Dx) Abductive conjecture H: ∀x(Ox → ψ(Dx)) ∧ ∀x(ψ(Dx) → Dx)
Example: Opium (O) has the disposition D to make people sleepy (after consuming it). Opium has a special power (a “virtus dormitiva”) ψ(Dx) that causes the disposition D.
188
G. Schurz
Speculative law abductions were especially common in the middle ages. The example of the “virtus dormitiva” was ironically commented by Molière, who pointed out that this “power” does not explain, but merely amounts to a redundant multiplication of causes (cf. Mill, 1865, book 5, Chap. 7, Sect. 2). Speculative law abductions do not offer unification because for every elementary empirical law, one has to introduce two elementary hypothetical laws to explain it (Schurz & Lambert, 1994, 87). For the same reason, the abductive conjecture has no predictive power that goes beyond the predictive power of the explained law. The theoretical variable “ψ(Dx)” can be fitted toward any given disposition D. One should not diminish the value of speculations: they can be regarded as a predecessor of scientific inquiry. Humans have an inborn instinct to search for explanations (cf. Sperber et al., 1995, ch. 3; Lipton, 1991, 130), and speculative abductions can be regarded as the idling of human’s inborn explanatory search when a proper explanation is out of reach. It is nevertheless important to understand why speculative beliefs lack objective justification. This is achieved by the rationality criteria of unification and independent testability. To make these notions precise, a further logical notion is introduced: that of a content element and content part. A content element of a hypothesis or set of hypotheses H is a logical consequence C of H that is (i) conjunctively elementary in the sense that C is not L(ogically) equivalent with a conjunction C1 ∧C2 of conjuncts both of which are shorter than C, and (ii) C is a relevant logical consequence of H, in the sense that no predicate (or propositional variable) in C is replaceable on some of its occurrences by an arbitrary new predicate with the same place number, salva validitate of the entailment of C by H. This definition of a content element (or “relevant element”) was first introduced in Schurz (1991, (21), (35)) (see also Schurz and Weingartner 2010, def. 4; and Schippers and Schurz 2017, def. 4.2). The shortness criterion involved in the condition (i) of conjunctive elementariness is related to the concept of minimal description length in machine learning (Grünwald, 2000); it is relativized to a framework with ¬, ∧, ∨, →, ∀, ∃ as primitive logical symbols (defined symbols being eliminated). According to the relevance condition (ii), irrelevant disjunctive weakenings p/p∨q (the replaceable component is underlined) do not belong to the content of a proposition, but conjunctive components do. For example, “Peter is a philosopher or the moon is blue” is not a content element of “Peter is a philosopher and a football player,” but “Peter is a philosopher” is. Other formal philosophers have proposed related accounts of elementary content parts with technically different properties; examples are Friedman’s (1974) “independently acceptable elements,” Gemes’ (1994) “content parts,” and Fine’s (2017) “verifiers.” The notion of unification is characterized as follows. Informally speaking, a hypothesis or theory H is unificatory iff H implies many elementary facts E1 , . . . ,En by a few elementary theoretical principles H1 ∧...∧Hm . The idea of unification underlying definition 9.3 can be found in Mach (1883, 586f.), Feigl (1970, 12), Friedman (1974), and Schurz and Lambert (1994). Using the notion of content element, this is precisely defined by saying that H is unificatory iff (i) H is logically equivalent with the conjunction H1 ∧...∧Hm of content elements of H, (ii) E1 , . . . ,En are all content elements of H (or of H’s high-probability consequences if H is a
10 Theory-Generating Abduction and Its Justification
189
probabilistic theory), and (iii) m < < n. The quotient n/m is regarded as a measure of unification. Note that without the notion of content elements, the idea of measuring unification by counting consequences would be untenable: it would be immediately susceptible to the “conjunction paradox” first mentioned by Hempel (1965, 273, fn. 36), according to which the mere conjunction E1 ∧ . . . ∧En of a set of facts Ei “pseudo-unifies” these facts. The characterization of independent testability is more involved. This condition becomes important for theoretical hypotheses H that have been constructed by exploiting the information contained in a particular evidence E. In the typical case, there is an “unfitted” general hypothesis Hunfit , and the hypothesis H (= Hfit ) has been obtained by fitting the values of certain “free parameters” in Hunfit to the given evidence E, so that E follows from H or is made probable by H. In such a case, H’s independent testability requires that H implies a further evidence E* that is use-novel, i.e., was not used for fitting the free parameters to their “right values.” In application to the above speculative fact abduction, the hypothesis H (i.e., Hfit ) is the abducted hypothesis “God wanted E (the Corona pandemic) and whatever God wants, happens”; the underlying unfitted hypothesis, Hunfit , is the existential quantification over the freely adjustable theoretical variable X of God’s wishes, “∃X(God wants X) and whatever God wants, happens”; and the fitting process consists in removing the existential quantifier in Hunfit and replacing the logical variable X by the observed evidence E post facto. Independent testability is a stronger rationality condition than unification: the former implies the latter, but not vice versa. To see this, note that if H (= Hfit ) has been obtained from Hunfit by fitting its free parameters to E, and H is independently confirmed, then H must imply a further evidence E* that was not used for fitting; thus H unifies E* and E. In the other direction, if H unifies several facts E, E*, this does not necessarily imply that one of them is use-novel because H could have been obtained by a simultaneous fitting of H toward E and E*. But the more H unifies, the more likely H will be found to entail some use-novel facts within the accepted background knowledge. On the other hand, there are areas at the borderline between science and speculation, for example, in metaphysics or in some parts of contemporary theoretical physics, where one finds hypotheses that satisfy the condition of unification to a high degree but, at least until today, do not possess independent confirmation by use-novel evidence. The criterion of use-novelty can be probabilistically justified as follows. Recall the unfitted hypothesis in the example of the God abduction: “God wants something X and whatever God wants, happens.” This hypotheses could be fitted to any possible evidence E’ (by replacing X with E’), in particular to E’ = ¬E. Therefore, the probability of Hunfit cannot be raised by the particular explanandum E, since P(E|Hunfit ) = P(¬E|Hunfit ) holds, which implies that P(Hunfit |E) = P(Hunfit ). It follows that Hunfit is not confirmed by E in the Bayesian sense. However, Hunfit is a logical consequence of Hfit , whence P(Hfit |E) < P(Hunfit |E) = P(Hunfit ), and since the prior probability P(Hunfit ) is low, it follows that P(Hfit |E) is forced to remain at a low value. (This implies that Hfit is not “genuinely” confirmed in the sense of Schippers & Schurz, 2020). Only if Hfit has been independently tested and
190
G. Schurz
confirmed by a second use-novel piece of evidence E* to which Hunfit has not been fitted, then Hunfit can be reasonably said to be confirmed via the confirmation of Hfit by E and E*. For obviously it is not possible to fit Hunfit to a given evidence E and then confirm the so-obtained Hfit by any other evidence E* whatsoever. In this case, Hfit ’s and Hunfit ’s probability grows and may come arbitrarily close to 1, relative to an increasing amount of conditionally independent pieces of evidence. The criterion of use-novelty criterion goes back to Worrall (2006). However, it is by no means a purely philosophical invention. The selection among models with freely adjustable parameters is called model selection (Burnham & Anderson, 2002). An important domain is curve fitting applied to statistically correlated variables (X, Y with values xi , yi ). Here, one approximates a finite set of data points E = {: 1 ≤ i ≤ n} by an optimal curve Y = f(X) with a remainder dispersion around it as small as possible. It is a well-known fact that any set of data points can be approximated by fitting the variable parameters ci of a polynomial function Y = c0 + c1 ·X + . . . + cn ·Xn of a sufficiently high degree n. This function plays the role of Hunfit . Merely fitting the parameters of Hunfit to the data set E is not enough for confirming it. The approximation success of a high-degree polynomial may also be due to overfitting the data, i.e., Hunfit may have been fitted on random accidentalities of the sample (cf. Hitchcock & Sober, 2004). Only if the curve Hfit with its parameters fitted toward E successfully approximates a use-novel data set E*, one to which its parameters have not been adjusted, then it is genuinely confirmed by E and E*. A well-known confirmation method corresponding to the use-novelty criterion is cross-validation (Shalev-Shwartz & Ben-David, 2014, Sect. 11.2). Here one starts with just one (big) data set E, splits E randomly into two disjoint data sets E1 and E2 , fits the unfitted hypothesis to E1 , and tests the fitting result at E2 . By repeating this procedure and calculating the average probability of E2 conditional on H-fitted-to-E1 , one obtains a highly reliable confirmation score. Two related methods are the BIC (Bayes information criterion) and the AIC (Akaike information criterion). Both are based on the expectation value of the likelihood of a curve optimally fitted to some set E1 , in regard to another independently chosen data set E2 . Among curves with equally good approximation accuracy this likelihood is the greater, the fewer freely variable parameters the curve has, assuming a constant dispersion with Gaussian error distribution (Burnham & Anderson, 2002, 63). The outcome of m-out-of-n cross-validation converges to the BIC score when n grows large (similarly, 1-out-of-n cross-validation converges to the AIC score; cf. Shao, 1997). Use-novelty also has important applications in philosophy of science. As pointed out by Lakatos (1970, 47ff.), if a theory’s predictions conflict with experience, it is always possible to save the theory core by enriching or revising the theory’s periphery using auxiliary assumptions that postulate the existence of disturbing factors. As long as such a hypothesis merely avoids the conflict with experience but does not produce new empirical content, it is called ad hoc. As long as a hypothesis is ad hoc, it does not have use-novel content and is not genuinely confirmed. For example, when around 1856 Le Verrier observed a divergence of the planet Mercury from its predicted orbit, he postulated the existence of a disturbing planet close to
10 Theory-Generating Abduction and Its Justification
191
Mercury, named Vulcan, too small to be observable. This hypothesis was ad hoc and remained ad hoc, since the existence of Vulcan could never be independently confirmed. Later Mercury’s divergence was explained by relativity theory (cf. Grünbaum, 1976, 332-5, 358; Lakatos, 1970, 39f, 88). An extreme variant of ad hocery is conspiracy theories: they save an extremely implausible “theory core” against multiple conflicting experiences by a cascade of ad hoc hypotheses. In conclusion, unification and independent testability are necessary criteria for the rational justifiability of theory-generating abduction. If an abducted theory does not satisfy these two criteria, it is post hoc and counts as a pure speculation that is not rationally justified. In the next section, it is shown how theory-generating abductions in science satisfy these two criteria. Thereafter, it will be seen that a philosophical abduction, the inference to external reality, follows the same pattern as scientific abductions.
Common Cause Abduction in Science A unificatory explanation of facts E1 , E2 , ... is only possible if these facts are mutually correlated, since otherwise each fact would have an independent cause (Barnes, 1995). Correlated singular facts are, at a first level, explained by empirical laws. For this reason, the primary explanandum of theory-generating abductions in science is not singular events, but empirical regularities that have been discovered by induction. Chemical theories, for example, explain all chemical reactions of a given kind, not just this or that reaction. The empirical regularities that are explained by theories are typically temporal regularities that are exhibited by certain but not by other kinds of objects or systems. These regularities describe empirical dispositions of these objects or systems. The notion of “disposition” that is employed in this chapter is the notion of dispositions as a conditional (or functional) property: That an object x has a disposition D means that whenever certain initial or “trigger” conditions C are (or would be) satisfied for x, then a certain reaction R of x will (or would) take place. For example, that x has the disposition of being soluble in water (D) means that whenever x is put into water (C), x will dissolve (R). This understanding of dispositions is in accordance with the “received view” (cf. Prior et al., 1982). Formally, a strict disposition is defined by the condition ∀x(Dx ↔ ∀t(Cxt →n Rxt)), and a probabilistic disposition by ∀x(Dx ↔ Pt (Rxt|Cxt) = high)), where “t” is the time variable, “→n ” a nomological conditional, and “Pt ” a statistical probability operator that binds the variable “t.” Dispositional properties are contrasted with categorical properties that are characterized in terms of “occurrent” intrinsic structures or states (Earman, 1986, 94). Dispositional properties can have categorical properties such as molecular structures as their causes, but they are not identical with them. The alternative view (e.g., Mumford, 1998) identifies dispositional with categorical properties. This alternative view stands in conflict with the fact that different dispositions can have the same molecular structure as their causal basis (see the example in Fig. 1).
192
Empirically given kinds of substances
G. Schurz
Unificatory abduction to a common explanation
Correlated empirical dispositions of these substances
(Special:) x is soluble in water Sugar
Salt
Natron-
(Special:) x is non-soluble in oil
Common intrinsic: 'nature': hydrophylic
carbonate Coppersulfate
(General:) x is soluble in water-similar solvents (ammonia ) (General:) x is not soluble in oilsimilar solvents (benzene, ) x has an increased melting point
Theoretical model: dipolar molecules
x-solutions conduct electricity x absorbs light at characteristic wavelengths
Fig. 1 Unificatory abduction of the hydrophilic nature of a substance as the common explanation of dispositions that are correlated with the solubility of a substance in water. Happened in the seventeenth century
Recall from section “Unification and Independent Testability: Two Rationality Criteria for Theory-Generating Abduction” that the introduction of one new entity or property merely for the purpose of explaining one empirical disposition is always speculative and ad hoc. Only if the postulated entity or property explains many (analytically independent but empirically correlated) dispositions and in this way yields an explanatory unification, it is a legitimate scientific abduction that is worthwhile to be put under further investigation. This is typically the case in scientific abductions to common “causes.” Schurz (2016) argues that the most important kind of theory-generating abduction in science is the explanation of correlated dispositions by common “causes.” For the time being, the notion of “cause” is taken in a loose sense, equivalent to “explanation”; a specification will be undertaken in later sections. An illustrative example is the chemical explanation of the solvability of substances in water or other fluids. Already premodern (al)chemists recognized that certain kinds of chemical substances, despite different visual appearances, share certain chemical dispositions. For example, sugar, salt, natron carbonate, copper sulfate, and other substances are soluble in water. Their solubility in water is strongly correlated with many other empirical dispositions: solvability in all water-similar solvents, non-solvability in oil, increased melting point, electric conductivity, characteristic light spectrum, etc. (see Fig. 1). Thus, if a substance has one of the dispositions in Fig. 1, then it has all of them. Chemists in the seventeenth century explained these correlated dispositions by the abduction of a common
10 Theory-Generating Abduction and Its Justification
193
intrinsic nature of these substances, which they called their hydrophilic nature. Having a hydrophilic nature figured as a unificatory explanation of these correlated dispositions. If m substances (S1 , . . . ,Sm ) have n dispositions (D1 , . . . ,Dn ) in common, then there are m·n elementary empirical laws of the form ∀x(Si x → Dj x) (1 ≤ i ≤ m, 1 ≤ j ≤ n). The hypothesis of their hydrophilic nature, abbreviated as ψH , explains them by m + n theoretical laws: ∀x(Si x → ψH x) for 1 ≤ i ≤ m and ∀x(ψH x → Dj x) for 1 ≤ j ≤ n. So the unificatory abduction reduces n·m empirical regularities to n + m theoretical laws. Common cause abductions also satisfy the second rationality condition, independent testability by use-novel empirical content. The presence of merely one of the correlated dispositions functions as an indicator for the presence of the theoretical cause (at least under certain conditions). Thus, if a new empirical kind of substance has one of the correlated dispositions, then one can infer that it possesses the theoretical property and hence predict that it possesses the other dispositions. For example, if a new substance was observed to dissolve in water, one can predict that it will not dissolve in oil without having it ever put into oil before. If these novel predictions are observed to be true, this counts as strong confirmational success. The “hydrophilic nature” of a substance was a natural kind concept of premodern chemistry, while the inner constitution of this property was hardly understood. The introduction of a new (theoretical) kind term is typically the first step in the development of a research program in the sense of Lakatos (1970). The next step consisted in the attempt to construct a theoretical model of the postulated theoretical “nature” that explains how the hydrophilic nature of a substance brings it about that this substances dissolves in water and in all other water-similar solvents. A satisfying answer to this question had to wait until the atomic and molecular model of matter was established in the nineteenth century. The molecular model that explains the hydrophilic nature of a substance consists in its dipolar structure, i.e., the fact that the molecules have a positively and a negatively charged end. Since water molecules also possess such a dipolar structure, the latter may insert themselves easily between the molecules of the given substance, the plus-pole pointing toward the minus-pole and vice versa, and so the substance gets distributed at the molecular level, i.e., dissolves in water. The molecules of all “water-similar” fluids possess this polar structure. In contrast, the molecules of all oil-similar fluids as well as the molecules of substances that dissolve in oil are nonpolar. The dipolar structure provides straightforward explanations of the other correlated dispositions (electric conductivity, increased melting point, etc.; cf. Mortimer, 1986, ch. 12). Many more examples of scientific common cause abduction could be given, for example, abductions of chemical kind concepts such as metals, acids, bases, and salts, or Newton’s abduction of the gravitational force (cf. Schurz, 2016). Note that in principle, even a theory about God’s wishes could be enriched so that it acquires use-novel content, for example, by letting it predict that God allows only good things to happen (etc.). Empirically testable versions of creationism are possible; their problem, however, was that in the history of religion, they have been refuted by the empirical facts.
194
G. Schurz
In the above examples, the simplest case of common cause abduction was discussed. If dispositions are merely statistically correlated, they often possess not only one but several common causes. A statistical generalization of common cause abduction that goes into this direction is statistical factor analysis. Here a large number of correlated empirical variables are unified by a small number of hypothetical factors, each of them “explaining” a certain amount of the variance between the variables. However, not always these factors are interpreted causally; often a purely instrumentalistic interpretation of them is preferred (for details cf. Haig, 2005; Schurz, 2016, Sect. 6).
Instrumentalistic Justification of Theory-Generating Abduction In this section the epistemological status of the inferred theoretical “causes” is discussed. For the time being, “common causes” are understood in the loose sense of “common explanations,” since so far a precise understanding and justification of the cause-effect relation has not been established (this will be the topic of the sections “Justifying Theory-Generating Abduction by Causality” and “Abductive Justification of Causality”). It is not clear whether these common theoretical explanations may only be justified instrumentalistically, by meta-induction over their empirical success, or realistically as unobservable causes of the correlated empirical dispositions. In the instrumentalistic justification, one is not directly evaluating the success of the hypothesis, but the success of its empirical consequences, assessed in regard to the two rationality conditions of unified explanation and confirmation by independent (use-novel) tests. What a superior instrumentalistic success unproblematically implies are high truth chances of the empirical predictions of the hypotheses, but not necessarily high truth chances of the theoretical part of the hypothesis. The empirical success of a theoretical model does not force us to believe in the truth of the theoretical part of the model. It leaves room for the instrumentalistic position in philosophy of science, exemplified by the empiricism of van Fraassen (1989). According to this position, one is warranted to believe in the empirical adequacy of scientific theories (their past and future predictive success), but not to believe their realistic truth. For example, an instrumentalistic physicist will believe in the reality of planets and their trajectories in our solar system, but not in the reality of gravitational forces. Although these forces figure as a common explanation of the trajectories of planets, they are not considered as “real causes” but merely as efficient instruments for predicting planetary orbits. Is the instrumentalistic justification all what an empiristically oriented abductivism can provide for theories? For the justification of the realistic truthlikeness of successful theories, stronger principles or assumptions than that of instrumentalistic success evaluation are needed. A well-known justification attempt is Putnam’s no miracle argument (1975, 73), which says that without the assumption of its realistic truth, a theory’s instrumental success would be as improbable as a miracle. There are many objections to the miracle argument. One is Laudan’s pessimistic metainduction (Laudan, 1981), pointing out that most theories in the history of science
10 Theory-Generating Abduction and Its Justification
195
that were highly successful at their time have later turned out to be false in their theoretical part. Therefore on meta-inductive grounds, one should not expect that the theoretical part of presently successful theories is close to the truth (cf. Carrier, 2003, Sect. 7–8; Schurz, 2009, Sect. 2). There is even an instrumentalistic method of designing an empirically equivalent quasi-theory that is equally successful and simple as a given theory T, but does not postulate any proper theoretical concepts at all. This is the instrumentalistic Ramsey sentence, R(T). It is obtained from a theory T(ψ1 , . . . ,ψk ) with theoretical concepts ψ1 , . . . ,ψk by existentially quantifying over them, that is, by replacing T with R(T) = def ∃X1 . . . ∃Xk T(X1 , . . . ,Xk ), where the existential quantifiers range solely over mathematical entities (Ramsey, 1931, 212-215). R(T) is provably empirically equivalent to T (Ketland, 2004, 293, th. 3), but it does not postulate any real entities or properties going beyond the observable; it attributes mere conceptual existence to T’s theoretical constructs. The Ramsey sentence seems to be a victory of the instrumentalistic position. What can be said against it? One may object that Ramsey instrumentalism is a sort of cheating because Ramsey instrumentalists use an isomorphic copy of the realistic model; they just don’t attribute real existence to its non-experiential concepts. Ramsey instrumentalists can reply that this objection ignores the crucial difference that the conceptual entities of the Ramsey sentence are not causally connected with the subject’s experiences; the concept of causality is missing in the instrumentalistic worldview. This brings us to the next section, in which the instrumentalistic challenge will be met by introducing a weak principle of causality.
Justifying Theory-Generating Abduction by Causality To defend theory realism against the Ramsey sentence, it has to be asked: What objective reasons can be given at all to believe in the realistic truthlikeness of a theory’s theoretical part? An answer comes from the theory of causality. Indeed, in the enlightenment philosophy before David Hume, it was common to justify the inference from perceptual impressions to real objects by a principle of deterministic causality that was called the principle of sufficient reason: everything must have a sufficient cause; so if there cannot be found a sufficient cause among the observables, an unobservable sufficient cause must be postulated. In the period that followed, causality had a tough time. The principle of causality was undermined by Hume’s critique (1748, ch. 4, 6) who argued that causality is a metaphysical illusion. All what one can observe is the temporal succession of two events, but not their connection as cause and effect: nothing in our observation shows that the first event “causes,” “produces,” or “necessitates” the second event. From a contemporary view, one can resist Hume’s objection by understanding causality as a theoretical concept, but this view was out of reach at Hume’s times. Later the principle of deterministic causality was refuted by the development of quantum physics. The principle of sufficient reason assumes (wrongly) that every event must have a sufficient or deterministic cause, but quantum physics showed
196
G. Schurz
that there are genuinely indeterministic processes in nature, whose outcomes have merely statistical causes. An example is radioactive decay: it is objectively undetermined when a Cesium-137 atom will decay; only its half-life time of 30 years is determined, within which it decays with a probability of 0.5. However, the refutation of determinism does not imply the rejection of causality altogether. There is a widely accepted causality principle in contemporary sciences: the Markov principle of causality. This principle does not require for each event a sufficient cause but hooks in at a more general level. It says, roughly, that every (nonaccidental) statistical or deterministic correlation is the result of a causal connection. To justify theory-generating abduction, only the following uncontroversial weak consequence of the full Markov principle is needed. It is explicated in terms “variables” in the explained mathematical sense of this word, i.e., functions from a domain into a value space ranging over properties or events, e.g., “the color-ofsomething,” “the-mass-of-something,” etc. Causality principle (C): If two variables X, Y are correlated, then either one is a cause of the other variable, or both variables are the effects of a common cause. That “X is a cause of Y” only means that X is a partial cause of Y, i.e., a causally relevant factor, but not necessarily the “full” cause of Y. Partial causeeffect relations are displayed by means of directed arrows between variables, X → Y. They are understood as real directed relations between these variables, but their specific ontological nature (forces, processes, or whatever) is left open. The full Markov principle says that every variable X is probabilistically independent of its non-effects, conditional on its parents (Spirtes et al., 2000, Sect. 3.4.1–2; Pearl, 1997, Sect. 3.2–3). A controversial part of the full Markov principle is its implication that common causes must always screen off their effects. For so-called “interactive” common causes that occur in quantum systems or decay processes, this condition may be violated (cf. Cartwright, 2007, 12; Schurz, 2017; Hitchcock & Rédei, 2021, Sect. 5). However, the principle (C) does not entail that screening off must always be obtained; it also admits interactive causes. Let us turn to the justification of theory-generating common cause abduction. Principle (C) implies that if two correlated variables X1 and X2 cannot be related in the form of a direct or indirect cause-effect relation (via a directed chain X1 → . . . → X2 or X2 → . . . → X1 ), then they must be the joint effects of a common cause. Precisely this is the case if the correlated variables designate dispositions, as in most common cause abductions. Recall that a dispositional property D(x) is a (non-accidental) temporal regularity, saying that whenever an object x is put in certain initial or trigger conditions A, x exhibits a certain reaction R, abbreviated as A → R (reference to x and to time indices is omitted). Now, because the trigger conditions of two dispositions are (at least normally) mutually independent, they cannot be related as cause and effect. So they must be related by a common cause, and since no such cause is observed, it must be an unobserved or unobservable cause expressed by a theoretical concept ψ.
10 Theory-Generating Abduction and Its Justification
197
To explain this point, it has to be clarified what could be meant by saying that a disposition D1 is “caused” by a theoretical property ψ or by another disposition D2 . The disposition Di is itself a regularity relying on a causal connection, designated as Ai → Ri (since Ri can be brought about by producing Ai , it follows from the Markov condition that this is the only possible causal interpretation of the Ai -Ri regularity). But only events, not causal connections, can be the direct object of a causal relation. That a theoretical property ψ causes the disposition Di , short for Ai → Ri , can only mean that ψ together with Ai causes Ri . The causal graph underlying this situation is depicted in Fig. 2a; the symbol “&” indicates that the two causal arrows are conjunctively connected. Now, what could it mean that a disposition D1 (A1 → R1 ) causes a disposition D2 (A2 → R2 )? The only way in which A1 → R1 can make A2 → R2 happen is by means of the causal chain A2 → A1 → R1 → R2 , which is displayed in Fig. 2b. However, the antecedent conditions of dispositions are independent and can be independently varied by human interventions. For example, that a substance is put into water, given into oil, and set under electric voltage are independent conditions that are not connected by causal arrows. Therefore, the causal structure in Fig. 2b is impossible (which is indicated by the big cross “×”). Thus, if the two correlated dispositions have independent antecedent conditions, the causality principle (C) entails that they must be causally connected by way of a hidden common cause, ψ, which is present in all those objects that possess the correlated dispositions. The causal structure underlying this situation is illustrated in Fig. 3. In conclusion, if the causality principle (C) is accepted, then the abductive inference of a theoretical common cause has a rigorous foundation. It is no longer a matter of intuition or search for unification, but is entailed by principle (C) together with the independence of the dispositions’ triggering conditions. The causality principle (C) justifies merely the abduction to the existence of a cause or network of causes explaining correlated dispositions. Thereby this principle answers the challenge of the instrumentalists, why one should postulate theoretical (a) \
Ai
Ri
(b) R1
R2
A1
A2
&
Fig. 2 Explaining the causation relation for dispositions. (a) Intrinsic property ψ causes a disposition Di that stands for Ai → Ri (“&” for conjunctive connection). (b) Disposition D1 causes a disposition D2 – but this is impossible since A1 and A2 are independent variables Fig. 3 Explanation of two correlated dispositions A1 → R1 and A2 → R2 by a common cause ψ
R1
R2
&
A1
&
\
A2
198 Fig. 4 More complex common cause explanations of correlated dispositions
G. Schurz
(a)
R1
R2
A1
A2
y1 (b)
y2
R1
R2
R3
A1
A2
A3
y1
y2 y3 (b)
entities at all, in order to explain what is observable. Principle (C), however, does not tell whether one should choose this rather than that causal model. Instead of one common cause, as in Fig. 3, one may also postulate a network of causes. For example, in Fig. 4a, two correlated dispositions are explained by two independent common causes, and in Fig. 4b, three correlated dispositions are explained by two common causes that have themselves a common cause (the interaction between the arrows pointing toward Ri would be more complicated in these cases, except that Ai would always be a necessary condition). In a similar vein Quine (1960, 141ff.) has shown that given an empirically successful theory T, one can always construct theories T , T , . . . that are empirically equivalent with T but have a (radically) different theoretical superstructure. This problem will be studied in the next section, where common cause abduction is applied to the justification of perceptual realism.
Abductive Justification of Perceptual Realism: Normal World Versus Brain in the Vat The importance of theory-generating common cause abduction in science is undoubted. More controversial is the transfer of this type of inference to problems in philosophy. While some philosophers accept theory-generating abductions only in the natural sciences but not in metaphysics (Beebee, 2018; Ladyman, 2012; Saatsi, 2017), the author of this chapter belongs to those who advocate this transfer (e.g., Paul, 2012; Armstrong, 1983; Williamson, 2016). Thereby, the notion of a “theory” is extended from science to metaphysics. Metaphysical theories are similar to theories in science in going beyond the observable to offer unified explanations of
10 Theory-Generating Abduction and Its Justification
199
the observable. What distinguishes metaphysical concepts from theoretical concepts in science is their more general and transdisciplinary nature. For example, while the notions of mass and force belong to physics, the more general notion of reality and cause belongs to metaphysics. In this section, the abductive inference to common causes is applied to the justification of a fundamental metaphysical theory: perceptual realism (for likeminded argumentations cf. Moser, 1989, 98; Vogel, 1990; Niiniluoto, 2018, 152). The explanandum of this abduction comprises the internal regularities of our sense experiences, in particular our visual experiences. A (stationary) object can be perceived from potentially infinitely many perspectives, each leading to the experience of a two-dimensional visual image. These regularities of our perceptions can be considered as dispositions of certain locations in our internal visual field, having the form “If I look from this-and-that viewpoint, a visual 2D impression of this-and-that shape appears.” These 2D images are mutually correlated depending on the viewer’s position and the angle of gaze direction. Now, all of these mutually correlated visual dispositions have a common cause explanation in terms of a subject-external three-dimensional object causing the visual impressions by the laws of the perspectival projection. Of course, the inferences from visual 2D impressions to external 3D objects are mainly unconscious; only in certain situations is the constructive character of these processes revealed. Nevertheless, they are theory-generating abductions at the lowest epistemological level: While their premises contain only introspective concepts (“from visual position p, I experience at point x of my visual field the 2D shape G2 ”), their conclusion asserts a realistic proposition (“at point x of the external space, there is the 3D figure G3 , such that G2 is the perspectival projection → of G3 onto the image plane behind the eye point p along the vector − xp”). The explanation of the regularities of our visual sense experiences by common 3D objects is not only highly unifying, but it is also rich in independently testable predictive content. By the perspectival projections of the abducted 3D objects, I can predict from my visual 2D image of the object from one or a few viewpoints how the object will appear from any other viewpoint, and I can compute how I have to move to get to the object or sidetrack it. Moreover, if the object itself is moving, a few perceptions of its time-dependent position are enough to predict the object’s trajectory and viewer-dependent visual appearance of it at later times. The visual cortex of our brain is a powerful prediction machine that computes these abductions in split seconds (Clark, 2013). In the contemporary era of AI, the underlying mathematical algorithms are employed in computer programming of phantastic animation films. A second important basis for abduction to the external reality is the intersensual correlations between different sensual experiences, in particular between visual and tactile perceptions. Grasping the seen objects is the primary mechanism of the baby to discern its own body from the outside world and coordinate its visual perceptions and motoric activities. Even in adulthood, when humans have the visual appearance of a supposed object whose reality they are not sure of, they go to the object and try
200
G. Schurz
to touch it: if the “touching test” succeeds, their realistic desires are satisfied, but if it does not, they become scared by ghost fantasies. The third basis for the abduction toward external reality is the intersubjective correlations between visual observations, i.e., the fact that different subjects make the same visual experiences when looking to the same location at the same time. In conclusion, the hypotheses of external 3D objects (making up perceptual realism) appear to be the best explanation of the intrasensual, intersensual, and intersubjective correlations between our position-dependent visual experiences. Even more, it seems that it is the only plausible explanation of these experiential facts. Can this claim be maintained? The only possible alternative explanations of minimal plausibility would be those that assume some other subject-external cause of the introspective regularities – one that has a rather different structure than the normal world (NW) hypothesis of perceptual realism. A famous example of this sort is the brain-in-a-vat (BIV) scenario. According to this scenario, we are brains in a vat wired to a supercomputer. The supercomputer sends signals to our brain causing in us exactly those experiences that make us believe the NW hypothesis, i.e., that we are human beings walking around, etc. Since the BIV and the NW hypotheses are empirically equivalent, none of the two can be epistemically preferable, so the skeptical argument goes. Does this mean that an empirically oriented abductivist should remain agnostic with respect to the truth of common sense realism? If the instrumentalistic stance is applied to the epistemology of perception, then it leads to the viewpoint of positivism. Indeed, positivism suspends judgment with respect to the truth of perceptual realism and the decision between the normal world and the BIV explanation. The positivist believes in the predictions of subject-internal experiences, but suspends judgment in regard to the truth of realism. Several epistemologists argued that the NW hypothesis is simpler than the BIV hypothesis (BonJour, 2003, 94; Vogel, 1990, 662; Shogenji, 2018, ch. 7). Unfortunately, the concept of simplicity is notoriously vague; it has to be made precise. Since entities or statements can themselves be simple or complex, merely counting them cannot produce a robust notion of simplicity. In this chapter it is proposed to base the simplicity measure on the representation of a hypothesis H by its elementary axioms; they are given by a set of (nonredundant) content elements of H that is logically equivalent with H and figures as an axiomatization of H. Based on this notion, a hypothesis is said to be the simpler, the fewer elementary axioms are needed for its axiomatization, and the shorter these are. Should this simplicity measure be employed as a ceteris paribus preference criterion? In other words, should one attribute the highest degree of justification to the most simple hypothesis among all empirically equivalent hypotheses? Simplicity can certainly figure as a preference criterion for instrumentalistic justification. However, for a realistic justification – one that is oriented at the truth chances of the theoretical part of the hypothesis or model – the explained simplicity criterion seems too weak. A robust probabilistic preference among competing hypotheses with equal empirical content does not follow from the simplicity criterion, because the hypotheses’ posterior probability depends, besides their likelihoods, on their prior
10 Theory-Generating Abduction and Its Justification
201
probabilities, and the latter ones are largely subjective: why should all elementary axioms have the same prior probability? Even if hypothesis H1 has 10 and H2 has 12 elementary axioms (all of equal length), the proponent of H2 may find H2 more probable because its axioms are more intuitive. A different situation is given if the elementary axioms of the simpler hypothesis H1 are a subset of those of H2 , i.e., if Ax(H2 ) = Ax(H1 )∪X. In this case, it holds for every nondogmatic prior probability distribution P that P(H1 ) < P(H2 ) (since P(Ax(H2 )) = P(Ax(H1 ))·P(X|Ax(H1 )) and P(X|Ax(H1 )) < 1). However, this case is not given in the comparison between the NW and BIV hypotheses because both speak about different entities. Nevertheless the situation is closely related. If one considers how the supercomputer achieves it to produce in our brain the same sense impressions as those that ordinary reality produces in us, then the only plausible explanation seems to be that the computer generates 3D objects, stored as bit sequences, that are structurally identical to those that we believe to see in the normal world and computes their perspectival projections (that it sends to the brain) in the same way as they are cast by light rays onto the eyes’ retina in the normal world (cf. Vogel, 1990, 664; Chalmers, 2005). Thus, the BIV hypothesis is parasitic on the NW hypothesis: it contains an approximately isomorphic copy of the NW hypothesis but makes additional assumptions about wiring our brain to the supercomputer. This makes it plausible to ascribe to the BIV hypothesis a much lower prior probability than to the NW hypothesis. More generally, what these considerations imply is that if one theoretical model is only mildly superior to another model with an incompatible theoretical superstructure, it may be indeed wise to suspend judgment regarding the realistic truth question. In some scientific areas, this situation seems indeed to be the case, for example, in some domains of contemporary theoretical physics. A realistic justification of a particular theoretical model can only be given if this model is by far superior to all accessible alternative theoretical models, in the sense that their alternatives either involve structurally similar submodels loaded with additional complications (lowering their prior probability), or they have a significantly inferior success (or likelihood). This precondition is satisfied in many scientific theories, as, for example, (i) the theory of the atomic structure in physics, (ii) the theory of the periodic system in chemistry, (iii) the theory of bondings and molecules in chemistry, (iv) the theory of or evolution in biology, etc. Even more this precondition is satisfied for everyday perceptual realism, when one compares the normal world model of perceptual realism with the brain-in-a-vat model.
Abductive Justification of Causality The causality principle (C) is essential to get the realistic justification of theorygenerating abduction started: if we observe correlated phenomena whose correlation cannot be explained by a directed causal relation among them, then there must be “hidden” (theoretical) common causes. However, the causality principle (C) is not a priori. Thus, the pressing question becomes how causality can be justified. To
202
G. Schurz
withstand Hume’s skeptical challenge, one has to answer the question of why causeeffect relations are needed at all, instead of simply accepting lawlike regularities as primitive facts. Different answers to this question have been given in the literature that cannot be treated here (cf. Beebee et al., 2009). Schurz and Gebharter (2016) have developed an abductive justification that considers causality as a theoretical concept, similar to the physical concept of force, with the difference being that causality does not belong to a physical but to a transdisciplinary metaphysical theory (Schurz, 2021). On pain of avoiding circularity, the abductive justification of causality must remain noncausal, being solely based on the general rationality conditions of unification and independent testability. It follows that theory-generating abductions must not be restricted to causal abductions. This consequence may seem unsatisfying. Nevertheless, it seems to be an achievement if it can be shown that the acceptance of merely one fundamental abduction, that of principle (C), provides a justificatory basis for theory-generating abductions of all other sorts. The crucial question that an abductive justification of causality must answer is: What does causality explain? The answer cannot be that every empirical regularity is explained by a corresponding causal power. Following from Section 2, such an “explanation” could neither achieve unification nor generate use-novel empirical content. Causality is also not needed to explain why observed regularities are inductively projectable, as some philosophers have suggested (cf. Fales, 1990, ch. 4). The inductive projectability of regularities is already explained by assuming that they are backed up by lawlike connections. Causality goes beyond inductive projectability or lawlikeness. The regularities that connect the joint effects of a common cause are perfectly lawlike too, although they are noncausal. An example is the correlation between lightning and thunder: neither is lightning the cause of the thunder nor vice versa, but both are the joint effects of an electric discharge in the atmosphere. According to Schurz and Gebharter (2016), cause-effect relations yield the best available explanation for two otherwise mysterious (in)stability properties of correlations in regard to conditionalization: screening off and linking up. Explaining screening off: Often two kinds of events X, Y are probabilistically dependent (or correlated), but they become probabilistically independent if one conditionalizes their probability distribution on some fixed but arbitrary value of a third variable Z, i.e., one considers only individuals that have the same Z-value. In this case, the third variable Z is said to screen off the two variables X and Y from each other: Two examples of correlated variables X and Y being screened off by a third variable Z: 1. Barometer reading (X) Storm coming (Y) Atmospheric pressure (Z). 2. Light switch (X) Light bulb (Y) Electric current (Z). A high barometer reading makes a coming storm highly probable, but if the atmospheric pressure is fixed, for example, to the value “low,” then the correlation between X and Y disappears, and a high barometer reading merely indicates a
10 Theory-Generating Abduction and Its Justification
X
Z
203
Y
Fig. 5 Explanation of screening off by binary causal relations (“···” stands for “probabilistic dependence” and “−” for “direct causal connection”)
nonfunctioning barometer instead of a coming storm. Likewise, a light switch in onposition correlates with the lightning of the bulb, but if the flow of electric current is fixed, for example, to the value “not-flowing,” then the bulb is not lightning even if the switch is on. Intuitively we believe to “know” that the screening off occurs in (1) because Z is a common cause of X and Y and in (2) because Z is an intermediate cause in between X and Y. However, to achieve a philosophical justification of causality, we must free our minds from prefabricated causal intuitions. But then we are confronted with a riddle: Why does the X-Y correlation disappear when Z’s values are fixed? The best available explanation of robust screening off phenomena – the only good explanation we can think of – is the following: the correlations between Z and X and between Z and Y reflect the direct causal connections (“direct” in relation to the set of variables {X,Y,Z}), but the correlation between X and Y is not direct but mediated (or transmitted) by Z. The situation is depicted in Fig. 5. In the full domain, different X-values correlate with different Y-values, but if one conditionalizes on a subdomain with constant Z-values, different X-values will no longer correlate with different Y-values, since the probabilistic dependence is no longer transmitted from X to Y. To make this abductive justification strong, one must show that the given explanation is the “only plausible” one; alternative attempts of explaining screening off either fail completely or involve enormous complications. Arguments to this effect are given in Schurz (2015); here it is sufficient to demonstrate the failure of duplication accounts. They come in two versions: (i) Humean-reductionistic (causality is “nothing but” correlation) and (ii) naive-metaphysical (every correlation is “backed up” by a causal connection). In Fig. 5, duplication accounts would also postulate a direct causal connection between X and Y, but then the fact that Z screens off X from Y must remain mysterious. Only the assumption that not all correlations correspond to direct causal connections can explain screening off. Explaining screening off requires merely the assumption of an undirected binary “causal” dependence relation; no direction of causation is needed so far. Directed causation is required for differentiating screening off from linking up, which is the topic of the next paragraph. Explaining linking up. In this case, three variables X, Y, and Z are characterized by a probability distribution that features exactly the opposite (in)stability properties of screening off: X and Y are probabilistically independent, but they become probabilistically dependent when one conditionalizes on the values of the third variable Z. This phenomenon is called “linking up”:
204
G. Schurz
Example of two uncorrelated variables X Y linked up by a third variable Z: Angle of the sun (X) Length of a tower (Y) Length of its shadow (Z). The angles of the sun (at different time points) and the heights of towers are obviously uncorrelated, but they become correlated if the shadow is fixed. For instance, if the tower’s shadow is long, then one can infer that the solar altitude must be low if the tower is short. If we put aside prefabricated causal intuitions, then we face a second riddle: Why do two formerly independent variables X and Y become correlated when we conditionalize on certain Z-values? To explain linking up, Z must again act as a mediator between X and Y. So the undirected causal relations in the linking up scenario must be the same as in the screening off scenario in Fig. 5. But the underlying causal structure can impossibly be the same, because screening off and linking up feature opposite probabilistic (in)stability effects. Thus undirected causal relations cannot explain both screening off and linking up. The best available explanation for screening off and linking up – again the only good explanation we can think of – is to assume that causal relations are directed. Let “X → Y” express that X exerts a direct causal influence on Y; then screening off and linking up can be simultaneously explained as follows. In both cases, Z mediates between X and Y. With directed causal arrows, there are three possible directed causal structures having X-Z-Y as undirected causal graph. The first two structures explain screening off, and the third one explains linking up: (a) X → Z → Y (or X ← Z ← Y), i.e., Z is an intermediate cause (between X and Y). Corresponding explanation: Here, Y correlates with X because changes of X-values cause changes of Z-values that in turn cause changes of Y-values. (b) X ← Z → Y, i.e., Z is a common cause (of X and Y). Corresponding explanation: Here, X correlates with Y because changes of X-values are caused by changes of Z-values, which also cause changes of Y-values. In both cases (a) and (b), the X-Y-correlation vanishes by conditionalization on Z. (c) X → Z ← Y, i.e., Z is a common effect (of X and Y). Corresponding explanation: Here, a change of X-values causes a change of Z-values, which, however, is not accompanied by a change of Y-values, because value changes are not transmitted from an effect to its cause (only from a cause to its effect). Thus Y is not correlated with X. Fixing Z to certain values will render X and Y correlated, as explained in the sun-tower-shadow example. Schurz and Gebharter (2016) demonstrate that the generalization of these three explanatory principles leads to the theory of causal Bayes nets, as axiomatized by the causal Markov condition plus the condition of minimality. This theory offers highly unified explanations of the statistical phenomena of screening off and linking up. The authors also investigate the question of independently testable empirical content, with the result that if the core axioms of causality are enriched by additional principles such as faithfulness, temporal forward directedness, or intervention conditions, then the metaphysical theory of causality will acquire rich
10 Theory-Generating Abduction and Its Justification
205
use-novel empirical content that allows its independent testability. In this way, the metaphysical theory of causality receives a justification meeting scientific standards.
Conclusions Theory-generating abductions introduce new concepts into their conclusion, i.e., concepts that are not contained in their premises. This kind of abduction underlies all uncertain inferences from empirical facts or regularities to theoretical hypotheses that explain these facts or regularities in terms of unobserved or unobservable entities and properties, expressed by theoretical concepts nomologically related to the observed facts. In section “Unification and Independent Testability: Two Rationality Criteria for Theory-Generating Abduction” the difficult challenge of discrimination theory-generating abductions from post-facto speculations without testable content was taken up. A discrimination was proposed based on two scientific rationality criteria: unification and independent testability. In section “Common Cause Abduction in Science” a form of theory-generating abduction that is particularly important in science was introduced: common cause abductions that explain correlated empirical dispositions in terms of common “theoretical causes.” In section “Instrumentalistic Justification of Theory-Generating Abduction” the challenge of the instrumentalistic position in philosophy of science was taken up. According to this position, the “theoretical causes” are mere instrumentalistic devices for a unified description and systematization of empirical regularities, but the assumption of their realistic reference is not warranted. This challenge was answered in section “Abductive Justification of Perceptual Realism: Normal World Versus Brain-in-the-Vat”, where the real existence of common causes of correlated dispositions was justified by a weak principle of causality. In section “Abductive Justification of Causality” common cause abduction was applied to the metaphysical theory of perceptual realism. The problem of establishing a robust preference criterion for choosing between empirically equivalent causal models was discussed at hand of the comparison of the normal world model and the brain-in-a-vat model. Finally, in section “Conclusions”, it was proposed that the principle of causality could itself be justified by a noncausal abduction based on the explanation of two statistical phenomena: screening off and linking up. Acknowledgments This work was supported by the DFG (Deutsche Forschungsgemeinschaft), research unit FOR 2495.
References Armstrong, D. M. (1983). What is a law of nature? Cambridge University Press. Barnes, E. (1995). Inference to the loveliest explanation. Synthese, 103, 251–277. Beebee, H. (2018). Philosophical skepticism and the aims of philosophy. Proceedings of the Aristotelian Society, 118(1), 1–24.
206
G. Schurz
Beebee, H., Hitchcock, C., & Menzies, P. (Eds.). (2009). The Oxford handbook of causation. Oxford University Press. BonJour, L. (2003). A version of internalist foundationalism. In L. BonJour & E. Sosa (Eds.), Epistemic justification (pp. 3–96). B. Blackwell. Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference (2nd ed.). Springer. Carnap, R. (1956). The methodological character of theoretical concepts. In H. Feigl & M. Scriven (Eds.), Minnesota studies in the philosophy of science (Vol. I, pp. 38–76). University of Minnesota Press. Carrier, M. (2003). Experimental success and the revelation of reality: The miracle argument for scientific realism. In P. Blanchard et al. (Eds.), Science, society and reality (pp. 137–161). Springer. Cartwright, N. (2007). Hunting causes and using them. Cambridge University Press. Chalmers, D. J. (2005). The matrix as metaphysics. In C. Grau (Ed.), Philosophers explore the matrix (pp. 132–176). Oxford University Press. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Science, 36, 181–253. Douven, I. (2011). Abduction. In Stanford encyclopedia of philosophy (March 09, 2011) http:// plato.stanford.edu/entries/abduction/ Earman, J. (1986). A primer on determinism. Reidel. Fales, E. (1990). Causation and universals. Routledge. Feigl, H. (1970). The orthodox view of theories: Remarks in defense as well as critique. In Minnesota studies in the philosophy of science (Vol. IV). University of Minnesota Press. Fine, K. (2017). A theory of truthmaker content. Journal of Philosophical Logic, Part I, 46, 625– 674, Part II: 6:675–702. Van Fraassen, B. (1989). Laws and symmetry. Clarendon Press. French, S. (2008). The structure of theories. In S. Psillos & M. Curd (Eds.), The Routledge companion to philosophy of science (pp. 269–280). Routledge. Friedman, M. (1974). Explanation and scientific understanding. Journal of Philosophy, 71, 5–19. Gabbay, D. M., & Woods, J. (2005). The reach of abduction. North Holland. Gemes, K. (1994). A new theory of content I: Basic content. Journal of Philosophical Logic, 23, 595–620. Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7, 155–170. Glymour, C. (1981). Theory and evidence. Princeton University Press. Grünbaum, A. (1976). Ad hoc auxiliary hypotheses and falsificationism. The British Journal for the Philosophy of Science, 27, 329–362. Grünwald, P. (2000). Model selection based on minimal description length. Journal of Mathematical Psychology, 44, 133–152. Haig, B. (2005). Exploratory factor analysis, theory generation, and scientific method. Multivariate Behavioral Research, 40(3), 303–329. Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74, 88–95. Hempel, C. G. (1951). The concept of cognitive significance: A reconsideration. Reprinted in Hempel (1965), ch. II.1. Hempel, C. G. (1965). Aspects of scientific explanation and other essays in the philosophy of science. Free Press. Hempel, C., & Oppenheim, P. (1948). Studies in the logic of explanation. Reprinted in: Hempel (1965), pp. 245–290. Hesse, M. (1976). Models and analogies in science. University of Notre Dame Press. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles Sanders Peirce Society, XXXIV(3), 503–533. Hitchcock, C., & Rédei, M. (2021). Reichenbach’s common cause principle. In The Stanford encyclopedia of philosophy (Summer 2021 Edition), plato.stanford.edu/archives/sum2021/ entries/physics-Rpcc/>
10 Theory-Generating Abduction and Its Justification
207
Hitchcock, C., & Sober, E. (2004). Prediction versus accommodation and the risk of overfitting. British Journal for the Philosophy of Science, 55, 1–34. Hume, D. (1748). In S. Butler (Ed.), An inquiry concerning human understanding. Echo Library. 2006. Josephson, J., & Josephson, S. (Eds.). (1994). Abductive inference. Cambridge University Press. Ketland, J. (2004). Empirical adequacy and ramsification. British Journal for the Philosophy of Science, 55, 287–300. Ladyman, J. (2012). Science, metaphysics and method. Philosophical Studies, 160(1), 31–51. Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. Reprinted in Lakatos, I., Philosophical papers (Vol. 1, pp. 8–101). Cambridge University Press, 1978 (quoted therefrom). Laudan, L. (1981). A confutation of convergent realism. Reprinted in D. Papineau (1997, ed.), The philosophy of science (pp. 107–138), Oxford University Press. Lipton, P. (1991). Inference to the best explanation. Routledge. Mach, E. (1883). The science of mechanics. BiblioBazaar. 2009. Magnani, L. (2001). Abduction, reason, and science. Kluwer. Mill, J. S. (1865). System of logic (6th ed.). Parker, Son, and Bourn. Minnameier, G. (2004). Peirce-suit of truth – Why inference to the best explanation and abduction ought not to be confused. Erkenntnis, 60, 75–105. Minnameier, G. (2017). Forms of abduction and an inferential taxonomy. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based science (pp. 175–195). Springer. Mortimer, C. E. (1986). Chemistry. Wadsworth Publishing Company. Moser, P. K. (1989). Knowledge and evidence. Reidel. Mumford, S. (1998). Dispositions. Oxford University Press. Niiniluoto, I. (2018). Truth seeking by abduction. Springer. Paul, L. A. (2012). Metaphysics as modeling: The handmaiden’s tale. Philosophical Studies, 160, 1–29. Pearl, J. (1997). Probabilistic reasoning in intelligent systems. Morgan Kaufmann. Peirce, C. S. (1901). Scientific metaphysics. In C. Hartshorne & P. Weiss (Eds.), 1931–1955 Collected papers of Charles S. Peirce, Vol. VI. Peirce, C. S. (1903). Lectures on pragmatism. In C. Hartshorne & P. Weiss (Eds.), Collected papers of Charles S. Peirce, Vol V. Prior, E. W., Pargetter, R., & Jackson, F. (1982). Three theses about dispositions. American Philosophical Quarterly, 19, 251–257. Psillos, J. (2000). Causality: Models, reasoning, and inference. Cambridge University Press. Putnam, H. (1975). What is mathematical truth? In I. H. Putnam (Ed.), Mathematics, matter and method (pp. 60–78). Cambridge University Press. Quine, W.v. O. (1960). Word and object. MIT Press. Raftopoulos, A. (2016). Abduction, inference to the best explanation, and scientific practise: The case of Newton’s optics. In L. Magnani & C. Casadio (Eds.), Model-based reasoning in science and technology (pp. 259–277). Springer. Ramsey, F. P. (1931). The foundations of mathematics. Kegan Paul. (reprint 1978). Saatsi, J. (2017). Explanation and explanationism in science and metaphysics. In M. Slater & Z. Yudell (Eds.), Metaphysics and the philosophy of science: New essays (pp. 162–191). Oxford University Press. Schippers, M., & Schurz, G. (2017). Genuine coherence as mutual confirmation between content elements. Studia Logica, 105(2017), 299–329. Schippers, M., & Schurz, G. (2020). Genuine confirmation and tacking by conjunction. British Journal for the Philosophy of Science, 71(1), 321–352. Schurz, G. (1991). Relevant deduction. Erkenntnis, 35, 391–437. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. Schurz, G. (2009). When empirical success implies theoretical reference: A structural correspondence theorem. British Journal for the Philosophy of Science, 60, 101–133. Schurz, G. (2014). Philosophy of science: A unified approach. Routledge.
208
G. Schurz
Schurz, G. (2015). Causality and unification. Theoria, 30(1), 73–95. Schurz, G. (2016). Common cause abduction: The formation of theoretical concepts and models in science. Logic Journal of the IGPL, 24(4), 494–509. Schurz, G. (2017). Interactive causes: Revising the Markov condition. Philosophy of Science, 84(3), 456–479. Schurz, G. (2021). Abduction as a method of inductive metaphysics. Grazer Philosophische Studien, 98, 50–74. Schurz, G., & Gebharter, A. (2016). Causality as a theoretical concept: Explanatory warrant and empirical content of the theory of causal nets. Synthese, 193, 1071–1103. Schurz, G., & Lambert, K. (1994). Outline of a theory of scientific understanding. Synthese, 101(1), 65–120. Schurz, G., & Weingartner, P. (2010). Zwart and Franssen’s impossibility theorem holds for possible-world-accounts but not for consequence-accounts to verisimilitude. Synthese, 172, 415–436. Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning. From theory to algorithms. Cambridge University Press. Shao, J. (1997). An asymptotic theory for linear model selection. Statistica Sinica, 7, 221–264. Shogenji, T. (2018). Formal epistemology and Cartesian skepticism: In defense of belief in the natural world. Routledge. Sperber, D., Premack, D., & Premack, A. J. (eds.) (1995). Causal cognition. Clarendon Press. Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. MIT Press. Stegmüller, W. (1976). The structure and dynamics of scientific theories. Springer. Thagard, P. (1988). Computational philosophy of science. MIT Press. Vogel, J. (1990). Cartesian skepticism and inference to the best explanation. Journal of Philosophy, 87(11), 658–666. Walton, D. (2001). Abductive, presumptive and plausible arguments. Informal Logic, 21, 141–169. Williamson, T. (2016). Abductive philosophy. The Philosophical Forum, 47(3–4), 263–280. Worrall, J. (2006). Theory-confirmation and history. In C. Cheyne & J. Worrall (Eds.), Rationality and reality (pp. 31–61). Springer. Yu, S., & Zenker, F. (2018). Peirce knew why abduction isn’t IBE – A schema and critical questions for abductive argument. Argumentation, 32, 569–587.
Abduction and Truth
11
Ilkka Niiniluoto
Contents Three Types of Inferences: Deduction, Induction, and Abduction . . . . . . . . . . . . . . . . . . . . . . Ampliative Inference and Fallibilism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Bayesian Confirmation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inference to the Best Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Truthlikeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
210 214 216 219 223 226 227
Abstract
Charles S. Peirce distinguished, during his long career between 1865 and 1914, three types of inferences: deduction, induction, and abduction. While deduction is necessarily truth-preserving, induction and abduction are ampliative or content-increasing. Peirce suggested that the reliability of ampliative inferences can be analyzed by truth-frequencies, but his successors and commentators are divided about the relation of abduction and truth. Abductive skeptics argue that explanatory virtues are irrelevant to truth, only informational, or ignorancepreserving, whereas others have defended abduction as a powerful method of discovery, probabilistic confirmation, or acceptance of the best explanation. This chapter concludes that successful abduction leads to increasing truthlikeness, so that it is a valuable guide in truth-seeking and scientific progress.
I. Niiniluoto () Department of Philosophy, History, and Art Studies, University of Helsinki, Helsinki, Finland e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_5
209
210
I. Niiniluoto
Keywords
Abduction · Confirmation · Discovery · Induction · Inference · Explanation · Probability · Scientific progress · Truth · Truthlikeness
Three Types of Inferences: Deduction, Induction, and Abduction Charles S. Peirce’s early lectures in the 1860s on the three main types of scientific inference – deduction, induction, and hypothesis – were expressed within the framework of Aristotle’s syllogistics. In the next two decades, he reformulated these patterns of reasoning in probabilistic terms. When Peirce developed his account of deductive logic independently of the Aristotelian tradition, he was also ready to revise in the late 1890s his general model of “hypothesis” or “abduction” as the “operation of adopting an explanatory hypothesis.” Peirce’s starting point was the observation that a paradigm example of deduction, the Barbara syllogism of the first figure, can be inverted in two different ways. In Aristotle’s own formulation, the two premises and the conclusion of Barbara are affirmative universal sentences of the form “G belong to every F” (e.g., “All men are mortal”), but the inference is valid also when the minor premise is singular (e.g., “Socrates is a man”). Using modern logical notation with ∴ as a general sign for inference and allowing singular terms, the latter case of Barbara looks as follows: (∀x) (Fx → Gx) Fb ∴ Gb
(1)
Induction is the inference of the major premise (rule) from the minor premise (case) and the conclusion (result): Fb Gb ∴ (∀x) (Fx → Gx).
(2)
Hypothesis is the inference of the minor premise from the major premise and the conclusion: (∀x) (Fx → Gx) Gb
(3)
∴ Fb. Thus, hypothesis leads from the rule and the result to the case (see CP 2.623.) Simple variants of these schemata allow the case to refer to more than one instance b.
11 Abduction and Truth
211
Then the inversion (2), which Peirce called “crude induction” (CP 2.757, 6.473), expresses a typical inductive generalization based on several instances: Fb1 & . . . &Fbn Gb1 & . . . &Gbn ∴ (∀x) (Fx → Gx).
(4)
Thus, deduction is typically reasoning from cause to effect, induction is reasoning from particulars to the general law, and hypothesis is reasoning from effect to cause (CP 2.536). Further, Peirce also stated, already in 1865, that hypothesis is an inference to an explanation, and the resulting inference (1) is then an “explanatory syllogism” (W 1:267, 428, 440). The same holds of examples where Barbara involves only universal statements: (∀x) (Fx → Gx) (∀x) (Bx → Fx)
(5)
∴ (∀x) (Bx → Gx). In this case, the hypothetical inversion of (5) takes the form (∀x) (Fx → Gx) (∀x) (Bx → Gx)
(6)
∴ (∀x) (Bx → Fx). For example, the supposition that light (B) is ether waves (F) explains why light gives certain peculiar fringes or is polarizable (G), given the law that ether waves give these fringes (W 1:267). Hence, Peirce’s account of deduction covers what Carl G. Hempel in 1948 called the deductive-nomological (DN) explanation of singular facts (1) and the DN explanation of laws (5) (see Hempel, 1965). Peirce’s account of hypothesis, as an inversion of deduction, covers singular abduction (3) and general abduction (6) (see Schurz, 2008). When the causal law (∀x)(Fx → Gx) involves a temporal succession from F to G, its hypothetical inversion from G to F may be called retroduction (CP 1.68). Such retroductive inferences may lead from present observations to past objects (e.g., the existence of Napoleon Bonaparte) or unobservable theoretical entities: The great difference between induction and hypothesis is, that the former infers the existence of phenomena such as we have observed in cases which are similar, while hypothesis supposes something of a different kind from what we have directly observed, and frequently something which it would be impossible for us to observe directly. (CP 2.640.)
212
I. Niiniluoto
Thus, besides “horizontal” inferences leading to observable facts, hypothetical reasoning is typically “vertical” (see Psillos, 2011). In 1878, Peirce replaced the universal premise “All F are G” in (1) by the statistical statement “Most F are G.” In 1883, he further formalized probabilistic arguments by replacing the major premise by a precise statistical statement “The proportion r of Fs are G.” Then the schema of simple probable deduction as a variant of deduction (1) is obtained: The proportion r of the Fs are G; b is an F; It follows, with probability r, that b is a G.
(7)
(CP 2.695). Here the conclusion can be taken to be “b is a G,” and the probability r indicates “the modality with which this conclusion is drawn and held to be true” (CP 2.720). Moreover, Peirce generalized (7) to the schema of statistical deduction:
The proportion r of the Fs are G, b’ , b’’ , b’’’ , etc. are a numerous set, taken at random from among the Fs; Hence, probably and approximately the proportion r of the b s are G.
(8)
(Note that (7) and (8) are not deductive truth-preserving arguments.) What Peirce called “quantitative induction” from a random sample to the population can now be obtained by inverting the argument (8). The hypothetical inversion of (7) leads to simple probabilistic abduction:
The proportion r of the Fs are G. Gb ∴ Probably, Fb.
(9)
This is a generalization of the singular hypothetical inference (3). It is significant that Peirce called the statistical arguments (7) and (8) “explanations” (CP 2.716). Thus, he anticipated C. G. Hempel’s 1962 model of inductive-probabilistic explanation of particular facts – and even the explanation of statistical facts (see Niiniluoto, 1981). In his Cambridge Lectures of 1898, Peirce said that induction is what Aristotle called epagoge, while hypothesis or retroduction is what Aristotle called apagoge. The general form of such abduction is as follows: If μ were true, π, π , π would follow as miscellaneous consequences. But π, π , π are in fact true. ∴ Provisionally, we may suppose that μ is true.
(10)
11 Abduction and Truth
213
It is also called “adopting a hypothesis for the sake of its explanation of known facts.” In his 1903 Harvard lectures, Peirce expressed the general form of abduction or the “operation of adopting an explanatory hypothesis” as follows (CP 5.189, EP 2:231): The surprising fact C is observed; But if A were true, C would be a matter of course. Hence, there is reason to suspect that A is true.
(11)
This schema, which has become Peirce’s best known or canonical formulation of abduction, indicates how a hypothesis can be “abductively conjectured” if it accounts “for the facts or some of them.” In Hempel’s (1965) terms, C is the known explanandum, which raises an explanation-seeking why-question “why C?”, A is the explanans, and the conclusion gives reason to suspect that the potential explanation of C by A is true or actual. The schema (11) is obviously a generalization of the original patterns (3) and (6) of hypothetical inference : the emphasis that the fact C is surprising (and therefore in need of explanation) has been added, and there are no restrictions on the logical complexity of A. As here A may be a general theory (together with some initial conditions), it might be said to express theoretical abduction (in contrast to singular abduction (3)). The idea of explanation is maintained in the second premise, but this is not any more explicitly associated with the relation of cause and effect. Already in 1878 Peirce argued that hypotheses have to be put into “fair and unbiased” tests by comparing their predictions with observations (CP 2.634). In his papers and lectures in 1901–1903, Peirce defined induction in a new way as “the operation of testing a hypothesis by experiment” (CP 6.526). This is in harmony with the method of hypothesis, but quite different from the inductive generalization (4). Abduction is an “inferential step” which is “the first starting of a hypothesis and the entertaining of it, whether as a simple interrogation or with any degree of confidence” (CP 6.525). Here abduction, deduction, and induction are successive steps in scientific inquiry: Abduction is the process of forming an explanatory hypothesis. It is the only logical operation which introduces any new idea; for induction does nothing but determine a value and deduction merely evolves the necessary consequences of a pure hypothesis. Deduction proves that something must be, Induction shows that something actually is operative, Abduction merely suggests that something may be. Its only justification is that from its suggestion deduction can draw a prediction which can be tested by induction and that, if we are ever to learn anything or to understand phenomena at all, it must be by abduction that this is to be brought about. (EP 2:216.)
Serious discussion about Peirce’s abduction started only after World War II. Norwood Russell Hanson (1958) argued that Peirce’s abduction gives a basis for studying the logic of discovery (cf. Paavola, 2006). An alternative interpretation linked abduction to the motives for investigating or pursuing test-worthy hypotheses (Laudan, 1980; Curd, 1980). These interpretations restrict the role of abduction to
214
I. Niiniluoto
the suggestion or generation of hypotheses for further investigation. A broader view allows that abduction has a function in the selection and evaluation of explanatory hypotheses which belongs to the context of justification. Peirce himself pointed out that in some cases abductive inference is “compelling” (e.g., perception in ordinary circumstances) or “perfectly certain” (e.g., the existence of Napoleon Bonaparte) (EP 2:54). Howard Smokler (1968) suggested that abductive inference is a method of confirming explanatory hypotheses on the basis of empirical evidence: (AC) If hypothesis H explains evidence E, then E confirms H.
Gilbert Harman (1965) formulated abduction as a rule of acceptance. His schema of inference to the best explanation (IBE) recommends the tentative acceptance of a hypothesis which is a better explanation of known facts than its rivals: (IBE) If hypothesis H is the best explanation of evidence E, then conclude for the time being that H is true.
Even though some Peirce scholars argue that abduction should be restricted to discovery and pursuit, without “confusing” it with reasoning to the best explanation (see Mcauliffe, 2015), for the purposes of this chapter, it is natural to follow many studies in logic (Aliseda, 2006), AI (Josephson and Josephson, 1994), and philosophy of science (see Douven, 2011; Psillos, 2011; Kuipers, 2019) by treating IBE as a special case of the broader notion of abduction (see Niiniluoto, 2018, 2022). This approach can be motivated also by noting that in some “inverse problems” abductive reasoning from effects to causes serves at the same time discovery and justification (Niiniluoto, 2018, Chap. 4).
Ampliative Inference and Fallibilism To understand the relation of abduction with truth, it should be placed in the context of Peirce’s fallibilism, which has largely become the mainstream view in epistemology and philosophy of science. Fallibilism is the doctrine that “people cannot attain absolute certainty concerning questions of fact” (CP 1.149). The view that human knowledge claims are fallible or liable to error, and therefore always corrigible by further evidence, should be distinguished from the pessimist view of skepticism: There is nothing, then, to prevent our knowing outward things as they really are, and it is most likely that we do thus know them in numberless cases, although we can never be absolutely certain of doing so in any special case. (CP 5.311)
Science “does not consist so much in knowing, nor even in ‘organized knowledge’, but rather in the pursuit of finding out” (CP 1.44). The “established truths” in science are nothing more than “propositions into which the economy of endeavor prescribes that, for time being, further inquiry shall cease” (CP 5.589). In his The Fixation of Belief (1877), Peirce described the acquisition of beliefs by a
11 Abduction and Truth
215
dynamical cycle, where acting on a belief may lead to unexpected and disappointing consequences, and this surprise irritates our doubt until a settlement of opinion, a new belief, is attained through inquiry. By the application of this critical method, science is self-corrective (CP 5.575), so that a fallibilist can be optimist about truth in the long run. According to Peirce’s pragmatist theory of truth, “the opinion which is fated to be ultimately agreed to by all who investigate, is what we mean by truth, and the object represented in this opinion is the real” (CP 5.407). His claim that “truth is that concordance of an abstract statement with the ideal limit towards which endless investigation would lead scientific belief” (CP 5.565) is not in conflict with the classical correspondence theory of truth. It should not be understood as a definition of truth but rather as an expression of the idea that the method of science is able to approach to the truth in the long run (see Niiniluoto, 2018, p. 9). Peirce divided inferences into analytic (deductive) and synthetic (induction and hypothesis) (CP 2.624). Analytic inferences are “explicative,” while synthetic inferences are “ampliative” or content-increasing. He was well aware that deduction is necessarily truth-preserving, so that from true premises only true consequences can be derived by deduction. In “On the Algebra of Logic” (1880), Peirce characterized the inference P ∴ C by a condition which looks like Alfred Tarski’s modeltheoretical definition in the 1930s: “every state of things in which a proposition of the class P is true is a state of things in which the corresponding propositions of the class C are true” (EP 1:203). But in ampliative inferences, the conclusion goes beyond the premises, so that reasoning even from true premises may result in falsity. Every single application of inductive generalization (4) is fallible: after thousands of observed white swans, the generalization “All swans are white” was falsified by the discovery of black swans in Australia. Similarly, every single application of abduction (3) is fallible: when I hear noise from the window and conclude that a car has passed my home on the street, it is still possible that the sound was fake or just my own hallucination. In spite of their fallibility, ampliative inferences are needed for the increase of real knowledge. Therefore, Peirce endeavored to show that by their self-corrective nature they are guaranteed to approach the truth in the long run. According to him, induction can be justified by the fact that it “pursues a method which, if duly persisted in, must, in the very nature of things, lead to a result indefinitely approximating to the truth in the long run” (CP 2.781). Here the persistence of induction means that larger and larger random samples of instances are collected. In the case of abduction, it is feasible likewise to enrich the basic fact C to be explained in order to reach a better conclusion: the fact that a person has suddenly died allows numerous alternative explanations, but a more detailed and deeper medical diagnosis by postmortem may exclude most of them and leave only a unique cause of death (Cf. Peirce’s own remark about “enlarging the field of facts” to be explained, EP 2:114). Further, persistence of abduction may also mean to carry out repeatedly the whole cycle of tentative explanations, their testing by deductive predictions, and corrected explanations. In this spirit, Peirce praised the role of abduction in the approach to the truth:
216
I. Niiniluoto
Abduction is reasoning, which professes to be such, that in the case there is any ascertainable truth concerning the matter in hand, the general method of this reasoning though not necessarily each special application of it must eventually approximate the truth. (Peirce, 1976, p. 37)
But this argument is not quite conclusive, since Peirce did not provide a general mechanism for revising hypotheses which have failed some tests (cf. Aliseda, 2006, for abduction and belief revision). In his early work, Peirce suggested that the trustworthiness of synthetic inference can be measured by a number which tells how often its pattern yields true conclusions from true premises. For ordinary deduction, which is necessarily truthpreserving, this truth-frequency is one. For simple probable deduction (7), this value is r, since by drawing random members from the class of Fs and applying repeatedly this schema a true consequence is obtained in the proportion r of all cases. However, the truth-frequency q of the probabilistic hypothetical inference (9) in general is not equal to r, since in this inverse inference q depends on the proportion of Fs in the class of Gs, and it is clear that this truth-frequency is independent of r. Indeed, it is possible that it is close to zero. Peirce was well aware of this issue, as in 1882 he pointed out that “a very small proportion of calves may be monstrosities, and yet a very large proportion of monstrosities may be calves” (CP 2.729). On the other hand, sometimes the probabilistic validity of abductive reasoning (9) may be high. In his later work in 1902, Peirce concluded that “probability proper has nothing to do with the validity of Abduction” (CP 2.102). Yet, Peircean truth-frequencies have later found their place in the error probabilities of the Neyman-Pearson statistical tests of significance and in Alvin Goldman’s (1986) epistemological program of “reliabilism.”
Abduction and Bayesian Confirmation Inspired by John Venn, Peirce was a frequentist in the theory of probability – until 1910 when he started to support the dispositional propensity interpretation of objective probability (CP 2.666). Therefore, he was sharply critical of the classical Bayesian theory of probabilistic inference (see, e.g., EP 2:215) and thought that induction does not lend probability to its conclusion (CP 2.780). Yet, Bayesianism provides a useful framework for studying induction and abduction as forms of ampliative reasoning, since its personal probabilities are designed to express the uncertainty of non-demonstrative inference. Jaakko Hintikka’s system of inductive logic shows how inductive generalization (4) can be treated so that non-zero probabilities are assigned to universal laws and their posterior probability approaches the limit one when the number of positive instances approaches infinity (see Niiniluoto, 2011). Bayesianism also provides a probabilistic link between abduction and truth (see Niiniluoto, 1999). The Bayesians interpret the probability P(H/E) as the rational degree of belief in the truth of hypothesis H given evidence E. Bayes’s theorem
11 Abduction and Truth
P (H/E) =
217
P (H )P (E/H ) P (H )P (E/H ) . = P (E) P (H )P (E/H ) + P (∼ H ) P (E/ ∼ H ) (12)
shows how the posterior probability P(H/E) or the credence of H can be calculated on the basis of the prior (initial) probability P(H) of H and the likelihood P(E/H) of H relative to E. If P(H/E) > P(H), evidence E confirms H in the sense that it increases the belief in the truth of H. This condition of positive relevance (PR) is equivalent to P(E/H) > P(E) and P(H/E) > P(H/∼E). Now if H logically entails E, then P(E/H) = 1. Assuming that H is consistent or epistemically possible (P(H) > 0) and E is non-tautological (P(E) < 1), (12) immediately implies that P(H/E) = P(H)/P(E) > P(H). Hence,
If H logically entails E, and if P (H) > 0 and P (E) < 1, then P (H/E) > P (H) . (13) This conclusion, where all probabilities can be relativized to some background knowledge B, is completely general in the sense that it is valid for all epistemic probability measures P and for all non-zero prior probabilities P(H) > 0. It allows H to be a strong and informative theory with theoretical concepts. Hence, (CE) If H and E are contingent statements, and H deductively explains E, then E confirms H.
More generally, as positive relevance PR is a symmetrical relation, it is sufficient for the confirmation of H by E that H is positively relevant to E. For example, if an infection H increases the probability of fever E, then the fever supports the hypothesis of infection. If inductive explanation is defined by the positive relevance condition, i.e., H inductively explains E if and only if P(E/H) > P(E), then (CE ) If H is a positively relevant inductive explanation of E, then E confirms H.
Combination of these conclusions gives the following:
If H deductively or inductively explains E, then E confirms H.
(14)
Thus, the Bayesian approach immediately justifies the idea that explanatory success is confirmatory or credence-increasing (see Niiniluoto, 1999). This a generalization and justification of Smokler’s (1968) qualitative principle AC of abductive confirmation. By the same arguments, a similar principle PC for predictive confirmation can be established:
If H deductively or inductively predicts E, then E confirms H.
(15)
218
I. Niiniluoto
The increase of probability by explanatory or predictive success can be calculated by the difference
P (H/E) –P (H) =
P (H ) (P (E/H) –P (E)). P (E)
(16)
This value increases when P(E) decreases. As Peirce knew, experimental tests should be directed to the “most unlikely” predictions deducible from a hypothesis (CP 7.182), and for the same reason it is advisable to explain “surprising facts” (cf. (11)). These arguments show that Peirce’s canonical schema (11) is truth-conducive in the sense that explanatory success increases the probability that the explanatory theory is true. This remark about confirmation is weaker than any claim about the acceptability of the conclusions of abduction – such stronger sense of justification will be discussed in connection with inference to the best explanation (see section “Inference to the Best Explanation”). But even though confirmation is a weak form of justification, it goes beyond the relatively popular pursuit interpretation, which claims that abduction is concerned only with the selection of hypotheses for further investigation and testing (see Kapitan, 1997; McKaughan, 2008; Davies and Coltheart, 2020). Further, theorem (14) is sufficient to challenge attacks against the “explanationist” view, such as “confirmation is logically independent of explanation” (Salmon, 2001, p. 88) and “explanatoriness is evidentially irrelevant” (Roche and Sober, 2013). Note that this kind of abductive skepticism is compatible with the view that successful predictions of novel facts can confirm theoretical hypotheses. Conclusion (14) also counts against Bas van Fraassen’s (1989) thesis, essential to his constructive empiricism against scientific realism, that explanatory power is only an “informational” rather than “confirmational” virtue. One way of questioning the application of theorem (13) would be to claim that P(E) = 1 if the explanandum E is already known or part of the “old evidence”. But a straightforward answer is to formulate the abductive situation so that P(E) is not one: E is not only a contingent statement but also a “surprising fact” in Peirce’s terms. Comparison of (14) and (15) shows that one need not make an essential difference between explanation (accommodation) and prediction as methods of confirmation (cf. Howson and Urbach, 1989, pp. 280–284). Another way of challenging theorem (13) would be to claim that P(H) = 0 for scientific theories H (van Fraassen, 1989). This would imply by (12) that P(H/E) = 0 for any evidence E, so that the empirical confirmation of H is impossible. But this kind of general a priori assumption means that van Fraassen’s constructive empiricism is dogmatically committed to theoretical skepticism (Niiniluoto, 2018, p. 155). However, in special cases scientific theories may be idealizations which are known to be false, so that the degree of belief in their truth is zero. Such cases need new tools where probabilities are replaced by the notion of truthlikeness (see section “Abduction and Truthlikeness”).
11 Abduction and Truth
219
A special form of abductive skepticism is presented in the GW model, developed by Dov Gabbay and John Woods (2005) (cf. Magnani and Bertolotti, 2017). An important feature of the GW model is its thesis that abduction is “ignorancepreserving.” Its starting point is an “ignorance problem” with respect to some proposition T, where the available knowledge base is insufficient to “attain” T. An abductive hypothesis H attains the cognitive target T only subjunctively or presumptively, and in “partial abduction” H can be presented as a conjecture and in “full abduction” it will be activated in action. Thus, in abduction the agent’s “ignorance remains.” This is clearly different from Peirce’s abductive schema (11) which starts from our knowing a surprising fact C but being ignorant why C is the case. When a hypothesis H is introduced to explain C, the why-question has a presumptive answer – a potential explanation in Hempel’s (1965) sense, as the truth value of H is not known for certain. Here conclusion (14) affirms that the ability of H to explain C gives some support to H, so that abduction is credence-increasing instead of ignorance-preserving. Lorenzo Magnani (2017) gives a different defense of the knowledge-enhancing character of abduction by appealing to Peirce’s thesis that we have a natural disposition, Galileo’s il lume naturale, of “guessing right” (CP 6.530, 7.223). Jaakko Hintikka (1998) gives abduction a prominent but somewhat peculiar place in his interrogative model of inquiry. In this model, all inference steps are deductive, so that induction and abduction as ampliative patterns of reasoning have no function. But abduction has a crucial strategic role in truth-seeking, which happens through questions addressed to “oracles” (like nature or other sources of information). While it is attractive to associate abduction with interrogative steps in inquiry, especially with why-questions and how possible-questions, Hintikka’s model could be improved by allowing the “Socratic” questioning strategies to be in interplay with three kinds of inference – deduction, induction, and abduction (Niiniluoto, 2018, pp. 48–50).
Inference to the Best Explanation In some cases it may be difficult to find even one satisfactory explanation of surprising facts, but Peirce was well aware that sometimes “the possible explanations of our facts may be strictly innumerable” (EP 2:107). In 1901, Peirce discussed criteria for the choice of good hypotheses, which included explanatory power and testability but also economic factors like cost, caution, breath (unification), and incomplexity (simplicity) (see CP 7.220-221, EP 2:106-114; cf. Thagard, 1978). Such criteria are needed by a Bayesian as well, since abductive confirmation is a weak form of justification which allows the explanandum C to have several alternative explanations: by theorem (14) every hypothesis H which is positively relevant to C receives some confirmation or support from C. The conclusions of section “Abduction and Bayesian Confirmation” can be strengthened by introducing quantitative measures for the explanatory power of H
220
I. Niiniluoto
with respect to E and the degree of confirmation of H by E and by defining K to be a better explanation of E than H if and only if K has more explanatory power with respect to E than H. Then for suitable pairs of such measures, one can prove the comparative theorem: If K is better explanation of E than H, then E confirms K more than H.
(17)
(see Niiniluoto, 2018, Chap. 6.3). The next step is to conclude that among rival explanations the best explanation of E has the highest degree of confirmation, which gives an explication of the acceptance rule IBE: (IBE) Hypothesis H may be inferred from evidence E when H is the best explanation of E.
(See ibid., Chap. 7.1.) Its special case is inference to the only explanation: (IOE) Hypothesis H may be inferred from evidence E when H is the only available explanation of E.
For a fallibilist, IBE and IOE cannot prove that hypothesis H is true, but they involve the tentative acceptance of H as true, with the proviso that an accepted hypothesis is open to revision by further evidence. Most measures of the explanatory power expl(H,E) of theory H with respect to facts E are normalizations of the difference P(E/H) – P(E), so that they have a positive value if and only if H is an inductive explanation of E (see section “Abduction and Bayesian Confirmation”). They can be motivated by noting that expl(H,E) measures the amount of information that H transmits about E (Hintikka, 1968). These measures have the maximum value if H logically entails E, so that they cannot make a difference between deductive explanations. Further, H has more explanatory power than K with respect to E if the likelihood condition P(E/H) > P(E/K) is satisfied. These positive relevance measures satisfy the comparative conclusion (17), if degrees of confirmation are defined by posterior probability P(H/E) or ratio P(H/E)/P(H). An exception is Hempel’s (1965) measure syst(H,E) = P(∼H/∼E) of systematic power, which implies that H is better than K if P (H) (1–P (E/H)) < P (K) (1–P (E/K)).
(18)
Hempel’s measure thus combines the requirement of high likelihood with Popper’s (1963) preference for theories with low prior probability or high empirical content cont(H) = 1 – P(H). If the acceptance of H on E gives us the gain syst(H,E), when H is true, and the negative loss –syst(∼H,E), when H is false, then the expected value of systematic power matches the difference measure of confirmation P(H/E) – P(H) in (17). Hintikka (1968) makes an important difference between local theorizing, where the researchers are interested in finding an explanation of a particular fact or event E,
11 Abduction and Truth
221
and global theorizing, where evidence E is only a stepping-stone to a more generally applicable theory. For the local case, measures expl lead to the rule of maximum likelihood, well known from statistical estimation problems: accept hypothesis H which maximizes P(E/H). A similar distinction can be made between the contexts where abductive inference is applied. In many everyday examples, we focus on particular cases with explanation-seeking or cause-seeking why-questions: Why did my car break down today? Seddon (2021) provides neurophysiological evidence for the claim that human perception operates with “the Best Guess Theory.” (Recall that for Peirce, perception, as “not fully conscious” reasoning from sensible effects to external causes, is an extreme or limiting special case of abduction, CP 5.181.) The answers or conclusions provide us sufficiently convincing beliefs which are sufficiently reliable to guide our practical action. Such local abductive problems may be selective in the sense that their potential answers are sought from ready-made lists, such as causes of death in medical diagnosis (Magnani, 2001, p. 20). Local hypotheses are significant also in historical sciences: What caused the extinction of dinosaurs? Who killed Olof Palme? For the study of local abductions, it is often advisable to enrich the evidence (see section “Ampliative Inference and Fallibilism”): as abductive reasoning is non-monotonic, the conclusion from E&E´ may be different (and more correct) than from E alone, if E´ gives additional relevant information about the explanandum. For global cases, abduction is used as a tool of theory formation in science. Here a good hypothesis H is expected to be able to explain and predict also other phenomena than the triggering fact E. In this case, unification of different and independent facts is an important virtue of H. Search for such global theories is usually creative in the sense that it requires the introduction of new theoretical concepts. Bayesian probabilities are usually defined over an “ultimate partition” (Levi, 1967), i.e., a set of mutually exclusive and jointly exhaustive statements Hi , i = 1, . . . ,n. In the simplest case, this set has only two alternatives {H, ∼H}. The standard Bayesian approach measures the success of a hypothesis H by its posterior probability P(H/E) given the available total evidence E. As P(H/E) is the expected truth value of H, this approach can be motivated by assuming that truth is the only “epistemic utility” of science (Niiniluoto, 2018, p. 112). Levi (1967) criticizes the high probability rule by its “conservative” nature, since it favors trivial tautologies t (since P(t) = 1) and logical consequences of evidence (since P(E/E) = 1), but in the abductive context, these choices are excluded: a tautology t does not explain anything (P(E/t) = P(E) for all E) and self-explanations are excluded (E does not explain E). The Bayesian comparison of two rival hypotheses H and K amounts to the requirement that the likelihood ratio exceeds a threshold value defined by the ratio of prior probabilities: P (H/E) > P (K/E) iff
P (K) P (E/H ) > . P (E/K) P (H )
(19)
222
I. Niiniluoto
If P(H) = P(K), (19) reduces to the likelihood condition P (E/H) > P (E/K).
(20)
If P(E/H) = P(E/K), then (19) holds if and only if P(H) > P(K). In particular, if both H and K are deductive explanations of E, then the comparison (19) recommends the hypothesis with a higher prior probability. Depending on the definition of priors, this comparison may mean that a more frequent cause H is preferred to a less common one K or that the preference for H depends on its “plausibility” (see Salmon, 2001). Karl Popper (1963) argued that science does not aim at weak probable hypotheses but rather informative truths backed by evidence. This idea is especially relevant in the context of global theorizing. Isaac Levi (1967) has formalized it by taking truth and information as the epistemic utilities of scientific inference, which leads to the rule Given evidence E, accept the theory H which maximizes P (H/E) –P (H) .
(21)
For some special cases, this rule behaves in the same way as the Bayesian rule (19). According to conclusion (16), if P(E/H) = P(E/K), H is better than K by rule (21) if and only if P(H) > P(K), so that high prior probability is favored again (in contrast to (18)). But it is important that (21) makes sense of the simultaneous search for the truth (in terms of high posterior probability) and information content (in terms of low prior probability): by (19) P(H/E) > P(K/E) holds if the likelihood ratio is larger than P(K)/P(H) which in this case is greater than 1. In the global context, a good hypothesis H should explain and predict diverse facts. This kind of unification can happen in two quite different ways (Schurz, 2015; Niiniluoto, 2018, Chap. 6.4). One of them is linking up: two empirical phenomena E and E are independent from each other (P(E/E ) = P(E)), but given H they become positively relevant to each other (1 > P(E/E &H) > P(E/H)). Then it can be shown that the confirmation of H by E&E´ can expressed as the sum of three factors: the confirmation of H by E, the confirmation of H by E , and the degree of unification of E and E achieved by H (see Myrvold, 2003). The second one is screening off : two empirical phenomena are positively relevant to each other, but given H they become independent of each other. Examples include William Whewell’s “consilience of inductions” (e.g., Newton’s theory deductively explains Kepler’s and Galileo’s laws) and Hans Reichenbach’s probabilistic common causes. Also in this case the unifying theory or common cause receives additional empirical confirmation. The conclusions of this section show that theoretical hypotheses H with nonzero priors receive confirmation from their empirical (explanatory and predictive) successes. When these confirmations cumulate, the posterior probability P(H/E) will be high. But H is acceptable by a rule like IBE only if its posterior probability is definite higher than the probability of its alternatives. As Lipton (2004) remarks, the
11 Abduction and Truth
223
acceptable best hypothesis has to be “good enough.” As the comparative conclusion (19) and Bayes’s theorem (12) show, the posterior probability of H is high if the probability of evidence P(E/∼H) is small. This is a strong requirement, since in the general case P(E/∼H) is the weighted average of the likelihoods P(E/∼Hi ) for all alternatives Hi to H – and a fallibilist cannot be certain that all the relevant potential explanations have already been found. But the best strategy in truth-seeking is to tentatively accept the best explanation so far and keep on searching for stronger and more complete explanations. So when P(E/∼H) is sufficiently small, P(H/E) is comparatively strong enough for such tentative acceptance. In the limit, if P(E/∼H) = 0, then P(H/E) = 1 and H is the only available explanation of E by IOE. It is a matter of debate whether explanatory success alone is enough for acceptance or whether a hypothesis should always receive additional support from successful predictions. In normal cases both kinds of successes are needed. But for the defenders of abduction, it is interesting to note that most contemporary physicists are ready to accept the existence of dark matter and dark energy (and unwilling the reject Newtonian or relativistic mechanics), since this hypothesis explains away some anomalies in the observed movements of galaxies, even though dark matter is unobservable and so far untestable by experiments.
Abduction and Truthlikeness Peirce’s treatment of abduction covered reasoning to deductive and probabilistic explanations, but it should be extended to approximate explanations well. Theory H approximately explains E if H entails (or makes more probable) a statement E which is close to E. For example, Newton’s theory approximately explains Kepler’s laws about planetary orbits (Niiniluoto, 2018, Chap. 8.1). Similarly, idealized theories which are false but close to the truth may also give satisfactory approximate explanations of observations and measurements. When Peirce introduced the notion of abduction as the inference from surprising facts to their explanation, in his canonical formulation (11) abduction gives a reason to suspect that a successful explanatory theory is true. But if the best available explanation H of evidence E is approximate, it may be more appropriate to tentatively conclude that T is truthlike or approximately true (see Niiniluoto, 1999, 2018; Kuipers, 1999, 2019). This principle can be called inference to the best approximate explanation: (IBAE) If the best available explanation H of evidence E is approximate, conclude for the time being that H is approximately true.
This kind of modification of the schema (11) is in harmony with Peirce’s passages where he advocates a strong form of fallibilism: scientific knowledge is not only uncertain but also “inexact” and “indeterminate” (CP 1.141). Kuipers proposes an alternative to IBE which he calls inference to the best theory:
224
I. Niiniluoto
(IBT) If a theory has so far proven to be the best among the available theories, then conclude for the time being that it is the closest to the truth of the available theories.
The best theory is here allowed to be inconsistent with evidence, so that IBT may involve abductive revision in Aliseda’s (2006) sense. An alternative to IBT is comparative abduction, which uses comparative statements both in premises and conclusion: (IBTc ) If theory Y is a better explanation of the available evidence E than theory X, then conclude for the time being that Y is more truthlike than X.
This kind of principle could not be formulated for IBE, since the notion of truth does not have a comparative form “truer than.” Rules IBT and IBTc are natural generalizations of IBAE, since their premise does not restrict the explanation to approximate explanation. They are applicable to cases, where the consequences of highly abstract theories are only approximately verified in experiments. Such a discrepancy between H and E may be due to the fact that observational errors make evidence statement E incorrect, so that theory H offers a corrective explanation of E, but even in such situations it may be reasonable to conclude cautiously only that the theory is truthlike. Even though Newton’s theory can give a corrective explanation of Kepler’s and Galileo’s empirical laws, quantum theory and Einstein’s relativity theory show that Newton’s mechanics is at best approximately true. The notion of truthlikeness was introduced to modern philosophy of science by Karl Popper. His version of fallibilism owes much more to Peirce than he ever acknowledged, but it is interesting to note that there is a historical continuity about the concept of truthlikeness. Popper made his first attempt at definition in 1960 after reading Quine’s (1960, p. 23) criticism that Peirce’s talk about approach to the truth involves “a faulty use of numerical analogy,” since the notion “nearer than” is defined for numbers and not for theories (see Popper, 1963, p. 231). Popper’s comparative definition of “closer to the truth” failed, as it could not be applied to pairs of false theories. Another motivation for Popper came from his anti-inductivist wish to replace the Bayesian notion of probability by an alternative concept which combines the aims of truth and information. (Popper never spoke about abduction, since his anti-inductivism denied any kind of inference from observations to explanatory theories.) After Popper several approaches have been developed. The similarity approach employs a framework of mutually exclusive and jointly exhaustive constituents (complete states of affairs) C1 , . . . , Cn and distance d(Ci ,Cj ) between constituents. Among the constituents there is a unique true constituent C*. All theories can be expressed in a normal form as a disjunction of constituents. A theory is approximately true if its minimum distance from C* is sufficiently small. A theory is truthlike if its overall distance from the target C* is sufficiently small. This distance has been measured by the average distance of the disjuncts d(Ci ,C*) in the normal form (see Oddie, 2014) or by the weighted sum of the minimum distance and the normalized total sum of the distances (see Niiniluoto, 1987).
11 Abduction and Truth
225
In order to justify rules like IBAE and IBT, one should be able to generalize the conclusions of section “Inference to the Best Explanation” which establish a probabilistic link between explanatory power and truth: theorem (14) shows that the explanatory success of theory H increases the posterior probability P(H/E) of H given evidence E, and posterior probability is the most reliable indicator of the truth of H. For IBAE and IBT, one needs a corresponding link between explanatory success and truthlikeness. For this purpose, the counterpart of epistemic probability P(H/E), which is equal to the expected truth value of H, is the expected verisimilitude of H given E: ver (H/E) = .P (Ci /E) Tr (H, Ci ), (22) where the sum goes over all i = 1, ..., n. Here C1 , ..., Cn are the alternative constituents expressible in some language L, and the degree of truthlikeness of theory H would be Tr(H,Ci ) if Ci were the true state. The real truthlikeness is the degree Tr(H,C*) for the unique true constituent C* (see Niiniluoto, 1987, for real and estimated truthlikeness). If all constituents Ci are equally probable given E, then also their ver-values are equal. But if evidence E entails one of them, say C1 , then ver(H/E) = Tr(H,C1 ). By the same token, if P(C1 /E) → 1 with increasing evidence, then ver(H/E) → Tr(H,C1 ). It is important that the value ver(H/E) generally differs from the probability P(H/E). For example, the ver-value of a tautology t is less than maximal, even though P(t/E) = 1. Further, unlike probability P(H/E), ver(H/E) may be non-zero even when the evidence E refutes H. Still, ver(H/E) is the best indicator of real truthlikeness, and some conditions guarantee convergence to the truth in the sense that ver(H/E) approaches the real degree Tr(H,C*) for increasing evidence E (Niiniluoto, 2018, p. 144). With these logical tools, it is possible to show that at least on some conditions theorem (14) can be generalized to the following principles: If H is a better explanation of E than K, then ver (H/E) > ver (K/E).
(23)
If H is the best explanation of E, then E increases the expected verisimilitude ver (H/E) of H. (24) (See Niiniluoto, 2018, pp. 142–143.) Theo Kuipers (2019) defends the rule IBT within his own framework of “nomic truth approximation.” He defines “closer to the truth” by a comparative condition and “the best theory” by the comparison of correct consequences and counterexamples. His “Rule of Success” states that it is rational to favor a theory which has so far proven to be empirically more successful than its rivals. The connection between explanatory success and increasing truthlikeness shows that Peirce was right in presenting abduction as the crucial ingredient of the selfcorrective method of science. This can be made explicit by defining scientific progress as increasing truthlikeness (see Popper, 1963; Niiniluoto, 2014). More
226
I. Niiniluoto
precisely, with the functions Tr and ver, one may distinguish between real and estimated progress: (RP) Step from theory H to theory K is progressive if and only if Tr(H,C*) < Tr(K,C*). (EP) Step from theory H to theory K seems progressive on evidence E if and only if ver(H/E) < ver(K/E).
Successful abduction gives rise to increasing estimated progress, which is the best indicator of real progress in science. In this sense, abduction is a powerful method of truth-seeking in science (Niiniluoto, 2018). The no miracle argument (NMA) concludes that scientific theories are truthlike from the premise that truthlikeness is the best explanation of the empirical success of science. This argument for scientific realism is abductive – more precisely, it involves the schema IBT. If the comparative IBTc is used instead of IBT, then the comparative NMAc concludes that K is more truthlike than H from the premise that K is empirically more successful than its rival H (Niiniluoto, 2018, p. 159). Thus, comparative NMAc supports the realist thesis that science has in fact progressed by increasing verisimilitude: the growing empirical success of science indicates that historical sequences of theories have approached to the truth. This optimistic realism gives a response to the antirealist “pessimistic meta-induction” about science. The concepts and conclusions of this section have been formulated with reference to a fixed target, i.e., complete truth C* in some language L. But it is important that abduction can lead to scientific progress also in cases where the language L changes with the introduction of new theoretical concepts, so that the set of relevant candidates of explanation is expanded as well. (This is one of the reasons why van Fraassen’s, 1989, “bad lot argument” is not fatal to abduction.) A nice illustration was given by Peirce, when he argued that Kepler’s discovery of the orbit of the planet Mars in 1609 was “the greatest piece of Retroductive reasoning ever performed” (CP 1.71-74). Astronomers ever since the Antiquity had tried “to save the phenomena” by systems which assume only circular movements. In his first law, Kepler broke away with this traditional restriction: starting from Tycho Brahe’s observations of the apparent places of Mars at different times and from the vague retroduction that the sun must have something to do with causing the planets to move in their orbits, Kepler implemented successive steps where his theory was approximately true but needed modifications “in such a way as to render it more rational or closer to the observed fact.” Finally he reached abductively the radically novel conclusion that the orbit of Mars is an ellipse (instead of a circle) with the sun at one focus. The next inductive step was to generalize this law to the orbits of all planets in the solar system.
Conclusion Philosophical debates about the role of abduction often concentrate, in a misleading way, on the question whether abductive reasoning produces a true conclusion in a single application of Peirce’s canonical schema (11) or its special case IBE. As
11 Abduction and Truth
227
abduction is a fallible form of ampliative inference , it is no wonder that we cannot prove the truth of its conclusion. But we have seen in this chapter that there are no good reasons to agree with the skeptical view that abduction is irrelevant to truth or merely ignorance-preserving. Besides guiding our actions in local everyday circumstances, finding an explanation of known facts and regularities may help to discover new scientific laws and theories, narrow down the space of test-worthy hypotheses for further investigation, and give justification to new conclusions in a weaker sense (probabilistic confirmation) or stronger sense (tentative acceptance). Repeating the schema of abduction with enlarging evidence or with the unification of independent explanations is a powerful method of approaching the truth. This dynamic truth-seeking is clearly the main function of Peirce’s abduction, which means that abduction has a crucial role in scientific progress.
References Aliseda, A. (2006). Abductive reasoning: Logical investigations into discovery and explanation. Springer. Curd, M. (1980). The logic of discovery: An analysis of three approaches. In Nickles (pp. 201– 219). Davies, M., & Coltheart, M. (2020). A Peircean pathway from surprising facts to new beliefs. Transactions of the Charles S Peirce Society, 56, 400–426. Douven, I. (2011). Abduction. In E. Zalta (Ed.), Stanford encyclopedia of philosophy. Stanford University. http://plato.stanford.edu/archives/spr2011/entries/abduction/ Gabbay, D. M., & Woods, J. (2005). The reach of abduction: Insight and trial. Elsevier. Goldman, A. (1986). Epistemology and Cognition. Harvard University Press. Hanson, N. R. (1958). Patterns of discovery. Cambridge University Press. Harman, G. (1965). Inference to the best explanation. The Philosophical Review, 74, 88–95. Hempel, C. G. (1965). Aspects of scientific explanation. The Free Press. Hintikka, Jaakko (1968). The varieties of information and scientific explanation. In B. van Rootselar and Fritz Staal (Eds.), Logic, methodology, and philosophy of science (pp. 51–171). North-Holland. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34, 503–533. Howson, C., & Urbach, P. (1989). Scientific reasoning: The Bayesian approach. Open Court. Josephson, J., & Josephson, S. (Eds.). (1994). Abductive inference. Cambridge University Press. Kapitan, T. (1997). Peirce and the structure of abductive inference. In N. Houser, D. D. Roberts, & J. van Evra (Eds.), Studies in the logic of Charles Peirce (pp. 477–496). Indiana University Press. Kuipers, T. (1999). Abduction aiming at empirical progress or even truth approximation leading to a challenge for computational modelling. Foundations of Science 4, 307–323. Kuipers, T. (2019). Nomic Truth Approximation Revisited. Springer. Laudan, Larry (1980). Why was the logic of discovery abandoned? In Nickles (pp. 173–183). Levi, I. (1967). Gambling with truth. Alfred A. Knopf. Lipton, P. (2004). Inference to the best explanation. Routledge. Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation. Kluwer and Plenum. Magnani, L. (2017). The abductive structure of scientific creativity: An essay on the ecology of cognition. Springer.
228
I. Niiniluoto
Magnani, L., & Bertolotti, T. (Eds.). (2017). Springer handbook of model-based science. Springer. Mcauliffe, W. (2015). How did abduction get confused with inference to the best explanation? Transactions of the Charles S. Peirce Society, 51, 300–319. McKaughan, D. (2008). From ugly Duckling to Swan: C. S. Peirce, abduction, and the pursuit of scientific theories. Transactions of the Charles S. Peirce Society, 44, 446–468. Myrvold, W. C. (2003). A Bayesian account of the virtue for unification. Philosophy of Science 70, 399–423. Nickles, T. (Ed.). (1980). Scientific discovery, logic, and rationality. D. Reidel. Niiniluoto, I. (1981). Statistical explanation reconsidered. Synthese, 48, 437–472. Niiniluoto, I. (1987). Truthlikeness. D. Reidel. Niiniluoto, I. (1999). Defending abduction. Philosophy of Science (Proceedings), 66, S436–S451. Niiniluoto, I. (2011). The development of the Hintikka Program. In D. Gabbay, S. Hartmann, & J. Woods (Eds.), Handbook of the history of logic, vol. 10 (Inductive Logic) (pp. 311–356). North-Holland. Niiniluoto, I. (2014). Scientific progress as increasing Verisimilitude. Studies in History and Philosophy of Science,46, 73–77. Niiniluoto, I. (2018). Truth-seeking by abduction. Springer. Niiniluoto, I. (2022). Peirce’s abduction and its interpretations. In C. de Waal (Ed.), The Oxford handbook of C. S. Peirce. Oxford University Press. Oddie, G. (2014). Truthlikeness. In E. Zalta (Ed.), The Stanford encyclopedia of philosophy. http:/ /plato.stanford.edu/archives/spr2011/entries/abduction/. Paavola, S. (2006). Hansonian and Harmanian abduction as models of discovery. International Studies in the Philosophy of Science, 20, 91–106. Peirce, Charles S. (1931–1935, 1958). Collected papers 1–6, edited by Hartshorne, C., & Weiss, P., 7–8, edited by A. Burks. Harvard University Press. (CP) Peirce, C. S. (1976). The new elements of mathematics, edited by Eisele, C. Mouton. Peirce, C. S. (1982). Writings of Charles S. Peirce: A chronological edition, edited by M. Fisch et al. Indiana University Press. (W1) Peirce, C. S. (1998). The essential Peirce, vol. 2 (1893–1913), edited by the Peirce Edition Project. Indiana University Press. (EP2). Popper, K. R. (1963). Conjectures and refutations. Routledge and Kegan Paul. Psillos, S. (2011). In D. Gabbay, S. Hartmann, & J. Woods (Eds.), An explorer upon untrodden ground: Peirce on abduction. In Handbook of the history of logic, vol. 10: Inductive logic (pp. 117–151). North-Holland. Quine, W. V. O. (1960). Word and object. The MIT Press. Roche, W., & Sober, E. (2013). Explanatoriness is evidentially irrelevant; or, inference to the best explanation meets Bayesian Confirmation Theory. Analysis, 73, 659–668. Salmon, W. (2001). Explanation and Confirmation: A Bayesian critique of inference to the best explanation. In G. Hon & S. S. Rakover (Eds.), Explanation: Theoretical approaches and applications (pp. 61–91). Kluwer. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. Schurz, G. (2015). Causality and unification: How causality unifies statistical regularities. Theoria, 30, 73–95. Seddon, P. B. (2021). Nature chose abduction: Support from Brain Research for Lipton’s Theory of inference to the best explanation. Foundations of Science. https://doi.org/10.1007/s10699-02109811-3. Smokler, H. (1968). Conflicting conceptions of confirmation. The Journal of Philosophy, 65, 300– 312. Thagard, P. (1978). The best explanation: Criteria for theory choice. The Journal of Philosophy, 75, 76–92. van Fraassen, B. (1989). Laws and symmetry. Oxford University Press.
Imagination, Cognition, and Methods of Science in Peircean Abduction
12
Ahti-Veikko Pietarinen and Francesco Bellucci
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hope to Believe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Economy of Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uberty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seductive Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instinct, Insight, Imagination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
230 230 232 233 235 238 240 242 243
Abstract
This chapter canvasses some of the most pertinent notions that Peirce allied to his conception of abduction: divination, trustworthiness, hope, investigand, economy, uberty, seduction, instinct, insight, and imagination. While such notions may suggest Peirce sliding into psychologism about logic, Peirce’s own texts show that we can give these concepts perfectly non-psychological, natural, and scientific glosses – namely those that integrate their logical, linguistic, cognitive, and semiotic meanings. Thus attempts to characterize abduction as the “heuristics” of the operation of one’s psychology or cognitive architecture would leave the issue of the full nature and reach of abduction wanting.
A.-V. Pietarinen () Ragnar Nurkse Department of Innovation and Governance, Tallinn University of Technology, Tallinn, Estonia e-mail: [email protected] F. Bellucci Department of Philosophy and Communication Studies, University of Bologna, Bologna, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_6
229
230
A.-V. Pietarinen and F. Bellucci
Keywords
Peirce · Abduction · Scientific method · Cognition · Imagination · Hope · Investigand · Economy · Uberty
Introduction Peirce’s original writings and formulations about abductive reasoning harbor a number of notions and qualifications that have not been addressed in the secondary literature to the extent they deserve. This has been partly due to unavailability of his writings, and especially from the later period of his life, but the skewed conception of Peirce’s original meanings is much due to the biased selections of texts that have appeared, as well as a manifest lack of consensus on what constitutes the central corpus of relevant texts that bear on abduction, retroduction, and related matters concerning Peirce’s lifelong investigations into the history and logic of science. This chapter goes through some of the most pertinent notions that Peirce allied to his conception of abduction: divination, trustworthiness, hope, the investigand modality, economy, uberty, seduction, instinct, insight, and imagination. While such notions may suggest Peirce sliding into psychologism about logic, Peirce’s own texts show that we can give these concepts perfectly non-psychological, natural, and scientific glosses – namely those that integrate their logical, linguistic, cognitive, and semiotic meanings. Thus attempts to characterize abduction as the “heuristics” of the operation of one’s psychology or of a cognitive architecture of a machine would leave the issue of the true nature and reach of abduction wanting.
Methods of Science Skillful guesses have been shown by experience to be correct above chance level. As discussed in the Chap. 2, “Peirce’s Abduction” in this volume, it is part of the justification of abduction to appeal to our experience of past abductions and then to infer by induction that abductions generally perform better than random guesses. Where does this power of divination draw its hits that throughout history of science, appear to overshadow those of the misses? We can look for answers in two domains: the nature of the mind and the nature of the methods and their development that has taken place in the sciences. For the purposes of the present exposition, the nature of the mind could further be divided into questions that have to do with the imaginary abilities of the mind and the nature and properties of intellectual cognition. First, a terminological clarification is in order. By “power of divination,” Peirce meant a specific form of instinctive reasoning akin to other and perhaps more familiar types of instinct, explaining it in terms such as these: “the human mind possesses, in some degree, a power of divining the truth, which is no more, at its
12 Imagination, Cognition, and Methods of Science in Peircean Abduction
231
utmost, than to have some endowment of instinct such as many species of birds, insects, and other creatures possess” (R 652, p. 14, 1910). Second, this naturalism about abduction’s power to approach truth agrees with a realist theory of what science and its methods have accomplished in the past. A famous argument for realism is the “no-miracles” argument, namely that scientific explanations are the only ones that do not make the scientific progress to look like a miracle. In the business of science, one finds better, more detailed and more encompassing arguments, reasons, proofs, and experimental designs that will provide deeper and broader explanations of one’s scientific observations. Niiniluoto (2018, p. 157) notices that the “no-miracle” argument is itself abductive. In fact this observation was made by Peirce much before the “no-miracles” argument acquired its name. He argued that lessons from the history of science are borne out by the relatively small proportion of good retroductions that have turned out to be quite false. “This statistical argument is itself retroductive,” Peirce states and adds that we must be on our guard against a begging of the question: it is a source of support, “though by no means the principal support, of my doctrine that the human mind has a power of divination” (R 652, pp. 23–24, 1910). That is, by 1910 Peirce had argued that the statistical argument from the history of science is in fact abductive (“is itself retroductive”), not inductive. Peirce insists that the frequency argument, i.e., the inductive argument, while it is one of the supports for the justification of abduction, is “by no means the principal support.” The question thus arises as to what Peirce thought the principal support for abduction to be. In a manuscript dating c.1906 Peirce had written on the same question in the following words: This kind of reasoning [abduction/retroduction] is justified by two propositions taken together. One is that man’s mind which is a natural product formed under the influences which have developed Nature (here understood as including all that is artificial), has a natural tendency to think as Nature tends to be. This must be so if man is ever to attain any truth not directly given in perception; and that he is to attain some such truth he cannot consistently, nor at all, deny. The other proposition is that no other process of deriving one judgment from another can ever give any substantial addition to his knowledge; so that, if he is to reason at all, we must assume that this kind of reasoning succeeds often enough to make it worth while; since it certainly is not worth while to leave off reasoning altogether. (R 876, p. 3, c. 1906)
In this passage the justification of abduction is shown to repose on two propositions taken together. The first is that the human mind has a natural tendency to reason correctly about natural phenomena. The reason is that it is a product of the same influences, forces and drivers as natural evolution. Either we must reason abductively or we do not reason at all. Second, the trustworthiness of abduction grows out of the irritation of living doubt when that irritation would cause thought to act merely out of desperation. Such irritation has to be appeased by the belief that we can have substantial additions to our knowledge by investment on future inquiry. It is better – and precisely better for the sake of the perseverance of human condition and well-being – to rely on the truth of our many guesses than never to rely on them at all.
232
A.-V. Pietarinen and F. Bellucci
There is thus a twofold origin to the trustworthiness of abductive reasoning. First, the cause of the validity of abduction is that the human mind has developed, as an outgrowth of natural evolution, an instinct for guessing correctly. We could explain much of such causes and their emergence by the modern co-evolutionary theories of ontogenic development including those of instinctive powers of active inference to sustain the survival of organisms (Beni & Pietarinen, 2021; Pietarinen & Beni, 2021). Second, the reason for clinging on to this hope as steadfastly as possible is that no other kind of reasoning would allow us to put questions to nature with an expectation that nature responds. This, in turn, relates to Peirce’s doubtbelief epistemology according to which we desire to avoid vacuum doubt; instead, we strive to replace that void with conjectures, surmises, informed guesses and ultimately beliefs with varying degrees of credence on the basis of which we then act (Ma & Pietarinen, 2015, 2018a). We either reason abductively or stop reasoning altogether. Opting for the cessation of reasoning would mean the progress of science coming to a halt. In sum, the justification of abduction is that abduction “is the result of a method that must lead to the truth if it is possible to attain the truth. Namely we must assume the human mind has a power of divining the truth, since if not it is hopeless even [to perform reasoning]” (R 276, p. 9). The reason for the correctness of abduction is that “to say we really believe in the truth of any proposition is no more than to say we have a controlling disposition to behave as if it were true” (R 652, p. 14). This belief is “sufficient reason for believing that we have such power” (ibid.) of guessing right (Bellucci & Pietarinen, 2020, 2021).
Hope to Believe Peirce draws the lesson that “we must therefore be guided by the rule of hope, and consequently we must reject every philosophy or general conception of the universe, which could ever lead to the conclusion that any given general fact is an ultimate one,” as he writes in an important long letter to William James on Christmas day 1909 (EP2, pp. 501–502). Here the spestic modality of “hope,” which Peirce frequently introduces in relation to the power of abduction to divine the state of the world, is clearly an allusion to the argument from desperation. Instead of beliefs or knowledge, abductive conclusions signal rational hope that eventually nature will come out the way she is, and that inquirers can participate in those ways through interrogations of the kind suggested by the abductive schema. Our observations concern answers that nature, or perhaps our own future self, comes up with when subjected to experimental and logical questioning. From those answers we then gather “our hope that the question which happens to occupy our minds is capable of final decision,” Peirce explains to his friend Francis C. Russell (September 6, 1894, RL 387). On the same year Peirce wrote that the followers of science are “animated by a cheerful hope that the process of investigation, if only pushed far enough, will give one certain solution to each question to which they apply it” (R 422, 1894). Our cheerful, rational hope – the spestic modality checked
12 Imagination, Cognition, and Methods of Science in Peircean Abduction
233
by experience – relies on the prospect that at least some of those question-answer pairs in our interrogative pursuits are on the right track. (And so one can credit them with the status of pre-belief modalities, studied in precision in Ma & Pietarinen, 2018a.) Such facets decidedly look away from abduction as an epistemological conundrum (Hintikka, 2007). They rather make abduction to look like having little or no bearing on the inquirer’s states of knowledge and belief, and merely that what the effect of inquiry is in its probationary states. Peirce famously described abduction as scientists’ “guessing which subsequently they come to believe” (R 905, 1907). A belief may follow, provided that the hopes we entertain at the crucial moments of drawing those scientific guesses were rightly present and guiding our rational minds in their orientation: The entire fabric of science has to be built up out of surmises at truth. (CP 7.87) Retroduction . . . depends on our hope, sooner or later, to guess at the conditions under which a given kind of phenomena will present itself. (RL 477)
This regulative principle of investigation guided Peirce’s own investigations too. A year before his Harvard and Lowell Lectures, he had stated that When we discuss a vexed question, we hope that there is some ascertainable truth about it, and that the discussion is not to go on forever and to no purpose. (CP 2.113, 1902)
His 1903 Lowell Lectures were precisely titled Some Topics of Logic Bearing on Questions Now Vexed (Peirce, 2019-2022 Volume 2/2). There he offered numerous solutions to questions such as the nature of propositions, principles of sound reasoning, and higher-order logic as the foundation of mathematics, among countless others. The mood of abductive conclusions is not only interrogative or imperative alone but rather a mixture which resembles what linguistics call co-hortative or jussive moods; those that capture both the important idea of “pursuit-worthiness” of abduced conclusions as well as the “rational hopes” of our guesses to turn out in the way our minds or machines predict. Investigands are invitations to proceed investigating conjectures further. Those investigations start at the level of prebeliefs. It is in the nature of the logic of abduction that some reasons are found why its conclusions are worthy of further investment.
Economy of Research To see that abductions are not idle forms of reasoning that infer a weak modality of whatever “may-be,” in 1905 and later, Peirce would emphasize that the conclusion of abduction is an interrogation (CP 6.528, 1901) that can be expressed in a sentence in the interrogative mood (EP2: 287, 1903). In an unsent draft letter to Victoria Welby he writes: [The] “interrogative mood” does not mean the mere idle entertainment of an idea. It means that it will be wise to go to some expense, dependent upon the advantage that would accrue
234
A.-V. Pietarinen and F. Bellucci
from knowing that Any/Some S is M, provided that expense would render it safe to act on that assumption supposing it to be true. This is the kind of reasoning called reasoning from consequent to antecedent. [ . . . ] Instead of “interrogatory”, the mood of the conclusion might more accurately be called “investigand”, and be expressed as follows: It is to be inquired whether A is not true. The reasoning might be called “Reasoning from Surprise to Inquiry”. (Peirce to Welby, July 16, 1905, RL 463)
Abduction does more than concluding abundance of ideas about possible explanations. The conclusion of abduction advances a hypothesis neither as true nor as a mere idea or possibility devoid of any scientific or intellectual value whatsoever. Abduction concludes an assertion (Chiffi & Pietarinen, 2020) which, as inquiry proceeds, is worth investigating further in order to determine its status. The grammatical expression of such conclusions is that of an assertion in the investigand mood, meaning that the propositional content of the sentence (the hypothesis) is qualified by a peculiar illocutionary force which Peirce expressed as “It is to be inquired whether H is true or not.” This illocutionary force is not exactly the interrogatory force of linguistic questions. Rather, it is what Peirce describes having “investigand” force that scientific hypotheses are endowed with. That mood communicates a certain urge or seduction for inquiry that a scientific mind observes some hypotheses to possess, so much so that it is those and only those hypotheses that are irresistibly recommended to be subjected to further investigation that are accepted by abductive reasoning as its conclusions. In their investigand mood, abductive conclusions represent the upshot of the Socratic questioning method. Peirce elaborates the method into the scheme in which conclusions are investigands and take into account values that make conclusions worthy of further investment and investigation. Illocutionary forces in abductive assertions thus draw from economic and not only epistemic values. An important part of that investigand mood is that the accepted proposition is also accepted as conveying the interrogative as something that is positively put forward as a “good question to ask” (R 478, p. 101). So what are the good questions to be asked? How do we delineate or classify good ones so as to distinguish them from a host of other not-so-good questions? Peirce’s answer is that in science, good questions are those that are accepted as “making an explanatory hypothesis” (R 478, p. 101, emphasis in the original). As at least some of our abductions are accepted as good because in asserting them they pronounce investigands that suggest hypotheses able to explain surprising, anomalous, or confounding phenomena. Normative and value-laden question of what makes some questions good and some others bad has to be addressed from the point of view of the economy of research (Chiffi et al., 2019). Peirce distinguished three components in his argument for economy of research. First, hypotheses possess the quality of caution: according to it hypotheses are to be broken down into smallest logical components. Big questions are to be divided into series of small questions, why-questions into series of yes-no questions. According to the quality of breadth, the value of a hypothesis is evaluated by its applicability in other but related subjects across a multiplicity of contexts and circumstances. Explanations of the same phenomena should be evaluated according to their consequences. Finally, the quality of incomplexity (the
12 Imagination, Cognition, and Methods of Science in Peircean Abduction
235
absence of complexity, simplicity or artlessness) states that, because hardly any hypothesis is optimal anyway, they ought at least to “give a good leave” (EP 2:110). Refuting a hypothesis sets an example of a good conduct to be followed, as we should attempt as large a “break” as possible from a hypothesis that is soon to fail us, for reasons of economic contingencies that “we must always consider what will happen when the hypothesis proposed breaks down” (CP 7.220). A leading motivation that carried Peirce to the restatement of the logical schema of abduction in late 1903 was the realization of the importance of economic factors that influence our reasoning in discovery and innovation. Those factors need to be integrated to the logic of abduction and their reasons submitted to logical analysis. Peirce puts the case as strongly as the following passage reveals: The principles upon which abduction ought to be conducted ought to be determined exclusively by considerations of what purpose it subserves and how it may best subserve that purpose. Since, therefore, in scientific investigation abduction can subserve no other purpose than economy, it follows that the rules of scientific abduction ought to be based exclusively upon the economy of research. (CP 7.220 fn)
Among permanent importance in questions of economy of research is the question of the allocation of resources. A closer analysis of abductive reasoning reveals the presence of what can be called Peirce’s maxim of novelty: “the new money should mainly go to opening up new fields,” the reason being that “new fields will probably be more profitable, and, at any rate, will be profitable longer” (CP 7.160). Chances of discovery are highest in frontier and interdisciplinary research, as it is there that the constraints that may otherwise be imposed upon finding the right antecedents of the subjunctive conditionals are deliberately and decidedly turned off. Finding new logical methods and tools of analysis are equally profitable and are part and parcel of blue sky research; those methods are moreover cheap to pursue while at the same time are of broad applicability across sciences and without limits (cf. CP 7.161). Curiosity-driven, exploratory ideas must reign free in order to maximize chances of hitting upon new hypotheses, those that Peirce characterized in terms of the uberty of abduction.
Uberty Abduction concludes nothing about the truth of the hypothesis: “to assert the truth of its conclusion ever so dubiously would be too much” (R 692). But this is not to claim that truth has no role whatsoever to play in abductive reasoning. Truth is valued in the process in another sense. Our abductive guesses track another, related notion. Hypotheses prompted by economic considerations are to be ranked higher than those that do not, along the quality of good hypotheses which Peirce terms their “uberty.” Our intellectual guesses are to be valued according to that quality. The meaning of “uberty” is not explained in Peirce’s surviving writings in so many words. A rather exhaustive account of what he said about it can be provided by a collation of a couple of key sentences: “[U]berous . . . greatly wanting in security”
236
A.-V. Pietarinen and F. Bellucci
(R 684, p. 3); “[U]ber, or udder, is sure to be often gravid with actually existing nutritious food” (R 692); “I am going to insist upon the superiority of Uberty over Security in the sense in which gold is more useful than iron, though the latter is more useful in some respects. And also that the art of making explanatory hypotheses is the supreme branch of logic” (Peirce to Josiah Royce, 30 June 1913, Harvard University Archives). That is, uberous reasoning is of great value even though it is lacking in security (soundness) of deduction, and its value may be even greater than those of security. And perhaps most importantly, in one of the last pieces he ever wrote, in November 1913, we find Peirce explaining abduction and uberty as follows: [November 6] I think logicians should have two principal aims: First, to bring out the amount and kind of security (approach to certainty) of each kind of reasoning, and second, to bring out the possible and esperable uberty, or value in productiveness, of each kind. I have always, since early in the sixties, recognized three different types of reasoning, viz: First, Deduction which depends on our confidence in our ability to analyze the meanings of the signs in or by which we think; second, Induction, which depends upon our confidence that a run of one kind of experience will not be changed or cease without some indication before it ceases; and third, Retroduction, or Hypothetic Inference, which depends on our hope, sooner or later, to guess at the conditions under which a given kind of phenomenon will present itself. Each of these three types occurs in different forms requiring special studies. From the first type to the third the security decreases greatly, while the uberty as greatly increases. [Interruption] [November 13] I don’t think the adoption of a hypothesis on probation can properly be called induction; and yet it is reasoning and though its security is low, its uberty is high. (CSP to J. H. Woods, November 6–13, 1913; RL 384).
From these excerpts we learn that uberous abductions nurture discovery even when they lack soundness that characterizes trustworthy inferences under deductive modes of inference. There is a trade-off between high security and high uberty: both cannot be had at the same time. Peirce continues his letter to Royce by claiming that the term “‘uberty’ covers two quite distinct virtues.” Unfortunately, what follows in the text trails off and does not explain what those virtues are. One can suppose that it is only after getting clear about the hierarchies and properties of the whole range of such sub-belief modalities that the very problem of what Peirce might have meant by the uberty of abductive conclusions can be properly addressed. Even so, uberty is a prime quality of scientific guesses upon which scientists rely when formulating their assertions about promising future hypotheses; those are the hypotheses that are “gravid with young truth” (R 682). They have “value in productiveness,” as Peirce explained in the letter to Woods. That is, uberty is a scientific value, just as economy is. Yet uberty is not mere “fruitfulness.” Peirce denies that non-uberous hypotheses could be taken to be as fruitful as we like and yet be conclusively overturned whenever compelling evidence emerges from experiments, simulations and analyses of data that refutes them. In contemporary terms, with economy and uberty Peirce emphasizes the importance of non-epistemic values in science, and recognizes that those are not altogether distinct from epistemic values.
12 Imagination, Cognition, and Methods of Science in Peircean Abduction
237
In another draft of R 682 (reproduced in EP 2), Peirce also specifies that “uberty” is not the same as “fruitfulness”: I can hardly be supposed to have selected the unusual word “uberty” instead of “fruitfulness” merely because it is spelled with half as many letters. Observations may be as fruitful as you will, but they cannot be said to be gravid with young truth in the sense in which reasoning may be, not because of the nature of the subject it considers, but because of the manner in which it is supported by the ratiocinative instinct. (R 682, EP 2, p. 472)
Scientific pursuits combine observation and reasoning. Their results are constituted by the “processes of collecting and grouping results of Observation and of Reasoning [ . . . ] so that Science itself, when this word is used in the sense of that sort of information that it is the function of men of science to supply to practical men, will consist inwhat those men have concluded from their reasonings about observations” (R 682, EP 2, p. 471). Now, observation itself is probably the most fruitful part of the scientific enterprise, in the sense that much of the information OUT of which a scientific theory emerges comes from observation. But abduction – one of the modes of reasoning through which a scientific theory is generated on the basis of observation – is not simply fruitful: it is “uberous.” Uberty, then, concerns informativeness peculiar to abductive reasoning, as opposed to the informativeness peculiar to empirical observation. It is the latter that Peirce calls “fruitfulness.” What is nonetheless important is that the uberty of conclusions suggests that also non-epistemic values can be logically analyzed. Recall Peirce’s statement that “I think logicians should have two principle aims: first, to bring out the amount and kind of security (approach to certainty) of each kind of reasoning, and second, to bring out the possible and esperable uberty, or value in productiveness, of each kind” (RL 384). It needs to be at least conceivable that the nature of those values could be logically analyzed alongside with how epistemic values are analyzed. (One way of doing so is by the application of non-normal modal logics of conjecture-making that has provided a theory of abductive reasoning at the level of pre-beliefs, (Ma & Pietarinen, 2018). There is a connection between uberty and our human hopes that things will turn out the way anticipated. It springs from uberty that the searching questions of science upon which our rational hopes are built are amenable to final resolutions as to how to act upon them. In one of his last essays from late 1913, “Essay toward Improving Our Reasoning in Security and in Uberty,” Peirce takes abductive conjectures in science distinguishable from other and less uberous formulations of interrogative moods, in the sense that the former are “actually gravid with living and prolific truth” (R 683, p. 8). Peirce is not talking about truth per se but “living truth”: truth that might arise to our view in case our inquiries were to be pushed to their utmost limits. It is those conjectures, prolific in truth, that are of “value in increasing knowledge” (ibid.). Peirce does not base his late revisions of abduction directly on the shortcomings of his earlier, 1903 Harvard schema, and he does not claim that conjectures increase knowledge. The revision concerns how abduction, skeletally an inverse reasoning from effects to causes, can result in conclusions in the investigand mood. The increase in knowledge follows from the imperative part of the investigand. Epistemic import is not derived from interrogatives alone. Questions isolated from
238
A.-V. Pietarinen and F. Bellucci
the larger context of research agendas and other collective and institutional pursuits are insufficient to cash out the benefit in terms of knowledge. Advances that Peirce refers to draw not from abduction as such but from the scientific values that its imperative mood, when activated, has toward the resolution of pertinent research questions, given the entire tri-partite cycle of the three stages of reasoning: abduction, deduction, and, imperatively, the experimental testing of their outcomes by induction. Indeed the action of induction is to conclude “from the results of [abduction and deduction] to what extent it will be safe to rely upon the hypothesis” (R 478, p. 102, 1903). Peirce’s post-1903 investigations conclude that abduction does not conclude by asserting truths of its conclusions. It shows truth as a general concept, as an idea of it. That idea can be observed in what the antecedent of the subjunctive conditional expresses in the abductive schema. It would be misleading to assume antecedents of such conditionals to be endowed with truth-values. Thus abductions conclude, in co-hortative or jussive moods, that what scientifically speaking is possible and worthy of further investment in subsequent stages of investigation.
Seductive Abduction In later years, Peirce also mentions that abduction is “seductive.” Is this not a sea change in Peirce’s thought that subjects abduction to psychological investigation? Peirce offers similar characterizations also of deduction and induction, speaking of deduction as “compulsive” and “induction” as “appealing to rational minds”: I have hitherto defined [deduction] as necessary reasoning; and no doubt much, perhaps most, possibly all deduction is necessary. But on reviewing the subject for this talk, it seems to me more correct to define Deduction as compulsive reasoning. Retroduction seduces you. Induction appeals to you as a reasonable being. But Deduction first points to the premises and their relation, and then shakes its fist in your face and tells you “Now by God, you’ve got to admit the conclusion”. I beg your pardon, with all my heart, I meant to say, “Now by the eternal world forces spiritual and personal [illeg.]. Necessary reasoning is reasoning from the truth of whose premises it not only follows that the conclusion is true, but that it would be so under all circumstances. (R 754, 1907)
What does Peirce mean by that abduction, induction and deduction are, respectively, seductive, reasonable and compulsive kinds of inferences? First, “seduction” of abduction is the quality of conjectured hypotheses that replaces the vacuum of doubt and wonder and sets mind to motion by prompting further inquiry. Conjectures are seductive precisely because, in light of the evolutionary development of the mind in continuous affinity with the world, the leading principle of abduction comes to be that the world is explainable. One has no choice other than to investigate further whether abductively conjectured assertions are indeed matters of course. Of the three types of reasoning, abduction is not only seductive but also persuasive because we are “compelled to begin with [it] if we are ever to discover a law or the rationale of any phenomenon” (R 843, p. 41, c.1907). Peirce talks about “irresistible persuasiveness” that abductive conclusions have in scientific discovery:
12 Imagination, Cognition, and Methods of Science in Peircean Abduction
239
You will find that our rational instinct often prompts us to reasoning such that no conceivable mass of similar data would render its conclusion either certain or, in the strict sense above defined, probable, and it appears to be evident on examination that it is impossible absolutely to prove that these arguments have any value whatever. Nevertheless, it seems that many of them have an almost, if not quite, irresistible persuasiveness, that many of them have caused great discoveries and apparent great advances in science; and finally the most decisive circumstance of all in their favour is that unless these arguments have do tend [sic] to carry us toward new truth in the whole, we must abandon all hope of penetrating further into the secrets of the universe than we have done already. (R 652, pp. 12–13, July 12, 1910)
Peirce draws an analogy from the game of Whist, in which players may be led to situations which “full warrant a player for acting on the hopeful hypothesis” (ibid., p. 14). This prompts abduction, which is reasoning “which from a consequent and a consequence infers an antecedent” (ibid., p.15). It is “infers” that Peirce accentuates here: abduction is inverse inference and has a certain logical form (Pietarinen, 2018). Second, induction is an appeal to our rationality in the sense that the tests it tries can decide whether the hypotheses (refined by deduction into predictions) are on the right track. In that sense, inductions add to the “concrete reasonableness” (CP 2.34, 1902) of our naturally evolving faculties for scientific beliefs by accumulating and updating our hypotheses about the world. Third, deduction, as the mediating stage of reasoning between abduction and induction, is described in the above quotation as compulsive. Far from being a mechanical routine of calculating conclusions from premises, Peirce places deduction too within the broader framework of scientific discovery and inquiry; it is better characterized not strictly speaking as necessary reasoning but as the mode of reasoning that admits of various other non-alethic modal characterizations. It is compulsive rather than necessary precisely in the sense in which no living doubt remains when inferring deductively that the conclusion follows from the premises. Skeptical Tortoises notwithstanding, deductive conclusions are logically sound (that from the truth of the premises it follows that the conclusion is true) and logically valid (that they so follow “under all circumstances”). Thus in his preferred late reframing of abductive, deductive, and inductive reasoning, Peirce is not talking in the lingo of psychology. As far as deduction is concerned, for example, he is restating what he by 1903 in his Lowell Lectures drafts had already concluded (R 454; Peirce, 2019-2022 LoF 2/2: 526). Reasoning in our elementary logic is both sound (everything deducible is logically true, that is, the system of elementary reasoning possesses the meta-logical property of soundness) and semantically complete (everything true is deducible in the system, that is, the system of elementary reasoning has the meta-logical property of completeness). Logical consequence means that a proposition holds in all the models; in Peirce’s words is true “under all circumstances.” Logical reasoning is moreover evident to reason, because it is based on observational facts about its representations and their transformations. Peirce set up precisely such systems of logical representations and rules and argues that these properties hold in the logic of existential graphs (Peirce, 2019-2022 LoF 2/1–2; see the Chap. 34, “Abduction in Diagrammatical Reasoning” in this volume).
240
A.-V. Pietarinen and F. Bellucci
Instinct, Insight, Imagination There are three related but distinct notions at play behind Peirce’s logical theory of abduction: instinct, insight, and imagination. Abduction cannot be grounded on insight or instinct as such; imagination, by contrast, is a significant driver of abductive reasoning (Pietarinen & Bellucci, 2016). Mere instinct or instinctive reasoning does not suffice to explain why abductive reasoning functions in the way it does and why it has been successful in the sciences, because instinctive reasoning is not logical. Reasoning proceeds from premises to conclusions according to some general reason or principle. Whatever one might mean by “instinct” it fails to refer to any such reason according to which we would be licensed, by voluntary and controlled means of behavior, to infer certain conclusions from given premises. Instinctive action does not possess a logical form. Resorting to instinct means admission that no explanation was given at all why humans should generate and attune to certain peculiar hypotheses rather than some others, which is exactly what a logic of discovery or, in Peirce’s terms, a theory of abduction is supposed to provide. In some places Peirce suggests that the generation of abductive hypotheses acts like a “flash of insight”: the abductive suggestion comes to us like a flash. It is an act of insight, although of extremely fallible insight. It is true that the different elements of the hypothesis were in our minds before; but it is the idea of putting together what we had never before dreamed of putting together which flashes the new suggestion before our contemplation (CP 5.181, 1903). Some have taken such insight to be a special property in the faculty of the mind that seeks unity and coherence in experience. However, it seems that what actually happens in abduction cannot solely be based on some such singular mental desires of seeking unity in experience. It is true that, according to Peirce, abduction involves, and actually begins with, a “colligation . . . of a variety of separately observed facts about the subject of a hypothesis” (CP 5.581, 1898). But organization and colligation of facts is of the nature of an inductive moment, which is involved or embedded in abductive reasoning, but abductive reasoning is not reducible to organization and colligation of facts. Although sudden acts of insight have repeatedly been reported in scientists’ testimonies in relation to feelings and experiences during moments of creative discovery, flashes of insight do not seem to capture well the totality of moments of discovery, or analyze in detail the complex inferences that have led the inquirer to those experiences. No singular act of sudden flashes of insight suffices to explain what is going on in the process of colligation of observed facts. In abduction, one overcomes facts by reaching beyond them by adding information. The logical process of reasoning goes beyond colligation and looks further than what any collection of facts can deliver. How to look beyond facts is what abduction is intended to facilitate. Abduction is that mode of reasoning particularly suited for the cases in which the facts themselves have run out and one must look for other, collateral sources of data in order to settle upon some compelling hypotheses, by the process of guessing under uncertainty (Pietarinen, 2014).
12 Imagination, Cognition, and Methods of Science in Peircean Abduction
241
If the insight is, on the other hand, meant to account for what is peculiar to the creative aspect of any discovery, the act-of-insight story fails to take into account the usual preconditions of creative discovery, such as the tedious and necessary groundwork needed for the insights to arise in the first place, or the equally important consideration of familiar experiences and common sense that guides the preferences among insights. Sudden acts of insight are hardly usable in analyzing creative discovery, because those acts of insights are already assumed to consist of creative moments. In other words, the acts of insights alone are meaningless in accounting for creative discovery, since those acts readily involve abductive reasoning. A creation of those very insights as well as a selection of such insights that would carry the investigation further than would be the case with many other, alternative insights is what needs to be explained by abduction, not trying to explain abduction by acts of insights. What we are left with then is the concept of imagination, which has less shortcomings than appeals to instinct and insight have. How does imagination connect with scientific reasoning? Peirce says that the importance of imagination in scientific investigation is in supplying an inquirer with “an inkling of truth” (CP 1.46, 1896). Since the limit notion of truth precludes gaining any direct insight into the truth, in rational inquiry the question of what the truth may be needs to be tackled by imagination. It is here that the inquirers, blessed with that precious capacity of imagining the truth, will commence the process that “dreams of explanations and laws.” Imagination becomes a crucial part of the method for attaining truth, that is, of the logic of science and scientific inquiry, so much so that Peirce is led to pronounce that “next after the passion to learn there is no quality so indispensable to the successful prosecution of science as imagination” (CP 1.47). Scientific hypotheses have consequences and experimental effects. Their selection is preordained by abductive reasoning that follows certain principles such as economy of research (Woods, 2012). It suffices that the consequences that various hypotheses have are experienceable, that is, that they could be produced in scientific imagination capable of observation. Far from being products of fictional imagination, experienceable hypotheses producible in scientific imagination must concur with the rules of logic, laws of nature and other facts, relationships, and constraints established by previously accepted theories. But there need not be anything directly sensible in those statements of conceivable consequences and conditional resolutions to act in certain ways in certain kinds of circumstances. Imagination is a recurrent theme in Peirce’s discussions of scientific inquiry. A pioneer in experimental psychology and physiology, he was naturally fascinated by the empirical working of imagination. As a logician and theorist of science, imagination transcends psychology as it is an element of logical thought: “the operation of the imagination [ . . . ] is most important in all but the lowest kind of thinking” (R 1114, 1; W4, 43, 1879); “in reasoning of the best kind, an imaginary experiment is performed” (NEM 4, 375, c.1890); “A pretty wild play of the imagination is, it cannot be doubted, an inevitable and probably even a useful prelude to science proper” (CP 1.235). Not surprisingly for a Kantian thinker as Peirce was, imagination was taken to have a crucial role in all reasoning. This is
242
A.-V. Pietarinen and F. Bellucci
because reasoning is fundamentally schematic and diagrammatic in its nature, and that “the best thinking [ . . . ] is done by experimenting in the imagination upon a diagram or other scheme, and it facilitates the thought to have it before one’s eyes” (NEM 1, 122). The product of pure imagination is what Kant called a schema and Peirce a diagram: all scientific imagination is diagrammatic, because all reasoning is so, directly or indirectly (R 293, 1907). Deduction is directly diagrammatic, ampliative reasoning indirectly so. Also the converse is true: all diagrams are logical, and all diagrammatic thinking is scientific thinking (i.e., “logical” in Peirce’s sense). Not all diagrammatic thinking is visual thinking, however; “imaginative” and “visual” are pretty distinct conceptual categories. A diagram is for Peirce “a concrete but possibly changing mental image of such a thing as it represents. A drawing or model may be employed to aid the imagination; but the essential thing to be performed is the act of imagining” (R 616; Peirce, 2019-2022 LoF 1; NEM 4.219n1, 1906). The essential nature of a diagram is not that of a picture. To employ models to aid the imagination is a highly pertinent early observation of the kind of model-based reasoning Peirce identifies as the core of scientific inquiry. How are the models discovered in the first place? How do iconic forms, congenial to the process of creation of hypotheses, aid the discovery of models? Note that models can be very concrete yet have generality just as diagrams do, as both are constructed according to general rules and precepts that govern a multiplicity of phenomena. Models represent conceivable states of affairs that can be stated in hypothetical conditional forms. Models have the status of the hypotheses, but imagination is needed to produce them in forms that facilitate discovery. Frigg and Hartmann (2012) have stated that “no theory of iconicity for models has been formulated yet.” But in Peirce’s comments on models that aid the imagination in which we discover new relationships and patterns we do have the crucial elements of a theory of iconic models that facilitate discovery. The value of schematic and diagrammatic representations in imagination is that they provide optimal conditions for the enablement of scientific reasoning beyond facts and data currently available. Their value is not reducible to heuristic devices or placeholders for modal-type thought-experiments. When Norton (1996) proposes that thought-experiments are disguised inductive, deductive, or enthymematic arguments, he overlooks the promising vista that in performing thought-experiments scientists are in fact following the principles of abductive reasoning. Diagrams are cognitive and analytic instruments embodying what is essential in the reasoning of the mind and in the laws of its behavior.
Conclusions The timing of some of the most important revisions to Peirce’s earlier abductive schemas in 1905 is no accident. His mature interpretation is related to cotemporaneous developments and revisions both of his theory of signs and of his mature theory of logic. Peirce can now add illocutionary forces to the classification
12 Imagination, Cognition, and Methods of Science in Peircean Abduction
243
of signs and produce ever-richer taxonomies of sign classes. This revisionary project then necessitated addition of special modalities to his logic (Ma & Pietarinen, 2018b), bringing the three legs of semiotics, logic, and the method of science in < n increasingly close contact in his late writings. Hence logical, linguistic, semiotic, economic, and cognitive sides of the problem of abduction need to be addressed in unison (Gabbay & Woods, 2006; Magnani, 2009, 2017; Park, 2017). The common framework to represent and explore abduction, just as that of deduction, is not a heuristics or psychology of instinct and insight, but the logical framework of imaginative thought performed in diagrams. (The diagrammatic nature of creative reasoning is addressed further in Chap. 34, “Abduction in Diagrammatical Reasoning,” this volume.)
References Bellucci, F., & Pietarinen, A.-V. (2020). Peirce on the justification of abduction. Studies in History and Philosophy of Science. Part A, 84, 12–19. Bellucci, F., & Pietarinen, A.-V. (2021). Methodeutic of abduction. In J. Shook & S. Paavola (Eds.), Abduction in cognition and action (pp. 107–127). Springer. Beni, M., & Pietarinen, A.-V. (2021). Aligning the free-energy principle with Peirce’s logic of science. European Journal for Philosophy of Science, 11, 94. Chiffi, D., & Pietarinen, A.-V. (2020). Abductive inference within a pragmatic framework. Synthese, 197, 2507–2523. Chiffi, D., Pietarinen, A.-V., & Proover, M. (2019). Anticipation, abduction and the economy of research: A normative stance. Futures, 115, 102471. Frigg, R., & Hartmann, S. (2012). Models in science. In E. N. Zalta (Ed.), Stanford encyclopedia of philosophy. Stanford University. Gabbay, D. M., & Woods, J. (2006). Advice on abductive logic. Logic Journal of the IGPL, 14, 189–219. Hintikka, J. (2007). Socratic epistemology: Knowledge: Explorations of knowledge-seeking through questioning. Cambridge University Press. Ma, M., & Pietarinen, A.-V. (2015). A dynamic approach to Peirce’s interrogative construal of abductive logic. IFCoLog Journal of Logics and Their Applications, 3, 73–104. Ma, M., & Pietarinen, A.-V. (2018a). Let us investigate! Dynamic conjecture-making as the formal logic of abduction. Journal of Philosophical Logic, 47(6), 913–945. Ma, M., & Pietarinen, A.-V. (2018b). Gamma graph calculi for modal logics. Synthese, 195, 3621. Magnani, L. (2009). Abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. Magnani, L. (2017). The abductive structure of scientific creativity. Springer. Niiniluoto, I. (2018). Truth-seeking by abduction. Springer. Norton, J. (1996). Are thought experiments just what you thought? Canadian Journal of Philosophy, 26, 333–366. Park, W. (2017). Abduction in context: The conjectural dynamics of scientific reasoning. Springer. Peirce, C. S. (1933–1958). In C. Hartshorne, P. Weiss, & A. W. Burks (Eds.), The collected papers of Charles S. Peirce, 8 vols. Harvard University Press. Cited as CP. Peirce, C. S. (1982). In Peirce Edition Project (Ed.), Writings of Charles S. Peirce: A chronological edition (Vol. 1). Indiana University Press. Cited as W. Peirce, C.S. (1998). In Peirce Edition Project (Ed.), The essential Peirce (Vol. 2). Indiana University Press. Cited as EP. Peirce, C. S. (2019–2022). In A.-V. Pietarinen (Ed.), Logic of the future. Peirce’s writings on existential graphs, 3 vols. De Gruyter. Cited as LoF.
244
A.-V. Pietarinen and F. Bellucci
Pietarinen, A.-V. (2014). The science to save us from philosophy of science. Axiomathes, 25, 149– 166. Pietarinen, A.-V. (2018). Conjectures and abductive reasoning in games. IfCoLog Journal of Logics and Their Applications, 5(5), 1121–1144. Pietarinen, A.-V., & Bellucci, F. (2016). The iconic moment: Towards a Peircean theory of scientific imagination and abductive reasoning. In O. Pombo, A. Nepomuceno, & J. Redmond (Eds.), Epistemology, knowledge, and the impact of interaction (pp. 463–481). Springer. Pietarinen, A.-V., & Beni, M. (2021). Active inference and abduction. Biosemiotics, 14, 499–517. Woods, J. (2012). Cognitive economics and the logic of abduction. Review of Symbolic Logic, 5, 148–161.
Part III Logics of Hypothetical Reasoning, Abduction, and Evidence
Introduction to Logics of Hypothetical Reasoning, Abduction, and Evidence
13
Atocha Aliseda
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
248
Abstract
This section offers an overview to some formal models for ampliative reasoning found in the philosophy of science and logical literature: adaptive logics, dynamic epistemic logic, dialogical logic, and logics of evidence and truth. Each of these chapters proposes innovative ways to extend these frameworks to account for ampliative reasoning, for (enumerative) induction, and for abduction. The chapters are technical, but they all provide the intuition and rationale behind the notions presented. The diversity of frameworks displayed in this section shows the wide variety of formal tools for inductive and abductive reasoning modeling, a logical perspective for modeling cognition. Keywords
Induction · Abduction · Paraconsistency · Adaptive logics · Dynamic epistemic logic · Dialogical logic · Evidence · Justification
A. Aliseda () Instituto de Investigaciones Filosóficas, Universidad Nacional Autónoma de México, Mexico City, Mexico e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_82
247
248
A. Aliseda
Introduction Chapters in this section review some logical models for scientific inquiry, those concerned with ampliative reasoning, in which the conclusion expands the given information. This kind of reasoning manifests itself in inferences such as induction and abduction, and it is opposed to deduction, in which conclusions are certain but add nothing new to the given. A salient aspect of ampliative reasoning is the tentative epistemic status of the produced conclusions, something which makes them defeasible. That is, given additional information, it may no longer be warranted to draw a previously valid conclusion. More in particular, the focus in this section will be on the tentative status of the produced conclusion, of its being hypothetical. A hypothetical statement is, at the very best, potential knowledge. It is neither true nor false, but holds a hypothetical epistemic status, one that may be settled later as true (when the hypothesis is corroborated) or as false (when it is falsified). Hypothetical reasoning is understood here as a type of reasoning to explanations. One case of hypothetical reasoning is “enumerative induction,” also known as “inductive generalization,” in which the inferential process that is at stake is one that obtains a universal statement as a conclusion (“all raves are black”) from a set of individual ones (“the first raven is black” . . . “the nth raven is black”). A generalization from instances is a case of ampliative reasoning because it expands what is stated in the instances by advancing a defeasible prediction (“the next raven will be black”). That is, the generalization may fail when a further instance falsifies the conclusion (“the nth + 1 raven is grey”). As is well-known, inductive generalization assumes the “uniformity of nature”; the world is uniform and therefore it seems safe to draw generalizations out of instances, though they may fail at some point. Another case of hypothetical reasoning that shall be reviewed in depth is that of abduction. As well-known, abduction is related to both hypotheses’ construction and to hypotheses selection and two approaches to abduction are found: as argument versus as inference to the best explanation. This is a familiar distinction in the philosophy of science, where abduction is closely connected with issues of scientific explanation. In a more in-depth view of logical abduction, three characterizations may be identified, namely as a logical inference, as a computational process, and as a process for epistemic change. Each one of these views highlights one relevant aspect of abductive reasoning: its logical structure, its underlying computational mechanism, and its role in the dynamics of belief revision. Indeed, there are several ways to characterize this reasoning type, and it may be more appropriate to characterize abductive patterns rather than trying to define it as a single concept. In the chapter by Mathieu Beirlaen, the author offers several adaptive logics for inductive generalization, each of which is analyzed as a criterion of confirmation and confronted to Hempel’s satisfaction criterion and the hypothetico-deductive model of confirmation. The adaptive criteria proposed in this chapter offer an interesting perspective on (qualitative) confirmation theory in the philosophy of science.
13 Introduction to Logics of Hypothetical Reasoning, Abduction, and Evidence
249
Adaptive logics are a proof-theoretical framework designed to model ampliative reasoning and dynamic information. For the case of inductive generalization, the defeasibility of the conclusion is dealt with in two respects. On the one hand, each of these proposed logics uses a criterion to assert a generalization as a statement within the proof. On the other hand, these logics implement a strategy, a specific way by which a generalization is refuted and therefore marked in the proof, so that it is no longer considered as part of the derivation (until it is unmarked, due to new information). In the chapter by Tjerk Gauderis, also framed into the adaptive logics framework, the author offers an interesting discussion regarding the feasibility of the project of modeling hypothetical reasoning by means of formal logics, exploring the assumptions one must hold to accept or reject this endeavor. The author then puts forward four patterns of hypothetical reasoning, showing that not even one can be easily modeled by formal means. Abduction of a singular fact is the one pattern that has received most attention in the logical literature; the author offers a review and a detailed description of two adaptive logics devised for this pattern, showing that although is the simplest pattern of all four, it already exhibits some challenges to be modeled formally. In the chapter by Angel Nepomuceno-Fernández, Fernando Soler-Toscano, and Fernando R. Velázquez-Quesada, the authors develop their proposal under the dynamic epistemic logic framework, which is largely based on a semantic perspective of modal logic and is an ideal tool to represent an agent’s state of knowledge (and belief) together with the dynamics of epistemic change. Cases of abduction for non-logically omniscient agents and in multi-agent scenarios are displayed by analyzing abductive reasoning as an epistemic process that involves both an agent’s information and the actions that modify it. This chapter exhibits a combination of the inferential and epistemic characterizations of abduction. In the chapter by Cristina Barés-Gómez and Matthieu Fontaine, the authors offer an interesting discussion in favor of a reconciliation between argumentation theory and formal logic, one in which their selected logical framework, dialogical logic, is the formal model for scientific inquiry. More in particular, reasoning is modeled via a dialectical interaction in a game-like scenario between the Proponent of a thesis and an Opponent to it. The authors take this framework beyond deduction and develop it further for abductive reasoning. Several abductive types are put forward in which hypotheses are about rules of interactions, from which arise the inferential level and logical notions such as validity. This chapter follows the argumentative approach to abduction, extending this view with a dialectical interaction and combining aspects of the inferential and epistemic characterizations. In the last chapter of this section, Abilio Rodrigues, Marcelo E. Coniglio, Henrique Antunes, Juliana Bueno-Soler, and Walter Carnielli take a broader view on scientific inquiry by dealing with the problem of inconsistent information. The approach to paraconsistency is developed within the framework of logics of evidence and truth (LETs) via two other logics put forward, LETF and LETK. Evidence conveyed is either positive or negative and may be conclusive or non-conclusive. The notions of evidence and justification are treated interchangeably and therefore
250
A. Aliseda
a connection to abduction naturally follows. In this regard authors of this chapter review the case when there are no obvious explanations, when there is conflicting evidence and still a meaningful explanation can be constructed (one that could not be produced in a classical setting). This chapter mainly follows the argumentative approach to abduction and is a combination of the computational and epistemic characterizations. By virtue of their being formal models of scientific reasoning, the chapters to follow are technical; each one of them offers an original contribution to the field, but at the same time, they all provide the intuition and rationale behind the notions presented. The diversity of formal frameworks displayed in this section shows the wide variety of formal tools for ampliative reasoning modeling. These tools have proved useful to philosophers, logicians, and computer scientists alike and may be so as well to anyone who would like to make use of the potential of formal tools to model scientific inquiry at large.
Abduction from a Dynamic Epistemic Perspective: Non-omniscient Agents and Multiagent Settings
14
Angel Nepomuceno-Fernández, Fernando Soler-Toscano, and Fernando R. Velázquez-Quesada
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brief Recap: ‘Classical’ Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Dynamic Epistemic Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formal Tools: Representing Knowledge and Beliefs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Language and Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operations on Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Problem and Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning in Nononmiscient Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding Reasoning to the Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Explaining Explicit Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Semantic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multi-agent Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
252 253 255 257 257 259 261 263 264 266 267 268 270 276 277
Abstract
This chapter studies abductive reasoning as an epistemic process that involves both an agent’s information and the actions that modify it. More precisely, it proposes and discusses definitions of an abductive problem and an abductive
A. Nepomuceno-Fernández () · F. Soler-Toscano Facultad de Filosofía, Grupo de Lógica, Lenguaje e Información, Universidad de Sevilla, Seville, Spain e-mail: [email protected]; [email protected] F. R. Velázquez-Quesada Department of Information Science and Media Studies, Universitetet i Bergen, Bergen, Norway e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_25
251
252
A. Nepomuceno-Fernández et al.
solution in terms of an agent’s information (in particular, her knowledge and beliefs) and the epistemic actions that affect it (in particular, observation and belief revision). The discussion is formalized within tools from dynamic epistemic logic, studying the properties of the given definitions, introducing an epistemic action representing the application of an abductive step, and providing illustrative examples. Two particular cases are explored: abduction for non-ideal (i.e., nonlogically omniscient) agents and abduction in multiagent scenarios. Keywords
Abductive reasoning · Dynamic epistemic logic · Knowledge · Beliefs · Non-omniscient agents · Multiagent systems
Introduction Abductive reasoning (see, e.g., Aliseda 2006; Magnani 2001; Douven 2021) is typically understood as the process of looking for an explanation for a surprising observation. Many forms of intellectual tasks belong to this category (e.g., medical and fault diagnosis, scientific discovery, legal reasoning, natural language understanding), thus making abduction one of the most important reasoning processes. Within logic, abductive reasoning has been studied mainly from a purely syntactic perspective. Indeed, definitions of an abductive problem and an abductive solution are typically given in terms of a theory and a formula. Therefore, most of the formal logical work on the subject has focused on (i) discussing what a theory and a formula should satisfy to constitute an abductive problem and what a formula should satisfy to be an abductive solution (Aliseda, 2006), (ii) proposing algorithms to find abductive solutions (Kakas et al., 1992; Mayer & Pirri, 1993; Mayer and Pirri, 1995; Reyes-Cabello et al., 2006; Klarman, 2008), and (iii) analyzing the structural properties of abductive consequence relations (Lobo & Uzcátegui, 1997; Aliseda, 2003; Walliser et al., 2004). In all these studies, which follow the AKMschema of abduction (Aliseda-Kakas/Kowalski/Kuipers-Magnani/Meheus; see, e.g., Park 2015), explanationism and consequentialism are considered, but the epistemic character of abductive reasoning seems to have been pushed into the background. This manuscript takes an epistemic and dynamic approach to abductive reasoning. The epistemic aspect is, in spirit, close to the ideas of Levesque (1989), Boutilier and Becher (1995), Aliseda (2000), and Magnani (2009): it stresses the key role that epistemic agents play within the abductive reasoning scenario. In this sense, this approach is akin to the GW-schema (Gabbay-Woods; see Park 2015 again), which is based on the ignorance problem that arises when an agent has a cognitive target she cannot reach with what she currently knows. The dynamic aspect is the novel one: the text makes explicit the actions involved in the abductive process. A more precise discussion of what this dynamic epistemic perspective entails can be found in Nepomuceno-Fernández et al. (2017).
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
253
Making precise what abductive reasoning means in this text. Given its importance, abductive reasoning has been discussed in various fields. This has led to different ideas of what abduction should consist of (see, e.g., Flach and Kakas 2000). For example, while certain authors claim that there is an abductive problem only when neither the observed χ nor its negation follows from a theory (Kakas et al., 1992), others say that there is also an abductive problem when, though χ does not follow, its negation does (Aliseda, 2006), a situation that has been typically called a belief revision problem. There are also several opinions of what an abductive solution is. Most of the work on strategies for finding abductive solutions focuses on formulas that are already part of the system (the aforementioned Kakas et al. 1992; Mayer & Pirri 1993; Mayer and Pirri 1995; Reyes-Cabello et al. 2006; Klarman 2008), while some others take a broader view, allowing not only changes in the underlying logical consequence relation (Soler-Toscano et al., 2012) but also the creation and modification of concepts (Quilici-Gonzalez & Haselager, 2005). The present proposal focuses on a simple account: abductive reasoning will be understood as a reasoning process that goes from a single unjustified fact to its abductive solutions/explanations, where an explanation is a formula of the system that satisfies certain properties. Still, similar epistemic and dynamic approaches can be followed to study other interpretations of abduction, as those that involve the creation of new concepts or changes in awareness (van Benthem & VelázquezQuesada, 2010; Hill, 2010).
Brief Recap: ‘Classical’ Abduction According to the most famous of Peirce’s formulation of abductive reasoning (Hartshorne & Weiss, 1935, paragraph 189), abduction is a process that is triggered when a surprising fact is observed. The result of an abductive inference is understood then as an explanatory hypothesis whose truth is conjectured only as plausible. This makes abduction an inferential process of a nonmonotonic character: the result is rather a provisional proposal that can be revised in the light of new information. When formalized within logical frameworks, the key concepts in abductive reasoning have traditionally taken the following form. The notion of surprising observation is taken to be relative to a background theory. Then, it is said that an abductive problem arises when there is a formula that does not follow from this theory. Definition 1 (Abductive problem). Let Φ and χ be a theory and a formula, respectively, in some language L . Let be a consequence relation on L . • The pair (Φ, χ ) is said to be a (novel) abductive problem when neither χ nor ¬χ are consequences of Φ, i.e., when
254
A. Nepomuceno-Fernández et al.
Φ χ
and
Φ ¬χ
• The pair (Φ, χ ) is said to be an anomalous abductive problem when, though χ is not a consequence of Φ, ¬χ is, i.e., when Φ χ
and
Φ ¬χ
It is typically assumed that the theory Φ is a set of formulas closed under logical consequence and that is a truth-preserving consequence relation. Consider then a novel abductive problem. The observation of a χ for which the theory Φ lacks an opinion shows that Φ is incomplete. To solve the problem, one needs further information that “completes” Φ by making χ one of its consequences; with this, the theory is strong enough to explain χ . Consider now an anomalous abductive problem. The observation of a χ whose negation is entailed by the theory shows that the theory contains a mistake. Now two steps are needed. First, perform a theory revision so ¬χ is not a consequence of Φ anymore; this turns the anomalous problem into a novel one. Then, search for further information that “completes” the revised theory, so χ is a consequence of it. Definition 2 (Abductive solution). • Given a novel abductive problem (Φ, χ ), a formula η ∈ L is said to be an abductive solution when Φ ∪ {η} χ • Given an anomalous abductive problem (Φ, χ ), the formula η is an abductive solution when it is possible to perform a theory revision to get a novel problem (Φ , χ ) for which η is a solution. This definition of an abductive solution is often considered too weak: η can take many “trivial” forms, including anything that contradicts Φ (as then everything, including χ , follows from Φ ∪ {η}) and even χ itself (as Φ ∪ {χ } χ ). Further conditions can be imposed to the previous definition to obtain more satisfactory solutions (Aliseda, 2006). Definition 3 (Classification of abductive solutions). Let (Φ, χ ) be an abductive problem. An abductive solution η is • Consistent if and only if Φ, η ⊥; • Explanatory if and only if η χ ; • Minimal if and only if for every other solution ζ , if η ζ , then ζ η.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
255
The consistency requirement rules out solutions that are inconsistent with the theory. In a similar way, the explanatory requirement discards those explanations that would justify the problem by themselves. Finally, the minimality requirement looks for the simplest explanation (Occam’s razor style): a solution is minimal when it is in fact equivalent to any other solution it implies.
A Dynamic Epistemic Perspective The traditional definition essentially understands abductive reasoning as a process that modifies a theory whenever there is a formula that is not entailed by the theory under some particular consequence relation. The epistemic and dynamic aspects this text wants to highlight give us a different perspective. Here, abductive reasoning is understood rather as a process that changes an agent’s information whenever, due to some epistemic action, the agent has come to know or believe a fact that she could not have predicted otherwise. Similarly, one typically understand an abductive solution as a formula that, when added to the theory, makes the abductive problem a consequence of the extended theory. Here, instead, an abductive solution will be a piece of information that, if it had been received before the surprise observation, would have allowed the agent to infer the solution (and hence, in a sense, predict) the abductive problem. Here is, with more precision, what the epistemic and dynamic perspective turns the abductive process into. (1) What is an abductive problem? There are two important points in defining an abductive problem. The first is what a formula χ should satisfy to be an abductive problem. The second is the action that turns a formula χ into an abductive problem. For the first, a formula is an abductive problem when it is surprising. There are different ways of defining “a surprising observation” in a dynamic epistemic setting (see, e.g., Lorini and Castelfranchi 2007), but in most cases, the basic requirement is that the formula does not follow from the agent’s knowledge. A probably more important point is deciding at which stage is this condition required. One cannot ask for it after observation: an agent would typically come to know χ after observing it. Rather, one needs to look at the stage before the observation. Thus, it will be said that a formula χ is surprising for an agent whenever, before observing it, she could not have come to know it on her own. More precisely, χ is surprising for an agent whenever, before the observation, χ did not follow from the agent’s knowledge. For the second, the action that triggers an abductive problem χ is typically assumed to be the observation of χ itself. Here a more general idea will be considered: the action that triggers the abductive problem will be the observation of some formula ψ. Thus, though ψ should indeed be related to χ (after all, the agent comes to know χ by observing ψ), the agent will not be restricted to look for explanations of the formula that has been observed: she may also look for explanations of any χ she came to know through the observation but could not have come to know by herself otherwise.
256
A. Nepomuceno-Fernández et al.
Here is, then, the epistemic and dynamic definition of an abductive problem. Let s1 represent an epistemic state of the agent, and let s2 be the epistemic state that results from the agent observing some given ψ at s1 . A formula χ is an abductive problem for the agent at s2 whenever she knows χ at s2 but, at s1 , χ was not a consequence of the agent’s knowledge.
(2) What is an abductive solution? In scientific contexts, most notions of explanation rely on a consequence (entailment) relation: “explanation” and “consequence” tend to go hand in hand. Then, given the definition of an abductive problem, a solution/explanation for it is any formula η that would have allowed the agent to infer χ . More precisely, Let χ be an abductive problem for the agent at some stage s2 . A formula η is an abductive solution if, had it been received at the previous stage s1 , it would have allowed the agent to infer χ.
Note then how abductive solutions are looked for not after the surprising observation, but rather at the stage immediately before. Thus, η is a solution when, had it been known before, would have allowed the agent to predict/expect χ . (3) How is “the best” explanation selected? Under the definition above, several formulas might turn out to be explanations for a given abductive problem. Finding suitable and reasonable criteria for selecting the best explanation has constituted a fundamental problem in abductive reasoning, and in fact many authors consider it to be the heart of the subject (Harman, 1965; Lipton, 2004; Hintikka, 1998). Many approaches are based on logical criteria, but beyond requisites to avoid triviality (Definition 3) and restrictions on the syntactic form, the definition of suitable criteria is still an open problem. Approaching abductive reasoning from an epistemic point of view provides a different perspective. A solution/explanation for an abductive problem does not depend on how the problem could have been predicted, but rather on how the agent could have predicted it. Instead of looking for criteria to select the best explanation, the goal should be a criterion to select the agent’s best explanation. Still, in setting in which the agent has only knowledge, this is not possible. The agent does not know that any of the solutions is the case (otherwise, she would have been able to infer the abductive problem), and there are no further tools for picking a not-known formula over another. However, knowledge is not the only attitude an agent has toward information. In fact, given how difficult is to be 100% certain about anything, most agents make decisions based not on what they know, but rather on what they believe. The agent cannot guarantee that any of the solutions hold, but she might consider some of them more plausible than the others. These are precisely the ones she will choose. It could be argued that this criterion is not “logical” in the classic sense: it is not based exclusively on the deductive relationship between the abductive problem and its several explanations. Nevertheless, it is logical in a broader sense: it depends on the agent’s knowledge and, crucially, her beliefs.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
257
(4) How is “the best” explanation incorporated into the agent’s information? Once the best explanation has been selected, it has to be incorporated into the agent’s information. As mentioned above, abductive reasoning is nonmonotonic: the chosen explanation does not need to be true and in fact can be discarded in the light of further information. Thus, the chosen explanation cannot be assimilated as knowledge, a “hard” form of information which is not subjected to modifications. Luckily, the agent also has a “soft” form of information that can be revised as many times as it is needed: beliefs. Once the agent has chosen her best abductive solution η, she can perform some belief change action to come to believe that η is the case.
Formal Tools: Representing Knowledge and Beliefs A natural framework for formalizing the discussed ideas is that of dynamic epistemic logic (DEL; van Ditmarsch et al. 2007; van Benthem 2011), an extension of epistemic logic (EL; Hintikka 1962; Fagin et al. 1995) that works on explicit representations of the actions that changes an agent’s information. In particular, the plausibility models of Baltag & Smets (2008) (cf. van Benthem 2007) can represent an agent’s knowledge and beliefs as well as acts of observation and belief revision, all of them crucial to the above stated understanding of the abductive process. This section introduces the basic DEL’s tools, which are then used in the next section to formalize the ideas discussed in the previous one.
Language and Models Definition 4 (Language L ). Given a set of atomic propositions P, formulas ϕ of the language L are given by ϕ ::= p | ¬ϕ | ϕ ∨ ϕ | ≤ ϕ | ∼ ϕ where p ∈ P. Formulas as ≤ ϕ express that “there is a world at least as plausible (as the current one) where ϕ holds,” and those as ∼ ϕ express that “there is a world epistemically indistinguishable (from the current one) where ϕ holds.” Other Boolean connectives (∧, →, ↔) as well as the “box” modalities [≤] and [∼] are defined as usual ([≤] ϕ := ¬ ≤ ¬ϕ and [∼] ϕ := ¬ ∼ ¬ϕ for the latter). The modalities ≤ and ∼ are used, respectively, for defining the notions of belief and knowledge (see below). But first, here is the semantic structure in which formulas of L are interpreted. Definition 5 (Plausibility Model). Let P be a set of atomic propositions. A plausibility model is a tuple M = W, ≤, V where (i) W is a non-empty set of objects called possible worlds; (ii) ≤ ⊆ (W × W ) is a locally connected and
258
A. Nepomuceno-Fernández et al.
conversely well-founded preorder, the agent’s subjective plausibility ordering over possible worlds (w ≤ u is read as “the agent considers u at least as plausible as w”) (Recall: a relation R ⊆ (W × W ) is (1) locally connected if and only if every two elements that are R-comparable to a third are also R-comparable; (2) conversely well-founded if and only if there is no infinite R| -ascending chain of elements in W , where R| , the strict version of R, is defined as R| wu iff Rwu and not Ruw; (3) a preorder if and only if it is reflexive and transitive.); and (iii) V : W → ℘ (P) is an atomic valuation function, indicating the atoms in P that are true at each possible world. A pointed plausibility model (M, w) is a plausibility model with a distinguished world w ∈ W . The key idea behind a plausibility model is that an agent’s beliefs can be defined as what is true in the most plausible worlds from the agent’s perspective. The modality for the plausibility relation ≤ will be used in this definition. To define the agent’s knowledge, the approach is to assume that two worlds are epistemically indistinguishable for the agent if and only if she considers one of them at least as plausible as the other (i.e., if and only if they are comparable via ≤). The epistemic indistinguishability relation ∼ can therefore be defined as the union of ≤ and its converse, that is, as ∼ := ≤ ∪ ≥. Thus, ∼ is the symmetric closure of ≤ (hence ≤ ⊆ ∼). Moreover, since ≤ is reflexive, transitive, and locally connected, ∼ is reflexive and transitive, and hence an equivalence relation, thus satisfying the standard EL requirements for a relation defining knowledge. This epistemic indistinguishability relation ∼ should not be confused with the equal plausibility relation, denoted by , and defined as the intersection of ≤ and ≥. The two modalities ≤ and ∼ are interpreted via their respective relations in the standard modal way. Here is the formal definition. Definition 6 (Semantic interpretation). Let (M, w) be a pointed plausibility model with M = W, ≤, V . Atomic propositions and Boolean operators are interpreted as usual. For the remaining cases, (M, w) ≤ ϕ
iff there is u ∈ W such that w ≤ u and (M, u) ϕ
(M, w) ∼ ϕ
iff there is u ∈ W such that w ∼ u and (M, u) ϕ.
A formula ϕ is • True at (M, w) when (M, w) ϕ, • Satisfiable when it is true in some pointed plausibility model (M, w), • Valid (notation: ϕ) when it is true in every pointed plausibility model. Valid formulas are important, as they describe general behavior of the language’s operators.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
259
Defining knowledge and beliefs Knowledge in plausibility models is defined by means of the epistemic indistinguishability relation in the standard way: the agent knows ϕ at some world w if and only if ϕ is the case in every world she considers to be epistemically possible from w. The modality [∼] can be used to this end. For beliefs, the idea is that the agent believes ϕ at a given w if and only if ϕ is the case in the most plausible worlds from w. Thanks to the properties of the plausibility relation, ϕ is true in the most plausible (technically, the ≤-maximum) worlds from w if and only if, in accordance with the plausibility order, from some moment onwards, every world “above” in the plausibility ordering satisfies ϕ. The modalities ≤ and [≤] can be used to this end. Summarising, The agent knows ϕ The agent believes ϕ
Kϕ := [∼] ϕ Bϕ := ≤ [≤] ϕ
Note how Kϕ → Bϕ (as ≤ ⊆ ∼), so knowledge implies belief. However, the converse is not valid: the agent might believe some ϕ without knowing it. The dual of these notions, epistemic possibility and most likely possibility, can be defined as the correspondent modal duals: := ∼ ϕ Kϕ
:= [≤] ≤ ϕ. Bϕ
For a more detailed description of this framework, a number of the epistemic notions that can be defined within it, its technical details, and its axiom system, the reader is referred to Baltag & Smets (2008).
Operations on Models The main idea in DEL is that changes in an agent’s information (here, knowledge and beliefs) can be represented as operations that change the underlying semantic structure representing these notions. Here is a quick review of the operations depicting the actions of observation (external communication) and belief revision.
Update, Also Known as Observation Here is a first natural operation: reducing the domain. Definition 7 (Update: operation and modality). Let M = W, ≤, V be a plausibility model; let ψ be a formula in L . The update operation yields the plausibility model Mψ! = W , ≤ , V where W := {w ∈ W | (M, w) ψ}, ≤ := ≤ ∩ (W × W ), and, for every w ∈ W , V (w) := V (w). For describing this operation’s effect, one adds a modality of the form ψ! for each formula ψ. Here is its semantic interpretation: (M, w) ψ! ϕ
iff (M, w) ψ and (Mψ! , w) ϕ
260
A. Nepomuceno-Fernández et al.
An update reduces the domain of the model, preserving only those worlds where the given ψ was true. Since a submodel is obtained, the operation preserves the properties of the plausibility relation, hence preserving plausibility models: if M is a plausibility model, then so is Mψ! . For the new modalities, an update formula ψ! ϕ holds at (M, w) if and only if ψ is the case (i.e., the evaluation point will survive the operation) and, after the update, ϕ is the case. The modality [ψ!] is defined as the modal dual of ψ!, that is, as [ψ!] ϕ := ¬ ψ! ¬ϕ. An update has a straightforward epistemic interpretation: it represents a public announcement (Plaza, 1989; Gerbrandy & Groeneveld, 1997) or, as it will be called here, an observation. Indeed, after observing ψ, an agent will discard those epistemically possible worlds that fail to satisfy it, and only worlds that satisfied ψ before the operation will remain. More details on this operation and its modalities (including an axiom system) can be found in the referred papers and in the textbooks van Ditmarsch et al. (2007) and van Benthem (2011).
Upgrade, Also Known as Belief Revision Here is another natural operation over plausibility models: changing the plausibility ordering. The new order can be defined in several ways. The following option, taken from van Benthem (2007), is one of the many possibilities. Definition 8 (Upgrade: operation and modality). Let M = W, ≤, V be a plausibility model; let ψ be a formula in L . The upgrade operation yields the plausibility model Mψ⇑ = W, ≤ , V , with its plausibility order given by ⎫ ⎧ ⎪ ⎪ {(w, u) | w ≤ u and (M, u) ψ}, ⎪ ⎪ ⎬ ⎨ ≤ := {(w, u) | (M, w) ¬ψ and w ≤ u}, ⎪ ⎪ ⎪ ⎭ ⎩ {(w, u) | (M, w) ¬ψ, w ∼ u and (M, u) ψ} ⎪ For describing this operation’s effect, one adds a modality of the form ψ⇑ for each formula ψ. Here is its semantic interpretation: (M, w) ψ⇑ ϕ
iff (Mψ⇑ , w) ϕ
After an upgrade with ψ, “all ψ-worlds become more plausible than all ¬ψworlds, and within the two zones the old ordering remains” (van Benthem, 2007). More precisely, a world u will be at least as plausible as a world w, w ≤ u, if and only if they already are of that order and u satisfies ψ, or they already are of that order and w satisfies ¬ψ, or they are comparable, w satisfies ¬ψ and u satisfies ψ. This operation preserves the properties of the plausibility relation, hence preserving plausibility models. For the modality, an upgrade formula ψ⇑ ϕ holds at (M, w) if and only if ϕ is the case after an upgrade with ψ. The modality [ψ⇑] is defined as the modal dual of ψ⇑, as in the update case. An upgrade also has a natural epistemic interpretation. The plausibility ordering defines the agent’s beliefs, so changing it might cause a belief change (van
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
261
Ditmarsch, 2005; van Benthem, 2007; Baltag & Smets, 2008). In particular, a belief revision with ψ (Hansson, 2017) can be represented by an upgrade with ψ. (More generally, each strategy for defining a new ordering with some ψ-worlds above some ¬ψ-worlds can be seen as a belief revision policy; see, e.g., Rott 2009.) Details on the operation and its modalities (including an axiom system) can be found in the referred papers or in the textbook van Benthem (2011).
Abductive Problem and Solution This section uses the DEL tools to formalize the discussed intuitions. Abductive problem Here is the definition of what an abductive problem is. Definition 9 (Abductive problem). Let (M, w) be a pointed plausibility model, and consider (Mψ! , w), the pointed plausibility model that results from observing a given ψ at (M, w). A formula χ is an abductive problem at (Mψ! , w) if and only if it is known at such stage, but it was not known before, that is, if and only if (Mψ! , w) Kχ
and
(M, w) ¬Kχ
or, equivalently, if and only if (M, w) ¬Kχ ∧ [ψ!] Kχ.
Although the definition of an abductive problem has been given in terms of the agent’s knowledge, it can also be given in terms of her beliefs (simply replace K for B, and the action of observation for that of belief revision). One can even provide a definition that combines knowledge and beliefs (e.g., she does not know χ at some stage but believes it after a belief revision with ψ). In these alternatives, the only difference is how attached the agent is to the problematic χ before and after the chosen epistemic action. But even if one sticks to the provided knowledge-based definition, one can also find a further classification criteria: one can look at the agent’s weaker epistemic attitudes as that of belief. Here is an example. Definition 10 (Expected, novel, and anomalous problems). Suppose χ is an abductive problem at (Mψ! , w), as stated in Definition 9. Then, χ is said to be • An expected abductive problem if and only if (M, w) Bχ . • An novel abductive problem if and only if (M, w) ¬Bχ ∧ ¬B¬χ . • An anomalous abductive problem if and only if (M, w) B¬χ .
On the one hand, the second and third cases match those of Definition 1. On the other hand, many would not consider the first case an abductive problem: the
262
A. Nepomuceno-Fernández et al.
observation is a confirmation rather than a surprise, and thus it should not trigger any further epistemic action. Nevertheless, the case shows how this proposal allows for such situations to be considered. The classification can be even refined by considering further attitudes, such as the safe beliefs of Baltag & Smets (2008) or the strong beliefs of Baltag & Smets (2009) (both definable within L ). Abductive solutions Here is a definition of an abductive solution that uses only the agent’s knowledge. Definition 11 (Abductive solution). Let (M, w) be a pointed plausibility model, and consider (Mψ! , w), the pointed plausibility model that results from observing ψ at (M, w). If at (Mψ! , w) the formula χ is an abductive problem, then η is an abductive solution if and only if, had the agent knew η before the surprising observation, she would have known χ , that is, if and only if (Mη! , w) Kχ or, equivalently, if and only (M, w) [η!] Kχ.
Just as in the abductive problem case, one can define an abductive solution in terms of weaker notions. For example, while a very strict agent accepts η as explanation only when it leads her to know χ , a more relaxed one would accept it when it leads only to a belief in χ (i.e., (Mη! , w) ¬Kχ ∧ Bχ ). Classifying solutions As mentioned before, it is common to classify abductive solutions according to their properties. For example, given an abductive problem χ , an abductive solution η is • Plain when it is a solution, • Consistent when it does not contradict the agent’s information, • Explanatory when it does not explain χ by itself. Similar properties can be described in this dynamic epistemic setting. The plain property simply states that η is an abductive solution (Definition 11). More interesting are the consistent and explanatory requirements. For the first, one asks for the solution to be compatible with the agent’s information. Definition 12 (Consistent solution). Let χ be an abductive problem at (Mψ! , w), with η one of its abductive solutions. It is said that η is a consistent solution if and only if the agent considers it possible at (Mψ! , w), that is, (Mψ! , w) Kη.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
263
Note, then, how a solution is consistent when it is epistemically possible after the epistemic action that triggered the abductive problem. For the explanatory requirement, the idea in the classic setting is to avoid solutions that imply the problematic χ on their own, such as χ itself or any formula logically equivalent to it. In the current epistemic setting, this idea can be understood in a different way: a solution η is explanatory when accepting it changes the agent’s information. As discussed on page 257, it makes sense for the agent to integrate the chosen solution not as part of her knowledge but rather as part of her beliefs. Thus, this requirement boils down to comparing the models (Mψ! , w) (which results from the surprising observation) and ((Mψ! )η⇑ , w) (which results from integrating the solution). Definition 13 (Explanatory solution). Let χ be an abductive problem at (Mψ! , w), with η one of its abductive solutions. It is said that η is an explanatory solution if and only if its acceptance changes the agent’s information, that is, if and only if (Mψ! , w) and ((Mψ! )η⇑ , w) represent different epistemic states. It is only left to make precise what “different epistemic states” mean. Taking it to mean “different plausibility models” is not enough: the same knowledge and beliefs can be represented by different plausibility models. Even more, taking it to mean “different knowledge and beliefs” is not good enough: two pointed models can represent the same knowledge and beliefs and yet differ in some other epistemic attitude. One reasonable alternative is to use the modal notion of bisimulation (see, e.g., Blackburn et al. 2001, Chapter 2.2), stating that two pointed plausibility models represent different epistemic states if there is no bisimulation between them. With this definition of an explanatory solution, neither χ nor formulas logically equivalent to it nor contradictions (to the agent’s knowledge, or logical contradictions) are explanatory. Indeed, if η stands for any of the mentioned formulas, the pointed models (Mψ! , w) and ((Mψ! )η⇑ , w) are exactly the same, as the revision operation with η will not change (Mψ! , w): for χ and logical equivalents because such formulas are true in every epistemic possibility (the agent knows χ ) and for contradictions because such formulas are false in every epistemic possibility. In fact, with this definition, one characterizes explanatory solutions not in terms of their form, as is typically done, but rather in terms of their effect: accepting them will not change the agent’s epistemic state.
Abductive Reasoning in Nononmiscient Agents The presented epistemic and dynamic approach to abduction has made some assumptions for the sake of simplicity. One of the most important is the fact that agents whose information is represented within the plausibility framework are “ideal”: their knowledge and beliefs are closed under logical consequence. This supposition is not exclusive of this approach; the classic logical definitions of abductive reasoning often assume not only that the given set of formulas Φ,
264
A. Nepomuceno-Fernández et al.
the theory, is closed under logical consequence, but also that is the logical consequence relation. The present proposal highlights the epistemic nature of abductive reasoning, and so it is natural to ask how such reasoning process works for a different kind of agents, in particular, for those whose information does not need to have “ideal” properties and thus are, in that sense, closer to real computational agents with limited resources (and also closer to us human beings). This section briefly discusses some ideas. Further developments in this direction can be found in Soler-Toscano and Velázquez-Quesada (2014).
Adding Reasoning to the Picture Suppose that Karl is in his dining room and sees smoke going out of the kitchen. Karl does not understand why there would be smoke, but then he realizes that the chicken he put on the fire has been there for a long time, and it should be burnt by now. Though initially Karl did not have any explanation about the smoke, he did not need any additional information in order to find a reason for the fire: a simple reasoning step was more than enough. This case does not correspond to any of the abductive problems described above, and the reason is that Karl is not an omniscient agent: he does not have all logical consequences of his information, and therefore he did not realize that the information he had before seeing the smoke was enough to predict it (i.e., to infer that there would be smoke). This shows that nonomniscient agents can face new kinds of abductive problems. To provide formal definitions for abductive problems and solutions involving nonomniscient agents, we need to distinguish between the information the agent actually has, her explicit information (InfEx ), and what follows logically from it, her implicit information (InfIm ) (see, e.g., Levesque 1984; Vardi 1986). The relation between both kinds of information is given by InfEx ϕ → InfIm ϕ. Based on this distinction, we can say that, for a nonomniscient agent to have an χ -abductive problem, χ should not be part of her explicit information, but she may have or not implicit information about χ and possibly (implicit or explicit information) about ¬χ . We can also make a further distinction: we may distinguish between what follows logically from the agent’s explicit information, the objectively implicit information (InfIm ), and what the agent can actually derive, the subjectively implicit information (InfDer ). We can make further reasonable assumptions: explicit information is derivable (by the do-nothing action), and derivable information is also implicit (this assumes that the agent’s inferential tools are sound). Then we have InfEx ϕ → InfDer ϕ
and
InfDer ϕ → InfIm ϕ.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
265
One can also consider actions that modify the agent’s information. In the following classification of abductive problems and solutions, Addψ represents an action that incorporates ψ to the agent’s explicit information (which may be modeled as the update in Definition 7) and Remψ a revision that removes ψ from the agent’s implicit information. The action α represents a reasoning step using the rule α. With the only restrictions given by the above relations among InfEx ϕ, InfIm ϕ, and InfDer ϕ, several kinds of abductive problems can be considered in which the agent has not explicit information about χ (¬InfEx χ ). But from the agent’s point of view, the interesting distinctions are those that she can detect. For example, the agent does not need to solve an anomaly that she cannot discover (suppose both InfIm χ ∧ InfIm ¬χ and ¬InfDer χ ∧ ¬InfDer ¬χ ). Following this observation, from the agent’s perspective, we get the six abductive problems in Table 1. The first part contains those situations in which the agent does not have explicit information about χ and neither about its negation ¬χ (the novel problem of Definition 1). This situation is divided in four cases, depending on what happens with the agent’s subjective implicit information (i.e., what she can derive with the inferential tools available to her): it might be the case that neither χ nor ¬χ are derivable (1.a), or that one of them is derivable but the other is not (1.b and 1.c), or that both are derivable (1.d). The second part contains those situations in which the agent does not have explicit information about χ , but does have explicit information about its negation (the anomalous problem of Definition 1). Here, her subjective implicit information gives us only two cases (recall that InfEx ϕ → InfDer ϕ), as χ might or might not be derivable. Note how in abductive problems (1.d) and (2.d) the agent can reach an inconsistent information state, as in both cases she can derive both χ and ¬χ . When the agent’s information is required to be truthful or consistent, these cases can be ruled out. For the other abductive problems, Table 2 presents the way certain epistemic actions can turn each problem into another until the problem is solved. (Note: if one assumes that χ becomes explicit information after being observed, the abductive problems (1.c) and (2.c) also lead to inconsistent states, as ¬χ can be derived. This would rule out anomalous abductive problems completely.) The truly novel abductive problem is (1.a), where the agent cannot get explicit information about χ , even after using her derivation tools. Thus, she needs to extend her knowledge, and Table 2 presents two possibilities. She may incorporate
Table 1 Abductive problems for nonomniscient agents ⎧ ¬InfDer χ ∧ ¬InfDer ¬χ ⎪ ⎪ ⎪ ⎨ InfDer χ ∧ ¬InfDer ¬χ ¬InfEx χ ∧ ¬InfEx ¬χ ∧ ⎪ InfDer ¬χ ¬Inf Der χ ∧ ⎪ ⎪ ⎩ Inf χ ∧ InfDer ¬χ Der ¬InfDer χ ∧ InfDer ¬χ ¬InfEx χ ∧ InfEx ¬χ ∧ InfDer χ ∧ InfDer ¬χ
(1.a) (1.b) (1.c) (1.d) (2.c) (2.d)
266
A. Nepomuceno-Fernández et al.
Table 2 Abductive solutions for nonomniscient agents
Case (1.a) (1.b) (1.c) (2.c) Addψ InfEx χ Solved – – – Addψ InfDer χ (1.b) – – – α InfEx χ – Solved – – α InfEx ¬χ – – (2.c) – Rem¬χ ¬InfIm ¬χ – – – (1.a)
some information ψ that makes χ explicit knowledge ( Addψ InfEx χ ), so the problem is solved. Alternatively, the addition of ψ might make χ only derivable ( Addψ InfDer χ ), thus moving the agent to abductive problem (1.b), which is solved by performing an inference that makes explicit what was only implicit before ( α InfEx χ ). Problem (1.c) is a derivable anomaly: the agent may start by reasoning to get explicit information of the anomaly ( α InfEx ¬χ ), which moves her to the situation in (2.c). To solve this kind of abductive problem, the agent may start by revising her information to remove ¬χ from her implicit information ( Rem¬χ ¬InfIm ¬χ ) which brings her to (1.a).
Explaining Explicit Information Abduction is usually defined as the problem of explaining a surprising observation. Novelty is an important characteristic of the fact to explain. However, in some cases, we may wonder whether certain information we explicitly know follows from the rest of our information. For example, the fifth postulate is an obvious piece of explicit information in Euclidean geometry. However, can it be proved from the first four postulates? This generates a variant of an abductive problem that does not depend of recent observations but rather from unjustified information. The agent identifies a piece of explicit information she has and, when she realizes that it cannot be supported by the rest of her information, she tries to find an explanation for it. One way of stating abductive problems of this form is the following. We introduce the modality Disϕ , representing the action through which the formula ϕ is discarded from the agent’s explicit information. Note how this action differs from the action Remϕ discussed before: while Remϕ removes ϕ from the agent’s implicit information, making ϕ inaccessible without further external interaction, Disϕ removes ϕ only from the agent’s explicit information. Then, ϕ will remain objectively derivable, and the agent will be able to derive it if she has the proper tools to do it. This “discarding” action thus intends to satisfy InfEx ϕ → [Disϕ ](InfIm ϕ ∧ ¬InfEx ϕ). With this modality, we can now make a further distinction in the agent’s explicit information, splitting it in what she knows and can actually justify and what she knows but cannot derive if she would have not observed it. More precisely, we say
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
267
that ϕ is observed explicit information if and only if it is explicit information but, after being discarded, becomes not derivable: InfEx ϕ ∧ Disϕ ¬InfDer ϕ. On the other hand, we say that ϕ is entailed explicit information if and only if it is explicit and, after being discarded, the agent can still derive it: InfEx ϕ ∧ Disϕ InfDer ϕ. The abductive problems in Table 1 for which ¬InfDer χ is the case (problems (1.a), (1.c) and (2.c)) can be adapted for observed explicit information. If Ω is the formula that represents one of these abductive problems, then InfEx χ ∧ Disχ Ω is the formula representing the version for observed explicit information of the same problem. The solutions of these problems start with Disχ and then proceed as in Table 2.
A Semantic Model The plausibility models of Baltag & Smets (2008) can be adapted to address the agent’s inferential tools by following the approach of Velázquez-Quesada (2014), which extends plausibility models with ideas from Fagin and Halpern (1988) and Jago (2006) in order to deal with the notions of implicit and explicit information. Definition 14 (PA language). Given a set of atomic propositions P, formulas ϕ of the plausibility-access (PA) language LPA are given by ϕ ::= p | A ϕ | ¬ϕ | ϕ ∨ ϕ | ∼ ϕ | ≤ ϕ where p is an atomic proposition in P.
The language LPA extends L (Definition 4) with formulas of the form A ϕ, read as “the agent has access to formula ϕ.” The intuitive idea is that a nonomniscient agent may not have access to all formulas that are true at each world. Then, explicit knowledge/beliefs are defined in terms of what the agent has actually access to (cf. Fagin and Halpern 1988; Jago 2006). Definition 15 (PA model (Velázquez-Quesada, 2014)). A plausibility-access (PA) model is a tuple M = W, ≤, V , A where W, ≤, V is a plausibility model (Definition 5) and A : W → ℘ (LPA ) is the access set function, indicating the set of
268
A. Nepomuceno-Fernández et al.
formulas the agent can access at each possible world. A pointed PA model (M, w) is a PA model with a distinguished world w ∈ W . For the semantic interpretation, the “access” formulas A ϕ simply look at the A-set of the evaluation point. Definition 16 (Semantic interpretation). Let (M, w) be a pointed PA model with M = W, ≤, V , A. Operators in LPA that are also in L are interpreted as in Definition 6. For A , (M, w) A ϕ
iff
ϕ ∈ A(w)
Now, implicit and explicit knowledge/beliefs can be defined. For implicit knowledge, the classic approach is used: the agent knows ϕ implicitly iff ϕ is true in all the worlds she considers possible from the evaluation point. For ϕ to be explicitly known, the agent needs to have access to it in all such worlds: The agent knows implicitly the formula ϕ The agent knows explicitly the formula ϕ
KIm ϕ := [∼] ϕ KEx ϕ := [∼] (ϕ ∧ A ϕ)
The notion of implicit belief is also the same of plausibility models: an agent believes ϕ implicitly iff ϕ is true in the most plausible worlds under the agent’s plausibility order. For explicit belief, it is asked for the agent to have access to ϕ in these maximal worlds. The agent believes implicitly the formula ϕ The agent believes explicitly the formula ϕ
BIm ϕ := ≤ [≤] ϕ BEx ϕ := ≤ [≤] (ϕ ∧ A ϕ)
An Example We will now work with the following variation of a classic example (Aliseda, 2006). Mary arrives late to her apartment. She presses the light switch but the light does not turn on. Knowing that the electric line is outdated, Mary assumes that it might have failed.
In the PA model below, each possible world shows the atomic propositions that are true at it, with its A-set shown below. In the model, Mary knows explicitly both that the light does not turn on (l) and that if the electric line fails (e), then there will be no light (e → l). Nevertheless, Mary does not believe, neither explicitly nor implicitly, that the electric line fails. The formulas on the right of the diagram express all this; they are true in every world in the model, so no evaluation point is specified.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
w1
269
• KEx l ∧ KEx (e → l)
w2
l
e, l
• ¬BEx e
{l, e → l}
{l, e → l}
• ¬BIm e
If we understand information as knowledge, this is a case of an abductive problem with explicit knowledge of the observation l, as explained in the section on “Explaining Explicit Information”. The modality Disχ represents an action that removes χ from the explicit (but not from the implicit) knowledge, so we can define it in the PA framework as an operation that removes χ from the A-sets of the worlds in the model (cf. the dropping operation of van Benthem & VelázquezQuesada 2010). Then, by applying Disl to the PA model above, we get a similar model in which the A-set of every world is {e → l}, therefore representing a situation in which, though Mary does not have explicit knowledge of l, she has implicit knowledge about it. Now, if we understand inference as the application of explicitly known rules with explicitly known premises (Grossi & VelázquezQuesada, 2015), then Mary cannot make l explicit only by inference (i.e., only by applying the rule e → l) because, even though she knows explicitly the rule, she does not know explicitly the premise e. Hence, l is objectively (InfIm l) but not subjectively (¬InfDer l) implicit knowledge. In summary, the abductive problem she faces is given by
Disl ¬InfEx l ∧ ¬InfEx ¬l ∧ ¬InfDer l ∧ ¬InfDer ¬l ∧ InfIm l ∧ ¬InfIm ¬l This corresponds to case (1.a) on Table 1 with the modification introduced in section “Explaining Explicit Information” (χ is replaced by l). According to Table 2 plus the modification described in the section on “Explaining Explicit Information”, in order for e to be a solution, it should satisfy Disl Adde e → l InfEx l. This is actually the case: by applying Disl (i.e., removing l from the A-sets) and then Adde (interpreted as an update with e together with the addition of e to the A-sets) to the initial PA-model, we get
w2
{e, e → l}
e, l
Now Mary’s information state is given by (¬InfEx l ∧ InfDer l ∧ InfIm l) ∧ (¬InfEx ¬l ∧ ¬InfDer ¬l ∧ ¬InfIm ¬l),
270
A. Nepomuceno-Fernández et al.
which corresponds to the abductive problem (1.b) of Table 1. Mary has now subjectively implicit information of l; hence she only needs to apply her reasoning abilities to make InfEx l true. The action α in Table 2 is now specified as e → l and may act as the addition to l to the A-sets containing e → l and e. Thus, e is indeed an abductive solution for the problem.
Multi-agent Abduction So far, abductive reasoning has been discussed as a single-agent enterprise: an agent makes a surprising observation and, in order to explain it, she looks into what she had before to find some additional piece that would have allowed her to predict the observation. Still, most of the intellectual tasks that involve abductive reasoning can be also performed (more successfully, one might argue) by a collection of agents. For example, medical and fault diagnosis can be performed by groups of experts, scientific discovery is carried out by communities of scientists, and natural language understanding sometimes involves more than one person. Example 1. Suppose that Mary is joined by a friend Gaby. They both observe that the light does not turn on (l). Mary knows that a power surge (s) produces a failure in the electric line (e); Gaby knows that a failure in the electric line produces lack of light. How can they explain, together, that the light does not turn on? This section introduces some ideas about abductive reasoning to multiagent scenarios. In order to formalize the discussion, the formal framework (the plausibility models of Definition 5; the formal language of Definitin 4) needs to be expanded into a multiagent setting. Definition 17 (Multiagent framework). Let A be a finite non-empty set of agents. A multi-agent plausibility model W, {≤a }a∈A , V is a plausibility model (Definition 5) in which each agent a ∈ A has her own plausibility relation ≤a . Then, the language LA is defined as L (Definition 4), with the modalities ≤ and ∼ replaced by modalities ≤a and ∼a for each agent a ∈ A. Each one of these modalities is semantically interpreted with respect to its matching relation (cf. Definition 6). Finally, formulas expressing knowledge and belief for each agent a ∈ A are defined, respectively, as Ka ϕ := [∼a ] ϕ and Ba ϕ := ≤a [≤a ] ϕ. In the multiagent setting, while a formula of the form Km (s → e) expresses that Mary (m) knows s → e, a formula of the form Kg (e → l) indicates that Gaby (g) knows e → l. But moving to a multiagent setting involves more than just adding subindexes to relations and modalities: it also involves working with epistemic notions for groups of agents. Here are the definitions of three such notions: distributed knowledge, general knowledge, and common knowledge.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
271
Definition 18 (Multiagent concepts). Let W, {≤a }a∈A , V be a multiagent plausibility model; take any B ⊆ A. Define the following relations: ∼D B :=
∼b ,
b∈B
∼E B :=
∼b ,
E + ∼C B := (∼B ) .
b∈B
(Recall: given a relation R ⊆ (W × W ), the relation R + is R’s transitive C E closure. Note that, while ∼D B and ∼B are equivalence relations, ∼B is reflexive and symmetric, but it might not be transitive.) Then, for each such B, add to the language three knowledge modalities: DB , EB , and CB . Their respective semantic interpretation is as follows: (M, w) DB ϕ
iff for all u ∈ W, if w ∼D B u then (M, u) ϕ
(M, w) EB ϕ
iff for all u ∈ W, if w ∼E B u then (M, u) ϕ
(M, w) CB ϕ
iff for all u ∈ W, if w ∼C B u then (M, u) ϕ
Formulas of the form DB ϕ are read as “agents in B have distributed knowledge of ϕ,” those of the form EB ϕ are read as “agents in B have general knowledge of ϕ,” and those of the form CB ϕ are read as “agents in B have common knowledge of ϕ.” Intuitively, ϕ is distributively known by a set of agents B when the information allowing the knowledge of ϕ is disseminated among B’s members. Thus, w ∼D B u (given w, the group considers u distributively possible) if and only if w ∼b u for every b ∈ B (everybody considers u possible or, in other words, nobody has discarded u). Thenotion of general knowledge indicates that every agent in B knows ϕ (i.e., EB ϕ ↔ b∈B Kb ϕ, as the set of agents is finite). Thus, w ∼E B u (given w, the group considers u possible) if and only if w ∼b u for some b ∈ B (somebody considers u possible or, in other words, not everybody has discarded it). Finally, common knowledge requires not only for all agents in B to know ϕ but also for all of them to know that all of them know ϕ and so on ad infinitum. In fact, CB ϕ can be seen as the infinite conjunction EB ϕ ∧ EB EB ϕ ∧ EB EB EB ϕ ∧ · · · (although this is not technically correct because such conjunction is infinite and thus not a formula in the language). Hence, w ∼C B u (given w, the group considers u commonly possible) if and only if there is a (possibly empty) collection of worlds {v1 , . . . , vn } such E E E E E that w ∼E B v1 ∼B v2 ∼B · · · ∼B vn−1 ∼B vn ∼B u (so some agent in B considers possible that some agent in B considers possible that · · · that some agent in B considers possible u possible). Note how common knowledge implies group knowledge, which in turn implies distributed knowledge (i.e., CB ϕ → EB ϕ and EB ϕ → DB ϕ). However, the other directions fail: the agents might know ϕ distributively without ϕ being general knowledge, and they can have general knowledge of ϕ without it being common
272
A. Nepomuceno-Fernández et al.
knowledge. Finally, observe how, when the set of agents is a singleton, individual, distributed, and general knowledge coincide ( Ki ϕ ↔ E{i} ϕ and Ki ϕ ↔ D{i} ϕ). With these tools at hand, here is the initial situation of Example 1. Example 2. The model M below represents Mary and Gaby’s knowledge about s, e, and l at the starting point of the example (i.e., before they observe that there is no light). For simplicity, at this stage all possible words are equally plausible for both agents; thus, for both of them, their plausibility relation coincides with their epistemic relation (i.e., ≤m = ∼m and ≤g = ∼g ). In the diagram, edges represent the plausibility relations ≤m and ≤g (reflexivity and transitivity assumed). w1 s, e
m
w2 s, e, l
m, g
w3 e, l
m, g
m
e w5
m
g
w4 s, l
m, g
l w6
m, g
w7
• Km (s → e) ∧ ¬Km (e → l) g
g
s
• ¬Kg (s → e) ∧ Kg (e → l) • D{m,g} (s → l) • ¬Km l ∧ ¬Kg l ∧ ¬D{m,g} l
w8
Note: (M, w2 ) Km (s → e) ∧ ¬Km (e → l) and (M, w2 ) ¬Kg (s → e) ∧ Kg (e → l), so each agent knows one of the implications in {s → e, e → l} but not the other. This means that, if they were to share their information, they would know s → l (among other things), which can be succinctly expressed by stating that (M, w2 ) D{m,g} (s → l). Still, (M, w2 ) ¬Km l ∧ ¬Kg l, so no one knows that there is no light. In fact, (M, w2 ) ¬D{m,g} l, so putting their individual information together would not be enough for them to know l. The first step in defining abductive reasoning in a multiagent setting consists in defining what an abductive problem is for a group of agents B. Definition 19 (Multiagent abductive problem). A formula χ is an abductive problem for the group B (a (χ , B)-abductive problem) at (Mψ! , w) if and only if χ is general knowledge among agents in B at such stage, but it was not at the previous stage. In other words, there is a (χ , B)-abductive problem at (Mψ! , w) if and only if (Mψ! , w) EB χ
and
(M, w) ¬EB χ .
Thus, χ is an abductive problem for B when, after the observation of ψ, all agents in B know a formula χ that some of them did not know before. This shows how this definition of an abductive problem for a group takes all agents in B to be important: if everybody in B knows χ after the observation, but there is some b in the group that did not know it before (so (M, w) ¬Kb χ , thus making χ a single-agent abductive problem for b at (Mψ! , w)), then χ is an abductive problem for the whole
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
273
group (as (M, w) ¬Kb χ for b ∈ B implies (M, w) ¬EB χ ). Moreover, when the group of agents is a singleton, χ is an abductive problem for {b} at (Mψ! , w) (Definition 19) if and only if, in the single-agent case, χ is an abductive problem for this lone agent b at (Mψ! , w) (Definition 9). The definition of an (χ , B)-abductive problem relies on the notion of general knowledge E. Then, one can look at the different cases that arise according to what happen to the other epistemic notions for groups: common knowledge C and distributed knowledge D. Start by considering the first requirement that EB χ holds after the observation. This already implies that, at that stage, the agents also have distributed knowledge of χ (i.e., DB χ holds too). Yet, for common knowledge there are two possibilities: the agents might also have common knowledge of χ (i.e., CB χ ), but they might have not (i.e., ¬CB χ ). In the first case, the agents in B have the strongest possible group knowledge about the abductive problem χ : everybody in the group knows χ , everybody in the group knows everybody knows χ , and so on. Thus, there is a truly public challenge to find an explanation. The second case is weaker: all agents in B know χ (as Definition 19 requires), but some of them might not know that the others know. Now consider the second requirement in a (χ , B)-abductive problem: that EB χ fails before the observation. For common knowledge, note that ¬EB ϕ → ¬CB ϕ (which follows from the CB ϕ → EB ϕ observed before). Then, the definition of a (χ , B)-abductive problem already implies (M, w) ¬CB χ : the agents did not have common knowledge about the abductive problem before the surprising observation. Considering distributed knowledge is more interesting, as it gives rise to two possibilities: either (M, w) ¬EB χ ∧DB χ or (M, w) ¬EB χ ∧¬DB χ . In the first case, the group already had enough information to “predict” the surprising χ ; in a sense, the only issue was the “lack of communication” among them. In the second case, even putting all their information together would have not allowed agents in B to “predict” χ . Thus, their collective knowledge was incomplete, and they truly needed additional (external) information. These two possibilities on each of the two requirements yield four different kinds of abductive problems, as shown on Table 3. In each case, the boldfaced formulas are either basic requirements (Definition 19) or consequences of them. Cases (2) and (4) are those in which the group faces the abductive problem in a weak sense (everybody knows χ , but χ is not common knowledge); cases (1) and (3) are those in which the abductive problem is, in a sense, strong (χ is common knowledge after the observation). Cases (1) and (2) are those in which communication would have
Table 3 The four kinds of abductive problems for a group of agents B (1) (2) (3) (4)
(M, w) [ψ!] (EB χ (M, w) [ψ!] (EB χ (M, w) [ψ!] (EB χ (M, w) [ψ!] (EB χ
∧ DB χ ∧ DB χ ∧ DB χ ∧ DB χ
∧ CB χ) ∧ (¬EB χ ∧ ¬CB χ ∧ DB χ) ∧ ¬CB χ) ∧ (¬EB χ ∧ ¬CB χ ∧ DB χ) ∧ CB χ) ∧ (¬EB χ ∧ ¬CB χ ∧ ¬DB χ) ∧ ¬CB χ) ∧ (¬EB χ ∧ ¬CB χ ∧ ¬DB χ)
274
A. Nepomuceno-Fernández et al.
allowed the agents to know χ (the formula was distributive knowledge); cases (3) and (4) are those in which additional external information was needed (χ was not distributively known). Example 3. Recall the model M of Example 2, representing Mary and Gaby’s situation before observing that the light does not turn on (l). The model below, Ml! (again, reflexivity and transitivity assumed), represents their epistemic state after they both made the surprising observation. w2 s, e, l
m, g
w3 e, l
g
w4 s, l
• Km (s → e) ∧ Km (e → l) • ¬Kg (s → e) ∧ Kg (e → l)
m, g
• Km l ∧ Kg l ∧C{m,g} l l w6
After the observation, both still know one of the implications in {s → e, e → l} but not the other. But now they both know l. In fact, since l has been essentially “publicly announced,” l is actually common knowledge among {m, g}. Thus, l is an abductive problem (of type (3), according to Table 3) for both Mary and Gaby. Now, what constitutes an abductive solution for a group? In the single-agent case, an abductive solution is a piece of information η that, if communicated to the agent before the surprising observation, it would have allowed her to “predict” the abductive problem χ . In multiagent scenarios, the idea is analogous. Definition 20 (Multiagent abductive solution). Suppose there is a (χ , B)-abductive problem at (Mψ! , w). The formula η is one of its multiagent abductive solutions if and only if receiving η before the observation of ψ would have made the abductive problem χ distributed knowledge: (M, w) [η!] DB χ . Thus, a solution η for an (χ , B)-abductive problem is further information that puts χ within the group’s reach. Just as with the case of abductive problem, one can also look at the different types of abductive solutions that arise from considering the effect the solution η would have had on the other group epistemic notions, general and common knowledge. From this perspective, the weakest solution is one that makes the abductive problem reachable and yet gives nothing else (i.e., (M, w) [η!] (DB χ ∧ ¬EB χ ∧ ¬CB χ )). Thus, the agents would still need to communicate before all of them can “see” that χ is indeed the case. In such cases, one can also make a further classification of
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
275
these solutions in terms of the type of epistemic action (more precisely, the amount and/or type of communication) that is required to reach general and/or knowledge. A slightly stronger abductive solution is one, thanks to which all agents in the group get to know χ (i.e., (M, w) [η!] (DB χ ∧ EB χ ∧ ¬CB χ )). Still, χ does not become common knowledge, and thus some agents in B might not know that the others know χ . Finally, the strongest solution is one, thanks to which χ becomes common knowledge, so every agent knows it, everybody knows that everybody knows it, and so on (i.e., (M, w) [η!] (DB χ ∧ EB χ ∧ CB χ )). These three kinds of abductive solutions are shown on Table 3. Again, boldfaced formulas are basic requirements (Definition 20). Example 4. As indicated above, χ is an abductive problem for {m, g} at Ml! (Example 3), as they all know l now but not all (in fact, none) of them knew it at the previous stage (model M at Example 2). • Note how e is an abductive solution (of type (1), according to Table 4). Indeed, had the agents observed e at the original model M, their epistemic state would have been as in the model Me! below (reflexivity and transitivity as before). w1 s, e
m
w2 s, e, l
m, g
w3 e, l
• Km (s → e) ∧ ¬Km (e → l) ∧ ¬Km l • ¬Kg (s → e) ∧ Kg (e → l) ∧ Kg l
m
• D{m,g} l ∧ ¬E{m,g} l ∧ ¬C{m,g} l e w5
As the model shows, while Mary still does not know l, Gaby knows l now (as she knew e → l before). Thus, l is distributive knowledge among them, which means that only communication is needed to guarantee that both agents know it (and, eventually, to make it common knowledge). • Note how s is also abductive solution. Indeed, had the agents observed s at the original model M, their epistemic state would have been as in the model Ms! below (reflexivity and transitivity assumed again). As the model shows, neither Mary nor Gaby knows l, but nevertheless l is distributive knowledge among them. Although s is also an abductive solution of (1) (Table 4), it can be distinguished from e above in that here no agent in the
Table 4 The two kinds of abductive solutions
(1) (2) (3)
(M, w) [η!] (DB χ ∧ ¬EB χ ∧ ¬CB χ) (M, w) [η!] (DB χ ∧ EB χ ∧ ¬CB χ) (M, w) [η!] (DB χ ∧ EB χ ∧ CB χ)
276
A. Nepomuceno-Fernández et al. w1 s, e
m
w2 s, e, l
g
w4 s, l
• Km (s → e) ∧ ¬Km (e → l) ∧ ¬Km l g
• ¬Kg (s → e) ∧ Kg (e → l) ∧ ¬Kg l • D{m,g} l ∧ ¬E{m,g} l ∧ ¬C{m,g} l
s w8
group knows l. This suggests that the notion of somebody knows (Ågotnes & Wáng, 2021) could be used to make a further refinement of the cases.
Conclusions This chapter has proposed an epistemic and dynamic approach to abduction, understanding this form of reasoning as a process that (i) is triggered by an epistemic action through which the agent comes to know or believe certain χ that otherwise she could not have been able to know or believe, (ii) looks for explanations for χ in the set of formulas that could have helped the agent to come to know or believe χ , and (iii) incorporates the chosen explanation as a part of the agent’s beliefs. Besides providing formal definitions of what an abductive problem and an abductive solution are in terms of an agent’s knowledge and beliefs, the present proposal has discussed (i) the process of abductive reasoning in nonomniscient agents, allowing novel types of abductive problems (e.g., relative to information that the agent has only implicitly and needs to reason to make it explicit) and (ii) abductive reasoning in multi-agents scenarios, where communication between different agents may produce abductive solutions that none of the agent can get alone. In these two aspects, this chapter presents a new step with repect to Nepomuceno-Fernández et al. (2017). Crucial for all these contributions has been the use of plausibility models and, in general, the DEL guidelines, which puts emphasis in the representation of both epistemic attitudes and the actions that affect them. It is worthwhile to compare, albeit briefly, the present proposal to other epistemic approaches to abductive reasoning. Besides immediate differences in the respective semantic models (while other approaches follow the classic Alchourrón et al. (1985) belief revision, using a set of formulas for representing the agent’s information, here possible worlds are used), there are two main points that distinguish the presented ideas from other proposals. First, here several epistemic attitudes are taken into account, thus making a clear difference between wt the agent holds with full certainty (knowledge) and what she considers very likely but still cannot guarantee (beliefs); this allows to distinguish between the certainty of both the previous information and the surprising observation, and the mere plausibility of the chosen solution. Second, this approach goes one step further by making explicit
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
277
the different stages of the abductive process, thus making also explicit the epistemic actions involved. This highlights the importance of actions such as belief revision, commonly understood in epistemic approaches to abduction as the one triggered by the abductive problem (Aliseda, 2000; van Benthem, 2009), and also such as observation, understood here as the one that triggers the abductive process. According to those considerations, this logical approach takes into account the dynamic aspects of logical information processing, and one of them is abductive inference, one of the most important forms of inference in scientific practices. The aforementioned multiagent scenarios allow to model concrete practices, particularly those that develop a methodology based on observation, verification, and systematic formulation of provisional hypotheses, such as in empirical sciences, social sciences, and clinical diagnosis. The epistemological repercussions of this DEL approach are given by the conceptual resources that it offers, useful to model several aspects of explanatory processes. If known theories of belief revision, at the last resort, say nothing about context of discovery, by means of DEL the accessibility of this context to rational epistemological and logical analysis is extended, further on classical logical treatment of abduction. From the perspective of game theoretic semantics, for example, now it is easier to determine what rules are strategic and what are operatories when abductive steps were given. But applications should also be considered to tackle certain philosophical problems. For example, abductive scenarios within multiagent settings can be used to study the implications of different forms of communication within scientific communities. Our proposal also focusses on some aspects in the logical treatment of abduction that are not contemplated in the classic AKM model or in the GW scheme. The most prominent aspects of cognitive abduction in Magnani (2009) – mainly the dynamic character of abductive inference, the not strictly explanatory aspects, the character of the evidence (with the peculiar way of assimilating mathematical objects, in the line pointed out by Gödel) in the cognitive environments, the precise sense of plausibility, and so on – are also minimally contemplated, especially when we consider multiagent systems with cognitive individuals who are not omniscient (i.e., with real agents). It is possible to further develop this line to seek a better representation of scientific practices within the framework of, in terms of Lakatos, metaphysical research program known as “logical dynamics of information and representation.”
References Ågotnes, T., & Wáng, Y. N. (2021). Somebody knows. In M. Bienvenu, G. Lakemeyer, & E. Erdem (Eds.), Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning, KR 2021, Online event, 3–12 Nov 2021 (pp. 2–11). Alchourrón, C., Gärdenfors, P., & Makinson, D. (1985). On the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic, 50(2), 510–530. Aliseda, A. (2000). Abduction as epistemic change: A Peircean model in artificial intelligence. In Flach and Kakas (2000) (pp. 45–58).
278
A. Nepomuceno-Fernández et al.
Aliseda, A. (2003). Mathematical reasoning vs. abductive reasoning: A structural approach. Synthese, 134(1–2), 25–44. Aliseda, A. (2006). Abductive Reasoning. Logical Investigations into Discovery and Explanation (Synthese Library Series, Vol. 330). Springer. Baltag, A., & Smets, S. (2008). A qualitative theory of dynamic interactive belief revision. In G. Bonanno, W. van der Hoek, & M. Wooldridge (Eds.), Logic and the Foundations of Game and Decision Theory (LOFT7) (Texts in Logic and Games, Vol. 3, pp. 13–60). Amsterdam University Press. Baltag, A., & Smets, S. (2009). Learning by questions and answers: From belief-revision cycles to doxastic fixed points. In H. Ono, M. Kanazawa, & R. de Queiroz (Eds.), Logic, Language, Information and Computation (Lecture Notes in Computer Science, Vol. 5514, pp. 124–139). Berlin/Heidelberg: Springer. Blackburn, P., de Rijke, M., & Venema, Y. (2001). Modal Logic. (Cambridge Tracts in Theoretical Computer Science, Vol. 53). Cambridge, UK: Cambridge University Press. Boutilier, C., & Becher, V. (1995). Abduction as belief revision. Artificial Intelligence, 77(1), 43–94. Douven, I. (2021). Abduction. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2021 edition). Metaphysics Research Lab, Stanford University. Fagin, R., & Halpern, J. Y. (1988). Belief, awareness, and limited reasoning. Artificial Intelligence, 34(1), 39–76. Fagin, R., Halpern, J. Y., Moses, Y., & Vardi, M. Y. (1995). Reasoning About Knowledge. The MIT Press. Flach, P. A., & Kakas, A. C. (Eds.). (2000). Abduction and Induction: Essays on Their Relation and Integration (Applied Logic Series, Vol. 18). Kluwer Academic Publishers. Gerbrandy, J., & Groeneveld, W. (1997). Reasoning about information change. Journal of Logic, Language, and Information, 6(2), 147–196. Grossi, D., & Velázquez-Quesada, F. R. (2015). Syntactic awareness in logical dynamics. Synthese, 192(12), 4071–4105. Hansson, S. O. (2017). Logic of belief revision. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Winter 2017 edition). Metaphysics Research Lab, Stanford University. Harman, G. (1965). The inference to the best explanation. The Philosophical Review, 74(1), 88–95. Hartshorne, C., & Weiss, P. (Eds.). (1935). Charles Sanders Peirce. The Collected Papers. Volume 5: Pragmatism and Pramaticism. Harvard University Press. Hill, B. (2010). Awareness dynamics. Journal of Philosophical Logic, 39(2), 113–137. Hintikka, J. (1962). Knowledge and Belief: An Introduction to the Logic of the Two Notions. Cornell University Press. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3), 503–533. Jago, M. (2006). Rule-based and resource-bounded: A new look at epistemic logic. In T. Ågotnes & N. Alechina (Eds.), Proceedings of the Workshop on Logics for Resource-Bounded Agents, Organized as Part of the 18th European Summer School on Logic, Language and Information (ESSLLI) (pp. 63–77). Kakas, A. C., Kowalski, R. A., & Toni, F. (1992). Abductive logic programming. Journal of Logic and Computation, 2(6), 719–770. Klarman, S. (2008). Abox abduction in description logic. ILLC Master of Logic Thesis Series MoL-2008-03. Levesque, H. J. (1984). A logic of implicit and explicit belief. In R. J. Brachman (Ed.), Proceedings of AAAI-84 (pp. 198–202). AAAI Press. Levesque, H. J. (1989). A knowledge-level account of abduction. In N. S. Sridharan (Ed.), IJCAI (pp. 1061–1067). Morgan Kaufmann. Lipton, P. (2004). Inference to the Best Explanation. Routledge. First edition: 1991. Lobo, J., & Uzcátegui, C. (1997). Abductive consequence relations. Artificial Intelligence, 89 (1–2), 149–171.
14 Abduction from a Dynamic Epistemic Perspective: Non-omniscient. . .
279
Lorini, E., & Castelfranchi, C. (2007). The cognitive structure of surprise: Looking for basic principles. Topoi, 26(1), 133–149. Magnani, L. (2001). Abduction, Reason, and Science: Processes of Discovery and Explanation. Springer. Magnani, L. (2009). Abductive Cognition: The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning (Cognitive Systems Monographs, Vol. 3). Springer. Mayer, M. C., & Pirri, F. (1993). First order abduction via tableau and sequent calculi. Logic Journal of the IGPL, 1(1), 99–117. Mayer, M. C., & Pirri, F. (1995). Propositional abduction in modal logic. Logic Journal of the IGPL, 3(6), 907–919. Nepomuceno-Fernández, Á., Soler-Toscano, F., & Velázquez-Quesada, F. R. (2017). Abductive reasoning in dynamic epistemic logic. In L. Magnani & T. Bertolotti (Eds.), Handbook of Model-Based Science (pp. 269–293). Springer. Park, W. (2015). On classifying abduction. Journal of Applied Logic, 13(3), 215–238. Plaza, J. A. (1989). Logics of public communications. In M. L. Emrich, M. S. Pfeifer, M. Hadzikadic, & Z. W. Ras (Eds.), Proceedings of the 4th International Symposium on Methodologies for Intelligent Systems (pp. 201–216). Oak Ridge National Laboratory, ORNL/DSRD-24. Quilici-Gonzalez, M. E., & Haselager, W. P. F. G. (2005). Creativity: Surprise and abductive reasoning. Semiotica, 153(1–4), 325–342. Reyes-Cabello, A. L., Aliseda, A., & Nepomuceno-Fernández, Á. (2006). Towards abductive reasoning in first-order logic. Logic Journal of the IGPL, 14(2), 287–304. Rott, H. (2009). Shifting priorities: Simple representations for twenty-seven iterated theory change operators. In D. Makinson, J. Malinowski, & H. Wansing (Eds.), Towards Mathematical Philosophy (Trends in Logic, Vol. 28, pp. 269–296). Springer. Soler-Toscano, F., Fernández-Duque, D., & Nepomuceno-Fernández, Á. (2012). A modal framework for modeling abductive reasoning. Logic Journal of the IGPL, 20(2), 438–444. Soler-Toscano, F., & Velázquez-Quesada, F. R. (2014). Generation and selection of abductive explanations for non-omniscient agents. Journal of Logic, Language and Information, 23(2), 141–168. van Benthem, J. (2007). Dynamic logic for belief revision. Journal of Applied Non-Classical Logics, 17(2), 129–155. van Benthem, J. (2009). Abduction at the interface of logic and philosophy of science. Theoria, 22(3), 271–273. van Benthem, J. (2011). Logical Dynamics of Information and Interaction. Cambridge University Press. van Benthem, J., & Velázquez-Quesada, F. R. (2010). The dynamics of awareness. Synthese (Knowledge, Rationality and Action), 177(0), 5–27. van Ditmarsch, H. (2005). Prolegomena to dynamic logic for belief revision. Synthese, 147(2), 229–275. van Ditmarsch, H., van der Hoek, W., & Kooi, B. (2007). Dynamic Epistemic Logic (Synthese Library Series, Vol. 337). Springer. Vardi, M. Y. (1986). On epistemic logic and logical omniscience. In J. Y. Halpern (Ed.), TARK (pp. 293–305). Morgan Kaufmann Publishers Inc. Velázquez-Quesada, F. R. (2014). Dynamic epistemic logic for implicit and explicit beliefs. Journal of Logic, Language and Information, 23(2), 107–140. Walliser, B., Zwirn, D., & Zwirn, H. (2004). Abductive logics in a belief revision framework. Journal of Logic, Language and Information, 14(1), 87–117.
Abduction and Dialogues
15
Cristina Barés Gómez and Matthieu Fontaine
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dialogical Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Defeasibility and Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sentential Abduction in Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abducting Rules in Structure Seeking Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abducing Rules in Adaptive Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
282 284 289 294 297 300 302 308 310
Abstract
In dialogical logic, meaning and logical notions arise from argumentative interactions between the proponent of a thesis and the opponent, who challenges that thesis. Dialogical logic was initially designed for intuitionistic logic. Then, it took a pluralistic turn. However, except for rare exceptions, most of its developments are confined to deductive reasoning. Considering that they may constitute a unified framework for argumentation, what could be the contribution of dialogues in the author’s understanding of abductive reasoning is discussed. In particular, abduction is not always sentential and sometimes involves hypotheses regarding inferential considerations, for example, the logic underlying the validity of certain principles. In dialogical logic, this issue can be tackled at different levels, namely, the play level by means of which meaning is defined, and the strategy level by means of which validity is defined. Based on an interpretation C. Barés Gómez () · M. Fontaine Departamento de Filosofía, Lógica y Filosofía de la Ciencia, Universidad de Sevilla, Seville, Spain e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_26
281
282
C. Barés Gómez and M. Fontaine
of already existing dialogical systems, the aim is to put forward different kinds of abductions in which hypotheses are about the rules of interactions, from which arise the inferential level and logical notions such as validity. Keywords
Abduction · Dialogical logic · Interaction · Game
Introduction In dialogical logic, the notion of proof is approached through a game of argumentation between the proponent of a thesis and the opponent who criticizes that thesis. Semantics is defined at the level of interaction. Validity is determined by the existence of winning strategies: The thesis is valid if there is a winning strategy for the proponent, that is, if he can defend the thesis no matter how the opponent plays. Dialogical logic was initially designed for intuitionistic logic. Then, it took a pluralist turn. However, most of its developments are concerned with deductive reasoning. This chapter deals with the exciting challenge of thinking about the dialogical basis for abduction. From the start, various difficulties are faced with. How ampliative inferences can be dialogically represented? How the introduction of information and hypotheses within a dialogue can be handled? How defeasibility can be dealt with in dialogues? Beyond technicalities, an even more interesting question is related to conceptual considerations about abduction. Indeed, a great deal of research on abduction is focused on what Magnani (2017, p. 216) calls “sentential abduction,” i.e., abduction related to logic and verbal or symbolic inferences. A hypothesis is formed by relying on the sentential aspects of natural or artificial languages, like in the case of logic. In this context, the aim of abduction is to arrange the initial set of premises by means of suitable hypotheses in view of restoring the relevant consequence relation with the conclusion, meant to represent a surprising fact to be explained. But abduction should not be restricted to these sentential aspects. For example, as stressed by Nersessian (1999), abduction can also be model-based, i.e., abduction as a process used to indicate the construction and manipulation of various kinds of representation, not only sentential and/or formal, but also mental (visual, image-related, analogical, etc.) and/or related to external mediators. This view is also shared by Magnani (2017, p. 213) who thinks that a considerable part of the abductive process is model-based. That is, a considerable part of hypothesis creation and selection occurs in the middle of a relationship between the brain and model-based aspects of external objects, and tools that have received cognitive and/or epistemological delegations. In this chapter, a pattern of abduction which is to a certain degree on middle ground in this regard is put forward: it is still concerned with formal and logical aspects, but the object of the abductive hypothesis is the logic itself. So that it is not looked for a restoration of the relation of consequence by means of additional premises, but an arrangement of the relation of consequence itself without adding new sentential hypotheses. This can be understood in dialogical logic by paying a
15 Abduction and Dialogues
283
peculiar attention between the play and the strategy levels, but also to their links, highlighted by Rahman et al., (2018). As it will be clear afterwards, the rules of interaction are definitory rules: they tell what is allowed in a dialogical game . But they do not tell how to play well, how to win. They define the so-called “play level” at which the semantics is defined in terms of interaction. By contrast, the “strategy level” is concerned with how to play well and how to win. This level is concerned with the existence of winning strategies, by means of which validity is approached. Whereas a full play is sufficient to exhibit the meaning of a thesis, it is not sufficient to determine its validity. The strategy level is, in turn, necessary to determine validity, but it is not necessary to exhibit meaning. Meaning is displayed when rules are applied correctly, no matter if the participants in a dialogue play well. Despite this distinction, the play level grounds the strategy level, since strategies can be extrapolated from the meaning of the connectives, and thus the link between the play and the strategy levels. Now the point is the following. What is interesting here is the definition of new patterns of abduction for hypotheses concerned with the inferential level. Nevertheless, in dialogical logic, the starting point must be concerned with hypotheses concerning the rules of interaction. It is indeed the play level that grounds the strategy level. Therefore, abducing logics consists first in abducing rules of interactions, which become themselves objects of the dispute. Once such hypotheses have been introduced in the dialogue, hypothetical plays may follow, and hypothetical strategies may be taken into consideration. This is how the concept of abduction is extended beyond purely sentential hypotheses in dialogical logic. This chapter begins with a basic presentation of dialogical logic, which is referred to as the “standard (deductive) dialogical logic” (Section “Dialogical Logic”). One salient feature of dialogues is that they are agent-based –that is, they are built by agents (real or ideal) – and they are not based on an external notion of truth. Since it is also looked for as an extension of abduction beyond sentential hypotheses, that proposal fits particularly well with recent developments renewing with the Peircean pragmatic spirit of abduction; in the works of Gabbay and Woods (2005), Woods (2013), or Magnani (2017), for example. Therefore, the chapter will follow with the Gabbay and Woods model and Magnani’s eco-cognitive model, which constitute interesting frameworks for the development of our proposal (Section “Abduction”). This compels to think about the possibility of approaching defeasibility in dialogues, which is the opportunity to explain the distinction between the play and the strategy levels (Section “Defeasibility and Dialogues”). Then, a first proposal of the Barés and Fontaine (2017) is discussed. Although they interestingly highlight the pragmatic distinction between deductive and abductive dialogues, which would involve different kinds of speech acts, they are still confined to sentential abduction (Section “Sentential Abduction in Dialogues”). The proposal is eventually illustrated by means of two different dialogical logics. In the Structure Seeking Dialogues of Rahman and Keiff (2005) and Keiff (2007), hypotheses are concerned with the underlying modal frame and the applications of the rules for modal operators (Section “Abducting Rules in Structure Seeking Dialogues”). In the Inconsistency-Adaptive Dialogical Logic of Beirlaen and Fontaine (2016), hypotheses are concerned with the application of the rule for negation (Section
284
C. Barés Gómez and M. Fontaine
“Abducing Rules in Adaptive Dialogues”). These two dialogical systems perfectly exhibit the pattern of abduction advocated in this chapter.
Dialogical Logic Dialogical logic was initially designed for intuitionistic logic. Then, with the work of Rahman and his associates, it took a pluralistic turn by taking advantage of its flexibility and a clear distinction between different level of rules. For foundational work on dialogical logic, the reader is referred to Lorenzen and Lorenz (1978); for more recent developments, Rahman and Rückert (2001a), Rahman and Keiff (2005), Rahman et al. (2018); for an introductory presentation, Redmond and Fontaine (2011). This section is a general presentation of dialogical logic, mainly the rules and basic explanations. Dialogues are games of argumentation between the Proponent (P) of a thesis and the Opponent (O). The game begins with P uttering the thesis that he must defend against every possible criticism of O. Moves in the games are performed by means of two kinds of speech acts: requests and assertions. They are either challenges against previously uttered statements or defenses in response to challenges. The game is governed by two kinds of rules – the particle rules and the structural rules – that define the play level, i.e., they are definitory rules that indicate how to play, but not how to win. The particle rules provide the local meaning of logical constants in terms of interaction. They are abstract descriptions consisting of sequences of moves such that the first member is an assertion, the second is an attack and the third is a defense (when possible). Given that it is assumed that both players speak the same language (otherwise dialogues would not make sense), they are the same for both players. The structural rules determine the general organization: how to begin, who plays, when, who wins, and so on. It is worth noting that “structural” does not have the same meaning as in other approaches such as sequent calculus for example, i.e., it is not used to refer to rules such as weakening or contraction. They provide the global meaning of the statements uttered in a dialogical game ; that is, their meaning in a specific context of argumentation. For example, by means of the rules [SR1c] and [SR1i] classical and intuitionistic games can be distinguished. An initial thesis ϕ uttered by P is claimed to be valid if and only if there is a P-winning strategy for ϕ, i.e., P can win the game no matter how O plays. Whereas the particle and structural rules define the semantics in terms of interactive language-games, the strategy level is concerned with validity through the notion of P-winning strategy. This section thus begins by the more fundamental level of definitory rules and then an explanation of how validity is approached in dialogues. Language and notations. Let L be a propositional language, defined as follows: ϕ := ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ → ϕ | ¬ϕ Lowercase letters p, q, r, . . . refer to atomic formulas in L. Lowercase Greek letters ϕ, ψ, χ , . . . are used to refer to L-formulas, and uppercase Greek letters
15 Abduction and Dialogues
285
, , , . . . to refer to finite sets of L-formulas. To define the structural rules, it is made use of two labels, P and O, standing for the players of the games, the Proponent and the Opponent, respectively. The identities of P and O are not relevant at the local level and are defined at the global level by means of the structural rule [SR0] given below in this section. That is why the particle rules are defined with player variables X and Y (with X = Y). The force symbols ! for assertions and ? for requests are used. A move is an expression of the form X − e where X is a player variable and e is either an assertion or a request. The notations n := ri and m := rj with ri , rj ∈ N∗ are used for the utterance of the rank the players choose according to the rule [SR0]. Ranks are positive integers bounding the number of attacks and defenses the players can perform in a play. A play is a sequence of moves performed in accordance with the game rules. Since it is also studied how a thesis is drawn from a set of premises, the initial thesis will be either a formula ϕ or an argument of the form ϕ[ψ 1 , . . . , ψ n ] which amounts to the claim that there is a winning strategy for the conclusion ϕ given the concession of ψ 1 , . . . , ψ n . In other words, the fact that P claims that he can draw ϕ on the basis of ψ 1 , . . . , ψ n by stating the thesis ϕ[ψ 1 , . . . , ψ n ] amounts to say something like “I can defend ϕ under the concession of ψ 1 , . . . , ψ n ”. The premises ψ 1 , . . . , ψ n are referred to as the initial concessions. In case the premise set is empty, the initial thesis is simply ϕ. The dialogical game for the claim ϕ[ψ 1 , . . . , ψ n ] (respectively ϕ) is the set D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ) of all the play with ϕ[ψ 1 , . . . , ψ n ] (respectively ϕ) as the initial thesis. Where = {ψ 1 , . . . , ψ n }, it is sometimes written [] instead of [ψ 1 , . . . , ψ n ] for the sake of presentation. For every move M in a given sequence S of moves, pS (M) denotes the position of M in S. Positions are counted starting with 0. It is also made use of a function F such that the intended interpretation of FS (M) = m , Z is that in the sequence S, the move M is an attack (if Z = A) or a defense (if Z = D) against the move of previous position m . The following table provides the particle rules for the propositional language:
Particle Rules Assertion X− !ϕ∧ψ X− !ϕ∨ψ X− ! ¬ϕ X− !ϕ→ψ X − ! ϕ[ψ 1 , . . . , ψ n ]
Attack Y − ? ∧L , or Y − ? ∧R Y− ?∨ Y− !ϕ Y− !ϕ Y − ψ1 .. . Y − ψn
Defense X − ! ϕ, or X − ! ψ (respectively) X − ! ϕ, or X− !ψ −−− X−ψ X− !ϕ
286
C. Barés Gómez and M. Fontaine
The particle rules are abstract descriptions consisting of sequences of moves. The first member is an assertion, the second is an attack, and the third is a defense (except in the case of negation, for which there is no possible defense). They are abstract because they are defined independently of any specific context of argumentation and independently of the players’ identities. When a player X asserts a conjunction, he is committed to giving a justification for both conjuncts. That is why the attacker (Y) requests the conjunct of his choice (left or right). In the case of a disjunction, it is the defender (X) who chooses since he his only committed to defend at least one of the disjuncts. An attack may be a request or an assertion (in the case of the negation) or even a composite speech-act as in the case of the conditional or an argument of the form ϕ[ψ 1 , . . . , ψ n ]. The latter is challenged by conceding each of the premises, and the defender must assert the conclusion. The particle rules define the local meaning of connectives. They are the same, no matter the logical context of argumentation. The difference between classical and intuitionistic uses of connectives is made at the structural level, which provides the global meaning of connective, i.e., their meaning in a specific context of interaction. The structural rules provide the global level of semantics by regulating the whole dialogue and how particle rules can be applied. We begin with the starting rule, which says how a dialogical play must start: [SR0][Starting Rule]. If the initial thesis is of the form ϕ[ψ 1 , . . . , ψ n ], then for any play P ∈ D (ϕ [ψ1 , . . . , ψn ]), we have: • pP (P − !ϕ [ψ1 , . . . , ψn ]) = 0, • pP (O − n = ri ) = 1 and pP P − m = rj = 2 If the initial thesis is of the form ϕ, then for any play P ∈ D (ϕ), we have: • pP (P − !ϕ) = 0, • pP (O − n = ri ) = 1 and pP P − m = rj = 2
The first clause of each case warrants that every play in D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)) starts with P asserting the thesis ϕ[ψ 1 , . . . , ψ n ] (respectively ϕ). Remember that a rank is a positive integer bounding the number of attacks and defenses which the players can perform in a play. The ranks guarantee the finiteness of plays by limiting the repetitions allowed in a dialogue. A move M performed by X in a dialogue is a repetition of a previous move M if (i) M and M are two attacks performed by X against the same move N performed by Y, or (ii) M and M are two defenses performed by X in response to the same attack N performed by Y. The authors follow the rule formulated by Clerbout (2014, p. 788) in which the rank chosen by the players applies uniformly to the whole dialogue, and for defenses as well as for attacks. As shown by Clerbout, for strategic reasons, it is usually enough for O to choose 1 and then P to choose 2.
15 Abduction and Dialogues
287
[SR1c][Classical Development Rule] For any move M in P such that pP (M) > 2 we have FP (M) = m , Z , where Z ∈ {A, D} and m < pP (M). Let r be the repetition rank of player X and P ∈ D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)) such that: • The last member of P is a Y-move. • M0 ∈ P is a Y − move of position m0 • There are moves M1 , . . . , Mn of player X in P such that FP (M1 ) = FP (M2 ) = · · · = FP (Mn ) = [m0 , Z] with Z ∈ {A, D}. Let N be an X-move such that FP N (N ) = [m0 , Z], where “P N ” denotes the extension of P with N. Then P N ∈ D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)) if and only if n < r. The rule [SR1c] ensures that after the repetition ranks have been chosen, every move either is an attack or a defense against a previous move made by the other player; players move alternately, and the number of attacks and defenses they can perform in reaction to a same move is bounded by their repetition ranks. Intuitionistic dialogical games are defined with a rule [SR1i], by modifying [SR1c] so that the repetition ranks only bound the number of challenges, and players can defend only once against the last non-answered challenge. This illustrates how different logics can be distinguished at the structural level, without having to change the local level. [SR2][Formal rule] The sequence S is a play only if the following condition is fulfilled: if N = P − ! ψ is a member of S, for any atomic sentence ψ, then there is a move M = O − ! ψ in S such that pS (M) < pS (N ). This rule means that P can assert an atomic sentence ψ only if O previously asserted the same atomic sentence O. Then, the notion of X-terminal is defined: [D1][ X -terminal] Let P be a play in D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)) the last member of which is an X-move. If there is no Y-move N such that P N ∈ D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)), then P is said to be X-terminal. The winning rule for plays can now be defined: [SR3][Winning Rule for Plays] Player X wins a play P ∈ D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)) if and only if P is X-terminal. According to [SR3], X wins a play if it is Y’ s turn to play and no move is available to Y. The rules of the game do not say anything about validity or how to play. Dialogical validity is grasped at the strategy level. The thesis of P is valid if and only if P has a winning strategy according to the following definition: [D2][Winning Strategy] A strategy of a player X in D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)) is a function sx which assigns a legal X-move to every nonterminal play P ∈ D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)) the last member of which is a Y-move. An X-strategy is winning if it leads to X’s win no matter how Y plays.
288
C. Barés Gómez and M. Fontaine
On the basis of the definition of winning strategy, the notion of consequence for dialogical CL (classical logic) can be defined; that is, a dialogical logic played with [SR0]-[SR3], the so-called CL-rules: [D3][CL-Consequence] CL ϕ (respectively CL ϕ) iff according to the CLrules, there is a P-winning strategy for the thesis D (ϕ [ψ1 , . . . , ψn ]) (respectively D (ϕ)). A similar definition of consequence for dialogical logic IL (intuitionistic logic) is obtained by substituting the IL-rules to the CL-rules, i.e., by substituting [SR1i] to [SR1c]. In dialogical logic, the existence of a proof is determined by the existence of a winning strategy. The rules are now illustrated by means of the following examples:
Dialogue 1 O 1 3 5
n := 1 p∨q p
0 3
P (p ∨ q) → p m := 2 p ?∨
0 2 6 4
Explanation: In the outer columns, the numbers of the moves are indicated. The move being attacked is indicated in the inner columns. For example, at move 3, O − ! p ∨ q is an attack against 0; at move 4, P − ? ∨ is an attack against 3. A defense always appears in front of the challenge (on the same line). In this play, P cannot answer directly to the attack performed by O at move 3 since he cannot assert an atomic formula that has not previously been conceded. The only possibility for P is to counterattack O’ s disjunction, at move 4. Then O chooses p and P makes use of this concession to answer move 3 by asserting p, at move 6. Then, there is no further possibility of move for O, and P wins. It is worth noting that this is only one play, which exhibits the meaning of “(p ∨ q) → p”. Such a play is not sufficient to determine the validity of the thesis. The validity is defined in terms of the existence of a winning strategy. In order to determine the existence of a P-winning strategy, all the possible choices for O must be considered. Here, another possibility was available at move 5, where O might have answered q instead of p. This would also have been a smarter move since it would have prevented P from winning the play. By considering the alternative plays, it can be observed that the initial thesis was not valid and that P wins only if O plays badly, not no matter how he plays. This distinction between the play level (semantics) and the strategy level (validity) is fundamental in dialogical logic. This point will be discussed later when speaking of defeasibility and non-monotonicity in dialogues. Another example illustrates the distinction between classical and intuitionistic dialogues.
15 Abduction and Dialogues
289
Dialogue 2 O 1 3 5
n := 1 ?∨ p
2 4
P p ∨ ¬p m := 2 ¬p −−− p
0 2 4 6
Explanation: This is a play for the initial thesis p ∨ ¬p – the classical law of excluded middle – played with the classical rules [SR1c]. Indeed, O begins by challenging the disjunction. Since p is atomic, the only possibility for P is to assert ¬p. Then, O challenges the negation by conceding p. Therefore, at move 6, P repeats his defense against 2 and asserts p by making use of the concession made by O. This play would not be possible with the intuitionistic rule [SR1i] since P could only defend himself against the last non-defended attack. In other words, after move 5, P cannot do anything according to the intuitionistic rules. It is worth noting that, contrary to Dialogue 1, there is only one possible play here. Indeed, given that O could not have played otherwise –he had no opportunity to choose– P’s winning is sufficient to determine classical validity of the initial thesis. This kind of example shows that there is no difference in meaning at the local level. Indeed, the rule for the disjunction is the same for classical and intuitionistic dialogues. The change is to be found at the global level, i.e., the level which provides us with the conditions for using the particle rules in a whole dialogue. Again, the play level only provides us with the semantics, although it also grounds the strategic level. Indeed, given the rules for intuitionistic dialogues, there cannot be a winning strategy for P for this initial thesis. And this is fully determined since the beginning.
Abduction When thinking about abduction in a dialogical setting, what is interesting is what the players actually do and the processes by means of which hypotheses are introduced. In this chapter, abduction is conceived in terms of interactive constructions, the clarification of which could take advantage of conceptual insights that can be found in the recent revival of the Peircean pragmatic view. That is why the models put forward by Gabbay and Woods (2005), Woods (2013), or Magnani (2017), in which abduction is understood in the light of cognitive, economic, and ecological considerations, constitute a starting point. Indeed, defending a thesis by changing the underlying logic might be perceived as a dialogical fallacy, even as cheating, i.e., if I cannot win the play, then I change the rules. But in light of these models, this process may also be understood otherwise, in terms of a strategic adjustment process that commits to further justification. The theoretical background of these models of abduction is the issue of this section, before returning to dialogues.
290
C. Barés Gómez and M. Fontaine
The economic and ecological perspectives allow discussing standards of correct reasoning, in particular, the difference between aptness and accuracy. Although correct reasoning is usually considered with respect to its accuracy (e.g., deductive validity or inductive strength), accuracy is not aptness. That is, depending on the resources available and the objectives of the agent, a less accurate reasoning may nevertheless be more apt to a given situation. Aptness may be judged from the perspective of eco-cognitive systems, defined by Magnani (2017, p. 9) as triples of the form , where A is an agent, T is a cognitive target (i.e., something the agent wishes to know or do), and R relates to the available resources (information, computational capacity, memory, time, financial resources, and so forth). The adequacy and the conditions of attaining a target are contextual, relating to the type of agent and his or her resources. Indeed, an individual agent with few resources appropriately sets less ambitious targets than an institutional agent (States, NASA, CNRS, CSIC, . . . ). A suitable strategy also consists in maximizing the agent’s resources in order to meet the target, that is, to do the best with less. From this perspective, scant-resource adjustment strategies, which are less costly and have more realistic targets, are sometimes better than stubborn quests for accuracy. Standards that make targets unattainable are intrinsically inappropriate. According to Woods (2013, pp. 364 ff.), abduction is a scant resource adjustment strategy, by means of which a hypothesis is introduced in view of acting despite a persisting state of ignorance. From this perspective, setting defeasible hypotheses tentatively is often a better strategy than trying to refrain from making errors. Indeed, if individual agents had to wait for certainty in order to make a decision and act, they could do nothing at all. Standards of reasoning defined in terms of deduction and induction are accurate forms of reasoning with high standards of precision, but they are inappropriate in situations in which crucial resources are limited. With respect to agents, the targets that they set, and the (limited) resources available to them for meeting them, error avoidance is not always a general condition of cognitive success. As noted by Woods (2013, p. 366), it is looked forward a third-way reasoning, where cognitive economy – in which some practical balance is sought between cognitive aspirations and the cognitive resources available for achieving them—plays a fundamental role. Third-way reasoning is of a practical nature, that is, agent-based, goal-oriented, and, therefore, resource-bound. This is why the standards established to determine its correctness must be stated in terms of the appropriateness of the goals set by agents and the resources available to them to attain them. It should be noted that the term “third-way reasoning” encompasses various kinds of reasoning, including certain forms of abduction. According to Woods (2013, p. 223), “there seems to be no want of candidate logics for the analysis of third-way reasoning – non-monotonic logics, truth maintenance systems, defeasible inheritance logics, default logics, autoepistemic logics, circumscription logics, logic programming systems, preferential reasoning logic, abductive logics, theory-revision logics, belief change logics and whatever else.” When saying that third-way reasoning is practical, this refers to agents with low resources and their relative goals. Be that as it may, this does not preclude its use for scientific purposes, insofar as a “theoretical abduction”
15 Abduction and Dialogues
291
may be performed by an individual researcher with limited resources. This can be explained by resorting once again to Woods (2013, p. 15), who distinguishes between two kinds of goals: small tasks requiring few cognitive resources and big tasks needing many. Usually, the former are performed by individual agents and the latter by institutional agents. The targets set by these two kinds of agents depend on their resources. To send astronauts to Mars over the next decade is a coherent target for Space X, but not for an individual agent. Agents are thus limited by the targets that they can afford to set and the resources available to meet them. As a result, cognitive tasks have inbuilt standards of success: standards of proof vary with the nature of the cognitive target and the level of resource adequacy. By and large, institutional agents with high targets and plenty of resources establish high standards of accuracy. Conversely, individual agents with lower targets and fewer resources, implement lower ones. From this perspective, the members of the Gang of Eighteen may be cognitively virtuous whenever they serve to maximize the resources available to an agent in order to attain a coherent target. The aptness of reasoning must be decided in terms of strategies for maximizing scant resources by striking the right balance between targets and resources. For example, according to Gigerenzer (2005, p. 196), 3-year-old children learning to speak who say “I gived,” instead of “I gave,” commit a good error. They first learn a general rule for the preterite, and then correct themselves when they are told that it is an irregular verb. Although they act based on a hasty generalization, it is a good strategy from an eco-cognitive perspective. Albeit inaccurate, since it does not lead to grammatically correct sentences, this generalization is nonetheless adequate, since it allows children to learn language, notwithstanding their limited resources. Indeed, given children’s cognitive resources, an error-avoidance strategy would make the target unattainable. As a matter of fact, most of our knowledge is acquired through similar processes of error, feedback, and correction, which are usually more appropriate than stubborn quests for accurate conclusions. Of course, this assumes error detection and management strategies. Third-way reasoning may be defeasible, since its premises support the conclusion, even though it is possible for the former to remain true and the latter to be revised in light of new information. In general, conclusions need not be completely dropped, but they must be formulated with a certain amount of flexibility. This allows agents to correct them. Lots of other so-called fallacies are sometimes cognitively virtuous. In particular, affirming the consequent when it does not occur in the course of a deduction, but it is understood in terms of hypothetical reasoning may become cognitively virtuous. Judging the correctness or incorrectness of reasoning depends on the cognitive system. The same level of accuracy should not be expected from a grammarian, who is studying language with practically unlimited information, computation, and time resources, as from a child who is learning language. Although children who hastily generalize the application of grammatical rules to irregular cases are wrong and make a mistake, their strategy is cognitively appropriate for learning language in view of their own limited resources. The relative adequacy of standards of reasoning explains not only why the so-called fallacies are committed and why they are attractive, but also why they are cognitively virtuous. According to Woods (2004,
292
C. Barés Gómez and M. Fontaine
p. 354), a fallacy is an argument that is good and bad relative to different levels of access to the necessary cognitive resources. There are good hasty generalizations and post hoc ergo propter hoc fallacies, although they are also bad in a relevant sense, because they fail to meet the standards of deduction and/or induction. The appropriateness of those standards does not depend on their absolute accuracy, but on their adequacy with respect to given cognitive systems. Whereas institutional agents usually attempt to avoid errors, individual agents leverage scant-resource adjustment and error management strategies. It is in this context that the importance of the Gabbay and Woods Model of abduction (GWm), following Gabbay and Woods (2005), and recently developed by Woods (2013), must be understood. Woods (2013, p. 376) defines abduction as an “ignorance-preserving” inference, in response to an ignorance problem. A question to which an agent has no answer acts as a cognitive irritant that forces him to formulate a hypothesis that may serve as a basis for new actions, despite his persisting state of ignorance. The GWm may be defined as follows. Let T be an agent’s epistemic target at a specific time, K the agent’s knowledge base at that time, K∗ an immediate successor of K, R an attainment relation for T (i.e., R(K, T) means that knowledge base K is sufficient to reach target T), while denotes the subjunctive conditional connective (for which no particular formal interpretation is assumed), and K(H) is the revision of K upon the addition of H. C(H) denotes the conjecture of H and HC its activation. Let T ! Q(α) denote the setting of T as an epistemic target with respect to an unanswered question Q to which, if known, α would be the answer. The GWm has the following general structure: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
T !Q (α) ¬R (K, T ) ¬R (K ∗ , T ) H ∈ /K H ∈ / K∗ ¬R (H, T ) ¬R (K(H ), T ) H R (K(H ), T ) H meets further conditions S1 , . . . , Sn Therefore, C(H ) Therefore, H C
(fact) (fact) (fact) (fact) (fact) (fact) (fact) (fact) (sub − conclusion (1, 7)) (conclusion (1, 8))
Abduction is triggered by a question Q and sets as a target the answer to that question (step 1). The agent’s knowledge is not sufficient to provide an answer (step 2), and he has no means or sufficient resources to find the answer in a timely way (step 2). The hypothesis is not a piece of knowledge (step 4), and it is not a piece of knowledge that could be acquired in a timely way either (step 5). The hypothesis alone does not relate to the target (step 6), even when combined with the knowledge base (step 7). Otherwise, ignorance would be removed – what Woods (2013, p. 368) calls “subduance” – and no abduction would be performed. In order to avoid
15 Abduction and Dialogues
293
misunderstandings, K is a knowledge base that can be regarded as holding for a set of propositions known by the agent (without excluding other forms of knowledge). It should not be confused with Hintikka’s (1962) epistemic operator in sentences like Kα ϕ, whose intended meaning is that “α knows that ϕ” and which is true in a corresponding modal framework if and only if ϕ is true in every state of affairs compatible with α’s knowledge. This would lead to an inconsistent reading of Step 7, given that if the agent knew H, then T would be attained. But this is clearly not how K has been defined, by contrast with H ∈ K. In the GW scheme, K(H) holds for the revision of K, a set, upon the addition of H, a hypothesis, without assuming that the resulting (revised) set be another knowledge base – as would be the case with a successor K ∗ of K in subduance, but in which case no abduction would be triggered. Yet, the subjunctive relation H R(K(H), T) (Step 8) holds, since if H were true, then it would play a role in the attainment of T. Thus, given certain conditions – yet to be specified – met by H (Step 9), H is worth being conjectured and C(H) can be concluded (Step 10). If the process ends there, only a partial abduction is performed; that is, a hypothesis is introduced but it is not activated. Full abduction occurs when the hypothesis is released as a basis for further reasoning without having been verified (step 11). It is worth noting that the subjunctive conditional H R(K(H), T) should not be understood as the expression of a sufficient condition H for the attainment of target R(K(H), T) either. Indeed, the truth of H is obviously insufficient for the attainment of T, given that if the agent does not know H, the target will not be reached. In order for the antecedent to express a sufficient condition for R(K(H), T), a stronger formulation is needed, as in 8 :
8 .H ∈ K R (K(H ), T ) Although 8 might express an acceptable fact if the conditional were deductively or classically understood (i.e., inferring R(K(H), T) from H ∈ K), it is inappropriate to express the conditions of acceptability of a hypothesis. Indeed, abduction is not deduction, and just as Step 8 would be committed to the conjecture of H from H R(K(H), T), so too would 8 be committed to the conjecture of H ∈ K– it should be recalled that in Peirce’s schema what is suspected to be true is antecedent A of the subjunctive conditional “if A were true, C would be a matter of course.” That is, instead of inferring that there are reasons to suspect the truth of H, it would be inferred that there are reasons to suspect that the truth of H is known. But this is clearly the conclusion one refrains from drawing in the ignorance-preserving GWm of abduction. The GWm is sufficiently general and flexible to incorporate various kinds of hypotheses. The introduction of H may be the result of a sentential abduction, but not necessarily. As explained by Woods (2017, pp. 148–149), the relation of attainment is not a relation of logical consequence either, but a more general relation of conclusionality. Although he ventures reserves regarding the ignorancepreserving feature of abduction, the general form of the GWm is well-suited for a combination with the eco-cognitive model advocated by Magnani (2017). Both
294
C. Barés Gómez and M. Fontaine
are also relevant if the aim is to characterize non-sentential abductive hypotheses in dialogues. Indeed, hypotheses relative to the rules of interaction may consist of strategic adjustment process. Sometimes, it is not because the rules of interaction do not allow further moves in a dialogue that the partners stop the discussion. Agreement about the rules themselves may be encountered by the players and serve as a basis for further hypothetical moves.
Defeasibility and Dialogues Defeasibility is an essential feature of the models of abduction that have just been mentioned. Although there is no consensus with respect to its definition, it can be said that an argument is defeasible if its premises provide support for the conclusion, even though it is possible for the premises to remain true and the conclusion to be revised in light of new information. Formally, defeasibility may be approached through logics that do not observe the monotony property, i.e., if ϕ, then ∪ ϕ (where ⊆ ). To understand defeasibility in a dialogical setting, a peculiar attention must be paid to the distinction between the play level and the strategy level. Otherwise, confusion may arise and jeopardize the possibility of extending dialogues beyond standard deductive logic. These issues are clarified in this section. It has already been mentioned the fact that P winning a play was not sufficient to determine validity, which assumes that P must be able to win no matter how O plays (see explanation to Dialogue 1). And to determine if there is a P-winning strategy, all the possible plays must be taken into account, in particular all the possible choices offered to O when, for example, defending a disjunction or challenging a conjunction. Therefore, a peculiar attention must be paid to the distinction and also to the links between the play and the strategy levels. Indeed, the play level is concerned with meaning, explained in terms of interaction within particular language-games. The strategy level is concerned with validity, explained in terms of P-winning strategy, which is not necessary at the play level. Meaning can be manifested even if the players do not play in an optimal way. It may be thought here of a chess player who can exhibit his knowledge of the rules by applying them, even if he is not able to win any play. In the same vein, a dialogical player manifests his knowledge of the meaning by applying the rules of the game in actual plays, even if he does not play well, e.g., he loses even when he could have won. In order to better understand how the dialogical theory of meaning is built, it is now referred to the Dialogues for Immanent Reasoning (DIR) of Rahman et al. (2018), in which local and strategic reasons backing a statement are made explicit in the object language by incorporating features of the Constructive Type Theory (CTT). All the technical details of their work cannot be provided here, but the main lines and their conclusions can be explained. Dialogical rules are normative for interaction and do not involve any strategic component. Particle and structural rules determine the meaning of statements in terms of rights and duties, i.e., the right to challenge a statement or to ask for reasons and the duty to answer such challenge
15 Abduction and Dialogues
295
or to give reasons. This provides a dialogical turn to Brandom’s inferential approach to meaning (1994). In addition, the particle rules also have a normative aspect with respect to the choices (see the particle rules, section “Dialogical Logic”): the choice is for the challenger of a conjunction, but for the defender of a disjunction. The structural rules determine the general organization of dialogical games. The correct application of both kinds of rules has no link with the notion of winning strategy. Indeed, knowing the meaning of an expression is knowing how to build a play for it, no matter who wins. It is only when linking plays to winning strategies that it can be talked of validity and rules of inference. Stated in a Wittgensteinian manner, the play level reflects (i) the internal feature of meaning, and (ii) the meaning as mediated by language-games (Rahman et al., 2018, p. 278). The first point brings to the necessity of a fully interpreted language, as the language of DIR which incorporates features of the CTT in order to make explicit the reasons backing the statements uttered by the players. The second point leads to the notion of dialogue-definiteness and the notion of propositions as plays. In relation to meaning as mediated by language-games, the notion of proposition in dialogical logic cannot be identified with strategies but with actual plays. If meaning is mediated by language-games, they must be actually playable by human beings. Following Lorenz (2001, p. 258) a proposition becomes a dialogue-definite expression, an expression A such that there is an individual play about A, that can be said to be lost or won after a finite number of moves performed in accordance with the rules of interaction. To know the meaning of an expression A is to know how to build a play for A. And this is independent of the validity of A. Therefore, in dialogical games , propositions should not be identified with winning strategies, but with plays. That is why in DIR, statements are backed by two kinds of reasons: local reasons and strategic reasons. Local reasons are precisely those reasons by means of which a language can be fully interpreted at the play level, regardless of the strategy level. In standard dialogues, reasons are left implicit. For example, a player uttering ϕ ∧ ψ is committed to utter ϕ and to utter ψ if his argumentative partner makes this request explicit by challenging the disjunction. All of this is made explicit by means of CTT in DIR, until the local reasons are given for atomic formulas. In standard dialogues, the simple fact that O gave a reason for an elementary proposition is sufficient for P to copy this reason for the same proposition. By means of CTT, DIR makes this kind of reason explicit. All of this is left implicit in standard dialogues such as Dialogue 1 and Dialogue 2 (section “Dialogical Logic”): the local reason P has for asserting p (move 6 in both) is that O stated the same atomic formula previously, in conformity with [SR2]. It is only when linked to winning strategies that moves in a dialogue can be considered as inferential. Strategic reasons are a kind of recapitulation of what can happen for a given thesis and show the entire history of the play by means of the instructions. They show an overview of the possibilities enclosed in the thesis. But the fundamental level of plays is also needed. This link clearly becomes salient when a heuristic method to extract the strategy level from the multiplicity of possible plays enclosed by the initial thesis is spelled out. The strategy level is a generalization of the procedure which is implemented at the play level; it is a systematic exposition
296
C. Barés Gómez and M. Fontaine
of all the variants. The strategy level allows a comparison of the different plays on the same thesis. They need not be actually carried out by the players. They are only a perspective on the possibilities offered by the play level. Given that there is a P-winning strategy if and only if P has a way to win regardless of O’s choices, the P’s strategies are built on O’s choices. That is, each possible choice of O must be taken into account and dealt with in order to determine if P is able to win in all the different cases stemming from O’s possible choices. This is the basis to define an algorithm to extract strategies from the play level (Rahman et al., 2018, p. 68, pp. 89ff). A heuristic method to build the P-winning strategies consists in taking into consideration O’s choices that entail a branching. For example, as in the Dialogue 1, there is a branching when O defends a disjunction (move 5). In other moves in which O would perform choices, a branching would occur. For example, if P states a conjunction, O can request the conjunct of his choice. Therefore, strategies stem from the play level. All the rules are applied and then every possible play opened by branching is considered. The distinction between the play and the strategy level is important for different reasons. First, if they are not distinguished carefully, confusion arises between semantic and inferential issues. Second, their connexion shows the path to follow for abduction of interaction rules, which is the main target of this chapter. Let us begin with the first point. The confusion between the play and the strategy level may lead to erroneously considering that dialogues intrinsically involve a strategic component. This would have the unrewarded result that dialogues are non-defeasible in essence, and thus jeopardize our business. The point is explained by referring to Dutilh Novaes (2015, p. 601) who claims that “[w]hat starts as a strategic but not mandatory component of the dialogical game – putting forward indefeasible arguments – then becomes a constitutive structural element of the deductive method as such: only indefeasible arguments now count as correct moves in a deductive argument” (author’s emphasis) when defending her “built-in opponent” (BIO) conception of logic and deduction. Accordingly, the standard notion of logical truth, understood in terms of necessary truth preservation, has internalized in monological practices the role of O as an ideal interlocutor who seeks to defeat the argument by showing a case in which the premises are true, but the conclusion is false. If O cannot succeed in defeating P’s argument, then the thesis is valid. According to the BIO conception of logic and deduction, it is this role of O that has been encapsulated, internalized, in the standard notion of deductive validity. As Dutilh-Novaes conceives dialogues, O always performs optimal moves. This leads to the conclusion that dialogues intrinsically involve a strategic component. This adversarial component accounts for the necessary truth preservation; that is, O tries to defeat the argument, which is valid if and only if it cannot be defeated, if it is indefeasible. O must play optimally and try to block P’s inferential steps performed in order to derive the conclusion from the premises the Opponent has already granted. If dialogues intrinsically involve a strategy component, then defeasibility cannot be properly approached in dialogical games .
15 Abduction and Dialogues
297
However, as it has already been explained, the particle and the structural rules are nothing but definitory rules and do not involve any inferential component (see section “Dialogical Logic”). At the play level, it makes no sense to say that O’s role is reduced to check the indefeasibility (or non-monotonicity) of a P’s move. Therefore, dialogues are not to be thought of as being designed to check the indefeasibility of P’s thesis and stick to the consequence that they intrinsically involve a strategic component. At the play level, the players need not be thought of as optimal. There is room for error and defeasibility, although it is only from the perspective of the strategic level that the best ways to play can be spotted. Nothing prevents a player from playing badly, even when there exists a winning strategy. At the play level, limitations of computational skills, memory, information, and so on can also be accounted for.
Sentential Abduction in Dialogues Defeasibility can be grasped at the play level. Yet, the use of hypotheses in dialogues remains to be clarified. Indeed, they must be distinguished from standard assertions of deductive dialogues. This can be done by specifying their conditions of use. Although the same propositional content may be shared by different kinds of speech acts, the conditions under which they can be uttered and the commitment to further justification they may bring with them are different. According to the GWm, abduction is a process whereby a hypothesis is set as a basis for new action despite a persisting state of ignorance. That is, the hypotheses may be activated in further reasoning without being verified. Based on this model, Barés and Fontaine (2017, p. 306) conceive abduction in dialogues in terms of hypothetical plays triggered by concession-problems: P faces a concession-problem if he lacks O’s concessions to win the play. In this case, P is allowed to introduce a hypothesis as a basis for further moves despite a lack of confirmation or in the absence of its concession. Even if deductive validity is not reached, such dialogues give rise to hypothetical plays and winning strategies. The emphasis on pragmatic features of dialogical moves leads to the introduction of new rules of interaction, from which another interpretation of P winning a play arises. More concretely, Barés and Fontaine (2017) advocate dialogical logic as a unified framework to reconcile logic and argumentation, and thus build the foundations of a dialogical approach to abduction. Notwithstanding the pragmatic flavor of the questions their analysis stems from, their proposal lies at the interface between the GWm and Aliseda’s (2006) formal account of abduction: • How can a surprising fact, an abductive problem, and (or) an ignorance-problem be characterized? • How can the guessing step, in which a hypothetical explanation is conjectured, be characterized? • How can the ignorance-preserving feature of abduction be characterized?
298
C. Barés Gómez and M. Fontaine
The first question relates to the notion of triggering, namely the conditions under which an abductive problem may be stated in a play. The second question relates to the notion of guessing and the possibility of introducing a hypothesis in the course of a dialogue. The third question relates to the notion of committing since it must account for the conjectural status of the hypothesis. That is, a new kind of speech act must be identified and distinguished from usual assertions of standard deductive dialogues in terms of the commitment it carries. Accordingly, abductive plays are triggered when P faces a concession problem. That is, P lacks the concession to win the play. An additional rule that allows P stating that he faces an abductive problem –the so-called concession problem – is needed. Following Aliseda’s (2006, p. 47) distinction between abductive novelty and abductive anomaly, two kinds of concession problems can be considered. This leads to the introduction of two new structural rules, [SRAN] and [SRAA], which allow P to claim he is facing an abductive novelty or an abductive anomaly, respectively. Without giving the technical details, these rules intuitively say that P can claim he has a winning strategy neither for ϕ[] nor for ¬ϕ[], in the former case. Or that he has no winning strategy for ϕ[] and moreover ¬ϕ, in the latter case. These triggering moves can be challenged by O who takes the burden of the proof either of ϕ[] or ¬ϕ[] in view of blocking P’s attempt to introduce new hypotheses. Then, if P succeeds in defending the existence of a concession problem, he is allowed to introduce a hypothetical abductive solution. Such a guessing move is allowed by means of some kind of weakening of the formal rule – given that P will now be allowed to introduce hypothetically atomic formulae. Now, even if there is a P-winning strategy for the initial thesis, it is only hypothetical and it does not suffice to determine (deductive) validity. From this moment, we leave the realm of deductive logic to take on the path of hypothetical plays. In addition, what is characteristic of the guessing move is that it commits the player who performs it to defend it against further challenges, which reflects conditions of acceptability of abductive hypotheses. The rules defined by Barés and Fontaine (2017, p. 309) parallel some of Aliseda’s conditions (2006, p. 74). Indeed, two rules allow asking for a justification that the abductive hypothesis is plain or explanatory. If O challenges a P-guessing move, P will have to show that the hypothesis allows to win the subsequent play (plain) or that the hypothesis alone (without the initial concessions) is not sufficient to win the play (explanatory). Other conditions could be reflected in the introduction of similar rules. If P is not able to defend his commitments, then the guessing moves are defeated. Recently, Barés and Fontaine (2019, 2021a) have tackled defeasibility in abductive dialogues by defining rules for adaptive dialogues of abduction (ADAr ). In these dialogues, hypothetical moves are adaptive conditional moves based on affirming the consequent. Roughly, when P faces a concession problem, he is allowed to challenge a conditional uttered by O by asserting the consequent. O must answer by stating the antecedent. Such moves are still hypothetical and P can perform them only if he commits himself to the defense further conditions, defined in relation to
15 Abduction and Dialogues
299
the adaptive notion of reliability. Further details on ADAr will not be given here since it does not really add new insights from a conceptual perspective. Another adaptive dialogical approach will be defined later more precisely, including the notion of an adaptive conditional move, which fits better with the aims of this chapter. Despite interesting insights, the dialogical approaches that have just been mentioned in this section are still confined to sentential abductions. This does not do justice to the possibilities offered by the dialogical framework, in which other kinds of abductive hypotheses can be accounted for. Indeed, the main objective is first to ask what the contribution of dialogues to our understanding of abduction is and, second, to highlight how they can be used to model new patterns of abduction, beyond purely sentential hypotheses. In particular, the aim is to make use of dialogue to handle formal hypotheses, i.e., hypotheses relative to the rules of interactions, whereby inferential hypotheses may arise. For the sake of explanation, Estrada (2012, p. 181) claims that abduction could be understood from a wider perspective, in the context of an input-output logic. He pretends to study “phenomena that seem to have an abductive flavor but that are not covered by the usual characterizations of abduction,” i.e., the logicoformal approaches which focus on the sentential aspects of abduction. For example, it may be looked for a logic that validates principles classically invalid such as subalternation, not by introducing additional principles or sentential premises, but by changing the meaning of connective or the consequence-relation. Estrada (2012, 188) argues that the notion of inference should be defined through processes of obtaining outputs from some inputs, which need not be deductive, as follows: [Inference Scheme] 1 , . . . , i L ϒ1 , . . . , ϒj , where 1 , . . . , i are the inputs and ϒ 1 , . . . , ϒ j the outputs. Then, the notion of inferential problem would be generally defined as follows: [Inferential Problem] 1 , . . . , i , ?I X L ? O, ϒ1 , . . . , ϒj , where X L denotes that inputs and outputs do not stand each other in a desired relation that a modification of inputs ?I or outputs ?O, or even both, would help √ to obtain it, which is expressed by means of L . An abductive problem would be defined as follows: [Abductive Problems] 1 , . . . , i , ?I X L ϒ1 , . . . , ϒj , And finally, “[g]iven some inputs and outputs, abduction is the process of modifying enough inputs in order to get a desired relation between them and the outputs” (Estrada, 2012, p. 189). The authors fully agree with Estrada’s remarks, but they still think that his definition does not attain the generality he is looking for. Indeed, inputs are not
300
C. Barés Gómez and M. Fontaine
necessarily situated at the left-hand side of the relation since they could concern the relation itself. A generalization of this scheme is made possible by considering a dynamic relation X L . And this is precisely what is pretended to do within a dialogical framework. The leading idea of the dialogues presented in the next sections is that a thesis may be defended without strictly applying the initial rules. This yields dialogues in which the rules of interaction change and may even become the object of the dispute. But changing rules is not transgressing rules. There are not errors of reasoning or dialogical fallacies, but scant resource adjustment strategies. It is not because no further move is allowed by the initial rules that the argumentative interaction must break off. Sometimes, the argumentative partners agree on a certain flexibility and decide to apply different rules as a basis for further (hypothetical) plays.
Abducting Rules in Structure Seeking Dialogues The Structure Seeking Dialogues (SSD) of Rahman and Keiff (2005) and Keiff (2007) emphasize the abduction of rules in the course of a dialogue in what might be called “frame-based abduction,” given that they involve hypothesis about the underlying modal framework. In modal dialogical logic, moves are sequence X − e − c, where X and e are like before and where c is an assignation of context (possible world) to a formula. The object-language is enriched by means of the two modal operators, and , for the necessity and the possibility, respectively. Their local meaning is given by the following particle rules:
Particle Rules for Modal Dialogues Assertion Attack Defense X − ! ϕ Y − ? /j − i X − ! ϕ − j X− ! ϕ Y− ? −i X− !ϕ−j
As for the rules for conjunction and disjunction, matters of choice are of particular importance. When challenging the necessity operator, it is Y who chooses a context j; but it is X who chooses when defending the modality operator. The structural rules for the different modal dialogical logics will not be defined. The SSD will not be fully presented, and this section will be restricted to a simplified version in view of explaining the basic insights. Some of the rules must be adapted. For example, assertions are made within a context, and O’s concessions must be relativized to their context of utterance as well. Thus, the formal rule [SR2] now states that P can assert an atomic formula ϕ in a context c only if O has previously asserted ϕ in c. This is illustrated by the following example, without adding any structural restriction to the applications of the particle rules:
15 Abduction and Dialogues Dialogue 3 R 1 3 5 7 9
c
O
0 0 1 2
n := 1 p ? /1 ? /2 p
301
0 4 6 3
P p→ p m := 2 p p p ? /2
c 0
1 2 0
R
0 2 4 6 10 8
Explanation: In the c-columns, the dialogical context (possible world) is indicated by means of a natural number (or, by convention, 0 for the initial context). In the R-columns, the accessibility relation assumed when challenging a or defending a is explicitly indicated by means of ordered pairs of two natural numbers ( means that the context i is accessible from the context j). The dialogue begins with P asserting the initial thesis p → p (move 0), then the choices of rank (moves 1 and 2), and O challenging the thesis by conceding the antecedent of the conditional (move 3). P answers by asserting the consequent (move 4), which is challenged by O (move 5). The attack consists in asking for p in a context 1 accessible from the context 0. Such an accessibility is not assumed by the game, it is O who concedes this relation by challenging the necessity operator. P answers by asserting p in 1 (move 6), then O challenges the necessity operator, and simultaneously concedes that 2 is accessible from 1 (move 7). Since p is an atomic formula, P cannot answer and must counter-attack O’s concession of p (move 8). Here is the crucial move of this play. Indeed, in order to obtain p in 2, P must ask for p in 2 by challenging O’s utterance of p from the context 0 (move 8). Therefore, P must rely on the assumption that 2 is accessible from 1. Nonetheless, if no assumption is made regarding the underlying modal frame, i.e., if the play occurs in the system K– then P should not be allowed to perform this move. Only O should be allowed to introduce such assumptions, when challenging -operators or defending -operators. And this is exactly how it works with the structural rule relevant for K-modal dialogues. Stated otherwise, in K-modal dialogues, ordered pairs of the R-column behave like atomic formulas. P can state only if O has already stated before. As explained by Rahman and Keiff (2005, 389), different structural rules will be used in relation to different modal frames. For example, if the underlying modal frame is assumed to be reflexive, then the play runs with another structural rule for context, i.e., P is allowed to state for any context i introduced in the dialogue. Other rules can be defined for transitivity, symmetry, and so forth. Now, in the SSD, the main idea is that when P is not able to win the play he can introduce hypotheses regarding the underlying modal frame. This is reflected by means of a dynamic in the application of the structural rules for contexts, at the play level. In the Dialogue 3, P conjectures a transitive modal frame when introducing (move 8) on the basis of O’s previous concessions of (move 5) and (move 7). Then, in the spirit of the GWm, the hypothesis is used in subsequent plays to show that P would win if the underlying modal frame were transitive. This leads to the following definition of a structural abductive problem by Keiff (2007, p. 200): [D4][Structural abductive problem] Let L be a system of inference rules such that, for a given formula A, L A and L ¬ A. A structural abductive problem consists in looking for an optimal system of rules L such that L ⊂ L and L A. Let us recall that dialogical logic is defined by two kinds of rules (see section “Dialogical Logic”). The particle rules provide the meaning of particles in terms of a triplet assertion-attack-defense. The structural rules govern the applications of these rules in dialogical games . The latter may change depending on the context of argumentation, e.g., if the play runs with intuitionistic rules, then only the last nondefended attack can be answered. This provides the play level which grounds the strategic level, at which the notion of validity and consequently the characterization of a logical system can be defined. The SSD thus allow some kind of dynamic in the use of structural rules, by introducing the possibility for P to conjecture accessibility relations: this is a frame-based abduction. The full technical details of the SSD cannot be provided in this chapter, and this section must be restricted to a pragmatic interpretation of dialogues. In the original SSD, Rahman and Keiff make use of Blackburn’s hybrid languages (2001) in order to express these considerations in the object language. As a consequence, the rule itself becomes the object of the argument. For more details on the SSD, the reader is referred to Rahman and Keiff (2005, pp. 396–403), Keiff (2004, 2007).
Abducing Rules in Adaptive Dialogues The dynamic invoked at the end of section “Sentential Abduction in Dialogues” is handled at the play level, e.g., by constraining the application of the particle rules by means of additional structural rules. In SSD, structural rules say how the particle rules for modal operators can be applied, in particular with respect to the accessibility relations. The issue will now be approached by means of another kind of hypotheses, namely hypotheses relative to the use of negation. In both approaches, modifying the rules of interactions involve changes at the strategy level, which stems from the play level. Therefore, when changing the rules of interaction, the notion of consequence relation can also be changed. A better articulation of this dynamic can be found in adaptive dialogues, in particular the InconsistencyAdaptive Dialogical Logic (IAD) of Beirlaen and Fontaine (2016). Although it was not initially concerned with abduction, IAD involves conditional moves which can also be interpreted in terms of abductive process. What is at stake in IAD is the use of negation. The main idea of IAD can be summarized as follows: Classical and intuitionistic dialogues are explosive. From an inconsistent set of premises, anything can be derived. This reflects the validity of the well-known Ex Falso Sequitur Quodlibet (EFSQ). But it is not uncommon that inconsistencies occur in the course of an argumentative interaction. Usually,
15 Abduction and Dialogues
303
this is not a reason to infer random statements or to stop the interactive process. That is why it may be agreed to begin with a paraconsistent logic in view of blocking explosion. A paraconsistent dialogical logic may be obtained by adding the structural rule [SR4.1] (inspired in the rule formulated by Rahman and Carnielli (2000)) that constrains the use of the particle rule for negation: [SR4.1][Negation Rule] The sequence S is a play only if the following condition is fulfilled: If there is a move N1 = P − ! ψ − C − d in S such that: 1. pS (N1 ) = n1 , 2. FS (N1 ) = [m1 , A] , and 3. m1 = pS (M1 ) such that M1 = O−!¬ψ − C − d Then, there is a move M2 = O − ! ¬ ψ − C − d in S such that: 1. pS (M2 ) = m2 and m2 < n1 , 2. FS (M2 ) = [n2 , A] , and 3. n2 = pS (N2 ) such that N2 = P −!¬ψ − C − d. Intuitively, this rule means that P is allowed to challenge a negated formula ¬ψ only if O has already challenged an occurrence of the same negated formula before. Stated otherwise, P is not allowed to assume that the negation in ¬ψ behaves normally unless O made that concession before. An immediate consequence is the invalidation of the EFSQ, as illustrated in the following dialogue:
Dialogue 4 O 1 3.1 3.2
n := 1 p ¬p
P q[p, ¬p] m := 2
0 2
0
Explanation: With the standard rules, P would have been allowed to challenge O’s utterance of ¬p (move 3.2) by asserting p, relying on O’s concession of p (move 3.1). This would have been the only possible play, won by P, so that the thesis would have been valid. With the negation rule [SR4.1] for paraconsistent dialogues, P is not allowed to perform such an attack and the dialogue ends immediately after O concedes the premises of the initial thesis. But this rule has the undesirable effect that perfectly acceptable inferences like the disjunctive syllogism are now invalid. In IAD , P may conjecture the normal behavior of the negation by means of conditional moves. If it can be shown by O that P relies on an abnormality, an inconsistency, then the conditional move is not reliable. In what follows, it is defined more precisely what this means. For more
304
C. Barés Gómez and M. Fontaine
details on adaptive logics, the reader is referred to Batens (2000, 2007), among others, but also Gauderis’ and Beirlaen’s papers in this present volume. Actually, an adaptive dialogical logic can be seen as the dynamic articulation of two logics: a lower limit dialogical logic (LLD) and an upper limit dialogical logic (ULD), which in some sense amplifies the range of possible moves allowed for a player. Moves applied in accordance with the ULD-rules are subject to additional conditions; they are the so-called conditional moves. The condition can be challenged in accordance with the relevant notions of abnormality and adaptive strategy, depending on the adaptive logic considered. IAD is defined according to the following triple: 1. Lower Limit Dialogical Logic: LLD = Paraconsistent Dialogical Logic ([SR0]– [SR3] + [SR4.1]), 2. Set of abnormalities: = DF {ϕ ∧ ¬ ϕ| ϕ ∈ L}, 3. Strategy: Reliability. The LLD-rules are the rules of standard deductive classical logic plus the negation rule [SR4.1]. In addition, a rule is added for conditional moves; namely [SR4.2], an ampliative rule that allows P to conjecture the normal behavior of the negation. By applying this rule, a player commits himself to its reliability; that is, assuming the normal behavior of the negation with respect to a formula ϕ should not yield an abnormality pertaining to the set . This condition will be explicitly indicated in the dialogue. Moves of IAD are now sequences of the form X − e − C − d, where X and e are as before, and C is the corresponding condition. Moreover, when challenging the condition, the burden of proof may change, and the O may become subject to the formal restriction. This dynamic in the application of the formal rule can be handled by distinguishing sub-dialogues: d is either the main dialogue – in which case it is written d1 – or a sub-dialogue – in which case it is written d1. i for the i-th sub-dialogue. The structural rules must be adapted; that is, a dialogue begins with an empty condition in the sub-dialogue d1 . Then, the introduction of conditions the passage to sub-dialogues are governed by the other structural rules. The classical development rule [SR1c] is likewise generalized to conditional moves by replacing moves of the form X − e with moves of the form X − e − C − d. The formal rule [SR2] will be substituted by [SR2.1] and [SR2.2]. First, the LLD of IAD is defined by adding the negation rule [SR4.1] to the rules of standard deductive dialogical logic. Based on this set of rules, the notion of LLD-consequence is defined. [D5][ LLD-consequence] LLD ϕ (respectively LLD ϕ) iff according to the LLD-rules there is a P-winning strategy for the thesis ϕ[] (respectively ϕ). LLD may appear to be too constraining. That is why, the ULD will allow P to standardly challenge the negation, by committing himself to justify that it is not an abnormality if O asks him. Stated otherwise, P is allowed to perform an abductive move regarding the rules underlying the dispute and this hypothesis can be challenged by O. This leads us to the adaptive strategy of reliability, defined with respect to our notion of abnormality.
15 Abduction and Dialogues
305
Let be the set of abnormalities defined as: =DF {ϕ ∧ ¬ ϕ| ϕ ∈ L}. Let be a finite subset of , Dab( ) is the classical disjunction of the members of called the “disjunction of abnormalities.” Then, reliability is defined as follows: [D6][Reliability] Let ϕ[] be the thesis of P. A formula ψ behaves reliably with respect to iff there is no formula Dab( ) such that: (i) ψ ∧ ¬ψ ∈ , (ii) LLD Dab ( ) , and (iii) LLD Dab ( \ {ψ ∧ ¬ψ}) .
In order to define IAD, [SR0] must be adjusted by substituting moves of the definition with moves of the form X − e − ∅ − d1 where ∅ is the empty set of conditions – i.e., plays begin without conditions– and d1 indicates that the play begins in the sub-dialogue d1 . The reasons will be clear in what follows. Then [SR2] will be replaced by [SR2.1] and [SR2.2], defined hereafter, in view of handling the formal rules in sub-dialogues. Finally, the following negation rule for IAD must be added, so that the structural rules of IAD is defined by [SR0], [SR1c], [SR2.1], [SR2.2], [SR3], [SR4.1], [SR4.2].and [SR4.3]. [SR4.2][IAD Negation Rule] The sequence S is a play only if the following condition is fulfilled: If there is a move N = P − ! ψ − C − d in the sequence S such that: 1. pS (N ) = n, 2. FS (N ) = [m, A] , and 3. m = pS (M) such that M = O−!¬ψ − C − d, then one of the following two conditions holds: 1. N is performed by P in accordance with the LLD-negation rule [SR4.1], 2. N = P −!ψ − R ψ − d, where Rψ abbreviates that ψ behaves reliably in view of the premises set . The point is that if O has already challenged an occurrence of the same negation before, then [SR4.1] applies and P may challenge the negation without condition. Otherwise, P is committed to defend the reliability of his move, i.e., that the negation behaves normally. That is why we speak of a conditional move, whose condition R ψ can be challenged by O. This he can do by claiming the abnormality corresponding to the formula in question is part of a Dab-formula that is a consequence of the premise set. According to Beirlaen and Fontaine (2016, p. 112), the Dab-formula provided by O must be an LLD-consequence of the premise set (condition (ii) of [D6]) and the disjunction should be an indispensable part of that disjunction (condition (iii) of [D6]). This means that P could in turn counterattack the Dabformula stated by O if it does satisfy these conditions. The dynamic of this process is caught by the following particle rule for the reliability operator R:
306
C. Barés Gómez and M. Fontaine
Particle rule for the reliability operator R Assertion Attack X−! ϕ − R − d Y −?R 1 ϕ ϕ Dab ( ) − ∅ − d1 (where ϕ ∧ ¬ ϕ ∈ )
Defense X−!F (Dab ( )) − ∅ − d1 Or X counterattacks X−!I (Dab( \ {ϕ ∧ ¬ϕ })) − ∅ − d1 (where Dab( \{ϕ ∧ ¬ ϕ}) = ∅
This particle rule grasps the commitment of P towards the viability of his hypothesis concerning the rules underlying the dialogical interaction. Indeed, it compels P to justify the reliability of the conditional move he has previously performed. It follows sub-plays in the course of which it is P’s hypothesis that becomes the object of the dispute. Indeed, O challenges the reliability condition by bringing into opposition a Dab formula. Then, P can defend the condition either by claiming that O is wrong and that his Dab formula cannot be LLD -derived from the premise set, what he does by claiming F (Dab ( )) where F is the failure operator, whose meaning is given below by another particle rule. Or he counterattacks by claiming that ϕ ∧ ¬ ϕ was not indispisensable to Dab( ) and that a smaller disjunction that does not contain it as a sub-formula was already derivable from the premise set. This would mean that ϕ ∧ ¬ ϕ is not guilty in the derivation of the Dab formula. This he does by claiming I (Dab( \ {ϕ ∧ ¬ϕ })), where I stands for the indispensability operator, whose meaning is also given by a particle rule defined below. Particle rule for the failure operator F Assertion Attack X−! F ϕ − ∅ − d1 Y − ! ϕ[] − ∅ − d1. i Y opens a sub-dialogue d1. i
Defense −−− No defense
A similar rule for the failure operator F had initially been introduced in the dialogical connexive logic by Rahman and Rückert (2001b). In attacking the failure operator, Y opens a new sub-dialogue in which he takes the burden of the proof of ϕ[]. In line with the starting rule, a sub-dialogue starts with the empty condition set. Given the setup of IAD plays, it will always be O who attacks expressions of the form F ϕ (but nothing would prevent to consider other kinds of dialogues with the same rules) and who defends ϕ[]. Consequently, it is now O who plays under the formal restriction. Let us first define the particle rule for the indispensability operator I: Particle rule for the indispensability operator I Assertion Attack Defense X−! I ϕ − ∅ − d1 Y −? I ϕ − ∅ − d1 X − ! ϕ[] − ∅ − d1. i X opens a sub-dialogue d1. i
Here, X’s defense consists in showing that the smaller Dab-formula is LLD derivable from the set of premises. He thus opens the sub-dialogue, and there is no switch in the burden of proof.
15 Abduction and Dialogues
307
To implement the switch at the structural level, [SR2] must be replaced by [SR2.1] and [SR2.2]. [SR2.1][Formal restriction for IAD] If X plays under formal restriction, then the sequence S is a play only if the following condition is fulfilled: if N = X − ! ψ − Cj − d is a member of S, for any atomic sentence ψ, then there is a move M = Y − ! ψ − Ci − d in S such that pS (M) < pS (N ). [SR2.2][Application of the formal restriction rule in IAD] The application of the formal restriction is regulated by the following conditions: 1. In the main dialogue d1 , if X = P, then X plays under the formal restriction. 2. If X opens a sub-dialogue d1. i , then X plays under the formal restriction. Intuitively, the rule [SR2.1] generalizes the formal rule to player variables and restricts the introduction of atomic formulae to sub-dialogues. That is, an atomic formula conceded by Y in di is not conceded for every dj with i = j. Then, the rule [SR2.2] says that the formal restriction applies to the player who opens the sub-dialogue, namely P who begins the play with his initial thesis, and then in accordance with the particle rules for the failure and the indispensability operators. It is worth noting that given the switch in the application of the formal restriction, it will be generally a good strategy to choose rank 2 for O at the beginning of the dialogue. IAD will now be completed with the rule [SR4.3]: [SR4.3][Application of negation rules] In the main dialogue d1 , P attacks negations in accordance with the inconsistency-adaptive negation rule [SR4.2]. In a sub-dialogue d1. i the player who plays under the formal restriction attacks negations in accordance with the LLD negation rule [SR4.1]. IAD is now illustrated by means of a simple example in which the F-operator is used to defend a condition.
Dialogue 5 O d1 1 3.1 3.2 5
n := 2 p∨q ¬p p −−− ?R p (p ∧ ¬p)
P
∅ ∅ ∅ ∅
∅
0 2
?∨ p
∅
F (p ∧ ¬p)
∅
4 6 8
−−− p∨q ¬p ?∧L q
∅ ∅ ∅ ∅
10.1 10.2 12 14
0 3.1 3.2
7 d1.1 9 p ∧ ¬ p[p ∨ q, ¬p] 11 p ∧ ¬ p
∅
6
∅ ∅
8
13
∅
10.1
9 11
?∨
q[p ∨ q, ¬p] m := 2
R p
308
C. Barés Gómez and M. Fontaine
Explanation: The dialogue runs in a standard manner until move 6. According to the LLD rules, this move is forbidden. But the IAD rules allow P to challenge a negation by a conditional move, i.e., by committing himself to defend a reliability condition, namely R p . O introduces a (simple) Dab-formula (p ∧ ¬p) to attack the reliability of P’s move (move 7), and P defends the condition by claiming that the Dab-formula introduced by O is not an LLD consequence of (move 8). In order to attack this P-move, O must open a sub-dialogue in which he takes the burden of the proof of p ∧ ¬p[p ∨ q, ¬p] (move 9). In this sub-dialogue, O must play under formal restriction. Then, the dialogue runs in a standard manner until move 14. This is the last move, no other move is available, and it has been performed by P. P has succeeded in defending the reliability condition wins the play. Notice that another move was possible for O in answer to P’s challenge (move 4). He could have conceded q instead of p (in fact, given that he has chosen rank 2, he can), but this would not have changed anything since P would have thus answered q in response to O’s challenge on the initial thesis (move 3.1–3.2). In both cases, P would have won. Although it was not its primary purpose, considering IAD as an abductive process highlights the originality and the scope of applications of a dialogical understanding of abduction. From a wider perspective, adaptive rules can be seen as various devices to articulate different logics in a pluralist framework. Abduction in which the input is the application of a conditional rule – which amounts to the hypothesis regarding the rules of the dialogue – can be interpreted in accordance with the GWm. For example, the conditional move (move 6) corresponds to the introduction of the hypothesis (step 10 of the GWm), the subsequent hypothetical sub-play corresponds to the activation of the hypothesis (step 11 of the GWm), and the challenge of the condition reflects the additional conditions (step 9 of the GWm). Of course, such hypotheses are not explanatory and cannot even be verified or falsified, since they are relative to the rules of interaction. But they serve as a basis for further moves and can be discarded if they produce abnormalities in the dialogue.
Conclusion In this chapter, the aim was not to provide with new tools for the resolution of abductive problems, but to adopt a dialogical perspective for conceptual purpose. The scope of abduction has been extended beyond sentential hypotheses, by accounting for the use of hypotheses relative to the rules of interaction, and subsequently the related notions of logical consequence. Indeed, although the dynamics of rules occurs at the play level, it gives rise to an inferential dynamic at the strategy level. Such abductions are not explanatory in the sense the term is usually understood. Indeed, they do not consist in the introduction of sentential hypotheses in order to make a surprising fact derivable from the theory (as in e.g., Aliseda (2006)). It is true that Peirce himself speaks of abduction in terms of “the process of forming an
15 Abduction and Dialogues
309
explanatory hypothesis” (CP 5.171) (peirce (1931–1958)), and that the concept of abduction proposed in this chapter is perhaps more general. The proposal might be better understood in relation to instrumental abduction, which has been extensively discussed by Magnani (2009) and Gabbay and Woods (2005), among others. The dialogical abductions discussed in this chapter are scant resource adjustment strategies (in the sense of section “Abduction”), in which the resources are the possibilities offered by the rules of interactions. When an argumentative interaction is threatened because of rules limitations, the players may agree in adopting other (hypothetical) rules which serve as a basis for further (hypothetical) moves. From a play level perspective, this allows a continuation of the interaction despite a lack of possibilities. From a strategy level perspective, this kind of abductive process can be seen as a pluralistic heuristic device to determine which logic might validate classically invalid inferences. By studying abduction from a dialogical perspective, different ways of understanding concepts of defeasibility and non-monotonicity arise. Indeed, whereas defeasibility is in general concerned with the play level, the dynamic of adaptive dialogical logics allows defining a concept of dialogical non-monotonicity at the strategy level, as in Barés and Fontaine (2021b). Dialogical monotonicity states that if there is a P-winning strategy for ϕ[], then there is a P-winning strategy for ϕ[ ] (where ⊆ ). Non-monotonic dialogues violate this principle. Although it is defined at the strategy level, it stems from defeasibility and the possibility to defeat conditional moves at the play level. Finally, it is worth noting that if no information external to the dialogue is introduced, then the existence of a P-winning strategy is determined since the beginning of the game. This is also the case notwithstanding the non-monotonicity of IAD. Dialogical strategies are “locked,” to put it in Magnani’s words (2019), since optimal choices are determined by the initial thesis and the definitory rules. By contrast, “unlocked” strategies are the ground for eco-cognitive openness and creativity. This gives rise to another challenge for dialogicians, who are now encouraged to unlock the strategies. This can be done for sentential abduction by adding external information, e.g., in material dialogues. This could be extended to rules-hypotheses in adaptive dialogues, e.g., the dynamics between the LLD and ULD, the lower and upper limit dialogical logics, respectively, could be unlocked by allowing the player to perform choices that are not restricted to already defined logics. This might be the path toward dialogical openness and creativity. Acknowledgments The authors warmly thank the editors, Lorenzo Magnani and Atocha Aliseda, for their help and their perspicacious comments. Both authors acknowledge the financial support of the projects “Abducción y Diagnóstico Médico. Interrogación e Hipótesis en la Causalidad Científica” held by Cristina Barés (US-1381050, Proyectos de I + D + i en el marco del Programa Operativo FEDER Andalucía 2014-2020. Junta de Andalucía); and “El proceso inferencial como proceso informacional: dinámica lógica de la información y la representación del discurso y el diálogo” held by Francisco Salguero (PAIDI 2020: Proyectos I + D + i financiados por la Junta de Andalucía (Referencia P20_01140)).
310
C. Barés Gómez and M. Fontaine
References Aliseda, A. (2006). Abductive reasoning. Logical investigations into discovery and explanation. Springer. https://doi.org/10.1007/1-4020-3907-7 Barés, C., & Fontaine, M. (2021a). Between sentential and model-based abductions: A dialogical approach. Logic Journal of the IGPL, 29(4), 425–446. https://doi.org/10.1093/jigpal/jzz033 Barés, C., & Fontaine, M. (2021b). Defeasibility and non-monotonicity in dialogues. Journal of Applied Logics – IfCoLog Journal of Logics and their Applications, 8(2), 329–353. Barés Gómez, C., & Fontaine, M. (2017). Argumentation and abduction in dialogical logic. In L. Magnani & T. Bertolotti (eds) Springer handbook of model-based science (pp. 295–314). Springer. https://doi.org/10.1007/978-3-319-30526-4_14 Batens, D. (2000). A survey of inconsistency-adaptive logics. In D. Batens, C. Mortensen, G. Priest, & J.-P. Van Bendegem (Eds.), Frontiers of paraconsistent logic (pp. 49–73). KC Publications. Batens, D. (2007). A universal logic approach to adaptive logics. Logica Universalis, 1, 221–242. https://doi.org/10.1007/s11787-006-0012-5 Beirlaen, M., & Fontaine, M. (2016). Inconsistency-adaptive dialogical logic. Logica Universalis, 10, 99–134. https://doi.org/10.1007/s11787-016-0139-y Blackburn, P. (2001). Modal logic as dialogical logic. Synthese, 127, 57–93. https://doi.org/10. 1023/A:1010358017657 Brandom, R. (1994). Making it explicit. Harvard University Press. Clerbout, N. (2014). First-order dialogical games and tableaux. Journal of Philosophical Logic, 43, 785–801. https://doi.org/10.1007/s10992-013-9289-z Dutilh-Novaes, C. (2015). Dialogical, multi-agent account of the normativity of logic. Dialectica, 69(4), 587–609. https://doi.org/10.1111/1746-8361.12118 Estrada-González, L. (2012). Remarks on some general features of abduction. Journal of Logic and Computation, 23(1), 181–197. https://doi.org/10.1093/logcom/exs005 Fontaine, M., & Barés Gómez, C. (2019). Conjecturing hypotheses in a dialogical logic for abduction. In D. Gabbay, L. Magnani, W. Park, & A.-V. Pietarinen (eds) Natural arguments. A tribute to John Woods (pp. 379–414). College Publications. Gabbay, D., & Woods, J. (2005). The reach of abduction. Insight and trials. Elsevier. Gigerenzer, G. (2005). I think, therefore I Err. Social Research, 72(1), 195–218. Hintikka, J. (1962). Knowledge and Belief. Cornell University Press. Keiff, L. (2004). Heuristique formelle et logiques modales non-normales. Philosophia Scientiae, 8(2), 39–57. https://doi.org/10.4000/philosophiascientiae.562 Keiff, L. (2007). Le Pluralisme dialogique – Approches dynamiques de l’argumentation formelle. PhD Thesis. Université Lille 3–Charles-de-Gaulle. Lorenz, K. (2001). Basic objectives of dialogue logic in historical perspective. Synthese, 127, 255– 263. https://doi.org/10.1023/A:1010367416884 Lorenzen, P., & Lorenz, K. (1978). Dialogische Logik. Wissenschqftliche Buchgesellschaft. Magnani, L. (2009). Abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. https://doi.org/10.1007/978-3-642-03631-6 Magnani, L. (2017). The abductive structure of scientific creativity. Springer. https://doi.org/10. 1007/978-3-319-59256-5 Magnani, L. (2019). AlphaGo, locked strategies, and eco-cognitive openness. Philosophies, 4(1), 8. https://doi.org/10.3390/philosophies401000 Nersessian, N. (1999). Model-based reasoning in conceptual change. In L. Magnani, N. Nersessian & P. Tagard (eds) Model-Based Reasoning in Scientific Discovery. Springer. https://doi.org/10. 1007/978-1-4615-4813-3_1 Peirce, C. S. (1931–1958). Collected papers of Charles Sanders Peirce. Harvard University Press. Rahman, S., & Carnielli, W. (2000). The dialogical approach to paraconsistency. Synthese, 125, 201–232. https://doi.org/10.1023/A:1005294523930
15 Abduction and Dialogues
311
Rahman, S., & Keiff, L. (2005). On how to be a dialogician. In D. Vanderveken (Ed.) Logic, thought and action (pp. 259–408). Springer. https://doi.org/10.1007/1-4020-3167-X_17 Rahman, S., & Rückert, H. (eds) (2001a). New perspectives in dialogical logic. Synthese 127. Rahman, S., & Rückert, H. (2001b). Dialogical connexive logic. Synthese, 127, 105–139. https:// doi.org/10.1023/A:1010351931769 Rahman, S., McConaughey, Z., Klev, A., & Clerbout, N. (2018). Immanent reasoning or equality in action – A Plaidoyer for the play level. Springer. https://doi.org/10.1007/978-3-319-91149-6 Redmond, J., & Fontaine, M. (2011). How to play dialogues. College Publications. Woods, J. (2004). The death of argument: Fallacies in agent-based reasoning. Kluwer. https://doi. org/10.1007/978-1-4020-2712-3 Woods, J. (2013). Errors of reasoning. Naturalizing the logic of inference. College Publications. Woods, J. (2017). Reorienting the logic of abduction. In L. Magnani & T. Bertolotti (eds) Springer handbook of model-based science (pp. 137–150). Springer. https://doi.org/10.1007/978-3-31930526-4_6
Paraconsistency, Evidence, and Abduction
16
A. Rodrigues, M. E. Coniglio, H. Antunes, J. Bueno-Soler, and W. Carnielli
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On Logics of Evidence and Truth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On the Notion of Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LETs as Information-Based Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Logic LET F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Valuation Semantics for LET F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LET F -Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding Implication: The Logic LET K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First-Order LET K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Natural Deduction System for QLET K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paraconsistency and Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applying LET K -Tableaux to Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
314 315 317 318 321 324 325 327 329 330 333 335 337
A. Rodrigues () Department of Philosophy, Federal University of Minas Gerais, Belo Horizonte, Brazil M. E. Coniglio Department of Philosophy and Centre for Logic, Epistemology and the History of Science, University of Campinas, Campinas, Brazil e-mail: [email protected] H. Antunes Department of Philosophy, Federal University of Bahia, Salvador, Brazil e-mail: [email protected] J. Bueno-Soler Centre for Logic, Epistemology and the History of Science, and School of Technology, Rua Paschoal Marmo, University of Campinas, Limeira, Brazil e-mail: [email protected] W. Carnielli Centre for Logic, Epistemology and the History of Science, Rua Sérgio Buarque de Holanda, University of Campinas, Campinas, Brazil e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_27
313
314
A. Rodrigues et al.
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On Valuation Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kripke-Style Semantics for LET F and LET K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Probabilistic Semantics for LET F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
340 341 341 342 344 345 348
Abstract
This chapter gathers together and sums up the recent work on technical and philosophical aspects of logics of evidence and truth (LETs) through a study of the logics LET F and LET K , including an application of the latter to the problem of abduction. LET F is a paracomplete and paraconsistent sentential logic equipped with a unary operator ◦ that divides the sentences of the language into two groups: one subjected to classical logic and the other subjected to the logic of first-degree entailment (FDE), also known as Belnap-Dunn four-valued logic. The chapter discusses the intuitive intended interpretation of LET F in terms of positive and negative evidence, and shows how LET F can be interpreted in terms of reliable and unreliable information. It then presents natural deduction systems, valuation semantics, and analytic tableau system for both LET F and LET K , which extends LET F with a classical implication connective. Finally, it shows how LET K and its first-order extension QLET K can be applied to the problem of abduction by means of tableaux that indicate possible solutions for abductive problems. The chapter also includes an appendix that contains some general remarks on valuation semantics, Kripke-style semantics for both LET F and LET K , and a probabilistic semantics for LET F . Keywords
Logics of evidence and truth · Paraconsistency · Paracompleteness · Evidence · Abduction
Introduction In a strict sense, paraconsistency is a property of formal systems that consists in the invalidity of the principle of explosion, also known as ex falso quodlibet, (1) A, ¬A B, for every A and B. Thus, a logic is paraconsistent if and only if explosion does not hold in it. More precisely, consider a language L equipped with a negation ¬ and a consequence relation . If explosion holds for , then is said to be an explosive consequence relation. A set Γ of sentences of L is said to be trivial if and only if for every sentence A of L , Γ A. Γ is contradictory if and only if for some sentence A
16 Paraconsistency, Evidence, and Abduction
315
of L , Γ A and Γ ¬A. If is explosive, a set Γ is contradictory if and only if it is trivial. Contradictoriness and triviality are equivalent in classical logic, as in any explosive logic. The distinctive feature of paraconsistent logics is that contradictoriness does not imply triviality. In a wide sense, paraconsistency also encompasses the study of the philosophical aspects of paraconsistent logics, and a central issue concerns the nature of the nonexplosive contradictions, a question that is closely related to the intended intuitive interpretation of paraconsistent logics. Specifically, the relevant question is: what property does the intuitive interpretation ascribe to a pair of contradictory sentences A and ¬A? Logics of evidence and truth (LETs) are paraconsistent logics conceived to formalize the deductive behavior of positive and negative evidence, which can be either conclusive or non-conclusive. The answer given by LETs to the aforementioned question is that a pair of contradictory sentences A and ¬A expresses the presence of conflicting non-conclusive evidence for the truth and the falsity of A. LETs are extensions of the logic of first-degree entailment (FDE) equipped with a unary connective ◦, and ◦A is intended to mean that there is conclusive evidence for either the truth or the falsity of A. It is assumed that conclusive evidence behaves classically, and so is subjected to classical logic, while non-conclusive evidence may be incomplete or contradictory. This chapter gathers together and sums up the recent work on technical and philosophical aspects of logics of evidence and truth (Antunes et al., 2020, 2022; Carnielli & Rodrigues, 2017; Rodrigues et al., 2020; Carnielli et al., 2020; Rodrigues and Carnielli, 2022) through a presentation and a study of the logics LET F and LET K . LET F is a sentential logic introduced in Rodrigues et al. (2020) that extends the logic of first-degree entailment (FDE), also known as Belnap-Dunn four-valued logic, with a classicality operator ◦ and a non-classicality operator •, dual to ◦. The intended intuitive interpretation of LET F in terms of evidence and information is presented and discussed. The logic LET K , which extends LET F with a classical material implication, and its first-order version, dubbed QLET K , are introduced. Finally, the chapter describes how LET K and QLET K can be applied to the problem of abduction (cf. Carnielli 2006).
On Logics of Evidence and Truth Logics of evidence and truth (LETs) developed from the logics of formal inconsistency (LFIs), and both LETs and LFIs are part of an evolutionary line that starts in the 1960s with the seminal work of da Costa (1963) on paraconsistency. da Costa’s approach was based on two main ideas: (i) to express the metatheoretical notion of consistency at the object language level and (ii) to divide the sentences of the language into two groups, one subjected to classical logic and the other subjected to a nonexplosive logic. He introduced the hierarchy Cn of paraconsistent logics such that each system in the hierarchy is equipped with what he called a “wellbehavedness” operator ◦ , defined in C1 (the first logic of the Cn hierarchy) in terms
316
A. Rodrigues et al.
of noncontradiction: A◦ = ¬(A ∧ ¬A). The formal systems were conceived in such a way that although the principle of explosion does not hold for non-well-behaved sentences, it does hold for the well-behaved ones. In the current notation, with ◦A replacing A◦ , these ideas may be expressed by: def
(2) A, ¬A B, for some A and B, but ◦A, A, ¬A B, for every A and B. As a result, some sentences (viz., the non-well-behaved ones) can be contradictory without implying triviality, but when A is well-behaved, the principle of explosion is recovered, and so A is subjected to classical logic (da Costa, 1963, pp. 2ff.). This approach to paraconsistency was extensively investigated in the 1970s and 1980s by da Costa and several of his collaborators (see, e.g., D’Ottaviano & da Costa, 1970; Loparic, 1986; Loparic & Alves, 1980; Loparic & da Costa, 1984). These ideas were taken up and further developed from the 2000s onward by Carnielli, Coniglio, and Marcos, who carried out a thorough investigation of logics of formal inconsistency (LFIs), which result from a generalization of da Costa’s proposal (see, e.g., Carnielli & Coniglio, 2016; Carnielli & Marcos, 2002; Carnielli et al., 2007; Marcos, 2005a). The notation has changed from A◦ to ◦A, ◦ has been called a consistency operator, and (2) has been dubbed the principle of gentle explosion. Carnielli & Marcos (2002) introduced the idea of a primitive consistency operator, not necessarily defined in terms of noncontradiction, which allowed distinguishing contradictoriness from inconsistency. Analogously to LFIs, logics of formal undeterminedness (LFUs) (Marcos, 2005b; Carnielli et al., 2019) are paracomplete logics that recover the validity of the principle of excluded middle: (3) A ∨ ¬A, for some A, but ◦A A ∨ ¬A, for every A. A logic that satisfies both (2) and (3) is called a logic of formal inconsistency and undeterminedness (LFIU). (For more precise characterizations of LFIs, LFUs, and LFIUs, see Carnielli et al. (2019), definitions 4.3, 4.5, and 4.7.) Logics of evidence and truth (LETs) count as a further step in the evolutionary line leading back to da Costa’s seminal work. They are LFIUs whose operator ◦, now called a classicality operator, is primitive. The intended intuitive interpretation of LETs is based on the notions of positive and negative evidence and on the assumption that people reason classically in the face of conclusive evidence. Positive and negative evidence are, respectively, evidence for the truth and for the falsity of a sentence A, and the presence of conclusive evidence for A is expressed by ◦A. Non-conclusive evidence requires a paracomplete and paraconsistent logic, since there may be circumstances in which there is simultaneous positive and negative evidence for a sentence A, and circumstances in which there is no evidence for A at all.
16 Paraconsistency, Evidence, and Abduction
317
On the Notion of Evidence As one might expect, the word “evidence” does not have a unified meaning, either in philosophy or in natural language. Nevertheless, it will be argued that the notion of evidence that underlies the intended interpretation of LETs fits a well-established use of this word, both in philosophical discussions and in the ordinary usage of language. The notion of evidence for a sentence A can be explained as “reasons for believing in or accepting A.” The idea of acceptance here is not that of rationally accepting A but of merely admitting A in a certain context of reasoning – in particular, whenever it is said that a contradiction has been accepted, this does not mean that it has been rationally accepted. This is for the following reasons: (i) the reasons for believing may be non-conclusive, partial, wrong, or even conclusive; (ii) evidence for A does not imply the truth of A; (iii) there may be simultaneous conflicting evidence for a pair of sentences A and ¬A, as well as no evidence at all for either A and ¬A; and (iv) evidence for A is objective in the sense that it is independent from the belief of an agent in A (cf. Carnielli & Rodrigues, 2017, Sect. 2). Evidence as “reasons for believing” with the four features listed above is in line with how the notion of evidence is used in philosophy. In Kelly (2014, Sect. 1), one finds the idea that evidence is what “confers justification” for a sentence, and so “reason to believe” and “evidence” are more or less synonymous. Indeed, if evidence for A are reasons for believing in A, and if such reasons are conclusive, an agent would be justified in believing in A. But these reasons may be non-conclusive, and, even if they are conclusive, the mere presence of conclusive evidence is not a sufficient condition for the belief of an agent in A. The idea that evidence may be conclusive or non-conclusive fits with the distinction between potential and veridical evidence established by Achinstein (2010, pp. 4-5). Accordingly, there may be potential evidence for a sentence A even if A is false, while veridical evidence for A requires the truth of A and corresponds to a conclusive justification of A. Achinstein adds, moreover, that evidence is objective in the sense that it does not depend on the belief of an agent. (A similar distinction, between defeasible and indefeasible evidence, is also found in Kelly 2014.) The connection between evidence and justification appears in Kim (1988, p. 390), where he writes that evidence, e.g., for a sentence A, “tends to enhance the reasonableness or justification” of A. In the context of the acceptance of evidence as reasons, the connection between evidence and justification appears also in Pollock (1974, pp. 33-34), where one reads that there may be simultaneous “reasons both for believing something and for disbelieving it” and that such reasons “do not all justify what they are reasons for.” In other words, there may be simultaneous evidence for A and ¬A, and in each case such evidence, of course, cannot be taken as conclusive. Finally, it is worth mentioning how The Cambridge Dictionary of Philosophy (Audi, 1999, p. 293) explains evidence:
318
A. Rodrigues et al.
[i.] In philosophical discussions, a person’s evidence is generally taken to be all the information a person has, positive or negative, relevant to a proposition. . . [ii.] According to a traditional and widely held view, one has knowledge only when one has a true belief based on very strong evidence. Rational belief is belief based on adequate evidence, even if that evidence falls short of what is needed for knowledge. . . [iii.] The evidence one has for a belief may be conclusive or inconclusive. Conclusive evidence is so strong as to rule out all possibility of error.
Thus, evidence is conveyed by positive and negative information (i) which may be conclusive or non-conclusive (iii), and evidence may not imply knowledge, nor imply rational belief (ii). Also clear in the quotation above is the intimate connection between evidence and justification. A central point of our characterization of evidence is that the expressions “x is evidence for A” and “x justifies A” may be taken as synonymous in several contexts, and this can be extended to non-conclusive evidence. Now, by taking a closer look at how the word “evidence” is used in natural language, one finds that in a number of circumstances, evidence is considered to be non-conclusive, objective, and independent of belief and truth. The Concise Oxford English Dictionary (Soanes & Stevenson, 2004) describes evidence as “information indicating whether a belief or proposition is true or valid” (our emphasis). The Cambridge Dictionary Online (Evidence, 2022) explains evidence as “one or more reasons for believing that something is or is not true” (our emphasis). And more importantly, if one does a Google search for “conflicting evidence,” “lack of evidence,” “conclusive evidence,” “non-conclusive evidence,” “inconclusive evidence,” “partial evidence,” “false evidence,” or “misleading evidence,” restricted to reliable sources of English language usage like “nytimes.com,” “bbc.com,” and “.edu,” one will find hundreds, even thousands, of collocations with “evidence” in line with the notion of evidence as characterized by us.
LETs as Information-Based Logics In a wide sense, an information-based logic is any logic suitable for processing information in the sense of taking a database as a set of premises and drawing conclusions from these premises in a sensible way. Since databases often contain contradictions, explosion cannot be valid in an information-based logic. It is wellknown that FDE can be interpreted as an information-based logic (e.g., Belnap, 1977a,b; Dunn, 2018, 2019). In addition to the interpretation in terms of evidence, LETs can also be interpreted along similar lines (cf. Antunes et al., 2022). In few words, instead of positive (negative) evidence, think of positive (negative) information, and ◦A means that the information A, positive or negative, is reliable. The subsections below describe the information-based interpretation of FDE and then show how it can be extended to LET F .
16 Paraconsistency, Evidence, and Abduction
319
FDE or the Belnap-Dunn Logic The logic of first-degree entailment (FDE) appears in Anderson & Belnap (1963) as the first-order system LEQ1 . The latter is the →-free fragment of the system EQ, which, in turn, is a quantified version of the system E of entailment (Anderson, 1960; Anderson & Belnap, 1962). In the 1970s, Belnap and Dunn published a series of papers with a four-valued semantics for the sentential fragment of LEQ1 , along with a corresponding interpretation in terms of information (Belnap, 1977a,b; Dunn, 1976). The resulting logic is what is usually referred to in the literature as the Belnap-Dunn logic, or simply FDE (cf. Omori & Wansing, 2017; Omori and Wansing, 2019). Belnap (1977a) proposes FDE as a logic to be used by a computer that receives information from different sources, which may be either inconsistent or incomplete. The semantic values T, F, Both, and None allow the following four scenarios to be expressed with respect to a given sentence A: 1. 2. 3. 4.
A holds, ¬A does not hold: v(A) = T ; ¬A holds, A does not hold: v(A) = F ; Neither A nor ¬A holds: v(A) = None; Both A and ¬A hold: v(A) = Both.
The values T and F are not to be understood as the standard truth values of classical logic. They are explained by Belnap as “told true” and “told false” signs in the sense that a computer “has been told” that A is true and that it “has been told” that A is false, respectively (Belnap, 1977a, p. 38). Likewise, Both means that the computer has been told that A is both true and false, while None means that nothing about A has been told to the computer. Positive information is represented by A, whereas negative information is represented by ¬A (more about this point briefly), and the presence of negative (positive) information A does not rule out the presence of positive (negative) information A. The value Both thus represents scenarios where there is both positive and negative information with respect to A, and the value None represents scenarios where there is no information at all about A. These inconsistent and incomplete scenarios can be thought of as databases that contain, respectively, contradictory information A and ¬A, and no information about A. A notion of information that fits with the above interpretation of FDE is the so-called general definition of information (GDI) as well-formed meaningful data (Floridi, 2011, Sect. 4.3), or simply meaningful data (Fetzer, 2004). Note that according to GDI, a piece of information is not required to be true, and so the veridicality thesis, advocated by Floridi (2011, pp. 93ff.), is not taken to hold. The notion of data upon which GDI is based is explained by Floridi (2011, pp. 85-86) as difference, or lack of uniformity. Once such differences acquire meaning in some context, they become information (see also Floridi, 2019, Sect. 1.3). Thus, nonlinguistic items like a drop of blood on the floor and a scratch on the skin may qualify as meaningful data. A linguistic version of this notion of information is presented by Dunn in the following passage:
320
A. Rodrigues et al.
[Information is] what is left from knowledge [defined as justified true belief] when you subtract justification, truth, belief, and any other ingredients such as reliability that relate to justification. . . [Information] is something like a Fregean “thought,” i.e., the “content” of a belief that is equally shared by a doubt, a concern, a wish, etc. It might be helpful to say that it is what philosophers call a “proposition,” but that term itself would need explanation (Dunn, 2008, p. 581).
Propositions, conceived of as the kind of thing that can be either true or false, seem to fit well the concept of information as meaningful data. It should be acknowledged, however, that the concept of proposition has problems that could be just transferred to the concept of information. Nonetheless, these problems can be easily avoided by just moving from propositions to sentences, that is, instead of saying that positive (negative) information is the proposition that A (¬A), one may talk about the information conveyed by the sentence A (¬A). Thus, one may say that positive information A is conveyed by the sentence A and that negative information A is conveyed by the sentence ¬A.
The Classicality Operator: Reliable and Unreliable Information Recall that classical negation is recovered in LETs for sentences that occur in the scope of the classicality operator ◦ (propositions (2) and (3) above). Adding the connective ◦ to FDE allows us to represent, in addition to the four scenarios expressed by FDE, two more scenarios where the information available is considered reliable. More precisely, when ◦A does not hold, and so when the information conveyed by A is not known to be reliable, then the four FDE scenarios mentioned above should be taken into consideration. But when ◦A does hold, those four scenarios are narrowed down to just two: either the information A or the information ¬A is reliable, but not both. To the extent that the connective ◦ indicates the presence of reliable information, it allows distinguishing circumstances in which only positive (negative), though non-reliable, information A is available from circumstances in which there is positive (negative) reliable information. (Note that the resulting six scenarios correspond to the six scenarios of evidence presented below – it suffices to replace conclusive and non-conclusive evidence with reliable and unreliable information, respectively.) On the Connections Between Evidence and Information Although the evidence- and the information-based interpretations of LETs have been presented as alternative ways of interpreting the logics of evidence and truth, they are more closely related than our foregoing discussion might have suggested. Indeed, the notion of evidence can be made more precise with the help of the definition of information as meaningful data. As mentioned above, information may appear in both linguistic and nonlinguistic forms. It may be conveyed by sentences as well as by things like blood spots, details in a photograph, fossil records, fingerprints on a gun, documents, etc. These “pieces of information” can be considered justifications for certain sentences, and such justifications may be non-factive, i.e., fallible, partial, or wrong – or
16 Paraconsistency, Evidence, and Abduction
321
even in some cases conclusive. Now, an appropriate definition of evidence for a sentence A would be a pair Θ, A, where Θ contains pieces of information, linguistic or otherwise, which are taken as justifications for A. (The terminology “non-factive justification/evidence” was borrowed from Fitting (2016), where he proposes justification logics conceived to formalize the notion of evidence presented in Carnielli & Rodrigues 2017.) Some additional remarks are in order here. First, an important point about the relation between Θ and A is that it is not to be thought of as a relation of logical consequence, that is, A does not follow logically from Θ through some notion of consequence. In fact, it is not clear that a general account of the relation between Θ and A could even be spelled out. It includes, besides different accounts of logical consequence, probability, generalizations (induction), analogies, protocols, abduction, causality, and more. Second, the above (rather tentative) definition of evidence not only makes clear the connections between the notions of evidence and information but also fits with both interpretations. If Θ is empty, and so nothing is being presented as a supposed justification of A, what remains is just A, which is nothing but a piece of linguistic information in the sense discussed above. Third, science denialism illustrates what it is meant by conflicting evidence based on non-factive justifications. Consider, for example, the creationist claim that God created the Earth less than ten thousand years ago. The scientific consensus estimates the age of the Earth at about 4.5 billion years. On the Web one finds both claims, together with a number of justifications. The same applies to other examples of denialism: HIV is just a passenger virus and is not the cause of AIDS, vaccination is not safe, there is no anthropogenic climate change, tobacco does not cause cancer, etc. Justifications for these sentences, as nonsensical as they might be, are out there, objectively available, sometimes without even identifying who made them available. (On scientific denialism and the (non-factive) justifications that come together with it, see, e.g., Oreskes 2019.)
The Logic LET F The logic LET F (the letter F stands for FDE) is a sentential LET introduced in Rodrigues et al. (2020). Besides ◦, LET F is also equipped with a non-classicality connective •, dual to ◦. The language LF of LET F is composed of denumerably many sentential letters p1 , p2 , . . . ; the unary connectives ◦, •, and ¬; the binary connectives ∧ and ∨; and parentheses. The set of formulas of LF , which is also denoted by LF , is inductively defined in the usual way. Roman capitals A, B, C, . . . will be used as metavariables for the formulas of LF , and Greek capitals Γ, Δ, Σ, . . . as metavariables for sets of formulas of LF . Definition 1 (A natural deduction system for LET F ). The logic LET F is defined by the following natural deduction rules:
322
A
A∧B
A. Rodrigues et al.
B
∧I
A B ∨I A∨B A∨B
A∧B A∧B ∧E B A
¬B ¬A ¬∧I ¬(A ∧ B) ¬(A ∧ B)
A B
◦A
¬A
B
•A
EXP◦
Cons
[A] .. . B B
◦A [◦A] .. . B
B
A∨B
[¬A] .. . C C
¬(A ∨ B) ¬(A ∨ B) ¬∨E ¬B ¬A
¬A ¬B ¬∨I ¬(A ∨ B)
◦A
¬(A ∧ B)
[A] .. . C C
[¬B] .. . C
[B] .. . C
∨E
¬∧E
A ¬¬A DN A ¬¬A [¬A] .. . B
PEM ◦
[•A] .. . B Comp
Enclosing a formula A in square brackets indicates that A is a discharged hypothesis. The notion of a derivation in LET F can be inductively defined along the lines of the definition presented in van Dalen (2008, pp. 35-36). It suffices to say here that a derivation is a tree of labeled formulas whose nodes are either a hypothesis or the conclusion of applying one of the rules above to formulas that occur previously in the tree. Given Γ ∪ {A} ⊆ LF , the notation Γ F A will be used to express that there is a derivation D in LET F such that A is the last formula that occurs in D (its conclusion) and all of D’s undischarged hypotheses belong to Γ . The rationale for the natural deduction system above is that the ◦-free rules preserve evidence, while the connective ◦ works like a context switch that divides the sentences into those that have a classical and those that have a nonclassical behavior (see Proposition 2 below). The connective • is the dual of ◦, and so • works as if it were a classical negation of ◦. The idea of rules that preserve evidence can be explained in analogy with the BHK interpretation of intuitionistic logic. The latter says that an inference rule is valid when it transforms constructive proofs of one or more premises into a constructive proof of the conclusion. Analogously, the guiding idea here is the following: supposing the availability of evidence for the premises, it may be asked whether an inference rule yields a conclusion for which evidence is also available.
16 Paraconsistency, Evidence, and Abduction
323
The ◦-free rules of LET F define a deductive system for FDE. Other deductive systems for FDE have already been presented in the literature (see Omori & Wansing, 2017, Section 2.2), but the natural deduction system proposed here makes the symmetry between positive and negative rules explicit: ∧I and ¬ ∨ I are symmetrical, ∨E and ¬ ∧ E are symmetrical, and so on and so forth. This mirrors the fact that positive and negative evidence are noncomplementary notions, but have symmetric deductive behavior. Note that there are no introduction rules for ◦ in LET F . The idea is that the information that a formula is conclusive (or reliable) comes always from outside the formal system. In a database, for instance, such information has to be included in the database rather than inferred by an underlying logic. However, Proposition 2 below shows that classical behavior propagates over the standard sentential connectives – for example, even though {◦p, ◦q} does not entail ◦(p∧q), p∧q behaves classically in LET F if ◦p and ◦q hold. The following six scenarios can be expressed by LET F : When •A holds: 1. 2. 3. 4.
A holds, ¬A does not hold: only positive evidence for A; A does not hold, ¬A holds: only negative evidence for A; Both A and ¬A hold: conflicting evidence for A; Neither A nor ¬A hold: no evidence at all for A.
When ◦A holds: 5. A holds: conclusive evidence for the truth of A; 6. ¬A holds: conclusive evidence for the falsity of A. Proposition 1. The following inferences hold in LETF : 1. 2. 3. 4. 5. 6.
◦A F A ∨ ¬A; ◦A, A, ¬A F B; F ◦A ∨ •A; ◦A, •A F B; A, ¬A F •A; B F A ∨ ¬A ∨ •A.
While ◦A entails that A behaves classically (items 1 and 2), •A follows from A’s violating some classically valid inferences (items 5 and 6). Items 3 and 4 express, respectively, that either there is or there is not conclusive evidence for A and that it cannot be that there is and there is not conclusive evidence for A. Note that 5, 6, and 4 are dual, respectively, to 1, 2, and 3 (on the concept of duality, see Carnielli et al. 2019).
324
A. Rodrigues et al.
Proposition 2 (Recovering classical logic in LET F ). Let ni ≥ 0 and suppose ◦¬n1 A1 , ◦¬n2 A2 , . . . , ◦¬nm Am hold (where, ¬ni , ni ≥ 0, represents ni iterations of negations of the formula Ai ). Then, for any formula B formed with A1 , A2 , . . . , Am over {∧, ∨, ¬}, B behaves classically. Proof. See Carnielli & Rodrigues (2017), Fact 31. Proposition 3. (1) Although in LETF neither modus ponens nor the deduction theorem holds for an implication A →w B defined as ¬A ∨ B, the following hold: i. ◦A, A, A →w B F B; ii. ◦A, A F B implies ◦A F A →w B. (2) ◦A ∧ A ∧ ¬A or ◦A ∧ •A define bottom particles in LETF . Proof. Left to the reader. That modus ponens and the deduction theorem do not hold for A →w B can be easily proved with the help of the semantics to be presented below. On page 327, an extension of LET F equipped with a classical implication → for which both modus ponens and the deduction theorem hold will be proposed.
Valuation Semantics for LET F This section presents a sound and complete two-valued valuation semantics for LET F (cf. Rodrigues et al., 2020, Sections 2.1 and 3.2). (For some general remarks on valuation semantics, see Appendix.) Definition 2 (Valuation semantics for LET F ). A valuation semantics for LET F is a collection of LET F valuations defined as follows. A function v : LF −→ {0, 1} is a LET F valuation if it satisfies the following clauses: 1. 2. 3. 4. 5. 6. 7.
v(A ∧ B) = 1 iff v(A) = 1 and v(B) = 1; v(A ∨ B) = 1 iff v(A) = 1 or v(B) = 1; v(¬(A ∧ B)) = 1 iff v(¬A) = 1 or v(¬B) = 1; v(¬(A ∨ B)) = 1 iff v(¬A) = 1 and v(¬B) = 1; v(¬¬A) = 1 iff v(A) = 1; If v(◦A) = 1, then v(A) = 1 if and only if v(¬A) = 0; v(•A) = 1 iff v(◦A) = 0.
Definition 3. A formula A is said to be a semantical consequence of Γ , Γ F A, if and only if for every valuation v, if v(B) = 1 for every B ∈ Γ , then v(A) = 1.
16 Paraconsistency, Evidence, and Abduction
325
The semantics above is nondeterministic in the sense that the semantic value of complex formulas is not always functionally determined by its parts. The example below illustrates the behavior of the connectives ◦ and • in LET F . Example 1. i. ii. iii. iv.
◦p F p ∨ ¬p; •p F p ∧ ¬p; F ◦p ∨ •p; ◦p, •p F .
1 2 3 4 5 6
p ¬p p ∨ ¬p p ∧ ¬p ◦p •p
0
1
0 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 1 0 1 1 0 1 0 1 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
The first two lines display the possible values of p and ¬p. Line 2 bifurcates because p and ¬p are independent of each order. The connectives ◦ and • are unary, but the semantic values of ◦p and •p depend nondeterministically on the semantic values of p and ¬p. When v(p) = 1 and v(¬p) = 0, or v(p) = 0 and v(¬p) = 1, the value of ◦p and •p bifurcates into 0 and 1. This expresses the fact that ◦p is undetermined in LET F when v(p) = v(¬p). Line 5 bifurcates because there are no sufficient conditions for v(◦A) = 1. There is no valuation v such that v(◦p) = 1 and v(p ∨ ¬p) = 0, so (i) holds, and valuation v1 provides a counterexample to (ii). Theorem 1 (Soundness and completeness). The natural deduction system for LETF presented in Definition 1 is sound and complete with respect to its valuation semantics: Γ F A if and only if Γ F A. Proof. See Rodrigues et al. (2020), Theorem 28.
LET F -Tableaux This section presents a sound, complete, and decidable analytic tableau system for LET F (cf. Carnielli et al., 2020). It will be implicitly assumed here the usual notions related to tableaux systems: trees, branches, nodes, etc. The labels 0 and 1 refer to metamathematical markers, intuitively related to the semantic values 0 and 1.
326
A. Rodrigues et al.
Definition 4 (Tableau rules for LET F ).
Rule 1
Rule 2
1(A ∧ B) 1(A) 1(B)
0(A ∧ B) 0(A)
0(B)
Rule 3
Rule 4
1(¬(A ∧ B))
0(¬(A ∧ B)) 0(¬A) 0(¬B)
1(¬A)
1(¬B)
Rule 5
Rule 6
1(A ∨ B)
0(A ∨ B) 0(A) 0(B)
1(A)
1(B)
Rule 7
Rule 8
1(¬(A ∨ B)) 1(¬A) 1(¬B)
0(¬(A ∨ B)) 0(¬A)
0(¬B)
Rule 9
Rule 10
1(¬¬A) 1(A)
0(¬¬A) 0(A) Rule 11 1(◦A) 1(A) 0(¬A)
0(A) 1(¬A)
Rule 12
Rule 13
1(•A) 0(◦A)
0(•A) 1(◦A)
16 Paraconsistency, Evidence, and Abduction
327
Note that there is no need for a rule for 0(◦A). Such a rule, call it R, would conclude 1(•A) from 0(◦A). Besides yielding a loop with Rule 12, it can be shown that R is not necessary at all. Suppose the application of R to 0(◦A) yielded a closed branch b such that both 1(•A) and 0(•A) occur in b. But in this case, it would be enough to apply Rule 13 to 0(•A), obtaining a branch b containing 1(◦A) and 0(◦A), and b would be a closed branch. Concerning Rule 11, recall that the symbol ◦ in LET F expresses classicality, i.e., a formula ◦A implies that A behaves classically. The classical behavior of A is recovered by recovering classical negation for A: either A or ¬A holds, but not both. This is precisely what Rule 11 does. The semantic clause for ◦A has only a necessary condition for v(◦A) = 1, and the absence of a rule for 0(◦A) mimics this fact. Definition 5 (LET F tableaux). 1. A tableau for a set Δ of signed formulas is a tree whose first node contains all the signed formulas in Δ, and whose subsequent nodes are obtained by applications of the tableau rules given in Definition 4 above. 2. A tableau branch is closed if it contains a pair of signed formulas 1(A) and 0(A). If a branch is not closed, then it is open. A tableau is closed if all its branches are closed. 3. When no rule can be applied to any open branch, the tableau is terminated. Every closed tableau is also a terminated tableau. 4. A formula A has a proof from premises Γ , denoted Γ TF A, if there exists a closed tableau for the set {1(B) : B ∈ Γ } ∪ {0(A)}. Γ can be empty, and in this case a proof of TF A reduces to a tableau for the singleton {0(A)}. Theorem 2. LETF -tableaux are sound and complete with respect to the valuation semantics: Γ TF A if and only if Γ F A. LETF tableaux provide a decision procedure for LETF . Proof. See Carnielli et al. (2020), Theorems 11, 14, and 16.
Adding Implication: The Logic LET K It is well-known that FDE lacks an implication that validates modus ponens. Different implications have already been added to FDE (see Omori & Wansing, 2017, Sect. 5.1). For instance, the implication →e of Omori & Wansing (2017), which is also the “classical material implication” adopted by Hazen & Pelletier (2019). When added to FDE, →e yields the logic FDE→ (Hazen & Pelletier, 2019), for which an adequate valuation semantics and a decidable analytic tableaux system can be straightforwardly obtained. The logic LET K extends FDE→ with the connective ◦, together with the corresponding rules (or can be alternatively defined by adding →e to LET F while dropping •). From now on, we write → in place of →e .
328
A. Rodrigues et al.
Definition 6 (A deductive system for LET K ). The language LK of LET K is obtained by adding the implication connective → to the •-free fragment of the language LF of LET F . A deductive system for LET K is obtained from the deductive system for LET F presented in Definition 1 by replacing Cons and Comp with the following rules:
[A] .. .. B →I A→B
A→B B
A ¬B ¬→I ¬(A → B)
A
→E
A ∨ (A → B)
→CL
¬(A → B) ¬(A → B) ¬→E A ¬B
The notion of a derivation in LET K is defined as usual. The notation Γ K A will be used to express that there is a derivation of A from Γ in LET K .
Proposition 4. A classical negation can be defined in LETK Define ⊥ as ◦A ∧ A ∧ ¬A. The negation ∼, defined as A → ⊥, satisfies the following properties, and thus is a classical negation in LETK . 1. A, ∼A K B; 2. K A ∨ ∼A. Proof. Left to the reader. An alternative but equivalent presentation of LET K , where •A can be defined as ∼◦A, is found in Carnielli & Rodrigues (2015). As expected, LET K admits a two-valued valuation semantics. Definition 7 (Valuation semantics for LET K ). A valuation semantics for LET K is obtained by adding the following clauses to Definition 2: 8. v(A → B) = 1 iff v(A) = 0 or v(B) = 1; 9. v(¬(A → B)) = 1 iff v(A) = 1 and v(¬B) = 1. A formula A is a semantical consequence of Γ in LET K , Γ K A, if and only if for every valuation v, if v(B) = 1, for every B ∈ Γ , then v(A) = 1.
16 Paraconsistency, Evidence, and Abduction
329
Definition 8. An analytic tableaux system for LET K is obtained by replacing Rules 12 and 13 in Definition 4 with the following rules: Rule 14
Rule 15
1(A → B)
0(A → B) 1(A) 0(B)
0(A)
1(B)
Rule 16
Rule 17
1(¬(A → B)) 1(A) 1(¬B)
0(¬(A → B)) 0(A)
0(¬B)
A formula A has a tableau proof from premises Γ in LET K , denoted by Γ TK A, if there exists a closed tableau for the set {1(B) : B ∈ Γ } ∪ {0(A)}. By adapting the completeness and soundness proofs of LET F , it can be proved that the relations K , K , and TK are all equivalent, that is, Γ K A if and only if Γ K A if and only if Γ TK A.
First-Order LET K Let us turn now to the first-order version, QLET K , of LET K , which is an extension of the •-free fragment of the first-order extension QLET F of LET F , introduced and investigated in Antunes et al. (2022). A quantified version of LET K is required because any sensible application of a logic will certainly need quantification, including the application of LET K to the problem of abduction to be seen below. The logical vocabulary of QLET K results from supplementing the vocabulary of LET K with the quantifiers ∀ and ∃, and replacing the sentential letters p1 , p2 , . . . with the individual variables from V = {vi : i ∈ N}. The nonlogical vocabulary of a first-order language will be specified by means of its first-order signature, which is a pair S = C , P such that C is an infinite set of individual constants and P is a non-empty set of predicate letters. Each element P of P is assumed to have a corresponding finite arity. Henceforth, the usual definitions of such syntactic notions as term, formula, bound/free occurrence of a variable, sentence, etc. are implicitly assumed – but with the proviso that formulas with void quantifiers (i.e., formulas containing subformulas of the form ∀xA or ∃xA such that x does not occur free in A) are not allowed. Given a first-order signature S , T erm(S ) denotes the set
330
A. Rodrigues et al.
of terms generated by S . Likewise, the set of formulas and the set of sentences generated by S will be denoted by F orm(S ) and Sent (S ). Hereafter x, x1 , x2 will be used as metavariables ranging over V ; c, c1 , c2 ,. . . as metavariables ranging over C ; and t, t1 , t2 , . . . as metavariables ranging over T erm(S ). Given t, t1 , t2 ∈ T erm(S ), the notation t (t2 /t1 ) denotes the result of replacing every occurrence of t1 in t (if any) by t2 . Similarly, A(t/x) denotes the formula that results by replacing every free occurrence of x in A by t.
A Natural Deduction System for QLET K For the sake of simplicity, the deductive systems and the formal semantics of the logics discussed below will be formulated exclusively in terms of sentences. This is the reason why it has been assumed right from the outset that languages must always have an infinite stock of individual constants – for otherwise one could be prevented from applying some quantifier rules due to the lack of enough constants. Definition 9 (A natural deduction system for QLET K ). Let S be a first-order signature, c ∈ C , and A, B, C ∈ Sent (S ). The logic QLET K is defined over S by adding the following rules to LET K (see Definition 6):
B ∨ A(c/x) ∀I B ∨ ∀xA
¬A(c/x) ¬∀I ¬∀xA
A(c/x) ∃I ∃xA
∀xA ∀E A(c/x)
¬∀xA
C
[¬A(c/x)] .. . C ¬∀E
∃xA
¬A(c/x) ¬∃I ¬∃xA
C
[A(c/x)] .. . C ∃E
¬∃xA ¬∃E ¬A(c/x)
A AV A
In ∀I , c must not occur in A or B, nor in any hypothesis on which B ∨ A(c/x) depends; and in ¬∃I , c must not occur in A nor in any hypothesis on which ¬A(c/x) depends. In ∃E and ¬∀E, c must occur neither in A or C, nor in any hypothesis on which C depends, except A(c/x) (¬A(c/x)). Finally, in AV , A denotes any alphabetic variant of A, which is any sentence that differs from A only in some of its bound variables. Proposition 5. The usual universal generalization rule:
16 Paraconsistency, Evidence, and Abduction
331
A(c/x) ∀xA (where c occurs neither in A nor in any hypothesis on which A(c/x) depends) can be derived in QLETK .
Valuation Semantics for QLET K Definition 10 (First-order structures). Let S be a first-order signature. An S -structure A is a pair D, I such that D is a non-empty set (the domain of A) and I is an interpretation function such that: 1. For every constant c ∈ C , I (c) ∈ D; 2. For every n-ary predicate P ∈ P, I (P ) is a pair P+A , P−A such that P+A ∪P−A ⊆ Dn. Given an S -structure A = D, I , we shall write cA and P A instead of respectively I (c) and I (P ). According to the definition above, individual constants are interpreted as elements of the domain D of A, while predicate letters are interpreted as pairs of relations over D: each predicate letter P is assigned both an extension, P+A , and an anti-extension, P−A . Notice that, given an n-ary predicate letter P , although P+A ∪ P−A must be a subset of D n , there are no constraints to the effect that P+A ∪ P−A = D n , nor to the effect that P+A ∩ P−A = ∅. This means that it is not required that exactly one of P (c1 , . . . , cn ) and ¬P (c1 , . . . , cn ) receive a designated value, for every constant c1 , . . . , cn : it may be that neither P (c1 , . . . , cn ) nor ¬P (c1 , . . . , cn ) holds in A, or that both P (c1 , . . . , cn ) and ¬P (c1 , . . . , cn ) hold. Definition 11 (Diagram signatures and diagram languages). Let S = C , P be a first-order signature and let A be an S -structure. The diagram signature SA of A is the pair CA , P such that CA = C ∪ {a : a ∈ D}; that is, SA is the first-order signature that results from S by introducing a new individual constant a for each element a of the domain. The language generated by SA will be called the diagram language of A, and the notation A will be used to denote the SA structure that is A just like A except that a = a, for every a ∈ D. The definition of a QLET K structure resembles very much the corresponding definition in classical first-order logic, except for the interpretation given to the predicate letters in terms of extensions and anti-extensions. Unlike classical logic, however, specifying a QLET K structure is not sufficient to determine the semantic values of every sentence of the relevant language due to the nondeterministic behavior of ¬, ◦, and •. In order to comply with this fact, it is thus necessary to supplement a structure with a valuation function.
332
A. Rodrigues et al.
Definition 12 (QLET K Interpretations). Let S = C , P be a first-order signature and A be an S -structure. A mapping v : Sent (SA ) −→ {0, 1} is an A-valuation if it satisfies clauses (1)–(9) of Definitions 2 and 7, along with the following additional clauses: 10. 11. 12. 13. 14. 15. 16. 17.
v(P (c1 , . . . , cn )) = 1 iff c1A , . . . , cnA ∈ P+A , for every c1 , . . . , cn ∈ CA ; v(¬P (c1 , . . . , cn )) = 1 iff c1A , . . . , cnA ∈ P−A , for every c1 , . . . , cn ∈ CA ; v(∀xA) = 1 iff v(A(a/x)) = 1, for every a ∈ D; v(∃xA) = 1 iff v(A(a/x)) = 1, for some a ∈ D; v(¬∀xA) = 1 iff v(¬A(a/x)) = 1, for some a ∈ D; v(¬∃xA) = 1 iff v(¬A(a/x)) = 1, for every a ∈ D; If A is an alphabetic variant of A, then v(A ) = v(A); Let A ∈ F orm(SA ) be such that no variables other than x are free in A, and let c1 , c2 ∈ CA . If c1A = c2A and v(A(c1 /x)) = v(A(c2 /x)), then v(#A(c1 /x)) = v((#A(c2 /x)) (where # ∈ {¬, ◦, •}).
Let S be a first-order signature. An S -interpretation is a pair A, v such that A is an S -structure and v is an A-valuation. A sentence A is said to hold in the interpretation A, v (A, v A) if and only if v(A) = 1; and a set of sentences Γ is said to hold in A, v (A, v Γ ) if and only if every element of Γ holds in A, v. Γ is said to have a model if it holds in some interpretation. Finally, A is a semantic consequence of Γ (Γ A) if and only if A, v A whenever A, v Γ for every interpretation A, v. Notice that a valuation assigns either 1 or 0 to each sentence of the diagram language of S , which, of course, includes all sentences in Sent (S ). Resorting to diagram languages is required in order to make sure that the quantifiers range over all the objects of the domain of A. Since ∀ and ∃ are given a substitutional interpretation – i.e., the semantic value of a formula ∀xA depends on the semantic values of all the substitution instances of A – it is necessary to extend the original language with a new individual constant for each element of the domain D, and to extend the interpretation function of the original structure accordingly. Notice further that clause (16) explicitly requires that any two formulas that differ only in some of their bound variables must be assigned the same value by a valuation. This clause is the counterpart of rule AV, without which formulas such as ◦∀xA and ◦∀yA(y/x) (where y does not occur in A) cannot be proven to be equivalent. (If void quantifiers were allowed, there would be sentences that are intuitively equivalent but which could receive different semantic values even in the presence of clause (16) – e.g., ◦∀xP c and ◦P c. That is the reason why “formulas” in which void quantifiers occur have been excluded from the set of formulas. An alternative approach would be to allow for void quantifiers, but extend the definition of alphabetic variants in a such a way that formulas that differ by the presence of one or more void quantifiers would also count as alphabetic variants of one another.) Finally, clause (17) is required for similar reasons: had it been missing, nothing
16 Paraconsistency, Evidence, and Abduction
333
would prevent ◦A(c1 /x) and ◦A(c2 /x) from being assigned different values by a valuation, even though A(c1 /x) and A(c2 /x) had the same value and c1 and c2 were interpreted as the same individual of the domain. Theorem 3 (Soundness and Completeness). Given a first-order signature S , the natural deduction system for QLETK is sound and complete with respect to the class of all S interpretations: Γ A if and only if Γ A. Proof. This result can be easily proven by adapting the proofs of Theorems 12 and 19 of Antunes et al. (2022). Corollary 1. Compactness theorem Γ A if and only if there is a finite subset Γ0 of Γ such that Γ0 A. Proof. See Antunes et al. (2022), Corollary 20.
Paraconsistency and Abduction The formulation of the problem of abduction is due to Peirce, who defined it as a process of forming hypotheses for explanatory purposes (Peirce, 1931, CP 5.189, 7.202). It is thus a kind of a reversed explanation, whose basic idea may be expressed as follows: when some fact is discovered that is not explained by the available theory (i.e., is not a consequence of the available theory), a set of new premises is added as a hypothetical solution to the problem. In other words, given the observation of an unexpected fact F, an explanation E for the fact F, such that the truth of E implies F, is then pursued. The act of adding something before using it as an explanation poses however a second problem: how is it possible to create an abductive explanans? First, it has to be acknowledged that characterizing the concept of explanation is one of the greatest challenges in the philosophy of science. This problem is even harder in logic and mathematics, where explanations are sometimes confused with proofs (see, e.g., Mancosu, 2001). Although it is not being suggested here that “explaining” can be reduced to “deducing,” it is certainly acceptable that the idea of explanation in deductive sciences includes the search for missing hypotheses; it is in this context that the general abductive process can be formulated as the process of (i) generating new hypotheses within arbitrary deductive systems and afterward (ii) using them in deductive terms. The task (i) is referred to here as creative abduction, while the task (ii) – using such new hypotheses – as explicative abduction. The term “explicative” is here understood under the following proviso: a missing link in a deduction certainly does not exhaust the need for an explanation, but does constitute the first necessary step toward explaining an unexpected (i.e., not yet deducible) fact. Two natural assumptions that can be raised with regard to explanations are the following: first, there can be various explanations for the same surprising fact, and second, there can be explanations of various degrees for the same surprising fact.
334
A. Rodrigues et al.
For example, searching for the ultimate scientific explanation as to why the grass of your garden is wet in the morning and discovering that the sprinkler was left on all night may be two different things. Both explain the fact, but respond to different needs. The question is analogous to the one in automatic theorem proving: finding any proof is one thing, while finding a “philosophically interesting” proof is another. In the same manner, that automatic theorem proving is satisfied with the first-level of proofs, so automatic abduction will be satisfied with a first-level explanation. What concerns us here are abductions with no “obvious” explanations, particularly those in which contradictions may be involved. Let be a deductive relation; if Γ A, the creative abductive step consists in finding an appropriate Δ so that Γ ∪ Δ A. In this case, the discovered Δ performs the explicative abductive step. Obviously, there must be some constraints; otherwise, Δ = {⊥} would be a trivial solution for the abductive problem in most deduction relations. Usually, if the underlying logic is monotonic and explosive, another constraint is that Γ ¬A, for this would imply that any explicative Δ would be a trivial explanation. This restriction, however, will not be necessary in our case, as the following discussions and examples will make clear. From the point of view of general argumentation (and not only deduction), abduction concerns the search for hypotheses or the search for explanatory instances that support reasoning. In this sense, it can be seen as a complement to argumentation, in the same manner that in the philosophy of science, the context of discovery is a complement to the context of justification. And moreover, further pursuing the analogy, the question of the logical possibility of creative abduction lies on the same side of the famous question of the logical possibility of scientific discovery. A renewed interest in abduction acquired impetus in the information age due to the factual treatment of data and the question of virtual causality. The enormous amount of data stored on the World Wide Web and in complex systems, as well as the virtual relationship among such data, continuously demands new tools for automatic reasoning. These tools should incorporate general logical methods which are at the same time machine understandable, and sufficiently close to human semantics as to perform sensible automated reasoning. An example wherein abductive inference is highly relevant is the model-based diagnosis in engineering and AI. Suppose that a complex system, such as an aircraft, is being tested before a transatlantic flight. The electronic circuitry permits the testers to predict certain outputs based on specific input tests. If the instruments show something distinct from the expected, it is a task of model-based diagnosis to discover an explanation for the anomaly and use it to separate the components responsible for the problem, instead of disassembling the whole aircraft. Another example occurs in the process of updating in the so-called datalog databases. Consider a logic program composed of the following clauses, where desc(x, y) means “x is a descendent of y,” parent (y, x) means “y is a parent of x,” and β ← α means that the database contents plus α produce (or answer) β: desc(x, y) ← parent (y, x) desc(x, y) ← parent (z, x), desc(z, y).
16 Paraconsistency, Evidence, and Abduction
335
There is a subtle difference between inserting information in the database in an explicit versus implicit manner: information of the form “y is a parent of x” is a basic fact, and can be inserted explicitly, while information of the form “x is a descendent of y” is either factual knowledge or is a consequence of the machine reasoning (as simple as it may be). If one wishes to insert a piece of implicit information, it is necessary to modify the set of facts stored in the database in such a way that this information can be deduced: this is an example of creative abduction and of explicative abduction at the same time. For instance, if desc(Zeus, U ranus) and parent (U ranus, Cronus) are stored in a database, then there are two different ways for implicitly inserting desc(Aphrodite, U ranus): either insert parent (Zeus, Aphrodite) or alternatively insert desc(Aphrodite, Cronus). These two alternative additions are examples of abductive explanations for desc(Aphrodite, U ranus). In fact, logic programming uses this abductive mechanism for answering queries, in the form: “Is the fact desc(Aphrodite, U ranus) compatible with the program clauses and data?” or “Is there any x such that desc(x, U ranus)?” The whole procedure is creative as much as it can be automatized. Therefore, it is evident that a useful abductive mechanism for databases should be based on first-order logic, and not merely on sentential logic. Abductive approaches are also used to integrate different ontologies and database schemes, or for integrating distinct data sources under the same ontology, for example, Arieli et al. (2004), where an abductive-based application for database integration is developed. Suppose that, while a query is being processed by a user, another data source had inserted desc(U ranus, Aphrodite) in our database, plus a constraint of the form: “For no x and y, simultaneously desc(x, y) and desc(y, x) can be maintained in the database.” If parent (Zeus, Aphrodite) had been inserted for one data source, the insertion of desc(U ranus, Aphrodite) by the second data source would cause a collapse in view of the constraint. What can be done? The option of deleting all data does not seem reasonable, and the one of having all queries be answered positively (since a database established on classical logic grounds would deduce anything from a contradiction) is of course intolerable. Thus, a logic suitable for dealing with this problem would have to be a paraconsistent first-order logic. We argue here that simple yet powerful techniques for automatic abduction can be implemented by means of tableaux for the logic LET K , which can be straightforwardly extended to QLET K .
Applying LET K -Tableaux to Abduction The problem of abduction involves two independent but complementary problems: (i) finding a method to automatically perform abduction (and, if possible, to automatically generate abductive data) and (ii) doing this within a reasoning environment capable of providing sensible outputs even in the presence of contradictions. If the underlying logic is explosive, a single inconsistency undermines the whole process. However, very simple logical models can be designed for dealing
336
A. Rodrigues et al.
with abduction, by means of defining them in terms of refutation procedures based on LETs. Indeed, abduction does not go in the forward direction of deduction. It is not difficult to accept, either, that abduction cannot coincide with any backward form of classical deduction, but it does not follow that another form of backward deduction would not work. Let us take as an example LET K : it does not prove anything that classical logic would not prove; it tolerates contradictions, but, nonetheless, it can encode the whole of classical reasoning. Backwards proof procedures based on LET K constitute a suitable approach to abduction, and it is our intention to show how this approach can be programmed and treated on a natural basis departing from a very simple formalism. In a situation where there are serious theories competing around a contradiction, there is not much point in rejecting one of them just to save the principle of explosion. It seems to be out of the question that it is more convenient to tame the logic, rather than to sacrifice a precious (and possibly correct) theory. This is not only the case for scientific theories. A single digit in a database can of course be too valuable to be just thrown away, and it is already widely recognized that no automated reasoning is possible without means of controlling logical explosion. What is yet not clear is whether the act of guessing involved in the discovery context of abduction, and furthermore under such conditions, can be the subject of logic. In many interesting cases, the process of guessing can be solved semiautomatically by means of careful manipulation of the concept of classicality. In this way, it is possible to obtain a reasonably efficient and conceptually simple method for discovering new logical hypotheses that will serve as explanans for a given explanandum. The basic idea of applying analytic tableaux to the problem of abduction is that the open branches may be seen as a heuristic device that indicates the formulas that would close the tableau, and these formulas are then taken as the explicative hypotheses. Below, a definition of the notion of an abductive explanation in which non-triviality, rather than consistency, is a necessary condition for possible solutions of abductive problems is proposed. Definition 13 (Abductive explanation). Let Γ and Δ be finite sets of sentences and let A be a sentence in the language of a given logic L. Γ and A form an abductive problem, and Δ is an abductive explanation for the abductive problem if: 1. (Abductive problem): The context Γ is not sufficient to entail A, that is, Γ L A; 2. (Abductive solution): The enriched context Γ plus Δ is sufficient to entail A, that is, Γ, Δ L A; 3. (Non-triviality of solution): The enriched context Γ plus Δ is nontrivial, that is, there exists B such that Γ, Δ L B ; 4. (Vocabulary restriction of solution): V ar(Δ) ⊆ V ar(Γ ) ∪ V ar(A). 5. (Minimality of solution): by lack of any other criteria, a mathematically minimal Δ is a good explanation (in the sense, e.g., that it is composed of a set with minimal cardinality and with formulas of minimal length).
16 Paraconsistency, Evidence, and Abduction
337
While conditions (1) and (2) just define what is an abductive problem and what is a solution, conditions (3), (4), and (5) impose restrictions for a solution to be considered relevant: condition (3) avoids, for instance, that Δ be taken as the collection of all formulas, or as a single bottom particle (which would entail any other formula). Since the compactness theorem holds for LET K , Γ and Δ can always be taken as finite sets. The Definition 13 above has been adapted from several authors (e.g., Aliseda (2006) and Soler-Toscano and Velázquez-Quesada (2014), among others). As already mentioned, the sensible point of our definition is clause 3, which requires non-triviality instead of consistency for possible solutions of an abductive problem. As it will be seen below, it may happen that a solution Δ to an abductive problem Γ L A may be such that Δ ∪ Γ is inconsistent with respect to a sentence A, that is, Δ ∪ Γ implies both A and ¬A. However, in this case, it cannot be that Δ ∪ Γ implies ◦A, which means that the proposed solution Δ in fact depends on further investigation that would try to solve the contradiction yielded by Δ ∪ Γ .
Examples Now, by making use of the tableau system for LET K presented in Definition 8, it is fairly easy to illustrate how an abduction mechanism based on the logic LET K works. Let us consider, first, an example from logical folklore: a theory Γ containing the following sentences: Γ = {A → C, B → C}, where A means “It rained last night,” B means “the sprinkler was left on,” and C means “the grass is wet.” If one observes that the grass is wet, and one wants to explain why this is so, “It rained last night” is an explanation, but “the sprinkler was left on” is another competing (though not incompatible) explanation. Tableaux allow for automatically computing nontrivial explanations. After such explanations are available, one may employ some criteria to discard some explanations or choose the best one among competitors. For instance, the hypothetical explanation that the sprinkler was left on may be true, but canceled by the fact that the main water register was known to be off. In any case, some choices may be necessary in order to implement a preference policy for ranking multiple explanations – facts may have precedence over hypothetical explanations, and likelihood may be used to classify explanations. Although this is an important part of the whole question that will affect the usefulness of the automatic explanations produced, it is not part of the abduction problem as originally posed. Example 2. A case where LET K tableaux and classical tableaux give the same result: Let Γ = {A → C, B → C}; of course Γ TK C. Running an LET K tableau for 1(Γ ) ∪ {0(C)} produces an open branch containing 0(A) and 0(B). Clearly, this branch would be closed by 1(A) or 1(B), which indicates that there are three possible abductive solutions: Δ1 = {A}, Δ2 = {B}, and Δ3 = {A, B}.
338
A. Rodrigues et al.
In the example above, in principle, it is considered that a minimal Δ provides a better explanation. However, in real-life contexts of reasoning, the choice between these solutions is a problem that may depend on data and criteria to be established by the user of the system. Example 3 (“Impossible explanations” explained). Suppose that one knows that if it rained last night, then the grass is wet; that one knows that the grass is wet, but also knows that it did not rain. How can one explain that the grass is wet? Let the situation be represented as Γ = {A → B, ¬A}; here, Γ TK B, but no classical tableau is able to find an explanation, since the only possible candidate, A, has to be ruled out by clause (3) of Definition 13 as it entails triviality. However, LET K tableaux will be able to provide a solution. In situations like this one, common sense suggests that the rain hypothesis may be accepted as an explanation, if the information suggesting that it did not rain is uncertain or dubious. Running an LET K tableau yields an open branch containing 1(A). Clearly, Δ = {A} would close the branch, but classically it would not be a solution, since the set of premises contain the formula ¬A. But A is indeed a solution that also indicates the sentence ¬A is not well established as true – that is, it is non-conclusive. In fact, in LET K , A, ¬A TK •A. For this reason, this explanation, under LET K tableaux, does not violate Definition 13. Notice that this scenario cannot be represented by a classical tableau. Example 4 (Explanations that avoid hasty conclusions). It is well-known that taking certain drugs has beneficial consequences for health, but also that the same drugs, under certain conditions, will produce undesirable effects on one’s health. Represent this situation as A → B and A → ¬B. Under classical reasoning (using classical tableaux, or any other classical inference mechanism), an immediate conclusion would be ¬A, that is, no one should take the drugs. However, the negative effects could be explained by inappropriate doses, or by different health conditions in different people, and so on. By using LET K tableaux, however, this case turns out to be an interesting abduction problem, since in LET K A → B, A → ¬B TK ¬A – a counter example is given by the valuation v(A) = v(B) = v(¬B) = 1, v(¬A) = 0. An abductive explanation produced by the LET K tableau is that the drug is to be banned only if the contradictory effects are undeniable, that is, if ◦B. Indeed, assuming ◦A, which is reasonable since one either takes or does not take the drug, in LET K ◦A, ◦B, A → B, A → ¬B TK ¬A (proof left to the reader). Hence, Δ = {◦B} is an explanation: the resulting LET K tableau is closed. Example 5 (Whodunit?). A diamond was stolen in a hotel room, and only two people had entered the room on two different days, Bob and Alice. Since there is only non-conclusive evidence against each of them and the standard of a proof in a criminal trial must be so strong that it should be beyond a shadow of doubt, the police initially considers that they are not guilty, but certainly one of them is, that is, the evidence basis contains Γ = {¬A, ¬B, A ∨ B}, where A and
16 Paraconsistency, Evidence, and Abduction
339
B stand, respectively, for “Alice is guilty” and “Bob is guilty.” At this point, Γ TK A and Γ TK B, so there are two abductive problems. Now, by running the respective LET K -tableau, it can easily be seen that either ◦A (meaning that the initial supposition about Alice’s innocence was indeed conclusive) or ◦B (meaning, alternatively, that the initial supposition about Bob’s innocence was indeed conclusive) would decide the question. Indeed, the following hold in LET K , A ∨ B, ¬A, ¬B, ◦A TK B, A ∨ B, ¬A, ¬B, ◦B TK A. The presumed innocence of exactly one of them must be revised. Defending the innocence of one of them amounts to the culpability of the other. Notice that they cannot be both innocent – if this were the case, triviality would ensue. These examples illustrate the fact that employing logics of evidence and truth in the general problem of abduction has interesting consequences, automatically producing meaningful explanations that would be imperceptible within the classical environment. LET K is not the only choice, and other LETs would play a similar role. It is worth noting that LET K is decidable (Coniglio & Rodrigues, 2022), and the complexity of its satisfiability problem is no worse than that of the classical satisfiability problem. The ideas presented above may be extended to first-order. A sound analytic tableau system for QLET K is obtained by adding the following rules to LET K ’s tableau system (see Definition 8): Rule 18
Rule 19
Rule 20
Rule 21
1(∀xA) 1(A(c=x))
0(∀xA) 0(A(c=x))
1(∃xA) 1(A(c=x))
0(∃xA) 0(A(c=x))
Rule 22
Rule 23
Rule 24
Rule 25
1(¬∀xA) 1(¬A(c=x))
0(¬∀xA) 0(¬A(c=x))
1(¬∃xA) 1(¬A(c=x))
0(¬∃xA) 0(¬A(c=x))
These rules are subjected to the following restrictions: c is an arbitrary constant in Rules 18, 21, 23, and 24, and c is a new constant in Rules 19, 20, 22, and 25. Clause 2 of Definition 4 has to be changed as follows: a tableau branch is closed if it contains a pair of signed formulas 1(A) and 0(A ), and A and A are variants of one another. Rules 18–25 are sound with respect to the valuation semantics presented in Definition 12, since they clearly comply with the semantic clauses for the quantifiers.
340
A. Rodrigues et al.
Thus, the method introduced here for obtaining automatic explanations can also be extended to first order. Let us see an example below. Example 6. Consider the following set of sentences: Γ = {∀x(Cx → Bx), ∀x(Gx → Bx), ¬Ca}. Here, Γ T Ba. Running an LET K tableau for 1(Γ ) ∪ {0(Ba)} produces an open branch containing 0(Ca), 1(¬Ca), and 0(Ga). Classically, the only candidate to be an abductive explanation is Ga. But from the point of view of QLET K , there are two possible explanations because 1(Ca) also closes that branch. In this case, however, a further conclusion is that Ca is non-conclusive. This explanation does not violate Definition 13.
Conclusion In this chapter the approach to paraconsistency developed by the logics of evidence and truth has been reviewed. From the technical point of view, the basic idea is to express two different notions of logical consequence in the same formal system, namely, classical logic on the one hand and a paraconsistent and paracomplete logic on the other. This is done in such a way that classical negation, and so classical logic, can be recovered for sentences for which a classical behavior is required. From an intuitive point of view, the basic idea is to express the deductive behavior of positive and negative evidence, which can be non-conclusive or conclusive, the latter being subjected to classical logic. Non-conclusive evidence can be incomplete and contradictory; therefore, excluded middle and explosion are not valid, and the underlying logic of sentences for which there is no conclusive evidence is FDE or an extension thereof (such as FDE→ ). LETs can also be interpreted in terms of information, and in this case instead of conclusive and non-conclusive evidence one can refer to reliable and unreliable information. This interpretation can be seen as a further development of the idea of a logic suitable for dealing with possibly inconsistent databases proposed and worked out by Belnap and Dunn from the 1970s onward. In addition to the intuitive interpretation in terms of evidence and information, several technical aspects of LET F and LET K , namely, natural deduction systems, analytic tableaux, valuation semantics, as well as the first-order version QLET K of LET K , have been treated. Decidability of the sentential logics LET F and LET K has been obtained by means of analytic tableaux (see also Coniglio & Rodrigues 2022). The chapter has briefly discussed the problem of abduction and proposed a definition of abductive explanation according to which non-triviality, rather than consistency, is a criterion for possible solutions of abductive problems. This move allows for the application of analytic tableaux of the logic LET K in order to find possible solutions of abductive problems. Since nonexplosive contradictions are tolerated in LET K , the backward mechanism of tableaux, in certain cases, provides contradictory solutions, which are of course marked as unreliable or non-conclusive. Thus, it may happen that premises that in fact are uncertain are explicitly indicated, allowing a broader perspective on the problem at stake.
16 Paraconsistency, Evidence, and Abduction
341
Appendix This appendix contains some general remarks on valuation semantics and some technical developments concerning LET F and LET K , namely, Kripke-style semantics for both logics and a probabilistic semantics for LET F .
On Valuation Semantics Valuation semantics, like the one presented above for LET F , are two-valued nondeterministic semantics proposed and investigated by Loparic et al. from the 1970s onward (see, e.g., da Costa & Alves, 1977; Loparic, 1986, 2010; Loparic & Alves, 1980; Loparic & da Costa, 1984). This section draws some historical and technical remarks on valuation semantics (cf. Antunes et al., 2022, Sect. 5). The so-called Suszko’s thesis is the claim that every Tarskian and structural logic admits of a two-valued semantics (Suszko, 1977). A proof of this result for sentential logics can be found in Malinowski (1993, pp. 72-73). Given a (possibly infinite) multivalued semantics for a Tarskian and structural logic L, a two-valued semantics for L is defined as follows: if a formula A receives a designated value in a multivalued interpretation I , the value 1 is assigned to A in a two-valued interpretation I ; otherwise, A is assigned the value 0 in I . Semantic consequence is then defined as preservation of the value 1, instead of preservation of a designated value. An analogous result has been obtained by Loparic & da Costa (1984, pp. 121122), where a general notion of valuation semantics is presented. Given a consequence relation L and a language L , a function e : L −→ {0, 1} is an evaluation if e satisfies the following clauses: (i) If A is an axiom, then e(A) = 1; (ii) If e assigns the value 1 to all the premises of an application of an inference rule, then it also assigns 1 to its conclusion; (iii) For some formula A, e(A) = 0. It is also necessary that a Lindenbaum construction can be carried out for L , which requires that L has to be Tarskian and compact. Let a set Δ be A-saturated when Δ L A and for every B ∈ / Δ, Δ ∪ {B} L A. Now, assuming that Γ L A: (iv) (v) (vi)
There is an A-saturated set Δ, such that Γ ⊆ Δ; Δ L B iff B ∈ Δ; The characteristic function c of Δ is an evaluation.
Since (iv) and (v) are immediate consequences of the Lindenbaum construction, it suffices to prove (vi). Clearly, c satisfies (i) and (iii) above. As for (ii), suppose c
342
A. Rodrigues et al.
assigns the value 1 to the premises of a derivation Δ0 L B, Δ0 ⊆ Δ. Since, by (v), B ∈ Δ, it then follows that c(B) = 1. Now, define a valuation as an evaluation that is the characteristic function of some A-saturated set. The collection of all valuations so defined is an adequate valuation semantics for L . Soundness follows from the definition of evaluations (the set of valuations is a proper subset of the set of evaluations), and completeness from the fact that c assigns 1 to all the sentences of Γ while assigning 0 to A. Note that the set of all evaluations for a given consequence relation L does not suffice for providing a semantics. Consider, e.g., the semantics of classical logic, which is a special case of a valuation semantics, and let T h be the set of all classical theorems. The characteristic function c of T h is an evaluation, but for all atoms p, neither p nor ¬p is in T h, so c(p) = 0 and c(¬p) = 0, even though c(p ∨¬p) = 1. It is also worth noting that the notion of an A-saturated set provides a method for proving completeness for any logic for which a Lindenbaum construction can be carried out. Valuation semantics were proposed by Loparic, Alves, and da Costa for the paraconsistent logics of da Costa’s Cn hierarchy (da Costa & Alves, 1977; Loparic, 1986; Loparic & Alves, 1980), which are “ancestors” of the logics of formal inconsistency and logics of evidence and truth. The problem they had at hand was to provide semantics for paraconsistent logics that are not finitely valued. They then came up with the idea of generalizing classical two-valued semantics in such a way that the axioms and rules were “mirrored” by the semantic clauses in terms of 0s and 1s. The value 0 assigned to a formula A can be read as “A does not hold” and 1 as “A holds” – note that this is the basic idea of the general notion of valuation as defined above. Later, valuation semantics were proposed for several nonclassical sentential logics, including minimal and intuitionistic logics (Loparic, 2010), FDE (Rodrigues et al., 2020), Nelson’s N4 (Carnielli & Rodrigues, 2017), and logics of formal inconsistency and undeterminedness (Carnielli et al., 2007). First-order valuation semantics were also proposed for da Costa’s quantified Cn hierarchy (da Costa et al., 2007), for some logics of formal inconsistency (Carnielli & Coniglio, 2016; Carnielli et al., 2014), and for a quantified version of the logic LET F (Antunes et al., 2022).
Kripke-Style Semantics for LET F and LET K This section presents Kripke-style models for both LET F and LET K (cf. Antunes et al., 2020). Definition 14. A Kripke model M for LET F (or an LET F -Kripke model) is a structure W, ≤, v such that W is a non-empty set of stages, the accessibility relation ≤ is a partial order on W , and v : LF × W −→ {0, 1} is a valuation function satisfying the following conditions, for every w ∈ W : 1. v(A ∧ B, w) = 1 iff v(A, w) = 1 and v(B, w) = 1;
16 Paraconsistency, Evidence, and Abduction
343
v(A ∨ B, w) = 1 iff v(A, w) = 1 or v(B, w) = 1; v(¬(A ∧ B), w) = 1 iff v(¬A, w) = 1 or v(¬B, w) = 1; v(¬(A ∨ B), w) = 1 iff v(¬A, w) = 1 and v(¬B, w) = 1; v(¬¬A, w) = 1 iff v(A, w) = 1; v(◦A, w) = 1 only if exactly one of the following conditions obtains: For every w ≥ w, v(A, w ) = 1 and v(¬A, w ) = 0; For every w ≥ w, v(A, w ) = 0 and v(¬A, w ) = 1; 7. v(•A, w) = 1 iff v(◦A, w) = 0; 8. If v(A, w) = 1, then for every w ≥ w, v(A, w ) = 1, for every A ∈ LF .
2. 3. 4. 5. 6.
Given an LET F -Kripke model M = W, ≤, v and a stage w ∈ W , a formula A is said to hold in w (M, w A) if and only if v(A, w) = 1. A is said to be a LET F Kripke consequence of Γ (Γ R F A) if and only if for every model M = W, ≤, v and every w ∈ W , if M, w B, for every B ∈ Γ , then M, w A. A is said to be logically valid if for every model M and stage w ∈ W , M, w A (the letter ‘R’ in R F stands for ‘relational’). These models are intended to represent a database that receives information over time from different sources, and such information may be either reliable or unreliable. Each stage w represents one of the six scenarios mentioned on page 315 with respect to a sentence A. For example, contradictory information A is expressed by w A, w ¬A, and w •A, reliable information A by w A, w ◦A, and so on. This is illustrated by the diagram below:
w4 p; ◦ p
w2 p
w5 p; ¬p
w1
w3 ¬p
w6 ¬p; ◦ p
In stage w1 , the database is empty and so has no information about p. In w2 it receives only the information p, which in w2 is not taken as reliable. From w2 , there are two possibilities: in w4 the database receives the information that the information about p is reliable, which is expressed by ◦p; alternatively, in w5 the
344
A. Rodrigues et al.
information ¬p is obtained, and so the information about p remains unreliable. Analogous reasoning applies to w3 , which may bifurcate into w5 or w6 . Theorem 4. The natural deduction system of LETF is sound and complete with respect to the class of all LETF -Kripke models: Γ F A if and only if Γ R F A. Proof. See Antunes et al. (2020), Theorems 3 and 4. It is worth mentioning that although Kripke models are required to satisfy the persistent condition expressed by clause (8) of Definition 14, this requirement could have been entirely ignored. That is, Theorem 4 would will hold even if clause (8) were missing – or even if it were replaced by weaker versions of the persistence condition. For a detailed discussion of this point, as well as for proofs of the relevant results, see Antunes et al. (2020), Sect. 4. Now, it is straightforward to adapt the Kripke-style semantics for LET F presented above to the case of LET K . It suffices to supplement clauses (1)–(8) of Definition 14 with two clauses governing the semantic behavior of →. Definition 15. A Kripke model M for LET K (or an LET K -Kripke model) is a LET F -Kripke model that satisfies the following additional clauses: 9. v(A → B, w) = 1 iff v(A, w) = 0 or v(B, w) = 1; 10. v(¬(A → B), w) = 1 iff v(A, w) = 1 and v(¬B, w) = 1. As before, given an LET K -Kripke model M and a stage w ∈ W , a formula A is said to hold in w (M, w A) if and only if v(A, w) = 1. A is a LET K -Kripke consequence of Γ (Γ R K A) if and only if for every model M and every w ∈ W , if M, w B, for every B ∈ Γ , then M, w A. A is said to be logically valid if for every model M and stage w ∈ W , M, w A. In view of the definition above, soudness and completeness results can be easily be proven by adapting the proofs of the corresponding results for LET F . Theorem 5. The natural deduction system of LETK is sound and complete with respect to the class of all LETK -Kripke models: Γ K A if and only if Γ R K A. Proof. Left to the reader.
Probabilistic Semantics for LET F This section presents a probabilistic semantics for LET F intended to quantify the degree of evidence enjoyed by a given sentence (cf. Rodrigues et al. 2020). The relation between the two-valued valuation semantics (page 324) and the probabilistic semantics can be illustrated by means of an analogy with qualitative and quantitative analysis in analytical chemistry. Qualitative analysis is concerned
16 Paraconsistency, Evidence, and Abduction
345
with whether or not some sample contains a given substance, while quantitative analysis asks how much of a substance is contained in a sample. Analogously, the valuation semantics represents only that there is or there is not positive and negative evidence available for A, while the probabilistic semantics intends to express the amount of such evidence. The intended intuitive interpretation is as follows. Let P (A) = ε mean that ε is the measure of evidence available for A. A probabilistic scenario is said to be paracomplete, or more generally gapped, when P (A) + P (¬A) < 1, and paraconsistent, or more generally glutted, when P (A) + P (¬A) > 1. When the evidence is non-conclusive or there is no evidence at all for A, P (A) + P (¬A) < 1, and when there is conflicting evidence for A, P (A) + P (¬A) > 1. The former is a paracomplete and the latter a paraconsistent scenario, and in both cases, P (◦A) < 1, which means that the probability measures of A and ¬A do not behave classically. If P (◦A) = 1, standard probability is recovered for A. Definition 16. Given a logic L, with a derivability relation L and a language L , a probability distribution for L is a real-valued function P : L −→ R satisfying the following conditions: 1. 2. 3. 4. 5.
Nonnegativity: 0 ≤ P (A) ≤ 1 for all A ∈ L ; Tautologicity: If L A, then P (A) = 1; Anti-Tautologicity: If A = ⊥, then P (A) = 0; Comparison: If A L B, then P (A) ≤ P (B); Finite additivity: P (A ∨ B) = P (A) + P (B) − P (A ∧ B).
The clauses above can be regarded as meta-axioms that define probability functions for an appropriate logic L just by taking L as the derivability relation of L, and so the notion of probability can be regarded as logic-dependent. Definition 17 (LET F -probability distribution). Let Σ = {A1 , · · · , An , · · · } be a (finite or infinite) collection of sentences in LF of LET F . A LET F -probability distribution over Σ is an assignment of probability values P to the elements of Σ that can be extended to a full probability function P : LF → R according to Definition 16.
Conditional Probability The notion of the conditional probability of A given B is defined as usual, for P (B) = 0:
P (A/B) =
P (A ∧ B) P (B)
346
A. Rodrigues et al.
In terms of evidence, a statement P (A/B) is to be read as a measure of how much the evidence available for B affects the evidence for A. Proposition 6. The following hold when the probabilities in the denominators are different from 0: 1. 2. 3. 4.
P (A/B) + P (¬A/B) − P (•A/B) ≤ P (A ∨ ¬A/B); If P (◦A) = 1, then P (A ∨ ¬A) = 1 and P (A ∧ ¬A) = 0; P (A/B) + P (¬A/B) = 1, if P (◦A) = 1; P (B/◦B) + P (¬B/◦B) = 1.
Proof. See Rodrigues et al. (2020), Theorem 39.
Independence and Incompatibility Intuitively, two sentences are independent if the fact that one holds does not have any effect on whether or not the other holds, and vice versa. Two sentences A and B are said to be independent with respect to a probability distribution P if P (A ∧ B) = P (A) · P (B). Note that two sentences can be independent in one probability distribution and dependent in another. Alternatively, independence can be defined as follows: A is independent of B if P (A/B) = P (A) (or equivalently, P (B/A) = P (B)). Classically, A and ¬A are never independent (unless one of them has probability zero). Given item 4 of Theorem 7 below, P (A∧¬A) ≤ P (•A), when P (A) · P (¬A) > P (•A), A and ¬A are not independent. Thus, P (•A) can be regarded as a bound on the “degree of independence” between A and ¬A. Intuitively, two sentences A and B are logically incompatible if A cannot hold when B holds, and vice versa. Two sentences A and B are said to be logically incompatible if A, B C, for any C, or equivalently, if A ∧ B is a bottom particle. Proposition 7. 1. 2. 3. 4. 5.
P (A ∧ ¬A) ≤ P (•A); P (◦A) ≤ P (A ∨ ¬A); P (◦A) = 1 − P (•A); P (◦A ∨ (A ∧ ¬A)) ≤ P (A ∨ ¬A); If P (◦A) = 1, then P (¬A) = 1 − P (A), P (A ∨ ¬A) = 1, and P (A ∧ ¬A) = 0.
Proof. See Rodrigues et al. (2020), Theorem 42. The proposition above states some useful properties of LET F distributions. Items 1–4 establish constraints on the values of P (◦A), P (•A), P (A), and P (¬A). Item 5 shows the classical behavior of probabilities when P (◦A) = 1.
Total Probability Theorems and Bayes’ Rules In the classical approach to probability, total probability theorems compute the probability of an event B in a sample space partitioned into exclusive and exhaustive
16 Paraconsistency, Evidence, and Abduction
347
events. Typically, for a partition in two pieces, a total probability theorem that reflects excluded middle assumes the following form: P (B) = P (B ∧ A) + P (B ∧ ¬A). Here, however, we are not really talking about sample spaces, i.e., about events themselves, but rather about the information related to such events, referred to as an information space. In the standard approach to probability theory, one starts from a group of events and attributes probabilities to these events, whose sum is always equal to 1. Now, for example, consider the language of LET F used to convey information about some event referred to by the sentence A. Such information can be incomplete, contradictory, more reliable or less reliable, and maybe even conclusive. In this case, the relevant sentences that concern us are A, ¬A, ◦A, •A, as well as other sentences of the language of LET F formed from them, for example, •A∨A, A∧¬A, ◦A ∧ A, etc. A LET F -probability distribution attributes values to these sentences. The information space is thus constituted by such sentences and by the measures of probabilities attributed to them by a LET F -probability distribution P. Note that, contrary to the classical case, P (A) + P (¬A) can be greater or less than 1 precisely because A and ¬A do not establish a partition of the information space. Now, the question is: since one cannot rely on the classical, mutually exclusive partitions of the sample space, how can total probability theorems be stated? In order to provide such theorems for LET F , the best strategy is to rely on the connectives ◦, • and also on a bit of terminology. Definition 18 (Cleavage). Let us call a cleavage a (finite) family of sentences {A1 , A2 , . . . , An }. A cleavage is said to be exhaustive if A1 ∨ A2 ∨ . . . ∨ An is a tautology, and so it covers all the information space, possibly with intersections. A cleavage is said to be exclusive when A1 ∨ A2 ∨ . . . ∨ An are pairwise logically incompatible, In this case, it does not yield intersection of information, since Ai ∧Aj for i = j is a bottom particle, and the cleavage possibly does not cover the whole space. An exhaustive and exclusive cleavage is a partition. Proposition 8. The following are theorems of LETF : 1. ◦A ∨ •A; 2. A ∨ ¬A ∨ •A; 3. (•A ∧ A) ∨ (•A ∧ ¬A) ∨ (•A) ∨ (•A ∧ (A ∧ ¬A)) ∨ (◦A ∧ A) ∨ (◦A ∧ ¬A). Item 1 above cleaves the information space in parts that are exhaustive and exclusive, and so are partitions. Items 2 and 3, on the other hand, cleave the information space exhaustively but not exclusively. Note that 3 corresponds to the six scenarios expressed by LET F : the first four disjuncts represent scenarios of unreliable information, while the last two disjuncts represent the two scenarios of reliable information. Items 1–3 above can be understood as different ways one can
348
A. Rodrigues et al.
look at the information space. Several total probability theorems depending upon the notion of cleavage given by the above formulas (Proposition 8) can be proved (see Rodrigues et al. (2020), Theorem 44). Bayes’ rule, or Bayes’ theorem, as is well-known, computes the probability of an event based on previous information related to that event. The standard Bayes’ rule proves that, for P (B) = 0: P (A/B) =
P (B/A) · P (A) P (B)
In the equation above, interpreted in terms of measures of evidence rather than standard probabilities, P (A) denotes the evidence available for A without taking into consideration any evidence for B. The latter is supposed to affect in some way the evidence for A, and so P (A/B) is the measure of the evidence for A after B is taken into account. P (B/A), usually called the “likelihood” in probability theory, is the evidence for B when A is considered as given, and P (B), usually called the “marginal likelihood,” is the total evidence available for B, which takes into account all the possible cases where B may occur. Some non-equivalent versions of Bayes’ rule can be defined, showing how the notion of classicality can modify Bayesian probability updating (see Rodrigues et al. (2020), Theorem 45). The probabilistic semantics of LET F is axiomatically stated in Definitions 16 and 17. An intuitive interpretation in terms of conclusive and non-conclusive evidence has been suggested, and the notions of information space and cleavage were adopted in order to explain the similarities and differences between standard probability theory and the semantics proposed. Although the semantics here presented has not been conceived to express degrees of belief by means of probability measures, it can be regarded as a tool to measure degrees of belief, certainty/uncertainty, rational acceptance/rejection, or some other relation between agents and sentences. Similar remarks apply to the connective ◦. In P (◦A) = ε, the value of ε expresses the degree to which it is expected that P (A) behaves classically. Indeed, ε can also be interpreted as the degree of reliability of evidence for A, coherence with previous data or with a historical series of measures of evidence for A, or even with a subjective ingredient, for example, the degree of trustfulness of the belief in A, or certainty/uncertainty of A.
References Achinstein, P. (2010). Concepts of evidence. In Evidence, Explanation, and Realism. Oxford University Press. Aliseda, A. (2006). Abductive Reasoning. Logical Investigations into Discovery and Explanation. Springer. Anderson, A. (1960). Completeness theorems for the systems E of entailment and EQ of entailment with quantification. Mathematical Logic Quarterly, 6, 201–216. Anderson, A., & Belnap, N. (1962). Tautological entailments. Philosophical Studies, 13, 9–24.
16 Paraconsistency, Evidence, and Abduction
349
Anderson, A., & Belnap, N. (1963). First degree entailments. Mathematische Annalen, 149, 302– 319. Antunes, H., Rodrigues, A., Carnielli, W., & Coniglio, M.E. (2020). Valuation Semantics for FirstOrder Logics of Evidence and Truth. Journal of Philosophical Logic, 51:1141–1173. Antunes, H., Rodrigues, A., Carnielli, W., & Coniglio, M. (2022). Valuation semantics for firstorder logics of evidence and truth. Journal of Philosophical Logic. Arieli, O., Denecker, M., Nuffelen, B. V., & Bruynooghe, M. (2004). Coherent integration of databases by abductive logic programming. Journal of Artificial Intelligence Research, 21, 245–286. Audi, R. (1999). The Cambridge Dictionary of Philosophy (2nd ed.). Cambridge University Press. Belnap, N. (1977a). How a computer should think. In G. Ryle (Ed.), Contemporary Aspects of Philosophy. Oriel Press. Reprinted in New Essays on Belnap-Dunn Logic, 2019 (pp. 35–55). Springer. Belnap, N. (1977b). A useful four-valued logic. In G. Epstein & J. Dunn (Eds.), Modern Uses of Multiple Valued Logics. D. Reidel. Reprinted in New Essays on Belnap-Dunn Logic, 2019. Dordrecht: Springer. Carnielli, W. (2006). Surviving abduction. Logic Journal of the IGPL, 14(2), 237–256. Carnielli, W., & Coniglio, M. (2016). Paraconsistent Logic: Consistency, Contradiction and Negation (Logic, Epistemology, and the Unity of Science series, Vol. 40). Springer. Carnielli, W., Coniglio, M., & Marcos, J. (2007). Logics of formal inconsistency. In D. Gabbay & F. Guenthner (Eds.), Handbook of Philosophical Logic (Vol. 14, pp. 1–93). Amsterdam: Springer. Carnielli, W., Coniglio, M., Podiacki, R., & Rodrigues, T. (2014). On the way to a wider model theory: Completeness theorems for first-order logics of formal inconsistency. The Review Of Symbolic Logic, 7(3), 548–578. Carnielli, W., Coniglio, M., & Rodrigues, A. (2019). Recovery operators, paraconsistency and duality. Logic Journal of the IGPL, 28, 624–656. Carnielli, W., Frade, L., & Rodrigues, A. (2020). Analytic proofs for logics of evidence and truth. South American Journal of Logic, 6(2), 325–345. Carnielli, W., & Marcos, J. (2002). A taxonomy of C-systems. In W. Carnielli, M. E. Coniglio, & I. M. D’Ottaviano (Eds.), Paraconsistency: The Logical Way to the Inconsistent. New York: Marcel Dekker. Carnielli, W., & Rodrigues, A. (2015). On the philosophy and mathematics of the logics of formal inconsistency. In New Directions in Paraconsistent Logic – Springer Proceedings in Mathematics & Statistics, 152 (pp. 57–88). India: Springer. Carnielli, W., & Rodrigues, A. (2017). An epistemic approach to paraconsistency: A logic of evidence and truth. Synthese, 196, 3789–3813. Coniglio, M.E. and Rodrigues, A. (2022). On six-valued logics of evidence and truth expanding Belnap-Dunn four-valued logic. arXiv:2209.12337 [math.LO] da Costa, N. (1963). Sistemas Formais Inconsistentes. Curitiba: Editora da UFPR (1993). da Costa, N., & Alves, E. H. (1977). A semantical analysis of the calculi Cn. Notre Dame Journal of Formal Logic, 18, 621–630. da Costa, N., Krause, D., & Bueno, O. (2007). Paraconsistent logics and paraconsistency. In D. J. et al. (Eds.), Philosophy of Logic – Handbook of the Philosophy of Science (Vol. 5, pp. 791– 911). Elsevier. D’Ottaviano, I., & da Costa, N. (1970). Sur un problème de Jáskowski. Comptes Rendus de l’Académie de Sciences de Paris, 270, 1349–1353. Dunn, J. (1976). Intuitive semantics for first-degree entailments and ‘coupled trees’. Philosophical Studies, 29, 149–168. Reprinted in Omori and Wansing (2019). Dunn, J. (2008). Information in computer science. In P. Adriaans & J. van Benthem (Eds.), Philosophy of Information. (Handbook of the Philosophy of Science, Vol. 8, pp. 581–608). Elsevier. Dunn, J. (2018). Three questions to J. Michael Dunn. Paraconsistent Newsletter Fall 2018.
350
A. Rodrigues et al.
Dunn, J. (2019). Two, three, four, infinity: The path to the four-valued logic and beyond. In H. Omori & H. Wansing (Eds.), New Essays on Belnap-Dunn Logic, pp. 77–97. Springer. Evidence. (2022). dictionary.cambridge.org. Cambridge Dictionary, 2022. Web. January 2022. Fetzer, J. (2004). Information: Does it have to be true? Minds and Machines, 14, 223–229. Fitting, M. (2016). Paraconsistent logic, evidence, and justification. Studia Logica, 105(6), 1149– 1166. Floridi, L. (2011). The Philosophy of Information. Oxford University Press. Floridi, L. (2019). Semantic conceptions of information. In The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University Hazen, A., & Pelletier, F. (2019). K3, Ł3, LP, RM3, A3, FDE, M: How to make many-valued logics work for you. In H. Omori & H. Wansing (Eds.), New Essays on Belnap-Dunn Logic. Springer. Kelly, T. (2014). Evidence. In E. N. Zalta (Ed.) The Stanford Encyclopedia of Philosophy (Fall, 2014 edn.). Metaphysics Research Lab, Stanford University Kim, J. (1988). What is naturalized epistemology? Philosophical Perspectives, 2, 381–405. Loparic, A. (1986). A semantical study of some propositional calculi. The Journal of NonClassical Logic, 3(1), 73–95. Loparic, A. (2010). Valuation semantics for intuitionistic propositional calculus and some of its subcalculi. Principia, 14(1), 125–133. Loparic, A., & Alves, E. (1980). The semantics of the systems Cn of da Costa. In A. Arruda, N. da Costa, & A. Sette (Eds.), Proceedings of the Third Brazilian Conference on Mathematical Logic (pp. 161–172). São Paulo: Sociedade Brasileira de Lógica. Loparic, A., & da Costa, N. (1984). Paraconsistency, paracompleteness and valuations. Logique et Analyse, 106, 119–131. Malinowski, G. (1993). Many-Valued Logics. Clarendon Press. Mancosu, P. (2001). Mathematical explanation: Problems and prospects. Topoi, 20, 97–117. Marcos, J. (2005a). Logics of Formal Inconsistency. PhD thesis, University of Campinas. Marcos, J. (2005b). Nearly every normal modal logic is paranormal. Logique et Analyse, 48, 279–300. Omori, H., & Wansing, H. (2017). 40 years of FDE: An introductory overview. Studia Logica, 105, 1021–1049. Omori, H., & Wansing, H. (Eds.), (2019). New Essays on Belnap-Dunn Logic. Springer. Oreskes, N. (2019). Why Trust Science? Princeton University Press. Peirce, C. (1931). Collected Papers. Harvard University Press. Pollock, J. L. (1974). Knowledge and Justification. Princeton University Press. Rodrigues, A., Bueno-Soler, J., & Carnielli, W. (2020). Measuring evidence: A probabilistic approach to an extension of Belnap-Dunn Logic. Synthese, 198, 5451–5480. Rodrigues, A., & Carnielli, W. (2022). On Barrio, Lo Guercio, and Szmuc on logics of evidence and truth. Logic and Logical Philosophy. Soanes, C., & Stevenson, A. (2004). Evidence. In Concise Oxford English Dictionary. Oxford University Press. Soler-Toscano, F., & Velázquez-Quesada, F. (2014). Generation and selection of abductive explanations for non-omniscient agents. Journal of Logic, Language and Information, 23, 141– 168. Suszko, R. (1977). The Fregean axiom and Polish mathematical logic in the 1920s. Studia Logica, 36, 377–380. van Dalen, D. (2008). Logic and Structure (4th ed.). Springer.
Qualitative Inductive Generalization and Confirmation
17
Mathieu Beirlaen
Contents Adaptive Logics for Inductive Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Standard Format and LI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General Characterization of the Standard Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proof Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minimal Abnormality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . More Adaptive Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Qualitative Confirmation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I-Confirmation and Hempel’s Adequacy Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I-Confirmation and the Hypothetico-Deductive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interdependent Abnormalities and Heuristic Guidance . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
352 354 354 355 360 362 366 367 370 372 376 379
Abstract
Inductive generalization is a defeasible type of inference which we use to reason from the particular to the universal. First, a number of systems are presented that provide different ways of implementing this inference pattern within first-order logic. These systems are defined within the adaptive logics framework for modeling defeasible reasoning. Next, the logics are reinterpreted as criteria of confirmation. It is argued that they withstand the comparison with two qualitative theories of confirmation, Hempel’s satisfaction criterion and hypothetico-deductive confirmation.
M. Beirlaen () Ghent University, Gent, Belgium e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_28
351
352
M. Beirlaen
Keywords
Inductive generalization · Confirmation · Hempel · Raven paradox · Adaptive logic
Adaptive Logics for Inductive Generalization Logics of induction are tools for evaluating the strength of arguments which are not deductively valid. There are many kinds of argument, the conclusion of which is not guaranteed to follow from its premises, and there are many ways to evaluate the strength of such arguments. This paper focuses on one particular kind of nondeductive argument and on one particular method of implementation. The type of argument under consideration here is that of inductive generalization, as when we reason from the particular to the universal. A number of logics are discussed which permit us, given a set of objects sharing or not sharing a number of properties, to infer generalizations of the form “All x are P ,” or “All x with property P share property Q.” Inductive generalization is a common practice which has proven its use in scientific endeavor. For instance, given the fact that the relatively few electrons measured so far carry a charge of −1.6 × 10−19 coulombs, we believe that all electrons have this charge (Norton, 2005). The methods used here for formalizing practices of inductive generalization stem from the adaptive logics framework. Adaptive logics are tools developed for modeling defeasible reasoning, equipped with a proof theory that nicely captures the dynamics of non-monotonic – in this case, inductive – inference. In proofs for adaptive logics for inductive generalization, the conditional introduction of generalizations is allowed. The proof theory is also equipped with a mechanism taking care that conditionally introduced generalizations get retracted in case their condition is violated, for instance, when the generalization in question is falsified by the premises. In sections “The Standard Format and LI” and “More Adaptive Logics”, the general framework of adaptive logics is introduced, and a number of existing adaptive logics for inductive generalization are defined. The differences between these logics arise from different choices made along one of two dimensions. A first dimension concerns the specific condition required for introducing generalizations in an adaptive proof. A very permissive approach allows for their free introduction, without taking into account the specifics of the premises. This is the idea behind the logic LI. A more economical approach is to permit the introduction of a generalization on the condition that at least one instance of it is present. This is the rationale behind a second logic, IL. In an IL-proof, a generalization “All P are Q” can be introduced only if the premise set contains at least one object which is either not-P or Q. More economical still is the rationale behind a third logic, G, which aims to capture the requirement of knowing at least one positive instance of a generalization before introducing it in a proof. That is, in a G-proof, a generalization
17 Qualitative Inductive Generalization and Confirmation
353
“All P are Q” can be introduced if the premise set contains at least one object which is both P and Q. The second dimension along which different consequence relations are generated concerns the specific mechanism used for retracting generalizations introduced in adaptive proofs. It is often not sufficient to demand retraction just in case a generalization is falsified by the premises. For instance, if the consequence sets of our logics are to be closed under classical logic, jointly incompatible generalizations should not be derivable, even though none of them is falsified by our premise set. Within the adaptive logics framework, various strategies are available for retracting conditional moves in an adaptive proof. Two such strategies are presented in this paper: the reliability strategy and the minimal abnormality strategy. Combining both dimensions, a family of six adaptive logics for inductive generalization is obtained (it contains the systems LI, IL, and G, each of which can be defined using either the reliability or the minimal abnormality strategy). These logics have all been presented elsewhere (for LI, see Batens (2005, 2006); Batens and Haesaert (2003). For IL and G, see Batens (2011)). The original contribution of this paper consists in a study comparing these systems to some existing qualitative criteria of confirmation. There is an overlap between the fields of inductive logic and confirmation theory. In 1943 already, Hempel noted that the development of a logical theory of confirmation might be regarded as a contribution to the field of inductive logic (Hempel, 1943, p. 123). In section “Qualitative Confirmation” the logics from sections “The Standard Format and LI” and “More Adaptive Logics” are reinterpreted as qualitative criteria of confirmation and are related to other qualitative models of confirmation: Hempel’s satisfaction criterion (section “I-Confirmation and Hempel’s Adequacy Conditions”) and the hypothetico-deductive model (section “I-Confirmation and the Hypothetico-Deductive Model”). Section “Qualitative Confirmation” ends with some remarks on the heuristic guidance that adaptive logics for inductive generalization can provide in the derivation and subsequent confirmation of additional generalizations (section “Interdependent Abnormalities and Heuristic Guidance”). The following notational conventions are used throughout the paper. The formal language used is that of first-order logic without identity. A primitive functional formula of rank 1 is an open formula that does not contain any logical symbols (∃, ∀, ¬, ∨, ∧, ⊃, ≡), sentential letters, or individual constants and that contains only predicate letters of rank 1. The set of functional atoms of rank 1, denoted A f 1 , comprises the primitive functional formulas of rank 1 and their negations. A generalization is the universal closure of a disjunction of members of A f 1 . That is, the set of generalizations in this technical sense is the set {∀(A1 ∨ . . . ∨ An ) | A1 , . . . , An ∈ A f 1 ; n ≥ 1}, where ∀ denotes the universal closure of the subsequent formula. Occasionally the term generalization is also used for formulas equivalent to a member of this set, e.g., ∀x(P x ⊃ Qx). It is easily checked that generalizations ∀(A1 ∨ . . . ∨ An ) can be rewritten as formulas of the general form ∀((B1 ∧ . . . ∧ Bj ) ⊃ (C1 ∨ . . . ∨ Ck )), and vice versa, where all Bi and Cj belong to A f 1 .
354
M. Beirlaen
A First Logic for Inductive Generalization In this section, the standard format (SF) for adaptive logics is introduced and explained. Its features are illustrated by means of the logic LI from Batens (2006) and Batens and Haesaert (2003), chronologically the first adaptive logic for inductive generalization. A general characterization of the SF is provided, and its proof theory is explained. For a more comprehensive introduction, including the semantics and generic metatheory of the SF, see, e.g., Batens (2007).
General Characterization of the Standard Format An adaptive logic (AL) within the SF is defined as a triple, consisting of: (i) A lower limit logic (LLL), a logic that has static proofs and contains classical disjunction (ii) A set of abnormalities, a set of formulas that share a (possibly) restricted logical form, or a union of such sets (iii) An adaptive strategy The LLL is the stable part of the AL: anything derivable by means of the LLL is derivable by means of the AL. Explaining the notion of static proofs is beyond the scope of this paper. For a full account, see Batens (2009). (Alternatively, the static proof requirement can be replaced by the requirement that the lower limit logic has a reflexive, monotonic, transitive, and compact consequence relation.) In any case, it suffices to know that the first-order fragment of classical logic (CL) meets this requirement, as we work almost exclusively with CL as a LLL. The lower limit logic of LI is CL. Typically, an AL enables one to derive, for most premise sets, some extra consequences on top of those that are LLL-derivable. These supplementary consequences are obtained by interpreting a premise set “as normally as possible” or, equivalently, by supposing abnormalities to be false “unless and until proven otherwise.” What it means to interpret a premise set “as normally as possible” is disambiguated by the strategy, element (iii). The normality assumption made by the logics to be defined in this paper amounts to supposing that the world is in some sense uniform. “Normal” situations are those in which it is safe to derive generalizations. “Abnormal” situations are those in which generalizations are falsified. In fact, the set of LI-abnormalities, denoted ΩLI , is just the set of falsified generalizations (the definitions are those from Batens (2011); in Van De Putte and Straßer (2014, Sec. 4.2.2), it is shown that the same logic is obtained if ΩLI is defined as the set of formulas of the form ¬∀xA(x), where A contains no quantifiers, free variables, or constants):
17 Qualitative Inductive Generalization and Confirmation
ΩLI =df {¬∀(A1 ∨ . . . ∨ An ) | A1 , . . . , An ∈ A f 1 ; n ≥ 1}
355
(1)
In adaptive proofs, it is possible to make conditional inferences assuming that one or more abnormalities are false. Whether or not such assumptions can be upheld in the continuation of the proof is determined by the adaptive strategy. The SF incorporates two adaptive strategies, the reliability strategy and the minimal abnormality strategy. In the generic proof theory of the SF, adaptive strategies come with a marking definition, which takes care of the withdrawal of certain conditional inferences in dynamic proofs. It will be easier to explain the intuitions behind these strategies after defining the generic proof theory for ALs. For now, just note that in the remainder “LI” is ambiguous between LIr and LIm , where the subscripts r and m denote the reliability strategy and the minimal abnormality strategy, respectively. Analogously for the other logics defined below.
Proof Theory Adaptive proofs are dynamic in the sense that lines derived at a certain stage of a proof may be withdrawn at a later stage. Moreover, lines withdrawn at a certain stage can become derivable again at an even later stage, and so on. (A stage of a proof is a sequence of lines, and a proof is a sequence of stages. Every proof starts off with stage 1. Adding a line to a proof by applying one of the rules of inference brings the proof to its next stage, which is the sequence of all lines written so far.) A line in an adaptive proof consists of four elements: a line number, a formula, a justification, and a condition. For instance, a line j A i1 , . . . , i n ; R Δ reads: at line j , the formula A is derived from lines i1 −in by rule R on the condition Δ. The fourth element, the condition, is what permits the dynamics. Intuitively, the condition of a line in a proof corresponds to an assumption made at that line. In the example above, A was derived on the assumption that the formulas in Δ are false. If, later on in the proof, it turns out that this assumption was too bold, the line in question is withdrawn from the proof by a marking mechanism corresponding to an adaptive strategy. Importantly, only members of the set of abnormalities are allowed as elements of the condition of a line in an adaptive proof. Thus, assumptions always correspond to the falsity of one or more abnormalities, or, equivalently, to the truth of one or more generalizations. Before explaining how the marking mechanism works, the generic inference rules of the SF must be introduced. There are three of them: a premise introduction rule (Prem), an unconditional rule (RU), and a conditional rule (RC). For adaptive logics with CL as their LLL, they are defined as follows:
356
M. Beirlaen
Prem IfA ∈ Γ :
RU
... ... A ∅
IfA1 , . . . , An CL B :
A1 .. .
Δ1 .. .
An Δn B Δ1 ∪ . . . ∪ Δn RC
IfA1 , . . . , An CL B ∨ Dab(Θ) : A1 .. .
Δ1 .. .
An Δn B Δ1 ∪ . . . ∪ Δn ∪ Θ Where Γ is the premise set, Prem permits the introduction of premises on the empty condition at any time in the proof. Remember that conditions, at the intuitive level, correspond to assumptions, so Prem stipulates that premises can be introduced at any time without making any further assumptions. Since ALs strengthen their LLL, one or more rules are needed to incorporate LLL-inferences in AL-proofs. In the proof theory of the SF, this is taken care of by the generic rule RU. This rule stipulates that whenever B is a CL-consequence of A1 , . . . , An , and all of A1 , . . . , An have been derived in a proof, then B is derivable, provided that the conditions attached to the lines at which A1 , . . . , An were derived are carried over. Intuitively, if A1 , . . . , An are derivable, assuming that the members of Δ1 , . . . , Δn are false, and if B is a CL-consequence of A1 , . . . , An , then B is derivable, still assuming that all members of Δ1 , . . . , Δn are false. Before turning to RC, here is an example illustrating the use of the rules Prem and RU. Let Γ1 = {P a ∧ Qa, P b, ¬Qc}. Suppose we start an LI-proof for Γ1 as follows: 1 2 3 4 5
P a ∧ Qa Pb ¬Qc Pa Qa
Prem Prem Prem 1; RU 1; RU
∅ ∅ ∅ ∅ ∅
Let Θ be a finite set of LI-abnormalities, that is, Θ ⊂ ΩLI . Then Dab(Θ) refers to the classical disjunction of the members of Θ (“Dab” abbreviates “disjunction of abnormalities”; in the remainder, such disjunctions are sometimes referred to as “Dab-formulas”). RC stipulates that whenever B is CL-derivable from A1 , . . . , An in disjunction with one or more abnormalities, then B can be inferred assuming that these abnormalities are false, i.e., we can derive B and add the abnormalities in
17 Qualitative Inductive Generalization and Confirmation
357
question to the condition set, together with assumptions made at the lines at which A1 , . . . , An were derived. For instance, (2) is CL-valid: ∀x(P x ∨ Qx) ∨ ¬∀x(P x ∨ Qx)
(2)
Note that the second disjunct of (2) is a member of ΩLI . In the context of inductive generalization, the assumption that the world is as “normal” as possible corresponds to an assumption about the uniformity of the world. In adaptive proofs, such assumptions are made explicit by applications of the conditional rule. Concretely, if a formula like (2) is derived in an LI-proof, RC can be used to derive the first disjunct on the condition that the second disjunct is false. In fact, since (2) is a CLtheorem, the generalization ∀x(P x ∨ Qx) can be introduced right away, taking its negation to be false (lines 1–5 are not repeated): 6
∀x(P x ∨ Qx)
RC
{¬∀x(P x ∨ Qx)}
In a similar fashion, RC can be used to derive other generalizations: 7 8 9 10 11
∀xP x ∀xQx ∀x(¬P x ∨ Qx) ∀x(P x ∨ ¬Qx) ∀x(¬P x ∨ ¬Qx)
RC RC RC RC RC
{¬∀xP x} {¬∀xQx} {¬∀x(¬P x ∨ Qx)} {¬∀x(P x ∨ ¬Qx)} {¬∀x(¬P x ∨ ¬Qx)}
Each generalization is derivable assuming that its corresponding condition is false. However, some of these assumptions clearly cannot be upheld. We know, for instance, that the generalizations derived at lines 8 and 11 are falsified by the premises at lines 3 and 1, respectively. So we need a way of distinguishing between “good” and “bad” inferred generalizations. This is where the adaptive strategy comes in. Since distinguishing “good” from “bad” generalizations can be done in different ways, there are different strategies available to us for making the distinction hard. First, the reliability strategy and its corresponding marking definition are introduced. The latter definition takes care of the retraction of “bad” generalizations. Marking definitions proceed in terms of the minimal inferred Dab-formulas derived at a stage of a proof. A Dab-formula that is derived at a proof stage by RU at a line with condition ∅ is called an inferred Dab-formula of the proof stage. Definition 1 (Minimal inferred Dab-formula). Dab(Δ) is a minimal inferred Dab-formula at stage s of a proof iff Dab(Δ) is an inferred Dab-formula at stage s and there is no Δ ⊂ Δ such that Dab(Δ ) is an inferred Dab-formula at stage s.
358
M. Beirlaen
Where Dab(Δ1 ), . . . , Dab(Δn ) are the minimal inferred Dab-formulas derived at stage s, Us (Γ ) = Δ1 ∪ . . . ∪ Δn is the set of formulas that are unreliable at stage s. Definition 2 (Marking for reliability). Where Δ is the condition of line i, line i is marked at stage s iff Δ ∩ Us (Γ ) = ∅. To illustrate the marking mechanism, consider the following extension of the LIr proof for Γ1 (marked lines are indicated by a “”-sign; lines 1–5 are not repeated in the proof):
6 7 8 9 10 11 12 13 14 15
∀x(P x ∨ Qx) ∀xP x ∀xQx ∀x(¬P x ∨ Qx) ∀x(P x ∨ ¬Qx) ∀x(¬P x ∨ ¬Qx) ¬∀xQx ¬∀x(¬P x ∨ ¬Qx) ¬∀xP x ∨ ¬∀x(¬P x ∨ Qx) ¬∀x(P x ∨ Qx) ∨ ¬∀x(¬P x ∨ Qx)
RC RC RC RC RC RC 3; RU 1; RU 3; RU 3; RU
{¬∀x(P x ∨ Qx)} {¬∀xP x} {¬∀xQx} {¬∀x(¬P x ∨ Qx)} {¬∀x(P x ∨ ¬Qx)} {¬∀x(¬P x ∨ ¬Qx)} ∅ ∅ ∅ ∅
As remarked above, the generalizations derived at lines 8 and 11 are falsified by the premises, so it makes good sense to mark them and therefore consider them not derived anymore. As soon as we derive the negations of these generalizations (lines 12 and 13), Definition 2 takes care that lines 8 and 11 are marked. The generalizations derived at lines 6, 7, and 9 are not falsified by the data, yet they are marked according to Definition 2, due to the derivability of the minimal inferred Dab-disjunctions at lines 14 and 15. We know, for instance, that the generalizations derived at lines 7 and 9 cannot be upheld together: at line 14 we inferred that they are jointly incompatible in view of the premises. Definition 2 takes care that both lines 7 and 9 are marked at stage 15, since U15 (Γ1 ) = {¬∀xP x, ¬∀xQx, ¬∀x(P x∨Qx), ¬∀x(¬P x∨Qx), ¬∀x(¬P x∨¬Qx)} (3) The only inferred generalization left unmarked at stage 15 is ∀x(P x∨¬Qx), derived at line 10. Due to the dynamics of adaptive proofs, we cannot just take a formula to be an AL-consequence of some premise set Γ once we derived it at some stage on an unmarked line in a proof for Γ , for it may be that there are extensions of the proof in which the line in question gets marked. Likewise, we need to take into account the fact that lines marked at a stage of a proof may become unmarked at a later stage. This is taken care of by using the concept of final derivability:
17 Qualitative Inductive Generalization and Confirmation
359
Definition 3 (Final derivability). A is finally derived from Γ at line i of a finite proof stage s iff (i) A is the second element of line i, (ii) line i is not marked at stage s, and (iii) every extension of the proof in which line i is marked may be further extended in such a way that line i is unmarked. Definition 4 (Logical consequence for LIr ). Γ LIr A (A is finally LIr -derivable from Γ ) iff A is finally derived at a line of an LIr -proof from Γ . Given the premise set Γ1 , there are no extensions of the proof above in which any of the marked lines become unmarked, nor are there extensions in which line 10 is marked and cannot be unmarked again in a further extension of the proof. Hence, by Definitions 3 and 4: Γ1 LIr ∀xP x
(4)
Γ1 LIr ∀xQx
(5)
Γ1 LIr ∀x(P x ∨ Qx)
(6)
Γ1 LIr ∀x(P x ∨ ¬Qx)
(7)
Γ1 LIr ∀x(¬P x ∨ Qx)
(8)
Γ1 LIr ∀x(¬P x ∨ ¬Qx)
(9)
The logic LIr is non-monotonic: adding new premises may block the derivation of generalizations that were finally derivable from the original premise set. For instance, suppose that we add the premise ¬P d ∧ Qd to Γ1 . Since the extra premise provides a counterinstance to the generalization ∀x(P x ∨ ¬Qx), the latter should no longer be LIr -derivable from the new premise set. The following proof illustrates that this is indeed the case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
P a ∧ Qa Pb ¬Qc ¬P d ∧ Qd ∀x(P x ∨ Qx) ∀xP x ∀xQx ∀x(¬P x ∨ Qx) ∀x(P x ∨ ¬Qx) ∀x(¬P x ∨ ¬Qx) ¬∀xP x ¬∀xQx ¬∀x(¬P x ∨ ¬Qx) ¬∀x(P x ∨ Qx) ∨ ¬∀x(¬P x ∨ Qx) ¬∀x(P x ∨ ¬Qx)
Prem Prem Prem Prem RC RC RC RC RC RC 4; RU 3; RU 1; RU 3; RU 4; RU
∅ ∅ ∅ ∅ {¬∀x(P x ∨ Qx)} {¬∀xP x} {¬∀xQx} {¬∀x(¬P x ∨ Qx)} {¬∀x(P x ∨ ¬Qx)} {¬∀x(¬P x ∨ ¬Qx)} ∅ ∅ ∅ ∅ ∅
360
M. Beirlaen
Line 9 is marked in view of the Dab-formula derived at line 15. There is no way to extend this proof in such a way that the line in question gets unmarked. Hence, Γ1 ∪ {¬P d ∧ Qd} LIr ∀x(P x ∨ ¬Qx). In fact, no generalizations whatsoever are LIr -derivable from the extended premise set Γ1 ∪ {¬P d ∧ Qd}.
Minimal Abnormality Different interpretations of the same set of data may lead to different views concerning which generalizations should or should not be derivable. Each such view may be driven by its own rationale, and choosing one such rationale over the other is not a matter of pure logic. For that reason, different strategies are available to adaptive logicians, each interpreting a set of data in their own sensible way, depending on the context. The reliability strategy was defined already. The minimal abnormality strategy is slightly less skeptical. Consequently, for some premise sets, generalizations may be LIm -derivable but not LIr -derivable. Like reliability, the minimal abnormality strategy comes with its marking definition. Let a choice set of Σ = {Δ1 , Δ2 , . . .} be a set that contains one element out of each member of Σ. A minimal choice set of Σ is a choice set of Σ, of which no proper subset is a choice set of Σ. Where Dab(Δ1 ), Dab(Δ2 ), . . . are the minimal inferred Dab-formulas derived from a premise set Γ at stage s of a proof, Φs (Γ ) is the set of minimal choice sets of {Δ1 , Δ2 , . . .}. Definition 5 (Marking for minimal abnormality). Where A is the formula and Δ the condition of line i, line i is marked at stage s iff (i) there is no ϕ ∈ Φs (Γ ) such that ϕ ∩ Δ = ∅, or (ii) for some ϕ ∈ Φs (Γ ), there is no line at which A is derived on a condition Θ for which ϕ ∩ Θ = ∅. An example will clarify matters. Let Γ2 = {P a ∧ Qa ∧ Ra, ¬Rb ∧ (¬P b ∨ ¬Qb), ¬P c ∧ ¬Qc ∧ Rc}. 1
P a ∧ Qa ∧ Ra
Prem
∅
2
¬Rb ∧ (¬P b ∨ ¬Qb)
Prem
∅
3
¬P c ∧ ¬Qc ∧ Rc
Prem
∅
4
∀x(P x ∨ Qx)
RC
{¬∀x(P x ∨ Qx)}
5
∀x(P x ∨ Rx)
RC
{¬∀x(P x ∨ Rx)}
6
∀x(¬P x ∨ Rx)
RC
{¬∀x(¬P x ∨ Rx)}
7
¬∀x(P x ∨ Qx)
3; RU
∅
8
¬∀x(P x ∨ Rx) ∨ ¬∀x(¬P x ∨ Rx)
2; RU
∅
9
∀x(P x ∨ Rx) ∨ ∀x(¬P x ∨ Rx)
5; RU
{¬∀x(P x ∨ Rx)}
10
∀x(P x ∨ Rx) ∨ ∀x(¬P x ∨ Rx)
6; RU
{¬∀x(¬P x ∨ Rx)}
17 Qualitative Inductive Generalization and Confirmation
361
To see what is happening in this proof, we need to understand the markings. Note that there are two minimal choice sets at stage 10: Φ10 (Γ2 ) = {{¬∀x(P x∨Qx), ¬∀x(P x∨Rx)}, {¬∀x(P x∨Qx), ¬∀x(¬P x∨Rx)}} (10) Line 4 is marked in view of clause (i) in Definition 5, since its condition intersects with each minimal choice set in Φ10 (Γ2 ). Lines 5 and 6 are marked in view of clause (ii) in Definition 5. For the minimal choice set {¬∀x(P x ∨ Qx), ¬∀x(P x ∨ Rx)}, there is no line at which ∀x(P x ∨ Rx) was derived on a condition that does not intersect with this set. Hence, line 5 is marked. Analogously, line 6 is marked because, for the minimal choice set {¬∀x(P x ∨ Qx), ¬∀x(¬P x ∨ Rx)}, there is no line at which ∀x(¬P x ∨ Rx) was derived on a condition that does not intersect with this set. Things change, however, when we turn to lines 9 and 10. In these cases, none of clauses (i) or (ii) of Definition 5 apply: for each of these lines, there is a minimal choice set in Φ10 (Γ2 ) which does not intersect with the line’s condition, and for each of the sets in Φ10 (Γ2 ), we have derived the formula ∀x(P x ∨ Rx) ∨ ∀x(¬P x ∨ Rx) on a condition that does not intersect with it. Hence, these lines remain unmarked at stage 10 of the proof. Things would have been different if we made use of the reliability strategy, since: U10 (Γ2 ) = {¬∀x(P x ∨ Qx), ¬∀x(P x ∨ Rx), ¬∀x(¬P x ∨ Rx)}
(11)
In view of U10 (Γ2 ) and Definition 2, all of lines 4–6 and 9–10 would be marked if the above proof were a LIr -proof. As with the reliability strategy, logical consequence for the minimal abnormality strategy is defined in terms of final derivability (Definition 3). A consequence relation for LIm is defined simply by replacing all occurrences of LIr in Definition 4 with LIm . Although the proof above can be extended in many interesting ways, showing the (non-)derivability of many more generalizations than those currently occurring in the proof, nothing will change in terms of final derivability with respect to the formulas derived at stage 10: Γ2 LIm ∀x(P x ∨ Qx)
(12)
Γ2 LIm ∀x(P x ∨ Rx)
(13)
Γ2 LIm ∀x(P x ∨ ¬Rx)
(14)
Γ2 LIm ∀x(P x ∨ Rx) ∨ ∀x(¬P x ∨ Rx)
(15)
Γ2 LIr ∀x(P x ∨ Qx)
(16)
Γ2 LIr ∀x(P x ∨ Rx)
(17)
362
M. Beirlaen
Γ2 LIr ∀x(P x ∨ ¬Rx)
(18)
Γ2 LIr ∀x(P x ∨ Rx) ∨ ∀x(¬P x ∨ Rx)
(19)
At the beginning of section “Minimal Abnormality”, it was mentioned that the rationale underlying the reliability strategy is slightly more skeptical than that underlying the minimal abnormality strategy. The point is illustrated by the proof for Γ2 . As we saw, the formula ∀x(P x ∨ Rx) ∨ ∀x(¬P x ∨ Rx) is LIm -derivable from Γ2 but not LIr -derivable from Γ2 .
More Adaptive Logics for Inductive Generalization LI interprets the world as “uniform” by taking as normal those situations in which a generalization is true and as abnormal those situations in which a generalization is false. But of course, if uniformity is identified with the truth of every generalization in this way, the world can never be completely uniform (for the simple fact that many generalizations are incompatible and cannot be jointly true). Perhaps a more natural way to interpret the uniformity of the world is to take all objects to have the same properties: as soon as one object has property P , we try to infer that all objects have property P . This is the rationale behind the logic IL from Batens (2011). Roughly, the idea behind IL is to generalize from instances. Given an instance, the derivation of a generalization is permitted on the condition that no counterinstances are derivable. So abnormal situations are those in which both an instance and a counterinstance of a generalization are present. This is the formal definition of the set of IL-abnormalities: ΩIL =df {∃(A1 ∨ . . . ∨ An ) ∧ ∃¬(A1 ∨ . . . ∨ An ) | A1 , . . . , An ∈ A f 1 ; n ≥ 1} (20) The logic IL is defined by the lower limit logic CL, the set of abnormalities ΩIL , and the adaptive strategy reliability (ILr ) or minimal abnormality (ILm ). In an IL-proof, generalizations cannot be conditionally introduced from scratch, since an instance is required. In this respect, IL is more demanding than LI. However, it does not follow that for this reason IL is a weaker logic, since it is also more difficult to derive (disjunctions of) abnormalities in IL. A simple example will illustrate that, for many premise sets, IL is in fact stronger than LI. Consider the following IL-proof from Γ3 = {P a, ¬P b ∨ Qb}: 1
Pa
Prem
∅
2
¬P b ∨ Qb
Prem
∅
3
∀xP x
1; RC
{∃xP x ∧ ∃x¬P x}
4
Qb
2,3; RU
{∃xP x ∧ ∃x¬P x}
5
∀xQx
4; RC
{∃xP x ∧ ∃x¬P x, ∃xQx ∧ ∃x¬Qx}
17 Qualitative Inductive Generalization and Confirmation
363
In view of P a CL ∀xP x ∨ (∃xP x ∧ ∃x¬P x), we applied RC to line 1 and conditionally inferred ∀xP x at line 3. Next, we used RU to infer Qb from this newly obtained generalization together with the premise at line 2. We now have an instance of ∀xQx, so we can conditionally infer the latter generalization, taking over the condition of line 4. Importantly, not a single disjunction of members of ΩIL is CL-derivable from Γ3 . This means that there is no way to mark any of lines 3-5 in any extension of this proof, independently of which strategy we use. Consequence relations for ILr and ILm are again definable in terms of final derivability (Definition 3). All we need to do is replace all occurrences of “LIr ” in Definition 4 with “ILr ” and “ILm ,” respectively. Hence: Γ3 IL ∀xP x
(21)
Γ3 IL ∀xQx
(22)
Compare the IL-proof above with the following LI-proof from Γ3 : 1
Pa
Prem
∅
2
¬P b ∨ Qb
Prem
∅
3
∀xP x
RC
{¬∀xP x}
4
Qb
2,3; RU
{¬∀xP x}
5
∀xQx
RC
{¬∀xQx}
6
¬∀xP x ∨ ¬∀x¬Qx
1,2; RU
∅
7
¬∀xQx ∨ ¬∀x(¬P x ∨ ¬Qx)
1,2; RU
∅
Independently of the adaptive strategy used (reliability or minimal abnormality), there are no extensions of this LI-proof in which any of lines 3–5 become unmarked. Therefore: Γ3 LI ∀xP x
(23)
Γ3 LI ∀xQx
(24)
The premise set Γ3 not only serves to show that IL is not strictly weaker than LI in terms of derivable generalizations. It also illustrates that, although in an IL-proof we generalize on the basis of instances, such an instance need not always be CLderivable from the premise set. In the proof from Γ3 , we derived the generalization ∀xQx even though no instance of this generalization is CL-derivable from Γ3 . Instead, we first derived ∀xP x (of which Γ3 does provide us with an instance) and then used this generalization to infer an instance of ∀xQx. This is perfectly in line with the intuition behind IL: if deriving a generalization on the basis of an instance leads us to more instances of other generalizations, then, assuming the world to be as uniform as possible, we take the world to be uniform with respect to these other generalizations as well.
364
M. Beirlaen
When discussing inductive generalization, confirmation theorists often use the more fine-grained distinction between mere instances of a generalization, positive instances, and negative instances. For example, given a generalization ∀x(P x ⊃ Qx), any a such that P a ⊃ Qa is an instance of ∀x(P x ⊃ Qx); any a such that P a ∧ Qa is a positive instance of ∀x(P x ⊃ Qx); and any a such that P a ∧ ¬Qa is a negative instance of ∀x(P x ⊃ Qx). Instead of requiring a mere instance before introducing a generalization, some confirmation theorists have suggested the stronger requirement for a positive instance, that is, a negative instance of the contrary generalization (see section “Interdependent Abnormalities and Heuristic Guidance”). According to this idea, interpreting the world as uniform as possible amounts to generalizing whenever a positive instance is available to us. Abnormal situations, then, are those in which both a positive and a negative instance of a generalization are available to us. There is a corresponding variant of IL that hardcodes this idea in its set of abnormalities: the logic G from Batens (2011). The latter is defined by the lower limit logic CL, the set of abnormalities ΩG , and either the reliability strategy (Gr ) or the minimal abnormality strategy (Gm ): ΩG =df {∃(A1 ∧. . .∧An ∧A0 )∧∃(A1 ∧. . .∧An ∧¬A0 ) | A0 , A1 , . . . , An ∈ A f 1 ; n ≥ 0} (25)
In proofs to follow, ∃(A1 ∧. . .∧An ∧A0 )∧∃(A1 ∧. . .∧An ∧¬A0 ) is abbreviated as A1 ∧. . .∧An ∧±A0 (where again A0 , A1 , . . . , An ∈ A f 1 ). As an illustration of the workings of G, consider the following G-proof from Γ4 = {P a ∧ Qa, ¬Qb, ¬P c}: 1
P a ∧ Qa
Prem
∅
2
¬Qb
Prem
∅
3
¬P c
Prem
∅
4
∀x(P x ⊃ Qx)
1; RC
{P x ∧ ±Qx}
5
∀x(Qx ⊃ P x)
1; RC
{Qx ∧ ±P x}
6
∀x(P x ≡ Qx)
4,5; RU
{P x ∧ ±Qx, Qx ∧ ±P x}
7
∃xP x ∧ ∃x¬P x
1, 3; RU
∅
8
∃xQx ∧ ∃x¬Qx
1, 2; RU
∅
The formulas derived at lines 4–6 are finally G-derivable in the proof. Since Gconsequence too is defined in terms of final derivability, it follows, independently of the strategy used, that Γ4 G ∀x(P x ⊃ Qx)
(26)
Γ4 G ∀x(Qx ⊃ P x)
(27)
Γ4 G ∀x(P x ≡ Qx)
(28)
Now consider the following IL-proof from Γ4 (where A1 , . . . , An ∈ A f 1 , !(A1 ∨ . . . ∨ An ) abbreviates ∃(A1 ∨ . . . ∨ An ) ∧ ∃¬(A1 ∨ . . . ∨ An )):
17 Qualitative Inductive Generalization and Confirmation 1 2 3 4 5 6 7 8 9 10 11
P a ∧ Qa ¬Qb ¬P c ∀x(P x ⊃ Qx) ∀x(Qx ⊃ P x) ∀x(P x ≡ Qx) !P x !Qx !(P x ∨ Qx)∨!(¬P x ∨ Qx) !(¬Qx ∨ P x)∨!(P x ∨ Qx) !(¬P x ∨ ¬Qx)
Prem Prem Prem 1; RC 1; RC 4,4; RU 1, 3; RU 1, 2; RU 1, 2; RU 1, 3; RU 1, 2; RU
365
∅ ∅ ∅ {!(¬P x ∨ Qx)} {!(¬Qx ∨ P x)} {!(¬P x ∨ Qx), !(¬Qx ∨ P x)} ∅ ∅ ∅ ∅ ∅
The minimal inferred Dab-formulas inferred at lines 7-11 will remain minimal in any extension of this proof (none of the disjuncts of any of the formulas derived at lines 9 or 10 is separately derivable). Accordingly, the marks in this proof will not change. Hence, independently of the strategy used: Γ4 IL ∀x(P x ⊃ Qx)
(29)
Γ4 IL ∀x(Qx ⊃ P x)
(30)
Γ4 IL ∀x(P x ≡ Qx)
(31)
Two more remarks are in order. First, the example above suggests that G is in general stronger than IL. This is correct for the minimal abnormality strategy but false for the reliability strategy. An illustration is provided by the premise set Γ5 = {P a, Qb, Rb, Qc, ¬Rc}. The generalization ∀x(¬P x ⊃ Qx) cannot be inferred on the condition ¬P x ∧ ±Qx, since we lack a positive instance. It can be inferred on the conditions ±Qx or ±P x in view of ∀xQx CL ∀x(¬P x ⊃ Qx) and ∀xP x CL ∀x(¬P x ⊃ Qx), but none of these conditions are reliable in view of the derivability of minimal Dab-formulas like ±P x ∨ (P x ∧ ±Rx) and ±Qx ∨ (Qx ∧ ±P x) ∨ (P x ∧ ±Rx). The situation is different in an ILr -proof, where deriving ∀x(¬P x ⊃ Qx) on the condition !(P x ∨ Qx) in a proof from Γ is both possible and final. That is, for every derivable Dab-formula in which !(P x ∨ Qx) occurs, we can derive a shorter (minimal) disjunction of abnormalities in which it no longer occurs. Summing up: Γ5 Gr ∀x(¬P x ⊃ Qx)
(32)
Γ5 ILr ∀x(¬P x ⊃ Qx)
(33)
The second remark is that the requirement for a positive instance before generalizing in a G-proof is still insufficient to guarantee that for every G-derivable generalization, a positive instance is CL-derivable from the premises. The following proof from P a illustrates the point:
366
M. Beirlaen 1 2 3
Pa ∀xP x ∀x(Qx ⊃ P x)
Prem 1; RC 2; RU
∅ {±P x} {±P x}
Independently of the strategy used, no means are available to mark line 3, hence P a G ∀x(Qx ⊃ P x), even though no positive instance of ∀x(Qx ⊃ P x) is available. More on this point below (see the discussion on Hempel’s raven paradox in section “I-Confirmation and Hempel’s Adequacy Conditions” and in the Appendix). A total of six logics have been presented so far: the logics LIr , LIm , ILr , ILm , r G , and Gm . Each of these systems interprets the claim that world is uniform in a slightly different way, leading to slightly different logics. Importantly, there is no Carnapian embarrassment of riches here: each of the systems has a clear intuition behind it. The systems presented here can be combined so as to implement Popper’s suggestion that more general hypotheses should be given precedence over less general ones (Popper, 1959). For instance, if two generalizations ∀x(P x ⊃ Qx) and ∀x((Rx∧Sx) ⊃ T x) are jointly incompatible with the premises, a combined system gives precedence to the more general hypothesis and delivers only ∀x(P x ⊃ Qx) as a consequence. There are various ways to hard-code this idea, resulting in various new combined adaptive logics for inductive generalization, each slightly different from the others. These combinations are not fully spelled out here. For a brief synopsis, see Batens (2011, Sec. 5).
Qualitative Inductive Generalization and Confirmation Inductive logic and confirmation theory overlap to some extent. As early as 1943, Hempel noted that the development of a logical theory of confirmation might be regarded as a contribution to the field of inductive logic (Hempel, 1943, p. 123). Following Carnap and Popper’s influential work on inductive logic and corroboration, respectively, many of the existing criteria of confirmation are quantitative in nature, measuring the degree of confirmation of a hypothesis by the evidence, possibly taking into account auxiliary hypotheses and background knowledge. Here, the logics defined in the previous two sections are presented as qualitative criteria of confirmation and are related to other qualitative models of confirmation. Quantitative criteria of confirmation are not considered. For Carnap’s views on inductive logic, see Carnap (1950). For Popper’s, see Popper (1959). For introductions to inductive logic and probabilistic measures of confirmation, see, e.g., Fitelson (2005), Hájek and Hall (2002), Jeffrey (1990), and Skyrms (1986). Let I be any adaptive logic for inductive generalization defined in one of the previous sections. (All remarks on I-confirmation readily generalize to the combined systems from Batens 2011, Sec. 5.) Where H is the hypothesis and Γ contains the evidence, I-confirmation is defined in terms of I-consequence:
17 Qualitative Inductive Generalization and Confirmation
367
Definition 6 (I-confirmation). Γ I-confirms H iff Γ I H . Γ I-disconfirms H iff Γ I ¬H . Γ is I-neutral with respect to H iff Γ I H and Γ I ¬H . This definition of I-confirmation has the virtue of simplicity and formal precision. The two main qualitative alternatives to I-confirmation are Hempel’s satisfaction criterion and the hypothetico-deductive model of confirmation. In section “I-Confirmation and Hempel’s Adequacy Conditions”, I-confirmation is compared to Hempel’s adequacy conditions, which serve as a basis for his satisfaction criterion. In section “I-Confirmation and the Hypothetico-Deductive Model”, Iconfirmation is compared to hypothetico-deductive confirmation. Section “Interdependent Abnormalities and Heuristic Guidance” concerns the use of the criteria from Definition 6 as heuristic tools for hypothesis generation and confirmation.
I-Confirmation and Hempel’s Adequacy Conditions Let an observation report consist of a set of molecular sentences (sentences containing no free variables or quantifiers). According to Hempel, the following conditions should be satisfied by any adequate criterion for confirmation (Hempel, 1945b): (1) Entailment condition: Any sentence which is entailed by an observation report is confirmed by it. (2) Consequence condition: If an observation report confirms every one of a class K of sentences, then it also confirms any sentence which is a logical consequence of K. (a) Special consequence condition: If an observation report confirms a hypothesis H , then it also confirms every consequence of H . (b) Equivalence condition: If an observation report confirms a hypothesis H , then it also confirms every hypothesis which is logically equivalent to H . (3) Consistency condition: Every logically consistent observation report is logically compatible with the class of all the hypotheses which it confirms. If “logical consequence” is taken to be CL-consequence, as Hempel did, then I-confirmation satisfies conditions (1)–(3) no matter which adaptive logic for inductive generalization is used, due to I’s closure under CL. So all of the resulting criteria of confirmation meet Hempel’s adequacy conditions. (For (3) the further property of “smoothness” or “reassurance” is required, from which it follows that the I-consequence set of consistent premise sets is consistent as well. See Batens (2007, Sec. 6).) The definition of Hempel’s own criterion requires some preparation (the formal presentation of Hempel’s criterion is taken from Sprenger 2013). An atomic formula
368
M. Beirlaen
A is relevant to a formula B iff there is some model M of A such that: if M differs from M only in the value assigned to B, M is not a model of A. The domain of a formula A is the set of individual constants that occur in the atomic formulas that are relevant for A. The development of a universally quantified formula A for another formula B is the restriction of A to the domain of B, that is, the truth value of A is evaluated with respect to the domain of B. For instance, the domain of P a ∧ (P b ∨ Qc) is {a, b, c}, whereas the domain of P a ∧ Qa is {a} and the development of ∀x(P x ⊃ Qx) for P a ∧ ¬Qb is (P a ⊃ Qa) ∧ (P b ⊃ Qb). Definition 7 (Hempel’s satisfaction criterion). An observation report E directly confirms a hypothesis H if E entails the development of H for E. An observation report E confirms a hypothesis H if H is entailed by a class of sentences, each of which is directly confirmed by E. An observation report E disconfirms a hypothesis H if it confirms the denial of H. An observation report E is neutral with respect to a hypothesis H if E neither confirms nor disconfirms H . There are two reasons for arguing that Hempel’s satisfaction criterion is too restrictive and two reasons for arguing that it is too liberal. Each of these is discussed in turn. First, in order for the evidence to confirm a hypothesis H according to Hempel’s criterion, all objects in the development of H must be known to be instances of H . This is a very strong requirement. I-confirmation is different in this respect. For instance, P a, Qa, ¬P b, ¬Qb, P c I ∀x(P x ⊃ Qx)
(34)
In (34) it is unknown whether c instantiates the hypothesis ∀x(P x ⊃ Qx), since the premises do not tell us whether P c ⊃ Qc. The development of ∀x(P x ⊃ Qx) entails P c ⊃ Qc, whereas the premise set of (34) does not. So the hypothesis ∀x(P x ⊃ Qx) is not directly confirmed by these premises according to the satisfaction criterion, nor is it entailed by one or more sentences which are directly confirmed by them. Therefore, the satisfaction criterion judges the premises to be neutral with respect to the hypothesis ∀x(P x ⊃ Qx), whereas (34) illustrates that ∀x(P x ⊃ Qx) is I-confirmed by these premises. Second, given the law ∀x(P x ⊃ Rx), the report {P a, Qa, P b, Qb} does not confirm the hypothesis ∀x(Rx ⊃ Qx) according to Hempel’s original formulation of the satisfaction criterion. The reason is that “auxiliary hypotheses” like ∀x(P x ⊃ Rx) contain quantifiers and therefore cannot be elements of observation reports. (The original formulation of Hempel’s criterion can, however, be adjusted so as to take into account background knowledge, see, e.g., Fitelson and Hawthorne 2010; Sprenger 2011b.) For problems related to auxiliary hypotheses, see also Section “I-Confirmation and the Hypothetico-Deductive Model”. For now, it suffices to note that the criteria from Definition 6 do not face this problem, as quantified
17 Qualitative Inductive Generalization and Confirmation
369
formulas are perfectly allowed to occur in premise sets. For instance, the set {P a, Qa, P b, Qb, ∀x(P x ⊃ Rx)} I-confirms the hypothesis ∀x(Rx ⊃ Qx): P a, Qa, P b, Qb, ∀x(P x ⊃ Rx) I ∀x(Rx ⊃ Qx)
(35)
It seems, then, that I-confirmation is not too restrictive a criterion for confirmation. However, there are two senses in which I-confirmation, like Hempelian confirmation, can be said to be too liberal. The first has to do with Goodman’s well-known new riddle of induction (Goodman, 1955). The family of adaptive logics for inductive generalization makes no distinction between regularities that are “projectible” and regularities that are not. Using Goodman’s famous example, let an emerald be grue if it is green before January 1, 2020, and blue thereafter. Then the fact that all hitherto observed emeralds are grue confirms the hypothesis that all emeralds are grue. The latter regularity is not projectible into the future, as we do not seriously believe that in 2020 we will start observing blue emeralds. Nonetheless, it is perfectly fine to define a predicate denoting the property of being grue, just as it is perfectly fine to define a predicate denoting the property of being green. Yet the hypothesis “all emeralds are green” is projectible, whereas “all emeralds are grue” is not. The problem of formulating precise rules for determining which regularities are projectible and which are not is difficult and important, but it is an epistemological problem that cannot be solved by purely logical means. Consequently, it falls outside the scope of this article. See Goodman (1955) for Goodman’s formulation and proposed solution of the problem and Stalker (1994) for a collection of essays on the projectibility of regularities. Finally, one may argue that I-confirmation is too liberal on the basis of Hempel’s own raven paradox. Where Ra abbreviates that a is a raven and Ba abbreviates that a is black, a nonblack non-raven I-confirms the hypothesis that all ravens are black: ¬Ba, ¬Ra I ∀x(Rx ⊃ Bx)
(36)
Even the logic G does not block this inference. The reason is that we are given a positive instance of the generalization ∀x(¬Bx ⊃ ¬Rx), so we can derive this generalization on the condition ∃x(¬Bx ∧ ¬Rx) ∧ ∃x(¬Bx ∧ Rx). As the generalization ∀x(¬Bx ⊃ ¬Rx) is G-derivable from the premises, so is the logically equivalent hypothesis that all ravens are black, ∀x(Rx ⊃ Bx) (remember that G, like all logics defined in the previous section, is closed under CL). Hempel’s own reaction to the raven paradox was to bite the bullet and accept its conclusion (Hempel, 1945a). According to Hempel, a nonblack non-raven indeed confirms the raven hypothesis in case we did not know beforehand that the bird in question is not a raven. For example, if we observe a gray bird resembling a raven, then finding out that it was a crow confirms the raven hypothesis (Sprenger, 2013). But as pointed out in Fitelson and Hawthorne (2010), this defense is insufficient. Even in cases in which it is known that a nonblack bird is not a raven, the bird in question, although irrelevant to the raven hypothesis, still confirms it.
370
M. Beirlaen
If – like Hempel – one accepts its conclusion, the raven paradox poses no further problems for I-confirmation. Those who disagree are referred to the Appendix, where a relatively simple adaptive alternative to G-confirmation is defined which blocks the paradox by means of a nonmaterial conditional invalidating the inference from “all nonblack objects are non-ravens” to “all ravens are black.”
I-Confirmation and the Hypothetico-Deductive Model If a hypothesis predicts an event which is observed at a later time, or if it subsumes a given observation report as a consequence of one of its postulates, then this counts as evidence in favor of the hypothesis. The hypothetico-deductive model of confirmation (HD confirmation) is an attempt to formalize this basic intuition according to which a piece of evidence confirms a hypothesis if the latter entails the evidence. In its standard formulation, HD confirmation also takes into account auxiliary hypotheses. Where Δ is a set of background information distinct from the evidence E, Definition 8 (HD-confirmation). E HD-confirms H relative to Δ iff (i) {H } ∪ Δ is consistent (ii) {H } ∪ Δ entails E ({H } ∪ Δ E) (iii) Δ alone does not entail E (Δ E) The intuitive difference conveyed by HD confirmation and Hempelian confirmation becomes concrete if HD confirmation is compared with Hempel’s adequacy criteria from section “I-Confirmation and Hempel’s Adequacy Conditions”. Let H abbreviate “Black swans exist,” let E consist of a black swan, and let Δ be the empty set. Then, according to Hempel’s entailment condition, H is confirmed by E, since E H . Not so according to HD confirmation, for condition (ii) of Definition 8 is violated (H E) (Sprenger, 2011a). The same example illustrates how HD confirmation violates the following condition, which holds for the satisfaction criterion in view of Definition 7 (Crupi, 2014): (4) Complementarity condition: E confirms H iff E disconfirms ¬H . The consequence condition too is clearly invalid for HD confirmation. For instance, Ra ⊃ Ba HD confirms ∀x(Rx ⊃ Bx), but it does not HD confirm the weaker hypothesis ∀x(Rx ⊃ (Bx ∨ Cx)), since ∀x(Rx ⊃ (Bx ∨ Cx)) Ra ⊃ Ba. An advantage of HD confirmation is that it fares better with the raven paradox. The observation of a black raven (Ra, Ba) is not deducible from the raven hypothesis ∀x(Rx ⊃ Bx), so black ravens do not in general confirm the raven hypothesis. But birds that are known to be ravens do confirm the raven hypothesis once it is established that they are black. Once it is known that an object is a raven, the observation that it is black is entailed by this knowledge together with the
17 Qualitative Inductive Generalization and Confirmation
371
hypothesis (∀x(Rx ⊃ Bx), Ra Ba). Likewise, a nonblack non-raven does not generally confirm the raven hypothesis. Only objects that are known to be nonblack can confirm the hypothesis by establishing that they are not ravens. In formulas: ∀x(Rx ⊃ Bx), ¬Ba ¬Ra. HD confirmation faces a number of standard objections, of which three are discussed here. The first is the problem of irrelevant conjunctions and disjunctions. In view of Definition 8, it is easily checked that whenever a hypothesis H confirms E relative to Δ, so does H = H ∧ K for any arbitrary K consistent with Δ. Thus, adding arbitrary conjuncts to confirmed hypotheses preserves confirmation. Dually, adding arbitrary disjuncts to the data likewise preserves confirmation. That is, whenever H confirms E relative to Δ, H also confirms E relative to Δ, where E = E ∨ F for any arbitrary F . Various solutions have been proposed for dealing with such problems of irrelevancy, but as so often the devil is in the details (see Sprenger (2011b) for a nice overview and further references). For present purposes, it suffices to say that Iconfirmation is not threatened by problems of irrelevance. Clearly, if the evidence E I-confirms a hypothesis H , it does not follow that it I-confirms H ∧ K for some arbitrary K consistent with Δ, since from {E} ∪ Δ I H it need not follow that {E} ∪ Δ I H ∧ K. Nor does it follow that E ∨ F confirms H relative to Δ, since from {E} ∪ Δ I H it need not follow that {E ∨ F } ∪ Δ I H . A second objection against HD confirmation concerns the inclusion of background information in Definition 8. In general, this inclusion is an advantage, since evidence often does not (dis)confirm a hypothesis simpliciter. Rather, evidence (dis)confirms hypotheses with respect to a set of auxiliary (background) assumptions or theories. The vocabulary of a theory often extends beyond what is directly observable. Notwithstanding Hempel’s conviction to the contrary, nowadays philosophers largely agree that the use of purely theoretical terms is both intelligible and necessary in science (Putnam, 1965). Making the confirmation relation relative to a set of auxiliaries allows for the inclusion of bridging principles connecting observation terms with theoretical terms, permitting purely theoretical hypotheses to be confirmed by pure observation statements (Glymour, 1980). However, making confirmation relative to background assumptions makes HD vulnerable to a type of objection often traced back to Duhem (1906) and Quine (1951). Suppose that a hypothesis H entails an observation E relative to Δ and that E is found to be false. Then either (a) H is false or (b) a member of Δ is false. But the evidence does not tell us which of (a) or (b) is the case, so we always have the option to retain H and blame some auxiliary hypothesis in the background information. More generally, one may object that what gets (dis)confirmed by observations is not a hypothesis taken by itself but the conjunction of a hypothesis and a set of background assumptions or theories. With Elliott Sober, we can counter such holistic objections by pointing to the different epistemic status of hypotheses under test and auxiliary hypotheses (or hypotheses used in a test). Auxiliaries are independently testable, and when used in an experiment, we already have good reasons to think of these hypotheses as true. Moreover, they are epistemically independent of the test outcome. So if a
372
M. Beirlaen
hypothesis is disconfirmed by the HD criterion, we can, in the vast majority of cases, maintain that it is the hypothesis we need to retract and not one of the background assumptions (Sober, 1999). A parallel point can be made concerning I-confirmation. Here too, we can add to the premises a set Δ of auxiliary or background assumptions. And here too, we can use Sober’s defense against objections from evidential holism. A nice feature of I-confirmation is that in adaptive proofs the weaker epistemic status of hypotheses inferred from an observation report in conjunction with a set of auxiliaries is reflected by their non-empty condition. Whereas auxiliaries are introduced as premises on the empty condition, inductively generated hypotheses are derived conditionally and may be retracted at a later stage of the proof. For a more fine-grained treatment of background information in adaptive logics for inductive generalization, see Batens (2011, Sec. 6). The third objection against HD confirmation dates back to Hempel’s (1945b), in which he argued that a variant of HD confirmation (which he calls the “prediction criterion” of confirmation) is circular. The problem is that in HD confirmation the hypothesis to be confirmed functions as a premise from which we derive the evidence and that it is unclear where this premise comes from. The hypothesis is not generated but given in advance, so HD confirmation presupposes the prior attainment – by inductive reasoning – of a hypothesis. This inductive move, Hempel argues, already presupposes the idea of confirmation, making the HD account circular. The weak step in Hempel’s argument consists in his assumption that the inductive jump to the original attainment of a hypothesis already presupposes the confirmation of this hypothesis. In testing or generating a hypothesis, we need not yet believe or accept it. Typically, belief and acceptance come only after confirming the hypothesis. Indeed, in probabilistic notions of confirmation, the idea is often exactly this: confirming a hypothesis amounts to increasing our degree of belief in it. Hempel’s circularity objection, it seems, confuses hypothesis generation and hypothesis confirmation. Hempel’s circularity objection does not undermine HD confirmation, but it points to the wider scope of the adaptive account as compared to HD confirmation. In an I-proof, the conditional rule allows us to generate hypotheses. Hypotheses are not given in advance but are computable by the logic itself. Moreover, a clear distinction can be made between hypothesis generation and hypothesis confirmation. Hypotheses generated in an I-proof may be derivable at some stage of the proof, but the central question is whether they can be retained – whether they are finally derivable. I-confirmation, then, amounts to final derivability in an I-proof, whereas the inductive step of hypothesis generation is represented by retractable applications of RC.
Interdependent Abnormalities and Heuristic Guidance For any of the adaptive logics for inductive generalization defined in this paper, at most one positive instance is needed to try and derive and, subsequently, confirm a generalization for a given set of premises. This is a feature that I-confirmation
17 Qualitative Inductive Generalization and Confirmation
373
shares with the other qualitative criteria of confirmation. As a simple illustration, note that an observation report consisting of a single observation P a confirms the hypothesis ∀xP x according to all qualitative criteria discussed in this paper. Proponents of quantitative approaches to confirmation may object that this is insufficient, that a stronger criterion is needed which requires more than one instance for a hypothesis to be confirmed. Against this view, one can uphold that confirmation is mainly falsification-driven. Rather than confirming hypotheses by heaping up positive instances, we try and test them by searching for negative instances. In the remainder of this section, it is argued by means of a number of examples that I-confirmation is sufficiently selective as a criterion for confirming generated hypotheses. The examples moreover allow for the illustration of an additional feature of I-confirmation: its use as a heuristic guide for provoking further tests in generating and confirming additional hypotheses. Simple examples like the one given in the previous paragraph may suggest that, in the absence of falsifying instances, a single instance usually suffices to I-confirm a hypothesis. This is far from the truth. Consider the simple premise set Γ6 = {¬P a ∨ Qa, ¬Qb, P c}. This premise set contains instances of all of the generalizations ∀xP x, ∀x¬Qx, and ∀x(P x ⊃ Qx). Not a single one of these is IL-confirmed, however, due to the derivability of the following disjunctions of abnormalities: !P x∨!Qx
(37)
!P x∨!(¬P x ∨ Qx)
(38)
!(P x ∨ Qx)∨!(¬P x ∨ Qx)
(39)
!Qx∨!(¬P x ∨ Qx)
(40)
!(¬P x ∨ Qx)∨!(¬P x ∨ ¬Qx)
(41)
Note that Γ6 contains positive instances of both ∀xP x and ∀x¬Qx, so not even a positive instance suffices for a generalization to be finally IL-derivable in the absence of falsifying instances. The same is true if we switch from IL to G. None of ∀xP x, ∀x¬Qx, or ∀x(P x ⊃ Qx) is G-confirmed, due to the derivability of the following disjunctions of abnormalities: ±P x ∨ ±Qx
(42)
±P x ∨ (P x ∧ ±Qx)
(43)
±Qx ∨ (Qx ∧ ±P x)
(44)
The reason for the non-confirmation of generalizations like ∀xP x, ∀x¬Qx, or ∀x(P x ⊃ Qx) in this example has to do with the dependencies that exist between abnormalities. Even if a generalization is not falsified by the data, it is often the case that this generalization is not compatible with a different generalization left unfalsified by the data. As a further illustration, consider the premise set Γ7 = {¬Ra, ¬Ba, Rb}. Again, although no falsifying instance is present, the
374
M. Beirlaen
generalization ∀x(Rx ⊃ Bx) is not IL-derivable. The reason is the derivability of the following minimal disjunction of abnormalities: !(¬Rx ∨ Bx)∨!(¬Rx ∨ ¬Bx)
(45)
Examples like these illustrate that I-confirmation is not too liberal a criterion of confirmation. They also serve to illustrate a different point. Minimal Dabformulas like (45) evoke questions. Which of the two abnormalities is the case? For this particular premise set, establishing which of Bb or ¬Bb is the case would settle the matter. If Bb were the case, then the second disjunct of (45) would be derivable, and (45) would no longer be minimal. Consequently, the abnormality ∃x(¬Rx ∨ Bx) ∧ ∃x¬(¬Rx ∨ Bx) would no longer be part of a minimal disjunction of abnormalities, and the generalization ∀x(Rx ⊃ Bx) would become finally derivable. Analogously, if ¬Bb were the case, then the first disjunct of (45) would become derivable, and, by the same reasoning, the generalization ∀x(Rx ⊃ ¬Bx) would become finally derivable. Thus: Γ7 ∪ {Bb} IL ∀x(Rx ⊃ Bx)
(46)
Γ7 ∪ {¬Bb} IL ∀x(Rx ⊃ ¬Bx)
(47)
Two more comments are in order here. First, this example illustrates that confirming a hypothesis often involves the disconfirmation of the contrary hypothesis. We saw that if we use Hempel’s criterion a nonblack non-raven confirms the raven hypothesis. But as Goodman pointed out, “the prospects for indoor ornithology vanish when we notice that under these same conditions, the contrary hypothesis that no ravens are black is equally well confirmed” (Goodman, 1955, p. 71). Thus, according to Goodman, confirming the raven hypothesis ∀x(Rx ⊃ Bx) requires disconfirming its contrary ∀x(Rx ⊃ ¬Bx). This is exactly what happens in the example: in order to IL-derive ∀x(Rx ⊃ Bx), a falsifying instance for its contrary is needed, as (46) illustrates. Goodman’s suggestion that the confirmation of a hypothesis requires the falsification/disconfirmation of its contrary was picked up by Israel Scheffler, who developed it further in his (Scheffler, 1963). Note that falsifying the contrary of the raven hypothesis amounts to finding a positive instance of the raven hypothesis. Thus, in demanding a positive instance before permitting generalization in a G-proof, the latter system goes further than IL in implementing Goodman’s idea. As we saw, however, not even G goes all the way: a generalization may be G-derivable even in the absence of a positive instance. Second, if empirical (observational or experimental) means are available to answer questions like ?{Bb, ¬Bb} in the foregoing example, these questions may be called tests (Batens, 2005). Adaptive logics for inductive generalization provide heuristic guidance in the sense that interdependencies between abnormalities evoke such tests. Importantly, further tests may lead to the derivability of new generalizations. In the example, deciding the question ?{Bb, ¬Bb} in favor of
17 Qualitative Inductive Generalization and Confirmation
375
Bb leads to the confirmation of ∀x(Rx ⊃ Bx) and to the disconfirmation of ∀x(Rx ⊃ ¬Bx), while deciding it in favor of ¬Bb leads to the confirmation of ∀x(Rx ⊃ ¬Bx) and to the disconfirmation of ∀x(Rx ⊃ Bx). This is an important practical advantage of I-confirmation over other qualitative criteria: adaptive logics for inductive generalization evoke tests for increasing the number of confirmed generalizations. The illustrations so far may suggest that this heuristic guidance provided by I-confirmation only applies to hypotheses that are logically related or closely connected, like the raven hypothesis and its contrary. But the point is more general, as the following example illustrates. Consider the premise set Γ8 = {P a, Qa, ¬Ra, ¬P b, ¬Qb, Rb, P c, Rc, Qd, ¬P e}. Despite the fact that Γ8 contains positive instances of the generalizations ∀x(P x ⊃ Qx) and ∀x(Rx ⊃ ¬Qx), and despite the fact that these generalizations are not falsified by Γ8 , none of them is IL-derivable due to the derivability of the disjunction !(¬P x ∨ Qx)∨!(¬Rx ∨ ¬Qx)
(48)
By the same reasoning as in the previous illustration, Γ8 evokes the question ?{Qc, ¬Qc}. If this question is a test (if it can be answered by empirical means), the answer will confirm one of the generalizations ∀x(P x ⊃ Qx) and ∀x(Rx ⊃ ¬Qx) and will disconfirm the other generalization (Batens, 2005). The example generalizes. In LI and G too, the derivability of ∀x(P x ⊃ Qx) and ∀x(Rx ⊃ ¬Qx) is blocked due to the CL-derivability of the LI-minimal Dabformula (49) and the G-minimal Dab-formula, respectively (50): ¬∀x(P x ⊃ Qx) ∨ ¬∀x(Rx ⊃ ¬Qx)
(49)
(P x ∧ ±Qx) ∨ (Rx ∧ ±Qx)
(50)
Here too, deciding the question ?{Qc, ¬Qc} resolves the matter. Thus, where I ∈ {LI, IL, G, }: Γ8 I ∀x(P x ⊃ Qx)
(51)
Γ8 I ∀x(Rx ⊃ ¬Qx)
(52)
Γ8 ∪ {Qc} I ∀x(P x ⊃ Qx)
(53)
Γ8 ∪ {Qc} I ∀x(Rx ⊃ ¬Qx)
(54)
Γ8 ∪ {¬Qc} I ∀x(P x ⊃ Qx)
(55)
Γ8 ∪ {¬Qc} I ∀x(Rx ⊃ ¬Qx)
(56)
For some concrete heuristic rules applicable to the logic LI, see Batens (2006).
376
M. Beirlaen
Conclusions A number of adaptive logics for inductive generalization were presented each of which, it was argued, can be reinterpreted as a criterion of confirmation. The logics in question can be classified along two dimensions. The first dimension concerns when it is permitted to introduce a generalization in an adaptive proof. The logic LI permits the free introduction of generalizations. IL and G require instances of a generalization before introducing it in a proof. Interestingly, these stronger requirements do not result in stronger logics. The second dimension along which the logics defined in this paper can be classified concerns their adaptive strategy. Here, no surprises arise. A logic defined using the reliability strategy is in general weaker than its counterpart logic defined using the minimal abnormality strategy (this was shown to be the case for all adaptive logics defined within the SF; see Batens 2007, Theorem 11). When reinterpreted as criteria of confirmation, the logics defined here withstand the comparison with their main rivals, i.e., Hempel’s satisfaction criterion and the hypothetico-deductive model of confirmation. In conclusion, the adaptive confirmation criteria defined in section “Qualitative Confirmation” offer an interesting alternative perspective on (qualitative) confirmation theory.
Appendix: Blocking the Raven Paradox? If a formalism defined in terms of CL behaves overly permissive, a good strategy to remedy this problem is to add further criteria of validity or relevance. For instance, in order to avoid problems of irrelevant conjunctions and disjunctions, hypotheticodeductivists may impose further demands on HD confirmation (see, e.g., Gemes 1993, 1998; Schurz 1991, 1994). A similar strategy could be adopted with respect to I-confirmation and the raven paradox. In this appendix, an alternative adaptive logic of induction, IC, is defined, as is a corresponding criterion of confirmation which is slightly less permissive than the criteria from section “Qualitative Confirmation”. IC makes use of a nonclassical conditional resembling a number of conditionals originally defined in order to avoid the so-called paradoxes of material implication. First, an extension of CL is introduced, including this new conditional connective. Next, the adaptive logic IC is defined. The new conditional, “→,” is fully characterized by the following rules and axiom schemas: A, (A → B) B A≡B (A → C) ≡ (B → C)
(MP) (RCEA)
17 Qualitative Inductive Generalization and Confirmation
377
A≡B (C → A) ≡ (C → B)
(RCEC)
(A → (B ∧ C)) ≡ ((A → B) ∧ (A → C))
(D∧)
((A ∨ B) → C) ≡ ((A → C) ∧ (B → C))
(D∨)
((RCEA), (RCEC), and (D∧) fully characterize the conditional of Chellas’s logic CR from Chellas (1975). The latter was also used for capturing explanatory conditionals in Beirlaen and Aliseda (2014). See also Priest (2008, Ch. 5) for some closely related conditional logics, including an extension of Chellas’s systems that validates (MP).) Let CL→ be the logic resulting from adding “→” to the language of CL and from adding (MP)-(D∨) to the list of rules and axioms of CL. Note that the conditional “→” is strictly stronger than “⊃”: (A → B) ⊃ (A ⊃ B)
(57)
(By (MP), A, (A → B) CL→ B. By the deduction theorem for “⊃,” A → B CL→ A ⊃ B. By the deduction theorem again, CL→ (A → B) ⊃ (A ⊃ B).) In view of this bridging principle between both conditionals, it is easily seen that counterinstances to a formula of the form ∀x(A(x) ⊃ B(x)) form counterinstances to ∀x(A(x) → B(x)) and falsify the latter formula as well. For instance, if P a ∧ ¬Qa, then, by CL, ¬∀x(P x ⊃ Qx), and, by (57), ¬∀x(P x → Qx). The adaptive logic IC is fully characterized by the lower limit logic CL→ , the set of abnormalities ΩIC =df {∃(A1 ∧ . . . ∧ An ∧ A0 ) ∧ ¬∀((A1 ∧ . . . ∧ An ) → A0 ) | A0 , A1 , . . . , An ∈ A f 1 ; n ≥ 0}, (58)
and the adaptive strategy reliability (ICr ) or minimal abnormality (ICm ). IC is defined within the SF. All rules and definitions for its proof theory are as for the other logics defined in this paper, except that in the definition of RU and RC, CL is replaced with CL→ . The following proof illustrates how formulas are derived conditionally in IC: 1 2 3
¬Ra ¬Ba ∀x(¬Bx → ¬Rx)
Prem Prem 1,2; RC
∅ ∅ {∃x(¬Bx ∧ ¬Rx) ∧ ¬∀x(¬Bx → ¬Rx)}
Given only the premises ¬Ra and ¬Ba, there is no possible extension of this proof in which line 3 gets marked. Hence: ¬Ra, ¬Ba IC ∀x(¬Bx → ¬Rx)
(59)
378
M. Beirlaen
However, contraposition is invalid for the new conditional →; hence, we cannot derive the raven hypothesis from the formula derived at line 3. Note also that, in view of (60), we cannot use the conditional rule RC to derive ∀x(Rx → Bx) on the condition {∃x(Rx ∧ Bx) ∧ ¬∀x(Rx → Bx)} in an IC-proof, since ¬Ra, ¬Ba CL→ ∀x(Rx → Bx) ∨ (∃x(Rx ∧ Bx) ∧ ¬∀x(Rx → Bx))
(60)
Therefore: ¬Ra, ¬Ba IC ∀x(Rx → Bx)
(61)
Thus, if conditional statements of the form “for all x, if A(x) then B(x)” are taken to be IC-confirmed only if the conditional in question is an arrow (→) instead of a material implication, then the raven paradox, in its original formulation, is blocked. An additional property of IC is that strengthening the antecedent fails for “→.” In section “More Adaptive Logics”, for instance, we saw that P a G ∀x(Qx ⊃ P x)
(62)
In IC, (62) still holds for the material implication but not for the new conditional. In an IC-proof from P a, we can still derive ∀xP x on the condition {∃xP x ∧ ∃x¬P x}, and since IC extends CL, it still follows that ∀x(P x ⊃ Qx): P a IC ∀xP x
(63)
P a IC ∀x(Qx ⊃ P x)
(64)
However, since ∀xP x CL→ ∀x(Qx → P x), and since we do not have any further means to conditionally derive the formula ∀x(Qx → P x) in an IC-proof: P a IC ∀x(Qx → P x)
(65)
Originally, the logics in the G-family were constructed as logics requiring a positive instance before we are allowed to apply RC. This is reflected in the definition of the set of G-abnormalities. In order to derive a formula like ∀x(P x ⊃ Qx) on its corresponding condition, a positive instance, e.g., P a ∧ Qa, is needed. Examples like (36) and (62) show, however, that such a positive instance is not always required in order to G-derive a generalization. The logic IC, it seems, does much better in this respect. However, it still does not fully live up to the requirement for a positive instance before generalizing, as the following IC-proof from Γ9 = {¬Ra ∧ ¬Ba, Rb, Bc} illustrates (where A0 , A1 , . . . , An ∈ A f 1 , †((A1 ∧. . .∧An ) → A0 ) abbreviates ∃(A1 ∧. . .∧An ∧A0 )∧¬∀((A1 ∧. . .∧An ) → A0 )).
17 Qualitative Inductive Generalization and Confirmation x1 2 3 4 5 6
¬Ra ∧ ¬Ba Rb Bc ∀x(¬Bx → ¬Rx) Bb ∀x(Rx → Bx)
Prem Prem Prem 1; RC 2,4; RU 2,5; RC
379
∅ ∅ ∅ {†(¬Bx → ¬Rx)} {†(¬Bx → ¬Rx)} {†(Rx → Bx), †(¬Bx → ¬Rx)}
The key step in this proof is the derivation of Bb at line 5, which together with Rb provides us with a positive instance of the raven hypothesis. Bb is derivable from lines 2 and 4 in view of CL and (57). Except for the formulas ∃xRx ∧ ∃x¬Rx and ∃xBx ∧ ∃x¬Bx, no minimal Dab-formulas are CL→ -derivable from Γ9 . Therefore, Γ9 IC ∀x(Rx → Bx)
(66)
As (61) illustrates the logic IC avoids the raven paradox in its original formulation. A possible drawback of IC is that it does not fully meet the demand for a positive instance when confirming a hypothesis (see also section “Interdependent Abnormalities and Heuristic Guidance”). It is left open whether it is possible and desirable to further extend IC so as to fully meet this demand. Acknowledgements The author is greatly indebted to Atocha Aliseda, Cristina Barés-Gómez, Diderik Batens, Matthieu Fontaine, Jan Sprenger, and Frederik Van De Putte for insightful and valuable comments on previous drafts of this paper. Research for this article was supported by the Programa de Becas Posdoctorales de la Coordinación de Humanidades of the National Autonomous University of Mexico (UNAM), by the project “Logics of discovery, heuristics and creativity in the sciences” (PAPIIT, IN400514-3) granted by the UNAM, and by the Sofja Kovalevskaja award program of the Alexander von Humboldt-Stiftung.
References Batens, D. (2004 (appeared 2005)). The basic inductive schema, inductive truisms, and the research-guiding capacities of the logic of inductive generalization. Logique et Analyse, 185– 188, 53–84. Batens, D. (2006). On a logic of induction. Logic and Philosophy of Science, 4(1), 3–32. Batens, D. (2007). A universal logic approach to adaptive logics. Logica Universalis, 1, 221–242. Batens, D. (2009). Towards a dialogic interpretation of dynamic proofs (pp. 27–51). College Publications. Batens, D. (2011). Logics for qualitative inductive generalization. Studia Logica, 97, 61–80. Batens, D., & Haesaert, L. (2001 (appeared 2003)). On classical adaptive logics of induction. Logique et Analyse, 173–175, 255–290. Beirlaen, M., & Aliseda, A. (2014). A conditional logic for abduction. Synthese, 191(15), 3733– 3758. Carnap, R. (1950). Logical foundations of probability. University of Chicago Press. Chellas, B. (1975). Basic conditional logic. Journal of Philosophical Logic, 4, 133–153. Crupi, V. (2014). Confirmation. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/spr2014/entries/confirmation/
380
M. Beirlaen
Duhem, P. (1991 (first published 1906)). The aim and structure of physical theory. Princeton University Press. Fitelson, B. (2005). Inductive logic (pp. 384–394). Routledge. Fitelson, B., & Hawthorne, J. (2010). How Bayesian confirmation theory handles the paradox of the ravens (pp. 247–275). Springer. Gemes, K. (1993). Hypothetico-deductivism, content, and the natural axiomatization of theories. Philosophy of Science, 60(3), 477–487. Gemes, K. (1998). Hypothetico-deductivism: The current state of play. Erkenntnis, 49, 1–20. Glymour, C. (1980). Theory and evidence. Princeton University Press. Goodman, N. (1955). Fact, fiction, and forecast. Harvard University Press. Hájek, A., & Hall, N. (2002). Induction and probability (pp. 149–172). Blackwell. Hempel, C. (1943). A purely syntactical definition of confirmation. Journal of Symbolic Logic, 8(4), 122–143. Hempel, C. (1945a). Studies in the logic of confirmation I. Mind, 54(213), 1–26. Hempel, C. (1945b). Studies in the logic of confirmation II. Mind, 54(214), 97–121. Jeffrey, R. (1990). The logic of decision (2nd ed.). University of Chicago Press. Norton, J. (2005). A little survey of induction (pp. 9–34). John Hopkins University Press. Popper, K. (1959). The logic of scientific discovery. Hutchinson. English translation; originally written in German in 1935. Priest, G. (2008). An introduction to non-classical logic (2nd ed.). Cambridge University Press. Putnam, H. (1965). Craig’s theorem. The Journal of Philosophy, 62(10), 251–260. Quine, W. (1951). Two dogmas of empiricism. Philosophical Review, 60, 20–43. Scheffler, I. (1963). The anatomy of inquiry. Alfred A: Knopf. Schurz, G. (1991). Relevant deduction. Erkenntnis, 35, 391–437. Schurz, G. (1994). Relevant deduction and hypothetico-deductivism: A reply to Gemes. Erkenntnis, 41, 183–188. Skyrms, B. (1986). Choice and chance. An introduction to inductive logic (3rd ed.). Wadsworth Publishing Company. Sober, E. (1999). Testability. Proceedings and Addresses of the American Philosophical Association, 73(2), 47–76. Sprenger, J. (2011a). Hempel and the paradoxes of confirmation (pp. 231–260). Elsevier. Sprenger, J. (2011b). Hypothetico-deductive confirmation. Philosophy Compass, 6/7, 497–508. Sprenger, J. (2013). A synthesis of Hempelian and hypothetico-deductive confirmation. Erkenntnis, 78(4), 727–738. Stalker, D. (Ed.). (1994). Grue! The new riddle of induction. Open Court Publishing Company. Van De Putte, F., & Straßer, C. (2014). Adaptive logics: A parametric approach. Logic Journal of the IGPL, 22(6), 905–932.
Modeling Hypothetical Reasoning by Formal Logics
18
Tjerk Gauderis
Contents The Feasibility of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Advantages and Drawbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Four Patterns of Hypothetical Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning and Adaptive Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Problem of Multiple Explanatory Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Standard Format of Adaptive Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LArs : A Logic for Practical Singular Fact Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MLAss : A Logic for Theoretical Singular Fact Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
382 384 386 391 392 393 396 400 407 410
Abstract
In this paper, it is discussed to which extent hypothetical reasoning can be modeled by formal logics. The paper starts by exploring this idea in general, which leads to the conclusion that in order to do so, a more fine-grained classification of reasoning patterns firstly is needed. Next, the formal framework of adaptive logics, which has proven successful to capture some of these patterns, is described, and some of the specific problems for this approach are discussed. The paper concludes by presenting two logics for hypothetical reasoning in an informal way such that the non-technically skilled reader can get a flavor how formal methods can be used to describe hypothetical reasoning.
T. Gauderis () R&D Into the Trees, Gent, Belgium e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_29
381
382
T. Gauderis
Keywords
Abduction · Hypothesis formation · Adaptive logics · Defeasible reasoning
The Feasibility of the Project To an outsider, the claim that hypothetical or abductive reasoning, i.e., the act of forming and suggesting hypotheses for certain observations or puzzling facts, can be modeled by means of formal logics might sound as outlandish as claiming that computers have the same cognitive and creative abilities as humans. After all, abductive or hypothetical reasoning – by which in this paper is always meant the reasoning towards (explanatory) hypotheses, not starting from certain (possibly counterfactual) hypotheses (Rescher, 1964) – is often considered to be the hallmark of creative ingenuity, leading to our rich and wide diversity of ideas, innovations, and scientific theories. It just seems impossible that this richness can be reconstructed or created by just using formal tools, which are by nature abstracted from the specific semantic content. This argument, in short the creativity excludes logic argument, is the main reason why even the field itself is sharply divided between believers and non-believers. This argument, however, is a straw man. Nobody would argue for the claim that hypothetical reasoning can be modeled by means of formal logics along these lines. What is argued for in this paper and the various sources it cites is the more modest claim that certain aspects and forms of hypothetical reasoning can be modeled with the aid of formal systems that are specifically suited for this task. There are three important ways in which this modest claim differs from the straw man that is attacked by the creativity excludes logic argument. First, it not implied that the logics that are used are classical or deductive logics. Second, abductive reasoning is not a monolithic concept: it does not consist of a single method or procedure but consists of many different patterns; formal logics are only used to capture one specific and precisely defined pattern at a time. Third, the relation between formal logics and abductive reasoning is not one of agent and activity (i.e., formal logics do not display themselves abductive reasoning like humans do) but one of model and target: formal logics are used by (human) agents to model and – to a certain extent – to simulate certain aspects of human abductive reasoning. The semantic content that is lacking in abductive reasoning is provided by these agents. In the remainder of this introduction, the type of logics that are suitable for modeling abductive or hypothetical reasoning is discussed in further detail, and the framework that is used for the logics in this paper is introduced in general terms. For those who are still a bit suspicious how abductive or hypothetical reasoning patterns can be modeled using formal logics, it needs to be stressed that it is not meant that any of these patterns is a valid inference in classical logic or any other (non-trivial) deductive logic. To model defeasible reasoning steps such as hypothesis formation, one has to use non-monotonic logics: logics for which an extension of a premise set does not always yield a consequence set that is a superset of the original
18 Modeling Hypothetical Reasoning by Formal Logics
383
consequence set. Or, put more simply, logics according to which new information may lead to revoke old conclusions. It is important to note that the purpose in using logics for this task is not the classical purpose of the discipline of logic. Classically, the discipline of logic studies the correct way to infer further knowledge from already known facts. The correct way should guarantee the truth of the new facts, under the supposition that the old facts are true. Accordingly, this has motivated the search for the right (deductive) logic (whether it be classical logic or another one such as intuitionistic logic). The purpose here, however, is to model or explicate human reasoning patterns. As these patterns are fallible, leading to conclusions that are not necessarily true even if the premises are assumed to be true, it should be possible to revoke previously derived results, hence the use of non-monotonic logics. Also, because there are many patterns of human reasoning, it is natural to conceive of a plenitude of logics in order to describe them. Let me explain this a bit more formally. A logic L can be considered as a function from the power set of the sentences of a language to itself. So, given a language L and the set W of its well-formed formulas: L : ℘ (W ) → ℘ (W )
(1)
Hence, a logic determines for every set of sentences (or premise set) Γ which sentences can be inferred from it (or belong to its consequence set): CnL (Γ ) =df L(Γ )
(2)
Put more simply, a reasoning pattern is nothing more than the inference of some statements given some initial statements. Thus, in principle, a logic can be devised to model any reasoning pattern in science. If this pattern can be formally described, description by a formal logic is in principle possible. It has to be added, though, that in reality, scientific and human reasoning include not only sentences or propositions but also direct observations, sketches, and various other symbolic representations. Yet for the purpose of modeling particular reasoning patterns, those sources can be generally represented by suitable propositions. Deductive logics, such as classical logic (CL), have the property of monotonicity, i.e., for all premise sets Γ and Γ : CnL (Γ ) ⊆ CnL (Γ ∪ Γ )
(3)
Most patterns of human reasoning, however, do not meet this criterion. For instance, if an agent infers a hypothesis, she is well aware that it might need to be revoked on closer consideration of the available background knowledge or in light of new information. Although non-monotonic reasoning has typically received less attention in the field of logic than monotonic reasoning, various frameworks for defeasible reasoning and non-monotonic logics are available such as default logic, adaptive
384
T. Gauderis
logics, and belief revision (see Koons (2014) for a general overview of the variation in approaches). In this article, the progress that has been made on modeling abduction within the adaptive logics framework is overviewed. This is a framework created by Batens over the past three decades (see Straßer (2013) or Batens (2007) for an extensive overview and thorough formal introduction). This framework for devising non-monotonic logics has some advantages that suit very well the project of modeling abductive reasoning patterns. First, the focus in the adaptive logics program is, in contrast with other approaches to non-monotonic reasoning, on proof theory. For these logics, a dynamic proof style has been defined in order to mimic to a certain extent actual human reasoning patterns. More in particular, these dynamic proofs display the two forms of revoking previously derived results that can also be found in human reasoning: revoking old conclusions on closer consideration of the available evidence (internal dynamics) and revoking them in light of new information (external dynamics). One should not be misled, however, by this idea of dynamic proofs in thinking that the consequence set of adaptive logics for a certain premise set depends on the proof. Adaptive logics are proper proof-invariant logics that assign for each premise set Γ exactly one consequence set CnL (Γ ). Second, over the years, a solid meta-theory has been built for this framework, which guarantees that if an adaptive logic is created according to certain standards (the so-called standard format), many important metatheoretical properties are generically proven. This creates an opportunity for projects such as this to focus almost exclusively on the application of these formal methods without having to worry too much about proving their metatheoretical characteristics. Finally, as the framework is presented as a unified framework for non-monotonic logics, it has been applied in many different contexts. Over the years, adaptive logics have been devised for, apart from abduction, paraconsistent reasoning, induction, argumentation, deontic reasoning, etc. (Most of these applications have been studied at the Centre for Logic and Philosophy of Science (Ghent University). At this Centre’s website, http://www.clps.ugent.be, many references can be found to papers in various contexts. The reference works mentioned earlier, Straßer (2013) and Batens (n.d., Adaptive Logics and Dynamic Proofs. Mastering the Dynamics of Reasoning with Special Attention to Handling Inconsistency, unpublished manuscript), also give a good overview of the various applications.
Advantages and Drawbacks Explicating patterns of hypothesis formation by means of formal logics has a clear advantage: by reducing patterns to their formal and structural essence, an insight into the pattern’s precise conditions and applications is gained that is hard to achieve purely by studying different cases. Another great advantage of the formal explication of human reasoning patterns is that it allows for the possibility to provide artificially intelligent agents (which in general lack the human capacity for context awareness unless it is explicitly
18 Modeling Hypothetical Reasoning by Formal Logics
385
provided) with formal patterns to simulate human reasoning. In the case of hypothesis formation, this possibility has presently already found applications in the artificial intelligence subfields of abduction (diagnosis), planning, and machine learning (see Paul (2000) for an overview). The method of explicating patterns of hypothesis formation by means of formal adaptive logics also has certain drawbacks, however. First, formal logics are expressed in terms of a formal language, in which not all elements of human reasoning processes can be represented. This leads inevitably to certain losses. A very obvious example is that in general only propositions can be represented in logics. That means that all observations, figures, or other symbolic representations must be reduced to descriptions of them. A more important example in the case of abduction is the implication relation. The adaptive logics framework that is used in this chapter is, certainly for ampliative logics such as those for abduction or induction, largely built around the use of a classical material implication (mostly to keep things sufficiently simple). As a result of this, all relations between a hypothesis and the observations that led to their formation (their triggers) are modeled by material implications. It is clear that this is a strong reduction of the actual richness of such relations. Hypotheses do not have to imply their triggers: they can also just be correlated with them or be probabilistically likely; or the relation can be much more specific, as in the case of an explanatory or causal relation. This issue is relevant beyond the field of adaptive logics. Paul (2000, p. 36) has claimed that most approaches to abduction use a material implication that is implicitly interpreted as some kind of explanatory or causal relation. See also Beirlaen & Aliseda (2014) for an attempt to better capture the explanatory conditional. Second, if one sets out to model actual historical human reasoning processes by means of dynamic logical proofs (as the adaptive framework allows us to do), one quickly finds that it is no easy task to boil down those actual processes to the microstructure of their individual reasoning steps. As human agents often combine individual steps and seldom take note of each individual step, this type of models always contains an aspect of simulation. Human reasoning also does not proceed linearly step by step as proofs do: it contains circular motions, off-topic deviations, and irrational connections that cannot be captured by formal logics. Therefore, models of such reasoning processes are always to a great extent idealized. Natural languages are also immensely more complex than any formal language can aspire to be. Therefore, models of human reasoning are unavoidably simplifications. Furthermore, as formal logics state everything explicitly, any modeler of human reasoning has to simplify deliberately the actual cases, only to achieve a certain degree of comprehensibility. Altogether, it is clear that formal models of human reasoning processes are, in fact, only models: they contain abstractions, simulations, simplifications, and idealizations. And although these techniques are the key characteristics of models, such as those used in science, it is not always easy to evade the criticism that formal logics can only handle toy examples.
386
T. Gauderis
Third, certain patterns of creative hypothesis formation, i.e., those that introduce the hypothetical existence of new concepts, cannot be modeled by first-order logics. They seem to require at least the use of second-order logics, and this is a possibility of which, at present, the adaptive logics framework is not capable. Fourth, as one is here purely concerned with hypothesis formation and not with hypothesis selection, formal methods will generate sets of possible hypotheses that may grow exponentially in relation to the growth of the agent’s background knowledge. It is clear that this also poses a limit to the application of these methods to real-world problems. Finally, one might question the normativity of this project (and more generally of the adaptive logics program). By aiming to describe actual human reasoning processes, this branch of logics appears to put a descriptive ideal first, which contrasts sharply with the strongly normative ideals in the field of logic in general. The standard answer to this question is that adaptive logics attempt to provide both: on the one hand, they aim to describe actual reasoning patterns; on the other, once these patterns are identified, they aim to prescribe how these patters should be rationally applied. Yet this does not answer how the trade-off between these two goals of description and normativity should be conceived: is it better to have a large set of logics that is able to describe virtually any pattern actually found in human reasoning, or should one keep this set trimmed and qualify most actual human reasoning as failing to accord with the highest normative standards? Therefore, it remains a legitimate criticism that the goals of description and prescription cannot be so easily joined: how their trade-off should be dealt with needs further theoretical underpinning.
Four Patterns of Hypothetical Reasoning The quest to characterize abduction under a single schema was abandoned around 1980. The main reasons were that such attempts (e.g., Hanson’s (1958; 1961) proposal to call abduction “the logic of discovery”) often did not provide much detailed guidance for actual discovery processes and that even these general attempts always captured only a part of the discovery process (e.g., Inference to the Best Explanation, which was first emphasized by Harman (1965), describes only the selection of hypotheses, not their formation). Around the same time, research from different fields such as philosophy of science based on historical cases, artificial intelligence, and cognitive science resulted in a new consensus that there is a plenitude of patterns, heuristics, and methods of discovery, which are open to normative guidance, yet this guidance might be content-, subject-, or context-dependent (Nickles, 1980; Simon, 1973). Various authors in the literature on abduction have tried to provide classifications of various patterns of abduction (Thagard, 1988; Schurz, 2008; Hoffmann, 2010).
18 Modeling Hypothetical Reasoning by Formal Logics
387
For an overview of several logic-based characterizations of abductions, see also the introductory chapter of this section. Although these attempts differ slightly, some general patterns clearly stand out. Below, the author’s personal interpretation of these major general patterns is given. The main reason this deviates from the previous classifications is that an attempt is made to simplify the rather prolific classifications, yet provide a sufficient basis for formal modeling. This is possible because it is not attempted to give a fully exhaustive list or a list the elements of which are mutually exclusive. The only purpose was to give a simple list as a basis that covers most instances of abductive reasoning and can serve as the basis of formal modeling. Before the classification of these major patterns found in abductive reasoning is given, it is important to note that abductive inferences form explanatory hypotheses for observed facts using the agent’s background beliefs (or knowledge). Therefore, these patterns have the structure of the inference of a hypothesis (HYP) from some observed facts (OBS) and a part of the agent’s background beliefs (or knowledge) (BBK). These latter are, apart from toy examples, typically more than a few factual statements and often encompass a whole explanatory framework of (shared) assumptions and knowledge that provides the explanatory link between hypothesis and observations (see Gauderis and Van De Putte (2012) for an elaborate discussion of the role of an explanatory framework in logical approaches to explanation). In line with the Fregean tradition, factual statements are considered as statements of a concept with regard to one or more objects (or a logical combination of such statements). For instance, the statement “There was a civil war in France in 1789” can be analyzed as the concept “a country in civil war” applied to or with regard to the object “France in 1789.” A fact is a true factual statement. As such, concepts can also be considered as the class of all objects (or tuples of objects) for which the concept with regard to that object (or tuple of objects) is a fact. An observed fact is a factual statement describing an agent’s observation that she considers to be true. This can be broadly conceived to include also, for instance, a graph or a table of measurements in an article. Together, the observed facts form the trigger for the agent. In this semiformal description of these patterns, that p should be considered as a hypothesis is expressed by using a formulation of the form “It might be that p”; beliefs and observed facts can be expressed simply by stating their content. Concepts such as “a country in civil war” or “a bipedal hominid” are denoted by uppercase letters (typically F for observed, factual concepts and E for explanatory concepts) and objects such as “France in 1789” or “Lucy” by lowercase letters such as x or y. A finite set or list of (related) objects or concepts can then be expressed by, e.g., x1 . . . xn or F1 . . . Fn , where generally n 1 (hence, including the possibility of a single object or concept; the other case is indicated by n 2). Finally, that a concept applies to certain objects will be indicated by the phrase “with regard to.”
388
T. Gauderis
1. Abduction of a Singular Fact (OBS) F with regard to x1 . . . xn (n 1) (BBK) E with regard to x1 . . . xn explains F with regard to those objects in a certain explanatory framework EF. (HYP)
It might be that E with regard to x1 . . . xn
Some examples of this pattern, which has also been called “simple abduction” by Thagard (1988), “factual abduction” by Schurz (2008), and “selective fact abduction” by Hoffmann (2010), are: • The inference that (HYP) the hominid who has been dubbed Lucy (x1 ) might have been bipedal (E), from (OBS) observing the particular structure of her pelvis and knee bones (F ) and (BBK) knowledge about how the structure of pelvis and knee bones relates to the locomotion of animals (EF) • The inference that (HYP) two particles (x1 and x2 ) might have opposite electric charges (E), from (OBS) observing their attraction (F ) and (BBK) knowledge of the Coulomb force (EF) 2. Abduction of a Generalization (OBS) F with regard to all observed objects of class D (BBK) E with regard to some objects explains F with regard to those objects in a certain explanatory framework EF. (HYP)
It might be that E with regard to all existing objects of class D
Some examples of this pattern, which has also been called “rule abduction” by Thagard (1988), “law abduction” by Schurz (2008), and “selective law abduction” by Hoffmann (2010), are: • The inference that (HYP) all hominids of the last three million years (D) might have been bipedal (E), from (OBS) observing the similar structure of the pelvis and knee bones (F ) of all observed hominid skeletons dated to be younger than three million years (D) and (BBK) knowledge about how the structure of pelvis and knee bones relates to the locomotion of animals (EF) • The inference that (HYP) all emitted radiation from a particular chemical element (D) might be electrically neutral (E), from (OBS) observing in all experiments conducted so far that radiation emitted by this element (D) continues in a straight path in an external magnetic field perpendicular to the stream of radiation (F ) and (BBK) knowledge of the Lorentz force and Newton’s second law (EF). 3. Existential Abduction, or the abduction of the existence of unknown objects from a particular class (OBS) F with regard to x1 . . . xn (n 1) (BBK) The existence of objects y1 . . . ym (m 1) of class E would explain F with regard to x1 . . . xn in a certain explanatory framework EF. (HYP)
It might be that there exist objects y1 . . . ym of class E
18 Modeling Hypothetical Reasoning by Formal Logics
389
Some examples of this pattern, which was already called “existential abduction” by Thagard (1988) and has also been called “first-order existential abduction” by Schurz (2008) and “selective type abduction” by Hoffmann (2010), are: • The inference that (HYP) a hominid (y1 ) of the genus Australopithecus (E) might have lived in this area, from (OBS) observing a set of vulcanized foot imprints (x1 . . . xn of class F ) and (BBK) the belief that these foot imprints are of an Australopithecus (EF) • The inference that (HYP) there might be other charged particles (y1 . . . ym of class E) in the chamber, from (OBS) observing deflections in the path (F ) of a charged particle (x1 ) in a chamber without external electric or magnetic fields and (BBK) knowledge of the Coulomb and Lorentz forces and Newton’s second law (EF) 4. Conceptual Abduction, or the abduction of a new concept (OBS) F1 . . . Fm (m 2) with regard to each of x1 . . . xn (n 2) (BBK) No known concept explains why F1 . . . Fm with regard to each of x1 . . . xn (HYP) It might be that there is a similarity between the x1 . . . xn , which can be labeled with a new concept E that explains why F1 . . . Fm with regard to each of x1 . . . xn in a certain explanatory framework EF. It was Schurz (2008) who pointed out that this pattern is rational and useful for science only if the observation concerns several objects each individually having the same or similar properties, so that some form of conceptual unification is obtained. Otherwise, for each fact it could be suggested that there exists an ad hoc power that explains (only) this single fact. Some examples of this pattern, which largely coincides with the various types of “second-order abduction” Schurz (2008) suggests and several of the types of “creative abduction” conceived by Hoffmann (2010), are: • The inference that (HYP) there might be a new species of hominids (E), from (OBS) observing various hominid fossils (x1 . . . xn ) that are similar in many ways (F1 . . . Fm ) and (BBK) believing that these fossils cannot be classified in the current taxonomy of hominids (EF) • The inference that (HYP) there might exist a new type of interaction (E), from (OBS) observing similar interactive behavior (F1 . . . Fm ) between certain types of particles (x1 . . . xn ) in similar experiments and (BBK) believing that this behavior cannot be explained by the already known interactions, properties of the involved particles, and properties of the experimental setup (EF) Using the terminology of Magnani (2001) and following the distinction of Schurz (2008), the first two patterns, abduction of a singular fact and abduction of a generalization, can be considered as instances of selective abduction, as the agent selects an appropriate hypothesis in her background knowledge, while the latter two, existential abduction and conceptual abduction, can be called creative
390
T. Gauderis
abduction, as the agent creates a new hypothetical concept or object. It has to be added that Hoffmann (2010) would dispute this distinction, as he sees the third pattern (existential abduction) in the first place as the selection of an already known type (e.g., the genus Australopithecus), and not so much as the creation of a new token (someone of this genus of which his/her existence is now hypothesized). As stated before, this list is not exhaustive. Further patterns have been identified, such as the abduction of a new perspective (Hoffmann, 2010), e.g., to suggest that a problem might have a geometrical solution instead of an algebraic one; “analogical abduction” (Thagard, 1988), e.g., explaining similar properties of water and light, by hypothesizing that light could also be wavelike; or “theoretical model abduction” (Schurz, 2008), i.e., explaining some observation by suggesting suitable initial conditions given some governing principles or laws. Some have even considered “visual abduction,” the inference from the observation itself to a statement describing this observation, as a separate pattern (Thagard & Shelley, 1997). For some of these patterns (or instances of them), it is possible to argue that they are a special case of one of the patterns above. For instance, the suggestion of the wave nature of light can also be seen as an instance of conceptual abduction, in which the (mathematical) concept “wave behavior” is constructed to explain the similar properties of water and light, yet it is true that the analogical nature of this inference makes it a special subpattern with interesting properties in itself. This is also how Schurz (2008) presents it: in his classification, analogical abduction is one of the types of second-order existential abduction he conceives of. Perhaps more important to note is that these patterns are not mutually exclusive given a particular instance of abductive reasoning. For instance, the inference that leads to the explanation of why a particular piece of iron is rusted can be described both as singular fact abduction (this piece of iron underwent a reaction with oxygen) and as existential abduction (there were oxygen atoms present with which this piece of iron reacted). But in essence it describes the same explanation for the same explanandum. Also, combinations occur. For instance, if a new particle is hypothesized as an explanation for an experimental anomaly (such as, for instance, Wolfgang Pauli’s suggestion of the neutrino in the case of the anomalous β spectrum (Gauderis, 2014)), then this is both an instance of existential abduction – there is a not yet observed particle that causes the observed phenomenon – and an instance of conceptual abduction, these hypothesized particles are of a new kind, a combination, which coincides with Hoffmann’s (2010) pattern of “creative fact abduction.” Yet in the mind of the scientist, this process of hypothesis formation might have occurred in a single reasoning step. One should not, however, be too worried about these issues, if it is remembered that these patterns are categories for linguistic descriptions of actual reasoning processes. Any actual instance of hypothesis formation can be described in several ways by means of natural language, and some of these expressions can be formally analyzed in more than one way. Therefore, one should not focus too much on the exact classification of particular instances of hypothesis formation. Yet this does not render meaningless the project of explicating various patterns of hypothesis
18 Modeling Hypothetical Reasoning by Formal Logics
391
formation. The goal of this project is to provide normative guidance for future hypothesis formation. If particular problems or observations can be looked at from different perspectives and, therefore, expressed in various ways, it is only beneficial for an agent to have multiple patterns of hypothesis formation at her disposal.
Abductive Reasoning and Adaptive Logics Now, the various attempts to model abductive reasoning by means of adaptive logics are presented. Though it first needs to be explained why the framework of adaptive logics is fit for this job. First, adaptive logics allow for a direct implementation of defeasible reasoning steps (in casu applications of Affirming the Consequent). This makes it possible to construct logical proofs that nicely integrate defeasible (in this case ampliative) and deductive inferences. This corresponds to natural reasoning processes. Second, the formal apparatus of an adaptive logic instructs exactly which formulas would falsify a (defeasible) reasoning step. As these formulas are assumed to be false (so long as one cannot derive them), they are called abnormalities in the adaptive logic literature. So, if one or a combination of these abnormalities is derived in a proof, it instructs in a formal way which defeasible steps cannot be maintained. This possibility of defeating previous reasoning steps mirrors nicely the dynamics found in actual human reasoning. Third, for all adaptive logics in standard format, such as the presented logics LArs and MLAss , there are generic proofs for most of the important metatheoretical properties such as soundness and completeness (Batens, 2007). So far, most research effort has been focused on modeling singular fact abduction, which already proves to be, even it appears to be the easiest case, a rich and fruitful point of departure. This is not exclusive to the adaptive logics framework: in general, very little logics have been proposed for other forms besides singular abduction. Some of the few exceptions are Thagard (1988) and Gauderis and Van De Putte (2012). (This last logic, which is an adaptive logic, suffers, however, from some complications Beirlaen & Aliseda (2014, appendix B) and Gauderis (2013b, p. 140).) For these reasons, this overview is limited to the various attempts to model singular fact abduction within the framework of adaptive logics. The history of research into singular fact abduction within the adaptive logics community dates back to the early 2000s and can be traced through the articles Meheus et al. (2002), Batens et al. (2003), Meheus and Batens (2006), Meheus and Provijn (2007), Meheus (2007), and Lycke (2009, 2012). Besides presenting early logics for singular fact abduction, this research has also shown that there actually exist two types of singular fact abduction (see section “The Problem of Multiple Explanatory Hypotheses”). In recent years, for each of these two types of abduction, an adaptive logic in standard format (see section “The Standard Format of Adaptive Logics”) has been developed: LArs for practical abduction (Meheus, 2011) and MLAss for theoretical abduction (Gauderis, 2013a). These will be the
392
T. Gauderis
two logics that will be presented and explained in this article (see sections “LArs : A Logic for Practical Singular Fact Abduction” and “MLAss : A Logic for Theoretical Singular Fact Abduction”). It further needs to be noted that recent research has even pushed further by considering abduction from inconsistent theories (Provijn, 2012), adaptations for use in AI (Gauderis (2011), improved version in Gauderis 2013b, Ch. 5), and a first logic for propositional singular fact abduction (Beirlaen & Aliseda, 2014).
The Problem of Multiple Explanatory Hypotheses The early research into logics for abduction has shown that two types of abduction logics can actually be constructed, depending on how the logic deals with multiple explanatory hypotheses for a single observation. To explain this problem, consider the following example. Suppose one has to form hypotheses for the puzzling fact P a while one’s background knowledge contains both (∀x)(Qx ⊃ P x) and (∀x)(Rx ⊃ P x). There are two ways in which one can proceed. First, one could construct a logic in which one could derive only the disjunction (Qa ∨ Ra) and not the individual hypotheses Qa and Ra. This first way, called practical abduction (according to the definition suggested in Meheus and Batens (2006, pp. 224–225) and first used in Lycke 2009) and modeled by the logic LArs (Meheus (2011), see section “LArs : A Logic for Practical Singular Fact Abduction”), is suitable for modeling situations in which one has to act on the basis of the conclusions before having the chance to find out which hypothesis actually is the case. A good example is how people react to unexpected behavior. If someone suddenly starts to shout, people will typically react in a hesitant way, taking into account that either they themselves are somehow at fault or that the shouting person is just frustrated or crazy and acting inappropriately. Second, someone with a theoretical perspective (for instance, a scientist or a detective) is interested in finding out which of the various hypotheses is the actual explanation. Therefore it is important that she can abduce the individual hypotheses Qa and Ra in order to examine them further one by one. Early work on these kinds of logics has been done in Lycke (2009, 2012). Yet these logics have a quite complex proof theory. This is because, on the one hand, one has to be able to derive Qa and Ra separately, but on the other, one has to prevent the derivation of their conjunction (Qa ∧Ra), because it seems counterintuitive to take the conjunction of two possible hypotheses as an explanation: for instance, if the street is wet, it would be weird to suggest that it has rained and that the fire department also just held an exercise. Moreover, if the two possible hypotheses are actually incompatible, it would lead to logical explosion in a classical logical context. Logical explosion is the situation that just any statement can be derived from a certain premise set. In CL this occurs when a premise set contains a contradiction, i.e., both a particular statement and its negation can be derived from the premise set, which makes an ex falso quodlibet argument possible. Briefly, such an argument goes as follows: suppose one’s premise set contains both the statements p and ¬p.
18 Modeling Hypothetical Reasoning by Formal Logics
393
Then, by means of addition, one can first derive p ∨ q for any random q (informally, as one already knows that p is true, any statement of the form “p or . . . ” will also be true). But, as ¬p also holds, one can derive q from this disjunction by means of a disjunctive syllogism (the logical rule that if you know that one side of disjunction is false, the other side has to be true to make the disjunction true). The logic MLAss (Gauderis, 2013a) presented in this overview article (section “MLAss : A Logic for Theoretical Singular Fact Abduction”) solves this problem by adding modalities to the language and deriving the hypotheses ♦Qa and ♦Ra instead of Qa and Ra. By conceiving of hypotheses as logical possibilities, the conjunction problem is automatically solved because ♦Qa ∧ ♦Ra does not imply ♦(Qa ∧ Ra) in any standard modal logic. This approach also nicely coincides with the common idea that hypotheses are possibilities. These features make the logic MLAss very suitable for the modeling of actual theoretical abductive reasoning processes.
The Standard Format of Adaptive Logics Before the logics for abduction LAsr and MLAss are presented, the reader should first be provided with the necessary background about the adaptive logics framework and, more in particular, with the nuts and bolts of its standard format. This will of course be a limited introduction, and the reader is referred to, e.g., Straßer (2013) or Batens (2007) for a thorough introduction. Definition An adaptive logic in standard format is defined by a triple: (i) A lower limit logic (henceforth LLL): a reflexive, transitive, monotonic, and compact logic that has a characteristic semantics (ii) A set of abnormalities Ω: a set of LLL-contingent formulas (formulas that are not theorems of LLL) characterized by a logical form, or a union of such sets (iii) An adaptive strategy The lower limit logic LLL specifies the stable part of the adaptive logic. Its rules are unconditionally valid in the adaptive logic, and anything that follows from the premises by LLL will never be revoked. Apart from that, it is also possible in an adaptive logic to derive defeasible consequences. These are obtained by assuming that the elements of the set of abnormalities are “as much as possible” false. The adaptive strategy is needed to specify “as much as possible.” This will become clearer further on. Strictly speaking, the standard format for adaptive logics requires that a lower limit logic contains, in addition to the LLL-operators, also the operators of CL (classical logic). However, these operators have merely a technical role (in the generic meta-theory for adaptive logics) and are not used in the applications presented here. Therefore, given the introductory nature of this section, this will not
394
T. Gauderis
be explained into further detail. In the logics presented in this paper, the condition is implicitly assumed to be satisfied. Dynamic Proof Theory As stated before, a key advantage of adaptive logics is their dynamic proof theory which models human reasoning. This dynamics is possible because a line in an adaptive proof has – along with a line number, a formula and a justification – a fourth element, i.e., the condition. A condition is a finite subset of the set of abnormalities and specifies which abnormalities need to be assumed to be false for the formula on that line to be derivable. The inference rules in an adaptive logic reduce to three generic rules. Where Γ is the set of premises, Θ a finite subset of the set of abnormalities Ω, and Dab(Θ) the (classical) disjunction of the abnormalities in Θ and where A
Δ
(4)
abbreviates that A occurs in the proof on the condition Δ, the inference rules are given by the generic rules: PREM If A ∈ Γ : .. .. . . A∅
RU
If A1 , ..., An LLL B : A1 .. .
Δ1 .. .
An Δn B Δ1 ∪ . . . ∪ Δn
RC
If A1 , ..., An LLL B ∨ Dab(Θ) A1 Δ1 .. .. . . An Δn B Δ1 ∪ . . . ∪ Δn ∪ Θ
18 Modeling Hypothetical Reasoning by Formal Logics
395
The premise rule PREM states that a premise may be introduced at any line of a proof on the empty condition. The unconditional inference rule RU states that, if A1 , . . . , An LLL B and A1 , . . . , An occur in the proof on the conditions Δ1 , . . . , Δn , one may add B on the condition Δ1 ∪ . . . ∪ Δn . The strength of an adaptive logic comes from the third rule, the conditional inference rule RC, which works analogously to RU, but introduces new conditions. So, it allows one to take defeasible steps based on the assumption that the abnormalities are false (this rule also makes clear that any adaptive proof can be transformed into a Fitch-style proof in the LLL by writing down for each line the disjunction of the formula and all of the abnormalities in the condition). Several examples of how these rules are employed will follow. The only thing that is still needed is a criterion that defines when a line of the proof is considered to be defeated. At first sight, it seems straightforward to mark lines of which one of the elements of the condition is unconditionally derived from the premises; this means that it is derived on the empty condition (defeated lines in a proof are marked instead of deleted, because, in general, it is possible that they may later become unmarked in an extension of the proof). But this strategy, called the simple strategy, usually has a serious flaw. If it is possible to derive unconditionally a disjunction of abnormalities Dab(Δ) that is minimal, i.e., if there is no Δ ⊂ Δ such that Dab(Δ ) can be unconditionally derived, the simple strategy would ignore this information. This is problematic, however, because at least one of the disjuncts of the ignored disjunction has to be true. Therefore, one can use the simple strategy only in cases where Γ LLL Dab(Δ)
(5)
only if there is an A ∈ Δ such that Γ LLL A with Dab(Δ) any disjunction of abnormalities out of Ω. This condition will be met for the logic MLAss (section “MLAss : A Logic for Theoretical Singular Fact Abduction”); this logic will, hence, employ the simple strategy. The majority of logics, however, do not meet this criterion, and for those logics, more advanced strategies have been developed. The best known of these are reliability and minimal abnormality. The logic LArs uses the reliability strategy. This strategy, which will be explained and illustrated below, orders to mark any line of which one of the elements is unconditionally derived as a disjunct from a minimal disjunction of abnormalities. At this point, all elements are introduced to explain the naming of the two logics that will be presented in this paper: as might be expected, LA and MLA stand for “Logic for Abduction” and “Modal Logic for Abduction,” and the superscripts r and s stand for the adaptive strategies reliability and simple strategy. The subscript s originally denoted that the logic was formulated in the standard format for adaptive
396
T. Gauderis
logics, but in Gauderis (2013b), it is argued that it is more useful to interpret this s as that they are logics for singular fact abduction. After all, most adaptive logics are nowadays formulated in the standard format anyhow, and this allows to contrast these logics with the logic LAr∀ which is a logic for abduction of generalizations (Gauderis and Van De Putte, 2012; Gauderis, 2013b).
LArs : A Logic for Practical Singular Fact Abduction In this section, the reader is introduced to the logic LArs (Meheus, 2011) in an informal manner. This will allow the reader to gain a better understanding of the framework of adaptive logics and the functioning of its dynamic proof theory. In the next section, the same approach will be used for the logic MLAss . The formal definitions of both logics will be presented in the Appendix for those who are interested. In order to model abductive reasoning processes of singular facts, the logic LArs (as will the logic MLAss ) contains, in addition to deductive inference steps, defeasible reasoning steps based on an argumentation schema known as Affirming the Consequent (combined with Universal Instantiation): (∀α)(A(α) ⊃ B(α)), B(β)/A(β)
(6)
The choice for a predicate logic is motivated by the fact that a material implication is used to model the relation between explanans and explanandum. As it is well known that B CL A ⊃ B, a propositional logic would allow one to derive anything as a hypothesis. In the predicative case, the use of the universal quantifier can avoid this. This can be seen if we compare CL B(β) ⊃ (A(β) ⊃ B(β)) with CL B(β) ⊃ (∀α)(A(α) ⊃ B(α)) (see Beirlaen & Aliseda (2014) for a propositional logic for abduction that solves this problem in another way). Let the list of desiderata for this logic first be overviewed. This is important because in specifying the set of abnormalities and the strategy, one has to check whether they allow one to model practical abductive reasoning according to one’s expectations. Apart from the fact that by means of this logic one should be able to derive hypotheses according to the schema of Affirming the Consequent, one has to make sure that one cannot derive – as a side effect – random hypotheses which are not related to the explanandum. Finally, as it is pointed out in the introduction, it is a nice feature of adaptive logics that they enable one to integrate defeasible and deductive steps. Lower Limit Logic The lower limit logic of LArs is classical first-order logic CL. This means that the deductive inferences of this logic are the reasoning steps modeled by classical logic. Also, as this logic is an extension of classical logic, any classical consequence of a premise set will also be a consequence of the premise set according to this logic.
18 Modeling Hypothetical Reasoning by Formal Logics
397
Set of Abnormalities Ω If one takes (here and in further definitions) the meta variables A and B to represent (well-formed) formulas, α a variable and β a constant of the language in which the logic is defined L , we can define the set of abnormalities of the logics LArs as: Ω = {(∀α(A(α) ⊃ B(α)) ∧ (B(β) ∧ ¬A(β))) | No predicate occurring in B occurs in A} The first line is the logical form of the abnormality; the second line in the definition is to prevent self-explanatory hypotheses. To understand the functioning of this logical form, consider the following example proof starting from the premise set {Qa, ∀x(P x ⊃ Qx)}, ∀x(P x ⊃ Rx). ∀x(P x ⊃ Qx) Qa P a ∨ ¬P a P a ∨ (∀x(P x ⊃ Qx) ∧(Qa ∧ ¬P a)) 5 Pa
1 2 3 4
-;PREM -;PREM -; RU
∅ ∅ ∅
1,2,3;RU 4;RC
∅ {∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)}
From this premise set, one would like to be able to form the hypothesis P a. One obtains this hypothesis as follows. One starts by writing two premises on the first two lines and a tautology on the third line (all these lines are not dependent on earlier lines, indicated by the dash). These three lines allow one then to derive the disjunction on line 4 by means of the unconditional inference rule RU. This disjunction has the exact form that allows one now to derive conditionally the hypothesis P a from it by applying the rule RC. From this hypothesis one can reason further in a deductive way by applying, e.g., modus ponens (note that the result of this inference has also a non-empty condition): .. . 5 6 7
.. . Pa ∀x(P x ⊃ Rx) Ra
.. . 4;RC -; PREM 5,6;RU
{∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)} ∅ {∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)}
Suppose now that one comes to know that ¬P a is the case and add this premise to the premise set and continue the proof. Strictly speaking, this is not what is actually done. What is actually done is start a new proof with another premise set (the extended set). But it is easily seen that one can start this new proof with exactly the same lines as the old proof. This way, it looks as if one has extended the old proof. Therefore, it made sense to use the phrase “adding premises and continuing a proof” as it also nicely mirrors how human beings deal with incoming information: they do no start over their reasoning but incorporate the new information at the point where they have arrived.
398
.. . 5 6 7 8 9
.. . Pa ∀x(P x ⊃ Rx) Ra ¬P a ∀x(P x ⊃ Qx) ∧(Qa ∧ ¬P a)
T. Gauderis
.. . 4;RC -; PREM 5,6;RU -;PREM
{∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)} ∅ {∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)} ∅
1,2,6;RU
∅
9 9
This new premise makes it possible to derive unconditionally on line 9 the condition of the hypothesis P a. At this point, it is clear that one should not trust anymore the hypothesis formed on line 5, which one indicates by marking this line with a checkmark, indicating that one lost one’s confidence in this formula once one wrote down line 9. As the formula Ra is arrived at by reasoning further upon the hypothesis P a, it has (at least) the same condition and is, hence, at this point also marked. In summary, each time one defeasibly derives a hypothesis, one has to state explicitly the condition the (suspected) truth of which would defeat the hypothesis. Therefore, one can assume the hypothesis to be true as long as one can assume the condition to be false, but as soon as one has evidence that the condition might be true, one should withdraw the hypothesis. Reliability Strategy In the previous example, one withdrew the hypothesis because its condition was explicitly derived. However, have a look at the following example proof from the premise set {Qa, Ra, ∀x(P x ⊃ Qx), ∀x(¬P x ⊃ Rx)}: 1 2 3 4 5 6
∀x(P x ⊃ Qx) ∀x(¬P x ⊃ Rx) Qa Ra Pa ¬P a
-;PREM -;PREM -;PREM -;PREM 1,3;RC 2,4;RC
∅ ∅ ∅ ∅ {∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)} {∀x(¬P x ⊃ Rx) ∧ (Ra ∧ P a)}
There is clearly something fishy about this situation. As the conditions on line 5 and 6 are not derivable from this premise set, logical explosion would allow one to derive anything from this premise set, if one were to use the simple strategy. Still, it is quite obvious that at least one of those two conditions has to be false, as the disjunction of these two conditions is a theorem of the lower limit logic. Yet, as one does not know from these premises which disjunct is true and which one is false, the most reliable thing to do is to mark both lines: .. . 5
.. . Pa
.. . 1,3;RC
{∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)}
7
18 Modeling Hypothetical Reasoning by Formal Logics
6 ¬P a 7 (∀x(P x ⊃ Qx)∧ (Qa ∧ ¬P a))∨ (∀x(¬P x ⊃ Rx) ∧(Ra ∧ P a))
2,4;RC 1-4;RU
{∀x(¬P x ⊃ Rx) ∧ (Ra ∧ P a)} ∅
399
7
This marking strategy is called the reliability strategy, and it orders one to mark lines for which an element of the condition has been unconditionally derived as a disjunct of a minimal disjunction of abnormalities (or in short, a minimal Dabformula). It is important to note that (1) the disjunction should only hold disjuncts that have the form of an abnormality (otherwise, a defeating disjunction could be constructed for every hypothesis) and (2) that this disjunction should be minimal (as disjunctions can always be extended by applications of the addition rule). To clarify this last point: suppose one was able to derive the condition of line 5 by itself, then the disjunction on line 7 would not be minimal anymore and there would be no reason anymore to mark line 6. Practical Abduction The logic LArs is a logic for practical abduction (see section “The Problem of Multiple Explanatory Hypotheses”). This means that it solves the problem of multiple explanatory hypotheses by only allowing the disjunction of the various hypotheses to be derived. Consider the following example from the premise set {Ra, ∀x(P x ⊃ Rx), ∀x(Qx ⊃ Rx)}: ∀x(P x ⊃ Rx) ∀x(Qx ⊃ Rx) Ra Pa Qa (∀x(P x ⊃ Rx)∧ (Ra ∧ ¬P a))∨ (∀x((Qx ∧ ¬P x) ⊃ Rx)∧ (Ra ∧ ¬(Qa ∧ ¬P a))) 7 (∀x(Qx ⊃ Rx)∧ (Ra ∧ ¬Qa))∨ (∀x((P x ∧ ¬Qx) ⊃ Rx)∧ (Ra ∧ ¬(P a ∧ ¬Qa))) 8 ∀x((P x ∨ Qx) ⊃ Rx) 9 P a ∨ Qa
1 2 3 4 5 6
-;PREM -;PREM -;PREM 1,3;RC 2,3;RC 1-3;RC
∅ ∅ ∅ {∀x(P x ⊃ Rx) ∧ (Ra ∧ ¬P a)} {∀x(Qx ⊃ Rx) ∧ (Ra ∧ ¬Qa)} ∅
1-3;RC
∅
1,2;RU 3,8;RC
∅ {∀x((P x ∨ Qx) ⊃ Rx)∧ (Ra ∧ ¬(P a ∨ Qa))}
6 7
Because of the fact that the minimal Dab-formulas on line 6 and 7 could be derived from the premises, the individual hypotheses P a and Qa have to be
400
T. Gauderis
withdrawn, yet the condition of their disjunction on line 9 is not part of a minimal Dab-formula from these premises. This shows that this logic only allows to derive a disjunction in the case of multiple explanatory hypotheses and none of the individual disjuncts. Avoiding Random Hypotheses Another important feature of a logic for abduction is that it prevents to allow to derive random hypotheses. The three most common ways to introduce random hypotheses is (1) by deriving an explanation for a tautology, e.g., deriving Xa from the theorems P a∨¬P a and ∀x(Xx ⊃ (P x∨¬P x))); (2) by deriving contradictions as explanations, which leads to logical explosion, e.g., deriving Xa ∧ ¬Xa from P a and the theorem ∀x((Xx ∧ ¬Xx) ⊃ P x); or (3) by deriving hypotheses that are not the most parsimonious ones, e.g., deriving P a ∧ Xa from Qa and ∀x(P x ⊃ Qx) (and its consequence ∀x((P x ∧ Xx) ⊃ Qx)). The logic LArs prevent these three ways by similar mechanisms as the mechanism to block individual hypotheses illustrated above. Elaborate examples for each of these three ways can be found in Meheus (2011).
MLAss : A Logic for Theoretical Singular Fact Abduction In this section, the reader will be introduced to the logic MLAss (Gauderis, 2013a) in a similar informal manner. Formal definitions can again be found in the Appendix. Analogously, this logic also models deductive steps combined with applications of Affirming the Consequent (combined with Universal Instantiation), yet it treats the problem of multiple explanatory hypotheses now in a different way: it allows to derive these hypotheses individually, yet to avoid logical explosion caused by mutually exclusive hypotheses, it treats them as modal possibilities (see section “The Problem of Multiple Explanatory Hypotheses”). The list of desiderata for this logic is very analogous as the one for the logic LArs , except for treating the problem of multiple explanatory hypotheses in a different manner. Specific for this logic (as this logic is aimed at modeling the reasoning of, e.g., scientists or detectives (Gauderis, 2012)) is the desideratum that it handles contradictory hypotheses, predictions, and counterevidence in a natural way. Formal Language Schema As this logic is a modal logic, the language of this logic is an extension of the language of classical logic CL. Let the standard predicative language of classical logic be denoted with L . C , V , F , and W will further be used to refer respectively to the sets of individual constants, individual variables, all (well-formed) formulas of L , and the closed (well-formed) formulas of L . LM , the language of the logic MLAss , is L extended with the modal operator . WM , the set of closed formulas of LM , is the smallest set that satisfies the following conditions:
18 Modeling Hypothetical Reasoning by Formal Logics
401
1. If A ∈ W , then A, A ∈ WM . 2. If A ∈ WM , then ¬A ∈ WM . 3. If A, B ∈ WM , then A ∧ B, A ∨ B, A ⊃ B, A ≡ B ∈ WM . It is important to notice that there are no occurrences of modal operators within the scope of another modal operator or a quantifier. The set WΓ , the subset of WM the elements of which can act as premises in the logic, is further defined as: WΓ = {A | A ∈ W }
(7)
WΓ ⊂ WM .
(8)
It is easily seen that
Lower Limit Logic The LLL of MLAss is the predicative version of D, restricted to the language schema WM . D is characterized by a full axiomatization of predicate CL together with two axioms, an inference rule and a definition: K (A ⊃ B) ⊃ (A ⊃ B)
(9)
D A ⊃ ¬¬A
(10)
NEC
if A, then A ♦A =df ¬¬A
(11) (12)
This logic is one of the weakest normal modal logics that exist and is obtained by adding the D-axiom to the axiomatization of the better-known minimal normal modal logic K. The semantics for this logic can be expressed by a standard possible world Kripke semantics where the accessibility relation R between possible worlds is serial, i.e., for every world w in the model, there is at least one world w in the model such that Rww . Intended Interpretation of the Modal Operators As indicated above, explanatory hypotheses – the results of abductive inferences – will be represented by formulas of the form ♦A (A ∈ W ). Formulas of the form B are used to represent explananda, other observational data, and relevant background knowledge. Otherwise, this information would not be able to revoke derived hypotheses (for instance, ¬A and ♦A are not contradictory, whereas ¬A and ♦A are). The reason D is chosen instead of K is that it is assumed that the explananda and background information are together consistent. This assumption is modeled by the D-axiom (for instance, the premise set {¬P a, (∀x)P x} is a set modeling an inconsistent set of background knowledge and observations, but in the
402
T. Gauderis
logic K, this set would not be considered inconsistent, because anything cannot be derived from this set by Ex Falso Quodlibet. To be able to do this, the D-axiom is needed). Set of Abnormalities Since the final form of the abnormalities is quite complex – although the idea behind is straightforward – two more basic proposals that are constitutive for the final form will first be considered, and it will be shown why they are insufficient. Obviously, only closed well-formed formulas can be an element of any set of abnormalities. This will not be explicitly stated each time. First Proposal Ω1 This first proposal is a modal version of the set of abnormalities of the logic LArs : Ω1 = {(∀α(A(α) ⊃ B(α)) ∧ (B(β) ∧ ¬A(β))) | No predicate occurring in B occurs in A} Analogous to the logic LArs , this means that a derived hypothesis will be defeated if one shows explicitly that the hypothesis cannot be the case. Simple Strategy For this logic the simple strategy can be used, which means, as stated before, that one has to mark lines for which one of the elements of the condition is unconditionally derived. It can easily be seen that the condition for use of the simple strategy, i.e., enlargethispage*12pt Γ LLL Dab(Δ)
(13)
only if there is an A ∈ Δ such that Γ LLL A, is fulfilled here. Since all premises have the form A, the only option to derive a disjunction of abnormalities would be to apply addition, i.e., to derive (A ∨ B) from A (or B), because it is well known that (A ∨ B) A ∨ B in any standard modal logic (it is also possible to derive a disjunction from the premises by means of the K-axiom. For instance, (A ⊃ B) ¬A ∨ B, but the first disjunct will always be equivalent to a possibility (♦¬A) and can, hence, not be an abnormality). Contradictory Hypotheses As a first example of the functioning of this logic, consider the following example starting from the premise set {Qa, Ra, ∀x(P x ⊃ Qx), ∀x(¬P x ⊃ Rx)}. As the reader is by now probably accustomed with the functioning of the abnor-
18 Modeling Hypothetical Reasoning by Formal Logics
403
malities, it is also already shown how this logic is able to handle contradictory hypotheses without causing explosion. 1 2 3 4 5 6 7
∀x(P x ⊃ Qx) ∀x(¬P x ⊃ Rx) Qa Ra ♦P a ♦¬P a ♦P a ∧ ♦¬P a
-;PREM -;PREM -;PREM -;PREM 1,3;RC 2,4;RC 5,6;RU
∅ ∅ ∅ ∅ {(∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a))} {(∀x(¬P x ⊃ Rx) ∧ (Ra ∧ ¬¬P a))} {(∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a)), (∀x(¬P x ⊃ Rx) ∧ (Ra ∧ ¬¬P a))}
♦P a and ♦¬P a are both derivable hypotheses because the conditions on lines 5–7 are not unconditionally derivable from the premise set. It is also interesting to note that, because of the properties of the lower limit D, it is not possible to derive from these premises that ♦(P a ∧ ¬P a). The conjunction of two hypotheses is never considered as a hypothesis itself, unless there is further background information that links these two hypotheses in some way. Predictions and Evidence To show that this logic handles predictions and (counter)evidence for these predictions in a natural way, let the premise set be extended with the additional implication ∀x(P x ⊃ Sx): 8 ∀x(P x ⊃ Sx) 9 ♦Sa
∅ {(∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a))}
-;PREM 5,8;RU
With this extra implication, the prediction ♦Sa can be derived. As long as one has no further information about this prediction (for instance, by observation), it remains a hypothesis derived on the same condition as ♦P a. If one would test this prediction, one would have two possibilities. On the one hand, if the prediction turns out to be false, the premise ¬Sa could be added to the premise set: .. . 5 .. .
.. . ♦P a .. .
9 ♦Sa 10 ¬Sa 11 ¬P a 12 (∀x(P x ⊃ Qx)∧ (Qa ∧ ¬P a))
.. . 1,3;RC .. .
{(∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a))} 12
5,8;RU PREM 8,10;RU
{(∀x(P x ⊃ Qx) ∧ (Qa ∧ ¬P a))} 12 ∅ ∅
1,3,11;RU
∅
404
T. Gauderis
In this case, one could subsequently derive ¬P a, which would falsify the hypothesis ♦P a. On the other hand, if the prediction Sa turned out to be true, the premise Sa could have been added, but this extension of the premise set would not allow us to derive P a. Since true predictions only corroborate the hypothesis but do not prove it, while false predictions directly falsify the hypothesis, one can say that this logic handles predictions in a Popperian way, although in using this vocabulary, the reader has to be reminded that MLAss is a logic for modeling abduction and handling explanatory hypotheses, not a formal methodology of science. This logic has nothing to say about the confirmation of theories for which Popper actually employed the concepts of corroboration and falsification (Popper, 1959). Contradictions One of the three ways a logic of abduction could generate random hypotheses as a side effect is by allowing for the abduction of contradictions. How this is possible and how the logic prevents this is illustrated in the following proof from the premise set {Qa}: 1 Qa 2 ∀x((Xx ∧ ¬Xx) ⊃ Qx) 3 ♦(Xa ∧ ¬Xa)
-;PREM -;RU 1,2;RC
4 (∀x((Xx ∧ ¬Xx) ⊃ Qx)∧ 1;RU (Qa ∧ (¬Xa ∨ Xa)))
∅ ∅ {(∀x((Xx ∧ ¬Xx) ⊃ Qx) ∧(Qa ∧ ¬(Xa ∧ ¬Xa)))} ∅
4
Tautologies Still, there are other ways to derive random hypotheses that are not prevented by the first proposal for the set of abnormalities Ω1 . For instance, Ω1 does not prevent that random hypotheses can be derived from a tautology, as illustrated by the following example. As it is impossible in the following proof from the premise set ∅ to unconditionally derive the abnormality in the condition of line 3 from the premises, the formula of line 3, the random hypothesis ♦Xa, remains derived in every possible extension of the proof. 1 (Qa ∨ ¬Qa) 2 ∀x(Xx ⊃ (Qx ∨ ¬Qx)) 3 ♦Xa
-;RU -;RU 1,2;RC
∅ ∅ {(∀x(Xx ⊃ (Qx ∨ ¬Qx))∧ ((Qa ∨ ¬Qa) ∧ ¬Xa))}
Therefore, let the set of abnormalities be adjusted to obtain the second proposal Ω2 . Second Proposal Ω2 No hypothesis can be abduced from a tautology if the abnormalities have the following form:
18 Modeling Hypothetical Reasoning by Formal Logics
405
Ω2 = {(∀α(A(α) ⊃ B(α)) ∧ (B(β) ∧ ¬A(β))) ∨ ∀αB(α) | No predicate occurring in B occurs in A} It is clear that one can keep using the simple strategy with this new set of abnormalities. It is also easily seen that all of the advantages and examples described above still hold. Each time one can derive an abnormality of Ω1 , one can derive the corresponding abnormality of Ω2 by a simple application of the addition rule. Finally, the problem raised by tautologies, as illustrated in the previous example, is solved in an elegant way, because the form of abnormalities makes sure that the abnormality will always be a theorem in case the explanandum is a theorem. So, nothing can be abduced from tautologies. Most Parsimonious Explanantia Still, there is third way to derive random hypotheses that cannot be prevented by Ω2 . Consider, for instance, the following proof from the premise set {Ra, ∀x(P x ⊃ Rx)}: 1 2 3 4
Ra ∀x(P x ⊃ Rx) ∀x((P x ∧ Xx) ⊃ Rx) ♦(P a ∧ Xa)
5 ♦Xa
-;PREM -;PREM 2;RU 1,3;RC 4;RU
∅ ∅ ∅ {(∀x((P x ∧ Xx) ⊃ Rx)∧ (Ra ∧ ¬(P a ∧ Xa))) ∨ ∀xRx} {(∀x((P x ∧ Xx) ⊃ Rx)∧ (Ra ∧ ¬(P a ∧ Xa))) ∨ ∀xRx}
The reason why the random hypothesis ♦Xa can be derived is the absence of a mechanism to ensure that the abduced hypothesis is the most parsimonious one and not the result of strengthening the antecedent of an implication. Before defining the final and actual set of abnormalities that also prevents this way of generating random hypotheses, a new notation has to be introduced to keep things as perspicuous as possible. Notation Suppose AP CN (α) is the prenex conjunctive normal form of A(α). This is an equivalent form of the formula A(α) where all quantifiers are first moved to the front of the expression and where, consequently, the remaining (quantifierfree) expression is written in conjunctive normal form, i.e., as a conjunction of disjunctions of literals. Hence, apart from the quantifiers which are all at the front, it is entirely made up of a big conjunction of subformulae. AP CN (α) = (Q1 γ1 ) . . . (Qm γm ) (A1 (α) ∧ . . . ∧ An (α)) and
AP CN (α) ≡ A(α)
(14)
406
T. Gauderis
with m 0,n 1,Qi ∈ {∀, ∃} for i m, γi ∈ V for i m, α ∈ V , and Ai (α) disjunctions of literals in F for i n. Then, the new notation A−1 i (α) (1 i n) can be introduced so that there is a way to take out one of the conjuncts of a formula in PCN form. In cases where the conjunction consists of only one conjunct (and, obviously, no more parsimonious explanation is possible), the substitution with a random tautology will make sure that the condition for parsimony, added in the next set of abnormalities, is satisfied trivially. if n > 1 :
(15)
A−1 i (α) =df (Q1 γ1 ) . . . (Qm γm )(A1 (α) ∧ . . . ∧Ai−1 (α) ∧ Ai+1 (α) ∧ . . . ∧ An (α)) with Aj (1 j n) the j th conjunct of AP CN (α) if n = 1 : A−1 1 (α)
(16) =df
with any tautology of CL Final Proposal Ω With this notation the logical form of the set of abnormalities Ω of the logic MLAss can be written down. Ω = {(∀α(A(α) ⊃ B(α)) ∧ (B(β) ∧ ¬A(β))) ∨ ∀αB(α) ∨
n
∀α(A−1 i (α) ⊃ B(α)) |
i=1
No predicate occurring in B occurs in A} This form might look complex, but its functioning is quite straightforward. What is constructed is the disjunction of the three reasons why one should refrain from considering A(β) as a good explanatory hypothesis for the phenomenon B(β), even if one has (∀α)(A(α) ⊃ B(α)). The disjunction will make sure that the hypothesis A(β) is rejected as soon as one of the following is the case: (i) when ¬A(β) is derived, (ii) when B(β) is a tautology (and, obviously, does not need an explanatory hypothesis), or (iii) when A(β) has a redundant part and is therefore not an adequate explanatory hypothesis. For the same reasons as stated in the description of Ω2 , one can keep using the simple strategy, and all of the advantages and examples described above will still hold. Let one has a look at how this final set of abnormalities solves the previous problem. As the condition is fully written out, one can easily see that the third
18 Modeling Hypothetical Reasoning by Formal Logics
407
disjunct ∀x(P x ⊃ Rx) is actually a premise and that, hence, the abnormality on line 4 unconditionally derivable is. 1 2 3 4
Ra ∀x(P x ⊃ Rx) ∀x((P x ∧ Qx) ⊃ Rx) ♦(P a ∧ Qa)
5 (∀x((P x ∧ Qx) ⊃ Rx) ∧ (Ra ∧ ¬ (P a ∧ Qa))) ∨ ∀xRx ∨ ∀x(P x ⊃ Rx) ∨ ∀x(Qx ⊃ Rx)
-;PREM -;PREM 2;RU 1,3;RC
2; RU
∅ ∅ ∅ {(∀x((P x ∧ Qx) ⊃ Rx)∧ 5 (Ra ∧ ¬(P a ∧ Qa))) ∨ ∀xRx∨ ∀x(P x ⊃ Rx) ∨ ∀x(Qx ⊃ Rx)} ∅
This concludes the informal presentation of this logic, which, in its final form, meets all desiderata put up front.
Conclusions There is quite some ground covered in this paper, the main purpose of which was to show in a direct yet nuanced fashion the feasibility and the limits of modeling hypothetical reasoning by means of formal logics. The paper started with an argument for this claim in a general way, showing which assumptions one has to assume or reject to take this view. As far as there is argued for the feasibility of this project, the attention was also drawn to certain limits, pitfalls, and disadvantages of it. This discussion was then expanded by identifying four main abduction patterns, which showed that no pattern of hypothetical reasoning can as easily modeled. In the second part of this article, gears were shifted, and a glimpse was shown of what is already possible today with current logical techniques, by explaining in detail two logics originating in the adaptive logics framework: LArs for practical fact abduction and MLAss for theoretical singular fact abduction. The purpose of including the full details of these logics is threefold: first, it shows the reader how certain steps, which are admittedly modest, can be taken in the project of formally modeling hypothetical reasoning. At the same time, the reader is introduced to the unificational framework of adaptive logics that shows promise to take some further steps along the road. Finally, it also shows that the use of formal models draws the attention to various issues about these reasoning patterns which were previously left unattended, for example, the difference between practical and theoretical abduction or the importance of avoiding random hypotheses by restricting the use of tautologies and contradictions. However, if one looks at the prospect of modeling abductive reasoning by means of formal (adaptive) logics, one has to conclude that so far only the tip of the iceberg has been scratched. At present, apart from a single exception, only logics have been devised for singular fact abduction, which is in fact the most easy of the various
408
T. Gauderis
patterns of abduction. Yet the complications that already arise on this level warn dreamers that the road ahead will be steep and arduous.
Appendix: Formal Presentations of the Logics LArs and MLAss In this appendix, the logics LArs and MLAss will, for sake of completeness, be defined in a formal and precise way. This section is limited to what is needed to present these specific logics. For a more general formal presentation of adaptive logics in standard format, see Batens (2007). Like any adaptive logic in standard format, the logics LArs and MLAss are characterized by the triple of a lower limit logic, a set of abnormalities, and an adaptive strategy. For LArs , the lower limit logic is CL, the strategy is the reliability strategy, and the set of abnormalities ΩLArs is defined by: ΩLArs = {(∀α(A(α) ⊃ B(α)) ∧ (B(β) ∧ ¬A(β))) | No predicate occurring in B occurs in A} For MLAss , the lower limit logic is D, the strategy is the simple strategy, and the set of abnormalities ΩMLAss is, relying on the previously introduced abbreviation, defined by: ΩMLAss = {(∀α(A(α) ⊃ B(α)) ∧ (B(β) ∧ ¬A(β))) ∨ ∀αB(α) ∨
n
∀α(A−1 i (α) ⊃ B(α)) |
i=1
No predicate occurring in B occurs in A} Proof Theory The proof theory of these logics is characterized by the three generic inference rules introduced in Section 2 and the following definitions. Within adaptive logics, proofs are considered to be chains of subsequent stages. A stage of a proof is a sequence of lines obtained by application of the three generic rules. As such, every proof starts off with the first stage which is an empty sequence. Each time a line is added to the proof by applying one of the inference rules, the proof comes to its next stage, which is the sequence of lines written so far extended with the new line. Definition 1 (Minimal Dab-formula at stage s). A Dab-formula Dab(Δ) (Dab(Θ) is the (classical) disjunction of the abnormalities in a finite subset Θ of the set of abnormalities Ω) is a minimal Dab-formula at stage s if and only if
18 Modeling Hypothetical Reasoning by Formal Logics
409
Dab(Δ) is derived on the empty condition at stage s, and there is no Δ ⊂ Δ for which Dab(Δ ) is derived on the empty condition at stage s. Definition 2 (Set of unreliable formulas Us (Γ ) at stage s). The set of unreliable formulas Us (Γ ) at stage s is the union of all Δ for which Dab(Δ) is a minimal Dab-formula at stage s. Definition 3 (Marking for the reliability strategy). Line i with condition Θ is marked for the reliability strategy at stage s of a proof if and only if Θ ∩Us (Γ ) = ∅. Definition 4 (Marking for the simple strategy). Line i with condition Θ is marked for the simple strategy at stage s of a proof, if stage s contains a line of which an A ∈ Θ is the formula and ∅ the condition. Definition 5 (Derivation of a formula at stage s). A formula A is derived from Γ at stage s of a proof if and only if A is the formula of a line that is unmarked at stage s. Definition 6 (Final derivation of a formula at stage s). A formula A is finally derived from Γ at stage s of a proof if and only if A is derived at line i, line i is not marked at stage s, and every extension of the proof in which i is marked may be further extended in such a way that line i is unmarked. Using the simple strategy, it is not possible that a marked line becomes unmarked at a later stage of a proof. Therefore, the final criterium reduces for this strategy to the requirement that the line remains unmarked in every extension of the proof. Definition 7 (Final derivability for LArs ). Γ LArs A (A ∈ CnLArs (Γ )) if and only if A is finally derived in an LArs -proof from Γ . Definition 8 (Final derivability for MLAss ). For Γ ⊂ WΓ : Γ MLAss A (A ∈ CnMLAss (Γ )) if and only if A is finally derived in a MLAss -proof from Γ . Semantics The semantics of an adaptive logic is obtained by a selection on the models of the lower limit logic. For a more elaborate discussion of the following definitions, the reader is referred to the original articles and the aforementioned theoretical overviews of adaptive logics. Definition 9. A CL-model M of the premise set Γ is reliable if and only if {A ∈ Ω | M A} ⊆ Δ1 ∪ Δ2 ∪ . . . with {Dab(Δ1 ), Dab(Δ2 ), . . .} the set of minimal Dab-consequences of Γ . Definition 10. A D-model M of the premise set Γ is simply all right if and only if {A ∈ Ω | M A} = {A ∈ Ω | Γ D A}.
410
T. Gauderis
Definition 11 (Semantic consequence of LArs ). Γ LArs A if and only if A is verified by all reliable models of Γ . Definition 12 (Semantic consequence of MLAss ). For Γ ⊂ WΓ : Γ MLAss A if and only if A is verified by simply all right models of Γ . The fact that these two logics are in standard format warrants that the following theorems hold: Theorem 1 (Soundness and completeness of LArs ). Γ LArs A if and only if Γ LArs A. Theorem 2 (Soundness and completeness of MLAss ). Γ MLAss A if and only if Γ MLAss A.
References Batens, D. (2007). A universal logic approach to adaptive logics. Logica Universalis, 1, 221–242. Batens, D., Meheus, J., Provijn, D., & Verhoeven, L. (2003). Some adaptive logics for diagnosis. Logic and Logical Philosophy, 11/12, 39–65. Beirlaen, M., & Aliseda, A. (2014). A conditional logic for abduction. Synthese, 191(15), 3733– 3758. Gauderis, T. (2011). An adaptive logic based approach to abduction in AI. Gauderis, T. (2012). The problem of multiple explanatory hypotheses. Gauderis, T. (2013a). Modelling abduction in science by means of a modal adaptive logic. Foundations of Science, 18(4), 611–624. Gauderis, T. (2013b). Patterns of hypothesis formation: At the crossroads of philosophy of science, logic, epistemology, artificial intelligence and physics. Gauderis, T. (2014). To envision a new particle or change an existing law? Hypothesis formation and anomaly resolution for the curious spectrum of the β decay spectrum. Studies in History and Philosophy of Modern Physics, 45(1), 27–45. Gauderis, T., & Van De Putte, F. (2012). Abduction of generalizations. Theoria, 27(3), 345–363. Hanson, N. R. (1958). Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge University Press. Hanson, N. R. (1961). Is There a Logic of Scientific Discovery? (pp. 20–35). Holt, Rinehart and Winston. Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74(1), 88–95. Hoffmann, M. (2010). Theoric transformations and a new classification of abductive inferences. Transactions of the Charles S. Peirce Society, 46(4), 570–590. Koons, R. (2014). Defeasible Reasoning (spring 2014 ed.). Stanford University. Lycke, H. (2009). The adaptive logics approach to abduction. Lycke, H. (2012). A formal explication of the search for explanations: The adaptive logics approach to abductive reasoning. Logic Journal of IGPL, 20(2), 497–516. Magnani, L. (2001). Abduction, Reason and Science: Processes of Discovery and Explanation. Kluwer/Plenum. Meheus, J. (2007). Adaptive Logics for Abduction and the Explication of Explanation-Seeking Processes (pp. 97–119). Centro de Filosofia das Ciencias. Meheus, J. (2011). A Formal Logic for the Abduction of Singular Hypotheses (pp. 93–108). Springer.
18 Modeling Hypothetical Reasoning by Formal Logics
411
Meheus, J., & Batens, D. (2006). A formal logic for abductive reasoning. Logic Journal of IGPL, 14, 221–236. Meheus, J., & Provijn, D. (2007). Abduction through semantic tableaux versus abduction through goal-directed proofs. Theoria, 22(3), 295–304. Meheus, J., Verhoeven, L., Van Dyck, M., & Provijn, D. (2002). Ampliative Adaptive Logics and the Foundation of Logic-Based Approaches to Abduction (pp. 39–71). Kluwer Academic. Nickles, T. (1980). Introductory Essay: Scientific Discovery and the Future of Philosophy of Science (pp. 1–59). Reidel. Paul, P. (2000). AI approaches to abduction. In Handbook of Defeasible Reasoning and Uncertainty Management Systems (Vol. 4, pp. 35–98). Kluwer Academic. Popper, K. (1959). The Logic of Scientific Discovery. Routledge. Provijn, D. (2012). The generation of abductive explanations from inconsistent theories. Logic Journal of the IGPL, 20(2), 400–416. Rescher, N. (1964). Hypothetical Reasoning. North-Holland. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. Simon, H. (1973). Does scientific discovery have a logic? Philosophy of Science, 40, 471–480. Straßer, C. (2013). Adaptive Logics for Defeasible Reasoning: Applications in Argumentation, Normative Reasoning and Default Reasoning. Springer. Thagard, P. (1988). Computational Philosophy of Science. MIT Press. Thagard, P., & Shelley, C. (1997). Abductive Reasoning: Logic, Visual Thinking, and Coherence (Synthese Library, Vol. 259, pp. 413–427). Kluwer Academic.
Part IV Abduction and Medicine: Diagnosis, Treatment, and Prevention
Introduction to Abduction and Medicine: Diagnosis, Treatment, and Prevention
19
Daniele Chiffi and Mattia Andreoletti
Contents Abduction and Medicine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
416 418
Abstract
The relation between a patient’s signs and symptoms and codified clinical and epidemiological knowledge can be extremely complex and usually provides the ground to formulate a clinical judgment. The chapters of this part focus in particular on the role of abductive reasoning in clinical practice and critically investigate the contribution of abductive inferences in formulating medical judgments in the fields of diagnostics, prognostics, and for the selection of treatment options. Keywords
Abduction · Medicine · Diagnosis · Treatment · Prevention · Clinical reasoning · Gabbay-Woods schema
D. Chiffi () · M. Andreoletti DAStU, Politecnico di Milano, Milan, Italy e-mail: [email protected] M. Andreoletti Faculty of Philosophy, Università Vita-Salute San Raffaele, Milan, Italy Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_83
415
416
D. Chiffi and M. Andreoletti
Abduction and Medicine The acts of formulating, testing, confirming, or rejecting a hypothesis are essential ingredients not just of scientific reasoning but also of clinical reasoning. This requires first of all the mastering of different methodologies together with a critical reflection on their value and limitations. An overwhelming literature in medicine has appealed to particularly sophisticated bio-statistical tools and methods in order to support epidemiological research and clinical practice, which find in the evidencebased medicine (EBM) approach a systematic synthesis. A not comparable amount of research in medicine has focused on abductive forms of reasoning, and this is quite surprising, since in many cases we need to reason from facts and data in order to formulate or select promising hypotheses. Even if the research on abductive reasoning in medicine has been happening for many decades, clear and explicit analyses of abduction in clinical research and practice are not very frequent. This is quite infelicitous since abduction, intended as a potentially fallible form of reasoning suggesting to further explore and possibly test a specific hypothesis, is a crucial aspect of virtually all medical judgments concerning diagnosis, prognosis, treatment, and prevention (Chiffi, 2021). Before the statistical testing of a clinical hypothesis, many pragmatic considerations regarding the amount of money, energy, and time required to possibly design and begin a clinical study need to be imagined and critically evaluated in conjunction with an ethical evaluation of the scientific hypothesis. The choice of working hypotheses in medicine depends not only on epistemic consideration but also on ethical and economic evaluations. This modern view of scientific and clinical research is indeed coherent with Peirce’s methodology of the “Economy of Research,” which may provide guidance in selecting those hypotheses that are worth testing, in particular when there is scarcity of resources (Rescher, 1976; Chiffi et al., 2020). Finally, as noted by Woods (2012), there may also be scarcity of cognitive resources for evaluating a hypothesis. Even more crucial is perhaps the role of abductive reasoning in clinical practice. The relation between a patient’s signs and symptoms and codified clinical and epidemiological knowledge can be extremely complex and usually provides the ground to formulate a clinical judgment. The idiographic methodology in clinical reasoning tries to make sense of individual cases and events that might be even quite dissimilar from the evidence collected at population level, given the extreme variability of individual medical conditions. Abduction may help the clinicians to select those aspects that are relevant for understanding the interplay between population-based evidence and the peculiarities of a single clinical case. This is particularly relevant when there is the necessity of formulating judgments regarding individual risk. At a first stage, for instance, a risk measure (or any other clinically relevant indicator) in an epidemiologic study is usually generalized or extrapolated to a target population (Fuller & Flores, 2015). In other words, general categories of hypotheses are selected to constrain the range of possible alternatives in order to assess the proper initial space of all hypotheses. This is a form of abductive reasoning, which has been called unfocused abduction (Ramoni et al., 1992). Then,
19 Introduction to Abduction and Medicine: Diagnosis, Treatment, and Prevention
417
at a second stage, the risk measure (or any other clinically relevant indicator) is particularized to a patient from the target population; thus, general hypotheses need to be shaped and refined based on the relevant clinical characteristics of the patient. This process is not so straightforward and requires the initial selection of a good general hypothesis to be considered and the comparison with more specific hypotheses that are selected in a context defined by the general hypothesis. This kind of abduction has been labeled as focused abduction (Ramoni et al., 1992). In comparing the general hypothesis with more specific ones, it is possible to get a refinement of the initial general hypothesis as well as either an alternative or a complementary one. The dynamics of unfocused and focused abduction thus represents a clinical strategy to shape and select hypotheses to be evaluated in different kinds of clinical judgments. Chapters in this Part of the Handbook discuss abduction mostly in the context of clinical reasoning. The themes of the chapters are about classic themes in clinical reasoning enriched by innovative views on abductive methodologies. Carlo Martini offers an overview of abduction in medical diagnosis. Besides presenting the classic literature on the topic, he draws on the latest empirical research to also understand how abductive reasoning takes place in clinical practice. This approach has the advantage to make the role of abductive inferences very concrete. Raffaella Campaner and Fabio Sterpetti focus instead on treatment assignments. They provide a philosophical analysis of the use of abduction in the selection and evaluation of evidence when addressing clinical decisions on therapeutic strategies. They also present a few case studies to better understand the role of abduction in actual scenarios. Chiffi and Andreoletti complete the overview of clinical decision-making areas, focusing on prognosis. They critically discuss the logical and epistemological foundations of prognostic reasoning and show that this latter can be easily understood in terms of abduction. In fact, prognosis is a form of decision-making under uncertainty and abductive reasoning seems to deal with it very well. Finally, Cristina Barés Gómez and Matthieu Fontaine suggest that diagnostic, therapeutic, and monitoring hypotheses can be connected within the Gabbay-Woods model of abduction. In such a model, abduction is considered as an ignorance-preserving inference. The basic idea is that diagnostic hypotheses can be activated in different abductions, leading to treatment hypotheses and monitoring procedures without previous confirmation. Their chapter also includes a thoughtful discussion of the debate between “mechanistic” and “probabilistic” approach to causality in medicine. All in all, the following chapters cover all the relevant areas of medical-decision making in which abductive reasoning plays a key role. The chapters clearly present the state of the art of the philosophical research on the topic and further add some original considerations that may lay the basis for novel research. The authors have also put some effort in introducing their arguments in more concrete terms, rather than discussing them in abstract. Therefore, this Part of the Handbook can be appreciated by scholars in different disciplines, including physicians and medical researchers.
418
D. Chiffi and M. Andreoletti
References Chiffi, D. (2021). Clinical reasoning: Knowledge, uncertainty, and values in health care. Springer. Chiffi, D., Pietarinen, A. V., & Proover, M. (2020). Anticipation, abduction and the economy of research: The normative stance. Futures, 115, 102471. Fuller, J., & Flores, L. J. (2015). The risk GP model: The standard model of prediction in medicine. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 54, 49–61. Ramoni, M., Stefanelli, M., Magnani, L., & Barosi, G. (1992). An epistemological framework for medical knowledge-based systems. IEEE Transactions on Systems, Man, and Cybernetics, 22(6), 1361–1375. Rescher, N. (1976). Peirce and the economy of research. Philosophy of Science, 43(1), 71–98. Woods, J. (2012). Cognitive economics and the logic of abduction. The Review of Symbolic Logic, 5(1), 148–161.
Abduction in Prognostic Reasoning
20
Daniele Chiffi and Mattia Andreoletti
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Foundations of Prognostication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in Prognostic Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prognosis, Hypothesis, and Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General and Particular Medical Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prognostic Judgment: Structure and Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
420 421 423 428 431 433 437 438
Abstract
One of the most important activities of medicine is the act of prognostication. The success of medical practice depends indeed also on the ability of physicians to make reliable and accurate predictions on the course of diseases and patients’ health. However, traditionally, the concept of prognosis has been partially overlooked both by clinicians and philosophers of science. In this chapter, the authors aim to fill this gap, focusing on the logical and epistemological aspects of prognostic reasoning, which has been usually understood in terms of abduction. In fact, any prognosis has to deal with some kind of uncertainty, and abductive forms of reasoning seem to nicely cope with it. The main aim of this chapter
D. Chiffi () DAStU, Politecnico di Milano, Milan, Italy e-mail: [email protected] M. Andreoletti Faculty of Philosophy, Università Vita-Salute San Raffaele, Milan, Italy Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_11
419
420
D. Chiffi and M. Andreoletti
is to summarize and critically discuss all the major findings directly related to prognostic and abductive reasoning in both philosophical and scientific literature. Keywords
Prognosis · Medical reasoning · Medical assertions · Uncertainty · Epidemiology · Evidence-based medicine
Introduction This chapter focuses on the issue of prognostication in clinical practice. As discussed later, prognostic judgments are well understood as the result of abductive inferences. Analyzing abduction in prognostic reasoning reveals some specificity of this form of reasoning in medicine while at the same time shedding some light on its epistemic merits and weaknesses. Prognostication is likely one of the most challenging activities physicians have to face. Indeed, precisely assessing the complexity of factors surrounding a particular medical condition and a specific patient and inferring a reliable prevision of the course of the disease is a very difficult task. So, a better grasp of the prognostic reasoning, and of its logical and epistemological foundations, can eventually help physicians formulate better judgments. So far, however, medicine has focused mostly on increasing its diagnostic abilities. And, following this tendency, the philosophical discourse on medicine has also predominantly concerned itself with the critical analysis of the concept of diagnosis. In contrast, there has been much less interest in systematic investigations of prognosis on both sides. This chapter aims to summarize and critically discuss all the recent findings directly related to prognostic reasoning and abductive reasoning in both philosophical and scientific literature. Prognostic reasoning is introduced in section “Foundations of Prognostication,” presenting its foundations and its specific issues. As just mentioned, traditionally, this issue has been overlooked both by clinicians and philosophers of science for a plethora of reasons. Nonetheless, patients expect physicians to offer them a reliable and accurate prognosis, since this is essential for making meaningful decisions and planning ahead of the diseases. Prognostic reasoning is understood in terms of reasoning under uncertainty and is based on different types of risks and uncertainties. And abduction is an essential tool in clinical reasoning and prognostication (Section “Abduction in Prognostic Reasoning”). Some examples of abductive inferences from medicine practice are also offered. Furthermore, it is worth noting a very important distinction between abduction and inference to the best explanation, which should not be confused. Section “Prognosis, Hypothesis, and Evidence” focuses on the role and the justification conditions of clinical judgments in dealing with fundamental forms of uncertainty in future scenarios, analyzing the role of prognostic factors in supporting prognostic and diagnostic judgments, and highlighting the difficulties of shaping the prognostic judgments based on epidemiological data and clinical signs.
20 Abduction in Prognostic Reasoning
421
Sections “General and Particular Medical Assertions” and “Prognostic Judgment: Structure and Uncertainty” move toward a more analytic analysis of prognostic judgments. Focusing on prognostic judgment, the authors point out that a prognosis is usually communicated as a form of verdictive speech act, which often occurs under fundamental uncertainty and which includes to claim something regarding a future clinical event. Finally, section “Conclusion” concludes the chapter.
Foundations of Prognostication Prognostication is an action that has been somewhat left on the sidelines of current medical practice (Rich, 2001). A prognosis is the end result of a fundamental clinical assessment that focuses mainly on the future, which is why it is considered more difficult than establishing diagnoses or making treatment decisions (Christakis & Sachs, 1996). Even when a diagnosis has been soundly formulated, the prognosis may be uncertain because our understanding of pathological conditions is always incomplete, and diseases may develop in unpredictable ways (Austoni & Federspil, 1975). This means that prognostic judgment faces fundamental uncertainty (also known as “Keynesian” or “Knightian uncertainty”) (Keynes, 2008; Knight, 1921), since probability measures in some cases can hardly be assigned to specific future events that shape such type of judgment. In fact, relevant prognostic information may not be available when formulating a prognosis, decreasing, therefore, the potential to predict the patient’s disease course. Despite this remarkable fact, prognostics is still fully based on the concept of risk, a notion that may be used in the context of certainty of uncertainty, but not in the context of fundamental uncertainty. The aim of prognostics is traditionally to judge: (i) the duration of a future disease stage in a given patient (prognosi quoad tempus); (ii) the probability of a patient recovering from a disease (prognosi quoad valetudinem); or (iii) the chances of a patient’s survival (prognosi quoad vitam) (Rizzi, 1993). All these types of prognoses are made with a margin of uncertainty, being a prognosis usually formulated as the probability of an outcome and/or of a patient’s health taking a certain course (in terms of a particular disease) within a specified time interval (Djulbegovic et al., 2011; Miettinen, 2011). Yet, this classical view on prognostication makes substantial use of the notions of probabilistic risk, despite the fact that this may be unsuitable when dealing with fundamental uncertainty related to the intricacies of disease development and patient’s decisions. Moreover, also, the latter type of uncertainty has some cognitive and emotional factors associated with a patient’s expectations. A patient may, for instance, experience fear of knowing his/her future. When a prognosis is very poor, physicians may prefer not to inform their patients. Physicians tend to consider such a prognosis as a self-fulfilling prophecy, because it is usually believed to occur in association with clinical treatments or placebo (or nocebo) effects (Christakis, 1999; Rich, 2002). A self-fulfilling prophecy is a prediction which induces a person believing in it to behave in such a way that makes it ultimately come true. In the case of prognostics, the core idea is that
422
D. Chiffi and M. Andreoletti
being informed of a poor prognosis can have a negative effect on a patient’s future health. The impression that this might play a part in the relations between physician and patient is questionable in such a multi-agent context as the clinical one, however. Seen from Lewis’s perspective (Lewis, 1975), in a multi-agent context, self-fulfilling beliefs require higher-order beliefs to justify them, combined with higher beliefs (or ideally common knowledge) shared by different epistemic agents. Such conditions are typically violated in the sphere of prognostics because of the asymmetry of information between physician and patient (Chiffi & Zanotti, 2017a). Prognostic beliefs can therefore hardly be considered as self-fulfilling, since they occur in a multi-agent context, where there may not exist converging higher beliefs for the patient and the physician. This being the case, physicians’ concerns about communicating a prognosis to patients often seem to be unwarranted. From the patients’ point of view, having to ignore or cope with severe and fundamental uncertainty about their future health is psychologically harmful (Smith et al., 2013). Patients’ hopes and expectations are based on higher-order beliefs concerning the prognostic information they receive from their doctors. To give an example, the famous biologist Stephen J Gould survived 20 years after being told that the median survival rate for patients with an abdominal mesothelioma like his was 8 months. His case suggests how population-based statistics cannot predict what will happen in a specific case, a fact that should always be borne in mind when giving patients prognostic information. If, however, doctors were to come up with a wrong prognosis, they may negatively affect not only their patient’s health but also their own professional credibility. From this point, one can see why prognostic accuracy can be so frustrating, especially since predicting a patient’s future health conditions is by no means easy. Not all facets of prognostics are sensitive to doctors’ and patients’ beliefs. A prognosis is also influenced by prognostic markers (which are unmodifiable and uninfluenced by the course of the disease) and prognostic factors (which are modifiable and dependent on how the disease develops). Prognostic markers and factors include (i) the stage and severity of a pathological condition associated with a disease; (ii) an individual’s resistance to disease (Individual variability, in terms of patients’ different predisposition to health and disease, was the focus of the research program at the Padua School of Constitutional Medicine organized at the beginning of the twentieth century by Achille De Giovanni. See De Giovanni (1909)); (iii) a patient’s gender and age; and (iv) a patient’s preferences on treatments, living environment, behavior, and so forth (Del Mar et al., 2008). Elucidating the relationships between general and specific cases can be quite complicated, especially from a prognostic standpoint. There is no standardized or mechanistic way to correlate a new patient to the results obtained from samples of other patients with the same disease (Hilden & Habbema, 1987; Thorne & Sawatzky, 2014). Moreover, given the fundamental uncertainty of prognostication, the set of prognostic factors may be incomplete or loosely clinically significant. Like diagnoses, prognosis includes a temporal factor as they are likely to change over time. They can be classified within two main categories: nosographic
20 Abduction in Prognostic Reasoning
423
and pathophysiological (Federspil, 2004). A nosographic prognosis is based on a nosographic diagnosis and aims to predict when patients will reach a certain future stage of their disease. (Whereas a nosographic diagnosis is only classificatory within a taxonomy, a pathophysiological prognosis involves a causal explanation.) This prediction is based on statistical probabilities elicited from population-based data. On the other hand, a pathophysiological prognosis is based on a pathophysiological diagnosis, meaning that the prediction of a future stage of disease in a given patient is in this case grounded on knowledge of the pathological explanation for his or her disease. Intuitively, if the initial condition occurs before a clinical judgment is established, then we have an explanation; if not, we have a prediction (as in the case of prognosis). Given the level of uncertainty inherent to medical knowledge, prognostic explanations do not usually involve universal laws, another reason why the concept of prognosis is not deterministic (Sadegh-Zadeh, 2012). Especially in the case of pathophysiological prognoses, knowing the probability of a disease stage occurring at population level may often be held as scarcely relevant to a specific patient, since patient-related factors can profoundly affect prognostication. But another source of uncertainty is also present in clinical practice: not all aspects of a given explanation are always reported in detail. Some pathophysiological laws may not be explicitly stated, for instance, leading to partial explanations that are usually termed “incomplete” or “elliptic.” A full explanation may instead be provided in a rational reconstruction once all the constituent parts of the explanation have been produced (Hempel, 1962). When doctors only count with very patchy explanations and limited information, their prognostic judgment can be suspended waiting for further signs to emerge in the natural history of the disease. It is occasionally possible to infer a simple prognosis from the identification of a patient’s signs and symptoms alone. The level of uncertainty of any prognosis – partly due to the inherent difficulty of foreseeing the course a disease in a particular patient – makes it reasonable to assume that some hypotheses (formulated in the light of a diagnosis) may not make sense of what is happening to a patient’s health and may thus need to be reconsidered. In this case, new hypotheses (not taken into account during the initial diagnostic workup) may be formulated, leading to an adjustment of the original diagnosis. Finally, a prognosis is shaped by the therapeutic options available for a given condition. Physicians may predict the future course of a disease based on the alleged efficacy of the treatment chosen for a given patient. After reviewing all the subsequent stages of the patient’s clinical management, doctors should ideally arrive at a final retrospective assessment (epicrisis) of all possible diagnostic, prognostic, or treatment errors.
Abduction in Prognostic Reasoning Abductive reasoning has been recognized as an essential tool in the clinical setting. According to Upshur, for instance, “inferences in clinical medicine [ . . . ] should be regarded as tentative statements that the inference holds, is provisionally the case, or
424
D. Chiffi and M. Andreoletti
pragmatically justifies action” (Upshur, 1997, p. 204). Both the justification and the growth of our medical knowledge can be analyzed using the concept of abduction, which may very well serve the purpose of grounding our clinical reasoning. The most common type of inference is abductive inference, having the best applicability (in contrast to the deductive and inductive types) in diagnostics since judgments can be made based on the available information, even if incomplete. It is assumed that abductive inference is, in a narrow sense, the only ampliative form of reasoning. The following justifications are usually acknowledged by parties making abductive inferences. It is generally assumed that (i) an unexpected and interesting observation stems from the premises of an abductive inference and (ii) there is an increase in the initial plausibility of the hypothesis expressed in the conclusion of the abductive inference. In deductive logic, from the (certainty of the) truth of a premise, we can infer the (certainty of the) truth of the conclusion. In abductive inferences, on the other hand, conclusions (expressing hypotheses) can only be plausible. The use of abductive inference has been largely recognized in medicine. An example of abductive inference in medicine, drawn from the study by Auguste Loubatières on the mechanism associated with hypoglycemic sulfonamides, illustrates the matter: (a) In patients suffering from typhoid and treated with 2254RP, the administration of this drug causes hypoglycemia. (b) If the drug stimulates the endocrine pancreas to produce insulin, then the onset of hypoglycemia is a matter of course. (c) There is reason to suspect that the hypothesis that the drug stimulates the endocrine pancreas to produce insulin is true. (Federspil, 2004). More specifically, hypothetical reasoning in the form of abduction finds many applications in medicine within diagnostic, prognostic, and therapeutic settings (Bissessur et al., 2009). Decision-making is an important part of the diagnostic process, specifically when it comes to choosing the possible hypotheses to identify a patient’s disease (from among a finite number of known hypotheses consistent with pathological knowledge) and to prescribing the appropriate treatments in light of the prediction of the future course of the patient’s condition. According to the philosopher Charles Sanders Peirce, abduction differs from either induction or deduction and can be formulated as follows: The surprising fact, C, is observed. But if A were true, C would be a matter of course. Hence, there is reason to suspect that A is true. (Peirce, 1931, CP 5.189)
As masterfully acknowledged by John Woods (2013), Peirce’s “important ideas” on abduction are the following: (P0) Abduction is triggered by surprise. (P1) Abduction is a form of guessing. Since we are rather remarkably good at guessing, it can only be supposed that we are likewise rather good at abducing. (P2) A successful
20 Abduction in Prognostic Reasoning
425
abduction provides no grounds for believing the abduced proposition to be true. (P3). Rather than believing them, the proper thing to do with abduced hypotheses is to send them off to experimental trial (CP 5.599, 6.469–6.473, 7.202–219). (P4) The connection between the truth of the abduced hypothesis and the observed fact is subjunctive (CP 5.189). (P5) The inference that the abduction licenses is not to the proposition H but rather that H’s truth is something that might plausibly be conjectured (CP 5.189). (P6) The “hence” of the Peircean conclusion is ventured defeasibly (CP 5.189). Peirce’s seminal intuitions are also the guiding ideas of the Gabbay-Woods (GW) schema of abduction (Gabbay & Woods, 2006). A pragmatic interpretation and a semi-formalization of this schema have been provided in (Chiffi & Pietarinen, 2019, 2020b). The GW schema was introduced as a refinement of the AKM model of abduction. In the acronym AKM, A refers to Aliseda (1998, 2005); K to Kowalski (1979), Kuipers (1999), and Kakas et al. (1992); and M to Magnani (2009) and Meheus et al. (2002). In (Magnani, 2009) is also presented his recent eco-cognitive model of abduction. The above-cited abductive arguments are among the most famous from Peirce’s writings, although more analytic ones can be found. Abduction is not a form of deduction where truth is transmitted with certainty from the premises to the conclusion. Peirce saw abduction as a reasoning procedure that holds also in what is known as the “context of discovery,” that is, in the genesis of a (scientific) theory. Moreover, logical reasoning is nowadays also part of the “context of justification” of a theory. Unfortunately, after Peirce, abduction in the context of justification (AJ) has often been portrayed as a possible form of “inference to the best explanation of a theory,” i.e., an inference that is justified by the hypothesis judged most adequate to explain a given phenomenon. Still, abduction and inference to the best explanation should not be confused, since there is no guarantee that an abducted hypothesis is the best explanation. AJ plays an essential role in the medical diagnostic process, viz., when a selection is made from among a finite set of hypotheses that are justified by pathological and scientific knowledge (Barosi et al., 1993). (Since testing hypotheses is time-, energy, and money consuming, Peirce proposed a methodology called “economy of research” in order to evaluate different research proposals based on their costs and potential benefits. See Chiffi et al. (2020).) Abduction in the context of discovery (AD), on the other side, is also acknowledged as a sort of heuristic principle which orientates the adoption of some “hypothesiscandidate,” namely, statements that are not positively confirmed by the evidence but whose negation is unjustified (not proved). Doctors have to select a working hypothesis from among a given finite set of possible hypotheses codified by clinical knowledge in order to make sense of a patient’s anamnestic cues. According to Frankfurt (1958), an explanatory hypothesis A always has to be acknowledged before the conclusion of the abductive argument, implying that abduction cannot be very creative. He, in fact, made the point in contrast to Peirce: more than creating new hypotheses, abduction rather works as a filter in the adoption of good hypothesis-candidates for testing. However, hypothesis A in the major premise and conclusion of the abductive schema have different modes, since A in the premise is the antecedent of the subjunctive conditional, while in the conclusion
426
D. Chiffi and M. Andreoletti
is put as an invitation to investigate further A (a “co-hortative” mood) (Anderson, 1986; Ma & Pietarinen, 2018). The conclusion is “a good working hypothesis to go ahead with, at least as a basis for conducting tests or, if tests are not necessary, as a basis for provisional action or inaction” (Walton, 2004, 144). The major role of abduction is thus to help us select hypotheses in the context of justification and to select hypothesis-candidates in the context of the discovery (Douven, 2011). For instance, in nursing, abduction should serve in this function of selecting hypothesiscandidates with a view to improving healthcare paths. In fact, there is no precise and codified scientific framework for nursing knowledge, contrary to what happens in medical diagnostics, which is based on the methods and results of many different sciences. If one reframes Peirce’s general structure of abductive inference according to a classical deductive logic, then she stumbles upon the fallacy of affirming the consequence while struggling to reformulate the above argument properly in an inductive framework. (For a critical analysis of the forms of induction, deduction and abduction in clinical reasoning, see Festa et al. (2009).) The fact C is observed and is surprising in the light of an accepted set of explicative hypotheses concerning C, a set called Hs. A new candidate of explanatory hypothesis A would elucidate the onset of C better than would be the case if the initial set of hypotheses were left unchanged. It thus follows that hypothesis A is strengthened. Of course, this is not a conclusive argument, but merely a plausible one. It is worth noting that A is not contained in the set Hs, meaning that (i) A had not been conceived previously, i.e., no explicative function for phenomena like C was attributed to A, and A was now merely being selected from among other hypotheses to make sense of C, or (ii) A was created because of the difficulty of explaining C by means of Hs. In both cases, there may be reason to suspect that A is true. The status of selected or created (clinical) hypothesis in abduction is critically investigated in Thagard (1992), Magnani (1997, 2001, 2009), and Ramoni et al. (1992). (For an interesting historical application of abduction to Akkadian medical diagnosis, see Barés Gómez (2018).) The distinction between creative and selective abduction was firstly described in Magnani (2001). In the clinical activity of diagnostics, hypotheses are merely selected from an existing set of possible options. Clinicians have to identify and isolate a hypothesis for testing (possibly by means of abductive inferences) in order to arrive at a diagnosis for a given patient. Strictly speaking, it is very unlikely that a creation of objective clinical knowledge occurs in the diagnostic process. This does not mean that diagnostic hypotheses associated with specific signs and symptoms remain fixed: they may evolve in the light of therapeutic and prognostic considerations, though they ideally remain finite and may be encoded again in a new diagnostic framework. Stanley and Campos take a different stance on the creative/selective nature of hypotheses in diagnostic reasoning (Stanley & Campos, 2013, 2015), arguing for a balance between the generation and the selection of hypotheses in medical diagnostics. Criticism to their view may stem from the interpretation given to the nature of the accepted set of explicative hypotheses (Hs). On the one hand, if Hs
20 Abduction in Prognostic Reasoning
427
is relativized to the set of hypotheses that the physician associates with C, then one might say that hypotheses could be created in the abductive process grounded on diagnostic reasoning. However, a physician may also be unaware of a relevant explicative hypothesis for C, in which case there can be no creation of hypotheses, but only the physician’s lack of knowledge. When the set of Hs is interpreted objectively, a “creative” production of hypotheses in clinical diagnostics is unlikely (in a narrow sense) because it seldom happens to assess and recognize a pathological condition that has not been already somewhat described or codified. Instead, it is possible to assume that “when facing atypical or complex cases, physicians may have to combine their knowledge of possible diseases in novel ways to explain the condition of that specific patient” (Stanley & Campos, 2016). It follows that knowledge about known diseases can be merged and codified in order to develop new diagnostic patterns for more complex situations. That no new hypotheses are usually created by the diagnostic judgment process suggests it is not common to create objective medical knowledge in an ordinary case of clinical reasoning. This view may be particularly challenged, instead, if one focuses on the role of hypotheses in prognostic judgments. It is plausible to assume that creative forms of abduction cannot be readily dismissed when formulating a prognosis because the initial set of hypotheses to consider cannot be suitably restricted. There are numerous hypothetical factors (of an environmental, genetic, or behavioral nature) capable of shaping the prognostic judgment and contributing to its fundamental uncertainty. As stated before, prognostic hypotheses regard future health conditions. Hypotheses on infinite or unsurveyable domains, as the one given by future times, express sentences not in principle decidable (Dummett, 1976). In virtue of this, prognostic hypotheses are extremely difficult to be tested and codified. On the contrary, since diagnostic hypotheses involve a present health condition, they might be decidable, at least in line of principle. In formulating a prognosis, the most difficult task is to make sense of the uncertainty regarding the individual disease pathway. Such uncertainty may push the physician to generate new explicative hypotheses or little explored scenarios. On the other hand, the conceivability of a prognostic hypothesis is usually improved by clinical expertise, in particular when a prognosis is intended to be grounded on specific pathophysiological explanations. Hence, clinicians encounter emerging aspects of clinical knowledge and new hypotheses. In this light, medicine is not merely an applied science like “applied mathematics” because abstract and general knowledge cannot be applied directly to a particular clinical context. It is during the prognostication process that a synthesis can be achieved from the interplay between encoded medical knowledge and the hypotheses emerging from a given clinical context. (Of course, also non-epistemic aspects of clinical reasoning such as ethical and social values play a non-secondary role in shaping the clinical context (Risjord, 2011).) Emergent hypotheses with a potential explicative power regarding a pathological condition in a patient may prompt a review of the main steps involved in a clinical judgment. It is on the prognostic level, moreover, that a patient’s decisions and values may influence the physician’s clinical reasoning. The generation of new prognostic hypotheses is often a good strategy for dealing with the
428
D. Chiffi and M. Andreoletti
uncertainty of the future course of the disease. It is therefore important to investigate the nature of diagnostic and prognostic hypotheses based on their different roles in abductive inferences, i.e., to the different forms of involved uncertainty. Understanding the uncertainty surrounding clinical reasoning, especially regarding diagnosis and prognosis, requires also a better grasp of the relation between hypothesis and evidence. Clinical judgments in fact are not based only on clinicians’ expertise, but rather they are grounded on specific evidence. Therefore, a discussion of the role of hypothesis and evidence in prognostication seems paramount to better grasp the problem of clinical reasoning under uncertainty.
Prognosis, Hypothesis, and Evidence Reasoning in medicine, and in particular with prognostic hypotheses, requires the critical use of a clinical methodology whose validity and limits must be evaluated. Methodological questions are at the heart of the research in clinical reasoning: they concern, on the one hand, a more objective approach to medicine based on biomedical evidence and explanations in which signs convey clinical information and, on the other hand, a person-centered medicine in which clinical signs have a meaning also within the individual’s human experience. In general, evidence in medicine does not have the same intuitive features as in other research fields. Evidence-based medicine (EBM), for instance, assumes that the nature of scientific evidence in medicine has in recent decades changed profoundly, thanks to the methodological development of clinical research, in particular for what concerns the collection of clinical information and the tools for their analysis. EBM in its most orthodox formulation claims to offer the clinician “certainties” or “truths,” simply by differentiating between the most and the least reliable data, or between “scientific” and “less scientific” facts. And, to do this, EBM suffices to consider as evidence only the results obtained through the most rigorous methodology, i.e., randomized controlled trials (RCTs). Such an epistemological attitude is too optimistic (resembling certain perspectives of the first logical positivism) because it considers the execution of rigorous experiments as a necessary and sufficient condition to know reality. Also, this is a very simplified view of medical epistemology, which does not take into account the complexity of medical research (see, e.g., Clarke et al., 2014; Djulbegovic et al., 2009; Cartwright, 2007). In recent years, the methodological limitations of this approach have emerged even more clearly following the so-called reproducibility crisis. The exponential increase in scientific publications and their sensationalist nature (especially in the biomedical field) have led many researchers to doubt and to question much of scientific production. The numerous attempts to replicate scientific studies published in the most prestigious international journals have given alarming results: only a small percentage has proved to be reproducible (Baker, 2015; Morrison, 2014). Epidemiological research is not an exception. Several studies have highlighted the numerous methodological biases that afflict clinical trials, one of the most
20 Abduction in Prognostic Reasoning
429
frequent of which is the inadequate number of samples used. Among others, the epidemiologist John Ioannidis has dealt with this problem. One of his most wellknown works, in fact, concerns the empirical evaluation of the so-called very large effects (VLE) of therapeutic interventions (Ioannidis, 2008). The epidemiologists involved in this paper identified all the studies that demonstrated the existence of VLE in the Cochrane Database of Systematic Reviews; then, they searched the literature for further trials that could confirm or deny them. The result was quite disconcerting: all the results initially identified as VLE became much smaller or even disappeared in later studies. After an in-depth analysis, it was found that the studies showing VLEs were all underpowered, i.e., the sample size used was very small. In a clinical trial without an adequate sample, it is very difficult to distinguish the observation of a random fluctuation from a real effect (Andreoletti & Teira, 2016). In these cases, it is not possible to observe the same effect again simply because there is none. This means that the evidence that EBM wants to rely on in order to offer certainties to the doctor appears to be anything but reliable. As discussed, prima facie, the EBM epistemology fails in reducing the complexity of medical science. And yet, recognizing the limits of epidemiological evidence does not automatically mean to deny its epistemic value, which should instead be reaffirmed in a context in which the complexity of medicine is managed through tools such as critical reasoning and logic. In other terms, doubt and uncertainty should be the foundation of contemporary medical epistemology (Djulbegovic, 2007). As pointed out by Eddy (1984, p. 75): Uncertainty creeps into medical practice through every pore. Whether a physician is defining a disease, making a diagnosis, selecting a procedure, observing outcomes, assessing probabilities, assigning preferences, or putting it all together, he is walking on very slippery terrain. It is difficult for nonphysicians, and for many physicians, to appreciate how complex these tasks are, how poorly they are understood, and how easy it is for honest people to come to different conclusions.
Remarkably, philosophers of science in the last decades have been investigating and criticizing the concept of evidence in medicine and have put the limits of EBM epistemology into discussion. There is a wide scholarly consensus that evidence from RCTs is insufficient to establish a causal link between an intervention and its effect. As far as interventions are concerned, medical scientists need to possess mechanistic knowledge of how the intervention brings about its effect. This is precisely why medical decisions that are made based on RCT results alone are prone to errors: physicians should consider obtaining evidence from other sources as well. (However, also mechanistic knowledge may be fallacious or at least not optimal when clinical phenomena assume a systemic structure.) Recent evidence from meta-research is also seriously questioning the validity of RCTs, since several biases distort the results of such studies. Some philosophers and (meta-)scientists have suggested that we should lower our degree of belief in medical interventions. For example, Jacob Stegenga (2018) has defended the idea of “medical nihilism,” opting for a more realistic understanding of what medical practice can and cannot achieve. Jonathan Fuller (2018) has also recently argued for updating (lowering) our confidence in medical claims based on meta-research evidence, in order to
430
D. Chiffi and M. Andreoletti
avoid being irrational. On his part, the leading meta-research scholar John Ioannidis (2008) advocates, more pragmatically, for a “rational down-adjustment of effect sizes” usually found in the medical literature. Anyway, no matter the degree of skepticism, the limits of RCTs in supporting decision-making should be always taken into account to avoid misguided decisions (Ongaro & Andreoletti, 2021). So far, philosophical analysis of medical practice has focused mostly on interventions, while other important areas of medicine, such as diagnosis and to a greater extent prognosis, have not gained much attention. Diagnostic and prognostic judgments require a different reasoning process – and a different type of evidence – than interventions. When patients receive a diagnosis of a disease, they are not just interested in finding out about their recovery but also want to know what will happen to them in terms of the natural course of the disease and quality of life. In short, they care about their own prognosis. For example, patients who have been diagnosed with cancer want to know if they will die, if something painful will happen to them, and what kind of life they can lead following the diagnosis. This sort of information is not just interesting per se, but has a profound impact on the patient’s decisions about treatment. Patients can decide to undergo “harmful” treatments if their conditions have a very bad course or forget about having treatment if the disease does not have much impact on their quality of life. Prognosis, as already discussed, is usually assumed to be related to the anticipated results of a disease or situation and the likelihood of its occurrence. Further expanding the definition, prognosis involves the effect of a disease or situation over time and the projected likelihood of regeneration or continuing related morbidity, with a given set of prognostic factors (see, e.g., Moons et al., 2009). Prognostic factors can help clinicians predict which patients are more or less likely to experience a given outcome. For example, one can predict death events or disability at 5 years in stroke survivors: patients who have moderate hemiparesis at baseline are 3 times more likely to die or be disabled at 5 years, whereas those with severe hemiparesis are over 4 times more likely to die or be disabled (FineoutOverholt & Mazurek, 2004). However, the quantification of such likelihood for a specific patient may be particularly complicated, and that is why abductive forms of reasoning aimed at evaluating plausible clinical hypotheses may help the clinician formulate a prognosis. A rough search on the PubMed database for papers that use the terms “prognostic factor” in the title and abstract is enough to see how prognostic research has grown significantly in the last decade. Prognostic research shows to be a fundamental sort of inquiry within the emerging paradigm of “personalized medicine” or “precision medicine.” Nowadays, these two terms are ubiquitous in medical journals, popular science, social networks, and so forth. Even multinational tech companies, such as Google, Apple, Amazon, etc., are investing in personalized healthcare research. There is indeed unanimous consensus among all the relevant stakeholders that precision medicine will be the next big revolution in healthcare. In principle, personalized or precision medicine is not confined to the effectiveness of treatments or preventive strategies, but rather “addresses how to use an individual’s prognostic information to make personally tailored choices about the best-suited treatment or
20 Abduction in Prognostic Reasoning
431
preventive management” (Moons et al., 2018). Therefore, studies on prognostic factors, markers, and models have become more frequent in the medical literature. As of today, for instance, studies that investigate the staging of cancers to predict the progression of the disease and the survival rates are very popular, as well as studies that analyze genetic markers to predict individual response to treatments. But, as economists have taught us, high demand means that supply increases. Although prognostic research is being progressively rewarded publication-wise, its growth is dismally taking place more in terms of quantity than of quality. Recently, an increasing body of empirical evidence has highlighted the severe limitations of prognostic research. For instance, Riley et al. (2019) have shown in what way prognostic studies can often be poorly designed. Altman et al. (1994) highlighted severe flaws in the analysis of the data, and McShane et al. (2006) noticed that results are being poorly reported, to say the least. All in all, after an initial period of excitement, the community is given less and less credit to prognostic studies. Fishing for significant correlations is now a notorious attitude, especially in some disease areas. Poorly designed prognostic research has been wryly labelled as a “playground” for researchers, or even “what’s in the fridge approach” (see Riley et al., 2019). Apparently, the famous “correlation is not causation,” which has been repeated by everyone over the years as a mantra, did not have any effect at all. Medical researchers and institutions, like the Cochrane Collaboration, are actively working to standardize and improve prognostic research to facilitate medical decision-making. Still, as of today, it is evident that the vast majority of prognostic studies are not fully reliable. The problem of reasoning under uncertainty in the context of personalized medicine has been recently discussed by Walker et al. (2019). Although novel methods of gaining evidence in basic science – such as iPSc (induced pluripotent stem cells) – are being successfully generated (Boniolo, 2016), such shortcuts are still far from being achieved in a clinical setting (Andreoletti, 2018). Therefore, a more detailed and philosophical analysis of prognostic judgments might be helpful.
General and Particular Medical Assertions It is well-known that any prognosis requires a diagnosis. This section explores the role of different levels of medical judgments and assertions. More specifically, it focuses on the speech acts involved in diagnostic and prognostic judgments and on their connection with epidemiological evidence. An internal judgment, in fact, can be externally expressed with different speech acts in clinical communication. The issue of epidemiological evidence obviously plays an important role not only in supporting the critical judgment on the patient’s health conditions but also in supporting the diagnostic and prognostic judgment and the choice of therapy. On the one hand, a specific diagnosis is inevitably linked to the medical knowledge and epidemiological evidence available in each historical period: if the epidemiological evidence is conflicting or unreliable, then even the diagnostic judgment based on such evidence could be incorrect. On the other hand, usually, clinicians do not
432
D. Chiffi and M. Andreoletti
base their judgment simply on epidemiological evidence but attempt to understand how and to what extent the evidence from epidemiological studies can inform their judgment once signs, symptoms, and any other information on the patient’s state of health are taken into account (Thorne & Sawatzky, 2014). Thus, there seems to be, at least prima facie, a continuous and mutual interplay between what Federspil and Vettor (2001) call “general medical assertions” and what one might call, using similar terminology, “particular medical assertions.” Generally, an assertion is an illocutionary act that must satisfy certain standards of acceptability in order to be considered justified. In this sense, different levels of acceptability can be thought to justifiably assert a proposition. For example, an assertion can be considered justified if it is true, if it is known, if there is evidence, or if it is reasonable to believe it is true. However, medical assertions seem to have a specific nature. Clinicians, in fact, may deem an assertion justified not only by its truth or knowledge value but also by the possibility of proving or attempting to corroborate a proposition (given the knowledge and medical evidence available). This last point of view, in particular, is usually accepted in EBM. In EBM, the notion of evidence relates to the notion of proof, meaning that a general medical assertion requires proof of its content to be justified. But the notion of justification refers (at least implicitly) to some notion of norm or threshold of evidence acceptance. General medical assertions seem to have the following structure: given a proposition that expresses a medical thesis, we try to understand if the degree of evidence achieved through epidemiological studies, meta-analyses, and systematic reviews of the literature allows us to be confident in the thesis so as to assert it in a justified manner. (On the problem of heterogeneity of evidence in metaanalyses, see Berchialla et al. (2021).) The following fundamental issues emerge, therefore, when analyzing general medical assertions according to the EBM: (i) As noted by Federspil and Vettor (2001), the idea of evidence held by EBM does not take into due consideration that the available medical evidence allows us to confer different degrees of credibility to our medical theses with regard to the different theoretical frameworks of reference. Forms of evidence that seem to justify a medical assertion could be unjustified in a different theoretical framework or for different clinical purposes (However, some exceptions to this are also contemplated within EBM as we will see in the case of the best evidence required for prognostication). (ii) Like any empirical knowledge, even medical assertions can never be conclusively proven and are always subject to a risk of error. (iii) The choice of the evidentiary thresholds to consider an assertion as justified is conventionally established (in particular the thresholds for false positives and false negatives) but may vary based on epistemic and non-epistemic considerations (Hempel, 1965). The justification of particular medical assertions is perhaps an even more complex issue. Suffice it to recall that the data of a reference population or
20 Abduction in Prognostic Reasoning
433
the evidence of clinical trials can in some cases show no traces of a clinical judgment referring to a single patient. Often, trials do not include patients with comorbidity, a very common situation with elderly patients. Even if the evidence at the general population level is, at least in principle, clinically relevant for the formulation of a particular clinical assertion, it may be difficult to isolate the factors of clinical relevance that allow us to select a reference partition of a single patient’s particular pathological condition (Giaretta & Chiffi, 2018). Finally, particular medical assertions are difficult to justify either because a disease can show a high individual variability or because the experimental conditions do not reflect those commonly encountered in clinical practice. Nonetheless, even if fallible, diagnosis can be viewed as a particular medical assertion since it is always about a specific patient in a precise time. The uncertainty permeating a diagnosis is usually not severe and can be statistically assessed; on the contrary, the uncertainty permeating prognostic judgments is more significant.
Prognostic Judgment: Structure and Uncertainty The logical structure of prognosis is not always fully explicated. A clear explication of a notion, instead, may prevent forms of ambiguity that can negatively affect clinical reasoning and medical decisions. An explication is intended to capture the core meaning of a notion, ruling out the less relevant part of it. In other terms, an explication is a procedure of conceptual clarification of a vague concept, the explicandum, with a precise concept, the explicatum, so that the explicatum must be (i) similar to the explicandum; (ii) more exact and informative than the explicandum; (iii) simple in order to be easily formalized; and finally (iv) connected by means of the explication with a rigorous system of scientific concepts (Carnap, 1950). A different sense of explication is the Kantian one for which explication does not require condition (iii), i.e., the condition of formalization (See Boniolo 2003). An interesting explication of prognosis is provided by Sadegh-Zadeh (2012), who defines a prognosis (π) as: π = (p, D, KB ∪ M)
(1)
where p indicates a specific patient, D stands for patient’s data, KB is the knowledge base used to substantiate the clinician’s formulation of the prognosis, and M is the method of reasoning and argumentation adopted for the prognosis. This explication is valuable, but there is still room to clarify some of its components and possibly extend it. KB is a crucial ingredient of this explication and includes many different aspects such as (i) the awareness of epidemiological data (ED) with reference to specific populations of patients; (ii) the level of expertise (E) of the clinician in dealing with similar patients; and (iii) the connection (C) (if meaningful) between epidemiological (population) data and the development of the patient’s disease.
434
D. Chiffi and M. Andreoletti
Based on this, one can postulate that: KB = (ED, E, C) .
(2)
As previously mentioned, the connection C may be particularly problematic since it consists in (i) finding a meaningful reference class in population studies and (ii) attempting to ‘particularize’ the relevant clinical factors of that class to a specific patient (see Chiffi & Pietarinen, 2019; Fuller & Flores, 2015). What is often lacking is some guidance to the use of a clinician’s judgment in shaping the variety of sources of information to a specific case. On the one hand, finding a relevant population class from the trial can sometimes be hopeless since experimental conditions can hardly be replicated in clinical practice; on the other hand, given the intricacy of individual variability, the criteria for the resemblance of the patient with the population may not be fully accurate. A common epidemiological and clinical view in order to calibrate function C deals with “accepting that results of randomized trials apply to wide population unless there is a compelling reason to believe the results would differ substantially as a function of particular characteristics of those patients” (Post et al., 2013, pp. 641–642; see also Dans, Dans, Guyatt, and Richardson (1998)). As observed by Fuller (2019), this can be viewed as a sort of “presumption of generalizability” that resembles the argumentum ad ignorantiam in argumentation theory, which has the following form: it is not known (proved) that a proposition A is true (false); therefore, A is false (true) (Woods & Walton, 1978). Even if this form of reasoning is not always fallacious, it remains particularly problematic when the specific knowledge base is far from being complete and reliable. This is almost always the case for medical evidence and knowledge, since “negative arguments depend on the completeness of the negated paradigm. In medicine or history, such a paradigm can be hardly considered as complete, and therefore reasoning from ignorance can only provide a certain degree of probability” (Macagno & Walton, 2011, p. 99; see also Walton, 1996). This means that the issue of establishing sound methodological and clinical rules in order to select the proper connection between epidemiological and clinical evidence is difficult and may represent the hallmark of sound clinical reasoning. Let us consider again the proposed explanation of prognosis. At any rate, the explanation π of prognosis is in a strict sense the propositional content (intended as the result) of an act of prognostication. The act of prognostication is a verdictive illocutionary act, in which a specific judgment or verdict is claimed and justified even in the presence of uncertainty. Because of the act of prognosis, many perlocutionary effects may follow as is, in fact, the case for a patient receiving a prognosis. Austin classically pointed out that: Verdictives, are typified by the giving of a verdict, as the name implies, by a jury, arbitrator, or umpire. But they need not be final; they may be, for example, an estimate, reckoning, or appraisal. It is essentially giving a finding as to something-fact, or value which is for different reasons hard to be certain about. (Austin, 1962, p. 151)
20 Abduction in Prognostic Reasoning
435
To which the author adds: Verdictives consist in the delivering of a finding, official or unofficial, upon evidence or reasons as to value or fact, so far as these are distinguishable. (Austin, 1962, p. 152)
Verdictive acts, thus, occur under a veil of uncertainty, while, for instance, the act of assertion requires something similar to conclusive evidence for its justification. In the light of these considerations and the standard pragmatic distinction between the speech act and the (propositional) content, the general form of an act of prognostication is the following: P (π)
(3)
P (p, D, KB ∪ M)
(4)
which is equivalent, in virtue of (1) to:
And by (2), it follows that (4) is equivalent to: P (p, D, ((ED, E, C) ∪ M))
(5)
The formula in (5) indicates the general form of a prognostic judgment expressed by means of an illocutionary and verdictive act, whose propositional content delivers information regarding the patient, the health data of the patient, the knowledge base of the clinician consisting in epidemiological evidence, professional expertise, and the relation between epidemiological and clinical evidence once a clinical reasoning method is used. Since prognostics is usually permeated by uncertainty, the methods of clinical reasoning are standardly based on a probabilistic assessment; however, in case of severe forms of uncertainty, difficult to be handled by probabilities, also abductive or “retroductive” forms of reasoning (Pietarinen & Bellucci, 2014) may represent an indispensable element of prognostication. The explanation (5) can be used to understand disagreement among clinicians about prognostication, especially with regard to the development of a disease in the very same patient. Forms of disagreement may arise from various factors such as clinical and epidemiological considerations, the similarity between the health conditions of a patient and a relevant reference class, the different forms of reasoning used in formulating the prognostic judgments, and so forth. As noticed before, for any illocutionary act, there should be some justification conditions supporting the felicity of the execution of that act. In the case of prognostication, one wants the prognosis to be about a future event. The idea here is that pathophysiological considerations and epidemiological evidence should contribute to assessing a future stage of the disease in a specific patient. Yet, pathophysiological considerations, or more in general causal considerations, are not always required for a prognosis, since statistical associations may provide the proper ground to formulate it (Stovitz & Shrier, 2019).
436
D. Chiffi and M. Andreoletti
Choosing which form of evidence is better to use for prognostication is rather difficult. EBM practitioners hold that RCTs are considered inappropriate when the study is looking at the prognosis of a disease. For instance: A special type of cohort study may also be used to determine the prognosis of a disease (i.e. what is likely to happen to someone who has it). A group of people who have all been diagnosed as having an early stage of the disease or a positive screening test [ . . . ] is assembled (the inception cohort) and followed up on repeated occasions to see the incidence (new cases per year) and time course of different outcomes. (Greenhalgh, 2010, p. 38)
The failure to select a proper inception cohort may result in unpredictable effects and fatal flow on prognostic studies (Ales & Charlson, 1987). As pointed out by Mebius et al. (2016), there seems to exist “an apparent contradiction” between best evidence in general, which is usually assumed in EBM to come from RCTs (and meta-analyses of their findings), and the best evidence for prognostic purposes, which comes from inception cohorts, defined as “a group of individuals identified and assembled for subsequent study at an early and uniform point in the course of the specified health condition” (Porta, 2014). This is a nice example of how the goodness of evidence can change based on its use and on the different theoretical frameworks it is grounded on. However, it is a known fact that observational studies are never free of bias, which is even more relevant for prognostic studies, often flawed with poor methodological standards (Lim & Feldman, 2013). Not by chance, some epidemiologists pointed out that whenever clinicians select prognostic studies to inform their practice and happen to encounter a study for which no inception cohort was assembled, they should just move on to the next article. For instance, if only current patients would be studied for prognosis – being the ones who had the most severe disease already died – then the observed outcomes would be overly optimistic. A prognosis is almost always more uncertain than a diagnosis (Chiffi & Zanotti, 2017b), and the reliability of available prognostic tools remains limited (Riley et al., 2013). In fact, a prognosis can be perfectly rational, but still wrong because of the irreducible uncertainty permeating the act of prognostication. This is because many aspects of the future are intrinsically unknown and difficult to be probabilistically evaluated. On this point, Ludwig von Mises (1966) considered necessary to distinguish between, on the one side, the class probability, in which the frequency of a certain homogeneous class is known, but the behavior of the individual outcomes of the class cannot be identified, and on the other side, the case probability, typical of the disciplines that are shaped by teleological aspects, which deals with unique events that cannot be grouped into larger classes. Likewise, the evolution of a disease in a specific patient can be inherently unpredictable. Also, (particular) clinical prognostic judgments have to deal with lifestyles, compliance with therapy, ideas, values, and contingencies of patients. As a way of example, some years ago, a friend of one of the authors of this chapter, who was a talented professor of philosophical logic, consciously decided not to receive chemotherapeutic treatments to cure his cancer, preferring to undergo awake brain surgery. His motivation was that he did not want to lose his cognitive capacities (something fundamental for a logician), even if life expectancy would have decreased dramatically. Unfortunately, 1 year after receiving his cancer diagnosis, he passed away, leaving a permanent mark on
20 Abduction in Prognostic Reasoning
437
many people both in and outside academia. This is a clear example of both the dramatic role of the patient’s views about the future in changing the prognosis and of all the limits in explaining the very concept of prognosis. However, the proposed explication of prognostic judgment may help us understand some sources of clinical uncertainty. For instance, it may be the case that (i) diseases may develop in unpredictable ways; (ii) for some diseases the behavior, lifestyles, values, and therapeutic patients’ preferences may affect the reliability of the prognosis; (iii) the clinical framework assigned to a patient is scarcely known or even incorrect; (iv) the epidemiological and medical knowledge base used by the clinician to substantiate the prognostic judgment is incomplete or not updated; and (v) the reasoning and argumentative methods are fallacious. At any rate, (5) does not claim to be an exhaustive definition of the act of prognosis, but rather an explication collecting the main relevant ingredients of a prognosis that may rule out some specific aspects. A patient’s values and choices may induce clinicians to formulate an alternative prognosis and to imagine an alternative future course of events, which is better aligned with the patient’s decisions, just as long as such decisions are well substantiated, sound, and informed. And even though the merits of an objective explication of the act of prognosis are undeniable, it falls short in providing a patient-centered approach to prognostication able to include the patient’s values and decisions and the way he/she conceives the uncertainty of the future. This fact highlights, once again, how the practice of medicine is an art in which both the patient and the doctor have much to say. The proposed pragmatic analysis attempted to reveal the merit of a linguistic and argumentative approach to prognostication and thus pave the way to a critical reflection on such a fundamental step in clinical reasoning.
Conclusion This chapter has critically discussed the basic elements of prognostication, pointing out the pragmatic and abductive facets of this clinical process. Even if diagnosis has been usually considered as the main process in clinical reasoning, prognostication – in particular from a patient’s perspective – has not received a comparable scientific and clinical interest. Still, things are rapidly changing in virtue of a recent emphasis on a person-centered perspective on clinical reasoning and practice. Notably, given the uncertainty permeating prognostic judgment, formulating sound prognoses is one of the main difficult tasks in medical reasoning. Many elements, indeed, may influence a prognosis, some of them are unmodifiable and uninfluenced by the course of disease and are called “prognostic markers”; others are modifiable and dependent on how the disease develops and are called “prognostic factors.” Abductive forms of reasoning seem to nicely cope with the uncertainty of any prognosis. The authors have discussed the creative or selective nature of hypotheses in prognostic reasoning showing that, unlike diagnoses, creative forms of abduction cannot be easily dismissed in formulating a prognosis. More generally, they pointed out that abduction is an essential form of reasoning in prognostication and one of its main ingredients when one tries to explicate this notion. Then, a new
438
D. Chiffi and M. Andreoletti
explanation of the concept of prognosis has been proposed, showing its pragmatic nature and the role of underlying different forms of evidence in connection with a clinical prognosis. Such connection is usually warranted in EBM by means of the “presumption of generalizability” of the findings of RCTs, which is a form of argumentum ad ignorantiam. Yet, this argument may be fallacious in those fields that are permeated by fundamental uncertainty such as prognostic reasoning. In conclusion, the pragmatic and abductive analysis of prognostication is intended to propose a conceptual clarification of the prognostic process which is virtually impossible without the involvement of any form of abductive reasoning. Some parts of this chapter are based on (Chiffi & Andreoletti, 2021) and on a revision of Chapters III and IV in (Chiffi, 2021).
References Ales, K. L., & Charlson, M. E. (1987). In search of the true inception cohort. Journal of Chronic Diseases, 40(9), 881–885. Aliseda, A. (1998). Seeking explanations: Abduction in logic. Philosophy of science and artificial intelligence. Stanford University Press. Aliseda, A. (2005). The logic of abduction in the light of Peirce’s pragmatism. Semiotica, 153(1/4), 363–374. Altman, D. G., Lausen, B., Sauerbrei, W., & Schumacher, M. (1994). Dangers of using “optimal” cutpoints in the evaluation of prognostic factors. JNCI: Journal of the National Cancer Institute, 86(11), 829–835. Anderson, D. R. (1986). The evolution of Peirce’s concept of abduction. Transactions of the Charles S. Peirce Society, 22(2), 145–164. Andreoletti, M. (2018). More than one way to measure? A casuistic approach to cancer clinical trials. Perspectives in Biology and Medicine, 61(2), 174–190. Andreoletti, M., & Teira, D. (2016). Statistical evidence and the reliability of medical research. In M. Solomon, J. R. Jeremy, K. Simon, & H. (Eds.), The Routledge companion to philosophy of medicine (pp. 232–241). Routledge. Austin, J. L. (1962). How to do things with words. Oxford University Press. Austoni, M., & Federspil, G. (1975). Principi di metodologia clinica. Cedam. Baker, M. (2015). Over half of psychology studies fail reproducibility test. Nature News. https:// doi.org/10.1038/nature.2015.18248. Barés Gómez, C. (2018). Abduction in Akkadian medical diagnosis. Journal of Applied LogicsIfCoLog Journal of Logics and their Applications, 5(8), 1697–1722. Barosi, G., Magnani, L., & Stefanelli, M. (1993). Medical diagnostic reasoning: Epistemological modeling as a strategy for design of computer-based consultation programs. Theoretical Medicine, 14, 43–65. Berchialla, P., Chiffi, D., Valente, G., & Voutilainen, A. (2021). The power of meta-analysis: A challenge for evidence-based medicine. European Journal for Philosophy of Science, 11(7), 1–18. Bissessur, S. W., Geijteman, E. C. T., Al-Dulaimy, M., Teunissen, P. W., Richir, M. C., Arnold, A. E. R., & de Vries, T. P. G. M. (2009). Therapeutic reasoning: From hiatus to hypothetical model. Journal of Evaluation in Clinical Practice, 15, 985–989. Boniolo, G. (2003). Kant’s explication and Carnap’s explication: The Redde Rationem. International Philosophical Quarterly, 43(3), 289–298. Boniolo, G. (2016). Molecular medicine: The clinical method enters the lab. In G. Boniolo & M. J. Nathan (Eds.), Philosophy of molecular medicine (pp. 23–42). Routledge. Carnap, R. (1950). Logical foundations of probability. University of Chicago Press. Cartwright, N. (2007). Are RCTs the gold standard? BioSocieties, 2(1), 11–20.
20 Abduction in Prognostic Reasoning
439
Chiffi, D. (2021). Clinical reasoning. Knowledge, uncertainty, and values in health care. Springer. Chiffi, D., & Andreoletti, M. (2021). What’s going to happen to me? Prognosis in the face of uncertainty. Topoi, 40(2), 319–326. Chiffi, D., & Pietarinen, A.-V. (2019). Clinical equipoise and moral leeway: An epistemological stance. Topoi, 38(2), 447–456. Chiffi, D., & Pietarinen, A.-V. (2020a). Abduction within a pragmatic framework. Synthese, 197(6), 2507–2523. Chiffi, D., & Pietarinen, A.-V. (2020b). The extended Gabbay-Woods schema and scientific practices. In D. Gabbay, L. Magnani, W. Park, & A.-V. Pietarinen (Eds.), Natural arguments. A tribute to John Woods. College Publications. Chiffi, D., & Zanotti, R. (2017a). Fear of knowledge: Clinical hypotheses in diagnostic and prognostic reasoning. Journal of Evaluation in Clinical Practice, 23(5), 928–934. Chiffi, D., & Zanotti, R. (2017b). Knowledge and belief in placebo effect. Journal of Medicine and Philosophy, 42(1), 70–85. Chiffi, D., Pietarinen, A. V., & Proover, M. (2020). Anticipation, abduction and the economy of research: The normative stance. Futures, 115, 102471. Christakis, N. A. (1999). Prognostication and bioethics. Daedalus, 128(4), 197–214. Christakis, N. A., & Sachs, G. A. (1996). The role of prognosis in clinical decision-making. Journal of General Internal Medicine, 11(7), 422–425. Clarke, B., Gillies, D., Illari, P., Russo, F., & Williamson, J. (2014). Mechanisms and the evidence hierarchy. Topoi, 33(2), 339–360. Dans, A. L., Dans, L. F., Guyatt, G. H., Richardson, S., & Evidence-Based Medicine Working Group. (1998). Users’ guides to the medical literature: XIV. How to decide on the applicability of clinical trial results to your patient. JAMA, 279(7), 545–549. De Giovanni, A. (1909). The morphology of the human body. Rebman Limited. Del Mar, C., Doust, J., & Glasziou, P. P. (2008). Clinical thinking: Evidence, communication and decision making. Wiley. Djulbegovic, B. (2007). Articulating and responding to uncertainties in clinical research. The Journal of Medicine and Philosophy, 32(2), 79–98. Djulbegovic, B., Guyatt, G. H., & Ashcroft, R. E. (2009). Epistemologic inquiries in evidencebased medicine. Cancer Control, 16(2), 158–168. Djulbegovic, B., Hozo, I., & Greenland, S. (2011). Uncertainty in clinical medicine. In F. Gifford (Ed.), Philosophy of medicine. Handbook of the philosophy of science (Vol. 16, pp. 299–356). Elsevier. Douven, I. (2011). Abduction. In Stanford encyclopedia of philosophy. http://plato.stanford.edu/ entries/abduction/index.html. Last accessed 20 July 2020. Dummett, M. (1976). What is a theory of meaning? (II). In G. Evans & J. McDowell (Eds.), Truth and meaning: essays in semantics (pp. 67–137). Clarendon Press. Eddy, D. M. (1984). Variations in physician practice: The role of uncertainty. Health Affairs, 3(2), 74–89. Federspil, G. (2004). Logica clinica. McGraw-Hill. Federspil, G., & Vettor, R. (1999). Clinical and laboratory logic. Clinica Chimica Acta, 280(1), 25–34. Federspil, G., & Vettor, R. (2001). La “evidence-based medicine”: una riflessione critica sul concetto di evidenza in medicina. Italian Heart Journal, 2(6 Suppl), 614–623. Festa, R., Crupi, V., & Giaretta, P. (2009). Deduzione, induzione e abduzione nelle scienze mediche. Logic and Philosophy of Science, 7(1), 41–68. Fineout-Overholt, E., & Mazurek, M. B. (2004). Evaluation of studies of prognosis. EvidenceBased Nursing, 7(1), 4–8. Frankfurt, H. (1958). Peirce’s notion of abduction. Journal of Philosophy, 55, 593–596. Fuller, J. (2018). Meta-research evidence for evaluating therapies. Philosophy of Science, 85(5), 767–780. Fuller, J. (2019). The myth and fallacy of simple extrapolation in medicine. Synthese. https://doi. org/10.1007/s11229-019-02255-0
440
D. Chiffi and M. Andreoletti
Fuller, J., & Flores, L. J. (2015). The risk GP model: The standard model of prediction in medicine. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 54, 49–61. Gabbay, D. M., & Woods, J. (2006). Advice on abductive logic. Logic Journal of the IGPL, 14(2), 189–219. Gelman, A., & Carlin, J. (2014). Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors. Perspectives on Psychological Science, 9(6), 641–651. Giaretta, P., & Chiffi, D. (2018). Varieties of probability in clinical diagnosis. Acta Baltica Historiae et Philosophiae Scientiarum, 6(1), 5–27. Greenhalgh, T. (2010). How to read a paper: The basics of evidence-based medicine. Wiley. Hempel, C. G. (1962). Explanation in science and history. In R. C. Colodny (Ed.), Frontiers of science and philosophy (pp. 9–19). The University of Pittsburgh Press. Hempel, C. G. (1965). Science and human values. In Aspects of scientific explanation and other essays in the philosophy of science (pp. 81–96). The Free Press. Hilden, J., & Habbema, J. D. F. (1987). Prognosis in medicine: An analysis of its meaning and roles. Theoretical Medicine, 8(3), 349–365. Ioannidis, J. P. (2008). Why most discovered true associations are inflated. Epidemiology, 19(5), 640–648. Kakas, A., Kowalski, R. A., & Toni, F. (1992). Abductive logic programming. Journal of Logic and Computation, 2(6), 719–770. Keynes, J. M. (2008). The general theory of employment, interest, and money. Atlantic Publishers, originally published in 1936. Knight, F. H. (1921). Risk, uncertainty and profit. Houghton-Mifflin. Kowalski, R. A. (1979). Logic for problem solving. Elsevier. Kuipers, T. A. F. (1999). Abduction aiming at empirical progress of even truth approximation leading to a challenge for computational modelling. Foundations of Science, 4(3), 307–323. Lewis, D. (1975). Language and languages. In K. Gunderson (Ed.), Minnesota studies in the philosophy of science (Vol. VII, pp. 3–35). University of Minnesota Press. Lim, L. S. H., & Feldman, B. M. (2013). The risky business of studying prognosis. The Journal of Rheumatology, 40(1), 9–15. Ma, M., & Pietarinen, A.-V. (2018). Let us investigate! Dynamic conjecture-making as the formal logic of abduction. Journal of Philosophical Logic, 47(6), 913–945. Macagno, F., & Walton, D. (2011). Reasoning from paradigms and negative evidence. Pragmatics and Cognition, 19(1), 92–116. Magnani, L. (1997). Basic science reasoning and clinical reasoning intertwined: Epistemological analysis and consequences for medical education. Advances in Health Sciences Education, 2(2), 115–130. Magnani, L. (2001). Abduction, reason and science. Springer. Magnani, L. (2009). Abductive cognition. The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. McShane, L. M., Altman, D. G., Sauerbrei, W., Taube, S. E., Gion, M., & Clark, G. M. (2006). REporting recommendations for tumor MARKer prognostic studies (REMARK). Breast Cancer Research and Treatment, 100(2), 229–235. Mebius, A., Kennedy, A. G., & Howick, J. (2016). Research gaps in the philosophy of evidencebased medicine. Philosophy Compass, 11(11), 757–771. Meheus, J., Verhoeven, L., Van Dyck, M., & Provijn, D. (2002). Ampliative adaptive logics and the foundation of logic-based approaches to abduction. In L. Magnani, N. J. Nersessian, & C. Pizzi (Eds.), Logical and computational aspects of model-based reasoning (pp. 39–71). Kluwer Academic. Miettinen, O. S. (2011). Epidemiological research: Terms and concepts. Springer. Mises, L. (1966). Human action, a treatise on economics. Henry Regnery. Moons, K. G. M., Royston, P., Vergouwe, Y., Grobbee, D. E., & Altman, D. G. (2009). Prognosis and prognostic research: What, why, and how? BMJ, 338, b375.
20 Abduction in Prognostic Reasoning
441
Moons, K. G. M., Hooft, L., Williams, K., Hayden, J. A., Damen, J. A. A. G., & Riley, R. D. (2018). Implementing systematic reviews of prognosis studies in Cochrane. Cochrane Database of Systematic Reviews, 10, Art. No.: ED000129. https://doi.org/10.1002/14651858.ED000129 Morrison, S. J. (2014). Reproducibility project: Cancer biology: Time to do something about reproducibility. eLife, 3, e03981. Paavola, S. (2005). Peircean abduction: Instinct or inference? Semiotica, 153(1/4), 131–154. Park, W. (2017). Abduction in context: The conjectural dynamics of scientific reasoning. Springer. Peirce, C. S. (1931). Collected papers (Vol. V). Harvard University Press. Cited as CP followed by volume and paragraph number. Pietarinen, A.-V., & Bellucci, F. (2014). New light on Peirce’s conceptions of retroduction, deduction, and scientific reasoning. International Studies in the Philosophy of Science, 28(4), 353–373. Porta, M. (2014). A dictionary of epidemiology (6th ed.). Oxford University Press. Post, P. N., de Beer, H., & Guyatt, G. H. (2013). How to generalize efficacy results of randomized trials: Recommendations based on a systematic review of possible approaches. Journal of Evaluation in Clinical Practice, 19(4), 638–643. Ramoni, M., Stefanelli, M., Magnani, L., & Barosi, G. (1992). An epistemological framework for medical knowledge-based systems. IEEE Transactions on Systems, Man and Cybernetics, 22, 1361–1375. Rich, B. A. (2001). Defining and delineating a duty to prognosticate. Theoretical Medicine and Bioethics, 22(3), 177–192. Rich, B. A. (2002). Prognostication in clinical medicine: Prophecy or professional responsibility? Journal of Legal Medicine, 23(3), 297–358. Riley, R. D., Hayden, J. A., Steyerberg, E. W., Moons, K. G. M., Abrams, K., Kyzas, P. A., . . . Hemingway, H. (2013). Prognosis Research Strategy (PROGRESS) 2: prognostic factor research. PLoS Medicine, 10(2), e1001380. Riley, R. D., van der Windt, D., Croft, P., & Moons, K. G. (Eds.). (2019). Prognosis research in healthcare: Concepts, methods, and impact. Oxford University Press. Risjord, M. (2011). Nursing knowledge: Science, practice, and philosophy. Wiley-Blackwell. Rizzi, D. A. (1993). Medical prognosis – Some fundamentals. Theoretical Medicine, 14(4), 365– 375. Sadegh-Zadeh, K. (2012). Handbook of analytic philosophy of medicine. Springer. Smith, A. K., White, D. B., & Arnold, R. M. (2013). Uncertainty: The other side of prognosis. The New England Journal of Medicine, 368(26), 2448–2450. Stanley, D. E., & Campos, D. G. (2013). The logic of medical diagnosis. Perspectives in Biology and Medicine, 56(2), 300–315. Stanley, D. E., & Campos, D. G. (2015). Selecting clinical diagnoses: Logical strategies informed by experience. Journal of Evaluation in Clinical Practice, 22(4), 588–597. Stegenga, J. (2018). Medical nihilism. Oxford University Press. Stovitz, S. D., & Shrier, I. (2019). Causal inference for clinicians. BMJ Evidence-Based Medicine, 24, 109. Thagard, P. (1992). Conceptual revolutions. Princeton University Press. Thorne, S., & Sawatzky, R. (2014). Particularizing the general: Sustaining theoretical integrity in the context of an evidence-based practice agenda. Advances in Nursing Science, 37(1), 5–18. Upshur, R. (1997). Certainty, probability and abduction: Why we should look to CS Peirce rather than Gödel for a theory of clinical reasoning. Journal of Evaluation in Clinical Practice, 3(3), 201–206. Walker, M. J., Bourke, J., & Hutchison, K. (2019). Evidence for personalised medicine: Mechanisms, correlation, and new kinds of black box. Theoretical Medicine and Bioethics, 40(2), 103–121. Walton, D. (1996). Arguments from ignorance. Pennsylvania State University Press. Walton, D. (2004). Abductive reasoning. University of Alabama Press. Woods, J. (2013). Errors of reasoning. Naturalizing the logic of inference. College Publications. Woods, J., & Walton, D. (1978). The fallacy of ‘ad ignorantiam’. Dialectica, 32(2), 87–99.
Abduction, Clinical Reasoning, and Therapeutic Strategies
21
Raffaella Campaner and Fabio Sterpetti
Contents Abduction and Clinical Reasoning: Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning and the Medical Expertise in Clinical Contexts . . . . . . . . . . . . . . . . . . Abductive Reasoning, Treatment Options, and Therapeutic Strategies . . . . . . . . . . . . . . . . . . Challenges from a Pandemic: Covid-19 and the Urgency of Effective Treating Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Epilepsy: An Ancient Disease and Current Therapeutic Issues . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks: Abductive Reasoning for Effective Treatments . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
444 449 454 455 458 462 463
Abstract
The debate on the role and possible uses of abduction in the health sciences has mainly concerned diagnosis. Indeed, whereas a range of works have addressed abductive reasoning in the elaboration of diagnoses, very limited attention has been devoted to whether and how abduction plays a relevant role also in the adoption and implementation of therapeutic strategies. This chapter provides
Raffaella Campaner and Fabio Sterpetti contributed equally to the chapter. More specifically, the first two sections are due to Fabio Sterpetti, while the last two sections are due to Raffaella Campaner. R. Campaner () Department of Philosophy and Communication Studies, University of Bologna, Bologna, Italy e-mail: [email protected] F. Sterpetti Department of Philosophy, Sapienza University of Rome, Rome, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_12
443
444
R. Campaner and F. Sterpetti
an attempt to start filling such a gap, considering, in particular, two aspects, that is, the selection and evaluation of evidence when addressing clinical decisions on single cases and the choice of some therapeutic strategy rather than others. Some reflections will be put forward which try to set a dialogue between philosophical discourse on abductive reasoning and actual therapeutic situations in clinical practice where clinicians’ expertise is particularly relevant in conceiving hypotheses about which treatment should be adopted. A couple of actual cases will be presented to exemplify conditions in which abductive reasoning actually plays an important part in clinical contexts. Keywords
Abductive reasoning · Clinical reasoning · Creative abduction · Selective abduction · Therapeutic strategies
Abduction and Clinical Reasoning: Introductory Remarks The debate on the role and possible uses of abduction in the health sciences has mainly concerned diagnosis (see, e.g., Chiffi & Zanotti, 2015, 2017; Stanley & Campos, 2013, 2016; Thompson, 2012; Upshur, 1997). The reason becomes transparent if one considers how abduction and diagnosis are usually understood. For example, Aliseda states that “abduction is a reasoning process invoked to explain a puzzling observation” (Aliseda, 2006, p. 28), and to provide an example of abductive reasoning, she refers precisely to medical diagnosis. Indeed, when “a doctor observes a symptom in a patient, she hypothesizes about its possible causes, based on her knowledge of the causal relations between diseases and symptoms” (ibidem). This example points out one of the main features of abductive reasoning, i.e., the capacity to provide hypotheses which can explain the phenomenon one wishes to investigate. Peirce writes that abduction is “the operation of adopting an explanatory hypothesis” and famously describes abduction as follows: The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true. (Peirce, CP 5.189)
Peirce’s view on abduction evolved over time, and some authors pointed out that in later years, he provided several characterizations of abduction which are more sophisticated than the schematic presentation of abduction quoted above (see, e.g., Pietarinen & Bellucci, 2014), which comes from his 1903 Harvard lectures and is the formulation of abduction provided by Peirce that is usually quoted in the literature (ibidem). However, since this chapter deals with how abduction, diagnosis, and their relation are usually understood, and not with the issue of whether the way in which abduction is usually understood is supported by a careful reconstruction of Peirce’s mature view on abduction, that schematic presentation of abduction is suitable for its purposes. Indeed, what is important to underline in this context is that this reasoning process is usually meant to take one from something unexpected to something that
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
445
makes perfect sense in the light of a chosen hypothesis. And this is precisely what is conveyed by the schematic presentation of abduction quoted above. The explanatory hypothesis makes the explanandum unsurprising. Now, in very general terms, a diagnosis can roughly be defined as “a summary term for a particular set of symptoms” (Vertue & Haig, 2008, p. 1049), which is then associated with some labeling of a disease. A diagnosis names the symptoms, by connecting them to a disorder. Before the diagnosis, what one has are sets of symptoms and signs; the diagnosis traces them back to a definite disease – a disease which the scientific medical community knows to some extent and which is already established within a given nosology. Were the diagnosis to be correct, it would explain the symptoms and signs and make them expectable in the light of the occurrence of that very disease. So, if C is the set of symptoms observed by the doctor in a patient, A can be interpreted as the hypothesis concerning the occurrence of an illness that is responsible of the symptoms themselves. In this sense, A would explain C. If A is tested and confirmed, the doctor will be able to confirm the diagnosis of her patient. Thus, making a diagnosis can be regarded as providing an explanation in the sense of associating the symptoms and signs with some portion of established medical knowledge, which makes the symptoms and signs unsurprising. Since abduction is the reasoning process aimed at providing explanatory hypotheses, this makes clear why abduction has been regarded as crucial for diagnosis. Diagnosis, in its turn, is central to clinical reasoning and usually constitutes the first, preliminary step to enact therapeutic strategies. Clinical reasoning is the “set of decision-making or problem-solving processes employed in the description of health problems” (ibid., p. 1047), and usually it leads to diagnosis, first, and then treatment. Even when the diagnosis is not definitive, or is actually incorrect, it orients therapeutic choices. Indeed, the goal of clinical reasoning “is diagnosis, which, in turn, directs treatment” (ibidem). So, abduction is usually regarded as central to clinical reasoning because diagnosis is regarded as central to clinical reasoning, insofar as it paves the way to the ultimate goals of clinical medicine, that is, curing and, whenever possible, healing. There are further reasons why abduction has been thought to be strictly related to diagnosis. Consider again a doctor that observes a range of symptoms in a single patient. Usually, there are several hypotheses she can make on the basis of her background knowledge in order explain the symptoms she is observing – which, moreover, can change across time, at different paces according to the kind of disease and/or the patient’s conditions. Very often, indeed, in clinical reasoning, there are many options to consider, and the clinician needs a way to prune some possibilities in order “to make the diagnostic space more manageable” (Stanley & Nyrup, 2020, p. 165). Since the diagnosis drives the cure, the former – as remarked, were it even just provisional – is needed to establish and manage the latter. However, more often than not symptoms and signs do not univocally orient diagnosis but, rather, are compatible with more than one diagnostic option. This points out another relevant feature of abductive reasoning that has led many authors to regard abduction as crucial for clinical reasoning, i.e., the capacity to select the hypothesis that would better explain the phenomenon under investigation among a set of alternative
446
R. Campaner and F. Sterpetti
hypotheses. Indeed, abductive reasoning can be regarded as both a generative and selective reasoning process (see Magnani, 2001; Schurz, 2008; Stanley & Nyrup, 2020): it has to do with the elaboration of the hypotheses and with the need to choose among them. This idea can be traced back to Pierce himself (cf., e.g., CP 6.525; on this issue, see also Hintikka, 1998). In this line of reasoning, when there are several hypotheses, they need to be ranked in order to give priority to the one which deserves to be pursued first. Abductive reasoning is regarded as a reasoning process that is able to rank hypotheses, and this makes it suitable to account for even this crucial aspect of clinical reasoning, related to the incertitude that can affect diagnosis. In other terms, according to several authors, abduction can be regarded as an instance of the Inference to the Best Explanation (IBE) (see, e.g., Schurz, 2008), i.e., the inference rule according to which, given “our data and our background beliefs, we infer what would, if true, provide the best of the competing explanations we can generate of those data” (Lipton, 2004, p. 56). Therefore, in this line of reasoning, abduction can be regarded as a special case of IBE, namely, the case in which one generates just one explanation, and IBE can in its turn be regarded as a generalization of abduction (see, e.g., Cellucci, 2013, Chap. 18). Now, despite the fact that the strict relation between abduction and IBE has already been pointed out in Harman (1965), i.e., the paper which gave to IBE its name, the issue of whether abduction should be regarded as related to IBE is still a debated issue. Different positions on how abduction and IBE are related can indeed be found in the literature. Mackonis (2013), for instance, sketches the situation and presents his personal view on the issue as follows: Some researchers do not conceptually discriminate between IBE and abduction or use the term ‘abduction’ as standing for IBE (Barnes 1995; Carruthers 2006; Douven 2011; Fodor 2000; Josephson and Josephson 2003; Niiniluoto 1999; Psillos 2002), but this stance is wrong: there is more to IBE than mere abduction. Some others argue that IBE and abduction are conceptually distinct (Campos 2009; Minnameier 2004; Hintikka 1998; McKaughan 2008), however, this stance is also an exaggeration: two concepts are indeed related. The most accurate description of the relation between IBE and abduction is to state that they overlap to some degree. (Mackonis, 2013, p. 976)
This issue will not be discussed here since it is not the focus of this chapter. Indeed, this chapter does not aim at determining whether IBE, as an inference rule, can be completely reduced to abduction, as an inference rule, or IBE and abduction, both intended as inference rules, just overlap to some degree. What is relevant to underline here is the idea that when one reasons in accordance with IBE as an inference rule, the kind of reasoning involved in that inferential process is abductive in character. Both abduction and IBE, indeed, aim to provide hypotheses in order to explain a surprising fact that has been observed, starting from one’s background knowledge. So, here it will just be assumed that there is a relation of some sort between abduction and IBE, a position which is widely held and defended in the literature (see Mackonis (2013), and the references there provided) and which is sufficient for the purpose of this chapter, i.e., to point out the relevance of abductive reasoning in those parts of clinical reasoning in which several hypotheses are available and one needs to rank them. Indeed, if IBE and abduction are so related,
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
447
i.e., they are not completely distinct and can both be related to abductive reasoning, abductive reasoning is able to account for a crucial aspect of clinical reasoning, i.e., the fact that one has to assess and eliminate alternative options. This can be done progressively, as more and/or more relevant evidence is collected, or by ranking hypotheses considering some additional criteria, such as their epistemic virtues, when several hypotheses can equally well account for the very same available evidence and either there is no time or there are no available means to acquire new evidence. Hence, selecting the best explanation among a given set of alternatives amounts to consider and rule out, at least temporarily, one by one all other available explanations. This connects abductive reasoning to eliminative reasoning, i.e., the kind of reasoning by which one routinely eliminates all but one hypothesis and infers the truth of that hypothesis or at least infers that it is highly probable that that hypothesis be true (Stanford, 2006, Chap. 2). Even this idea can be traced back to Harman (1965): in “general, there will be several hypotheses which might explain the evidence, so one must be able to reject all such alternative hypotheses before one is warranted in making the inference” (Harman, 1965, p. 89). Recently, several authors have complained that too much attention has been paid so far to diagnosis in accounting for the role that abductive reasoning plays in clinical reasoning (see, e.g., Chiffi & Zanotti, 2017). They stressed that abductive reasoning is relevant to clinical reasoning for reasons that go beyond its capacity of accounting for the clinical activity of diagnostics. Indeed, on the one hand, (1) there are features of abductive reasoning that are not fully deployed in the activity of diagnostics, and on the other hand, (2) it is not the case that the activity of diagnosis is the only process of clinical reasoning involved in searching for explanatory hypotheses by means of abductive reasoning. As regard (1), i.e., the thesis that there are features of abductive reasoning that are not fully deployed in the activity of diagnostics, Chiffi and Zanotti pointed out that usually in “the clinical activity of diagnostics, hypotheses are merely selected from an existing set of possible options” (Chiffi & Zanotti, 2017, p. 932). Clinicians “have to select an already known hypothesis capable of explaining their patient’s signs and symptoms and thus identify a disease or syndrome” (ibid, p. 929). So, in their view, abductive reasoning is able to account for the activity of diagnostics for it is able to account for the eliminative process by which clinicians identify the cause of the symptoms of their patients on the basis of their previous knowledge. In this perspective, diagnosis is a matter of choosing the most suitable hypothesis from a set of given alternative hypotheses, i.e., the hypothesis that provides the best explanation of a given patient’s signs and symptoms. As Chiffi and Zanotti argue, things are different, though, in prognostic reasoning, which is also crucial for clinical reasoning. According to them, diagnosis occurs under risk: one is able, at least in principle, to compute the risk of making the wrong diagnosis given the available evidence and background knowledge, since there is sufficient clinical knowledge gained at population level to assign probabilities to such kind of events, while, on the contrary, “fundamental uncertainty is mainly associated to prognostic judgment regarding a decision in which it is not possible to compute probability because some scenarios are, for instance, not predictable” (ibidem; on the distinction between
448
R. Campaner and F. Sterpetti
“risk” and “uncertainty,” see Hansson, 2014). Chiffi and Zanotti rely on a distinction made by Magnani (2001) between selective abduction and creative abduction (on this distinction, see also Tuzet, 2006). This distinction has not to be confused with the idea that abductive reasoning is both a generative and selective process mentioned above. Indeed, the distinction between selective abduction and creative abduction hinges on whether the hypotheses that are generated in order to explain the surprising facts observed are taken from previous knowledge available to the epistemic subject or are created by the subject for wanting of an already available and adequate hypothesis. This point is made clear by Magnani (2001) in a passage where diagnosis is characterized in terms of selective abduction: “Diagnosis is the first task to be executed in medical reasoning. It starts from patient data that is abstracted into clinical features to be explained. Then, selective abduction generates plausible diagnostic hypotheses” (Magnani, 2001, p. 75). So, in both selective abduction and creative abduction, hypotheses are generated to find an explanation. But while in selective abduction, one relies on a given set of alternative hypotheses, in creative abduction, one has to infer hypotheses which are not already available (ibidem). This feature of abductive reasoning, i.e., the capacity of producing truly new hypotheses, is what makes abductive reasoning be regarded as an ampliative form of reasoning, i.e., a reasoning process which is able to make available in the conclusion something which was not already given in the premises (Hintikka, 1998). According to Thagard (2011), for instance, the process of medical discovery, i.e., the process by which medical knowledge is ampliated, is related to abductive reasoning. Thus, this is a relevant aspect of abductive reasoning, which cannot be fully appreciated by focusing exclusively on diagnosis. Indeed, according to Chiffi and Zanotti, “creative abduction is rarely involved in the diagnostic process, but mainly in the prognostic one” (Chiffi & Zanotti, 2017, p. 929). In the same vein, Niiniluoto states that: “medical diagnosis is based on fixed well-established lists of diseases and causes of death” (Niiniluoto, 2018, p. 13). Given that in the activity of prognostics, usually one cannot rely on a set of given alternative hypotheses in the same sense as one does in the case of diagnosis, so one should pay much more attention to prognosis in accounting for the role played by abductive reasoning in clinical reasoning. As regard (2), i.e., the thesis that it is not the case that the activity of diagnostics is the only process of clinical reasoning involved in searching for explanatory hypotheses by means of abductive reasoning, the present chapter aims to show how abductive reasoning is relevant not only to diagnosis, as it is usually taken to be, or prognosis, as it is recently been pointed out, but also to the adoption and implementation of therapeutic strategies, which too is a crucial aspect of clinical reasoning, albeit somehow the object of lower philosophical attention. More precisely, in the next sections, it will be argued that both creative abduction and selective abduction are involved in the process of evaluating evidence when addressing clinical decisions on single cases and choosing some therapeutic strategy rather than others. In order to do that, two cases, which allow one to highlight how creative abduction and selective abduction are both relevant when one deals with therapeutic decisions, will be considered, namely, Covid-19 (section “Challenges from a Pandemic: Covid-19 and the Urgency of Effective Treating Options”) and
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
449
epilepsy (section “Epilepsy: An Ancient Disease and Current Therapeutic Issues”). These two cases – although very different from one another – will be taken to show some problematic aspects of therapeutic reasoning, appearing when the diagnosis does not suffice per se to identify a single treatment. Situations can occur in which therapeutic options are far from straightforward, even if the pathology at stake is clearly identified and labeled. It will be illustrated in what respects issues arising in such clinical contexts can be addressed by abductive reasoning, in what senses this can be done, and how this sheds some light on the features of clinical reasoning as a whole – which is by no means to be confined to diagnostic concerns.
Abductive Reasoning and the Medical Expertise in Clinical Contexts The decision on what therapy should be adopted to cure a patient’s disease is typically pursued in conditions of uncertainty, and not of risk, for reasons that are analogue, at least to some extent, to the ones highlighted by Chiffi and Zanotti (2017) with regard to prognosis. Once a diagnosis has been formulated, the clinician is requested to make some steps forward and start suggesting how the disease is to be treated. Diagnosis typically does not present just a theoretical import, per se, but constitutes the first step to actively proceed further, to intervene and change what would otherwise be the natural course of the disorder. Whereas the diagnosis labels what is there, therapy is meant to change it, to avoid that it develops has it would without any intervention. Getting from the diagnosis to the actual treatment requires some clinical decision in between. If the activity of diagnostics can mainly be accounted for in terms of selective abduction, this is not the case for clinical decision on what therapy should be adopted and clinical assessment of how a given treatment is working for a given patient at a given moment. Indeed, even if it is true that usually some statistical information is available to the clinician on how treatments work, what their major or more frequent side effects are, and so on, there is such a huge number of contextual factors that have to be taken into account when dealing with the issue of deciding what therapy is the most adequate for a given patient at a given time. Such factors concern many different aspects: they can have to do with the patient, the clinician, and the available therapeutic options and more in general, as it will be remarked in the next section, the current state of medical knowledge on the disease to be treated. In many cases, it is impossible to deal with eliminative reasoning in order to tackle such kind of decision problems. Very often all those factors interact with each other, making it very difficult to disentangle their influence and understand what each kind of factor is exactly responsible of. This situation makes the clinician’s decision on what therapy should be adopted impossible to be accounted for in terms of selective abduction. The problematic situation just sketched is not to be fully generalized too quickly. There are many cases in which clinicians proceed in deciding what therapy has to be adopted rather straightforwardly, as a direct and unequivocal consequence of their activity of diagnostics. For instance, low fever due to what will be identified
450
R. Campaner and F. Sterpetti
as common seasonal flue will be most likely treated with just some paracetamol. Although exceptions can obviously hold, paracetamol is likely to be considered as effective enough to be prescribed in most common cases diagnosed as common flu. Standard treatments which are not seen as controversial, or puzzling, are evaluated by standard criteria of adequacy. In those cases, in which the disease is regarded as well-known and standard treatments are widely accepted by the medical community, acquired theoretical and statistical knowledge about the functioning of treatments and their possible side effects is robust; the disease is regarded to be such that contextual factors do not affect its unfolding in a way that makes heuristic strategies aimed at diminishing complexity and statistical reasoning unserviceable. In such situations, the clinician needs very few information about patient’s conditions in order to select the adequate therapy among a – usually limited – set of given alternative options that are well known. This is due to the fact that the patient’s conditions are easily traced back to some general model of the disorder. In those cases, even in itinere assessment of some treatment’s efficacy for a given single patient is made in rather a routine and noncreative way. Those cases of decision about what therapy should be adopted can be accounted for in terms of selective abduction. In many cases, though, things are much more complicated, and the patterns of clinical reasoning are not uniform or straightforward. More often than not, the affected patient will exhibit just some, i.e., not necessarily all, the features regarded by the medical community as characteristic of that very disease. Actually, in clinical medicine, physicians are typically searching for rules in a world of extremely frequent exceptions, due to the very high variability of single cases (see, e.g., Boniolo & Campaner, 2019). Each patient will be different from any other, even if sharing the same diagnosis, and the course of the disease will change across time in that very individual. Interindividual and intraindividual variations cannot be ignored: the peculiar traits somebody exhibits are the hallmarks of the disorder in that particular person and are very likely to affect its course and, hence, the prognosis. Such features are hence those guiding the clinician in both formulating the diagnosis – relating the patient’s situation to a given general model of the disease – and hypothesizing which therapeutic treatment is to be adopted. Even if it can be easily held that each patient presents peculiar features, distinguishing her from other patients who received the same diagnosis, differences do not always count the same. In some cases, variations and exceptions will have to be taken into most careful account, given that they might make a difference with respect to the therapy to be prescribed; to whether, when, and how it is to be adopted; to its efficacy in the short and in the long run; and to possible side effects. They will dictate whether a given therapy which is regarded as standard and belonging to a given protocol for a given disease is likely to work in the single case under investigation and whether it is to be adopted as it stands or with possible modifications. Part of the mastery required to the clinician consists in her capacity to tell highly relevant individual differences from differences that, while still holding, can be neglected into clinical procedures, insofar as they would not determine a different diagnosis and/or treatment.
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
451
When all those contextual factors interact in such a way to inflate the complexity of the clinical situation, the process by which the clinician decides what the adequate therapy is cannot be accounted for in terms of selective abduction. Rather, such process should be accounted for in terms of creative abduction. The clinician cannot just rely on a given set of pre-established alternative hypotheses. She has often to generate brand new hypotheses on which the most adequate therapy for a specific patient is, by reflecting on the specific features of her patient and by relying on her background knowledge. This is clearly not to say that the elaboration of new therapeutic options is made from scratch: the clinician cannot but build her hypotheses on her background knowledge, which is going to include both established medical knowledge and clinical expertise gathered from previous cases. Obviously, clinical expertise, resulting from the experience one has had throughout one’s whole career, differs in significant ways from one physician to another and might lead to various sorts of proposals as far as curing is concerned. Furthermore, one has to keep in mind that the peculiar features of a given single patient will not usually be clear all at once but will be progressively become clear and discovered along the clinical process, thus requiring further clinical reflections and adjustments of the very assessment of what counts as relevant from a therapeutic point of view. In some circumstances, the generative process might lead to identify a new therapy for a given disease, for instance, by hypothesizing that a given drug D, which is usually used to cure disease A, could be beneficial for a given patient P, who is affected by disease B (something along these lines will be illustrated in section “Challenges from a Pandemic: Covid-19 and the Urgency of Effective Treating Options”). In other cases, what is defined as a given pathology A can actually include a range of sub-cases or peculiar manifestations, significantly varying from one patient to another in such a way that the therapy needs to be modified ad hoc and tentatively so, many times, following the course of the disease and being basically guided by it (this kind of situations will be presented in section “Epilepsy: An Ancient Disease and Current Therapeutic Issues”). These sorts of cases, and similar ones, where there is not a set of already available and clearly defined alternative options to choose from, make it evident that the role of clinician’s expertise in deciding what therapy should be adopted is essential. Indeed, all the considerations above will be based on the evidence collected on the patient’s conditions by means of different kinds of tests, colloquia on symptoms, and first-person reports, but their evaluation and the evaluation of how all the relevant contextual factors will interact and affect each other will also be strongly dependent on the clinician’s own expertise. As briefly mentioned above, the clinician’s past experience on similar and different cases, on previously treated patients, and on past outcomes of related therapies – were they successful or not – will significantly affect therapeutic choices. This becomes clearer if one considers that in those cases, one is dealing with creative abduction. The process of hypothesis production is crucial in this kind of abductive reasoning. One way in which hypotheses are generated is by means of analogy. Schurz, for instance, classifies “analogical abduction” among the creative patterns of abductive reasoning. In analogical abduction, abduction “is driven by analogy” (Schurz, 2008,
452
R. Campaner and F. Sterpetti
p. 217), i.e., the new hypotheses are generated by means of analogy, and then they are evaluated and selected abductively. Now, analogy is a kind of ampliative reasoning which infers, “from the fact that two things are similar in certain respects and one of them has a certain property, that the other has that property” (Cellucci, 2013, p. 336; on analogy and analogical reasoning, see also Bartha, 2019). Since the issue is to try to make hypotheses on how a given patient should be treated, in order to be able to produce hypotheses by means of analogy, the extent of clinician’s background knowledge and expertise proves crucial. The expert clinician can indeed rely on a vast repertoire of cases from which she can start making analogies with the current case she needs to address. The more analogies one is able to make, the more likely it is that one of those analogies will be the adequate one. The number and degree of adequacy of the analogies the clinician draws will depend on the number of cases she has previously addressed, on their features, and on her capacity to pick the relevant similarities and the relevant differences between past situations and the target case. Indeed, this is not just a matter of “brute force,” of producing as many analogies as possible. The point is that the more expertise one has, the more one can consider different respects that could be relevant in connecting the current case to previous felicitous cases, and the more likely one is to be able to tell what features analogies should rely on and, on the contrary, what features can just be ignored for the purpose of drawing analogies. Last, but by no means least, decisions on treatments will inevitably depend on currently available medical knowledge on the disease to be cured. According to the pathology to be addressed, medical knowledge might be more or less advanced, more or less detailed, or, rather, full of gaps. Varying degrees of medical knowledge, and ignorance, will be associated to different diseases and to different treatments. When talking of medical knowledge, and ignorance, one can refer both to the personal medical education, training, and subsequent update of the single clinician, and to the general state of knowledge of the international medical community on a certain disease. Medical understanding can differ a lot from one pathology to another: research on a given disease might be flourishing, whereas, on the other hand, certain diseases might not be equally at the center of research interests. For instance, as it will be stressed in section “Challenges from a Pandemic: Covid-19 and the Urgency of Effective Treating Options,” Covid-19 pandemic, given its sudden spread all over the world, has become very quickly a core concern of medical research worldwide, with very large communities of researchers focusing on it and large resources being invested; on the contrary, often rare diseases get more limited resources and less involvement. Epistemic situations can significantly vary from one disease to another, and variations in the adoption of clinical strategies will follow, which will be related to different matters. First of all, it is important to recall that the relation between theoretical knowledge – especially of an explanatory kind – and the devising of therapeutic interventions is not straightforward either. In an obvious sense, it just holds: if one has an adequate, well-developed, and reliable account of the disease, one is more likely to find an effective therapy for it. The explanation of the disease will provide one with information about the relevant links on which to
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
453
intervene to affect the behavior of the pathology. This is clearly the case if one thinks of causal explanations: knowing the causal links allows one to intervene on the causes in order to modify the effects, thus changing the unfolding of the disorder (on causal explanations of diseases, see, e.g., Campaner, 2019; Dammann, 2020). However, explanatory knowledge is neither necessary nor sufficient to design therapies. Although explaining how and why a disease occurs and develops provides a very good ground for the design of treatments, circumstances can hold in which genuine explanatory knowledge is not available yet, and still therapies, or prevention strategies, need to be pursued for the sake of the diseased or to avoid further diseases. In such cases, correlational or predictive knowledge will constitute the starting ground for the elaboration of therapies. On the other hand, explanatory knowledge might be available, and yet one might not be capable to translate it into effective treatments yet – e.g., due to practical, technical, ethical, financial reasons. For instance, progress in biomedical research can take different temporal intervals to be translated into effective therapeutic options, and the efficacy of the latter on the actual population can require different time intervals to be adequately evaluated, especially in order to be aware of the outcomes in the long run, including possible side and adverse effects. Technical and financial constraints can also obstacle or delay the translation of medical progress into therapeutic benefits. All these elements will affect the confidence of the clinician in prescribing a given treatment and in preferring – when more than one option is available – one over another. Those epistemic and temporal factors contribute to clarifying why in many cases decisions on what therapy should be adopted are made under uncertainty and not under risk. When one therapy is chosen, it means that a hypothesis has been selected. Some scenarios would be ruled out, while some other would become relevant. This is not a “static” issue but an ongoing situation. After some time, an evaluation of how a given therapy is working on a given patient has to be made. Several new contextual factors can intervene, complexify the clinical situation, and lead to a modification of the clinician’s evaluation on what the best therapy to adopt is. New hypotheses are needed, and new scenarios might become relevant. In many cases, there are no available sets of pre-formulated hypotheses to choose from or statistics from which one can compute the probability of each possible option. One needs to reduce the complexity of the space of possibilities again, i.e., one needs to select another hypothesis among the many that are possible and that are still possibly unknown (on the role that ignorance plays in driving scientific inquiry when one operates in conditions of uncertainty and not of risk, see, e.g., Carrara et al., 2021; Firestein, 2012). So, there is a sort of – so to speak – “double” epistemic movement here: on the one hand, abductive reasoning leads to select just one hypothesis, i.e., which therapy has to be adopted, hence reducing complexity. On the other hand, the relevant contextual factors can interact in such a way that complexity might rapidly increase. Many options become available at different times in the process of therapy selection and evaluation and by means of it. It is a process that continuously produces bifurcations, branching alternative possibilities that have to be pruned by means of abductive reasoning, in the difficult cases, by means of creative abduction.
454
R. Campaner and F. Sterpetti
Abductive Reasoning, Treatment Options, and Therapeutic Strategies In order to make what has been presented above closer to scientific practice and, hence, to show the relevance that philosophical reflections can play in clarifying clinical reasoning, the following sections will tackle two particular cases, i.e., the treatment of Covid-19 (section “Challenges from a Pandemic: Covid-19 and the Urgency of Effective Treating Options”) and of epilepsy (section “Epilepsy: An Ancient Disease and Current Therapeutic Issues”). These two diseases are very different from one another in a number of respects. As is well-known, Covid-19 infection is a very recently identified one, is highly aggressive and transmissible, and has affected the population worldwide in an unprecedented way. No treatment was available at the beginning of the pandemic of Covid-19, while now it appears reasonable to expect that treatments able to be effective on large scale might become available in the next future. Epilepsy, on its hand, has been somehow known since antiquity (Magiorkinis et al., 2010), has hence been investigated at length, and presents a wider set of available diagnostic and therapeutic tools. Nevertheless, the treatment to be adopted in the single, specific case of epilepsy is still far from straightforward and uncontroversial, with a whole range of variations to be accounted for and a range of adjustments to be pursued. Even if very distant as sorts of diseases, percentage of population affected, kinds of prescribed treatments, and disciplines investigating them, both the outburst of Covid-19 and the presence of epilepsy offer interesting therapeutic challenges to clinicians, questioning the idea that a single pattern of therapeutic reasoning can be identified and, even more so, expressed univocally and formally. This point cannot be fully developed here, since it is not the focus of this chapter. Roughly, the idea is that when a reasoning process is really creative, in the sense that the conclusion it leads to does not belong to a set of pre-given alternatives, or it is not implied by those pre-given alternatives, as it is the case when one deals with creative abduction in clinical reasoning, that reasoning process cannot but be a nondeductive reasoning process. This implies that that reasoning process cannot be completely formalized and so cannot be fully automated, since nondeductive reasoning cannot be fully automated, given that it cannot be fully formalized, despite the fact that nondeductive inference rules can be formalized (see, e.g., Cellucci, 2013; Sterpetti, 2020). In other words, the crucial role played by creative abduction in clinical reasoning provides reason to support the idea that therapeutic reasoning cannot be fully automated, and so that the role played in the clinical practice by the clinician and her background knowledge and expertise is crucial and cannot be completely eliminated. However, the task of this chapter is limited to providing reason to support the claim that creative abduction plays a crucial role in clinical reasoning, specifically in the search for effective treatments. The beginning of the pandemic of Covid-19 and the treatment of epilepsy are indeed both cases that highlight the role played by abduction in clinical reasoning and, more precisely, the role played by creative abduction in the adoption and implementation of therapeutic strategies.
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
455
Challenges from a Pandemic: Covid-19 and the Urgency of Effective Treating Options The last couple of years have undoubtedly been dominated by the rapid spread of SARS-CoV-2 , a novel coronavirus that emerged in late 2019 and is still affecting, with its variations, the population worldwide. The resulting Covid-19 disease, a severe acute respiratory syndrome, has been labeled a Public Health Emergency of International Concern (PHEIC) by the World Health Organization (see WHO Director-General’s statement on IHR Emergency Committee on Novel Coronavirus (2019-nCoV), (https://www.who.int/director-general/speeches/detail/who-directorgeneral-s-statement-on-ihr-emergency-committee-on-novel-coronavirus- (2019ncov)), due to its huge impact in terms of morbidity and mortality. It has almost immediately, in early 2020, attracted a very large amount of resources to be scientifically addressed and defied. After two years from the outbreak of the pandemic, scientific understanding of the virus and its behavior has increased, and, what has been most relevant for its containment, vaccination is now available. The situation, although still very critical, has significantly improved. This section will focus on what happened in the first months of the pandemic, when clinicians were suddenly required to deal with an unexpected global event, facing an extremely large number of severe clinical cases and related deaths. Transmission and treatment of Covid-19 infection are still the objects of ongoing debates and updates from infectious disease experts and epidemiologists across the globe. This has been the case from the very beginning of the pandemic: given the global dimension of the health crisis, many research groups quickly redirected their research to what appeared as the most urgent topic, and large collaborations were established to maximize results in such an emergency situation. But what happened from a strictly clinical standpoint, in the ward, while research was starting tackling the disease? What were clinicians doing (and are still partly doing) at the bedside, while research was moving its first steps ahead, and what were their reasoning patterns? In the case of Covid-19 infection, diagnosis and treatment have been misaligned: the clear identification of the disease, labeled and defined not only on the basis of symptoms and signs but of the presence of the pathogenic factor, has not coincided with the identification of an adequate treatment. The clinician’s capacity to tell whether a patient was Covid-19 positive preceded her capacity to prescribe a generally effective treatment. In the first place, no specific cure for Covid-19 virus was known that could be used as the standard, default option. Secondly, it immediately appeared that the virus behaved differently in different patients: the course of the disease, its severity, and even the organs it hit the most differed from one patient to another. Thirdly, the virus turned out to be particularly aggressive in patients who were already affected by other diseases – e.g., previous pneumological disorders or oncological pathologies. This made it even more urgent to search for cures even in the absence of a thorough understanding of the functioning of the disease. Epidemiology is actually familiar with cases in which treatments or, even
456
R. Campaner and F. Sterpetti
more so, prevention campaigns have to be pursued in the absence of a thorough understanding of the disease behavior or even, sometime, of its precise causes (on this, see, e.g., Campaner & Galavotti, 2012). With respect to Covid-19 infection, even if the diagnosis was uncontroversial, it did not bring a clear treatment with it, and at the same time, therapeutic strategies were to be enacted very quickly, with no time to wait for a much more detailed understanding of the disease and its modes of unfolding in time. In a clear emergency situation as the outbreak of a pandemic, with thousands of human lives in danger, clinical reasoning had to act quickly and creatively. In April 2020, one could still read: the ongoing coronavirus disease 2019 (COVID-2019) pandemic has swept through 213 countries and infected more than 1,870,000 individuals, posing an unprecedented threat to international health and the economy. There is currently no specific treatment available for patients with COVID-19 infection. The lessons learned from past management of respiratory viral infections have provided insights into treating COVID-19. Numerous potential therapies, including supportive intervention, immunomodulatory agents, antiviral therapy, and convalescent plasma transfusion, have been tentatively applied in clinical settings. A number of these therapies have provided substantially curative benefits in treating patients with COVID-19 infection. (Zhang et al., 2020, p. 59)
Lopinavir/ritonavir, remdesivir, favipiravir, chloroquine, hydroxychloroquine, interferon, ribavirin, tocilizumab, and sarilumab were identified as potentially beneficial drugs, and synergistic drug combinations were explored as well, as cocktails of antiviral and antimicrobial agents. However, “despite the worsening trends of COVID-19, no drugs [were] validated to have significant efficacy in clinical treatment of COVID-19 patients in large-scale studies” (Shio-Shin et al., 2020, p. 436. See also Ali et al., 2020; Wu, 2020; Jin et al., 2021). Nothing specific and specifically effective was available. Furthermore, among the abovementioned and several other drugs, some were then withdrawn after showing adverse reactions, even after demonstrating promising clinical outcomes (on this, see, e.g., Ullah et al., 2020), thus modifying – actually, restricting – along time the set of possible alternative therapeutic hypotheses. Therapeutic reasoning initially had to act on the basis of diagnostic labeling and partial, evolving, prognostic results, but without actually benefitting much from diagnostic and prognostic reasoning. Evidence was scarce, and severity of the disease was very high. Physicians in the wards were required to act very quickly – given the pace at which the disease was progressing both in the single patients and in the population at large – and could not but generate a range of hypotheses concerning the functioning of drugs used to treat other pathologies whose course shared at least some features with Covid-19 infection. The common strategy was thus “to re-purpose the available drugs or antiviral that [could] minimize or reduce the burden of the health care emergencies” (Naik & Shakya, 2021, p. 1), Covid19 being a big issue not only for the single affected patient but also for the whole healthcare system. In a sense, the search for effective treatments for the respiratory syndromes, and related organs’ failures, due to Covid-19 indirectly allows also
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
457
medical doctors to restart a range of other therapeutic activities that have been slowed down or postponed by the pandemic. Clinical therapeutic reasoning forced by Covid-19 was clearly abductive in character: if a hypothesis concerning the similarity of the functioning of a given drug in a given already known disease and in Covid-19 were right, the efficacy of that drug in treating Covid-19 would not have been mysterious; or if a hypothesis concerning the similarity of the behavior of a given known disease A and in Covid19 were right, the adoption of the drug that is usually adopted to treat disease A to cure Covid-19 would have been explained. But such abductive, clinical reasoning was also clearly not eliminative, since it was not based on a fixed set of pregiven and well-known alternative hypotheses. Rather, it was strongly creative, trying to envisage what the behavior of some previously known treatments could be if applied to the unknown disorder. Working in conditions of high uncertainty, and in the absence of formal tools, statistics, or standard reasoning patterns to appeal to, physicians adapted some treatments that were already employed for other diseases – such as retroviral drugs or devices to treat pulmonary disorders – on a case-bycase basis. Their reasoning was guided by their previously acquired expertise, but it is to be stressed both that such expertise had inevitably been gained in treating other, different diseases and that many medical doctors suddenly found themselves in hospital wards devoted to Covid-19 patients without having a specific training and work experience in infectious diseases. The creative component of abductive reasoning played hence, at least in the first phases of the pandemic, a highly relevant role. Adjustments were made on a daily basis, observing the effects of the prescribed treatments on the single patients, the unfolding of the disease and its effects on the single body and its capacity of reaction to the virus. Effects were far from uniform, with great discrepancies from one case to another and little understanding of the reasons why that was the case. The generation of hypotheses on possible alternative treatments was followed by the tentative choice of a given treatment, its prescription, the detection of its effects, then, according to them, the tentative choice of a further option, and so on, trying to identify regular behaviors out of extremely high variability of the single treated cases, which were continuously monitored. Summing up, in the case of Covid-19 infection, for quite some time, therapeutic interventions could not be performed but on the basis of very partial and patchy understanding of the disease working, extremely limited evidence concerning its course in time, and very poor information on the reasons underlying individual variations. Severity of the disease, organs affected, and reactions to the treatments appeared to be highly different from one individual to another. Clinical reasoning made hence a large use of creative abduction, generating new, alternative hypotheses on treatment on the basis of background medical knowledge and clinical expertise, selecting the one that appeared to be the best one, and progressively getting from one another on the basis of gathered first-hand experience. Even if all patients were clearly diagnosed as patients infected with Covid-19 virus, and hence – genetic mutations of the virus notwithstanding – all undoubtedly shared the same disease, each of them reacted differently to the disease and its different cures. Being it a new disease, no actual and specific clinical expertise was available; previous expertise,
458
R. Campaner and F. Sterpetti
built on analogous cases, was creatively adjusted, revised, and redefined. Were creative abduction an unavailable reasoning process, no attempt to treat Covid-19 and save so many lives could have been made by clinicians in the first months of the pandemic. Thus, the case of Covid-19 infection shows (1) that abductive reasoning is not limited to the clinical activity of diagnostics, since it shows that abductive reasoning is also crucial to the process of adoption and implementation of therapeutic strategies, and (2) that in clinical reasoning, when one works in conditions of uncertainty, it is creative abduction that plays a fundamental role.
Epilepsy: An Ancient Disease and Current Therapeutic Issues Epilepsy is no doubt one of the diseases that have been known to humankind for at least four millennia: the first description of the disease can indeed be found in a text from 2000 B.C.E. written in Akkadian found in Mesopotamia (Magiorkinis et al., 2010). Although it has been defined and classified in many different ways across time, cases of what medical doctors would currently diagnose as epilepsy have been reported in many ancient medical and literary texts, such as, for instance, Orpheus’ poem Lithica and Hippocrates of Cos’ treatise On the Sacred Disease (ibidem). Extended experience of epilepsy notwithstanding, a wide range of clinical puzzles still persist, and effective treatment is very problematic. Epilepsy is one of the most common neurological disorders, affecting 5–7 of every 1,000 individuals worldwide (Epi25 Collaborative, 2019, p. 267). Actually, epilepsy is a group of disorders characterized by repeated seizures caused by excessive electrical activity in the brain, and it has a multifactorial origin. A variety of epilepsy types hold, with different sorts of seizures, levels of severity, and comorbidity. The origins of the different features are difficult to disentangle. Current investigations involve different medical disciplines and search for treatment solutions in different directions. Genetic studies (ibidem) have identified an increasing number of disease-causing genes and are hence suggesting a fundamental molecular foundation for understanding some forms of epilepsy – a direction of research which is unlikely to be translated, at least at present, into direct and effective treatment options. Further studies are connecting the processes that contribute to some types of epilepsy to several mechanisms, including particular immune system phenomena (see, e.g., De Sarro, 2016; Gambardella et al., 2016). Research also regards relations between epilepsy and other kinds of disorders, as not only can epilepsy be due to different factors, but it can also occur together with a heterogeneous set of other diseases. Comorbidities include organic as well as psychiatric disorders. Clinically important psychiatric conditions that have been shown to be associated with epilepsy are anxiety and mood disorders, attention deficit hyperactivity disorder, and psychoses (see, e.g., Mula, 2016). It goes without saying that comorbidities cannot but add extra-difficulties when the time comes to prescribe a given treatment, to avoid undesired side effects, and to associate a therapeutic choice to the best possible prognosis. Moreover, epilepsy turns out to be very serious in the medium
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
459
and long run, since it is associated with cognitive and behavioral impairments, such as, for instance, general cognitive decline and worsening of long-term both verbal and nonverbal memory (see, e.g., Mameniškien˙e et al., 2016). Despite development and recent achievements in drug therapy, “treatment of some form of epilepsy is still empirical and merits particular attention in some individual patients” (De Sarro, 2016, p. 332). In particular, a point of concern for clinical pharmacology of epilepsy is due to the fact that “about one third of people with epilepsy fail to achieve complete freedom from seizures with existing medications and furthermore that the currently available antiepileptic drugs (AEDs) have significant limitations in terms of safety, tolerability and predisposition to cause or be a target for clinically important adverse drug interactions suggesting the need of novel drugs” (ibidem). Although over twenty-four antiepileptic drugs are currently available, as a matter of fact, over one third of epileptic patients do not achieve complete freedom from seizures (see, e.g., Franco et al., 2016). Moreover, even when working effectively, currently available AEDs still present serious shortcomings in terms of tolerability, safety, and propensity to induce or be a target for clinically serious adverse drug interactions. Pharmacoresistance is not uncommon (see, e.g., Luoni et al., 2011). It is also worth highlighting that “AEDs are currently prescribed based primarily on consideration of seizure type(s), comorbidities and co-medications, and there are no reliable tools to predict clinical responses in the individual patient” (ibid., p. 95). Finally, a range of further considerations can be made that show how in the case of epilepsy therapeutic reasoning has to deal with various difficulties, due to the following facts: (1) the market value for epilepsy drugs is small and crowded; (2) developing AEDs that are superior to existing agents is an elusive target; (3) epilepsy is a highly heterogeneous disease, which implies that a single drug is unlikely to benefit broadly all patients; (4) with the current trend to link reimbursed prices of medicines to comparative effectiveness, AED development no longer brings financial return from investment, unless a highly superior molecule is discovered; (5) conducting randomized controlled AEDs trials has become more difficult than in the past (see ibid., pp. 96–98). All these issues, together with the acknowledgment that the overall number of patients is significantly high, point, on the one hand, to the urgent need to develop newer and more efficacious general strategies and, on the other hand – and at the same time – to the need to devise effective therapeutic strategies for the single patient who happens to be under treatment at present. Abductive, clinical reasoning plays a part in both respects. As regard the need to develop newer and more efficacious general strategies, developing truly innovative medicaments requires the integration of theories for new drug discovery (which concern, e.g., the epileptogenesis mechanisms involved in seizure generation, seizure spread, development of comorbidities, disease modification, etc.) with not only controlled epilepsy trials but also – and especially – clinical data that are provided by physicians themselves on the basis of the individual patients they encounter and treat, often for a long time. This means that developing more efficacious general
460
R. Campaner and F. Sterpetti
strategies to treat epilepsy amounts to make a significant scientific discovery. And, as already noted, creative abduction plays a crucial role in scientific inquiry and discovery (Magnani, 2001; Aliseda, 2006; Niiniluoto, 2018). As regard the need to devise effective therapeutic strategies for the single patient who is currently under treatment, whereas in the case of Covid-19, the disease is “new,” and all the patients are basically affected by the same virus, epilepsy as a whole has been known for a very long time. However, the large medical understanding of the disease does not suffice to prescribe a single effective treatment to address all cases. Many cases present unique features, and they tend all to be etiologically and genetically very heterogeneous. How can the physician opt for the best treatment? The answer is abductively. And also in this case, it is not just eliminative abduction that plays a crucial role. Indeed, even if the pharmacological strategy is picked to treat epilepsy, and the number of available drugs is currently limited to twenty-five possible drugs, the clinician is required to hypothesize which will fit better the patient she is treating and/or what drug is to be changed when and how (see, e.g., Santulli et al., 2016). The hypotheses the clinician will be considering, and hence the criteria that will drive her choice, are going to depend on many different matters, such as the specific characteristic of each available drug, the various kinds of epileptic syndrome affecting the individual patient to be treated, detailed information on the seizure types, and – what is most difficult, if not impossible, to categorize and formally represent – the patient’s unique features, her experience of the disorder and of its cognitive and behavioral effects. Thus, the clinician cannot rely on a given set of alternative hypotheses to choose from. She has to generate hypotheses based on her background medical knowledge, clinical expertise, and personal knowledge of the patient she is treating. For instance, monotherapy is usually considered as the standard option, at least in principle and when dealing with first diagnosed epilepsy, but the association of various AEDs is a common practice. It is then going to be up to the clinician to hypothesize what the joint effects and mutual interactions of the drugs in the single patient will be. Unfortunately, how a given drug will react if associated with another one and jointly prescribed to a given patient is often far from clear when the therapeutic decision is taken. The latter relies – let us stress it again – on not formally representable evaluations (i.e., the process that leads to such evaluations cannot be automated), and it implies a choice among a set of hypotheses generated by the clinician herself: “the choice of AEDs used in association is frequently founded on clinical experience or anecdotal observations. Polytherapy should be as ‘rational’ as possible and contemplate the mechanism of action, the pharmacokinetic characteristics and the safety of each compound” (De Sarro, 2016, p. 332). Yet, even if rational, polytherapy is adopted on the basis of the expertise that each single clinician has built up across her career and that will provide her with various kinds of evidence, including collections of first-person reports, readings and interpretations of imaging, evaluation of cognitive tests, and observations of patients’ behavior. Such evidence will be highly heterogeneous, including both quantitative and qualitative aspects. More in general, “study conditions of registered clinical trials are quite different
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
461
from daily clinical practice, which is more various and complex” (ibidem) and needs to take into account and provide tools to manage both medical symptoms in the short run and psychosocial implications in the medium and long run. In the case of epilepsy, treatment is first and foremost aimed at symptomatic control of seizures, which can be dangerous and lead to permanent impairment. It is worth stressing here that the disease can have such peculiar manifestations that the clinician herself often needs to be present to an epilepsy crisis to better understand the features the disease gets into that very patient: in a sense, reported symptoms and signs do not suffice for the clinician to feel properly in charge and suggest a given treatment, but the disease needs to be “experienced” by the clinician herself to get a proper grasp. Being the witness of an epileptic crisis helps the clinician make better sense of the particular form the disease exhibits in the patient under observation. In other words, one could go so far as to say that the clinician needs to “feel” the disease to envisage what the most effective treatment could be. One of the most peculiar experiences which get to be part and parcel of the clinical work in the case of epilepsy is the patient-clinician encounter during the epileptic crisis: expert clinicians report how the very encounter – or, rather, loss of contact – between the eyes of the clinician and those of the patient during an epileptic crisis provides the former with very valuable, precious information on the kind of crisis that is; on the degree of consciousness, or loss of consciousness, of the patient; and, hence, on the severity of the disease and how it should be treated. Obviously, not only will each patient present peculiar features, but also episodes of epileptic crises can differ from one another, thus affecting the therapeutic choice. It goes without saying that the experience of the epileptic crisis itself, while useful for both diagnostic and prognostic purposes, and very important in the devising of therapeutic options, is not representable in formal terms. Its role in therapeutic reasoning is that of orienting abductive reasoning, since it allows clinician to both generate hypotheses and select the best one among those hypotheses. All things considered, including the “feeling” of the disease the clinician gets if she is, so to speak, lucky enough to witness a crisis, point to highly hypothetical, alternative, and not standard therapeutic options: The various steps characterizing the treatment of epilepsy (onset, drug choice and daily dose, and duration) are the result of a complex process in which a decision must be taken in light of the outcome of the disease, the factors predicting that outcome, and the efficacy and safety of the available drugs. All these elements help us to understand why the treatment of epilepsy across its different stages cannot be standardized; it must be largely individualized and subjected to a comprehensive evaluation of the individual case. (Beghi, 2010, p. 90)
Thus, analogously to the case of Covid-19 infection analyzed in previous section, the case of epilepsy shows (1) that abductive reasoning is not limited to the clinical activity of diagnostics, since it shows that abductive reasoning is crucial to the process of adoption and implementation of therapeutic strategies, and (2) that in clinical reasoning, when one deals with uncertainty, i.e., when relevant information or evidence are still unknown or are not knowable, it is creative abduction that plays a fundamental role.
462
R. Campaner and F. Sterpetti
Concluding Remarks: Abductive Reasoning for Effective Treatments In this chapter, it has been pointed out the role that abductive reasoning plays in devising therapeutic strategies, since attention in the relevant literature has so far mainly been devoted to analyzing the role that abductive reasoning plays in diagnosis. Only recently some attention has been devoted to prognosis. The aim of this chapter was, on the one hand, to clarify that abduction plays a crucial role in clinical reasoning beyond the role it plays in the clinical activities of diagnostics and prognostics and, on the other hand, to clarify the crucial role that creative abduction plays in clinical reasoning, a role that cannot be fully appreciated by focusing exclusively on diagnosis, since in the clinical activity of diagnostics, one usually deals with eliminative abduction. In order to highlight the peculiar features of clinical search for therapeutic strategies that make creative abduction peculiarly relevant for this domain of clinical reasoning, two case studies have been considered, namely, the Covid-19 infection and epilepsy. Despite the diversity of those two diseases, both cases highlight the role that abductive reasoning plays in the search for effective treatments. More precisely, both the cases of Covid19 infection and epilepsy make explicit the role that creative abduction plays in the search for effective treatments when one works in conditions of uncertainty. Indeed, in those cases in which one cannot rely on a set of pre-given and accepted alternatives in order to abductively select the one that fits the best with one’s observations and data, and so identify the more adequate treatment, the search for effective treatments crucially relies on the ability of the clinician to produce and rank different hypotheses on the basis of her expertise, background knowledge, and personal knowledge of her patient. All these aspects can obviously vary a lot from one subject to another and include also highly qualitative elements – elements that simply cannot be just formalized and/or measured. If it is true that abductive reasoning is central to the search for effective treatments in general, i.e., both in condition of risk and in condition of uncertainty, it is also true that it is the role played by creative abduction in conditions of uncertainty that brings to light how crucial the ability of the clinician to produce hypotheses in order to find effective treatments is. If performing selective abduction in condition of risk can be regarded as a routine task, and it can be thought that that task can be automated or performed by clinicians without any kind of particular expertise, this is not the case for creative abduction. Thus, showing that creative abduction is central to the search for effective treatments, at least when one deals with difficult cases, i.e., when one operates in conditions of uncertainty, makes clearer the indispensable role played by abductive reasoning in clinical reasoning and reaffirms the indispensable role played by clinician’s expertise and reasoning skills in the search for effective treatments.
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
463
References Ai, J., et al. (2020). COVID-19: Treating and managing severe cases. Cell Research, 30, 370–371. https://doi.org/10.1038/s41422-020-0329-2 Ali, M. J., et al. (2020). Treatment options for Covid-19: A review. Frontiers in Medicine, 7, 480. https://doi.org/10.3389/fmed.2020.00480 Aliseda, A. (2006). Abductive reasoning: Logical investigations into discovery and explanations. Dordrecht, Springer. Barnes, E. (1995). Inference to the loveliest explanation. Synthese, 103(2), 251–277. Bartha, P. (2019). Analogy and analogical reasoning. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. https://plato.stanford.edu/archives/spr2019/entries/reasoning-analogy/ Beghi, E. (2010). Treating epilepsy across its different stages. Therapeutic Advances in Neurological Disorders, 3(2), 85–92. Boniolo, G., & Campaner, R. (2019). Causal reasoning and clinical practice: Challenges from molecular biology. Topoi, 38(2), 423–435. Campaner, R. (2019). Varieties of causal explanation in medical contexts. Milan, Mimesis International. Campaner, R., & Galavotti, M. C. (2012). Evidence and the assessment of causal relations in the health sciences. International Studies in the Philosophy of Science, 26(1), 27–45. Campos, D. G. (2009). On the distinction between Peirce’s abduction and Lipton’s inference to the best explanation. Synthese, 180(3), 419–442. Carrara, M., Chiffi, D., De Florio, C., & Pietarinen, A.-V. (2021). We don’t know we don’t know: Asserting ignorance. Synthese, 198(4), 3565–3580. Carruthers, P. (2006). The architecture of the mind: Massive modularity and the flexibility of thought. Oxford, Oxford University Press. Cellucci, C. (2013). Rethinking logic: Logic in relation to mathematics, evolution, and method. Dordrecht, Springer. Chiffi, D., & Zanotti, R. (2015). Medical and nursing diagnosis: A critical comparison. Journal of Evaluation in Clinical Practice, 21(1), 1–6. Chiffi, D., & Zanotti, R. (2017). Fear of knowledge: Clinical hypotheses in diagnostic and prognostic reasoning. Journal of Evaluation in Clinical Practice, 23(5), 928–934. Dammann, O. (2020). Etiological explanations: Illness causation theory. Abingdon, CRC Press De Sarro, G. (2016). Managing epilepsy in the third millenium: Recent achievements and future perspectives. Pharmacological Research, 113, 332–334. Douven, I. (2011). Abduction. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. http:/ /plato.stanford.edu/archives/spr2011/entries/abduction/ Epi25 Collaborative. (2019). Ultra-rare genetic variation in the epilepsies: A whole-exome sequencing study of 17,606 individuals. The American Journal of Human Genetics, 105, 267– 282. Firestein, S. (2012). Ignorance: How it drives science. Oxford, Oxford University Press. Fodor, J. A. (2000). The mind doesn’t work that way: The scope and limits of computational psychology. Cambridge (MA), MIT Press. Franco, V., French, J. A., & Perucca, E. (2016). Challenges in the clinical development of new antiepileptic drugs. Pharmacological Research, 103, 95–104. Gambardella, A., Labate, A., & Aronica, E. (2016). Pharmacological modulation in mesial temporal lobe epilepsy: Current status and future perspectives. Pharmacological Research, 113, 421–425. Hansson, S. O. (2014). Risk. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. https:/ /plato.stanford.edu/archives/spr2014/entries/risk/
464
R. Campaner and F. Sterpetti
Harman, G. H. (1965). The inference to the best explanation. The Philosophical Review, 74(1), 88–95. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34(3), 503–533. Jin, W., et al. (2021). Deep learning identifies synergistic drug combinations for treating COVID19. PNAS, 118(39), e2105070118. Josephson, J. R., & Josephson, S. G. (Eds.). (2003). Abductive inference: Computation, philosophy, technology. Cambridge, Cambridge University Press. Lipton, P. (2004). Inference to the best explanation (2nd ed.). London, Routledge. Luoni, C., et al. (2011). Determinants of health-related quality of life in pharmacoresistant epilepsy: Results from a large multicenter study of consecutively enrolled patients using validated quantitative assessments. Epilepsia, 52(12), 2181–2191. Mackonis, A. (2013). Inference to the best explanation, coherence and other explanatory virtues. Synthese, 190(6), 975–995. Magiorkinis, E., Sidiropoulou, K., & Diamantis, A. (2010). Hallmarks in the history of epilepsy: Epilepsy in antiquity. Epilepsy & Behavior, 17(1), 103–108. Magnani, L. (2001). Abduction, reason and science: Processes of discovery and explanation. Dordrecht, Springer. Mameniškien˙e, R., Rimšien˙e, J., & Puronait˙e, R. (2016). Cognitive changes in people with temporal lobe epilepsy over a 13-year period. Epilepsy & Behavior, 63, 89–97. McKaughan, D. J. (2008). From ugly duckling to swan: C.S. Peirce, abduction, and the pursuit of scientific theories. Transactions of the Charles S. Peirce Society, 44(3), 446–468. Minnameier, G. (2004). Peirce-suit of truth: Why inference to the best explanation and abduction ought not to be confused. Erkenntnis, 60(1), 75–105. Mula, M. (2016). The pharmacological management of psychiatric comorbidities inpatients with epilepsy. Pharmacological Research, 107, 147–153. Naik, R. R., & Shakya, A. K. (2021). Therapeutic strategies in the management of COVID-19. Frontiers in Molecular Biosciences, 7, 636738. https://doi.org/10.3389/fmolb.2020.636738 Niiniluoto, I. (1999). Defending abduction. Philosophy of Science, 66(Suppl), S436–S451. Niiniluoto, I. (2018). Truth-seeking by abduction. Cham, Springer. Peirce, C. S. (CP). (1931–1958). Collected papers of Charles Sanders Peirce, Vols. 1–6, Hartshorne, C., & Weiss P. (eds.); Vols. 7–8, Burks, A. W. (ed.). Cambridge (MA), Harvard University Press. Pietarinen, A.-V., & Bellucci, F. (2014). New light on Peirce’s conceptions of retroduction, deduction, and scientific reasoning. International Studies in the Philosophy of Science, 28(4), 353–373. Psillos, S. (2002). Simply the best: A case for abduction. In A. C. Kakas & F. Sadri (Eds.), Computational logic: Logic programming and beyond (pp. 605–625). Berlin, Springer. Santulli, L., et al. (2016). The challenges of treating epilepsy with 25 antiepileptic drugs. Pharmacological Research, 107, 211–219. Schurz, G. (2008). Patterns of abduction. Synthese, 164(2), 201–234. Shio-Shin, J., Lee, P.-I., & Hsueh, P.-R. (2020). Treatment options for COVID-19: The reality and challenges. Journal of Microbiology, Immunology and Infection, 53, 436e443. Stanford, K. P. (2006). Exceeding our grasp: Science, history, and the problem of unconceived alternatives. New York, Oxford University Press. Stanley, D. E., & Campos, D. G. (2013). The logic of medical diagnosis. Perspectives in Biology and Medicine, 56(2), 300–315. Stanley, D. E., & Campos, D. G. (2016). Selecting clinical diagnoses: Logical strategies informed by experience. Journal of Evaluation in Clinical Practice, 22(4), 588–597. Stanley, D. E., & Nyrup, R. (2020). Strategies in abduction: Generating and selecting diagnostic hypotheses. The Journal of Medicine and Philosophy, 45(2), 159–178. Sterpetti, F. (2020). Mathematical proofs and scientific discovery. In M. Bertolaso & F. Sterpetti (Eds.), A critical reflection on automated science (pp. 101–136). Cham, Springer.
21 Abduction, Clinical Reasoning, and Therapeutic Strategies
465
Thagard, P. (2011). Patterns of medical discovery. In F. Gifford (Ed.), Handbook of philosophy of medicine (pp. 187–202). Amsterdam, Elsevier. Thompson, B. (2012). Abductive reasoning and case formulation in complex cases. In L. Robertson (Ed.), Clinical reasoning in occupational therapy: Controversies in practice (pp. 15–30). Chicester (UK), Wiley-Blackwell. Tuzet, G. (2006). Projectual abduction. Logic Journal of the IGPL, 14(2), 151–160. Ullah, M., et al. (2020). Therapeutic options for treating COVID-19. Engineered Science, 10, 8–10. Upshur, R. (1997). Certainty, probability and abduction: Why we should look to C.S. Peirce rather than Gödel for a theory of clinical reasoning. Journal of Evaluation in Clinical Practice, 3(3), 201–206. Vertue, F. M., & Haig, B. D. (2008). An abductive perspective on clinical reasoning and case formulation. Journal of Clinical Psychology, 64(9), 1046–1068. Wu, R. (2020). An update on current therapeutic drugs treating COVID-19. Current Pharmacology Reports, 6, 56–70. Zhang, J., Xieb, B., & Hashimoto, K. (2020). Current status of potential therapeutic candidates for the COVID-19 crisis. Brain, Behaviour, and Immunity, 87, 59–73.
Abductive Reasoning in Clinical Diagnostics
22
Carlo Martini
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as Expert-Based Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Doctors Using Abductive Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Short Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples from Recorded Clinical Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patients Using Abductive Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Note on Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
468 469 471 471 472 475 477 478 478
Abstract
In this chapter, we look at the reasoning process called abduction as it happens in clinical reasoning, specifically, in diagnostics. Abduction is recognized as one of the most important forms of reasoning in studying relations between causes and effects, and often attributed to reasoning processes in scientific discoveries. Interestingly, abduction does not only play a role in scientific reasoning, but it is believed that people apply it in common everyday scenarios, for example, by detectives and judges. Yet there have been few attempts to document this form of reasoning in clinical interactions. Doctors apply abduction daily in their clinical practice, that is, in the attempt to diagnose their patients, and, as it will be seen, patients themselves use abductive reasoning to understand and explain their own symptoms. This chapter provides empirical evidence of the use of abductive inferential reasoning in clinical medicine, focusing on diagnosis as it happens in C. Martini () Faculty of Philosophy, Vita-Salute San Raffaele University, Milan, Italy and Center for Philosophy of Social Science, University of Helsinki, Helsinki, Finland e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_13
467
468
C. Martini
the moment of interaction between a doctor and their patient. The goal is not to document in a structured way the extent to which abductive inferential patterns appear in a typical doctor-patient interaction, but rather to illustrate that kind of interaction and mode of reasoning using real dialogues. Keywords
Abductive reasoning · Medical reasoning · Doctor-patient interaction · Diagnostic reasoning
Introduction This chapter focuses on the reasoning process called abduction as it happens in clinical reasoning, specifically, in diagnostic settings. Empirical evidence of the use of abductive inferential reasoning in clinical medicine will be provided, focusing on diagnosis as it happens in the moment of interaction between a doctor and their patient. It is often argued that abduction plays a central role in scientific reasoning, in the field of biomedical research we find Ignaz Semmelweis’ inference that “cadaverous materia” must be the reason for the insurgence of puerperal fever in women after childbirth or miscarriage, or Alexander Fleming’s inference that the contaminating molds which had formed in some of his staphylococcus culture-plates had bactericidal and bacteriolytic properties. Abduction is recognized as one of the most important forms of reasoning in studying relations between causes and effects. Dragulinescu (2016) provides two examples from the history of arteriosclerosis research as cases of “reasoning from effects to causes” at times described as the essence of abductive inference (Niiniluoto 1999). Similarly, Festa et al. (2010) describe two examples of medical discoveries as applications of abductive reasoning: the contamination hypothesis by Semmelweis, which has been mentioned above, in the etiology of puerperal fever, and the “sink hypothesis” by David Ho and his collaborators, who were studying pathogen models of AIDS. Interestingly, abduction does not only play a role in scientific reasoning, but also it is recognized that most people apply it in common everyday scenarios, for example, detectives and judges (see Harman 1965). Doctors apply abduction in their clinical practice, that is, in the attempt to diagnose their patients, and, as it will be shown, patients themselves use abductive reasoning to understand and explain their symptoms. This chapter provides empirical evidence of the use of abductive inferential reasoning in clinical medicine. In this chapter we steer away from theoretical questions about what exactly abduction is, and whether it amounts to inference to the best explanation (IBE, henceforth), or the more general and primitive form of inference identified by C. S. Peirce. Irrespective of those important theoretical questions, it seems undeniable
22 Abductive Reasoning in Clinical Diagnostics
469
that it is possible to provide a “good enough” characterization of abduction that is fit for the purposes of identifying inferential patterns used by doctors to come to conclusions regarding medical diagnoses, and that will be the goal of the next section (section “Abduction As Expert-Based Inference”). Section “Doctors Using Abductive Reasoning” looks at the use of abduction by doctors to diagnose patients. This chapter focuses specifically on the use of questions to form and correct inferential patterns that guide the clinician toward a choice of therapy. But this chapter also focuses on how clinicians use abduction to provide explanations for patients to understand their own conditions. That is to say that abduction plays a role not only in a doctor’s reasoning process but also in their communicative process. Section “Patients Using Abductive Reasoning” looks at an interesting phenomenon: the use of abductive inferences by the patients themselves to self-diagnose and find an explanation for the symptoms they are observing. Section “Conclusion” concludes. In this chapter we do not attempt to collect empirical evidence of abductive reasoning in a systematic way. All empirical evidence will be illustrative of a phenomenon. All empirical data was collected through the COMMUNI.CARE protocol by recording and transcribing doctor-patient interviews at San Raffaele University Hospital (Consolandi et al. 2020). The goal of this chapter is not to document in a structured way the extent to which abductive inferential patterns appear in a typical doctor-patient interaction, but rather to illustrate that kind of interaction, and mode of reasoning, using real dialogues and interactions. While some authors have stated that abductive reasoning is common to most reasoners, there have been few attempts to document this in the clinical setting, and to understand more precisely what kind of role abduction plays in clinical interactions.
Abduction as Expert-Based Inference It is quite common to get a fairly solid grasp of the concepts of deduction and induction (or probabilistic) reasoning, but whenever one looks for an account of the process of reasoning called “abduction” there seem to be at least two different ways of understanding the concept. (A) According to a view shared by many modern philosophers of science, abduction amounts to Inference to the Best Explanation (IBE, henceforth). Taken in this sense, abduction allows us to justify hypotheses. In other words, it has to do with the context of justification. This view on abduction can be found, for example, in Douven (2021). While it is recognized that some form of abductive reasoning is a natural way of reasoning in natural settings, as most people seem to make use of abduction in everyday reasoning, scholars have tried to characterize IBE in rather precise and even formalizable terms. Douven (2021) characterizes both induction and abduction as ampliative inferences, with the important difference that abduction, unlike induction, makes explicit reference to explanatory considerations. The close link between IBE and the concept of
470
C. Martini
explanation allows formalization, as there is a large literature on explanatory reasoning that tries to formalize the concept of explanation and make it coherent (Pitt 1988). It is then safe to claim then that IBE is a more precise notion than the historical notion of abduction (cf. also Douven 1999). (B) On a more historical view, abduction is the process of discovering or formulating hypotheses to explain empirical facts and observations. The idea is attributed to Peirce (see Fann 1970), and many have tried to explicate Peirce’s concept of abduction to try to understand what exactly it amounts to (Frankfurt 1958; Anderson 1986; Pietarinen and Bellucci 2014; Walton 2004). While Peirce’s notion is more fluid, and possibly also more complex, than the modern notion of IBE, the general agreement seems to be that Peirce’s notion of abduction is central to scientific reasoning and discovery, since it is by way of abduction that scientists formulate scientific hypotheses. Abduction helps reasoners infer new theories and new explanatory hypotheses based on the observed facts. Unlike IBE, Peirce’s abduction is not confined to the context of justification. According to him, scientists verify hypotheses by way of induction and deduction, but it is by way of abduction that scientists come up with hypotheses in the first place. Abduction is a sort of creative process, whereby the human mind arrives at hypotheses through observation of facts. Peirce stresses the generative character of abductive inferences. What kind of account would serve best the purpose of illustrating abduction in clinical reasoning? As it will be shown subsequently, it is very likely that human abductive reasoning fits both accounts of abduction illustrated above, depending on the context. Both IBE and Peircean abduction are useful to explain human reasoning over diseases and cures. This fact will be shown with examples regarding the case of diagnostic and self-diagnostic processes in doctor-patient interactions. Most importantly, both characterizations of abduction, both as IBE and Peircean abduction, have implications for the idea of formalizing abductive reasoning, where formalization means that it is possible to explicitly represent all information and the steps that are necessary for processing the information. For contrast, both deductive and inductive inferences can be formalized to a reasonable extent, through logics and mathematics (for deduction) and through probability theory (for induction). There have been attempts to provide a formal framework for abduction: For instance, Denecker et al. (1996) provide a very high-level blueprint for formalization, though nothing close to concrete applications. More recently, Pfister (2022) provides a more complete account of how to possibly formalize abduction but concludes with a sensible and narrower view that some abductive inferences are formalizable, while others may for now elude formalization for structural reasons (Pfister 2022, p. 25). In general, abduction relies heavily on reasoning and background knowledge, and the ability to explore connections between facts and generate ideas. This is a property of human-based reasoning, for the time being, and makes it possible to safely define abduction, informally, as expert-based reasoning over connections of cause and effects among phenomena.
22 Abductive Reasoning in Clinical Diagnostics
471
The goal of the next two sections is to provide examples of how abduction is used in natural reasoning. By natural reasoning, we mean the employment of reasoning resources unaided by theory. Natural reasoning employs various inferential patterns, for example, induction: If an agent observes a thousand white swans, the agent probably comes to expect the next observation of a swan also to be a white swan, even without awareness of the (logically fallacious) inferential pattern: Swan 1 is white, Swan 2 is white, . . . Swan n is white; therefore, the next Swan will be white. Similarly, scientists, clinical practitioners, and also patients make use of abductive reasoning. In presenting the examples below, we keep in mind the distinction between abduction as generative reasoning, and abduction (IBE) as justificatory reasoning.
Doctors Using Abductive Reasoning A Short Review The most obvious experts for use of abductive reasoning in clinical settings are doctors. Like detectives, they are often using clues (e.g., symptoms and medical tests) and background information (e.g., biomedical knowledge and patient history) to formulate a diagnosis. Some of the chapters in this volume have looked at the use of abduction for prognostic reasoning (Chiffi and Andreoletti, 2022) and the choice of therapy (Campaner and Sterpetti, 2022). This chapter focuses on diagnosis as it happens in the moment of interaction between a doctor and their patient. Abduction is a much-debated topic in clinical reasoning, yet it is an understatement to say that there is very little empirical research on the topic. This chapter is not a literature review, but a few systematic searches can illustrate the literature landscape. A search of articles with the string in the past decade in PubMed yielded about 280 results, of which none contain an analysis of the use of abductive reasoning using real textual interactions between doctors and patients. Most articles obtained with those search words were completely irrelevant, a few articles discussed abduction in medicine and nursing from a variety of theoretical perspectives, and a few articles were written for practitioners, illustrating abductive reasoning and advocating its use in clinical education and practice. A new search using the string narrowed the search to 83 entries, leaving all of the relevant entries from the previous search. Of the articles that were loosely related to abductive reasoning, Råholm (2010) provides a general introduction directed at practitioners, especially nurses. Mirza et al. (2014) provide a review aimed at training nurses in the use of abductive reasoning. Veen (2021) suggests how to use abduction in medical education to unify theory and practice. There are several other papers not directed at practitioners. Bolton (2015) provides a taxonomy of where different forms of reasoning are used at different
472
C. Martini
stages of the doctor-patient interaction. Chiffi and Zanotti (2016) argue that “creative abduction regarding clinical hypotheses in diagnostic process is very unlikely to occur, whereas this seems to be often the case for prognostic judgments”; their argument is theoretical, rather than based on data. Stanley and Sehon (2019) use idealized scenarios to illustrate abductive reasoning but provide no real-cases data. Stanley and Nyrup (2020) provide a case study that is “developed on the basis of the clinical experience of one of the authors”; the paper does not contain empirical data. Stanley and Campos (2013) purport to demonstrate “the use and pervasiveness of abduction by any other name in clinical diagnosis.” The paper does not provide empirical data but rather shows how different postulated differential diagnostic methods (like pattern recognition, or multiple branching) coincide with, or can be reduced to, various forms of abductive reasoning. Vanstone et al. (2019) discuss intuition in clinical reasoning, and their sample is based on 30 interviews with clinicians and their self-reporting of cases; no recorded dialogues are reported. A few other outliers are worth mentioning. Klichowicz et al. (2021) and Lipscomb (2012) are theoretical papers about abductive reasoning itself. Prasad (2021) presents different forms of clinical reasoning, inductive, deductive, and abductive, but does not use empirical data. Haig (2008) introduces a special issue of the Journal of Clinical Psychology containing six theoretical papers on abduction in clinical reasoning. Magnani (1997) concerns medical education and does not contain empirical data. Finally, Upshur (1997) opposes the use of Gödel’s theorem, to argue against evidence-based medicine (EBM), and suggests Peirce’s abductive methods as a conceptual basis that can serve as common ground for both EBM and its critics.
Examples from Recorded Clinical Dialogues This chapter focuses on diagnosis as it happens in the moment of interaction between a doctor and their patient. To be clear, the kind of interactions that will be presented here may happen at different stages of the diagnostic process. In some of the interactions reported below, the doctor was seeing the patient for the first time, and the diagnosis was taking place at that very moment. In other cases, the doctor had already examined the patient’s lab tests and had already come to a conclusion, which they (The singular “they” will be used throughout the chapter, in place of the gender non-neutral “she” and “he.”) then communicated to the patient. What will be evident in the examples is that when a doctor interacts with their patient they are, typically, though not necessarily, performing a double function, which involves both modes of abduction illustrated in the section “Abduction As Expert-Based Inference,”. On the one hand, the doctor may be trying to reach a diagnostic conclusion on the basis of their first interaction with the patient, listening to them, examining any lab test the patient may have already taken, asking questions, and ordering new lab tests. On the other hand, a doctor is also trying to make sense to the patient of their symptoms and disease. The latter is a role too often overlooked
22 Abductive Reasoning in Clinical Diagnostics
473
in doctor-patient interactions but one of the pillars in Emanuel and Emanuel’s preferred model of Physician-Patient relationship: the deliberative model (1999, 2222). In the deliberative model, the physician is using not only their reasoning to diagnose a condition but also language to communicate to the patient a diagnosis, a prognosis, and possibly a cure, to establish a therapeutic alliance that can potentially affect the medical outcome itself (see Pinto et al. 2012). For context, some examples of abductive reasoning we will While this is not an attempt to systematically review medical reasoning in the field, as it were, it is certainly possible to conclude, from a look at the partial data collected here, that both inductive (specifically, probabilistic) and abductive reasoning are used in doctor-patient interaction, but the latter takes the lion’s share. Let us start with a few examples of inductive reasoning. A doctor tells patient L that 90% of patients will not see their cancer grow for the duration of the chemotherapy they are recommending. Usually, either remission or stasis is expected. The same doctor also tells patient L that it is very unlikely that the disease (pancreatic cancer) will completely disappear only with chemotherapy, or with chemotherapy followed by radiotherapy: “it is not the most likely occurrence from a statistical point of view.” A doctor tells patient F that it is “rather likely” that there will be an infection or blockage in the biliary stent that was inserted into their body to facilitate the drainage of bile into the digestive tract. A doctor tells patient N that it is known that, after the type of surgery the patient underwent, it is very likely that some carcinogenic cells may have detached from the mass of the tumor, and now lie hidden somewhere in the body. Unfortunately, the probability of this is rather high: “certainly higher than 80% with surgery for pancreatic cancer that involves removing part of the pancreas.” In the cases just analyzed, most inductive and probabilistic reasoning is offered to the patient not only to explain the likely course of events, but also to choose a set of therapies. What seems to be rather rare is the use of inductive reasoning to diagnose, where instead abduction is certainly more common. What is interesting is the contrast between clinical experts, which primarily use abduction, as it will be illustrated, and expert systems that run on statistical prediction rules. Unlike clinicians, expert systems, sometimes also referred to as medical AI, make heavy use of induction and probability to infer a diagnosis from given symptomatology (Abu-Nasser 2017). Next, we will provide some examples of abductive reasoning in doctor-patient interactions. A doctor tells patient F that most likely the vomiting they experienced was caused by high bilirubin levels, causing jaundice, a yellowish pigmentation of the skin. There is an alternative explanation – i.e., vomiting may be caused by some anticancer drugs – but according to the doctor in the interaction, the picture that fits the symptoms better is that it was high bilirubin, as expected for patients that suffer from pancreatic cancer, that caused vomiting. Patient G is retelling their medical history, which started with back pain, and now progressed with clay-colored stools and abnormally dark urine. The doctor states that the progression “was expected” since it fits with the cancer’s natural progression, as it tends to grow and thus block the bile duct. The doctor continues, stating that at the present stage the issue is bile
474
C. Martini
ducts are blocked, and that can be inferred from the values for: (a) gamma-glutamyl transferase (GGT), (b) phosphatase, and (c) bilirubin (for which the patient has already been tested). The doctor explicitly endorses abductive reasoning in inferring that the presence of a pancreatic cancer blocking the bile ducts (possibly uniquely) explains the co-occurrence of the three values and the symptoms described by the patient. This latter instance is one in which the doctor is using explanation both in the attempt to come up with a diagnosis of the situation, and to explain to the patient what they are experiencing. The diagnosis clearly serves the purpose of inferring a prognosis and suggesting a course of therapy, while explaining the patient what their lived experience means has the social function of appeasing their likely sense of confusion, and establishing a communication channel where the patient can trust the doctor. Further into the visit with the same patient (G), the doctor receives a further confirmation of G’s condition, when the result from the biopsy arrives from the lab, and confirms the initial hypothesis of pancreatic cancer. In this case, the biopsy seems to be the strongest diagnostic indicator and it makes the initial inference only stronger. Patient E is a borderline case, since their cancer is at either a very early stage or not yet a case of carcinogenic mutation (the doctor explains that the transition from healthy to carcinogenic cells is gradual). The patient’s (possible) cancer is at a stage where “differential” diagnosis is possible. The doctor expresses a high level of confidence that the lesion may be treated successfully with surgery, yet the patient asks whether it is possible that some carcinogenic cells may have detached from the lesion and traveled in other places of the body. The doctor uses abduction to claim that the evidence so far collected would support the hypothesis that there is no metastasis. Here the doctor is using abductive inference to support their diagnosis, while, at the same time, trying to reassure the patient. Concomitantly, the doctor claims that it is their intention to order further tests (like blood tests for tumor markers and a CT scan), with the goal of using the further evidence collected to (hopefully) support the hypothesis of no-metastasis. This latter element is a recurrent theme: An abductive inference, being ampliative, is a type of living organism, which changes its status depending on the accumulation of both background knowledge and evidence. The way a doctor collects both background knowledge and evidence is through questioning the patient about their medical history and habits, and through medical tests. All actions are aimed at supporting a developing chain of abductive reasoning. Patient L has a complex family history, and the doctor is collecting information about diverse typologies of cancers that L’s family members have had. The familial history of cancers may point to hereditary genetic conditions, and the doctor is suggesting some genetic testing to look for possible Gilbert syndrome, which would prevent the body from producing one of the enzymes responsible for eliminating one of the anticancer drugs that the doctors plan to use. Lack of that enzyme would imply one of the possible therapeutic protocols cannot be followed, because the drug’s toxicity would accumulate in the body, so alternative anticancers protocols should be used. The reasoning here is an inductive step, nested into a counterfactual
22 Abductive Reasoning in Clinical Diagnostics
475
step, which is instrumental to the last abductive step toward the decision of which therapy course the patient should undertake. Let us explicate: The patient’s family history supports the possibility that also the patient might show a certain genetic mutation (probability), and the possibility is consequential to an important factor: choice of therapy. If the patient had such genetic mutation, then a certain therapy would imply dangerous side effects. All together, the elements that further testing provides will help the doctor formulate a choice of therapy: If the genetic mutation was present, then therapy A would be the best choice; but if the genetic mutation was not present, then therapy B would be the best one. The cases presented here are just a few examples of the almost constant use of abductive reasoning by doctors when speaking to a patient. The anecdotical evidence provided here supports the hypothesis that abduction (IBE) serves a role in the context of justification: The doctor has formulated a hypothesis and uses abduction to support their hypothesis (diagnosis) and at the same time explain the hypothesis to the patient (explanation in the context of medical communication). At the same time, Peircean abduction works through the doctor’s questioning (verbal and through tests) of the patient’s condition to formulate new hypotheses about the causes of some of the symptoms. Is the vomiting related to the cure, or the cancer itself? Is patient M’s diabetes linked to independent ailments or to the cancer profile? Questions and answers help make sense of reality and formulate diagnostic as well as prognostic and treatment hypotheses.
Patients Using Abductive Reasoning This section will start from a possibly little explored example of inferences in clinical reasoning, those employed by patients themselves. While the topic has received little consideration, it should not be surprising that patients themselves use abduction to understand their own “symptoms.” A note of clarification is necessary here. Before becoming a patient, a person is typically living a normal life and, like with any lived experience, a person may display signs of discomfort, like a painful back, or some difficulties in digestions. Often these signs might be dismissed as random occurrences due to habits, diet, or the environment, but there may be a moment in which these signs become “symptoms” to the patient themselves. The person starts becoming a patient and interpreting the signs as symptoms of some underlying problem. It is in this process that a person can start displaying abductive reasoning patterns. Not all diseases manifest themselves with symptoms that require immediate care, for example, severe shortness of breath, or a sudden drop in blood pressure that requires urgent care and possibly hospitalization. Many conditions, like the cases of cancers considered in this chapter, typically manifest themselves relatively slowly, with a progressive deterioration of a person’s normal daily living. It is in this latter type of these cases that it is possible to observe patients applying abductive reasoning to self-diagnose the experience they are going through.
476
C. Martini
Some examples: Patient D is a multimorbidity patient; they have been affected by several conditions before, including cancer, and get regular checks due to their medical history. In December, they experience important intestinal problems, including frequent diarrhea. In January, the signs they experience start being perceived as symptoms. The first abductive passage happens when the patient declares “I thought it was something related to me eating too much during the Christmas holidays.” The initial inference is temporary; the patient’s discomfort is enough to warrant their decision to see a physician, who decides to run several tests, which reveal a more serious issue: pancreatic cancer. Several minutes into the narration of their medical history, the oncologist asks the patient if they have had any other issue other than stomach problems (diarrhea), specifically if they experienced stomach or back pains. The patient claims they had back pain, but they attribute it to their profession (driver) and (bad) posture. The doctor then asks: “they were not new [the back pains]?” at which point the patient’s companion (present at the visit) interjects “it’s possible you [the patient] thought that the back pain was to be attributed to posture and your driving, but maybe it is not.” One can observe here a structure of abductive inferential patterns. The oncologist is asking questions in order to collect new observations from the patient’s narration of their medical history, habits, and symptoms, which are all elements on the basis of which the doctor will formulate their diagnosis. In doing so, the doctor asks if back pain was one of the symptoms, since to them it fits with the diagnosis of pancreatic cancer. But to the patient, the back pain fits with the bad posture possibly due to their profession or habits, so the same sign is interpreted differently, leading to different abductive reasonings, depending on the background information (the patient’s vs. the doctor’s). The patient’s companion intervenes to correct what she realizes could be their partner’s faulty reasoning: “possibly you thought it was driving, but it may not be.” One must note at this point that likely the patient’s companion does not have information about the fact that back pain is common in cases of pancreatic cancer, but they are likely starting to connect the dots (and guess their partner’s faulty reasoning) on the basis of the oncologist’s questions. These examples should suffice to show that a doctor-patient interaction can start already with a series of abductive inferences that the patient has made before they decided to speak to a doctor. There are several similar examples from the data collected. Patient D’s companion (present at the visit) tells the oncologist not only that D had lost significant weight, but also that they had significant health-related issues that affected D’s mental health, and therefore the weight loss was attributed to anxiety. The anxiety, a much more common condition than pancreatic cancer, serves as an explanation for the partner’s abductive inferences regarding D’s weight loss. Patient G had been forced to sleep on the couch for about 20 days due to a COVID-19 situation at home, and started experiencing pain in their lower back. Couch-sleeping is explicitly mentioned as what they thought explained their back pain, since, again, posture issues are more common than pancreatic cancer and fitted the occurrence of back pain. Patient M’s inference is slightly more complex. They are a diabetic patient, but also physically very active (2+ hours of daily gym
22 Abductive Reasoning in Clinical Diagnostics
477
training), and the physical activity usually keeps the diabetes under check. When they start observing a deterioration of their glycemic levels, they attribute it to the fact that, due to COVID-19, their physical activity had been curbed, and the problems related to the diabetes resurged. Also in this case, the explanation that was most easily available was used to support their abductive inference. Most inferences that patients make serve the purpose of explaining and justifying the symptoms they experience given their background conditions and the environment. By environment here, it should be meant anything related to those external factors affecting a person and their physical and mental health. It is interesting that according to Adam Smith our inquiry has, among others, the function of making reality coherent and to “sooth the imagination” (Smith 1980 [1795], p. 46). In the patient’s world, where normal living has been disrupted by ailments, “such disturbances induce philosophical [= scientific] inquiry” and inquiry is meant “to sooth the imagination, and to render the theatre of nature a more coherent, and therefore a more magnificent spectacle, than otherwise it would have appeared to be” (Samuels 2007, p. 53). For the patient, inquiry does not seem to have a diagnostic purpose, at least not directly, but rather the goal of making sense of a disrupted course of life. This is not, or at least should not be, the primary goal of abductive reasoning by doctors, who are trying to diagnose a condition in their patients and find the best cure for them. Yet also doctors, in a therapeutic relation, often perform a “soothing” function: That is, they are tasked with making sense of what is happening to their patient. This is an important communicative element that affects the relation of trust between a physician and a patient (Clark 2002).
Conclusion Section “Introduction” introduced the problem of abduction in clinical settings. Section “Abduction As Expert-Based Inference” provided a very broad survey of the theoretical issues related to abduction and arrived at an informal characterization of abduction as expert-based inference. The next two sections provided empirical grounding to the idea that abduction is part of natural reasoning about illnesses and cures: Section “Doctors Using Abductive Reasoning” focused on doctors using abductive reasoning, while section “Patients Using Abductive Reasoning” focused on patients using abductive reasoning. Some authors like Harman (1965) and Douven (2021) have stated that abductive reasoning is common to most reasoners, also in daily cognitive processes. It seems that there have been few attempts to document this in the clinical setting, and to understand more precisely what kind of role abduction plays in clinical interactions. The work presented here is explorative and provides concrete examples, taken from real interactions between doctors and their patients, to document diverse uses of abduction and, by contrast, other forms of reasoning in clinical diagnostic settings. It will be the task of future work to explore abductive reasoning in doctor-patient interaction in a systematic manner.
478
C. Martini
Appendix: Note on Data The data was collected between July 2021 and January 2022. An assistant recorded the interaction between patients and doctors. All dialogues have been completely anonymized due to the sensitivity of the data collected. All doctor-patient recordings were assigned a letter initial, in alphabetical order, and all patients are named in this chapter with the corresponding letter. Doctors are not named or identified in any way. All patients have given consent to the recordings and their use for research purposes, and the protocol was approved by the San Raffaele Hospital Ethical Committee. A copy of the authorization can be obtained from the author.
References Abu-Nasser, B. (2017). Medical expert systems survey. International Journal of Engineering and Information Systems (IJEAIS), 1(7), 218–224. Anderson, D. R. (1986). The evolution of Peirce’s concept of abduction. Transactions of the Charles S. Peirce Society, 22(2), 45–164. Bolton, J. W. (2015). Varieties of clinical reasoning. Journal of Evaluation in Clinical Practice, 21(3), 486–489. Campaner, R., & Sterpetti, F. (2022). Abduction, clinical reasoning, and therapeutic strategies. In L. Magnani (Ed.), Handbook of abductive cognition. Springer, Cham. https://doi.org/10.1007/ 978-3-031-10135-9_12-1 Chiffi, D., & Andreoletti, M. (2022). Abduction in prognostic reasoning. In: L. Magnani (Ed.), Handbook of abductive cognition. Springer, Cham. https://doi.org/10.1007/978-3-031-101359_11-1 Chiffi, D., & Zanotti, R. (2016). Fear of knowledge: Clinical hypotheses in diagnostic and prognostic reasoning. Journal of Evaluation in Clinical Practice, 23(5), 928–934. Clark, C. C. (2002). Trust in medicine. The Journal of Medicine and Philosophy, 27(1), 11–29. Consolandi, M., Martini, C., Reni, M., Arcidiacono, P. G., Falconi, M., Graffigna, G., & Capurso, G. (2020). COMMUNI. CARE (COMMUNIcation and patient engagement at diagnosis of PAncreatic CAncer): Study protocol. Frontiers in Medicine, 7, 134. Denecker, M., Martens, B., & De Raedt, L. (1996). On the difference between abduction and induction: a model theoretic perspective. In ECAI96 workshop on abductive and inductive reasoning (pp. 1–7). Douven, I. (1999). Inference to the best explanation made coherent. Philosophy of Science, 66, S424–S435. Douven, I. (2021). Abduction. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (Summer 2021 Edition). URL = https://plato.stanford.edu/archives/sum2021/entries/abduction/ Dragulinescu, S. (2016). Inference to the best explanation and mechanisms in medicine. Theoretical Medicine and Bioethics, 37, 211–232. https://doi.org/10.1007/s11017-016-9365-9 Fann, K. T. (1970). Peirce’s theory of abduction. Martinus Nijhoff. Festa, R., Crupi, V., & Giaretta, P. (2010). Forme di ragionamento e valutazione delle ipotesi nelle scienze mediche. In A. Pagnini (Ed.), Filosofia della medicina. Epistemologia, ontologia, etica (pp. 119–142). Carocci Editore. Frankfurt, H. (1958). Peirce’s notion of abduction. Journal of Philosophy, 55, 593–596. Haig, B. D. (2008). Scientific method, abduction, and clinical reasoning. Journal of Clinical Psychology, 64(9), 1013–1018. Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74, 88–95. Klichowicz, A., Lippoldt, D. E., Rosner, A., & Krems, J. F. (2021). Information stored in memory affects abductive reasoning. Psychological Research, 85(8), 3119–3133.
22 Abductive Reasoning in Clinical Diagnostics
479
Lipscomb, M. (2012). Abductive reasoning and qualitative research. Nursing Philosophy, 13(4), 244–256. Magnani, L. (1997). Basic science reasoning and clinical reasoning intertwined: Epistemological analysis and consequences for medical education. Advances in Health Sciences Education, 2(2), 115–130. Mirza, N. A., Akhtar-Danesh, N., Noesgaard, C., Martin, L., & Staples, E. (2014). A concept analysis of abductive reasoning. Journal of Advanced Nursing, 70(9), 1980–1994. Niiniluoto, I. (1999). Defending abduction. Philosophy of Science, 66, S436–S451. Pfister, R. (2022). Towards a theory of abduction based on conditionals. Synthese, 200(3), 1–30. Pietarinen, A. V., & Bellucci, F. (2014). New light on Peirce’s conceptions of retroduction, deduction, and scientific reasoning. International Studies in the Philosophy of Science, 28(4), 353–373. Pinto, R. Z., Ferreira, M. L., Oliveira, V. C., Franco, M. R., Adams, R., Maher, C. G., & Ferreira, P. H. (2012). Patient-centred communication is associated with positive therapeutic alliance: A systematic review. Journal of Physiotherapy, 58(2), 77–87. Pitt, J. C. (Ed.). (1988). Theories of explanation. Oxford University Press. Prasad, G. R. (2021). Enhancing clinical judgement in virtual care for complex chronic disease. Journal of Evaluation in Clinical Practice, 27(3), 677–683. Råholm, M. B. (2010). Abductive reasoning and the formation of scientific knowledge within nursing research. Nursing Philosophy, 11(4), 260–270. Samuels, W. J. (2007). Adam Smith’s History of Astronomy argument: how broadly does it apply? And where do propositions which “sooth the imagination” come from? History of Economic Ideas, XV(2), 1–26 Smith, A. (1980 [1795]). History of astronomy. In W. P. D. Wightman, J. C. Bryce, & I. S. Ross (Eds.), The glasgow edition of the works and correspondence of adam smith, vol. 3: essays on philosophical subjects with Dugald Stewart’s account of Adam Smith Stanley, D. E., & Campos, D. G. (2013). The logic of medical diagnosis. Perspectives in Biology and Medicine, 56(2), 300–315. Stanley, D. E., & Nyrup, R. (2020, March). Strategies in abduction: Generating and selecting diagnostic hypotheses. The Journal of Medicine and Philosophy: A Forum for Bioethics and Philosophy of Medicine, 45(2), 159–178. Oxford University Press. Stanley, D. E., & Sehon, S. R. (2019). Medical reasoning and doctor-patient communication. Journal of Evaluation in Clinical Practice, 25(6), 962–969. Upshur, R. (1997). Certainty, probability and abduction: Why we should look to CS Peirce rather than Gödel for a theory of clinical reasoning. Journal of Evaluation in Clinical Practice, 3(3), 201–206. Vanstone, M., Monteiro, S., Colvin, E., Norman, G., Sherbino, J., Sibbald, M., . . . & Peters, A. (2019). Experienced physician descriptions of intuition in clinical reasoning: A typology. Diagnosis, 6(3), 259–268. Veen, M. (2021). Creative leaps in theory: The might of abduction. Advances in Health Sciences Education, 26(3), 1173–1183. Walton, D. (2004). Abductive reasoning. University of Alabama Press.
Medical Reasoning and the GW Model of Abduction
23
Cristina Barés Gómez and Matthieu Fontaine
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Galenic Medicine: Between Logic and Medical Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Medical Diagnosis Within the GWm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Deduction, Induction, and Ignorance-Preserving Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . Cutdown and Fill-Up in Medical Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evidence-Based Medicine and Mechanistic Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mechanistic Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
482 484 487 491 494 498 500 503 505
Abstract
Medical reasoning involves diagnosis, but also prognosis, therapy planning, and monitoring. Both are based on hypotheses introduced by means of abductive inference, followed by predictions made by deduction, and confirmation in the course of an inductive phase. Abductive hypotheses are not always immediately followed by the deductive-inductive phase and may give rise to further hypotheses. Diagnostical, therapeutic, and monitoring hypotheses can be connected within the Gabbay and Woods model, in which abduction is considered as an ignorance-preserving inference. The point is that diagnostical hypotheses can be activated in other abductions giving rise to therapeutic hypotheses, and monitoring strategies, without having received confirmation. This proposal is applied to Magnani’s Select and Test Model, and consequences are drawn with respect to the use of hypotheses in medical reasoning. The chapter ends in C. Barés Gómez () · M. Fontaine Departamento de Filosofía, Lógica y Filosofía de la Ciencia, Universidad de Sevilla, Seville, Spain e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_14
481
482
C. Barés Gómez and M. Fontaine
a discussion relative to the debate between “mechanistic” and “probabilistic” perspectives in medicine. In agreement with Russo and Williamson’s theses, it must be recognized that both perspectives are necessary to establish causality. Nonetheless, the inference-based analysis proposed here inclines to think that mechanisms can be viewed as abductive hypotheses, whose evidence can only be established in an inductive phase of empirical confirmation. Yet, given that induction can never make the first step in medical inquiry, we must always start with an abductive phase in the course of which a mechanism is conjectured. Keywords
Abduction · GW model · Hypothesis · Mechanism · Medicine
Introduction In this chapter, the authors provide an explanation of medical reasoning within the Gabbay and Woods model of abduction (hereinafter GWm), following Gabbay and Woods (2005). According to the GWm, abduction is an ignorance-preserving inference triggered in response to an ignorance problem. That is, agents confronted to surprising facts introduce hypotheses as a basis for new actions, independently of their confirmation. This does not mean that the agent will not later look for their confirmation, e.g., by means of empirical trials, but that this leads beyond abductive reasoning. In some sense, the chapter revives the Peircean triad involving abduction, deduction, and induction in scientific research, which must always begin by the introduction of hypotheses that recommend a course of action. In medical reasoning, the physician introduces hypotheses, e.g., diagnosis, and then activates this hypothesis in further reasonings and actions, despite the absence of confirmation. The activation of the hypotheses can be followed by deductive inferences by means of which predictions are made, e.g., prognosis, and confronted to empirical facts in an inductive phase of (probabilistic) confirmation. Such a Peircean understanding of abduction precludes from the very start of this work a Bayesian approach, in which hypotheses would be introduced on the basis of degrees of credence, or probability, or any other statistic support. Indeed, probability is concerned with induction. And, as it will be explained in this chapter, there cannot be induction without previous abduction by means of which is introduced the hypothesis to be tested. Therefore, it must be clear since the beginning that abduction is neither deduction nor induction, nor even a mix of both. It is a third kind of reasoning whose cognitive virtue is not related to deductive validity or inductive strength, but pragmatic as it provides a basis for action despite persisting ignorance. It is only once a hypothesis is conjectured that scientific or medical inquiry can begin. In this sense, it will be interesting to benefit from the insight of the Select and Test Model (hereinafter STm) of medical reasoning advocated by Magnani (1992).
23 Medical Reasoning and the GW Model of Abduction
483
Indeed, such a model accounts for medical diagnosis in terms of a triad starting with abduction and followed by a deductive-inductive phase of confirmation. Magnani applies the STm to diagnosis, therapy planning, and monitoring. Nonetheless, he is not explicit about the connection between these three forms of medical reasoning. An extension of the STm, by reconsidering it within the GWm of abduction, in which diagnosis, therapy planning, and monitoring can be connected in an ignorance-preserving way is suggested. Indeed, diagnostical hypotheses are not always released in a deductive-inductive phase of confirmation, but they are often activated in further reasonings when planning, for example, a therapy. That is, the diagnostic is conjectured as a basis for the introduction of another hypothesis, namely, a hypothetically efficient therapy, which can in turn be followed by a monitoring strategy. A redefinition of the STm within the GWm allows providing a more general and unified framework for medical reasoning. Beyond the selection of hypotheses, the case of their creation is also considered. That is, although physicians usually base their practice on the selection of hypotheses stored in an encyclopedic background, medical research also involves the identification of new illnesses, pathological factors, and therapies, or even ways of monitoring patients, which are usually expressed in terms of causal laws. Different kinds of abductions and, consequently, different kinds of hypotheses will be thus distinguished. This discussion will lead to take part in an actual debate in the philosophy of medicine between “mechanistic” and “probabilistic” perspectives. The mechanistic perspective is concerned with the explanation of causal relations. The probabilistic perspective looks for evidence of causality in difference-making studies that display the correlation between variables – it is usually engaged with empirical trials and statistic-probabilistic results. Following theses advocated by Russo and Williamson (2007), it is acknowledged that both perspectives are necessary to establish causality in medicine. Nonetheless, in this chapter, they are not considered as two different kinds of evidence. The difference is rather to be understood in inferential terms: whereas mechanisms result from abduction, probabilities and statistics result from empirical trials and induction. And there is no evidence of a mechanism that would not be empirical or inductive. But induction can never make the first step. We must always begin with the introduction of a hypothesis, e.g., a mechanistic hypothesis. Such a hypothesis will guide the investigation and serve as a basis for planning empirical trials. Without hypothesis to be tested, no empirical trial to be led. Anyway, in the end, the corroboration of the hypothesis is always inductive, whereby probabilistic. Interesting is the historical fact that Galen, in the second century BC, was already looking for some kind of reasoning beyond deduction and induction. Indeed, although he thought that medical practice should be based upon deductive (syllogistic) logic, he was also aware that confirmation could not be obtained independently of the application of the conclusions. For example, as an architect would have confirmation of the correctness of his reasonings only after building a balanced and stable edifice, a physician would have confirmation of his diagnosis only in a therapeutical application of it. Deduction alone is not sufficient. Nonetheless, induction would be of no help either. Yet, Galen’s proposal remains unclear. Without
484
C. Barés Gómez and M. Fontaine
claiming that it was what Galen had in mind, it is suggested that the concept of abductive inference might help to understand what is lacking. Thus, the chapter begins with brief historical remarks on Galenic medicine, between logic and practice (section “Galenic Medicine: Between Logic and Medical Practice”). Then, medical diagnosis is explained within the Gabbay and Woods model of abduction (section “Medical Diagnosis Within the GWm”). After, Magnani’s Select and Test Model of medical reasoning is outlined and an extension of it in the GWm is provided (section “Deduction, Induction, and Ignorance-Preserving Abduction”). It follows a discussion of its application to creative abduction (section “Cutdown and Fill-Up in Medical Reasoning”). Finally, the Russo-Williamson thesis about mechanistic and probabilistic perspective in medical research is introduced (section “Evidence-Based Medicine and Mechanistic Evidence”), and after having expressed some doubts about the notion of “mechanistic evidence,” the pragmatic virtue of mechanism is set in terms of abductive hypotheses (section “Mechanistic Hypotheses”).
Galenic Medicine: Between Logic and Medical Practice Documents about medical reasoning and medical diagnosis are already found in Mesopotamian (Sumerian and Akkadian), Egyptian, Greek, or Roman cultures (see, e.g., Barés Gómez (2018) for a study of medical diagnosis in Akkadian and Barés Gómez (2021) for a study of hippiatric texts in Ugarit). Nevertheless, we do not find an explicit and systematized reflexion on how to reason about medical diagnosis until Galen (Hankinson (1991)). This does not mean there is no reasoning in previous texts of medical history, but that medical reasoning becomes an object of study as such with Galen. Indeed, according to Galen (2002), “the best doctor is also a philosopher” and three characteristics are required. First, the doctor needs logic and should be trained to demonstrative reasoning. Second, he needs physics and natural philosophy. Third, he needs ethics. His patients should know that he is devoted to the art and practices it out of benevolence for humankind. We may say that we find in Galen the presages of a “logic of medical reasoning,” explicitly based upon Aristotelian logic. Although Galen’s work is based upon Aristotelian syllogistic, he gives more importance to a wider process of inference underlying medical practice. Indeed, even if the physician’s reasoning involves deductive demonstrations that must eventually be confronted with experience, the process also includes the research of causal explanations which are neither demonstrable nor inductive. For the sake of explanation, we begin by referring to Wood’s (2018, 14) trichotomy between consequence-having, consequence-spotting, and consequencedrawing. Consequence-having occurs in logical space and deals with entailment, between, e.g., a statement A or a set of statements Σ and a statement B. Consequencespotting is an epistemic achievement that occurs in the psychological space. It is knowing such an entailment relation. Consequence-drawing occurs in the inferential subspace of psychological space. Both are related. We cannot spot a consequence
23 Medical Reasoning and the GW Model of Abduction
485
that there is not. We cannot draw a consequence without having spotted it. However, it would be a mistake to reduce the two latter relations to the former. In particular, correct inferences should not be reduced to deductive entailment. Aristotle was the first logician who thought in a separate way consequence-having and consequencedrawing. Syllogistic provides us with a means to isolate consequence-having and to encapsulate it in deductive schemes. By contrast, practical logic is a logic of how agent actually draws conclusions, and the processes whereby conclusions are drawn do not always follow the standard of deductive logic. This does not mean that the agents are wrong in drawing their conclusions and that they commit errors of reasoning. This can be understood in the context of a cognitive economy, with respect to a cognitive system. According to Magnani (2017, 9), an eco-cognitive system is a triple < A, T, R > where A is an agent, T is a target (e.g., something the agent wishes to know or to do), and R relates to the available resources (information, computational capacity, memory, time and so on). In this context, correctness is to be thought in terms of a balance between the resources available and their use in view of attaining the target. Usually, successive attempts of drawing conclusions in the course of errors and corrections processes may constitute a more efficient way of reasoning than applying high standards of deductive logic. As a consequence, we must consider that conclusions are drawn defeasibly; that is, they could be revised in the light of new information. Although medical diagnosis involves demonstrations, possibly based on deductive entailment relations, drawing conclusions brings the physician beyond consequence-having and deduction. We see in Galen’s work an attempt to extend medical diagnosis beyond deduction, by considering how a physician actually draws conclusions, which assumes an articulation of different kinds of inferences. That is, even if Galen relies on syllogistic deduction, he also recognizes the necessity to approach medical reasoning in terms of some kind of practical logic. Perhaps he was looking for what we will refer to later as a form of “abductive reasoning” underlying the medical practice. Anyway, it is clear that beyond his confidence in deductive demonstrations, Galen was explicitly aware that other forms of reasoning were needed to deal with medical practice. When analyzing medical diagnosis, Galen recognizes that crude observation alone is not enough, that science must be organized and structured properly in an explanatory system, or “causal understanding” to put it in Hankinson’s terms (2008, 166). This work must be done by someone “gifted and practiced in logic” (Morrison (2008)). The first task consists in the differentiation (differentiae), that is, the task of defining precisely the disease, affection, and symptoms. Notice that we will not discuss here the philosophical problems behind the definitions of the concepts of “disease,” “illness,” and “health” (see, e.g., Hofmann (2002) for more details on this issue). The second task consists in looking for a cause or explanation for an abnormal natural activity or function, as it is clear in the following quotation of Galen’s On the Therapeutic Method by Hankinson: How, you may ask, can we go about doing this? From [X. 50 k] an undemonstrable axiom, agreed by all because it is plain to the intellect. And what is that? That nothing occurs without a cause. If this is not agreed, we will be unable to investigate the cause of damage
486
C. Barés Gómez and M. Fontaine
to or total loss of vision. But since this is one of the things that is plain to the intellect, when we have posited that there is some cause of the damage, we can proceed to search for it. Whether you choose to call this cause a disposition of the body, or the body ordered in a certain way, make no difference to the matter before us. Thus at all events you will either say that the disease itself is the cause, or if the damage to the activity is actually the disease, the disposition which damages it will be the cause of the disease. (Hankinson (1991, 26, 7.1)).
Far away to enter into the problem of causation, this quotation highlights the difficulty of providing a valid reasoning by means of which a cause or an explanation of the illness and/or the associated symptoms and signs would be established. It is worth noting that different kinds of diagnoses may involve different ways to reason. Galen’s methodology is based upon the relation between reason (logos) and experience (empeiria) or empirical testing (peira). A systematic or logical approach is needed to guide medical practice (Hankinson, 2008, 166), but it has to be complemented by experience. The requisite for a logical basis nevertheless justifies to follow the deductive approach of syllogistic (Morrison, 2008). Yet, how the reasoning itself can be complemented by experience? What is the role played by empirical tests? How should they be implemented in such a logic? According to Hankinson’s analysis of Galen’s work (2008, 166 ff.), this consists in “backwards from the empirical observation or generalisation to the discovery of the fundamental facts.” This cannot be induction – i.e., some kind of generalization from particulars – for Galen, since we must infer “indicatively” “to the hiding heart of things” in view of determining the proper structure of things. In fact, Galen’s approach gives a fundamental role to a demonstration ending in undemonstrable a priori axioms – e.g., “nothing occurs without a cause” (Galen, On Therapeutic Method, X, 50 in Hankinson (1991, 26) – and the injunction to further empirical confirmation. But his attempts to articulate both of them seem to remain unachieved. At the same time, Galen claims in the Thrasybulus: “we have shown in On demonstration that inductions (epagôgai) are useless for scientific demonstrations” (Hankinson (2008, 168)). This is difficult to understand. Our interpretation, here, is that he lacks a clear identification of a third kind of inference that we would nowadays refer to as “abduction” – i.e., an inference in the course of which a (explanatory) hypothesis is introduced. Indeed, the empirical test (peria) is fundamental in Galen’s approach to medical reasoning, since it eventually provides confirmation to the explanation put forward by rational means (logos). Consequently, medical diagnosis must begin with reasoning, and it is only in the end of the process that empirical tests are involved. But, even if the initial reasoning phase is not inductive, it cannot be restricted to deduction either. Interestingly, Galen also suggests that in some cases the physician may act despite a lack of confirmation, on the basis of recognized efficient forms of reasoning: When we find a demonstrative method which leads us to what we were looking for and is clearly confirmed by the facts of the matter themselves, this gives us no small test of its truth, so that we may risk applying it in cases where there is no clear confirmation. Galen, On the Diagnosis and cure of the Errors of the Soul, V68,=SM I, 52,23_53,6. Translation in Hankinson (2008, 169).
23 Medical Reasoning and the GW Model of Abduction
487
The confirmation comes from the experience, but as Galen said, this confirmation might come from the practical application of the inferred conclusions. Recently, based upon Peirce’s concept of abduction – i.e., a reasoning by means of which is introduced a hypothesis – Gabbay and Woods (2005) have defined this form of reasoning in terms of ignorance-preserving inference. That is, it is an inference triggered in response to an ignorance problem with respect to the observation of surprising facts. But this inference does not aim at removing ignorance, only to introduce hypotheses as a basis for new actions despite a (possibly) persisting state of ignorance. Without pretending that it was what Galen had in mind, this may shed the light on medical practice in the spirit of that necessity of combining (nondeductive) reasoning and experience, as well as in conjecturing hypothesis in order to act without empirical confirmation.
Medical Diagnosis Within the GWm Medical diagnosis is usually based on nosology, in relation to an encyclopedic background and a general classification of already known diseases. Various nosological systems involving different taxonomic criteria and methodologies are recognized by the World Health Organization. According to Stempsey (2017, 647), the Systematized Nomenclature of Pathology (SNOP) developed by the College of American Pathologists in 1965 provides an exhaustive classification of disease, based on an anatomico-clinical model. “Diseases are described with respect to four fields,” Stempsey explains, “topography (the part of the body affected), morphology (the structural changes produced in the disease), etiology (the etiologic agent responsible for the disease), and function (the manifestation of the disease).” Nosological diagnoses do not pretend to explain the causal mechanism behind the pathological state of the patients, only to identify the diseases which are being manifested by signs and symptoms. In this context, diagnosis includes the history of the illness, physical examination, and laboratory and other sorts of clinical tests. All of this provides us with information necessary to perform a diagnosis. Yet, what is the underlying inference by means of which a diagnostic is conjectured? How is it used in medical practice? Diagnosis can be explained in terms of abductive inference, in particular within the GWm of abduction, which also accounts for decisions made by physicians despite a lack of confirmation. Indeed, let β be a set of signs/symptoms and α a disease. The underlying scheme of argument may be represented (and instantiated) as follows: The patient displays β. (The patient displays a flu-like illness, including shaking chills, headache, muscle aches, tiredness, nausea, vomiting, and diarrhea.) The patient has α → The patient displays β. (If the patient has malaria, then he displays a flu-like illness, including shaking chills, headache, muscle aches, tiredness, nausea, vomiting, and diarrhea.) ∴ The patient has α. (The patient has malaria.)
If diagnosis is based on this form of reasoning, then it is not deductive. Indeed, from the perspective of deduction, this scheme is nothing but the well-known
488
C. Barés Gómez and M. Fontaine
fallacy of “affirming the consequent,” whose conclusion does not necessarily follow from the premises – given that different diseases can be manifested by the same symptoms. As such, medical diagnosis rather takes the shape of abduction, as in Peirce’s schema (CP 5.189) (“CP” refers to Collected Papers) Peirce (1931–1958): The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is a reason to suspect that A is true. In this scheme of argument, the conclusion is a hypothesis. As such, it can be defeated, corrected, or revised, in the light of new information. Not only is abduction not deductive, but it is not inductive either. Induction is concerned with the probable confirmation of a hypothesis by confrontation with facts, possibly in the course of empirical trials. Abduction is only concerned with the introduction of a hypothesis. According to Peirce (CP 5.146), “Abductive and Inductive reasoning are utterly irreducible, either to the other or to Deduction, or Deduction to either of them.” Moreover, “probability power has nothing to do with the validity of abduction” (CP 2.102): a hypothesis need not be inductively strong to be accepted. The cognitive virtue of abduction is pragmatic: it is a response to an ignorance problem that consists in introducing a hypothesis as a basis for new action. Further developments of the GWm focusing on the pragmatic feature of abduction and the nature of a specific speech act of conjecturing can be found in Barés Gómez and Fontaine (2017) and Chiffi and Pietarinen (2020). As nicely put by Woods (2013, 376), “[d]eductive inference is truth-preserving. Inductive inference is likelihood enhancing. Abductive inference is ignorancepreserving.” According to the GWm, abduction is a response to an ignorance problem. A question to which an agent has no answer acts as a cognitive irritant that forces him to formulate a hypothesis that may serve as a basis for new actions, despite his persisting state of ignorance. With respect to a cognitive system , this can be understood as a scant-resources adjustment strategy. The agent does not have sufficient resources to meet the target, but conjectures that, if the hypothesis were true, it would allow him to do so. Then, and this is perhaps one of the more salient features of the GWm when applied to medical diagnosis, that conjecture can serve as the basis for new actions, even in the absence of an answer to the ignorance problem. Let Q be a question we cannot answer with our present knowledge and which acts as a cognitive irritant. Three situations are possible: • Subduance. New knowledge removes ignorance (e.g., empirical discovery). • Surrender. We give up without looking for an answer. • Abduction. We establish a hypothesis as the basis for new actions. To put it in Woods’s terms: “[w]ith subduance, the agent overcomes his ignorance. With surrender, his ignorance overcomes him. With abduction, his ignorance remains, but he is not overcome by it.” Abduction leads to a hypothesis that could be
23 Medical Reasoning and the GW Model of Abduction
489
revised in light of new information. It “is a response that offers the agent a reasoned basis for new action in the presence of that ignorance” (Woods (2013, 368). More formally, let T be an agent’s epistemic target at a specific time, K the agent’s knowledge base at that time, K* an immediate successor of K, R an attainment relation for T (i.e., R(K, T) means that knowledge base K is sufficient to reach target T), while denotes the subjunctive conditional connective (for which no particular formal interpretation is assumed), and K(H) is the revision of K upon the addition of H. C(H) denotes the conjecture of H and HC its activation. Let T ! Q(α) denote the setting of T as an epistemic target with respect to an unanswered question Q to which, if known, α would be the answer. The GWm has the following general structure: 1. T ! Q(α) 2. ∼R(K, T) 3. ∼R* (K, T) 4. H ∈ K 5. H ∈ K* 6. ∼R(H, T) 7.∼R(K(H), T) 8. H R(K(H), T) 9. H meets further conditions S1 , . . . , Sn 10. Therefore, C(H) 11. Therefore, HC
(fact) (fact) (fact) (fact) (fact) (fact) (fact) (fact) (fact) (sub-conclusion (1,7)) (conclusion (1,8))
Let us now explain this scheme through an application to medical diagnosis. A patient has a flu-like illness, including shaking chills, headache, muscle aches, tiredness, nausea, vomiting, diarrhea, and so forth. The physician’s agenda is to treat the patient or at least to alleviate his pain. The physician does not know what ailment is causing the patient’s discomfort. In other words, he has an ignorance problem, for which reason his target is to discover its nature so that, if he knew it, it would solve the problem. The starting point is T ! Q(α) (Step 1), in which target T is the discovery of an illness α that would allow the physician – provided that he knew it – to answer question Q and to treat the patient accordingly. The resources required for reaching the target are unavailable to the physician. He only knows the symptoms and other related information (the patient’s account and so forth) (Step 2). In his encyclopedic knowledge, he may be aware that a determined illness causes these symptoms, but without being totally sure that the patient is actually suffering from it. Nor is he capable of finding any immediate successor to answer the question (Step 3). Indeed, he may not have the necessary resources for discovering such an answer in a timely fashion before the end of the consultation. If the physician were able to find a solution, this would lead to subduance, thus halting the abductive process. Nonetheless, neither are tests ever infallible nor is subduance so absolute. Despite lacking an answer, the physician suspects that the patient is suffering from malaria. As a hypothesis, his suspicion does not pertain to his knowledge base (Step 4) or to any immediate successor
490
C. Barés Gómez and M. Fontaine
(Step 5). Therefore, his ignorance problem remains unresolved and the target unattained (Step 6), even when combined with his knowledge base (Step 7). Yet, the subjunctive relation H R(K(H), T) (Step 8) holds, since if H were true, then it would play a role in the attainment of T. As such, H is worth being conjectured and C(H) can be concluded (Step 10). These steps are of particular importance insofar as they are the keystone of ignorance-preserving abduction. They express precisely how Gabbay and Woods understand the subjunctive in the second premise of Peirce’s schema and, subsequently, the “hence” of the conclusion. Thus, given certain conditions – yet to be specified – met by H (Step 9), hypothesis H can be conjectured (Step 10). Let us assume that our physician suspects a case of malaria. Abduction does not end there. Indeed, the physician now has three possibilities: 1. The hypothesis is confirmed – e.g., by means of an antigenic diagnostic test – and a new piece of (defeasible) knowledge is obtained. This is subduance. 2. The hypothesis is not confirmed or is invalidated – e.g., by means of a test – and the physician gives up. He can thus look for another hypothesis (e.g., influenza). 3. The hypothesis is not confirmed, but he maintains it anyway. The third case could happen if the agents are in the middle of Amazonian Forest and have no possibility to test. This leads to full abduction: the physician activates the conjecture (Step 11), by employing it as the basis for new actions, despite his persisting state of ignorance. By contrast, partial abduction would end at Step 10. Depending on the context, different strategies may be adopted. Indeed, if there is no time (owing to the risk of contagion or death) or if there are no material resources (money, test, scanner, etc.) available, the physician may be prompted to act swiftly in the absence of confirmation. Thus, he would perform a full abduction by activating the conjecture, without prior confirmation. For example, if he suspected malaria, he would lose no time in prescribing quinine or chloroquine or something similar. If a test were performed, after Step 10, this would lead to a post-partial abductive confirmation. When acting, albeit in an ignorance-preserving manner, a post-full abductive confirmation may also be obtained. This occurs, for example, when a surgeon opens a patient to treat appendicitis and, when visually confirming the infection, discovers that his diagnosis was correct. In relation to the ignorance-preserving aspect of abduction, it should be noted that abduction is evidentially inert. In other words, it does not provide any grounds for the truth or falsity of its conclusion. It does not even oblige the reasoner to believe this conclusion, as stressed by Peirce (1992, 172), who considers that the introduction of an abductive hypothesis is a form of guessing (CP 6.530). Abduction is an inferential process during which the reasoner is justified in introducing a hypothesis as a basis for new actions, possibly in accordance with the GWm. Of course, many filters, such as plausibility and reliability, among others, may come into play when introducing the hypothesis. But none of them are either sufficient or necessary, as shown by Gabbay and Woods (2005, chap. 3 to 7).
23 Medical Reasoning and the GW Model of Abduction
491
Deduction, Induction, and Ignorance-Preserving Abduction Thinking of abduction in terms of ignorance-preserving inference, distinct from deduction and induction, does not mean that it should be conceived in isolation from them. Full abduction, when hypotheses are released in further inferential work, may involve the three kinds of inference. Indeed, a hypothesis may serve as a premise for further deductive inference – e.g., to make predictions – the conclusions of which remain hypothetical. When planning empirical trials for the confirmation of the hypothesis by induction, other abductive hypotheses may also be involved. This relates to the Peircean triad, in which research begins with surprise, to which we respond by abductive hypotheses, which in turn “recommend a course of action” – to put it in Peirce’s terms (as quoted by West (2016, 133)). In letters written by Peirce to Lady Welby and quoted by Bellucci and Pietarinen (2016), abductive conclusions are also referred to as “investigand,” i.e., an invitation to investigate. Magnani’s STm of medical reasoning also describes diagnosis, therapy planning, and monitoring, by referring to the three kinds of inference: [S]elective abduction is the making of a preliminary guess that introduces a set of plausible diagnostic hypotheses, followed by deduction to explore their consequences, and by induction to test them with available patient data. (Magnani (1992, 25))
According to the STm , medical reasoning begins with the abductive phase, in which hypotheses are selected on the basis of the patient’s data. Following this, the deduction-induction phase deals with the process of evaluation. Deduction is used for predicting expected consequences and evolution (e.g., prognosis). As stressed by Magnani (1992, 24), induction should be understood here as an ampliative process of generalizing knowledge with which hypotheses can be confirmed or rejected. In other words, hypotheses whose expected consequences turn out to be consistent with facts and the patient’s data are corroborated by induction, and the others are rejected. In both cases, new or refined hypotheses may be introduced. Although different ontologies are involved in diagnosis, therapy, and monitoring, the models are similar. Once a hypothesis is established (e.g., a diagnosis, a therapy, or a monitoring strategy), certain predictions derived at a time t1 (the presence of a certain symptom, the development of consequences, estimates of a particular evolution) can be revised at a time t2 : the conclusions are defeasible. Diagnosis, therapy planning, and monitoring can all be explained by the STm, namely, by first selecting a hypothetical diagnosis, therapy, or monitoring strategy, which is usually ranked (parsimony, danger, cost, curability, etc.), and then by testing it in the deductive-inductive phase. In the STm, the emphasis is placed on a selection among “a set of pre-enumerated hypotheses provided from established medical knowledge” (1992, 23) – e.g., nosological nomenclatures previously mentioned. Filters doubtless play a cognitive role when selecting hypotheses, but specifying them is a very complex task, and they do not constitute necessary or sufficient conditions, whether individually or collectively (see Gabbay and Woods (2005), Barés Gómez and Fontaine (2021a)).
492
C. Barés Gómez and M. Fontaine
Although the STm is compatible with different models of abduction, the interconnection between diagnosis, therapy planning, and monitoring can be explained within the GWm. Indeed, diagnostic hypotheses are not always immediately sent to empirical inquiry. They are often released for premissory work in planning therapies and monitoring strategies, which also involve an abductive phase. That is, the activation of a first abductive hypothesis, e.g., diagnosis, occurs in another abductive phase – e.g., therapy planning and/or monitoring – and it is only later after having observed the evolution of the patient that all of these hypotheses are eventually corroborated or rejected. Such hypotheses may come one after the other in an ignorance-preserving way. Because of the intertwinement of the different inferences involved in medical reasoning, it is likely that ignorance need not be totally preserved and abduction may be “ignorance-mitigating” or even “knowledgeenhancing,” following Magnani (2017). Therefore, although the GWm does not involve an inductive phase of confirmation, it constitutes a general framework in which the different levels of medical reasoning of the STm can be articulated. More concretely, a physician conjectures the conclusions of his diagnosis in view of elaborating a therapeutic strategy. The therapeutic strategy, as explained in the STm, first relies on a hypothesis which will be confirmed by its effectiveness – i.e., the patient’s recovery. Before the end of the therapy, the physician may even plan a monitoring strategy, which in turn will provide grounds to decide the effectiveness of the therapy. For example, when a physician diagnoses an appendicitis, he does not look for a full confirmation before acting. His hypothesis recommends him to perform surgery urgently on the patient. Most of the time, the localization of the pain, urinalysis (e.g., to discard cystitis), and blood tests (to detect signs of infection) are sufficient to engage in surgery. Other times, echography or a scanner may be required. But it is only once the physician has opened the patient’s belly that he will really get confirmation. The physician never acts completely in the dark (hopefully!), but the different kinds of reasoning and the different phases of the STm must be thought of as being subtly intertwined in a more or less ignorancepreserving way. That is, hypotheses give rise to the formulation of other hypotheses and recommend various kinds of courses of action, among which inductive phases of confirmation. Indeed, in the various phases of the medical process, the results obtained by monitoring may confirm the effectiveness of the therapy, which would in turn confirm the initial diagnosis. At any time of the process, information in conflict with one of the hypotheses may involve the revision, the correction, or even the rejection of one or several hypotheses. How this is performed is a matter of defeasible reasoning that may be modeled in various different ways. The distinction between post-partial and post-full abductions, introduced in the previous section, now takes another dimension. In the former case, a hypothesis is conjectured (Step 10 of the GWm), and then it is immediately sent to empirical test, that is, to engage in an inductive phase of confirmation. According to Woods (2013, 371), this would lead to an action involving the hypothesis but not the activation of it. Such a test would yield a post-partial abduction confirmation. By contrast, full abduction involves the activation of the hypothesis, when the hypothesis is released for inferential work. For example, when a physician infers
23 Medical Reasoning and the GW Model of Abduction
493
a therapeutic or monitoring strategy without having confirmation of his initial diagnosis, he acts on the hypothesis in the context of a full abduction. Sometimes the therapy is successful, which provides evidence in favor of the physician’s initial diagnosis. For example, if the physician suspects a bacterial infection, he prescribes antibiotics – very often without prescribing biological analysis. If the patient’s condition improves, this will provide further reasons to believe that the initial diagnosis was correct, albeit defeasibly. In this case, we may say that we have a post-full abduction confirmation. Interesting is the fact that the STm highlights that therapy planning and monitoring follow the same model of reasoning as diagnosis. That is, they begin by hypothesis and then engage in a deductive-inductive phase as well. In this context, diagnosis must always give rise to other hypotheses. Most of the time, if not always, it must involve other kinds of hypotheses in such a way that it engages to full abduction. That is, in practice, it does not even seem that post-partial abduction confirmation really occurs. Even if one prescribes a PCR for suspicion of Covid, there is still a medical reasoning involving hypotheses concerning illness and test and/or monitoring strategies. Most of the time, without the activation of these hypotheses in full abduction, action is not possible. This model of medical reasoning, extending the STm within the GWm, highlights the interconnection between diagnosis, therapy, and monitoring. But it also provides further insights about the status of hypotheses, in particular with respect to the agent’s targets. If the agent is a physician, his target may be (i) to know the disease causing the state of the patient, (ii) to know the therapy which would heal the patient, and (iii) to know how to observe the evolution of the patient. Again, both are intertwined. Nonetheless, having a therapeutic answer is sometimes sufficient to close the agenda. That is, if someone has fever in the middle of Amazonian Forest, the physician may suspect a case of malaria and prescribe quinine. If the patient gets better, he closes the agenda, no matter whether the patient really had malaria. But if the case happens in Europe, say in Lille, if the physician suspects malaria, he will keep open his agenda in order to verify if the patient really has malaria (perhaps in order to inform sanitary authorities). Moreover, the fact that a patient got malaria in the North of France will in turn become a surprising fact calling for a theoretical and practical response. Why is he infected? How did it happen? Does it call for a political response? The target is thus contextual. It also depends on the resources of the agent: in the middle of Amazonia, time and possibility of tests are lacking, and the physician must make a decision immediately. Whereas the main target of the physician in the middle of Amazonia is to heal the patient, it may be to know the disease or where the infection comes from in Europe. To put it in Hempelian terms, we can say that in the former case, the therapy gets the status of a test implication (see Hempel (1966, 7)), and its confirmation is enough to close the agenda. In the latter case, it would not, and further tests would be required. If a physician suspects a case of malaria in the middle of Amazonia, the only hypothesis he has is “if someone has malaria, he has high fever (plus other symptoms).” But the hypothesis “if someone (having malaria) takes quinine, the symptoms will cease” may be used as a test implication. If the symptoms cease, then the physician will corroborate the hypothesis that the patient had malaria, albeit
494
C. Barés Gómez and M. Fontaine
defeasibly. Test implication hypotheses are usually stated in a conditional form; they tell that under specific conditions an outcome of certain kinds will occur. According to Hempel’s deductive-nomological model (1945), we begin with a hypothesis, then we draw predictions by deduction, and finally we confront these predictions to facts. If the facts are consistent with the prediction, the degree of probability of the hypothesis is increased. More concretely, medical hypotheses may serve as a basis for predictions concerning the patient’s state. If this prediction is consistent with the fact, then the diagnosis is corroborated. Despite similarities with the STm and the Peircean triad, the deductive-nomological model does not aim at representing the introduction of hypotheses, which does not matter in the process of confirmation. We thus have the following reasoning: 1. 2. 3. 4. 5.
If the patient has malaria, then he has symptoms such as fever, etc. If the patient takes quinine, the symptoms will disappear. The patient takes quinine. The symptoms disappear. Therefore, it is likely that the patient has malaria.
The statements (2)–(4) involve a deductive-nomological phase of confirmation of the test implication (2). If the test implication is confirmed, then the diagnosis is corroborated (5), albeit defeasibly. Interesting is the fact that this reasoning displays two kinds of hypotheses. First is the diagnosis, and second is the therapy. The test implication corresponds to the therapeutic phase of the STm, in which the hypothetical diagnosis is activated in an ignorance-preserving way. Now, if the context is a European city, the physician would probably seek further evidence in view of corroborating his diagnosis, and a monitoring phase will follow. We must be careful when considering the falsification of a diagnosis. It is well known since Popper (1959) that falsification may be based on modus tollens, thereby a deductively valid argument, by contrast with corroboration. Indeed, observing a black swan is enough to falsify the hypotheses that all swans are white. But falsifications of test implications are not always sufficient to falsify the hypothesis. Indeed, in our example, the test implication involves auxiliary hypotheses – e.g., that quinine treats malaria and that the quinine prescribed by the physician is in a good state of conservation and sufficiently active to have the expected effects. If the patient does not recover after having taken quinine, it is perhaps that the vial of quinine he gave him was too old, not in a good state of conservation, not of good quality, or inactive and not that the initial diagnosis was wrong.
Cutdown and Fill-Up in Medical Reasoning These different uses of hypotheses also relate to the distinction between nosographic and pathophysiological diagnoses, described by Chiffi (2021, 20 ff.). Whereas the former is only classificatory within a taxonomy, the latter involves a causal explanation. For example, in the Amazonian case described above, the hypothesis
23 Medical Reasoning and the GW Model of Abduction
495
is set and released as the product of a nosographic diagnosis, and its scope of application remains limited to an immediate treatment of the patient. Clinical reasoning usually follows normed practices in which the physician’s creativity is not solicited. This reduces the practitioner’s activity to the application of processes of recognition of symptoms and their relation to an etiological background in view of planning a therapy and a monitoring strategy. These processes are regulated by institutions that give directions concerning decision-making and allowed therapies. The abductive process of selection has been related by Magnani (2017, 6 ff.) to the cutdown problem, i.e., the problem of cutting and selecting among a class of already given hypotheses. Nonetheless, abduction may also be creative by introducing new hypotheses. In particular, while nosographic diagnoses are mainly concerned with selective abduction, it seems that pathophysiological diagnoses involve more complex processes of inference and hypotheses concerning unknown causal explanations. Creativity may be involved when a physician faces unknown diseases, or better said a set of symptoms that is not related to a known disease, or when medical researchers seek the causes of a given characteristic pathological state, or when they look for new therapies and means to monitor or observe the patient’s state and evolution. The creative process of abduction relates to the fill-up problem. According to Magnani (2017, ch. 8), abduction is related to eco-cognitive openness, a notion understood in the context of the eco-cognitive model (hereinafter ECm), in which agents performing abductions are embodied in distributed cognitive systems. This should explain not only how hypotheses are selected but also how they are created. Indeed, cognition is embodied and the interactions between brains, bodies, and external environments are its central aspects. Guessing new hypotheses is a process that occurs in a complex distributed system in which a constant exchange of information occurs. There are interactions between the brain – not only conscious intellectual activity but also the unconscious kind – and the manipulations of the environment or artifacts (e.g., diagrams). At any rate, abduction is highly contextual, and conclusions should always be evaluated with respect to particular targets and scant resources. Abduction is multimodal and involves a constant exchange of information between cognitive agents, their internal aspects, and external environment. Information crosses the boundaries of different cognitive devices. Magnani (2017, 137) considers that an inputs-outputs model is better suited to account for abduction than a premises-conclusion inference. In this context, ecocognitive openness assumes that inputs are enriched by possible solutions and sensitive to external information. At the same time, other inputs are modified or rejected. This dynamic is intertwined with modifications of the outputs. Inputs and outputs may be of different kinds, e.g., sentences, diagrams, arguments, and rules. Such a dynamic may be rendered by different processes whether when the inputs are sentential, e.g., by means of belief revision theory, or when the inputs are rules, e.g., in the adaptive dialogical approach to abduction of Barés Gómez and Fontaine (2021b) to account for the abduction of the rules governing an argumentative interaction. As stressed by Magnani (2017, 139), eco-cognitive openness also assumes multimodal processes, in which are considered cognitive
496
C. Barés Gómez and M. Fontaine
aspects (sentential, diagrammatical, etc.), model-based, and computational. Creativity is stimulated by transdisciplinarity, cross-fertilization of ideas, and knowledge in motion, as illustrated by the case of HIV/AIDS studies, put forward by Magnani (2017, 164). By contrast, certain kinds of philanthropic funding, profitability, patents, and disciplinary isolation may lead to a “shutdown of creativity.” According to Barés Gómez and Fontaine (2021a), relaxing conditions sometimes imposed on the introduction of hypotheses – e.g., plausibility or reliability – may also occasionally contribute to promoting creativity in the proposal of hypotheses by medical professionals. Once more, depending on the context, the status of the hypotheses of different kinds of medical reasoning (diagnosis, therapy, monitoring) may vary. Whereas the therapy may be relative to the main goal, without worrying about the real source of the symptoms, it may also constitute a test implication for the diagnostical hypothesis. Depending on the agent’s goals, different patterns of abduction can be involved. In the STm and the everyday practice of the physician, the target usually consists in finding a disease explaining the pathological state of a patient and then a therapy and possibly planning a monitoring strategy. The physician’s abductions are, in this context, relative to a particular case. He does not aim at a general understanding of a disease or a therapy, but rather to find the causes and the means to heal the patient, in view of operating selections among his encyclopedic background. Usually, physicians do not conjecture new hypothetical diseases – by means of relations of causality between diseases and symptoms – or new therapies, also by means of relations of causality between, e.g., drugs and possible recovery. Therefore, medical reasoning involved in the everyday practice of physicians is usually based on factual abduction. According to Schurz (2008, 206), factual abduction proceeds from known laws of the form If Cx, then Ex, known evidence of the form Ea has occurred, to the abduction of a conjecture of the form Ca could be the reason/cause. The known laws are the content of the encyclopedic medical background. The known evidence consists of observed symptoms and signs. The factual conjecture is the conclusion of a diagnosis, or therapy, or monitoring strategy. By contrast, discovery of new illnesses or introduction of new therapies might rely on creation of hypotheses, in which case it could be based upon law abduction . According to Schurz (2008, 212), law abduction proceeds from background laws of the form (∀x)(Cx → Ex), and an empirical law to be explained of the form (∀x)(Fx → Ex), to the abduction of a conjecture of the form (∀x)(Fx → Cx). Here we limit our proposal to discovery or creation within a paradigm, revolution being explainable by means of more complex theoretical abduction. Among various well-known examples of the history of science, Semmelweis’ work as accounted for by Hempel (1966, 17 ff.) is an interesting case study to discuss creative law abduction. Between 1844 and 1848, at the Hospital General of Vienna, a striking difference of fatality was observed between the two divisions of the maternity. The First Division was affected by an excessive mortality rate provoked by childbed puerperal fever, which did not occur in the Second Division. Nobody could provide an explanation and, consequently, make decision to stop this fatality. Various
23 Medical Reasoning and the GW Model of Abduction
497
hypotheses were considered, some of them immediately rejected (e.g., diet, since it made no difference between the two divisions). In 1847, a colleague of Semmelweis received a puncture wound in the finger from the scalper of a student with whom he was performing an autopsy. He died after having displayed the same symptoms that of the victims of childbed fever. Semmelweis conjectured that it was the “cadaveric matter” (scientists were not aware of the existence of microorganisms at that time) that was responsible for the infection and the cause of the childbed fever. The striking fact was that in the First Division, the cares were provided by students who practiced autopsies, whereas in the Second Division, the cares were provided by midwives who did not practice autopsies. Given that the midwives utensils were not contaminated, this would have explained the difference of the mortality rates between the two divisions. Semmelweis’ hypothesis was therefore that it was the cadaveric matter in the utensils of the students who practiced autopsies that was the cause of the childbed fever. He also conjectured that if they washed their hand and their material in a solution of chlorinated lime, the infections would have decreased and subsequently the morality rate. This is exactly what happened. It is worth noting that Semmelweis introduces different hypotheses. First, the main hypothesis is a possible solution to the abductive problem, i.e., a possible solution to the question of the origin of death following childbed fever. Second is a test implication which is also some kind of practical solution, i.e., how to avoid death by the supposed contamination by cadaveric matter, which if it works will provide evidence in favor of the main hypothesis. Third is an auxiliary hypothesis concerning the efficiency of chlorinated lime for disinfection. Although the ontologies and the problems are different, the model is similar to the extended STm of medical reasoning. An explanation is conjectured. It is released in further inferential work by conjecturing the possibility of solving the problem by disinfecting hands and tools, a hypothesis which is in turn released in ignorance-preserving action. Nonetheless, what is different with mere selective medical reasoning is that Semmelweis introduces new causal or conditional hypotheses formulated as general laws. That is, “if patients are infected by cadaveric matter, they will be affected by puerperal fever”; “if practitioners wash their hands and their tools, fatality will decrease”; and “if practitioners wash their hands and their tools with a solution of chlorinated lime, they eliminate cadaveric matter.” All of these lawlike hypotheses are released without being fully confirmed. By contrast, in the STm, physicians usually select hypotheses on the basis of laws already admitted. When prescribing quinine in case of suspicion of malaria, a physician suspects that the patient has malaria and then prescribes quinine. He does not introduce laws of the forms “if he has malaria, then he has fever . . . .” He introduces a factual hypothesis “he has malaria” on the basis of this known conditional. So creative medical reasoning would be based on law abductions, rather than factual abductions. Interesting is the fact that, despite its success, Semmelweis’ hypothesis has not been accepted by the scientific community. Various possible reasons are discussed by Gillies (2005, 176), one of them being that the mechanism underlying the infection was not sufficiently explained until the germ theory developed by Pasteur, Lister, and Koch at the end of the nineteenth century. It is worth noting that germ theories also involved factual
498
C. Barés Gómez and M. Fontaine
hypotheses about the existence of germs and bacteria and not only law hypotheses. That is, the theory was based on entities, properties of these entities, and descriptions of mechanisms.
Evidence-Based Medicine and Mechanistic Evidence When explaining causality in philosophy of medicine, a burning debate brings into opposition “mechanistic” and “probabilistic” perspectives. The latter finds evidence of causality in difference-making studies that display the correlation between variables. It is based on empirical trials, statistics, and probability. The former considers that probabilistic evidence is not enough to prove causality and that an explanation of the underlying mechanism – relating the cause to its effect – is also needed. Recently, Russo and Williamson (2007) have argued that both kinds of evidence are necessary to establish causality, whereby the “Russo-Williamson thesis” (RWT). One kind of causality, is explained by means of two kinds of evidence. In this section, the authors of this chapter advocate another explanation of the mechanistic and probabilistic perspectives, by referring to the inferences which support them. The point is that, whereas probabilistic evidence is obtained by induction, mechanisms are abductive hypotheses, whose cognitive virtue is pragmatic and has nothing to do with evidence. Nonetheless, although abduction is ignorance-preserving, its conclusions recommend a course of action – e.g., by orienting an empirical trial – and as such they constitute a first step, necessary in the economy of research. Probabilistic evidence is based upon empirical research programs, including statistical surveys, whose aim is to support the existence of a correlation between variables. Due to their empirical character, results are never more than probable; what is shown is the inductive strength of a correlation. A widely accepted methodology of medical research is known as evidence-based medicine (see Sackett et al. (1996)). It usually involves hierarchization of different kinds of evidence with randomized control trials (RCT) and meta-analysis at the top (see Guyatt et al. (2002, 7), but also Stegenga (2014) for a critical study). RCT consist of tests based on randomized and control groups. They also involve standard such as double-blind (patient, nurse) to test the placebo effects. It is not our aim to describe exhaustively how these studies work or to discuss these hierarchies; it is sufficient to understand that they are quantitative studies of correlations between variables. For example, they establish conclusion of the form “a drug X causes the effect Y in such proportion” based on external evidence. It is worth noting that probabilistic evidence does not provide any explanation of the link between the variables; it only establishes the probability of the link between them. For example, it does not explain how a drug cures symptoms or how certain health conditions cause cancer. An explanation of the link between cause and effect can be provided by mechanisms. Russo and Williamson (2007, 158) criticize the view held by Lipton and Ødegaard (2005), who consider that (1) does not convey more information than (2):
23 Medical Reasoning and the GW Model of Abduction
499
1. Smoking causes lung cancer. 2. If you smoked two packs a day for X amount of years, your chance of getting lung cancer would be 10 times greater than a nonsmoker. Indeed, (2) may be true without (1) being so. Moreover, (2) alone would not be sufficient to justify, e.g., a political intervention against smoking. Although causal relations are established by means of mixed evidence, mechanistic and/or probabilistic, we should not confuse them. According to the RWT, both are necessary to establish a causal relationship. While the probabilistic aspect shows that the cause makes a difference to its effects, the mechanistic aspect explains the dependencies (Russo and Williamson (2007, 159)). Mechanisms are sometimes characterized in terms of causal pathways or explanations of the causal relation. Despite the huge variety of definitions of the notion of mechanism, we may rely on the one put forward by Illari and Williamson (2012, 120): “A mechanism for a phenomenon consists of entities and activities organised in such a way that they are responsible for this phenomenon.” This definition is based upon the following explanation: “All mechanistic explanations begin with (a) the identification of a phenomenon or some phenomena to be explained, (b) proceed by decomposition into the entities and activities relevant to the phenomenon, and (c) give the organisation of entities and activities by which they produce the phenomenon” (Illari and Williamson (2012, 123)). A mechanism is some kind of characteristic activities by means of which the phenomenon is explained. For example, as stressed by Gillies (2005, 177), Semmelweis’ hypothesis about the cadaveric matter causing puerperal fever was not wrong, but the description of a mechanism explaining how it could cause puerperal fever was still lacking. The description of characteristic activities came later with the germ theory, the identification of entities such as bacteria (e.g., streptococcus, cholera), their activities, and how they produce puerperal fever. Other times, it is the discovery of a bacteria which calls for a mechanistic explanation, as for the discovery of Helicobacter pylori described by Thagard (1999, ch. 3). In 1979, while peptic ulcers were usually explained by an excessive stomach acidity, Warren discovered a bacterium in the stomach of patient suffering of acidity. With the help of Marshall, they conjectured the hypothesis that Helicobacter pylori could be the cause of peptic ulcers (Marshall and Warren (1984)). Following this hypothesis as a recommendation for a therapy, they prescribed antibiotics to treat peptic ulcers, with a certain success. Nonetheless, the existence of such a bacterium in the stomach was, as stressed by Thagard, not coherent with the background knowledge about the disease, understood in terms of excess of acidity. The hypothesis was finally accepted when a mechanism of production of ammonia allowing the bacteria to survive despite high acidity rate was recognized. Then, a mechanism describing the activity of such an entity explained how it could cause peptic ulcers. We may understand what is meant by “entities” and their “activities organized” in a certain way, as well as what might be evidence of them. It is more difficult to understand what is meant by “that they are responsible for the phenomenon” and, overall, how we might obtain evidence of such a responsibility. Illari (2011, 145)
500
C. Barés Gómez and M. Fontaine
claims that “[e]vidence of a mechanism is just evidence of the existence of a mechanism or mechanisms in the domain of inquiry in question,” which is spelled out later when she explains that “[e]vidence of a mechanism is evidence of the entities or activities that make up mechanisms, or the organisation of those entities and activities by which they produce the phenomenon the mechanism is known for.” But how is the evidence of such a productive activity obtained if it is not inductively, thereby probabilistically? Then, what would be the difference between mechanistic and probabilistic evidence? In fact, evidence of entities and their properties is also inductive, even if the probability might be higher and the processes of corroborations easier (in the extent that our senses, observations, and measuring tools are reliable). Although there might be proposals different from Illari’s, we eventually find difficult to understand how there could be mechanistic evidence that is not probabilistic, as it seems to be assumed by the RWT.
Mechanistic Hypotheses Thinking of mechanisms in terms of evidence may be problematic, that is why we suggest that its pragmatic virtues should be emphasized. The crucial distinction – between mechanism and probabilities – is not a matter of evidence. It is a matter of inference, by means of which different complementary steps are taken in scientific inquiry. Whereas mechanisms are conjectured by abduction, probabilities are obtained by induction. And, as explained by Woods, since abduction is ignorancepreserving, it is also “evidentially inert” (Woods (2017, 138)). Evidence may be obtained when confronting hypotheses with facts. In the previous example, (1) is a mechanistic hypothesis about lung cancer. It can be released to express a test implication hypothesis, i.e., a formulation that is empirically testable in accordance with Peirce’s requisite (CP 7.220). This leads to the formulation of the hypothesis (2), which can be sent to trial. All of this process does not produce evidence of any kind. Evidence is obtained later in the inductive phase. This does not make abduction cognitively irrelevant given that without this preliminary work, empirical trials could not even begin. Indeed, as it is now explained, hypotheses are themselves released for premissory work in further inferential work, in which implication test hypotheses are stated and strategies of empirical corroboration are planned. Inductive strength is defined in terms of probabilities. Yet, what is a mechanistic evidence? If a mechanism consists in some kind of causal explanation – involving entities, activities, properties, and so on – it is difficult to see how we could get evidence of it. Of course, we may have evidence of the entities involved and even, in some extent, observe certain activities. But in the end, evidence is never intrinsically mechanical. Such evidence is mainly empirical, and, therefore, it almost always falls within the realm of uncertainty, possibly handled by probabilities. Why there cannot be mechanistic evidence which is not probabilistic? Mechanisms involve causal patterns. To put it in Kantian terms, although we might acknowledge that causality is an a priori concept, empirical causal relations can only be established a posteriori. Thus, we might find it necessary that there must be a mechanism
23 Medical Reasoning and the GW Model of Abduction
501
behind illness. But the underlying causal laws are determined a posteriori. However, as it has been shown by Hume, there is no experience corresponding to causal relations. Therefore, the only means to establish them is by means of an ampliative inferential process, namely, by induction and probabilities. If causal pathways are inherent to mechanisms, there cannot be mechanistic evidence which is not probabilistic. Here is where the inferential-based analysis held in this chapter leads to a proposal different from the RWT. The authors of this chapter fully agree that both probabilistic evidence and mechanisms are necessary to establish causal relations, but not for the same reasons. The argument is formulated in a Peircean spirit. Induction cannot make the first step. An abductive hypothesis is needed first. Consequently, RCT cannot be run without an abductive hypothesis which indicates which potential correlations are relevant to be studied. Such a hypothesis is a mechanism. That is, as in the STm, abduction, deduction, and induction are conceived within in a Peircean triad. Mechanisms have little to do with evidence and must be approached from a pragmatic perspective. Of course, depending on the notions of evidence and knowledge (which may be partial, fallible, and defeasible), this might be discussed. Nonetheless, this does not really matter since the main cognitive virtue of mechanisms is not related to evidence, but to pragmatic reasons: they are conjectured as a basis for new action despite a persisting state of ignorance. Without abductive hypotheses, RCT research is not possible. And good abduced hypotheses showing a mechanistic plausibility could be the ones to be tested in a RCT. Why probabilistic evidence is not reachable without a preliminary abductive phase? This question deserves further comments. First is a matter of relevance, which may be understood in accordance with the general definition provided by Gabbay and Woods (2003, 158) with respect to a triple with pieces of information I, agents X, and agendas A: “I is relevant for X with respect to A if and only if in processing I, X is affected in ways that advance or close A.” Relevance of information is always relative to a hypothesis or a target, set within an agenda of research. How can we gather data without a hypothesis under which they are collected? If we think of particular facts altogether, if we study correlations to synthesize them in the affirmation of a general law, it is because we have already gathered them under a general hypothesis. Of course, the hypothesis may be grounded on experience, the experience of a particular regularity. But the observation of a particular regularity is neither inductive nor probabilistic. It is the experience of something that appears as a regularity, or better said whose regularity is suspected, and which is thus surprising. The surprising regularity may trigger abduction and gives rise to the introduction of a hypothesis. It is only once the hypothesis has been conjectured that induction may occur. If trials are made in order to find relevant information with respect to a hypothesis, then the hypothesis must have been given first. Information collected in EBM, and RCT, can be relevant only because a mechanistic hypothesis has been previously given. Second is a matter of economy of research, to put it in Peircean terms. Interestingly, after having distinguished probable reasoning from a population to a sample and probable reasoning from a sample to a population (induction),
502
C. Barés Gómez and M. Fontaine
Peirce (1992, 139) argues that “induction can never make a first suggestion. All that induction can do is to infer the value of a ratio.” The first suggestion, the introduction of a hypothesis, is always made by another kind of reasoning, namely, abduction, which has nothing to do with probability. This is partly explained by the problematic nature of “randomness” highlighted by Peirce. Indeed, induction assumes randomness of the sample, which in turn, as rightly stressed by Putnam in his comment to Peirce’s lecture, “requires knowledge of the equality of certain future frequencies, and is thus a species of lawlike knowledge” so that “inductive reasoning always requires the presence of assumption about the general course of the world” (in Peirce (1992, 67)). In other words, scientific inquiry cannot begin with induction, which is not possible without abductive hypotheses. Third, confrontation with empirical facts requires further hypotheses relative to the variables to be tested and the methods to establish suspected correlations. The process is similar to the extended STm we have proposed, in which an initial diagnostical hypothesis is linked to other hypotheses concerning therapy and monitoring in an ignorance-preserving way. Indeed, in the course of a particular research, we face various questions which call for different hypothetical answer. What do we have to observe? How do we observe it? How do we interpret the data? What will be the treatment of new information? If induction can never make the first step, RCT can never be established in first instance either. A hypothesis is always needed to begin with, even when abduction is triggered by an empirically observed surprising fact. Moreover, when a mechanism is conjectured, other hypotheses must be conjectured in order to make it testable, but also in planning an empirical trial. Afterward, data obtained by empirical trials may in turn become surprising and call for further explanations. The initial hypotheses can therefore be revised and new ones can be introduced to fit with the observations. The new hypotheses recommend another course of actions and another deductive-inductive phase of tests, as in the original STm. Here too, it is difficult to assess the possibility of a post-partial abduction: a first hypothesis is released in further abduction in view of planning empirical tests. There can be loops of abductive and inductive phases. For example, based on a mechanism of lung cancer caused by smoking, we elaborate RCT. Then, after positive results, we express further hypotheses, such as introducing a hereditary parameter. And the process still goes on. In the previous example of Helicobacter pylori, as described by Thagard, the success of a therapy based on antibiotics supported the initial hypothesis – and this situation could occur even in cases in which the bacteria would have not been discovered yet. Consequently, we must be careful when speaking of mechanism in terms of explanations of data. Which are the data to be explained? On the one hand, they cannot be the data provided by empirical trials such as RCT; otherwise mechanisms would explain data we do not have yet. Similar concerns about the problem of defining abduction in terms of explanation have been put forward by Hintikka (1998) – although his purpose was to argue in favor of a distinction, not of inferential processes, but of different kinds of rules, namely, definitory and strategic. It is also worth noting that it is sometimes difficult to identify the introduction of a hypothesis in the history of science. Hypotheses very hardly arise suddenly, in a Eureka instant.
23 Medical Reasoning and the GW Model of Abduction
503
From a Kuhnian perspective, we might expect a subtle process of discoveries, proposals, and corrections of hypotheses. Although we claim that abduction must come first, this Kuhnian view is not incompatible with our inferential model. Indeed, scientists and physicians are involved in a permanent flow of forward and backward movements, confronted to problems, anomalies, and other jigsaws, between the formulation of hypotheses, deductive predictions, and inductive corroborations. Let us also draw a final consequence of this proposal. La Caze (2011, 84) rightly complains the lack of consideration for mechanisms, usually relegated at the bottom, if not excluded, of hierarchies of evidence of EBM. The authors of this chapter claim that both mechanism and probabilistic surveys are necessary for medical research, but it is not a matter of evidence. If mechanisms are not intrinsically evidence, they have nothing to do in such hierarchies. Rejecting them for their low level of evidence is also pointless. Indeed, their cognitive virtue is not to be understood in terms of evidence, but in pragmatic terms. Mechanistic hypotheses are crucial because without them RCT are not possible. This is precisely where an inferential perspective analysis leads to, based on the distinction between abduction and induction. It is worth noting that conceptions of abduction that would conflate abduction and induction would probably lead to similar confusion, looking for evidence where there is not.
Conclusion Medical reasoning involves abduction, deduction, and induction. Whether it be in view of establishing a medical diagnosis, planning a therapy, or a monitoring strategy, the physician always begins with the introduction of a hypothesis. Then, the hypothesis can be used in predictions based upon deduction, and empirical confirmations based on induction. Nonetheless, the deductive-inductive phase does not always follow immediately. Most of the time, a hypothetical diagnosis is activated in other abductions, aiming at specifying the initial diagnosis, planning a therapy or a monitoring strategy. Given that the physician engages in chains of hypothetical reasonings despite a lack of confirmation, the abductive phase of medical reasoning is better explained in the context of the GWm of abduction, that is, by considering abduction as an ignorance-preserving inference. This was the starting point of this chapter, in which we have first accounted for medical diagnosis in terms of abduction and then applied this model to clarify the use of hypotheses in medical reasoning. The main interest of the GWm is its emphasis on the pragmatic role of abduction and the distinction between partial and full abduction. We thus avoid the confusion between the introduction of a hypothesis (partial abduction) and the activation of the hypothesis (full abduction). By clearly distinguishing between abduction, deduction, and induction, we noticed that the activation of the hypothesis was not to be identified with its confirmation. This does not mean that hypotheses are not sent to empirical trials, but that this leads beyond the abductive process. The activation of the hypothesis means that it must be released for premissory work in
504
C. Barés Gómez and M. Fontaine
further reasonings. For example, a diagnostic can be conjectured and activated when planning a therapy, even in the absence of a confirmation of the initial hypothesis. Various hypotheses can also be activated when planning empirical trials in medical research. In these processes, abduction, deduction, and induction are somewhat intertwined, but this does not mean that the latter are essential to the definition of abduction. This conception of medical reasoning within an ignorance-preserving frame was the basis for an original understanding of the use of hypotheses in medical reasoning. Indeed, we have first highlighted the connection between diagnosis, therapy planning, and monitoring, of Magnani’s STm. They involve hypotheses that can be connected, even before the deductive-inductive phase of confirmation. Indeed, a physician who conjectures a diagnosis does not immediately engage in a deductive phase of predictions and an inductive phase of confirmation. He often activates the hypothesis in another abductive phase, when planning a therapy (and/or a monitoring strategy). In fact, even in case in which the physician would look for empirical confirmation of his diagnosis, this could not be possible without further hypotheses concerning the methodology to be employed. Indeed, although we had first distinguished between post-partial and post-full abduction confirmations, we have finally raised doubts about the possibility of the former. The point is that strategies of confirmation or empirical trials always involve further hypotheses about how can we find evidence in favor of the hypotheses. More generally, action is not possible without the activation of the hypotheses. This model applies to selective abduction, which characterizes the daily practice of most of the physician, whose profession is usually institutionally regulated – that is, a physician is not allowed to introduce new illness or new therapies. But it should also apply to creative abduction, in particular in medical research. Then, we are not concerned with the mere selection of hypotheses, but their creation. Whereas selective abductions can be explained in terms of factual abductions, creative abductions often involve law abductions. Law abductions finally led us to draw conclusions in order to take part in the debate between mechanistic and probabilistic approaches in medicine and consequently to a discussion of the now well-known Russo-Williamson thesis that is now summarized. In accordance with the RWT, it is agreed that both mechanisms and probabilistic approaches are necessary to establish causality in medical sciences, but not that there are two different kinds of evidence. Empirical evidence is always based on induction and is mainly probabilistic. The difference is not to be understood in terms of evidence, but in terms of inference. Whereas probabilistic approaches are based on induction, mechanisms are introduced by means of abductive inferences. Mechanisms are abductive hypotheses. As such, they are not subject to a specific kind of evidence. They only recommend a course of action and can be activated when planning empirical trials and eventually used in an inductive phase of confirmation. Nonetheless, this should not lead to adopt a skeptical stance with respect to mechanisms, e.g., because of their so-called speculative nature. Indeed, they do not aim at proving anything, but they are what is to be proved in the course of further investigations.
23 Medical Reasoning and the GW Model of Abduction
505
Anyway, they constitute a necessary first step in medical inquiry. Indeed, induction can never make the first suggestion. A hypothesis must always have been introduced first. Consequently, conceiving medical science only from a probabilistic perspective does not make sense. How could we plan empirical trials – RCT in particular – without first a hypothesis at hand? Moreover, planning empirical trials also involve hypotheses with respect to the process and the methodology to adopt in view of confirming a hypothesis. Such hypotheses involve causal laws and mechanisms as well. Again, without the activation of hypotheses, action is not possible. More generally, situating mechanisms at the bottom of hierarchical pyramids of evidence is irrelevant and almost senseless. Acknowledgments We warmly thank the editors, Lorenzo Magnani, Mattia Andreoletti, and Daniele Chiffi, for their help and their perspicacious comments. We are also thankful to John Woods for his availability and his invaluable advice on how to understand the GWm of abduction as well as other relevant philosophical issues. The authors acknowledge the financial support of the projects “Abducción y Diagnóstico Médico. Interrogación e Hipótesis en la Causalidad Científica” held by Cristina Barés and Francisco Salguero (US-1381050, Proyectos de I + D + i en el marco del Programa Operativo FEDER Andalucía 2014–2020. Junta de Andalucía) and “Métodos lógicos y abductivos aplicados a la semántica y la pragmática de la interacción comunicativa” held by Francisco Salguero (PID2020-117871GB-I00, Proyecto del Plan Estatal 2017–2020 Generación del Conocimiento – Proyectos I + D + i, Ministerio de Ciencia e Innovación).
References Barés Gómez, C. (2018). Abduction in Akkadian medical diagnosis. Journal of Applied Logic – IFColog Journal of Applied Logics and its Applications, 5(8), 1697–1722. Barés Gómez, C. (2021). Un análisis de la inferencia en la práctica médico-veterinaria antigua. Los textos hipiátricos de Ugarit. In C. Barés Gómez et al. (Eds.), Lógica, Conocimiento y Abducción (pp. 265–284). College Publications. Barés Gómez, C., & Fontaine, M. (2017). Argumentation and abduction in dialogical logic. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based science (pp. 295–314). Springer. Barés Gómez, C., & Fontaine, M. (2021a). Medical reasoning in public health emergencies: Below high standards of accuracy. Teorema, 40(1), 151–173. Barés Gómez, C., & Fontaine, M. (2021b). Between sentential and model-based abductions: A dialogical approach. Logic Journal of the IGPL, 29(4), 425–446. https://doi.org/10.1093/jigpal/ jzz033 Bellucci, F., & Pietarinen, A. V. (2016). Charles Sanders Peirce: Logic. International encyclopedia of philosophy. Accessed 7 Dec 2016. https://iep.utm.edu/peir-log/ Chiffi, D. (2021). Clinical reasoning: Knowledge, uncertainty, and values in health care. Springer. Chiffi, D., & Pietarinen, A. V. (2020). Abductive inference within a pragmatic framework. Synthese, 197, 2507–2523. Gabbay, D., & Woods, J. (2003). Agenda relevance. A study in formal pragmatics. Elsevier. Gabbay, D., & Woods, J. (2005). The reach of abduction. Insight and trials. Elsevier. Galen. (2002). Que el mejor médico es también filósofo. In T. Martínez Manzano (translator) Galeno, Tratados lógicos y autobiográficos (pp. 65–92). Gredos. Gillies, D. (2005). Hempelian and Kuhnian approaches in the philosophy of medicine: The Semmelweis case. Studies in History and Philosophy of Biological and Biomedical Sciences, 36, 159–181. https://doi.org/10.1016/j.shpsc.2004.12.003
506
C. Barés Gómez and M. Fontaine
Guyatt, G., Drummond, R., Meade, M., & Cook, D. (2002). Users’ guides to the medical literature: A manual for evidence-based clinical practice. American Medical Association. Hankinson, R. J. (1991). Galen on the therapeutic methods. Books I and II. Clarendon Press. Hankinson, R. J. (2008). Epistemology. In R. J. Hankinson (Ed.), The Cambridge companion to Galen (pp. 49–65). Cambridge University Press. Hempel, C. G. (1945). Studies in the logic of confirmation (I.). Mind, 54(213), 1–26. Hempel, C. G. (1966). Philosophy of natural science. Prentice-Hall. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of the Charles S. Peirce Society, 34, 503–533. Hofmann, B. (2002). On the triad disease, illness and sickness. J Med Philos, 27(6), 651–673. https://doi.org/10.1076/jmep.27.6.651.13793 Illari, P. M. (2011). Mechanistic evidence: Disambiguating the Russo-Williamson thesis. International Studies in the Philosophy of Science, 25(2), 139–157. https://doi.org/10.1080/02698595. 2011.574856 Illari, P., & Williamson, J. (2012). What is a mechanism? Thinking about mechanisms across the sciences. European Journal for Philosophy of Science, 2, 119–135. https://doi.org/10.1007/ s13194-011-0038-2 La Caze, A. (2011). The role of basic science in evidence-based medicine. Biology and Philosophy, 26, 81–98. https://doi.org/10.1007/s10539-010-9231-5 Lipton, R., & Ødegaard, T. (2005). Causal thinking and causal language in epidemiology. Epidemiological Perspectives and Innovations, 2(8). https://doi.org/10.1186/1742-5573-2-8 Magnani, L. (1992). Abductive reasoning: Philosophical and educational perspectives in medicine. In D. Evans et al. (Eds.), Advanced models of cognition for medical training and practice (pp. 21–41). Springer. Magnani, L. (2017). The abductive structure of scientific creativity. Springer. Marshall, B. J., & Warren, J. R. (1984). Unidentified curved bacilli in the stomach of patients with gastritis and peptic ulceration. Lancet, 1, 1311–1315. https://doi.org/10.1016/s01406736(84)91816-6 Morrison, B. (2008). Logic. In R. J. Hankinson (Ed.), The Cambridge companion to Galen (pp. 66–115). Cambridge University Press. Peirce, C. S. (1931–1958). Collected papers of Charles Sanders Peirce. Harvard University Press. Peirce, C. S. (1992). Reasoning and the logic of things. Harvard University Press. Popper, K. (1959). The logic of scientific discovery. Hutchinson. Russo, F., & Williamson, J. (2007). Interpreting causality in the health sciences. International Studies in the Philosophy of Science, 21(2), 157–170. https://doi.org/10.1080/02698590701498084 Sackett, D. L., et al. (1996). Evidence based medicine: What it is and what it isn’t. British Journal of Medicine, 312(7023), 71–72. https://doi.org/10.1136/bmj.312.7023.71 Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. https://doi.org/10.1007/s11229007-9223-4 Stegenga, J. (2014). Down with the hierarchies. Topoi, 33, 313–322. https://doi.org/10.1007/ s11245-013-9189-4 Stempsey, W. E. (2017). Applying medical knowledge: Diagnosing disease. In T. Schramme & S. Edwards (Eds.), Handbook of the philosophy of medicine (pp. 643–660). Springer. Thagard, P. (1999). How scientists explain disease. Princeton University Press. West, D. (2016). Course of action recommendations and their place in developmental abduction. IFColog Journal of Applied Logics and its Applications, 3(1), 123–152. Woods, J. (2013). Errors of reasoning. Naturalizing the logic of inference. College Publications. Woods, J. (2017). Reorienting the logic of abduction. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based science (pp. 137–150). Springer. Woods, J. (2018). Truth in fiction. Rethinking its logic. Springer.
Part V Abduction in Mathematics
Introduction to Abduction in Mathematics
24
F. D. Rivera
Abstract
The five chapters that comprise this section of the book illustrate in both theory and empirical research how the nature, activity, and practice of mathematics could be grounded in inquiry, (surprising) observation, imagination, insight, creativity, invention, and everything else that emerges from abductive thinking and reasoning. These five chapters help make the case that the trivium of abduction, induction, and deduction is the core of inference making and validation in mathematics. de Freitas explores the process of creative abductive reasoning in the context of an ecological, anthropocentric view of mathematical practices, where mathematical activity involves using eco-cognitive processing and imagination in mathematical reasoning. Campos describes mathematical activity as a type of scientific abduction in inquiry and heuristic investigative contexts. Ernest describes ways in which abduction may occur in creative work among students in school mathematical contexts and among research mathematicians and underscores cognitive activities, metacognitive activities, and abductively driven intuitive activities in mathematical problem-solving. Using student data, Meyer illustrates how the meaning and context of discoveries are mediated by different types of abductions. Pedemonte illustrates ways in which an instructor’s abductive interventions may help decrease the distance between students’ argumentation and solutions to a problem.
F. D. Rivera () Department of Mathematics and Statistics and Ed.D. in Educational Leadership Program, San Jose State University, San Jose, CA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_80
509
510
F. D. Rivera
Keywords
Abduction in mathematics · Creativity in mathematics · Discovery in mathematics · Instructor-mediated abductions · Creative · Overcoded and undercoded abductions in mathematical activity
The five chapters that comprise this section of the book illustrate in both theory and empirical research how the nature, activity, and practice of mathematics (i.e., mathematical enterprise) could be grounded in inquiry, (surprising) observation, imagination, insight, creativity, invention, and everything else that emerges from abduction, “the only logical operation that introduces any new idea” and generates, tests, evaluates, and selects the most reasonable “explanatory hypotheses” (Peirce, 1931–1958, 5.172; Mohammadian, 2019). Paul Ernest in his chapter notes that “the main interest in abduction for mathematics, other than historical, is its intended role as a logic of discovery and not as a form of logical reasoning.” Daniel Campos in his chapter describes how the mathematical enterprise could occur in situations where theorems and methods develop and emerge following a “heuristic investigative” approach. Both aspects of discovery and heuristic investigations are consistent with how Peirce talked about the nature and practice of science, philosophy, and mathematics that represents the “state of hypothetical things.” Everyday mathematics instruction and learning tend to support an idealized and secure view of mathematics that Elizabeth de Freitas in her chapter describes as formal, closed, systematic, and “unmessy,” and we add inductive and deductive that seem to operate rather unproblematically in predictable ways. Hopefully, a serious reading of these five superb chapters will convince readers that it is about time for the mathematical enterprise to be rethought, and, possibly, rethought otherwise. That we rethink it in the same manner Peirce conceptualized scientific inquiry: “curiositydriven” (Grinnell, 2019, p. 221), “progressive, [with] one discovery leading to others” (Short, 2020, p. 213), and “inventively” but “restrictively wild” that allows the generation of explanatory hypotheses through abduction, empirical testing through induction, and consequential testing through deduction (Mohammadian, 2019, p. 141). Abductions and their structures are, indeed, central and necessary to inferencing, much like inductions and deductions and their structures. Unfortunately, there is no shared institutional enthusiasm but only a slow reception toward abduction. Producing an abduction, like discoveries in science, requires patience in dealing with ambiguity, convoluted and nonlinear processing, and frequent failures (Grinnell, 2019). Michael Meyer in his chapter illustrates an interesting situation with a pair of tenth-grade students that made a “discovery at the edge of knowledge, [that is,] looking for something without being exactly sure what it looks like and guessing what might be the answer” (Grinnell, 2019, p. 221). Bettina Pedemonte in her chapter offers a nice complementary closure around predicaments in abduction by illustrating ways in which instructors can effectively intervene in such abductive
24 Introduction to Abduction in Mathematics
511
moments when the phase of mathematical knowing becomes sloppy and muddy (ibid.). Instead of giving up, Pedemonte makes a clear case for engaging students in instructor-driven “opportunities for abduction” (Grinnell, 2019, p. 221). We hope that the five chapters that comprise this section help make the case that the trivium of abduction, induction, and deduction is the core of inference making and validation in mathematics. This trivium explicitly acknowledges the important role of abduction in the continuous growth and development of the mathematical enterprise. It also enables us to see how mathematics, like science, is not only a body of “systematized knowledge” based on “dead memory” but “a living and growing body of truth” (Peirce, 1931–1958, 6.428, cited in Short, 2020, p. 212). Below is an overview of each chapter. In Chap. 25, “The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint”, de Freitas illustrates the central role of abduction in “the strange mix of creative and formal activity that constitutes mathematics” contra typical views that sees mathematics as “transcendent, universal, and eternal.” de Freitas points out that abduction “depends on our ability to engage with not-knowing, on making thought alien to itself. It is a necessary supplement to combinatorial logic, a risky gesture that both undergirds and prehends the iterative process that is at the heart of the concept of algorithm.” With abduction de Freitas associates an ecological, anthropocentric view of mathematical practices that are “embodied, situated, cultural, planetary, and non-human.” Consequently, mathematical activity involves using eco-cognitive processing and “the speculative work of the imagination in mathematical reasoning.” Throughout the chapter, de Freitas draws on different examples that model the core components of abduction in mathematical invention, modeling for us this “kind of estrangement, this destabilizing of a doxa, this upheaval of thought.” The examples convey “how mathematics operates through contingency and imagining otherwise, and through various techniques of generalization and their disruptive consequences.” Some of these core components include: the necessity of imagination for innovative mathematical activity; the play between forming hypothetical knowledge and establishing necessary knowledge; seeing mathematics as “experimental science, operating on hypothetical objects;” engaging in conditional reasoning; surprise and “reckoning with the unusual, the nonsense, and dissensus;” making inventive mutations through processes of remixing and recomposing on route to mathematical coherence; open-ended modeling; “imagining what is not the case;” and moving between discovery and explanation. de Freitas also describes the process of creative abductive reasoning, which fuels creativity, discovery, imagination, and invention and allows a “gateway onto imaginary entities.” That we learn to appreciate how messy it can get but “always loops into a circuit of correction, interrupting habit with invention, in a cognitive milieu that is more-than-human, [and] stretching outside the control of the organism” as it tries to generate a best explanation despite its plausible nature. In Chap. 26, “Peirce’s Conception of Mathematics as Creative Experimental Inquiry”, Campos discusses how Peirce conceptualizes the mathematical enterprise
512
F. D. Rivera
as a type of “scientific abduction ” that has the characteristics of being creative, experimental, open, abductively insightful, and heuristic-driven and draws on “semiotic imagination in mathematical reasoning.” Imagining involves “creating, picturing, and manipulating intricate configurations of signs,” modeling a type of practice that makes reasoning necessary. For Campos, “imagination is the key to originality, in mathematics as in all scientific and philosophical reasoning, and so it is the primary source of breakthrough discovery.” But, it is also important to understand the “nature of the inquiry in which [mathematics] is embedded” and the role that diagrams and other icons play in experiments in mathematics because “they represent purely hypothetical state[s] of things” that allow mathematicians to employ both “imagination and judicious observation” as they investigate “pure hypotheses.” “In mathematical reasoning,” Campos writes, “imagination creates experimental diagrams that function as signs that are then perceived, interpreted, judged, often transformed, reimagined, re-interpreted, and so on, in a continuous process.” Consequently, experimental hypotheses emerge as “imaginative suggestions that become subject to logical scrutiny,” which then become “possible keys to the solution of a theorematic deduction—that is, a deduction that requires the creation and examination of specially constructed diagrams and not simply specifying the corollarial consequences of previously discovered theorems.” In a much larger context, Campos situates Peirce’s philosophy of mathematics not in terms of the nature of mathematical objects but how it models an “open-ended inquiry.” While a closed-ended inquiry views theorems and methods in mathematics as fixed, axiomatic, and thoroughly deductive, an open-ended inquiry takes an opposite characterization. Consequently, theorems and concepts emerge, and analytic problem-solving becomes the method that drives the development of mathematics. Hence, experimentation and imagination yield both framing and experimental hypotheses as a result of constructing abductions, which would make mathematics less foundationalist and more heuristic in conception. Campos writes, “the heuristic investigation of mathematics is relevant to progress in mathematics itself, since the main philosophical problem of such an investigation is mathematical discovery and it aims at improving or inventing methods of discovery.” To illustrate, Campos cites Peirce’s letter to his brother in which Peirce explicitly noted the following aim of instruction in geometry (and mathematics education, more generally): “That instruction in geometry ought to begin with awakening the geometrical imagination, both psychology and experience show. The first example of proof offered should be a good specimen of real mathematical reasoning and not the kind of thing which astounds the pupil by demonstrating at length something obvious at glance.” An example of real mathematical reasoning would be one in which imaginative hypothesis-making is necessary to solve a problem. Finally, the following point captures how Campos sees Peirce’s practice of mathematics: “Peirce’s philosophical emphasis is on rational discovery which encompasses justification, not on rational justification severed from irrational discovery; on experimentation, not axiomatization; on hypothesis-making, not mechanical or ‘corollarial’ deduction; on rational ampliative inference, not unexplainable intuition; on problem-solving
24 Introduction to Abduction in Mathematics
513
which encompasses theorem-proving, not on deducing theorems from self-evident axioms; on hypothetical or ‘would-be’ truth, not on absolutely certain truth.” In Chap. 28, “Abduction and Creativity in Mathematics”, Ernest explores the role of abduction in philosophy, in practice, and in the process of meaning-making in mathematics. He notes that while “materials for abduction” in mathematics “remain in the mind, in conversation and in the lecture room,” there is “plenty of room for speculations about the patterns that link and indeed create mathematical objects.” Ernest also explores the important role of abduction in characterizing and analyzing the nature of creativity in mathematics. In the case of school mathematics, students employ abduction when they need to make decisions in creatively driven contexts that involve “choosing for themselves the concepts and skills [and tools] or missing hypotheses” that “are not automatically given within the task.” In the case of the work of mathematicians, creative work “use[s] a deeper and more extensive range of strategies for solving problems,” where in some cases solution methods “lead to breakthroughs.” Abductions also occur in moments of illumination, where “the hypothesis springs to mind from some unknown unconscious source, and suggests the missing piece that helps to solve the problem.” In an earlier work, Ernest underscores cognitive activities (e.g., carrying out plan, applying strategies, using skills) and metacognitive activities (e.g., planning, monitoring progress, decision-making, checking, choosing strategies) that occur in problem-solving in mathematics. Addressing the “great virtue” of abduction more explicitly in such a process, Ernest suggests adding a third cluster of intuitive activities such as following hunches, noticing associations, attending to memories triggered, following intuitive links, responding to multisensory experiences, etc. Ernest closes his chapter with a note on the important role of schooling in fostering “new creative mathematicians.” That “ the goal in bringing creativity into school mathematics” is “to foster in miniature the creativity manifested by all mathematicians, resting as it does on feelings of understanding, excitement and joy,” empowering students and building their character in the process. Hopefully, the creative experience would “encourage more students to follow the path of becoming mathematicians.” “Research mathematics,” Ernest writes, “is all about creativity and here abduction plays an essential part” because “creativity depends on a larger and richer array of connections between concepts, terms, conjectures, theorems and theories than deduction alone can offer.” In Chap. 27, “Using Abduction for Characterizing the Process of Discovery”, Meyer explores the meaning of “discovery” in mathematics and strategies for initiating and supporting those discoveries in terms of abduction. Drawing on Eco’s typology of abduction (i.e., creative, undercoded, and overcoded abductions) and empirical data from fourth- and tenth-grade students’ think-aloud responses, Meyer demonstrates how much of discovery learning is not only about “the content or structure of the recognized relationships” but also “depends on the person who recognizes the specific relationships.” Furthermore, the students’ processing illustrates how “different aspects of discoveries” are “mediated by [different types of] abductions.” For instance, generating new mathematical knowledge or relationships
514
F. D. Rivera
from scratch can take place in situations where abductions are used as a “necessary inference,” consistent with prevailing conceptualizations of creative abductions, including the role of induction in the verification phase of knowledge construction. Overcoded and undercoded abductions can also support deeper discoveries. In overcoded abductions, students use inferred rules and relationships to help them generate and recognize “surprising” instances or results that emerge from the rules and relationships, which Meyer classifies as “a discovery performance only to a limited extent.” In undercoded abductions, students discover and generate rules and relationships that are all equiprobable and then select the rule that makes the most sense. For Meyer, undercoded and (especially) creative abductions are “decisive inferences in” and “seem to be predestined” for “discovery-based learning.” Meyer closes his chapter with the following three criteria for characterizing discovery that is framed in abductive lens: (i) Is the rule of abduction known or unknown? (ii) Is the well-known rule more or less obvious or not? (iii) Does the knowledge gained remain on the surface of what is perceptible or does it penetrate deep into a mathematical structure? Meyer also reminds us that the abductive types depend on the “content of the observed phenomenon” that is available to learners, that “it makes a difference if the abduction starts from just one example of numbers or a set of examples.” In Chap. 29, “Abductive Arguments Supporting Students’ Construction of Proofs”, Pedemonte illustrates ways in which instructors can effectively use abductive argumentation to support students that are in different problem-solving situations in algebra such as constructing proofs, recognizing mistakes, or incorrectly applying rules. Pedemonte’s case studies involving interactions between an instructor and her students demonstrate how an instructor’s abductive interventions can help “decrease the distance between students’ argumentation and solution to the problem” by either constructing claims and rebuttals or providing undercoded or creative abductive arguments that support and not interrupt the “cognitive unity between the students’ argumentation and proof.” Cognitive unity pertains to the relationship between conjecturing and proving in mathematics. Arguments, on the one hand, emerge from students’ conceptions and systems of knowledge that enable them to produce conjectures. Proof, on the other hand, depends on mathematical theories. Pedemonte illustrates how cognitive unity is not likely going to occur if students’ (incorrect) conceptions and proofs do not align with the underlying mathematical theories that are needed to solve a problem. Pedemonte also recasts Toulmin’s model for analyzing arguments in terms of how the components of the model exemplify overcoded, undercoded, and creative abductive structures. Pedemonte’s chapter helps instructors understand how to assess and implement productive abductive interventions, especially those that support cognitive unity. This chapter is among the first of its kind to explore ways in which instructors’ abductive interventions can be used as a “teaching strategy and form of reasoning.” Typical and current abduction-based research investigations in mathematics learning tend to focus on students’ thinking and processing, but Pedemonte’s chapter offers a nice framework for investigating the potential impact of instructor-mediated abductions in such situations of learning and problem-solving. For Pedemonte,
24 Introduction to Abduction in Mathematics
515
instructor-mediated abductive interventions should not be interpreted in terms of “lead[ing] students to find answers to problems.” Instead, they should support justifications of those answers, that is, “the focus is not the solution to the problem but how the solution is obtained.”
References Grinnell, F. (2019). Abduction in the everyday practice of science: The logic of unintended experiments. Transactions of the Charles S. Peirce Society, 55(3), 215–227. Mohammadian, M. (2019). Beyond the instinct-inference dichotomy: A unified interpretation of Peirce’s theory of abduction. Transactions of the Charles S. Peirce Society, 55(2), 138–160. Peirce, C. (1931–1958). Collected papers of Charles Sanders Peirce (Vols. 1–8) (P. Weiss, C. Hartshorne, & A. Burks, Eds.). Harvard University Press. Short, T. (2020). Peirce’s idea of science. Transactions of the Charles S. Peirce Society, 56(2), 212–221.
The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
25
Elizabeth de Freitas
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mathematics as the Science of the Hypothetical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Surprise and Epistemic Destabilizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concept Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Computation and Distributing Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proof and Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problem-Solving and Metacognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
518 520 522 524 527 530 533 536 537
Abstract
This chapter examines case studies of mathematical abduction, bringing theories of eco-cognition, computation, imagination, and transcendental empiricism to bear on creative mathematical activity, be it expert, novice, or maverick. Abduction is discussed in terms of hypothesis generation and imagining otherwise, surprise and epistemic destabilization, concept creation and the modeling of wholes/totalities, plausibility and probable inference, as well as efficiency and minimizing error. Many of the case studies reveal how abduction plays a crucial role in the ongoing transformation of mathematics, in the remixing of continuous and discrete conceptual and procedural apparatus, allowing for new and innovative formal structures to emerge. The speculative act of abduction patches together the part and the whole, chance and necessity, the finite and the infinite, as a mode of onto-epistemic mathematical behavior.
E. de Freitas () Adelphi University, New York, NY, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_34
517
518
E. de Freitas
Keywords
Abduction · Mathematics · Computation · Proof · Explanation · Efficiency · Hypothesis · Realism · Material agency · Imagination
Introduction In Stanislaw Lem’s novel Solaris (1961), scientists struggle to comprehend the strange intelligence of an ocean that envelops the planet Solaris and drives its unusual orbit around twin suns. The ocean seems to act like a living organic being, a monstrous colloidal consciousness, exceeding all terrestrial creative capacity. To the scientists in the novel, it was a “protoplasmic ocean-brain enveloping the entire planet and idling its time away in extravagant theoretical cognitation about the nature of the universe” (Lem, 1961/1970, p. 22). Earthly mathematicians were willing to consider the inherent creativity of such a creature, while others saw such idle “extravagant” cogitation as threatening. Some referred to the planet Solaris as a “metamorph” and “Yogi ocean” because of its transformational qualities and the capacity of its undulating surface to generate extremely diverse abstract forms never before seen on earth, while others, who were frustrated by such enigmatic creativity, referred to it as “autistic ocean”. This fantastic image of a cogitating planet, befuddling scientists with its generative capacity and yet somehow recognizable as a mathematical being in its rendering of unusual structures, offers a fruitful starting image of the strange mix of creative and formal activity that constitutes mathematics. Like the vapors rising up from Solaris, the foggy illusions of mathematics are multiple: illusions of transcendence, universality, and eternity. Taking an ecological approach instead allows us to study mathematical habits as embodied, situated, cultural, planetary, and more-than-human. The point here is to think broadly about mathematical activity and appreciate the alien or nonhuman qualities of mathematical reasoning and invention. The speculative power of mathematics, whereby abstractions are engendered, is not only a ‘cultural’ act of constructing symbolic form but also accentuates and affirms thought’s capacity to become radically alien to itself. I introduce this chapter with this theme not to mystify mathematics, but to better understand human mathematics as part of larger eco-cognitive processes and to direct our attention to the speculative work of the imagination in mathematical reasoning. As Badiou (2006) suggests, mathematical thought is “the exercise of an alteration, of an estrangement of the intelligence” (Badiou, 2006, p. 21). In order to do mathematics, one must leverage mathematics as an alteration on thought, as a means for thinking otherwise. Badiou (2016) turns to Isidore Ducasse, known as the Count of Lautreamont, who suggests that the “alien qualities” of mathematics achieve a “denaturing of man, a transmigration of his essence, a positive becomingmonster” (Badiou, 2006, p. 19). In this chapter I consider various case studies of mathematical creativity and invention, for how these involve precisely this kind of estrangement, this destabilizing of a doxa, this upheaval of thought.
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
519
Magnani (2009, 2015) has written extensively about abduction as a mode of hypothetical reasoning which characterizes processes of scientific discovery and mathematical invention: “Creative abductive reasoning is a risky sort of inference that constitutes a central process in conceptual change in science, mathematics, and logic” (Magnani, 2015, p. xi). Magnani characterizes abduction as an “ecocognitive” logic of creation or rational model of discovery, a form of reasoning within creative activity. He wants to show how creativity is not an irrational mystery, but an inferential and instrumental process. Abduction is considered a key method for generating creative abstractions in many human endeavors, including mathematics, where diagrams function abductively like gateways onto imaginary entities. Through his study of human, nonhuman, and artificial intelligence, Magnani (2015) tracks the nonstop “hypothesis generation” that occurs at both the level of “instinctual behavior” as well as conscious symbolic formulation and collective metamorphosis. Magnani is interested in the creative process tout court, as a ground state for any onto-epistemology; he extends the creative process back into the conditions that afforded the new to emerge and studies expert use of background knowledge and techniques of incubation. Notably, abduction works through contradictions, inconsistencies, and manipulations of preexisting forms. More broadly, abduction always loops into a circuit of correction, interrupting habit with in(ter)vention, in a cognitive milieu that is more than human, stretching outside the control of the organism (Levesques, 2018). The term “abduction” was used by Charles Peirce to designate a mode of inferential reasoning that brought forth new hypotheses. He developed his theory of abduction across 50 years (1860s–1910s), initially in terms of syllogisms and later in terms of scientific and creative inquiry more generally. Peirce considered abduction “the process of forming explanatory hypotheses” and claimed that “It is the only logical operation which introduces any new idea” (CP 5.172). Scholars debate the extent to which Peirce conceived abduction as involving both instinct and inference, different from but linked to the hypothetico-deductive method. For many commentators, it is an explanatory inference functioning to evaluate truth claims, an ignorance-preserving epistemic gesture, a retreat to the best partial explanation, but also a generative and creative act. Some treat abduction as a kind of affective synthesis (Oatley & Johnson-Laird, 2002) and others as a mode of deliberation when faced with unexpected surprises. Indeed, abduction may well do all this, playing a part in everyday unconscious judgments, as well as nonstandard modal logics and models of uncertain or plausible reasoning. As a mode of reasoning, abductive inferences generate hypotheses and educated guesses and are often characterized as “plausible reasoning” (Cellucci, 2017). Accordingly, abductive inferences respond to surprises and the unexpected and complement other modes of inference, such as induction and deduction. In deduction, one thinks within nested sets, and the implication is contained within the nests. In other words, we don’t stretch outside of the purview of previously constructed sets; we start with a given concept (or habit) and seek the overlaying or underlaying of subsets. But in abduction there is a vague or incidental correlation that must be affirmed by a speculative gesture that might create concepts that do not belong to
520
E. de Freitas
the known categories (Fischer, 2001). It is terribly easy to go “wrong” in abduction and meander far from the “ground truth.” According to Peirce, however, vagueness is precisely what makes abduction so powerful – guessing introduces possibilities of “errors” but it also performs a kind of reasoning that cannot be achieved otherwise. One uses abduction regularly, whenever one forms a hypothesis based on limited knowledge and generates a plausible explanation. This chapter reviews previous scholarship on mathematical invention, focusing on the role of abduction. To what extent does abduction help us make sense of mathematical practice as speculative, generative, and inferential? Is abduction at work in mathematical practices that mobilize a mix of continuous and discrete conceptual and procedural apparatus? The chapter focuses on examples from the literature that reveal how mathematics operates through contingency and imagining otherwise and through various techniques of generalization and their disruptive consequences. A brief discussion of Peirce’s philosophy of mathematics helps lay the groundwork, before examining case studies where forms of abduction emerge. This chapter began with Lem’s “protoplasmic ocean-brain” in order to situate such creativity in larger ecological and processual questions. Might abduction be our best way to rethink the force of contingency, creativity, and constraint in mathematical behavior more broadly? Could the speculative act of abduction patch together the discrete and the continuous, chance and necessity, the finite and the infinite, as a mode of planetary onto-epistemic production? Abduction would then be a distinctly powerful force in the ongoing transformation of earthly mathematics, opening onto what Peirce called the “great rolling billows of abstraction in the ocean of mathematical thought” (Peirce, CP 4.235).
Mathematics as the Science of the Hypothetical Peirce claims that mathematics deals exclusively with hypothetical states of things and asserts nothing about real facts; “the mathematician studies not what is, but what would be under a given hypothesis” (Peirce, CP 3.428). The imagination plays a significant role in Peirce’s mathematics, being a necessary epistemic condition for innovative mathematical activity: Without [the imagination’s] creative work the inquirer would have no world to explore, no determinate hypothetical state of affairs to investigate with the rigor of necessary reasoning. The faculties of concentration and generalization would have no subject to investigate, no mathematical matter to experiment upon and observe, if there were no imagined mathematical hypotheses. Imagination is the key to originality, in mathematics as in all scientific and philosophical reasoning, and so it is the primary source of breakthrough discovery. (Campos, 2010, p. 128)
For Peirce, the imagination is a semiotic ability to create and modify signs, a form of action which plugs into perception and sensation, with the capacity to reconfigure the real (Barrena, 2013). Indeed, mathematics demonstrates “the most athletic imagination” (Peirce, C4.661). And yet for Peirce, there is a process of abstraction whereby the mathematician represents the real, by first substituting “for
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
521
the intricate, and often confused, mass of facts set before him, an imaginary state of things involving a comparatively orderly system of relations, which, while adhering as closely as possible or desirable to the given premises, shall be within his powers as a mathematician to deal with” (Moore, 2010, p. 3). This “imaginary state of things” is an artifact and also a hypothesis, and all subsequent investigation concerns this hypothesis and not ‘real truth.’ Mathematical hypotheses, however, are different from other speculative endeavors, such as poetry, although Peirce does suggest that detective stories entail a mathematical element. What marks mathematical hypothesizing as distinct is the primary concern with the particular methods of achieving inferential consequences. Thus Peircean mathematics has a dual tendency, combining hypothesis with necessity: he states that mathematics is both “the study of what is true of hypothetical states of things” (Peirce, C4.233) and also “the science which draws necessary conclusions” (Peirce, C4.228). Campos (2010) summarizes: “These complementary definitions reveal his understanding of the nature of mathematical inquiry as being concerned with studying the true, or necessary, consequences of pure hypotheses” (p. 125). As such, mathematical truth is not positivist because it concerns the would-be consequences within a hypothetical milieu or framework. For Peirce, mathematics is not simply the science of quanta or quantity, a definition that we have inherited from the Greeks, and is both too narrowly focused and inadequately translated from the original context. He similarly takes umbrage with De Morgan and Sir William Rowan Hamilton’s attempt to develop a Kantian definition of algebra as the science of time. Mathematics is not the science of time (or space) because it is not a “positive” or positivist activity, “inquiring into matters of fact” (Moore, 2010, p. 5). Mathematics is fully reliant on observational practices, and “realist” as such, but does not concern itself with matters of fact. However, mathematical reasoning is not altogether captured nor rescued by logic, for the latter is less “perspicuous”; indeed, logic is considered a branch of mathematics and “logic ought to draw upon mathematics for control of disputed principles” (Peirce, CP 3.427). Logic, for Peirce, also depends on ethics or “the philosophy of aims” and is a more ambitious field of inquiry, while mathematics “is purely hypothetical; it produces nothing but conditional propositions” (Peirce, CP 4.240). Peirce is committed to mathematical realism and disagrees with the logicism of Frege and the nominalism of Kant. Mathematical reasoning operates on hypothetical diagrams in order to derive and test theorems, and these diagrams are conditioned by inferential logics, but they are also free in some sense. The crucial notion of modal degrees of possibility allows Peirce to unify his epistemological and ontological commitments, applying his “extreme realism” and fallibilism to both tractable mathematical representations and virtual universals. Historically, we can situate his approach alongside - and to some extent against - the axiomatic movement of 1870–1920, which saw the “arithmetization of analysis” by Cantor and Dedekind, as well as the emergence of first-order logic, setting the stage for new relationships between mathematics and logic (Moore, 1988). Wilder (1952) and Hacking (2014) suggest that the coupling of mathematics and logic is a twentiethcentury development and that historians of mathematics gave scant attention to logic
522
E. de Freitas
until 1940, when Bell (1940) dedicated 25 pages to mathematical logic (Wilder, 2012, p. 289). Notably, Peirce is one of the few philosophers at that time who is focused on mathematical inquiry as an inventive process, a process that cannot be reduced to logic and conventions of deduction. Peirce characterizes mathematical activity as drawing from the imagination, unconstrained by correspondence theories of truth; and yet he also emphasizes its remarkable tractable representations and how these convey a system of relations, a functional diagrammatic structure. The term diagram is used here broadly, to include both classical visual drawings in geometry and more rarefied symbolic systems, but in each case, the mathematician performs experiments on these “objects” and observes the results, to determine next actions. Mathematics is thus an experimental science, operating on hypothetical objects. Truth warranting comes to the mathematicians through experiment, intervention, conjecturing, figural rendering, and modifications with “what if not” speculative gestures. All this material labor moves the epistemic subject from a state of hypothesis to an elaborate state of “maybe” and finally to a feeling of conviction, where belief has become justified (Nemirovsky & Ferrara, 2009). The key for Peirce in mathematical invention is imaginative diagramming, but there are pragmatist constraints on the shape of this semiotic transfiguring. The form of the iconic diagram offers an imagined plausible space in which to operate, but the relational structure constrains the inventiveness (de Freitas, 2016b). Peirce speaks of the “indefensible compulsiveness” that compels belief after such inventive work: “indefensible compulsiveness of the perceptual judgment is precisely what constitutes the cogency of mathematical demonstration . . . [I]t is the truth that the nodus of any mathematical proof consists precisely in a judgment in every respect similar to the perceptual judgment except only that instead of referring to a percept forced upon our perception, it refers to an imagination of our creation” (Peirce, C7.659). In summary, this Peircean philosophy of mathematics accentuates the imagination and the hypothetical, but also elaborates an affective process of truth warranting, and an embodied or mediated mathematical logic.
Surprise and Epistemic Destabilizing Abduction often begins with surprise, when expectations are shaken, and situations problematized. Mathematical surprise is a mark of dissensus (de Freitas & Sinclair, 2014), an interruption of coherence and pattern, or the uncovering of an unexpected pattern. Mathematical thinking entails some sort of reckoning with this dissensus, so that learning processes can continue. But first there is the opening of nonsense, a world that is sensed (or encountered) without making sense. This raises the important link between surprise and nonsense and the troubling of consensus and socioepistemic conditions that shape hypothesis generation. Abduction is responsive to the unusual, the break in habit or expectation. It might be linked to the very important capacity to notice aberrant motion and suspiciously wonder about missing causes. More broadly, the power of human abduction is found in plugging directly into the imagination to furnish new ideas that might make sense of the unexpected.
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
523
This is an everyday sort of practice as well as a force for creative production. In the everyday context, abduction is characterized as the act of composing a hypothesis based on an observation of an irregularity that occurs in relation to expectations. Dreyfus and Eisenberg (1986) claim surprise as one of the important aesthetic qualities of a mathematical problem. Their list of such qualities includes: “its level of prerequisite knowledge, its clarity, its simplicity, its length, its conciseness, its structure, its power, its cleverness, and whether it contains elements of surprise” (p. 3). They treat surprise as a psychological state triggered by an observation. All the characteristics of the mathematical aesthetic – clarity, brevity, elegance, and conciseness – lack significant impact if a feeling of surprise is not also engendered. Gouvéa (2011) discusses the possibility that the mathematician Georg Cantor (1845–1918) was surprised when seeing but not believing that there was a bijection between the interval [0,1] and higher-dimensional space. Byers (2007), for instance, suggests that Cantor was surprised because the existence of the bijection was counterintuitive, despite having a proof for justifying its truth, and reveals how mathematical proof does not always furnish feelings of certainty and that mathematicians often discover the unexpected, contravening what they anticipated to be true. In this case, Cantor had observed that the cardinality of a higherdimensional set is not higher than the interval. Gouvéa (2011) examines the 1870s correspondence between Cantor and Dedekind to show how Cantor was not so much surprised by his “inference procedure” but hoping to convince Dedekind that it was “arithmetically rigorous.” The existence of the bijection was counterintuitive and contravened basic assumptions about geometry. Cantor recounts how he was following the debates regarding the foundations of geometry and realized that there was an unproven presupposition that did not seem self-evident. Cantor states that his initial insight was the problematizing of a widespread assumption, by treating it as a theorem that needed proving. Perhaps then he began to construct such a bijection, believing that it would reveal contradictions. He then indicates that his opposite conviction became strengthened. He was able to construct a one-to-one mapping of all the points from a square to those of a continuous line, contravening fundamental precepts, such as dimension, but otherwise coherent. Thus he found a way to show that “surfaces, bodies, indeed even continuous structures of p dimension would have the same power as curves” and that the cardinality of these sets is the same. The question of what or where surprise lies is complicated in this case or any other where proofs are constructive. Cantor appeals to Dedekind to attend to the details of his proof and ensure that it is correct. He seeks support for his claims. Dedekind, however, congratulates Cantor on his interesting proof, but notes that the bijection is constructed by atomizing the continuity of the interval and that such a correspondence doesn’t actually map the continuous: Dedekind critiques the innovative diagonalizing “inference procedure” because it compels one “to admit a frightful, dizzying discontinuity in the correspondence, which dissolves everything to atoms, so that every continuously connected part of one domain appears in its image as thoroughly decomposed
524
E. de Freitas
and discontinuous” (Gouvéa, 2011, p. 228). Dedekind remains convinced that “the dimension-number of a continuous manifold remains its most important invariant” and that Cantor’s work fails to destabilize the concept of dimension constructed through “continuous functions.” In response, Cantor suggests that his approach affirms the wonderful power of the “ordinary real and irrational numbers” to determine all dimensional space. The two mathematicians are focused on philosophical concerns about the relationship between the continuous and the discrete. Stanley (2002) argues that mathematical surprise has to be seen as “an event of emergence” and that those who are surprised must be “prepared to be surprised,” in that surprise occurs only when there is a discrepancy between expectations and experiences (p. 15). The word surprise has French and Latin roots in surprendre (to overtake) and prehendere (to grasp or take, as in prehensile). A surprise disrupts the certainty of the past and opens alternative futures. Surprises are moments when thought becomes alien to itself, grasping the future (prehensile) while being overtaken by it, causing concepts to tremble with indeterminacy, as Cantor did with the concept of dimension. And yet surprise is a deeply relational event, emergent through the interaction of different bodies: “[S]urprises are event-full moments or happenings” (Stanley, 2002, p. 15). As events, surprises are not owned by the subject that undergoes them; events redistribute the sensible across the realm of the plausible. A surprise is an event through which two or more bodies interpenetrate in new ways, and a new assemblage emerges. Ideas mix and intermingle during surprises in ways that bring forth immanent tensions and new surface effects, new ways of sharing new kinds of sense, nonsense, ultimately destabilizing past consensus. The contours of the sensible are literally reconfigured through surprises, when mathematical invention troubles and problematizes the structural foundations of a concept (Mancosu, 2008). In abduction, we come to perceive a proposition as a conditional hypothesis, much like Cantor did. It is this destabilizing of belief that triggers the mind to seek an explanation or the construction of a counterexample. In the case of Cantor, abduction takes us into the heart of conditional reasoning, into the ontological relationality that subtends any kind of surprising inference, allowing for the construction of, in this case, the monstrous bijection.
Concept Creation According to Zalemea (2009), mathematical creativity emerges through the transfusion of one conceptual field into another, through associations, analogies, and couplings. He characterizes the unity of mathematics less in terms of a logic of explanation, and more in terms of inventive mutations, and offers a series of case studies: The unity of mathematics expresses itself, not only in virtue of a common base upon which the All is reconstituted (set theory), but – before all else – in the convergence of its methods and in the transfusing of ideas from one to another of its various webs. The penetration of algebraic methods into analysis, itself subordinated to topology, the
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
525
ubiquitous geometrization of logic and the structural harmony of complex analysis with arithmetic, are all examples in which mathematics’ global unity can be perceived in its local details. (Zalemea, 2009, pp. 36–37)
Zalamea describes the field of mathematics in terms of fluid mixture – in discussing a whole raft of examples of mathematical developments, he uses words like “decantering,” “pouring,” “transfusing,” “filtering,” “saturating,” and “distilling.” Hence his attention to the ways in which the ideas flow from domain to domain and metamorphize as they do. He argues that the radical transformation of mathematics in the twentieth century, due in large part to the mathematician Grothendieck’s hugely influential algebraic methods, opens up a “practice of a relative mathematics” (p. 140) which breaks with an “absolute” mathematics (“in the style of Russell”). He supports these claims with reference to the particular technical practices employed: “In a technical manner, both Einstein and Grothendieck manipulate the frame of the observer and the partial dynamics of the agent in knowledge” (Zalemea, 2009, p. 141). Grothendieck’s method of “sounding out” seems to plug into a “reticent structure” or “proto-geometry” that is in the batter itself (Zalemea, 2009, p. 152). Zalamea draws extensively on what he calls the “dynamic Platonism” of the philosopher Albert Lautman (2011), who showed how mathematics often develops through breaking up its own rigidity by remixing key pairings like continuous/discrete or symmetry/dissymmetry. New mathematical structures emerge through transits and collaborations that partake in that remixing. Indeed, the debates among Cantor and Dedekind lend support to this view: Contrary notions (local/global, form/matter, container/contained, etc.) dwell within groups, number fields, Riemann surfaces and many other constructions . . . the contraries are not opposed to one another, but, rather, are capable of composing with one another so as to constitute those mixtures we call mathematics. (Zalemea, 2009, p. 58)
This focus on remixing and recomposing key contrary notions helps clarify what is at stake in attempts to characterize the role of abduction in creative mathematical practices. Contrary notions dwell within conceptual figures, such as real or complex numbers, where the concept of number is stretched and modulated to accommodate diverse numeric tendencies (de Freitas et al., 2017). Heeffer (2007) argues that abductive reasoning can be seen at work in the text Ars Magna (1545) by Cardano, who is credited with using negative and imaginary numbers in mathematical problems, radically altering methods for solving quadratics. Cardano called his method the rule of postulating a negative (de regula falsum ponendis). Negative solutions for linear equations and negative roots for quadratic equations were completely unacceptable prior to Cardano. The Arabic tradition allowed for two solutions to quadratics, but this approach diminished in the abacus tradition (Heeffer, 2007). Nonetheless, Cardano is willing to use them because, as he says, they make sense symbolically and assist in solving problems. Heeffer (2007) shows how Cardano performs operations (multiplication) on the square root of a negative number and obtains coherent results, which ultimately help furnish solutions to various problems. These new symbolic objects, however, were considered circumspect or dubious, and Cardano suggests they might designate a “missing surface,” according
526
E. de Freitas
to a geometric interpretation, or perhaps exist merely as symbolic gestures and are otherwise “useless” in their reference to the real (Heeffer, 2007, p. 9). They have no meaning outside of the symbolic structure, in that they have no material instantiation in the specific problem situation, but they accord with the symbolic milieu within which he is working. Of course negative numbers correlate to material debt; but the point is that the mathematics of money and abacus problems in the European tradition did not use the concept of “negative unknowns”; this is new with Cardano, who grants these unknown quantities adequate individuation so that they might be manipulated within the symbolic framework. For Heeffer (2007), the anomalies of negatives were the “surprising facts” that had haunted algebra for centuries, and Cardano abduced an explanatory hypothesis which made these facts acceptable. The hypothesis that an unknown surface exists, for instance, can be put to work, in explaining the set of solutions and the relationship between symbols. In this sense, abduction invents a new idea, or materializes a mathematical object, by manipulating a previously suspect idea, and uses the newly refurbished idea as part of the procedural explanation, which produces the desired solutions. Roots of negative numbers are explanatory in that they are functioning like causal mechanisms, within the symbolic framework, showing how the solution emerges because of their use; they were in themselves dubious at the time, but helped justify the truth of the solution. Abduction functions here to establish mathematical coherence. Meyer (2010) is focused on the role of abduction in the “discovery of mathematical coherence” and the “determination of invariant truth,” but our example here pertains to concept construction, rather than emphasis on pattern and rule generalization (p. 186). Pickering (2006) also turns to the construction of mathematical concepts not easily referenced in the physical world, as evidence of mathematical creativity and abduction. He is interested in mathematical invention, discovery, and creative elaboration from within its “disciplining” and algorithmic habits. He suggests that mathematical practice be considered “open-ended modeling,” where modeling is defined broadly as a way of producing associations between diverse cultural elements. Three modeling practices are emphasized: bridging, filling, and transcription. Bridging and filling are said to be “free moves,” while transcription is a “forced move” within any disciplinary practice. He discusses the case of William Rowan Hamilton’s (1805–1865) process for inventing quaternions, which Hamilton characterized as an attempt to generalize the vector arithmetic of complex numbers into higher-degree ordered n-tuples that might also be directed entities in higher dimensions. Hamilton had been stymied for 10 years, struggling to create a coherent arithmetic for such numbers. In 1843, he recounts how he suddenly realized that if he could abandon the constraint that multiplication be commutative and use the quadruple (a, b, c, d) instead of the triple, he could then fashion a consistent and tractable new concept of number that encompassed the complex plane. For Pickering (2006), the case of Hamilton is used to argue against sociologists of science in the Strong program, for whom knowledge was entirely sociocultural and detachable from the real. Hamilton’s geometric experimentation involves an abductive process whereby the break with arithmetic conventions (commutativity)
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
527
is accommodated because the new quaternion concept allows him to make better spatial sense of the structural relationships between three-dimensional vectors. Like the Cantor and Cardano examples, this one also pertains to the concept of dimension and the materiality of mathematical labor. Moreover, the creative impulse emerges through various diagramming gestures, and draws attention to the “material agency” of mathematical practice, and the embodied fumblings, remakings, additions, and filtrations that shape what Pickering calls “the mangle of practice” (p. 251). The existence of quaternions was debated in the mathematical community in the 1840s. Concerns that these numbers were too speculative, and that their arithmetic broke with conventions, seemed to set them aside from the rest of mathematics. Hamilton was an idealist, following Kantian claims that mathematics was synthetic a priori and that humans were equipped with faculties that allowed them to conceive and connect their conceptions with the empirical world; for Hamilton, algebra was the science of pure time, and geometry the science of pure space. For these reasons, Pickering claims that Hamilton is involved in a Kantian “technical-metaphysical” practice of creating concepts. To move beyond this Kantian framing of concept construction, one must think with different philosophers (Smith, 2007). For instance, Châtelet (2000) selects certain episodes in the history of mathematics and physics to show how particular diagrams – what he terms “cutting-out gestures” – have erupted during inventive thought experiments to reveal the onto-generative nature of mathematical agency. Châtelet invites us to see diagrams as invoking a dynamic process of excavation that conjures the virtual in sensible matter, a kind of problematics (to complement axiomatics) for engaging with the quivering indeterminacy of concepts. The square, for instance, is the material process of quadrature; the circle is the differential force that sustains or produces or determines it. Deleuze (1994) will also compare the general equation of a circle x2 + y2 − R = 0 with the differential equation ydy + xdx = 0 to show how the latter captures the movements that sustain the fluidity of the concept of circle. This differential equation quivers with its universality and its indeterminacy (de Freitas & Sinclair, 2020). A philosophy of mathematics more in line with Peircian realism, that reframes the nature of “concept construction” is characterized by Deleuze as a “problematics” – this is an approach to mathematics that reanimates mathematical figures and equations, embracing the event nature of concepts and the hypothetical nature of the real, rather than a Kantian idealism.
Computation and Distributing Error The iterative loops of habit, anticipation, and correction are pivotal in abductive processes, as is “what if not?” thinking. To imagine what is not the case is to fumble after fabulation and the power of the false, into a hypothetical realm that lures one away from matters of fact, which are then modified and reanimated by being the occasion for such speculation. Creative processes often involve this kind of altering or transforming, where the gestural contemplation of “what if not?” indulges the infinite power of negation, when “to imagine otherwise” transforms the present
528
E. de Freitas
moment into a pivot, and seeds an array of alternative paths for action and reason. To speculate is to populate a hypothetical space, where divergence and distortion operate on what is the case, a space that is enlarged by the power of the imagination to craft other plausible scenarios which might explain the apparent circumstances. Zalamea (2012) defines abduction as an inferential process that locally glues the breaks in the continuum of habit and expectation, by means of an arsenal of methods which select effectively the “closer” explanatory hypotheses for a given break, thereby stitching and mending the discontinuities in any new regularizing perspective. For Zalamea (2012) abduction must draw from an “infinitude of useless hypotheses” whose utility is bound to their being false (p. 102). Such a claim is important, because it helps examine the extent to which abduction is at work in computation and machine abstraction. To abduct is both a creative gesture and a mode of regulation, smoothing over the interruptions to any generalization or abstraction, while creating unexpected contortions that might also disrupt previous pattern-making habits. Magnani (2015) trusts that “computational philosophy” will make rational the “speculative problem” of creative invention in the sciences, and he sees abduction as often a mix of instrumental plausible reasoning and manipulative creative intervention. He distinguishes between sentential and modelbased abduction, a distinction that partially corresponds to that between logical relations and diagrammatic relations. But model-based abduction captures a wide spectrum of activity: analogical visual-iconic reasoning, simulative reasoning, perception activity, thought experiments, and deductive reasoning (Magnani, 2015, p. 37). He turns to current developments in AI computation and machine learning which seem to operate with a distributed abductive reason. Magnani (2015) is interested in how we are developing genetic algorithms that are computationally adequate to perform viable “discovery” processes modeled on mathematical and scientific methods. Parisi (2017, 2019) also turns to the concept of abduction to explore algorithmic creative capacity and to help us think much more broadly about the nature of instrumental reason. She seeks to understand the “collective abductive inferences within and throughout computational logic,” which involve “hypothetical elaboration” and plausible sense-making found in “algorithms, software, subroutines, codes, as well as databases, platforms, interfaces, and so on” (p. 97). She aims to show how machine learning – in particular, deep neural net algorithms – can be considered a speculative mode of cognition, a form of inference that brings forth the unscripted new, in response to unanticipated and contingent conditions. Current machine learning habits, for instance, deep neural nets, are bound by linear regression models and basic combinatoric permutations of plausible hypotheses, but they are in some cases generative of new hypotheses (Buckner, 2019; Josephson & Josephson, 1994). What matters in a deep learning neural net is how value is correlated with weight and how these weights are distributed and revised across the network. The hypothesis space is indeed the multidimensional array of possible weighted significance assigned to each element of input data. The algorithm modifies weights so as to reduce overall error, thereby inducing an error function which emerges from the process.
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
529
We can better understand the nature of computational habits when we understand how the power to imagine is situated in these kinds of techniques of the false – the material labor of imagining otherwise. Abduction operates in computation as a cognitive corrective effort, to make sense of anomalous indexed facts, by assessing future possible outcomes that follow from certain hypotheses. Deep neural nets (DNN) are demanding and meandering learners. The method of gradient descent seeks to minimize this divergent thinking, to shepherd the algorithm over the undulating terrain of uncertainty, away from those propositions that are farfetched and those that are deceptively truth-like, until it can settle on a reasonable hypothesis. As Parisi (2017) warns, big data methods underscore the links between affect and reason, whereby a “chain of contingencies becomes the driving force for decision-making actions.” I suggest here that this process is best understood as deeply rooted in the vast crenelated power of the false, a false that is differentially spread across the Baroque folds of plausibility. The example of generative adversarial networks (GANs) is particularly relevant as these gain skill through a kind of generative “unsupervised learning” (Goodfellow et al., 2014). GANs have a dueling structure of generator-discriminator built into them, where the generator acts like the imagination extending and creating fictions, and the discriminator guesses whether the work is fiction or not and the degree of trust in such decisions. The first “guesses” of what might be considered a plausible choice are absurd and far from the training data, generated as they are from a random flux of pixels (GANs are most often used on arrays of pixels), selected without conceptual guidance. This noisy generator – proffering “useless hypotheses” plucked from the vast hypothesis space – is precisely what is needed for the machine learning to be robust. Gradually, the generator learns from the process of judgment and generates hypothetical propositions that are more accurate and capable of fooling the discriminator, who has also learned during this process. These generated propositions and images are creative abstractions insofar as they capture something “about the concept” in its totality, without being averages in the conventional sense. These examples of abductive computational cognition raise important issues about “efficient” reasoning, or lack thereof, in mathematical modeling. McKaughan (2008) suggests that abduction pertains to the “pursuitworthiness” of a thesis and not the process of generating novel hypotheses. He argues that abduction is an inference based on weighted consideration of if-then conditions, fashioning a form of reasoning “invoked as a means of providing fallible evidential support to explanatory hypotheses that go beyond the observational data” (p. 450). Abduction can easily go wrong in this way and is associated with fallacious implications, reasoning “from effects to causes” or retroduction, but is also pragmatic and helpful, a kind of inference to the best explanation. This kind of abduction is also seen in “fact optimization” in young people’s problem-solving (Hidayah et al., 2020). According to this interpretation, abduction is not an ampliative inference. This is the interpretation that treats hypothesis generation as “merely preparatory” in that it “commits us to nothing. It merely causes a hypothesis to be set down upon our docket of cases to be tried (CP.5.602).” Hypotheses worthy of pursuit draw
530
E. de Freitas
attention to the “economy of labor” in reasoning (McKaughan, 2008, p. 452), so that “the better abduction is the one which is likely to lead to the truth with the lesser expenditure of time, utility, etc.” (Peirce, NEM 4, 37–38). McKaughan aligns with Thagard (1981) and Kapitan (1990, 2000) in suggesting that “economies of research” are relevant to understanding Peirce’s later work on abduction. The leading consideration in all cases of abduction “is the question of economy – economy of money, time, thought, and energy” (Peirce, CP 5.600). Current machine learning models are an impressively inefficient way of generating plausible hypotheses. Perhaps there simply isn’t enough abduction occurring in such mathematical formulations. Critics of DNN, like the AI researcher Francois Chollet (2019), suggest that machine learning algorithms should be trained on different curriculum (different kinds of data), if they are to learn how to generalize as well as humans do. He shows how training on pixels and atomized data arrays is not the best diet for learning and that a relational data set, expressing some sort of semantic structure, might get us closer to what we think humans are doing when they create abstractions. Rather than a sloppy mess of pixel arrays – a Jackson Pollock painting – might we train our DNN on data that is more structured? Tensions between those defending the purely roaming data-hungry DNN and others like Chollet, who are attempting to rethink the reliance on DNN in machine learning, have led to a renewed emphasis on “neuro-symbolic” training sets, so as to simulate what they surmise to be innate neurocognitive structures at the foundation of human learning (Chollet, 2019). This debate is directly related to questions about the explainability of AI algorithms and the broader question of what constitutes a good explanation in mathematics.
Proof and Explanation Proof structures in mathematics are somewhat diverse; Colyvan (2020) lists conditional proof, reductio ad absurdum, finite induction, transfinite induction, disjunctive syllogism, universal generalization, and proof by cases (p. 192). Mathematical proofs come in different forms and accordingly achieve different effects. Some proofs are said to be more explanatory than others, in that they both affirm the truth of a proposition and show also why it is true. This distinction is helpful in revealing the place of abduction in mathematics and in understanding how mathematical explanations are not “causal histories” in the usual sense. To that end, Colyvan (2020) discusses Euler’s well-known proof that there is no way to traverse the bridges of Konigsberg once and only once, on a journey that begins and ends in the same place. More generally, the theorem states that a connected graph has an Eulerian cycle if and only if there is no vertex with odd degree. In the original network-theoretic proof, there is a “proof idea” that captures the inherent constraints of the network, that being the requirement that every node or vertex in a network must pair departure and arrival paths. If there is an odd valence for any node, the connected network cannot be traversed by an Euler cycle. Another proof, using a brute force combinatoric study of possible paths, would affirm the truth
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
531
of the proposition without referencing the network structure. For Colyvan (2020) this latter approach makes the proof nonexplanatory. In other words, brute force proofs or proofs that warrant truth through enumerating and exhausting the space of possibility are not considered “explanatory.” Abduction plucks from the space of possibility, precisely because such exhaustive searches are usually beyond our capacity. It is these kinds of constraints or limits to capacity, along with questions of the “economy of money, time, thought and energy,” that create conditions in which explanatory activity is demanded. For Colyvan (2020) “explanatory” proofs have to use the structural relationships of the problem, insofar as they are formalized or contracted into meta-level models and not insofar as they are encapsulated in complete lists of all possible situations. Brute force exhaustive calculations are not explanatory. As he suggests, “armed only with such a proof, we would be none the wiser as to why there is no Eulerian cycle for the multigraph in question” (p. 191). Such a proof may be confirmatory, but it does not offer explanation as to why something is the case. This means that a description of a situation which is “total” – rather than a method of reasoning that draws from a sample, representative set, a guess, or some other speculative gesture in order to create a mathematical structure – is not explanatory. In addition, the graph-theoretic proof is more general and thus more nimble, because it can determine whether an Euler cycle exists in a modified graph, whereas the other proof style must begin again, from first counting principles, when having to consider a modified graph. Arzarello et al. (1998) use abduction to describe the process of moving between a logic of discovery and one of explanation, deploying a heuristic that transitions from conjecturing modes to methods of proving mathematical claims to be true. They characterize this process as dialectical, where abduction entails a “resolutive move” between groping conjectures, allowing for a convergence toward a validated form of knowledge. In the context of classical deductive proof, citing Pappus, they treat the two methods of analysis and synthesis as capturing this dialectic process. Synthesis seeks preliminary assumptions or givens, as conditions for a statement to be true (or false), and analysis starts from first principles and consequences to affirm the truth value of a statement. Abductions emerge at the outset as working hypotheses, not yet formalized in if-then propositions, but still “a sort of reverse deduction” whereby the subject “sees what rule it is the case of.” Then the subject switches to the deductive mode and rethinks the relationship “in the opposite way” and finally they “detach” and become a “true rational agent . . . who controls the products of the whole exploring and conjecturing process from a higher level.” This is a model of logical thought that sees ascending and descending up and down a ladder of conjecture/inference relations. Knipping (2003) also shows how problemsolving and proving involve a process where abduction generates the rule, and then reversing allows for a deductive proof to affirm the truth of the claims. The links to Imre Lakatos’ work on methods of mathematical invention and proof are explicit. Arzarello et al. (1998) examine Lakatos’ (1976) machinery of proof and refutation, in which he analyzes the Euler conjecture about the relationship between edges, vertices, and faces of polyhedra. They suggest that in Lakatos’ work, abduction functions as a “logic of not” whereby counterexamples are engendered and cause
532
E. de Freitas
some surprise or dissonance, and then additional abductions are engendered to explain the counterexample. New concepts are proposed, invariants are modified, domains of relevance are expanded or contracted, and so on. Hookway (2012) also discusses Lakatos’ elaboration of mathematical inquiry, where the distinction between simple corollary implication and more ampliative synthetic deductions of “theorematic” quality is illustrated (pp. 194–203). “Theorematic reasoning,” so named by Peirce, are the points in mathematical inquiry where a creative construction is introduced into mathematical inquiry, resulting in a surprising observation, which follows by necessity from the relational structure (Hoffmann, 2010). Komatsu and Jones (2021) look for how the abductive response to counterexamples is used by young people learning mathematics. Lakatos elaborates how the creative and speculative “what if not” activity should not be treated as simply the reverse of a deductive warranting activity but is an informal method unto itself. For Lakatos, these are not simply unfinished formal proofs, in which the pertinent axioms and logical rules of inference are suppressed, but rather a significantly different mode of inquiry, a non-axiomatic argument that has its own trajectory and its own becoming. As Peirce says, the mathematician’s hypotheses “are creatures of his own imagination; but he discovers in them relations which surprise him sometimes” (Peirce, C5.567). This element of surprise responds to the indeterminacy of the imaginative structure and the fact that what follows from it through necessity is not pre-given epistemically. Constructive proofs that furnish that of which they affirm the existence, such as the ancient proof of the infinitude of the prime numbers, seem more like causal histories. These kinds of proof offer empirical providence and reveal the actual method of creation. Brouwer suggests that intuitionist mathematics involves creative acts of two kinds: through an infinite proceeding of mathematical sequences, engendering the new, and through the defining of properties of a new “species” of mathematical object (Sriraman, 2004). The constructivist emphasis on constructive proofs, and their restrictions on infinities, seems to curtail abductive processes that plunge into an alien thought and what Bell (2008, 2015) calls an “infinite pragmatism”. Mathematical inductive proofs are notoriously lacking in explanatory power (Brown, 1997). The method of mathematical induction confirms that a relation holds for all cases if it (1) holds for a base case and (2) holds for case K + 1 if it holds for case K. The proof method must show that the conjecture (K + 1) follows necessarily from the case K. The proof establishes the truth of the relation, but rarely makes tangible the structural significance of the relation. The role of abduction in such proofs is minimal, perhaps only in the initial conjecture regarding the rule, pattern, or relation. The inference mechanism within mathematical induction stays clear of inventive hypothesis. Rivera (2008) discusses this kind of abduction in patterning tasks given to middle school children, noting that 5 years of performance tests reveal that students struggle with such generalization tasks (Rivera & Becker, 2007). He recounts how three figural examples might underdetermine a pattern and, in such curriculum, create illusions that narrowly “see” linearity in sample sets. Indeed, much of the work in mathematics education research that focuses on abduction is primarily focused on “noticing” pattern rules induced by examples (Meyer, 2010).
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
533
Pedemonte (2008) folds abduction into the inductive methods for generalizing rules. Rivera (2008) stresses that induction includes abductive moves, and he focuses on “algebraic abductions” whereby a student generates a formula for a sequence of terms. Implicit in these kinds of examples is the correlation between the structure of the figure and the index in the ordered sequence, which confines the creative abstraction to inductive reasoning. Hence, these examples explore the way that abduction is folded into the induction process. Rivera (2008) distinguishes between partial and complete abduction, where conjecturing and testing together form the “complete” inductive-abductive process. He cites Hoffmann (1999) who states: “induction is not what can be generalized from a sample of data, but only a quantitative determination of what is already given by abduction” (p. 272). Reid (2018), who surveys scholarship on abduction in mathematics education, points out that the later Peirce rescinded what he saw as an overemphasis on the probabilistic reasoning associated with abduction, attempting to separate the process from mere “abductive induction.” And yet generalization through induction is a common mathematical form of reasoning, a mode of abstraction whereby one selects some qualities shared across the specific sample, and then detaches these qualities from the particulars to that they become mobile generals, tested against examples further afield (Josephson, 2000; Yliloski & Kuorikoski, 2010). Induction generalizes in this way, but “the general” that newly emerges is always tethered to the finite sample set (or countable set), undermining the universality of the rule. Hume’s problem of induction – articulated during the eighteenth century, while the mathematics of probability was emerging – points to the inherent insufficiency of induction to justify the rule (Hume, 1999). Induction always harbors this inadequacy, while reason nonetheless furnishes creative abstractions and brings forth the general rule.
Problem-Solving and Metacognition Cifarelli (1997) suggests that mathematical problem-solving involves ad hoc abductive hypothetical reasoning, which allows one to proceed cautiously and adapt easily; following Burton (1984), he focuses on the process by which novel hypotheses are engendered in mathematical problem-solving activity. In other words, he argues that abduction is a way of moderating problem-solving activity and reflecting on process, as Norton does as well (Norton, 2008). Cifarelli (1997) is particularly interested in mathematics problem posing habits that operate through hypothetical counterfactuals and suppositions. He uses case studies of children showing flexibility and control as they explore the mathematical situation – his attention is on the metacognitive nature of abductive behavior. Reid (2018) surveys many who have focused on abduction in this way within contexts of mathematics learning and pedagogy (e.g., Cifarelli, 1999; Krummheuer, 2007; Pedemonte, 2007; Pedemonte & Reid, 2011; Rivera & Becker, 2007). This embodied approach resonates with the mathematician Longo (2015) who also argues that human habits of mathematizing are conditioned by eco-biological processes and that there is
534
E. de Freitas
a correspondence between mathematics (as a study of quantities organized in structures) and the cosmos, but decries that “this shouldn’t be considered a new Pythagoreanism” (p. 12). Mathematical activity, for Longo (2015), entails a kind of dialectical coordinated interaction with another being which resists us – the world says “no” and “channels our epistemic praxis, which is of an eminently organizational character . . . ” (p. 13). This “real friction with the world” is not the conventional culture-matter dialectic, nor a process of uncovering an immanent mathematical structure, but rather points to a geologic entailment that goes back to prehuman dynamic evolving forms (p. 16). Our brain and body are organized in such a way, whereby particular physiological structures and neural networks are both conditions for particular kinds of geometry, and simultaneously plastic, responsive, and generative, making abduction necessary for learning and invention. Longo also turns to the “interval” and the concept of dimension, as an important source for mathematical metamorphosis (de Freitas, 2021). Abrahamson (2012) expands this perspective and suggests that “guided mediated abduction” is a pedagogical process, in that students’ logical inference and creative reasoning involves appropriation of cultural forms, to which he references the dialectical learning models and cultural-historical activity theory of Vygotsky. He studies psychological trajectories for mathematical reinvention, tracking young people as they explore the ratios of colored marbles in a probability experiment, suggesting that their learning is best described as abduction. He focuses on students’ ability to abduce “intensive quantities” which differ from standard extensive quantities in being scale-invariant properties of a system. Temperature, hardness, and density are examples mentioned. Intensive quantities are said to be nonadditive across a system, and they entail different ways of relating part to whole. Moreover, the notion of intensive quantity revisits the earlier examples of Cantor, Cardano, and Hamilton, in that our discussion revealed how number itself became intensive through acts of abduction (de Freitas & Ferrara, 2015). Insofar as quantity is treated as intensive, it allows for distortion and the kind of “sounding out” associated with Grothendieck’s method. Abrahamson (2012) suggests that the probability concept of “likelihood” is also an intensive quantity, which is mobilized in embodied notions of chance and anticipation. Since abduction is associated with plausible reasoning, this particular study of reasoning about likelihood takes us deeper into the concept of plausibility itself. Indeed, plausible reasoning about plausibility seems to shift the frame of reference, either back to the psychology of the individual or perhaps forward to a radical new image of probability in the wild (de Freitas, 2016). Abrahamson is focused on how the inference mediates a change in conviction regarding the truth of a proposition that pertains to a causal mechanism, allowing the student to move from the sensory phenomenon into the abstract model for determining likelihood. That is, they are moving from the empirical case to the symbolic rule, metric equation, or iconic diagram which is mathematically associated with the event under study. Rivera (2017) also examines the role of disposition in abductive cognition in education contexts, where “what follows necessarily” becomes highly motivating or worthy of pursuit. In this case, mathematical abduction is a kind of destabilizing of certainty at the psychological level, so that previous trust in an
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
535
intuitive or perceptual understanding is diminished, as the new mathematical model is accepted. Abrahamson (2012) considers the destabilizing of epistemic certainty associated with an “intuitive” response to be essential for constructivist learning, so that a conviction becomes “a volatile impression warranting inferential support.” This is described as moving from intuitive knowledge to proving. Abduction is then proffered as the way to rethink learning issues associated with student voice, authority, trust, and critical pedagogy, linked to the “inherent forms of disciplinary media” that educators deploy in classrooms (p. 639). The opportunity for abduction then occurs through encounters with new media; by modeling the probability of outcomes using material cards, children moved from an initial “sense” of equiprobability for all outcomes to a reflective and analytic sense of heteroprobable aggregate events. For Abrahamson, the guided mediated abductive process problematizes attributions of truth precisely because it is a mode of construction that is deeply “socioepistemic” rather than a rarefied insulated cognitive process. Notably, the children improve their understanding of likelihood by moving from a notion of “five objects” to a notion of “five-sets-of-objects” – in other words, they hypothesize a whole (the “event space”) to which the particulars belong. This tendency to reason through the existence of “wholes” – often with different claims to universal or infinite extension – is at the heart of probability and abduction, but also perhaps at the heart of empiricism itself. Indeed, as Deleuze (1991) says, “One can only invent a whole, since the only invention possible is that of the whole” (p. 40). Mathematics is an empirical practice, Abrahamson suggests, precisely because it operates through material media and embodied gestures, where our evolutionary capacity for holistic perceptual judgment of intensive quantities is solicited to negotiate with analytic mathematical models of the same phenomena. However, empiricism, as Hume showed, is a philosophy of the imagination, because “mankind is an inventive species,” bound to habit, belief, and passion (Hume, 1975, 3.2.1.19, SBN 484). For Hume, our inclinations and habits of forming a “general view” on a situation – establishing a totality, a general idea, universal concept, an institution, a rule for action, etc. – are ground in the imagination, so that all knowledge is “fictioned from within a multiplicity” (Bell, 2008, p. 2). Through acts of abduction, the imagination goes to wholes and totalities, furnishing or gathering purpose: “To believe is to infer one part of nature from another, which is not given. To invent is to distinguish powers and to constitute functional totalities or totalities that are not given in nature” (Deleuze, 1991, p. 86). We see this in the case of the children working with colored marbles, cards, and probability concepts. Coordination demands abstract ideas, and these are obtained by contorting given relations to ends and purposes. It is the passions that play a key role in gathering the associations together into a collective, thereby forging a reason to act: “feeling posits ends and reacts to wholes” (Deleuze, 1991, p. 49). The power of the imagination is to imagine power, that is, to break power off from the exercise of power, to play with the limits of sense and sensation, liberating abstract forms by both “extending them infinitely” and “presenting the accidental as essential” (Deleuze, 1991, p. 59). This is exactly how abduction operates in this case, fueling an association that establishes the principle of reason.
536
E. de Freitas
Conclusion In 1993 Arthur Jaffe and Frank Quinn published a controversial paper about the need for “serious caution” when considering the consequences of speculative mathematics. They saw a trend in the 1990s that strayed away from the “disciplined” activity of rigorous proof toward an “intuitive reasoning without proof” alongside other shifts in the organization of the field (Jaffe & Quinn, 1993, p. 1). In what seems now a quaint reference to the technology of “copying machines and electronic bulletin boards,” they raised concerns that the activity of proving was being separated from the activity of experimentation. They saw “casual reasoning” and vague intuitive approaches to the fundamental ideas of the field, mentioning the work of William Thurston who offered “a grand insight, delivered with beautiful but insufficient hints” and never published a full proof of the geometrization theorem. This shift in the field of mathematics was disastrous, they concluded, creating “dead areas” of research, where newcomers hesitated to enter, despite there being need for rigorous work after the “vigorous theorist” had claimed the space: “when a theorist claims credit, it is difficult for rigorous workers to justify the investment of labor required to make it reliable” (p. 9). Their paper spawned much debate about the conventions of reliability in mathematics and the collective burden of rigorous proof, issues still relevant today, as we rely increasingly on computer proofs and polymath crowdsourcing (Pease et al., 2020; Polymath, 2014). This debate helps frame the chapter’s commentary on creative mathematics. The case studies discussed here range from accounts of concept construction, error management, unconventional proof methods, and pattern recognition, but all of them point to one overarching theme – abductive reasoning is precisely how discrete mathematics stretches to hypothetical wholes/totalities/infinities/continuities and other vaguely structured models. These speculative ‘wholes’ are no less mathematically real, indeed for many they are more real. Longo (2015, 2019) raises concerns that a mathematics subsumed by computational methods presumes a “flat” and “unidimensional” discrete-computational approach, where mathematics becomes tethered to Turing’s project to build a logical-formal computing machine. Imposing such an impoverished logic onto mathematical activity writ large supports an “arithmetical discrete/finite, decidable (and thus programmable)” world view (p. 8). In other words, Longo bemoans the separation of proof from experiment, but for very different reasons than those of Jaffe and Quinn. Zalemea (2009) similarly states that “nothing could therefore be further from an understanding of mathematical invention than a philosophical posture that tries to mimic the set-theoretical analytic and presumes to indulge in ‘antiseptic’ procedures as the elimination of the inevitable contradictions of doing mathematics or the reduction of the continuous/discrete dialectic” (pp. 183–184). Studies of abduction reveal the fundamental fallibilism of mathematical concepts and the vague inexact methods by which humans engage with a vast surreal space of hypotheses. Abduction depends on our ability to engage with not-knowing, on making thought alien to itself. It is a necessary supplement to combinatorial logic, a risky gesture that both undergirds and prehends the iterative process that is at the heart of the mathematical concept of algorithm. These case studies of abduction
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
537
reveal some of the material labor involved in mathematics, in inventing intervals and “wholes,” as we grapple with contingency and finite constraint (de Freitas & Sinclair, 2019). The speculative act of abduction patches together the discrete and the continuous, chance and necessity, the finite and the infinite, as a mode of onto-epistemic behavior (de Freitas, 2018). We should be wary of deriding “theoretical” or speculative mathematics, for it is precisely through abduction that mathematics undergoes new metamorphoses, as Zalemea (2009) suggests, “decantering,” “pouring,” “transfusing,” “filtering,” “saturating,” and “distilling” conceptual mixtures. Destabilizing the rigidity of mathematics and remixing key pairings such as continuous/discrete or symmetry/dissymmetry is where abductive cognition is most powerfully at work. In closing, I return to the strange planet of Solaris and Lem’s image of a “protoplasmic ocean-brain” where the power of the imagination is situated in larger ecological and processual metamorphosis. Abduction is not only a human faculty, but an expression of worldly activity, an ongoing onto-epistemic process. Through abduction, an eco-cognitive creative impulse mends the breaks in relational structures, remixing matter and meaning, allowing for circulating flows of anticipation and surprise, so that a genuinely new mathematics might emerge.
References Abrahamson, D. (2012). Rethinking intensive quantities via guided mediated abduction. Journal of the Learning Sciences, 21(4), 626–649. https://doi.org/10.1080/10508406.2011.633838 Arzarello, F., Andriano, V., Olivero, F., & Robutti, O. (1998). Abduction and conjecturing in mathematics. Philosophica, 61(1), 77–94. Badiou, A. (2006). Mathematics and philosophy. In S. Duffy (Ed.), Virtual mathematics–The logic of difference (pp. 187–208). Clinamen. Badiou, A. (2016). In praise of mathematics (S. Spitzer, Trans.). Polity Press. Barrena, S. (2013). Reason and imagination in Charles S. Peirce. European Journal of Pragmatism and American History, V-1. https://doi.org/10.4000/ejpap.575. https://journals.openedition.org/ ejpap/575 Bell, E. T. (1940). The development of mathematics. Dover Publications. Bell, J. (2008). Deleuze’s Hume: Philosophy, culture and the Scottish enlightenment. Edinburgh University Press. Bell, J. (2015). Infinite pragmatism: Deleuze, Peirce, and the habit of things. In S. Bowden, S. Bignall, P. Patton, & P. (Eds.), Deleuze and pragmatism. Routledge. Brown, J. R. (1997). Proofs and pictures. British Journal of Philosophy of Science, 48, 161–180. Buckner, C. (2019). Deep learning: A philosophical introduction. Philosophy Compass, 14(50). https://doi.org/10.1111/phc3.12625 Burton, L. (1984). Mathematical thinking: The struggle for meaning. Journal for Research in Mathematics Education, 15(1), 35–49. Byers, W. (2007). How mathematicians think: Using ambiguity, contradictions and paradox to create mathematics. Princeton University Press. Campos, D. G. (2010). The imagination and hypothesis making in mathematics: A Peircean account. In M. E. Moore (Ed.), New essays on Peirce’s mathematical philosophy. Open Court. Cardano, G. (1545). Ars Magna, Johann Petreius, Nürnberg (English translation by Witmer, R. T. (1968). Ars Magna or the rules of algebra. M.I.T. Press. Reprinted by Dover Publications, 1993).
538
E. de Freitas
Cellucci, C. (2017). Varieties of maverick philosophy of mathematics. In B. Sriraman (Ed.), Humanizing mathematics and its philosophy: Essays in celebration of Reuben Hersh’s 90th birthday (pp. 223–252). BirkHauser Publisher. Châtelet, G. (1993/2000). Figuring space: Philosophy, mathematics and physics. Kluwer Academic. Chollet, F. (2019). On the measure of intelligence. https://arxiv.org/abs/1911.01547 Cifarelli, V. (1997). Emergence of abductive reasoning in mathematics problem solving. Paper presented at the Annual Meeting of the American Educational Research Association (Chicago, IL, March, 1997). https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.844. 6782&rep=rep1&type=pdf. Accessed 15 Mar 2022 Colyvan, M. (2020). The ins and outs of mathematical explanation. In M. Pitici (Ed.), The best writing on mathematics 2020. Princeton University Press. de Freitas, E. (2016a). Number sense and calculating children: Multiplicity, measure and mathematical monsters. Discourse: Studies in the Cultural Politics of Education, 37(5), 650–661. de Freitas, E. (2016b). Material encounters and media events: What (kind of mathematics) can a body do? Educational Studies in Mathematics, 91(2), 185–202. de Freitas, E. (2018). The mathematical continuum: A haunting problematic. The Mathematics Enthusiast, 15(1–2), 148–158. de Freitas, E. (2021). Mathematics in the middle: The relationship between measurement and metamorphic matter. Matter: Journal of New Materialism Research, 2(2), 1–24. https://revistes. ub.edu/index.php/matter/article/view/35888 de Freitas, E., & Ferrara, F. (2015). Movement, memory, and mathematics: Henri Bergson and the ontology of learning. Studies in Philosophy of Education, 34(6), 565–585. de Freitas, E., & Sinclair, N. (2014). Mathematics and the body: Material entanglements in the classroom. Cambridge University Press. de Freitas, E., & Sinclair, N. (Eds.). (2019). Body studies in mathematics education: Diverse scales of mattering. Special issue of ZDM. International Journal of Mathematics Education Research, 51, 227–237. https://doi.org/10.1007/s11858-019-01052-w. de Freitas, E., & Sinclair, N. (2020). Measurement as relational, intensive, inclusive: Towards a minor mathematics. Journal of Mathematical Behavior, 59. Open access: https://www. sciencedirect.com/science/article/pii/S0732312320300602 de Freitas, E., Sinclair, N., & Coles, A. (2017). What is a mathematical concept? Cambridge University Press. Deleuze, G. (1991/1953). Empiricism and subjectivity: An essay on Hume’s theory of human nature (C. V. Boundas, Trans.). Columbia University Press. Deleuze, G. (1994/1968). Difference and repetition (P. Patton, Trans.). Columbia University Press. Dreyfus, T., & Eisenberg, T. (1986). On the aesthetics of mathematical thought. For the Learning of Mathematics, 6(1), 2–10. Fischer, H. R. (2001). Abductive reasoning as a way of world making. Foundations of Science, 6(4), 361–383. Goodfellow, I. J., Pouget-Abadie, J, Mirza, M, Xu, B, WardeFarley, D et al. (2014). Generative adversarial networks. Arxiv.org. Access at https://arxiv.org/abs/1406.2661 Gouvéa, F. (2011). Was Cantor surprised? The Mathematical Association of America Monthly, March. 198–209. Hacking, I. (2014). Why is there a philosophy of mathematics, at all? Cambridge University Press. Heeffer, A. (2007). Abduction as a strategy for concept formation in mathematics: Cardano postulating a negative. In O. Pombo & A. Gerner (Eds.), Abduction and the process of scientific discovery. Centro de Philosophia das ciencias da Universidad de Lisboa. Hidayah, I. N., Sa’dijah, C., Subanji, & Sudirman. (2020). Characteristics of students’ abductive reasoning in solving algebra problems. Journal on Mathematics Education, 11(3), 347–362. https://doi.org/10.22342/jme.11.3.11869.347-362 Hoffmann, M. H. G. (1999). Problems with Peirce’s concept of abduction. Foundations of Science 4(3), https://doi.org/10.1023/A:100967824079. https://www.researchgate.net/publication/ 42335883_Problems_with_Peirce’s_Concept_of_Abduction
25 The Role of Abduction in Mathematics: Creativity, Contingency, and Constraint
539
Hoffmann, M. (2010). “Theoric transformations” and a new classification of abductive inferences. Transactions of the Charles S. Peirce Society, 46(4), 570. Hookway, C. J. (2012). The pragmatic maxim: Essays on Peirce and pragmatism. Oxford University Press. Hume, D. (1975/1740). A treatise of human nature (L. A. Selby-Bigge, Ed., 2nd ed. revised by P. H. Nidditch). Clarendon. Hume, D. (1999/1748). An enquiry concerning human understanding (L. Tom, Ed.). Oxford University Press. Jaffe, A., & Quinn, F. (1993). Theoretical mathematics: Towards a cultural synthesis of mathematics and theoretical physics. Bulletin of the American Mathematics Society., 29(1993), 1–13. Josephson, J. R. (2000). Smart inductive generalizations are abductions. In P. Flach & A. Kakas (Eds.), Abduction and induction (pp. 31–44). Kluwer Academics. Josephson, J., & Josephson, S. (Eds.). (1994). Abductive inference: Computation, philosophy, technology. Cambridge University Press. Kapitan, T. (1990). In what ways is abductive inference creative? Transactions of the Charles S. Peirce Society, 26(4), 499–512 Kapitan, T. (2000). Abduction as practical inference. In The Digital Encyclopedia of Peirce Studies. Access at http://www.commens.org/encyclopedia/article/kapitan-tomis-abductionpractical-inference Knipping, C. (2003). Argumentation structures in classroom proving situations. In M.A. Mariotti (Ed.). Proceedings of the Third Conference of the European Research in Mathematics Education. No Pagination. Bellaria, Italy Komatsu, K., & Jones, K. (2021). Generating mathematical knowledge in the classroom through proof, refutation, and abductive reasoning. Educational Studies in Mathematics, 109, 567–591. Krummheuer, G. (2007). Argumentation and participation in the primary mathematics classroom: Two episodes and related theoretical abductions. Journal of Mathematical Behavior, 26(1), 60–82. Lakatos, I. (1976). Proofs and refutations: The logic of mathematical discovery (J. Worrall & E. Zahar, Eds.). Cambridge University Press. Lautman, A. (2011). Mathematics, ideas and the physical real (S. B. Duffy, Trans.). Continuum. Lem, S. (1961/2017). Solaris (B. Johnston, Trans.). Originally published in 1961. Levesques, S. (2018). Abduction as regulator: An input from epigenetics. Transactions of the Charles S. Peirce Society, 55(2) (Spring 2019), 119–137. Longo, G. (2015). Synthetic philosophy of mathematics and natural sciences: Conceptual analyses from a Grothendiekian perspective. Speculations. See also papers by Longo at https://www.di. ens.fr/users/longo/ Longo, G. (2019). Quantifying the world and its webs: Mathematical discrete versus continua in knowledge construction. Theory, Culture, Society, 36(6), 63–72. Magnani, L. (2009). Abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. Magnani, L. (2015). The eco-cognitive model of abduction. Journal of Applied Logic, 13, 285–315. Mancosu, P. (2008). Philosophy of mathematical practice. Oxford University Press. McKaughan, D. J. (2008). From ugly duckling to swan: C. S. Peirce, abduction, and the pursuit of scientific theories. Transactions of the Charles S. Peirce Society, 44(3), 446–468. Meyer, M. (2010). Abduction – A logical view for investigating and initiating processes of discovering mathematical coherences. Educational Studies in Mathematics, 74(2), 185–205. Moore, G. S. (1988). The emergence of first-order logic. In W. Aspray & P. Kitcher (Eds.), History and Philosophy of Modern Mathematics. Minnesota Studies in the Philosophy of Science. Minneapolis: University of Minnesota Press, 11, 95–135 Moore, M. E. (Ed.). (2010). Philosophy of mathematics: Selected writings of Charles S. Peirce. Indiana University Press. Nemirovsky, R., & Ferrara, F. (2009). Mathematical imagination and embodied cognition. Educational Studies in Mathematics, 70(2), 159–174.
540
E. de Freitas
Norton, A. (2008). Josh’s operational conjectures: Abductions of a splitting operation and the construction of new fractional schemes. Journal for Research in Mathematics Education, 39(4), 401–430. Oatley, K., & Johnson-Laird, P. N. (2002). Emotion and reasoning to consistency. In S. C. Moore & M. Oaksford (Eds.), Emotional cognition (pp. 157–181). Johns Benjamins. Parisi, L. (2017). Computational logic and ecological rationality. In E. Hörl (Ed.), General ecology: The new ecological paradigm (pp. 75–100). Bloomsbury. Parisi, L. (2019). Media ontology and transcendental instrumentality. Theory, Culture, Society, 36(6), 95–124. Pease, A., Martin, U., Tanswell, F. S., & Aberdein, A. (2020). Using crowdsourced mathematics to understand mathematical practice. ZDM: The International Journal On Mathematics Education, 52(7). Pedemonte, B. (2007). How can the relationship between argumentation and proof be analysed? Educational Studies in Mathematics, 66, 23–41. Pedemonte, B. (2008). Argumentation and algebraic proof. ZDM – The International Journal On Mathematics Education, 40(3), 385–400. Pedemonte, B., & Reid, D. (2011). The role of abduction in proving processes. Educational Studies in Mathematics, 76(3), 281–303. https://doi.org/10.1007/s10649-010-9275-0 Peirce, C. S. (1958). Collected papers of Charles Sanders Peirce, vols. 1–6 (1931–1935), vols. 7–8 (1958). Peirce, C.S. (1976). NEM. The new elements of mathematics by Charles S. Peirce (4 volumes in 5, C. Eisele, ed.). Mouton Publishers. Pickering, A. (2006). Concepts and the mangle of practice: Constructing quaternions. In R. Hersh (Ed.), 18 Unconventional essays on the nature of mathematics (pp. 250–288). Polymath, D. H. J. (2014). The “bounded gaps between primes” Polymath project: A retrospective analysis. Newsletter of the European Mathematical Society, 94, 13–23, arXiv:1409.8361. Reid, D. A. (2018). Abductive reasoning in mathematics education: Approaches to and theorisations of a complex idea. EURASIA Journal of Mathematics, Science and Technology Education., 14(9). https://doi.org/10.29333/ejmste/92552 Rivera, F. D. (2008). On the pitfalls of abduction: Complicities and complexities in patterning activity. For the Learning of Mathematics, 28(1), 17–25. Rivera, F. (2017). Abduction and the emergence of necessary mathematical knowledge. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based science (Springer handbooks). Springer. https://doi.org/10.1007/978-3-319-30526-4_25 Rivera, F. D., & Becker, J. R. (2007). Abduction–induction (generalization) processes of elementary majors on figural patterns in algebra. The Journal of Mathematical Behavior, 26(2), 140–155. Shank, G. (1987). Abductive strategies in educational research. American Journal of Semiotics, 5, 275–290. Shank, G. (1998). The extraordinary ordinary powers of abductive reasoning. Theory & Psychology, 8(6), 841–860. Smith, D. W. (2007). The conditions of the new. Deleuze Studies, 1(1), 1–21. Sriraman, B. (2004). Characteristics of mathematical creativity. The Mathematical Educator, 14(1), 19–34. Stanley, D. (2002). A response to Nunokawa’s article: Surprises in mathematics lessons. For the Learning of Mathematics, 22(1), 15–16. Thagard, P. (1981). Peirce on hypothesis and abduction. In K. L. Ketner (Ed.), Proceedings of the C. S. Peirce bicentennial international congress (pp. 271–274). Texas Tech University Press. Wilder, R. (1952/65/2012). Introduction to the foundations of mathematics. Dover Publications. Wilder, R. L. (2014). Mathematics as a cultural system. London: Elsevier Science Publishing Yliloski, & Kuorikoski, (2010). Dissecting explanatory power. Philosophical Studies: An international Journal for Philosophy in the Analytic Tradition. 148(2), 201–219 Zalamea, F. (2012). Peirce’s logic of continuity. Docent Press. Zalemea, F. (2009). A synthetic philosophy of mathematics (Z. L. Fraser, Trans.). Sequence Press.
Peirce’s Conception of Mathematics as Creative Experimental Inquiry
26
Daniel G. Campos
Contents Mathematics as Creative, Experimental Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peirce’s Heuristic Conception of Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
543 549 554 555
Abstract
C.S. Peirce defines mathematics in two ways: first as “the science which draws necessary conclusions” and second as “the study of what is true of hypothetical states of things.” Both definitions are descriptions of mathematical activity. Rather than addressing primarily the types of objects that mathematics studies, Peirce’s conception of mathematics emphasizes the nature and character of mathematics as a reasoning activity. This chapter first discusses and illustrates the importance of creative inferences akin to abduction in mathematical practice. Peirce developed an open, heuristic conception of mathematics as creative experimental inquiry. Second, this chapter classifies Peirce’s philosophy in terms of Carlo Cellucci’s distinction between open and closed views, or heuristic and foundationalist conceptions, of mathematics. This helps to locate Peirce’s view in the broader philosophical map and complements discussions framed in terms of the nature of mathematical objects.
D. G. Campos () Department of Philosophy, Brooklyn College of The City University of New York, Brooklyn, NY, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_35
541
542
D. G. Campos
Keywords
Peirce · Mathematics · Heuristics · Experimentation · Hypothesis-making · Creativity · Imagination · Abduction
C.S. Peirce presents two definitions of mathematics in his manuscript entitled “The Essence of Mathematics” (CP 4.227–244; 1902). (In keeping with standard practice in Peirce scholarship, references for all citations from Peirce, 1932–1958, The Collected Papers of Charles Sanders Peirce, will be abbreviated henceforth as CP, followed by volume and paragraph number. When possible, the date of Peirce’s writing is included to acknowledge the historical development of his thought.) First, following his father Benjamin Peirce, he defines mathematics as “the science which draws necessary conclusions” (CP 4.228). (See Benjamin Peirce, 1870, Sect. 1.) Second, Peirce defines it as “the study of what is true of hypothetical states of things” (CP 4.233). Peirce himself acknowledges that “[i]t is difficult to decide between the two definitions of mathematics; the one by its method, that of drawing necessary conclusions; the other by its aim and subject matter, as the study of hypothetical states of things” (CP 4.238). Note, however, that both Peircean definitions of mathematics are descriptions of mathematical activity. According to its “method,” mathematics draws necessary conclusions. According to its “aim and subject matter,” mathematics studies what is true of hypothetical states of affairs. Both definitions reflect the Peircean notion that mathematics is first and foremost a practice, an activity. This chapter will show how this practice is creative and experimental so that mathematical inferences akin to scientific abduction are central to it. Since Peirce is generally regarded as the philosopher who established abduction as the inference that leads to creativity and discovery in scientific inquiry, it is worthwhile to discuss in what sense mathematics is a creative, experimental practice, what role creative inferences akin to abduction play in it, and the way in which Peirce’s philosophy of mathematics can be classified accordingly. Peirce’s conception of mathematics is in fact difficult to locate in terms of various classifications of philosophies of mathematics. The difficulty arises from the fact that traditional classifications into realist, formalist, conceptualist, structuralist, and intuitionist notions mainly concern the types of entities or objects that mathematics studies. (For an introductory account of various philosophies of mathematics, classified according to the objects of mathematical study, see Horsten (2021).) Peirce’s conception of mathematics emphasizes instead the nature and character of mathematics as a reasoning activity. In his article “The Logic of Mathematics in Relation to Education” (CP 3.553–562; 1898), Peirce begins by exemplifying the main philosophical question about mathematics: “In order to understand what number is, it is necessary first to acquaint ourselves with the nature of the business of mathematics in which number is employed” (CP 3.553). In other words, before asking about the nature of a mathematical object, we must first investigate the nature of the inquiry in which it is embedded. Peirce’s philosophical conception
26 Peirce’s Conception of Mathematics as Creative Experimental Inquiry
543
of mathematics should be based on the nature of the mathematicians’ reasoning activity, and not on the nature of mathematical objects as it was alternatively conceived by most philosophers in the late nineteenth and early twentieth centuries. Peirce’s philosophical focus is on practice; we must examine carefully the nature of mathematical reasoning itself. This is in line with the growing interest, among philosophers of mathematics, to characterize mathematics as a human practice (Ferreirós, 2015). Carter (2014) follows Peirce in characterizing this practice as investigating “hypothetical states of things,” effectively linking him to this philosophical discussion. In what follows, the chapter first provides an overview of Peirce’s view on mathematical practice. Then it classifies his view as an open, heuristic conception that emphasizes creativity and experimentation, including abductive insight.
Mathematics as Creative, Experimental Inquiry Peirce’s concern with mathematical reasoning is logical in his broad sense of the term, since he conceives of logic as a threefold science of good reasoning consisting of the following branches: (i) “Speculative grammar,” a label due to Duns Scotus, which consists of the “analysis of what kinds of signs are absolutely essential to the embodiment of thought” (EP 2: 257; 1903). (References for all citations from Peirce, 1992–1998, The Essential Peirce, will be abbreviated henceforth as EP, followed by volume and page number. The date of Peirce’s writing is also included.) It is the study of what general types of signs are necessary for reasoning to be possible. (ii) “Logical critic,” which is the study of “all the different elementary modes of getting at truth and especially all the different classes of arguments . . . [and of] their properties so far as these properties concern [the] power of the arguments as leading to the truth” (EP 2: 256; 1903). Roughly, for Peirce the elementary types of arguments that can lead to the truth in various degrees are deduction, induction, and abduction. Here we see that deductive logic is but one sub-branch of the whole science. (iii) “Methodeutic,” which “is the last goal of logical study” and consists in “the theory of the advancement of knowledge of all kinds” (EP 2: 256; 1903). This is the branch of logic that, on the basis of the other two, ought to reveal the methods for breakthrough discovery and for the growth of knowledge – mathematical, scientific, and otherwise. It is mathematical methodeutic that Peirce declared to be a mystery deserving a lifetime of research, but this study requires delving into the other two “logical” concerns in Peirce’s scheme. Consequently, Peirce’s views on the logic of mathematical inquiry are manifold. They concern the nature and types of mathematical representations or signs, the patterns and forms of mathematical reasoning, and the methods for mathematical discovery and innovation. (This section summarizes a more comprehensive discussion of Peirce’s philosophy of mathematical reasoning presented in Campos (2010).) Peirce thus stands out not only as a philosopher who devoted significant effort to studying the logic of mathematical inquiry but also as one who thought that investigating this issue was the main way to approach the philosophical study of
544
D. G. Campos
mathematics. Before asking other philosophical questions about mathematics, first and foremost we must ask about the nature of mathematical activity. In this activity, creative inferences akin to abduction play a central role. Mathematics is a science of observation and experimentation. Instead of experimenting upon actual physical reality like the physicist, chemist, or biologist, however, the mathematician experiments upon icons representing purely hypothetical states of things. In fact, mathematical experimentation can be characterized as the continuous interplay between imagination and judicious observation for the investigation of pure hypotheses. In mathematical reasoning, the imagination creates experimental diagrams that function as signs that are then perceived, interpreted, judged, often transformed, reimagined, reinterpreted, and so on, in a continuous process. Experimental hypotheses are imaginative suggestions that become subject to logical scrutiny as possible keys to the solution of a theorematic deduction, that is, a deduction that requires the creation and examination of specially constructed diagrams and not simply specifying the corollarial consequences of previously discovered theorems. (In his discussion on the “Essence of Mathematics,” Peirce distinguishes between two kinds of necessary reasoning, namely, “corollarial” and “theorematic” reasoning, and he argues that theorematic reasoning characterizes mathematics as an activity (CP 4.233; 1902). In fact, he considers his distinction between these two kinds of proof to be his first real discovery about the methods of mathematical inquiry (NEM 4: 49; 1902). For more detailed discussion of this distinction, see Hintikka (1980), Levy (1997), and Campos (2010).) Once conceived, the experimental diagrams act like objects for observation. They resist the mind, as it were, since we cannot think just anything we want about them, but rather we must interpret the mathematical relations the diagrams present to the mind. Moreover, these diagrams must be evaluated as solutions to the mathematical problem. In this respect, the process of mathematical reasoning is akin to abduction in the natural sciences. In a 1908 description of the formulation of an explanatory conjecture, Peirce writes that the inquirer “provisionally holds [the hypothesis] to be ‘Plausible’; this acceptance ranges, in different cases, – and reasonably so, – from a mere expression of it in the interrogative mood, as a question meriting attention and reply, up through all appraisals of Plausibility, to uncontrollable inclination to believe” (CP 6.469, 1908). In mathematics, the imagined experimental diagram is provisionally suggested, and it then immediately becomes an object for judicious observation. This judicious observation beholds the hypothesis with a degree of plausibility that ranges from “no” to “maybe?” to “yes!,” and reasoning thus proceeds to further experimentation or to generalization, when the experiment is observed to provide a general demonstration. This does not mean that experimental hypothesis-making in mathematics is exactly the same as scientific abduction. There are similarities, but there are also relevant differences resulting from the different constraints placed upon reasoning by the mathematical and scientific objects of study and, especially, from the different aims of mathematical and scientific activity: to explore what would be true of hypothetical states of things versus to discover what is actually true of the existing
26 Peirce’s Conception of Mathematics as Creative Experimental Inquiry
545
natural world. Yet, there are various interrelated species of creative conjecturing in mathematics and the sciences. In this chapter, the focus is on the kinship between creative mathematical experimentation and scientific abduction. In particular, it emphasizes the role of the semiotic imagination in mathematical reasoning. The imagination consists in “the power of distinctly picturing to ourselves intricate configurations” (MS 252; n.d.). (Citations abbreviated as MS are references to the unpublished Harvard manuscripts of Charles Sanders Peirce. The manuscript number is according to the Robin catalogue (see Robin, 1967). The page number is according to the numbering assigned by the Institute for Studies in Pragmaticism, Texas Tech University.) That is, the imagination is a semiotic ability to create, recreate, and transform signs. For Peirce, imagination and perception are indeed continuous with and shade into each other. Peirce writes that in relation to knowledge and belief, that is, as a matter for logic, it is not valuable “to draw a hard and fast line of demarcation between perception and imagination,” even if as a matter of physiological psychology the distinction may be justified (CP 7.646; 1903). For our purposes, this is because the semiotic outcomes of both imagining and perceiving are signs – signs that flow in the course of continuous, ongoing reasoning processes. The mathematical imagination, in particular, consists first in the ability to create original mathematical diagrams in order to represent an innovative hypothetical world and to do this “distinctly” in the epistemological sense of being able to determine its properties with exactitude. The imagination is necessary for the possibility of innovative mathematical reasoning because without its creative work, the inquirer would have no world to explore, no determinate hypothetical state of affairs to investigate with the rigor of necessary reasoning. The complementary faculties of concentration and generalization would have no subject to investigate, no mathematical matter to experiment upon and observe, if there were no imagined mathematical hypotheses. Imagination is the key to originality, in mathematics as in all scientific and philosophical reasoning, and so it is the primary source of breakthrough discovery. Peirce in fact warns against those who would underestimate the importance of original imagination in mathematics, especially those philosophers and mathematicians who vainly glorify the skill of necessary demonstrative reasoning as if it were the highest capacity for mathematical reasoning, without realizing that demonstrative skill depends on the imagination (CP 4.461; 1908). The mathematical imagination informs an original world, providing it with a distinct and determinate structure. Its creative achievement consists in forming a whole that is not simply an aggregation of parts. The highest creations of the mathematical imagination are hypothetical worlds so rich in possibility that their properties may actually surprise the inquirer: “The pure mathematician deals exclusively with hypotheses. Whether or not there is any corresponding real thing, he does not care. His hypotheses are creatures of his own imagination; but he discovers in them relations which surprise him sometimes” (CP 5.567; 1901). Thus, it is due to the rich originality of the imagination that mathematics is also a science of discovery – even though hypothetical mathematical worlds are created, they are not closed systems where
546
D. G. Campos
everything is determined by rigid self-evident axioms, but they are rather openended creations that may be subject to surprising discoveries. In its originative function, then, the mathematical imagination frames hypothetical states of things. Within the context of hypothetical frameworks, moreover, the imagination also plays a transformative, experimental role. For Peirce mathematics is a science of observation and experimentation upon diagrams akin to the physical sciences. In “Prolegomena to an Apology for Pragmaticism,” he writes that “one can make experiments upon uniform diagrams, and when one does so, one must keep a bright lookout for unintended and unexpected changes thereby brought about in the relations of different significant parts of the diagram to one another. Such operations upon diagrams, whether external or imaginary, take the place of the experiments upon real things that one performs in chemical and physical research. Chemists have ere now, I need not say, described experimentation as the putting of questions to Nature. Just so, experiments upon diagrams are questions put to the Nature of the relations concerned” (CP 4.530; 1906). Even if the objects of mathematical study are entia rationis or beings of reason, mathematics is nonetheless an experimental science in which hypothesis-making is crucial. An example will help to clarify the foregoing discussion. There are several geometrical ones, but Peirce also provides his own examples of theorematic deductions in non-geometrical proofs in order to clarify the scope and nature of this kind of creative, experimental reasoning in mathematics. (For original geometrical examples, see Campos (2009b, 2010). For a non-geometrical one by Peirce, see (NEM 4: 5–8, MS 691; 1901) and commentary by Stephen Levy (1997). In all of them, the upshot is that conceptual invention for problem-solving is the work of the mathematician’s imagination.) Let us consider one through which Peirce explicitly seeks to illustrate the claim that mathematical demonstration involves a judgment, akin to perceptual judgment, that refers to a hypothesis of our own creation (CP 7.659; 1903). Peirce begins with a premise which, he writes, “is true of the whole numbers,” thereby specifying the mathematical framework – the theory of whole numbers – within which he is working (CP 7.660; 1903). The premise is, “If any predicate, P, be true of the number 0, zero, but not of all numbers, then there must be two numbers M and N such that N is next greater than M, and P, while true of M, is not true of N” (CP 7.660; 1903). Let G stand for “next greater than” and P for “some predicate.” Letting the universe of discourse be the whole numbers and using contemporary algebraic notation, we can diagram this premise as: (P0 & ∼ (∀x) Px) ⊃ (∃m) (∃n) (Gmn & (Pm & ∼ Pn)) . From this premise, Peirce proceeds to demonstrate two results, the first corollarial (CP 7.661; 1903) and the second theorematic (CP 7.662–664; 1903). The corollarial proposition is, “There is no number except zero that is not next greater than some number other than itself” (CP 7.661; 1903). Letting “a” be any whole number other than 0, we must prove that ∼ (∃a) (∼ (∃m) Gma). Suppose, to the contrary, (∃a) (∼ (∃m) Gma). Let the predicate P be “is not a.”
26 Peirce’s Conception of Mathematics as Creative Experimental Inquiry
547
Since ∼Pa, then it follows that [P0 & ∼ (∀x) Px], which is the antecedent of our original premise. But ∼ (∃n) ((∃m) (Gmn & ∼ Pn)), since by hypothesis (∃a) (∼ (∃m) Gma). This contradicts our original premise. Therefore, by reductio ad absurdum, ∼ (∃a) (∼ (∃m) Gma) for a = 0. This is a corollarial deduction since it does not involve any ideas – embodied in signs – other than those already stated in the premise. In other words, there is no imaginative introduction of any new formal relations embodied in an appropriate diagram. This does not mean that the demonstration is trivial or obvious. An effort of observation and interpretation of the algebraic diagram that expresses the corollarial deduction is still necessary. There is no need, however, to imagine any formal relations not stated in the original premise. The theorematic proposition, in turn, is: “There is no number except zero that cannot be reached from zero by a finite multitude of successive steps, each passing from a number to [a] number next greater than it” (CP 7.662; 1903). Here we find a theorematic step – the introduction of the idea of a “finite multitude” of steps, from a whole number to a “next greater” whole number, or from predecessor to successor, by which any whole number a = 0 can be reached from zero. A finite multitude is the multitude of any collection such that if a finite number of members were added to the collection, the collection would remain finite. We will return to clarify this theorematic idea and its embodying icon. Let us first see its place in the demonstration of the proposition, again by reductio ad absurdum (CP 7.664; 1903). Suppose there is such a number “a” where a = 0. Let P be the predicate “is either zero or can be reached from zero by a finite collection of steps from a number to a next greater number.” Then, (P0 & ∼ Pa). It follows that (P0 & ∼ (∀x) Px), which again is the antecedent of our original premise. However, ∼ (∃m) (Pm & (∃n) (Gmn and ∼ Pn)) since this would contradict the definition of finite multitude; in other words, to suppose the contrary “would be to suppose a finite collection of steps would cease to be finite on the adjunction of one more.” This again contradicts our original premise. Therefore, the theorematic proposition follows. The theorematic innovation consists in introducing the idea of reaching, starting from zero, another whole number by a finite multitude of steps – each step consisting in passing from a number to a next greater number. The relation “is the next number greater than” is already present in our original premise. But the idea of a finite multitude of steps is not present in that premise; it is imaginatively introduced into our reasoning. In this theorematic reasoning, there is implicitly an imagined diagram of a finite succession of steps from a number to a next greater number. The mathematician may imagine, for example, a continuous number line, in which each whole number is indexed by a distinct mark. Then she may imagine any whole number a = 0 as being reachable from 0 by a finite multitude of steps along these indexical marks. In this case, the experimental hypothesis consists in imagining a diagram of such a finite succession of whole numbers, each successor being next greater to its predecessor.
548
D. G. Campos
Once this diagram is imagined, the hypothesized proposition is conceivable. From a Peircean standpoint, there is no way of conceiving this theorematic proposition without imagining it as being embodied in some hypothetical diagram, be it graphical or algebraic. Innovative mathematical conception is inseparable from experimental diagrammatic imagination. The mathematician must check whether the experimental hypothesis leads to a deductively rigorous demonstration. The mathematician thus observes whether the relations signified by the algebraic array above do hold with generality; that is, she observes the diagrammatic demonstration and sees that the array displays the contradictions and reveals the reductio ad absurdum. As she judiciously observes it, she forms other “interpretants” of the algebraic expression in her mind, that is, other diagrams which are themselves the imaginative embodiment of the formal relations stated in the theorem. Moreover, this experimental diagrammatic reasoning relies on an already existing framework of mathematical definitions, axioms, and theorems. In particular, it relies on the conception of “finite multitude.” In this context, Peirce defines the conception of “finite multitude” by way of a syllogistic diagram. He writes: By a “finite” multitude is meant the multitude of any collection [for which, that collection] being substituted for ‘Hottentot’ in the following syllogism, this syllogism would be valid: Every Hottentot kills a Hottentot; No Hottentot is killed by more than one Hottentot; Therefore every Hottentot must be killed by a Hottentot. (CP 7.662; 1903)
On the basis of this syllogism, Peirce proceeds to demonstrate that “if a single individual is joined to a finite collection, the collection will remain finite” (CP 7.663; 1903). As we have seen, this property of a finite collection is introduced, as a finite multitude of steps, in the theorematic demonstration above. The syllogism above is itself a diagram. It may be expressed algebraically with perfect generality, but the very same formal relations are already represented in the syllogism above. In the end, as this example suggests, Peirce provides us with an original notion of mathematical rationality itself, one in which conceptual innovation cannot be severed from imaginative diagramming. Hintikka does recognize this when he anticipates and responds to a possible objection to Peirce. He notes that Hilbert’s formalism and Frege’s views may suggest that strict logical reasoning takes us beyond the imagination to strict symbolicity (Hintikka, 1980, pp. 312– 313). Hintikka points out, however, that Peirce’s insight into the necessary iconic elements involved in logical symbolism anticipates the insights of later logicians who emphasize “the pictorial (model-theoretical) elements of logical inferences and other logical operations” (Hintikka, 1980, p. 313). In Peirce’s terms, even in strictly symbolic proofs, the mathematician creates, experiments upon, and observes diagrams of formal relations. In this reconception of mathematical rationality as necessarily involving the creative, experimental work of the imagination, we find a fresh perspective for an alternative approach to the study of the logic of ampliative inference in mathematics, a study akin to the investigation of abductive inference in the natural sciences.
26 Peirce’s Conception of Mathematics as Creative Experimental Inquiry
549
In the context of theorem-proving, any such study ought to take into consideration the constraints under which the imagination functions as well as the ways in which it creates and modifies mathematical icons. But the Peircean conception of mathematical reasoning is not limited to theorem-proving. Thus, the next section reclassifies Peirce’s philosophy of mathematics on the basis of its creative, experimental, problem-solving nature.
Peirce’s Heuristic Conception of Mathematics Scholarly attempts at classifying Peirce’s philosophy of mathematics in terms of other traditions focused on the nature of mathematical objects are helpful but insufficient to characterize and classify his philosophy. Examples of work in need of complement include Christopher Hookway’s efforts to classify Peirce as a structuralist (Hookway, 2010), Matthew Moore’s careful linking of such structuralism to Peirce’s mature realism (Moore, 2010), and Ahti-Veikko Pietarinen’s characterization of Peirce’s pragmaticism as a philosophy of mathematics that carves a middle way between Hilbert’s axiomatic foundationalism and Brower’s intuitionism (Pietarinen, 2010). They illumine important aspects of Peirce’s thought but must be complemented in order to show more fully the originality and character of Peirce’s own philosophy of mathematics. An alternative classification, then, may prove helpful in order to locate Peirce’s view along the spectrum of the various philosophies of mathematics. Keeping in mind that Peircean mathematics is a reasoning activity, it can be evaluated in terms of Italian philosopher Carlo Cellucci’s classification of philosophies of mathematics as closed versus open views or as foundationalist versus heuristic conceptions (Cellucci, 2000, 2002). This approach is helpful to classify Peirce’s philosophy of mathematics in a way that emphasizes the centrality of creative inferences, akin to abduction, in mathematical practice. According to Cellucci (2000), the closed view sees mathematics as close-ended inquiry. Mathematical theories are founded upon fixed principles given once and for all and which can only be accepted or rejected as a whole, so that the standard way of developing mathematics is by deriving consequences from the axioms of a given system. Mathematics proceeds by way of the axiomatic method; that is, mathematical reasoning consists in deductively unfolding what is already implicitly contained in the axioms of the given system or in replacing the latter with a new system by conceiving of entirely new axioms. Cellucci classifies the views of Kant, Frege, and Hilbert as closed views. From a Peircean standpoint, there is no place for creative inferences akin to abduction in these views. Only deduction is relevant. In contrast, the open view regards mathematics as open-ended inquiry. Mathematical theories do not have absolute foundations on permanent axioms but hinge upon provisional hypotheses. Mathematical theories are dynamic, that is, capable of being transformed to represent changeable states of affairs. Within the context of these open systems, solving a problem in a particular field of mathematics may require concepts and methods from other fields, so that the standard way
550
D. G. Campos
of developing mathematics is by problem-solving via the “analytical” method. Mathematics proceeds by way of analysis in which proof-search begins with a hypothesis and not with a fixed, self-evident axiom; that is, mathematical reasoning consists in solving a problem by provisionally assuming a hypothesis and showing that it leads to an adequate solution (Cellucci, 2000). Under Cellucci’s initial classification, Peirce’s conception of mathematics is an open view. For Peirce mathematical principles, axioms, or postulates are not selfevident truths that determine the true content of a closed system; they are rather provisional hypotheses that frame an open-ended system. Within that system, we can pose problems and search for hypothetical solutions by way of mathematical experimentation. Thus, we can speak of two senses of “hypothesis” in Peirce’s openended view – first, the “framing” hypotheses that create a mathematical state of things, and second, “analytical” or “experimental” hypotheses that we provisionally assume in order to solve problems within the system. Moreover, mathematical systems need not be axiomatic. A mathematical system may be a non-axiomatic hypothetical model of, say, a theoretical or practical problem. Mathematicians, then, may create a diagrammatic model and experiment upon it for the purposes of hypothetical problem-solving, without a need to axiomatize an entire system. Peircean mathematical systems are also dynamic because framing hypotheses may always be modified or recreated and analytical hypotheses may always be introduced – via analogy, for instance – from one field of mathematics into another. Peirce’s position admits the open view that problem-solving may be a potentially infinite process of posing increasingly more general hypotheses and considering their consequences (see Cellucci, 2000, p. 162). Let us take the system of Euclidean geometry as an example. A closed view would interpret a change of the fifth postulate as a toppling of the Euclidean system. The postulate states that “if a straight line falling on two straight lines make the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles” (Euclid, 1956, Vol. 1, p. 155). If the postulate is “false,” the system is “false,” and a new, “true” system must be constructed on the basis of “true” axioms. However, for Peirce this position would be inadequate to describe the inquiring practice of mathematicians. According to his open view, in modifying the fifth postulate, a mathematician is simply changing a framing hypothesis and reconceiving a mathematical state of things. Now the mathematician studies what would be true of the reconceived hypothetical state of affairs, exploring new possibilities and performing new experiments that may call for new experimental hypotheses. Thus, Peirce would not claim that, qua mathematics, the Euclidean system is false while the non-Euclidean system is true. They are two interrelated open systems that we can investigate through experimentation. The creation of a new system need not imply the destruction or “falsification” of the old one; it is rather the conception of an alternative “would-be world.” Mathematicians do not set out to create closed systems on the basis of axioms; they set out to explore what follows from provisional hypotheses.
26 Peirce’s Conception of Mathematics as Creative Experimental Inquiry
551
Having discussed this initial distinction between closed and open views of mathematics, in his book Filosofia e Matematica (2002), Cellucci further develops and expounds his position, this time in terms of a thorough critique and rejection of the “foundationalist” conception of mathematics in favor of what he calls the “heuristic” conception. The remainder of this section states some of his main theses and discusses how they apply to Peirce’s philosophy of mathematics as a heuristic conception overall, even if there are points of disagreement between Cellucci and Peirce. First, according to Cellucci’s heuristic conception, the main problem in the philosophy of mathematics is not the justification of mathematical knowledge but discovery, where mathematical discovery is an activity that encompasses but is not reducible to justification (Cellucci, 2005, pp. 18–19). He writes: Indeed, discovery requires making hypotheses capable of solving given problems, and in order to choose the hypotheses one must carefully evaluate the reasons for and against them. The evaluation process is intertwined with the process of hypothesis-formation, since one must compare alternative hypotheses in order to select one of them. This blurs the distinction between discovery and justification. (Cellucci, 2005, p. 18)
Second, the heuristic investigation of mathematics is relevant to progress in mathematics itself, since the main philosophical problem of such an investigation is mathematical discovery and it aims at improving or inventing methods of discovery (Cellucci, 2005, p. 20). Peirce, as we have seen, writes that the question of how mathematicians can guess on the ways to solve a mathematical problem or to carry out the demonstration of a proposed mathematical result “is a mystery that deserves a life-time of study” (NEM 4: 215; n.d.). (References for all citations from Peirce (1976), The New Elements of Mathematics, will be abbreviated henceforth as NEM, followed by volume and page number. When possible, the date of Peirce’s writing is included.) In the article “On the Algebra of Logic,” he expresses the hope that his work “may prove a first step toward the resolution of one of the main problems of logic, that of producing a method for the discovery of methods in mathematics” (CP 3.364; 1885). This is not only an important problem but the main problem in Peirce’s philosophy of mathematics as a logic of inquiry. Peirce dedicated so much intellectual effort toward research into the methods of discovery and conceptual innovation in mathematics and science that it is reasonable to say that it was his central philosophical concern. Mathematics is a creative, innovative intellectual practice, and the main philosophical problem concerning mathematics is to account for the logic of creation, innovation, and discovery in mathematics. From a Peircean standpoint, in order to emphasize that mathematics involves both creativity and discovery, though, we might say that the main problem for philosophy is the logic of mathematical inquiry. Third, instead of the foundationalist focus on the question of the existence of mathematical objects, the heuristic conception focuses on the problem of how to introduce hypotheses for problem-solving: “Mathematical objects are simply hypotheses introduced to solve specific problems. To speak of mathematical objects is misleading because the term ‘object’ seems to designate the things the
552
D. G. Campos
investigation is about, whereas hypotheses are the tools for the investigation. The latter is [sic] intended not to study properties of mathematical objects but to solve problems” (Cellucci, 2005, pp. 19–20). Here Peirce’s agreement is only partial. For him, the characteristic activity of the mathematician is to study hypothetical states of things, and this practice centrally involves the introduction of experimental hypotheses to solve problems. The hypothetical states of things are often, though not always, created to investigate the general features of actual states of things. In other words, empirical problems often, though not necessarily, serve as enabling conditions for the possibility of mathematical inquiry. He would agree, then, that mathematicians mainly aim at solving problems and not at investigating the properties of mathematical objects. In this sense, his philosophy is heuristic. However, the claim that mathematical objects are merely hypotheses is too strong for Peirce. In this sense, he would reject Cellucci’s nominalism for a view according to which the object of mathematical study is the “form of a relation” (CP 4.530; 1906). The question of the nature of these objects, however, is ancillary to the question of what mathematicians do and how they reason. Fourth, the method of mathematics is analytic, not axiomatic (Cellucci, 2005, pp. 24–26). This point has been discussed above. For Peirce the axioms of mathematical theories are not self-evident truths that serve as foundations for knowledge but rather hypotheses that frame a general state of things for experimental investigation. Let us now delve into Cellucci’s extreme rejection of mathematical axioms and discuss Peirce’s more nuanced position. Cellucci states his rejection of the foundationalist view on mathematical method by appealing to mathematical experience thus: The idea that, to prove a proposition, you start from some first principles, derive some results from those axioms, then, using those axioms and results, push on to prove other results, contrasts with mathematical experience which shows that in mathematics one first formulates problems, then looks for hypotheses to solve them. Thus one does not proceed, as in the axiomatic method, from axioms to theorems but proceeds, as in the analytic method, from problems to hypotheses. (Cellucci, 2005, p. 25)
Peirce expresses a similar view when he writes that the “business of the mathematician lies with exact ideas, or hypotheses, which he first frames, upon the suggestion of some practical problem, then traces out their consequences, and ultimately generalizes” (MS 188: 2; quoted in De Waal, 2005, pp. 287–288). Here the view is stated with regard to practical problems, but the problems may also be theoretical for Peirce (Campos, 2009a, pp. 412–415). The crucial point is that mathematicians make hypotheses in order to solve problems (Campos, 2010; Carter, 2014). As we will see, the tracing out of consequences need not be deductive in procedure, even if the logic of a completed mathematical proof is deductive. However, what about the possibility that the axiomatic method may provide a strategy for finding proofs and thus solving problems? Cellucci rejects it outright, again by appealing to mathematical experience: The idea that the axiomatic method provides a strategy both for finding and remembering proofs also contrasts with mathematical experience, which shows that proofs based on the axiomatic method often appear to be found only by a stroke of luck, and seem artificial and
26 Peirce’s Conception of Mathematics as Creative Experimental Inquiry
553
difficult to understand. Showing only the final outcome of the investigation, established in a way that is completely different from how it was first obtained, such proofs hide the actual mathematical process, thus contributing to make mathematics a difficult subject. (Cellucci, 2005, p. 25)
Peirce would agree only in part. He would agree that presenting mathematics as the result of deductive demonstration from axioms or first principles obscures the nature of mathematical reasoning. In fact, it is such a mode of presentation that kills students’ imaginations and interest in mathematics. In a letter dated March 11, 1895, he writes to his brother on the aims of mathematical education to be accomplished through an introductory geometry textbook: “That instruction in geometry ought to begin with awakening the geometrical imagination, both psychology and experience show. The first example of proof offered should be a good specimen of real mathematical reasoning and not the kind of thing which astounds the pupil by demonstrating at length something obvious at glance” (L 339). (References to Peirce’s letters are abbreviated as L followed by their Robin Catalogue (1967) number.) An example of real mathematical reasoning would be one in which imaginative hypothesis-making is necessary to solve a problem. However, from a Peircean standpoint we might say that Cellucci dismisses too strongly the possibility that axioms themselves may be treated as framing hypotheses and that exploring the consequences of such framing hypotheses may lead to mathematical discovery also. This does not mean that all framing hypotheses are axioms. An example is the idea of the “fundamental probability set” which served as the framing hypothesis of early mathematical probability without being an axiom (Campos, 2009a). It does mean that axioms may be regarded as framing hypotheses and that by modifying or transforming these axioms mathematical discovery may follow. An example is the aforementioned case of non-Euclidean geometries: by rejecting the fifth postulate of Euclidean geometry, mathematicians were able to discover new geometrical systems. Thus, the Peircean conception of mathematics is even richer, heuristically, than Cellucci’s own conception. Cellucci’s next three theses regard the rationality of mathematical method. The fifth thesis is that the characteristic practice of mathematics is not theorem-proving – where the choice of axioms is non-mathematical but merely intuitive – but rather problem-solving, where the choice of hypotheses is rational (Cellucci, 2005, pp. 23–24). Sixth, mathematical discovery is not irrational, merely psychological, or intuitive, but rather rational and logical (Cellucci, 2005, pp. 28–29). Seventh, the logic of mathematical reasoning as problem-solving is non-deductive as well as deductive (Cellucci, 2005, pp. 26–28). The central idea here is that the logic of mathematical problem-solving is rational and inferential, not intuitive, and its logic is largely non-deductive. The rejection of intuition as unmediated cognition is a cornerstone of Peircean epistemology. In his 1868–1869 “Cognition Series” in the Journal of Speculative Philosophy, Peirce provides a thorough rejection of the possibility of intuition as unmediated cognition and argues that all cognitions are mediated by previous cognitions (W 2: 193–272). (References for all citations from Peirce (1982), Writings of Charles S. Peirce, will be abbreviated henceforth as W, followed by
554
D. G. Campos
volume and page number.) In the case of mathematics, this means that we can make no appeal to intuitive cognition of axioms as absolute first principles. Such cognitions must be mediated by previous cognitions embodied in signs and are of the nature of inferences. In the case of mathematics, these signs are “diagrams,” that is, icons that embody and present directly the relations that conform to a general hypothesis. In Peirce’s words, “the icon is very perfect in signification, bringing its interpreter face to face with the character signified. For this reason, it is the mathematical sign par excellence” (EP 2: 307; 1904?). Peirce accordingly characterizes mathematical reasoning as experimentation upon “diagrams.” This experimentation consists in making hypotheses to solve problems or find proofs, and it is creative, inventive, and imaginative but still rational, as there is no strong distinction between reason and imagination (Campos, 2010). Moreover, the logic of mathematical reasoning is largely non-deductive. In the case of “theorematic” deduction, creative inferences akin to abduction are necessary in order to introduce new concepts that solve a problem (NEM 4: 28, 1902; NEM 4: 288, 1903?; Campos, 2010; Hookway, 1985, pp. 193–194). And the generalization of mathematical results is itself inductive, as it is the result of a diversity of experiments that lead to the establishment of a general conclusion (Hookway, 1985, p. 203; Eisele, 1979, p. 5). Peirce, however, would provide a more comprehensive view of mathematical activity than Cellucci in not admitting such a strong contrast between problemsolving and theorem-proving. Cellucci, in advancing his theses, has dichotomizing tendencies that Peirce avoids. The latter would not set theorem-proving as a practice opposed to problem-solving, but rather as a form of problem-solving. The proving of propositions in Euclid’s Elements, for example, can be a form of problem-solving. On the other hand, beyond defending his heuristic theses, Cellucci expounds how the analytical search for hypotheses may proceed by a variety of non-deductive techniques, including mathematical analogy (reducing one problem to another by showing that the elements of an existing problem are analogous to those of another which has already been solved or is easier to solve), diagramming (in the ordinary sense of investigating an actual figure), generalization (provisionally assuming a particular case to be a general rule), particularization (instantiating a general rule in a particular case), induction, abduction, and so on (Cellucci, 2002). The repertoire of “analytical” techniques for hypothesis-making discussed by Cellucci is extensive. In this sense, Cellucci’s heuristic view of mathematics can serve to strengthen the Peircean account of mathematical reasoning by identifying and describing a variety of heuristic tools for experimental hypothesis-making.
Conclusion From the foregoing analogical discussion, therefore, we may conclude that Peirce developed a heuristic conception of mathematics as creative experimental inquiry. Peirce’s philosophical emphasis is on rational discovery which encompasses justification, not on rational justification severed from irrational discovery; on experimentation, not axiomatization; on hypothesis-making, not mechanical or
26 Peirce’s Conception of Mathematics as Creative Experimental Inquiry
555
“corollarial” deduction; on rational ampliative inference, not unexplainable intuition; on problem-solving which encompasses theorem-proving, not on deducing theorems from self-evident axioms; on hypothetical or “would-be” truth, not on absolutely certain truth. This places him in a philosophical tradition that is radically different from Kant, Frege, and Hilbert’s foundationalism and from Brower’s intuitionism, for example. By attempting to describe and classify Peirce’s philosophy of mathematics from standpoints and perspectives more akin to his own, Peirce scholars will make stronger contributions to the overall understanding of his own philosophy than by trying to insert him in traditions and discussions foreign to Peirce’s own central concerns. In proposing that Peirce’s conception is open-ended and heuristic in the Cellucian sense, this chapter points out one avenue to make progress in that direction – an avenue that emphasizes the importance of creative conjecturing in mathematical reasoning.
References Campos, D. G. (2009a). The framing of the fundamental probability set: A historical case study on the context of mathematical discovery. Perspectives on Science, 17(4), 385–416. Campos, D. G. (2009b). Imagination, concentration, generalization: Peirce on the reasoning abilities of the mathematician. Transactions of the Charles S. Peirce Society, 45(2), 135–156. Campos, D. G. (2010). The imagination and hypothesis-making in mathematics: A Peircean account. In M. Moore (Ed.), New essays on Peirce’s mathematical philosophy (pp. 123–125). Open Court. Carter, J. (2014). Mathematics dealing with ‘hypothetical states of things’. Philosophia Mathematica, 22(2) Series III, 209–230. Cellucci, C. (2000). The growth of mathematical knowledge: An open world view. In E. Grosholz & H. Breger (Eds.), The growth of mathematical knowledge (pp. 153–176). Kluwer. Cellucci, C. (2002). Filosofia e matematica. Laterza. Cellucci, C. (2005). ‘Introduction’ to Filosofia e matematica. In R. Hersh (Ed.), 18 unconventional essays on the nature of mathematics (pp. 17–36). Springer. De Waal, C. (2005). Why metaphysics needs logic and mathematics doesn’t: Mathematics, logic, and metaphysics in Peirce’s classification of the sciences. Transactions of the Charles S. Peirce Society, 41(2), 283–297. Eisele, C. (1979). Studies in the scientific and mathematical philosophy of Charles S. Peirce. Mouton. Euclid. (1956). In T. L. Heath (Ed.), The thirteen books of Euclid’s elements (Vol. 1–3). Dover. Ferreirós, J. (2015). Mathematical knowledge and the interplay of practices. Princeton. Hintikka, J. (1980). C.S. Peirce’s ‘first real discovery’ and its contemporary relevance. The Monist, 63, 304–315. Hookway, C. (1985). Peirce. Routledge. Hookway, C. (2010). ‘The form of a relation’: Peirce and mathematical structuralism. In M. Moore (Ed.), New essays on Peirce’s mathematical philosophy (pp. 19–40). Open Court. Horsten, L. (2021). Philosophy of mathematics. In: E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (Winter 2021 Edition). https://plato.stanford.edu/archives/win2021/entries/ philosophy-mathematics/. Levy, S. H. (1997). Peirce’s theorematic/corollarial distinction and the interconnections between mathematics and logic. In N. Houser, D. D. Roberts, & J. Van Evra (Eds.), Studies in the logic of Charles S. Peirce (pp. 85–110). Indiana University Press. Moore, M. (2010). Scotistic structures. Cognition, 11(1), 79–100.
556
D. G. Campos
Peirce, B. (1870). Linear associative algebra. Washington, DC. Reprinted as Peirce, B. (1881). American Journal of Mathematics, 4, 97–229. Peirce, C. S. (1932–1958). Collected papers of Charles Sanders Peirce (Vols. 1–8; P. Weiss, C. Hartshorne, & A.W. Burk, Eds.). Harvard University Press. [Abbreviated CP]. Peirce, C. S. (1976). The new elements of mathematics (Vols. 1–4; C. Eisele, Ed.). Mouton Publishers. [Abbreviated NEM]. Peirce, C. S. (1982). Writings of Charles S. Peirce: A chronological edition (Vol. 1–6, 8). Indiana University Press. [Abbreviated W]. Peirce, C. S. (1992–1998). The essential Peirce: Selected philosophical writings (Vols. 1–2; N. Houser, et al., Eds.). Indiana University Press. [Abbreviated EP]. Pietarinen, A. (2010). Which philosophy of mathematics is pragmaticism? In M. Moore (Ed.), New essays on Peirce’s mathematical philosophy (pp. 59–79). Open Court. Robin, R. S. (1967). The annotated catalogue of the papers of Charles S. Peirce. University of Massachusetts Press.
Using Abduction for Characterizing the Process of Discovery
27
Michael Meyer
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Characteristics of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generation of Abductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discoveries qua Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Empirical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example I: Creative Discoveries and their Initiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example II: Overcoded Abduction and Discoveries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Second Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Third Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Excursion: Remarks on the Ambiguity of Utterances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
558 559 563 564 566 568 568 573 575 576 577 577 578 581 581
Abstract
Allowing students to discover (and to justify) relationships is an essential core of many didactic approaches, not only for mathematics. However, the questions of what a “discovery” actually is or can be and what options there are to initiate discoveries often remain unanswered. To answer these questions, it is important to understand how new knowledge can be gained. In this chapter, based on Ch. S. Peirce’s concept of abduction, a proposal is presented which helps to
M. Meyer () Institute for Didactics of Mathematics, University of Cologne, Cologne, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_36
557
558
M. Meyer
understand discoveries in depth. This is first worked out theoretically and second tested and thereby sharpened by the reconstruction of empirical scenes. Keywords
Abduction · Discoveries · Discovery learning · Types of abduction
Introduction The process of discovery is one crucial aspect of learning mathematics. Different approaches like “discovery learning” (Bruner, 1961), “problem-solving” (Polya, 1973), or the “(neo)socratic method” (Nelson, 1949) focus on the active learning of students who are required to recognize mathematical relationships on their own. In spite of (or perhaps because of) the large number of such claims, there are various understandings in the literature about what a “discovery” actually is or can be and how new discoveries are gained (Kollosche, 2017; Meyer, 2007). The thought process involved in discovery learning is often described as “productive” in the didactic literature. In this context, expressions such as “incubation,” “enlightenment” (e.g., Ritter & Dijksterhuis, 2014), or “AHA!-experience” (Liljedahl, 2005) appear, which are theoretically hardly understandable and gain their suggestive power from metaphors and the addressing of subjective experience. What is behind these terms? How can Bruner’s concepts of “rearranging” or “transforming evidence” (Bruner, 1961, p. 22) be understood? What does “active development and acquisition of knowledge” (Wittmann, 1994, p. 165, translated) or “the new linking of experience” (Hoffmann, 2003, p. 12, translated) mean? The questions indicate that a theoretical conceptualization of discoveries, especially discovery learning, might be difficult. Accordingly, one could ask what discovery learning is in a specific situation. Let us suppose people look at the calculations in Fig. 1 in order to recognize relationships. Is it a discovery when a student realizes that the addends have been interchanged? Or is it a discovery when the equality of the sums is recognized? When an expert in mathematics recognizes the applicability of the commutative law for the given tasks, is this a discovery? As the example indicates, discovery learning seems to have something to do with recognizing causal relationships. Also, these relationships are not to be arbitrarily described as “profound discoveries,” but rather located relative to the recognizing subjects. In other words, it is not only the content or structure of the recognized relationships that matter since it also depends on the person who recognizes Fig. 1 Some tasks for discovering the commutative law
2+5=
5+4=
5+10=
5+2=
4+5=
10+5=
27 Using Abduction for Characterizing the Process of Discovery
559
the specific relationship. In this chapter, discoveries are characterized by using abduction, as it has already been considered by many authors in this book (e.g., Aliseda, 2006; Magnani, 2001).
Characteristics of Abduction Abduction was elaborated as the third elementary kind of inference by the American philosopher Charles Sanders Peirce. In the sense of mathematical formal logic, only deduction is a “permissible” inference. Therefore, an expanded understanding of the forms of inferences is required, and one that is based in philosophy. In the course of his philosophical thinking, Peirce offered several different forms and descriptions of abduction and induction (cf. Fann, 1970; Magnani, 2001; Reid, 2004; Schurz, 2008). First, he named the inference from observed facts to an explanatory case “hypothesis” and afterward “retroduction” and “abduction.” The shift in his writings did not only refer to his theory of abduction but also to his theory of induction. In his previous writings, he assigned induction as the inference of generating a general rule based on the observation of a sample of facts. In the following sections below, only the later writings of Peirce are regarded since he recognized “in almost everything I printed before the beginning of this century I more or less mixed up Hypothesis and Induction” (Peirce CP 2.227). From this point on, he only regarded abduction as the inference of generating new knowledge as it is “the process of forming an explanatory hypothesis. It is the only logical operation which introduces any new idea [ . . . ]” (Peirce CP 5.171). Peirce himself offered the pattern of abduction in Fig. 2. The pattern contains a middle sentence, in which the observed facts (C) are related to an explanation (A). This sentence is going to be considered in the following sections below more extensively. If this pattern is regarded from an epistemological perspective, an additional element is required that represents characteristics of scientific “why” explanations: Hempel and Oppenheim (1948) focus on causal scientific explanations based on the theory of Popper (1989/1934). By means of this form of explanation, reasons are given for why an observed phenomenon occurs – in contrast to (deductive) reasons, which show reasons why one should believe that the phenomenon occurs (cf. Hempel, 1977, p. 3). Hempel and Oppenheim (1948, p. 136) suggest the
“The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true.” Fig. 2 Peirce’s schematic description of abduction (CP 5.189, 1903)
560
M. Meyer
reformulation of explanatory “why” questions to: “According to which general rule and due to which situation-specific conditions does the phenomenon occur?” Therefore, “why explanations” according to Hempel and Oppenheim (cf. Hempel & Oppenheim, 1948, p. 136ff.) should, on the one hand, contain a description of the phenomenon to be explained (i.e., explanandum, that which is to be explained) and, on the other hand, sentences that explain the phenomenon (i.e., explanans, that which contains the explanation). The explanans consist of both general rule and situation-specific conditions that occur before or with the phenomenon (antecedent conditions). Accordingly, Hempel and Oppenheim (1948, p. 137; cf. also Stegmüller 1976, p. 124) make the following demands on a (scientific) explanation: B1: The explanandum must be able to be deduced logically from the explanans. B2: The explanans must contain at least one general rule. B3: The explanans must have an empirical content; it must be empirically verifiable. B4: The sentences contained in the explanans must be true. B1 and B2 represent requirements that play a special role in mathematics education. For example, when discovering the commutative law using productive tasks (e.g., Fig. 1), then general relationships (rules) are in focus. Also, Peirce mentioned the existence of such a general rule (B2) in his previous writings concerning abduction (CP 2.623, 1878). Of course, learners can also recognize other relationships in tasks like this, such as that (a) the tasks were only skillfully selected by the teacher so that they all have the same result or (b) the commutative property depends on one of the two summands being 5. The recognized relationships therefore do not have to be mathematically correct, but can also contain relationships that are incorrect or correct only to a limited extent. The requirement B4 cannot be maintained in the learning process. This was also recognized by Hempel (1977, p. 8), who also allows “potential explanations” as explanations. If the diagram in Fig. 2 is now extended to include these points, the pattern of abduction in Fig. 3 under the conditions mentioned emerges. Starting from the observation of a phenomenon, an abduction is the inference that builds a potential explanation: A rule and a case are inferred. By recognizing the rule, the observed phenomenon appears as a result of the rule (a specific element of the consequence of the rule). Let us regard a short example by considering the tasks in Fig. 1. In order to associate or generate the commutative law, the abduction in Fig. 4 could take place. Fig. 3 The pattern of abduction as an inference from observed phenomena to explanatory causes
phenomenon (result): rule: case:
( ) ∀ : ( )⟹ ( ) ( )
27 Using Abduction for Characterizing the Process of Discovery
phenomenon (result):
rule: case:
561
2 + 5 = 5 + 2, 5 + 4 = 4 + 5, … If only the order of the addends is reversed (in an addition), then the sum does not change. In 2+5 and 5+2 or 5+4 and 4+5 only the order of the addends is reversed.
Fig. 4 Abduction for discovering the commutative law
If induction according to widespread understanding (which also Peirce did in his previous writings) is the inference from individual cases to a (new) general rule, the question remains as to why the concrete cases and results are put together as premises of the inference. For example, one might see three black swans and conclude that all swans are black. But what makes the person connect the property “being a swan” with the property “being black”? To answer such questions, abduction gives an answer: First concrete consequences of a rule have to be recognized in order to be able to associate or generate a corresponding concrete realization of the antecedent. Peirce compares induction and abduction as follows: Abduction makes its start from the facts, without, at the outset, having any particular theory in view, though it is motivated by the feeling that a theory is needed to explain the surprising facts. Induction makes its start from a hypothesis which seems to recommend itself, without at the outset having any particular facts in view, though it feels the need of facts to support the theory. Abduction seeks a theory. Induction seeks for facts. (Peirce CP 7.218) Abduction is the process of forming an explanatory hypothesis. It is the only logical operation which introduces any new idea; for induction does nothing but determine a value [ . . . ]. (Peirce CP 5.171)
Two points are crucial for the project to characterize discoveries by means of abductions: 1. The pattern in Fig. 3 shows an inference in which two elements are not given before: the rule and the case. Surely, abductions are not inferences in the sense of formal logic. Also, Peirce uses the metaphor of a flash in order to illustrate the process of abduction: The abductive suggestion comes to us like a flash. It is an act of insight, although of extremely fallible insight. It is true that the different elements of the hypothesis were in our minds before; but it is the idea of putting together what we had never dreamed of putting together which flashes the new suggestion before our contemplation. (Peirce CP 5.181)
Abductions are in need of insight and this insight is independent of the order of inferring case and rule. The rule is needed in order to infer the case. The case is a part of the antecedent of the rule and, thus, if the rule is present, the case is present, too. Accordingly, Eco writes that the main problem is “how to figure out both the rule and the case at the same time, since they are inversely related, tied together by a sort of chiasmus [ . . . ]” (Eco, 1983, p. 203). The simultaneous
562
M. Meyer
recognition of rule and case makes it clear that the abduction is a hypothetical form of inference: it is by no means certain that this rule or this case has been causal for the observed facts (phenomenon). The rule or the case may even be wrong. The only requirement of an abduction is that the rule provides a possible cause for the observed phenomenon as a special antecedent of it. 2. The rule of an abduction is going to be associated together with the case. Thus, the rule is not given before an abduction takes place. The abduction presented in Fig. 4 is of different quality depending on who makes the inference, for example, a mathematical expert who knows such a rule or a novice who did not know the rule before. Accordingly, Eco (1983) differentiates three types of abductions: If the rule of an abduction is “invented ex novo,” Eco speaks of “creative abductions” (p. 207). Creative abductions contain new rules and, thus, new discoveries. A student who has not been aware of the commutative law before would create a new rule with the creative abduction in Fig. 4. If the rule of an abduction is known beforehand, Eco differentiates between “overcoded” and “undercoded” abductions. Eco calls an abduction overcoded if the association of the known rule is “given automatically or semiautomatically” (Eco, 1983, p. 206). If the association of a known rule is nearly obvious, the abduction would be an overcoded one. For example, if one looks out of the window and sees a wet street, the association of “If it has rained recently, the streets are wet.” would be really obvious. Or, if a mathematical expert assumes the commutative law like in Fig. 4, then the abduction does not require much effort. These abductions are overcoded. In contrast to overcoded abductions, the association of the known rule is not obvious in undercoded abductions. If a student is aware of the commutative law but has not used it very often, the student first has to recognize that this rule can really be used to explain the observed phenomenon. In other words, before an abduction takes place, an observed phenomenon has not been conceptualized as a consequence of the rule. Eco’s types of abduction make clear that the case can only be inferred abductively together with and through association (overcoded and undercoded abduction) or creation (creative abduction) of a rule. In other words, abductions comprise: • The discovery of a new case as an explanation for the observed facts (all types) • The discovery of a relationship between the rule and the observed facts (all types) • The discovery of a new rule (creative abduction) According to Peirce, abduction, deduction, and induction are the only elementary inferences: Reasoning is of three elementary kinds; but mixed reasonings are more common. These three kinds are induction, deduction, and presumption (for which the present writer proposes the name abduction). (Peirce CP: 2.774)
In this triadic relation, abduction is followed by deduction and induction. Furthermore, based on abductively created hypotheses, possible and necessary
27 Using Abduction for Characterizing the Process of Discovery
563
consequences are determined, which can then be tested and afterward checked inductively. Peirce writes: [T]here are but three elementary kinds of reasoning. The first, which I call abduction [ . . . ] consists in examining a mass of facts and in allowing these facts to suggest a theory. In this way we gain new ideas; but there is no force in the reasoning. The second kind of reasoning is deduction, or necessary reasoning. It is applicable only to an ideal state of things, or to a state of things in so far as it may conform to an ideal. [ . . . ] The third way of reasoning is induction, or experimental research. Its procedure is this. Abduction having suggested a theory, we employ deduction to deduce from that ideal theory a promiscuous variety of consequences to the effect that if we perform certain acts, we shall find ourselves confronted with certain experiences. We then proceed to try these experiments, and if the predictions of the theory are verified, we have a proportionate confidence that the experiments that remain to be tried will confirm the theory. I say that these three are the only elementary modes of reasoning there are. (Peirce CP 8.209)
When it comes to the (empirical) verification of mathematical discoveries in general, these three steps play a special role (e.g., Hoffmann, 2007). Starting from the hypothetical abduction, possible and necessary consequences are drawn (deductively), which then have to be tested. If these consequences occur, the hypothesis (e.g., the hypothetical rule of abduction) can be confirmed (enumerative induction) or refuted (eliminative induction) (Meyer, 2010). So far, this three-step process of inferring has been considered from different perspectives, for example, related to the process of generalization (e.g., Körner & Meyer, 2022; Rivera & Becker, 2007a, b), experimentation (Rey, 2021), and diagrammatic reasoning (Hoffmann, 2007) in mathematics education research.
Generation of Abductions Regarding the uncertainty associated with performing an abduction, Rescher writes: [ . . . ] an evolutionary model of random trial and error with respect to possible hypotheses just cannot operate adequately within the actual (or perhaps even any possible) timespan. (Rescher, 1995, p. 321)
Peirce himself oscillates back and forth between “[ . . . ] ‘realism’ and ‘objective idealism’” (Nagl, 1992, p. 117) when answering the question of why we so often make the right hypothesis. It seems crucial that abduction is a kind of “background logic” (cf. Hoffmann, 1999). The cognitive background against which an abduction is performed always plays a crucial role. Peirce notes that: [ . . . ] every reasoning involves another reasoning, which in its turn involves another, and so on ad infinitum. Every reasoning connects something that has just been learned with knowledge already acquired so that we thereby learn what has been unknown. It is thus that the present is so welded to what is just past to render what is just coming about inevitable. The consciousness of the present, as the boundary between past and future, involves them both. Reasoning is a new experience which involves something old and something hitherto unknown. (CP 7.536)
Carrying out a (reasonable) abduction therefore presupposes that a researcher is familiar with the specific field. Accordingly, Hoffmann notes that the:
564
M. Meyer
[ . . . ] discovery of something new is initially an almost slavish submission to the rules and conventions of knowledge that were already known for sure. There is no knowledge without previous knowledge, one could say succinctly. (Hoffmann, 2003, p. 305; translated)
Hoffmann describes this type of knowledge as “implicit knowledge” (Hoffmann, 2002, p. 13 and p. 273 f.). However, the necessity of this condition also makes it clear that “[ . . . ] quite new conceptions cannot be obtained from abduction” (CP 5.190). Therefore, neither an “anything goes” in Feyerabend’s sense nor a “creatio ex nihilo” (Hoffmann, 2002, p. 271 f.) can be assumed. The reliance on abduction is based on the hope that there is a relation (partly innated, partly developed) between the mind of the thinker and the nature of the object of thought, which does not make the guesswork entirely dependent on chance (CP 1.630, 2.754). Abduction enables us to bring together “[ . . . ] what we had never dreamed of putting together [ . . . ]” (CP 5.181). Putting together old ideas, in this way, creates new ideas which – mediated by the phenomenon – find a reference to reality. For example, Kepler knew the positions of planets as well as geometric shapes. Abduction allowed him to combine these two elements of his knowledge to create the notion of a planet’s “elliptical orbit.” So, the need for background knowledge gives some structure to the abductive inference. Nevertheless, the abduction “[ . . . ] is mere conjecture, without probative force” (CP 8.209). In summary, the emergence of new insights can be described as the tentative linking of given knowledge, which expresses a material structure of things in the social medium of used signs.
Discoveries qua Abduction The previous section described what an abduction can provide in terms of discoveries. Now the reverse path is to be taken which will also allow us to determine whether the didactically and pedagogically required discoveries can also be understood as abductions. Bruner (1961) considers in regard to the “rearranging” or “transforming evidence” the hypothetical method: Intuitive thinking, the training of hunches, is a much-neglected and essential feature of productive thinking not only in formal academic disciplines but also in everyday life. The shred guess, the fertile hypothesis, the courageous leap to a tentative conclusion - these are the most valuable coin of the thinker at work. (Bruner, 1960, p. 13 f)
Peirce goes a step further and writes: “We have no power of intuition, but every cognition is determined logically by previous cognitions” (CP 5.265). Bruner’s description shows that relative to the three elementary forms of inference, only the process of performing an abduction can be used to describe discoveries. Bruner (1960, p. 60) also speaks of the combination of ideas as it comes into play in the case of abduction, especially concerning the rule. Using the Kepler example, the abduction was presented as an inference with which surprising results could be explained by a newly formed rule. However, every
27 Using Abduction for Characterizing the Process of Discovery
565
perception already represents an abduction because “things” are not perceived as such but only their appearances. Based on this, the conceptual category to which the original phenomenon probably belongs is inferred. Using the example of perceiving an azalea, Peirce writes: Looking out of my window this lovely spring morning I see an azalea in full bloom. No, no! I do not see that; though that is the only way I can describe what I see. That is a proposition, a sentence, a fact; but what I perceive is not proposition, sentence, fact, but only an image, which I make intelligible in part by means of a statement of fact. This statement is abstract; but what I see is concrete. I perform an abduction when I so much as express in a sentence anything I see. The truth is that the whole fabric of our knowledge is one matted felt of pure hypothesis confirmed and refined by induction. Not the smallest advance can be made in knowledge beyond the stage of vacant staring, without making an abduction at every step. (Peirce, LOS, 1986, p. 899 f.)
For Peirce, one does not see a specific object but some characteristics of it. The characteristics lead to the assumption of putting the object in a conceptual category. Surely, abductions in perception, which can be regarded as overcoded ones, cannot be described as (profound) discoveries. Rather, creative and undercoded abductions seem to be crucial. Furthermore, the surprising results represent something that is unknown or a gap in one’s understanding that needs to be filled (Nührenbörger & Schwarzkopf, 2016). The filling of this gap happens again qua abduction. According to Andreewsky: (h)ence, if a given phenomenon looks strange, this only means that the theoretical framework used to interpret this phenomenon must be revisited! The revisiting cognitive process is labelled abduction, and its aim is to ‘normalize’ anomalies. (Andreewsky, 2000, p. 839)
To paraphrase Piaget, the surprise content of the results or a deviation from normality creates a cognitive conflict. This is due to the content of the “new” within the results, which triggers the surprise and the motivation for the development of new knowledge. Peirce notes: Abduction makes its start from the facts [phenomena, M. M.], without, at the outset, having any particular theory in view, though it is motivated by the feeling that a theory is needed to explain the surprising facts. (CP 7.218)
In a mathematics classroom, facts do not necessarily surprise students. Therefore, it seems better to regard the phenomena as facts that are in need of an explanation. Comparable descriptions have also been made by Polya who showed that abduction also plays an essential role in problem-solving. Regarding his second phase “devising a plan,” he writes: This idea may emerge gradually. Or, after apparently unsuccessful trails and a period of hesitation, it may occur suddenly, in a flash, as a ‘bright idea’. (Polya, 1973, p. 8f)
Polya (1954a, b) speaks of “heuristic” or “plausible reasoning” in describing the progress in knowledge when solving problems, including those inferences that resemble Peirce’s abduction. Polya does not distinguish between three forms of
566
M. Meyer
Fig. 5 General patterns of abduction (left, the cognitive “flash of genius”; right, abduction as the process of making an explanatory hypothesis plausible)
inferences but speaks of an “inductive basic scheme.” Reconstructions of problemsolving processes using the pattern of abduction (Fig. 5) can be found in Söhling (2017). Polya (1973) also describes reasoning by analogy as one useful way of problem-solving. Gentner (1983) writes: The central idea is that an analogy is an assertion that a relational structure that normally applies in one domain can be applied in another domain. (Gentner, 1983, p. 303)
Kunsteller (2018) notes that these processes can also be regarded as abductions. She also reconstructs empirical utterances and elaborates a general pattern of analogies based on the pattern of abduction. Up to now a lot of theoretical considerations and also fictitious examples indicate that abduction seems to be a necessary inference for generating knowledge through discoveries. In the following section below, this is going to be verified empirically. Some methodological remarks are presented first and then two scenes of real classroom communication are reconstructed.
Methodology The interpretation method developed by Voigt (1984) was used to analyze the scenes in section “Empirical Examples.” The theoretical basis of this interpretative research paradigm is symbolic interactionism (Blumer, 1986) and ethnomethodology (Garfinkel, 1967) from sociology. In symbolic interactionism, meanings are negotiated and modified in interaction. Ethnomethodology is based on the observation of everyday life, taking into account various “basic and normative rules” (Cicourel, 1972). These rules include, among other things, the understanding that statements are always embedded in context. The context of a statement has to be considered in an interpretation. Furthermore, this context means that not all content is explicitly shared. The speaker more or less expects that the listener completes the implicit part of the utterance. Both symbolic interactionist and ethnomethodological perspectives require a focus on the actual (classroom) interaction taking place. This in turn means that the pattern shown in Fig. 3 cannot serve as the basis for the reconstruction since it represents the process of cognitive generation of abduction: starting from a surprising phenomenon, both rule and case are formed almost simultaneously. Such processes can be hinted at in classroom communication, but the rule is then used
27 Using Abduction for Characterizing the Process of Discovery
567
to highlight the case as meaningful. Using the example of the commutative law, an utterance could be, for example, “The tasks produce the same results because only the addends have been swapped. And that’s why 5+10 would have to result in the same as 10+5.” During the course of the external plausibility check of an abductively developed assumption, a rule rather serves as a premise than as a result of the inference (even if the rules are usually not stated (see Krummheuer, 1995, Meyer, 2007)). Thus, the role of the observed phenomena changes. Here, they are regarded as results of the already known and already associated rule right from the start. Relative to the pattern of the abduction, it causes a correspondingly changed shape of the pattern, as shown in Fig. 5. The three central steps of Voigt’s interpretation method are as follows: Step 1: The researcher distances himself or herself from the concrete scene. The scene should be mentally placed in different contexts, whereby different hypotheses of the scene can be generated by the interpreters (here, the author of this paper). These hypotheses should also be independent of direct associations and thus of the use of known rules. In the following, these hypotheses are called “interpretations.” The change in the contextual background takes place after the scene has first been interpreted from the everyday point of view of a teacher or a didactician in order to distance one’s self from it. The reason for this procedure is that the interpreter should try not to hastily identify his own experiences in the transcript but rather collect new insights. Since the interpretations are explanations for observed phenomena (the utterances of the students), abductions have to be made (cf. Meyer, 2009; Timmermans & Tavory, 2012). The turning away from everyday habits of thought corresponds to the demand for creating creative abductions by means of which a maximum of new insights should be made in order to supplement (and not replace) everyday associations. Step 2: Possible and necessary consequences of the interpretations are developed. Which consequences may (or have to) occur in order to confirm the interpretation? Which consequences would refute the interpretation? The (non)occurrence of these consequences is checked on the basis of the subsequent statements of the scene. This involves the realization of the steps of deduction and induction in the three-step development of empirical verification: The drawing of necessary and possible consequences takes place deductively. The examination of the consequences takes place. The corresponding confirmation or refutation of the initially abductively obtained interpretations is an inductive step. Step 3: Observation and review. Which consequences have (not) occurred? And, thus, which interpretations can be confirmed? Which can be refuted? As the consideration of the abduction in step 1 already shows, interpretations of utterances are uncertain and need to be checked using the subsequent statements. In principle, this process also occurs in discovery learning: An assumed mathematical rule can empirically be corroborated or refuted. In contrast to interpretations, it also can be proven. At the end of an interpretation process,
568
M. Meyer
ideally one interpretation is left that enables an appropriate understanding, both for the scene and, in particular, the research question. It should be explicitly mentioned that it is not the aim of the following examples to show scenes in which nice tasks for discovering mathematical relationships are presented. Rather, it is about showing different aspects of discoveries mediated by abductions through the reconstruction of abductions in a scene. This is intended to exemplify and elaborate the conceptualization of discoveries, which from a theoretical perspective previously happened by means of creative and undercoded abductions.
Empirical Examples Example I: Creative Discoveries and their Initiation The tenth grade of a secondary school dealt with the calculation of powers. After this was initially completed with a class test, the teaching experiment began with a double lesson on the subject of power functions. In the first lesson, worksheet A (Fig. 6) was given to the learners. The solutions are discussed in the second lesson. Working sheet A should be used to introduce the power functions x −→ xn with natural exponents of the form n = 2k (k ∈ N) starting from the normal parabola. The graphs of the power functions x −→ x6 and x −→ x2 are presented to the learners on the worksheet. The term x6 is not present on the worksheet. The axes of the coordinate system are not subdivided so that an exact determination of the function rule for the dashed graph is not possible, and rather qualitative properties of the graph must be taken into account. In task 1, the learners’ attention is drawn to the course of the graphs near and far from the origin. While discussing the first task, the function expressions x −→ 10 • x2 and x −→ 0, 1 • x2 were argumentatively ruled out for the dashed graph. The students in the class had problems completing the second task on the worksheet. They did not succeed in connecting elements of their background knowledge (graph and function expressions of a normal parabola and arithmetic with powers). Therefore, they also received the following function rules for discussion: x −→ x2 + 10, x −→ x5 , and x −→ 2x . The following transcript below begins at the point when the students have ruled out x −→ x2 + 10 as the desired function rule (Fig. 7). In the beginning, Eva says that the graph at hand could be that of the power function x −→ x4 . She traces her hypothesis back to the course of the left branch of the graph. The translation of the student from the graphic representation to the function term can be reconstructed by an abduction like in Fig. 8. The reconstructed rule contains the (causal) relationship between the course of the graph and the calculation expression. The students were not aware of this rule beforehand, so Eva’s abduction can be described as creative. From the previous lesson, the students only knew quadratic functions.
27 Using Abduction for Characterizing the Process of Discovery
Fig. 6 Excerpt from working sheet A (“Arbeitsblatt A”) and its translation
569
570 Teacher
Eva
Teacher
Eva Teacher Eva
M. Meyer also das, können wir vergessen. (streicht die Funktionsvorschrift “x −→ x2 + 10” durch) . . . hast du ne andere Idee was es sein könnte Eva’ ja vielleicht könnte das ja irgendwas mit ähm, wegen meiner x4 oder so sein weil, äh der, der äh .. (leise) ne ist Quatsch (..) x4 (Lehrer schreibt “ x −→ x4 ” an die Tafel, Eva spricht wieder lauter) also x4 weil äh, wenn man dann äh, bei x ne Minuszahl einsetzt–, äh kommt ja trotzdem noch was äh positives bei äh, bei dem– bei der Funktion raus also ähm, auf der, x-Achse äh also wenn man jetzt da äh vom Ursprung aus rechts ist, (L zeigt auf den Ursprung) dann muss es ja trotzdem noch ne ähm, noch ähm, also noch positiven äh Wert ergeben damit man überhaupt hoch kann sonst würde man ja runter gehen wenns äh wenn da– x5 wegen meiner steht da kann– also Minus mal Minus ist ja Plus, dann noch Mal Minus mal Minus ist wieder Plus und dann– noch Mal mal Minus ist ja äh Minus, also würde man ja im Minusbereich landen und dann würd der Graph nach unten hin wegfallen. mhm . . . du sagtest wenn man hier rechts geht auf der– ersten Achse’ (fährt den Graph mit dem Finger im positiven Bereich der x-Achse nach) nee wenn man links geht links ja, links
ok, we can forget about this (crosses the function expression x −→ x2 + 10) . . . do you have another Idea for what it could be Eva’ maybe it could be something with ehm, for my sake x4 or something like that because, eh the, the eh .. (soft spoken) no it is nonsense (..) x4 (teacher writes x −→ x4 on the blackboard, Eva speaks up) as x4 because eh, if you insert–, than eh, a negative number for x, eh there will come something positive out of eh, of the–, of the function as ehm, on the, x-axis eh so if you are now right eh from the point of origin, (teacher points at the point of origin) then there still has to be a ehm, still ehm, thus still a positive eh result so that you actually can get high elsewise you have to go deeper if eh if there– is x5 for my sake you can– so minus times minus is plus, then another time minus times minus is again plus and then– another time minus times minus is yeah eh minus, so you would land in the range of the negative numbers and then the graph would drop down. mhm . . . you said if you are here on the right on the– first axis’ (retraces the graph with his finger in the positive area of the x-axis) no if you are going left left yes, left
Fig. 7 Figures on the blackboard (the translation of “Achse” is “axis”)
27 Using Abduction for Characterizing the Process of Discovery
571
result: The graph is going up for negative values of . rule:
If the term has the form (n even), then the corresponding graph is going up (on both sides of the origin).
case:
The term of the graph could be “something with”
.
Fig. 8 An abduction concerning Eva’s statement
The uncertainty with which an abductive inference is made is clearly recognizable in Eva. She rates her own suggestion as “no it is nonsense,” and other indications within her statement also suggest that Eva proposes a hypothesis that she considers daring. This includes her soft speaking and the words “maybe” and “something with.” The reason for Eva’s abduction was probably the example x −→ x5 , which follows the previously ruled out function rule (x −→ x2 + 10) on the blackboard. The student could have recognized from this that the graph that fits this rule is going down, in contrast to the given graph for negative x-values. This decisive difference can be regarded as the observed phenomenon, which can be explained by abduction (How do you have to change the function term so that the associated graph gets the obvious course?). Based on this result, the student may then have considered which changes within the functional specification can be used to approximate the given graph. Accordingly, the student infers an even exponent within the rule x −→ xn . The concrete power she mentioned seems to be just an example (“for my sake x4 or something like that”). However, the number “2,” which is given as the exponent in the function term of the normal parabola, seems to be excluded. The reconstruction of the abduction relates only to a small part of the utterance. Based on the further course of the statement, it can be assumed that the student has become aware of the vagueness of her knowledge, because different approaches can be reconstructed as to how she justifies her discovery. The first part of her verification can be interpreted as a deductive proof of the implicit rule of the abduction: Starting with x −→ xn (n even), Eva deduces by multiplying x successively with itself that the product has to be positive, if x is any negative number. In another step she infers that the corresponding graph has to go up. In detail three deductions can be reconstructed in this section (cf. Meyer, 2007, p. 197ff.). Nearly everything in this proof remains implicit. However, interpreting the second part of her argument suggests that Eva could be aware of it. Here she expresses a nearly analogous argument for odd exponents in x −→ xn . Her argument in the second part of her verification can be interpreted in different ways as a proof of the case of the reconstructed abduction. Every possibility is in need of the assumption that the term in question is of the form xn : 1. The argument can be regarded as a proof by contradiction when it is assumed that Eva implicitly compares the deduced course of the graph with the given graph.
572
M. Meyer
2. Eva’s statement could be a proof by contradiction for the reverse direction of the implication. Thus, the rule of abduction turns into a statement of equivalence, and the case of the abduction can be inferred from the result of the abduction deductively. 3. As she infers the consequences for n being even or odd, it can be assumed that Eva does a complete proof by exhaustion. The (really short) reconstruction shows that the process of discovering knowledge starts with only one given premise (in this situation, the graphs). In this scene, the students initially could not use this premise to execute their abduction. With the mention of x −→ x5 , the teacher gave a hint that Eva grasped. The hint of the teacher can be interpreted as a reduction of the semantic fields between the result and the case of the abduction. Thus, a creative abduction can be forced from the outside by presenting elements of the background knowledge which can be combined. Even though the hint of the teacher has been given before, the concept of abduction
Fig. 9 Excerpt from working sheet E (“Arbeitsblatt E”) and its translation
27 Using Abduction for Characterizing the Process of Discovery
573
Fig. 9 (continued)
shows that the students have to notice the relationship between the result and an explanatory rule and case. The different possible reconstructions of the verifications indicate that the student seems to be aware of the uncertainty of the expressed assumption. The different possibilities of verification Eva offers can be understood as presenting a maximum of ways to escape this (possibly unsatisfactory) situation of hypothetical knowledge.
Example II: Overcoded Abduction and Discoveries This next scene comes from the fifth (and last) lesson of an experiment in a fourth-grade classroom. In previous lessons, some tasks that dealt with functional relationships were already used. The learners considered fractions as operators and determined formation rules for sequences of numbers. Worksheet E (the crucial part of the sheet can be seen in Fig. 9) was distributed at the end of the fourth examination lesson and should be solved by the learners as homework without further preparation. The following scene below shows the discussion of the solutions in the beginning of the fifth lesson.
574
M. Meyer
Fig. 10 The blackboard at the beginning of the scene
The following scene begins with the learners work on the first task of working sheet. So far, the learners suggested the numbers 40, 30, 20, and 10 as “winning numbers” for Tanja. The first three proposals have already been checked against the calculations with proposal “40” being corrected. So far, it has not been apparent that the learners recognized functional relationships for a range of numbers (Fig. 10).
Teacher Lisa Teacher
ach du meinst hier eine 10. ja (Lehrer trägt “10” unter “20” ein) ah so, 90 und hier gibt 40– (zeigt jeweils auf die Rechnungen) prima, und äh Lars’
Lars
man kann auch alle Zahlen unter 30 nehmen alle Zahlen unter 30’ ja kann man warum das denn’ Marlene weil nämlich ehm alle Zahlen unter 30 sind ja niedriger– und wenn man minus 20 und rechnet sind ja 80, und äh 30 plus 20 sind 50, da könnte man genauso gut auch ehm minus 1 rechnen, und plus 1 das– also–, (Lehrer trägt “1” unter “10” ein) das wäre dann 99 und oben wäre es dann 31, (L zeigt jeweils auf die Rechnungen) dann hätte Tanja gewonnen. aha, prima, was ist denn die höchste Zahl die Tanja einsetzen kann’, ist da jemand drauf gekommen’, Frank
Teacher Marlene Teacher Marlene
Teacher
ah you mean here a 10. yes (teacher writes a “10” below “20”) ah so, 90 and here comes 40- (points on the according calculations) fine, and eh Lars’ one can take every number under 30 all numbers below 30’ yes you can why that’ Marlene because even ehm all numbers under 30 are lower– and if you calculate minus 20 there are 80, an eh 30 plus 20 are 50, you can calculate just as well eh minus 1, and plus 1 that– also–, (teacher writes “1” under “10”) that would be 99 and above it would be 31, (L points on the respective calculations) than Tanja would win. aha, fine, what is the highest number which can be put in for Tanja’, has anyone got that’, Frank
After the number “10” was determined as the winning number for the player Tanja, Lars states that Tanja can take “all numbers under 30” to win. Lars, thus, extends the suggestions of concrete numbers to a range of numbers.
27 Using Abduction for Characterizing the Process of Discovery
575
The meaning of “lower” is crucial for understanding the utterance. The conjunction “because” in Marlene’s utterance indicates the following reason: She states “all numbers under 30 are lower.” However, the exact meaning of “lower” remains questionable, so several interpretations are possible. Based on the naming of specific numbers, a first interpretation is that Marlene interpreted the facts more “superficially.” An abduction is reconstructed in which the knowledge remains “attached to the surface of the perceptible.” Two other interpretations of Marlene’s utterance assume that her knowledge penetrates deeper into a mathematical structure and that she only expresses this verbally using concrete numerical examples. Thus, Marlene could also have made a “deeper” discovery.
First Interpretation If one interprets “lower” as a rephrasing of “below 30,” then the interpretation would assume a superficial discovery. This interpretation causes problems insofar it is a circular definition. This in turn would probably imply an underestimation of the rationality of a fourth-grade student because Marlene uses the conjunction “because.” However, it cannot be excluded that the student, without having any mathematical insight, only wants to provide the semblance of a reason (especially since the criterion for “Tanja wins” is a relation between the number terms and not a specific number limit). The abduction in Fig. 11 can be reconstructed following this interpretation. According to this interpretation, Marlene recognizes, based on the specific arithmetic examples given by her classmates, the relevant numbers being “lower” than 30 as the reason for the result (Tanja wins). The rule of the abduction, which consists of the tentative generalization from the concrete examples to a range of numbers, has not been previously known to the learners. This is mainly because it is a very specific rule. Nevertheless, this abduction suggests little mathematical insight. For this reason, one can speak of a creative, superficial abduction. The numbers that Lars and Marlene considered range from 1 to 30. Possibly to confirm their rule for the lower limit, Marlene gives the numerical example “1” (which has not yet been discussed in the classroom interaction). This is an isolated case in the rule. According to the three steps, the inferences of deduction and induction can now be reconstructed: If the established rule is valid, then Tanja must have the higher result than Oliver. This implication of the rule is then tested in order to increase certainty about the validity of the statement of the rule. result: rule: case:
Tanja wins with the numbers 30, 20 and 10. If you chose any number (“all”) “lower” of equal to 30, Tanja wins. 10, 20, and 30 are "lower" than or equal to 30.
.
Fig. 11 Abduction according to the first interpretation of Marlene’s statement
576
M. Meyer
Marlene’s discovery of the abductively obtained rule does not necessarily require the consideration of all three numerical examples mentioned above. It is quite conceivable that the rule has only been generated with the numbers 30 and 10. In the same way that the number 1 in the above illustration might have only been found by combining all three steps (abduction, deduction, and induction), this procedure would have been used with a different number (20 in this example). For the number 1, the separation made between discovery and (empirical) verification seems necessary insofar as this number was not previously mentioned in the classroom interaction. So, it was not part of the result of the uttered abduction. In addition, Marlene explicitly carries out the test to confirm her assumption. As already mentioned, Marlene’s statement leaves a lot of room for interpretation. In the following section below, this room for interpretation will be used more optimistically. Two further interpretations are given, which assume that Marlene made more profound discoveries. Thus, two “deeper abductions” are reconstructed.
Second Interpretation If one interprets “lower” to mean the result that Oliver’s calculation (30 + = ) is lower than that of Tanja’s calculation (100 − = ), the abduction shown in Fig. 12 could be reconstructed. The basis for this abduction is again formed by the previously checked numerical examples. The antecedent of the rule that remains implicit is not a calculation but corresponds to the task (Tanja’s winning condition). The decisive insight in this abduction is the knowledge of the case: one must recognize that for x ≤ 30 the condition of the rule is fulfilled. However, it remains questionable whether Marlene entirely understands the case. There is little evidence for this in the transcript. However, their calculation examples could be seen as an attempt to justify the case. As already mentioned, the rule represents the winning condition for Tanja. Since this rule was specified by the rules of the game, one could speak of an overcoded abduction since the rule is given. Nevertheless, this abduction brings forth a discovery, that is, the recognition of the case as being a case of the rule (recognizing the case as a concrete element of the antecedence of the rule).
result: rule: case:
Using some numbers under 30 (or 30), Tanja will win.
If 100 - x > 30 + x, then Tanja will win. For x ≤ 30 it applies that 100 - x ≥ 70 and 30 + x ≤ 60.
Fig. 12 Abduction according to the second interpretation of Marlene’s statement
27 Using Abduction for Characterizing the Process of Discovery
result: rule:
case:
577
Using some numbers under 30 (or 30), Tanja will win. If one wins by a margin over the other and that margin is increased, then one wins all the more. The further the selected number is below 30, the greater the gap between Tanja and Oliver.
Fig. 13 Abduction according to the third interpretation of Marlene’s statement
Third Interpretation “Lower” can also be interpreted as relating to distance, that is, the smaller the numbers used below 30, the lower Oliver’s result is compared to Tanja’s. If Marlene’s utterance is understood this way, the abduction in Fig. 13 can be reconstructed. The starting point and, thus, the result of this abduction are again the statements made by Marlene’s classmates. However, the exact understanding of the calculations they carried out is decisive for the formation of this abduction in two respects: First, the learners must have understood that the given numerical examples are winning numbers for Tanja. This is how the result is formed. Furthermore, to carry out this abduction, it is necessary to consider the distances between the individual calculations for Oliver and Tanja and the development of these distances when choosing other numbers. The learners know rules that are comparable with the reconstructed ones from various game situations that they carry out themselves or they see on television (sports reports, etc.). Although the task represents a game situation that can be understood, the rule is not already suggested by the task like it has been before in the previously reconstructed abduction (Fig. 12). Correspondingly, the abduction can be described as undercoded. As with the second interpretation, the key finding is in the case. A verification of the case cannot be found in the present lesson. A verification would be difficult for the learners since it requires deep mathematical insight: They would need to develop a strategy such as “if x is Tanja’s winning number, then every number y with y < x is Tanja’s winning number.” This requires a functional comparison of the gradients, the monotony behavior, etc., for example, in form of an illustration of graphs or tables. However, the learners lack the necessary language to be able to explicate such thoughts. As with the previous interpretation, the given calculation examples could be seen as an attempt to verify the case.
Excursion: Remarks on the Ambiguity of Utterances From an ethnomethodological point of view, one reason for the fundamental ambiguity of utterances is the “et cetera rule” (Cicourel, 1972), which explains why
578
M. Meyer
many elements of an utterance remain implicit in interpersonal interaction. Speakers do not express everything explicitly because they expect the listeners to think along and “make sense of what was said.” It is the task of the listener to fill in the missing parts. A public utterance (in this situation, Marlene and her classmates) represents a result among those involved in the lesson, which has to be interpreted. They have to infer the contents that were meant but not explicated from their background knowledge. The abductions necessary for the interpretation depend on background knowledge, which in turn are individually different. As a result, misunderstandings and ambiguities are inevitable. So, the speakers enable an understanding of their utterances that goes beyond their own. While one learner may prefer a more in-depth discovery, another learner may be satisfied with a superficial discovery. Both discoveries can be discussed in classroom communication. Therefore, the ambiguity of utterances can also have a positive effect on the course of the lesson. However, an ambiguity caused by implicit utterances does not have to be consciously created. Especially when learning through discovery, the learner is often at the “front of his or her knowledge.” In such situations, it might be difficult to express their own thoughts or to formulate them in accordance with the presumed expectations of the teacher or classmates. In addition, the words that would make it possible to express discoveries are often unknown or unclear. Furthermore, it is also possible that the speaker has performed several abductions and cannot decide which one should be presented publicly. In this case the public statement could contain a mixture of these insights. If, however, it is difficult to formulate hypotheses or to choose between them during discovery-based learning, it is also difficult for the interpreter (be it a fellow learner or a researcher) to understand the learner and the potential mathematical content in detail. If one wants to avoid ambiguities, both the abductions necessary to solve the task and (building on this) the abductions necessary to understand an utterance would have to be overcoded as much as possible. However, since it would be more or less obvious how to solve the task, it would hardly be a question of “discoveries” anymore. Instead of viewing ambiguities simply as shortcomings, one should recognize them as positive outcomes of discovery-based learning.
Conclusion In the first part, abduction was highlighted as the decisive final inference for the generation of knowledge. Contrary to popular belief, abduction – and not induction – is necessary for gaining new knowledge. A logical description of discovery learning can therefore succeed only with this inference. If the goal of discovery learning is to find something new that was not previously known by looking at a given data, it cannot be done with deduction or induction alone. Voigt writes accordingly:
27 Using Abduction for Characterizing the Process of Discovery
579
If one disregards the fashionable and romantic notion that discovery-based learning is all what is didactically well-intended, and attempts a logical Peircean analysis, it quickly becomes apparent that discovery-based learning essentially cannot be described as an induction, but has to be described as an abduction. (Voigt, 2000, p. 696; translated)
Conversely, not every type of abduction is associated with this type of learning. Eco’s creative, undercoded, and overcoded abductions make it possible to characterize discovery learning using certain types of abductions. Even in guided instructional lessons, learners are able to carry out abductions (e.g., those already given in perception). But in these abductions, the use of certain rules is more or less obvious because the rules are given, and, hence, the corresponding abductions would be overcoded. Deeper insights are in need of surprising results. Peirce writes: Every inquiry whatsoever takes its rise in the observation [ . . . ] of some surprising phenomenon [ . . . ]. (CP 6.469)
This level of surprise is limited in the case of overcoded abductions. Thus, rather undercoded and, above all, creative abductions seem to be predestined for discoverybased learning. This impression is substantiated when considering the other types of abduction. Since the assignment of the known rule to the phenomenon is more or less automatic in the case of overcoded abductions, one can speak of a discovery performance only to a limited extent. In the situation of undercoded abductions, they start with an observed phenomenon that is not previously known as a consequence of a specific rule. By associating the rule to the phenomenon, new relationships are discovered (the phenomenon as a result of the rule). In the situation of creative abductions, a new rule is created or discovered. In other words, with an abduction, not only a case is discovered, but the discovery of the case coincides with, and is enabled through, the discovery of the rule, which is not previously known in creative abductions. Thus, the designation of a finding as a type of discovery depends on the type of abduction. Since the classification of an abduction as undercoded or creative is only possible relative to the cognitive system of reference, the dependence of discoveries and subjects described by Bruner becomes clear. Discoveries are in need of abductions: (a) Which take place on the basis of knowledge of a rule, whereby certain results are not yet conceptually assigned to the rule or (b) where a rule is previously unknown. Correspondingly, both a (concrete) case and a (general) rule can be formed anew in discovery learning and thus have a hypothetical character. The latter also applies to the relationship between rule and result, which is discovered in both types of abduction. With the identification of creative and undercoded abductions as decisive inferences in discovery learning, a relatively plausible differentiation is offered
580
M. Meyer
in the theoretical part of this work. However, the reconstruction of abductions points to a more difficult connection between abductions and discoveries. The empirical reconstruction of an abduction in which the rule is more or less obvious (overcoded abduction) shows the importance of recognizing the case: It was by no means easy for the student Marlene to recognize a case as a legal case of a rule. In order for the theoretical distinction between creative, under- and overcoded abductions to remain an applicable tool to describe discovery-based learning, it is proposed to broaden the term “undercoded abduction”: An abduction is considered undercoded if the association of the known rule is not obvious and if the case is not obviously a concrete element of the antecedence of the rule. Accordingly, it can be reformulated: Discoveries are in need of abductions: (a) Which take place on the basis of knowledge of a rule, whereby certain results or certain cases are not yet conceptually assigned to the rule or (b) where a rule is previously unknown. The types of abduction capture fundamental differences in the nature of discoveries. However, this cannot always describe the quality of an abduction. The second empirical reconstruction shows under- and overcoded abductions that suggest the generation of deeper mathematical insights compared to those of a creative abduction. For a finer qualitative differentiation between different discoveries, determining between what is “deeper” and what is “superficial” can assist in classifying the type of abduction. The following criteria below can be used to characterize a discovery: 1. Is the rule of abduction known or unknown? This criterion is used to differentiate between undercoded and overcoded abductions (where the rule is known) and creative abductions (where the rule is unknown). Whether a rule is known or not depends on the cognitive horizon of the individual (and more precisely on the horizon of the cognitive system involved). 2. Is the well-known rule more or less obvious or not? With this criterion, a distinction can be made between overcoded abductions (where the rule follows more or less automatically) and undercoded ones (where the rule is not obvious). This criterion of the “obviousness” of the rule must also be seen relative to the horizon of the individual concerned. If the rule seems almost obvious but the case cannot easily be regarded as a case of the rule, it is an undercoded abductions (see above). 3. Does the knowledge gained remain on the surface of what is perceptible, or does it penetrate deep into a mathematical structure?
27 Using Abduction for Characterizing the Process of Discovery
581
This criterion is used for the mathematical-qualitative description of an abduction and is independent of the above two criteria. In the case of “superficial” abductions, the knowledge based on the given results remains attached to the “surface of the perceptible.” “Deeper” abductions penetrate deeper into a mathematical structure. In the empirical reconstructions, this criterion is primarily used to separate deeper mathematical insights from the tentative generalization of singular occurrences to a new general rule. While the above criteria can be used to characterize discoveries, different forms of abductions can be differentiated in terms of the content of the observed phenomenon. It makes a difference if an abduction starts from just one example of numbers or from a set of examples. Also, the elaboration of the observed phenomenon can contain hints for the later proof of the (generated) rule. These are just some aspects that indicate a possible variety of options for initiating discoveries in (school) mathematics (for a more detailed view, please refer to Meyer (2018)).
Appendix Transcription 1. Paralinguistic signs , a short stop while speaking, max. 1 second .. a short break, max. 2 seconds .. a short break, max. 3 seconds surethe voice lingers on at the end of a word or a comment sure’ a rise of voice, marked at the end of the word sure emphasis has been placed on this word sure word spoken with a drawl 2. Other characterizations (..) vague, but assumed words (shows) characterization of body language and facial expressions A row starts at the end of the last word of the previous statement: Noticeable quick follow-up, e.g.: M: why that F: therefore
References Aliseda, A. (2006).Abductive reasoning. Logical investigations into discovery and explanation. Springer. Andreewsky, E. (2000). Abduction in language interpretation and law making. Kybernetes, 29(7– 8), 836–845. Blumer, H. (1986). Symbolic interactionism. Perspective and method. University of California Press. Bruner, J. S. (1960). The process of education. Harvard UP.
582
M. Meyer
Bruner, J. S. (1961). The act of discovery. Harvard Educational Review, 31, 21–32. Cicourel, A. V. (1972). Basic and normative rules in the negotiation of status and role. In D. Sudnow (Ed.), Studies in social interaction (pp. 229–258). Free Press. Eco, U. (1983). Horns, hooves, insteps. In U. Eco & T. A. Sebeok (Eds.), The sign of the three. Dupin, Holmes, Peirce (pp. 198–220). Indiana UP. Fann, K. T. (1970). Peirce’s theory of abduction. Nijhoff. Garfinkel, H. (1967). Studies in ethnomethodology. Prentice-Hall. Gentner, D. (1983). Structure mapping: A theoretical framework for analogy. Cognitive Science, 7, 303–310. Hempel, C. G. (1977). Aspekte wissenschaftlicher Erklärung (translation: Aspects of scientific explanation). de Gruyter. Hempel, C. G., & Oppenheim, P. (1948). Studies in the logic of explanation. Philosophy of Science, 15(2), 135–175. Hoffmann, M. (1999). Problems with Peirce’s concept of abduction. Foundations of Science, 4(3), 271–305. Hoffmann, M. (2002). Erkenntnisentwicklung. Ein semiotisch-pragmatischer Ansatz (translation: Development of cognition. A semiotic-pragmatic approach). Philosophische Fakultät der Technischen Universität. Hoffmann, A. (2003). Elementare Bausteine der kombinatorischen Problemlösefähigkeit (translation: Elementary bricks of combinatory problem solving ability). Franzbecker. Hoffmann, M. (2003). “Entdeckendes Lernen” – semiotisch gefasst (translation: Discovery learning – A semiotical perspective). In Beiträge zum Mathematikunterricht (translation: Contributions to mathematics teaching) (pp. 305–308). Franzbecker. Hoffmann, M. (2007). Seeing problems, seeing solutions. Abduction and diagrammatic reasoning in a theory of scientific discovery. In O. Pombo & A. Gerner (Eds.), Abduction and the process of scientific discovery (pp. 213–236). CFCUL/Publidisa. Kollosche, D. (2017). Entdeckendes Lernen: Eine Problematisierung (translation: Discovery learning: A problematization). Journal für Mathematikdidaktik, 38, 209–237. Körner, C., & Meyer, M. (2022). An interpretative analysis of generalization processes using abduction theory: The case of addition with zero (CERME12). Bozen. Krummheuer, G. (1995). The ethnography of argumentation. In P. Cobb & H. Bauersfeld (Eds.), The emergence of mathematical meaning: Interaction in classroom cultures (pp. 229–270). Erlbaum. Kunsteller, J. (2018). Ähnlichkeiten und ihre Bedeutung beim Entdecken und Begründen. Sprachspielphilosophische und mikrosoziologische Analysen von Mathematikunterricht (translation: Similarities and their importance in discovering and justifying. Language game-philosophical and micro-sociological analyzes of mathematics lessons). Springer. Liljedahl, P. (2005). Mathematical discovery and affect: The effect of AHA! Experiences on undergraduate mathematics students. International Journal of Mathematical Education, 36(2– 3), 219–234. Magnani, L. (2001). Abduction, reason and science. Kluwer. Meyer, M. (2007). Entdecken und Begründen. Von der Abduktion zum Argument (translation: Discovering and verifying. From abduction to argument) (1st ed.). Franzbecker. Meyer, M. (2009). Abduktion, Induktion – Konfusion. Bemerkungen zur Logik der interpretativen Sozialforschung (translation: Abduction, induction – Confusion. Remarks on the logic of interpretative research). Zeitschrift für Erziehungswissenschaft, 2, 302–320. Meyer, M. (2010). Abduction – A logical view of processes of discovering and verifying knowledge in mathematics. Educational Studies in Mathematics, 74, 185–205. Meyer, M. (2018). Options of discovering and verifying mathematical theorems – task-design from a philosophical-logical point of view. Eurasia Journal on Mathematics, Science and Technology Education, 14(9). https://doi.org/10.29333/ejmste/92561 Nagl, L. (1992). Charles Sanders Peirce. Campus. Nelson, L. (1949). Socratic method and critical philosophy. In Selected essays (T. K. Brown, Trans.). UP.
27 Using Abduction for Characterizing the Process of Discovery
583
Nührenbörger, M., & Schwarzkopf, R. (2016). Processes of mathematical reasoning of equations in primary mathematics lessons. In N. Vondrová (Ed.), Proceedings of the 9th congress of the European Society for Research in mathematics education (CERME 9) (pp. 316–323). ERME. Peirce, C. S. CP, Collected papers of Charles Sanders Peirce. Harvard UP. Volumes I–VI edited by C. Hartshorne & P. Weiss, 1931–1935; Volumes VII–VIII, edited by A. W. Burks, 1958, quotations according to volume and paragraph. Peirce, Ch. S. (1986). LOS, Historical perspectives on Peirce’s logic of science: A history of science (2nd ed.) (C. Eisele, Ed.). Mouton. Polya, G. (1954a). Mathematics and plausible reasoning. Volume I: Induction and analogy in mathematics. University Press. Polya, G. (1954b). Mathematics and plausible reasoning. Volume II: Patterns of plausible inference. University Press. Polya, G. (1973). How to solve it. A new aspect of mathematical method (2nd ed.). University Press. Popper, K. R. (1989). Logik der Forschung (translation: Logic of science) (9th ed., Original: 1934). Mohr. Reid, D. A. (2004). Forms and uses of abduction. In: M. A. Mariotti (Ed.), Proceedings of CERME 3, Bellaria. Rescher, N. (1995). Peirce on abduction, plausibility, and the efficiency of scientific inquiry. In N. Rescher (Ed.), Essays in the history of philosophy (pp. 309–326). Avebury. Rey, J. (2021). Experimentieren und Begründen. Naturwissenschaftliche Denk- und Arbeitsweisen beim Mathematiklernen (translation: Experimenting and justifying. Scientific ways of thinking and working when learning mathematics). Springer. Ritter, S., & Dijksterhuis, A. (2014). Creativity – The unconscious foundations of the incubation period. Frontiers in Human Neuroscience, 8, Art. 215. Rivera, F., & Becker, J. (2007a). Abduction – Induction (generalization) processes of elementary majors on figural patterns in algebra. The Journal of Mathematical Behavior, 26, 140–155. Rivera, F., & Becker, J. (2007b). Abduction in pattern generalization. In W. Jeong-Ho, L. Hee-Chan, P. Kyo-Sik, & J. Dong-Yeop (Eds.), Proceedings of the 31st conference of the international group for the psychology of mathematics education (Vol. 4, pp. 97–104). Korea Society of Educational Studies in Mathematics. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. Söhling, A. (2017). Problemlösen und Mathematiklernen. Vom Nutzen des Probierens und des Irrtums (translation: Problem solving and mathematics learning. On the use of trial and error). Springer. Stegmüller, W. (1976). Hauptströmungen der Gegenwartsphilosophie (translation: Main currents in contemporary German, British, and American philosophy) (Vol. I). Kröner. Timmermans, S., & Tavory, I. (2012). Theory construction in qualitative research: From grounded theory to abductive analysis. Sociological Theory, 30(3), 167–186. Voigt, J. (1984). Interaktionsmuster und Routinen im Mathematikunterricht. Theoretische Grundlagen und mikroethnographische Falluntersuchungen (translation: Patterns of interaction and routines in mathematics lessons. Theoretical foundations and micro-ethnographic case studies.) Beltz. Voigt, J. (2000). Abduktion (translation: Abduction). In: Beiträge zum Mathematikunterricht (translation: Contributions to mathematics teaching) (pp. 694–697). Franzbecker. Wittmann, E. Ch. (1994). Wider die Flut der “bunten Hunde” und der “grauen Päckchen”: Die Konzeption des aktiv-entdeckenden Lernens und des produktiven Übens (translation: Against the flood of “colored dogs” and “gray packages”: The concept of active-discovery learning and productive practice). In E. Ch. Wittmann, & G. N. Müller (Eds.), Handbuch produktiver Rechenübungen. Band 1. Vom Einspluseins zum Einmaleins (translation: Handbook of productive arithmetic exercises. Volume 1. From the one-oh-one of addition to the one-ohone of multiplication) (2nd ed., pp. 157–171). Klett.
Abduction and Creativity in Mathematics
28
Paul Ernest
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . School Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meaning and Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in Mathematics Education Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and the Philosophy of Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
586 587 592 599 603 604 605 609 610
Abstract
This chapter reviews current thinking on abduction and creativity in mathematics. Mathematics uses abduction to generate explanatory hypotheses and novel concepts and theories. This is recognized in the emergent philosophy of mathematical practice, following on from the groundbreaking work of Imre Lakatos. His linkage of creativity with proof is leading to the breakdown of the traditional rigid separation of the contexts of discovery and justification. The role of meaning in mathematics also crosses this divide, and links are made here between abduction and the meaning theories of Wittgenstein and Brandom. Creativity is a topic that connects the domains of school mathematics and research mathematics too. One such connection is that schooling represents the conduit for the development of new creative mathematicians. School mathematics is traditionally made up of routine exercises but there are moves
P. Ernest () University of Exeter, Exeter, UK e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_38
585
586
P. Ernest
to try and bring more creativity into it. The aims are to foster deeper feelings of understanding, and the excitement and the joy felt by mathematicians, as well as to enhance education for the benefit of all, including developing more mathematicians. Overall, it can be said that Peirce’s ideas of abduction offer a deeper understanding of creativity, both in school and research mathematics. Keywords
Mathematical creativity · Meaning theory · Discovery in mathematics · Creativity in school mathematics · Abduction in mathematics
Introduction There is a growing literature on the importance of abductive reasoning in the philosophy of science, cognitive science, and mathematics and science education. This stems from the work of the great philosopher and logician Charles Saunders Peirce, who fathered the idea. Peirce was concerned to offer an overall scheme for categorizing logic. In his search for completeness, Peirce differentiated three forms of logical reasoning: deduction, induction, and abduction. A brief sketch of these, as they pertain to mathematics, is as follows. Deduction is well understood and concerns the rules for drawing necessary conclusions from premises. There are many systems of deductive logic, from Aristotle’s syllogistic forms of reasoning to modern systems of mathematical logic including natural deduction. Induction has two meanings. Mathematical induction is a form of deductive reasoning limited in its application to mathematics. Mathematical induction has the following form. Let P stand for any property of the natural numbers. If it can be proved that, from the assumption that P(n) holds for all values of n up to k, P(k + 1) also holds, then for all natural numbers n, P(n) is true. Phrased in this way, the principle of mathematical induction does not require the premise P(0), as it does in some versions, because it is included in the premise stated above. In contrast, in its other meaning, induction is a scheme for coming up with a hypothesis or conjectured general rule from the observation of a finite number of cases. For example, on observing that 1 + 3 = 4, 1 + 3 + 5 = 9, and 1 + 3 + 5 + 7 = 16, the process of conjecturing the hypothesis that 1 + 3 + 5 + . . . + 2n + 1 = (n + 1)2 is an instance of induction. The fact that this general formula correctly describes the pattern of a finite number of cases in a generalization is no guarantee that it holds for all further cases. In fact, in mathematics, if we have a finite sequence of numbers, as in this example, then it can be continued in an unlimited number of different ways. However, the induced generalization in this case is perhaps the simplest and most elegant way of generalizing the initial sequence. This process, the induction of a generality from a sequence of examples, is a strategy or heuristic, a pattern of plausible reasoning. As such, there is no guarantee
28 Abduction and Creativity in Mathematics
587
that the induced formula will be correct. Of course the schema of induction is also widely employed in science, and informally in language acquisition and concept formation, but always as a suggestive or speculative technique and not as a means of generating validated knowledge. There have been attempts to formalize induction as a logical principle (see, e.g., Lakatos, 1968). However, it is widely accepted that induction is not a reliable foundation for science (Popper, 1959). A common example quoted in the philosophy of science concerns black swans. The observation of any number of endemic swans in the United Kingdom may suggest the inductive generalization that all swans are white. However, black swans are found in Australia and have now also been imported to the United Kingdom. So, the generalization “All swans are white,” derived by induction based on a finite number of observations in the United Kingdom, is false.
Abduction Peirce’s third form of reasoning is abduction. Like induction this is a form of plausible reasoning, one that does not guarantee correctness of outcomes. Peirce viewed abduction as the only logical operation which introduces any new idea into reasoning. Thus, evidently, he did not view induction as a logical operation on a par with deduction and abduction. For it involves a mysterious leap, coming up with something not preformed or even anticipated in the “reasoning.” Peirce gave differing accounts of abduction at different times in his long career. According to Fann (1970, p. 9), “most writers on Peirce’s theory of abduction divide Peirce’s thought roughly into two periods.” These consist of his early and middle work versus his later work, from about 1900 onward. The earlier period approaches abduction in terms of theorizations based on the logical form of the reasoning, and there is an emphasis on abduction as a probabilistic form of syllogism. At that time, he referred to abduction as “hypothesis” and characterized it by this syllogism (Peirce, 1867, p. 285): Hypothesis Any M is, for instance, P P P , etc. S is P P P , etc.; ∴ S is probably M. Here S is the subject, a specific case of interest, and P , P P are a number of characteristics of S. The word “probably” in the conclusion indicates a key characteristic of abduction. The argument gives the conclusion plausibility, but not certainty. (Reid, 2018, p. 2).
Note that in this example, if M is fully characterized by a finite list of properties, P P P , etc., and S satisfies this list as well, then one may correctly conclude that S is M. This earlier version of abduction was published when Aristotle’s syllogistic was still the dominant form of recognized logical reasoning. At the time the syllogism was the theoretically most developed theory of logical reasoning, although it
588
P. Ernest
was not that much used in mathematics. Syllogistic reasoning is not subtle or complex enough to deal with the full range of logical inferences in either sentential or predicate forms. However, in the intervening period between Peirce’s early publications on the topic (1860s) and his later publications (roughly 1900 onward), there were major developments in logic. Starting with Frege’s (1879) Begriffsschrift and including parallel work by Peirce himself, Schröder, Peano, and others, mathematical logic emerged as a theory in its own right. In this quick overview have I ignored such developments in logic such as Boole’s The Laws of Thought of 1854. This, although important, did not provide a full logic adequate for expressing mathematical reasoning and was still dominated by Aristotle’s syllogistic, according to Corcoran (2003). In 1902 Peirce wrote that he now regarded the syllogistic forms and the doctrine of extension and comprehension (i.e., objects and characters as referenced by terms), as being less fundamental than he had earlier thought: According to my own principles, the reasoning with which I was there dealing could not be the reasoning by which we are led to adopt a hypothesis, although I all but stated as much. But I was too much taken up in considering syllogistic forms and the doctrine of logical extension and comprehension, both of which I made more fundamental than they really are. As long as I held that opinion, my conceptions of Abduction necessarily confused two different kinds of reasoning. (Peirce, 1902, p. 102)
Peirce’s earlier work on abduction was very much based on Aristotelian syllogistic, which he rejected in his mature understanding. Therefore, this chapter will concentrate on his later work on abduction which has more direct bearing on mathematics. The main interest of the concept of abduction for mathematics, other than historical, is its intended role as a logic of discovery and not as a form of logical reasoning. In Peirce (1903, pp. 188–189), he offered the following form for abduction: The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true.
Thus, abduction is a pattern of plausible reasoning, a strategy, if not for generating, then for selecting possible hypotheses to explain a surprising fact. In mathematics this could be a shared pattern in a sequence of examples or a surprising relationship between concepts. Hintikka (2007, p. 38), drawing on the work of others, offers a philosophical analysis. He characterizes the Peircean concept of abduction rather thoroughly, in terms of the following four theses. Inferential Thesis: Abduction is, or includes, inferential process, or processes. Thesis of Purpose: The purpose of “scientific” abduction is both to generate new hypotheses, and to select hypotheses for further examination. Hence, a central aim of such abduction is “to recommend a course of action”. Comprehension Thesis: Scientific abduction includes all the operations whereby theories are engendered. Autonomy Thesis: Abduction is, or embodies, reasoning that is distinct from, and irreducible to, either deduction or induction.
28 Abduction and Creativity in Mathematics
589
These seem like clear and valuable theses. From the perspective of mathematics, the Thesis of Purpose can be extended beyond the generation of new hypotheses with the additional idea of problem solving, theoretical development, and other mathematics-specific courses of action. Shank (1988) regards abduction as an essential part of human thinking and reasoning in many of its forms. He argues that the abductive imperative has shifted away from its role as a key principle in human inquiry and turned into a methodological procedure. He coined the term “the law of juxtaposition” to describe what he regards as one of the most interesting models of abductive inquiry. This law simply states, in its general form, that human beings will inevitably be drawn to attempt to reconcile, by abduction, any two juxtaposed items into a meaningful and conclusive third. Thus, according to Shank, the law of juxtaposition states that for any conscious sign using being, there exists an imperative to resolve any two signs just juxtaposed in experience in terms of a consequential third. By signs he means such possible objects as observations, entities, experiences, ideas, texts, units, understandings, beliefs, claims, clues, rules, things, happenings, and so on. This provides him with a powerful unity, thus linking different methods of inquiry. For example, he argues that the following all share this form. 1. A topic juxtaposed with a sign or concept produces a metaphor as a consequence. 2. A rule juxtaposed with a case yields a result as a consequence. 3. A signal juxtaposed with any innate release yields a user behavior as a consequence. 4. A subject juxtaposed with a predicate yields a proposition. 5. A term juxtaposed with the context yields a connotation as a consequence. Every juxtaposition described here involves an abductive imperative where some implicit rule links the two new components of the juxtaposition so as to describe a third, the consequence that renders the juxtaposition itself meaningful. The juxtapositions described are in turn (1) metaphoric, (2) comprehension, (3) deductive syllogistic reasoning, (4) reflexive behavior, and (5) propositional formation. According to Shank, these constitute different forms of abduction all using the same general methodological principle. Shank is a semiotician who looks out from the use of signs to the whole panoply of human understandings and knowledge. This is a fully divergent view of abduction. Another semiotician interested in the nature and role of abduction is Umberto Eco. In contrast with Shank, Eco looks inward into the different types of logical forms that abduction can take. Or rather he looks at the different circumstances during which an abductive hypothesis emerges. Eco (1983) provides an analysis of different variants of the logical form of abduction, according to the nature and role of the rule abducted. Eco terms the three kinds of abduction that he identifies as overcoded, undercoded, and creative. Overcoded abduction occurs when the proposer only has access to one rule from which the case under consideration follows.
590
P. Ernest
What Eco terms “undercoded abduction” occurs when the abductive selection can be made from multiple general rules. Here a single rule is to be selected, presumably the one considered optimal in the specific context. Eco illustrates this with the case of Kepler who in order to explain astronomical observations was selecting from among several types of planetary trajectories. These needed to be closed curves, like the circle, ellipse, or ovoid, because of the cyclical recurrence of planets in their orbits. This example is drawn from Peirce’s own illustration. He argued that the number of possible path types for a moving planet is finite, so one curve must be chosen from this range of possibilities. These two types of abductions, the overcoded and the undercoded, are termed selective abductions by Magnani (2001). In selective abduction the right or best explanatory hypothesis is drawn from a given set of possible explanations. The third type of abduction occurs when there is no known rule that would produce the observed result. Eco (1983) calls this type “creative abduction” because the reasoner must invent or discover a new rule. In attempting to find a foundation for (what he terms) the extraordinary ordinary types of reasoning used in psychological processes, Shank (1998) analyzed different forms of abductive reasoning. He identified six modes of abductive reasoning that are used in cognition across the full range of human practices and activities: They are (a) reasoning to the omen or hunch; (b) reasoning to the clue; (c) reasoning to the metaphor analogy; (d) reasoning to the symptom; (e) reasoning to the pattern; and (f) reasoning to the explanation. Three of these modes of abductive reasoning, have played such an important historical role in the history of empirical inquiry that they can be identified as the basis of three venerable modes of abductively guided empirical enquiry. These so called skills are detection, diagnosis and divination. (Shank, 1998, pp. 848–849)
Most commonly these modes of reasoning are identified as the special techniques of detectives, hunters, doctors, and other professionals seeking to diagnose or identify some general characteristic from a sign or clue. Shank’s contention is that these modes of reasoning are far more widely applied than deduction or induction which are given pride of place in accounts of logic and its philosophy. Briefly, abduction rules! As befits a mode of reasoning on a par with deduction and induction, the literature on abduction ranges far and wide over the whole territory of human reasoning and understanding. The great virtue of abduction is that it is a means or vehicle for creativity. While this is true for all domains of human creativity, my particular concern here is with mathematics. Within mathematics it would seem that the most valuable role of abduction is to offer a possible analysis of creativity.
Creativity in Mathematics Abduction, incorporating all three of Eco’s (1983) types, constitutes one strand of creativity in mathematics, admittedly an important strand. But the more general question is: what constitutes creativity in mathematics? Devlin (2019) analyzed responses to an informal survey of opinions on mathematical creativity conducted among mathematicians and mathematics educators. The definition that he came up with, which summarized these responses, is that
28 Abduction and Creativity in Mathematics
591
mathematical creativity is non-algorithmic decision-making. In order to make sense of this answer, it must be contextualized and expanded beyond this very minimal definition. Obviously, it pertains to mathematical activity and is about the deployment of mathematical knowledge and skills in tackling a task or problem. Unpacking it, it would seem that by non-algorithmic Devlin means that it is not an automatic mathematical activity. In other words, it refers to thoughtful application of mathematical skills in problem solving. Devlin’s second term is decision-making. Here he means that it is a metacognitive activity involving a deliberate and conscious selection of which of one’s own skills and knowledge to apply in any solution process. Here, in a nutshell, Devlin successfully identifies key elements of creativity in mathematics. However, it is not yet clear whether he is describing the successful student of mathematics or the practice of the research mathematician. Perhaps it applies to both. In discussing mathematical creativity, it is useful to distinguish activities in the teaching and learning of mathematics, that is, in mathematics education, from the activities of the working mathematician. Within mathematics education research, the main focus is on defining and fostering creativity in school mathematics. Thus, the focus is often on moving away from routinized mathematical activities toward responses to more difficult mathematical problems set by the teacher (or even self-set). As Devlin’s definition suggests, the creativity resides in non-algorithmic decision-making. In other words, the student must choose for themselves which concepts, skills, algorithms, or strategies to apply in trying to solve any given problem. As with abduction, the concepts and skills or missing hypotheses must be provided by the student and are not automatically given within the task. However, the novelty in the student’s solutions must be judged with respect to the set of concepts, skills, methods, and algorithms that the student has been previously taught to use with some automaticity. For the creativity resides in the decisionmaking about what skills and tools to apply in the solution of any given problem, as well as in the overall implementation and self-evaluation of the solution produced. Creativity is not involved in the application of any automatic algorithmic process cued by the problem type. Any automaticity will reflect the student’s mastery of some of the skills and algorithms learned up to that point in their course of study, which will then be drawn upon as resources to be used in problem solving. For example, working 17 + 21 for younger children and solving x2 + 3x + 2 = 0 for high school students can be automatic given the appropriate teaching. In contrast, finding and continuing the pattern in the sequence 0, 3, 8, 15, 24, . . . and then expressing it algebraically is not normally an automatic task for junior high school students. In fact, solving the latter can be done by abductive reasoning, namely, hypothesizing formulas that generate this pattern. There are heuristic strategies that can direct one toward possible solutions, such as looking at the first-order differences between the terms in the sequence (3, 5, 7, 9, . . . ). Several different ways of proceeding can be envisaged. One can guess what the formula is. Term 1 is 0, so the formula could be n-3. This gives Term 1 correctly but gives Term 2 the incorrect value 2–3 = −1. An observant student might conjecture that the
592
P. Ernest
sequence is one less than the square numbers and conjecture the formula n2 –1. This is indeed the simplest formula that describes these first five terms of the sequence and continues with the same pattern. Typically, this would be the answer intended in a school setting. However, it should be remarked that mathematically there are in fact an unlimited number of formulas to describe this sequence, even if one extends it to k terms, but will vary in value from the given formula for later terms (i.e., for n > k). Thus, what might be accepted in school as “the right answer” is in fact not the only possible answer, just the simplest and, indeed, the “standard” answer. Perhaps it is unnecessary to note that for any function F(n), given the function G(n) = 1 for n < 6 and G(n) = 0 for n > 5, then the expression G(n)(n2 –1) + G(n)F(n) matches the first five terms of the sequence but continues as F(n) thereafter. A more explicit version of this formula can be provided but it has a more complicated form. Thus, potentially any numerical sequence beginning with a finite number of fixed terms can be consistently described and continued by some function in an unlimited number of ways. For a mathematician, finding the formula for the sequence, in the standard answer form, should be a quick intuitive job. But mathematicians also have further algorithms and heuristics at their disposal that could be employed. For example, looking at the first-order differences (3, 5, 7, 9, . . . ) and the second-order differences (2, 2, 2, 2, . . . ) indicates to the knowledgeable that the pattern is an expression of power 2.
School Mathematics Creativity in school mathematics is expressed in at least four ways. These are mental agility, problem solving, problem posing and mathematical investigations, and mathematical projects. Each of these involves creativity in different ways, although there are overlaps. Mental agility involves creativity because in performing mental arithmetic, there are a variety of ways of completing a task. This is different from written arithmetic, where often there is a standard algorithm for an operation. However, working out mental arithmetic tasks involves a choice of how to decompose and recompose the numbers to make the calculation easy and convenient to complete. For example, in multiplying 36 by 48, the proximity of 48 to 50 can be exploited (e.g., in the calculative sequence 36 × 48 = 36 × 50 – 36 × 2 = 1800 – 72 = 1728). However, another product might be decomposed differently, for example, 36 × 42 can be calculated as 36 × (40 + 2). This is a different strategy based on the characteristics of numbers involved and the most convenient ways of decomposing them. The creativity in mental arithmetic is based on identifying and exploiting decompositions that make the calculation more accessible and less effortful. This is explained (and conducted and monitored) by the facility of metacognition, being aware of and regulating one’s own knowledge and mental processes in arithmetic.
28 Abduction and Creativity in Mathematics
593
Perhaps the earliest documented use of mental agility strategies, in terms of children’s development, occurs when children move on from the “count on” to the “count on from larger” (Carpenter et al., 1982). Thus, in computing 3 + 5 young children move from computing 3 (+1 + 1 + 1 + 1 + 1) to reach the answer 8 to computing 5 (+1 + 1 + 1) to reach it. As well as the efficiency gained in counting on from the larger number, this is also part of the process of gaining understanding of commutativity (e.g., 3 + 5 = 5 + 3), an essential element of mental arithmetic. Problem solving is the process of solving mathematical problems. In school mathematics these are mostly tasks that have been set by a teacher. Problem solving can be conducted in several ways. The simplest is proceeding by trial and error impulsively using simple strategies that are to hand, one after another without learning much from the failures. More sophisticated problem solving involves planning and executing strategies more systematically. Polya (1945) offered an influential analysis of problem solving based on four stages. These are understanding the problem, planning a solution strategy, carrying out the plan, and reflecting on the outcome and looking back. If the attempted solution does not work or if there is a desire to extend the solution, the cycle can be started again. Polya was reflecting on how he and his professional mathematician acquaintances solved problems in university-level mathematics problems or research mathematics, but his “method” has been widely adopted for school mathematics teaching. Schoenfeld (1992) contrasts the typical pattern of work of novice and expert problem solvers. To do this he analyzed problem-solving activity into six levels or stages of work, building on Polya’s categories. These are read, analyze, explore, plan, implement, and verify. These correspond to Polya’s stages but include an additional impulsive (unplanned) stage of exploring the problem, corresponding to the impulsive trial-and-error approach mentioned above: Understanding the Problem: Read, Analyze Seeking to Solve Without Plan: Explore Devising a Plan: Plan Carrying Out the Plan: Implement Looking Back: Verify The new category is exploration, meaning seeking to solve without plan. This is something that is neither planned nor recommended (at least not to excess) but is something that is frequently observed. Exploration can be valuable for enriching understanding and providing information on which to base planning. But when this stage of activity is persisted in, typically by novice solvers lacking self-regulation or metacognitive skills, it usually leads to failure. Problem posing and mathematical investigations are enlarged versions of problem solving, but they begin with an extra stage, namely, the posing of the problems to be solved by the proposed solver. In a mathematical investigation, students may be given an area of mathematics to explore and then invited to pose problems for themselves to explore, within it. For example, students can be invited to cut out a range of different sized squares from centimeter square printed card and
594
P. Ernest
explore what plane shapes they can make with them and examine their properties. If one looks at the triangles one can assemble from card squares, one may discover that when the area of the largest square equals the sum of the areas of two smaller squares, the angle opposite the largest square is a right angle (a version of Pythagoras’ theorem). This type of activity includes two aspects of mathematical creativity. First there is the choice of solution strategy, as in problem solving. Second, there is the choice and formulation of the problems to be solved, the problem-posing element. This is a novel dimension of creativity, posing or creating a problem to be solved, or a hypothesis to be tested, which exceeds the marshalling of knowledge and skills to solve a given or existing problem. Mathematical projects are typically activities conducted by a group of students. They engage in a joint (or individual) mathematical project and then build a display of their activities and results to be viewed by others. The project could be, for example, a joint mathematical investigation with a display of the inquiry and its outcomes or a student-conducted survey with the results analyzed and illustrated graphically on a poster. There are two creative dimensions involved. First there is the choice and conduct of the mathematical inquiry. This includes the creativity of problem posing and mathematical investigations. Second, there is the choice of how best to display the results of the project. Communicating mathematical outcomes is also a creative process. It is an activity shared with other areas beyond mathematics. However, picking out the mathematical content to show and the means of showing it are more specifically mathematical. Niss and Højgaard (2019) offer a valuable analysis of competences in mathematics. This includes two main headings. First, there is the ability to ask and answer questions in and with mathematics. Second, there is the ability to deal with mathematical language and tools. The first heading includes the main creative dimensions of mathematics discussed above, including thinking mathematically and posing and solving mathematical problems. However, the second heading includes representing mathematics and communicating with and about mathematics, including making use of aids. Thus Niss and Højgaard (2019) attach a considerable weight to creativity in communicating with mathematics, the fourth dimension of creativity listed above. Looking back over these four types of creative activities in school mathematics, only the middle two types of activity, problem solving and mathematical investigations, might involve abduction. If and when abduction is involved in project work, it is of the same types as in problem solving and problem posing, so it need not be counted separately. The strategies or heuristics used in problem solving are perhaps the most important place where abduction is employed in school mathematics, namely, the coming up with a hypothesis to explain a mathematical result or solve a problem. Liljedahl and Sriraman (2006, p. 19) offer a view of mathematical creativity at the school levels parallel to what has been discussed here. They suggest that school creativity in mathematics has two parts:
28 Abduction and Creativity in Mathematics
595
1. Creativity includes the process that results in unusual (novel) and/or insightful solutions to a given problem or to analogous problems. 2. Creativity also includes the formulation of new questions and/or possibilities that allow an old problem to be regarded from a new angle. Thus, although creativity in school mathematics is broader than the uses of abduction, it employs abduction both in solving problems and in posing problems. These two dimensions of mathematical creativity are evidently also very necessary within research mathematics. There is less research on professional mathematicians’ creativity, and the strategies that they employ, than there is about students of mathematics. However, creative work among professional mathematicians can be expected to use a deeper and more extensive range of strategies for solving problems. After all most school mathematics tasks or problems can be solved in a few minutes or, in the case of the most difficult ones, in a couple of days. The problems addressed by professional mathematicians sometimes remain unsolved for centuries with many mathematicians offering partial solution methods that finally lead to a breakthrough, if any solutions are ever reached. Furthermore, the range and depth of knowledge and techniques used by research mathematicians will clearly be deeper and more complex, by several orders of magnitude. Even the day-to-day problems addressed by mathematicians, as opposed to the great problems of mathematics, such as the Fermat problem (Fermat’s last theorem), are likely to take days, weeks, or months to solve. Sriraman (2004) reviews research on the creativity of mathematicians. He begins with Hadamard’s (1945) studies which constitute one of the earliest systematic investigations of the creativity of mathematicians. Hadamard (1945) suggests a four-stage cycle of creativity, building on the work of Poincare. He considers the process of invention to progress through four stages, beginning with “preparation,” passing through “incubation” to “illumination,” and finally reaching “verification.” In following these stages, the mathematician first prepares by immersing their self in the specific literature related to the problem they are addressing and studying the techniques involved (preparation). Having thought hard about the problem, the mathematician is trying to solve in the context of relevant knowledge they enter what is described as the stage of incubation. Here the mathematician lets the ideas mull over at the back of their mind and lets their unconscious work on the ideas and the problem. The following stage is almost miraculous and is called illumination. It seems like luck when following an irrational and unconscious process they gain a sudden awareness, coming to a rule or a model suggesting a solution to the problem. Such a process does not always happen, but when it does can arrive like a bolt from the blue, often called the “Aha!” moment. Hence the appropriateness of the term illumination for this stage. There is no guarantee that this sudden insight will solve the problem, but where it does occur, it is commonly reported as resulting in a successful solution. The final stage of verification occurs when the mathematician seeks to verify or prove the result that they have derived by this process. According to
596
P. Ernest
Sriraman (2004), Hadamard’s emphasis on the preparatory stage is at odds with some current theories, which emphasize the illuminatory stage. But it seems that Hadamard’s model still works very well for mathematics. Sriraman interviewed a number of practicing mathematicians and came to the conclusion that this same process and description fits well with their explanations of their own mathematical discoveries or inventions. The illumination stage is where abduction takes place. It is unbidden, but with the prepared stage or arena, the hypothesis springs to mind from some unknown unconscious source and suggests the missing piece that helps to solve the problem. As in many creative processes, the author, artist, or mathematician is mystified as to where the inspiration comes from. Traditionally, the inspiration was attributed to the muses. But these days we are more likely to attribute it to intuition and the workings of the mind at the unconscious level. There is a parallel with Zen Buddhism. The seeker after Satori undergoes a long period of preparation learning meditation techniques, Zen koans and the philosophy of Zen. Engaging in a sustained period of meditational practice is in some ways parallel to the stage of incubation. This may be followed by sudden illumination or Satori in the case of Zen. Achieving this illumination is unpredictable and variable, and some students of Zen may never achieve it. Likewise, some mathematicians will never succeed in solving a problem that they are addressing. Luckily, they can turn to another problem to solve unlike the student of Zen for whom there is only one Satori. But when it does happen, when the mathematician receives the inspiration that solves the problem, it may be like a bolt from the blue. The final stage of the four-phase process is that of verification. Here mathematical creation and the practice of Zen differ. For the mathematician enters a convergent phase of working in seeking to verify the result, looking for and constructing a proof. For the enlightened Zen student Satori needs no validation. It is a self-validating experience. Any subsequent, further, or final stage will be divergent, addressing the world with the newfound wisdom. It is during the stage of illumination that the mathematician’s conjectured hypothesis, their abducted result, arrives. We do not know, and we will probably never know, what actually goes on in the unconscious mind to complete the process of abduction. We know how to prepare for it, but there is no guarantee that creativity underpinning abduction will take place. From the literature we know that often mathematical illuminations are accompanied by visual imagery, analogous to and representative of the solution process or the abducted hypothesis sought. In these cases, the mathematician not only has a vision, that is, experiences visual imagery, dynamic or static, but also has the capacity or inspiration to interpret this vision as an analogy for the result or hypothesis. There are a number of well-known cases of this phenomenon including Kekule’s discovery of the carbon ring and Poincare’s solution to the Fuchsian function problem. In exploring mathematical creativity, and creativity in any field, we have to acknowledge that the actual processes of creation are irrational. That is, it is not explicit chains of reasoning that lead to the creative product. Tacit knowledge, the experience acquired and learned through engagement in productive practices, seems
28 Abduction and Creativity in Mathematics
597
to be what contributes to the creative process, including the abductive emergence of hypotheses and tentative results. Indeed, there seem to be unconscious processes at work in creativity that as yet cannot be pinned down, made explicit, let alone be analyzed rationally. One of the problems with an account of mathematical creativity that goes through the four stages of preparation, incubation, illumination, and validation is that it mystifies the process of illumination. If one draws the analogy with the process of building an arch, one can see that much work goes into preparing and building the foundations, erecting the verticals and the curved sides of the arch, and it is only in the last phase that the capstone is inserted in the supporting armature to complete the arch. In the same way, mathematical discovery involves a great deal of preparation and establishment work before the final link can be inserted to hold all the elements together. Inserting the capstone is like the illumination phase where all the parts of the arch are put together into a whole. The subsequent mortaring of the joints is like the validation stage. Although the capstone is the most dramatic and essential parts of the building work, it only represents at most 5% of the whole process. Likewise, the generation of the hypothesis in a flash of illumination is the most dramatic and visible bit of mathematical creation, but it can only occur after serious and extensive preparatory work. Like an athlete’s medal winning Olympic performance, it is the most dramatic and memorable part of their career, but it does not reflect all the years of preparation and hard work. The illumination phase in mathematical creation, where it occurs, is indeed a marvelous thing. But it is not an indication that the creative mathematician is some kind of specially empowered being, superior to ordinary mortals, on whom the lightning bolt of inspiration strikes. Instead, it is a reflection of the extensive training and preparatory work undertaken to get to the position of being able to invent new mathematics. Furthermore, as Burton (2004) shows, much of mathematical creation emerges from group activity. Despite all the preparatory work involved, the source of new hypotheses and conjectures remains mysterious. According to Peirce (1960, vol. 7, p. 202), “Abduction is the process of initially setting up or entertaining a hypothesis likely in itself.” Peirce’s focus is on the logical or semiotic processes involved in abduction, not the psychological or cognitive aspects. Thus, he offers no conjectures as to the source of the hypothesis. This is a whole different direction of inquiry. Nevertheless, in exploring mathematical creativity and the role of abduction, it is illuminating to speculate on the origins of hypotheses. After all it was Peirce himself that indicated that abduction is the sole logical entry point for creativity in reasoning, especially in mathematics. Mathematical heuristics is that branch of study that considers the strategies employed in solving mathematical problems or in coming up with new mathematical ideas. One of the first to study such strategies systematically was the philosopher Descartes who published a book entitled Rules for the Direction of the Mind. This contains several strategies that might enable successful problem solving. For example, rule 13 suggests that simplifying a problem can help to solve it, and rule 15 suggests that drawing a diagram to represent the problem can also help in the solution.
598
P. Ernest
Polya (1945) offers an extended list of strategies for problem solving, including the suggestion, again, that making a diagram can help solve a mathematical problem. Among other things Polya suggests that the mathematician may fruitfully begin by asking the questions: What is the unknown? What are the data? What is the condition? Can you restate the problem more suggestively? Could you imagine a more accessible related problem? (If so try to solve that and then build a bridge to the original problem.) These are just a sample of the strategies and heuristics that he offers. Ever since his pioneering work, many authors have extended Polya’s suggested strategies. For example, Burton (1984) offers the following strategies among many others: control variables systematically, use one solution to find others, work backward, focus on one aspect of the problem, partition the problem into cases, and reformulate the problem. What kinds or levels of thinking are involved in problem solving? Following the literature, Ernest (2013) analyzes the cognitive activities involved in problem solving into two types, the cognitive and the metacognitive. These are as follows: Cognitive Activities: carrying out plan, applying strategies, using skills, knowledge, etc. Metacognitive Activities: planning, monitoring progress, decision-making, checking, choosing strategies, etc. To these, in the light of the above discussion of abduction, we could add a third cluster comprised of intuitive activities. These might include the following: Intuitive Activities: following hunches, noticing associations, attending to memories triggered, following intuitive links, responding to multisensory experiences, etc. However, although we know intuition plays an important part, there is no known technique for invoking abductive inferences or suppositions. It is an irrational, intuitively triggered process that wells up from the prepared unconscious. Like Satori in Zen Buddhism, you can make all the preparations you like but the lightning strike of inspiration may or may not come. While in school mathematics the problems set should be within reach of the students, in research mathematics many of the most famous problems are either very difficult or ultimately and demonstrably insoluble. Hilbert’s first problem, from his famous list of 23 offered at the beginning of the twentieth century, was to prove the continuum hypothesis. Simply put the hypothesis states that the size of the continuum (how many real numbers exist) is Aleph one, the next and immediately subsequent measure of infinity greater in size than the set of natural numbers (Aleph null). It was not until the 1960s that the mathematician P. J. Cohen (1966) demonstrated with his new technique of “forcing” that this problem is insoluble and that the continuum hypothesis is independent of the standard axioms of set theory.
28 Abduction and Creativity in Mathematics
599
Meaning and Abduction There is a link between abduction and meaning theory which is of significance for mathematics and mathematics education. Traditionally, the referential or picture theory of meaning has been dominant. However, this is now widely regarded as inadequate (see, e.g., Ernest, 2018; Rorty, 1979). Instead, we can draw on Wittgenstein’s (1953) widely adopted “use” theory of meaning. This says that much of meaning is given by use: “for a large class of cases – though not for all – in which we employ the word ‘meaning’ it can be defined thus: the meaning of a word is its use in the language” (Wittgenstein, 1953, I, Sec. 43). His account also treats sentences and other signs in the same way: a large part of their meaning resides in their use. Wittgenstein allows for three other sources of meaning – custom, rule following, and physiognomic meaning (Finch, 1995). We shall leave these to one side as they do not illuminate abduction much, if at all. Focusing on meaning as use, it is important to hedge this in the way that Wittgenstein does. Namely, the use of words or signs is always located within what he terms language games situated within forms of life. Thus, according to this theory, the meanings of words and signs are the roles they play within conversations located in social forms of life. But these are not free-floating conversations; they are conversations centered on, and intrinsically a part of, shared activities with a goal or object in mind. Thus, conversations are not just trivial decorations but an integral part of social activities. The function of conversations is to facilitate important joint and productive activities through directions, confirmations, and other communicative means. Thus, the meanings of the terms and signs employed is their functions within these activities. Joint action within a form of life is usually directed and punctuated by discourse. In other words, language in conversation is a tool employed to further a joint activity and take it toward its goal. Wittgenstein makes it clear that meanings depend on the language games in which they are used, and “When language-games change, then there is a change in concepts, and with the concepts the meanings of words change” (Wittgenstein, 1969, Sect. 65). Wittgenstein even goes on to say that with every new proof of a mathematical theorem, the meaning of that theorem is changed. At this point Robert Brandom’s (2000) inferentialist account of meaning is a useful supplement. For Brandom, the meaning of words and sentences is largely given by their use in language, but it is a central aspect of use, namely, the nexus of inferential connections with other words and sentences. According to the inferentialism of Robert Brandom, the meaning of any linguistic expression (sentence, claim, principle, theory, rule, etc.) consists of sets of central conceptual uses, namely, the consequences and logically justified applications of the expression, and the reasonings that lead up to or justify the expression. The inferential significance or meaning of a claim also depends on background collateral commitments, which is the beliefs and knowledge of the author or speaker of the claim.
600
P. Ernest
For Brandom the inferentialist meaning of a word or sentence S is its connections through reasoning with antecedents (reasonings leading to S) and its consequences (reasonings that follow from S). These uses are shown through enacted utterances, but meaning also reflects past uttered links and is always open toward the future. So, the current meaning of a word or sentence, at any time, is partial and never final, for further patterns of use will supplement the meaning. Figure 1 is a diagram representing the inferential theory of meaning of an expression illustrating Brandom (2000). As noted above, it must be acknowledged that the antecedents and consequents of the expression S must be located within the background context of the utterance, with its collateral commitments. Thus, according to the inferentialism of Brandom (2000), the meaning of any linguistic expression (sentence, claim, principle, theory, rule, etc.) consists of set of central conceptual uses: 1. The consequences and logically justified applications of the expression including reasonings, conceptual uses, and representations that flow from it 2. Reasonings that lead up to or justify the expression including assumptions, conditions, and precursors These reasonings are all located within the background collateral commitments (beliefs, knowledge), and these also add to the inferential significance of a claim or expression. It is clear that abduction fits primarily in Part 2 here, among the antecedents of the expression. Given an expression S, representing a surprising fact, observation, or conclusion, abduction is the inference or other process that gives rise to a hypothesis X that explains, justifies, or otherwise gives rise to the expression S. This is part of what Brandom includes within the inferential meaning of term or sentence, and it includes its abductive precursors. But abduction is not just about the precursor to a claim or sentence. Abduction also includes the surprising fact that follows on from the claim or sentence. Thus, it includes some of the consequences or subsequent associations where the link is weaker than straightforward or full-strength deduction. Abduction thus spreads over all three domains of Brandom’s meaning
• Assumptions • Conditions • Pre-cursors Leading to
Sentence or concept
• EXPRESSION
• Applications • Conceptual uses Consequences
Fig. 1 The inferential meaning of an expression. (After Brandom (2000))
28 Abduction and Creativity in Mathematics
601
triple. This includes the antecedents, the central expression, term or sentence, and the consequents. Abduction does not encompass all of these but lies within these domains of potential. The inferential meaning of a sentence is never final, for the list of antecedents and consequents is never complete. Likewise, the list of abductive hypotheses can never be complete. It is always a potential list, realized to a greater or lesser extent in practice. It might be said that the third of Brandom’s domains corresponds to deduction, for it includes the deductive consequences of the central sentence. The central term or sentence might be compared with inductive reasoning, for it can be a generalization, induced from its antecedents, which predicts new facts as consequence. The first of Brandom’s domains thus corresponds to abduction, for it is the missing antecedent of the central sentence or claim which in turn gives rise to subsequent conclusions, inferences, and facts. Although this is just an analogy, the comparison with Brandom’s meaning theory is illuminating as it parallels the tripartite division of reasoning principles into abductive, inductive, and deductive. There is a sense in which Brandom’s inferentialist theory of meaning corresponds, at least in part, with the ancient mathematician Pappus’ understanding of the methods of analysis and synthesis. According to the standard interpretation, analysis is a downstream process (drawing logical conclusions from the desired theorem). In contrast, synthesis is an upstream process (looking for the premises from which the conclusions can be drawn). Substituting temporal elements for the logical sequencing in the inferential meaning schema, the before, after, and future elements are all accommodated. Hintikka and Remes (1974) offer another interpretation of Pappus’ method, in which instead of logical consequence, they substitute the ideas of “corresponds to” or “goes together with.” This leads to a looser interpretation of the relationship of the antecedents and the succedents with the central expression whose proof is sought. I shall draw upon this relaxation of the strict logical relation in considering the application of this meaning schema to mathematics education. Polya acknowledges that abduction is a heuristic inference, namely, an identifiable kind of plausible reasoning. He claims that the logical form of abductive reasoning is a kind of reverse modus ponens inference (Walton, 2004). Modus ponens is the reasoning scheme that permits the deductive inference of B from the antecedents A→B and A. This can be presented as the following natural deduction scheme: A AÆ B ___________ B
Abduction can be presented as a plausible reasoning scheme as follows: B XÆ B ____________ X?
602
P. Ernest
In these figures the horizontal line represents a division of the reasoning into the space of premises and the conclusion space. It is also analogous to the vertical addition or subtraction calculation, where the horizontal line marks off the answer, drawn from the operation on the preceding numbers (e.g., the addends). In the second scheme, X? represents X as having the modality of possibility. The fact is that X is a plausible antecedent of B but it does not have the logical status of a deductive conclusion. Rather than launching into full-blooded modal logic that formalizes possibility and necessity operators, the use of the question mark here is just an informal indication of the modal status of hypothesis X. If X was not presented with this modified status (X?), the inference schema could be a source of contradiction. For if F stands for a false statement, then whatever the status of B, the implication statement F→B is correct (falsity implies anything). Nevertheless, the following scheme is safeguarded: B FÆ B ____________ F?
For even if the conclusion F is false, F? remains possible. That is, F? is true or false, and in the particular case where F is false, this holds true. From the perspective of Brandom’s inferentialist meaning theory, X? is a plausible antecedent for the expression B. This scheme, with its use of the variable X, is also intended to suggest that there can be, indeed there must be, many antecedents for the expression B, although not all of these will stand in the abductive relation to the conclusion B. In the contexts of mathematics and mathematics education, there will be many different types of antecedents and consequences for an expression S making up its meaning. The types of reasoning include a much broader range than those that we normally label as deductive reasoning. Many antecedents will suggest the expression S as a tentative conclusion or as a plausible generalization. Likewise, many of the reasonings that follow on from S will not be strictly logical inferences. Brandom himself describes the relationship between the antecedents and consequents of S, which make up its meaning, as moves in the space or game of giving and asking for reasons. Understood liberally this can include antecedents and consequents that are not strictly linked by deduction to the sentence or expression S. The full range of reasons that we may give or ask for in arguments or conversations includes instances that are more loosely linked to the inquiry than by logical inference. Thus far, my discussion of types of inference has been very abstract. None of the mathematically specific modes of reasoning have been discussed in the context of Brandom’s meaning theory. However, many specific types of inference and even weaker forms of derivation, consequence, and association are to be found within mathematics. Within mathematics and school mathematics, some of the possible types of reasoning involved include the following, listed approximately from the simpler to the more rigorous types of reasoning:
28 Abduction and Creativity in Mathematics
• • • • • • • • • • • •
603
Copying the pattern shown in an example in a further example Following a pattern or rule to give further examples or instances Solving a problem using heuristics Reasoning from a representation (table, graph, equation, diagram, etc.) Reasoning from observations, results, conjectures Reasoning by analogy, or metaphorically, to illustrate, represent, or justify Reasoning by using a model (concrete or abstract) to illustrate, represent, or justify Reasoning from a model to derive specific results and predictions Arguing from general to the particular (specializing, instantiating) Reasoning from the particular to the general (induction, abstracting, abduction) Informal deduction using principles of logical inference Formal deductive inferences from axioms using rules of deduction or other systems of deduction
This is a far from complete list, but it illustrates some of the plausible forms of reasoning that can be employed to loosely derive an expression from its antecedents or to draw conclusions from an expression in mathematics. As remarked above, already in Hintikka and Remes (1974), there is the suggestion for the relaxation of the strict logical relations in the application of the methods of analysis and synthesis which is exemplified here in applying Brandom’s meaning theory to mathematics education. In each case the formal reasoning and the antecedents employed need not be on the same level as the consequents. They may be more abstract, at an equivalent level of abstraction, or at a lower level. In this respect they may differ from strict deductive arguments where conclusions and antecedents are often on the same level. As with inductive arguments, there may be a difference in the level of particularity or specificity and abstraction. The inclusion of some of these antecedents and consequents very likely exceeds what Brandom would like to see, according to his inferentialist schema. But in the area of mathematical invention, one needs to be open to more informal reasonings and associations than the normal deductive logic admits. Even mathematicians use informal forms of reasoning in their proofs that fall short of formal deductions, so it is not surprising that such forms of reasoning are also employed in school mathematics (Rav, 1999).
Abduction in Practice A number of studies of abduction in practice have been reported. Kim et al. (2021) conducted an ethnomethodological study of abductive reasoning used by student teachers while “tinkering.” They observed three pairs of student teachers to document how they engaged in reasoning while tinkering with robots to reprogram their behaviors:
604
P. Ernest
Abduction was often observed in debugging episodes in that participants (a) determined plausible cases (causes) that explained misfunctioning code (the result) (b) by applying the best rule at the moment that connected the case to the result, and (c) eliminated irrelevant rules when realizing a disconnect between the case and the result through seeing reprogrammed robot behaviors. (Kim et al., 2021, p. 5)
Their findings were summarized by the following themes: Theme 1: Tacit rules were applied in search for best explanations. Theme 2: Rules were eliminated based on perceptual observations. Theme 3: Reflective abstraction was seldom observed. Theme 4: Generating multiple hypotheses in advance was an unnatural requirement. (Kim et al., 2021, p.6)
Thus, although abductive reasoning was at work, it was often implicit rather than explicit. The processes involved what Eco (1983) describes as “creative abduction” rather than overcoded or undercoded abduction. Indeed, undercoded abduction was avoided according to theme 4 of the findings, for it was deemed sufficient to generate one hypothesis at a time and then test or eliminate that through trial-and-error methods. Hidayah et al. (2020) conducted a study of 58 secondary school students solving algebraic problems. Their study classifies the types of reasoning employed by students in attempting to solve these problems. Their four categories of results are types of student response. These comprised what they termed the (1) creative conjectures type, (2) fact optimization type, (3) factual error type, and (4) mistaken fact type. Students of the creative conjecture-type group solved the problem. They made conjectures based on the given facts by writing, describing, or sketching so as to design their problem-solving strategy. Some of them had some doubts about the solution. Consequently, they developed new conjectures outside the specific problem question but still related to the question. In solving the problem, this group of students developed new ideas related to the questions using abductive reasoning. Among the fact optimization-type group (2), students made conjectures about the answers to problems, and then they confirmed the conjectures with deductive reasoning. Thus, they can be said to have used abduction first and deduction subsequently. Among the factual error-type group (3), students added facts drawn from outside of the problem to help their solutions, but the facts used were incorrect, leading to incorrect conclusions. In the mistaken fact-type group (4), students assumed that the question itself stated a truth, so it was employed as a fact. As a result, their conclusions were also incorrect. The students classified among the first two types successfully used abductive strategies to solve the problems.
Abduction in Mathematics Education Research Reid (2018) offers an overview or meta-analysis of abductive research within mathematics education:
28 Abduction and Creativity in Mathematics
605
Three ‘lineages’ can be identified in the mathematics education literature. One (predominantly German) lineage begins with Voigt’s work on methodology and leads to Hoffmann’s use of abductive reasoning in a theory of learning. A second (based in the United States) focuses on abductive reasoning in problem solving. A third (mainly Italian) lineage focuses on the role of abductive reasoning in the ‘cognitive unity’ between conjecturing and proving.
The first of these strands looks at abductive reasoning as part of a theory of learning. Reid highlights the work of Hoffmann (2001) in developing a theory of learning in which signs mediate between different forms of knowledge and which makes learning possible. This is primarily a semiotic theory of learning and as such it foregrounds the role of signs. The second strand concerns abductive reasoning in conjecturing when solving problems. For example, Cifarelli (1999) is interested in activities in which a learner needs to discover mathematical coherences to solve a problem. In particular, he looks at ways in which abductive reasoning fosters an intermingling of problemposing and problem-solving activities. As he says, he is mainly focused on the function of abduction, which “furnishes the reasoner with a novel hypothesis to account for surprising facts” (Cifarelli, 1999, p. 217). Ferrando (2006) examined students’ problem solving in calculus. Inspired by Cifarelli’s (1999) idea “that an abductive inference may serve to organize, reorganize, and transform a problem solver’s actions” (Ferrando, 2006, p. 58), she distinguishes between facts, conjectures, statements, and actions. She argues that both statements and actions can be abductive: An abductive statement is a proposition describing a hypothesis built in order to corroborate or to explain a conjecture. . . . An abductive action represents the creation, or the ‘taking into account’ a justifying hypothesis or a cause. (Ferrando, 2006, p. 59)
The third strand identified by Reid (2018) is about abductive reasoning in the dialectic between conjecturing and proving. Boero et al. (1996) proposed that the argumentation processes involved in first forming a conjecture involve elements that are then employed in the proof of the conjecture, so that the two processes of conjecturing and proving have a unity. They termed this idea cognitive unity. It is relevant here because abductive reasoning plays an important part in cognitive unity. This is manifested in a reversal of the abduction used in conjecturing to produce the deduction used in proving the theorem. Arzarello et al. (1998, p. 78) explicitly discuss “the dialectic between an explorative, groping phase and an organizing strategy which converges towards some piece of validated knowledge.” This is discussed further in the next section, in the context of the philosophy of mathematics.
Abduction and the Philosophy of Mathematics The main concerns of the philosophy of mathematics in the twentieth century have been epistemological and ontological. The lesser of these two interests is
606
P. Ernest
ontological, concerned with investigating the nature and being of mathematical objects. Although there is some attention to the social construction of mathematical objects, Platonism has been the main ontological position among mathematicians and philosophers. Platonism views mathematical objects as eternal and superhuman, so the issue of their discovery or formation does not really arise. The dominant of these two interests has been epistemological, with a primary concern for the justification of mathematical results. The main instrument for this is mathematical proof which is deductive proof encompassing various modes of inference including mathematical induction. Although proof has always been a concern of the philosophy of mathematics, one of the triggers for this renewed interest in epistemology was the emergence of paradoxes, anomalies, and contradictions in new mathematical theories at the end of the nineteenth century. This gave a great impetus for research in the foundations of mathematics. Indeed, it was regarded as a crisis in the philosophy of mathematics, and various schools of thought emerged, primarily concerned to give mathematics firm foundations. Throughout all of this work, the attention was on justification, proof, and other means of warranting and vouchsafing mathematical results and truth. From Frege onward concerns with invention in mathematics, including the creation of new hypothesis claims and theories, has been dismissed as merely a psychological interest and not the proper concern of the philosophy of mathematics. The philosophy of science has also focused on the testing and justification of scientific theories as opposed to their discovery or invention. Popper (1959) carefully distinguishes between the context of discovery and the context of justification. He regards only the latter as the legitimate concern of the philosophy of science. He dismissed concerns with the context of discovery as beyond the scope of the philosophy of science. This is ironic given that the title of his book in English is The Logic of Scientific Discovery. Even his protégé and successor, Lakatos (1976), one of the first in the new philosophy of mathematical practice tradition, named his key work “Proofs and Refutations,” a homage to Popper’s (1963) work’s title “Conjectures and Refutations.” Lakatos (1976) claimed to be more concerned with justification than discovery in mathematics. However, this book, which among other things traces the development of the Euler-Cauchy theorem, represents a key work in the emergent philosophy of mathematical practice tradition. This, at last, introduces discovery and invention as proper concerns of the philosophy of mathematics. Lakatos (1976) begins his philosophical and historical account with a preliminary conjecture relating the numbers of edges, sides, and vertices of a polyhedron. He then devotes much of his work to the processes of criticism, redefinition, revised proofs, refutations, and revised conjectures, concerning this relationship. This ultimately leads to the accepted version of the theorem and its proof. Even in this account, he omits the initial abductive stage in which the preliminary conjecture is formulated. Nevertheless, his dialectical interplay between the processes of formation and discovery versus those of refutation and validation opens the door to the context of discovery in mathematics. Indeed, there is a role for abduction in some of his developmental phases in which hypotheses and conjectures are improved and replaced.
28 Abduction and Creativity in Mathematics
607
In his later work, Lakatos (1978) extends his logic of mathematical discovery to the twin processes of analysis and synthesis which gives a place, albeit implicitly, to the processes of abduction in the formation of mathematical knowledge. Lakatos describes the machinery of proofs and refutations within the framework of Pappusian analysis-synthesis: “The analysis provides the hidden assumptions needed for the synthesis. The analysis contains the creative innovation, the synthesis is a routine task” (Lakatos, 1978, p. 93): However, the hidden lemmas are false . . . But nevertheless we can extricate from the analysis (or from the synthesis) a ‘proof-generated theorem’ by incorporating the conditions articulated in the lemmas. (Lakatos, 1978, p. 95)
Lakatos’ (1976) theory of proofs and refutations consists in refining an existing conjecture and proving it. At the beginning, there is a conjecture, some sentence S which possibly holds within the target domain D. Through a thought experiment, a surprising or interesting situation emerges, namely, a counterexample c to the sentence S, which has a global character. This raises the problematic situation of explaining the counterexample, that is, of producing the rule of which this counterexample is a case. Through a new abduction, we get such a reason or rule R. The stage is now set for a new resolutive move, investigating how the cause of the counterexample and the conjecture S might be connected. In fact, R is possibly a reason why the conjecture does not hold; hence it is reasonable to look within the context D for some new hypothesis, say S , which eliminates R and, consequently, the counterexample. Lakatos (1978) describes how the subject can find such a S within the context D, through a dynamic exploration, namely, how S is produced with a new abduction. Arzarello et al. (1998) propose a model based on Lakatos’ work to explain the two main modalities of mathematical action. These consist of creating (exploring/selecting conjectures) and warranting (constructing proofs) for mathematical knowledge. They claim that any process of exploration-conjecturing-proving features a complex switching from the one modality to the other and back. This requires a high level of flexibility for the mathematician in tuning themselves to the right mode. Their model aims to explain how the transition from the one modality to the other happens. Their model goes through three phases. Phase 1 Within the explorative modality, the process starts with the use of some heuristic to guess what happens in particular examples, thus selecting a conjecture. This is a working hypothesis to be checked, which, typically, is not in conditional form. To confirm it new explorations are made by using some heuristics. At this stage the subject expresses their hypothesis as an abduction, namely, a sort of reverse deduction. In fact, generally the subject sees what rule it is the case of. Having selected the piece of their knowledge that they believe to be right, the conditional form is virtually present. Its ingredients are all present, but their relationships are reversed with respect to the conditional form. The direction in which the subject
608
P. Ernest
sees the things is still in the stream of the exploration, and the control of the meaning is ascending. Phase 2 A switching to the deductive form happens, because of the abduction. Now the control is descending, and we have an exploration of the situation, where things are looked at in the opposite way: not in order to have hints for getting a conjecture, but in order to see in them why the regularity that the abduction provides works. The reversed way of looking at the reasoning leads the subject to formulate the conjecture in the conditional form. Now the modality is typically heading toward that of a proof. Phase 3 Now suitable, possibly fresh, heuristics are used, in order to prove the conjecture. Here the descending control is crucial: it allows the subject to interpret the relationships in the way that produces proof steps. First, they have a local character, and then they are organized in a more global and articulated way. In this last phase, conjectures are possibly reformulated in order to combine better proof steps, and new explorations are possibly made to test them. The philosophy of mathematical practice, which Lakatos’ work helped to launch, no longer rigidly separates the contexts of discovery and justification. As Arzarello et al. (1998) show in their scheme, the contexts of discovery and justification are inextricably interwoven in the creative processes of mathematicians. In this emergent tradition, ethnomethodological studies examine mathematicians’ day-to-day practices in generating and validating new knowledge. Further, social studies of mathematics look at mathematicians’ beliefs and the role of their interactions and institutions in the development and furthering of mathematics. But although increasing attention is devoted to the discovery and creation of mathematics, no formulaic methodology of creativity exists or, in my view, is possible. However, acknowledging that the process of abduction is involved in mathematical creativity is a big step forward. It enables a deeper analysis of the processes involved. Part of the creativity of mathematicians is to create new techniques, methods, new ways of defining mathematical ideas, modes of argument, and so on. If mathematical creativity could be defined and mechanized, then it could be conducted by computers without human input. But mechanized theorem proving has only produced trivial results because the description of past practices can never encompass future innovation. That is why we recognize the essential role of abduction in mathematical creativity, and it gives us a tool for analyzing these processes. But they can never be brought to the final canonical encoded state that we find in deductive methods. Indeed, even the mathematical proof techniques employed in mathematical practice are themselves not satisfactorily captured by logic. Theoretically, formal proofs are fully syntactic. But as Rav (1999, p. 11) points out, the informal proofs “of customary mathematical discourse, hav[e] an
28 Abduction and Creativity in Mathematics
609
irreducible semantic content.” To demand that all published proofs were fully syntactic, apart from the explosion in size of journals and papers this would lead to, would force creative mathematics to grind to a halt. More than slowing down mathematical creativity as in a “work to rule,” demanding fully rigorous proofs would make even the simplest inferential step unacceptably long. The creative jumps in reasoning in accepted papers need an expert mathematician for judgment and remain several steps away from purely mechanical verification. It is therefore no surprise that the processes of abduction employed in mathematical discovery are even less amenable to explicit specification than those of deductive reasoning. Furthermore, human judgment as to what is of interest and value is something that cannot, as yet, be written as a program. In my view, like the rest of creativity, it is something that will never be captured mechanically.
Conclusion This chapter offers a review of current thought on abduction and creativity in mathematics. Mathematics is a mysterious subject which draws its inspiration from the domain of pure ideas, unlike empirical science which tries to explain the world. Both of these disciplines use abduction to generate explanatory hypotheses about their respective worlds. But the world of mathematics is not simply given to our senses. In this respect it is unlike science where our senses can be supplemented and enlarged with the most penetrating extensions such as telescopes, microscopes, and cloud chambers. The domain of mathematics is fully abstract, so the materials for abduction remain in the mind, in conversation, and in the lecture room. Nevertheless, these domains offer plenty of room for speculations about the patterns that link and indeed create mathematical objects. No mathematician has ever claimed, unlike some overconfident Victorian physicists, that the field is exhausted. The potential for creativity is boundless. Just as with all sciences, the conduit to new creative mathematicians is that of schooling. School mathematics is traditionally made up of routine exercises, but there are moves to try and bring more creativity into it through problem solving and problem posing (as well as some of the other dimensions indicated above). In both problem solving and posing, abduction has an important place to play. The goal in bringing creativity into school mathematics is twofold. First, to foster in miniature the creativity manifested by all mathematicians, resting as it does on feelings of understanding, excitement, and joy. To the extent that this can be achieved for every student, it is surely character building and empowering. Second, as the means to developing more mathematicians, enhancing the processes of education to encourage more to follow this path must be a benefit. The growing need for mathematicians across a whole range of professions from finance and economics to managers and mathematics teachers calls out for improved teaching practices. Research mathematics, at least in the domain of pure mathematics, is all about creativity, and here abduction plays an essential part. We have seen how meaning as use theories, and inferentialism in particular, is bound up with the use of abduction.
610
P. Ernest
These ideas lead to broader understandings of the relationships between terms, concepts, sentences, and theories, with significant meaning links that are not as strong as formal deductions. Creativity depends on a larger and richer array of connections between concepts, terms, conjectures, theorems, and theories than deduction alone can offer. Overall, it can be said that Peirce’s ideas of abduction offer a deeper understanding of creativity, both in school and research mathematics. Such an enriched understanding offers prospects for enhancing creativity in school mathematics, as well as increasing the numbers of creative mathematicians and further extending the field.
References Arzarello, F., Andriano, V., Olivero, F., & Robutti, O. (1998). Abduction and conjecturing in mathematics. Philosophica, 61(1), 77–94. Boero, P., Garuti, R., Lemut, E., & Mariotti, M. A. (1996). Challenging the traditional school approach to theorems: A hypothesis about the cognitive unity of theorems. In L. Puig & A. Gutierrez (Eds.), Proceedings of the twentieth conference of the international group for the psychology of mathematics education (Vol. 2, pp. 113–120). PME. Brandom, R. B. (2000). Articulating reasons: An introduction to inferentialism. Harvard University Press. Burton, L. (1984). Thinking things through. Blackwell. Burton, L. (2004). Mathematicians as enquirers: Learning about learning mathematics. Kluwer Academic. Carpenter, T. P., Moser, J. M., & Romberg, T. (1982). Addition and subtraction: A cognitive perspective. Lawrence Erlbaum. Cifarelli, V. (1999). Abductive inference: Connections between problem posing and solving. In O. Zaslavsky (Ed.), Proceedings of the 23rd conference of the international group for the psychology of mathematics education (Vol. 2, pp. 217–224). PME. Cohen, P. J. (1966). Set theory and the continuum hypothesis. Benjamin. Corcoran, J. (2003). Aristotle’s prior analytics and Boole’s laws of thought. History and Philosophy of Logic, 24(4), 261–288. Devlin, K. (2019). What is mathematical creativity, how do we develop it, and should we try to measure it? Part 1, Mathematical Association of America website. Retrieved 13 March 2022 via https://www.mathvalues.org/masterblog/2019/1/25/what-is-mathematical-creativityhow-do-we-develop-it-and-should-we-try-to-measure-it-part-1?rq=creativity Eco, U. (1983). Horns, hooves, insteps: Some hypotheses on three types of abduction. In U. Eco & T. Sebeok (Eds.), The sign of three: Dupin, Holmes, Peirce (pp. 198–220). Indiana University Press. 1983. Ernest, P. (2013). The psychology of mathematics. Amazon Digital Services, Kindle edition. Ernest, P. (2018). A semiotic theory of mathematical text. Philosophy of Mathematics Education Journal, 33. Fann, K. T. (1970). Peirce’s theory of abduction. Martinus Nijhoff. Ferrando, E. (2006). The abductive system. In J. Novotná, H. Moraová, M. Krátká, & N. Stehlíková (Eds.), Proceedings of the thirtieth conference of the international group for the psychology of mathematics education (Vol. 3, pp. 57–64). Finch, H. L. (1995). Wittgenstein. Element Books. Frege, G. (1879). Begriffsschrift. In J. van Heijenoort (Ed.), (1967) From Frege to Gödel: A source book in mathematical logic (pp. 1–82). Harvard University Press. Hadamard, J. (1945). Essay on the psychology of invention in the mathematical field. Princeton University Press.
28 Abduction and Creativity in Mathematics
611
Hidayah, I. N., Sa’dijah, C., Subanji, A., & Sudirman, B. (2020). Characteristics of students’ abductive reasoning in solving algebra problems. Journal on Mathematics Education, 11(3), 347–362. Hintikka, J. (2007). Socratic epistemology: Explorations of knowledge–seeking by questioning. Cambridge University Press. Hintikka, J., & Remes, U. (1974). The method of analysis. Reidel Publishing Company. Hoffmann, M. H. G. (2001). Skizze einer semiotischen Theorie des Lernens. Journal für Mathematik-Didaktik, 22(3–4), 231–251. Kim, C., Baabdullah, A., Lee, E., Dinç, E., & Zhang, A. Y. (2021). An ethnomethodological study of abductive reasoning while tinkering. AERA Open, 7(1), 1–25. https://journals.sagepub.com/ action/doSearch?target=default&ContribAuthorStored=Belland%2C+Brian+R,B.R Lakatos, I. (Ed.). (1968). The problem of inductive logic. North Holland Publishing. Lakatos, I. (1976). Proofs and refutations, the logic of mathematical discovery. Cambridge University Press. Lakatos, I. (1978). Mathematics, science and epistemology: Philosophical papers (Vol. 2). Cambridge University Press. Liljedahl, P., & Sriraman, B. (2006). Musings on mathematical creativity. For the Learning of Mathematics, 26(1), 17–19. Magnani, L. (2001). Abduction, reason and science: Processes of discovery and explanation. Kluwer Academic. Niss, M., & Højgaard, T. (2019). Mathematical competencies revisited. Educational Studies in Mathematics, 102(1), 9–28. Peirce, C. S. (1867). On the natural classification of arguments. In Peirce (1960), vol. 2, pp. 461– 516. Peirce, C. S. (1902). Minute logic. In Peirce (1960), vol. 2, pp. 1–118. Peirce, C. S. (1903).Lectures on pragmatism. In Peirce, (1960), vol. 5, pp. 14–212. Peirce, C. S. (1960). Collected papers. Harvard University Press. Polya, G. (1945). How to solve it. Princeton University Press. Popper, K. (1959). The logic of scientific discovery. Hutchinson. Popper, K. (1963). Conjectures and refutations (Revised fourth edition 1972). Routledge and Kegan Paul. Rav, Y. (1999). Why do we prove theorems?. Philosophia Mathematica (Series 3), 7, 5–41. Reid, D. A. (2018). Abductive reasoning in mathematics education: Approaches to and theorisations of a complex idea. EURASIA Journal of Mathematics, Science and Technology Education, 14(9), 1–13. Rorty, R. (1979). Philosophy and the mirror of nature. Princeton University Press. Schoenfeld, A. (1992). Learning to think mathematically. In D. A. Grouws (Ed.), Handbook of research on mathematics teaching and learning (pp. 334–370). Macmillan. 1992. Shank, G. (1988). Semiotics. Plenum Press. Shank, G. (1998). The extraordinary ordinary powers of abductive reasoning. Theory & Psychology, 8(6), 841–860. Sriraman, B. (2004). The characteristics of mathematical creativity. The Mathematics Educator, 14(1), 19–34. Walton, D. (2004). Abductive reasoning. University of Alabama Press. Wittgenstein, L. (1953). Philosophical investigations (G. E. M. Anscombe, Trans.). Basil Blackwell. Wittgenstein, L. (1969). On certainty. Blackwell.
Abductive Arguments Supporting Students’ Construction of Proofs
29
Bettina Pedemonte
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cognitive Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Use of Toulmin’s Model to Represent Abductive Arguments . . . . . . . . . . . . . . . . . . . . . . Instructors’ Interventions That Support Cognitive Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . Teaching Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of the Student’s Answers to Two Algebraic Problems . . . . . . . . . . . . . . . . . . . . . Polynomial Factorization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implications of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
614 615 616 618 622 623 623 632 634 635
Abstract
Research studies show that it is not easy for the instructor to modify students’ argumentations based on conceptions that hardly evolve into theorems. Extending these studies, this chapter shows that abductive arguments can be effectively used by the instructor to support students in completing their argumentations, when they are struggling in solving a problem or they are using incorrect rules to solve it. Test results drawn from 60 undergraduate students who solved two algebraic problems, a factorization problem and a system of linear equations, are presented. Students who provided incomplete or incorrect solutions to one of these problems were selected for a one-on-one meeting with the instructor. The analysis of three cases shows how an instructor’s abductive argument can help students recognize their mistakes and modify their argumentations in solving
B. Pedemonte () Department of Neurology, UCSF Dyslexia Center, UCSF Memory and Aging Center, San Francisco, CA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_72
613
614
B. Pedemonte
these problems. Toulmin’s model is used to analyze the interaction between a student’s argumentation and an instructor’s intervention to show that instructors’ abductive arguments should be appropriately constructed in ways that support cognitive unity between students’ argumentation and proof. Keywords
Abductive argumentation · Proof · Cognitive unity · Algebraic problems · Teaching intervention
Introduction Abduction is a form of reasoning often used by students in their argumentation when they solve mathematical problems (Mason, 1996; Cifarelli & Saenz-Ludlow, 1996; Cifarelli, 1999; Ferrando, 2006; Krummheuer, 2007) and when they are engaged in the mathematical practice of proving (Knipping, 2003; Saenz-Ludlow, 2016; Pedemonte & Reid, 2010, Pedemonte, 2007, 2008). Abduction is crucial in introducing new ideas (Peirce, 1960) and in the development of creative reasoning (Magnani, 2004). It is usually described as the process of forming an explanatory hypothesis on an observed surprising result (C.P. 5.171). For its nature, abduction is a basic form of reasoning. It is an inference present in generalization processes (Mason et al., 2009, Rivera & Becker, 2016) strictly related to inductive reasoning (Rivera & Becker, 2007, Rivera, 2017) and often in anticipating deductive reasoning (Pedemonte, 2008). In the proving process, abduction plays an essential role in the dialectic between conjecturing a hypothesis and proving a result (Arzarello et al., 1998). Sometimes, abduction can be an obstacle to the construction of a deductive proof (Pedemonte, 2007) because the abductive structure seems to be more spontaneous to students than a deductive one. In geometry, for instance, proof is expected to be deductive, but the reasoning that precedes it is generally either abductive or inductive. It is hard for students to transform abductive argumentations into deductive proofs. Instead, in algebra, abduction can support algebraic proof (Pedemonte, 2008) usually characterized by a strong deductive structure. In algebra, a proof can be purely mechanical, what Tall (1995) calls a manipulative proof, that consists in transforming expressions through the manipulation of algebraic symbols. Abductive argumentations can be used to link the letters used in the proof with instances of numbers used in the argumentation (Pedemonte, 2008). In these studies, abduction is analyzed as part of argumentation developed by students to justify a conjecture and construct a proof. Abductive arguments support the construction of proof when the continuity between students’ argumentation and proof is maintained (Boero et al., 1996; Pedemonte, 2008). Extending these studies, this chapter shows that abductive arguments can be effectively used by the instructor to support students in constructing proofs in algebra. Proof is a particular argumentation (Pedemonte, 2007) considered here
29 Abductive Arguments Supporting Students’ Construction of Proofs
615
as part of a problem-solving activity (Weber, 2005) in which students are asked to select and apply rules of inferences until the conclusion is deduced. When the students’ argumentation is incomplete or based on incorrect conceptions (Balacheff, 2009), it cannot evolve into a theorem, and, thus, a didactical intervention might be necessary to support students in constructing a proof. It is difficult for instructors to help students understand their mistakes and overcome them since their explanations oftentimes do not match with the students’ thinking (Pedemonte, 2018). When that happens, the didactical intervention can create a gap between the students’ reasoning and the correct answer to a problem (Son, 2013). The aim of this chapter is to show that through appropriate abductive arguments containing the answer to the problem or an intermediate answer, instructors can decrease the distance between the students’ argumentation and solution to the problem. Instructors’ interventions can lead their students to look for a justification for the problem’s answer. The students can then use the justification to complete the instructors’ abductive arguments if cognitive unity between the student’s argumentation and proof is maintained.
Cognitive Unity Cognitive unity is a theoretical construct introduced to describe the relationship between conjecturing and proving in mathematics (Garuti, et al., 1996). During a problem-solving process, an argumentation is usually developed to produce a conjecture. The main idea of cognitive unity is that in some cases, this argumentation can be used by the student in the construction of proof by organizing some of the previously produced arguments according to a logical chain (Garuti et al., 1998). However, while proof is based on a mathematical theory, the argumentation is related to the arguer’s conceptions (Balacheff, 2009). These conceptions belong to the arguer’s system of knowledge, which is not necessarily a mathematical theory. Therefore, incorrect conceptions, which cannot evolve into theorems because they are not supported by a mathematical theory, can prevent cognitive unity (Pedemonte & Balacheff, 2016) between argumentation and proof. When the argumentation supporting the conjecture is based on incorrect conceptions, two possibilities can be envisioned: (1) the proof is not constructed because the student cannot replace the incorrect conception by a theorem, and (2) an “incorrect proof” is produced based on the conception used in the argumentation. In both cases, students need to change the resolution strategy to construct a proof. A didactical intervention is sometimes necessary to help students invalidate incorrect conceptions and to construct a different argumentation supporting a new conjecture. However, an instructor’s intervention is effective if it does not “interrupt” the cognitive unity between the students’ argumentation and proof but instead encourages the continuity between them (Pedemonte, 2018). In this chapter, instructor’s interventions based on abductive arguments that support cognitive unity between students’ argumentation and proof are analyzed. The aim of this analysis is to show how these arguments can be effectively used
616
B. Pedemonte
when students struggle in completing their argumentations or when they make mistakes that potentially prevent the solution to the problem. The interaction between instructor and student is analyzed using Toulmin’s model.
The Use of Toulmin’s Model to Represent Abductive Arguments Toulmin’s model (1993) was embraced by a large number of researchers in mathematics education to show how argumentation could be used to support learning in a classroom (Krummheuer, 2007; Yackel, 2001; Yackel & Rasmussen, 2002; Wood, 1999) and to describe its relationship with the mathematical proof (Inglis et al., 2007; Nardi et al., 2012; Lavy, 2006; Knipping, 2008; Pedemonte, 2007, 2008; Weber & Alcock, 2005). In Toulmin’s model, an argument provides a standpoint (an assertion, an opinion) which is called a claim in Toulmin’s terminology. Data is produced to support the claim. A warrant provides the justification for using the data in support of the dataclaim relationship; it can be expressed as a principle or a rule and acts as a bridge between the data and the claim. This is the ternary base structure of an argument, but auxiliary elements may be necessary to describe it. Toulmin describes three of them: the qualifier, the rebuttal, and the backing. The warrant imparts different degrees of force to the conclusion it justifies, which may be indicated by a qualifier such as “necessarily,” “probably,” or “presumably” attached to the transition from the data to the claim. In the latter case, conditions of rebuttal “indicating circumstances in which the authority of the warrant would have to be set aside” (Toulmin, 1958, p. 101) might be mentioned. So, a warrant can be defended by appeal to a backing that can be expressed in the form of categorical statements of fact (Toulmin, 1958, p. 105). A backing can be provided by a system of taxonomic classification, by a statute, by statistical results, or by a mathematical theory. Then, Toulmin’s model of argument contains six related elements organized as shown in Fig. 1. A step in Toulmin’s model appears as a deductive step: the data and warrant lead to the claim. However, Toulmin’s model can be used to represent other arguments’ structures (Pedemonte, 2007), for example, the abductive structure. Abduction is unless Rebuttal
Fig. 1 Toulmin’s model of argumentation
Data
Qualifier
since Warrant
account of Backing
Claim
29 Abductive Arguments Supporting Students’ Construction of Proofs
617
Fig. 2 Representation of abduction in Toulmin’s model
an inference that allows the construction of a claim starting from an observed fact (Magnani, 2001; Peirce, 1960; Polya, 1962). Based on Peirce’s, 1878 formulation of abduction, there are at least three different kinds of abductions (Eco, 1983) that can be represented using Toulmin’s model (Fig. 2): overcoded, undercoded, and creative abduction (Bonfantini & Proni, 1983, Magnani, 2001). Overcoded abduction occurs when the arguer is aware of only one rule from which that case would follow (Eco, 1983, p. 206). The general rule is represented as warrant in Toulmin’s model. The claim and warrant are fixed, but data, represented by the question mark, should be found. If there are multiple general rules to be selected from, Eco identifies it as “undercoded abduction.” In this case, not only does the data have to be sought to justify the claim, but the rule that can be used to justify it should also be found. In Toulmin’s model, the warrant is represented by a set of different rules, and the arguer should select one of them. The warrant and the data, represented by the question marks in Toulmin’s model, are not fixed. Magnani (2001) links overcoded and undercoded abductions together as selective abductions. Selective abduction is defined as the process of finding the right explanatory hypothesis from a given set of possible explanations. In this case, the arguer should find the most appropriate rule to construct the conclusion from the set of rules he or she has access to. However, there may be no general rule known to the arguer that would imply the given case. Thus, the arguer must invent a new rule. Eco (1983) calls an abduction that involves the invention of a new rule a creative abduction. The question mark in the warrant means that the rule should be sought to justify the claim. Unlike the undercoded cases, this rule should be created, not selected from among a set of existing rules. Research (Pedemonte & Reid, 2010) shows that undercoded and overall creative abductions are the most difficult to manage to construct a proof because a great deal of irrelevant information may be involved in the argumentation process, confusing and creating disorder in the student’s thought process. However, these abductions could be part of educational strategies to support students in solving problems when they struggle or make mistakes. Instructors’ interventions formulated as undercoded or creative abductions could be transformed into overcoded abductions by the students, decreasing the gap between students’ argumentation and the problem’s answer. As shown in the next section, abductive arguments constructed by the instructors should provide a claim, not a warrant for not interrupting cognitive unity in students’ argumentations.
618
B. Pedemonte
Instructors’ Interventions That Support Cognitive Unity Experimental research (Pedemonte, 2018) shows that to support students in constructing a proof, instructors’ interventions should support cognitive unity between argumentation and proof. When a student uses an incorrect conception to construct a conjecture, the claim of the argument is, in general, not correct. Therefore, an instructor’s intervention is necessary to invalidate the argument and help the student modify the strategy to solve the problem. The intervention runs as a rebuttal in the student’s argumentation if it does not interrupt cognitive unity between argumentation and proof. Cognitive unity is maintained if the backing of the instructor’s argument is the same as the student’s and if the warrant is “consistent” with the one used by the student. In this chapter, the didactical intervention provided by an instructor is analyzed from a structural point of view (Pedemonte, 2007). The structure of an argument is the logical cognitive connection between statements (e.g., deduction, abduction, and induction structures). Presumably, undercoded or creative abductive arguments are most effective in supporting cognitive unity in a student’s argumentation: the student can complete the undercoded or creative abduction with a warrant, transforming it into an overcoded abduction. Two interactions between student and instructor are analyzed to explain this point. The students are solving the following geometrical problem:
ABCD is a rectangle AC = 20 cm E, F, G, H are the midpoints of the sides AB, BC, CD, DA Find the perimeter of EFGH The student is expected to solve the problem using the midpoint theorem: the segment (HG) joining two sides (AD and DC) of a triangle (ADC) at the midpoints (H and G) of those sides is parallel to the third side (AC) and is half the length of the third side. Thus, HG is 10 cm long, and the perimeter of EFGH measures 40 cm. Example 1 A student, Victor (V), approaches the problem by inserting variables on the drawing labeling DG = x and DH = y. He then attempts to construct a system of equations to find x and y. He struggles to find the second equation, which leads to the instructor (I) intervening. The instructor’s argument is deductive because the instructor knows how to solve the problem. However, it is perceived by Victor as an undercoded abduction: Victor only uses the claim of this argument (HG = 10). He justifies this claim by solving a system of equations and he looks for data to complete this argument. The undercoded abduction provided by the instructor is only partially used by Victor because otherwise it would interrupt cognitive unity in Victor’s argumentation. It is used by Victor to decrease the gap between his argumentation and the problem’s answer.
29 Abductive Arguments Supporting Students’ Construction of Proofs
619
At this point, Victor does not know how to continue his reasoning. , and this other side 1. V: I called the side . We know that is and is . So squared + squared is equal to 20 squared.
Victor simplifies the first equation and looks for another equation to complete the system.
Victor shows the equation he has written. He also simplifies it by dividing both sides of the equation by 2. He gets 2 + 2 = 100.
HG 2. I: What are you trying to do? 3. V: I used Pythagoras. I like Pythagoras… I only need to find another equation to solve the system to find and I need two equations because there are two variables, right? And if . But I can’t I know and it’s easy to get find another equation.
Through an overcoded abduction, the instructor suggests the solution to the problem: HG is equal 10 for the midpoint theorem. Victor could use it to find data to apply the theorem.
Victor looks at the figure to construct a second equation. The instructor provides the solution and the rule to solve the problem. 4. I: But if you use the midpoint theorem you immediately see that is 10. 5. V: Is it 10? Wait… oh yeah. I don’t need another equation because I already know 2 so is 10. that 2 + 2 is 6. I: Ok, but do you think you need to take and ? 7. V: Maybe not, but this works. 2 = 2 + 2 = 100 so = 10. Victor writes Afterward, Victor calculates the perimeter for the quadrilateral EFGH.
However, Victor only uses the claim: HG=10 and solves the problem replacing the instructor’s warrant with his own rule: solving a system of equations.
The previous argument is constructed by Victor reworking the instructor’s argument. This argument helps Victor find the solution because he realizes that the second equation is not 2 necessary ( 2 + 2is equal to ).
Example 2 In this case, Ron (R) incorrectly solves the problem. He writes that HG is 5 cm long because he thinks HG is the fourth part of AC given that the rectangle HDGO is 1 4 of the rectangle ADCB. The instructor (I) intervenes to help Ron understand the mistake.
620
B. Pedemonte
At this point, Ron explains his solution. 1. R: I think that is 5 because this rectangle (he points to HDGO) is one fourth of the rectangle ADCB, so HG is one fourth of AC. Then it is 5. Ron is mobilizing an incorrect conception that cannot evolve into a theorem. The instructor intervenes. 2. I: It is not correct; you should use the midpoint theorem. You can see that the length is not 5, but 10. 3. R: What do you mean? I’m confused. The instructor’s intervention interrupts cognitive unity in the student’s argumentation. Ron does not know how to use the information provided by the instructor to solve the problem. 1. R: This rectangle is ¼ of ABCD, so this diagonal is 20 divided by 4. Right? 2. I: It’s 10, not 5. 3. R: The half? [Silence]
Ron believes that his reasoning is correct. Through an abductive step (that is deductive for the instructor), the instructor suggests the solution to the problem (HG is equal 10) and a different warrant: the midpoint theorem.
The argument is not useful to Ron because it interrupts cognitive unity in his argumentation. It is only when the instructor transforms the overcoded abductive step into an undercoded or creative one (the warrant is not given), that Ron can use it.
The student is looking at the picture. 4. R: Oh, now I see why. It’s 10 because this rectangle (HDGO) is equal to this one (AHOE), so the diagonals are equal, and AO is 5 because it’s half of AC.
Ron uses the claim HG=10 to modify his argument. The claim provided by the instructor runs as a rebuttal in the student’s argument. Ron realizes that there is something wrong in his reasoning.
(continued)
29 Abductive Arguments Supporting Students’ Construction of Proofs
621
Ron looks at the figure, and he finally sees that HG is equal to AO because the two rectangles HDGO and AHOE are equal. Cognitive unity is reconstructed.
As in the previous case, the instructor’s overcoded abduction is not used by Ron because it would interrupt cognitive unity in his argumentation. Instead, Ron uses the second argument provided by the instructor based on undercoded (or creative) abduction. Ron needs to find or create a rule to justify the correct claim provided by the instructor. Ron’s warrant changes at the end, but it is consistent with the incorrect rule applied in his first argument (HG is 14 AC because the area of HDGO is 14 of the area of ADCB). Ron justifies the instructor’s claim comparing the small rectangles (HDGO with AHOE). Ron modifies his argument because the claim provided by the instructor is different from his claim and the qualifier of the instructor’s argument is surely stronger than the qualifier of Ron’s argument. These two examples show that a didactical intervention is sometimes necessary to support students in constructing proofs when engaged in problem-solving activities, they are unable to complete their argumentations, or they are using incorrect rules. Arguments provided by instructors should have the following characteristics: – Not interrupting cognitive unity in the student’s argumentation. – Decreasing the distance between the student’s argumentation and the correct answer. – Being a rebuttal in the student’s argumentation when the student’s argumentation is based on incorrect conceptions and mistakes are observed. – Being an undercoded or creative abductive argument for the student (from the instructors’ point of view it is a deductive argument because the instructor knows how to solve the problem). The student can transform it in an overcoded abductive argument that runs as a bridge between the student’s argumentation and the proof. In the next section, through the analysis of three cases, these characteristics are analyzed while students are solving two algebraic problems.
622
B. Pedemonte
Teaching Experiment Data is taken from a set of resolution processes gathered from two College Algebra courses taught at a university in Northern California, USA. In these courses, algebra is not usually considered as a way of seeing and expressing relationships but as a body of rules and procedures for manipulating symbols. Students are usually able to develop their calculus skills, but they are not aware of the axioms and theorems they are using in performing it. Thus, algebra is taught and learned as a language, and emphasis is given to its syntactical aspects. The instructor of the two courses, however, used a different approach to teaching algebra. She has been teaching algebra at the university for 5 years. She is a researcher in mathematics education, and her main interests include understanding cognitive processes involved in solving mathematical problems and finding new teaching methods to support mathematical learning. Her teaching methods focus on challenging students to be explicit about the rules they use in algebraic manipulations and to solve algebraic problems by justifying each step in their solutions with the appropriate algebraic rules. Thus, in these courses, typical problem-solving tasks could be considered as proving problems (Weber, 2005). In this study, the students were required to solve different algebraic problems during an assessment. The solutions of two selected problems, a polynomial factorization (Table 1) and a system of equations (Table 3), were collected and analyzed. For each problem, 60 solutions were gathered. Test results with incorrect solutions were grouped according to the types of mistakes made by the students. During office hours, the instructor organized one-on-one meetings with the students who did not correctly solve the problems or did not complete them. Each meeting lasted about 10 min. Through appropriate abductive arguments, the instructor tried to assist each student to solve the problem, understanding and overcoming their possible mistakes. A few meetings were recorded and transcribed to analyze in detail the interaction between the instructor and the students. This study aimed to verify the claim that an instructor’s abductive arguments are effective interventions for supporting students in constructing proofs.
Table 1 Polynomial factorization problem and relative students’ responses
Factor the following: 12 a 2 + a + ab + 2b Students’ answers: Correct Incomplete No answer Incorrect 20 (33.3%) 12 (20%) 7 (11.7%) 21(35%)
Total 60
29 Abductive Arguments Supporting Students’ Construction of Proofs
623
Analysis of the Student’s Answers to Two Algebraic Problems Overall results are presented for the two problems, including a classification of mistakes made by students. An analysis of three special cases is developed to show the main characteristics of the instructor’s abductive arguments that helped her students understand their mistakes and eventually solve the problem.
Polynomial Factorization Problem As confirmed from the students’ answers (Table 1), factoring is usually one of the most difficult concepts in algebra. Before the assessment, the instructor spent at least 3 hours on factorization rules including a special activity performed in class on this topic where students were required to work in small groups (three or four students per group) to solve 15 factorization problems. Among the 60 students, only 20 (33.3%) correctly solved the problem. Twelve students were not able to complete the factorization. Most ofthem (eight out of 12) started to factor by grouping the polynomial writing a 12 a + 1 + b (a + 2). The process was interrupted because the two terms did not have anything in common ( 12 a + 1 is different from a + 2). The other four students who did not complete the factorization(four out of12) tried to factor out the letter a from the first three terms writing a 12 a + 1 + b + 2b, their answer to the problem. Many students (21 out of 60) made different mistakes, which are described in Table 2. During office hours, the instructor asked the students who provided incorrect answers to expand them. Students who made mistakes 1, 4, or 5 could easily observe that their answers were incorrect. However, they were unable to find an alternative strategy. Asking the students to expand their answers was not effective for most students who provided the answers 2 or 3. These students obtained the initial polynomial after expanding their answers. Most of the students who made mistake 2 inserted the missing addition sign during the expansion process. In the students’ mind, the sign was implicitly present. who made mistake 3 realized that Students 1 an a was missing in the expression a 2 a + a + b (a + 2); they added it, but they were unable to continue the process to solve the problem. In all cases, expanding the expression was not useful in providing an alternative answer. At this point, the instructor provided the correct answer through an abductive argument (Fig. 3) and asked the students to figure out a way to complete the factorization.
624
B. Pedemonte
Table 2 Kinds of mistakes students made in factoring the polynomial Kind of mistake
Answer
1
(a + b)
2
a
3
a
4
1 2a
1 2a
1 2a
+ 1 (a + 2)
+ 1 b (a + 2)
+ b (a + 2)
a 12 a + 1 + b (a + 2) =a(a + 2) + b(a + 2) (a + b)(a + 2)
5
Description of the mistake
Number of students who provided this answer
After factoring by grouping, a third factor (a + b) is added because the other two terms are not equal
6
An addition sign is forgotten among the two terms of the factorization
7
During factorization by grouping, the 1 (coefficient of a) is forgotten. It is considered as 0
2
The first part of the expression is multiplied by 2 to take away 12
3
The polynomial is multiplied by 2 to cancel out 1 2 . The number is only multiplied with the first term
3
Fig. 2 Instructor’s abductive argument to the factorization problem
This argument was not useful for students who made mistake 1 and for students 1 who provided the incomplete answer a 2 a + 1 + b + 2b. However, in most of the other cases (19 out of 23), the students were able to solve the problem correctly. Consider the two examples below that show how the interaction between a student and the instructor was effective in helping the students construct a correct answer. Case 1 Tom (T) did not complete the problem. He wrote the expression a 12 a + 1 + b (a + 2) and was unable to continue. The instructor (I) provided the answer and asked him to try to solve the problem.
29 Abductive Arguments Supporting Students’ Construction of Proofs 1. I: This is the answer. How can you get this answer? 2. T: I don’t know how to transform this expression to get the answer. These are not equal. Tom is looking at
1 2
+1 and
625
In Toulmin’s model, the student’s initial argument can be represented as follows:
+ 2.
Tom knows he can factor by grouping, but he is unable to 3. I: Why do you want them to be equal? 4. T: If they are equal, I can factor by grouping. [Silence] I should find this expression. 1
1
do that because the two expressions 2 +1 and
+ 2 are
not equal. The undercoded abduction provided by the instructor is transformed by the student in an overcoded abduction because Tom knows the warrant. He needs to factor by grouping to find the answer.
Tom writes 2( + 2) + ( + 2). 5. T: Only the first terms are different. I should have + 2. Instead, I have 1 2
+ 1. I need to make them equal. Tom partially distributes the answer provided by the instructor to see what he needs to find. In this way, he Maybe if I take out one half… Can I further decreases the distance between his answer and do that? Can I take out one half? the correct one. 6. I: What do you mean? 7. T: Instead of taking out only I also take out one half, and 1 becomes 2 right? So, I can factor by grouping. Now the instructor’s argument is transformed into the following one:
The gap between the student’s answer
(12
)
+ 1 + ( + 2) and the correct one is reduced.
Tom tries to transform his answer in the instructor’s 1
answer: 2( + 2) + ( + 2). He notes that the two expressions differ only in the first part. Tom looks for a way to make these parts equal.
(continued)
626
B. Pedemonte
Tom looks for a rule to make the two expressions equal. 1
He factors out 2 from the first part of the expression to get the answer. The previous argument is transformed in a deductive step.
This argument is the connection between Tom’s argumentation and the instructor’s argument elaborated by the student. Tom can completely factor the expression.
Tom transformed the abductive step provided by the instructor into an overcoded abduction. He used the instructor’s argument, which supports cognitive unity in students’ argumentation, to reduce the gap between his initial answer and the correct one. Case 2 In this second example, Alice provided answer 4 below as a solution for this problem. The instructor provides the correct answer and asked Alice to figure out a way to get the answer. In this second example, the instructor’s argument ran as a rebuttal to the student’s argument who compared her answer with the correct one. Through the rebuttal, Alice understood her mistake: her answer differed from the instructor’s answer by the fraction 12 . Alice realized that she had multiplied the expression by 2 without dividing it. As in the previous case, the abduction provided by the instructor was introduced in the student’s argumentation without interrupting cognitive unity in the student’s argumentation.
System of Equations Problem Results for this problem conveyed the observation that providing the answer to the problem as claim of the instructor’s abductive arguments is not always the best educational strategy. For example, when students make a mistake, the instructor’s intervention should focus on it. The instructor spent 4 h teaching how to solve linear systems of equations. She focused on three methods: substitution, addition, and graphical methods. During the assessment, most students used the substitution method, and no student used the graphical method.
29 Abductive Arguments Supporting Students’ Construction of Proofs 1. I: This is the correct answer. How can you get this answer? 2. A: I don’t know. My answer is + 2) different. I found + that we have seen is not correct. (She previously expanded her answer.) 3. I: What did you do to find this answer? 4. A: I tried to factor by grouping, so I could take out from the first two terms and from the other terms. Alice shows the statement:
(12
)
627
The student explains her reasoning. She factors a from the first two terms and b from the other terms.
She incorrectly multiplies the first term by 2 to have an expression in it equal to + 2) so that it’s part of the second term. She needs to continue factoring by grouping the expression.
+ 1 + ( + 2)
5. A: I multiplied by 2 here (she shows the first part of the expression) to get + 2 inside the parenthesis, and I factored by grouping… but there is something wrong.
Alice’s copy was the following:
Alice transforms the undercoded abduction from the instructor in an overcoded abduction selecting the rule to complete the warrant: factoring by grouping.
Alice observes that the first part of the expressions is different. The instructor intervention runs as a rebuttal in the student argument, already aware that her answer was incorrect.
6. I: Yes, your answer is different from the correct one. 7. A: How can I get this answer factoring by grouping? I think the first part is correct. Then I multiply the first term by 2, and Alice realizes that she also needs to divide the first part of this part is correct too, right? the expression by 2. Maybe not, in the correct answer, ½ is still there but not in my answer [Silence]… I know why I got it wrong…If I multiply by 2, I also have to divide by 2; otherwise, I change the expression… this was the mistake. I had to divide by 2. Right?
(continued)
628
B. Pedemonte
She inserts ½ in the first part of the expression. She writes: 1 2
( + 2) + ( + 2)
8. A: Now I think it is correct. Alice writes:
Alice’s final copy is the following:
In Table 3, the problems and the students’ answers are presented. 25 students out of 60 (41.7%) correctly solved the problem, 11 out of 60 (18.3%) did not provide an answer (i.e., they did not try to solve the problem or only copied the two expressions), and 24 out of 60 (40%) provided incorrect answers. Most of the 24 students (eight out of 24) made mistakes in operating with fractions, and a small part of them (five out of 24) incorrectly applied algebraic rules (e.g., the sign was incorrectly written in moving a variable from one side of the equation to the other). One fourth of the 24 students (six out of 24) made calculation mistakes using a correct process. Five out of 24 students did not try to solve the system but simply substituted numbers to replace x and y to look for a correct answer. These mistakes are described in Table 4. During office hours, the instructor met each of the 24 students who incorrectly solved the problem. She initially provided the solution to the problem through a graphical representation: the solution was presented on the Cartesian plane as the intersection point between the two lines representing the two equations. The instructor then asked the students to figure out a way to solve the system. The instructor’s argument can be represented as an undercoded (or creative) abductive step (Fig. 4). Data is partially represented from the system of equations, but a few steps are necessary to find the claim. Table 3 System of equations problem and relative students’ responses
Solve this system of equations: 3x + y = 1, Students’ answers: Correct No answer Incorrect Total 25 (41.7%) 11 (18.3%) 24 (40%) 60
y = 12 x +
9 2
29 Abductive Arguments Supporting Students’ Construction of Proofs
629
Table 4 Kinds of mistakes students made solving a system of equations
Kind of mistake 1
Description of the mistake Mistakes operating with fractions (e.g., students forget to multiply one of the terms by the denominator)
Example 3x 1 +y = 1 9 y = 2 x + 2 (2)
3x + y = 1 y =x+9 y = − 3x + 1 So
So 3
2
4
Number of students who provided this answer 8
3x + y = 1 x − (−3x + 1) = −9
Mistake related to an incorrect use of the sign (in the substitution method they do not change the sign to one of the terms) Calculation mistakes (most of them are related to fractions)
y = − 3x + 1 1 9 2 x + 3x + 1 = − 2 (2)
Inserting values for x and y trying to guess the solution
3(1) + (−2) = 1 y = 12 (1) + 92
3x + y = 1 1 9 x−y =− 2 2 7 9 x=− 2 2
5
6
5
No student was able to use this abductive step to solve the system. Most of the students just substituted the coordinate of the intersection point (x = − 1, and y = 4) into the equations and verified that the answer was correct. This result was not unexpected because this instructor’s abductive argument did not support cognitive unity in students’ argumentation. Cognitive unity is maintained if the representation system (the backing in Toulmin’s model) in which the students’ argumentation is expressed is not modified. In this example, the answer to the problem, provided by the instructor, is presented using a graphical representation (e.g., the two lines in the Cartesian plane), while the students used algebraic manipulation. No student used the graphical method to solve the system of equations. As already observed in previous research (Pedemonte, 2018), instructors’
630
B. Pedemonte
Fig. 1 Instructor’s abductive argument to the system of equations problem
interventions are not effective if they do not support the cognitive unity in the students’ argumentation. Furthermore, more arguments are necessary to find the final answer to the problem starting from the two equations. The instructor’s abductive argument did not decrease the distance between the student’s argument and the correct answer; it does not suggest a rule to solve the problem. The abductive argument from the instructor was interpreted by the student more as a creative abduction than an undercoded one because the student was unable to select a rule from a set of theorems. Finally, the instructor’s argument did not support the students in understanding possible mistakes made in their previous argumentations. The instructor modified the didactical strategy by asking the students to explain their reasoning in solving the problem. An abductive argument was designed by the instructor when a mistake was identified. An intermediate correct answer was selected as the claim of the abductive argument. The students were then required to figure out a way to solve the system using that argument. The instructor was able to help the students provide the correct answer in almost every case except when they made the fourth type of mistake of merely plugging in values for x and y. Below is an example in which the instructor’s intervention was effective to support students in understanding their mistake and providing the answer to the system. Case 3 Initially, this student, Carlo, made mistake 3.
29 Abductive Arguments Supporting Students’ Construction of Proofs 1. C: I have found -2 and -5. 2. I: Ok, but the correct answer is -1 and 4. 3. C: Yes, I know, my answer is not correct, but I don’t know how to find the correct answer. 4. I: Can you explain your reasoning? Which method did you use? 5. C: The substitution method. I put the first equation in the second one… before I found in the first equation. 6. I: Ok, what did you get? ― 1 = 9 and this 7. C: I got + should be correct because I multiplied by 2; I mean I multiplied the two sides of the equation by 2. ― 2 = 9. 8. I: You should find +
631
Carlo incorrectly simplifies the equation. He forgets to + 1) of the equation multiply the second term ( ― by 2. The instructor introduces the correct answer in an undercoded abductive step. The instructor’s argument can be represented as follows:
The instructor introduces the answer. His argument is abductive. 9. C: I multiplied both side by 2. Did you do the same? Because I do not understand why you get ― ― 2. I think you substituted the first equation into the second one…
Carlo tries to understand his mistake. The instructor’s intervention runs as a rebuttal in Carlo’s argument.
Carlo focuses on the expression 1 2
―(―
9
+ 1) = 2
Carlo expands the expression on the right side of the equation. He writes 1 9 + ―1= 2 2 10. C: Wait a minute. I know what I did wrong, I forgot to multiply these ― 1) by 2. If numbers (he shows ― 2. I you multiply by 2, you get forgot to multiply by 2 this part of the equation…
Carlo transforms the instructor’s argument into an overcoded abduction.
Carlo needs to figure out a way to find the correct answer starting from his equation. He expands the expression to get close to the instructor’s answer.
(continued)
632
B. Pedemonte
Carlo was able to correctly solve the system. He is getting close to the correct answer.
Carlo can use the previous warrant to complete this argument: multiplying both sides of the equation by 2. This time he performs the multiplication correctly.
Carlo realizes that he forgot to multiply one part of the equation by 2.
In this example, the abductive argument provided by the instructor is effective when it is introduced in the student’s argument to support cognitive unity in argumentation. When a chain of steps is necessary to complete an argumentation, the instructor’s intervention might not be effective even if the answer to the problem is made explicit. To support the students’ comprehension of their mistakes that would then lead them toward the correct solution process, an instructor’s abductive arguments should be part of a student’s argumentation and close enough to the students’ mistakes to support them in recognizing and overcoming those mistakes.
Implications of the Study In this chapter, abductive arguments are analyzed to show that they can be used by an instructor to support students in solving algebraic problems when they are unable to construct a proof or when they make mistakes. Research shows that when students are unable to construct a geometrical proof because their argumentations are based on incorrect conceptions (Balacheff, 2009) that can hardly evolve into theorems, an instructor’s intervention can be effective when it does not “interrupt” the cognitive unity between the students’ argumentation and proof. The instructor’s intervention should become a rebuttal to the students’ arguments to invalidate them. This could happen if the instructor’s argument has the same backing as the students’ arguments. In this chapter, the didactical interventions provided by an instructor are analyzed from a structural point of view intended here as the logical cognitive connection between statements (e.g., deduction, abduction, and induction structures) (Pedemonte,
29 Abductive Arguments Supporting Students’ Construction of Proofs
633
2007). Instructors’ interventions based on deductive arguments incorporate correct claims, correct rules, and a deductive structure. Deductive arguments do not seem to be the best option to support students autonomously in constructing a proof. Rather, when instructor’s interventions are based on undercoded (or creative) abductions (Eco, 1983; Bonfantini & Proni, 1983; Magnani, 2001), they can decrease the distance between the students’ arguments and the correct answer to the problem, supporting cognitive unity in the students’ argumentations. This idea, developed through two cases in geometry, was verified in an experiment settled in algebra. Appropriate abductive arguments were used by the instructor to help students understand their mistakes and solve the problem. In the case of the factorization problem, the complete factorization of the expression was suggested to the students as an answer to the problem. In Toulmin’s model, this answer is represented as the conclusion of the instructor’s abductive argument. When the students had previously identified the factoring by grouping strategy as a rule to solve the problem, cognitive unity in students’ argumentation could be reconstructed. They used this strategy to complete the undercoded abduction transforming it into an overcoded one, where only data is missing. This overcoded abduction runs as a bridge between the initial students’ argumentation and the problem’s answer. The undercoded abduction was used by the students to identify the missing step to solve the problem (Case 1) or to recognize a possible mistake (Case 2). In this last case, the instructor’s abductive argument runs as a rebuttal to the students’ argumentation; the students acknowledged the mistake in comparing their incorrect answer with the correct answer provided by the instructor. The instructor’s argument was a rebuttal in students’ argumentation because it was expressed in the same representation system as the students’ argumentation (they both had the same backing) (Pedemonte, 2018). In the second problem (solving a system of linear equations), the first abductive argument, provided by the instructor, was expressed using a graphical representation (e.g., two lines in the Cartesian plane), while the students only used algebraic manipulation to solve the problem. Thus, the first argument provided by the instructor did not support cognitive unity in students’ argumentation. Furthermore, when the mistake is performed in the middle of the solving process, not close to the final answer (Case 3), the instructor’s intervention provided the correct expression immediately after the student’s mistake. In Toulmin’s model, this intervention is represented as an undercoded abductive argument having as conclusion the correct expression (not necessarily the answer to the problem). By comparing their incorrect expression with the correct expression provided by the instructor, the students realized to have made a mistake. The second undercoded abductive argument provided by the instructor was effectively used by the student (Case 3) to understand his mistake, modify his argumentation, and solve the problem. This second intervention is different from the first one because it supports cognitive unity in the student’s argumentation: the claim is an intermediate correct answer that the student used to identify his mistake. When cognitive unity could not be constructed (the students were unable to use the warrant previously used in solving the problem), the instructor’s undercoded abductions could not be transformed into overcoded abductions by the students.
634
B. Pedemonte
The interaction between the instructor and students failed in problem 1 when the students initially factored the polynomial using an incorrect factorization rule (they factored out a from the first three terms of the polynomial) or when they made mistake 1 (providing a third-degree polynomial as an answer: (a + b) 12 a + 1 (a + 2)). In these cases, the students did not reconstruct cognitive unity because the warrant they used in their argumentation could not be used as a warrant to complete the undercoded abduction provided by the instructor. The undercoded abductive arguments provided by the instructor could not be used to decrease the distance between the students’ argumentations and correct answer.
Conclusion In mathematics education, abductive reasoning is usually analyzed in relation with cognitive factors and learning effects. Research studies focusing on abduction as teaching strategy and form of reasoning for a didactical intervention are missing. The main contribution of this chapter is to address this issue. We have observed that abductive arguments, opportunely designed by the instructor, may support students in solving problems when they are unable to construct a proof or when they make mistakes. Abductive arguments, especially those based on undercoded (or creative) abductions, are effective educational strategies when students can transform them into overcoded abductions. In Toulmin’s model, the instructor’s intervention based on an undercoded abduction is represented as a ternary structure where the problem’s answer is the conclusion, and data and warrant are missing. Students should select (or create) a rule coherent with the rules previously used in their argumentations to complete the warrant in an instructor’s abductive arguments. Selecting (or creating) a warrant, the students can transform the instructor’s undercoded abductive argument into an overcoded abduction, where the claim and the warrant are fixed; only data should be found. The overcoded abduction runs as a bridge between the initial students’ argumentation and the problem’s answer supporting cognitive unity in students’ argumentation. An instructor’s undercoded abduction can be also used when students’ argumentation is based on incorrect conceptions or when students make mistakes. If the conclusion of the undercoded abductive argument provided by the instructor is close enough to the incorrect answer provided by the students, it can run as a rebuttal in students’ argumentation. In this case, the students can compare their incorrect answer with the undercoded abduction provided by the instructor and realize they have made a mistake. They can change the answer and provide a correct justification to the problem. It is important that the correct answer provided by the instructor is close enough to the incorrect answer given by the students. In some cases, it should be appropriate to provide an intermediate answer to the problem, closer to the mistake made by the students than the final answer. This educational strategy is effective if cognitive unity can be reconstructed in students’ argumentation. An instructor’s interventions based on undercoded
29 Abductive Arguments Supporting Students’ Construction of Proofs
635
abductions should decrease the distance between students’ argumentation and the answer to the problem. When this does not happen, and cognitive unity cannot be maintained, this educational strategy does not work. Further studies are necessary in determining educational strategies that can support students when they solve problems where cognitive unity is not constructed. It is also important to observe that the main aim of this educational strategy is not to lead the students to find the answer to the problem. The answer is given by the instructor in the undercoded abductive argument. Instead, the students are asked to complete the undercoded abduction selecting a rule (the warrant in Toulmin’s model) and correctly applying it to the data. This educational strategy requires the students to justify the answer to the problem. The focus is not the solution to the problem but how this solution is obtained. Thus, a typical problem-solving task (like a factorization problem or a system of linear equations) becomes a proving problem (Weber, 2005).
References Arzarello, F., Micheletti, C., Olivero, F., & Robutti, O. (1998). A model for analyzing the transition to formal proofs in geometry. In A. Olivier & K. Newstead (Eds.), Proceedings of the twentiethsecond annual conference of the International Group for the Psychology of Mathematics Education (vol. 2, pp. 24–31). Stellenbosch, South Africa. Balacheff, N. (2009). Bridging knowing and proving in mathematics: A didactical perspective. In G. Hanna, H. N. Jahnke, & H. Pulte (Eds.), Explanation and proof in mathematics. Philosophical and educational perspectives (pp. 115–135). Springer. Boero, P., Garuti, R., & Mariotti M. A. (1996). Some dynamic mental processes underlying producing and proving conjectures. In L. Puig & A. Gutierrez (Eds.), Proceedings of the twentieth conference of the International Group for the Psychology of Mathematics Education (vol. 2, pp. 121–128). Valencia. Bonfantini, M., & Proni, G. (1983). To guess or not to guess. In U. Eco & T. Sebeok (Eds.), The sign of three: Dupin, Holmes, Peirce (pp. 119–134). Indiana University Press. Cifarelli, V. (1999). Abductive inference: connections between problem posing and solving. In O. Zaslavsky (Ed.), Proceedings of the 23rd annual conference of the International Group for the Psychology of Mathematics Education (vol. 2, pp. 217–224). Haifa, Israel. Cifarelli, V., & Sáenz-Ludlow, A. (1996). Abductive processes and mathematics learning. In E. Jakubowski, D. Watkins, & H. Biske (Eds.), Proceedings of the eighteenth annual meeting of the North American Chapter of the International Group for the Psychology of Mathematics Education (Vol. I, pp. 161–166). ERIC Clearinghouse for Science, Mathematics, and Environmental Education. Eco, U. (1983). Horns, Hooves, Insteps: Some hypotheses on three types of abduction. In U. Eco & T. Sebeok (Eds.), The sign of three: Dupin, Holmes, Peirce (pp. 198–220). Indiana University Press. Ferrando, E. (2006). The Abductive System. In J. Novotná, H. Moraová, M. Krátká, & N. Stehlíková (Eds.), Proceedings of the thirtieth conference of the International Group for the Psychology of Mathematics Education (Vol. 3, pp. 57–64). PME. Garuti, R., Boero, P., Lemut, E., & Mariotti, M. A. (1996). Challenging the traditional school approach to theorems. In L. Puig & A. Gutierrez (Eds.), Proceedings of the twentieth conference of the International Group for the Psychology of Mathematics Education (vol. 2, pp. 113–120). Valencia.
636
B. Pedemonte
Garuti, R., Boero, P., & Lemut, E. (1998). Cognitive unity of theorems and difficulty of proof. In A. Olivier & K. Newstead (Eds.), Proceedings of the twentieth-second annual conference of the International Group for the Psychology of Mathematics Education (vol. 2, pp. 345–352). Stellenbosch, South Africa. Inglis, M., Mejia-Ramos, J. P., & Simpson, A. (2007). Modelling mathematical argumentation: The importance of qualification. Educational Studies in Mathematics, 66, 3–21. Knipping, C. (2003). Argumentation structures in classroom proving situations. In M. A. Mariotti (Ed.), Proceedings of the third conference of the European Society in Mathematics Education (unpaginated). Retrieved from http://ermeweb.free.fr/CERME3/Groups/TG4/ TG4_Knipping_cerme3.pdf Knipping, C. (2008). A method for revealing structures of argumentation in classroom proving processes. ZDM – The International Journal on Mathematics Education, 40(3), 427–441. Krummheuer, G. (2007). Argumentation and participation in the primary mathematics classroom: Two episodes and related theoretical abductions. The Journal of Mathematical Behavior, 26(1), 60–82. Lavy, I. (2006). A case study of different types of arguments emerging from explorations in an interactive computerized environment. The Journal of Mathematical Behavior, 25, 153–169. Magnani, L. (2001). Abduction, reason and science: Processes of discovery and explanation. Kluwer Academic Publishers. Magnani, L. (2004). Conjectures and manipulations: Computational modeling and the extratheoretical dimension of scientific discovery. Minds and Machines, 14, 507–537. Mason, J. (1996). Abduction at the heart of mathematical being. In E. Gray (Ed.), Thinking about mathematics & Music of the spheres: Papers presented for the inaugural lecture of Professor David Tall (pp. 34–40). Mathematics Education Research Centre. Mason, J., Stephens, M., & Watson, A. (2009). Appreciating mathematical structures for all. Mathematics Education Research Journal, 21(2), 10–32. Nardi, E., Biza, I., & Zachariades, T. (2012). Warrant revisited: Integrating mathematics teachers’ pedagogical and epistemological considerations into Toulmin’s model for argumentation. Educational Studies in Mathematics, 79, 157–173. Pedemonte, B. (2007). How can the relationship between argumentation and proof be analyzed? Educational Studies in Mathematics, 66, 23–41. Pedemonte, B. (2008). Argumentation and algebraic proof. ZDM – The International Journal on Mathematics Education, 40(3), 385–400. Pedemonte, B. (2018). How can a teacher support students in constructing a proof? In A. J. Stylianides & G. Harel (Eds.), Advances in mathematics education research on proof and proving. An international perspective (pp. 115–130). Springer. ISSN:2520-8322. Pedemonte, B., & Balacheff, N. (2016). Establishing links between conceptions, argumentation and proof through the ck¢-enriched Toulmin model. The Journal of Mathematical Behavior, 41, 104–122. Pedemonte, B., & Reid, D. (2010). The role of abduction in proving processes. Educational Studies in Mathematics, 76(3), 281–303. Peirce, C. S. (1878, August 13). Deduction, induction, and hypothesis. Popular Science Monthly, 470–482. (Compiled in Peirce, C. S., 1960, 2.619-644). Peirce, C. S. (1960). Collected papers. Harvard University Press. Polya, G. (1962). How to solve it? Princeton University Press (French translation Mesnage C. Comment poser et résoudre un problème. Dunod (Ed.), Paris). Rivera, F. (2017). Abduction and the emergence of necessary mathematical knowledge. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based science (pp. 441–457). Springer. Rivera, F., & Becker, J.-R. (2007). Abduction–induction (generalization) processes of elementary majors on figural patterns in algebra. The Journal of Mathematical Behavior, 26, 140–155. Rivera, F., & Becker, J. R. (2016). Middle School Student’s patterning performance on semi-free generalization tasks. The Journal of Mathematical Behavior, 43, 53–69.
29 Abductive Arguments Supporting Students’ Construction of Proofs
637
Sáenz-Ludlow, A. (2016). Abduction in proving. In A. Sáenz-Ludlow & G. Kadunz (Eds.), Semiotics as a tool for learning mathematics. Semiotic perspectives in the teaching and learning of mathematics series (pp. 155–179). Sense Publishers. https://doi.org/10.1007/978-94-6300337-7_8 Son, J. W. (2013). How preservice teachers interpret and respond to student errors: Ratio and proportion in similar rectangles. Educational Studies in Mathematics, 84, 49–70. Tall, D. (1995). Cognitive development, representations & proof, justifying and proving in school mathematics (pp. 27–38). Institute of Education. Toulmin, S. E. (1958). The uses of argument. Cambridge University Press. Toulmin, S. E. (1993). Les usages de l’argumentation (P. De Brabanter, Trans.). Presse Universitaire de France. Weber, K. (2005). Problem-solving, proving, and learning: The relationship between problemsolving processes and learning opportunities in the activity of proof construction. The Journal of Mathematical Behavior, 24(3–4), 351–360. Weber, K., & Alcock, L. (2005). Using warranted implications to understand and validate proof. For the Learning of Mathematics, 25(1), 34–38. Wood, T. (1999). Creating a context for argument in Mathematics Class Young Children’s concepts of shape. Journal for Research in Mathematics Education, 30(2), 171–191. Yackel, E. (2001). Explanation, Justification and argumentation in mathematics classrooms. In M. Van den Heuvel-Panhuizen (Eds.), Proceedings of the 25th conference of the international group for the psychology of mathematics education (vol. 1, pp. 1–9). Utrecht, Olanda. Yackel, E., & Rasmussen, C. (2002). Beliefs and norms in the mathematics classroom. In G. Toerner, E. Pehkonen, & G. Leder (Eds.), Mathematical beliefs and implications for teaching and learning. Kluwer.
Part VI Diagrams, Visual Models, and Abduction
Introduction to Diagrams, Visual Models, and Abduction
30
Gianluca Caterina and Rocco Gangle
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
641 646
Abstract
C.S. Peirce’s thesis that all reasoning is fundamentally diagrammatic lies at the core of his approach to logic, conceived as represented through the three levels of his Existential Graphs (Alpha, Beta, and Gamma). In Peirce’s work, however, is ubiquitously present a more general notion of diagram, which is closely related to the concept of abduction, although the connection of the latter with classical logical systems is not obvious. The six chapters comprising this section present a holistic approach to the notion of diagrams and diagrammatic reasoning, with some of the chapters having a defined formal mathematical flavor, and others approaching those issues from an epistemological perspective. The richness and variety of the approaches presented in the section will highlight the deep connections between diagrams, visual models, and abduction.
Introduction The dynamics of abductive inference are exhibited in a particularly explicit way in the context of reasoning with diagrams and visual models. This is because such diagrams and models are already in themselves the products of a kind of
G. Caterina · R. Gangle () Center for Diagrammatic and Computational Philosophy, Endicott College, Beverly, MA, USA e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_85
641
642
G. Caterina and R. Gangle
hypothesis or conjecture. This hypothesis or conjecture that both grounds and defines diagrams and models as such is the way they are taken by some cognitive agent to represent some target domain of cognition in a visual or otherwise sensible medium. The plausibility and the accuracy of this representation-relation in any particular case remain relative to the interpretative decision to take the diagram or model as representative in this manner. Such a decision is always, in a general sense, abductive in character. This is true both for technical and mathematically rigorous models and diagrams as employed in the natural sciences as well as those sketched informally in notebooks or on napkins, or traced gesturally with fingers in the air. In this respect, investigating and classifying the various modes of model-based and diagrammatic abduction is concerned in its own way with the more general problem of “classifying abduction in science and classifying abduction in broader contexts” (Park, 2015, p. 236). The following chapters comprising the Handbook section “Diagrams, Visual Models, and Abduction” approach theoretical and practical concerns with respect to this important area of research in abductive cognition from a variety of points of view. One common theme is the work of American philosopher Charles S. Peirce. Peircean semiotics provides a theoretical framework for the chapters in this section authored by Stjernfelt, Pietarinen, Semetsky, and Caterina, Gangle, and Tohmé. Additionally, the specific diagrammatic logical notation developed by Peirce that he called Existential Graphs is the subject of detailed analysis in the chapter by Oostra. Another theoretical framework found in multiple chapters in this section is the mathematics of Category Theory. On the one hand, Category Theory itself uses diagrammatic notation extensively to express mathematical structures and concepts. On the other hand, the particular emphasis of Category Theory on structurepreserving maps is especially useful as a tool for representing mathematically tractable features of diagrams and visual models of various types. In light of this twofold diagrammatic character of Category Theory, the contribution by Ochs examines the abductive character of its standard diagrammatic notation of arrows and objects, while the chapter by Caterina, Gangle, and Tohmé applies Category Theory to diagrammatic reasoning in general and, in particular, to mathematically rigorous modeling practices in science. By way of a general introduction, many of the core themes to be found in the chapters of the present section are already notable in one of the earliest instances of diagrammatic thinking in the Western tradition, namely, the geometrical investigation undertaken in Plato’s dialogue Meno. In an important theoretical detour of that dialogue, Socrates and a household slave of the Thessalian aristocrat Meno together use a series of geometrical diagrams to articulate and test out several hypotheses before hitting upon the correct solution to their chosen problem: how to construct a new square twice the area of some given square. In the series of diagrammatic conjectures and evaluations reproduced here as Figs. 1 through 3, the pattern of abductive reasoning with models and diagrams displays some of its most essential facets. In particular, these diagrams show the
30 Introduction to Diagrams, Visual Models, and Abduction
643
Fig. 1 Sides in the ratio 1:2
Fig. 2 Sides in the ratio 2:3
Fig. √ 3 Sides in the ratio 1: 2
abductive character of the diagrammatic reasoning that underlies both “corollarial” and “theorematic” forms of deductive reasoning as defined by Peirce (1992, 1998, vol. 2, p. 298): A Corollarial Deduction is one which represents the conditions of the conclusion in a diagram and finds from the observation of this diagram, as it is, the truth of its conclusion. A Theorematic Deduction is one which, having represented the conditions of the conclusion in a diagram, performs an ingenious experiment upon the diagram, and by the observation of the diagram so modified, ascertains the truth of the conclusion.
644
G. Caterina and R. Gangle
In other words, when reasoning with diagrams (and more generally any type of theoretical model), abductive reasoning allows for conclusions to be drawn both immediately via direct inspection and mediately via processes of constructive experimentation. Figure 1 represents the slave’s initial hypothesis for constructing a square twice the area of a given square. Meno’s slave suggests abductively that a square built on twice the length of one of the original square’s sides will have twice its area. The hypothesis is performed by constructing the second square. The constructed diagram serves as both an expression of the abductive hypothesis (a square with sides twice the length of a given square will have twice the area) and a direct exhibition of the falseness of that hypothesis. It is in this sense that reasoning with visual models and diagrams at once facilitates and constrains abductive inferences. Hypotheses and conjectures may be constructed and experimented with directly on the model or diagram by modifying it – for example, by adding to the diagram, deforming it, or deleting some of its components. Additionally, the results of conjectures may be perceived directly or rationally evaluated by way of further acts of abductive cognition. Figure 2 represents the slave’s second, revised hypothesis. Perhaps, the slave conjectures, the correct ratio of the squares’ sides is 2:3. Again, such an abduction is diagrammatically constructed and then (through further abductive reasoning with the subdivided squares) falsified. Finally, by following the prodding of Socrates, Meno’s slave tries abductively to construct a square along the diagonal of the initial square, as shown in Fig. 3. With this diagram, the correct solution to the initial geometrical problem has been made visually evident. A square constructed on the diagonal of a given square will indeed have twice the original square’s area. Importantly, the abductive conjecture itself is facilitated by experimenting with specific features of the concrete diagram. After evaluating the conjecture by examining its diagrammatic effects, Meno’s slave has thereby discovered a general theoretical result. In the dynamic interplay between (1) inspection of a diagram or model, (2) experimental modification or supplementation of the diagram or model, and (3) reflection upon the new features, structures and relations that the result of the experimentation exhibits, the conditions, stages, and consequences of abductive reasoning are thus present in a distinctive manner. While the variety of different types of diagrams and visual models in various theoretical domains and practical contexts is virtually unlimited, it is possible for certain features common to all such diagrams and models to be isolated and examined. The study of the relation between models and diagrams and abductive reasoning is neither completely arbitrary nor based on any single rigid framework. The contributions in this section illustrate the flexible nature of such investigation by recasting it in both formal and informal frameworks. The approaches represented here indeed highlight the epistemological richness of even the most simple diagrams which, when approached with an abductive intention, become a fertile ground for the development of new ideas and theory-building processes.
30 Introduction to Diagrams, Visual Models, and Abduction
645
The section opens with a contribution from Frederik Stjernfelt, whose Chap. 34, “Abduction in Diagrammatical Reasoning” examines the structure
of abductive reasoning with respect to diagrams and visual models from a Peircean perspective. Like several of the other chapters in the present section, Peirce’s general semiotic theory as well as his specific analyses of diagrammatic reasoning provides useful tools for understanding abductive inference with respect to visual models and diagrams in a contemporary context. The subsequent chapter from Ahti-Veikko Pietarinen, Chap. 35, “Peirce’s Diagrammatic Reasoning and Abduction”, delves into Peirce’s claim that diagrams and diagrammatic reasoning underlie virtually all the creative processes behind the formation of new mathematical and scientific ideas. Along with a rigorous, yet lively and engaging, analysis of Peirce’s texts supporting such claims, Pietarinen elaborates on the effects of a Peircean perspective on two fields in particular, Cognitive Science and History of Mathematics, in which abductive reasoning is essential. The third chapter, Chap. 31, “Existential Graphs as a Visual Tool of Abductive Cognition in Intuitionistic Logic and Various Sublogics”, by Arnold Oostra is a self-contained analysis of the extension of Peirce’s alpha level of Existential Graphs to intuitionistic logic. In Oostra’s work, Peirce’s diagrammatic logic becomes a rich framework in which abductive reasoning unfolds along precise formal lines. Intuitionistic logic is recast onto a fully fledged diagrammatic system, where the main characters of certain fundamental logical substructures are naturally represented and conveyed to the reader in a visually perspicuous manner. In the Chap. 32, “Visual Semiotics, Abduction, and the Learning Paradox: The Role of Graphic Signs”, Inna Semetsky focuses on the semiotics of visual learning from a Peircean perspective. Her analysis emphasizes the intuitive, imaginative, and creative dimensions of visual signs, and she draws upon Michael Polanyi’s concept of tacit knowledge to trace the dynamics of knowledge acquisition by way of experimentation with images, symbols, and other culturally sedimented signs. The editors must sadly note that Professor Semetsky unfortunately passed away soon after finishing her chapter contribution. In virtue of her many scholarly contributions to the fields of semiotics and education theory and her warm personality and indefatigable presence, the editors would like to dedicate fondly the present section of the Handbook to her memory. Eduardo Ochs’s Chap. 33, “On the Missing Diagrams in Category Theory” is a prime example of a working abduction in the field of diagrammatic notation. The issue at stake in his chapter is what is “missing” from the standard formal diagrams employed in the mathematical notation of Category Theory. More precisely, Ochs points out that many of the conventions adopted to interpret and explain categorical diagrams in typical mathematical texts are neither natural nor intuitive for many researchers who may not be specialists in the area. Reasoning abductively, experimenting with and manipulating new formal conventions for understanding such diagrams, Ochs proposes a series of new diagrammatic protocols which promise to help to solve some of the most stubborn ambiguities in the interpretation of this mathematical notation.
646
G. Caterina and R. Gangle
The final chapter in the section, Chap. 36, “Abduction in Diagrammatic Reasoning: A Categorical Approach”, by Gianluca Caterina, Rocco Gangle, and Fernando Tohmé aims to formulate the core features of abductive reasoning in the context of diagrams and visual models from two complementary points of view, Peircean semiotics and Category Theory. By coordinating these two theoretical perspectives, the authors show how abductive inferences are both constitutive of and facilitated by various levels of diagrammatic and model-based reasoning and in particular how these features of diagrammatic abduction are characteristic of the use of mathematical models in the natural sciences.
References Park, W. (2015). On classifying abduction. Journal of Applied Logic, 13, 215–238. Peirce, C. S. (The Peirce Edition Project, ed.) (1992, 1998). The Essential Peirce (2 Vols.). Indiana University Press.
Existential Graphs as a Visual Tool of Abductive Cognition in Intuitionistic Logic and Various Sublogics
31
Arnold Oostra
Contents Standard Existential Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Existential Graphs for Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sublogics of Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Existential Graphs for Intuitionistic Sublogics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
648 652 656 661 663 666 667
Abstract
The visual representations of mathematical proofs can help in their comprehension, communication, and construction, and particularly, they might highlight the abductive reasoning that so frequently emerges in this context. Rather than a graphical representation of the structure of a given argument, the quest is for truly graphical formal systems. Among the first proposals that positively match this description are the Existential Graphs introduced by Charles S. Peirce. Being created as an analytic tool, one of the salient features of this system is that they provide a very detailed outline of mathematical proofs, which makes logical abduction clearly visible. Existential Graphs were designed originally for classical first-order logic, being the only precisely determined logic in Peirce’s time, but the twentieth century saw the emergence of many nonclassical logics. Thanks to some distinctly abductive steps, Existential Graphs have been successfully applied to various modal logics and to intuitionistic logic. This paper highlights abduction in the system of Existential Graphs for intuitionistic
A. Oostra () Universidad del Tolima, Ibagué, Colombia e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_39
647
648
A. Oostra
logic and in a profusion of its sublogics. The first section is a presentation of Peirce’s original Existential Graphs, with various emphases on abduction. This is followed by a swift introduction to intuitionistic logic and the system of Existential Graphs that adapts to it, obtained from very precise abductions. The mutual independence of the intuitionistic connectives yields a great variety of sublogics given by the combination of some of these connectives, and suitable abductions give rise to systems of Existential Graphs for many of such sublogics. Thus, Existential Graphs become a device for the visual study of abductive cognition in various realms of mathematical logic. Keywords
Abduction · Existential graphs · Intuitionistic logic · Visual representation of logic
Standard Existential Graphs Existential Graphs were invented by Charles S. Peirce at the end of the nineteenth century as a visual representation of logic (Peirce, 1906, 1958, 2021). In this system, the logical formulas are spread out as two-dimensional diagrams. A small number of graphical inference rules complete a full-fledged deductive system that allows logical proofs. In this scenario, a finite set of premises can be translated into a diagram, on which the allowed rules are performed to obtain a graphical conclusion, which in turn can be reread in any formal or linguistic context. Besides all the advantages that a visual presentation of logic can provide, abduction as a search for hypotheses arises in several quite clear ways in the realm of Existential Graphs. The ideas underlying Existential Graphs are very simple. Written sentences are asserted, and to deny a sentence or a combination of sentences, it suffices to enclose it. If it is necessary to analyze the sentence internally, the subjects are drawn as lines, and the predicates are written at their ends. Thus, it is also possible to negate single predicates or their combinations by enclosing them. Other modalities of the sentences or the predicates are attained with different enclosures. In this graphical context, the inference rules are a code of permissions on the erasure or drawing of components on a given graph. For example, enclosures may appear around others. A double enclosure without elements between the two curves would mean a double negation, so in a classical interpretation, such a feature might be freely drawn or erased. Moreover, all components are oddly or evenly enclosed, and this could determine its possible removal or addition. A detailed description of these graphs and their rules of transformation follows below. Peirce classified his Existential Graphs in three main systems: Alpha, which corresponds to classical propositional logic; Beta, which is a completely graphical version of first-order logic; and Gamma, which includes modal and second-order logics (Roberts, 1973; Zalamea, 2012). Many years after Peirce’s death, Jay Zeman established a precise correspondence of certain Gamma graphs systems with three known modal logics (Zeman, 1964).
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
649
In more detail, the components of Alpha graphs are the sheet of assertion, an unlimited flat surface without border upon which all graphs are drawn; propositional letters; and cuts, which are simple, closed curves on the sheet, usually ovals. Different letters and graphs can be drawn besides each other, and a cut can be drawn around a letter or graph. Hence, an Alpha graph is a diagram composed of a finite combination of letters and cuts, drawn upon the sheet of assertion. There may be repeated letters, but they all occupy different places. The cuts do not touch the letters nor do they touch each other. Two graphs that can be continuously deformed into each other are equal. The basic logical interpretation of the Alpha graphs is that the sheet of assertion is the universe of possibilities of truth, and drawing a graph on the sheet means asserting its interpretation. Therefore, writing a letter means asserting the proposition it represents, and drawing two graphs means asserting both. On the other hand, enclosing a graph with a cut means denying it. From these basic conventions, the graphs of the propositional connectives shown in Fig. 1 follow immediately. From the graphs of the connectives, an Alpha graph can in turn be recursively constructed for any propositional logical formula. An area is defined to be a region of the sheet of assertion limited by cuts, and an area is odd or even if there is an odd or even number of cuts around it. In addition, a double cut is made up of two cuts, one inside the other and without letters or cuts in the area between them. These are the Alpha rules of transformation: 1. Erasure: In an even area, any graph may be erased. 2. Insertion: In an odd area, any graph may be scribed. 3. Iteration: Any graph may be iterated in its own area, or in any area contained in it, that is not part of the graph to be repeated. 4. Deiteration: Any graph may be erased if a copy of it persists in the same area or in any area around it. 5. Double cut: A double cut may be drawn around or removed from any graph on any area. Figure 2 shows a proof of two traditional rules of inference in a completely graphical way. In each step, the number above the arrow ⇒ indicates the Alpha rule of transformation used.
Fig. 1 Alpha graphs for propositional connectives
• Negation:
¬A
A
• Conjunction:
A∧B
A B
• Disjunction:
A∨B
A
B
• Implication:
A→B
A
B
650
A. Oostra
A
A
A
4 ⇒
B
B
A
5 ⇒
B
4 ⇒
B
A
A
1 ⇒
B 1 ⇒
B
B
A
Fig. 2 Proof of modus ponens (top) and modus tollendo tollens (bottom) by means of Existential Graphs Alpha 2 ⇒
5
⇒
3 ⇒
A
A
A
Fig. 3 Proof of the law of excluded middle by means of Existential Graphs Alpha
A
B
A
A
2 ⇐
A
A
A
3 ⇐
2×
A
2 ⇐
5 ⇐
Fig. 4 Reverse proof of Peirce’s law by means of Existential Graphs Alpha
In the same way, tautologies can be proved graphically. In this case, having no premises, the proof starts from the empty sheet of assertion. Figure 3 contains a proof of the law of excluded middle: A ∨ ¬A. Any graphical proof starting from the empty sheet begins with the drawing of an empty double cut. Although abductive reasoning is always present in mathematical thinking and is therefore reflected by graphical proofs, it becomes completely obvious in this last segment of tautology proving. The way to construct these proofs almost always starts by drawing the conclusion. Then, the graphs or hypotheses that might lead to this graph are looked for, in an iterated breakdown to a simple graph that can be easily obtained. For example, Fig. 4 starts with the graph of Peirce’s law: ((A → B) → A) → A. The graph of ¬B is in an odd area, so drawing it is allowed, and a useful working hypothesis is the graph without it. Next, the two inner copies of ¬A can be iterated, and the graph without these subgraphs is a hypothesis from which the conclusion graph is directly derived. Now the outer copy of ¬A is in an odd area, and there is permission to draw it. The working hypothesis is reduced to an empty double cut, which can be obtained at once. Thus, a proof naturally arises just by reversing all these abductive steps. In a technical way, the system of Existential Graphs Alpha is equivalent to classical propositional calculus (Roberts, 1973; Fuentes, 2014). The system of Existential Graphs Beta, in turn, is equivalent to first-order logic with equality. To obtain Beta graphs thick, possibly branched lines – called lines of identity – are added to the Alpha system. These lines represent subjects, elements, or individuals. Furthermore, predicate letters with a positive number of – ordered – lines attached are considered, representing relatives or predicates. So, a Beta graph is a diagram composed of a finite combination of lines, letters, and cuts drawn on the sheet of
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
651
assertion. The corresponding number of lines is attached to each letter. If two lines touch, they are identified, but two lines may also cross without touching each other. The cuts do not touch the letters nor do they touch each other, but a line may cross a cut. Two graphs that can be continuously deformed into each other are equal. The interpretation of the Beta graphs follows the same clauses given for the Alpha graphs. Additionally, drawing a line on the sheet means asserting the existence of an individual, writing a letter with lines attached to it means that the predicate it represents holds for the individuals involved, and joining two lines of identity means identifying the individuals they represent. From these conventions, the graphs of the quantifiers shown in Fig. 5 follow. The parity of the areas does not change with the lines of identity. The rules of transformation that complete the system are the same Alpha rules, only extended with the following adaptations to the line of identity: 1. Erasure: In an even area, any line of identity may be cut. 2. Insertion: In an odd area, two lines of identity may be joined. 3. Iteration: Any loose end of a line may be extended inward through cuts; when there are lines of identity involved in the graph to be iterated, they must correspond exactly to those of the original graph. 4. Deiteration: Any loose end of a line may be retracted from the outside through cuts; when there are lines of identity involved in the graph to be deiterated, they must correspond exactly to those of the outside copy of the graph. 5. Double cut: The application of this rule is not prevented by the presence of lines that cross both cuts. Figure 6 shows a graphical proof of syllogism Darii: All M are P , and some S are M, thus some S are P . In fact, the Existential Graphs Beta constitute a diagrammatic version of first-order logic (Roberts, 1973; Pietarinen, 2015). Existential Graphs Gamma are more than a precise system, like Alpha and Beta, and include a complete environment for the exploration of new signs. One of Peirce’s ideas is that of the broken cut, a cut drawn with a dashed or discontinuous curve. One feasible interpretation of such a cut is the possible negation of its interior, leading to a representation of alethic modal logic. The adequate adaptation of the five rules of transformation to the broken cuts leads to systems of Existential Graphs
• There exists P: • There exists not P: • All is P: • All is not P: Fig. 5 Beta graphs for quantifiers
∃xP(x)
P
∃x¬P(x)
P
∀xP(x)
P
∀x¬P(x)
P
652
A. Oostra
M
P
S
M
3 ⇒
P
4 ⇒ S
M
M
P
S
M
2 ⇒
P
5 ⇒ S
M
M
P
S
M
⇒
P
1 ⇒ S
1 ⇒
S
P
M
Fig. 6 Proof of syllogism Darii by means of Existential Graphs Beta
Gamma for the well-known modal logics S4, S4.2, and S5 (Zeman, 1964; Chagrov & Zakharyaschev, 1997).
Intuitionistic Logic Intuitionistic logic is a formal system of symbolic logic developed inside mainstream mathematics that in some way reflects the constructive ideas of intuitionism. Intuitionism, in turn, is a doctrine on the foundations of mathematics according to which this science is the result of the free and constructive mental activity of the person who investigates. The founder of intuitionism was the Dutch mathematician and philosopher L. E. J. Brouwer (1881–1966). In an intuitionist approach, a mathematical proposition becomes true when the subject experiences or intuits its truth, after having made an adequate mental construction. This idea of truth in turn leads to an intuitionistic interpretation of logical connectives, for example, the negation of a sentence means that something absurd may be constructed from it, and the disjunction of two sentences is true if an effective construction of any of them is possible. The general ideas of intuitionism lead to logical and mathematical results that differ from the usual ones, for example, the intuitionistic concept of negation forces to deny the principle of double negation and the principle of excluded middle. Beginning in 1912, Brouwer dedicated his life to developing a comprehensive revision of mathematics under intuitionistic principles. The result of his effort and that of his successors is a mathematical edifice completely different from the usual one. While there are many common results, in both theories, there are conclusions that are not valid in the other (Heyting, 1971; Troelstra & van Dalen, 1988). These general principles of intuitionism also led, within the realm of usual mathematics, to the precise formal system known as intuitionistic logic. One of the pioneers of this development was Arend Heyting, a disciple of Brouwer, who in 1930 published The Formal Rules of Intuitionistic Logic. This paper contains a Hilbert-style axiomatization of this logic, which at the propositional level only differs from classical propositional calculus by the axioms of negation. Subsequently, intuitionistic logic was further developed by Gerhard Gentzen and Kurt Gödel. Also, like classical logic, intuitionistic logic has a wide range of semantic models.
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
653
A certain class of partially ordered sets, known as Heyting algebras, constitute the algebraic counterpart of intuitionistic propositional calculus. The most important instance of a Heyting algebra is the family of open sets of a topological space, ordered by inclusion. On the other hand, Saul Kripke discovered that a certain variant of his models, proposed first as semantics for modal logics, also provided semantics for Heyting’s intuitionistic logic. Later, in an unexpected turn of events, intuitionistic logic appeared naturally in the context of sheaves and category theory (Goldblatt, 1984). The fundamental connectives for intuitionistic propositional calculus are implication →, conjunction ∧, disjunction ∨, and absurd ⊥. Negation is defined as ¬α = α → ⊥. The following formulas are taken as axioms. 1. 2. 3. 4. 5. 6. 7. 8. 9.
α → (β → α) (α → (β → γ )) → ((α → β) → (α → γ )) ⊥→α (α ∧ β) → α (α ∧ β) → β (γ → α) → ((γ → β) → (γ → (α ∧ β))) α → (α ∨ β) β → (α ∨ β) (α → γ ) → ((β → γ ) → ((α ∨ β) → γ ))
Axiom 3 stands for the principle of explosion: Anything follows from falsehood. The other axioms are exactly the same as for classical propositional calculus. The only inference rule is modus ponens, and the entailment relation Σ ϕ is defined as usual. A formula τ entailed by the empty set, that is τ , is called a theorem. The following results may be proved exactly as in classical logic: • α → β, β → γ α → γ ; • α → α; • α, β α ∧ β. Some remarkable deductions follow at once from the intuitionistic negation. • ¬α α → β; • α → β, ¬β ¬α (modus tollendo tollens); • α ∨ β, ¬β α (modus tollendo ponens). For instance, ¬β is β → ⊥, and from α → β and β → ⊥ follows α → ⊥, that is, ¬α. In intuitionistic propositional calculus, the deduction theorem is also valid: Σ α→β
if and only if Σ, α β.
Thus, a weak form of reductio ad absurdum is obtained:
654
A. Oostra
Σ ¬α
if and only if
Σ, α ⊥.
From here follow other features of intuitionistic negation: • • • •
α ¬¬α; ¬¬¬α ¬α, and ¬α ¬¬¬α; ¬α ∨ β α → β; α → β ¬(α ∧ ¬β).
Being a model for intuitionistic logic, in intuitionistic propositional calculus, there is no double negation, no excluded middle, and no full reductio ad absurdum. But even further, although only the axiom of negation was altered, this has an effect on the other connectives since the classical relationships between them do not subsist. The following deductions are not possible in this system: • • • • •
α ∨ ¬α; ¬¬α α; ¬α → β α ∨ β; α → β ¬α ∨ β; ¬(α ∧ ¬β) α → β.
To make sure that it is impossible prove a certain deduction in intuitionistic propositional calculus, or in any formal system, a semantics is required for this logic. A Heyting algebra can be regarded as a special type of lattice. A lattice is a partially ordered set in which every pair of elements a, b has a greatest lower bound, denoted a ∧ b, and also a least upper bound, denoted a ∨ b. Precisely, a Heyting algebra is a lattice H with minimum element 0 ∈ H and a binary operation → which for any a, b, x ∈ H satisfies: x≤a→b
if and only if
x ∧ a ≤ b.
Since x ∧ a ≤ a for all elements, the definition implies x ≤ a → a; hence every Heyting algebra has a maximum element 1. Moreover, in every Heyting algebra, the relation a ≤ b holds if and only if a → b = 1. The pseudo-complement ¬a of any element a is defined by: ¬a = a → 0. From this, it follows that a ≤ ¬¬a for every element. Only for some special elements, the equality holds, for example, ¬¬¬a = ¬a. Many other interesting inequalities are true in Heyting algebras, like ¬a ∨ b ≤ a → b, a ∨ b ≤ ¬a → b, and a → b ≤ ¬(a ∧ ¬b). Follow some examples of Heyting algebras. Any totally ordered set with maximum 1 and minimum 0 is a Heyting algebra, defining a → b = 1 if a ≤ b, and a → b = 0 otherwise. Therefore, in this case,
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
655
¬a = a → 0 = 1 if a = 0, and ¬a = 0 otherwise. Hence, for any element a with 0 < a < 1, it follows that a ∨ ¬a = a = 1, and a < ¬¬a. Moreover, in this case, if 0 < a, b < 1, then a ∨ b < ¬a → b. If 0 < a < b < 1, then ¬a ∨ b < a → b, and if 0 < b < a < 1, then a → b < ¬(a ∧ ¬b). A Boolean algebra is a lattice with maximum 1 and minimum 0 whose binary operations are both distributive over the other and in which every element a has a unique complement, denoted a . Every Boolean algebra is a Heyting algebra defining a → b = a ∨ b. The basic example of a Boolean algebra is the set of subsets of a fixed universe, ordered by inclusion. In any Boolean algebra, it is true that ¬a = a → 0 = a ∨ 0 = a , and all the following identities hold: a ∨ a = 1, a = a, a → b = a ∨ b, a → b = a ∨ b = (a ∧ b ) , which correspond to the laws of classical propositional logic. The basic example of a Heyting algebra is the set of open sets of a fixed topological space, ordered by inclusion. If V and W are open sets, the open set V → W is defined as V → W = ext (V \ W ), where ext S denotes the exterior of set S. Hence, the pseudo-complement of any open set V is ¬V = V → ∅ = ext (V \ ∅) = ext V . Here it is clear that, in general, V ∨ ¬V = V ∪ ext V is not the whole space. In specific topological spaces, it is possible to find suitable open sets such that ¬¬V = V , V ∨ W = ¬V → W , ¬V ∨ W = V → W , and V → W = ¬(V ∧ ¬W ). In order to establish the Heyting algebras as a semantics for intuitionistic propositional calculus, functions or valuations v : L → H from the set of propositional letters to any Heyting algebra H are considered. A valuation v brings about a unique extension function, denoted v, from the set of all the formulas to H . Now an intuitionistic formula ϕ is a consequence of a set of formulas Σ, symbolized Σ | ϕ, if for any valuation v such that v(σ ) = 1 for each σ ∈ Σ, also v(ϕ) = 1 holds. For example, α → β, α | β. A formula τ is valid if v(τ ) = 1 for any valuation v. The following results summarize the resemblance between the entailment and consequence relations, hence between the syntax and semantics of intuitionistic propositional logic. The soundness theorem establishes that for any set Σ of formulas: Σ ϕ
if
then
Σ | ϕ.
This is so because all the axioms are valid, and modus ponens preserves validity. This result yields a way to prove that certain deductions are not possible in intuitionistic propositional calculus. Thus, the instances mentioned in totally ordered Heyting algebras, and suggested in topological spaces, show that p ∨ ¬p, ¬¬p p, ¬p → q p ∨ q, p → q ¬p ∨ q, and ¬(p ∧ ¬q) p → q. On the other hand, the completeness theorem ensures that for any formula ϕ: if
| ϕ
then
ϕ.
656
A. Oostra
Thus, the theorems of intuitionistic propositional calculus are exactly the formulas that are valid in Heyting algebras.
Existential Graphs for Intuitionistic Logic The system of Existential Graphs for intuitionistic logic is the result of different abductions at various levels. In the first place, the possibility of such a graphical system, originally suggested by Fernando Zalamea (1997), obeys a quite natural abduction. It is very evident that the Existential Graphs constitute a topological system, based on the notions of interior and exterior in the plane. In turn, intuitionistic logic has deep connections with topology. On the one hand, this holds through the algebraic semantics of intuitionistic logic, since the basic example of a Heyting algebra is the family of open sets of a topological space. On the other hand, this association follows from the natural topological interpretation of intuitionistic logic, already pointed out by Tarski (1938). Since Existential Graphs and intuitionistic logic have so much common ground, a hypothesis that naturally arises is the existence of a system of Existential Graphs whose logic is intuitionistic and which has the usual system of Existential Graphs for classical logic as its particular limiting case. Secondly, the signs that make up the system of Existential Graphs for intuitionistic logic result from an abductive process (Oostra, 2010). In intuitionistic logic, propositional connectives cannot be defined in terms of each other, as in classical logic. Thus, regardless of the rules of transformation, any diagrammatic system for intuitionistic logic requires different signs for all its basic connectives. Although it is feasible to maintain the representation of conjunction and negation, it is imperative to introduce new signs for implication and disjunction. As a first abduction, which turned out successful, it was proposed to join the inner cut to the outer cut at one point for the sign of implication, as shown in Fig. 7. Following Peirce, who occasionally used this diagram as a shorthand for implication, this graph is called a scroll, its outer curve cut, and its inner curve loop. The following abduction consisted of using a double scroll for disjunction; this is a sign composed of a cut to which two loops are joined inside. More precisely, the components of intuitionistic Existential Graphs Alpha are the sheet of assertion, propositional letters, cuts, scrolls, and double scrolls. An intuitionistic Alpha graph is a diagram composed of a finite combination of letters, cuts, and scrolls, drawn upon the sheet of assertion. There may be repeated letters, but they all occupy different places, the cuts and scrolls do not touch the letters nor do they touch each other, and two graphs that can be continuously deformed into each other are equal. At this point, two important conventions are adopted, illustrated in Fig. 8. Firstly: A simple cut enclosing a graph is an abbreviation of a
Fig. 7 A new Existential Graph for implication
A
B
31 Existential Graphs as a Visual Tool of Abductive Cognition. . . Fig. 8 Two conventions for intuitionistic Existential Graphs Alpha
657
G
G
G
=
H
G
=
K
A∧B
• Implication:
A→B
A
B
• Disjunction:
A∨B
A
B
⊥
• Negation:
¬A
K
A B
• Conjunction:
• Absurd:
H
A
=
A
Fig. 9 Intuitionistic Alpha graphs for propositional connectives
scroll whose loop contains only an empty cut and whose outer area contains only the graph enclosed by the cut. Secondly: A double scroll with graphs in its areas is an abbreviation of a single scroll whose outer area contains only the same graph of the outer area of the double scroll and whose loop contains only a double scroll with an empty outer area and the same loops as the original graph. Again, the basic logical interpretation of the intuitionistic Alpha graphs is that the sheet of assertion is the universe of possibilities of truth, and drawing a graph on the sheet means asserting its interpretation. Writing a letter means asserting the proposition it represents, and drawing two graphs means asserting both. On the other hand, drawing a scroll means asserting the implication whose antecedent is the graph in the outer area and whose consequent is the graph inside the loop. Hence, by the first of the former agreements, drawing a graph enclosed in a cut without loops means negating it. Finally, drawing a double scroll with an empty outer area means asserting the disjunction of the graphs enclosed in the two loops. Hence, drawing an arbitrary double scroll means asserting the implication whose antecedent is the graph in the outer area and whose consequent is the disjunction of the graphs inside the loops. Figure 9 shows the graphs for the basic intuitionistic connectives. As in the classical case, from the graphs of the connectives, an intuitionistic Alpha graph can be recursively constructed for any propositional logical formula. An area is defined to be a region of the sheet of assertion limited by curves, both cuts and loops. An area is odd or even if there is an odd or even number of curves around it, counting both cuts and loops. The rules of transformation for intuitionistic Alpha graphs are mostly the same as in the classical case, just conveniently adapted to loops. Since the principle of double negation does not hold in intuitionistic logic,
658
A. Oostra
the only rule that needs to be changed substantially is the double cut. Thus, the rules of transformation are: 1. Erasure: In an even area, any graph may be erased. Any loop within an even area may be eliminated with its contents. 2. Insertion: In an odd area, any graph may be scribed. In an odd area limited externally by a cut, a loop containing any graph may be added to this cut. 3. Iteration: Any graph may be iterated in its own area, or in any area contained in it, which is not part of the graph to be repeated. Any loop may be iterated, with its contents, on its own cut. 4. Deiteration: Any graph may be erased if a copy of it persists in the same area or in any area around it. A loop with its contents may be erased if another loop with the same contents persists on its cut. 5. Scrolling: A scroll with empty outer area may be drawn around or removed from any graph on any area. Figure 10 shows a diagrammatic proof of two basic results of intuitionistic propositional calculus. In intuitionistic Existential Graphs Alpha, any proof starting from the empty sheet begins with the drawing of an empty scroll. As mentioned in the previous section, in intuitionistic propositional calculus, deduction A → B ¬(A ∧ ¬B) is valid, but not the other way around. In the context of intuitionistic Existential Graphs Alpha, this means that it is admissible to separate a loop of a scroll to form a simple cut, although it is generally not possible to join it back. Figure 11 shows the detailed proof of this detachment. From here follows immediately modus tollendo tollens in the intuitionistic graphical context, exactly as in Fig. 2. On the other hand, completely removing the letter A in Fig. 11, and adding a scrolling step, results in a proof of B ¬¬B with intuitionistic Existential Graphs Alpha. Again, the deduction in the other direction is not valid in intuitionistic logic. As in the classical case, the use of Existential Graphs for intuitionistic logic can highlight some abductions used in mathematical proofs. The graphical proof of the principle of transitivity of implication shown in Fig. 12 obeys the following abduction. Since B is to be removed, and there is a copy of this letter in an even area and another in an odd one, the entire graph that contains B in an odd area is
A
A
4 ⇒
B
5 ⇒
A
B
5 ⇒
2 ⇒
A
A
B
3 ⇒
1 ⇒
A
B
A
Fig. 10 Proof of modus ponens (top) and theorem A → A (bottom) by means of intuitionistic Existential Graphs Alpha
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
A
2 ⇒
B
A B
4 ⇒
3 ⇒
B
A B
A B
1 ⇒
B
659
⇒
B B
=
A B
A B
Fig. 11 Diagrammatic proof of A → B ¬(A ∧ ¬B) with intuitionistic Existential Graphs Alpha
A
B
B
3 ⇒
C
4 ⇒
A B B C
A B
B
1 ⇒
C
1 ⇒
C
A
⇒
A B B C
5 ⇒
C
A
C
Fig. 12 Proof of transitivity of implication with intuitionistic Existential Graphs Alpha
A
B
B
3 ⇒
2 ⇒
C
A
B
A
B
A A B
AB
C
C
4 ⇒
5 ⇒
A
A
B
B
A
A
C
B
C
1 ⇒
A
⇒
C
Fig. 13 A different proof of transitivity of implication with intuitionistic Existential Graphs Alpha
iterated beside the even copy, which is used to deiterate the inner one. In fact, the graph corresponding to the hypothesis of modus ponens arises in an even area, and it is possible to repeat the steps of the previous example to obtain conclusion C within that area. Figure 13 shows another proof of the same result as Fig. 12, but following a different abduction. Now the odd copy of the letter B is embedded into a graph, building inside exactly what is needed to deiterate it from the outside. Again, abductive reasoning is clearly perceptible in the graphical proof of theorems, that is, deductions without premises. First, the diagram that represents the conclusion is drawn, and then the graphs or hypotheses that could lead to this graph are looked for, in a breakdown to a simple graph. Figure 14 begins with the graph of Axiom 1, in which the letter B is in an odd area, so it is admissible to draw it. Hence, a useful working hypothesis is the graph without it. Next, the scroll with empty outer area can be removed, and the resulting graph corresponds, in fact, to a theorem already obtained graphically in Fig. 10. The inner copy of letter A could be the result of iteration, so a good hypothesis for this is the graph with only the outer copy, which is in an odd area and can be easily added. In this way, the running
660
A. Oostra
A
B A
2 ⇐
A
A
5 ⇐
A
A
3 ⇐
A
2 ⇐
5 ⇐
Fig. 14 Reverse proof of Axiom 1 in the system of intuitionistic Existential Graphs Alpha
hypothesis is reduced to an empty scroll, which can be drawn at once. The graphical proof of the theorem is achieved by reversing all these abductive steps. Since the formal proof of the equivalence between intuitionistic propositional calculus and the system of intuitionistic Existential Graphs Alpha has many technical details (Ortiz & Segura, 2018; Oostra, 2021), only a brief description is presented below. From the standard drawing of the connectives, a mapping g from the set of the intuitionistic formulas to the set of all intuitionistic Alpha graphs on the sheet of assertion is defined inductively. The proof that α β implies that g(α) graphically entails g(β) follows from two facts. On the one hand, the translation of each axiom of intuitionistic propositional calculus can be deduced from the empty sheet of assertion, as was done with Axiom 1 in Fig. 14. On the other hand, modus ponens corresponds to a graphical entailment as shown in Fig. 10. In this way, each step of a deduction for α β can be copied graphically, although shorter proofs may be achieved in many cases. For the other direction of the sought equivalence, a new formal system is introduced for intuitionistic propositional calculus. Its language has fewer symbols than the usual presentation: a scroll is represented as [A(B)], a double scroll as [(A)(B)], an empty cut as [] with a new constant, and an arbitrary cut as [A]. This defines a function s from the set of graphs to the set of strings. The algebraic rules of this new system are inspired by the rules of transformation for intuitionistic Existential Graphs Alpha and allow all these transformations to be performed on the resulting strings. Finally, a mapping f is defined that translates each string back into a formula of intuitionistic propositional calculus, where the proof of the algebraic rules is routine. The composite f s is not exactly the inverse mapping of g, but the composites f sg, gf s, and even sgf are all equivalent to the corresponding identity. This establishes the mathematical equivalence of the three systems. Exactly as in the classical case, the system of intuitionistic Existential Graphs Beta results by adding lines of identity and predicate letters to the Alpha graphs. The interpretation of these Beta graphs follows the same clauses given for the Alpha graphs. Besides, drawing a line on the sheet means asserting the existence of an individual, and writing a letter with lines attached to it means that the predicate it represents holds for the individuals involved. Thus, the graphs of the basic intuitionistic quantifiers are shown in Fig. 15. Again, the parity of the areas does not change with the lines of identity. The rules of transformation that complete the system are the same rules for the Existential Graphs Alpha, extended with the same adaptations to the line of identity registered in the classical case. The resulting system of intuitionistic Existential Graphs Beta corresponds to intuitionistic first-order logic. Figure 16 shows a graphical proof of syllogism Barbara, which is also valid in intuitionistic logic: All M are P , and all
31 Existential Graphs as a Visual Tool of Abductive Cognition. . . ∃xP(x)
P
∃x¬P(x)
P
∀xP(x)
P
∀x¬P(x)
P
• There exists P: • There exists not P: • All is P: • All is not P:
661
Fig. 15 Intuitionistic Beta graphs for intuitionistic quantifiers
M
P
S
M
3 ⇒
S
P
2 ⇒
M S
S
P
S
M
S
M
S 2 ⇒
1 ⇒
S
P
S
M
S
M
4 ⇒
S 5 ⇒
M
M
S
P
M
M
S
P
S
M
S
M
S
4 ⇒
P
S S
P
⇒
M
1 ⇒
S
P
M
Fig. 16 Proof of syllogism Barbara by means of intuitionistic Existential Graphs Beta
S are M, thus all S are P . This line of graphical argument follows the abduction of embedding the odd copy of letter M into a graph equal to the outer one, in order to deiterate it in the penultimate step. A different proof obeys the abduction of first completely copying the graph containing this odd letter M beside the even copy of M and then removing the two copies of M step by step. For more details and examples, see Oostra (2011). Finally, the introduction of broken cuts and loops and their alethic interpretation, with the adequate adjustment of the rules of transformation, leads to systems of intuitionistic Existential Graphs Gamma that correspond to various intuitionistic modal logics of type S4, S4.2, and S5 (Oostra, 2012).
Sublogics of Intuitionistic Logic Even more than in classical logic, intuitionistic propositional calculus can be thought of as the result of the harmonious convergence of its connectives, each of which is essential to the overall behavior of the system. In a perhaps abductive analysis of the role of each of these connectives in the complete system, the axioms that govern each of them could be considered separately. This idea was already suggested by David Hilbert and Paul Bernays in the introduction of the axioms for
662
A. Oostra
classical propositional calculus, who compared such a classification with the wellknown groups of the axioms of Geometry (Hilbert & Bernays, 1934). In fact, the restriction to some connectives, or just one of them, governed by the corresponding axioms or rules, determines a certain sublogic of intuitionistic logic. Since the basic connectives of intuitionistic propositional calculus are implication, disjunction, conjunction, and absurd, there are immediately 16 sublogics, given by all possible subsets of {→, ∨, ∧, ⊥}. If the subset includes implication, it is only necessary to consider the corresponding axioms and modus ponens. Such is the case of some well-known cases like positive intuitionistic logic, given by implication, conjunction, and disjunction {→, ∧, ∨}, and intuitionistic implicational logic, also known as positive implicative logic (Rasiowa, 1974), given by implication only {→}, with Axioms 1 and 2 plus modus ponens. If implication does not belong to the chosen subset, then suitable inference rules are adopted for the connectives, which replace the axioms. In any sublogic that contains implication and absurd, negation can intuitionistically be defined as ¬α = α → ⊥. By the axioms for implication, different relevant results follow from this equivalence even without Axiom 3, like modus tollendo tollens as shown in a previous section. Therefore, in particular, for connective ⊥, it is always worth considering both possibilities, with or without Axiom 3 or an equivalent rule of inference. Hence, the 8 sublogics that contain absurd each give rise to two sublogics, and the 16 sublogics given by subsets of connectives expand to 24. Figure 17 shows the lattice of sublogics of intuitionistic propositional calculus thus given by subsets of connectives. In this diagram, ⊥− means that the absurd has been included without Axiom 3 or an equivalent inference rule.
IPC
→∧∨⊥−
→∧∨
∧∨
→∨
→∧
∨
∧
→
∧∨⊥−
→∨⊥−
→∧⊥−
∨⊥−
∧⊥−
→⊥−
∧∨⊥
→∨⊥
→∧⊥
∨⊥
∧⊥
→⊥
⊥
⊥−
Fig. 17 The sublogics of intuitionistic propositional calculus given by subsets of connectives
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
663
As noted before, Axiom 3 stands for the principle of explosion. Thus, in the 16 sublogics on the left-hand side of Fig. 17, it does not hold that anything follows from falsehood or contradiction. This is one of the basic features of paraconsistent logic (da Costa, 1963), so perhaps these sublogics could be classified as paraconsistent. At least, with what follows in the next section, this remark allows pointing out to a certain connection between intuitionistic logic, paraconsistent logic, and Existential Graphs. Each of the sublogics in Fig. 17 is associated with an algebraic structure that operates as its semantics, just as Heyting algebras are associated with intuitionistic propositional calculus and Boolean algebras with classical propositional calculus. In fact, all the structures corresponding to these sublogics are extensions of the Heyting algebras. For example, the structures corresponding to intuitionistic implicational logic are the positive implication algebras (Rasiowa, 1974), also known as implicative models (Henkin, 1950) or Hilbert algebras (Diego, 1961). On the other hand, relatively pseudo-complemented lattices characterize positive intuitionistic logic (Rasiowa, 1974), while Hilbert lower semilattices correspond to intuitionistic implicational logic with conjunction (Castillo & Oostra, 2010; Oostra, 2019).
Existential Graphs for Intuitionistic Sublogics Just as the restriction to some connectives gives rise to sublogics of intuitionistic propositional calculus, a natural abduction suggests that the restriction to certain signs will lead to systems of Existential Graphs for some of these sublogics. This study has an intrinsic mathematical interest, consisting of the search for graphical versions for more nonclassical logical systems. In addition, in this particular framework, this elaboration allows deepening the understanding of the different diagrams used in the intuitionistic Existential Graphs and, on the other hand, of the behavior of the basic intuitionistic connectives. As already shown, Existential Graphs are a suitable environment to study abduction in mathematical reasoning in a visual way. A first attempt to study Existential Graphs for sublogics of intuitionistic propositional calculus consists in considering only the scroll. Just as the system of classical Existential Graphs Alpha has only letters and cuts on the sheet of assertion, here there are only letters and scrolls. Sticking to the original interpretation, writing two letters or drawing two graphs together on the sheet means asserting both, so this graphical system allows to represent the conjunction. Drawing a scroll with graphs in its areas means asserting an implication; thus this system represents implication as well. Assuming no more signs, this provides a representation of the formulas of implicational logic with conjunction, {→, ∧}. For graphical deduction, the same rules of transformation as in the full intuitionistic system are assumed, restricted to letters and scrolls. Since there are no double scrolls or cuts, it is impossible to draw or remove additional loops. Surprisingly, the rules of transformation are the verbatim transcription of the classical rules Alpha, changing only a cut to a scroll
664
A. Oostra
and a double cut to a scroll with an empty outer area. The proof of modus ponens shown in Fig. 10 is valid in this system, as is the proof of theorem A → A in the same Figure. The proof of Axiom 1 shown in Fig. 14 is valid, and Axiom 2 is also provable with these diagrams and rules. In fact, this system of Existential Graphs formally corresponds to intuitionistic implicational logic with conjunction (Gómez, 2013; Oostra, 2019). Both proofs of transitivity of implication, shown in Figs. 12 and 13, are also valid in this system. Although at first it seems that the sheet of assertion always imposes the conjunction, in fact it is feasible to define systems of Existential Graphs for logics without this connective. The trick is to forbid the juxtaposition of graphs in all internal areas. For example, it is possible to consider only the scroll, plus the letters, as in the previous system. However, in any of the two areas of a scroll, it is now only permitted to write a letter, draw a scroll, or leave it empty. Drawing such a scroll on the sheet with graphs in its areas means asserting the matching implication. Since this is the only connective allowed in this interpretation, it results in a representation of the formulas of intuitionistic implicational logic, {→}. Only on the sheet of assertion may various graphical premises appear together, although this is not interpreted as their conjunction. Again, the same rules of transformation are assumed but restricted to letters and scrolls without juxtaposition in the areas. For example, a graph is only allowed to be iterated in an empty area or to be written in an empty odd area. Once again, the proof of modus ponens, shown in Fig. 10, and of Axiom 1, shown in Fig. 14, are valid in this system. With great care, it is possible to prove Axiom 2 with these restrictions; thus this system of Existential Graphs corresponds to intuitionistic implicational logic. As an explanatory example, none of the proofs shown in Figs. 12 and 13 are valid in this system because in both there appear juxtaposed graphs in internal areas. Figure 18 shows a proof of transitivity of implication in this system; its basic abduction is the careful embedding of the odd copy of the letter B in a graph that can then be deiterated from the outside. Now, it is not difficult to propose a system of Existential Graphs for intuitionistic positive logic, the sublogic of intuitionistic propositional calculus given by
A
B
B
C
B
A
5 ⇒
B C
A
A
2 ⇒ A
B
A
3 ⇒
A
B B C
A B
C
5 ⇒ A
C
⇒ B
A
B
4 ⇒ A
B
5 ⇒
A
B
A
C
C
1 ⇒
A
C
Fig. 18 A proof of the transitivity of implication with implicative Existential Graphs Alpha
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
665
implication, conjunction, and disjunction {→, ∧, ∨}. Letters, scrolls, and double scrolls are taken for signs on the sheet of assertion. The interpretation is the same as before; hence a graphical representation of the three mentioned connectives results and any intuitionistic positive formula can be represented graphically. The same rules of transformation are assumed as in the full system, restricted to the retained diagrams, that is, the rules are only discarded when applied to cuts. Again, the proofs shown in Figs. 10, 12, 13, and 14 and even Fig. 18 are all valid in this system. Furthermore, the remaining axioms for conjunction and disjunction can be proved, and this system of Existential Graphs formally corresponds to intuitionistic positive logic. To these three systems of Existential Graphs, negation can be added in two steps, as explained before for the sublogics of intuitionistic propositional calculus. For absurd, the empty cut is included as a new sign on the sheet of assertion, and Fig. 19 explains this choice. Given an arbitrary sign ∗ for absurd, and the basic abduction that negation is represented by the Peircean cut, then the intuitionistic definition of negation as the implication of absurd leads inescapably, under the rule of scrolling, to the equivalence of ∗ with the empty cut. If no additional rules of transformation are allowed for the empty cut, only some basic properties of the cut result, derived from its definition as an implication. Thus systems of Existential Graphs arise for the sublogics given by {→, ⊥− }, {→, ∧, ⊥− }, and {→, ∧, ∨, ⊥− }. Figure 20 shows that the principle of explosion corresponds precisely to applying the rule of insertion of an arbitrary loop to the empty cut. Thus, assuming all the rules of transformation, there result systems of Existential Graphs for the sublogics given by {→, ⊥} and {→, ∧, ⊥}, in addition to recovering the whole diagrammatic system for intuitionistic propositional calculus. Figure 21 highlights in green the sublogics of intuitionistic propositional calculus given by subsets of connectives that have full-fledged systems of Existential Graphs associated, as explained before. In a new and surprising manifestation of the asymmetry in intuitionistic logic, it seems that the restriction of the usual rules of transformation to a system with only scrolls and double scrolls, but not admitting conjunction, is not sufficient to prove the required axioms of disjunction. Therefore, to obtain systems of Existential Graphs for the sublogics given by {→, ∨}, {∧, ∨}, and only {∨}, new rules are probably required. This could lead to a more complete understanding of Fig. 19 Definition of negation with absurd * (top) and equivalence of * with the empty cut (bottom)
A 5 ⇒
∗ 2 ⇒
A
∗
5 ⇒
Fig. 20 Principle of explosion in intuitionistic Existential Graphs
=
∗
A
A
=
666
A. Oostra IPC
→∧∨⊥−
→∧∨
∧∨
→∨
→∧
∨
∧
→
∧∨⊥−
→∨⊥−
→∧⊥−
∨⊥−
∧⊥−
→⊥−
∧∨⊥
→∨⊥
→∧⊥
∨⊥
∧⊥
→⊥
⊥
⊥−
Fig. 21 Sublogics of intuitionistic propositional calculus with associated systems of Existential Graphs
intuitionistic disjunction, which is in fact one of the salient features of this logic. Finally, the sublogics given by {∧} alone, or without any connectives, are quite primitive, and it is easy to think of systems of Existential Graphs for them, although their study, both formal and graphical, seems less interesting.
Conclusions Existential Graphs are a powerful tool for bringing to light the basic abductions used in logical proofs. At many stages of mathematical practice, these abductions are only occasionally made visible by auxiliary figures, or by concealed comments, and then carefully hidden. But in this context, diagrams constitute the very formal language in which the deductions are carried out. Thus, the unique strength of this system is that it combines syntax, semantics, and pragmatics on the one hand and abduction, induction, and deduction, on the other, all in one. In both algebraic and graphical systems, the understanding of propositional logic grows by considering connectives individually and subsystems given by their combinations. This paper shows this procedure in intuitionistic logic, using Existential Graphs. Thus, the different systems of Existential Graphs initiated by Peirce and now extended to many nonclassical logics substantially improve the study of logic itself and of abductive cognition in general.
31 Existential Graphs as a Visual Tool of Abductive Cognition. . .
667
In the other direction, the sheer possibility of studying more sublogics of intuitionistic propositional calculus by means of Existential Graphs was in fact opened up thanks to the abductive reasoning made visible by the diagrams themselves. The clearly abductive idea of building a graph inside in order to deiterate it from the outside, which at first was just an alternative idea, turned out to be the only way so far to extend the proofs to systems with fewer signs. Thus, abductive reasoning enhanced the study of Existential Graphs. May the studies of visual systems continue to grow since, as Peirce expressed it, this “ought to be the logic of the future” (Roberts, 1973).
References Castillo, M., & Oostra, A. (2010). Álgebras para la lógica implicativa con conjunción. Matemáticas: Enseñanza Universitaria, 18(2), 31–50. Chagrov, A., & Zakharyaschev, M. (1997). Modal Logic. Oxford Logic Guides (Vol. 35). Oxford: Clarendon Press. da Costa, N. (1963). Sistemas formais inconsistentes. Universidade Federal do Paraná, Curitiba, Brasil. Diego, A. (1961). Sobre álgebras de Hilbert. PhD thesis, Universidad de Buenos Aires, Buenos Aires, Argentina. Fuentes, C. (2014). Cálculo de secuentes y gráficos existenciales Alfa: dos estructuras equivalentes para la lógica proposicional. Undergraduate thesis, Universidad del Tolima, Ibagué, Colombia. Goldblatt, R. (1984). Topoi. The Categorial Analysis of Logic. Amsterdam: Elsevier. Gómez, A. (2013). Gráficos Alfa para la lógica implicativa con conjunción. Undergraduate thesis, Universidad del Tolima, Ibagué, Colombia. Henkin, L. (1950). An algebraic characterization of quantifiers. Fundamenta Mathematicae, 37, 63–74. Heyting, A. (1971). Intuitionism. An Introduction. Amsterdam: North-Holland. Hilbert, D., & Bernays, P. (1934). Grundlagen der Mathematik (Vol. I). Berlin: Springer. Oostra, A. (2010). Los gráficos Alfa de Peirce aplicados a la lógica intuicionista. Cuadernos de Sistemática Peirceana, 2, 25–60. Oostra, A. (2011). Gráficos existenciales Beta intuicionistas. Cuadernos de Sistemática Peirceana, 3, 53–78. Oostra, A. (2012). Los gráficos existenciales Gama aplicados a algunas lógicas modales intuicionistas. Cuadernos de Sistemática Peirceana, 4, 27–50. Oostra, A. (2019). Representación compleja de los gráficos Alfa para la lógica implicativa con conjunción. Boletín de Matemáticas, 26(1), 31–50. Oostra, A. (2021). Equivalence proof for intuitionistic existential alpha graphs. In A. Basu, G. Stapleton, S. Linker, C. Legg, E. Manalo, & P. Viana (Eds.), Diagrams 2021: Diagrammatic Representation and Inference. Lecture Notes in Computer Science (Vol. 12909, pp. 188–195). Cham: Springer International Publishing. Ortiz, J., & Segura, J. (2018). Gráficos Alfa intuicionistas. Undergraduate thesis, Universidad del Tolima, Ibagué, Colombia. Peirce, C. S. (1906). Prolegomena to an apology for pragmatism. The Monist, 16(4), 492–546. Peirce, C. S. (1931–1958). Collected Papers of Charles Sanders Peirce (8 volumes). Cambridge, MA: Harvard University Press. Peirce, C. S. (2019–2021). Logic of the Future (3 volumes). Berlin: De Gruyter. Pietarinen, A. (2015). Exploring the Beta Quadrant. Synthese, 192(4), 941–970.
668
A. Oostra
Rasiowa, H. (1974). An Algebraic Approach to Non-classical Logics. Amsterdam: North-Holland. Roberts, D. D. (1973). The Existential Graphs of Charles S. Peirce. The Hague: Mouton. Tarski, A. (1938). Der Aussagenkalkül und die Topologie. Fundamenta Mathematicae, 31, 103–134. Troelstra, A. S., & van Dalen, D. (1988). Constructivism in Mathematics. An Introduction (2 volumes). Amsterdam: North-Holland. Zalamea, F. (1997). Pragmaticismo, gráficos y continuidad: hacia el lugar de C. S. Peirce en la historia de la lógica. Mathesis, 13, 147–156. Zalamea, F. (2012). Peirce’s Logic of Continuity. A Conceptual and Mathematical Approach. Boston: Docent Press. Zeman, J. J. (1964). The Graphical Logic of C. S. Peirce. PhD thesis, University of Chicago, Chicago.
Visual Semiotics, Abduction, and the Learning Paradox: The Role of Graphic Signs
32
Inna Semetsky
Contents Introduction: The Abductive Semiotics of Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Fool as a Figure of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Learning Abductively from Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
670 673 679 690 693
Abstract
This chapter examines the semiotics of abductive inference as based in archetypal images and unconscious processes. The apparent paradox of learning by way of creative thought is unfolded and partially resolved by demonstrating the essentially experimental nature of all knowledge acquisition. Emphasizing the triadic character of Peirce’s conception of the sign and the relevance of Polanyi’s concept of tacit knowledge for theorizing abductive processes of reasoning, the specific role of the imagination in abduction as characterized by a logic of the “included third” rather than that of the excluded middle is analyzed in terms of the figure of the Fool, the zero arcanum, from among the traditional iconography of the Tarot. Formal aspects of abductive reasoning are then considered from the standpoint of the semiotics of visual images and models, in particular their iconic and indexical dimensions and their liminal character as standing between the purely imaginary on the one hand and the rationally cognitive on the other hand.
Inna Semetsky: deceased. I. Semetsky () Institute for Edusemiotics Studies, Melbourne, VIC, Australia © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_40
669
670
I. Semetsky
Keywords
Abduction · Semiotics · Edusemiotics · Tarot · Visual reasoning · Peirce · Polanyi
Introduction: The Abductive Semiotics of Images The relationship between word and image remains historically, philosophically, and ideologically troubled. The verbal mode is characterized by “linear, sequential, reductionist, and abstract thinking,” while the medium of images demands “a holistic, simultaneous, synthetic, and concrete” (Shlain, 1998, p. 1) mode of perception, thus problematizing the role of linguistic signs as the basic means of communication. Signs are relational entities surpassing Saussure’s linguistic dyads but representing triads formed by interpretants (Peirce’s term) as included thirds between the dualistic categories of modern discourse. It is self-reference – suspect in the analytic logic of language, where it is denounced as circular and begging the question – that ensures the tri-relative process∼structure of genuine signs as dynamic patterns (Kelso & Engstrøm, 2006) composing the process of semiosis. The notation ∼ (tilde) here indicates a coordinated, complementary relation between perceived opposites, including word vs. image, and it is surely by breaking the dualistic oppositions that semiotics as the science of signs affirms the value of meanings rather than just empirical facts. Not only do “pictures have a continuous structure . . . They induce the reader to . . . read the picture as if it were a written text” (Posner, 2004, p. 84). The relational dynamics of signs enclose a semiotic space that takes the paradoxical shape of the so-called semiotic triangle, not unlike in M. C. Escher’s print Relativity. This figure inspired Nobel laureate, mathematician, and physicist Sir Roger Penrose to conceptualize the specific structure of the “impossible” triangle, which in turn was used by Escher in his depiction of the Waterfall (Schattschneider, 2010). Significantly, Escher defined his landscapes as mindscapes. In the area of education, the first steps from “page to screen” were taken several decades ago (Snyder, 1997), laying the background for the latest advances in educational theory, especially regarding its cutting-edge edusemiotic turn (Semetsky, 2014; Semetsky & Stables, 2014) which affirms that “the most striking feature of Peirce’s theory of signs is that it suggests a corresponding theory of minds, according to which minds are sign-using (or ‘semiotic’) systems” (Fetzer, 1991, p. 65). For Peirce, cognition proceeds “in the relation of my states of mind at different instants . . . In short, the Immediate (and therefore in itself unsusceptible of mediation – the Unanalyzable, the Inexplicable, the Unintellectual) runs in a continuous stream through our lives; it is the sum total of consciousness, whose mediation, which is the continuity of it, is brought about by a real effective force behind consciousness” (Peirce CP 5.289). Recent research on mind in motion (Tversky, 2019) inadvertently confirms the Peircean assertion of “moving pictures of thought” (Peirce CP 4.8) as inferences, and full-fledged semiotic reason incorporates the
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
671
ingenious Peircean category of abduction. In the current discourse on philosophy of science, abduction is usually taken in one sense only and presented as an inference to the best explanation; as such, abduction remains epistemically problematic (Hintikka, 1998; Magnani, 2001). For Peirce, however, abduction is equivalent to those “operations of the mind which are logically exactly analogous to inferences excepting only that they are unconscious and therefore uncontrollable and therefore not subject to criticism” (Peirce CP 5.108) while partaking of the initial gut feeling as the bodily response to a “surprising fact” (CP 5.189). Abduction punctuates conscious rule-based propositional thinking; still, it is a mode of reasoning – yet one that “comes to us like a flash. It is an act of insight” (Peirce CP 5.181). Magnani references Herbert Simon, who wrote that problem-solving cannot be reduced to deduction but “is a process of selective trial and error, using heuristic rules” (in Magnani, 2001, p. 16) that makes semiosis a non-monotonic process. The significance of the unconscious defies any reduction of learning to a solely “conscious mental process . . . It is chiefly bodymind learning” (Merrell, 2002, p. 15) where the material and mental and the unconscious and consciousness form a genuine bipolar, Janus-faced, sign. Describing the structure of abduction, Peirce noted that “the first premise is not actually thought, though it is in the mind habitually. This, of itself, would not make the inference unconscious. But it is so because it is not recognized as an inference; the conclusion is accepted without our knowing how” (Peirce CP 8.65), as though intuitively and in the manner of “instinctive reason” (CP 6.475). It is specifically what Magnani calls model-based creative abduction that strongly relates to visual signs: while being vague, ambiguous, and “processed” subconsciously, such “image-based hypothesis formation can be considered as a form of visual (or iconic) abduction” (Magnani, 2001, p. 43). Nonetheless it can elicit conceptual change conducive to learning because visual signs belong to those “thought-signs [which] are of two classes: first, pictures or diagrams or other images (I call them Icons) . . . and secondly, signs more or less analogous to symptoms (I call them Indices)” (Peirce CP 6.338). Cognition is not limited to consciousness: “in subconscious mental activities visual representations play an immediate role. [And] abduction plays a role even in relatively simple visual phenomena” (Magnani, 2001, p. 42). This chapter uses a specific image from the deck of Tarot pictures that (while traditionally located at the low end of popular culture) are a perfect subject for research in semiotics (Semetsky, 2011, 2013). According to the Encyclopedic Dictionary of Semiotics, Tarot readings belong to “a branch of divination based on the symbolic meanings attached to the individual cards . . . interpreted according to the subject or purpose of a reading and modified by their position and relation to each other, i.e., from their specific location in a ‘layout’ or ‘spread’” (Sebeok, 1994, Vol. 1, p. 99). The dictionary refers to the set of cards as symbolic, thus equating the image with a particular keyword traditionally attributed to it. The reading process however is much more complex as it requires creative abduction in order to perceive the “encoded” information pertaining to visual signs. Surely they need to be read or decoded, though this code is not binary.
672
I. Semetsky
The term code entered semiotics via the terminology of information theory (Nöth, 1995). A semiotic code serves as a relative “correlation or correspondence between sign repertoires or signs and their meanings” (Nöth, 1995, p. 205). Such correspondence explains how the images appearing in specific positions in a layout can be interpreted in context. While “analogic coding generates messages in a continuous space, such as images, models, and nonverbal signs” (1995, p. 208), it is during readings when the signs of the yet unknown and unconscious are duly translated into linguistic signs that the semiotic messages of Tarot icons are verbalized or digitized: “ultimately every act of semiosis involves a digital transformation of messages” (1995, p. 208) and “the ‘symbolic’ representation did, in fact, already possess an iconic content” (Stjernfelt, 2007, p. 91). By bringing the unconscious and consciousness, mental and material, psyche and soma, together, the process of interpretation interrogates the very notion of codification: While it is true that medical “semiology” is purely a study of the natural indices of pathology, psychosomatics, by contrast, sees in such symptoms reactions which are destined to communicate information, desires which the subject is not able to express any other way. . . . Parapsychology, too, postulates the notion of subliminal messages which are not conscious. . . . Signification is more or less codified . . . Here, too, we have the frontier between logics and poetics. (Guiraud, 1975, pp. 23-24)
Peirce, addressing “certain ‘telepathic’ phenomena,” stressed that “Such faint sensations ought to be fully studied by the psychologist and assiduously cultivated by every man” (Peirce & Jastrow, 1884, in Hacking, 1990, p. 206). Earlier studies by Soviet semioticians equated the phenomenon of Tarot mainly with fortune-telling (Lekomceva & Uspensky, 1977, p. 70). Playing cards were regarded as a simple semiotic system with a limited vocabulary, in which “divination of past and present is a game” (p. 71). Indeed, historically Tarot was taken simply as a cultural game (Dummett, 1980; Dummett & Decker, 2002). Still, functioning in the capacity of “the typology of plots” (Egorov, 1977, p. 77), a reading of the spatial distribution of pictures pointed to the existence of certain narrative units and “motif-functions” (p. 81) together with “formulization” (sic, p. 83) in terms of the ordering of information. A comprehensive study by Heeren and Mason (1984) presented an ethnography of communication used by contemporary readers as well as the discourse guiding their analysis. The authors distinguished between separate fields of discourse such as style of everyday life, interview style, and, “the most unusual and distinctive” (1984, p. 197), visionary style. An original study by Aphek and Tobin (1989) advanced readings to the level of a complex, dynamic, meta-semiotic system, in which the cards were seen as representing “the possible semantic, cultural and social attributes of an umbrella term or theme attributed to that particular card” (1989, p. 13). A notion of the “dynamic relativism in human communication” (p. 2) placed readings in a framework of autopoiesis (cf. Varela, 1979), thus making “the very act of perception itself as an individualized autopoietic process” (Aphek & Tobin, 1989, p. 3). Still, Tarot was presented as just one of many branches of fortune-telling in general, that is, “a specific instance of persuasive dyadic human communication” (p. 175), thereby ignoring the triadic nature of Peircean signs. In the framework of Peirce’s semiotics, a distribution of Tarot pictures in a layout comprises diagrams
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
673
that, in a mode analogous to his existential graphs, render “literally visible before one’s very eyes the operation of thinking in actu” (Peirce CP 4.6).
The Fool as a Figure of Abduction All logical relations can be studied by being displayed in the form of existential graphs or iconic representations, such diagrammatic thinking likely yielding solutions to otherwise unsolvable logical problems. While the meaning created by the diagrammatic method is “altogether something virtual” and “lies not in what is actually thought, but in what this thought may be connected with in representation” (Peirce CP 5.289), a series of representations ultimately culminate in actual effects as “practical bearings” (CP 5.402) as per Peirce’s pragmatic maxim. In accord with his postulate of synechism, Peirce asserted the possibility of divination as the prerogative of a genius who “may receive and act upon indications of which we are quite unconscious, and which, owing to the low sensibility of the conscious part of the mind, seem impossible” (CP 6.586). Still, there exists “an immediate attraction for the idea itself, whose nature is divined before the mind possesses it, by the power of sympathy, that is, by virtue of the continuity of mind” (CP 6.307) that moves beyond the confines of the solely analytic mind and creates “genuine synthetic consciousness, or the sense of the process of learning” (CP 1.390). Peirce considered consciousness a vague term and asserted that “if it is to mean Thought it is more without us than within. It is we that are in it, rather than it in any of us” (CP 8.256). This “outside” of cognition is the dimension of the unconscious; and interpreting and bringing to consciousness the silent voice of images is what gives them meaning and significance. Peirce’s semiotics as the science of signs defies the classical principle tertium non datur, the law of excluded middle, which is the basis of propositional thinking. The Thirdness of interpretation necessarily includes the Firstness of abduction, in accord with the unorthodox cardinality of Peircean categories located at the psychological, logical, and ontological levels alike: “In psychology Feeling is First, Sense of reaction Second, General conception Third, or mediation. . . . Chance is First, Law is Second, the tendency to take habits is Third. Mind is First, Matter is Second, Evolution is Third” (Peirce CP 6.32). Abduction functions as a sort of perceptual judgment creating a hypothesis that asserts its conclusion only conjecturally; yet, according to Peirce (CP 5.189), there is a reason to believe that the resulting judgment is true (Contemporary cognitive science (Von Eckardt, 1996) recognizes Peirce’s contribution in the context of the so-called theory of content determination. If the images like “certain sorts of ink spots . . . have certain effects on the conduct, mental and bodily, of the interpreter” (Peirce CP 4.431), then habit change and belief revision may follow. Magnani (2001) provides an extensive analysis of belief revision including “from the point of view of conceptual change” (p. 39).). At the psychological level, abduction functions as intuition that appears to be immediate, while abduction per se is embedded in the continuity of an inferential, mediated process. Thus it represents a paradoxical mediated immediacy, seemingly
674
I. Semetsky
a contradiction in terms, but only from the perspective of classical logic which does not allow for any “in-between-ness” as a prerogative of genuine, triadic, signs. Still, as leading to insight, the terms intuition and abduction may be interchangeable. While psychologically intuition proceeds below awareness and tends to produce a seemingly instantaneous insight, logically in the form of abduction it is a necessary part of semiotic reason. It is the capacity to read the barely perceptible signs, the meanings of which are as yet unknown, that accounts “for experts’ abilities to respond to many situations ‘intuitively,’ [and not] to hypothesize additional mechanisms to explain intuition or insight” (Simon, 1995, p. 35). Still such a nonmechanical “mechanism” does exist in the form of abductive and paradoxical “unconscious inference” (Peirce CP 8.63). Abduction jump-starts the process of learning as a subtle feeling comparable with “a peculiar musical emotion . . . This emotion is essentially the same thing as an hypothetic inference” (Peirce CP 2.643) that leads to insightful judgment even in the absence of propositional thought. Sure enough, we are “capable of making judgments by the use of functions whose processes are not ordinarily verbalized” (Berne, 1977, p. 72) but tend to hide deep in the unconscious. Importantly, in order for there to be three Peircean categories of Firstness, Secondness, and Thirdness, there must exist some extra principle holding them together, some undifferentiated “field within which semiosis plays out its drama” (Merrell, 1995, p. 217), one acknowledged by Peirce as pre-Firstness or nothingness. The nothingness (no-thing-ness) is expressed by the numeral Zero, an ambiguous sign of ultimate wisdom or total folly, a number that historically and perhaps not entirely arbitrarily has been assigned to the very first picture, called the Fool, in the Tarot set of Major Arcana (Fig. 1). Whenever “The artist introduces a fiction . . . it is not an arbitrary one; it exhibits affinities to which the mind accords a certain approval” (Peirce CP 1.383). While the following interpretation of this sign’s meaning is a prime example of abductive inference, the image per se is a visual representation of abduction as a category. The picture portrays a youth standing at the edge of a cliff and looking upward while seemingly not noticing the uneven edge or the possibility of falling down. The world ahead is full of new encounters, experiencing phenomena as “the parish of percepts . . . out in the open” (Peirce CP 8.144); yet the Fool remains ignorant of them. Only venturing into a novel, as yet unknown, territory so as to learn might bring some order into the chaotic flux of perceptions. The wandering Fool, who is always on the road, who carries his sack on a stick as the universal symbol of vagabonds and minstrels, and is pictured as having stopped at a pivotal point at the edge, “is barely in touch with any facet or fashion of Firstness; hence . . . remains vague in the extreme” (Merrell, 1996, p. 141). It is when “the surprising fact . . . is observed” (Peirce CP 5.189) that the Fool’s inquiring mind begins apprehending experience by means of abduction, the peculiar logic of discovery rather than the logic of justification. If a priori existing concepts as clear and distinct Cartesian ideas were readily available to know the meaning of any novel experience, it would not come as such a surprise with almost uncanny impact. Ideas must be made clear by belief revision. Novelty kicks in when the brute facts of the physical
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
675
Fig. 1 The Fool. (This illustration is from the Rider-Waite Tarot Deck, known also as the Rider Tarot and the Waite Tarot. Reproduced by permission of US Games Systems Inc., Stamford, CT 06902, USA. © 1971 by US Games Systems, Inc. Further reproduction prohibited.)
world intervene: “Firstness is as a dream out of which ens reale, the category of Secondness, inevitably at times awakens a sleeper” (Deely, 2001, p. 661). As an ex-sleeper, the Fool’s first step in reasoning is to form a hypothesis by a simple “conjecture. These ideas are the first logical interpretants of the phenomena that suggest them, and which, as suggesting them, are signs” (Peirce CP 5.480). Marked off by Zero, at the border of the paradoxical pre-Firstness, the Fool seems to signify nothing (cf. Rotman, 1987), but not quite so: although barely touching upon the abductive inference present in Firstness, the Fool’s state of disequilibrium partakes “of readiness to receive a certain piece of information” (Bateson & Bateson, 1987, in Hoffmeyer & Emmeche, 1991, p. 159) and accordingly starts functioning as a production of meaningful ordered structures. The Fool’s pragmatic value, then, is not at all Zero; the quasi-purpose of this sign is to produce meaning, make sense of experience, and initiate the process of creating order out of chaos. The abyss in the picture is reminiscent of Peirce saying that “primeval chaos in which there was no regularity was mere nothing, from a physical aspect. Yet it was not a blank zero; for there was an intensity of consciousness there” (Peirce CP 6.265) even if these intensive and chaotic potentialities were as yet not actual, hence unavailable to cognition. Zero can then be described as “the germinal nothing . . . boundless possibility . . . boundless freedom” (Peirce CP 6.217); this sense of freedom available for the Fool in his nonmetric semiotic space, in which void coincides with plenum, opens the possibility of a semiotic relation. In the world described by the conditions of deterministic chaos (cf. Prigogine & Stengers, 1984),
676
I. Semetsky
the Fool’s ultimate freedom as Firstness is itself a necessity. Just about to establish a relation with an environment by leaping ahead into this very environment, this sign “belongs” to an open-ended, interactive, semiotic system as a complex whole representing an interconnected network of relations. The “environment” – the natural world behind the cliff in the picture, symbolically – has a mind of its own which would be qualified by Peirce as the quasi-mind and wherein “all the regularities of nature and of mind are regarded as products of growth” (CP 6.102) and evolution, that is, Thirdness. The world of signs-in-action (semiosis) demonstrates the emergence of “another kind of causation” (CP 6.60) which would not be possible without the aspect of free play, a throw of the dice symbolized by the Fool’s teetering at the cliff: the Firstness in Thirdness. This is the world “perfused with signs, if it is not composed exclusively of signs” (Peirce CP 5.448). Signs “located” outside of human mind are embodied laws constituting matter which, according to Peirce, is also a state of mind – a frozen habit – that as such subscribes to “regularity, or routine” (CP 6.277): quasi-mind. Nonetheless it is “the idea of continuity, or the passage from one form to another by insensible degrees” (Peirce CP 2.646) that ensures the overall sign process’ ties to consciousness, thus ultimately fulfilling the condition of genuine intentionality or “aboutness”: mental content is always already about something even if this something is imperceptible, hiding in the unconscious, and cannot as yet be properly recognized and articulated. The paradox of the unconscious abductive inference is in full force! However the “future of psychology may lie in the paradoxes rather than in the body of logic” (Berne, 1977, p. 29). Indeed, semiotic logic is circular, self-referential, and as such paradoxical to its core. For Peirce, however, a paradoxical “selfcontradictory proposition is not meaningless; it means too much” (CP 2.352). From a semiotic perspective, any thinker always engages in a dialogue with that thinker’s own unknown “future self” because “his thoughts are what he is ‘saying to himself,’ that is, is saying to that other self that is just coming into life in the flow of time” (Peirce CP 5.421). This future-oriented dynamics is a prerogative for the growth of meanings as a feature of semiotic reason. As Peirce passionately says, “Your self of one instant appeals to your deeper self for his assent” (CP 6.338). The term retroduction as a process of discovery is interchangeable with abduction (cf. Magnani, 2001) as it stresses the backward movement necessary for intuitive learning from within while also performing a leap forward to the unknown. Kihlstrom (1993) described the experiment on subliminal perception performed by Peirce and his student Jastrow and referred to the unconscious as “a domain of mental structures and processes which influence experience, thought, and action outside of phenomenal awareness and voluntary control” (1993, p. 125). When such knowledge organization, albeit at first imperceptible to consciousness, becomes “fully accepted,” it “tends to obliterate all recognition of the . . . premisses from which it was derived” (Peirce CP 7.36): the tiny inferential but unarticulated steps remain unavailable to direct cognitive awareness. Contemporary neuroscience and neurophilosophy (Varela, 1999) recognize the existence of an imperceptible temporal gap in brain activity, during which a kind of unconscious processing of information is supposed to take place. This dynamics
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
677
determines “the entire readiness-for-action in the next moment” (Varela, 1999, p. 51) – just like that exhibited by the Fool icon! The image of the Fool expresses an instinctual and “quasi-immediate” (Merrell, 1995, p. 204) sense of spontaneous decision-making and taking chances offered by the as-yet-unknown experiences, “though it is not purely accidental or aleatory” (p. 204). The world of choices comes about as if by chance, seemingly from nothingness, out of Zero; the Fool was wandering without any specific purpose or destination – he wouldn’t be the Fool otherwise – yet his abductive leap represents a selective, albeit unconscious, choice, a subliminal decision-making, that is, an interference of difference that would have made a difference in practice. The domain of nothingness contains the seeds of future possibilities; in fact they are here, in the picture, subsisting in the void, or metaphorical abyss of freedom, behind the cliff. The Fool discovers the meaning of experience when he “lets go of the constraints of habitual responses [by performing] a saltus step off the edge” (Kevelson, 1999, p. 15). His abductive “saltus” is an “unconscious inference [that] differs essentially from inference in the narrow sense” (Peirce CP 8.63) because “neither Deduction nor Induction contributes a single new concept to the structure” (CP 6.475) of knowledge. The role of these types of inferences is such that “Induction . . . has to test the deductive consequences of the hypothesis proposed” (Stjernfelt, 2007, p. 335), the initial hypothetical inference indeed proceeding by abduction. In the process of diagrammatic reasoning, the heuristics “that there might lie further information hidden in the multi-perspective structure of the picture” (Stjernfelt, 2007, p. 284) is always present! In a series of translations into other “more fully developed” (Peirce CP 5.594) signs, signified by the subsequent Arcana, the naive Fool will learn and grow. His inevitable jump (or fall?) is the very condition for initiating learning and becoming conscious of the unconscious. The Fool as Zero, then, due to its quality of paradoxical disjunction, does in fact perform the synthesizing, conjunctive role of the production of meaning or sense from its own opposite, nonsense or nothingness as the very epitome of the Fool. Each consequent whole number that “indexes” the subsequent Arcana in a deck describes the property that contains zero in itself as an empty set, not unlike in the process of iteration, during which the basic marks or braces are repeated. If the empty set {} corresponds with zero, the process can be represented visually by the infinite series (Fig. 2). The braces are being repeated within a logical process that starts from nothing, from the empty set or zero as indeed indicated by the Fool. Such is the Fool’s paradoxical significance in his signifying nothing! In the Laws of Form , G. SpencerBrown (1979) demonstrated that logic can be arithmeticized; he in fact added the unmarked state to Peirce’s existential graphs and allowed “the use of empty space in place of a complex of Signs. This makes a profound difference and reveals a beautiful and simple calculus of indications underlying the existential graphs. Indeed Spencer-Brown’s true contribution is that he added Nothing to the Peirce theory!” (Kauffman, 2001, p. 80), the nothingness being the sign of the Fool. Thus it becomes possible to construct logic by using the fundamental first step of drawing a distinction (one might say, abductively or intuitively) and then proceeding with two arithmetical acts of making a mark to signify a distinction and subsequently
678
I. Semetsky
Fig. 2 The infinite series. (Reproduced with permission from Barrow (2000, p. 160). See also Rucker (1982, p. 40).)
repeating the mark. The logic of the excluded middle simply represents the same; the logic of the included middle enables learning, due to the initial difference that certainly makes a difference. This is the Fool’s prerogative in the play of semiosis: to make a difference! His action precedes any conscious deliberation, however. He does not know the range of experiences that will have been encountered even as “what enters the mind as information always depend on a selection, and this selection is mostly unconscious. In this sense one should not speak about ‘getting’ information, rather information is something we ‘create’” (Hoffmeyer & Emmeche, 1991, p. 122)!
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
679
Learning Abductively from Images The Fool, in his pre-First, pre-conscious, and pre-verbal state of mind, cannot yet reason in propositions; still, this sign initiates the very process of semiotic reasoning: the sequence of Tarot signs embodies “an endless series of representations . . . the interpretant is nothing but another representation to which the torch of truth is handed along; and as representation, it has its interpretant again. Lo, another infinite series” (Peirce CP 1.339). Yet, while the iconic character of the visual images is obvious, an interpretation would not be possible without some indexicality even when the sign’s initial “interpretant is a mere quality of feeling” (CP 8.332). As Nöth (2012, p. 286) asserts, “Acquaintance with an unknown object can only be conveyed through an icon in conjunction with an index.” What do the Tarot images “gesture” toward? According to Henry Corbin, there exists an intermediate realm between micro- and macrocosm called mesocosm and dubbed the imaginal world – mundus imaginalis or mundus archetypus. Archetype is an opaque sign that “points beyond itself to a meaning that is darkly divined yet still beyond our grasp, and cannot be adequately expressed in the familiar words of our language” (Jung in Nöth, 1995, p. 119) but needs a relevant medium of communication so as to be brought to consciousness. Archetypes that are, intrinsically, forms without content acquire this very content relationally within the dynamics of a selforganizing semiotic process. The informational content therefore always already is, albeit potentially, virtually and outside of consciousness. Still, is it the archetypal ideas inhabiting Corbin’s imaginal world that inspired creative artists to design Tarot cards as material representations of such supposedly inexpressible ideas? The embodiment of archetypal ideas in the pictorial medium affords an opportunity to articulate the expressive “voice” of images when the creative imagination and intuition of the reader lead to making abductive inferences for the construction of logical, meaningful narratives. This is the Tarot hermeneutic (Semetsky, 2011). In the interpretive, hermeneutic process constituting the reading of pictures, “an alternative model of understanding, not unrelated to the alternative which Peirce introduced under the designation of abduction ” (Nöth, 1995, p. 336) is imperative. For Peirce, “every general word excites a pictorial idea. Even to the modern student, the pictorial ideograph becomes a considerable part of the idea it excites; and the influence of the hieroglyphics, the modes of expression, etc., is to make ‘a composite of pictures’ particularly expressive in describing the idea conveyed” (Peirce CP 2.354). Corbin makes clear that the “figures of the mundus imaginalis do not subsist in the same manner as the empirical realities of the physical world, [nor] in the purely intelligible world” (Corbin, 1972, p. 6), and references the expression spissitudo spiritualis coined by Henry More, the Cambridge Platonist. Not unlike Plato’s chora which gives birth to ideas, this “in-between” world functions as a semiotic bridge, thus becoming “a metaphysical necessity. Imagination is the cognitive function of this world [that] provides the foundation for a rigorous analogical knowledge allowing for an evasion of the dilemma of current rationalism, which offers only
680
I. Semetsky
a brute choice between the two banal dualistic terms of either ‘matter’ or ‘mind’” (Corbin, 1972, pp. 6–7). In such an ambivalent and “radically conjunctive” (Merrell, 1997, p. 63) world of signs, the classical principle of the excluded middle is by definition invalid (cf. Rotman, 1993). Tertium non datur becomes tertium quid – the included third. The third, imaginal, world in-between the Platonic worlds of the intelligible and sensible is reminiscent of the theory of recollection exemplified in Meno, when it is by contemplating geometrical figures as visual signs and engaging in a dialogue – that is, actively participating in the semiotic image-word relationship – that the slave boy begins to learn. Magnani indeed references “Plato’s doctrine of reminiscence” (2001, p. 1) as his theory of knowledge in relation to problem-solving and learning: “‘Constructing’ the figures, Socrates the dialectic leads the young slave to discover by himself the geometrical truths he already possesses in his spirit” (Magnani, 2001, p. 7). The imaginal world of archetypes, in contrast to being purely imaginary or simply unreal, constitutes a distinct order of reality corresponding to a distinct mode of perception. Archetypal ideas permeating the mundus imaginalis and inhabiting the unconscious manifest in life as habits of which, in the absence of learning and critical self-reflection, individuals remain largely unaware, behaving repetitively in the grip of old assumptions and beliefs, thus reinforcing habitual patterns that sink even deeper into the unconscious. It is the human cognitive function enriched with imagination, insight, and intuition – the psychological equivalents of abduction – that provides an epistemic access to the imaginal, archetypal world. Magnani writes: “The activity of Kantian schematism is implicit too, resulting from imagination and completely unknowable as regards its ways of working, empty, and devoid of any possibility of being rationally analyzed. It is an activity of tacit knowledge” (2001, p. 1). Michael Polanyi, affirming a mode of tacit knowledge, equates intuition with an apparently unaccountable element in science because “the powers of scientific discerning . . . operate by selecting, shaping and assimilating clues without focally attending to them” (Polanyi, 1964, p. 11) but perceiving them only as subtle and opaque signs that eventually become integrated in consciousness. Mental contents depend on the dialectical interplay between the two poles, distal and proximal – that is, explicit or focal object of awareness and tacit or subsidiary awareness: the tacit dimension (Polanyi, 1966) of knowledge, that of which the thinker typically remains unconscious. These two poles together form a single sign, albeit double-sided and hence bipolar. The semiotic categories of integration, embodiment, and indwelling are the three aspects of Polanyi’s model of the mind. Indwelling refers to the tacit or proximal dimension of experience at the very interface of human bodies with the natural world, that is, at the subconscious level. New discovery dwells in bodily knowing and can break through the boundaries of old knowledge, thus creating new mental content which had previously been only implicit. The “rules” of reasoning include tacit inference within the flow of awareness between the conscious and the unconscious which exceeds it (In his Diagrammatology, Frederik Stjernfelt contends that “What is gained by realizing the diagrammatical character of picture viewing is not least the close relation between picture and thought. It is a corollary
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
681
to Peirce’s generalized conception of logic that thought, even if general, can never leave an intuitive, iconic basis, and the diagram as a category is . . . Peirce’s heir to Kant’s famous schemata as a meeting place between intuition and thought” (2007, p. 288).). Like Peirce, Polanyi affirms the logical character of reasoning from clues (signs), which is not deductive but proceeds in the mode of tacit or implicit inference: direct deduction gives way to indirect, mediated integration. Integrated learning is a function of the tacit mode of knowing: it does not represent the known but creates new concepts – as yet unknown, or better said “known” only tacitly or unconsciously. Jerome Bruner referred to an intuitive sense of rightness and contended that “Intuition implies the act of grasping the meaning or significance or structure of a problem without explicit reliance on the analytic apparatus of one’s craft” (Bruner, 1979, p. 102). Intuition functions in accord with its literal meaning, that is, learning from within, thereby affirming its place in the semiotic process founded on “A communication mechanism which is at work across the three levels of perception” (Jantsch, 1975, p. 149). Access to knowledge, then, “and this is the crucial point, is available within ourselves” (1975, p. 146) as much as without, making a semiotic “transcendental relative” (Deely, 2001, p. 619) immanent in Polanyi’s tacit dimension. Education takes the edusemiotic turn as an integrative model of learning with the unconscious (Semetsky, 2020) and as such foregrounds what Polanyi called a tacit triad, partaking of the tri-relative structure of Peircean genuine signs, because tacit knowing joins together three coefficients. For Polanyi, human awareness has a vectorial, or directional, character, in the sense that the dynamics between the two poles, subsidiary and focal, proceeds along a single continuum. Importantly, knowing and acting coalesce, and the dimension of awareness is coupled with the activity dimension. The action of signs proceeds between the focal and subsidiary poles as well as between the conceptual and the bodily, that is, between the unconscious and consciousness when the tacit or implicit mode of knowing becomes explicit: “An integrative act takes place when the particulars of subsidiary awareness coalesce into a meaningful whole” (Gill, 2000, p. 47). Polanyi illustrates this point with an example of the learning process, where a medical student tries to read visual radiograms, first being puzzled but then entering the world of interpretable signs and learning their “language,” which begins to make sense when he eventually perceives some regularities in the radiograms as being significant: “this is sense-reading. It is followed by the student being able to verbalize the features he learned to see as significant – that is, sense-giving” (Jha, 2002, p. 62). The vector of awareness, intrinsic to the transformation of signs, thus proceeds from the tacit, peripheral, unconscious, and nonverbal mode to the focal, conscious, and verbal. But simultaneously the student is undergoing his own transformation and growth because surely “man is a sign” (Peirce CP 5.314) and signs evolve! Going back to the Fool image, even with his incapacity for rational decision-making, he is nevertheless inclined to make the right choice. His apparently irrational jump seems to confirm the Peircean insight that “an abductive leap comes by way of a fundamental human instinctive penchant for generally
682
I. Semetsky
being more right than wrong in the face of an indefinite number of possibilities for erring” (Merrell, 1992, p. 14). The Fool – solely by “admitting pure spontaneity” (Peirce CP 6.59) – will jump into the abyss of real albeit as yet sub-representative experiences that will eventually make sense for him. Since icons in general “play a key role in modeling, whether speaking of the ‘semiotically real’ object to be modeled or the source from which the model is derived” (Merrell, 1992, p. 189), meaning is always already implicated on the surface within the layout of pictures as projections (or semiotic translations) of Corbin’s mundus imaginalis or mundus archetypus. And the function of archetypes is akin to Peirce’s “general idea, [which] is already determinative of acts in the future to an extent to which it is not now conscious” (Peirce CP 6.156). Peirce’s pragmatic maxim presupposes the discovery of meanings, notwithstanding that the “meaning lurks perpetually in the future” (Merrell, 1992, p. 189). However, in the Fool’s paradoxical but semiotically real world, where that which exists as potentiality can turn to actuality only in some indeterminate future, the future per se is not totally indeterminate but subsists as future anterior. Future that will have been means that it retroductively culminates in “has been,” which makes it always already projectable. The so-called triangle argument in physics supports this assertion. The coexistence of past, present, and future aspects of time in the layout (the infamous “fortune-telling”) is possible because they appear momentarily “frozen in their locations in space and time” (Kennedy, 2003, p. 53). Accordingly, “me-now” becomes simultaneous with “metomorrow” (Fig. 3). The dotted lines indicate simultaneity, simultaneity implies coexistence, and the coexistence relation is indicated by a two-headed arrow, thus forming a triangle that is both seemingly “impossible” yet properly semiotic. The static layout of images is a “frozen” slice of the dynamic process of semiosis. Embodying a self-reflective triadic structure, the layout demonstrates the paradoxical memory of the future. Indeed, “A man denotes whatever is the object of his attention at the moment; he connotes whatever he knows or feels of this object, and is the incarnation of this form . . . his interpretant is the future memory of this cognition, his future self”
Fig. 3 The triangle argument. (Reproduced with permission from Kennedy (2003, p. 63))
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
683
(Peirce CP 7.591). It is by the Firstness present in Thirdness, by the abductions necessary for the translation of signs, that the apparent logical gap between a problem and its solution can be crossed – still, Polanyi’s tacit mode retains its logical priority. A leap to the unknown (yet in no way unknowable) always has an aesthetic aspect, as intellectual beauty or an elegant, albeit fallible, solution. Knowledge in semiotics is always already fallible, and concepts are never completely determined: they are born from potentially insightful intuitions, abductive leaps, guesses, and hunches. In the absence of semiotic interpretation as mediation (Thirdness), such intuitive perception in its semiotic (nonlocal) relation with material reality – like the Fool portrayed as just about to leap – is not as yet integrated in consciousness. Intuition, for Peirce, is not determined by previous cognition (consciously); he acknowledged the somewhat “occult nature” of the unconscious, “of which and of its contents we can only judge by the conduct that it determines, and by phenomena of that conduct” (Peirce CP 5.440). Emphasizing the role of diagrammatic reasoning, Peirce stressed that “passing from one diagram to the other, the [inquirer] . . . will be supposed to see something . . . that it is of a general nature” (CP 5.148), thus contributing to making the ideas clear. The purpose of a diagrammatic mode was to “depict thought’s very movement, its processual character, in terms of interconnecting lines, schemes, figures, abstract mappings. In fact, [Peirce] believed that all thought is sign process and hence it is capable of being presented diagrammatically” (Merrell, 1995, p. 51). Referring to diagrams, Peirce explains that they lead to the synthesis of consciousness (hence, accomplish learning) by awakening intuitions based on the demonstrable relations between elements “which before seemed to have no necessary connection. . . . Intuition is the regarding of the abstract in a concrete form, by the realistic hypostatization of relations; that is the one sole method of valuable thought” (Peirce CP 1.383). Describing abduction as “first, present, immediate, fresh, new, initiative, original, spontaneous, free,” Peirce nonetheless cautioned: “Only, remember that every description of it must be false to it” (CP 1.357). Still, how can abduction or intuition be represented if this subtle knowledge in its tacit mode escapes its precise representation in propositional thought or verbal language but lurks in the signs of the unconscious? Even as pictures, images, and “graphic symbols (which include iconic and indexical signs) are a semiotically still largely unexplored field of research” (Nöth, 1995, p. 219), we can construct a diagram of abduction as a mode of specifically visual semiotics (cf. Nöth & Jungk, 2015). For Peirce: The geometer draws a diagram, which if not exactly a fiction, is at least a creation, and by means of observation of that diagram he is able to synthesize and show relations between elements which before seemed to have no necessary connection. . . . [I]t it is the genius of the mind, that takes up all these hints of sense, adds immensely to them, makes them precise, and shows them in intelligible form in the intuitions of space and time. (CP 1.383)
Polanyi asserts that human experience is projected along a vectorial continuum (cf. Semetsky, 2013) between its focal and subsidiary – tacit – dimensions. Invoking vectors and keeping in mind Polanyi’s model of the hierarchy of knowledge, one can construct a vectorial diagram while simultaneously demystifying the learning
684
I. Semetsky
paradox that continues to haunt educational discourse. In Plato’s dialogue Meno, Socrates claims that new knowledge cannot be acquired by learning. He is implicitly addressing the problem of being-as-first-known, later formulated by Aquinas, ens primum cognitum. Meno is puzzled by what Socrates means when he provocatively says that there cannot be any new knowledge and that what is called learning is a process of recollection. The theory of recollection demands that all knowledge is already possessed unconsciously and given truths are simply recognized. However, if any new knowledge is incompatible with prior learning – the latter in fact being a precondition for the understanding of what is new – then there is no foundation on which to build such new knowledge. The paradox exhibits absurdity because either one knows a priori what it is that one is looking for or one does not know what one is looking for and therefore cannot have prior expectations of finding anything. A graph representing intuition/abduction may be constructed on the complex plane, which is a grid of two axes marked with imaginary numbers on the vertical axis and real numbers on the horizontal axis. The imaginary number i denotes the square root of minus one. In reference to the geometry of complex numbers, “Caspar Wessel in 1797, Jean Robert Argand in 1806, John Warren in 1828, and Carl Friedrich Gauss well before 1831, all independently, came up with the idea of the complex plane” (Penrose, 2004, p. 81). Imaginary and real numbers together form a plane, on which a point represents a complex number a + bi. Imaginary numbers, while historically considered impossible or even magical, are part and parcel of complex numbers which do “play a fundamental role in the workings of the universe” (Penrose, 2004, p. 67). This is the universe of signs, irreducibly complex and including thinkers, human beings, as among its own constituent parts. Do imaginaries perfuse the world just like Peircean signs? Descartes considered imaginary numbers to be absurd or fictitious; it was he who first coined the term. There was no place for them in Newton’s mechanistic philosophy either. Leibniz recognized their intermediary character and qualified them as the ontological amphibium positioned at the level of the divine intellect between being and nonbeing. As will be seen in Fig. 4, they indeed symbolize divination in the form of insight into the as yet unknown. To Gauss, however, the meaning of imaginary numbers was derived from their geometrical, hence diagrammatical or graphical, representation. Polanyi’s focal, rational awareness is expressed by means of real numbers a along the horizontal axis. But tacit knowledge or subsidiary awareness is along the vertical axis b: it is an intuitive jump to the unknown, a leap of abduction, an effort of imagination, for which the imaginary number i is the proper symbol. So in this model, tacit or implicit, instinctive and unconscious, knowing coexists with explicit reasoning, this semiotics being represented by complex numbers comprising the ordered pair a + bi on the Gaussian plane. An analytical representation of direction is indeed possible by means of vectors as directed magnitudes. The inferential processes represented by vectors “add up”: they converge onto a resultant vector on the complex plane (Fig. 4). As Peirce pointed out, “what is growth? Not mere increase” (Peirce CP 1.174), as it would have been in plain arithmetical addition. An operation of simple addition would maintain the linearity of the process which would thus have subscribed to
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
685
Fig. 4 The resultant vectorr = a + bi. (This diagram was first suggested, constructed, and described from the standpoint of Peirce’s semiotics in, e.g., Semetsky, 2004 (pp. 433–454) or Semetsky, 2015 (pp. 154–166).)
the logic of excluded middle. But a triadic relation, a prerogative of genuine signs, is embedded in the nonlinear dynamics of semiosis, in accord with the logic of included middle. The diagonal line casts its shadow on the horizontal axis, appearing via projection as if from nowhere – or from nothingness as the mark of the Fool. Peirce, in a stroke of a genius, paraphrases Shakespeare when positing that the ontological status of abduction partakes of “airy nothings to which the mind of a poet, pure mathematician or another might give local habitation and a name within that mind” (Peirce CP 6.455). These airy nothings – the Fool’s domain – exist at a level of complexity exceeding the realm of real numbers; they do not belong to available sense data. Peirce’s categories of Firstness, Secondness, and Thirdness are truly the “conceptions of complexity” (Peirce CP 1.526). Indeed, “Of what is the conception of complexity built up?” (Peirce CP 5.88). The geometrical addition of directional forces amounts to the resultant vector r. Such heuristics is specifically semiotic and relational. It is inherent in the transformative action of signs when the unconscious becomes integrated in consciousness. That is how a semiotic triangle on the complex plane is formed, even if such a triangle appears to be impossible. A vectorial diagram represents the dynamics inherent in the structure of signs: it is constructed on the basis of projective geometry (recall Polanyi’s emphasis on projection), employing a perspectival composition, which uses the technique of parallel projectors emanating from an imaginary object and intersecting a plane
686
I. Semetsky
of projection at right angles (coplanar) to create images. Peirce inquired “whether geometry rests upon any observations concerning clairvoyance” (CP 2.614) and argued for “the application of geometry to the logic of relatives” (CP 3.133). He asserted the relevance of projective geometry to semiotics. Indeed “projective geometry is no more about definite objects or figures and their respective properties, but deals with relational structures and their possible transformations” (Otte, 2011, p. 328). According to the innovative area of so-called virtual logic (Kauffman, 1996), “It is remarkable that domains imaginary with respect to arithmetic are vitally real with respect to geometry” (1996, p. 293) (It was Bernhard Riemann who merged projective geometry with the idea of complex numbers. On the Riemann sphere, zero and infinity are but two opposite poles. In semiotic discourse, they would therefore represent a perfect sign or a complementary pair. In quantum mechanics, zero (marked by the Fool) as vacuum is a source of infinite energy (the abyss in the Fool’s picture).). In Fig. 4, the length a is just a shadow or projection of the diagonal resultant line onto the horizontal axis, not unlike a Platonic copy as the image or shadow on the walls of the proverbial Cave. And abduction itself, being just an intuitive guess or a hypothesis, is also a projection of a much more complex albeit implicit knowledge structure: “the mind is in the attitude of search, of hunting, of projection, of trying this and that” (Dewey, 1991, p. 112). In the context of semiotics, René Thom (1985) presents a case of projected shadow as an example of structural isomorphism produced by coupling: in other words, a semiotic encounter of two poles. It is light illuminating the original and casting the shadow as its image that performs the function of interaction. The formation of images is a manifestation of the universal dynamics that “allows the appearance of forms . . . charged with more meaning” (Thom, 1985, p. 280). The resultant vector as a vectorial sum r = a + bi represents expanded semiotic consciousness as a new level of focal knowledge in which the subsidiary dimension is integrated. The complex point symbolizes the closure of the semiotic triangle on itself, like a genuine triadic self-referential sign. It is at this point where “the physical universe ceases to be merely physical. The realm of brute force and physical interaction as such at this moment becomes caught up in the semiotic web, and the universe becomes perfused with signs” (Deely, 2001, p. 621). Strictly speaking, this point is akin to the vanishing point in a perspectival composition as the vertex “closing” the triangle. In the context of semiotics and edusemiotics, such an “impossible” triangle is not only possible but also necessary: it represents a paradoxical “open-enclosure” structure without which the infinite process of semiosis could not exist. The closure is only operational: “a closure which itself opens possibilities” (Colapietro, 2000, p. 145). Abduction, as the magnitude along the vertical axis, creates depth in the understanding that amounts to a sign’s ultimate intelligibility because of nonlocal “contact with some sort of Platonic world” (Penrose, 1997, p. 125). In contemporary physics the relationship between the three worlds (the physical world, the Platonic world, and the mental world) has been considered a mystery, heavily debated, and dubbed gaps in Roger Penrose’s tiling. The core of Penrose’s argument is that the physical world may be considered a projection of the Platonic world and the world
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
687
of mind arises from part of the physical world, thus enabling one in this process to insightfully grasp and thus understand some part of the Platonic world. Importantly, what “inhabits” the Platonic world is not only the True but also the Good and the Beautiful as “non-computable elements – for example, judgement, common sense, insight, aesthetic sensibility, compassion, morality” (Penrose, 1997, p. 125). The Platonic world mediates – not unlike Corbin’s mundus imaginalis or mundus archetypus – between the conscious mind (mental world) and the actual, physical world. Unconscious ideas express themselves nonverbally, at the level of feelings, abductions, and instinctive actions bypassing representation in consciousness, and hence cannot be articulated or deliberated upon in propositional language. Still, abduction is a mode of inference. Proceeding below awareness, it enables an intuitive grasp of (moral) meanings. Naturalistic ethics recapitulates ontology, but the natural world exceeds its reductive mechanistic description. Nature includes a semiotic dimension related to the human experiences from which learning arises. The abstract universals of the Platonic world, just like Peirce’s generals, exist in a semiotic relation to the particulars of actual existence: they are not strictly universal but reflect the particularities of events that, upon interpretation, acquire new meanings. The “contours” of the presupposed universals may therefore be dynamic and fuzzy. New meanings are the outcomes of learning inscribed in the unlimited process of semiosis: signs do evolve (The Platonic world of forms, besides algebra of complex numbers, also “contains geometries other than the Euclidean one, . . . infinite numbers and non-computable numbers . . . There are Turingmachine actions that never come to a halt as well as oracle machines” (Penrose, 1994, p. 412). This world, however, is not limited to solely mathematical truths: “Plato himself would have insisted that the ideal concept of ‘the good’ or ‘the beautiful’ must also be attributed a reality” (p. 416).). While Grush and Churchland (1995) argue against Penrose’s positing a possible direct insight into Platonic truths – and therefore understanding the meanings of the (mathematical) concepts – over following the logic of computational rules, it seems that the argument becomes moot in view of the presence of abductive inference: it is only through “a derivative way” (Deely, 1990, p. 35), which takes the form of projection (a kind of semiotic translation), that signs can reach out toward the level of the unconscious. Abduction appears to function as some sort of Geiger counter, equated by Penrose to a bridge connecting the “small” and “large” worlds, even if, according to Penrose, the nature of such an included, mediating, element in the framework of currently available scientific theories remains poorly understood. But semiotics as the science of signs leads to a better understanding of such an unorthodox expanded reality: signs are intrinsically relations and interconnections – or bridges – comprising the tri-relative, nonlinear, and self-referential process of semiosis. Thirdness as mediation is expressed by the resultant vector or a diagonal line that duly creates a closed area as the (mathematical) operation of integration and confirms Polanyi’s model of the hierarchy of knowledge. Without abduction or intuition as Firstness and mediation as Thirdness, it appears that any knowledge of the real must be limited to the realm of Platonic shadows, to illusions appearing on the walls of the infamous Cave, and be doomed to forever remain unaware of
688
I. Semetsky
the unconscious, thus missing the possibility of full-fledged meaningful learning. Yes, a semiotic triangle seems to close on itself – but it does so at a higher level of knowledge when the unconscious is duly integrated into consciousness. The complex plane would not be complex without the axis of imaginary numbers but would remain a Cartesian grid, preventing any genuine understanding of how new knowledge can come into existence. Without abduction or intuition, all knowledge would remain sequential, because signs would progress horizontally without being able to change direction geometrically. It is merely some prior focal knowledge that would grow in quantity, but tacit and unconscious, implicit, knowledge would lack any possibility of explication to enable new objects of knowledge – represented by the complex number a + bi pointed to by the arrow r – to enter cognition. A novel hypothesis literally, in front of one’s eyes, does bring a new direction into thinking, thus conferring meaning on experience. If one imagines positioning oneself alongside the resultant line, in the midst of genuine signs, two perspectives become clear: “Viewing a thing from the outside, considering its relations of action and reaction with other things, it appears as matter. Viewing it from the inside, looking at its immediate character as feeling, it appears as consciousness” (Peirce CP 6.268): bodily and conceptual poles coalesce, just as posited by Polanyi. The complex plane as a whole contains what Peirce would call an admixture or a weighted sum of real and imaginary components. Semiosis demands an admixture of mind-dependent and mind-independent relations comprising “dream and reality, possibility and actuality” (Deely, 2001, p. 645) that can solve the problem of intelligibility only when functioning together as a complex whole. Even if reasoning from abduction “does not have to make separate acts of inference” and remains subconscious, “If we were to subject this subconscious process to logical analysis, we should find that it terminated in what that analysis would represent as an abductive inference” (Peirce CP 5.181). Plato’s famous division first presented as the Line in Republic IV was an indication of his envisioning the multiplicity of levels of knowledge, and the diagram in Fig. 4 enables the realization that sensible things are particular reflections (projections) of universals, or Peirce’s generals, “residing” in the intelligible realm and available to cognition through the existence of the imaginal world, the included third. To recapitulate, Plato’s theory of recollection states that thinkers always already possess all the knowledge unconsciously and can simply recognize given truths. Well, not exactly, even if the slave boy in the Meno dialogue does have some kind of “tacit precognition” (Magnani, 2001, p. 13). The diagram of abduction at once dismantles and affirms the paradox of inquiry. The self-organizing dynamics of sign relations overcomes the paradox of new knowledge as well as what appears to be the inconsistency between articulating representations and meanings in terms of Peirce’s logic of signs. A dyadic relation alone would not lead to the creation of meanings: “a sign . . . to actualize its potency, must be compelled by its object” (Peirce CP 5.554) via an abductive leap from the unconscious. The feature of double codification (analog and digital) in reading and interpreting visual signs that are spread (seemingly randomly) in a given Tarot layout relates to a specific problem in contemporary physics, namely, “the emergence of the discrete from the
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
689
continuous” (Stapp, 2007, p. 88). Physicist Henry Stapp posits the hypothetical mechanism of a spontaneous quantum reduction event “associated with a certain mathematical ‘projection’ operator” (p. 94), the action of which seems to be direct (via projection) but that also causes “indirect changes,” producing “‘faster-thanlight’ effects” (p. 94). These effects indeed manifest themselves in the edusemiotics of Tarot images, not unlike Einstein’s proverbial “spooky actions at a distance” (p. 94). The Fool’s abduction initiates the evolution of thought signs even if “in the beginning – infinitely remote – there was a chaos of unpersonalized feeling” (Peirce CP 6.33), rendering the poor Fool overwhelmed and immobile indeed. The transformation of signs, including human evolution irreducible to biological process, demands perceptual (nonconceptual) abduction which “Over against any cognition” reaches toward “an unknown but knowable reality” (Peirce CP 5.257). Peirce asked, “what must be the characters of all signs used by a ‘scientific’ intelligence, that is to say, by an intelligence capable of learning by experience” (CP 2.227). This question demands an answer. Such characters may well be represented by Tarot’s visual signs, each one telling a story of a particular aspect of human experience, from which human beings can and should learn. Still, it remains habitual to subscribe to the dualistic worldview because old habits are resilient and tend to become fixed and rigid while “issuing a command to one’s future self” (Peirce CP 5.487) which, as such, tends to behave in a repetitive manner according to the gamut of unconscious habits. Worse, there is a tendency to believe in the righteousness of one’s own actions, without ever questioning them, because “Belief is . . . a habit of mind essentially enduring for some time, and mostly (at least) unconscious” (Peirce CP 5.417). Human beings do have an inherent “capacity for learning” (Peirce CP 5.402), a tendency toward revising beliefs and creating new habits; however the historical emphasis on verbal language and propositional thought took away the ability to reason intuitively or abductively with the help of the right hemisphere. Thankfully, contemporary culture has begun to reinforce “the perceptual mode of the right brain. The personal computer has greatly increased the impact of the iconic revolution and continues to do so” (Shlain, 1998, p. 416). Visual abduction “catches” new knowledge because “information emerges as a result of the inherently nonlinear, functionally self-organizing dynamics” (Kelso & Engstrøm, 2006, p. 102) constituting the process of semiosis as the action of signs, which in turn are complementary pairs or “opposing tendencies [that] must coexist to make possible the creation of functional information” (p. 104). Kelso and Engstrøm contend that by imparting observable effects on the patterns of coordination, nonverbal language can be used for communication; hence “it can be thought of as functional information” (p. 98). They emphasize that functional information is never arbitrary with respect to the dynamics this information directs. The logic of the included middle – the basis for semiosis – ensures the coordination dynamics which “champions the concept of functional information, and shows that it arises as a consequence of a coupled, selforganized dynamical system living in the metastable regime where only tendencies and susceptibilities coexist” (p. 104). It is the Fool who exists in a metastable regime of potential tendencies which become actual when abduction intervenes and disturbs his fragile balance at the edge of the cliff.
690
I. Semetsky
Scott Kelso’s original research (1995) proved the existence of coordinated dynamic patterns at the level of both the brain and behavior. This was a point of departure for Kelso and Engstrøm in their “fascination with what seemed at first a somewhat esoteric connection between philosophy and the science of coordination” (Kelso & Engstrøm, 2006, p. xiii). Philosophy as semiotics or the science of signs, however, takes away this mystery. John Deely (2001) stressed that Peirce’s theory of signs is rooted in science and not in mysticism. Reading Tarot signs (historically considered to be occult, mystical, and esoteric) partakes of reading “Chinese writing and Egyptian hieroglyphics [as] systems of real characters” (Nöth, 1995, p. 272) that belong to the so-called universal language, the project of which so far remains incomplete. Leibniz conceived of lingua characteristica as a universal pictographic or ideographic alphabet of human thought, complemented by calculus ratiocinator and reflecting the ratio embedded in nature. He envisaged the universal ars inveniendi for the invention of new truths, together with the formal scientia generalis of all possible relations between all concepts in all branches of knowledge taken together. This unified science of all sciences called mathesis universalis would have employed a formal universal language in which Leibniz included pictures and “various graphic-geometrical figures” (Nöth, 1995, p. 274) as a possible medium of communication. The real characters of mathesis are archetypal images, ideograms, and what Leibniz noncoincidentally called “arcana.” Yet the contemporary transference of Leibniz’ dream into research in artificial intelligence does not seem to bring current science any closer to realizing his project. As Penrose insisted, it seems that “whatever brain activity is responsible for consciousness . . . it must depend upon a physics that lies beyond computational simulation” (1994, p. 411). Leibniz’ project refers to the injunction of knowledge representation. Analytic philosophy presents language as a system of representations a priori distinguished from signs. The representational system presupposes a class of things represented which are not representations themselves, hence outside language and outside thought. A linguistic sign (and other regimes of signs are typically ignored) is supposed to represent transparently or literally. On account of this, a poetic, artistic, or nonverbal regime of signs as a type of language that “represents” indirectly via mediation cannot be “objective” in describing reality. The object represented is habitually taken as empirically observable physical reality without realizing that semiotic reality may have its own language of expression, even if its discourse appears silent: seen but not heard. Like the analysis of any art, reading the pictorial language of Tarot “involves a crucial abductive component – abduction being the Peircean prerequisite for any gaining of new knowledge” (Stjernfelt, 2007, p. 279).
Conclusions Examining the triadic structure of signs leads to understanding that the level of potential meanings for a given sign must exceed the steady references already present in any conscious mind because semiotic intelligence encompasses thinking (mental world) as coupled or integrated with doing (physical world): the science of
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
691
signs demonstrates harmony or analogy between “ethical reason [and] experimental logic” (Peirce CP 5.430). An expanded consciousness, in which the unconscious has been integrated, can transcend the limitations of the present and let in various opportunities afforded by an open future. The apparent closure of a semiotic triangle each and every time opens new possibilities, and the impossible triangle stops being impossible. When complemented by visual signs in addition to linguistic ones, edusemiotics – by realizing virtual meanings in practice – can paradoxically compute the essentially non-computable. Such “computation” (to a degree, of course) is unlikely to become a strict rule-based algorithm and needs clarification. At the cutting edge of philosophy of mind and cognitive science, computers are understood as dynamical systems that indeed manipulate “bits,” but these units of information are not strictly reducible to what physics calls particles. They are moments in the overall process which is represented, significantly, by analog and not solely digital information. Seth Lloyd (2006) posits a computational universe full of invisible information (of which agents would typically be unconscious) that as such computes its own evolution in a self-organizing manner, not unlike the process of semiosis. Hence follows the motto “It from bit,” or better said “It from qubit,” which means that the observable universe arises out of invisible information. Energy is needed to process this invisible information and make it relatively visible at the level of material reality observable by regular senses. Thus matter, energy, and information are connected in an interrelation that parallels the structure of Peirce’s genuine signs. It cannot be otherwise in a world perfused with signs, where matter and mind presuppose each other, thus forming a complementary pair, a bodymind indeed. For Peirce, matter is effete mind: it is endowed with protomentality. It is potentiality-made-actual which is exercised by the Fool in the form of abductive inference as Peirce’s Firstness. Philosopher of science (and Peirce accolade) Abner Shimony contends that potentiality can be considered the very: instrument whereby the embarrassing bifurcation between dim protomentality and highlevel consciousness can be bridged. Even a complex organism with a highly developed brain may become unconscious. The transition between consciousness and unconsciousness need not be interpreted as a change of ontological status, but as a change of state, and properties can pass from definiteness to indefiniteness and conversely. (Shimony in Penrose, 1997, p. 151)
The postulate of propensity assigns “an ontological status to the tendencies or propensities of the various possible outcomes of a singular chance event” (Shimony, 1993, p. 237, Vol. II). The actualization – via a “magnitude of thirdness” (Deely, 1990, p. 102) – of the many potentialities hiding in unconscious nature takes place due to the subjective, bottom-up, “intervention of the mind” (Shimony, 1993, p. 319, Vol. II) into a signifying chain of semiosis by means of interpretation. Yet this very intervention is also objective by being implemented via a global, top-down, choice. A choice of this kind can be accounted for by means of what Shimony, addressing “the status of mentality in nature” (in Penrose, 1997, p. 144), dubbed the hypothetical “superselection rule” (p. 158) which appears to play a decisive role in becoming aware of the unconscious.
692
I. Semetsky
Lloyd (2006), stressing that universal quantum computation proceeds in a dual analog-digital mode, specifies the structure of the computational space in terms of a circuit diagram representing both logic gates (the places where “qubits” interact, thus exchanging and transforming information) and including causal connections represented by the connecting “wires” or paths along which the information flows. These moments in the flow of semiosis can be defined as discrete “bits” only within a certain context, that is, taken as already parts-of-the-whole (cf. Rockwell, 2007), quite in accord with Peirce’s conceptualizations. Such “computation” pertains to the evaluation of experience mediated by visual signs and includes Peircean abduction as hypothetical conjecture enabled by insight into the third, imaginal, world. Lloyd contends that entropy as the invisible information (signs, we should say) permeating the universe is also the measure of human ignorance. He stresses that “quantum mechanics, unlike classical mechanics, can create information out of nothing” (Lloyd, 2006, p. 118) – just like the Fool’s abductive guess at the edge of the cliff – thus removing the apparent mysticism of the unconsciousbecoming-conscious: they are connected via a semiotic relation. Needless to say, the manner in which the perceived opposites “are connected – or entangled – is a very subtle thing. . . . Quantum entanglement is a very strange type of thing. It is somewhere between objects being separate and being in communication with each other – it is a purely quantum mechanical phenomenon and there is no analogue of this in classical physics” (Penrose, 1997, p. 66). Still, there is an analogue to this in semiotics and edusemiotics. If a thinker literally steps out of the framework of the Cartesian mind, forever separated from the world, and connects in practice with the embodied world of graphic signs, as during Tarot readings, then that thinker assumes a position of radical objectivity. This is analogous to the implications of the triangle argument presented earlier in Fig. 3, in which the imaginary “supernova” becomes conceptually equivalent to the vanishing point in a perspectival composition, referenced in the structural analysis of the diagram on the complex plane in Fig. 4. The layout of images reflects on the possibility of anticipating the future by evaluating “the options in further evolution” (Jantsch, 1980, p. 232) of signs as patterns of coordinated, interpretive activity constituting “embodied cognition” (Kelso & Engstrøm, 2006, p. 89) in the overall dynamics of the Tarot semiotic system. Postmodern interdisciplinary research should give the go-ahead to a philosophical worldview combining “the enterprise of experimental metaphysics” (Shimony, 1989, p. 64) with “criticism at its best, criticism displaying the rich art of evaluating and analyzing with knowledge and propriety the works of civilization” (Deely, 1990, p. 82). The Tarot visual system is historically among the oldest of these works. Resisting its traditionally perceived “low” status as a card game or fortune-telling device, it becomes apparent that the Tarot represents an exemplary edusemiotic system that can teach human beings about themselves and the world at large, provided that thinkers learn how to read and understand its pictorial language. Trusting the experience of abduction should help.
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
693
References Aphek, E., & Tobin, Y. (1989). The semiotics of fortune-telling (Vol. 22, foundations of semiotics). John Benjamins Publishing Company. Barrow, J. D. (2000). The book of nothing. Vintage Books. Bateson, G., & Bateson, M.C. (1987). Steps to an ecology of mind. Chandler Publishing. Berne, E. (1977). Intuition and ego states: The origins of transactional analysis. Harper & Row. Bruner, J. S. (1979). On knowing: Essays for the left hand. Harvard University Press. Colapietro, V. M. (2000). Further consequences of a singular capacity. In J. Muller & J. Brent (Eds.), Peirce, semiotics, and psychoanalysis (pp. 136–158). Johns Hopkins University Press. Corbin, H. (1972). Mundus imaginalis, or the imaginary and the imaginal (R. Horine, Trans.). Analytical Psychology Club of New York, Inc. Deely, J. (1990). Basics of semiotics. Indiana University Press. Deely, J. (2001). Four ages of understanding: The first postmodern survey of philosophy from ancient times to the turn of the twenty-first century. University of Toronto Press. Dewey, J. (1991). How we think. Prometheus Books. Dummett, M. (1980). The game of tarot: From Ferrara to Salt Lake City. Gerald Duckworth & Co. Dummett, M., & Decker, R. (2002). A history of the occult tarot, 1870–1970. Gerald Duckworth & Co. Egorov, B. F. (1977). The simplest semiotic system and the typology of plots. In D. P. Lucid (Ed.), Soviet semiotics: An anthology (pp. 77–86). Johns Hopkins University. Fetzer, J. H. (1991). Philosophy and cognitive science. Paragon House. Gill, J. H. (2000). The tacit mode: Michael Polanyi’s postmodern philosophy. SUNY Press. Grush, R., & Churchland, P. S. (1995). Gaps in Penrose’s toiling. Journal of Consciousness Studies, 2(1), 10–29. Guiraud, P. (1975). Semiology (G. Gross, Trans.). Routledge & Kegan Paul. Hacking, I. (1990). The taming of chance. Cambridge University Press. Heeren, J. W., & Mason, M. (1984). Seeing and believing: A study of contemporary spiritual readers. Semiotica, 50(3/4), 191–211. Hintikka, J. (1998). What is abduction? The fundamental problem of contemporary epistemology. Transactions of Charles S. Peirce Society, 34, 503–533. Hoffmeyer, J., & Emmeche, C. (1991). Code-duality and the semiotics of nature. In M. Anderson & F. Merrell (Eds.), On semiotic modeling (pp. 117–166). Mouton De Gruyter. Jantsch, E. (1975). Design for evolution: Self-organization and planning in the life of human systems. George Braziller. Jantsch, E. (1980). The self-organizing universe: Scientific and human implications of the emerging paradigm of evolution. Pergamon Press. Jha, S. R. (2002). Reconsidering Michael Polanyi’s philosophy. University of Pittsburgh Press. Kauffman, L. H. (1996). Virtual logic. Systems Research, 13(3), 293–310. Kauffman, L. H. (2001). The mathematics of Charles Sanders Peirce. Cybernetics & Human Knowing, 8(1–2), 79–110. Kelso, J. A. S. (1995). Dynamic patterns: The self-organization of brain and behavior. The MIT Press. Kelso, J. A. S., & Engstrøm, D. A. (2006). The complementary nature. The MIT Press. Kennedy, J. B. (2003). Space, time and Einstein: An introduction. Acumen. Kevelson, R. (1999). Peirce and the mark of the gryphon. St. Martin’s Press. Kihlstrom, J. F. (1993). The rediscovery of the unconscious. In H. J. Morowitz & J. L. Singer (Eds.), The mind, the brain, and complex adaptive systems (Vol. 22, pp. 123–143). AddisonWesley. Lekomceva, M. I., & Uspensky, B. A. (1977). Describing a semiotic system with a simple syntax. In D. P. Lucid (Ed.), Soviet semiotics: An anthology (pp. 65–76). Johns Hopkins University.
694
I. Semetsky
Lloyd, S. (2006). Programming the universe: A quantum computer scientist takes on the cosmos. Alfred A. Knopf. Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation. Kluwer Academic/Plenum Publishers. Merrell, F. (1992). Sign, textuality, world. Indiana University Press. Merrell, F. (1995). Peirce’s semiotics now: A primer. Canadian Scholars’ Press. Merrell, F. (1996). Signs grow: Semiosis and life processes. University of Toronto Press. Merrell, F. (1997). Peirce, signs, and meaning. University of Toronto Press. Merrell, F. (2002). Learning living, living learning: Signs, between east and west. Legas. Nöth, W. (1995). Handbook of semiotics. Indiana University Press. Nöth, W. (2012). Translation and semiotic mediation. Sign Systems Studies, 40(3–4), 279–298. Nöth, W., & Jungk, I. (2015). Peircean visual semiotics: Potentials to be explored. Semiotica, 2015(207), 657–673. Otte, M. F. (2011). Evolution, learning, and semiotics from a Peircean point of view. Educational Studies in Mathematics, 77(2–3), 313–329. Peirce, C. S. (1931–1935; 1958). Collected papers of Charles Sanders Peirce (Vol. 1–6, Ed. C. Hartshorne and P. Weiss; Vol. 7 and 8, Ed. A. Burks). Harvard University Press. [cited as CP]. Peirce, C.S., & Jastrow, J. (1884). On small differences of sensation. Memoirs of the National Academy of Sciences, 3, 75–83 Penrose, R. (1994). Shadows of the mind: A search for the missing science of consciousness. Oxford University Press. Penrose, R. (1997). The large, the small, and the human mind. Cambridge University Press. Penrose, R. (2004). The road to reality: A complete guide to the laws of the universe. Jonathan Cape. Polanyi, M. (1964). Science, faith, and society. University of Chicago Press. Polanyi, M. (1966). The tacit dimension. Doubleday. Posner, R. (2004). Basic tasks of cultural semiotics. In G. Withalm & J. Wallmannsberger (Eds.), Signs of power – Power of signs. Essays in honor of Jeff Bernard (pp. 56–89). INST. Prigogine, I., & Stengers, I. (1984). Order out of chaos. Bantam Books. Rockwell, W. T. (2007). Neither brain nor ghost: A nondualist alternative to the mind-brain identity theory. The MIT Press. Rotman, B. (1987). Signifying nothing: The semiotics of zero. Stanford University Press. Rotman, B. (1993). Ad infinitum – The ghost in turing’s machine. Stanford University Press. Rucker, R. (1982). Infinity and the mind: The science and philosophy of the infinite. Birkhauser. Schattschneider, D. (2010). The mathematical side of M.C. Escher. Notices of the AMS, 57(6), 706–718. Sebeok, T. A. (Ed.). (1994). Encyclopedic dictionary of semiotics (Approaches to semiotics; 73) (Vol. 1). Mouton de Gruyter. Semetsky, I. (2004). The role of intuition in thinking and learning: Deleuze and the pragmatic legacy. Educational Philosophy and Theory, 36(4), 433–454. Semetsky, I. (2011). Re-symbolization of the self: Human development and tarot hermeneutic. Sense Publishers. Semetsky, I. (2013). The edusemiotics of images: Essays on the art∼science of tarot. Sense Publishers. Semetsky, I. (2014). Taking the edusemiotic turn: A body∼mind approach to education. Journal of Philosophy of Education, 48(3), 490–506. Semetsky, I. (2015). Interpreting Peirce’s abduction through the lens of mathematics. In M. Bockarova, M. Danesi, D. Martinovic, & R. Núñez (Eds.), Mind in mathematics: Essays on mathematical cognition and mathematical method (pp. 154–166). Lincom. Semetsky, I. (2020). Semiotic subjectivity in education and counseling: Learning with the unconscious. Routledge. Semetsky, I., & Stables, A. (Eds.). (2014). Pedagogy and edusemiotics: Theoretical challenges/practical opportunities. Sense Publishers.
32 Visual Semiotics, Abduction, and the Learning Paradox: The Role of. . .
695
Shimony, A. (1989). Search for a worldview which can accommodate our knowledge of microphysics. In J. T. Cushing & E. McMullin (Eds.), Philosophical consequences of quantum theory: Reflections on Bell’s theorem. University of Notre Dame Press. Shimony, A. (1993). Search for a naturalistic world view: Scientific method and epistemology (Vol. I-II). Cambridge University Press. Shlain, L. (1998). The alphabet versus the goddess: The conflict between word and image. Viking. Simon, H. A. (1995). Near decomposability and complexity: How a mind resides in a brain. In H. J. Morovitz & J. L. Singer (Eds.), The mind, the brain, and complex adaptive systems. Proceedings vol. XXII. Santa Fe institute in the sciences of complexity (pp. 25–43). Addison-Wesley. Snyder, I. (Ed.). (1997). Page to screen: Taking literacy into the electronic era. Allen & Unwin. Spencer-Brown, G. (1979). Laws of form. E. P. Dutton. Stapp, H. P. (2007). Mindful universe: Quantum mechanics and the participating observer. Springer-Verlag. Stjernfelt, F. (2007). Diagrammatology: An investigation on the borderlines of phenomenology, ontology, and semiotics. Springer. Thom, R. (1985). From the icon to the symbol. In R. E. Innis (Ed.), Semiotics: An introductory anthology (pp. 275–291). Indiana University Press. Tversky, B. (2019). Mind in motion: How action shapes thought. Basic Books. Varela, F. J. (1979). Principles of biological autonomy. North Holland. Varela, F. J. (1999). Ethical know-how: Action, wisdom and cognition. Stanford University Press. Von Eckardt, B. (1996). What is cognitive science? The MIT Press.
On the Missing Diagrams in Category Theory
33
Eduardo Ochs
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Finding “the” Object with a Given Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Freyd’s Diagrammatic Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Internal Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Natural Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adjunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Way to Teach Adjunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Types for Children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dependent Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Witnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Judgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Set Comprehensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Omitting Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indefinite Articles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . “Physicists’ Notation” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Basic Example as a Skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reconstructing Its Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Natural Transformations (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Full Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
698 699 702 705 705 706 708 709 710 711 713 714 716 717 718 718 719 719 720 721 722 722 723 726 726 727
E. Ochs () Universidade Federal Fluminense, Rio das Ostras, Brazil © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_41
697
698
E. Ochs
Abstract
Many texts on Category Theory are written in a very terse style, in which it is assumed (a) that all relevant concepts are visualizable in diagrams and (b) that the texts’ readers can abductively reconstruct the diagrams that the authors had in mind based on no more than the most essential cues. In fact, there are many tacit conventions for drawing diagrams scattered through the literature, but a single unified diagrammatic language for all expository and interpretative contexts does not exist. This chapter offers an attempt to reconstruct abductively an (imaginary) language for missing diagrams: it proposes an extensible diagrammatic language, called DL, that follows the conventions of the diagrams in the literature whenever possible and that seems to be adequate for drawing “missing diagrams” for Category Theory. Examples include the “missing diagrams” for adjunctions and for the Yoneda Lemma. It is also shown how to formalize such abductive inferences of the missing diagrams in Agda. Keywords
Category theory · Diagrams · Diagrammatic reasoning · Formal mathematics
Introduction This chapter introduces a diagrammatic language – called DL – that can be used to draw “missing diagrams” for Category Theory. DL is a reconstructed language, in the sense that it has been developed abductively in light of the difficulties and impasses that can arise when attempting to understand more standard notation in the current Category Theory literature. The present chapter may be understood as a comprehensive method for abductive reasoning about standard diagrammatic notation in Category Theory. The conventions of DL are explained in Section “The Conventions”. Several quotations may help to motivate the need for and development of DL. The first is from Eilenberg and Steenrod (1952, p. ix), cited in Krömer (2007, pp.82–83): The diagrams incorporate a large amount of information. Their use provides extensive savings in space and in mental effort. In the case of many theorems, the setting up of the correct diagram is the major part of the proof. We therefore urge that the reader stop at the end of each theorem and attempt to construct for himself the relevant diagram before examining the one which is given in the text. Once this is done, the subsequent demonstration can be followed more readily; in fact, the reader can usually supply it himself.
One often spends a lot of time in studying Category Theory trying to “supply the diagrams oneself.” In Eilenberg and Steenrod (1952), supplying the diagrams is
33 On the Missing Diagrams in Category Theory
699
not especially difficult, but in books like Lane (1997), in which the most important concepts involve several categories, it may be necessary to rearrange tentative diagrams hundreds of times until one reaches “good” diagrams for understanding the material. The problem is that one may expect too much from “good” diagrams. The following quotations are from Sections 1 and 12 of a relevant article (Ochs, 2013): Different people have different measures for “mental space”; someone with a good algebraic memory may feel that an expression like Frob : Σf (P ∧ f ∗ Q) ∼ = Σf P ∧ Q is easy to remember, while I always think diagramatically, and so what I do is that I remember this diagram,
and I reconstruct the formula from it. Let’s call the “projected” version of a mathematical object its “skeleton.” The underlying idea in this chapter is that for the right kinds of projections, and for some kinds of mathematical objects, it should be possible to reconstruct enough of the original object from its skeleton and few extra clues – just like paleontologists can reconstruct from a fossil skeleton the look of an animal when it was alive.
What is needed is a diagrammatic language that would allow for the expression of the “skeletons” of categorical definitions and proofs. Such skeletons should be easy to remember – both because they should have shapes that are easy to remember and because they should be relevantly similar to “archetypal cases” (Ochs, 2013, section 16). The following sections elaborate just such a diagrammatic language, DL, which may be understood as a systematic expressive means for abductive reasoning about the standard diagrammatic notation employed in mathematical texts using Category Theory. For the sake of brevity, several examples of how to extend this diagrammatic language by adding new conventions to will be omitted. The reader is referred to Ochs (2022b, Section 8) for detailed versions of such examples.
The Conventions The conventions that will be presented now are the ones that are needed to interpret the diagram below, which is essentially the Proposition 1 in the proof of the Yoneda Lemma in Lane (1997, Section III.2); that diagram will be called the “Basic Example” and also “diagram Y0.”
700
(CD)
E. Ochs
Diagrams are made of components that are nodes and arrows. The nodes can contain arbitrary expressions. The arrows work as connectives, and each arrow can be interpreted as the top-level connective in the smallest subexpression that contains it. For example, the curved arrow in the diagram above can be interpreted as η
T
(A → RC) ↔ (B(C, −) → A(A, R−)). (C→) (C→)
(C↔) (CAI)
Arrows that look like ‘→’ (\to) represent hom-sets, or, in Set, spaces of functions. When a ‘→’ arrow is named the name stands for an element of η that hom-set. For example, in A → RC we have η : A → RC. Arrows that look like ‘→’ (\mapsto) represent internal views of functions or functors. This has some subtleties; see Section “Internal Views”. Arrows that look like ‘↔’ (\leftrightarrow) represent bijections or isomorphisms. “Above” usually means “inside”, or “internal view”. In the diagram above the morphism η : A → RC is in A and C is an object of B. Also, the R
arrow C → RC is above B → A, and this means that it is an internal R view of the functor R. Note that usually is not always – and B → A is T
(CO)
(CC) (CTL)
not an internal view of B(C, −) → A(A, R−). When the definition of a component of a diagram is “obvious” in the sense of “there is a unique natural construction for an object with that name,” its definition will typically be omitted, and it will be pretended that the definition is obvious; the same goes for its uniqueness. See Section “Finding “the” Object with a Given Name”. Everything commutes by default, and non-commutative cells have to be indicated explicitly. See Section “Freyd’s Diagrammatic Language”. The default “meaning” for a diagram without quantifiers is the definition of its top-level component. There is a natural partial order on the
33 On the Missing Diagrams in Category Theory
(CMQ)
(CAdj)
(CYo)
(CDT) (CIA)
(COT)
701
components of a diagram, in which α ≺ β iff α is “more basic” than β or, in other words, if α needs to be defined before β. In the diagram above, the top-level component is the curved bijection. The default “meaning” for a diagram with quantifiers is a proposition. See Sections “Freyd’s Diagrammatic Language”, “Adding Quantifiers” and “Adding Functors” for how to obtain that proposition. The best way to explain Category Theory to someone who knows just a little bit of Maths is by starting with the adjunction (×B) (B→) of Section “Adjunctions”; the canonical way to draw is with the left adjoint going left, the right adjoint going right, and the morphisms going down. In Proposition 1 of Lane (1997, Section III.2) the map η is a universal arrow, and someone who learns adjunctions first sees the unit maps η : A → (B→(A×B)) as the first examples of universal arrows – so that’s why the upper part of the diagram above is drawn in this position. In some constructions related to the Yoneda Lemma, the part of the construction that looks like a part of an adjunction is drawn as that part of an adjunction would be drawn. For example, “The functor U : Ring → Set is representable” (see Ochs, 2022b, Section 8) is drawn as
Its upper part looks like a part of an adjunction, but the rest does not. A diagram acts a dictionary of default types for symbols. See Section “Omitting Types”. Default types allow us to use indefinite articles in a precise way. For example, we have η : HomA (A, RC), so “an η” means “an element of HomA (A, RC).” See Section “Indefinite Articles”. A notation as close to the original text as possible will be consistently used, especially when trying to draw the missing diagrams for some existing text. If the missing diagrams for the Proposition 1 of Lane (1997, Section III.2) were at issue, the expression would be this:
702
(CSk)
(CTT)
(CFSh) (CISh) (CPSh) (CNSh)
E. Ochs
It should be noted that another choice of letters than Mac Lane’s is used here. Suppose that a piece of text is given – say, a paragraph P – and the task is to reconstruct the “missing diagram” D for P . Ideally this D should be a “skeleton” for P , in the sense that it should be possible to reconstruct the ideas in P from the diagram D using very few extra hints; see Ochs (2013, Sect.12). Diagrams should be close to Type Theory: it should be possible to use them as a scaffolding for formalizing the text in (pseudocode for) a proof assistant. The image by a functor of a diagram D is drawn with the same shape as D. The internal view of a diagram D is drawn with the same shape as D, modulo duplications – see Section “Internal Views”. A particular case of a diagram D is drawn with the same shape as D. A translation of a diagram D to another notation is drawn with the same shape as D.
The conventions (CD)–(CMQ) and (CFSh)–(CNSh) all appear in diagrams in Lane (1969), Freyd (1976), Freyd & Scedrov (1990), Taylor (1999), Riehl (2016), and Leinster (2014), but very few of them are spelled out explicitly, and the idea of “same shape” is never stressed. See Nederpelt & Geuvers (2014, p.179) for a neat example of “substitution produces something with the same shape” and Ye et al. (2020) for a language for drawing diagrams from high-level specifications in which it may be possible to implement the rules about “same shape.” Most texts on CT use diagrams to prove theorems. Here diagrams will be used to understand theorems and to translate between languages. This approach can be seen as an extension of Ganesalingam (2013) to Category Theory; see also Jansson et al. (2022), which is a recent book that follows many of the ideas in Ganesalingam (2013).
Finding “the” Object with a Given Name Lane (1997) “defines” functors by describing their actions on objects, and it leaves to the reader the task of discovering their actions on morphisms. It is instructive to see how to find these actions on morphisms.
33 On the Missing Diagrams in Category Theory
703
A functor F : A → B has four components: F = (F0 , F1 , respidsF , respcompF ). They are its action on objects, its action on morphisms, the assurance that it takes identity maps to identity maps, and the assurance that it respects compositions. When Mac Lane says this, Fix a set B. Let (×B) denote the functor that takes each set A to A × B.
he is saying that (×B)0 A = A × B, or, more precisely, this: (×B)0 := λA. A × B The “the ” in the expression “Let (×B) denote the functor...” implies that the precise meaning of (×B)1 is easy to find, and that it is easy to prove respids(×B) and respcomp(×B) . If f : A → A then (×B)1 f : (×B)0 A → (×B)0 A. One knows the name of the image morphism, (×B)1 f , and its type, (×B)1 f : A × B → A × B, and it is implicit that there is an “obvious” natural construction for this (×B)1 f from f . A natural construction is a λ-term, so we are looking for a term of type A × B → A × B that can be constructed from f : A → A. In a big diagram:
A double bar in a derivation means “there are several omitted steps here,” and sometimes a double bar suggests that these omitted steps are obvious. The derivation on the left says that there is an “obvious” way to build a (×B)1 f : A ×B → A×B from a “hypothesis” f : A → A. If its double bar is expanded, one gets the tree at the right. This shows that the “precise meaning” for (×B)1 f is (λp:A ×B.(f (πp), π p). More formally (and erasing a typing), (×B)1 := λf.(λp.(f (πp), π p)). The expansion of the double bar above becomes something more familiar by translating the trees to Logic using Curry-Howard:
704
E. Ochs
The tree at the right is obtained by proof search. The operation above that obtained a term of type A × B → A × B will be called term search or, as it is somewhat related to type inference, term inference. Term search may yield several different construction and trees and so several non-equivalent terms of the desired type. When Mac Lane says “the functor (×B)” he is indicating that: • a term for (×B)1 is easy to find (note that the expression “a precise meaning for (×B)1 ” is used here). • All other natural constructions for something that “deserves the name” (×B)1 yield terms equivalent to that first, most obvious one. • proving respids(×B) and respcomp(×B) is trivial. In many situations one starts with just the name of a functor, as the “(×B)” in the example above, and from that name it will be easy to find the “precise meaning” for (×B)0 , and from that the “precise meaning” for (×B)1 , and after that proofs that respids(×B) and respcomp(×B) . The expression “...deserving the name...” will be used in this process – terms for (×B)0 , (×B)1 , respids(×B) , and respcomp(×B) “deserve their names” if they obey the expected constraints. For a more thorough discussion, see Ochs (2013). These ideas of “finding a precise meaning” and “finding (something) deserving that name” can also be applied to morphisms, natural transformations, isomorphisms, and so on. Section “Natural Transformations (2)” will show how to find natural constructions for the two directions of the bijection in the Basic Example – or how to expand the double bars in the two derivations here:
It should be possible – in the sense of Cheng (2004) – to formalize the method that (re)constructs a functor from its action on objects or from its name. While this
33 On the Missing Diagrams in Category Theory
705
is not standard for Category Theory, Agda has a tool that can be used for just this: see the section on “Automatic Proof Search” in Team.
Freyd’s Diagrammatic Language In Freyd (1976), Peter Freyd presents a very nice diagrammatic language that can be used to express some definitions from Category Theory. For example, this is the statement that a category has all equalizers:
All cells in these diagrams commute by default, and non-commuting cells have to be indicated with a “?”. Each vertical bar with a “∀” above it means “for all extensions of the previous diagram to this one such that everything commutes”; a vertical bar with a “∃! ” above it means “there exists a unique extension of the previous diagram to this one such that everything commutes” and so on. See the paper Freyd (1976) for the basic details of how to formalize these diagrams and the book (Freyd & Scedrov, 1990, p.28 onwards), for tons of extra details, examples, and applications. The subdiagrams of a diagram like the one above will be referred to as its “stages.” Its stage 0 is empty, its stage 1 has two objects and two arrows, its last stage has four objects and five arrows, and the quantifiers separating the stages are Q1 = ∀, Q2 = ∃, Q3 = ∀, and Q4 = ∃!. They are structured like this:
Adding Quantifiers Here is a simple way to draw all stages at once. The starting point is a diagram for the “last stage with quantifiers,” which will be called LSQ:
706
E. Ochs
All the stages and quantifiers can be recovered from it. The numbered quantifiers in it are ∀1 , ∃2 , ∀3 , and ∃!4 . The highest number in them is 4, so n = 4 (n is the index of the last stage), “stage 4 with quantifiers,” SQ4 , will be set to LSQ. To obtain the SQ3 from SQ4 , all nodes and arrows in SQ4 that are annotated with a “∃!4 ” will be deleted; to obtain SQ2 from SQ3 , all nodes and arrows in SQ3 that are annotated with a “∀3 ” will be deleted and so on until a diagram SQ0 is obtained (in this example, such a diagram is empty). To obtain each Sk – a stage in the original diagrammatic language from Freyd, which doesn’t have quantifiers – from the corresponding SQk , all the quantifiers in SQk are treated as mere annotations and therefore will be erased; for example, “∃2 e” becomes “e,” and ∀1 A becomes A. To obtain the quantifiers Q1 , Q2 , Q3 , and Q4 that are put in the vertical bars that separate the stages, ∀1 , ∃2 , ∀3 , and ∃!4 will be assigned to them, without the numbers in the subscripts. Bonus convention: when the quantifiers in a diagram are just “∀”s and “∃! ”s without subscripts, the “∀”s are to be interpreted as “∀1 ” and the “∃! ”s as “∃!2 ”s.
Adding Functors Freyd’s language can’t represent functors – in the sense of diagrams like the ones in Section “Functors” – but it would be desirable to use it to draw the missing diagrams for definitions involving functors, so the strategy is to extend it again. The following example will clarify this. This is the definition of universal arrow in Lane (1997, p.55), including the original diagram, modulo change of letters: Definition 1. If R : B → A is a functor and A an object of A, a universal arrow from A to R is a pair (B, η) consisting of an object B of B and arrow η : A → RB of A such that to every pair (B , g) with B an object of B and g : A → RB an arrow of A, there is a unique arrow f : B → B of B with Rf ◦ η = g. In other words, every arrow h to R factors uniquely through the universal arrow η, as in the commutative diagram:
33 On the Missing Diagrams in Category Theory
707
The definition itself goes only up to the “with Rf ◦ η = g.” so the part starting from “In other words” can be ignored, and a better “missing diagram” can be drawn to unpack for the definition:
This diagram is quite close to being a skeleton for the definition of universal arrow. It can be interpreted as a proposition, and the only extra hint that is needed is that “universalness” for the arrow η corresponds to the truth of that proposition. Here’s how to extract the proposition from it: In a context where: A is a category, B is a category, R : B → A, A ∈ A, B ∈ B, η : A → RB, for all B ∈ B and g : A → RB , there exists a unique f : B → B such that Rf ◦ η = g. To convert that to a definition of universalness, the “for all” needs to be replaced by “(B, η) is a universal arrow for A to R iff for all.”
708
E. Ochs
The convention for quantifiers from Section “Adding Quantifiers” allows the rewriting of the diagram in three stages above as
It can also be noticed that most typings that can be inferred from the diagram can be omitted. The diagram above can be formalized as: “in a context where (A, B, R, A, B, η) are typed as in the diagram above, (B, η) can be said to be a universal arrow from A to R when ∀(B , g).∃!f.(Rf ◦ η = g)”. And as by default everything commutes (see the convention (CC)), the (Rf ◦ η = g) can also be omitted. In Section “Omitting Types” a way to formalize this method for omitting and reconstructing types will be presented, and in Section “Indefinite Articles” a second way to define universalness will be discussed as well. Finally it can be noted that erasing a node or arrow also erases everything that depends on it. In the example above, SQ2 has an arrow labeled ∃!2 f ; to obtain SQ1 from SQ2 , that arrow needs to be erased, the arrow Rf , and the arrow f → Rf – and to obtain SQ0 from SQ1 , the arrow g, the node B , the node RB , and the arrow B → RB they all need to be erased.
Internal Views A clear way of introducing internal views is by making use of the diagram below:
33 On the Missing Diagrams in Category Theory
709
The parts with the two blobs and “−→”s between them is based on how sets and functions are typically taught at an elementary level; it is an internal view of the √ N → R below it. Not all elements of N are shown in the blob-view of N, but the ones that are shown are named; compare this with Lawvere and Rosebrugh (2003, p.2 onwards), in which√the elements are usually dots. The arrow n −→ n between the blobs shows a generic element of N and its image, and the other “−→”s are substitution instances of it, like this: (n −→
√ √ n)[n := 2] = (2 −→ 2)
In some cases, like 4 −→ 2, 2 instead of to” 2, as explained in the next section.
√ √ 4 can be written, because 4 “reduces
Reductions f
The convention (C→) says that an arrow α → β above an arrow A → B should be interpreted √ as meaning f (α) β, where “” means “reduces to”; the standard example is 4 2. In a diagram:
1
The idea of reduction comes from λ-calculus. It can be written that α β to say ∗ that the term α reduces to β in one step, and α γ to say that there is a finite sequence of one-step reductions that reduce α to γ . Here the interest is in reduction √ 1 in a system with constants, in which, for example, ( )(4) 2. Here is a directed graph that shows all the one-step reductions starting from g(2 + 3), considering g(a) = a · a + 4:
Note that all reduction sequences starting from g(2 + 3) terminate at the same term, 29 – “the term g(2 + 3) is strongly normalizing” – and reduction sequences from g(2 + 3) may “diverge” but they “converge” later: this is the “Church-Rosser property,” aka “confluence.”
710
E. Ochs
A good place to learn about reduction in systems with constants is Abelson and Sussman (1996).
Functors By the convention (CFSh), the image of the diagram above A in the diagram below – here above usually means inside –
is a diagram with the same shape over B. It is drawn like this:
In this case, the arrows like A1 → F A1 are not drawn because there would be too many of them – they are left implicit. The diagram above can be said to be an internal view of the functor F . To draw the internal view of the functor F : A → B the starting point is a diagram in A that is made of two generic objects and a generic morphism between them, leading to the following:
It is interesting to compare this √ with the diagram with blob-sets in Section “Internal Views”, in which the “n → n” says where a generic element is taken. F
Any arrow of the form α → β above a functor arrow A → B is interpreted as saying that F takes α to β or, in the terminology of Section “Reductions”, that F α reduces to β. So this diagram
33 On the Missing Diagrams in Category Theory
711
defines (A×) as (A×)0 := λB. A × B, (A×)1 := λf.λp.(πp, f (π p)). In this case the internal views of (A×) can be used to define (A×)1 :
Natural Transformations Suppose that functors F, G : A → B and a natural transformation T : F → G are given. An immediate way to draw an internal view of T is this:
Starting with a morphism h : C → D in A, like this,
the convention (CFSh) would yield an image of h by F and another by G, and the arrows T C and T D can be drawn to obtain a commuting square in B:
712
E. Ochs
This way of drawing internal views of natural transformations yields diagrams that are too heavy, so usually they will be drawn just as the following:
T
Note that the input morphism is at the left, and above F → G, its images by F , G, and T are drawn. When the codomain of F and G is Set, sometimes it can also be drawn at the right an internal view of the commuting square, like this:
Then the commutativity of the middle square is equivalent to ∀x ∈ F C.(Gh ◦ T C)(x) = (T D ◦ Ff )(x). It can be noted that in this case, the square at the right is an internal view of an internal view. In Section “Finding “the” Object with a Given Name”, it has been discussed that a functor has four components. A natural transformation has two of them:
33 On the Missing Diagrams in Category Theory
713
T = (T0 , sqcondT ), where T0 is the operation C → T C and sqcondT is the guarantee that all the induced squares commute.
Adjunctions Adjunctions will be drawn like this
with the left adjoint going left and the right adjoint going right. Preferred names for the left and right adjoints are L and R. The standard notation for that adjunction is L R. The top-level component of the diagram above is the bijection arrow in the middle of the square – it says that Hom(LA, B) ↔ Hom(A, RB). It is implicit that bijections like that for all A and B are given; it is also implicit that that bijection is natural in some sense. Sometimes adjunction diagrams can be expanded by adding unit and counit maps, the unit and the unit as natural transformations, the actions of L and R on morphisms, and other things. For example:
The naturality conditions can be obtained by regarding and as natural transformations and drawing the internal views of their internal views:
714
E. Ochs
A Way to Teach Adjunctions Parts of the language introduced so far have been tested in a seminar course (Ochs, 2019a) where Categories were taught starting with adjunctions. The course began with λ-calculus and some sections of Ochs (2019b), and then students were asked to define each one of the operations in the right half of the diagram below as λ-terms:
Then the definition of functors, natural transformations, and adjunctions were covered and checked that the right half is a particular case (“for children”!) of the diagram for a generic adjunction in the left half. After that, and after also checking that, in the Planar Heyting Algebras of Ochs (2019b), an adjunction (∧Q) (Q →) can be observed, the students were able to decipher some excerpts of Awodey (2006). From the components of the generic adjunction in the diagram above, it is possible to build this big diagram:
33 On the Missing Diagrams in Category Theory
715
These names can be used for its subdiagrams:
A BCDEF G I
.
A fully specified adjunction between categories B and A has lots of components (L, R, ε, η, , , univ(ε), univ(η)), and maybe even others, but usually only some of these components are defined; there is a Big Theorem About Adjunctions (see below) that says how to reconstruct the fully specified adjunction from some of its components. Some parts of the diagram above can be interpreted as definitions, like these: Lf := (ηA ◦ f ) g := εB ◦ Lh
εB := (idRB )
ηA := (idLA )
Rk := (k ◦ ηB
)
h := Rg ◦ ηA
716
E. Ochs
The subdiagrams B and F can also be interpreted in the opposite direction, as g := (∀A.∀g.∃!h)Ag = (univεB )Ag
h := (∀B.∀h.∃!g)Bh = (univηA )Bh
The notations (∀A.∀g.∃!h)Ag and (univεB )Ag are clearly abuses of language – but it’s not hard to translate them to something formal, and they can help us to understand and formalize constructions like this one
that are needed in cases like the part (ii) of the Big Theorem. The Big Theorem About Adjunctions is this – it’s the Theorem 2 in Lane (1997, page 83), but with letters changed to match the ones that are used in the diagrams: Big Theorem About Adjunctions. Each adjunction L, R, : A B is completely determined by the items in any one of the following lists: (i) Functors L, R, and a natural transformation η : idA → RL such that each ηA : A → RLA is universal to R from A. Then is defined by (6). (ii) The functor R : B → A and for each A ∈ A an object L0 A ∈ B and a universal arrow ηA : A → RL0 A from A to R. Then the functor L has object function L0 and is defined on arrows f : A → A by RLf ◦ ηA = ηA ◦ f . (iii) Functors L, R, and a natural transformation ε : LR → idB such that each εB : LRB → B is universal from L to B. Here is defined by (7). (iv) The functor L : A → B and for each B ∈ B an object R0 B ∈ A and an arrow εB : LR0 B → B universal from L to B. (v) Functors L, R and natural transformations η : idA → RL and ε : LR → idB such that both composites (8) are the identity transformations. Here is defined by (6) and by (7).
Types for Children Some basic notions of Type Theory will be needed in the next sections. Some nonstandard notational conventions are needed here that appear more or less naturally when Category Theory “for children” is presented in the right way.
33 On the Missing Diagrams in Category Theory
717
Section 6 of Selinger (2013) has a very good presentation of types “for adults”: it uses expressions like A × B and A → B as types and treats them as purely syntactical objects, but each one comes with an “intended meaning.” A version “for children” can be defined in which these intended meanings become more concrete, and the version “for children” and the version “for adults” can be worked out in parallel.
Dependent Types In the version “for children”: • • • •
All types are sets. Some sets are types. Every finite subset of N is a type. if A and B are types then A × B and A → B are types. A × B is the space of pairs of the form (a, b) in which a ∈ A and b ∈ B, and A → B is the space of functions from A to B. • a : A means a ∈ A – the distinction between “:” and “∈” will only appear in other settings. • “space of” means “set of.” The space of functions from A to B is the set of all functions from A to B, and each function is considered as a set of input-output pairs. So, for example, if A = {2, 3} and B = {4, 5}, then A × B = {(2, 5), (3, 4),(3, 5),}, 4), (2, (2,4), (2,4), (2,5), (2,5), A→B = , , , (3,4) (3,5) (3,4) (3,5) • if A is a type and (Ca )a∈A is a family of types indexed by A then Π a:A.Ca and Σa:A.Ca are dependent types defined in the usual way, and (a:A) → Ca and (a:A) × Ca are alternate notations for Π a:A.Ca and Σa:A.Ca (see Norell & Chapman, 2008, section 2). Formally, Σa:A.Ca (a:A) × Ca Π a:A.Ca (a:A) → Ca
= { (a, c) ∈ A × ( a∈A Ca ) | a ∈ A, c ∈ Ca } = { (a, c) ∈ A × ( a∈A Ca ) | a ∈ A, c ∈ Ca } = { f : A → ( a∈A Ca ) | ∀a ∈ A. f (a) ∈ Ca } = { f : A → ( a∈A Ca ) | ∀a ∈ A. f (a) ∈ Ca }
If A = {2, 3}, C2 = {6, 7}, and C3 = {7, 8}, then (a:A) × Ca = {(2, 7), (3, 7),(3, 8),}, 6), (2, (2,6), (2,6), (2,7), (2,7), (a:A) → C = , , , . (3,7) (3,8) (3,7) (3,8)
718
E. Ochs
Witnesses If P is a proposition, P will signify its space of witnesses or its space of proofs. The exact definition of P will usually depend on the context; it is always the case that P = ∅ when P is false and P = ∅ when P is true. In some situations, all the witnesses of a proposition P will be identified – this is called proof irrelevance ; see Nederpelt & Geuvers (2014, p.340) – and all the spaces of witnesses will be either singletons or empty sets; in other situations, some ‘P ’s will have more than one element. The notation P will denote a witness that P is true. Formally, P is a variable whose type is P . A good way to remember this notation is that P looks like a box and P looks like something that comes in that box. In Agda, the operation “≡” returns a space of proofs of equality. If a and b are expressions with the same type, then Agda’s “a ≡ b” corresponds to a = b, and people sometimes use the name “a≡b” to denote an element of a ≡ b – a = b will be used for that. See the section “Equality” in Wadler et al. (2020) for simple examples and Agda’s standard library for more examples.
Judgments The main objects of Type Theory are derivable judgments. A derivable judgment is one that can appear as the root node of a derivation in which each bar is an application of one of the rules in Nederpelt & Geuvers (2014, p.127). These derivations are usually huge – for example, here is a derivation for A:Θ, B:Θ (Π a:A.B):Θ:
so people rarely draw them explicitly, and other tools are used to show that certain judgments are derivable. Every derivable judgment obeys this (taken verbatim from Selinger 2013, p.52): A typing judgment is an expression of the form x1 : A1 , x2 : A2 , . . . , xn : An M : A. Its meaning is “under the assumption that xi is of type Ai , for i = 1 . . . n, the term M is a well-typed term of type A.” The free variables of M must be contained in x1 , . . . , xn .
33 On the Missing Diagrams in Category Theory
719
Understanding what this means in the version “for children” will take us quite close to understanding that in Type Theory “for adults.” That will be done in the next section. As a clarification, the main objects of the Type Theory used in Agda and in most other proof assistants are derivable judgments with definitions, as explained in Chapters 8–10 of Nederpelt & Geuvers (2014). A judgment with definitions is written as Δ; Γ M : N, where Δ is a list of definitions (Nederpelt & Geuvers, 2014, def.9.2.1); the ‘Δ’ can be mostly ignored here.
Set Comprehensions The part at the left of the “” in a typing judgment is called a typing context. Typing contexts also appear in set comprehensions. Let’s see an example: { 10a + b | a ∈ {1, 2}, b ∈ {2, 3}, a < b }
< b ; 10a + b } { a ∈ {1, 2}, b ∈ {2, 3}, a
generator
generator
context
filter
expression
The comprehension { expr | context } was rewritten as { context ; expr } for clarity, and it has been marked which parts of the context act as “generators” and which ones act as “filters.” The context above can be rewritten in type-theoretical notation as a : {1, 2}, b : {2, 3}, a 0 and so on. Now we also need to represent the logic program which is a set of rules into something in vector spaces. We can consider using vector representation as mentioned for each rule; however, we need a way to identify each rule in the program. Thus, a matrix, which is a set of multiple row vectors, is an ideal choice for this purpose. We can address each row as a rule in the program and set a mapping between each row index and each rule in the program. To simplify this task, we can introduce some conditions such as there are no two rules having the same head atom, so we can have a direct mapping between the indices of propositional variables and rules via the propositional variable in the rule head. Accordingly, we can have a sample matrix of the mentioned program as follows: p ⎛ p 0 q ⎝0 r 0
q 1/2 0 0
r ⎞ 1/2 0 ⎠ 0
928
T. Q. Nguyen et al.
This program has only a single And-rule with the head atom being p so we only need to use the first row of the matrix. The reason for choosing the value 1/2 is that each variable in the rule body q, r contributes the same amount of information to deduce p. Next, let us see the behavior of the above matrix if we multiply it with a vector (0, 1, 1). p ⎛ p 0 q ⎝0 r 0
q 1/2 0 0
r ⎞ ⎛ ⎞ ⎛ ⎞ p 0 p 1 1/2 0 ⎠ · q ⎝1⎠ = q ⎝0⎠ r 1 r 0 0
Interestingly, the behavior is similar to applying deductive reasoning on a set of two variables p and q. In fact, we can do more with logic in vector spaces that we are going to present more formally later in this section and the next section. In order to deal with deductive reasoning, Sakama et al. have formally described the method in Sakama et al. (2017). To extend this idea to work with abduction, we slightly modify the definition by Sakama et al. to define a matrix program of a logic program P in a vector space. Definition 1. Matrix representation of standardized programs (Sakama et al., 2017): Let P be a standardized program with L = {p1 , . . ., pn }. Then P is represented by a program matrix MP ∈ Rn×n (n = |L |) such that for each element aij (1 ≤ i, j ≤ n) in MP : 1. aijk = m1 (1 ≤ k ≤ m; 1 ≤ i, jk ≤ n) if pi ← pj1 ∧ · · · ∧ pjm is in PAnd and m > 0; 2. aijk = 1 (1 ≤ k ≤ l; 1 ≤ i, jk ≤ n) if pi ← pj1 ∨ · · · ∨ pjl is in POr ; 3. aii = 1 if pi ← is in PAnd or pi ∈ H.; 4. aij = 0, otherwise. In Definition 1, an update has been introduced in the condition 3 that we set 1 for all abducible atoms pi ∈ H. Example 1. Consider a program P = p ← q ∧ r, p ← s ∧ t, r ← s, q ← t, s ←, t ← with BP = {p, q, r, s, t}. P is not a Singly-Defined (SD) program because there are two rules p ← q ∧ r and p ← s ∧ t having the same head; then P is transformed to the standardized program P by introducing new atoms u and v as follows: P = {u ← q ∧ r, v ← s ∧ t, p ← u ∨ v, r ← s, q ← t, s ←, t ←}. Then by applying Definition 1, we obtain the matrix representation MP of P . Note that we omit all zero elements of matrices in this chapter for better readability.
42 Abductive Logic Programming and Linear Algebraic Computation
p q r s t u v
⎛
p
⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
q
r
s
t 1
1 1 1 1/2
1/2 1/2
929
u v ⎞ 1 1 ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
1/2
Originally, Sakama et al. used interpretation vector which is exactly the same as a correspondent vector we have defined in Definition 6. In fact, a correspondent vector is a special variable recording occurrence of each propositional variable. By manipulating this special vector, we can perform logic inference in vector spaces by using matrix operators. Similar to the way we perform logic inference starting with a set of facts, we can use a correspondent vector of facts as a starting point. Definition 2. Initial vector: Let P be a program with Herbrand base L = {p1 , . . . , pn }. Then the initial vector of P is a correspondent vector of facts such that the i-th value v[i] = 1 (1 ≤ i ≤ |L |) iff the i-th atom pi of L is a fact; otherwise v[i] = 0. In order to utilize the use of vector representations, a thresholding method is defined to perform needed set operations in vector spaces. Definition 3. θ -thresholding: 1. Given a value x ∈ R, define θ (x) = x such that x = 1 if x > 0; otherwise, x = 0 2. Given a vector v ∈ Rn , define θ (v) = v such that v [i] = 1 if v[i] > 0; otherwise v [i] = 0 3. Given a matrix M ∈ Rn×m , define θ (M) = M such that M [i][j ] = 1 if M[i][j ] > 0; otherwise M [i][j ] = 0 where 1 ≤ i ≤ n, 1 ≤ j ≤ m. Without ambiguity, we will identify the set representation s with the vector representation v, so we denote them all as v from now on. Henceforth, vi is the i-th atom of L that constitutes s, while v[i] is the value of the vector at index i. In the space of the correspondent vector we have defined, we can perform set operations using numerical operations as follows:
930
T. Q. Nguyen et al.
Proposition 1. The following equivalence relations hold : u∩v =∅ ⇔u·v =0
(3)
u ∩ v = ∅ ⇔ u · v > 0
(4)
u ⊆ v ⇔ θ (u + v) ≤ θ (v)
(5)
where · is the inner product. Proof. • (3) u ∩ v = ∅ ⇔ u · v = 0 – If u ∩ v = ∅, then based on Definition 6, {i | u[i] = 1} ∩ {j | v[j ] = 1} = ∅. So u ∩ v = ∅ ⇒ u · v = 0. – If u · v = 0, then there is no index i such that u[i] = 1 and v[i] = 1. So {i | u[i] = 1} ∩ {j | v[j ] = 1} = ∅ or u ∩ v = ∅ by Definition 6. • (4) u ∩ v = ∅ ⇔ u · v > 0 – If u ∩ v = ∅, then there is at least an index i such that u[i] = 1 and v[i] = 1. So u · v > 0. – If u · v > 0, then we must find at least an index i such that u[i] = 1 and v[i] = 1. So {i | u[i] = 1} ∩ {j | v[j ] = 1} = ∅ or u ∩ v = ∅ by Definition 6. • (5) u ⊆ v ⇔ θ (u + v) ≤ θ (v) – If u ⊆ v, then {i | u[i] = 1} ⊆ {j | v[j ] = 1} by Definition 6. Assigning z = u + v, then we have {k | z[k] = 1} ⊆ {j | v[j ] = 1}. By applying θ , we limit all values in those vectors by 1, so we have θ (z) ≤ θ (v) or θ (u + v) ≤ θ (v). – If θ (u + v) ≤ θ (v), then {k | z[k] = 1} ⊆ {j | v[j ] = 1}, where z = u + v. Accordingly, z ⊆ v by Definition 6. It is obvious that u ⊆ z, so u ⊆ v. To compute the least model in vector space, Sakama et al. proposed Algorithm 1 that is equivalent to the result of computing least models by the TP -operator. Example 2. Continue with the logic program P in Example 1: P = {p ← q ∧ r, p ← s ∧ t, r ← s, q ← t, s ←, t ←}. Standardized program: P = {u ← q ∧ r, v ← s ∧ t, p ← u ∨ v, r ← s, q ← t, s ←, t ←}. There are two facts s and t, so the initial vector is (0, 0, 0, 1, 1, 0, 0)T . Then we can apply Algorithm 1 to find the least model of P as follows: Algorithm 1 Fixed-point computation
1: 2: 3: 4: 5: 6: 7:
Input: A definite program P with Herbrand base BP = {p1 , p2 , ..., pn } Output: A fixed point of v find the matrix representation MP of P according to Definition 1. find the initial vector v0 . v = v0 . u = θ(MP · v0 ) while u = v do v=u u = θ(MP · v) return u
42 Abductive Logic Programming and Linear Algebraic Computation
p q r s t u v
p q r s t u v
⎛
p
⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
⎛
r
s
t
u 1
v ⎞ ⎛ ⎞ ⎛ ⎞ 1 p 0 p 0 ⎟ q ⎜0⎟ ⎟ q ⎜ ⎟ ⎜ ⎟ ⎜1⎟ ⎟ r ⎜0⎟ ⎜1⎟ r ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ · s ⎜1⎟ = s ⎜1⎟ ; ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ t ⎜1⎟ ⎟ t ⎜ ⎟ ⎜ ⎟ ⎜1⎟ ⎠ u ⎝0⎠ u ⎝0⎠ v 0 v 0
u 1
v ⎞ ⎛ ⎞ ⎛ ⎞ 1 p 0 p 0 ⎟ q ⎜1⎟ ⎟ q ⎜ ⎟ ⎜ ⎟ ⎜0⎟ ⎟ r ⎜1⎟ ⎜0⎟ r ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ · s ⎜1⎟ = s ⎜1⎟ ; ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ t ⎜1⎟ ⎟ t ⎜ ⎟ ⎜ ⎟ ⎜1⎟ ⎠ u ⎝0⎠ u ⎝1⎠ v 0 v 1
1 1 1 1 1/2
p
⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
p q r s t u v
q
⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
q
1/2
r
1/2
1/2
s
t 1
1 1 1 1/2
p
q
1/2
r
1/2
1/2
s
t 1
1 1 1 1/2
1/2 1/2
1/2
931
u v ⎞ ⎛ ⎞ ⎛ ⎞ 1 1 p 0 p 1 ⎟ q ⎜0⎟ ⎟ q ⎜ ⎟ ⎜ ⎟ ⎜0⎟ ⎟ r ⎜0⎟ ⎜0⎟ r ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ · s ⎜1⎟ = s ⎜1⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ t ⎜1⎟ ⎟ t ⎜ ⎟ ⎜ ⎟ ⎜1⎟ ⎠ u ⎝1⎠ u ⎝0⎠ v 1 v 0
The fixed-point computation is the key to many inferencing methods, so we can apply this algorithm for numerous applications, for example, performing consistency checking. In the consistency checking, we have to verify that a truth assignment together with the theory does not entail ⊥. The {⊥} can be represented in vector by v(⊥) similar to the way we have defined the initial vector in Definition 2. Then the consistency checking can be computed by the fixed point of matrix computation FP(MP v) (Sakama et al., 2021), where MP is the program matrix of P and FP(MP v) is the vector representation of the least fixed point of the TP -operator (van Emden & Kowalski, 1976) starting from v. The method to compute FP(MP v) is presented in Algorithm 1, while an efficient implementation using sparse representation has been developed in Nguyen et al. (2022). Based on this method, we adopt a similar idea to verify consistency of an interpretation vector. That is made easy by checking whether the vector representation of FP(MP v) contains ⊥ or not. According to Proposition 1, this conditional checking can be done in vector space by using the inner product of the two vectors of FP(MP v) and ⊥. Further, this condition checking can also be extended to take multiple vectors (or a matrix) at once. In terms of normal logic programs, stable models (Gelfond & Lifschitz, 1988) can also be computed linear algebraically (Sakama et al., 2017, 2021), and sparse matrix methods are effective for it too (Nguyen et al., 2022).
932
T. Q. Nguyen et al.
Abduction Before diving deeper into the detailed method, let us consider again the very simple example at the beginning of the previous section that a program with only a single And-rule p ← q ∧ r was considered. The program matrix of it is p ⎛ p 0 q ⎝0 r 0
q 1/2 0 0
r ⎞ 1/2 0 ⎠ 0
By simply transposing it, a new matrix is obtained: p ⎛ p 0 q ⎝ 1/2 r 1/2
q 0 0 0
r ⎞ 0 0⎠ 0
Now let us see the behavior if we multiply the new matrix with a vector (1, 0, 0): p ⎛ p 0 q ⎝0 r 0
q 1/2 0 0
r ⎞ 1/2 0 ⎠ 0
⎛ ⎞ ⎛ ⎞ p 1 p 0 ⎝ ⎠ ⎝ · q 0 = q 1/2 ⎠ r 0 r 1/2
Surprisingly, the behavior is similar to abduce the explanation of p, and we can say that in order to explain p, we have to explain both q and r. Of course, this example here is not well general enough, but it gives us an initial idea about how to compute abduction in vector spaces. Now let us move on to the more formal theory of the linear algebraic method for abduction. Definition 4. Horn clause abduction: A propositional Horn clause abduction problem (PHCAP) is an abduction problem consists of a tuple L , H, O, P , where H ⊆ L (called hypotheses or abducibles), O ⊆ L (called observations), and P is a consistent Horn logic program. In this chapter, we assume a program P is acyclic (A program P is acyclic if the dependency graph of P is acyclic. The dependency graph of a logic program P is a graph (V , E), where the nodes V are the atoms of P and, for each rule from P , there are edges in E from the atoms appearing in the body to the atom in the head.) (Apt & Bezem, 1991) and in its standardized form. Without loss of generality, we assume that any abducible atom h ∈ H does not appear in any head of rule in P . If there exists h ∈ H and a rule r : h ← body(r) ∈ P , r can be replaced by r : h ← body(r) ∨ h in P , and then replace h by h in H. If r is in the form (2), then r is an Or-rule and no need to further update r . On the other hand,
42 Abductive Logic Programming and Linear Algebraic Computation
933
if r is in the form (1), then we can update r to become an Or-rule by introducing an And-rule br ← body(r) in P and then replace body(r) by br in r . Definition 5. Explanation of PHCAP: A set E ⊆ H is a solution of a PHCAP L , H, O, P if P ∪ E O and P ∪ E is consistent. E is also called an explanation of O. An explanation E of O is minimal if there is no explanation E of O such that E ⊂ E. Deciding if there is a solution of a PHCAP is NP-complete (Selman & Levesque, 1990; Eiter & Gottlob, 1995). In this chapter, we want to find the set E of minimal explanations E for a PHCAP L , H, O, P . In PHCAP, P is partitioned into PAnd ∪ POr where PAnd is a set of And-rules of the form (1) and POr is a set of Or-rules of the form (2). Given P , define head(P ) = {head(r) | r ∈ P }, head(PAnd ) = {head(r) | r ∈ PAnd }, and head(POr ) = {head(r) | r ∈ POr }. The key idea of doing logic inference is to incorporate set operations and handle them by manipulating real values in vector spaces. We will give an overview of how to create the vector space and how to perform set operations on that vector space. Definition 6. Correspondent vector of PHCAP: Any subset s ⊆ L can be represented by a corresponding vector v of the length |L | such that the i-th value v[i] = 1 (1 ≤ i ≤ |L |) iff the i-th atom pi of L is in s; otherwise v[i] = 0. This definition is similar to the definition of an interpretation vector which is defined in Sakama et al. (2017). Here the vector needs to be defined for both deductive and abductive reasoning so the term correspondent vector has been introduced. Definition 7. Abductive matrix of PHCAP: Suppose that a PHCAP has P with its program matrix MP . The abductive matrix of P is the transpose of MP represented as MP T . Let us consider a logic circuit in Fig. 1 that can be formulated as a PHCAP in Example 3. Example 3. Consider a PHCAP such that L = {p, q, r, s, h1 , h2 , h3 }, O = {p}, H = {h1 , h2 , h3 }, P = {p ← q ∧ r, q ← h1 ∨ s, r ← s ∨ h2 , s ← h3 }. Fig. 1 An example of logic circuit
h1
h3
h2
q p
s r
934
T. Q. Nguyen et al.
The program matrix and the abductive matrix of P are ⎛
p
p ⎜ ⎜ q ⎜ ⎜ ⎜ r ⎜ ⎜ MP = s ⎜ ⎜ ⎜ ⎜ h1 ⎜ ⎜ ⎜ h2 ⎜ ⎝
q
r
1/2
1/2
s
h1
1
1
h2
1
h3
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ , 1 ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ 1
1
1 1
h3
⎛
p
p ⎜ ⎜ q ⎜ ⎜ 1/2 ⎜ r ⎜ ⎜ 1/2 MP T = s ⎜ ⎜ ⎜ ⎜ h1 ⎜ ⎜ ⎜ h2 ⎜ ⎝ h3
q
1
r
s
h1
h2
h3
1 1
1 1
⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
1
1
⎞
1
In abduction, we also make use of the correspondent vector as defined in Definition 6 to realize an explanation – a set of propositional variables. Without ambiguity, we will identify the set representation s with the vector representation v, so we denote them all as v from now on. Henceforth, vi is the i-th atom of L that constitutes s, while v[i] is the value of the vector at index i. In some specific cases, v is employed as a special function that outputs a corresponding vector of a subset in vector spaces: v(O) the observation vector, v(H) the hypotheses vector, v(⊥) the integrity vector (shorthand of v({⊥}) where {⊥} ⊂ L ), v(head(PAnd )) the vector of all head atoms of And-rules in PAnd , and v(head(POr )) the vector of all head atoms of Or-rules in POr . This notation is used for better indexing each element and a vector value in the set/vector. If there is no need to indicate each individual item, the function notation v() can be omitted. The goal of PHCAP is to find the set of minimal explanations E according to Definition 5. Using Definition 6, we can represent any E ∈ E by a column vector E ∈ R|L |×1 . To compute E, we define an interpretation vector v ∈ R|L |×1 . We use
42 Abductive Logic Programming and Linear Algebraic Computation
935
the interpretation vector v to demonstrate linear algebraic computation of abduction to reach an explanation E starting from an initial vector v = v(O) which is the observation vector (note that we can use the notation O as a vector without the function notation v() as stated before). At each computation step, we can interpret the meaning of the interpretation vector v as: in order to explain O, we have to explain all atoms vi such that v[i] > 0. Definition 8 (Explanation vector). The interpretation vector v reaches an explanation E if v ⊆ H. This condition can be written in linear algebra as follows: θ (v + H) ≤ θ (H)
(6)
where H is the shorthand of v(H) which is the hypotheses set/vector mentioned above. We now define one-step abduction in PHCAP step by step. To denote the interpretation vector v at a step t, the superscript (t) is introduced. Definition 9. One-step abduction for PAnd of a vector: We can obtain a reduct abductive matrix MP (PAnd )T from MP T by removing all columns w.r.t. Or-rules in POr . Then we define the one-step abduction for PAnd as v (t+1) = MP (PAnd )T · v (t)
(7)
The one-step abduction (7) is a reverse version of the TP -operator on a single vector. By transposing the program matrix to an abductive matrix, the abductive step is represented in a vector space that computes the explanation v (t+1) for v (t) . This step corresponds to a deductive step through Clark completion in an SD program (Console et al., 1991). Suppose that there is an index i such that vi ∈ v (t) ∩ head(PAnd ); according to Definitions 1 and 7, there is a column w.r.t. v (t) [i] > 0, vi in MP (PAnd )T , denoted by col(vi ). By applying (7), v (t+1) [j ] = |col(vi )| (t+1) for any j such that vj ∈ col(vi ). Then vector v (t+1) represents the set of atoms required to explain v (t) . Example 4 (cont. Example 3). PAnd = {p ← q ∧ r, s ← h3 }. We can obtain a reduct abductive matrix MP (PAnd )T by removing columns w.r.t. rules {q ← h1 ∨ s, r ← s ∨ h2 } in the original abductive matrix. Consider applying one-step abduction for PAnd with v (t) = O:
936
T. Q. Nguyen et al.
v (t) = (1, 0, 0, 0, 0, 0, 0)T (= O) v (t+1) = MP (PAnd )T · v (t)
=
p ⎛ p ⎜ ⎜ q ⎜ ⎜ 1/2 ⎜ r ⎜ ⎜ 1/2 ⎜ s ⎜ ⎜ ⎜ h1 ⎜ ⎜ ⎜ h2 ⎜ ⎝ h3
q
r
s
h1
h2
h3
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
1 1 1
⎞
·
1
⎛ ⎞ p ⎜1⎟ ⎜ ⎟ ⎟ q ⎜ ⎜0⎟ ⎜ ⎟ ⎜ r ⎜0⎟ ⎟ ⎜ ⎟ ⎟ s ⎜ 0 ⎜ ⎟ ⎜ ⎟ ⎟ h1 ⎜ ⎜0⎟ ⎜ ⎟ ⎟ h2 ⎜ 0 ⎝ ⎠ h3 0
=
⎛ ⎞ p ⎜ 0 ⎟ ⎜ ⎟ ⎟ q ⎜ ⎜ 1/2 ⎟ ⎜ ⎟ ⎜ r ⎜ 1/2 ⎟ ⎟ ⎜ ⎟ ⎟ s ⎜ 0 ⎜ ⎟ ⎜ ⎟ ⎟ h1 ⎜ ⎜ 0 ⎟ ⎜ ⎟ ⎟ h2 ⎜ 0 ⎝ ⎠ h3 0
The vector v (t+1) can be interpreted as: in order to explain p, both q and r are to be explained. Definition 9 illustrates that we can apply continuously the one-step abduction (7) with v (0) = O until it reaches an explanation by the condition in Definition 8 and satisfies consistency. In fact, Definition 8 may not hold in case where there is an atom in the interpretation vector that we have no rule in PAnd to apply to find its explanation. Proposition 2. The summation of v (t) is bounded: sum(v (t+1) ) ≤ sum(v (t) ) ≤ · · · ≤ sum(v (0) )
(8)
where sum(v) = Σvi ∈v v[i]. This proposition is trivial to prove using Definitions 1 and 7. For simplicity, we can initialize the starting point v (0) that satisfies sum(v (0) ) = 1. If there are multiple observations o1 , o2 , . . . , ok ∈ O, a new atom o is introduced to replace the current observation set. Then a new conjunctive rule o ← o1 ∧ o2 ∧ · · · ∧ ok is introduced to the theory P . Then we can initialize the starting point O = {o} such that summation of the corresponding vector is 1. From now on, we assume sum(v (0) ) = 1 without loss of generality. Proposition 3. If sum(v (t) ) < 1, then v (t) ∪ PAnd O . Combining Definition 9 with Definition 8, Proposition 3, and the consistency condition, we can deal with And-rules in PAnd . This is just an initial step to solve the PHCAP L , H, O, P . In abductive reasoning, Or-rules are more complicated to
42 Abductive Logic Programming and Linear Algebraic Computation
937
handle because they increase the number of possible explanations. Hence, we need an efficient method for dealing with the growth of possibilities in vector spaces. According to Definition 6, an interpretation v can be represented by a column vector v ∈ R|L |×1 . Multiple vectors v can be stacked up to form an interpretation matrix M ∈ R|L |×|M| , while all definitions and propositions with the one-step abduction for PAnd of a vector still work. Therefore, Definition 9 can be rewritten as follows: Definition 10. One-step abduction for PAnd : M (t+1) = MP (PAnd )T · M (t)
(9)
From now on, a notation M is used as a matrix that is equivalent to a vector of vectors or a set of sets. Note that |M| is denoted as the number of vectors or sets in M. The mentioned notation is used again that Mi is the i-th set of M, while M[i] is the vector at an index i. Let v be an interpretation vector in L , H, O, P such that v ∩ head(POr ) = {head(r1 ), head(r2 ), . . . , head(rk )} with r1 , r2 , . . . , rk ∈ POr . In order to compute explanations of v, we have to explore all combinations c extracted from {body(r1 ), body(r2 ), . . . , body(rk )} such that ∀j ∈ {1, 2, . . . , k}, c ∩ body(rj ) = ∅. It turns out that this is equivalent to enumerate the Minimal Hitting Set (MHS) with the input set as {body(r1 ), body(r2 ), . . . , body(rk )} (GainerDewar & Vera-Licona, 2017). MHS(S) is denoted as all MHSs of a family of sets to be hit S. Now let us define one-step abduction for POr . Definition 11. One-step abduction for POr :
M
(t+1)
=
v \ head(POr ) ∪ s
(10)
∀v∈M (t) ∀s∈MHS(S(v, POr ) )
where S(v, POr ) = {body(r1 ), body(r2 ), . . . , body(rk )} is a family of sets to be hit such that v ∩ head(POr ) = {head(r1 ), head(r2 ), . . . , head(rk )}. Note that all new vectors v ∈ M (t+1) will be reallocated values such that sum(v) = 1 to maintain the condition in Proposition 3 of the one-step abduction (9) for PAnd . Example 5 (cont. Example 4). Por = {q ← h1 ∨ s, r ← s ∨ h2 }. We use the output of Example 4 as the input of the one-step abduction for POr , but now we treat it as a matrix instead:
938
T. Q. Nguyen et al.
T
M (t) = 0
p
q
r
s
h1
h2
h3
0
1/2
1/2
0
0
0
0
M (t) = {{q, r}} S(M (t) ,P 0
MHS(S(M (t) ,P 0
Or )
Or )
= {{h1 , s}, {s, h2 }}
) = {{s}, {h1 , h2 }}
M (t+1) = {{s}, {h1 , h2 }}
M
(t+1) T
=
p ⎛ 0⎝0
q
r
s
h1
h2
0
0 1
0
0
1
0
0 0 1/2
0
1/2
h3
⎞ 0 ⎠ 0
Up to now, one-step abduction for PAnd and POr has been defined. Although each method itself is not sufficient to solve the PHCAP L , H, O, P , their characteristics are important for us to define a general approach. Definition 12. Or-computable and And-computable: 1. 2. 3. 4.
A vector v is Or-computable iff v ∩ head(POr ) = ∅. A matrix M is Or-computable iff ∃v ∈ M, v is Or-computable. A vector v is And-computable iff v is not Or-computable. A matrix M is And-computable iff ∀v ∈ M, v is not Or-computable.
Based on the two one-step abductions (9) and (10), an exhaustive search strategy is proposed to solve the PHCAP L , H, O, P in a vector space as illustrated in Algorithm 2. Some explanations are in order: • Step 7: sumcol (M ) means applying summation on each vector v ∈ M to return a vector. Then we compare each element of this vector with 1 − ε following the Proposition 3 to return a corresponding Boolean vector. Due to the numerical issue with floating-point numbers in computer, e.g., 13 + 13 + 13 = 0.999 . . . , a small fraction ε is introduced to relax the condition in Proposition 3. Choosing the best ε depends on actual L , H, O, P . If we set ε too small, we may filt out good interpretations, and the algorithm might not give expected output. While setting ε too large, we may waste of computation in unexplainable paths. • Step 8: We use the Boolean vector in Step 7 to eliminate unexplainable interpretations. We keep only vectors such that their Boolean value is False. [] is the projection method that extracts from M only vectors that satisfy the condition inside []. Similarly, we also use the projection method in Steps 15–18.
42 Abductive Logic Programming and Linear Algebraic Computation
939
Algorithm 2 Explanations finding in a vector space
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19:
Input: PHCAP consists of a tuple L , H, O, P Output: Set of explanations E Create an abductive matrix MP T from P Initialize the observation matrix M from O E=∅ while True do M = MP T · M M = consistent(M ) v_sum = sumcol (M ) < 1 − ε M = M [v_sum = False] if M = M or M = ∅ then v_ans = θ(M + H) ≤ θ(H) E = E ∪ M[v_ans = True] return minimal(E) do v_ans = θ(M + H) ≤ θ(H) E = E ∪ M [v_ans = True] M = M [v_ans = False] M = M ∪ M [not Or-computable] M = M [Or-computable]
M = v \ head(POr ) ∪ s
Proposition 3 Definition 8 Minimality check Definition 8
∀v∈M ∀s∈MHS(S(v, POr ) )
20: 21:
M = consistent(M ) while M = ∅
• Step 12: Applying the minimality check on the set E to eliminate redundant explanations according to Definition 4. This method is implemented by sorting all E ∈ E by their cardinality and then applying a simple set iteration loop. • Steps 16, 18–19: Construct a matrix M which is Or-computable, and then perform the one-step abduction (10). Here multiple MHS problems are arisen during the search. To deal with it, a naive approach is employed that all combinations are enumerated first, and then apply the minimality check similar to Step 12. However, this implementation can only deal with up to 500,000 combinations; therefore, we exploit PySAT (https://github.com/pysathq/pysat to solve large-size MHS problems (Ignatiev et al., 2018). Theorem 1 (Nguyen et al., 2021). The output of Algorithm 2 is the set of all minimal explanations of the PHCAP L , H, O, P . Example 6. Let us demonstrate how to solve the PHCAP in Example 3 using Algorithm 2. Actually, we have done the first iteration of Algorithm 2 as illustrated in Examples 4 and 5. We continue the next iteration with the interpretation matrix M = M (t+1) obtained in Example 5:
940
T. Q. Nguyen et al. p
q
r
s
h1
h2
0⎝0
0
0
1
0
0
1
0
0
0
0 1/2
1/2
⎞ 0 ⎠ 0
p
q
r
s
h1
h2
h3
0⎝0
0
0
0
0
0
1
0
0
0 1/2
⎛
M = T
⎛
M
T
T
= (MP · M) = T
0
1/2
h3
⎞ 1 ⎠ 0
Here Algorithm 2 stops because all interpretations reach explanations of Definition 8, satisfying the consistency condition, and M = ∅ after that. Finally, the algorithm applies minimal checking and gives the output set of minimal explanations E = {{h3 }, {h1 , h2 }}. The main idea of Algorithm 2 is applying the one-step abductions (9) and (10) continuously in a vector space. Except for the MHS enumerator, almost everything can be implemented using matrix operations. Therefore, it is possible to gain remarkable boosting performance by implementing a parallel version of Algorithm 2 using a more powerful BLAS library, e.g., Intel MKL and NVIDIA cuBLAS.
Related Works An empirical work of the proposed method has been reported in Nguyen et al. (2021) that both dense and sparse representations are implemented. Experimental results demonstrate that the proposed algorithm is competitive with other existing methods. Moreover, our proposed method is parallelizable because the data structure is not too complex and the computation on each interpretation vector is independent of each other. Hence, there are many rooms for further improvement using a more powerful BLAS library that is designed for parallel computing. Propositional abduction has been solved using propositional satisfiability (SAT) techniques in Ignatiev et al. (2016), in which a quantified MaxSAT is employed and implicit hitting sets are computed. Another approach to abduction is based on the search for stable models of a logic program (Gelfond & Lifschitz, 1988). In 2016, Saikko et al. have developed a technique to encode propositional abduction problems as disjunctive logic programming under answer set semantics. Answer set programming has also been employed for first-order Horn abduction in Schüller (2016), in which all atoms are abduced and weighted abduction is employed. In terms of linear algebraic computation, Sato et al. developed an approximate computation to abduce relations in Datalog (Sato et al., 2018), which is a new form of predicate invention in inductive logic programming (Muggleton, 1991). They did empirical experiments on linear and recursive cases and indicated that the approach can successfully abduce base relations, but their method cannot compute explanations consisting of possible abducibles in diagnosis.
42 Abductive Logic Programming and Linear Algebraic Computation
941
The most recent work on the linear algebraic approach for abduction is Aspis et al. (2018) that Aspis et al. have employed a third-order tensor to define an explanatory operator for computing abduction in Horn propositional programs. Aspis et al. have proposed encoding each rule in T as a slice in a third-order tensor, and then they achieve the growth naturally. Then, they only consider removing columns that are duplicated or inconsistent with the program. In terms of growing the interpretation matrix by size, this is similar to our one-step abduction for TOr . According to our analysis, Aspis et al.’s method has some points that can be improved to avoid redundant computation. First, they can consider merging all slices of TAnd into a single slice to limit the growth of the output matrix. Second, they have to consider incorporating MHS-based elimination strategy; otherwise, their method will waste a lot of computation on explanations that are not minimal. To the best of our knowledge, it is not trivial to implement an efficient method in a vector space that enumerates exactly all MHSs as have been defined in Definition 11. Hence, to implement (10) at this time, all interpretations are treated as a set instead of a vector. Fortunately, the vector-set conversion can be performed with minimal cost using the sparse representation that has been reported in Nguyen et al. (2021).
Conclusion According to experimental results that have been reported in Nguyen et al. (2021), an implementation using sparse matrix is the most stable method among those algorithms having the highest number of solved samples. In fact, there are many rooms for further improvement of Algorithm 2 using a more powerful BLAS library which is designed for parallel computing or combined with other advanced techniques to save redundant computation. The merit of solving PHCAP in vector space is not only the scalability but also the capability of integrating with other AI techniques, e.g., Artificial Neural Network (ANN). A more compiled method can be considered for handling consistency by computing the minimal explanations of ⊥ in a matrix M⊥ , which correspond to nogoods of the ATMS. Using linear algebraic methods, checking the consistency of an interpretation vector v is made easy by verifying that M⊥ · v = 0. In addition, taking the MHS problem into account in vector space is a potential research topic. If the MHS problem efficiently can be handled in the vector space, we can unlock the capability of GPU computing in solving large-size PHCAPs. Future work includes developing an efficient method for abduction with normal logic programs in vector spaces. Acknowledgements This work has been supported by JSPS KAKENHI Grant Numbers JP18H03288 and JP21H04905, and by JST CREST Grant Number JPMJCR22D3, Japan. Tuan Nguyen Quoc has also been supported by Monbukagakusho (MEXT) Scholarship and Japan International Cooperative Agency “Innovative Asia”.
942
T. Q. Nguyen et al.
References Apt, K. R., & Bezem, M. (1991). Acyclic programs. New Generation Computing, 9, 335–364. Aspis, Y., Broda, K., & Russo, A. (2018). Tensor-based abduction in horn propositional programs. In ILP 2018 (CEUR Workshop Proceedings, Vol. 2206, pp. 68–75). Boutilier, C., & Beche, V. (1995). Abduction as belief revision. Artificial Intelligence, 77(1), 43–94. Console, L., Dupré, D. T., & Torasso, P. (1991). On the relationship between abduction and deduction. Journal of Logic and Computation, 1(5), 661–690. Dai, W.-Z., Xu, Q., Yu, Y., & Zhou, Z.-H. (2019). Bridging machine learning and logical reasoning by abductive learning. In Neural Information Processing Systems 2019 (Vol. 32). Curran Associates, Inc. D’Asaro, F. A., Spezialetti, M., Raggioli, L., & Rossi, S. (2020). Towards an inductive logic programming approach for explaining black-box preference learning systems. In Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning (pp. 855–859). de Kleer, J. (1986a). An assumption-based TMS. Artificial Intelligence, 28(2), 127–162. de Kleer, J. (1986b). Problem solving with the ATMS. Artificial Intelligence, 28(2), 197–224. Eiter, T., & Gottlob, G. (1995). The complexity of logic-based abduction. Journal of the ACM (JACM), 42(1), 3–42. Eshghi, K. (1988). Abductive planning with event calculus. In ICLP/SLP (pp. 562–579). Gainer-Dewar, A., & Vera-Licona, P. (2017). The minimal hitting set generation problem: Algorithms and computation. SIAM Journal on Discrete Mathematics, 31(1), 63–100. Gelfond, M., & Lifschitz, V. (1988). The stable model semantics for logic programming. In ICLP/SLP, 88, 1070–1080. Greiner, R., Smith, B. A., & Wilkerson, R. W. (1989). A correction to the algorithm in Reiter’s theory of diagnosis. Artificial Intelligence, 41(1), 79–88. Ignatiev, A., Morgado, A., & Marques-Silva, J. (2016). Propositional abduction with implicit hitting sets. In ECAI 2016 (Frontiers in Artificial Intelligence and Applications, Vol. 285, pp. 1327–1335). IOS Press. Ignatiev, A., Morgado, A., & Marques-Silva, J. (2018). PySAT: A Python toolkit for prototyping with SAT oracles. In International Conference on Theory and Applications of Satisfiability Testing (pp. 428–437). Ignatiev, A., Narodytska, N., & Marques-Silva, J. (2019). Abduction-based explanations for machine learning models. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 1511–1519). Inoue, K. (1992). Linear resolution for consequence finding. Artificial Intelligence, 56(2–3), 301– 353. Inoue, K. (2002). Automated abduction. In A. C. Kakas & F. Sadri (Eds.), Computational Logic: Logic Programming and Beyond: Essays in Honour of Robert A. Kowalski Part II (LNAI 2408, pp. 311–341). Springer. Inoue, K. (2016). Meta-level abduction. IfCoLog Journal of Logics and Their Applications, 3(1), 7–36. Josephson, J. R., & Josephson, S. G. (1996). Abductive Inference: Computation, Philosophy, Technology. Cambridge University Press. Kakas, A. C., Kowalski, R. A., & Toni, F. (1998). The role of abduction in logic programming. In D. Gabbay, C. Hogger, & J. Robinson (Eds.), Handbook of Logic in Artificial Intelligence and Logic Programming (Vol. 5, pp. 235–324). Oxford University Press. Muggleton, S. (1991). Inductive logic programming. New Generation Computing, 8(4), 295–318. Nabeshima, H., Iwanuma, K., Inoue, K., & Ray, O. (2010). Solar: An automated deduction system for consequence finding. AI Communications, 23(2–3), 183–203.
42 Abductive Logic Programming and Linear Algebraic Computation
943
Nguyen, T. Q., Inoue, K., & Sakama, C. (2021). Linear algebraic computation of propositional horn abduction. In 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) (pp. 240–247). IEEE. Nguyen, T. Q., Inoue, K., & Sakama, C. (2022). Enhancing linear algebraic computation of logic programs using sparse representation. New Generation Computing, 40(1), 225–254. A shorter version is in: EPTCS online proceedings of ICLP (Vol. 325, pp. 192–205) (2020) Paul, G. (2000). AI approaches to abduction. In D. M. Gabbay, & R. Kruse (Eds.), Handbook of Defeasible Reasoning and Uncertainty Management Systems (Vol. 4, pp. 35–98). Springer. Reiter, R. (1987). A theory of diagnosis from first principles. Artificial Intelligence, 32(1), 57–95. Rocktäschel, T., Bošnjak, M., Singh, S., & Riedel, S. (2014). Low-dimensional embeddings of logic. In Proceedings of the ACL 2014 Workshop on Semantic Parsing (pp. 45–49). Rocktäschel, T., & Riedel, S. (2017). End-to-end differentiable proving. In Neural Information Processing Systems 2017 (pp. 3788–3800). Saikko, P., Wallner, J. P., & Järvisalo, M. (2016). Implicit hitting set algorithms for reasoning beyond NP. In KR (pp. 104–113). Sakama, C., Inoue, K., & Sato, T. (2017). Linear algebraic characterization of logic programs. In International Conference on Knowledge Science, Engineering and Management (pp. 520–533). Springer. Sakama, C., Inoue, K., & Sato, T. (2021). Logic programming in tensor spaces. Annals of Mathematics and Artificial Intelligence, 89(12), 1133–1153. Sato, T. (2017). Embedding tarskian semantics in vector spaces. In Workshops at the Thirty-First AAAI Conference on Artificial Intelligence. Sato, T., Inoue, K., & Sakama, C. (2018). Abducing relations in continuous spaces. In IJCAI: Proceedings of the Conference (pp. 1956–1962). Schüller, P. (2016). Modeling variations of first-order horn abduction in answer set programming. Fundamenta Informaticae, 149(1–2), 159–207. Selman, B., & Levesque, H. J. (1990). Abductive and default reasoning: A computational core. In AAAI (pp. 343–348). Shakerin, F., & Gupta, G. (2020). White-box induction from SVM models: Explainable AI with logic programming. Theory and Practice of Logic Programming, 20(5), 656–670. van Emden, M. H., & Kowalski, R. A. (1976). The semantics of predicate logic as a programming language. Journal of the ACM, 23(4), 733–742. Yang, B., Yih, W., He, X., Gao, J., & Deng, L. (2015). Embedding entities and relations for learning and inference in knowledge bases. In 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings
Acquisition of Feature Concepts Via Open Abductive Communication with Data Jackets
43
Yukio Ohsawa, Teruaki Hayashi, Sae Kondo, and Akinori Abe
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Innovators’ Marketplace on Data Jackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Jacket as a Subjective Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Communication for Sharing Contexts of Living by IMDJ . . . . . . . . . . . . . . . . . . . . . . . . . . The Abductive Thoughts in IMDJ and Feature Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . Living Lab for Enhancing the Sensitivity to the Open Society . . . . . . . . . . . . . . . . . . . . . . . . . Living Labs and Its Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lessons for Livings from Organizational Citizenship Behavior . . . . . . . . . . . . . . . . . . . . . . The Structure of Data-Federative Innovation with LLDJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
946 948 948 949 953 964 964 965 967 969 970
Abstract
Here, the authors show that open communication with abductive reasoning contributes to data-interactive innovation by externalizing living contexts and feature concepts that form key concepts to bridge the social requirements and features of datasets. The authors first introduce communication using “subjective” metadata
Y. Ohsawa () · T. Hayashi Department of Systems Innovation, School of Engineering, The University of Tokyo, Tokyo, Japan e-mail: [email protected]; [email protected] S. Kondo Department of Architecture, School of Engineering, Mie University, Tsu, Mie, Japan e-mail: [email protected] A. Abe Faculty of Letters, Chiba University, Inageku, Chiba, Japan e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_73
945
946
Y. Ohsawa et al.
called data jackets (DJs). DJs are human-made metadata for existing or expected datasets, reflecting the subjective or potential interests of stakeholders in data use/reuse. Even if the owners of data may hesitate to open their data to the public, they can present DJs in the Innovators’ Marketplace on DJs (IMDJ), a platform of data-mediated value exchange where participants communicate to distill and choose ideas to combine, use, or reuse data to create businesses or collaborations. In this chapter, the authors present a revised framework of abductive reasoning for designing, collecting, and using DJs in IMDJ to support the process of multistakeholder requirement elicitation and satisfaction. In the examples, the authors discover that creativity in IMDJ has been ignited by and for the externalization and sharing of feature concepts corresponding to new variables, new predicates, new functions, or even new logical clauses, which can also be used as the performance dimension of data analysis criteria, that is, the features desired to be extracted from data. The effect of living labs, where participants’ sensitivity to the contexts in the open society, is therefore integrated here that extends both the target of and the context to use/reuse of datasets via shared feature concepts. Keywords
Data jacket · Feature concept · Creativity · Communication · Living lab · Abduction · Process of innovation · Context
Introduction Innovation is the process of applying new technology, combining materials, methods, and resources to open up a new market. Since this concept appeared as changes in the combinations of factors for production (Schumpeter, 1912), it does not mean just inventing a product. Rogers, considering stakeholders in the process of innovation and expansion of the open market, pointed out that leading consumers play the role of innovators (Rogers, 2003). A sophisticated idea here is that not only the creators or developers of new products but also users play an important role in discovering new value of a product by using it and diffusing the value to the majority in the market. This assertion can be reinterpreted by introducing the dimension “context” the authors put as a core of the process of innovation. A context is the set of situations in which an individual is positioned at a time in real life. A user of a product is more involved in the context of his/her own real life than the product’s developer, which is why the user understands the fitness of a product in its own use context. Thus, the user proposes a new product design to fit the right context or move to other living contexts to make better use of the product. According to von Hippel (2006), leading consumers invent not only use or diffuse technologies. Furthermore, users can invent the contexts of living as well as products or technologies. Such an innovative user is not necessarily a “leading” user who stands at the initial step of the diffusion
43 Acquisition of Feature Concepts Via Open Abductive. . .
947
process of innovation but may be “late adopters” or “laggards” who do not accept a new product because the feeling that a product does not fit one’s living context may urge the user to create a new context. The latter may not be called a user because he/she does not use the product until the revision of the design or the way of living, but can be called a “potential” user. Thus, the essence of innovation is to create contexts, not only products, possibly by involving potential users who may not be included in the loyal customers of the developer. This point distinguishes innovation from a child’s talent of value sensing acquired in the growth of mind (Donaldson, 1992) or a part of sense-making that can be supported by information systems using data (Dervin, 1992). These may have something to do with innovation, but that innovation has a different aspect than sheer intelligence by nature or of an infant because experiences in living are necessary for learning use-contexts. For example, let us consider a smartphone. A child may be able to sense its value by understanding (if possible) pieces of information to appear on the display, but cannot relate the information to scientific discoveries or novel businesses as far as they have not learned about the latest context of science or business. In addition, the data on the locations of smartphone users cannot be used without understanding the data format or being interested in the context in which to use the data. The acquisition of knowledge about potential elements of innovation and interest in the use context should precede the discovery of an invention. Thus, the use context is essential in defining the evaluation creation of the fruit of innovation, that is, the dimension of performance that Drucker introduced in his definition of innovation (Drucker, 1985). The interest in contexts here can be hardly given by nature or the thought of an individual because the most suitable use context of a product may not be in the scope of one individual but can be learned through communication with others. However, if that context is known equally to everyone, knowing it does not mean a new context acquisition to anyone in society. That is, for innovation, the communication of diverse stakeholders in potential markets who exchange living contexts plays an important role. Innovators Marketplace on Data Jackets (IMDJ (Ohsawa et al., 2013)) is a method for creating data-use scenarios for solving real-life problems following the above redefinition of innovation, where diverse participants interact by combining the DJs shown in Section “Innovators’ Marketplace on Data Jackets”. Participants in the IMDJ communicate to create solutions to satisfy data users’ requirements by sharing, combining, and/or using data that consider the contexts of each other. The IMDJ has been used in various scientific and business domains, as stated in this Section, and is now at a stage to prevail in solving issues in the daily contexts of working people and ordinary habitants of local regions. In Section 2, Here a logical definition of DJs and the human process of communication for reasoning toward satisfying requirements in IMDJ are reviewed and revised. The advantage of the IMDJ is its contribution to the elicitation and exchange of contexts for using things potentially valuable in real life. However, its limit is shown from the point of the gap between the requirements and the theory obtained in abductive reasoning. The authors review the logical model of the IMDJ process, previously presented on the framework of abduction representing thoughts and communication, and explain
948
Y. Ohsawa et al.
the parts introducing contexts. In Section “The Abductive Thoughts in IMDJ and Feature Concepts”, feature concepts are shown as key elements externalized in the IMDJ and as a core technique for data utilization according to case studies. Simply put, contexts and feature concepts play the role of joint elements in data-mediated innovation. Then, a living lab is introduced in Section “Living Lab for Enhancing the Sensitivity to the Open Society” as an approach to coping with the limit of the IMDJ by deepening and widening participants’ sensitivity to contexts and to the utility of feature concepts. In Section 4, a living lab on data jackets (LLDJ) is proposed as a modification of the IMDJ to open and extend the communication and thoughts to a deeper and wider range of latent requirements than the sheer IMDJ. Here, the authors reinterpret data-mediated (not using “data-driven” because the process is not always driven by data) innovation as a process of interactions to externalize the contexts and feature concepts for connecting requirements, solutions, theories, and data.
Innovators’ Marketplace on Data Jackets Data Jacket as a Subjective Metadata A data jacket (DJ hereafter, first introduced in Ohsawa et al., 2013), is a piece of digest information of a dataset, which does not include the content of the data, but only the title, the variables, and the abstract that may represent the subjective expectation of the owner or potential users of the data for the utility. Even if the owners of data may hesitate to open their data to the public, they can present DJs to others. As a collection in the data market, DJs play the role of human-made metadata for existing or expected datasets, reflecting the subjective or potential interests of stakeholders in the data market. The idea of DJ comes from a jacket of a movie DVD in a store, where only superficial information about the movie is given for a shelf exhibition to potential customers who may just pick up and read or may buy the DVD. The content of the data, that is, the movie itself, should be hidden in order to reduce the risk of being stolen or leaked to anyone who may harm the benefits of the owner and the retailer. In contrast to real data, DJs are easy to create and disclose to appeal to the latent utility of corresponding data because of the secure manner of showing out expectations for data utility. This effect is reinforced by the potential links between the datasets, as shown in Fig. 1. For example, as shown in this figure, the DJs about personal health and food consumption can be disclosed and aid the communication and thoughts about the utility of the data, which may be confidential. These data can be combined via understanding the relevance of weather and health on their visualized link via “time” that is a variable shared between the two datasets. In addition, the context in which the utility of the data may be appealed can be elicited by showing out the concept “health,” a common word between the two DJs in the map.
43 Acquisition of Feature Concepts Via Open Abductive. . .
949
Fig. 1 A scene of visualization and communication of connectivity of datasets using DJs. The edges show the connectivity visualized based on the shared words among DJs, on which stakeholders, who are participants of the data market, exchange contexts, that is, the purposes and the conditions of using the data. The analysis planner Dr. Y is proposing a method for combining datasets to satisfy the requirement of Dr. X, and asking about an additional requirement.
Communication for Sharing Contexts of Living by IMDJ Our method for realizing such reasoning with communication sharing the visual map has been IMDJ, as introduced above and shown in Fig. 2. The IMDJ is a platform of data-mediated value exchange where participants communicate to distill and choose ideas to combine, use, or reuse data to create businesses or collaborations. Through a preparatory visualization of connectivity among DJs, participants in the IMDJ discuss (1) why and how each dataset has been or should be collected, (2) what can be achieved by using the datasets and how, when, and where to achieve it, and (3) what the value of the achievements are and who requires the value and why, via communication in the market of data. Thus, the use of DJs here means initiating loops of living human interaction to create dimensions of performance of real-life activities corresponding to the contexts in real life, some of which may be represented by variables in the data. To aid participants’ thoughts about the connectivity among DJs, the maps here are made using KeyGraph (Ohsawa et al., 1998; Ohsawa 2018a), where words or variables shared by multiple DJs are highlighted and positioned on bridges between the DJs. The IMDJ starts by visualizing a map of DJs selected to match the interests of the participants (Step 1), followed by the communication where requirements are
950
Y. Ohsawa et al.
Fig. 2 (a) A snapshot of face-to-face and (b) on-line IMDJ (Iwasa et al., 2019). Solutions (squares e.g., “We can have . . . ”) are proposed by combining DJs (large cards, e.g., DJ1039 in (b)) responding to requirements (e.g., “what are the scholarship opportunities . . . ” in the yellow sticker in the left of (b)). The source of (b) is Ohsawa et al. (2019b)
43 Acquisition of Feature Concepts Via Open Abductive. . .
951
shown, questions are asked, and solutions are proposed and evaluated to meet the requirements of a set of datasets corresponding to DJs (Step 2). The visualization in Step 1 and communication in Step 2 are really executed face-to-face as in Fig. 2a or on-line as in (b) where DJs are via our Web system registered in advance – more than 6000 DJs have been registered so far including ones in other systems – and selected corresponding to the query assigned by the participants of a new IMDJ session, for example, “good life” in Fig. 1 or “eating in cities” in Fig. 2b. Normally, 20 to 50 DJs are selected to execute one session of the IMDJ. The essential contents of the communication in the IMDJ are formed by the triplet below: DJs: One for each dataset selected in advance by the organizer of an IMDJ session from the set of DJs provided by the owners or individuals with knowledge of datasets. Requirements: The goals presented as requirements in the real-life by participants Solutions: The statement showing the DJs corresponding to the datasets and the tools (e.g., statistic analysis, machine learning, change explanation (Ohsawa 2018a; Ohsawa et al., 2019a), etc.) for enabling the data use that aims at satisfying the requirement above It is noteworthy that essential questions and answers in the IMDJ can be categorized into the following two types: Question 1: Ask “why” and answer it, meaning a description of causality to explain the process of reaching an observed state (e.g., the reason why the moon is bright) Question 2: Ask “how” and answer it, meaning to realize a certain situation, that is, to achieve a goal state (e.g., how to travel to the moon) Here, the authors put aside other questions such as “when,” “where,” and “who” as subquestions of the two questions “why” and “how” above, because time, place, and the subjects are parts of the answers to these two questions. On the other hand, “what” can be positioned not only as a subquestion of “how” (if used to represent an element, i.e., a tool or a resource) but also as a higher goal if it is the meta-goal of “why,” that is, the goal of aiming at a given goal, for example, happiness and health, if the goal is to eat something good. Such a meta-goal is desired to be given initially as a query for searching useful DJs in advance, but is often externalized via communication in IMDJ. In such a case, participants’ emotions play an important role because the meta-goal that had been latent tends to be linked to an essential and stronger requirement in a daily life context, which tends to be emphasized by emotional expressions in human communication. In this sense, the participants in the IMDJ are encouraged to ask good questions and feel free to be frank and openmined. The effect of living labs mentioned later, where participants’ sensitivity to the contexts in the open society is reinforced by communication, can be well integrated here, which extend both the target of, and the contexts in which to think about the conditions for the use/reuse of datasets.
952
Y. Ohsawa et al.
The merit of the face-to-face IMDJ in Fig. 2a is the possibility of sharing the real-life context by seeing each other’s facial expressions and hearing voices that carry the emotions linked to the real contexts of living. This is obviously useful in the externalization of meta-goals, which tend to be out of the visual graph. On the other hand, the online IMDJ in Fig. 2b has the advantage of enabling the recording of all the dialogues, including the requirements, solutions, and the DJs presented, evaluated, or used in solutions, on the way of each IMDJ session. See Fig. 3 for grasping the entire structure of these three levels, where the requirements are connected indirectly to the DJs via the proposed solutions. The recording of the contents of this hierarchical structure is useful for searching DJs for participants in IMDJ sessions to be organized in the future, because they do not know all the requirements within one session, which may be realized potentially by DJs. That is, the requirements reachable from a given set of DJs are unknown to the participants but may have been presented in advance within the log of the participants’ acts in the sessions so far. Therefore, if an organizer of a later session of IMDJ should collect DJs matching participants’ interest, they are encouraged to use DJ store to search for DJs that may not include the search query directly but include it indirectly, that is, which are linked to requirements (via solutions) including words relevant to the query (Hayashi & Ohsawa 2016a). In addition, datasets of indirect or implicit relevance to a user’s interest can be explored by DJs using the DJ store,
Fig. 3 Innovators’ Marketplace on Data Jackets (IMDJ) of three layers that has been used so far. The lowest layer corresponds to data which are available or to be collected. The top layer is here supposed to be the open society without a close relation to datasets at the bottom
43 Acquisition of Feature Concepts Via Open Abductive. . .
953
which enables analogical inventions of data analysis methods by connecting the target problem in the current requirement to a base problem in a past requirement via the words common to the two requirements, as shown in a later example. The words in the requirements can be regarded as at least a part of the context of a data user’s living context, so IMDJ is a place for externalizing, sharing, and connecting contexts. However, in the following sections, creativity in IMDJ is found to be ignited not only by DJs but also by and for the externalizing and sharing feature concepts corresponding to new variables, new predicates, new functions, or new logical clauses, which can also be used as the performance dimension of data analysis criteria, that is, the features desired to be extracted from data.
The Abductive Thoughts in IMDJ and Feature Concepts In this section, the authors present a revised framework of abductive reasoning for designing, collecting, and using DJs in IMDJ to support the process of multistakeholder requirement elicitation, and satisfaction. Figure 4 shows a logical representation of the DJs. A set of DJs is defined as follows, revised from previous publications (Ohawa et al., 2017, 2019b) and adding Ci and Wi. DJi (i∈[1,N]): The i-th data jacket (N: the number of datasets in the data market) DJi : = {Vi , Fi , Pi , Ci , Wi }, where the elements are defined as follows: Vi : the set of variables in DJi Ex: temperature, place, day, human, . . . used in a weather dataset. Fi : elements of Vi , which can be expressed as functions over other elements of Vi . Ex: temperature (place, day) Pi : the set of predicates that represent relate elements of Vi Ex: stay (human, day, place), that is, the human stayed in the place on the day. Ci : hypothetical clauses (including facts) representing expected knowledge Ex: stay (human, day, home): unhealthy (human, day). Wi : words about concepts including subjective expectation for utility of the data Ex: useful for decisions about daily life activities, representing a context in which the advantage of weather data can be taken. G: The goal, that is, the requirement defined as the relation over terms corresponding to events or entities in the target world. This definition may be incomplete.
954
Y. Ohsawa et al.
Fig. 4 The revision of the connection structure of DJs in the communication process. For example, to refine the utility of data the represented by DJ2 weather, the variable address is added. The conceptual statements about the utilities of datasets work to connect between the real requirement and the knowledge to be acquired on data, in human communications. (Source from Ohsawa et al., 2019b)
43 Acquisition of Feature Concepts Via Open Abductive. . .
955
Example of a complete expression of a goal: healthy (human, day), meaning the human becomes healthy by the day Example of an incomplete expression: healthy (human), healthy, or doing_well, missing or rephrasing some part of the complete expression of the goal. T: The theory, that is, a model described by a set of Horn clauses, each given using predicates in PG below. T is represented over elements of PG , FG , CG , and VG , which are parts of the DJs in DJcom(G) in Eq. (1), which satisfies Eq. (2) ([v] for variable v means the range of the value of variable v), if the conclusion G’ derived by theory T subsumes goal G. This means that a formal expression G’ is related to the informal expression G and T is completely defined that intuitively means all the clauses in T are supported by data or facts corresponding to some DJcom(G). DJcom G := DJa , DJb , . . . DJL ⊆ {DJ1, DJ2, . . . DJN} where XG := ∪i∈a,b,...,L Xi for X ∈ V, F, P, C, W ∃
v ∈ VG
∀
∃ Vx ∈ Va , Vb , . . . VL vx ∈ Vx | [v] ∩ [vx ] = ∅ .
(1) (2)
For example, suppose G is the requirement to know the influence of weather on health, represented as “health ← weather” By relating the parts of G to complete expressions, for example, health to γ-GTP_high(person ID, date) and weather to hot(date), G corresponds to G’ in clause (3). Hereafter, let us hide the universal quantifier ∀.
G : ∃ person ID γ − GTP_high (person ID, date) ← hot (date)
(3)
G’ can be derived by the combination of clauses (4) and (5), by which T is formed. γ − GTP_high (person ID, date) ← beer_consume (person ID, date)
(4)
In Fig. 4a, dotted lines connect the appearances of the same variable in multiple DJs to combine predicates, corresponding to sharing a variable among all Vx used for deriving G’, as in Eq. (2). If the obtained T is not satisfactory (the low confidence in Fig. 4a, b), other variables such as address in Fig. 4b, as in Fig. 4a, are added. Furthermore, as shown in Fig. 4c, new DJs may be added to DJcom(G) to obtain a satisfactory T and evaluate it using data corresponding to the DJs. However, solutions tend not to be satisfactory enough to attract participants in the IMDJ even if they were highly evaluated by the participants. The problem here was the lack of correspondence between G and a predicate in G’, which can be
956
Y. Ohsawa et al.
derived by T. So far, the postprocess of the IMDJ called Action Planning (AP) introduced additional details of the plan to use data. In AP, a latent requirement that may be the reason for speaking the requirements presented as G in IMDJ was externalized by metacognition of each participant, and then details that may have been missed in the gamified workshop in IMDJ were added to complete the plan of actions to fulfill the revised goal. Thus, the obtained solution corresponding to T or the incomplete expression of T can be revised. However, there were two problems. First, the new goal set by revising G’ in AP may be inconsistent with G’, so the solution T obtained in IMDJ may lose its value owing to the goal revision. For example, reducing alcohol to reduce γ-GTP may harm the mental state, which is another part of health. In such cases, it is difficult to reach a shared awareness of the value of data-based solutions. Second, the AP phase executed for one solution may take several hours for communication and thought. Owing to this long time per solution, it is difficult to cover all essential solutions and requirements by AP sessions. Therefore, it is necessary to create a new strategy, rather than just reusing AP, to cope with the following drawbacks in the logical formulation shown above and in references (Ohsawa et al., 2019b). (a) The role of Wi is not well explained in the logic-based representation of IMDJ as above, although the conceptual statements about the utilities of datasets work to discuss the gap between the real requirement and the knowledge to be acquired on data, in human thoughts and communications. (b) The data may not support the theory from the bottom, that is, the hypotheses that should be validated by data but may play the role of goals that are the observations to be explained by abductive reasoning. The drawback (a) exists because IMDJ so far has no user interface to urge providers of DJs to add their subjective expectations corresponding to Wi onto the predicates in set Pi above on the way of communication. In the example above, unhealthy in G and γ-GTP_high(person ID, date) are not equal because health, having various explanatory factors, cannot be covered by the quantity of γ-GTP for any person or on any date. If the data-based explanation of health is not satisfactory, the user should be able to add factors to Pi with respect to concepts in Wi so that the missed concepts are added to compensate for the gap between G and G’ (see Fig. 5). On the other hand, the drawback (b) occurs if the data in hand correspond to goal G rather than supporting the hypotheses in theory T. In such a case, G’ may not be fixed because the most believable among multiple theories as the explanations of G cannot be chosen by data. As an example of (b), the change in health condition may be observed by a tool of change detection, just looking at the curve of γ-GTP in the blood test data or by applying an up-to-date algorithm for time-series analysis. However, this does not explain the change because the cause of the change is not included in the target data. In such a case, theories T with the data at the top (goal G’ or G) should be created to connect the top and the bottom (hypotheses given as CG ).
43 Acquisition of Feature Concepts Via Open Abductive. . .
957
Fig. 5 A scene of communication with adding words, and their corresponding variables and predicates, that is, contextual information and feature concepts. This process compensates for the gaps between G and G’ as well as between the expected and obtained results.
958
Y. Ohsawa et al.
Thus, the following three items that should be linked to each other: 1. Representations of the requirement G* (denoting either the original G, logical G’, or any revised expression of the goal) 2. Information available (learnable, minable, or visible) from available data 3. Theory T as an explanation of G’ The gap between (1) and (3) can be solved if the user can fit his/her expressions of goal to the logical format that is predicate logic and can approximate all the concepts with variables, predicates, and functions. On the other hand, the gap between (1) and (2) can be noticed if one understands what can be obtained from data and how it can be expressed explicitly (using equations, words, logics, or technical terms in data sciences). Here, “understand” means to explain the correspondence of acquired knowledge referring to real-life entities and events and to “express” means to speak verbally or write for a machine to read. Here, let us see some examples to discuss the literacies for linking (1) through (3) above and the multiple expressions within G*. In the few examples (digests of Ohsawa et al., 2022) below obtained so far in IMDJ sessions, it is found that communication externalizes underlined concepts as the dimensions of performance in using data in businesses, which the authors call feature concepts. Example 1 Skill development in sports (Req: requirement, Sol: solution). Req1: Evaluate and improve the defense performance of a soccer team Sol1: Visualize “lines” of teammates on which to quickly pass a ball, which explains the skill of a defensive team to manage the changes in the offensive team. DJ1: Wide-view video DJ2: Body direction (included in the data of DJ1) For Example 1, the solution proposed in the IMDJ was originally “evaluate the defense performance of a soccer team based on the positions and body angles of players,” which was then revised to the one above. This revised version can be expressed by four Horn clauses (5) through (8), where hi for 1 through 11 represents players in a soccer team. The defense performance of the teams is high if the 11 forms three lines, where each line means three or four players positioned side by side in parallel to the goal line. The video data represented by DJ1 was used to detect the lines formed by the players, and the predicate “defendable” corresponds to goal G’. This is not an exact match with “defense skill” in Req 1 but is a straightforward shortening; thus, the gaps of (1) through (3) above are solved. However, the predicate “line” is embodied by detecting each player in the data by distinguishing the uniform clothes of the two teams, computing the body angles, and the angles between the goal line and the line connecting each pair of players. Thus, “line” came to a computable concept externalized via the conversation about available data and the latent requirement after the presentation of the original solution “based on the positions and body angles of players,” hence a feature “line” that was not covered by the elementary features of the data (V, F,
43 Acquisition of Feature Concepts Via Open Abductive. . .
959
P, or W) but connects the requirement and the data. This is why the authors call such a word as “line” a feature concept, a key factor for an analysis to satisfy a requirements. Figure 6 shows the created product reflecting the solution above, a visualizer of lines in the teams extracted from real video data of a soccer game. Using this tool, the soccer coach who presented the requirement above came to lead all his teams, none of which had been ranked within the top 32 in his prefecture before the IMDJ session, to within the top 8. good_defense (h1, h2, h3) : −in_line (h1, h2, h3)
(5)
good_defense (h4, h5, h6, h7) : −in_line (h4, h5, h6, h7) ,
(6)
good_defense (h8, h9, h10, h11) : −in_line (h8, h9, h10, h11)
(7)
defendable (team) : −∀ H good_defense (H) , in (H, team)
(8)
Fig. 6 An IMDJ session inviting coaches of sports in (a) and the obtained product: a dynamic visualizer of “lines” in soccer games (b). The lines between players are automatically computed and visualized for each team, distinguished by red and blue lines (Takemura et al., 2018)
960
Y. Ohsawa et al.
Similarly, Example 2 was realized by introducing a feature concept change explanation, trend shift, and diversity shift. The solution was embodied later than the presentation of the original solution, which explained changes in the market by showing causal events such as items or behaviors of customers. Example 2 Change explanation in businesses Req2: Detect and explain causes of customers/investors’ behavioral shifts DJ3: Market data, for example, POS in a supermarket or stock prices. TJ1: Tangled string (mentioned later) or other tools to explain the change. Sol2: Explain changes in the market with visualized “explanatory changes” implying the latent dynamics such as the “trend shift” or “diversity shift” in the market. Change explanation can be represented by the predicate “change” defined indirectly in clause (10), that is, indirect in the sense that the defined predicate is in the RHS, not LHS. This means to position a change as a transition from a trend in a certain period (t-t: t) to the next period (t: t + t) if the appearance of the market changes substantially, as in Eqs. (9) and (10), and explains the trends in Eq. (11). In this sense, it is essential to distinguish the explanation of changes from the detection or prediction that have been studied using machine learning technologies (e.g., Fearnhead & Liu, 2007; Hayashi & Yamanishi, 2015; Miyaguchi & Yamanishi, 2017). | market (t : t + t) − market (t − t : t) |>
(9)
trend1 (t : t + t) : −trend2 (t − t : t) , change (t) ,
(10)
market (t − t : t) = vectori : −trendi (t − t : t t) ,
(11)
In Eq. (9), the market at a period of time is supposed to be represented virtually as a vector, and the change at a change-point t is larger than a given positive value. The cause of the state of the market is given by the trend in the same period as in Eq. (11). The requirement is interpreted as explaining the causality of the change in the trend, as expressed by the clauses in Eqs. (10) and (11), respectively, and the difficulty is that the value of only the predicate “market” can be obtained as vectori from the data in hand. That is, vectori as a virtual quantification of “market” is obtained from the data, for example, putting the frequency of each sold item as a value of each dimension, and then it is linked to the trendi that is the i-th trend (supposing a trend is countable by 1, 2, . . . i, . . . N where N represents the number of known trends (e.g., trend1 = hot meal, trend2 = beverage for cooling the body, etc., in the market of food), which may not be included in the given data. Then the change in “trend” is linked to other events neither included in the given data. It is desired that the only information obtained from data-based computation, that is, “vector,” can be linked
43 Acquisition of Feature Concepts Via Open Abductive. . .
961
Fig. 7 Tangled string applied to the Japanese stock market. The red rectangular nodes represent the start of a pill representing a trend, green the end in (a). They correspond to the red and green vertical lines in (b) where the arrow at the bottom shows an explanation by an expert. Source from Ohsawa, Hayashi, Yoshino (2018)
to the “trend” outside of the data, which may be possible by introducing humans’ common sense. In other words, the vector should be interpreted by a human. In Example 2, a TJ stands for a tool jacket (Hayashi and Ohsawa 2016), where a tool for using data (a method of AI, data visualization, or simulation) is summarized in a similar form to DJ, that is, the title (e.g., KeyGraph), the abstract (e.g., visualizing the co-occurrence relations between both frequent and infrequent items in the data), and the input/output variables (word, item, event, human, time, etc.). A tangled string is a method for explaining a change by positioning an event in a string representing a sequence that tangles on the way if the same event occurs multiple times, as shown in Fig. 7 (Ohsawa et al., 2019a). Using a tangled string is one way to realize the solution in Example 2 because the stocks corresponding to those visualized in the tangled string can be projected to the price history in the major stocks in Japan, which can be explained by the market(t–t: t) before and market(t: t + t) after each change point t is visualized as two substrings that are pills, that is, the tangled parts corresponding to a trend where the same stocks are repeatedly purchased. Here, trend2(t–t: t) and trend1(t: t + t), which are the interpretations of market(t–t: t) before and market(t: t + t) after t by experts of stock analysis based on the visualization, could be explained by relating the experts’ external knowledge, that is, knowledge acquired from daily business but out of the data of price history. Thus, in the example, “trend shift” worked as a concept matching the requirement and computable from data, that is, a feature concept. Note that a feature concept may be represented by a clause, as in Eq. (10). In this sense, the Clause Management System (CMS) proposed by Reiter in 1987 (Reiter & de Kleer, 1987) and extended by Kean and Tsiknis (1992) gives a hint to us about the process for creating feature concepts as missing essential links in composing the theory T. Here, a deductive reasoner and CMS collaborate in deriving new clauses implied by the combination of available knowledge and hypotheses
962
Y. Ohsawa et al.
Fig. 8 Feature concepts in the form of abstract illustrations
without inconsistency and querying for an explanation of a clause. In other words, CMS can be used for abductive reasoning with a goal represented as a clause. This framework has been extended by using analogical abduction (Abe, 2000, 2003), where knowledge can be imported to the target domain from a basic domain by evaluating the similarities. A feature concept, if represented by a clause, can be regarded as a goal that is desired to be derived by combining the elements of DJs. If a new clause is obtained, it can be added in the future as a part of a DJ, which may correspond to the dataset obtained by the combination of multiple DJs. However, a feature concept can be created by anyone who utilizes the data whether or not he/she knows the grammar to represent his/her knowledge in the form of a clause. That is, the essence to be obtained from the data is expressed in any form that is easy for one’s thoughts and communication with colleagues. For example, as shown in Fig. 8, the feature concepts corresponding to the knowledge to be obtained by each method of data mining/visualization can be represented by an abstract illustration, where the elements are not named if there are no specific entities corresponding to them. In Eq. (11), the interpretability of “vector” came to be realized not only by the tool proposed in TJ1, but also a new method computing Graph-based Entropy (GBE) (Ohsawa, 2018a), shown as in Fig. 9. Each of the three graphs shows the co-occurrence of items purchased by the same customer at the same time. In the sequence of graphs in Fig. 9 for the market of category “cooking spices,” the structure of the graph in the left involves “cream stew.” In addition, the clusters were separated into two clusters during the second week. Then, the cluster in the lower half of the graph is reinforced in the third week. These structural changes coincide with the point of change in the value of GBE, which quantifies the diversity of clusters of items. “Cream stew” is found to stay in the finally reinforced (densely connected) cluster, and other spices in this cluster are also used in cooking stew according to experts of foods in the supermarket. Using Google Trends for the Japanese query “shichu” that means “stew,” it came to be found that the interest of people in eating stew gets highlighted from the latter half of August every year, which nearly coincides with the trend shift to spices for cooking stew as obtained here. In this sense, the “vector” for a market state can be regarded as
43 Acquisition of Feature Concepts Via Open Abductive. . .
963
Fig. 9 The variation of the graph corresponding to the co-occurrence structure of 3 weeks in August the category “cooking spices” in a supermarket. A part of a figure in Ohsawa, 2018a
the set of weights of active links in the co-occurrence graph for each week, or the one-dimensional value of GBE for each week. Because GBE is a quantification of diversity in the market (borrowing from the use of entropy in Kahn 1995), diversity shift can be regarded as a concept feature to be extracted from the time-series POS data. In summary, the feature concepts as the dimensions of thoughts for data analysis for Example 2 are the explanatory change, trend shift, and pills and tangled strings in the case of a tangled string. In addition, the diversity of trends in the target market is explained using the visualized another (deeper-level) feature concept, that is, the co-occurrence structure of items. Example 3 was realized on the analogy from the basis of Example 2, by diverting the idea to use entropy, as in GBE representing the diversity of trends in the market, to the diversity of epicenter clusters of earthquakes computed as the regional entropy on seismic information: RESI (Ohawa 2018). This diversion has been realized by searching DJs with the query “explain the sign of changes in earthquakes” with the DJ store (Hayashi & Ohsawa 2016a) as a search engine of DJs on the stored log of IMDJ including the relevance between DJs and requirements. Although “explain,” “sign,” nor “change” has not been described in DJs, these words in feature concepts have been included in the requirements of some participants, for whose satisfaction some DJs were used. The DJ3 above for POS data has been hit as one of these satisfactory DJs in the past, which may be problematic for earthquake analysis, but the analyst had learned to use entropy from the solution, including the use of diversity computed from entropy. As a result, he created an effective method for detecting explanatory signs of earthquakes using an analogy from market dynamics to earthquakes via the feature concept diversity shift of trends in the market and of epicenters. This process is equivalent to the analogical abductive reasoning introduced above (Abe, 2003) combined with Reiter’s CMS, except that the W of each DJ, represented by natural language instead of logical clauses, works in relating the query to DJs. That is, once used, feature concepts are added to the DJ store, which includes the clause-base corresponding to CMS, and the clauses are queried
964
Y. Ohsawa et al.
to achieve a new goal G represented by a natural language. The query may directly hit an element of Wi in a certain DJ (DJi) or indirectly hit the DJ via a previously supported goal or its expression in natural language. Example 3 Change explanation in nature. Req: Detect precursors of and explain changes in earthquakes. DJ4: Sequence of earthquakes in Japan DJ5: Location of seismographs in Japan DJ6: (The way of using) Position of sale data (as in Example 3) TJ2: Regional entropy on seismic information based on the idea of TJ1 Sol3 (revised from the original presentation): Entropy-based detection of precursors, that is, explanatory signs, on the diversity shift of epicenters in the sequence of earthquakes
Living Lab for Enhancing the Sensitivity to the Open Society Generally speaking, a feature concept may be an element of P (e.g., line(h1, h2)), F (e.g., change(t), trend(t), and diversity(t)), or a clause in C, as shown above. In addition, the contexts can be represented by intentions or prior/post constraints of actors, which can be assigned to an element of either V, P, F, or C. Moreover, these can be queried by a user via natural language expressions that are used daily via the correspondence of the query and elements of W or goals showing requirements in the past. Therefore, the freedom of the range of prepared concepts or contexts to be exchanged should be open in various dimensions. It can be expected that the effect of IMDJ, to externalize and exchange feature concepts via the communication to open each participant’s awareness to others’ contexts, can be reinforced by the effects of living labs introduced below. In this section, the expectation of the effects of combining the LL and DJs is discussed. This part is a diversion of the first authors’ (Ohsawa’s) keynote presentation in intelligent systems design and applications (ISDA2019 (Ohsawa et al., 2019b)) to include externalization, exchange, and the use of feature concepts to empower data utility.
Living Labs and Its Effects In recent years, the living lab (LL) has been attracting the attention of industry, government, and academia to create new solutions to problems. LL was born as a social participatory method that works from the viewpoint of ordinary living people, mainly in northern Europe, and is regarded as a framework for the participation of various stakeholders of innovation and sustainable development of society. Two strengths of LL are that it (1) widens the scope of communication to others’ requirements in real life and (2) deepens the communication to consider the reasons of for requirements and activities that correspond to the “contexts” mentioned above. By this effect of LL, the proposed LLDJ aims to reinforce the effect of
43 Acquisition of Feature Concepts Via Open Abductive. . .
965
IMDJ to accelerate the externalization and exchange of feature concepts because the combination of requirements and DJs is fostered by opening the interests of citizens and working people in the target region, as well as data scientists, to others’ living contexts. To date, studies on LL have been conducted in Europe. In recent years, there has been a social interest in LL, especially for innovations in the ways of living with ICT. Følstad (2008) elucidated the processes and methods of these aims. In response, Leminen et al. (2012) and Almirall and Wareham (2011) conducted analyses from the perspectives of management and participation methods and the roles of involved parties. However, neither method has yet been elucidated because the position of LL on the way to innovation has not been fixed or explicitly clarified. In other words, since the effect of LL was not clear, the evaluation criteria for its performance have not been established. On the other hand, by highlighting the openness of the interests of participants in LL to the society, the expectation of introducing LL for the reinforcement of IMDJ’s effect to externalize feature concepts is clarified.
Lessons for Livings from Organizational Citizenship Behavior Living lab activities have been considered as voluntary and organized social contribution activities, and as an evaluation viewpoint of their effects, the third author (Kondo) focused on the effects of LL that contribute to Organizational Citizenship Behavior (OCB). OCB has been defined as an individual behavior that is discretionary, not directly or explicitly recognized by the formal reward system, and that in the aggregate promotes the effective functioning of the organization (Organ, 1988). In the sense that OCBs are not part of the job description but are performed by an employee’s personal choice for positive contribution to the overall organizational effectiveness, contextual performance (nontask-related work behaviors and activities contributing to the social and psychological aspects of the organization (Borman & Motowidlo, 1993)), and extra-role behavior (behavior attempting to benefit the organization beyond existing role expectations (Dyne et al., 1995)), the authors aim to import from LL to data-mediated innovations. Organ (1988) developed, specified, or extended explanatory applications for industrial and governmental organizations. For example, the effect of LL to mediate political skills includes the ability to sense the influence of individuals on others and the intentions of others (Ohshima et al., 2018). It is also known that LL activities can result in the networking of participants and the expression of potential requirements. Since these results are thought to be related to the above-mentioned regulatory and explanatory factors of OCB, it is hypothesized that LL contributes to the enhancement of OCB using these factors. By introducing LL, OCB is enhanced, and participants shall be more sensitive to deep and wide potential requirements in society and the influence of individuals from/to each other. As shown in Fig. 10, the abstracts of OCB and LL collected from Wikipedia are visualized in one graph by KeyGraph to see the contact points between them, among the 117 words visualized. The words “work,” “social,” “personal,” “life,”
966
Y. Ohsawa et al.
Fig. 10 The KeyGraph visualization of LL (right half) and OCB (left). (Source from Ohsawa et al., 2019b)
“experience,” “evaluation,” and “context” are shared between OCB and LL, to which concepts related to collaborative problem solving such as “problem,” “conflicts,” “multidisciplinary” are linked. This implies that enhancing and evaluating the performance of individuals in the context of social problem detection (i.e., requirement sensing) and the performance of an organization to externalize problems due to the influence from/to each other and solve them from multidisciplinary viewpoints are featured effects of LL. In summary, these are regarded as the effects or causes of the openness of interactions in LL by releasing the participants’ social constraints and reinforcing their awareness of their own and others’ living contexts. These can be described as follows. 1. Intention: LL opens participants’ sensitivity to others’ reasons for desiring the value, that is, the requirements of people, including those who may attend the workshop. 2. Preconstraint: LL externalizes the situation for using the value, that is, where and when to use an entity. 3. Postconstraint. LL participants are expected to communicate the forthcoming influence from and to individuals. Items (1), (2), and (3) above for the three parts of a tsugo, proposed in Ohsawa and Akimoto (2013) as intentions (goals) and constraints (a part of contexts) of an individual, which can be regarded as the context in daily life here. So far, it has
43 Acquisition of Feature Concepts Via Open Abductive. . .
967
been pointed out that tsugoes form information whose “stickiness” (von Hippel, 1994) may disturb the connection of knowledge of participants, which is essential for innovations. In other words, the contexts to be elicited in LL are essential information that should be shared by stakeholders, especially if they are interested in innovation. As a result, regard LLDJ can be regarded as a method for externalizing, exchanging, and sharing the contexts and feature concepts, which are the connection joints of stakeholders, for data-mediated innovations.
The Structure of Data-Federative Innovation with LLDJ The proposed process of living lab on DJs (LLDJ) involves five simple steps. Step 1) Set the topic Z. Collect the initial participants’ Π LL interest in Z. Step 2) Open an LL relevant to the topic Z from the viewpoint of daily life, that is, independent of data. Collect the set of requirements RLL . Step 3) Collect DJs hit by the query of keywords in RLL using the DJ store. Step 4) Open an IMDJ to Π IMDJ , participants collected with calling data providers, data scientists, and Π LL for participation. Revisions: Z ← ρ(Z, RLL , RIMDJ ), DJ ← DJ + contexts + feature concepts. add DJnew : ∪i DJi , a frequently used set of DJs Step 5) Π Step 1.
LL
←Π
IMDJ ;
return the solution(s) and requirements in the IMDJ to
In Step 1, participants are regulated to communicate requirements and to propose solutions to satisfy the requirements, and to ask “why,” that is, the reasons for the requirements and for the solutions, that is, the methods “how” to satisfy the requirements, so that they deepen the requirements until they externalize the contexts of living and improve the methods fitting the contexts. This regulation borrows the idea from the deep reasoning question corresponding to “why?” used in design communications (Eris, 2004). In these studies of design methods, it has been suggested that deep reasoning questions (DRQ) and generative design questions (GDQ) corresponding to “how” questions are desired in a good balance in order to realize a highly evaluated design of products. In Steps 3 and 4, DJs relevant to utterances in LL relevant to the given topic Z are used or reused, with revising Z and the DJ set reflecting the presented contexts and feature concepts. In the abductive reasoning with DJs, both DRQ and GDQ are placed in a similar position that is downstream from the top (G’), which may originate from a detected event in data, to hypotheses to be validated by DJs. Thus, LLDJ and IMDJ can be compared, as shown in Fig. 11 and Table 1. As shown in Fig. 11, the LLDJ contributes to solutions for general social issues as well as issues considered in the previous IMDJ.
968
Y. Ohsawa et al.
Fig. 11 The structure of Living Lab with Data Jackets (LLDJ): The feature concepts (FCs) and requirements call each other via communicated questions (why and how) in LLDJ Table 1 The comparison of IMDJ versus LLDJ. Revised from Ohsawa et al. (2019b). Elements of a workshop Participants
IMDJ Data providers/experts, data users, data scientists
Visualization as common reference for participants
KeyGraph showing DJs and links between them
The communication
Requirements and solutions by use of multiple data, corresponding DJs connected in KeyGraph, combined into one A few layers
Structure of requirements
LLDJ Members revised by cycles (step 1 to 4), including ordinary citizens in LL and others similar to IMDJ KeyGraph of words in LL and co-occurrence links between them, in addition to the graph for IMDJ, both revised by cycles Questions asking “why” and “how” and proposals of FC’s are explicitly requested, in addition to the communication of IMDJ Multiple layers called for via “why” questions
In summary, the role of LL here is to open participants’ sensitivity to (1) others’ reasons for regarding something as valuable, that is, the requirements of people in the society, including those not attending the workshop, and (2) where and when to use the value, that is, the situation for using the thing. Because of the enhancement
43 Acquisition of Feature Concepts Via Open Abductive. . .
969
of this sensitivity, it can be said that the elicitation and exchange of contexts and feature concepts are fostered in LLDJ even beyond the IMDJ so far.
Conclusions Here, the meaning of innovation has been reviewed based on the original definition and redefined the DJs, on which the effects of logical communication with abductive reasoning in IMDJ for innovation have been shown and the problem for IMDJ came to be clarified. Innovation refers to combining elements for externalizing and/or inventing performance dimensions, which are latent factors for actions, including the goals of actions and the conditions for performing the actions. Then, these latent factors came to be related to essential elements, that is, joints, to connect diverse stakeholders’ knowledge and datasets, called feature concepts, in addition to contexts. The feature concepts bridge the gap between the required knowledge and data and between knowledge of different domains via analogy, which is necessary for realizing a high performance of data use/reuse. By asking questions “why?” as well as “how?” in communication in IMDJ by manipulating the content of DJs extensively, the externalization of feature concepts and concepts is fostered. This structure is shown in Fig. 12. To reinforce the effect of IMDJ in externalizing feature concepts, LLDJ is shown here as a method to deepen and widen the requirement to be shown from communication with inviting local aspects in people’s daily lives, to externalize essential social issues and useful feature concepts. This effect is not only due to covering a wider range of requirements, but also due to the tendency that the participants’ open awareness of others’ contexts is enhanced, and the network of their requirements and solutions are developed, where the elicitation of feature
Fig. 12 The structure of data-federative innovation with using feature concepts and analogymediated communications in LLDJ
970
Y. Ohsawa et al.
concepts is also involved in the supporting networking process via the reedition of DJs. In the ongoing project, the authors are designing a new system for online communicable DJs that can be revised and extended to reflect rising topics and expectations about data usage to further reinforce the above effects by adding useful feature concepts. Acknowledgments This study has been supported by the JST COI-NEXT ClimCore, the Cabinet Secretariat of Japan, JSPS Kakenhi and 20 K20482 and 20 K14981.
References Abe, A. (2000). Abductive analogical reasoning. Systems and Computers in Japan, 31(1), 11–19. Abe, A. (2003). The role of abduction in chance discovery. New Generation Computing, 21, 61–71. Almirall, E., & Wareham, J. (2011). Living labs: Arbiters of mid- and ground-level innovation. Technology Analysis & Strategic Management, 23(1), 87–102. Borman, W. C., & Motowidlo, S. J. (1993). Expanding the criterion domain to include elements of contextual performance. In N. Schmitt, W. C. Borman, & Associates (Eds.), Personnel selection in organizations (pp. 71–98). Jossey-Bass. Dervin, B. (1992). From the mind’s eye of the user: The sense-making qualitative-quantitative methodology. In J. D. Glazier & R. R. Powell (Eds.), Qualitative research in information management (pp. 61–84). Donaldson, M. (1992). Human minds: An exploration, Allen. The Penguin Press. Drucker, P. F. (1985). The discipline of innovation. Harvard Business Review, 63(3), 67–73. Dyne, V. L., Cummings, L. L., & McLean Parks, J. (1995). Extra-role behaviors: In pursuit of construct and definitional clarity. Research in Organizational Behavior, 17, 215–285. Eris, O. (2004). Effective inquiry for innovative engineering design. Kluwer Academic. Fearnhead, P., & Liu, Z. (2007). Online inference for multiple Changepoint problems. Journal of Royal Statistical Society: Series B, 69(4) ISSN 1369-7412. Følstad, A. (2008). Living labs for innovation and development of information and communication technology: A literature review, EJ. Virtual Organizations and Networks, 10, 99–131. Hayashi, T., & Ohsawa, Y. (2016a). Data jacket store: Structuring knowledge of data utilization and retrieval system, in trans. Japanese Society for Artificial Intelligence, 31(5). Hayashi, T., & Ohsawa, Y. (2016b). Meta-data generation of analysis tools and connection with structured meta-data of datasets. In Proceedings 3rd international conference on signal processing and integrated networks (pp. 226–231). Hayashi, Y., & Yamanishi, K. (2015). Sequential network change detection with its applications to ad impact relation analysis. Data Mining and Knowledge Discovery, 29, 137–167. Hippel, E. V. (1994). Sicky information and the locus of problem solving: Implications for innovation. Management Science, 40, 429–439. Hippel, E. V. (2006). Democratizing Innovation (New ed.). The MIT Press. Iwasa, D., Hayashi, T., & Ohsawa, Y. (2019). Development and evaluation of a new platform for accelerating cross-domain data exchange and cooperation. New Generation Computing. Kahn, B. K. (1995). Consumer variety seeking among goods and service. Journal of Retailing and Consumer Services, 2, 139–148. Kean, A. C. Y., & Tsiknis, G. K. (1992). Assumption based reasoning and clause management system. Computational Intelligence, 8(1), 1–24. Leminen, S., Westerlund, M., & Nyström, A. G. (2012). Living labs as open-innovation networks. Technology Innovation Management Review, 6–12. Miyaguchi, K., & Yamanishi, K. (2017). Online detection of continuous changes in stochastic processes. International Journal of Data Science and Analytics, 3(3), 213–229.
43 Acquisition of Feature Concepts Via Open Abductive. . .
971
Ohsawa, Y. (2003). KeyGraph: Visualized structure among event clusters. In Y. Ohsawa & P. McBurney (Eds.), Chance discovery (pp. 262–275). Springer. Ohsawa, Y. (2018a). Graph-based entropy for detecting explanatory signs of changes in market. The Review of Socionetwork Strategies, 12, 183. Ohsawa, Y. (2018b). Regional seismic information entropy for detecting earthquake activation precursors. Entropy, 20(11), 861. Ohsawa, Y., & Akimoto, M. (2013). Unstick Tsugoes for innovative interaction of market stakeholders. International Journal of Knowledge and Systems Science (IJKSS), 4(1), 32–49. Ohsawa, Y., Benson, N. E., & Yachida, M. (1998). KeyGraph: Automatic indexing by cooccurrence graph based on building construction metaphor. In Proceedings of the advanced digital library conference (IEEE ADL’98) (pp. 12–18). Ohsawa, Y., Kido, H., Hayashi, T., & Liu, C. (2013). Data jackets for synthesizing values in the market of data. Procedia Computer Science, 22, 709–716. Ohsawa, Y., Hayashi, T., & Kido, H. (2017). Restructuring incomplete models in innovators marketplace on data jackets. In L. Magnani & T. Bertolotti (Eds.), Handbook of model-based science (pp. 1015–1031). Springer. Ohsawa, Y., Hayashi, T., & Yoshino, T. (2019a). Tangled string for multi-timescale explanation of changes in stock market. Information, 10(3), 118. Ohsawa, Y., Kondo, S., & Hayashi, T. (2019b). Data jackets as communicable metadata for potential innovators – Toward opening to social. In Prof. Int’l Conf. on Intelligent Systems Design and Applications (ISDA). Ohsawa, Y., Kondo, S., & Hayashi, T. (2022). Living beyond data with Feature Concepts, Ohsawa, Y. (ed) Living Beyond Data pp.3–27 Ohshima, R., Miyazaki, G., & Haga, S. (2018). The mediating effect of political skill in influencing the effect of the big five personality domains on organizational citizenship behavior. Japanese Association of Industrial/Organizational Psychology Journal, 32(1), 31–41. Organ, D. W. (1988). A restatement of the satisfaction-performance hypothesis. Journal of Management, 14(4), 547–557. Reiter, R., & de Kleer, J. (1987). Foundations of assumption-based truth maintenance systems: Preliminary report. In Proceeding AAAI-87 (pp. 183–188). Rogers, E. M. (2003). Diffusion of innovations (5th ed.). Free Press. Schumpeter, J. A. (1912). Theorie der wirtschaftlichen Entwicklung. Duncker & Humblot. Takemura, K., Hayashi, T., Ohsawa, Y., Aihara, D., & Sugawa, A. (2018). Computational coach support using soccer videos and visualization. IEICE-TR, 117(440), 93–98. (in Japanese).
Part VIII Abduction and Economics
Introduction to Abduction and Economics
44
Fernando Tohmé
Contents Abduction and Economic Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
975 977
Abstract
We present a brief review of the potential uses of abduction in economic reasoning and review the chapters in the section on Abduction and Economics. There are two main areas in which abduction may play a role in Economics. One is in the development of hypotheses that may contribute to the development of a full theoretical framework of economic behavior. The other is in generating tools and procedures to detect and assess economic data. The four chapters in this section cover different aspects of the applications of abduction in Economics, ranging from discussions of historical and methodological nature to more specific applications in Econometrics and Macroeconomics. Keywords
Abduction · Stylized facts · Economic theory · Econometrics.
Abduction and Economic Reasoning Economics has been defined as “the science which studies human behaviour as a relationship between ends and scarce means which have alternative uses” (Robbins,
F. Tohmé () Dpto. de Economía, Universidad Nacional del Sur, INMABB-UNS-CONICET, Bahía Blanca, Argentina e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_87
975
976
F. Tohmé
1935). The key words here are “scarce” and “alternative,” where the last one refers to entities guided by independent intentional agents. The analysis of optimal (or at least satisfactory) outcomes in the interaction of those agents is a fertile field for the application of abduction. The other forms of inference, deduction and induction, also play highly relevant roles. So for instance, one the main hypotheses are in place, varied mathematical techniques are applied to derive consequences that can be contrasted with real-world data. On the other hand, rich statistical information permits economists to postulate “stylized facts” about the behavior of economic entities. Abduction plays there a middleman role in generating the following relation: DATA → STYLIZED FACTS → HYPOTHESES → THEORIES → TESTABLE CLAIMS Notice that this fits squarely into Charles Peirce’s description of scientific abduction Peirce (1992–1998). The analogy with one Peirce’s favorite examples (Essential Peirce 2.216), namely Kepler’s laws of planetary motion, is straightforward. The transition from the “stylized facts” compiled by Tycho Brahe to Newtonian celestial mechanics was mediated by the hypothesis of elliptical planetary orbits. Unlike in Physics, economists have not come up yet with an overarching framework like Newton’s mechanics and thus, abduction is applied on fragments of the economy. This is a reflection of the difficulties found in trying to unify the varied forms in which intentional agents interact. Worse yet, data itself is affected by this heterogeneity and thus economists were forced to create a whole subdiscipline (Econometrics) to handle both finding stylized facts and translating testable claims into measurable terms. John von Neumann, arguably the main responsible for triggering the evolution of Economic Theory after WWII, by developing a mathematical toolbox specifically aimed to analyze economic problems. In his book with Oskar Morgenstern (von Neumann & Morgenstern, 1944), they claim that “it would have been absurd in physics to expect Kepler and Newton without Tycho – and there is no reason to hope for an easier development in Economics.” This realization, if correct, calls for huge abductive jump in the understanding of economic phenomena based on the compilation of a huge amount of data. The eight decades that have passed show that we are not yet even close to that. If anything, the last decades have taught us that perhaps the best way of proceeding is to compile much more data in the hopes to extract from there the desired abduction. In this sense, the fast development of Machine Learning techniques and the hopes to apply AI to build hypothesis on phenomenological observations may hold the key for the crucial abduction that may give sense to the entire discipline (Heckman & Singer, 2017). In the meanwhile, we can only discuss different facets of abduction in the field. The chapters in this section cover those main aspects. James Wible addresses a quite interesting question, namely what can we say about Peirce’s own contribution
44 Introduction to Abduction and Economics
977
to the use of abduction to understand economic phenomena. Peirce’s conception of the economy of the “natural order of things” is clearly aligned with the general abstract description of Economics, including the existence of competing goals. Ramzi Mabsout in his chapter discusses the way in which the use of abduction in Economics provides a unifying theme underlying the diverse schools of thought, highlighting the role of the hypotheses in their development. The contribution of Marcelo Auday, Ricardo Crespo, and Fernando Tohmé focuses on the specific role of abduction as response to surprising events. They show that economic crises provide rich sources of information for the generation of new hypotheses that contribute to the design of economic policies. Finally, Fernando Delbianco and Fernando Tohmé discuss the question of how abduction contributes in the generation of methodological assumptions in the analysis of economic data and how Machine Learning may contribute to generate sound hypotheses about economic phenomena. In summary, the chapters concentrate on the historical, methodological, and even political use of abduction in Economics. While the existence of a Kepler-like abduction that could make sense of the entire realm of economic interactions is still an open question, these contributions show that abduction has an outstanding role in economic reasoning.
References Heckman, J. J., & Singer, B. (2017). Abducting economics. American Economic Review, 107(5), 298–302. Peirce, C.S. (1992–1998). The essential Peirce: Selected philosophical writings. Vol. 2, 1893– 1913, ed. the Peirce Edition Project. Indiana University Press. Robbins, L. (1935). An essay on the nature and significance of economic science. MacMillan and Co. von Neumann, J., & Morgenstern, O. (1944). Theory of games and economic behavior. Princeton University Press.
Abduction in Economics: A Philosophical View
45
Marcelo Auday, Ricardo Crespo, and Fernando Tohmé
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in Theory and Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in Economic Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction at Work in Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
980 981 983 986 988 989
Abstract
In this chapter, we explore the role of abduction in the way economists explain economic phenomena and develop ways to test their theories. The main impact of abduction arises in events that disrupt the usual interpretations of economic phenomena, such as macroeconomic crises. The hypotheses inferred arise from the examination of ideas and real-world data, leading to a mix of qualitative and quantitative formulations of theoretical and empirical models.
M. Auday Universidad Nacional del Sur (UNS) & IIESS-CONICET, Bahía Blanca, Argentina e-mail: [email protected] R. Crespo IAE Business School, Universidad Austral & Conicet, Buenos Aires, Argentina e-mail: [email protected] F. Tohmé () Dpto. de Economía, Universidad Nacional del Sur, INMABB-UNS-CONICET, Bahía Blanca, Argentina e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_52
979
980
M. Auday et al.
Keywords
Abduction · Economic reasoning · Economic models · Data models · Economic crises
Introduction To contemporary ears, philosophy and economics sound like very distant disciplines. Philosophy seems to deal with immaterial and ethereal entities while economics with material and concrete ones. However, philosophy has been present in economics since its very beginning. Plato and Aristotle already dealt with the philosophy of economics. The “father” of economics as an independent discipline, Adam Smith, was first and foremost a moral philosopher. A long list of economists with a strong philosophical bent followed in Smith’s footsteps: his close friend David Hume, John Stuart Mill, Karl Marx, John Neville and John Maynard Keynes, Herbert Simon, Albert Hirschman, and Amartya Sen, only to mention a few of them. Perhaps the most forceful recognition of the close relationship between the two disciplines was made by the renowned economist Robert Heilbronner in his book Economists as Worldly Philosophers, published in 1953. However, from the mid-twentieth century onward, few economists have bridged the divide between the two disciplines. A certain feeling of disorientation in the field may be attributed to this lack of philosophical reflection. The recurrence of economic and financial crises in the last decades associated with the fact that few economists predicted them has led to a reconsideration of the philosophy of economics, although little has permeated back into economics. These philosophical discussions about economics have emphasized on the analysis of economic methods, dealing with the scope and nature of statistics and measurement in economics, the validity of econometric methods, and the nature of models. This has led to the assessment of the meaning of causality and probability in economic matters. Despite these developments, the large literature on the methodology of economics generated after 1980 scarcely mentions abduction as a component of economic reasoning nor reflects much on the origin and role of economic assumptions. Yet, given that abduction plays a central role in all scientific areas, it is also a component of economic reasoning. In effect, as stated in Magnani (2009), “abduction is a basic kind of human cognition.” Thus, it cannot be absent in economics. The generally unrecognized pervasiveness of abduction in economic reasoning stems from the difficulties derived from the complexities of real-world economic phenomena. In turn, the complexity of economic events arises from their singularity, from the reflexive character of human affairs, and from the variable consequences on future events of human reactions to predictions of these events.
45 Abduction in Economics: A Philosophical View
981
These features make extremely difficult to derive generalizations through inductive inference. Hence, in addition to empirical data, background knowledge, metaphors, analogies, and intuitions provide the building materials for the construction of theoretical and empirical economic models. These models obtain by combining those disparate pieces of information in a processes of abduction. The purpose of this chapter is, precisely, to dissect the process of abduction in economics.
Abduction in Theory and Experience Abduction can be roughly understood as a portmanteau concept, covering many notions that cannot be classified as corresponding to deductive or inductive reasoning processes. Moreover, it involves a series of procedures that cannot strictly been seeing as “reasoning” processes (where “reasoning” is conceived as the construction of arguments). Salient instances of these procedures are model building, the generation of hypotheses, or the creation of concepts. The central methodological problem of science in general (and economics, in particular) is to establish the relation between theory and experience. Many problems that arise in the quest to establish the right nature of this relation involve abductive processes. The relation between a theoretical language and a corresponding observational language, as seen in the traditional philosophy of science, is such that concepts expressed in the former cannot be obtained by inductive generalizations of expressions in the latter, nor can they be deduced from them. That is, it is not possible to generate theoretical hypotheses on the basis of empirical or experimental data by applying only inductive or deductive mechanisms. The relation between theory and experience is mediated, on both sides, by models: • Theoretical models, aimed to capture relevant features of the phenomena under consideration, are built on the basis of certain theoretical assumptions. • Those models are tested by using data models. That is, the data used to test theoretical claims are not “rough” but structured. This structure depends also on assumptions, in this case, about the design of experiments or on the search of empirical data (by means of surveys, measurements, field experiments, etc.). In principle, it seems natural to think that the theoretical assumption guiding the construction of theoretical models should be the same as the ones that lead to the data models by providing guidelines for the collection of observational or experimental data. But this is not necessarily so. There exist technical or even moral limitations to data acquisition that impose extra conditions, on top of the theoretical ones. Worse yet, some theoretical assumptions may be disregarded in the process of gathering data.
982
M. Auday et al.
An instance of the latter case can be found in the current gap between theoretical economics and experimental economics. Their corresponding scientific communities overlap (by focusing on the same topics) but differ on the methodological and conceptual assumptions adopted by them. Ways of overcoming this gap are crucial to establish a more tight relation between theory and experience in economics. An important contribution seeks to do that (Samuelson, 2005): [This work] explores one aspect of this integration of experimental economics into economics. How can we usefully combine work in economic theory and experimental economics? What do economic theory and experimental economics have to contribute to one another, and how can we shape their interaction to enhance these contributions?
We can see that the process of contrasting a theoretical model also involves abduction in the construction of data models (and particularly in the design of lab experiments). Some authors state that the goal of empirical economics is to learn from data. This implies to specify what pieces of evidence should be deemed acceptable and how to learn from that data. In this, abduction plays a central role (Heckman and Singer, 2017): It is the interplay of theory and facts and the evolution of both in the process of generating new hypotheses that is the core of abduction. A central feature of abduction is the quest for and construction of hypotheses and explanations, which are the most plausible candidates to account for an empirical phenomenon.
This poses a difference with the usual practice in statistics (Heckman and Singer, 2017): Missing from current practice is recognition of the great benefit for knowledge of going back and forth with data—i.e., learning from it, revising hypotheses in light of it, augmenting it with fresh data and fresh theoretical insights and suggesting new interpretations, and formulating new hypotheses suggested by the collection of new data.
The central issue is thus: The rigid separation of the processes of model generation and model testing—a central feature of the formulation of the identification problem—while analytically convenient—is artificial.
This can be easily related to the analysis in Samuelson (2005). This examination of the theoretical and experimental literature on the notion of subgame perfect equilibria and the ultimatum game shows how theorists and experimentalists have interacted in a back and forth process. Theoretical and experimental proposals, arising from reviewing older theoretical assumptions and the evaluation of the reach of experimental results, have been exchanged leading to new ideas and interpretations. The fact that abduction plays a role in all the steps involved in generating and testing hypotheses suggests that a dynamical process underlies all of them. It involves a feedback between the creation of hypotheses, the generation of data, the test of hypotheses, and then the revision of the latter as well as of the experiments and observations carried out.
45 Abduction in Economics: A Philosophical View
983
On the theoretical side, notice that there exist assumptions at a meta-level of the relation between theories and theoretical models. So, for instance, consider the assumption that economic agents are optimizers. It guides the construction of specific theoretical models to explain the phenomena under analysis. If a model does not work, it can be partially modified, keeping the assumption of optimizing behavior. This amounts, for instance, to incorporate informational limitations (imperfect and incomplete information, for instance) or to modify the properties of certain elements of the model (say, by dropping the requirement of completeness of preferences or the conditions imposed on the set of alternatives). An illustrative example of this can be found in the critiques in Sen (1993) to the theory of revealed preferences. There are cases in which choices cannot be reconstructed in terms of the standard assumptions of revealed preferences. Nevertheless, Bossert and Suzumura (2010) show how that if the set of choices is adapted, those choices can again be rationalized. These considerations seem to point toward displacing the focus of attention from the processes involved in building isolated theoretical and data models, to families of models for the same class of phenomena. Assessing the appropriateness of a single model requires to consider the alternative models. A robust model is one in which no further amendments seem needed.
Abduction in Economic Reasoning John Maynard Keynes wrote to Roy Harrod (Keynes, 1973, p. 296), letter to Harrod, 4 July 1938): It seems to me that economics is a branch of logic; a way of thinking [. . . ] one cannot get very far except by devising new and improved models. This requires, as you say, “a vigilant observation of the actual working of our system’. Progress in economics consists almost entirely in a progressive improvement in the choice of models [. . . ] Economics is a science of thinking in terms of models joined to the art of choosing models which are relevant to the contemporary world. [. . . ] The object of a model is to segregate the semi-permanent or relatively constant factors from those which are transitory or fluctuating so as to develop a logical way of thinking about the latter, and of understanding the time sequences to which they give rise in particular cases.
Rodrik (2015) does not refer to Keynes’ ideas on models, but in Rodrik (2018), he declares: Had I been familiar with this quote from Keynes before I wrote the book, I might have chosen not to spend the effort!
For Rodrik, the essential work of the economist is to select appropriate models or develop new ones adapted to specific circumstances, capturing “the most relevant aspect of reality in a given context” (Rodrik, 2015, p. 11). The reason for this is the complexity and contingency of the social life (Rodrik, 2015, p. 67 and p. 116): “there are few immutable truths in economics” (Rodrik, 2015, p. 148). Rodrik applies Isaiah Berlin’s famous metaphor (Berlin quotes the Greek poet Archilochus: “the fox knows many things, but the hedgehog knows one
984
M. Auday et al.
big thing” (Berlin, 1953, p. 1).) to economists, classifying them as being either hedgehogs or foxes. Hedgehog economists think that there is always one way of resolving an economic problem, regardless of context, while fox economists will answer, “it depends” (Rodrik, 2015, p. 175). Rodrik favors fox economists. In the same train of thought, he supports Albert Hirschman’s criticism of some scholars’ compulsive tendency to look for all-encompassing theories that discard real-world contingencies (cited in Rodrik, 2015, p. 145). Neither Keynes nor Rodrik nor Hirschman refers to abduction. However, abduction is present in the aforementioned arguments. In fact, as discussed in the previous section, abduction is an essential component of economic analysis, theoretical and empirical. Economic theory generally proceeds by constructing models (Morgan & Morrison, 1999), that is, mental schemes based on mental experiments (Nersessian, 1992). Models are often written in mathematical language, but, apart from their formal expression, they use metaphors, analogies, and pieces of intuition to motivate their assumptions and to give support to their conclusions (Frigg & Hartmann, 2006). In dealing with ongoing economic processes, agents and analysts must generally evaluate whether the situation resembles in a relevant way some instances observed or studied in the past and whether this warrants applying somehow the “lessons” drawn from those experiences. The problem in judging “whether some pasts are good references for the future” becomes particularly severe when the economy is seen to undergo important changes (Crespo et al., 2010). Simplicity, explanatory power, coherence, and testability are rather unconsciously exerted in the abduction of possible explanatory models. Each context indicates which virtue has more or less weight. For example, as Keynes contends, vagueness may be more virtuous than precision when dealing with the complex social realm. For him, elegance and simplicity may be misleading, and economy may be a vice instead of a virtue. This is compatible with Peirce’s thought. For him, “simplicity” does not imply a “simplified” hypothesis, but “the more facile and natural, the one that instinct suggests, that must be preferred” (Peirce, 1958, 6.477). The retroductive phase in model building creates the opportunity to make abductive-like decisions. Although it sounds rather obvious, we have to acknowledge the existence of a gap between the formulation of a question to be answered through measurement and the actual measurement providing the right answer. The difference arises from the fact that problems are qualitative, while data is quantitative. In consequence, rough data (which certainly are the quantitative counterparts of qualitative concepts) must be organized according to the qualitative structure to be tested. The inferences that allow economists and econometricians to detect patterns in reams of data cannot be understood as being statistical inductions. They are more a result of a detective-like approach to scarce and unorganized information, where the goal is to get clues out of datasets of rough observations and disclose hidden statements that could make them meaningful. In other words, it is a matter of
45 Abduction in Economics: A Philosophical View
985
making guesses, which later can be put in a deductive framework and tested by statistical procedures. So far, it seems that it is just an “artistic” feat, which can only be performed by experts. David Autor (2012), relying on his experience in editing the Journal of Economic Perspectives, has asserted about the process of economic reasoning: Economic research often begins with a big interesting question, which also tends to be sprawling and unmanageable. So the researcher breaks down the question into chunks, carefully examining assumptions and interpretations along the way, diving deeply into analysis. Papers in the refereed literature result from such deep dives. But as these papers are discussed and digested, their lessons are brought back from the deep where they can be more broadly appreciated. This process is as indispensable for scholars as it is for end users. Academics master and ultimately digest frontier scholarship by distilling its insights down to a few big facts, simple models, and reliable predictive relationships.
Economists have a background of general rules. When a surprising or abnormal fact appears, the first step is to try to come up with an explanation according to those rules. The appearance of a surprising or abnormal fact reveals that some relevant information is ignored. Gabbay and Woods (2005, p. 85) rephrase this as “abduction is triggered by the irritation of ignorance.” This irritation may be weak or strong. It is weak when it can be guessed that the event can be explained by previous knowledge (rules, theories). In turn, it is strong when there are no clues in the background of prior knowledge of any possible explanation. Weak ignorance or abnormal/surprising events often lead to selective abduction while strong ignorance to creative abduction. The best explanation obtains by delimiting the possible hypotheses until only one of them remains. In this process, the economist captures simple and coherent hypotheses and models by taking into account not only the features of the specific case but also information about similar situations. Let us give a sketchy systematization of this reasoning process in economics. We can clearly differentiate the following steps in this process: 1. An abnormal/surprising/ignorance triggering event is detected, requiring an explanation. 2. The event is carefully described. 3. Some stylized facts are extracted from the description. 4. Situations sharing the same stylized facts are given particular attention. 5. Possible explanations based on a theory, on a modified theory, on a combination of theories (sometimes supposing a decision about possibly competing theories), or on a completely new theory are imagined. 6. Formal expressions, capturing the relations deemed essential in the explanation of the relevant stylized facts, according to the previous step, are formulated. 7. Only those combinations of deductive chains and inductive plausibility that are both externally and internally coherent are chosen, discarding other possibilities. 8. This provides an original coherent explanation(s) of the event. 9. The conclusions are tested.
986
M. Auday et al.
Abduction is hidden specially in steps 3, 4, 5, and 7. Steps 6 and 7 are mostly deductive. Step 9 is also inductive and retroductive. The whole process is a Peircean qualitative inductive process (in the sense defined in Rescher 1978, p. 3), but almost always uses also instrumental assumptions. The so-called “as if” arguments are usually applied in economics. Good economists have a guess instinct (Peirce, 1958, 6.476–477) enacted in these scientific processes. This is not a mysterious miracle but an intellectual intuition, stemming from a theoretical framework or background knowledge combined with the experience of having worked hard with theories, models, and data. This leads them to foresee a set of probably successful models. Combining this gift with hard empirical work economists often overcome the problems of underdetermination of theories by formulating local or context-dependent theories. Context dependence is a characteristic feature of any Inference to the Best Explanation (Day & Kincaid, 1994; Cresto, 2006). Even so, economists are often not satisfied with this and keep trying to improve their models. Given the fluctuating ontological condition of the economic material, improvements are obtained only by remaining closely related to the details of real situations. Previous knowledge must be used with caution, since analogies may fail and old or conventional theories may be misleading. Thus, economists need that special “gift for using vigilant observation to choose good models” (Keynes, 1973, p. 297). In any case, the search for improvements must stop at some point. This is due, on the one hand, to the urgency of making decisions that cannot wait for further investigation. The economy of research (Rescher (1978, 65ff.), extensively quoting Peirce) requires accepting conclusions supported by reasonable inferences to the best explanation even if they may be fallible. On the other hand, the quantitative quality requirements of data acquisition, processing, and even presentation lead to accepting conclusions that may only be good enough. There exist two ways in which abduction and economical reasoning are related. One is of course defined by the application of abduction to model building in economics. The other relation arises in the aforementioned economy of research. Peirce himself, in 1879, connected these two different notions: Peirce [. . . ] goes beyond economic metaphors and qualitative narratives and actually reformulates Jevons’ (1871) famous marginal utility model for balancing the consumption of two different goods to balancing additional funds for additional scientific research projects. (Wible, 1994)
Abduction at Work in Economics The history of economic thought presents different uses of abduction, even before Peirce introduced its concept. So, in classical political economy, a salient instance can be found in the work of perhaps the most renowned of the founding fathers of the discipline, David Ricardo. Indeed, as shown by Peirce (Hoover, 2018; Hoover & Wible, 2020), some form of ampliative inference, mixing abduction and induction,
45 Abduction in Economics: A Philosophical View
987
can be found in Ricardo’s theory of rent (Other contributions of Peirce to economics are examined in Mabsout 2015.). In modern economic theory, a salient example arises in the joint work of Milton Friedman and Leonard Savage on risk-taking (Friedman & Savage, 1948). Faced with the surprising fact of the contradiction between the usually assumed concave utility function and the existence of risk-taking agents, they produced an alternative utility function with both concave and convex sections to capture the risk averseness of most economic agents with the risky behavior of others in a single framework (Hirsch & de Marchi, 1990, p.18). Even with these and other instances at hand, economists hardly mention abduction explicitly. Only recently has the term entered the economic literature. Heckman and Singer (2017) point out that the seminal work of Friedman (1957) on the consumption function is a clear instance of abductive inference. They also refer to Gary Becker as a “master of abduction” and to Acemoglu and Robinson (2012) as “an extensive abductive analysis investigating sources of success and failure of nations.” Steven Durlauf (2020) analyzes the details of Acemoglu and Robinson (2012). More in general, he states that “the state of knowledge in the institutions and growth literature is a successful example of abductive reasoning, also known as inference to the best explanation in the philosophy of science literature.” Samuelson (2005) formalizes the relations between economic theory and experimental economics. While never explicitly mentioning abduction, it presents a transparent specification of the different abductive processes involved in those interactions. Gilboa et al. (2015) include in what they call “inductive inference” different types of reasoning, like analogical and case-based reasoning, which other authors considered are varieties of abduction. So, for instance, Hoover (2018) points out that “analogical reasoning represents a constructive interplay of abduction and induction.” The fact is that the distinction between induction and abduction becomes blurred at some points. The need to explain surprising or unforeseeable phenomena sometimes clashes with a limitation of current approaches in empirical economics. According to Heckman and Singer (2017), these approaches “lack of formal guidelines for taking the next step and learning from surprising findings. There is no established practice for dealing with surprise, even though surprise is an everyday occurrence.” Economic crises are a source of surprising events which require abducing possible explanations for them. Crespo et al. (2010) formalize processes of abduction to develop a stylized model of economic crises. Tohmé and Crespo (2013) delve into the nuances of that modeling process. Hoover (2016), in a survey of the contributions of Swedberg (2014) and Orléan (2014), remarks that the financial crisis of 2007/2008 exposed a concomitant crisis in the macroeconomic theory of the mainstream. Both the survey and Swedberg (2014) use the notion of abduction to explain how theories are built in economics. Some interesting examples of abduction arise in the search for explanations of notorious cases of economic failure:
988
M. Auday et al.
• The recessionary effects of multiple devaluations in Argentina, a topic that has been extensively studied by many scholars. They have been attributed to the idiosyncratic economic and social characteristics of Argentina. Díaz Alejandro (1965, Chap. 2) proposed a possible hypothesis to explain this phenomenon, namely, that devaluations produce a re-distribution of income to the agricultural export sectors that have a lower marginal propensity to consume domestic goods. An alternative (or complementary) hypothesis for why multiple devaluations cause recessions is that increases in the foreign exchange rate increase consumer price inflation, leading the authorities to enact stabilization policies with recessionary consequences. • During the 1970s, economists noticed that in Argentina, Brazil, Chile, and Uruguay, the use of restrictive monetary policies to fight inflation seemed to induce a stagflation. In those countries, the implementation of anti-inflation monetary policies were followed by a reduction in real output accompanied by an acceleration of inflation. Cavallo (1977) advanced the hypothesis that a restrictive monetary policy diminishes aggregate demand, increasing the interest rate leading to increases in both costs and cost inflation. • Lagos and Llach (2011) enumerate 38 different hypotheses for the causes of the Argentine decline, including sociological, cultural, institutional, historical, political, as well as many economic hypotheses for this phenomenon, some of them contradicting each other. In Llach and Lagos (2014), these previous theses are reviewed and contrasting with a new one, based on the comparison of Argentina with Brazil, Chile, Uruguay, and New Zealand. This hypothesis proposes the interaction of historical (strong path dependency), economic, structural, institutional, sociological, ethical, and cultural causes which, albeit also present in the other countries under study, seem to prevail extensively in Argentina. The country always needs a positive current account balance to avoid defaulting its debts. To this end, it requires keeping a devalued domestic currency. Consequently, the overvalued foreign currency increases the weight of foreign debts relative to GDP. Additionally, the loss of value of the domestic currency produces inflation, sterilizing its positive effects by generating a recession. A stabilization plan – a restrictive monetary policy with tax increases – to eliminate government deficit deepens the recession and reinforces stagflation. A recession accompanied by debt increases leads to debt default. Real wages fall, spurring strong social discontent. Eventually, fiscally irresponsible populist policies are implemented, which relax the stabilization policy and lead to higher inflation, which can end up becoming hyperinflation.
Concluding Remarks The role of abduction in the construction of economic models presents some peculiarities that make it somewhat different from the way it is applied in other disciplines. This is because the qualitative/quantitative mixed nature of economics.
45 Abduction in Economics: A Philosophical View
989
The hypotheses to be generated abductively must be imprecise enough as to be translatable into different possible mathematical formats. The dual role of abduction, generating content in economic reasoning and as a tool for carrying out efficient investigations, contributes in the construction of models in two different ways. It helps to yield theoretical models but also in designing empirical and experimental models to test them. Notice that the results of the abductions discussed in the previous section are hard to test empirically without adding extra assumptions about how the data for their verification is generated. These assumptions are usually matter of debate and thus hardly lead to the consensual implementation of policies to prevent or correct the undesired results that triggered the abductive processes. The so-called “schools” of economic thought are the response to discrepancies on either theoretical or policy-making issues. A still unexplored area is the search for inductive inferences in the engineering of macroeconomic policies, unlike in the theory and applications of the theory of regulation of markets (See section 5.4 in Tohmé and Crespo 2013.).
References Acemoglu, D., & Robinson, J. (2012). Why Nations Fail: The Origins of Power, Prosperity and Poverty. New York: Crown Publishing Group. Autor, D. (2012). The Journal of Economic Perspectives at 100 (Issues). Journal of Economic Perspectives, 26, 3–18. Berlin, I. (1953). The Hedgehog and the Fox: An Essay on Tolstoy’s View of History. London: Weidenfeld and Nicolson. Bossert, W., & Suzumura, K. (2010). Consistency, Choice, and Rationality. Cambridge, MA: Harvard University Press. Cavallo, D. (1977). Stagflationary Effects of Monetarist Stabilization Policies. Ph.D. Thesis, Harvard University. Cresto, E. (2006). Inferring to the Best Explanation: a Decision Theoretic Approach. Ph.D. Thesis, Columbia University. Crespo, R., Tohmé, F., & Heymann, D. (2010). Abducing the crisis. In L. Magnani, W. Carnielli, & C. Pizzi (Eds.), Model-Based Reasoning in Science and Technology: Abduction, Logic, and Computational Discovery (pp. 179–198). Berlin: Springer. Day, T., & Kincaid, H. (1994). Putting inference to the best explanation in its place. Synthese, 98, 271–295. Díaz Alejandro, C. (1965). Exchange-Rate Devaluation in a Semi-indusrialized Country: The Experience of Argentina, 1955–1961. Cambridge, MA: MIT Press. Durlauf, S. (2020). Institutions, development and growth: Where does evidence stand? In J.M. Baland, F. Bourguignon, J.-P. Platteau, & T. Verdier (Eds.), The Handbook of Economic Development and Institutions (pp. 189–217). New York: Princeton University Press. Friedman, M. (1957). A Theory of the Consumption Function. New York: Princeton University Press. Friedman, M., & Savage, L. (1948). The utility analysis of choices involving risk. Journal of Political Economy, 56, 279–304. Frigg, R., & Hartmann, S. (2006). Models in science. In Stanford Encyclopedia of Philosophy. (plato.stanford.edu/entries/models-science/). Gabbay, D., & Woods, J. (2005). The Reach of Abduction: Insight and Trial (A Practical Logic of Cognitive Systems, Vol. 2). Amsterdam: Elsevier.
990
M. Auday et al.
Gilboa, I., Samuelson, L., & Schmeidler, D. (2015). Analogies and Theories: Formal Models of Reasoning. New York: Oxford University Press. Heckman, J., & Singer, B. (2017). Abducting economics. American Economic Review, 107, 298– 302. Hirsch, A., & de Marchi, N. (1990). Milton Friedman. Economics in Theory and Practice. Ann Arbor: The University of Michigan Press. Hoover, K. (2016). The crisis in economic theory: A review essay. Journal of Economic Literature, 54, 1350–1361. Hoover, K. (2018). Models, Truth and Analytic Inference in Economics. Center for the History of Political Economy at Duke University Working Paper Series (2019-01). Hoover, K., & Wible, J. (2020). Ricardian inference: Charles S. Peirce, economics, and scientific method. Transactions of the Charles S. Peirce Society, 56, 521–557. Keynes, J. M. (1973). The General Theory and After: Part II. Defence and Development, The Collected Writings of John Maynard Keynes (Vol. 14). London: Macmillan. Lagos, M., & Llach, J. (2011). Claves del Retraso y del Progreso de la Argentina. Buenos Aires: Temas. Llach, J., & Lagos, M. (2014). El País de las Desmesuras. Buenos Aires: Editorial El Ateneo. Mabsout, R. (2015). Abduction and economics: The contributions of Charles Peirce and Herbert Simon. Journal of Economic Methodology, 22, 491–516. Magnani, L. (2009). Abductive Cognition. The Epistemological and Eco-cognitive Dimensions of Hypothetical Reasoning. Berlin: Springer. Morgan, M., & Morrison, M. (1999). Introduction. In M. Morgan & M. Morrison (Eds.), Models as Mediators (pp. 1–9). Cambridge: Cambridge University Press. Nersessian, N. (1992). In the theoretician’s laboratory: Thought experimenting as mental modeling. Proceedings of the Biennial Meeting of the Philosophy of Science Association, 2, 291–301. Orléan, A. (2014). The Empire of Value: A New Foundation for Economics. Cambridge, MA: MIT Press. Peirce, C. S. (1958). Collected Papers of Charles Sanders Peirce. Cambridge, MA: Harvard University Press. Rescher, N. (1978). Peirce’s philosophy of science: Critical studies in his theory of induction and scientific method. Transactions of the Charles S. Peirce Society, 15, 176–179. Rodrik, D. (2015). Economic Rules. Oxford: Oxford University Press. Rodrik, D. (2018). Second thoughts on economic rule. Journal of Economic Methodology, 25, 276–281. Samuelson, L. (2005). Economic theory and experimental economics. Journal of Economic Literature, 43, 65–107. Sen, A. (1993). Internal consistency of choice. Econometrica, 61, 495–521. Swedberg, R. (2014). The Art of Social Theory. New York: Princeton University Press. Tohmé, F., & Crespo, R. (2013). Abduction in economics: A conceptual framework and its model. Synthese, 190, 4215–4237. Wible, J. (1994). Charles Sanders Peirce’s economy of research. Journal of Economic Methodology, 1, 135–160.
Abduction in Econometrics
46
Fernando Delbianco and Fernando Tohmé
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Econometric and Modeling Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Possible Classic Econometric Tools to Perform Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meta-analysis and Zooming In . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
992 995 998 1002 1005 1006 1008 1010
Abstract
We analyze in this chapter the nature and definition of abduction in Econometrics. Unlike abduction in Economics in general, here we address the question of how surprising results in empirical analyses arise and may be treated. We cover different ways in which traditional econometric methods as well as machine learning tools handle abductive processes. While usually this is not made explicit, the methods we cover here proceed by generating new assumptions of conceptual or methodological nature to manage surprising outcomes. In the case of traditional econometrics, we discuss the unexpected results that may arise in the estimation of regression coefficients. Failures in capturing the actual functional forms, in including omitted variables or due to the violation of
F. Delbianco Universidad Nacional del Sur (UNS) & INMABB-CONICET, Bahía Blanca, Argentina e-mail: [email protected] F. Tohmé () Dpto. de Economía, Universidad Nacional del Sur, INMABB-UNS-CONICET, Bahía Blanca, Argentina e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_53
991
992
F. Delbianco and F. Tohmé
other assumptions, may lead to wrong estimations. Different methods like using information criteria or using dummy saturation may help to correct the causes of those failures. Machine learning methods like LASSO or unsupervised learning may help, as well as heterodox econometric methods like autometrics, address the question of obtaining the right elements of a model by just analyzing data, without a very precise initial theoretical model. Other methods covered in this chapter consist in the use of meta-analytic and transitional inference to either use or to obtain different answers to the same empirical question. We also discuss different methods to assess assumed causal relations or to detect them in data. In any case, all the rich information provided by the methods discussed here is obtained through abduction processes. Keywords
Econometrics · Machine Learning · Abduction · Error Term
Introduction Charles Sanders Peirce considered three modes of inference, i.e., processes in which a body of data A yields another piece of information B. One of them, usual in theoretical fields, is deduction, which allows reaching a logical conclusion, up from a class of premises. In turn, most empirical studies are instances of induction, which generates a general principle starting from a body of observations. The third mode of inference is abduction. It starts, as induction, with a body of observations, but instead of generating a prototype pattern, it finds an explanation for those observations. As argued in Magnani (2001), abduction is a valid kind of scientific reasoning, concerned with finding explanatory hypotheses. In this sense, it might be applied in theoretical as well as empirical fields. In the former case, it involves a creative search for alternative explanations. In the case of empirical studies, it involves the search of theoretical constructs useful to predict potential observations and make sense of already recorded observations. Interestingly, abduction can also be applied to find adequate methods to make better statistical (i.e., inductive) inferences. In this sense, it becomes relevant the point made in Bellucci and Pietarinen (2020), namely, that the justification of abductive reasoning resides in the power of truly knowing things by means of abductive reasoning. This seeming circularity (not a vicious one, according to these authors) is particularly critical for the justification of the adoption of empirical methods. But then, notice that the actual “usefulness” of those methods results from abductive arguments since unlike theoretical constructs, which can be assessed in terms of their mathematical soundness, empirical techniques are validated by abductions carried out by practitioners. Econometrics is the field of Economics that applies statistical methods to realworld data to detect patterns that are useful to assess economic relationships. In this sense, it is as a field in which inductive inference is widely applied, in the
46 Abduction in Econometrics
993
same way as economic theory results from the widespread application of deductive reasoning. But econometricians also apply abductive reasoning, frequently without acknowledging it. In this chapter we intend to clearly discern in which cases econometrics uses abduction instead of induction. Only very recently has this distinction attracted interest. A very influential contribution in this sense has been the article published in the Papers and Proceedings issue of the American Economic Review of 2017 by the Nobel Prize winner James Heckman and Burton Singer in which they state that while the goal of running empirical analyses is to learn from data, the question of how to assess whether the evidence is admissible or what is the best way of learn from it is far from being settled (Heckman & Singer, 2017). Econometricians have a rich toolbox of estimation and testing tools to contrast theoretical claims. Nevertheless, empirical results can still be surprising, being not expected according to a theory or to previous results. According to Heckman and Singer, making sense of such surprises amounts to performing an abduction. But there is no off-the-shelf method to abduct an explanation out from surprising evidence. It may require the joint efforts of several researchers and perhaps requiring a sequence of approximations. Heckman and Singer summarize the essence of abduction in this field in the following claim: The abductive model for learning from data follows more closely the methods of Sherlock Holmes than those of textbook Econometrics.
Taking their lead, we aim to examine in finer detail the details of the process of carrying out abductive inferences in econometrics, schematized in Fig. 1. In the course of executing a regular empirical analysis, the appearance of a surprise (an empirically validated non-expected result) yields an original piece of evidence, unlike the previous ones and not inferred by means of classical logical or statistical procedures. This new information can be made sense only by resorting to abduction. Abductive reasoning combines the use of evidence and theoretical considerations. Abduction is similar to induction in deriving conclusions from outcomes. This is unlike deduction, which makes inferences derived from assumed premises. In turn, unlike induction, abduction reasons backwards from outcomes to yield the framework from which it could be logically derived. The primary outputs of abductions are new theoretical constructions. But they can also yield new empirical models and new testing procedures appropriate for them. In this chapter we focus on the abduction of the latter constructs in econometrics. They answer to the question of how to evaluate the robustness of econometric models against new and unexpected evidences. This constitutes an important aspect in economic modeling, which amounts to the evaluation of empirical models when running econometric exercises. The literature on abduction in econometrics is rather scarce. In addition to the contribution of Heckman & Singer (2017), we can mention also Goyal (2017), who calls for running abductions based on a sequence of macroeconomic results. She states that “macroeconomic theories are being constantly surprised by events they are unable to predict, prevent or even understand them.” Also in the field of
994
F. Delbianco and F. Tohmé
Economic Theory
Economic Model
Empirical Model
Empirical Results
New Piece of Evidence
SURPRISE
Post Estimation
Fig. 1 Schematics of econometric abduction: the workflow that starts with an economic theory and ends by carrying out post estimations exercises may yield unexpected results. They, in turn, constitute a new piece of evidence. This feeds back into the theoretical foundation as well into the economic model and finally also into the empirical model to be estimated
macroeconomics, Durlauf (2020) addresses the extensive literature that reports a wide dispersion of results on the relation between institutions and growth. Durlauf indicates that such abundance of evidence calls for an abduction process. The aforementioned contributions criticize the current approach, which starts by considering a bag of theoretical models, seemingly arising from a priori considerations, and are subjected to a bunch of falsification tests. According to Heckman and Singer, this way of proceeding does not allow to run a sound abduction process, because it does not highlight the fundamental role of data in the determination of the final model. A rather old contribution by Marostica et al. (2000) claims that the difficulties associated with the detection of patterns in the real world show that it is hardly a matter of automatic curve fitting. Those authors refer to Simon (1968) who indicates that an approximate generalization is, according to any statistical test, indistinguishable from the form of a wrong generalization. Then, an inductive inference must start by checking the data before making any hasty generalization. Abduction becomes relevant because the qualitative evidence obtained in such prior analysis is not easily translatable into quantitative forms that can be statistically supported.
46 Abduction in Econometrics
995
In turn, Kuorikoski et al. (2010) claim that in Economics the refinement of theoretical models involves applying robustness analyses. That is, a central issue to the adoption of a certain model and its underlying assumptions depends on the robustness of its results. Reiss (2012) points out that three inconsistent points of view are widely held about models and explanatory hypotheses in Economics: (1) economic models are false; (2) despite this, economic models are explanatory; and (3) only true accounts explain. To recover logical consistency, authors usually keep two and reject one of these points of view. Reiss claims that none of the resulting resolutions works and concludes that economists cannot get rid of this so-called Explanation Paradox. To put these ideas in more precise terms, we consider four expressions, Y , X, f (·), and μ, combined as Y = f (X) + μ. The simplest functional form of f (·) is, for instance, Y = βX + μ. Y and X represent variables, where the former is dependent, while X is exogenous: f (·) explains the values of Y . The portion of Y not explained by f (·) is captured by μ. Then, in broad terms, the main decisions to make are: 1. Determine which variables are explained and which ones explain them. Some of the explained variables can in turn explain other endogenous ones. 2. Choose the functional form of f (·). As said, the simplest one is linear, but it depends on the problem at hand. 3. Make assumptions about the non-observed effects. In particular, define what properties of the errors are to be assumed. Notice that there might exist nonobserved effects that are not captured by the error term. 4. Given the choices 1 to 3, the question is to choose the estimation method for f (x). The usual alternatives are ordinary least squares or maximum likelihood. 5. Once chosen the estimation method, how to know that 1 and 3 are correct choices? This requires to run a series of tests on the properties of the estimations based on the properties of μ. Economic theory takes care of item 1, while statistics handles 2 and 4. But in this chapter, we will emphasize on items 3 and 5. Errors and non-explained effects are the potential sources of surprises that may require an abduction process to find explanations for them.
Some Econometric and Modeling Background While economists have used statistical arguments since the birth of the discipline, econometrics as a field emerged in the first half of the twentieth century. A landmark was the creation of the Cowles Commission in the 1930s. As mentioned by Epstein (2014), the rationale for its creation was the need to provide sound economic foundations to empirical analyses: “Even where the best statistical practices have been followed, however, it is argued that the present state of the science would still
996
F. Delbianco and F. Tohmé
support only very modest claims for the stock of empirical results they have so far produced.” Qin (2013) points out that the members of the Cowles Commission believed that their task was to measure structural models, because this is the only class of models apt to simulate policy alternatives. Reduced-form models were only useful for forecasting, an activity believed by the Cowles members as somewhat inferior to running policy analyses. Textbooks helped to popularize simplified versions of the Cowles Commission approach. For a couple of decades, econometrics was taught as a universal toolbox of statistical tools, useful for the estimation of the parameters of a priori formulated theories. The precision and rigor achieved by using structural models induced the adoption of the Cowles styles and methods of fitting economic theories. In turn, as Leamer and Leamer (1978) pointed out, data exploratory activities as shown in Fig. 2 were seen as being “sinful.” The original Cowles approach was grounded in the foundational contribution of Haavelmo (1943), as shown by Spanos (1986, 1989) and more recently by Heckman and Pinto (2015). But while the influence of the structural equation modeling (SEM) adopted by the Cowles Commission has somewhat waned in econometrics, it has been adopted outside the field and lies at the core of the implied causality approach of Pearl (2015). The growth of machine learning, thanks largely to the increased availability of data facilitated by the Internet, posed a challenge to the traditional approach to statistical analysis. This has of course also affected econometrics. This was acknowledged by Breiman (2001), who pointed out: “There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex datasets and as a more accurate and informative alternative to data modeling on smaller datasets. If our goal as a field is to use data
Econometric Model
Results
Error Term
Fig. 2 The role of errors in econometric abductions: given the results of the estimation process, the unexplained portion conveys intrinsically valuable information that can be used to reformulate the econometric model
46 Abduction in Econometrics
997
to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.” Two decades later, Imbens and Athey (2021) found out that this is no longer the case. As they say: “Breiman’s ‘Two Cultures’ paper painted a picture of two disciplines, data modeling, and algorithmic machine learning, both engaged in the analyses of data but talking past each other. Although that may have been true at the time, there is now active interactions between these two. For example, in Economics, machine learning algorithms have become valuable and widely appreciated tools contributing to the analyses of economic data, informed by causal/structural economic models.” The last observation reveals the existence of two main approaches to the detection of patterns in economic data. One seeks to find causal relations, while the other just tries to predict values, finding explicitly or implicitly the generative model of the data. The latter is more properly described as the application of inductive inference. The first approach, instead, seeks to find causal explanations for the observations recorded in datasets. Since “causality” is a theoretical notion, this amounts to a result of abductive reasoning. But there are also other ways in which abductions are carried out in econometrics. One can be found in the way in which the ceteris paribus assumptions are used to simplify statistical analyses. That is, the choice of variables and relations that are assumed to remain unaffected by the phenomena under consideration. As pointed out by Andrew Gelman: “As statisticians, we spend much of our effort fitting models to data and using those models to make predictions. These steps can be performed under various methodological and philosophical frameworks. Common to all these approaches are three concerns: (1) what information is being used in the estimation process, (2) what assumptions are being made, and (3) how estimates and predictions are interpreted, in a classical or Bayesian framework” (Gelman et al., 2020). The decisions involved in Gelman’s (1) and (2) concerns are in good measure about what to leave out of the analysis and constitute ceteris paribus assumptions. The choice of these assumptions is the result of abducting the elements that do not explain the observations at hand. An instance of abduction arises in the use of bidimensional plots to reject or postulate possible relations y ∼ f (X). Dodge and Rousson (2001) point out that asymmetries in the correlation coefficients arise when data does not comply with the theoretical assumptions of the analysis. A salient case is that of the Phillips curve, a theoretical construction that usually does not appear in real-world data and does not even satisfy the ceteris paribus controls in the analysis of the relation between inflation and unemployment. Another related question amounts to finding out when statistical properties trigger abduction processes. Imbens (2021) is particularly critical of the use (and abuse) of p-values in statistical analyses as evidence of surprising relations. Imbens also asks whether it is equally surprising to find that a relation is nonsignificant when it was considered to be significant or the other way around, or when the signs of the parameters are the opposite of the expected ones.
998
F. Delbianco and F. Tohmé
Heckman and Singer point out that econometricians should take cues from expert systems, learning to generate and test hypotheses. Aided by automated tools, econometricians should have a stock of potential models at hand as well as statistical software able to generate estimations and run tests. This could lead to new results and point out differences with previous estimations. Given all these different issues, it becomes highly relevant to discuss what methods should be included in a robust econometric toolbox. A related question is that of finding the rules of thumb associated to those tools, which should be considered real abductive reasoning methodologies.
Possible Classic Econometric Tools to Perform Abduction Since abductive reasoning is triggered by a surprising realization, classical postestimation tools are critical to assess if a “surprise” is such or just a wrong diagnosis. McElreath (2020) uses the metaphor of the geocentric model to describe the classical linear regression model in econometrics. While essentially incorrect, it is a quite useful tool. McElreath uses also another metaphor, that of the Golem, which we think is an appropriate representation of the process of making an abduction. The legendary Golem of Prague was a creature of clay, animated by the goal of protecting the just and faithful but blind to the real intentions of its creator and easy to being misused. In McElreath’s metaphor, the “Golem” is a formal model, never true, but aimed to discover truth, blind to the intentions of the modeler and easy to misuse. The usual procedure in econometrics starts by building the simplest “Golem,” i.e., a linear regression model. More complex processes (evolved “Golems”) can arise in response to unexpected results arising in the analysis of the same class of datasets, but using different forms of statistical estimation. This seemingly unending process proceeds by evolving new and more complex ways of examining the same data. But nothing ensures achieving a “final truth,” and the models can easily mislead its users. There exist many techniques and practices in the toolbox of econometrics that are mainly used to run diagnostic tests. But in terms of abductive reasoning, those tests usually constitute the first step in the process. The information yield by a test can be surprising because either it may correspond to an unexpected parameter or perhaps provide new information about priors, indicating how to incorporate this evidence into the econometric model. Many of the practices that a student of econometrics learns in a first course involves testing whether Gauss-Markov assumptions are fulfilled when running ordinary least squares estimations. The failure in satisfying these assumptions can be seen as a sign of a bad specification of the estimated model. Therefore, it can be concluded that many of the properties necessary to make a right inference are not fully captured in the model. Wooldridge points out in his widely used textbook (Wooldridge, 2015): “Formal economic modeling is sometimes the starting point for empirical analysis, but it
46 Abduction in Econometrics
999
is more common to use economic theory less formally, or even to rely entirely on intuition.” That is, after specifying an economic model, a second step in an abductive process requires to turn it into what is properly an econometric model. This involves moving from the world of theory and ideas to the world of feasible estimations. This step requires using some new assumptions in the process of building the econometric model. Finding out that an econometric model is badly specified constitutes a first step in a process of abduction. Without that realization, the modeler would be unable to tell whether the chosen model is the right one. In practice, the usual strategy applied to validate a model involves testing the residuals of the regression. This is because econometricians work on the basis of the idea that everything that a model does not account for is left in the residuals and furthermore that the residuals are not correlated with the rest of the functional form of the model. Many other important properties are assumed about these residuals. The violation of these properties implies, in general, the existence of relevant features that have not been modeled, as shown in Fig. 3. Misspecifications or measurement errors usually provide hints about their causes. As pointed out by Hu and Ridder (2012): “In most applications, researchers estimate a model based on what they observe in the data, which may contain dependent or endogenous variables and exogenous variables, and treat those unobserved in the data as shocks or error terms. If these unobservables include agents’ choices or covariates in agents’ information set, their misinterpretation as exogenous errors in the model is a major source of endogeneity.” Hu (2017) elaborates further this point: “From a practitioner’s view, ignorance of identification directly lead to inconsistency of an estimator. In layman’s terms, we wouldn’t know what an estimator is estimating without a solid identification argument. For example, it is well known that many estimators ignoring measurement errors in explanatory variables lead to inconsistent estimates. In addition, nonidentification implies a flat likelihood function, with which iterative algorithms may not converge.” Given how critical the role of assumptions is, several ways of testing them have been developed. Some of them are:
Error Term
Testing Assumptions Violations of Assumptions INSIGHTS
Corrections of Model
Fig. 3 Testing and correcting: testing the assumptions on the error term can help to find out whether some of them are violated. Consequent corrections to the model may lead to new insights
1000
F. Delbianco and F. Tohmé
• Functional Form and Omitted Variables According to the widely accepted notion that data generating processes might be better approximated by polynomial or other nonlinear functional forms, Ramsey (1969) introduced what is known as the Reset Test. It consists in, once the results of a linear regression on a dependent variable y yields a predicted value y, ˆ a second regression is performed, again with y as a dependent variable but now on a polynomial defined in terms of its predicted value y, ˆ according to the following specification: y = f (yˆ 2 , yˆ 3 , yˆ 4 , . . .) Once obtained the coefficients in this specification, a joint significance test is applied to them. If any of the coefficients is significant, it means that some nonlinearities have not been taken into account in the model yielding y. ˆ Detecting what variables and relations should be incorporated into the model calls for an abduction to be performed by the modeler. The usual approach involves looking for the explanatory variables with the largest marginal effects, positive or negative, and incorporate their squared values as new variables (Stock et al., 2012). It is important to note that this test only indicates whether the model is linear in the original variables. It does not help to capture the influence of other variables. But, as pointed out in Wooldridge (2015), a wrongly specified functional form can be corrected by adding either logarithms or squared values of the variables already taken into account. A fundamental aspect to address in the analysis of possible omissions is the data types. Important questions to answer are: Are variables measured over time? Do they represent a cross section of individuals? Does the sample include a group of observations measured at different points in time? The economic hypotheses that can be put forward are conditioned by the actual characteristics of the data. In other words, whether the model is centered on trends, unobserved fixed effects, or multilevel heterogeneity, the results ensue from the observations. This establishes a back and forth relation between theory and practice. The latter provides evidence that may help to discriminate among the different potential theories. But in turn, the chosen theoretical constructs imply different levels of empirical analysis that may lead to new observations. Among those different levels, it is worth to mention panel data (Baltagi et al., 2008) or multilevel data specifications (Snijders & Bosker, 2011). The exploration of these different specifications may provide a roadmap to possible ways of measuring and detecting relations among the same variables. Running an F test allows to select the model that better fits the dataset. • Violated Assumptions Heteroskedasticity, i.e., when the variability of the vector of variables differs across the dataset, implies the possible existence of omitted variables that may remain in the residual and generate the unequal variance. That is, some unexpected component may explain the heterogeneous variability of the data.
46 Abduction in Econometrics
1001
In turn, the presence of autocorrelation (correlation between variables and their lagged values) may suggest that there are structures of lagged variables that are forgotten when defining the structure of a regression. It might just be a matter of inertia in the data, but it is also possible that some unexpected lagged structures have been missed in the analysis. This is particularly relevant for the analysis of causality, as we will discuss later in this chapter. On the other hand, other violations of the Gauss-Markov assumptions of the characterization of ordinary least squares estimators can indicate that other adjustment measures may have to be applied. • Information Criteria Information criteria are estimators of the prediction errors associated to econometric models. There exist different ways of defining such errors and thus of selecting the models that predict better the values of the explained variables. Some of the best known criteria are the Akaike information criterion (AIC), presented in Akaike (1974), which estimates the relative loss of information of a model that intends to capture the generating process of the data. On the other hand, the Bayesian information criterion (BIC) introduced in Schwarz (1978) is increasing in the error variance and the number of explanatory variables. Another information-based measure is the Kullback-Leibler divergence or relative entropy, which provides a metric of the discrepancy between the distribution generated by a model and the actual distribution underlying the data (see Kullback, 1987). In this case, models that yield small differences are preferred. An unexpected result in this case consists on getting a better value of an information measure with a model that was not expected to be very accurate. This can trigger an abductive process aimed at finding out what aspects make the former better than the latter. • Goodness of Fit A measure of goodness of fit of a model captures the discrepancy between observed values and the values expected according to the model. The most usual measure is known as coefficient of determination, denoted R 2 , which is the proportion of the variation of the dependent variable that can be predicted from the explanatory variables. Better measures of goodness of fit are the mean squared error (MSE), i.e., the average of the squares of the errors of the estimated variables, or the root-mean-square error (RMSE), the square root of MSE. As in the case of information criteria, models that yield better goodness of fit values than the preferred one present an opportunity to abduce the hypothesis of a stronger association between an unexpected variable and the explained variables. • Extreme Bounds Analysis Leamer and Leonard (1983) presented a technique, extreme bounds analysis (EBA), which yields upper and lower bounds for the parameters of interest, given any possible set of explanatory variables. This approach allows to assess the robustness of the results given by the controls added to a model. A large interval defined by the bounds indicates a high sensitivity to the explanatory variables, in comparison to the parameters living in narrow intervals. High sensitivity may mean that the corresponding controls generate endogeneity or multicollinearity,
1002
F. Delbianco and F. Tohmé
again violating the assumptions of the econometric analysis. An abduction involves focusing on the explanatory variables with robust parameters, obtaining a more precise and accurate model. • Autometrics Hendry et al. (1995, 2000) develop an alternative approach to variable selection in linear regression, autometrics (automatic general-to-specific selection). The ensuing method is closer to machine learning than to traditional econometrics, since, instead of postulating a model, it seeks to obtain it by discarding variables. Similarly to EBA (although the foundations of both methods are radically different), in which different combinations of variables are tested, autometrics examines automatically different specifications. The resulting model is not unexpected, since the method does not start from any assumed specification, but it allows the analyst to postulate a hypothesis about the origin of the data. • Dummy Saturation One aspect of the general-to-specific procedure involves “dummy saturation,” i.e., adding dummy (0/1) variables to the specification. There exists a vast number of dummy variables that a researcher can add to a model, as many as the features or events that may or not be the case. These are totally fictitious variables, in the sense that they are not observable, being approximations to hypothetical aspects of the phenomenon under examination. If some of these dummy variables remain in the final specification, the analyst can abduce that the corresponding aspects are relevant in the determination of the value of the explained variables. All these procedures involve in certain ways an exploration of the space of possible specifications. Autometrics and dummy saturation do that automatically, making them closer to machine learning methods. This poses the question of whether crucial aspects of abduction can be implemented computationally.
Abduction and Machine Learning Pure automatic filtering of variables and similar machine learning procedures are methods that provide insights about the specifics of abductive reasoning in econometrics. These procedures belong to what Breiman (2001) deems as the other “statistical culture.” Unlike the mainstream conception of statistics, this alternative culture may help to generate theoretical constructions, instead of just testing them (as illustrated in Fig. 4). The access to larger databases has prompted the exploration of new approaches to data analysis. The impact of Big Data and the availability of ever powerful computer machinery facilitate the experimentation with new methodologies, even if they may lack statistical foundations. Of course, not all of them can contribute to perform abductions. Here we will briefly discuss those that in fact can help to automatize abductive processes:
46 Abduction in Econometrics Fig. 4 Formal models vs. empirical explorations: formal models can yield other results than a more exploratory approach. These possible differences should give the researcher insights about what is missing in the theoretical model
1003
DATA
Theoretical Modelling
Data Mining
Results
”Black-Box” Results
Contradiction?
• LASSO/Ridge Regression While linear regression is still the cornerstone of econometric modeling, the additional objective of ensuring the simplicity and interpretability of the model can be achieved by penalizing excesses in the number and size of the parameters of the regression. Tibshirani (1996) gave a precise formulation of this idea, developing a Least Absolute Shrinkage and Selection Operator (LASSO). The penalization of parameters forces a selection of the relevant variables and thus yields a hypothesis about the determination of the explained variables. This regularization procedure is intended to solve the consequences of ill-posed problems, like overfitting. LASSO addresses additional questions, like obtaining smooth solutions by minimizing the squared value of the residuals, while ridge regression just minimizes their absolute value. The penalization procedure of these methods shrinks the coefficients or just discards them. The variables that have nonzero coefficients are assumed to be those that are most relevant and thus become part of the abducted hypothesis. • Feature Selection Li et al. (2017) present a survey of methodologies developed in Computer Science for selecting features in data. Some of them apply to numerical data and thus can be used to select subsets of economic variables. These subsets embody properties that are relevant in economic contexts. Chandrashekar and Sahin (2014), in another survey on feature selection, point out that removing irrelevant variables should not be compared with other dimension reduction methods such as principal component analysis (PCA) since good features can be independent of the rest of the data. In this sense, selecting a feature may be understood as summarizing several variables into a new one at an equal footing as
1004
•
•
•
•
F. Delbianco and F. Tohmé
the other variables. This explicit definition of new variables constitutes a crucial part in an abduction that leads to discovering a simple but accurate approximation to the generating distribution of the observable data. Filter Methods Filter methods are used to select variables in conjunction with different machine learning methods. They work by ranking the variables and selecting those that are high in the rank. This can be done in one take or, iteratively, selecting a group of variables with high scores and then defining a new ranking based on that selection, picking out those at the top of this ranking and defining a new score, etc. Bommert et al. (2020) present a comparison of filter methods, finding out that a simple method, permutation, is very effective. This method ranks sets of variables by how they improve the classification of variables compared to the classification obtained by reshuffling them. A suitable ranking criterion can be used in combination with a threshold. The variables that are ranked below that threshold are removed. As in the previous case, the variables that survive the filtering process become substantial components in the abduced model. Wrapper Methods Wrapper methods are useful tools for data mining. They use a predictor as a black box into which groups of variables are fed. The performance of the predictor is the objective function to be maximized. This optimization problem is NP-hard. Thus, in actual applications only suboptimal subsets of variables can be found by employing heuristic search algorithms. El Aboudi and Benhlima (2016) survey different wrapper methods. Unlike filter methods there is no clear winner in the general comparison among them. But in the case of numerical variables, the learning procedures of wrapper methods may find reduced sets of variables that again will constitute the backbone of an abduced model. Embedded Methods As shown by Lal et al. (2006), in contrast to filter and wrapper approaches, in embedded methods the learning part and the feature selection part cannot be separated – the structure of the class of functions under consideration plays a crucial role. The main approach is to incorporate the feature selection as part of the training process. Different approaches can be used in embedded methods, for instance, optimizing support vector machines assigning weights to features. The same can be done using neural networks. So, multilayer perceptron networks are trained to generate feature weights, which are calculated using a measure of the salience of the outputs of the trained network. A downside of these approaches is that they behave as black boxes and thus do not contribute to understand the relations among the variables. Nevertheless, the resulting sets of variables can be used by the modeler to state hypotheses about the process generating the data. Unsupervised Learning The methods of unsupervised learning seek to find hidden structures in unlabeled data. Clustering techniques are a primary example of unsupervised learning, looking for natural groupings in a set of objects without knowing whether they belong to one or another group. Bommert et al. (2020) present a benchmark for the selection of groupings of high-dimensional data. This procedure always risks
46 Abduction in Econometrics
1005
the possibility of overfitting, i.e., yield more parameters than needed to explain the endogenous variables. The clusters found in this analysis can be summarized by new variables that become, again, part of an abduced model. All these methods are closely related, and all of them can contribute to abductive reasoning. But nothing precludes the resulting model to be overfitted or with selected controls that might not be actually “good,” as discussed by Cinelli et al. (2020). One way to address these potential problems is by using methods that address specific queries instead of finding a “one-size-fits-all” model.
Meta-analysis and Zooming In As we have discussed above, the way in which abductions arise in econometrics is through the examination of the results of statistical estimations of data. Two methodologies refine some aspects of the assessment of statistical results and facilitate finding results that contradict the expected ones. • Meta-analysis Meta-analysis is a statistical methodology to combine the outcomes of different studies. In the case of econometrics, its main use, as discussed by Boyle et al. (2015), is the detection of robust insights that arise in separate analyses. On the other hand, it also allows to compare different results and seek the causes for their difference, using a toolbox of tests. If the models that are to be combined are critically different, namely, differing in the dataset and control variables, it might not be possible to find a unifying model summarizing the results of the individual ones. A possibility is to weigh the different results by ranking the underlying models by criteria like their biases or their scores according to information criteria. If the critical differences are taken into account but still substantial heterogeneity persists, the tools of meta-regression (Van Houwelingen et al., 2002) can be applied to distinguish the features that lead to unexpected results as suggested in Fig. 5. Alternatively, if unexpected robust insights are detected, the method allows to find out the characteristics that lead to them. In either case these are informative triggers for an abduction. • Transitional Inference Delbianco et al. (2021) introduce the notion of transitional inference, elaborating on the novel approach presented by Liu and Meng (2016). This is a methodology according to which data is examined at different levels of resolution. In this sense, once detected some interesting features, the method allows to “zoom in” into the relevant data. Robustness is ensured by the constancy of the results along different levels of resolution. On the other hand, results that are relevant at some level of resolution may not remain so at other levels. The level of resolution is selected in order to assess “individualized” inference, targeted on a particular entry in the database. In order to make the statistical inference more relevant to this particular observation (at the expense of robustness),
1006
F. Delbianco and F. Tohmé Study A
Result A
Study B
Result B
...
... å Evidence
Model Study N
Contradiction?
Result N
Fig. 5 Meta-analysis: the aggregate result of different studies of a same topic may provide additional information not contained in the individual studies. This may happen, in particular, if some studies contradict the results obtained in others Zoom A
Result A
Zoom B
Result B
...
... å Evidence
Model Zoom N
Contradiction?
Result N
Fig. 6 Transitional inference: “zooming” into different components of the data or into particular sets of observations in the same model may yield different estimation results. The study of these differences may lead to new insights about the phenomenon under analysis
the method selects a subsample of entries of the full population according to their similarity to those of the targeted entry. Any new individualized inference implies a different choice of similar entries, yielding a new level of resolution. Unexpected results may arise at any level (see Fig. 6). The researcher must assess whether they are relevant to the question at hand. If so, their robustness must be examined. Even if the results are not robust, they might trigger an abduction, although focused on the context associated to the corresponding resolution level at which they arise.
Abduction and Causality In their discussion of Breinman’s distinction between the two cultures of statistics, Imbens and Athey (2021) point out that since its inception, econometrics has chosen a more causal-oriented approach. More specifically, usual work in the field seeks to
46 Abduction in Econometrics
1007
model causal relations resulting either from the implementation of specific policies or triggered by some event. In particular, it is relevant to determine how these relations may affect a subgroup of the population. While postulating causal relations is the point of departure for those econometric analyses, a relevant step consists in selecting the right data to assess them. Different techniques for finding data that may allow to estimate causal models are discussed by Cunningham (2021) and Huntington-Klein (2021). Among those methodologies, the design of experiments or alternatively the detection of situations that constitute natural experiments plays highly relevant roles. In both cases the goal is to find a treatment and a control group, such that the outcomes under the corresponding data can be compared. The former correspond to the individuals or entities subject to causal influences, while the latter are those that remain unaffected. If the results obtained with one group differ from those gained with the other, it can be stated that the relation among variables has a causal nature. Athey and Imbens (2017) present a review of the state of the art on the analysis of causality in econometrics. They point out that it is widely assumed in econometrics that if a simple model, like linear regression, provides sound estimations, the estimated relation can be considered to be causal. That is, the causal relation that is already assumed in the model gets validated by the data. A different possibility arises in purely empirical studies, without any previous assumptions. Starting with a group of time series, each one corresponding to a different variable, there are tests that can be applied to determine whether causal relations exist among them. One of those techniques, introduced in Granger (1969), is known as the Granger causality test, measuring whether prior values of the series for a variable x are able to predict future values of y. If that is the case, x is said to “Granger-cause” y. Another statistical technique to evaluate the possible existence of a causal relation between variables is known as transfer entropy. Introduced in Schreiber (2000), the method consists in measuring the time-asymmetric information transfer from the stochastic process corresponding to a variable x to that of a variable y. That is, it measures how the uncertainty of y is reduced by knowing the past values of x. Machine learning methods for the detection of causality in data include the linear non-Gaussian acyclic model (LINGAM) presented by Shimizu et al. (2006) and PCMCI introduced by Runge et al. (2019). The former assumes that all the relevant variables are included in the dataset, that the relations among them are linear, and that the error terms are not normally distributed. PCMCI involves two stages. In the first (P C1 ), irrelevant relations are removed for each variable by testing iteratively for statistical independence conditions. The second stage (MCI , for momentary conditional independence) tests for the statistical independence of variables under relations that have not been discarded in the first step but conditioned by the previous values of those variables (the “causes” are treated as lagged variables). PCMCI can be presented more clearly depicting the relations among variables as directed acyclic graphs (DAG). The nodes correspond to variables and the directed
1008
F. Delbianco and F. Tohmé
MODEL
Causal Modelling
Impact Evaluation
INSIGHTS
Fig. 7 Causality: empirical analyses allow to evaluate the impact of an intervention intended to test a causal model. The insights gained by this evaluation allow improving the model
edges the causal relations. As discussed by Pearl (2018), this methodology, unlike statistical methods, does not seek parameters but the causal ordering of the variables. According to Pearl (1998), a DAG provides a simpler representation of a structural equation modeling. But then, three variables can be linked in three different ways: forming a chain (x → y → z), a fork (x ← y → z), or a collider (x → y ← z). These structures allow to identify conditional independence. So, for instance, in the case of a fork, x and z are independent, conditioned on y. But then, if in the structural equation modeling there is a parameter α such that, say, z = αx, this affectation can be disregarded as merely representing a correlation but not a causation. This approach is thus particularly useful to preclude, previous to running any estimation, the addition of bad controls to a model, which could lead to biased estimations. A final mention could be done to the approach developed in Vinod (2019). This novel method uses the residuals of kernel regressions, avoiding the dependency on linearity assumptions. With these residuals, the causality test is performed by studying the stochastic dominance of one variable over the other. The two main approaches to the analysis of causality, namely, assuming it and testing for it, or just seeking to find it in the data, lead to different kinds of abduction (see Fig. 7). In the case that the causality is already assumed in the model, an abduction is triggered by the failure of the model. In the case that causality is found in the data, the abduction is immediate. The main difference is that in the former case, the analyst already had a theoretical conception in mind, while in the latter it is acquired after examining the data.
Conclusions This chapter is not intended as an exhaustive compilation of methods. We try, instead, to illustrate the different ways and cases at which an abduction process in econometrics may occur. A researcher may find them while modeling any economic problem. We focused in particular on the abduction that arises in the analysis of data and in the estimation and inference processes.
46 Abduction in Econometrics
1009
DATA
Technique A
Empirical Model
Result A
Evidence A
Technique B
Result B
Contradiction?
Evidence B
Fig. 8 Contradiction as a trigger: different techniques may give different results. The possible contradiction ensuing from the comparison between these empirical estimations may yield new insights
In the empirical exercise of estimating an economic model, frequently arise situations as the one depicted in Fig. 8: two different techniques, A and B, give corresponding results A and B. They may differ slightly, but the question is what may happen if these differences are much larger than the expected ones or than those that are considered to be “normal”? In this chapter we show different particular ways in which the inspection of the differences between evidence A and evidence B indicates the presence of a contradiction. The researcher should then perform an abduction process, investigating what this surprisingly result may indicate about the existing economic theory. This abduction process can be very explicit in a context of empirical modeling. It is also very precise because the differences between the different techniques can be objectively listed. They may arise in the assumptions, in the optimization process, or in the computational aspects of estimation. Even the use of different econometric or statistical software can yield slightly different results due to the underlying default processes. This, of course, has subsequent implications for the estimated economic relations. Another way in which the abduction process can be illustrated in econometrics is in the presence of “residual” components, as a result of the difference between what is estimated and the actual estimation. In this sense, new insights can arise both in the explained component of the empirical model and in the unexplained part. Patterns that emerge in the residuals can be incorporated not only into the empirical model but also into the original economic model.
1010
F. Delbianco and F. Tohmé
Assuming that the empirical analysis is an evolving process, and adopting the notion of the Golem presented in McElreath (2020), the inspection of the sum of evidence obtained by using different approaches can only give us more information, not less. Whether the researcher adheres to either one of the two “cultures” of Breiman (2001), the questions raised by the particularities of the techniques have an impact on the scientific conclusions of the study. In light of this, it is necessary to make explicit the implications of the chosen techniques for the scientific results of econometric studies. Abduction is the result of these considerations.
References Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. Athey, S., & Imbens, G. W. (2017). The state of applied econometrics: Causality and policy evaluation. Journal of Economic Perspectives, 31(2), 3–32. Baltagi, B. H., et al. (2008). Econometric Analysis of Panel Data (Vol. 4, 6th ed.). Springer. Bellucci, F., & Pietarinen, A.-V. (2020). Peirce on the justification of abduction. Studies in History and Philosophy of Science Part A, 84, 12–19. Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., & Lang, M. (2020). Benchmark for filter methods for feature selection in high-dimensional classification data. Computational Statistics and Data Analysis, 143, 106839. Boyle, K. J., Kaul, S., & Parmeter, C. F. (2015). Meta-analysis: Econometric advances and new perspectives toward data synthesis and robustness. In Benefit Transfer of Environmental and Resource Values (pp. 383–418). Springer. Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199–231. Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers and Electrical Engineering, 40(1), 16–28. Cinelli, C., Forney, A., & Pearl, J. (2020). A crash course in good and bad controls. Available at SSRN, 3689437. Cunningham, S. (2021). Causal inference. The Mixtape, 1, New Haven: Yale University Press. Delbianco, F., Fioriti, A., & Tohmé, F. (2021). A methodology to answer to individual queries: Finding relevant and robust controls. Behaviormetrika, 48(2), 1–24. Dodge, Y., & Rousson, V. (2001). On asymmetric properties of the correlation coeffcient in the regression setting. The American Statistician, 55(1), 51–54. Durlauf, S. N. (2020). Institutions, development, and growth: Where does evidence stand? In The Handbook of Economic Development and Institutions (pp. 189–217). Princeton University Press. El Aboudi, N., & Benhlima, L. (2016). Review on wrapper feature selection approaches. In 2016 International Conference on Engineering & MIS (ICEMIS) (pp. 1–5). IEEE. Epstein, R. J. (2014). A History of Econometrics. Elsevier. Gelman, A., Hill, J., & Vehtari, A. (2020). Regression and Other Stories. Cambridge University Press. Goyal, A. (2017). Abductive reasoning in macroeconomics. Economic and Political Weekly, 5233, 77–84. Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, 37, 424–438. Haavelmo, T. (1943). The statistical implications of a system of simultaneous equations. Econometrica, Journal of the Econometric Society, 11, 1–12. Heckman, J., & Pinto, R. (2015). Causal analysis after haavelmo. Econometric Theory, 31(1), 115–151.
46 Abduction in Econometrics
1011
Heckman, J. J., & Singer, B. (2017). Abducting economics. American Economic Review, 107(5), 298–302. Hendry, D. F., et al. (1995). Dynamic Econometrics. Oxford University Press on Demand. Hendry, D. F., et al. (2000). Econometrics: Alchemy or Science?: Essays in Econometric Methodology. Oxford University Press on Demand. Hu, Y. (2017). The econometrics of unobservables: Applications of measurement error models in empirical industrial organization and labor economics. Journal of Econometrics, 200(2), 154–168. Hu, Y., & Ridder, G. (2012). Estimation of nonlinear models with mismeasured regressors using marginal information. Journal of Applied Econometrics, 27(3), 347–385. Huntington-Klein, N. (2021). The effect: An introduction to research design and causality. Imbens, G., & Athey, S. (2021). Breiman’s two cultures: A perspective from econometrics. Observational Studies, 7(1), 127–133. Imbens, G. W. (2021). Statistical significance, p-values, and the reporting of uncertainty. Journal of Economic Perspectives, 35(3), 157–174. Kullback, S. (1987). Letter to the editor: The Kullback–Leibler distance. The American Statistician, 41(4), 340–341. Kuorikoski, J., Lehtinen, A., & Marchionni, C. (2010). Economic modelling as robustness analysis. The British Journal for the Philosophy of Science, 61(3), 541–567. Lal, T. N., Chapelle, O., Weston, J., & Elisseeff, A. (2006). Embedded methods. In Feature Extraction (pp. 137–165). Springer. Leamer, E., & Leonard, H. (1983). Reporting the fragility of regression estimates. The Review of Economics and Statistics, 65, 306–317. Leamer, E. E., & Leamer, E. E. (1978). Specification Searches: Ad Hoc Inference with Nonexperimental Data (Vol. 53). Wiley. Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM Computing Surveys (CSUR), 50(6), 1–45. Liu, K., & Meng, X.-L. (2016). There is individualized treatment. Why not individualized inference? Annual Review of Statistics and Its Application, 3, 79–111. Magnani, L. (2001). Theoretical abduction. In Abduction, Reason and Science (pp. 15–52). Springer. Marostica, A., Tohmé, F., et al. (2000). Semiotic tools for economic model building. The Journal of Management and Economics, 4, 27–34. McElreath, R. (2020). Statistical Rethinking A Bayesian Course with Examples in R and STAN. CRC Press. Pearl, J. (1998). Graphs, causality, and structural equation models. Sociological Methods & Research, 27(2), 226–284. Pearl, J. (2015). Trygve haavelmo and the emergence of causal calculus. Econometric Theory, 31(1), 152–179. Pearl, J. (2018). Causal and counterfactual inference. In The Handbook of Rationality (pp. 1–41). Springer. Qin, D. (2013). A History of Econometrics: The Reformation from the 1970s. Oxford University Press. Ramsey, J. B. (1969). Tests for specification errors in classical linear least-squares regression analysis. Journal of the Royal Statistical Society: Series B (Methodological), 31(2), 350–371. Reiss, J. (2012). The explanation paradox. Journal of Economic Methodology, 19(1), 43–62. Runge, J., Nowack, P., Kretschmer, M., Flaxman, S., & Sejdinovic, D. (2019). Detecting and quantifying causal associations in large nonlinear time series datasets. Science Advances, 5(11), eaau4996. Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461. Schwarz, G. E. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464. Shimizu, S., Hoyer, P. O., Hyvärinen, A., Kerminen, A., & Jordan, M. (2006). A linear nonGaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(10), 2003–2030.
1012
F. Delbianco and F. Tohmé
Simon, H. A. (1968). On judging the plausibility of theories. In Studies in Logic and the Foundations of Mathematics (Vol. 52, pp. 439–459). Elsevier. Snijders, T. A., & Bosker, R. J. (2011). Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. SAGE. Spanos, A. (1986). Statistical Foundations of Econometric Modelling. Cambridge University Press. Spanos, A. (1989). On rereading haavelmo: A retrospective view of econometric modeling. Econometric Theory, 5(3), 405–429. Stock, J. H., Watson, M. W., et al. (2012). Introduction to Econometrics (Vol. 3). New York: Pearson. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. Van Houwelingen, H. C., Arends, L. R., & Stijnen, T. (2002). Advanced methods in meta-analysis: Multivariate approach and meta-regression. Statistics in Medicine, 21(4), 589–624. Vinod, H. D. (2019). New exogeneity tests and causal paths. In Handbook of Statistics (Vol. 41, pp. 33–64). Elsevier. Wooldridge, J. M. (2015). Introductory Econometrics: A Modern Approach. Cengage Learning.
C. S. Peirce’s Conception of Abduction and Economics
47
James R. Wible
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peirce’s Early Writings on Hypothesis and Retroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peirce’s Later Writings on Abduction and Scientific Discovery . . . . . . . . . . . . . . . . . . . . . . . Abduction and Economic Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1014 1015 1020 1024 1026 1032 1032
Abstract
One of the most important contributions stemming from the thought and works of C. S. Peirce is his conception of abduction. Peirce is the American philosopher, scientist, mathematician, and occasional economist who most often viewed himself as a logician, and he is a co-founder of the American philosophy known as pragmatism. Peirce claims authorship for proposing the conception of abduction which was inspired by passages in Aristotle’s writings. An abduction is a provisional explanation which might account for a surprising or unexpected event. Abduction is one of three major reasoning processes according to Peirce. The others are deduction and induction. Additionally, one can identify an abduction-deduction-induction (ADI) sequence to scientific discovery. There is also a connection between abduction and economics. The link between economics and reasoning processes like abduction did not stem from Peirce’s contemporaneous interests in economics as such. Instead, Peirce imagined an “economy of research” or that processes of inquiry were also subject to economic
J. R. Wible () Department of Economics, Paul College of Business and Economics, University of New Hampshire, Durham, NH, USA © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_54
1013
1014
J. R. Wible
considerations. In his writings, abduction is often placed next to or intertwined with an economic understanding of inquiry and scientific research. Certainly, one gets the impression that an ADI discovery process subject to available economic resources is what makes new knowledge possible.
Keywords
Abduction · Peirce · Economy of research · Inference · Logic · Scientific discovery
Reasoning is of three elementary kinds; but mixed reasonings are more common. These three kinds are induction, deduction, and presumption (for which the present writer proposes the name abduction). C. S. Peirce (1901a, CP 2, p. 495).
Introduction One of the most important contributions stemming from the thought and works of C. S. Peirce is his conception of abduction. Peirce, who lived from 1839 to 1914, is the American philosopher, scientist, mathematician, and occasional economist who most often viewed himself as a logician, and he is a co-founder of the American philosophy known as pragmatism. Peirce claims authorship for proposing the conception of abduction which was inspired by passages in Aristotle’s writings. Abduction has many aspects, but the core idea is that of a contingent guess which might explain some newly recognized but unexpected phenomenon. Often this unanticipated phenomenon is experienced initially as a total surprise which is inconsistent with what is already known in the realm of human knowledge. An abduction is a provisional explanation which might account for that surprising or unexpected event. Abduction is one of three major reasoning processes according to Peirce. The others are deduction and induction. Deduction is the use of logical or mathematical reasoning to draw inferences and elaborations that are already implicit in a logically conceived statement. Induction is drawing a general inference from finite data and observations where the general inference must be taken as provisional since no finite collection of observations could prove that an inductive generalization is true or valid. Abduction and induction often work in tandem in what Peirce called ampliative inference. Ampliative inferences provide informative new ideas about phenomena stemming from the world in which humans live, move, and have their being. But such inferences have to be tested. The three processes of reasoning can also be fashioned to outline a process of scientific discovery. As mentioned, a new phenomenon is often encountered as an unexpected, “abductive” surprise. Then follow
47 C. S. Peirce’s Conception of Abduction and Economics
1015
logical deductions from those abductions or provisional statements about those new phenomena. These in turn engender an inductive process of inquiry, observation, and/or experimentation. One can identify an ADI or abduction-deduction-induction sequence to scientific discovery that gets repeated in many different perhaps even minute ways as learning and inquiry unfold in novel and specific circumstances. Abduction is thus a core aspect of Peirce’s theory of scientific discovery. There is also a connection between abduction and economics. Peirce was very interested in the development of the discipline of economics and especially mathematical economics in the late nineteenth century. However, the connecting link between economics and reasoning processes like abduction did not stem from Peirce’s contemporaneous interests in economics as such. Instead, Peirce (1879) imagined an “economy of research” or that processes of inquiry were also subject to economic considerations. Peirce’s initial formulation of the economy of research did incorporate the mathematics of marginal utility theory first developed by Jevons (1871). In important passages in Peirce’s writings, abduction is often sequentially placed next to or intertwined with an economic understanding of inquiry and/or scientific research. Certainly, as one encounters many of the major passages on abduction and the logic of inquiry and scientific research, one gets the impression that an ADI discovery process subject to available economic resources is what makes new knowledge possible.
Peirce’s Early Writings on Hypothesis and Retroduction Abduction was one of several names that Peirce used for provisional reasoning about unexpected phenomena. Other terms fulfilling the same role in his writings were hypothesis and retroduction. In his time hypothesis was recognized as a form of inference, and Peirce repeatedly compared it with deduction and induction especially in the first two decades of his career. Retroduction as a term of expression connotes an inference after something has been observed. Abduction is the term that was used in a constellation of major writings which came in the last two decades of his life. Earlier in the 1860s and 1870s, Peirce made great efforts to compare hypothesis with deduction and induction. (Articles to begin reading what philosophers have to say about Peirce and abduction are Burks 1946, Frankfurt 1958; Douven 2017a,b, and Walton 2004.) Peirce’s great interest in the logic of hypothetical inference can be found early in his career. In the 1860s, from his mid-20s to age 30, Peirce delivered three sets of lectures, two at Harvard before the graduate school was founded and the Lowell Lectures of 1866 which show great interest in the logic of inference and reasoning. Similarly, there are a few published articles which indicate his early and precocious interest in logic and hypothetical reasoning. In his first lecture series, “On the Logic of Science,” when he was 25 years old, Peirce recognized a third form of inference which is distinct from induction and deduction – hypothesis:
1016
J. R. Wible
There is a large class of reasonings which are neither deductive nor inductive, I mean the inference of a cause from its effect or reasoning to a physical hypothesis. I call this reasoning à posteriori. If I reason that certain conduct is wise because it has a character which belongs only to wise things, I reason a priori. . . . But if I think it is wise because a wise man does it, I then make the pure hypothesis that he does it because he is wise, and I reason à posteriori. (Peirce 1865 WP 1, p. 180)
A year later Peirce (1866, W 2 pp. 357–514) delivered the first of his three Lowell Lectures series. Again, the themes were on the logic of science with a focus on induction and hypothesis. The first lecture began with several pages of narrative themes and then launched into an extension of the logic of science with simple syllogisms, but these are quickly followed with variations and patterns of syllogisms that are represented in tables and diagrammatic illustrations. By the fifth lecture, induction and hypothesis are being contrasted. “Induction is the process by which we find the general character of classes and establish natural classifications,” and “Hypothesis alone affords us any knowledge of causes and forces and enables us to see the why of things” (1866, W 1, p. 428). A second set of Harvard Lectures on British logicians (Peirce, 1869) would follow just after Peirce turned 30 that September with many similar themes about reasoning and inference as the two previous sets of lectures. In these three lecture series, besides hypothesis, there are also extended discussions of matters which would recur in Peirce’s writings through the last decades of his life. He repeatedly contrasts his conception of hypothetic inference with that of J. S. Mill. Additionally, he often relates logic to his theory of signs and symbols. A third line of discussion is the logic of probability and probable inferences. A fourth is reasoning with large numbers and infinity which provides the opportunity to relate logic to sampling for experimental purposes and to the question of continuity for mathematics and probability theory. The scientist will always draw or encounter a discrete or finite sample of observations or data. However, the process generating the observations may be finite or infinite in principle. Also, probability theory, conceived mathematically with calculus, involves both a continuous number line and an infinite number of possible probabilities between zero and one. Peirce’s interest in hypothesis and other reasoning processes reached a more enduring form in the last article of his most famous set of essays, the “Illustrations of the Logic of Science” published in 1877 and 1878 in the Popular Science Monthly. The lecture series is available now in the Writings of C. S. Peirce (1984–2010) and The Essential Peirce (1992, 1998). Over the past century, most introductions to Peirce and his conception of pragmatism have usually begun with these essays and especially the first two. It is worthwhile to begin with the titles of the individual articles: “The Fixation of Belief” (1877b) “How to Make Our Ideas Clear” (1878a) “The Doctrine of Chances” (1878b) “The Probability of Induction” (1878c) “The Order of Nature” (1878d) “Deduction, Induction, and Hypothesis” (1878e)
47 C. S. Peirce’s Conception of Abduction and Economics
1017
“Fixation of Belief” is most well known for its characterization of methods by which prominent beliefs are established in society. One is a method of tenacity, a second one of authority, a third of a priori ideas, and the fourth the scientific method. The scientific method is the one which will achieve belief as a credible inference sooner and more efficiently than the other methods. “How to Make Our Ideas Clear” is where the often-quoted pragmatic maxim is found: Consider what conceivable effects, which might conceivably have practical bearings, we conceive the object of our conception to have. Then, our conception of these effects is the whole of our conception of the object. (Peirce, 1878a, WP 3, p. 266)
While the first two articles deserve much more interpretation and commentary, they have garnered by far the most attention from philosophers over the decades since they were written and have received much more consideration than the other four articles. So our focus turns toward the last articles. The third and fourth articles, “The Doctrine of Chances” and “The Probability of Induction,” are about logic, and the logical principles of probability and sampling which continue important strands of Peirce’s writings and some of those writings are now recognized as among the best on those subjects in nineteenth-century America (Stigler, 1980). The fifth article, “Order of Nature,” essentially provides logical and diagrammatic illustrations about patterns of order which were then thought to be apparent in the subject matter of sciences like astronomy. It is very much the perspective of someone with considerable abilities for abstract mathematical representation and experience with large samples of observations. The last article, “Deduction, Induction, and Hypothesis,” provides what may be Peirce’s first, accessible publication which appeared during his lifetime regarding the three processes of reasoning. In that article, Peirce (1878e) returned to a notion of classification which he holds is a more primitive way to do scientific investigation than with sampling and quantitative methods. However, here he is classifying the types of inferences that might be made in science rather than the content of science. He begins with a simple deductive syllogism which he calls “Barbara,” following the nomenclature of medieval logicians, rather than “modus ponens.” He maintains that every type of inference can be reduced to a deductive syllogism or some variation of a deductive syllogism. After discussing several examples including draws from urns and instances of mortality, he illustrates the differences among the three major types of inferences essentially rotating the places of the premises and the conclusion of the syllogism: But this is not the only way of inverting a deductive syllogism so as to produce a synthetic inference. Suppose I enter a room and there find a number of bags, containing different kinds of beans. On the table there is a handful of white beans; and, after some searching, I find one of the bags contains white beans only. I at once infer as a probability, or as a fair guess, that this handful was taken out of that bag. This sort of inference is called making an hypothesis. It is the inference of a case from a rule and result. We have then – Deduction Rule. – All the beans from this bag are white. Case. – These beans are from this bag.
1018
J. R. Wible ⇒ Result. – These beans are white. Induction Case. – These beans are from this bag. Result. – These beans are white. ⇒ Rule. – All the beans from this bag are white. Hypothesis Rule. – All the beans from this bag are white. Result. – These beans are white. ⇒ Case. – These beans are from this bag. (Peirce, 1878e, WP 3, pp. 325–26)
The first example of deduction is an illustration of analytic inference and the last two of synthetic or ampliative inference. For Peirce induction is much more direct than hypothesis. Induction is a generalization from a number of direct observations. Hypothesis takes place when circumstances justify a supposition explaining that circumstance. Commenting in more detail comparing induction and hypothesis, he writes: Induction is where we generalize from a number of cases of which something is true, and infer that the same thing is true of a whole class. Or, where we find a certain thing to be true of a certain proportion of cases and infer that it is true of the same proportion of the whole class. Hypothesis is where we find some very curious circumstance, which would be explained by the supposition that it was a case of a certain general rule, and thereupon adopt that supposition. Or, where we find that in certain respects two objects have a strong resemblance, and infer that they resemble one another strongly in other respects. . . . As a general rule, hypothesis is a weak kind of argument. It often inclines our judgment so slightly toward its conclusion that we cannot say that we believe the latter to be true, we only surmise that it may be so. (Peirce, 1878e, WP 3, pp. 326–327)
A subsequent passage contains several examples of the logic of falsification. Peirce asserts that if the conclusions of an argument are false, and if the premise implies the truth of the conclusion, then one can also argue for the falsity of the premise. He gives two different cases of inverting the syllogism. One is a denial of the case, and the other is a denial of the rule, respectively. Falsification was famously offered as a crucial conception of empirical progress in science and economics by Popper (1959) and Friedman (1953). During the next 20 years, Peirce wrote relatively less often about the logic of inference. During these decades, scientific work and mathematical research dominated his writing. In 1891 his three decades of employment at the Coast Survey ended (Brent, 1998, p. 202). Then he began to write a number of manuscripts and gave several more lecture series. One of those sets of public presentations, the Cambridge Conference Lectures of 1898, titled Reasoning and the Logic of Things, provided an opportunity to construct a revised conception of his conception of reasoning processes. The second lecture in this series is titled “Types of Reasoning.” There comments are made about the contributions of prominent thinkers in the
47 C. S. Peirce’s Conception of Abduction and Economics
1019
history of logic and how their ideas can be improved. Peirce (1898, p. 135) moves toward another presentation of the three forms of reasoning with various syllogisms. Here he uses the logical quantifiers “any” and “some” to produce three very similar variations of each syllogism together with affirmative and negative versions of propositions to form syllogisms relating to the use of probability. Peirce’s first figure illustrates deduction, the second retroduction, and the third induction. The figures are each derived by interchanging two of three propositions in the second or third syllogisms as compared to the first. With respect to deduction and induction as reconceived for probable inference, Peirce concludes: We see three types of reasoning. The first figure embraces all Deduction whether it be necessary or probable. By means of it we predict the special results of the general course of things, and calculate how often they will occur in the long run. A definite probability always attaches to the Deductive conclusion because the mode of inference is necessary. The third figure by means of which we ascertain how often in the ordinary course of experience one phenomenon will be accompanied by another. No definite probability attaches to the Inductive conclusion such as belongs to the Deductive conclusion. (Peirce, 1898, p 141)
Previously, Peirce (1898, p. 137) had elaborated that deduction is a limiting case of demonstrative inference where the certainty of a proposition being true is one and untrue zero. Commenting on induction, he highlights Aristotle’s view that induction is “the assault upon the generals by the singulars” (Peirce, 1898, p. 139). Then he takes up what had been his conception of hypothesis, or as he prefers in these lectures to call it – retroduction. Retroduction, if it is to be done well, requires choices to be made which raises economic aspects of inquiry: The second figure is Retroduction. Here, not only is there no definite probability to the conclusion, but no definite probability attaches even to the mode of inference. We can only say that the Economy of Research prescribes that we should at a given stage of our inquiry try a given hypothesis, and we are to hold it provisionally as long as the facts permit. There is no probability about it. It is a mere suggestion which we tentatively adopt. (Peirce, 1898, p. 142)
We cannot leave Peirce’s early writings on hypothesis and retroduction without noting one prominent instance where the term abduction is found. There is one appearance of that term in his early writings. If most of his manuscripts had not survived, we would not know that he was curious enough to write a definition of abduction while he was quite young. Before he had turned 30, Peirce had imagined writing a dictionary of logic with several pages of definitions of “A” terms. There in that manuscript is found a definition which is likely his first mention of abduction as a logical term: ABDUCTION This is the English form of abductio, a word employed by Julius Pacius, as the translation of απ ´ αγ ωη´ (Prior Analytics, lib. 2, cap. 25), which had been rendered deductio by Boethius and reductio and even inductio by the schoolmen. It is a form of argument described by Aristotle . . . . for it is being altogether nearer knowledge. (1867, WP 2, p. 108)
The quotation of Aristotle continues with a sense of detailed knowledge of the intricacies of logic. But Peirce did not choose to incorporate the term abduction into his writings on inference until three and a half decades later in 1901. Those are the
1020
J. R. Wible
writings that need to be considered next. (The quote continues before the ellipsis as follows: “Abduction is when it is evident that the first term [that which occurs only in the syllogism only as a predicate. . . .] is predicable of the middle, but that the middle is predicable of the last [that which is only subject] is inevident, but is credible or more so than the conclusion . . . . (1867 WP 2, p. 108, square brackets in original).)
Peirce’s Later Writings on Abduction and Scientific Discovery In the last creative phase of his life dating from 1901, Peirce would adopt the term abduction rather than hypothesis or retroduction for reasoning to explain an unexpected observation or phenomenon. In attempting to clarify a third process of reasoning such as abduction, Peirce would go back repeatedly to an ambiguous passage in Aristotle’s writings that could have resulted from the deterioration of the manuscripts when they were stored in compromising venues over the centuries. The deterioration of the manuscripts would lead to the question of whether some of the Greek terms like those for deduction, induction, and hypothesis or abduction were corrupted and thus translated inaccurately: “There are strong reasons for believing that in the chapter on the Prior Analytics, there occurred one of those many obliterations in Aristotle’s MS due to its century long exposure to damp in a cellar, which the blundering Apellicon, the first editor filled up with the wrong word” (Peirce, 1898, p. 140). At that point in 1898, Peirce would assert that the better term would be retroduction, but within a few, short years, it was later abandoned for abduction. More than 30 years after he first wrote that extensive definition of the term “abduction,” Peirce began to use it as an essential part of his theory of scientific discovery as the quote at the beginning of this article reveals. From 1867 until 1901, Peirce used the terms hypothesis and retroduction for contingent statements which might explain something previously unknown to science and humanity. Then within the span of 3 years, Peirce would write two extended presentations of the meaning of abduction as a logical term and its role in the process of scientific discovery. The first long passage is from 1901 when an extensive monograph was written with its lengthy title truncated to the “Logic of History.” A second substantive and elaborate treatment of abduction can be found in the last three lectures of the Harvard Lectures of 1903. Since Peirce’s writings are complex and scattered over multiple longer and shorter collections, longer excerpts are provided here since most readers interested in abduction have not had the opportunity to read Peirce’s best passages on abduction. Previously, as mentioned above, one of the major purposes of the “Logic of History” monograph was to defend his hypothesis that certain passages of Aristotle’s Prior Analytics should be interpreted as “abduction” rather than induction or deduction. It is amazing to see the lengths that Peirce (1901b) went to in his quest to ground his conception of abduction accurately in the ideas of Aristotle (Wible, 1998). Peirce is quite well known as a master of the logic of scientific
47 C. S. Peirce’s Conception of Abduction and Economics
1021
method which includes extensions to probability and economic aspects of research and experimentation. The first half of the “Logic of History” does not disappoint in this regard. In applying his conception of scientific method to historical research, Peirce (1901b) recapitulates, reconceives, and extends his conception of scientific method which is then refocused for the special task of inquiry regarding ancient documents. As mentioned, a key concern was the systematic and regular damage that was done to the scrolls where they were exposed to dampness for nearly a century. His idea expressed in terms of the lines of a printed book was that about 1 in every 70 lines was corrupted when the scrolls were not stored properly. If so, this provides a recurring sample of corrupted passages which then become the focus of probable inference and detailed reinterpretation. It is in the first half of the “Logic of History” with its creative restatement of scientific methods where exceptionally interesting passages on abduction are found. That half begins with a critique of flaws in Hume’s theory of testimony which had been used in interpreting ancient documents. There are comments on deduction, mathematics, and logic. Then various aspects of scientific method are characterized with extensive and detailed passages on various types of induction. There is even a discussion on large numbers and infinity since one must be aware of the nature of the phenomenon generating observations and whether random sampling will be effective. Then for the first time in his writings, Peirce begins to provide an extensive account of abduction. First come comments on deduction. Since this is Peirce’s first extensive presentation of abduction, a few longer quotes may be order. He raises the matter of anomalous observations which invite if not require explanation: Accepting the conclusion that an explanation is needed when facts contrary to what we should expect emerge, it follows that the explanation must be such a proposition as would lead to the prediction of observed facts, either as necessary consequences or at least as very probable under the circumstances. A hypothesis, then has to be adopted, which is likely in itself, and renders the facts likely. This step of adopting a hypothesis as being suggested by the facts, is what I call abduction. . . . a hypothesis adopted by abduction could only be adopted on probation, and must be tested. When this is duly recognized, the first thing that will be done, as soon as the hypothesis has been adopted, will be to trace out its necessary consequences and probable experiential consequences. This step is deduction. (1901b, EP 2 pp. 94–95)
After detailed comments on deduction, Peirce moves on to characterize induction. He briefly characterizes deduction in summary fashion and then introduces an extensive analysis of induction: Deduction, of course, relates exclusively to an ideal state of things. A hypothesis presents such an ideal state of things, and asserts that it is the icon, or analogue of experience. Having then, by means of deduction, drawn from a hypothesis predictions as to what the results of experiment will be, we proceed to test the hypothesis by making the experiments and comparing those predictions with the actual results of experiment. Experiment is very expensive business, in money, in time, and in thought; so that it will be a saving of expense to begin with that positive prediction from the hypothesis which seems least likely to be verified. For a single experiment may absolutely refute the most valuable of hypotheses while a hypothesis must be a trifling one indeed if a single experiment could establish it. . . . This sort of inference it is, from experiments testing predictions based on a hypothesis, that is alone properly entitled to called induction. (1901b, EP 2, pp. 96–97)
1022
J. R. Wible
What follows next are passages where the three reasoning process are set forth in the sequence in which they might appear during a process of scientific discovery. If the processes of abduction, deduction, and induction are conceived as steps in a process of discovery, then these may be important aspects of the origins of the growth of knowledge. Of course, real science is often quite complicated: The reasonings of science are for the most part complex. Their parts are so put together as to increase their strength. . . . Abduction .... is merely preparatory. It is the first step of scientific reasoning, as induction is the concluding step. Nothing has so much contributed to present chaotic or erroneous ideas of the logic of science as failure to distinguish the essentially different characters of different elements of scientific reasoning.... Abduction makes its start from the facts, without at the outset, having any particular theory in view, though it is motived by the feeling that a theory is needed to explain the surprising facts. Induction makes its start from a hypothesis which seems to recommend itself, without at the outset having any particular facts in view, though it feels the need of facts to support the theory. Abduction seeks a theory. Induction seeks for facts. In abduction the consideration of the facts suggests the hypothesis. In induction the study of the hypothesis suggests the experiments which bring to light the very facts to which the hypothesis had pointed. (Peirce, 1901b, EP 2, p. 106)
What follows immediately in the “Logic of History” is Peirce’s most extensive discussion of the economic side of scientific research which he termed the “economy of research.” There are also comments on the strategic nature of chains of scientific reasoning which he called “twenty questions.” Informed guessing or abductively guided sequences of abduction-deduction-induction essentially aid the scientist in moving forward with scientific research in the face of severe economic constraints. Less than 3 years after Peirce wrote the “Logic of History,” he took the opportunity to present his ideas about abduction publicly. The “Logic of History” would not be published in part until the 1930s more than two decades after Peirce had died, and only about two-thirds of the manuscript would appear in the Collected Papers (Peirce, 1931–1958). The immediate context is that his childhood friend and longtime acquaintance, William, James, would organize a set of lectures for Peirce’s first formal return to the Harvard campus in several decades in the spring of 1903 (Brent, 1998, pp. 290–293). Peirce’s Carnegie Institute grant application to fund the writing of his memoirs had just been rejected. Harvard President Elliot, in response to James, granted permission for the lectures as long as James would raise the money to support the series. Within 2 or 3 months of receiving final confirmation for the lectures, Peirce found himself at Harvard making seven scheduled presentations and an eighth on mathematics. The first introductory lecture recapitulates main themes from earlier writings: his conception of pragmatism and several mathematical examples of the meaning of pragmatism including an optimizing model of the insurance firm (Wible, 2014; Hoover & Wible, 2021). Lectures two and three were about his philosophical theory of fundamental relational categories for conception and reasoning and a phenomenological interpretation of those categories. Lecture four was about seven possible systems of metaphysics which might be possible from various permutations of those relationally conceived categories. The last three Harvard lectures take up reasoning processes again. By this time in his life, Peirce conceived of three “normative sciences” as a sub-part of philosophy.
47 C. S. Peirce’s Conception of Abduction and Economics
1023
This is where reasoning would find its place in his mature system of thought. The three “normative sciences” are esthetics, ethics, and logic. Together, these normative sciences are one of the major branches of philosophy with the other two being phenomenology and metaphysics. In other writings, Peirce had claimed that mathematics was more abstract than philosophy with its three branches and that mathematics and philosophy are more abstract than what are now called the natural and social sciences. It is in the context of elaborating on logic as part of the normative sciences within philosophy that ideas of abduction are here presented in the Harvard Lectures. He asserts that an intellectual ideal needs to be one that is admired on esthetic grounds which would be recognized by those in the community of inquirers. It must be an admirable ideal. It also must be something which is good in the context of ethical concerns related to the subject matter and processes of inquiry, and it must adhere to the canons of logic so that one can also claim that it is true. Thus, an intellectual ideal must have “logical goodness,” and it must be “admirable.” Putting these criteria together, a proposition or statement should be one which is “sound” and thus possesses the qualities of admirable logical goodness. To answer the question: “In what does the soundness of argument consist?,” Peirce (1903a, EP 2, p. 205) next launches into comments on the three processes of reasoning again, abduction, deduction, and induction, and then again mentions the narrative about the flaws in Aristotle’s manuscripts: In order to answer the question it is necessary to recognize three radically different kinds of arguments which I signalized in 1867 and which had been recognized by the logicians of the eighteenth century, although those logicians quite pardonably failed to recognize the inferential character of one of them. Indeed, I suppose that the three were given by Aristotle in the Prior Analytics, although the unfortunate illegibility of a single world in his manuscript and its replacement by a wrong word by his first editor. . . . has completely altered the sense of the chapter on Abduction. . . .. (Peirce, 1903a, EP 2, p. 205)
After these comments comes another substantial paragraph on the three types of reasoning. Peirce (1903a, EP 2, p. 205) begins with the claim that “these three kinds of reasoning are Abduction, Deduction, and Induction.” Deduction “is the only necessary reasoning. It is the reasoning of mathematics,” and it is “reasoning concerning probabilities.” The passage continues claiming that “Induction is the experimental testing of a theory. . . .” and that induction “never can originate any idea whatever. No more can deduction.” Then comes the assertion that scientific discovery and creativity come from abduction: All the ideas of science come to it by way of abduction. Abduction consists in studying facts and devising a theory to explain them. Its only justification is that if we are ever to understand things at all, it must be in that way. (Peirce, 1903a, EP 2, p. 205)
Continuing with the seventh and last Harvard Lecture comes another characterization of the three processes of reasoning. There are added nuances as was Peirce’s custom when an important subject was exposited in a new writing. This is the passage from which many quotations have been taken since the Harvard Lectures first appeared in the 1930s:
1024
J. R. Wible
Abduction is the process of forming an explanatory hypothesis. It is the only logical operation which introduces any new idea; for induction does nothing but determine a value and deduction merely evolves the necessary consequences of a pure hypothesis. Deduction proves that something must be, Induction shows that something actually is operative, Abduction merely suggests that something may be. Its only justification is that from its suggestion deduction can draw a prediction which can be tested by induction and that, if we are ever to learn anything or to understand phenomena at all, it must be by abduction that this is to be brought about. . . A man must be downright crazy to deny that science has made many true discoveries. But every single item of scientific theory which stands established today has been due to abduction. (Peirce, 1903a, EP 2, pp. 216–217)
Finally, near the end of the seventh lecture, Peirce goes back to what he had done many times earlier in couching reasoning processes in terms of a syllogism. Here he gives us abduction in syllogistic and inferential form: Long before I first classed abduction as an inference it was recognized by logicians that the operation of adopting an explanatory hypothesis, – which is just what abduction is, was subject to certain conclusions. Namely, the hypothesis cannot be admitted, even as a hypothesis, unless it be supposed that it would account for the facts or some of them. The form of inference therefore is this: The surprising fact, C, is observed; But if A were true, C would be a matter of course. Hence, there is reason to suspect that A is true. (Peirce, 1903a, EP 2, p. 231)
Even closer to the end of that last Harvard Lecture, Peirce claims that abduction is central to his conception of pragmatism: “If you carefully consider the question of pragmatism you will see that it is nothing else than the question of the logic of abduction” (Peirce, 1903a, EP 2, p. 234) (For an individual case such as an abduction regarding a crime like theft, Peirce (1907) wrote a piece titled “Guessing”. His grandest abduction however was that fundamental philosophical categories could be understood relationally. He named these categories Firstness, Secondness, and Thirdness and called this his “Guess at the Riddle” (Peirce, 1887–1888). Readers may also find it interesting that a group of artificial intelligence researchers have written computer programs to embody the logic of abduction and one of them was named PEIRCE after the founder of pragmatism (Josephson & Josephson, 1994, pp. 3 and 94ff.)).
Abduction and Economic Methodology Although the literature is sparse, there have been a few scholars in economics who have written about abduction with an economic theme in mind. Mostly these are economists or historians of economic thought who have an interest in scientific methodology from a mainstream perspective. This is itself interesting since Peirce
47 C. S. Peirce’s Conception of Abduction and Economics
1025
has been claimed as an important influence on a rival and formerly dominant school, institutional economics. For institutionalist perspectives on Peirce, see Liebhafsky (1993) and Webb (2007, 2012). A mainstream economist, methodologist, and historian who has written about abduction with economic connotations is Kevin Hoover. Hoover is also a macroeconomist and has studied Peirce for several decades. In an article published with a co-author, Harris and Hoover (1980) compared Peirce’s conception of abduction with Nelson Goodman’s conception of induction. They argued that ambiguities inherent in Goodman’s conception of induction could be better understood with Peirce’s conception of ampliative inference including his conception of the economy of research. More than a decade after their co-authored article, Hoover (1996) wrote a more general piece relating various aspects of Peirce’s conception of scientific method and inference to various strands of ideas in economic methodology current during and before the early 1990s. Hoover (1996) noted that several interpretations relating economic methodology to pragmatism at that time were not consistent with Peirce’s conception of scientific method. Then he carefully outlined Peirce’s theory of scientific inquiry and noted the prominent role that abduction played in his conception of ampliative inference. Just 3 years after the Harris and Hoover article, another contribution on abduction and its economic aspects appeared, “The Economy of Peirce’s Abduction” Brown (1983). Brown wrote about the importance that abduction plays in scientific discovery and contrasted that view with Popper’s stance that there really is no logic of scientific discovery. Brown noted that abduction may lead to the generation of too many hypotheses and that economic considerations would be necessary to establish an order for testing those hypotheses. With the passing of a few more years, Alan Dyer (1986) wrote a piece comparing Peirce’s and Thorstein Veblen’s theories of scientific creativity. In his article, Dyer clearly identified himself with the American institutionalist school of economics. Dyer takes the reader through an unusual reading of Thorstein Veblen’s writings to argue that Veblen’s conceptions of idle curiosity and the play of musement in creating new ideas have close parallels with Peirce’s theory of scientific discovery and abduction. Dyer noted that Veblen did take Peirce’s class on the logic of science at Johns Hopkins and suggested that as one reason why there may be a similarity in their views. Closer to the present and just a few years ago, another article was written by Ramzi Mabsout (2015) comparing Peirce’s conception of abduction to Herbert Simon’s work on the cognitive psychology of discovery. Simon was known for comparing human intelligence with artificial intelligence. Simon created models of discovery which Mabsout compared to Peirce’s conception of the role of abduction in scientific investigation. Apparently, Simon had differences with Karl Popper’s (1959) conception of scientific discovery which led Simon closer to Peirce’s views. Something needs to be said about Peirce and institutional economics. Institutional economics appeared in the late nineteenth century in America and then evolved mostly in opposition to neoclassical economics and its empiricist conception of scientific method. Veblen (1898) was the most prominent critic of mainstream economics from the institutionalist perspective. However, most of the articles just mentioned above concern Peirce’s conception of scientific
1026
J. R. Wible
method with its prominent role for abduction and its relation to the conception of scientific methods which permeate mainstream economics. Peirce is an evolutionary thinker whose conception of scientific method was imagined as functioning within evolutionary natural and social processes of human knowing and inquiry. The fact that Veblen did take a graduate class on the logic of science from Peirce has led many prominent institutionalists to claim Peirce as a founding influence. However, the fact that Veblen never cited Peirce, that most institutionalists have not engaged with the mathematically themed writings of the last 25 years of Peirce’s life, and that most have ignored his conception of abduction suggests that their assertions of the significant influence of Peirce’s ideas and writings on institutional economics need to be reconsidered. Dyer (1986) is a major exception in that he does take up important aspects of Peirce’s conception of abduction. However, Dyer did not take up Peirce’s mathematically facilitated conception of scientific discovery with a prominent role for abduction in intricate inferential processes. Peirce’s writings on philosophy, mathematics, and science would need to be considered to do justice to the full scope of Peirce’s theory of abduction. The more general relationship of Peirce to Institutional Economics has been explored by Wible (2015, 2018b). The large collections of Peirce’s writings which seem to have been left untouched by institutionalists are his four-volume, New Elements of Mathematics (Peirce, 1976), and volumes three and four on logic and mathematics of the Collected Papers (Volumes 3 and 4). Comparisons of Peirce’s and Veblen’s ideas and their conceptions of economics as a mathematical science have been considered by Wible (2021).
Abduction and Economics From the previous accounts of the development of Peirce’s conception of abduction, one gets the sense that economic considerations may be secondary and in the background of his explanation of the various aspects of the logic of scientific inference. A recognition of economic aspects of the logic of scientific inquiry and economic methodology is quite unusual and of interest in its own terms. But one can ask if there is more to the matter and whether economics plays a more fundamental role in Peirce’s thought than what the articles above have suggested. Peirce did pursue some aspects of economics at a very high level in his time in the late nineteenth century. An exploration of his economic interests does reveal additional layers of economic considerations that are just coming to light. One gets the distinct impression that Peirce was interested in more than applying state-of-theart mathematical economics to matters of scientific research and inference. Perhaps there is something more fundamentally economic about the processes of scientific inquiry that goes beyond the economics of his time. Peirce’s explicit interests in the subject matter of economics form a small sliver or strand of works alongside his voluminous contributions to philosophy, science, and mathematics. They are easy to miss if one searches through the various collections of his writings. His most obvious contribution to economics is his short essay
47 C. S. Peirce’s Conception of Abduction and Economics
1027
“Note on the Theory of the Economy of Research” (1879). A draft of that paper (Peirce, 1877a) precedes the publication of the six Popular Science Monthly articles “Illustrations of the Logic of Science.” One noted Peirce specialist, Max Fisch, has indicated that the “Note” might be considered as a seventh essay in that series. One can imagine that if another version had been written to make a seventh essay in the Popular Science Monthly series, it might have immediately followed the last article, “Deduction, Induction, and Hypothesis.” While a more general version of the economy of research did not appear at that time, Peirce often did place his theory of the economy of research right next to his explanations of the logic of scientific inference. In the “Note,” Peirce (1879) goes beyond economic metaphors and qualitative narratives and actually reformulates Jevons’ (1871) famous marginal utility model for balancing the consumption of two different goods to balancing additional funds for additional scientific research projects (Wible, 1994). Instead of two goods, Peirce (1879) imagines spending money on two quite similar research entities. He asserts that the scientist should balance the additional benefits of extra funds spent on each research project until a point of equality or equilibrium is reached. Peirce even provides a version of Jevons’ bidirectional utility graph. Jevons had created a visual representation drawing the declining marginal utility of one food choice in a positive direction and the other one in the opposite direction with reference to a reversed axis. Consumer equilibrium would be where the two marginal utility curves intersect. Peirce did the same thing for two different research projects with the marginal utility of additional funds to one research project declining in a positive direction and the other with reference to a second axis declining in the opposite direction. The research equilibrium for the allocation of additional funds would be where the two marginal utility curves intersect. In that time and place in the late 1870s, this represents an extraordinary application and reinterpretation of consumer theory at the dawn of its creation. This conception of the economy of research would be a persistent theme over several decades and provide an economic rationale for Peirce’s conception of ampliative inference and abduction. While not published in his time, there are other writings on economics of exceptional interest. In correspondence with his father Benjamin in December of 1871, Peirce (1871) wrote to his father about Cournot’s now famous duopoly equations. These equations are found in Cournot’s (1838) Researches and were the subject of a meeting of the Cambridge Scientific Club on the Harvard campus. Peirce did not attend the meeting. However, the equations in his letter were not recognized as the duopoly equations by Peirce scholars until a few years ago when Wible and Hoover (2015) wrote about them. There are a few other manuscripts with Cournot-like equations regarding the profit maximizing firm and equilibrium across several markets. Two of these writings involve letters written to the editor of The Nation regarding the relative price effects of a sugar tariff. Peirce’s conceptualization of the economic consequences of the sugar tariff clearly reflect his reading of Cournot (Wible & Hoover, 2021). Another economic concept which Peirce developed was Ricardian inference. Ricardian inference regards large numbers like infinity as found with extremely sizeable collections of data or observations
1028
J. R. Wible
(Hoover & Wible, 2021). Peirce researched Cantor’s theory of an algebra of infinite numbers and corresponded with him. Calculus and probability put scientists in the realm of infinity since both invoke a conception of continuity which involves dealing with conceptions of infinity. Additionally, one must always be aware of the fact that a finite collection of observations or data could have been generated by a process which in principle is infinite and that the calculation of statistical measures from finite samples presumes that probability is continuous between zero and one. One way to handle aspects of large numbers and collections of large numbers of entities is relationally. A few years after the letter with Cournot’s duopoly equations came a brief manuscript expressing what would become a key aspect of advanced mathematical microeconomics, the axiom of transitivity. In “On Political Economy,” the first section of the writing has an elaborate exploration of the theory of the firm with calculus, and the second part is more theoretical with a brief discussion of goods related in consumption and with an algebraic statement of an important property of consumer demand: The dependence of demand on price arises from this fundamental proposition. The desire of a person for anything has a quantity of one dimension, and a person having a choice will take that alternative which gives him the greatest satisfaction. In other words if a person prefers A to B and B to C he also prefers A to C. This is the first axiom of Political Economy. (Peirce, 1874, WP 3, p. 176)
This is likely the first statement of the axiom of transitivity with regard to economic theory of demand, and another would not come for nearly 80 years in the 1950s. Peirce’s interest in Cournot’s duopoly equations and his mathematical theory of the firm and markets raises the more general question of game theory. Game theory became a more prominent part of economics after World War II. Throughout his discussions of logic and scientific method, there are references to the strategic aspects of those activities. Clearly, Peirce did not directly influence game theory as it was first brought into economics. Peirce is more of a predecessor than a precursor to game theory as it has developed in economics (Wible, 2018a). But Peirce used a specific game theory conception to highlight an important aspect of the economy of research and abduction. Almost always having more general or philosophical readers in mind, Peirce would use simpler examples to illustrate more complicated points especially if there was widespread familiarity with the example. This was the game of 20 questions. What may be Peirce’s first use the of the game of 20 questions in relation to the conception of an hypothesis and the economy of research came in a review of textbooks relatively early in his career. There he held that J. S. Mill’s conception of inference was flawed, confused, and economically inefficient: A hypothesis, therefore, does not differ from any other inferential proposition. . . .Here two questions must be distinguished: the first, in reference to what a man may logically do; the second, as to how he may best economize his scientific energies. . . .But he will be very unwise to spend a large portion of his life putting anything to the test which can hardly be true or which can hardly be false (Peirce, 1872, WP 3, p. 5).
47 C. S. Peirce’s Conception of Abduction and Economics
1029
Then Peirce extends the economic critique to the game of 20 questions: When the questions put to nature will only be answered by yes or no, he will advance with the greatest rapidity (as in the game of twenty questions) by asking questions an affirmative answer to which is equally probable with a negative one. He must, however, consider what degree of certainty the answer will have, and the rule will be, among questions of equal importance, to make that investigation which will have the greatest effect in altering existing probabilities. (Peirce, 1872, WP 3, pp. 5–6).
The strategic game of 20 questions again arises in the “Logic of History” coming after Peirce’s first elaborate presentation of abduction as presented previously. In the “Logic of History,” right after the presentation of scientific discovery and abduction also comes Peirce’s most elaborate discussion of the economy of research. There are three major factors involved in the economy of research: “Caution, Breadth, and Incomplexity” (Peirce, 1901b, EP 2, p 109). The game of 20 questions relates to the skilled researcher being able to find an efficient way forward with scientific inquiry: In respect to caution, the game of twenty questions is instructive. In this game one party thinks of some individual object, real or fictitious, which is well known to all educated people. The other party is entitled to answers to any twenty interrogatories they propound which can be answered by Yes or No, and are then to guess what was thought of, if they can. (Peirce, 1901b, EP 2, p. 109)
The game of 20 questions highlights the strategic but economic aspect of abduction and guessing. Conceived as a sequential decision tree with many possible branches, scientific inquiry would face too many alternatives to explore with limited resources. It is the resourcefulness of the experienced researcher which might lead to scientific progress: Thus twenty skillful hypotheses will ascertain what two hundred thousand stupid ones might fail to do. The secret of the business lies in the caution which breaks a hypothesis up into its smallest logical components, and only risks one of them at a time. What a world of futile controversy and of confused experimentation might have been saved if this principle had guided investigations into the theory of light! (Peirce, 1901b, EP 2, p. 109)
In the Harvard Lectures just 2 years later, Peirce would extend the idea of a game of 20 questions from being part of a strategic game of the economy of research and abduction to a game against nature. This clearly represents a generalization of the conception of scientific inquiry as a strategic game of abductively guided guessing: An experiment . . . . is a question put to nature. Like any interrogatory it is based on a supposition. If that supposition be correct, a certain sensible result is to be expected under certain circumstances which can be created or at any rate are to be met with. The question is, Will this be the result? If Nature replies “No!” the experimenter has gained an important piece of knowledge. If Nature says “Yes,” the experimenter’s ideas remain just as they were, only somewhat more deeply engrained. If Nature says “Yes” to the first twenty questions although they were so devised as to render that answer as surprising as possible, the experimenter will be confident that he is on the right track, since 2 to the 20th power exceeds a million (Peirce, 1903a, EP 2, p. 215).
Such a comment as the experimenter playing a game against nature can also be found in the writings of Imre. Lakatos (1970, p. 130).
1030
J. R. Wible
One can see that Peirce is placing a great deal of significance on his strategically embellished conception of abduction. In the next Harvard Lecture, Peirce would consider objections to the role of abduction in his conception of inquiry. He imagines three possible objections (1903a, EP 2, pp. 231–233). One is that an abduction may appear initially in a non-logical way and thus defy the syllogistic form of abduction that he has set forth a few pages earlier. He replies that it is the substance of the abductive statement which really matters. A second objection is perhaps that the abductive statement proves too much. His response is that this could occur for an inexperienced mind with the implication that a more seasoned researcher would avoid overstating the abduction. A third objection might be that the newly observed phenomenon might have a different origin than what is supposed in the abductive statement. This third objection to abduction is the most serious: If the antecedent is not given in a perceptive judgment, then it must first emerge in the conclusion of an inference. At this point we are obliged to draw the distinction between the matter and the logical form. With the aid of the logic of relations it would be easy to show that the entire logical matter of a conclusion must in any mode of inference be contained, piecemeal, in the premises. Ultimately therefore it must come from the uncontrolled part of the mind, because a series of controlled acts must have a first. (Peirce, 1903a, EP 2, p. 233)
Peirce continues raising a sequence of concerns which come from the uncontrolled part of the mind. Controlled mental processes are the hallmark of reason, but they often involve a conception of “abductive expectability”: Where do the conceptions of deductive necessity, of inductive probability, of abductive expectability come from? Where does the conception of inference itself come from? That is the only difficulty. But self-control is the character which distinguishes reasoning from any process by which perceptual judgments are formed, and self-control of any kind is purely inhibitory. It originates nothing. (Peirce, 1903a, EP 2, p. 233)
This last passage from the Harvard lectures brings the additional dimension of expectations into the discussion regarding abduction. A conception of expectations was also an important discussion in yet another or third major monograph, “Minute Logic,” written in 1902 and thus coming between the 1901 “Logic of History” and the Harvard Lectures of 1903. Though published in many fragments, the “Minute Logic” runs to about 400 pages in the Collected Papers and is thus longer than the other two monographs taken together. In the “Minute Logic,” Peirce lays out a provisional response to the question of where “abductive expectability” comes from: Being in futuro appears in mental forms, intentions and expectations. Memory supplies us a knowledge of the past by a sort of brute force, a quite binary action, without any reasoning. But all our knowledge of the future is obtained through the medium of something else. (1902a, CP 2, p 46)
Knowledge of the future comes from the possibility that there are law-like regularities in nature and thus in human experience: “All our knowledge of the laws of nature is analogous to knowledge of the future, inasmuch as there is no direct way in which the laws can become known to us” (Peirce, 1902a, CP 2, p. 47).
47 C. S. Peirce’s Conception of Abduction and Economics
1031
What scientists need to do is to “guess out the laws bit by bit.” Those guesses could approach what would come to be regarded as being right by the relevant community of researchers or inquirers at some point even if in the distant future (Peirce, 1902a, CP 2, p. 47). Similarly, ordinary individuals in their economic activities of various complexity often could do something analogous to the reasoning processes of scientists. They could “guess out” the most important fundamentals of their circumstances “bit by bit.” This discussion of anticipating the future turns toward a definition of an expectation which should shed some light on what a conception of abductive expectabilty might mean: Now, on the other hand, consider what an expectation is. Begin with something in the distant future. . . . There is a sort of picture in your imagination whose outlines are vague and fluid. You do not attach it to any definite occasion, but you think vaguely that some definite occasion there is, to which that picture does attach itself, and in which it is to become individualized. (Peirce, 1902b, CP 2, p. 77).
Peirce would go on to claim that an expectation is a habit of imagining what might happen in the future. An inference is essentially an analogy to an expectation and is the key to reasoning about the future. The patterns of what is expected to happen in the future can be captured in a relationally interpreted diagram. Peirce’s conception of expectations which could be represented diagrammatically allowed a connection to his conception of mathematics (Wible, 2020). Mathematics may help us work out the implications of expectations. Diagrams not only include visual illustrations and geometric graphs, but they also extend to mathematical and logical symbols and equations. Such representations are skeletal diagrams which represent key aspects of the logic of events suggested in an abductive statement. Logical diagrams may suggest mathematical models of the phenomena being observed and expectations about the future course of those phenomena: Logic will, indeed, like every other science, have its mathematical parts. There will be a mathematical logic just as there is a mathematical physics and a mathematical economics. If there is any part of logic of which mathematics stands in need . . . . it can only be that very part of logic which consists merely in an application of mathematics, so that the appeal will be, not of mathematics to a prior science of logic, but of mathematics to mathematics . . . . Mathematics is engaged solely in tracing out the consequences of hypotheses [or abductive expectations]. (Peirce, 1902c, CP 1, p. 112)
Yet another version of human reasoning would come in what would amount to a fourth monograph, the 1903 Lowell Lectures (Peirce, 1903b). There the focus is on semiotics and the use of symbols to reason about the future and the inferences relative to the future (Wible, 2022). The passage on abduction comes in a long syllabus written to accompany the Lowell Lectures but apparently was not made available to the audience. Here inferences are arguments, and arguments take the form of symbols. The three main forms of argument are abduction, deduction, and induction. Here Peirce (1903b, EP 2 p. 287) claims that “the whole operation of reasoning begins with Abduction . . . .”
1032
J. R. Wible
Conclusions There is an increasing realization of the importance of the conception of abduction and the preeminent role which Charles Sanders Peirce had in creating and elaborating that conception. Peirce is clearly the modern author of abduction, and he would locate an original suggestion for abduction as a separate form of reasoning in the writings of Aristotle. Peirce’s writings on abduction also intertwine with his theory of the “economy of research” and his interest in the mathematical economics of his time. Peirce thought that there was an economic aspect to the “order of nature” which had implications for human reasoning processes such as inference, science, and mathematics. Peirce also touched on aspects of inquiry which have both economic and epistemological dimensions such as abductive expectations and strategic situations which can be understood with game theory. There are additional aspects not considered here such as his interest in computation and mathematical learning which could extend the relevance of abduction to those concerns as well. For Peirce, abduction is one of the most important processes in human reasoning and where creative ideas about our world originate.
References Brent, J. (1998). Charles Sanders Peirce: A Life (2nd ed.). Bloomington: Indiana University Press. Brown, W. M. (1983). The economy of Peirce’s abduction. Transactions of the Charles S. Peirce Society 397–411. Burks, A. W. (1946). Peirce’s theory of abduction. Philosophy of Science, 4, 301–306. Cournot, A. (1929 [1838]). Researches into the Mathematical Principles of the Theory of Wealth (N.T. Bacon, Trans.). New York: Macmillan. Douven, I. (2017a). Abduction. In Stanford Encyclopedia of Philosophy (Summer 2017 Edition), E. N. Zalta (Ed.), https://plato.stanford.edu/archives/sum2017/entries/abduction/. Douven, I. (2017b). Peirce on abduction. In Stanford Encyclopedia of Philosophy (Summer 2017 Edition), E. N. Zalta (Ed.), https://plato.stanford.edu/archives/sum2017/entries/abduction// peirce.html. Dyer, A. W. (1986). Veblen on scientific creativity: The influence of Charles S. Peirce. Journal of Economic Issues, 20(1), 21–41. Frankfurt, H. G. (1958). Peirce’s notion of abduction. The Journal of Philosophy, 55(14), 593–597. Friedman, M. (1953). The methodology of positive economics. In Essays in Positive Economics. Chicago: University of Chicago Press. Harris, J. F., & Hoover, K. D. (1980). The relevance of Charles Peirce. The Monist, 63(3), 329–341. Hoover, K. D. (1996). Pragmatism, pragmaticism and economic method. In R. E. Backhouse (Ed.), New Directions in Economic Methodology. London: Routledge. Hoover, K. D., & Wible, J. R. (2021). Ricardian inference: Charles S. Peirce, economics, and scientific method. Transactions of the Charles S. Peirce Society, 56(4), 521–557. Jevons, W. S. (1871). The Theory of Political Economy (5th ed.). London: Macmillan. Josephson, J. R., & Josephson, S. G. (1994). Abductive Inference: Computation, Philosophy, Technology. Cambridge: Cambridge University Press. Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In I. Lakatos & A. Musgrave (Eds.), Criticism and the Growth of Knowledge (pp. 91–196). Cambridge: Cambridge University Press.
47 C. S. Peirce’s Conception of Abduction and Economics
1033
Liebhafsky, E. E. (1993). The influence of Charles Sanders Peirce on institutional economics. Journal of Economic Issues, 27(3), 741–751. Mabsout, R. (2015). Abduction and economics: The contributions of Charles Peirce and Herbert Simon. Journal of Economic Methodology, 22, 1–26. Peirce, C. S. (1865). On the Logic of Science. Harvard Lectures of 1865, WP 1 (pp. 161–302). Peirce, C. S. (1866). The Logic of Science; or, Induction and Hypothesis. Lowell Lectures of 1866, WP 1 (pp. 357–504). Peirce, C. S. (1867). Specimen of a Dictionary of the Terms of Logic and allied Sciences: A to ABS, WP 2 (pp. 105–121). Peirce, C. S. (1869). [Lectures on British Logicians]. Harvard Lectures of 1869, WP 2 (pp. 309– 345, 533–538). Peirce, C. S. (1871). [Letter to Benjamin Peirce]. In NEM (Vol. III/I, pp. 553–554). Peirce, C. S. (1872). Educational Text-books, II, WP 3 (pp. 1–7). Peirce, C. S. (1874). [On Political Economy], WP 3 (pp. 173–76). Peirce, C. S. (1877a). Note on the Theory of the Economy of Research, ms 1093. Before June 4, 1977 Peirce, C. S. (1877–1878): Six articles on the “Illustrations of the Logic of Science” Peirce, C. S. (1877b). The Fixation of Belief, WP 3 (pp. 242–257). Peirce, C. S. (1878a). How to Make Our Ideas Clear, WP 3 (pp. 257–276). Peirce, C. S. (1878b). The Doctrine of Chances, WP 3 (pp. 276–289). Peirce, C. S. (1878c). The Probability of Induction, WP 3 (pp. 290–305). Peirce, C. S. (1878d). The Order of Nature, WP 3 (pp. 306–322). Peirce, C. S. (1878e). Deduction, Induction, and Hypothesis, WP 3 (pp. 323–338). Peirce, C. S. (1879). Note on the Theory of the Economy of Research. United States Coast Survey for the fiscal year ending June 1876, U.S. Government Printing Office 1879, reprinted in Operations Research (Vol. XV, 1967 [1879], pp. 642–648). Also reprinted in CP 7 (pp. 76–83); and in WP 4 (pp. 72–78). Peirce, C. S. (1887–1888). A Guess at the Riddle, in WP 6 (pp. 168–210), EP 1 (pp. 245–279), CP 1 (pp. 181–226). Peirce, C. S. (1898). In K. L. Ketner (Ed.), Reasoning and the Logic of Things: The Cambridge Conference Lectures of 1898. Cambridge: Harvard University Press, 1992. Excerpts also found in EP 2 (pp. 11–56), CP 1 (pp. 339–363), CP 5 (pp. 399–422), CP 6 (pp. 1–5, 46–66, 132–146), CP 7 (pp. 284–312). Peirce, C. S. (1901a). Notes on Ampliative Reasoning, CP 2 (p. 495). Peirce, C. S. (1901b). On the Logic of Drawing History from Ancient Documents Especially from Testimonies, EP 2 (pp. 75–114). Peirce, C. S. (1902). Three excerpts from the Minute Logic Peirce, C. S. (1902a). [Partial Synopsis of a Proposed Work in Logic], CP 2 (pp. 42–56). Peirce, C. S. (1902b). Why Study Logic, CP 2 (pp. 67–119). Peirce, C. S. (1902c). A Detailed Classification of the Sciences, CP 1 (pp. 83–137), CP 7 (pp. 223– 248). Peirce, C. S. (1903a). Harvard Lectures on Pragmatism, EP 2 (pp. 133–241). Peirce, C. S. (1903b). Sundry Logical Conceptions. From the Lowell Lectures of 1903, EP 2 (pp. 266–288). Peirce, C. S. (1907). Guessing. A manuscript re-titled [Later Reflections] by the editors of the Collected Papers, CP 7 (pp. 27–34). Peirce, C. S. (1931–1958). Collected Papers of Charles Sanders Peirce (Vols. 1–6, ed. C. Hartshorne & P. Weiss, Vols. 7–8, ed. A. Burks). Cambridge: Harvard University Press (Referred to as CP). Peirce, C. S. (1976). In C. S. Peirce & C. Eisele (Eds.), New Elements of Mathematics (4 Vols, 2481pp.). The Hague: Mouton Publishers (Referred to as NEM). Peirce, C. S. (1984–2010). Writings of Charles S. Peirce: A Chronological Edition (Vols. 1–6 and 8, Many editors). Bloomington: Indiana University Press (Referred to as WP).
1034
J. R. Wible
Peirce, C. S. (1992, 1998). In N. Houser, C. Kloesel, & The Peirce Edition Project (Eds.), The Essential Peirce (2 Vols). Bloomington: Indiana University Press (Referred to as EP). Popper, K. (1959). The Logic of Scientific Discovery. New York: Harper and Row. Stigler, S. M. (1980). American Contributions to Mathematical Statistics in the Nineteenth Century (Vol. II). New York: Arno Press. Veblen, T. (1898). Why is economics not an evolutionary science? Quarterly Journal of Economics, xii, 57–81. Walton, D. (2004). Abductive Reasoning. Tuscaloosa: University of Alabama Press. Webb, J. L. (2007). Pragmatisms (plural) part 1: Classical pragmatism and some implications for empirical inquiry. Journal of Economic Issues, 41(4), 1063–1086. Webb, J. L. (2012). Pragmatisms (plural) part II: From classical to neo-pragmatism. Journal of Economic Issues, 46(1), 45–74. Wible, J. R. (1994). Charles Sanders Peirce’s economy of research. Journal of Economic Methodology, 1, 135–160. Wible, J. R. (1998). Peirce’s economic reasoning in his methodological essay, ‘On the logic of drawing history from ancient documents especially from testimonies’. In M. Rutherford, (Ed.), The Economic Mind in America: Essays in the History of American Economics, Perspectives in the History of Economic Thought (pp. 233–257). London: Routledge. Wible, J. R. (2014). Peirce’s economic model in the first harvard lecture on pragmatism. Transactions of the Charles S. Peirce Society, 50(4), 548–580. Wible, J. R. (2015). The Puzzle of C. S. Peirce’s pragmatism and economics: Is it a scientific method for institutionalist or neoclassical economics or something else? History of Economics Society Meetings, Michigan State University, June 2015. Wible, J. R. (2018a). Game theory, abduction, and the economy of research: C. S. Peirce’s conception of humanity’s most economic resource. Transactions of the Charles S. Peirce Society, 54(2), 134–161. Wible, J. R. (2018b). A Peircean perspective on integrating economics and evolutionary theory. A review of Carsten Hermann-Pillath’s foundations of economic evolution: A treatise on the natural philosophy of economics. Journal of Economic Methodology, 25(1), 105–111. Wible, J. R. (2020). C. S. Peirce’s theory of abductive expectations. European Journal of the History of Economic Thought, 27(1), 2–44. Wible, J. R. (2021). Why economics is an evolutionary, mathematical science: How could Veblen’s view of economics have been so different than C. S. Peirce’s? Journal of the History of Economic Thought, 43(3), 350–377. Wible, J. R. (2022). C. S. Peirce’s Semiotic and Mathematical Conception of Economics. Recherches Sémiotiques/Semiotic Inquiry (RSSI, forthcoming 2022, 33pp.). Wible, J. R., & Hoover, K. (2015). Mathematical Economics Comes to America: Charles S. Peirce’s Engagement with Cournot’s Recherches sur les Principes Mathematiques de la Théorie des Richesses. Journal of the History of Economic Thought, 37(4), 551–536. Wible, J. R., & Hoover, K. (2021). The economics of trade liberalization: Charles S. Peirce and the Spanish treaty of 1884. European Journal of the History of Economic Thought, 28(2), 229–248.
Abduction and Economics
48
Ramzi Mabsout
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Political Economy and Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical and Applied Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in Heterodox Economics and Philosophy of Economics . . . . . . . . . . . . . . . . . . . The Economics of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1036 1037 1039 1042 1044 1045 1045
Abstract
This entry maps the use of ampliative inference or abduction in economics. Abduction does not just integrate numerous economic concepts but has been in continuous use since classical economics. The integration of abductive elements in economics is better understood after Charles Sander Peirce’s (1839–1914) clear delineation of a logic underlying ampliative inference. His methodology of science, which includes abductive inference as the initial step that suggests novel explanatory hypotheses to be tested, can integrate the prevalent (Millian) deductivist and (Popperian) falsificationist methodologies among economists. Although the current interest in abduction in economics has taken root within a very diverse group of scholars, their account of abduction reveals significant overlaps on the importance of experience, intuition, creativity, and detective-like research instead of mechanical textbook-like procedures. Such consensus among
R. Mabsout () Department of Economics, American University of Beirut, Beirut, Lebanon e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_55
1035
1036
R. Mabsout
heterogenous schools of thought in economics on the most important elements of research in their discipline highlights their shared elements and can contribute to the unification of the discipline. Keywords
Abduction · Economics · Inference · Classical economics · Mainstream economics · Theoretical economics · Applied economics · Heterodox economics · Economics of science
Introduction The traditional, or orthodox, view in economic methodology on such issues as the method of economic inquiry, the validity of inferences, and what counts as an ampliative inference remains to our days influenced by the reflections of John Stuart Mill (1806–1873). Mill, however, faced something of a conundrum in his methodological reflections: whereas in his early thoughts he considered political economy a separate deductive social science whose laws are known via (inductive) empirical introspection, in more mature works he focused on history and the general science of society, of which political economy is a branch, and how they rely on a different inferential procedure, the inverse deductive method. The inverse deductive method, unlike a deduction, starts with empirical observations and moves upward to the causal explanation. History and the general social sciences, Mill considered, employ the inverse deductive method because their subject matter is a dynamic, constantly changing social world, whose causal laws change with geography and over time. The conundrum arises because Mill thought the causal laws of political economy also change with geography and over time. Perhaps Mill saw political economy as a separate science which could adopt deduction because it had identified its more important law (the desire for wealth) or perhaps he wished to shield Ricardian deductivism from empirical refutation, whatever may be the reason behind the separateness of political economy, there is a reading of Mill where political economy is but a narrow branch of history and the general social sciences and, ultimately, amenable to the inverse deductive method (Hollander, 1985; Redman, 1997). Charles Sander Peirce’s (1839–1914) methodology of science, as we shall see, can bridge some gaps left by Mill, specifically, on integrating the deductive method of political economy with the method of history and the general social sciences. Peirce’s methodology of science starts with abduction which discovers an explanation to some surprising phenomena; it is followed by deduction which generates testable hypotheses and ends with induction which tests the outputs of deduction. Peirce’s methodology can, and has been, applied to economics bridging the gap between Mill’s deductivist political economy and his inverse deductive method for the general social sciences and history. Interestingly, Peirce (1931, CP 5.167) was acquainted with Mill’s work but had few kind things to say:
48 Abduction and Economics
1037
“A generation and a half of evolutionary fashions in philosophy has not sufficed to extinguish the fire of admiration for John Stuart Mill – that very strong but philistine philosopher whose inconsistencies fitted him so well.” Similar critiques on Mill (and Bacon) are interspersed in Peirce’s work and mostly focus on their alleged scientific ignorance. The broader issue, beyond Mill, is that Peirce’s clear delineation of a logic underlying ampliative inference, called abduction, the method of hypothesis, or retroduction, has had an impact, if belatedly identified, on economics, both mainstream and heterodox, theoretical and applied. This entry describes the implications of this impact from classical political economy to contemporary economics. It begins by identifying, in section “Political Economy and Abduction,” early elements of abduction in the works of political economists. Abduction in the theoretical and applied works of contemporary economists is the focus of the next section. Section “Abduction in Heterodox Economics and Philosophy of Economics” is on the use of abduction outside mainstream economics, namely among heterodox economists and philosophers of economics. The final section reflects on the economics of abduction, that is how abduction itself integrates economic elements.
Political Economy and Abduction The idea that inference can be ampliative and contain, perhaps even generate, novel explanations predates Peirce whose pioneering work, nevertheless, was the first to clearly distinguish between abduction and induction. Before Peirce, deduction, induction, and abduction were often amalgamated (Harris & Hoover, 1980). The entanglement between deduction, induction, and abduction, specifically, is found in the influential works of early inductivists such as Francis Bacon (1521–1626) and among political economists such as David Ricardo (1772–1823) and John Stuart Mill (Redman, 1997, pp. 188–189). Bacon, an early and influential proponent of induction in science, described a method to discover axioms labeled induction to axioms: “In establishing axioms, another form of induction must be devised than has hitherto been employed; and it must be used for proving and discovering not first principles (as they are called) only, but also the lesser axioms, and the middle, and indeed all” (Bacon, [1858] 2011, p. 97). For Bacon, the formalism of the syllogism is not sufficient to establish first principles in science, and induction was elaborated as a method or instrument of discovery (Klein & Giglioni, 2020; Urbach, 1987, p. 49). Bacon conceived two types of induction, the method of interpretation and the method of anticipation or premature generalizations. According to Urbach (1982), and though Bacon was critical of the method of anticipation in so far as it avoided empirical refutation, both interpretation and anticipation are ampliative. While Bacon dismissed anticipation and considered it rash and unlikely to lead scientific progress (Redman, 1997, p. 170), Urbach (1982, pp. 116–118), points out that in his work “there is no attempt disparage speculation in science, and anticipation is not criticized for being a speculative method. What Bacon objected to in anticipation was its refusal to
1038
R. Mabsout
allow theories to be refuted by empirical evidence . . . Bacon’s real target was not speculation but the dogmatic defence of speculation and the tendency to regard speculations as infallible.” Bacon was not alone in promoting the importance of ampliative inference in science, and traces (subsumed in induction) can be found in Isaac Newton (1642–1727), David Hume (1711–1776), Adam Smith (1723–1790), David Ricardo, and Sir John Herschel (1792–1871) as we now discuss. Among classical political economists, abductive elements are present in Ricardo’s theory of rent, which Peirce identifies as Ricardian inference (Hoover, 2019; Hoover & Wible, 2020). Ricardian inference resembles Peirce’s own “analytical method” a hybrid that combines, in analogy, the interplay of abduction and induction. The most prominent defender of Ricardo’s work in the nineteenth century, namely, John Stuart Mill, deemed political economy a narrow specialized moral science, a branch of speculative politics, whose method of inquiry is deductive a priori because of the impossibility of conducting experiments and for being a single cause science (Hands, 2001). If the deduction is not empirically confirmed, Mill, nevertheless, allowed its revision and the addition of hitherto excluded laws in political economy over and above that of the desire of wealth maximization (Hollander, 1985, pp. 104–134). Mill, however, did not consider deduction a valid mode of inference because, he argued, it could not be used to generate knowledge not already contained in the premises (Mill, 1843, Vol. 1, Book 2, Chap. 3, p. 244). Furthermore, Mill recognized that new knowledge can be acquired with induction, stating that “all discovery of truths not self-evident, consists of inductions” (Mill, 1843, Vol. 1, Book 3, Chap. 1, p. 345). In A System of Logic, Mill, in fact, appears to have stumbled on more specific elements of abduction when discussing the problems faced by the general sciences of society – such as history and sociology – which, according to him, must explain a continuously changing interdependent social world. For example, he contends that The fundamental problem, therefore, of sociology is to find the laws according to which any state of society produces the state which succeeds it and takes its place . . . It is one of the characters, not absolutely peculiar to the sciences of human nature and society, but belonging to them in a peculiar degree, to be conversant with a subject matter whose properties are changeable. (Mill, 1843, Vol. 2, Book 6, Chap. 10, p. 587)
Mill also argued that without history, moral sciences such as political economy ignore cumulative change. History can play a key role in moral sciences through the inverse deductive method. According to Redman (1997, p. 342), the inverse deductive method has the following form: 1. An empirical generalization drawn from the facts of history suggests a law. 2. The generalization is checked to see if it conforms to the laws of human nature or ethology. 3. The generalization is evaluated on the basis of the results. As Redman (1997, p. 337) further explains, “when employing the inverse deductive method, empirical generalizations drawn from history suggest a law,
48 Abduction and Economics
1039
which is often verified by known deduction from psychological or ethological laws of human nature.” History and local customs broaden the scope of political economy whose deductions were restricted to Great Britain and the USA (Mill, 1843, Vol. 2, Book 6, Chap. 9, p. 577). Mill, in fact, did not even consider the desire for wealth, the approximate generalization which delineated the domain of political economy, applicable to continental Europe: “Yet those who know the habits of the continent of Europe are aware how apparently small a motive often outweighs the desire of money-getting, even in operations which have money-getting as their direct object” (Mill, 1843, Vol. 2, Book 6, Chap. 9, p. 578). The axioms of (British) political economy, for Mill, were approximate generalizations restricted to a place and an epoch. The importance given to local context is also not a Millian idiosyncrasy among classical political economists for Ricardo too considered the behavioral axioms “varied geographically and temporally . . . and accordingly [entailed] the conditional character of the conclusions of political economy” (Hollander, 1985, p. 21). Nevertheless, while the inverse deductive method has a logical structure with elements that mirror abduction, Hirsch and de Marchi (1990, p. 117) argue that, for Mill, the laws in the inverse deductive method should be true causal laws: “Instead of first deducing implications from causal laws and then verifying these implications with specific experience, the empirical generalizations are derived first from specific experience and then are verified by being related to known casual laws... the purpose of the inversion is not derive premises or causal laws.” Hirsch and de Marchi (1990, p. 117) thus distinguish Mill’s inverse deductive method and abduction where the laws are novel and therefore defeasible. According to Hirsch and de Marchi (1990, p. 120), for Mill, political economy should not derive the “behavioral premise (or premises) for theorizing from detailed observation of specific experience,” ruling out the inverse deductive method from political economy which relies on introspection for the premises of its deductions.
Theoretical and Applied Economics The Millian orthodoxy in economic methodology classifies economics as a deductive science. However, Wible (2014) goes against this trend when he demonstrates that a simple insurance model, derived by Charles Peirce for his Harvard Lecture of 1903, is an instance of an abductive inference. Peirce’s model is a profit maximization problem with only four variables (price charged per policy p, quantity of policies sold n, loss per policy sold l, and fraction of policies incurring loss q) and a probability for the quantity of policies that incur losses. Wible (2014, p. 569) argues that in the Harvard Lecture, Peirce, to the displeasure of his hosts, states that the foundations of pragmatism are not to be found in psychology but logic and that the question of pragmatism and abductive logic is one and the same. Wible then claims that Peirce’s insurance model has the following abductive form:
1040
R. Mabsout
The surprising fact of an insurance firm behaving as a profit maximizer (C) is observed. But if the insurance firm being a monopolist were true (A), then profit maximization (C) would be a matter of course. Hence, there is a reason to suspect that it is true that the insurance firm is a monopolist (A).
Redman (1997) too questions the extent to which economics is deductive but does so by revisiting the original source in Mill. Thus, in contrast to Hirsch and de Marchi (1990), Redman argues that Mill’s mature work was open to the possibility of applying the inverse deductive method to political economy. Hirsch and de Marchi (1990, p. 14), nevertheless, claim that if Mill could not apply the inverse deductive method to economics, Milton Friedman did. The clearest case of abductive inference arises in Friedman’s joint work with Leonard Savage on risk. According to Hirsch and de Marchi (1990, p. 18), Friedman and Savage (1948) use Charles Peirce’s abductive procedure when they start with an observational surprising fact to explain, or rationalize, namely, risk-taking. The standard concave utility function with diminishing marginal utility entailed risk averseness and could not be used to explain, or rationalize, risk-taking by economic agents. Friedman and Savage (1948) posited an (initially) implausible wiggly utility function with convex and concave sections. This wiggly utility function, they argue, Friedman and Savage used it to explain both observed risk averseness and risk-seeking in economic behavior. Following Mill, Robert Sugden (2002, 2009, 2013) describes economics as a deductive science but adds that it should integrate ampliative inferences such as abduction to link the model world to the real world. Sugden (2013, p. 239) assumes economic models ought to explain real-world phenomena which is why, when reflecting on how fictional models lacking realism can explain observable tendencies in the world, Sugden introduces abduction as a subcategory of induction. For Sugden, economic’s ability to explain is contingent on the premises – which contain isolations, abstractions, empirical generalizations, ceteris paribus clauses, etc. – and the parameter values assumed in its models. The question for Sugden then is how modelers justify the transition from the particular parameter values assumed in the model world to the real world? After exposing various economic models such as Akerlof’s market for lemons, Schelling’s checkerboard of racial sorting, and Banerjee’s herding model, he posits a class of inductive inferences necessary for the transition but that requires a leap in reasoning, among them, explanation, prediction, and abduction. The inferential form he gives to abduction is the following (with R a regularity and F a set of causal factors): A1. In the model world, R is caused by F A2. R occurs in the real world Therefore, there is reason to believe: A3. F operates in the real world
48 Abduction and Economics
1041
An abductive inference obtains “when we observe R in a particular case, we have some reason to expect to find F too” (Sugden, 2002, p. 126). Without abductive leaps, “the implications that the modeler wants his reader to draw about the real world cannot be deduced merely from the accepted principles of economics: what can be so deduced are implications only about the model world” (Sugden, 2013, p. 238). There are several interesting features contained in Sugden’s account of abduction. First, confidence in the connecting inference between the model world and the real world, i.e., its credibility, is contingent on the degree of similarity between the two worlds. Nevertheless, Sugden admits that judgments of similarity contain subjective elements which are bequeathed to his account of credibility and, it follows, explanatoriness is deemed an individual subjective notion (Sugden, 2013, p. 242). Second, the leap introduced by abductive inference – linking the model world to the real world – Sugden (2009, p. 4) argues, cannot be formalized using the rigorous language of economic theorists and an insoluble vagueness is appended to abduction. Kevin Hoover (2019) considers modeling in science in general, and economics in particular, is useful in so long as it is an instrument for stating truths. He adopts Peirce’s conception of analogy which combines abduction and induction to economics focusing on Lawrence Klein’s macroeconometric models of the US economy. Hoover claims Klein starts with an abduction to set the frame, namely a Keynesian model of aggregate supply and demand. Klein then specifies additional models of increasing complexity and uses induction to test their accuracy. When some aspects of the models appear to lack accuracy, Klein refines the model instead of rejecting it. In this manner, Klein builds increasingly larger and more complex models of the US economy settling on a macroeconometric model deemed useful enough for policy analysis. It should be noted that under Peirce’s analogy, if induction stumbles on a substantive unexpected deviation from the initial abducted frame, the scientist must return to the drawing board and abduct a new frame to supplant the first. Recently, the importance of abduction in applied research was recognized in the flagship journal of the discipline, the American Economic Review. There, in an article published by Heckman and Singer (2017), it is claimed that “a central feature of abduction is the quest for and construction of hypotheses and explanations, which are the most plausible candidates to account for an empirical phenomenon” (Heckman & Singer, 2017, p. 299). Heckman and Singer argue that abduction involves the use of multiple techniques and data sources. Although not acknowledged, they argue that abductive inferences are found in some of the most important contributions to the discipline including in the works of Friedman and Gary Becker (1930–2014). The textbook approach of hypothesis testing (falsification) and the commendation that hypotheses ought to spring from models is deemed unhelpful. They encourage researchers to think beyond the stringent implications of models, to reason back and forth from model to the data and from data to the model, arguing that groundbreaking empirical work is messy, creative, and not regimented by strict frequentists and Bayesian strictures. Abduction, for Heckman and Singer, is more like detective work and is receptive to all kinds of information and methods. They
1042
R. Mabsout
emphasize the importance of practice and experience in abductive work as well as the researcher’s ability to imbed himself or herself in the data. Such publications, in a mainstream journal by mainstream economists, signals that work on abduction is no longer a fringe topic and that economists are, slowly but surely, shifting away from their initial metronomic focus on Popperian falsification.
Abduction in Heterodox Economics and Philosophy of Economics Heterodox economists tend to be more forthcoming than mainstream economists on their methodological presumptions. A case in point is Tony Lawson’s (1997, 2003, 2009) work and critique of mainstream economics. For Lawson (2003), social and natural structures are distinct since the former contains human intentional agency and practices that shape, and are in turn shaped by, social structure. And returning to an old Millian theme, because social structure is dependent on human agency, it will have spatial-temporal restrictions, that is social science, unlike natural science, is necessarily historical-geographical (Lawson, 2003, p. 149). Mainstream economics, in so far as it is fixated on constant events conjunctions mathematically, or formalistically, modeled as deductions and inductions, provides little insights into a social reality that is open, dynamic, changing, and structured characterized by (i) highly restricted and unstable regularities (demi-regularities or demiregs) and (ii) “deeper structures, powers, mechanisms and tendencies” (Lawson, 2003, p. 79). The objective of economics then is not to simplify a complex dynamic social reality by fictionalizing it, as mainstream economists do, but uncover the deeper causal structures, powers, mechanisms, and tendencies that produce or facilitate the demiregs. The formal inferential modes used by mainstream economists – induction and deduction – cannot uncover the deeper causal explanations Lawson posits. This task is left to “the essential mode of inference drawn upon in science” (Lawson, 2003, p. 145), retroduction or abduction. Retroduction then is the mode of inference that allows the scientist to obtain causal hypotheses. It does not seek to cover a phenomenon with a generalization “but to identify a factor responsible for it, that helped produce or at least facilitated, it. To posit a mechanism (at a different level than the phenomena being explained), which, if it existed and acted in the postulated manner, could account for the phenomena singled out for explanation” (Lawson, 1997, p. 212). Lawson observes the following characteristics of retroduction: 1. It is context dependent, operates under analogy and metaphor, and is dependent on the researcher’s perspective, beliefs, and experience (1997, p. 212). 2. Good explanation provides contrastive, or relative, explanatory power (1997, pp. 206–208, 213). 3. Contrastive explanations involve an element of puzzle, contradictions, inconsistencies, experiences of surprise, and doubt (1997, pp. 210–211). 4. Theory assessment is not gauged by its ability to predict but by its explanatory power “to illuminate a wide range of empirical phenomena” (1997, p. 213).
48 Abduction and Economics
1043
5. After hypotheses are retroduced, consequences are deduced for subsequent empirical testing using induction (Lawson, 1997, p. 213). 6. Social scientific explanation is backward-looking (1997, p. 219). Lawson (2003, 2009) further distinguishes between theoretical explanations that use retroduction where a novel previously unknown causal explanation is posited and applied explanations that use retrodiction and where the causal explanations is already identified. Lawson (2009) applies a retroductive explanation to Akerlof’s famous market for lemons paper. The surprising contrast to be explained is the large price discrepancy between news cars and cars that just left the showroom. As soon as a new car is sold and leaves the showroom, there is an important drop in price which cannot be explained by the quality of the car because quality is, more or less, the same. But, Akerlof argues, lemons are more likely to be found in the secondhand market as their owners try to dispose them. Buyers knowing this will want to pay less than the price of a new car. The consequence is that prices of used cars will tend to be substantially lower as trust between buyers and sellers breaks down. More accommodating to the mainstream and using first-order logic to ground and connect abduction to economics, Crespo, Tohmé, and Heymann (CTH, 2010) and Tohmé and Crespo (TC, 2013) lament that economic methodology has so far neglected reasoning on the origin of hypotheses. Among practicing economists, furthermore, they note that discussions on abduction are still implicit though its importance should not be less than that of statistical methods. Though they note that to motivate and justify their modeling assumptions and conclusions economists use analogies, metaphors, and intuitions, however, unlike Lawson, CTH do not criticize the mathematical formalism and fictions that distinguish economics from the other social sciences. According to CTH (2010), four features of abduction stand out: 1. Following Peirce, they adopt “qualitative induction” which combines a first abductive stage (generating novel yet defeasible explanatory hypotheses) and a second retroductive stage (eliminating via empirical testing the weaker of the novel hypotheses). 2. Abductive inference is more like detective work than statistical induction. How structured it will be is context dependent and contingent on the alignment between the qualitative problem of interest (income growth or welfare growth say) and the matching quantitative data (GDP or quality of life). 3. More experienced economists deploy their intuitions more effectively in navigating abductions. 4. The explanatory power of an abduction should be evaluated considering its simplicity, internal and external coherence, and testability. After grounding abduction in first-order logic, CTH identify arguments describing macroeconomic large-scale crises that “could be used to rationalize the choice of the different hypotheses as basic elements of the approach to the subject” (CTH, 2010, p. 183). In such times, they note that there is an overall boost in abductive
1044
R. Mabsout
efforts by economists and agents as the efficacy of previously held beliefs is questioned. CTH’s abductive exercise involves evaluating, comparing, and selecting among the arguments the ones with the best abductive fit according to the following criteria: satisfying the description of a macroeconomic crisis; being internally consistent; and having observable border conditions or testable implications. TC (2013) develop CTH’s conception of abduction in economics integrating extensions by Gabbay and Woods (2005) and Magnani (2009), specifically nonexplanatory/instrumental abduction and manipulative abduction. TC pursue their initial focus on abduction in macroeconomic crises with the 2008 financial contagion and the Argentinian experience of multiple hyperinflations episode used as folds. TC then add subsections on abduction and the probability of coin tossing on the abductive elements contained in Akerlof’s above-cited market for lemons model.
The Economics of Abduction The economics of science is the branch of economics that studies how knowledge is produced, tested, distributed, and applied (Dasgupta & David, 1994; Rescher, 1989; Hands, 2001, Chap. 8). A key question in the economics of science is the problem of resource allocation. Peirce ([1879] 1967) pioneered the economics of science in his paper on the economy of research as well as in his other works. In the paper on the economy of research, Peirce used calculus to argue that the allocation of resources in science should follow the principle of maximizing the ratio of benefits to costs (Wible, 1998). Although Rescher (1978, pp. 38–39) claims that he came close to questioning this belief, Peirce adopted a cumulative-convergence theory of scientific progress believing that the most important discoveries in science have been made and what remained to be discovered involved increasing accuracy and only marginal adjustments. Peirce also considered that as science progresses, it experiences both rising costs and diminishing returns (Rescher, 1978). More specifically, Peirce considered empirically testing, by induction, the hypotheses suggested by abduction the costliest scientific activity (Peirce, 1931, CP 7.206, 7.220). Developing and testing all possible hypotheses inferred by abduction is also uneconomical (Peirce, 1931, CP 5.51). However, the proliferation of hypotheses may still be avoided because Peirce posited a “natural instinct for truth” which, supported by experience and a set of hypotheses, guide the human mind toward the correct one (Peirce, 1931, CP 1.81, 7.220, 6.530, 5.50, 217– 218, 5.173–174, 2.753–754, 8.223, 5.591, 5.604, 7.508). As Rescher (1978, p. 8) explains, for Peirce, the outputs of abduction are guided by trained intuition and therefore a proliferation of hypotheses is constrained. Which hypotheses are further pursued depends on whether they can be subjected to empirical testing, on their explanatory power of the surprising fact, and on economy (Peirce, 1931, CP 7.220). Peirce (1931, CP 7.220–221) noted that economy in science has three elements: costs, the value of the hypothesis in-itself, and the effects of the hypothesis on other projects. On costs, he argues that before
48 Abduction and Economics
1045
examining a hypothesis, we should account for “the amount of wealth, in time, thought, money, etc., that we ought to have at our disposal before it would be worthwhile to take that hypothesis for examination” (Peirce, 1931, CP 2.780) and if “a hypothesis can be put to the test of experiment with very little expense of any kind, that should be regarded as giving it precedence in the inductive procedure” (Peirce, 1931, CP 7.220; also CP 5.598). In other words, for Peirce, the selective testing of abductive output should be guided by money, time, thought, and energy (Harris & Hoover, 1980, p. 338). On value, he argued the hypothesis most likely to be true should be tested, while on the relation to other projects he pointed that this is important for abduction because its hypothesis might be refuted and the implications of this “breakdown” must be factored in (Peirce, 1931, CP 7.220).
Concluding Remarks Recent scholarship has uncovered many aspects of the relationship between abduction and economics. From classical political economists to modern-day modelers, evidence is building up suggesting that abduction is switching from a fringe topic to the center of the discipline. Economists are appropriating the concept of abduction and whether mainstream or heterodox, applied or theoretical, they offer overlapping accounts of how abduction works emphasizing the importance of experience, intuition, creativity, and detective-like research instead of mechanical textbook-like procedures. Future work is still needed to further integrate abductive steps in the economics curriculum on a par with the other modes of inference. This goal may be achieved if a more diversified roster of abductive case studies is made available which remain too few. There are many challenges, foremost of all is that the abductive elements that propelled a research project are too often excluded from the published research findings. There are no formal requirements to report abductive elements which continue to be almost exclusively informally transmitted from peer to peer and from more experienced to less-experienced economists.
References Bacon, F. (2011). The works of Francis Bacon (Vol. 4). Cambridge University Press. Crespo, R., Tohmé, F., & Heymann, D. (2010). Abducing the crisis. In L. Magnani et al. (Eds.), Model based reasoning in science and technology. Springer. Dasgupta, P., & David, P. (1994). Towards a new economics of science. Research Policy, 23, 487– 521. Friedman, M., & Savage, J. (1948). The utility analysis of choice involving risk. Journal of Political Economy, 56, 279–304. Gabbay, D., & Woods, J. (2005). The reach of abduction: Insight and trial. Elsevier. Hands, W. (2001). Reflections without rules. Economic methodology and contemporary science theory. Cambridge University Press. Harris, F., & Hoover, K. (1980). Abduction and the new riddle of induction. The Monist, 63, 320– 341.
1046
R. Mabsout
Heckman, J., & Singer, B. (2017). Abducting economics. American Economic Review: Papers and Proceedings, 107, 298–302. Hirsch, A., & de Marchi, N. (1990). Milton Friedman. Economics in theory and practice. The University of Michigan Press. Hollander, S. (1985). The economics of John Stuart Mill. Volume I: Theory and method. University of Toronto Press. Hoover, K. (2019). Models, truth, and analytic inference in economics. CHOPE working paper, no. 2019-01. Center for the History of Political Economy, Duke University. Hoover, K., & Wible, J. (2020). Ricardian inference: Charles S. Peirce, economics, and scientific method. Transactions of the Charles S. Peirce Society, 56, 521–557. Klein, J., & Giglioni, G. (2020). Francis Bacon. In Edward N. Zalta (Ed.), The Stanford encyclopedia of philosophy (fall 2020 ed.). Stanford University Press. https://plato.stanford. edu/archives/fall2020/entries/francis-bacon/ Lawson, T. (1997). Economics and reality. Routledge. Lawson, T. (2003). Reorienting economics. Routledge. Lawson, T. (2009). Applied economics, contrast explanation, and asymmetric information. Cambridge Journal of Economics, 33, 405–419. Magnani, L. (2009). Abductive cognition: The epistemological and eco-cognitive dimension of hypothetical reasoning. Springer. Mill, J. S. (1843). A system of logic, ratiocinative and inductive (Vol. 1–2). Cambridge University Press. Peirce, C. S. ([1879] 1967). Note on the theory of the economy of research. Operations Research, 15, 643–648. Peirce, C. S. (1931). In C. Hartshorne & P. Wiess (Eds.), Collected papers of Charles Sanders Peirce (Vol. 1–8). Harvard University Press. Redman, D. (1997). The rise of political economy as a science. Methodology and the classical economists. MIT Press. Rescher, N. (1978). Peirce’s philosophy of science. University of Notre Dame Press. Rescher, N. (1989). Cognitive economy. The economic dimension of the theory of knowledge. University of Pittsburgh Press. Sugden, R. (2002). Credible worlds: The status of theoretical models in economics. In U. Mäki (Ed.), Fact & fiction in economics. Models, realism, & social construction. Cambridge University Press. Sugden, R. (2009). Credible worlds, capacities and mechanisms. Erkenntnis, 70, 3–27. Sugden, R. (2013). How fictional accounts can explain. Journal of Economic Methodology, 20, 237–243. Tohmé, F., & Crespo, R. (2013). Abduction in economics: A conceptual framework and its model. Synthese, 190, 4215–4237. Urbach, P. (1982). Francis Bacon as a precursor to Popper. British Journal for the Philosophy of Science, 33, 113–132. Urbach, P. (1987). Francis Bacon’s philosophy of science: An account and a reappraisal. Open Court. Wible, J. (1998). The economics of science. Methodology and epistemology as if economics really mattered. Routledge. Wible, J. (2014). Peirce’s economic model in the First Harvard Lecture on pragmatism. Transactions of the Charles S. Peirce Society, 50, 548–580.
Part IX Abduction in Education and Human Sciences
Introduction to Abduction in Education and Human Sciences
49
Alger Sans Pinillos
Contents Introduction to Abduction in Education and Human Sciences . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1050 1054
Abstract
Studies on abduction in the field of Education and Human Sciences are recent. However, this does not mean that this concept has not been present in these disciplines. In the middle of the last century, abduction was a poorly defined tool to address various theoretical and practical problems. Usually, it has been introduced through Pragmatism. One of the main objectives of using this philosophical theory has been to define a meta-perspective that contemplates the social and cultural dimensions of inquiry. A relevant factor of abduction is that it articulates Pragmatism so that it is possible to consider how facts and values interrelate in human praxis and define reality. Focusing on society, a concern for converting individual action into a transformative tool that extends humanistic values appears. Science Education may have been the area most concerned with the relationship between Human Sciences and the Formal and Experimental Sciences. The underlying issue in these debates is that the project of educating and teaching individuals in society compels reflection on the various ways to achieve the goal such that both Human Sciences and Formal and Experimental Sciences are imbricated toward a usually prescriptive purpose: to improve the world. Abduction represents an ideal mechanism to explain the adaptation of knowledge to the circumstances of the learning subjects. The process of teaching requires synthesizing general issues to particular facts and, likewise, needs to A. Sans Pinillos () Department of Humanities – Philosophy Section, University of Pavia, Pavia, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_88
1049
1050
A. Sans Pinillos
be able to generalize particular facts to other resemblances to make them more comprehensible. For this reason, teaching models and methods must be in constant development and change. The present section aims to show the current status of the application of abduction in Education and Human Sciences in the light of Cognitive Science. With interest in defining different aspects of the mind and its interaction with the environment and other entities in the world, more and more discussions are emerging about the prescriptive forms of these relationships. These advances are providing tools to support theories about the social and cultural dimensions of being human. Keywords
Abduction · Pragmatism · Human sciences · Science education · Teaching models
Introduction to Abduction in Education and Human Sciences Studies on abduction in the field of Education and Human Sciences are recent. However, this does not mean that this concept has not been present in these disciplines. In the middle of the last century, abduction was a poorly defined tool used to address various theoretical and practical problems. For example, in areas such as Philosophy and the History of Science, these issues crystallized in the famous debate between the context of justification and the context of discovery. In this discussion, abduction emerged as a piece to explain the relation of non-formalizable elements in the justification process. In many ways, one could situate this trend in the areas of research to be concerned with the nexus between the practical dimension (i.e., inquiry) and theory (knowledge) as one of the fundamental problems of contemporaneity. For this reason, abduction has become the fundamental problem for contemporary epistemology (Hintikka, 2007): because in this concept is concentrated the need to conceptualize epistemic change. The incorporation of abduction into Human Sciences has been preceded by an interest in Pragmatism (West, 1989). This is not surprising. In order to characterize the basis of experimentation, Charles Sanders Peirce placed abduction as the cornerstone of Pragmatism. Therefore, it is not surprising that applying this philosophical theory involves including features from abduction. In the late 1990s, Human Sciences disciplines such as Comparative Literature and Historical Criticism incorporated Pragmatism as a meta-perspective that allowed experimentation to be defined in a world of communicative reaction (Cooren, 2014). In this context, a Pragmatist assumption is undertaken: the experience that shapes reality is structured through a continuous flux of information. The main objective is to define bases that contemplate the social and cultural dimensions of the inquiry. For this reason, in Human Sciences research, it is common for abduction to be used to define ways in which the historical and social context is relevant in the characterization of events (Polkinghorne, 1988, p. 19).
49 Introduction to Abduction in Education and Human Sciences
1051
In this sense, abduction and Pragmatism are used in these theories to give relevance to the social and cultural elements of the context. However, it is common in these investigations to assume a peripheric view of Pragmatism, even relativistic (Sundin & Johannisson, 2005). No priority is given to defining the relationship between the individual agent and the community in acquiring knowledge (see Bergman, 2012). Leaving aside the difficulties that this may have caused in some cases in the understanding of both Pragmatism and abduction, it should not be forgotten that there is an underlying question of content and, above all, of purpose. In Human Sciences, there is an interest in the practice of human beings, and, in this sense, a different logic is needed from that offered by the studies of the end of the last century on abduction in fields such as computation, cognitive science, epistemology, etc. (Paavola, 2021). A relevant factor of abduction is that it articulates Pragmatism so that it is possible to consider how facts and values interrelate in human praxis and define reality (Sans Pinillos, 2021). In Peirce’s original Pragmatism, it is possible to characterize inquiry as a project of construction of an unfinished reality (Apel, 2016). On the one hand, in Peirce, a symbiotic relationship is observed between the habits generated by the environment and the changes that actions provoke in the context in which agents inhabit. In other words, action is determinant for reality to occur (James, 1991). On the other hand, it should be noted that, despite the differences between Peirce and James, in presenting Pragmatism, James assumes Peirce’s pragmatic maxim: All realities influence our practice, and that influence is their meaning for us. I am accustomed to put questions to my classes in this way: In what respects would the world be different in this alternative or that were true? If can find nothing that would become different, then the alternative has no sense. (James, 1978, p. 29; C.f., Peirce, 1958, CP, 5. pp. 14–40)
The nuances introduced by James in the cognitive aspects and in the dimension of psychological agents implied a theoretical turn in Peirce’s Pragmatism. For example, the relevance of hindsight to experimentation involved contemplating forms of relationship between people’s inner lives and perceptions (see Myers, 1997). In the same way, the proposal of the pragmatist Ferdinand Canning Scott Schiller is a metaphysics in the service of science. As it is well known, this goal of controlling logical positivism and abstract metaphysics for the sake of a theory embracing the whole dimension of the human being was called “humanism” (see Schiller, 1907). Focusing on society, a concern for transforming individual action into a transformative tool that extends humanistic values within classical Pragmatism appears. An example is the influence of Pragmatism in the formation of the social sciences, such as the work of Jane Addams, George Herbert Mead, Charlotte Perkins Gilman, and Charles Cooley, among others. This reconstruction allows us to see what role Human Sciences have played in defining the Social Sciences. Except for Sociology, Science Education may have been the area most concerned with the relationship between Human Sciences and the Formal and Experimental sciences. This objective can be found in the pedagogical theories of the pragmatist John Dewey. For this philosopher, education is the
1052
A. Sans Pinillos
privileged mechanism for achieving a democratic society because it makes people grow intellectually, morally, and emotionally (Rodgers, 2002, p. 845). This objective is extensible to all pedagogical proposals. One of the most relevant revolutions for habit formation, ethical instruction, and qualitative inquiry is a language of relationships that fosters a kind of “anastomosis” between disciplines (Peterson, 2016). In this sense, abduction has proven to be a powerful tool for articulating this project of human transformation. At the end of the last century, the debate about the relationship between science and culture intensified. Factors such as the influence of constructivism in Science Education, the rise of the historical reading of science, and multiculturalism, modified educational research and practice. Then, new questions arose in Science Education from theories such as Postmodernism and disciplines such as Anthropology. One example is the debate on the global predominance of Western Scientific Models (El-Hani & Mortimer, 2007). The underlying issue that lies in these debates is that the project of educating and teaching individuals in society compels reflection on the various ways to achieve the goal such that both the humanities and the sciences are imbricated toward a usually prescriptive purpose: to improve the world (i.e., to provide tools to foster better relationships among people, to propose more ecological avenues of inquiry, to transform old categories into new, more inclusive ones, etc.). Abduction is the perfect mechanism to address this goal and, at the same time, deal with theoretical problems, such as the paradox of learning: that “new and better knowledge is fashioned out of prior, less complex knowledge” (Prawat, 1999). Abduction represents an ideal mechanism to explain the adaptation of knowledge to the circumstances of the learning subjects. The process of teaching requires synthesizing general issues to particular facts and, likewise, needs to be able to generalize particular facts to other resemblances to make them more comprehensible. For example, it is common to explain the Natural Sciences through the great discoveries of humanity. In the same way, historical facts are usually related to other similar (but not identical) facts to show the resemblances and differences in the development of humanity. Likewise, a very useful resource is proposing “centers of interest”: relating the different subjects using a group of common interests for the student. Of course, these strategies are not exclusive to Sciences Education. However, in this discipline, the inquiry capabilities of human beings have been systematized with a defined objective: to teach and educate. In other words, to form citizens. Therefore, teaching models and methods must be in constant development and change. For this reason, there has long been an interest in abduction in Science Education. The different pedagogical theories have been updated as new logic and theories of argumentation emerged at the same time as research on learning and development progress in the psychology field. The present section aims to show the current state of the application of abduction in the disciplines of Human Sciences and Science Education in the light of Cognitive Science. It is a novel challenge whose possibility already places a particular state of the art in many ways. There seems to be a convergence between the Human Sciences and Science Education objectives and the disciplines that make up Cognitive Science. In addition to an
49 Introduction to Abduction in Education and Human Sciences
1053
interest in defining the different aspects of the mind and its interaction with the environment and other entities in the world, debates are increasingly emerging about the prescriptive forms of these relationships. Likewise, these advances are providing tools to support theories about the social and cultural dimensions of human beings. The present section aims to show the current status of the application of abduction in Education and Human Sciences in the light of Cognitive Science. It is a novel challenge, the possibility of which already marks a particular state of the art in many respects. We are currently experiencing the reunification of this objective in both Human Sciences and Formal and Experimental scientific fields, which implies a change of drift in Human Sciences research. Likewise, there seems to be a convergence between the objectives of the Human Sciences and Science Education and the disciplines that make up cognitive science. In addition to the interest in defining different aspects of the mind and its interaction with the environment and other entities in the world, more and more discussions are emerging about the prescriptive forms of these relationships. These advances are providing tools to support theories about the social and cultural dimensions of being human. As shown in the chapters that make up this section, Teaching and Education are the central axes of this convergence between descriptive and prescriptive knowledge. The chapter by Prof. Agustín Adúriz-Bravo and Dr. Leonardo González Galli ( Chap. 52, “Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching of School Scientific Explanation and Argumentation”) shows the role of teachers in fostering students’ scientific explanations and arguments through explanatory and argumentative models of paradigmatic cases of science (epitomes) based on abduction. In particular, an analysis of Darwin’s Origin of Species is presented to show how several theoretical propositions contained in that book can be reconstructed as the conclusions of pieces of abductive reasoning that, in many cases, use intermediate analogies. Prof. Phil Seok Oh proposes in his chapter ( Chap. 50, “Abduction in Earth Science Education”) that abduction is a genuine inquiry model for earth science education. The main reasons to highlight abduction are its characteristics of applicability, creativity, evaluation, and selectivity, which enable this reasoning to cope with the contingency of experimentation. In this sense, Oh proposes the method of multiple working hypotheses (MMWH), which can contribute to a balanced scientific literacy through teaching and learning based on an abductive characterization of inquiry. In another line, but in the same argumentative direction, the chapter by Prof. John Shook ( Chap. 51, “Abductive Inquiry and Education: Pragmatism Coordinating the Humanities, Human Sciences, and Sciences”) addresses the need to relate Human and Scientific Sciences to build a pedagogy that simultaneously contributes to scientific inquiry and cultural advancement. The inquiry is characterized abductively to articulate a bridge that unites and coordinates Human Sciences and the Formal and Experimental Sciences through education. Finally, the work of Dr. Alger Sans Pinillos (Abductive Irradiation of Cultural Values in Shared Spaces: the Case of Social Education Through Public Libraries) analyzes how some spaces or places become distributors of cultural values. The Public Library is taken as the focus of reflection. The relevance of these places lies in the fact that, together with a direct
1054
A. Sans Pinillos
relationship with their activities, they become structures of irradiation of values considered positive.
References Apel, K.-O. (2016). Der Denkweg von Charles S. Peirce. Eine Einführung in den amerikanischen Pragmatismus. Suhrkamp Verlag. Bergman, M. (2012). Pragmatism as a communication-theoretical tradition: An assessment of Craig’s proposal. European Journal of Pragmatism and American Philosophy, 4(1), 208–221. Cooren, F. (2014). Pragmatism as ventriloquism: Creating a dialogue among seven traditions in the study of communication. Language Under Discussion, 2(1), 1–26. https://doi.org/10.31885/ lud.2.1.239 El-Hani, C. N., & Mortimer, E. F. (2007). Multicultural education, pragmatism, and the goals of science teaching. Cultural Studies of Science Education, 2, 657–702. https://doi.org/10.1007/ s11422-007-9064-y Hintikka, J. (2007). Socratic epistemology. Cambridge University Press. James, W. (1978). Pragmatism. In Pragmatism and the meaning of truth (pp. 1–166). Harvard University Press. James, W. (1991). The will to believe and other essays in popular philosophy. In psychology: Briefer course, the will to believe, talks to teachers and to students, essays (pp. 445–704). Library of America. Myers, G. E. (1997). Pragmatism and introspective psychology. In R. A. Putnam (Ed.), The Cambridge companion to William James (pp. 11–24). Cambridge University Press. Paavola, S. (2021). Practical abduction for research on human practices: Enriching rather than testing a hypothesis. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action. Studies in applied philosophy, epistemology and rational ethics. Springer. https://doi.org/10. 1007/978-3-030-61773-8_2 Peirce, C. S. (1958). In C. Hartshorne, P. Weiss, & A. W. Burks (Eds.), Collected papers of Charles Sanders Peirce (pp. 1931–1935). Harvard University Press. Peterson, T. (2016). Contemporary approaches to a pedagogy of process. Semiotica, 2016(212), 7–26. https://doi.org/10.1515/sem-2016-0129 Polkinghorne, D. (1988). Narrative knowing and the human sciences. State University of New York Press. Prawat, R. S. (1999). Dewey, Peirce, and the learning paradox. American Educational Research Journal, 36(1), 47–76. https://doi.org/10.3102/00028312036001047 Rodgers, C. (2002). Defining reflection: Another look at John Dewey and reflective thinking. Teachers College Record, 104(4), 842–866. https://doi.org/10.1111/1467-9620.00181 Sans Pinillos, A. (2022). Neglected pragmatism: Discussing abduction to dissolute classical dichotomies. Found Sci, 27, 1107–1125. https://doi.org/10.1007/s10699-021-09817-x Schiller, F. C. S. (1907). Studies in humanism. Macmillan and co., limited. Sundin, O., & Johannisson, J. (2005). Pragmatism, neo-pragmatism and sociocultural theory: Communicative participation as a perspective in LIS. Journal of Documentation, 61(1), 23–43. https://doi.org/10.1108/00220410510577998 West, C. (1989). The American evasion of philosophy: A genealogy of pragmatism. The University of Wisconsin Press.
Abduction in Earth Science Education
50
Phil Seok Oh
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What Is Abduction? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as Ampliative Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as Creative Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as Evaluative and Selective Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Nature of Earth Science and Abduction as an Earth Scientific Practice . . . . . . . . . . . . . Earth Science as a Historical and Interpretive Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . Earth Science as a Modeling-Based Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Earth Science as a Systems Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction in the Earth Science Classroom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Learning Strategies for Abductive Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Method of Multiple Working Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Narrative Explanations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Systemic Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Teaching Strategies Supporting Students’ Abductive Inquiry Learning . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1056 1056 1057 1059 1061 1062 1063 1065 1067 1068 1072 1072 1073 1075 1076 1078 1080 1080
Abstract
Epistemic goals and practices of earth science are distinctive from those of other domains of science, especially experimental sciences such as physics and chemistry. Therefore, abduction is employed as a type of reasoning specific to earth scientific problem-solving and inquiry. This chapter revisits the meaning of
P. S. Oh () Department of Science Education, Gyeongin National University of Education, Anyang, Gyeonggi-do, Republic of Korea e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_48
1055
1056
P. S. Oh
abduction as ampliative, creative, and evaluative and selective reasoning. It is also discussed how well abduction fits with the nature of earth science as a historical and interpretive science, a modeling-based science, and a systems science. The abductive inquiry model (AIM) as an instructional model to teach students earth science through discipline-specific practices of inquiry is then introduced on the basis of previous studies on abduction and earth science education. Learning and teaching strategies for facilitating abductive inquiry in the earth science classroom, such as the method of multiple working hypotheses (MMWH), modeling, narrative explanations, and systemic approaches, are suggested as well. Finally, a case is made for how earth science education can contribute to the development of more balanced scientific literacy for students by using abduction as a reasoning practice for teaching and learning earth science through inquiry. Keywords
Abduction · Reasoning · Inquiry · Earth science · Earth science education
Introduction Epistemic practices of a scientific discipline are defined by how scientists work to solve problems in that discipline, which is in turn characterized by the goal of inquiry and the nature of phenomena of interest (Dodick & Orion, 2003; Turner, 2013). One of the important goals of studies in earth science is historical: Earth science aims to “describe the development of the earth from its earliest beginnings to its present form” (Laudan, 1987: 2). Also, earth science deals with natural phenomena in a wide variety of temporal and special scales. These phenomena are difficult to observe directly or operate in a laboratory and often occur as a result of complex interactions among various systems within the earth and universe. Therefore, specific types of reasoning are employed in earth scientific inquiry, which include abduction. In this chapter, the meaning of abduction is revisited, especially with consideration given to how abduction fits with the nature of earth science. The ways abductive reasoning can be implemented in the science classroom to teach students earth science through a discipline-specific practice of inquiry are described as well on the basis of theoretical and empirical studies of abduction and earth science education in schools.
What Is Abduction? It is well known that abduction was first introduced by Charles S. Peirce as a type of inference. The often-quoted formulation of Peirce’s abduction (cf Hanson, 1958: 86), which is presented below, indicates that abduction starts from a surprising phenomenon and moves toward a hypothesis which, if true, explains the phenomenon.
50 Abduction in Earth Science Education
1057
Some surprising phenomenon Q is observed. If P were true, Q would be explicable as a matter of course. Hence, there is reason to think that P is true.
Peirce viewed abduction as the only type of inference that can introduce a new idea into scientific inquiry through generating an explanatory hypothesis which neither deduction nor induction can provide (Goudge, 1950). Accordingly, abduction has been advanced by Hanson (1958, 1961/1989) and other philosophers of science as a logic of scientific discovery. Indeed, abduction plays important roles in inventing novel concepts, constructing new models, and creating scientific theories (Clement, 2008; Magnani, 2001; Nersessian, 2008; Thagard, 1992). For example, Thagard pointed out that revolutions in science, such as the development of the ideas of continental drift and plate tectonics in geology, involved theoretical and conceptual changes, noting how such new theories and concepts cannot be formed by generation from observations but instead arise by abduction. Nersessian also exemplified this in explaining how J. C. Maxwell’s model of electromagnetism was developed via abduction which included the processes of constructing, manipulating, and adapting models. Thus, abduction is conceptualized as the process of developing explanations by constructing new concepts, models, and theories in the context of scientific inquiry. A more detailed understanding of the nature of abduction can be achieved by examining three essential characteristics of abduction: abduction as ampliative, creative, and evaluative and selective reasoning.
Abduction as Ampliative Reasoning The ampliative nature of abduction can be explained with its syllogistic representation. Below, for the purpose of comparison, the syllogistic forms of abduction, deduction, and induction are presented, respectively, with an example of scientific reasoning (cf Oh, 2011, 2019): Abduction (Q, result) (P → Q, rule) (P, case)
A typhoon moved along an anomalous path. If a typhoon interacts with another typhoon, it can result in the typhoon’s anomalous path. Therefore, there is a reason to think that the typhoon in question interacted with another typhoon.
Deduction (P → Q, rule) (P, case) (Q, result)
If a typhoon interacts with another typhoon, it can result in the typhoon’s anomalous path. A typhoon interacted with another typhoon. Therefore, the typhoon in question moved along an anomalous path.
1058
P. S. Oh
Induction (P, case) (Q, result) (P → Q, rule)
A typhoon interacted with another typhoon. The typhoon in question moved along an anomalous path. Therefore, if a typhoon interacts with another typhoon, it can result in the typhoon’s anomalous path.
In the syllogistic form of abduction, the first premise (Q) is called a “result,” the second premise (P → Q) a “rule,” and the conclusion (P) a “case.” Thus, abduction is a type of inference which derives a case from a result and a rule. This is contrasted with deduction which draws a result (Q) from a rule (P → Q) and a case (P) and induction which generates a rule (P → Q) by combining a case (P) and a result (Q) (Goudge, 1950; Magnani, 2001; Oh, 2008, 2011, 2019; Walton, 2004). Deduction is known as a self-evident and truth-conservative inference in that in the deductive inference, a conclusion is necessarily true if all the premises are true and the inferential process is valid. That is, deduction draws the necessary condition (Q) as a conclusion when the sufficient condition (P) and the rule (P → Q) are given in the premises. By contrast, abduction draws out the sufficient condition (P) of the necessary condition (Q) and the rule (P → Q). Therefore, the truth of the conclusion cannot be guaranteed by the truth of the premises. In other words, the content of the abductive conclusion goes far into the range which cannot be ensured logically, and its validity can be supported only partially by the premises. Accordingly, abduction is considered an ampliative form of reasoning (Goudge, 1950; Haig, 2005; Magnani, 2001; Oh, 2008, 2011; Schurz, 2008; Walton, 2004). The ampliative nature of abduction is a logical flaw. Simply speaking, in abductive reasoning, another case may produce the same result (Rhoads & Thorn, 1993). However, the ampliativeness of abduction helps scientists generate a hypothesis explaining a phenomenon whose conditions of occurrence or causative processes are unknown. Of course, induction is another type of an ampliative inference. But, it can only provide a generalized statement based on the regularity or pattern among individual events. As an instance, von Engelhardt and Zimmermann (1988: 152) pointed out that observations on the occurrences of minerals and rocks merely gave rise to inductive generalizations, such as “Pentlandite deposits are always found in association with SiO2 -poor rocks” and “Muscovite in igneous rocks occurs solely in granites.” The inductive generalizations do not, however, include any theoretical explanation concerning the relationships of the mineral occurrences with particular types of rocks. On the contrary, abduction is able to explain observations by suggesting theoretical entities. For example, an observation of plagioclase glasses in rocks of a crater-shaped structure can be explained by abductively inferring the theory of crystallography or meteorite impacts and shock waves (von Engelhardt & Zimmermann, 1988). As such, abduction generates a theoretical explanation which transcends a given observation, which characterizes abduction as ampliative reasoning.
50 Abduction in Earth Science Education
1059
Abduction as Creative Reasoning In the syllogistic schema of abduction, a rule (P → Q) already appears in the premise, and a case (P) can be drawn automatically following the structure of the inference. Therefore, a key operation in abductive reasoning is to find a rule which includes a possible explanation of a result (Q) (Bonfantini & Proni, 1983; Haig, 2005, 2008; Kapitan, 1992; Paavola, 2004, 2006). Of course, one can sometimes think of a rule automatically and instantaneously. Eco (1983) called this automatic and instantaneous process “overcoded abduction.” For example, when we find a trace of imprints of sandals on an empty beach, we immediately conjecture that somebody was recently walking there (Schurz, 2008). But, such an automatic process is not always the case. Therefore, an abductive reasoner has to find a rule from available resources or employ heuristic strategies to figure out a rule to explain a result (Haig, 2005, 2008; Oh, 2008, 2011, 2019; Thagard, 1988, 1992). Here the resources refer to various types of knowledge and information that include scientific facts, laws, models, and theories and those from everyday experiences. These resources play a role of expanding the scope of rules available in abductive reasoning, and, when combined with appropriate cognitive strategies, they yield an explanation. To take an example from the history of science, Alfred Wegener took advantage of his observation of huge masses of ice in Greenland as an analogical resource to suggest the idea of continental drift. Even before the times of Wegener, it was widely recognized that the east coast of South America and the west coast of Africa fit together neatly. To explain this puzzling phenomenon, scientists at that time relied mostly on the so-called contraction theory or contractionism (Frankel, 2012; Oreskes, 2003). According to this theory, South America and Africa were once joined together, but because of the collapse of part of a larger continent caused by the cooling and shrinking of the earth, the two continents were broken apart and located far away from each other. That is, at that time, the vertical movement of the earth’s crust was well accepted by scientists. On the contrary, Wegner’s idea was considered innovative because it proposed a possibility of the horizontal movement of the crust – the drift of the continents. In his inference to continental drift, Wegener utilized his observation of icebergs to make an analogy: “This first split in Cretaceous time into two parts, which then, like floating icebergs, drifted farther and farther apart” (Giere, 1988: 230; emphasis in original). Wegener’s reasoning can be formalized using the syllogistic schema of abduction as follows:
(Q, result) (P → Q, rule) (P, case)
The coastal lines of South America and Africa fit together neatly. If the two continents were once joined together and drifted away from each other like floating icebergs, the coastal lines fit together neatly. Therefore, the two continents were once joined together and drifted away from each other.
1060
P. S. Oh
Fig. 1 A diagrammatic representation of abduction
As such, abduction is characterized as creative reasoning because the course of abductive reasoning is accompanied with the development of novel ideas using a variety of resources and heuristic strategies. The creative nature of abduction is manifested in a diagrammatic form of abduction. Fig. 1, modified from Oh’s (2016, 2017, 2019) studies on modelingbased abductive reasoning, indicates that abduction generates an explanation of given evidence via activating and adapting resources. In Fig. 1, evidence refers to natural phenomena which are found to be surprising and problematic and need explaining. There are a couple of reasons the term “evidence” is used instead of “data.” First, in scientific explanations, not all data are used for evidence. Rather, the phenomena that are evidenced by the data are more realistic targets for constructing explanations (Borsboom et al., 2021; Villanueva & Hand, 2011). Second, evidence is not a set of simple facts. It is related closely to the process and conclusion of abductive reasoning in that evidence, if explained, supports and proves the explanation (Oh, 2019). When evidence is perceived, it triggers abductive reasoning by providing clues and constraints to generate explanatory hypotheses (Paavola, 2006). In practice, evidence helps an abductive reasoner activate resources which possibly contain information about as-yet-unknown causes of the surprising phenomena. The activation of resources can occur in two ways (Oh, 2019). First, one can elicit resources from a pool of knowledge in his or her mind. Second, if there is not a suitable resource in one’s own mind, he or she has to search external sources of information for adequate resources. That is, the formulation of an abductive explanation occurs within the framework of an abductive reasoner’s prior store of knowledge or available resources out of which one can draw information relevant to the problem under investigation (Oh, 2019; Rhoads & Thorn, 1993). It should be highlighted, however, that the activated resources do not explain the evidence directly, but they are adapted to the specific context of the problem. In other words, the construction of explanations in abductive reasoning is realized through the adaptation process which includes refining, combining, or other ways of transforming resources to better fit the particular problem in question. The adaptation can take place in various ways. Wegener’s analogy with floating icebergs to develop the idea of continental drift is an example of adapting a resource from a scientist’s personal experience to the demand of constructing an explanation. Another example is the combination of different pieces of resources, which is often
50 Abduction in Earth Science Education
1061
evident in developing scientific explanations of how a certain geological feature has been formed. For example, earth science experts activated several pieces of knowledge including those about spheroidal weathering, metamorphism, and crust movements and combined them together to make up an explanation of the formation process of a rock with onionskin-like structure (Oh, 2016, 2019). In addition, a variety of operations, such as manipulations of external representations and simulations with models, can contribute to the adaptation process for constructing an abductive explanation (Magnani, 2004; Oh, 2019; Thagard, 1988, 1992, 2010). According to the diagram of abductive reasoning (Fig. 1), evidence is finally explained and becomes comprehensible by the virtue of the hypothesis generated via abduction. However, an abductive reasoner can keep pursuing scientific explanations when new evidence is found or a problem is newly defined. The diagrammatic representation of abductive reasoning (Fig. 1) implies that there are a wide-ranging variety of resources available to an abductive reasoner. Rhoads and Thorn (1996: 35) emphasized the importance of resources in abductive inference using a more common term, background knowledge: This pattern of inference from effect to potential cause based on background knowledge of cause-effect relations is known as abductive inference; such reasoning is common in geohistorical investigations . . .
They also illustrated this feature with a historical case of G. K. Gilbert, who had drawn on his background knowledge as a resource to abductively propose a couple of hypotheses – volcanic explosion and meteorite impact – about the origin of Coon Butte in Arizona. The availability and usability of various resources thus suggest that many noble ideas can be developed by abduction, which characterizes abduction as creative reasoning. Again, the creative nature of abduction is corroborated by the history of science in which brand-new scientific concepts, models, and theories were constructed through abductive reasoning (Nersessian, 2008; Thagard, 1992).
Abduction as Evaluative and Selective Reasoning The ampliativeness of abduction indicates that a conclusion drawn via abduction is not necessarily true but a conjectured truth. In other words, an abductive conclusion is worthy of further pursuit (Goudge, 1950; Haig, 2005). Also, the creative nature of abduction suggests that there are a wide variety of resources with which an abductive reasoner can construct hundreds of hypothetical explanations (Oh, 2008, 2019). Therefore, in the actual practice of scientific inquiry, an abductive reasoner is apt to work with multiple resources and explanations. Given this, abduction involves a preliminary assessment of explanatory hypotheses, which characterizes abduction as evaluative and selective reasoning (Goudge, 1950; Haig, 2005, 2008; Magnani, 2001; Oh, 2008, 2011, 2019; Schurz, 2008; Walton, 2004). There exist at least three reasons abduction includes the preliminary process of evaluating and selecting hypotheses. First, because a lot of resources are available, an abductive reasoner has to determine which resources seem to be promising to
1062
P. S. Oh
use in the construction of scientifically sound explanations. Second, even when pursuing multiple explanations, an abductive reasoner compares their strengths and weaknesses and uses them complementarily to refine the explanations into more sophisticated ones. Third, in scientific inquiry, at some point the time comes when the most plausible explanation is selected and set out for further investigation. Here, the plausibility of an explanation does not mean that the explanation is absolutely correct. It rather implies that the explanation “is fruitful and productive on the path of scientific inquiry” (Baker, 1996a: 73). Hence, it becomes an important task of an abductive investigator to choose more plausible hypotheses for ongoing scientific inquiry. A number of criteria can be used to evaluate and select explanations in the process of abductive reasoning (Haig, 2005, 2008; Schurz, 2008; Thagard, 1988, 1992). For example, Thagard proposed that the explanatory coherence of a hypothesis or theory was determined on the basis of three criteria: consilience, simplicity, and analogy. Among these, consilience serves as a measure of how much a hypothesis or theory explains. However, a consilient theory not only explains the most facts but also the most important facts. For instance, the theory of general relativity is considered more consilient than Newtonian mechanics because it explains the perihelion of Mercury, the bending of light in a gravitational field, and the red shifts of spectral lines in an intense gravitational field (Thagard, 1988). Other criteria for evaluating and selecting abductive hypotheses include explanatory power or breath, predictive success, empirical adequacy, consistency, precision, and economy, to name a few (Haig, 2005, 2008; Schurz, 2008; Thagard, 1988, 1992). As a historical example in earth science, J. Tuzo Wilson considered simultaneously at least three rival hypotheses about the features of the earth’s crust: the contracting earth hypothesis, the expanding earth hypothesis, and the convecting earth hypothesis. Wilson kept comparing the explanatory power and predictive capacity of these competing hypotheses before he finally converted to mobilism concerning the movement of the earth’s crust (Laudan, 1980). In sum, abduction is a form of evaluative and selective reasoning in that it involves the process of comparing different hypotheses and choosing plausible ones. This feature is related closely to the other characteristics of abduction, namely, the ampliativeness and creativeness. That is, the ampliative and creative nature of abduction allows for generating a number of new explanations about phenomena under study, which necessitates that the abductive reasoner evaluate and select explanations.
The Nature of Earth Science and Abduction as an Earth Scientific Practice Earth science is a scientific discipline which investigates the earth’s materials and processes as well as its place in the solar system and universe (National Research Council [NRC], 2012). Earth science is also considered a systems science which addresses the earth as a set of interacting subsystems and as a part of larger systems
50 Abduction in Earth Science Education
1063
(Ben-Zvi-Assaraf & Orion, 2005, 2010a, 2010b; Earth Science Literacy Initiative [ESLI], 2010; Finley et al., 2011). Therefore, earth science involves geology, meteorology, climatology, hydrology, oceanography, and astronomy as an integrated whole and is interconnected with other areas of science such as physics, chemistry, and biology (Bokulich & Oreskes, 2017; Dolphin & Dodick, 2014; Orion & Ault, 2007). Due to this integrated and interdisciplinary nature, earth science is called by several interchangeable names including earth sciences, geoscience or geosciences, earth system or earth systems science, and earth and space science (Bokulich & Oreskes, 2017; Next Generation Science Standards [NGSS] Lead States, 2013; NRC, 1996). Earth science has been a part of the school curriculum since the 1890s (Finley et al., 2011) and is now recognized as an equally important subject among its sister sciences (NGSS Lead States, 2013; NRC, 1996). Although earth science has much in common with the other domains of science, it is also distinguished from them by the unique nature of its scientific inquiry. More specifically, in what follows, the nature of earth science is described in terms of three disciplinary characteristics – earth science as a historical and interpretive science, a modeling-based science, and a systems science. In addition, how these characteristics are closely linked with abduction is also discussed.
Earth Science as a Historical and Interpretive Science One of the distinctive characteristics of earth science is that it is a historical and interpretive science (Ault Jr., 1998; Cleland, 2002; Dodick & Orion, 2003; Dolphin & Dodick, 2014; Frodeman, 1995; Gray, 2014; Kitts, 1977; Kleinhans et al., 2005; Oh, 2008, 2010, 2011, 2019; von Engelhardt & Zimmermann, 1988). Earth science explores historical traces registered in nature and interprets them to reconstruct past events and processes. For this reason, Baker (1996a, 1999) argued that earth scientific indices, such as rocks, fossils, sediments, and landforms, are all signs for which causative processes are inferred abductively. This feature contrasts with that of experimental sciences in that “the most frequent operations in historical science are not based on the observation of causal sequences . . . but on the observation of results [from which] an attempt is made to infer previous causes” (Simpson, 1963: 45). According to Peirce, there are three different kinds of explanatory hypotheses in science (Goudge, 1950). First, there are hypotheses which refer to facts unobserved but capable of having been observed by an investigator. By contrast, the second kind of hypotheses refers to facts not only unobserved but physically incapable of being observed, and the third refers to entities which are both factually and theoretically incapable of being observed. Although earth science is concerned with all three kinds of hypotheses, the second group of hypotheses is most closely related to earth science as a historical science because “this is the case with all hypotheses about the past” (Goudge, 1950: 196). Earth science often deals with retrodictive or postdictive problems which require an inference from effects to causes. Oh (2008, 2010, 2011) suggested two types of
1064
P. S. Oh
retrodictive or postdictive problems in earth science. One is the type of problem in which historical evidence is observed and its causes far in the past are inferred. As an example, earth scientists interpret materials and structural properties of rock layers to figure out geologic processes that shaped the strata through a long period of time. This sort of problem is at times classified as an “abductive historical retrodiction” problem (Rhoads & Thorn, 1993; von Engelhardt & Zimmermann, 1988). In the other type of retrodictive or postdictive problem, presently occurring phenomena are explained with concurrent but unobserved causes. For instance, earth scientists try to solve the problem of what happened in the earth’s inaccessible interior a short time before an earthquake shakes the ground. This kind of problem is also known as an “abductive contemporaneous codiction” problem (Rhoads & Thorn, 1993; von Engelhardt & Zimmermann, 1988). Taken together, the retrodictive or postdictive types of problems reveal the inextricable relationship between earth science and abduction, because the problems call for the development of hypotheses about unobserved facts, which relies heavily on the reasoning moving backward, namely, abduction (Baker, 1996a, 1999; Gray, 2014; Kitts, 1977; Kleinhans et al., 2005; Oh, 2008, 2010, 2011, 2019; Rhoads & Thorn, 1993, 1996; von Engelhardt & Zimmermann, 1988). It should be noted that solving retrodictive or postdictive problems in earth science is influenced by the underdetermination by available evidence (Kleinhans et al., 2005; Oh, 2011; Turner, 2005). The underdetermination means that explanations cannot be firmly established by evidence. For example, fossil records cannot provide enough information concerning terrestrial animals migrating seasonally between dry upland areas and wetter lowland areas, for conditions in dry upland areas are not well-suited to fossilization (Turner, 2005). As implied in this example, the underdetermination arises from the nature of evidence studied in earth science (Kleinhans et al., 2005; Oh, 2008, 2011; Turner, 2005). First of all, not all events in the past and present are recorded in the earth’s environment, and some phenomena do not permit a direct detection or observation. If recorded, they may disappear or be deformed due to such earth processes as weathering, erosion, metamorphism, flooding, glaciation, mountain-building, and climate change as well as current techniques for detecting and measuring. Further, because many events and phenomena occurring in the earth are systematically interconnected, there are a number of conditions which can possibly result in the same effect. As such, only partial and fragmented evidence are available to earth scientists, and therefore the underdetermination becomes a typical problem in earth scientific inquiry. Kleinhans et al. (2005: 307) explained this as follows: Earth scientists are usually confronted with the predicament that they lack sufficient nonambivalent data in order to construct a tenable explanation of the course of events. They have to infer from present situations to past ones, or from a limited set of observations to a hypothesis or theory. The empirical data gathered by earth scientists often leave room for a wide range of different, incompatible hypotheses.
However, the notion of the underdeterminism plays a central role in earth scientific inquiry, since the underdeterminism makes it apparent that complete physical causal-law explanations are intrinsically out of reach in earth science (Baker,
50 Abduction in Earth Science Education
1065
1996a; Kleinhans et al., 2005; Raia, 2005, 2008). Given the underdetermination by available evidence, earth scientists employ a unique type of reasoning – abduction which can draw out an ampliative explanation of limited evidence. In other words, abduction helps earth scientists overcome the underdetermination problem by allowing for generating scientific explanations beyond the limit of evidence. This is why abduction is considered a discipline-specific practice of reasoning which fits best with earth science. As Kleinhans et al. (2005: 311) argued, “most earthscientific explanations are the result of abductive inference.”
Earth Science as a Modeling-Based Science There is a wide agreement that developing and using models – modeling – is a central practice in science (Magnani & Bertolotti, 2017; Morgan & Morrison, 1999; NGSS Lead States, 2013). A model is viewed as a linguistic or nonlinguistic representation of a natural phenomenon or an idea about a phenomenon (Gouvea & Passmore, 2017; Oh, 2019; Oh & Oh, 2011). But, the linguistic form of a model is not necessarily a proposition which is a typical mode of expression of a scientific law. Rather, a scientific model often comes in alternative modes, such as a narrative explanation, and can be better understood when expressed in combination with nonlinguistic representations such as drawings, pictures, diagrams, and mathematical descriptions. Functionally, a model serves to describe and explain natural phenomena and communicate scientific ideas with others (Gouvea & Passmore, 2017; Oh & Oh, 2011). Modeling plays a key role in earth scientific inquiry, so much so that earth science can be considered a modeling-based science discipline (Bokulich & Oreskes, 2017; Gilbert & Ireton, 2003; Lally et al., 2019; von Engelhardt & Zimmermann, 1988). The importance of a model in earth science is associated with the characteristics of the natural phenomena dealt with in earth science (Bokulich & Oreskes, 2017; Dolphin & Dodick, 2014; Kleinhans et al., 2005; Oh, 2008, 2019). First, earth scientific phenomena come in different temporal and spatial scales. In particular, the time scale of the phenomena is much larger than the human lifespan, and it is impossible to observe directly or even indirectly earth-shaping processes and their long-term effects. Second, many earth scientific phenomena are not manipulable in nature or even in a laboratory setting. This is far different from experimental sciences such physics and chemistry in which a hypothesis about the natural world is tested under controlled environments. Third, most earth scientific phenomena are results of complex interactions of a number of variables within a system and are thus barely repeatable. In order to study such hardly observable, non-manipulable, and complex phenomena, earth scientists rely heavily on modeling (Bokulich & Oreskes, 2017; Giere, 1988, 1999; von Engelhardt & Zimmermann, 1988). Developing and using models offers several benefits in conducting earth scientific inquiry. Most of all, models can be manipulated in ways that are impossible in natural or laboratory settings. Therefore, they can help earth scientists understand complex earth processes and effects. Also, modeling makes it possible to compare different
1066
P. S. Oh
hypotheses for the same phenomena and assess their strengths and weaknesses (Kleinhans et al., 2005). A close link between abduction and modeling is suggested by many authors (Clement, 2008; Magnani, 2001, 2002, 2004; Nersessian, 2008; Oh, 2019; Thagard, 2010; Thagard & Shelley, 1997). First, scientific hypotheses and explanations generated by abduction often appear in the form of a model. In this regard, Clement (Clement, 2008, 2013; Clement & Steinberg, 2002) pointed out that explanatory models in science are constructed by abduction, rather than being deduced from axioms or induced from observational data. This feature is manifested in earth science because research in earth science is accompanied with various kinds of models for diverse purposes including scientific explanation of natural phenomena (Bokulich & Oreskes, 2017). Oh (2016, 2017, 2018, 2019) showed that both experts’ and novices’ explanations about the features of rocks and geologic structures were produced through abductive reasoning and that these explanations were best expressed in various forms of models. In short, a scientific model is generated via abduction (Clement, 2008, 2013; Clement & Steinberg, 2002; Nersessian, 2008). Second, abduction is more often than not performed via constructing and using a model. In this regard, Thagard and Shelly (1997: 418; emphasis added) argued that there were instances of abductive reasoning that were plausibly interpreted as what they called “visual abduction”: Suppose you return to your car at the shopping center and find a big scratch on one door. . . . You can form a mental image of a car driving up beside yours and then its driver opening a door that scratches yours. Here the explanation is a kind of mental movie . . . The abductive inference that the accident happened this way involves a mental picture of the other car’s door hitting yours. Such pictures provide an iconic representation of the event that you conjecture to have happened . . .
In the citation above, mental images, movies, and pictures all correspond to “mental models” that people make up in their minds. Later, Thagard (2010: 449) further developed the idea of visual abduction by including other nonverbal kinds of representations: Sententially, abduction might be taken to be just “If p then q; why q? Maybe p”. But, much can be gained by allowing up the p and q in the abductive schema to exceed the limitations of verbal information and include visual, olfactory, tactile, auditory, gustatory, and even kinesthetic representations.
He continued to emphasize, “p and q need not be linguistic representations, but can operate in any modality” (Thagard, 2010: 454). It is not difficult to realize that the various types of representations Thagard mentioned include a scientific model. As such, our understanding of abduction can be expanded when it is recognized that abductive reasoning often occurs within the process of developing and using models. According to Thagard and Shelley (1997), the visual abduction is natural for many people and has cognitive advantages as compared with abduction based only on a verbal or sentential representation. Magnani (2002, 2004) also argued that the action of manipulating models could function as an enormous source of information to provide otherwise unavailable knowledge to abductive reasoners. The abductive
50 Abduction in Earth Science Education
1067
reasoning accompanied with modeling is called “model-based abduction,” “manipulative abduction,” or “modeling-based abductive reasoning” (Magnani, 2002, 2004; Oh, 2019). Especially, Oh (2016, 2017, 2019) conceptualized modeling-based abductive reasoning in the context of earth scientific problem-solving and provided empirical evidence for the close relationship between abduction and modeling described thus far.
Earth Science as a Systems Science Current scientific literacy requires the notion of the earth as a system which is composed of several interacting subsystems (Ben-Zvi-Assaraf & Orion, 2005, 2010a, 2010b; ESLI, 2010; Finley et al., 2011). As an example, ESLI (2010) has proposed nine big ideas and subordinate concepts of earth science that all should know to be earth-science-literate citizens. One of the big ideas for the earth science literacy is that “Earth is a complex system of interacting rock, water, air, and life” (ESLI, 2010: 6). This idea is supported by eight concepts about the features of the earth as a system (ESLI, 2010: 6): • The four major systems of Earth are the geosphere, hydrosphere, atmosphere, and biosphere. • All Earth processes are the result of energy flowing and mass cycling within and between Earth’s systems. • Earth exchanges mass and energy with the rest of the Solar System. • Earth’s systems interact over a wide range of temporal and spatial scales. • Regions where organisms actively interact with each other and their environment are called ecosystems. • Earth’s systems are dynamic; they continually react to changing influences. • Changes in part of one system can cause new changes to that system or to other systems, often in surprising and complex ways. • Earth’s climate is an example of how complex interactions among systems can result in relatively sudden and significant changes. Obviously, the idea of the earth system and its supporting concepts reflect the recognition of earth science as a systems science. A systems science deals with a system which is “defined as a functionally related assemblage of interacting, interrelated, or interdependent elements forming a complex whole” (Shaked & Schechter, 2017: 9). In the same way, earth science addresses complex system properties embedded in earth scientific phenomena, such as nonlinearity, emergence, self-organization, and evolution (Ghil, 2019; Kleinhans et al., 2005; Raia, 2005, 2008). In other words, earth scientific phenomena cannot be addressed properly by the deterministic causality common in experimental sciences that depend mainly on linear-mono-causal explanations (Raia, 2005, 2008). Instead, earth scientific phenomena often emerge from the nonlinear relationships among components of
1068
P. S. Oh
the earth system, and scientific explanations of them involve multiple processes with complex interactions (Schumm, 1991; Stillings, 2012). The systemic nature of earth processes opens up a variety of explanations for complex system behaviors. Therefore, it is necessary for earth scientists to suggest “outrageous” hypotheses by virtue of abduction (Kleinhans et al., 2005). Earlier, Davis (1926) advocated the value of outrageous hypotheses based on the fact that many progressive concepts in earth science, such as Wegener’s concept of wandering continents, started with conjectures which were once considered daring and outrageous. He further elaborated on the reasoning in earth science: The very foundation of our science is only an inference; for the whole of it rests on the unprovable assumption that, all through the inferred lapse of time which the inferred performance of inferred geological processes involves, they have been going on in a manner consistent with the laws of nature as we know them now. (Davis, 1926: 465–466)
It is thus obvious that the complexities embedded in events and phenomena occurring in the earth’s systems are properly addressed with a variety of outrageous hypotheses inferred by abduction. This was also noted by Baker (1996b: 212) who indicated that “the concept of the outrageous geological hypothesis, with its emphasis on anomalies, is closely allied to Peirce’s concept of abduction.” In addition, earth scientists need to work with multiple hypotheses so that they can compare diverse scenarios resulting from different processes and interactions among several factors in the earth system (Kleinhans et al., 2005). As described previously, abduction allows earth scientists to generate a number of bold explanations, and therefore it is aptly described as a cognitive practice adequate to reasoning about the earth as a complex system. In sum, the characterization of earth science as a historical and interpretive science, a modeling-based science, and a systems science reveal the epistemic goals and practices of earth science that are different from those of other domains of science. Abduction enables earth scientists to achieve the goal of reconstructing past earth processes by interpreting evidence recorded in the earth’s environment. This reasoning practice is best suitable to earth scientific inquiry, characterizing earth science distinctively from its sister sciences.
Abduction in the Earth Science Classroom The history of science education is that of a continuous effort to reflect what scientists actually do in the school curriculum (Rudolph, 2019). In the science education reform from the 1950s through the 1970s or 1980s, the processes of science were emphasized over the simple mastery of scientific facts. The processes of science mean the intellectual skills needed for scientific inquiry, including observing, measuring, predicting, and inferring as some of basic skills and formulating hypotheses, controlling variables, and experimenting as examples of integrated skills (American Association for the Advancement of Science [AAAS] Commission on Science Education, 1971). These skills were believed to be transferable across
50 Abduction in Earth Science Education
1069
different science domains and even to everyday life. The idea of teaching students the processes of science underpinned the development of a new science curriculum named Science – A Process Approach (SAPA). Gagné (1966: 49; emphasis in original), one of the scholars playing the leading role in developing SAPA, explained this feature of the process-based approach to teaching science: The most striking characteristic of these materials is that they are intended to teach children the processes of science rather than what may be called science content. That is, they are directed toward developing fundamental skills required in scientific activities. . . . The goal, however, is not an accumulation of knowledge about any particular domain, such as physics, biology, or chemistry, but competence in the use of processes that are basic to all science.
In the 1990s, the US National Science Education Standards (NRC, 1996) presented a broader view of scientific inquiry. The Standards defined inquiry as “the diverse ways in which scientists study the natural world and propose explanations based on the evidence derived from their work” (NRC, 1996: 23). Inquiry was also referred to as the activities of students’ learning science in school. In other words, the Standards conceptualized inquiry as one of “the intellectual and cultural traditions that characterize the practice of contemporary science” (NRC, 1996: 2) and argued that school science had to reflect scientific inquiry in its own curriculum. More recently, the Next Generation Science Standards (NGSS Lead States, 2013), as a new standards document for school science education in the USA, suggested that students should learn science through the practices of scientists. The eight science practices in the NGSS are asking questions, developing and using models, planning and carrying out investigations, analyzing and interpreting data, using mathematics and computational thinking, constructing explanations, engaging in argument from evidence, and obtaining, evaluating, and communicating information. These practices represent what scientists do as they engage in inquiry on the natural world, and they are also recommended for students’ activities to learn science in school. Nevertheless, the curricular approaches to reflect scientific inquiry in school science education are limited in the sense that they did not consider practices unique to different domains of science (Ault Jr. & Dodick, 2010; Ault Jr., 2015; Rudolph, 2019). Ault criticized that the generic processes-based or standards-based approach to science education disrespected the fundamental diversity of disciplinary concepts and methods. He maintained instead that it was necessary to specify practices discipline by discipline and match appropriate practices to distinctive problems in order to meet the diverse challenge of scientific inquiry in different contexts. As discussed so far, abduction is a reasoning process characterizing earth science and should be considered an authentic form of earth scientific practice that science learners are to experience in school. In this regard, Gray (2014) identified abduction as one of the expanded languages to address historical sciences equally with experimental sciences in the science classroom and argued for inquirybased activities derived from the characteristics of historical sciences. Further, the abductive inquiry model (AIM) has been proposed and used to support earth science teachers and students to implement abduction as they teach and learn earth science (Oh, 2008, 2011, 2017, 2018, 2019). The AIM is an instructional model specific
1070
P. S. Oh
Fig. 2 The abductive inquiry model (AIM)
to earth science, which is pedagogically transformed and recontextualized from authentic practices of earth scientists. It consists of four major phases: exploration, examination, selection, and explanation (Fig. 2). In the phase of exploration of the AIM, students explore earth scientific phenomena and find problems to be answered through inquiry. The exploration can be carried out in diverse ways including hands-on experiments with earth materials, geological field trips, astronomical observations, and dealing with real-time data about natural events on the earth. As a result, students are expected to discover surprising phenomena and identify them as evidence to be explained by abductive reasoning (Oh, 2008, 2011). For example, in Oh’s (2011) study, students were given tracking data of four typhoons and guided to transform the data to graphs so that they could find that all the typhoons traveled in parabolic trajectories. They were then provided data of a new typhoon moving in an anomalous trajectory which was different from the typical parabolic path of typhoons. Soon after representing the data in graphs, the students readily identified the abnormal movement of the new typhoon as a problem to be scientifically explained. The examination phase allows students to search a variety of resources and find information which seems to be promising to solve the problem. They may recall some knowledge from their experiences and past learning. At the same time, they can search external sources of information to discover disciplinary resources such as scientific laws, models, and theories. Consequently, this information-searching process becomes another learning opportunity for students to acquire disciplinary knowledge they can use as a resource for solving scientific problems. Also, during the examination phase, students adapt the resources to the context of the problem by refining and/or combining different pieces of knowledge. In short, the major goal of the examination phase is to examine all possible resources and, using the resources abductively, suggest hypotheses to explain the phenomena in question (Oh, 2008, 2011). For instance, Oh (2008, 2011, 2017, 2018, 2019) applied the AIM in un-
50 Abduction in Earth Science Education
1071
dergraduate earth science courses with the purpose of providing preservice teachers with opportunities to practice earth scientific inquiry by solving scientific problems about rocks, geological structures, or typhoons. While the students engaged in solving the problems through abductive reasoning, they were allowed to examine diverse sorts of resources to come up with as many hypothetical explanations as possible. In practice, the students recalled relevant facts, models, and theories from their background knowledge and also searched various sources including reference books, the Internet, and professors’ lecture notes to find scientific information. Furthermore, the students employed several cognitive strategies, such as analogy, combination, and modeling, to construct new hypotheses. In the phase of selection, students evaluate the resources and explanations examined in the previous phase. They are supposed to compare multiple candidates, assess each of them from scientific viewpoints, and choose the most plausible ones. If any new problems are detected in this step, students may go back to the previous phases and go over the processes again. That is, the phases in the AIM represent a cyclic or recursive procedure of earth scientific inquiry in which explanatory hypotheses are generated and elaborated continuously via ongoing development and evaluation (Oh, 2008, 2011). Different kinds of evaluative criteria were used by students in this phase of the AIM. For instance, Oh’s (2011) study identified how a group of students, in their attempt to explain a typhoon’s anomalous path, suggested the hypothesis that upper level winds had changed the typhoon’s movement. More notably, even though they failed to locate evidence for the appearance of upper level winds during the target dates, the students decided to keep the hypothesis because they found it scientifically sound that upper level winds could affect other atmospheric phenomena such as a typhoon. That is, theoretical coherence was used here as an important criterion to select the hypothesis as a probable explanation of the typhoon’s abnormal movement. Additionally, in Oh’s (2017) study, more than a few students mistook a weathered sedimentary rock as basalt simply because the rock had a lot of holes. However, when finding that the rock also had a number of small and large, variously shaped grains, some students discarded their hypothesis that the rock was basalt. In this case, empirical consistency was used as a criterion in a way that contradictory evidence played an important role for evaluating a hypothesis. The explanation is the final phase of the AIM in which students propose complete explanations of the evidence using the resources and hypotheses selected in the preceding step (Oh, 2008, 2011). Again, students can take advantage of various resources in this step. Also, it is possible and even encouraged for students to develop multiple explanations in consideration of the nature of earth scientific phenomena for which a single right answer can hardly be determined by evidence. Moreover, explanations suggested in this phase often appear in the forms of models and narratives in which many earth scientific events and processes are interconnected resulting in the observed phenomena (Oh, 2008, 2011, 2017, 2018, 2019). As discussed later in this chapter, developing and using a model and composing narratives are useful strategies for representing scientific explanations
1072
P. S. Oh
of earth scientific phenomena. This feature was manifested when students engaged in abductive problem-solving following the phases suggested by the AIM. As implied by the descriptions above, it is hardly possible and undesirable to complete abductive inquiry on earth scientific subjects within a single lesson. Instead, the AIM is best implemented as it is employed as part of a rather long project-based science learning experience for students. The project-based instruction is believed to be an effective way of teaching and learning science in which students can exercise scientific practices of inquiry (Holthuis et al., 2018; Krajcik et al., 2007). But, an older approach to project-based learning is limited in that it puts more emphasis on scientific processes over the content of science (O’Neill & Polman, 2004). If project-based learning is combined with the AIM, however, students can learn both content and processes of science because the AIM encourages learners to search and use disciplinary knowledge as a resource for problem-solving. Therefore, when science teachers organize school science programs, the AIM can be considered a promising model of instruction to enact earth scientific practices for student learning. In addition, the AIM should be used as a stepping stone for students to learn disciplinary practices of earth science and develop their own inquiry practices by renewing and expanding the model (Emden, 2021).
Learning Strategies for Abductive Inquiry Learning strategies refer to ways of using a set of skills to optimize the process and outcome of learning. Competent learners often employ various strategies to achieve higher learning goals. Earth science learners can also make effective use of abduction by adapting learning strategies that are congruent with the nature of earth science when they learn earth science through abductive inquiry. This chapter proposes four learning strategies for abductive inquiry that can be used in the context of earth science learning: the method of multiple working hypotheses, modeling, narrative explanations, and systemic approaches.
The Method of Multiple Working Hypotheses More than a century ago, the method of multiple working hypotheses (MMWH) was advocated by T. C. Chamberlin (1890), an American geologist and educator, as an intellectual method practiced in science. The MMWH involves the development of several hypotheses that can possibly explain the phenomena being studied. This method is different from the “method of the ruling theory,” which is directed to the finding of facts supporting the theory, and the “method of working hypothesis,” in which facts are collected for demonstrating the hypothesis. By contrast, the MMWH allows for bringing up every rational hypothesis about the phenomena and forbids scientists from fastening their affection on any one explanation (Chamberlin, 1890). Laudan (1980) pointed out advantages of employing the MMWH
50 Abduction in Earth Science Education
1073
in scientific inquiry. First, by adopting the MMWH, scientists can avoid being attached unconsciously to one hypothesis. Second, the MMWH suggests lines of investigation that might otherwise be overlooked. Third, hypotheses can be refined and further developed if they are compared with one another. Fourth, elements from one hypothesis can be integrated with those from other hypotheses to develop a more sophisticated hypothesis. According to Chamberlin (1890), true explanations of natural phenomena are necessarily complex and should be encouraged by the MMWH. This is because “an adequate explanation often involves the co-ordination of several agencies, which enter into the combined result in varying proportions” (Chamberlin, 1890: 756). His notion has been repeated recently by Elliott and Brook (2007: 612) who pointed out that “the greatest value of Chamberlin’s MMWH . . . lies in the construction of hypotheses and the testing of complex systems in settings where explanations are not necessarily mutually exclusive.” Thus, considering the fact that earth science deals with a number of system components and their complex interactions, it is obvious that the MMWH is essential to earth science (Dolphin & Dodick, 2014; Kleinhans et al., 2005). The purpose of using the MMWH is also compatible with the characteristics of abduction, for abduction enables generating many probable hypotheses about earth processes. Hence, the MMWH should be considered a useful strategy to be employed by learners engaging in abductive inquiry to solve earth scientific problems. Oh’s (2018) study is an example of research in which undergraduate students made use of the MMWH to solve a scientific problem about a spheroidally weathered rock. In this study, students initially suggested a number of hypotheses about the rock each of which included a distinctive geological process such as exfoliation, magma intrusion, xenolith formation, metamorphism, and different types of weathering. However, when asked to compare and contrast the multiple hypotheses, the students found more readily scientifically adequate hypotheses. Likewise, earth science learners can benefit from using the MMWH in abduction-based inquiry on earth scientific phenomena.
Modeling As described previously, there is a close link between abduction and modeling. Therefore, it is certain that developing and using models (i.e., modeling) is a learning strategy best suited for abductive inquiry. Modeling is recognized as one of the important practices of science, and scholars in the field of science education have proposed various ways of enacting modeling in the science classroom. For example, Clement (1989) suggested a cycle of model generation, evaluation, and modification (GEM), which has been used in a number of studies on modeling in the science classroom (e.g., Khan, 2007; Núñez-Oviedo & Clement, 2019). Additionally, Campbell et al. (2013: 110) proposed five modeling pedagogies that could be applied for meeting different purposes of science instruction: (1) exploratory modeling in which students investigate properties of a preexisting model; (2) expressive modeling in which students express their ideas by creating
1074
P. S. Oh
new models or using existing models; (3) experimental modeling in which students form a hypothesis and prediction from a model and test them with an experiment; (4) evaluative modeling in which students compare alternative models, assess their merits and limitations, and select the most adequate models; and (5) cyclic modeling in which students engage in ongoing processes of constructing, evaluating, and improving models. The GEM cycle and other modeling pedagogies all reflect diverse ways of modeling practices in science and simplify them as suitable to student learning in the science classroom. The modeling strategies for science learners include in common the process of students’ constructing and manipulating their own models. The construction and manipulation of a model can take place mentally, physically, or in both ways so that a mental and physical model can interact with each other. For example, a mental model can be used as a preliminary step for creating a physical model (Bokulich & Oreskes, 2017). Also, a physical model can serve to record features of mental model and develop an external representation too complex to appear at once in working memory (Clement, 2013). Nersessian (2013) referred to the interaction between a mental and a physical model as the coupling of internal and external representations. According to her, manipulating physical models provides details and constraints to developing mental models. Inversely: Mental models embody and comply with the constraints of the phenomena being reasoned about, and enable inferences about these through simulation processes. . . . Simulative mental modeling can lead to potential empirical insights . . . by creating new states or situations that parallel those of the real world. (Nersessian, 2013: 407)
The simulation with a model is thus believed to have many strengths in solving scientific problems. Particularly, in the context of earth science, Frodeman (1996) conceptualized a mental simulation employed by field geologists as envisioning. Envisioning is a type of visual intelligence whereby geologists take a set of marks in natural objects such as rocks as signs of past events and processes. For example, geologists perform “grasping the nature of an object or pattern by mentally rotating, unfolding, or completing it in space” (Frodeman, 1996: 425). In such a way, envisioning involves a mental construction and simulation of a model with the aim of virtually enacting causative processes resulting in a phenomenon in question. Hence, envisioning is considered a type of qualitative reasoning with a model (Forbus, 2008) which has a similarity to the “imagistic simulation” suggested by Clement (2008) or the “conceptual simulation” by Trickett and Trafton (2007). This mental model-based qualitative reasoning can also be a promising strategy for students to solve earth scientific problems and make sense of events and phenomena on the earth. In fact, Oh’s (2019) study demonstrated the usefulness of a simulation with a model. In this study, an undergraduate student proposed, as a result of her abductive inquiry, a scientifically sound model of geological units found in an ancient stream system. Most importantly, she simulated her model with a diagram and hand gestures in a way that supported the visualization of unobserved geologic processes and persuaded her peers to agree with her explanations.
50 Abduction in Earth Science Education
1075
Envisioning as a metal construction and simulation of earth scientific models has many characteristics in common with thought experiments. However, envisioning is different from thought experiments in that while thought experiments are not necessarily expressed in a form of a story-narrative, a dominant feature of envisioning is a narrative (Klassen, 2006). Narrative is in fact another strategy for learning earth science discussed next.
Narrative Explanations Explanations in historical sciences often take on the character of genetic explanations. The genetic explanation describes how a subject has been developed over time by “[setting] out the sequence of major events through which some earlier system has been transformed into a later one” (Nagel, 1961: 25). Another central characteristic of the genetic explanation is that it is often concerned with a particular occurrence rather than regularities among recurring events. That is, a genetic explanation organizes causally related events up to the genesis of a particular phenomenon using a rather long narrative account (Kleinhans et al., 2005; Nagel, 1961; Norris et al., 2005). There is no doubt that this feature is essential to historical hypotheses generated by abduction. Also, because a genetic explanation usually takes the form of narratives, narrative explanations are considered a genuine form of explanations in earth science (Kleinhans et al., 2005; Norris et al., 2005). The minimum structure of a narrative explanation consists of an initial state, action, and final state (Klassen, 2006). For example, the following simple narrative explanation of the Snowball Earth hypothesis includes descriptions of the initial and final states of the global climate and factors affecting the climate (e.g., albedo, thickness of atmosphere, amount of icepack) as well as the action that caused changes in the factors (cf Currie, 2014: 1166–1167): The earth’s climate was stable in the Neoproterozoic era. At the late Neoproterozoic, the supercontinent Rodinia broke up and the megacontinent Gondwana began to form. Most continents clustered at the middle and lower latitudes. The landmass clustering around the tropics lowered temperature by increasing albedo and thinning the atmosphere. The lower temperatures increased icepack cover, forming a positive feedback loop between lowering temperature, larger icecaps, and higher albedo. Eventually, the entire earth froze over.
Narrative explanations can be developed more sophisticatedly by considering more various narrative elements such as an event token, narrator, narrative appetite, past time, structure, agency, purpose, and readers (Norris et al., 2005). However, some specific characteristics of narrative explanations in earth science should be noted. First of all, narrative explanations do not require including every past event (Kleinhans et al., 2005; Nagel, 1961). As discussed earlier, antecedent conditions of earth scientific phenomena can never be observed and revealed completely. Therefore, to construct a narrative explanation of earth scientific phenomena, those events that are causally related to the development of the phenomena should be selected. In addition, a narrative explanation combines the sequence of past events with causal forces and conditions (Kleinhans et al., 2005). In other words,
1076
P. S. Oh
constructing a narrative explanation in earth science is similar to building a “causal nexus” (Salmon, 1984) which explains an event by describing the place of the event within the past that are causally related to the occurrence of the event (Glennan, 2010). Just as narratives allow earth scientists to explain natural phenomena without being reduced to physical causal-law explanations, earth science learners can benefit by constructing their abductive hypotheses in narrative forms. In Oh’s (2011, 2017, 2018, 2019) studies in which the AIM was applied for earth scientific inquiry, students generated narrative forms of explanations even though no explicit instruction about narrative explanations was given. This result helps demonstrate how essential narrative explanations can be to earth science. However, it is even better if earth science learners are guided to develop their own narrative explanations by integrating different events and processes into a holistic story. In such explanations, one event or process is not necessarily a cause for another event or process. Instead, it is important that multiple events together become components of a larger causal story and explain evidence with no contradiction to one another (Oh, 2019). For example, Oh (2011) showed that student-generated abductive explanations involved causal chains of several intervening events all of which contributed together to the development of a typhoon’s erratic path.
Systemic Approaches While it is widely recognized that earth science is a systems science, there are considerable variations in characterizing systemic approaches appropriate to earth scientific inquiry (Scherer et al., 2017). That is, the disciplinary thinking of earth science as a systems science is called by different names including earth systems concepts (Finley et al., 2011), systems thinking (Lally et al., 2019; Scherer et al., 2017), and system thinking skills (Ben-Zvi-Assaraf & Orion, 2005, 2010a, 2010b). Regardless of what they are called, these habits of mind are contrasted sharply with reductionist thinking (Shaked & Schechter, 2017). In systems thinking, seeing the whole beyond the parts and seeing the parts in the context of the whole are central. Contrastingly, according to reductionist thinking, the whole can be broken down into its parts and put back together from its parts. Additionally, systems thinking indicates that the whole emerges from the interactions among its parts and that the parts are related to one another through complex multiple influences. By contrast, in reductionist thinking, the parts are believed to be related through a simple cause-effect relationship. As described earlier, earth scientific phenomena cannot be explained by a unique cause or a linear sequence of causes and effects, because they often emerge from complex interactions within the earth’s systems (Kleinhans et al., 2005; Raia, 2005, 2008). Therefore, earth scientific inquiry can be best implemented by adapting systems thinking. While the general features of systems thinking can apply to earth scientific inquiry, Ben-Zvi-Assaraf and Orion (2005, 2010a, 2010b) suggested eight “system thinking skills” in the context of earth system education:
50 Abduction in Earth Science Education
1077
• The ability to identify the components of a system and processes within the system • The ability to identify relationships among the systems’ components • The ability to identify dynamic relationships within the system • The ability to organize the systems’ components and processes within a framework of relationships • The ability to understand the cyclic nature of systems • The ability to make generalizations • Understanding the hidden dimensions of the system • Thinking temporally: retrospection and prediction The skills above clearly define the ways earth science learners should employ systems thinking to understand the dynamic properties of the earth as a system. However, the list of the skills is limited in that it places emphasis on the identification of the components of the earth system and the transformation of matter in the earth’s cycles such as the water cycle (Oh, 2019; Scherer et al., 2017). A broader framework of disciplinary thinking in earth science was provided earlier by Schumm (1991: 35–94), who enumerated ten problems or difficulties that investigators are likely to encounter during their studies of the earth: • Time, involving the problem with how the results obtained during short-time-span studies are to be applied to long-time-span or varying-time-span problems • Space, involving the problem with how the complexity of the subject will increase as size increases (small to large) and as scale becomes larger (low to high resolution) • Location, involving the problem of extrapolating from one location to another • Convergence, a situation when different causes and processes produce similar or the same effects • Divergence, a situation when similar causes and processes or the same cause and process produce different effects • Efficiency, which suggests that there is no desired response in a natural system when an event or series of events affect the system • Multiplicity, which suggests that multiple causes can act simultaneously and in combination to produce a phenomenon • Singularity, the natural variability among like things; the condition, trait, or characteristic that makes one thing different from others • Sensitivity, the susceptibility of a system to even minor external change • Complexity, the complex behavior of a system that has been subject to altered conditions According to Schumm (1991), these problems or difficulties originate from the systemic nature of earth scientific phenomena and must be considered with care in any study of natural phenomena. Therefore, his framework can be used as a guide for systemic approaches to earth scientific problem-solving when students learn earth science through abductive inquiry. For example, in Oh’s (2019) study, the
1078
P. S. Oh
only student who took a systemic approach by considering the divergent relationship between causes and effects was able to successfully create a scientific model about a channel fill structure and epsilon cross bedding in an ancient stream system. In her model, two geologic events – building and filling a steam channel and inward growing of a point bar – were produced by the same cause, namely, a meandering stream. By contrast, the other students tried in vain to explain several geologic structures separately and search for one-to-one correspondence between a particular geologic structure and an earth scientific model without considering systemic relations among different pieces of evidence.
Teaching Strategies Supporting Students’ Abductive Inquiry Learning Regardless of whether the type of science instruction is centered on student inquiry or teacher lectures, the teacher’s role of supporting student learning is important (Furtak et al., 2012; Oh, 2010; Sung & Oh, 2018; Wise & O’Neill, 2009). This principle applies equally to earth science classrooms in which abduction is implemented as a discipline-specific practice of inquiry. More specifically, four teaching strategies for abductive inquiry in the earth science classroom can be suggested on the basis of the review of relevant literature. First, learning a science discipline should be accompanied with the engagement in problem-solving essential to the discipline. Earth science is characterized uniquely by retrodictive or postdictive problems and abductive inquiry to solve such types of problems. Therefore, teachers of earth science should build their expertise in developing retrodictive or postdictive problems about earth scientific phenomena and support students to successfully carry out abductive reasoning, so that the students can have enough opportunities to learn earth science by engaging in discipline-specific practices of scientific inquiry. Importantly, the AIM as an inquiry model specific to earth science can be utilized to organize practice-based earth science classrooms which are aligned with recent reforms in science education (e.g., NGSS Lead States, 2013; NRC, 2012). It is thus necessary that earth science teachers use the AIM widely across different grades and develop refined models best suitable to their own classrooms. Second, studies on abduction and earth science learning have revealed that resources available to learners play a crucial role in effective practices of scientific inquiry and problem-solving. For example, when students had or were provided with knowledge relevant to a problem, they often developed scientifically sound solutions (Oh, 2010, 2017, 2019). It is also important that students are allowed to search for scientific knowledge and information as they engage in abductive inquiry, because the knowledge and information can be used as resources for abductive reasoning and yield a creative path for solving a problem (Oh, 2011, 2018, 2019). Therefore, earth science teachers should make appropriate resources available to students. However, Oh (2011) points out that this teacher role is not intended to buttress students’ rote memorization of science content. He argues instead:
50 Abduction in Earth Science Education
1079
Learning scientific knowledge should be emphasized in the context of scientific inquiry and problem solving. That is, student must be guided to take advantage of relevant knowledge throughout their inquiry processes. . . . In addition, students should be given opportunities to apply what they have learned from the science classroom to solve challenging problems through inquiry. (Oh, 2011: 428)
That is, it is the teacher’s task to empower students with enough resources to effectively perform abductive inquiry and develop scientific explanations of earth scientific phenomena. This pedagogical role can be realized in several ways including stimulating students to activate their background knowledge, guiding them to various sources of information, and providing scientific knowledge at the very moment when it is needed (Oh, 2010, 2011, 2019). Moreover, teachers should be able to use students’ own resources in productive ways so that the student resources, over time, become more compatible with canonically understood knowledge of science about the natural world (Sung & Oh, 2018). Third, teachers must lead students to use a variety of learning strategies. This chapter has explored cognitive strategies that are most suitable to earth scientific problem-solving such as the MMWH, modeling, narrative explanations, and systemic approaches. Earth science students need to be introduced to these strategies and encouraged to employ those that are most useful at appropriate moments of learning. For instance, students should be guided to keep themselves from applying a reductionist approach and instead take a systemic approach to solve earth scientific problems, for they are likely to have difficulty in constructing a scientific explanation if, for example, they search for one-to-one correspondence between cause and effect (Oh, 2019). Earth science learners can also take advantage of the MMWH to come up with many plausible explanations of a complex earth scientific phenomenon resulting from several interacting causes (Oh, 2018). Additionally, their explanations can be best expressed and communicated in alternative forms of representations such as a model and narrative (Oh, 2010, 2011, 2017, 2018, 2019). In order to facilitate students’ active use of such learning strategies, the school curriculum ought to include these strategies as important components throughout so that teachers can teach them explicitly in their earth science classrooms. Fourth and lastly, Gray (2014: 337) argued, “Science teachers who teach historical science topics in the classroom must familiarize themselves with the unique methodologies of the historical sciences as well as the additional concepts and terminology historical science inquiries require.” For this purpose, all the above teaching strategies should be reflected in teacher education programs in a way that earth science teachers can be taught in the same way they want to teach earth science to their students. Science teacher education programs are often criticized in that they address content of science and pedagogies for science learning separately (Lewis, 2008). The NGSS claims, however, that scientific practices require coordination of both knowledge and skills simultaneously (NRC, 2012). Therefore, engaging in earth scientific inquiry and using a variety of cognitive strategies to solve earth scientific problems can be the most promising way for earth science teachers to develop their expertise in teaching earth science on the basis of students’ engagement in discipline-specific practices of inquiry.
1080
P. S. Oh
Conclusions More recently, earth science is gaining increased attention because of its critical role in addressing many acute problems related to the present and future environment of the earth. Accordingly, cultivating earth scientific literacy in students becomes an urgent goal of earth science education (ELSI, 2010). In addition, the modern philosophy of science and the current reform of science education emphasize the diversity of inquiry methods across science disciplines (Ault Jr. & Dodick, 2010; Ault Jr., 2015; Gray, 2014; Rudolph, 2019). Nevertheless, the characterization of science in the school curriculum is mainly based on experimental sciences such as physics and chemistry. Rudolph (2019: 230) indicated, “Too often . . . we think of the scientific process, and teach it, as a general process (or set of practices) . . . [which] seems to confuse people raised with the view that science works via a stepby-step method or that an easily arranged experiment can prove what is true.” He continued to argue: What’s needed in science education is to survey all the methodologies of science – experimental, historical, comparative, statistical, and so on – so that students can begin to appreciate the wide variety of intellectual work, all of it scientific, that leads to knowledge. (Rudolph, 2019: 230)
Earth science can contribute to such work in science education by integrating the epistemic goals and practices that are different from those of experimental sciences into the school curriculum and therefore developing balanced scientific literacy in students (Ault Jr. & Dodick, 2010; Ault Jr., 2015; Dodick & Orion, 2003; Gray, 2014; Oh, 2019). This notion of the importance of earth science and earth science education calls for a genuine understanding of inquiry methods in earth science. As discussed in this chapter, abduction is ampliative, creative, and evaluative and selective reasoning well-suited for solving retrodictive or postdictive problems on the basis of various resources and cognitive strategies. Abduction is thus a reasoning practice which characterizes earth science as distinctive from other domains of science. It can also be implemented in the science classroom in which students are to learn science by engaging in disciplinary practices of inquiry. Further studies on abduction and other discipline-specific forms of scientific practices should be carried out to better understand the nature of earth science and apply earth scientific practices in ways that can facilitate student learning of earth science in school.
References American Association for the Advancement of Science (AAAS) Commission on Science Education. (1971). The AAAS project: Science-a process approach. In E. Victor & M. S. Lerner (Eds.), Readings in science education for the elementary school (2nd ed., pp. 451–462). The Macmillan Company. Ault, C. R., Jr. (1998). Criteria of excellence for geological inquiry: The necessity of ambiguity. Journal of Research in Science Teaching, 35, 189–212. Ault, C. R., Jr. (2015). Challenging science standards. Rowman & Littlefield.
50 Abduction in Earth Science Education
1081
Ault, C. R., Jr., & Dodick, J. (2010). Tracking the footprints puzzle: The problematic persistence of science-as-process in teaching the nature and culture of science. Science Education, 94, 1092– 1122. Baker, V. R. (1996a). Hypotheses and geomorphological reasoning. In B. L. Rhoads & C. E. Thorn (Eds.), The scientific nature of geomorphology (pp. 57–85). Wiley. Baker, V. R. (1996b). The pragmatic roots of American quaternary geology and geomorphology. Geomorphology, 16, 197–215. Baker, V. R. (1999). Geosemiosis. GSA Bulletin, 111(5), 633–645. Ben-Zvi-Assaraf, O., & Orion, N. (2005). Development of system thinking skills in the context of earth system education. Journal of Research in Science Teaching, 42(5), 518–560. Ben-Zvi-Assaraf, O., & Orion, N. (2010a). System thinking skills at the elementary school level. Journal of Research in Science Teaching, 47(5), 540–563. Ben-Zvi-Assaraf, O., & Orion, N. (2010b). Four case studies, six years later: Developing system thinking skills in junior high school and sustaining them over time. Journal of Research in Science Teaching, 47(10), 1253–1280. Bokulich, A., & Oreskes, N. (2017). Models in geosciences. In L. Magnani & T. Bertolotti (Eds.), Springer handbook of model-based science (pp. 891–911). Springer. Bonfantini, M. A., & Proni, G. (1983). To guess or not to guess? In U. Eco & T. A. Sebeok (Eds.), The sign of three: Dupin, Holmes, Peirce (pp. 119–134). Prentice Hall. Borsboom, D., van der Maas, H., Dalege, J., Kievit, R., & Haig, B. (2021). Theory construction methodology: A practical framework for theory formation in psychology. Perspectives on Psychological Science, 16(4), 756–766. Campbell, T., Oh, P. S., & Neilson, D. (2013). Reification of five types of modeling pedagogies with model-based inquiry (MBI) modules for high school science classrooms. In M. S. Khine & I. M. Saleh (Eds.), Approaches and strategies in next generation science learning (pp. 106–126). IGI Global. Chamberlin, T. C. (1890). The method of multiple working hypotheses. Science, 148(3671), 754– 759. Cleland, C. E. (2002). Methodological and epistemic differences between historical science and experimental science. Philosophy of Science, 69(3), 474–496. Clement, J. (1989). Learning via model construction and criticism: Protocol evidence on sources of creativity in science. In G. Glover, R. Ronning, & C. Reynolds (Eds.), Handbook of creativity: Assessment, theory and research (pp. 341–381). Plenum. Clement, J. J. (2008). Creative model construction in scientists and students: The role of imagery, analogy, and mental simulation. Springer. Clement, J. J. (2013). Roles for explanatory models and analogies in conceptual change. In S. Vosniadou (Ed.), International handbook of research on conceptual change (2nd ed., pp. 412– 446). Routledge. Clement, J. J., & Steinberg, M. S. (2002). Step-wise evolution of mental models of electric circuits: A “learning-aloud” case study. The Journal of the Learning Sciences, 11(4), 389–452. Currie, A. M. (2014). Narratives, mechanisms and progress in historical science. Synthese, 191, 1163–1183. Davis, W. M. (1926). The values of outrageous geological hypotheses. Science, 63(1636), 463– 468. Dodick, J., & Orion, N. (2003). Geology as an historical science: Its perception within science and the education system. Science & Education, 12, 197–211. Dolphin, G., & Dodick, J. (2014). Teaching controversies in earth science: The role of history and philosophy of science. In M. R. Matthews (Ed.), International handbook of research in history, philosophy and science teaching (pp. 553–599). Springer. Earth Science Literacy Initiative. (2010). Earth science literacy principles. http://www. earthscienceliteracy.org/es_literacy_6may10_.pdf. Accessed 1 June 2021. Eco, U. (1983). Horns, hooves, insteps: Some hypotheses on three types of abduction. In U. Eco & T. A. Sebeok (Eds.), The sign of three: Dupin, Holmes, Peirce (pp. 198–220). Prentice Hall. Elliott, L., & Brook, B. W. (2007). Revisiting Chamberlin: Multiple working hypotheses for the 21st century. Bioscience, 57(7), 608–614.
1082
P. S. Oh
Emden, M. (2021). Reintroducing “the” scientific method to introduce scientific inquiry in schools? Science & Education, 30, 1037–1073. Finley, F. N., Nam, Y., & Oughton, J. (2011). Earth systems science: An analytic framework. Science Education, 95, 1066–1085. Forbus, K. D. (2008). Qualitative modeling. In F. van Harmelen, V. Lifschitz, & B. Porter (Eds.), Handbook of knowledge representation (pp. 361–393). Elsevier. Frankel, H. R. (2012). The continental drift controversy: Wegener and the early debate. Cambridge University Press. Frodeman, R. (1995). Geological reasoning: Geology as an interpretive and historical science. GSA Bulletin, 107(8), 960–968. Frodeman, R. L. (1996). Envisioning the outcrop. Journal of Geoscience Education, 44, 417–427. Furtak, E. M., Seidel, T., Iverson, H., & Briggs, D. C. (2012). Experimental and quasi-experimental studies of inquiry-based science teaching: A meta-analysis. Review of Educational Research, 82(3), 300–329. Gagné, R. M. (1966). Elementary science: A new scheme of instruction. Science, 151(3706), 49– 53. Ghil, M. (2019). A century of nonlinearity in the geosciences. Earth and Space Science, 6, 1007– 1042. Giere, R. N. (1988). Explaining science: A cognitive approach. University of Chicago Press. Giere, R. N. (1999). Science without laws. University of Chicago Press. Gilbert, S. W., & Ireton, S. W. (2003). Understanding models in earth and space science. NSTA Press. Glennan, S. (2010). Ephemeral mechanisms and historical explanation. Erkenntnis, 72, 251–266. Goudge, T. A. (1950). The thought of C. S. Peirce. Dover Publications. Gouvea, J., & Passmore, C. (2017). ‘Models of’ versus ‘models for’: Toward and agent-based conception of modeling in the science classroom. Science & Education, 26, 49–63. Gray, R. (2014). The distinction between experimental and historical sciences as a framework for improving classroom inquiry. Science Education, 98, 327–341. Haig, B. D. (2005). An abductive theory of scientific method. Psychological Methods, 10(4), 371– 388. Haig, B. D. (2008). An abductive perspective on theory construction. The Journal of Theory Construction and Testing, 12(1), 7–10. Hanson, N. R. (1958). Patterns of discovery. Cambridge University Press. Hanson, N. R. (1961). Is there a logic of scientific discovery? In B. A. Brody & R. E. Grandy (Eds.), Readings in the philosophy of science (2nd ed., pp. 398–409). Prentice Hall. (1989). Holthuis, N., Deutscher, R., Schultz, S. E., & Jamshidi, A. (2018). The new NGSS classroom: A curriculum framework for project-based science learning. American Educator, 42(2), 23–27. Kapitan, T. (1992). Peirce and the autonomy of abductive reasoning. Erkenntnis, 37, 1–26. Khan, S. (2007). Model-based inquires in chemistry. Science Education, 91(6), 877–905. Kitts, D. B. (1977). The structure of geology. Southern Methodist University Press. Klassen, S. (2006). The science thought experiment: How might it be used profitably in the classroom? Interchange, 37, 77–96. Kleinhans, M. G., Buskes, C. J. J., & de Regt, H. W. (2005). Terra incognita: Explanation and reduction in earth science. International Studies in the Philosophy of Science, 19(3), 289–317. Krajcik, J., McNeill, K. L., & Reiser, B. J. (2007). Learning-goals-driven design model: Developing curriculum materials that align with national standards and incorporate project-based pedagogy. Science Education, 92(1), 1–32. Lally, D., Forbes, C. T., McNeal, K. S., & Soltis, N. A. (2019). National geoscience faculty survey 2016: Prevalence of systems thinking and scientific modeling learning opportunities. Journal of Geoscience Education, 67(2), 174–191. Laudan, R. (1980). The method of multiple working hypotheses and the development of plate tectonic theory. In T. Nickles (Ed.), Scientific discovery: Cases studies (pp. 331–343). D. Reidel. Laudan, R. (1987). From mineralogy to geology: The foundations of a science, 1650–1830. University of Chicago Press.
50 Abduction in Earth Science Education
1083
Lewis, E. B. (2008). Content is not enough: A history of secondary earth science teacher preparation with recommendation for today. Journal of Geoscience Education, 56(5), 445–464. Magnani, L. (2001). Abduction, reason, and science: Process of discovery and explanation. Kluwer Academic/Plenum. Magnani, L. (2002). Epistemic mediators and model-based discovery in science. In L. Magnani & N. J. Nersessian (Eds.), Model-based reasoning: Science, technology, values (pp. 305–329). Kluwer Academic/Plenum. Magnani, L. (2004). Model-based and manipulative abduction in science. Foundation of Science, 9, 219–247. Magnani, L., & Bertolotti, T. (2017). Springer handbook of model-based science. Springer. Morgan, M. S., & Morrison, M. (1999). Models as mediators: Perspectives on natural and social science. Cambridge University Press. Nagel, E. (1961). The structure of science: Problems in the logic of scientific explanation. Harcourt, Brace & World. National Research Council. (1996). National Science Education Standards. The National Academy Press. National Research Council. (2012). A framework for K-12 science education: Practices, crosscutting concepts, and core ideas. The National Academies Press. Nersessian, N. J. (2008). Creating scientific concepts. The MIT Press. Nersessian, N. J. (2013). Mental modeling in conceptual change. In S. Vosniadou (Ed.), International handbook of research on conceptual change (2nd ed., pp. 395–411). Routledge. Next Generation Science Standards (NGSS) Lead States. (2013). Next generation science standards for states, by states. The National Academies Press. Norris, S. P., Guilbert, S. M., Smith, M. L., Hakimelahi, S., & Phillips, L. M. (2005). A theoretical framework for narrative explanation in science. Science Education, 89, 535–563. Núñez-Oviedo, M. C., & Clement, J. J. (2019). Large scale scientific modeling practices that can organize science instruction at the unit and lesson levels. Frontiers in Education, 4, 68. O’Neill, D. K., & Polman, J. L. (2004). Why educate “little scientists?” examining the potential of practice-based scientific literacy. Journal of Research in Science Teaching, 41(3), 234–266. Oh, P. S. (2008). Adopting the abductive inquiry model (AIM) into undergraduate earth science laboratories. In I. V. Eriksson (Ed.), Science education in the 21st century (pp. 263–277). Nova Science. Oh, P. S. (2010). How can teachers help students formulate scientific hypotheses? Some strategies found in abductive inquiry activities of earth science. International Journal of Science Education, 32(4), 541–560. Oh, P. S. (2011). Characteristics of abductive inquiry in earth science: An undergraduate case study. Science Education, 95, 409–430. Oh, P. S. (2016). Roles of models in abductive reasoning: A schematization through theoretical and empirical studies. Journal of the Korean Association for Science Education, 36(4), 551–561. (in Korean with an English abstract). Oh, P. S. (2017). The roles and importance of critical evidence (CE) and critical resource models (CRMs) in abductive reasoning for earth scientific problem solving. Journal of Science Education, 41(3), 426–446. (in Korean with an English abstract). Oh, P. S. (2018). An exploratory study of the ‘method of multiple working hypotheses’ as a method of earth scientific inquiry. The Journal of the Korean Earth Science Society, 39(5), 501–515. (in Korean with an English abstract). Oh, P. S. (2019). Features of modeling-based abductive reasoning as a disciplinary practice of inquiry in earth science: Cases of novice students solving a geological problem. Science & Education, 28, 731–757. Oh, P. S., & Oh, S. J. (2011). What teachers of science need to know about models: An overview. International Journal of Science Education, 33(8), 1109–1130. Oreskes, N. (2003). From continental drift to plate tectonics. In N. Oreskes (Ed.), Plate tectonics: An insider’s history of the modern theory of the earth (pp. 3–27). Westview Press.
1084
P. S. Oh
Orion, N., & Ault, C. (2007). Learning earth sciences. In S. Abell & N. Lederman (Eds.), Handbook of research on science education (pp. 653–688). Routledge. Paavola, S. (2004). Abduction as a logic and methodology of discovery: The importance of strategies. Foundations of Science, 9, 267–283. Paavola, S. (2006). Hansonian and Harmanian abduction as models of discovery. International Studies in the Philosophy of Science, 20(1), 93–108. Raia, F. (2005). Students’ understanding of complex dynamic systems. Journal of Geoscience Education, 53(3), 297–308. Raia, F. (2008). Causality in complex dynamic systems: A challenge in earth systems science education. Journal of Geoscience Education, 56(1), 81–94. Rhoads, B. L., & Thorn, C. E. (1993). Geomorphology as a science: The role of theory. Geomorphology, 6, 287–307. Rhoads, B. L., & Thorn, C. E. (1996). Observation in geomorphology. In B. L. Rhoads & C. E. Thorn (Eds.), The scientific nature of geomorphology (pp. 21–56). Wiley. Rudolph, J. L. (2019). How we teacher science: What’s changed, and why it matters. Harvard University Press. Salmon, W. C. (1984). Scientific explanation and the causal structure of the world. Princeton University Press. Scherer, H. H., Holder, L., & Herbert, B. (2017). Student learning of complex earth systems: Conceptual frameworks of earth systems and instructional design. Journal of Geoscience Education, 65(4), 473–489. Schumm, S. A. (1991). To interpret the earth: Ten ways to be wrong. Cambridge University Press. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. Shaked, H., & Schechter, C. (2017). Systems thinking for school leaders: Holistic leadership for excellence in education. Springer. Simpson, G. G. (1963). Historical science. In C. C. Albritton Jr. (Ed.), The fabric of geology (pp. 24–48). Addison-Wesley. Stillings, N. (2012). Complex systems in the geosciences and in geoscience learning. In K. A. Kastens & C. A. Manduca (Eds.), Earth and mind II: A synthesis of research on thinking and learning in the geoscience (pp. 97–111). The Geological Society of America. Sung, J. Y., & Oh, P. S. (2018). Sixth grade students’ content-specific competencies and challenges in learning the seasons through modeling. Research in Science Education, 48(4), 839–864. Thagard, P. (1988). Computational philosophy of science. The MIT Press. Thagard, P. (1992). Conceptual revolutions. Princeton University Press. Thagard, P. (2010). How brains make mental models. In L. Magnani, W. Carnielli, & C. Pizzi (Eds.), Model-based reasoning in science and technology: Abduction, logic, and computational discovery (pp. 447–461). Springer. Thagard, P., & Shelley, C. (1997). Abductive reasoning: Logic, visual thinking, and coherence. In M. L. Dalla Chiara, K. Doets, D. Mundici, & J. van Benthem (Eds.), Logic and scientific methods (pp. 413–427). Kluwer Academic Publishers. Trickett, S. B., & Trafton, J. G. (2007). “What if . . . ”: The use of conceptual simulations in scientific reasoning. Cognitive Science, 31, 843–875. Turner, D. (2005). Local underdetermination in historical science. Philosophy of Science, 72, 209– 230. Turner, D. (2013). Historical geology: Methodology and metaphysics. In V. R. Baker (Ed.), Rethinking the fabric of geology (pp. 11–18). The Geological Society of America. Villanueva, M. G., & Hand, B. (2011). Data versus evidence: Investigating the difference. Science Scope, 35(1), 42–45. von Engelhardt, W., & Zimmermann, J. (1988). Theory of earth science (translated by L. Fisher). Cambridge University Press. Walton, D. (2004). Abductive reasoning. The University of Alabama Press. Wise, A. F., & O’Neill, K. (2009). Beyond more versus less: A reframing of the debate on instructional guidance. In S. Tobias & T. M. Duffy (Eds.), Constructivist instruction: Success of failure? (pp. 82–105). Routledge.
Abductive Inquiry and Education: Pragmatism Coordinating the Humanities, Human Sciences, and Sciences
51
John R. Shook
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Education Minding Minds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experiential Education and Scientific Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Culture in Nature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Inquiry and Knowledge Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nine Modes of Exploratory Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Humanistic Disciplines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Two Historical Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Two Exposition Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Five Scientific Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Educational Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scientific Objects, Education Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1086 1087 1088 1091 1093 1097 1097 1098 1099 1099 1100 1103 1104 1105
Abstract
Through its own traditions, research programs, and collaborations with other human sciences, education is a discipline displaying an unbounded potency for advancing human understanding and achievement. Education is a humanistic discipline about culture rather than a scientific field about nature, so it can get classified as a nonscientific discipline because of its inherently historical and social orientation. Narratives about what children should be becoming and how they should be developing are normatively prescriptive, not just naturalistically descriptive. Why then would science serve education? Disciplines such as
J. R. Shook () Bowie State University, Bowie, MD, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_49
1085
1086
J. R. Shook
psychology, social theory, and anthropology are relevant to pedagogy because they bridge humanistic disciplines and naturalistic sciences: they are able to be human sciences. They themselves are hybrid blends, so their humanistic knowledge is salient to pedagogy’s understanding of the child’s learning and the practices of teaching. Although education is primarily a humanistic discipline oriented more to culture rather than nature, pedagogical practice can incorporate knowledge about childhood development, interpersonal communication, intellectual growth, and social integration. Education, while far more than scientific research, thereby joins the human sciences, contributing to scientific inquiry and cultural advancement simultaneously. Due to their similar priorities, education and science make an excellent fit. What is scientific and what is educational are unified at their root by exploratory discovery. With methods exemplifying that spirit of discovery, learners and teaching practices can be studied by the human sciences. Education’s humanistic ends are aided by knowledge about the nature of those to be educated. Keywords
Education · Educational science · Human sciences · Humanities · Science · Abduction · Philosophy of science
Introduction The world is a patient teacher to a curious species seeking out knowledge. Endless answers shall be given to those asking boundless questions, and that inexhaustibility to nature leaves us both awed and humbled. What the world cannot teach is what we need most to understand: what would be the right questions to ask, to learn even more? We bear responsibility for the inquiries we undertake, so we must attend to what we are doing with our investigations, as closely as we watch nature’s doings. Self-exploration accompanies world exploration. Modern science shines its bright light upon nature, while modernity directs a spotlight of illumination onto us. “What is humanity, that we are mindful of our own education?” Education is a truly human specialty, and all of science without exception is part of humanity’s education. This complementary opinion of science is hardly an unfamiliar perspective. The corollary, that humanistic disciplines can borrow scientific methods and absorb scientific knowledge, sounds less familiar. What about that supposed divide between the sciences and the humanities, keeping both “cultures” deaf and blind to the other? Education straddles that divide and integrates them together at their common root. Not only should science be educational and education should teach science, but their shared mission of abductive discovery connects the humanities, human science, and sciences together in collaboration. This opening principle about discovery is sufficient, for pragmatism, to guide any elaboration of a “philosophy” of education. No theoretical philosophy of education obstructs our inquiry from the
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1087
outset, and none will be defined in this chapter, although there is an educational philosophy assembled by its conclusion. To demand a “philosophy” of education up front, prior to empirical inquiry into learning, is to fixate upon an abstract obstacle obscuring the road of practical inquiry.
Education Minding Minds Education revolves around the experiences of learning and learning well to facilitate ever more learning. Learning that narrows, misdirects, or halts the mind does not deserve the name. Focusing on teaching and learning, we observe a special and solemn space for an important kind of human relationship. Philosophies of education differ over a great deal, but see no reason to disagree here (Carr, 2005; Jackson, 2011; Moore, 2010; Woods & Barrow, 2006). This is a normative situation replete with roles and responsibilities attending to key matters: what should be learned, how does proper learning proceed, and how learning’s guidance should be conducted. Education is a human practice, and hence it can be right or wrong and done rightly or wrongly. There is nothing neutral or value-free about education, as it performs its service advancing humanity. Science as a human enterprise could never be neutral either. Science consists of more than theories and knowledge; it has its methods, values, and norms, too. Depicting science as value-free, alongside education as value-laden, obscures much about both disciplines. The growth of knowledge, along with the development of inquiring minds able to gain knowledge, is as normative and noble as any human endeavor. What science values, education values as well. On that common field of shared interests, offers of constructive criticism would be expected along with congratulatory confirmations. Sometimes criticism sounds too sharp, as we hear in that head-turning question, “Why is education teaching children all wrong?” Disruptive dictums from science pundits skirt around the heart of the matter: where does science constructively connect with education? Superficial criticism only clouds the important issues. Collaborations between education and science should proceed from a shared nature and purpose, or, if no such basis exists, each one can carry on without interference from the other. Education has long experience with social forces offering their advice or admonishment. “Why isn’t education teaching children right?” is a provocative but ambiguous question, motivated by different intentions behind it. “Why isn’t education teaching what is right?” expresses a moral or civic concern. “Why isn’t education teaching the right subjects?” instead criticizes what is being taught. “Why isn’t education teaching the right way?” challenges the teaching practices. Science, too, is familiar with getting dragged into social controversies. Science’s worldview may be judged as harmful, or helpful, for civic values. Science’s knowledge could be assessed as central, or peripheral, for general competencies. Science’s expertise might be incorporated as useful, or rejected as irrelevant, for pedagogical efficiencies. Modernity has brought a measure of materialism to our times, but the idea of living in a “scientific culture” remains aspirational. Voices
1088
J. R. Shook
enthusiastic about science try to be heard over those sounding apprehensive. Upon complaints to the effect that “That education is unscientific,” firm priorities have to be set. Shall scientific paradigms be confrontational against, or deferential to, conservative values? Shall scientific areas be secondary to, or replacements of, other subjects? Shall scientific theories be essential to, or optional for, teaching methods? Among all of these competing demands, education itself should remain the top priority for any society. It must not be regarded as inherently political, exclusory, or mechanical. Treating education as principally about something other than the teaching-learning relationship represents a deeper betrayal of education’s mission than any sign of intrusive scientism. Nevertheless, education in general plays many social and cultural roles simultaneously, Education as a public affair does require attention, by the area of educational policy. Education as an academic institution calls for oversight with educational administration. The core mission of education universally is the responsibility of the discipline of education specifically, where educational research is harbored. Where educational research is distorted locally or dictated nationally by policy or administrative agendas, education becomes discordant and somewhat undisciplined. Discipline starts from putting first things first. Education must embody what is human, to be broadly humanistic as anything else about culture, and duty-bound to advance learning for its own sake. Teaching only sacrosanct ideals deprives minds of comparing and testing values. Teaching only certain subjects to the exclusion of others prevents minds from appreciating a wide variety of endeavors. Teaching in fairly restricted ways limits minds to similarly constrained ways of thinking. Genuine education not only teaches; it teaches in ways that foster ever-more learning and the sure growth of knowledge. Learning for its own sake has but one other devoted ally among all the cultural forms and social institutions ever invented: science. Minds should become explorative, experimental, expansive – with that mental growth in focus, the operational mission to education is coming into view. An undisciplined or debilitated education, less than fully capable of facilitating further learning, cannot do justice to that special learning-teaching bond. Mentality and its growth must possess its own inherent value and intrinsic justification. No doubt that is why all areas of human achievement, particularly science, unfailingly prize the mind – while any number of social forces try to manipulate minds.
Experiential Education and Scientific Inquiry Science, like all exploratory discovery, built upon that foundation of education embedded at the core of human culture. It would be foolish for scientists to think that they have no need of learning and sound insights into good learning. If knowledge does not come from learning, by what process could knowledge ever enter a mind? Mystical and mythical illumination were left to religion when science gained its independence from theology. In science, learning from exploratory experience surely counts as learning.
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1089
The purpose of thinking is inquiry, for the one who is underdoing the learning processes. This is the essence of all education, in whatever form and format. In the words of John Dewey: To say that thinking occurs with reference to situations which are still going on, and incomplete, is to say that thinking occurs when things are uncertain or doubtful or problematic. Only what is finished, completed, is wholly assured. Where there is reflection there is suspense. The object of thinking is to help reach a conclusion, to project a possible termination on the basis of what is already given. Certain other facts about thinking accompany this feature. Since the situation in which thinking occurs is a doubtful one, thinking is a process of inquiry, of looking into things, of investigating. Acquiring is always secondary, and instrumental to the act of inquiring. It is seeking, a quest, for something that is not at hand. We sometimes talk as if “original research” were a peculiar prerogative of scientists or at least of advanced students. But all thinking is research, and all research is native, original, with him who carries it on, even if everybody else in the world already is sure of what he is still looking for. (Dewey, 1916, pp. 173–174)
It is unnecessary to advocate Dewey’s entire philosophy of education to acknowledge the sensible point about learning made here. Dewey had an enormous influence on progressive educational theory (Darling & Nordenbo, 2003), and his teacherwith-learner approach serves as a counterbalance to curriculum-centered paradigms on the one side and child-centered pedagogies on the other (Noddings, 2015). His broad views on education never stray far from his tight focus on thinking, problemsolving, and learning from the trials of experience. Why would science be distanced from education? Empirical sciences could not plead ignorance of the methods behind their discoveries while taking credit for those successes. Scientific methods are fundamentally methods of experiential learning, or else they have nothing to do with knowledge. We need not rehearse outdated debates between empiricism and rationalism to understand that the sciences surmount that standoff by fruitfully combining empirical observations with reasoned inferences in their complex methodologies. Science deserves all due credit for its numerous sophisticated methodologies, carefully crafted for the many domains of diverse fields and subfields of inquiry. Reliable methodology conducive to knowledge is hardly alien to education. Let us then seriously ponder how science is education and education is scientific. This thesis for deliberation is not merely that “science is educational” or that “education includes science.” Announcing their rooted unity presents a truly radical thesis, exposing to our view their deep common root. Their superficial apprehensions will not delay our excavation. Education, as a humanistic discipline, often exhibits anxieties and antipathies toward what is regarded as scientism. Science, for its part, would not surrender its hard-won independence just to be submissive to humanism. Both sides need to relax their defensiveness. Neither values-free scientism nor values-laden science is our objective. Worries over reductionism or relativism are premature until the territory around education and science has been adequately scouted and surveyed. Deeper commonalities have to be investigated, to get past hasty misconceptions. A widespread notion about education thinks that it only transmits established knowledge, while science acquires new knowledge. Science education, for instance,
1090
J. R. Shook
makes complete sense from a scientific standpoint, supplying knowledge that education conveys. This preconception treats education as just teaching lessons for dutiful learners, while science is testing ideas for daring explorers. Didactic instruction fits that caricature for education, at most. Such an antiquated view of education hardly comports with science’s respect for modernized empiricism. Let us all be sound empiricists at last. Education in its refined meaning has everything to do with active learners trying out ideas new to them. Experiential learning is sounder than rote learning. Educational research accordingly explores the possibilities to dynamic learning in all its forms and formats. Experiential learning is not synonymous with solitary learning. Suitably directed, education can guide learners through processes of comparing and testing their own ideas as well as those of others and encourage inquisitive activities calling for group participation. Science applies its methodologies within group efforts of comparatively testing ideas of explorers through guided research programs. That description makes a fine fit with the conception of empirical and experiential education. Education and science are, from their root, the same flourishing and flowering of human mentality. From any learner’s perspective, acquiring knowledge is voyage of novel exploration and discovery, just as the scientist’s experience of performing scientific inquiry is exploratory discovery too. Still, voices keep insisting on science’s independence from education. We will be reminded that instructional settings provide teaching guidance. Group guidance is indeed crucial, we can reply. Does the scientist confront nature alone, out beyond society and all social institutions? Just the opposite: organized scientific communities together establish confirmable knowledge. We will also be reminded that educational instruction presumes that most everything to be learned is already well-confirmed and reliably known. Again, we can reply, any scientific field relies on an ample storehouse of established knowledge from past investigations and theoretical advances. We will next be told that educational practices, unlike scientific programs, depend on multiple disciplines. Indeed, education blends traditions of pedagogy with knowledge from allied disciplines such as developmental and abnormal psychology, sociology, and organizational studies. Yet when we turn to look at any scientific field, its supportive subfields and neighboring fields contribute knowledge and methods. The field of cellular biology would be making little progress absent the participation of biochemistry, molecular biology, and genetics or without collaborations with physiology and evolutionary biology. Parallels are only mounting between scientific inquiry and experiential education. Scientific independence from education is, so far, looking less and less plausible. There must be sound methodology for experiential learning and discovery, or else nothing about science or society makes sense. The sciences regard their methodologies as highly refined and specialized, and too complex for application in educational settings. That is a valid distinction, only confirming the general standpoint urged here about a shared science-education heritage. There is much that science does that cannot appear educational, and much occurs in education that doesn’t seem particularly scientific, but their common
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1091
core of experiential discovery plays its essential role nonetheless. Elementary and secondary instruction introduces elements of empirical and experimental inquiry, and then scientific fields instruct their college majors and graduate students in advanced methodologies.
Culture in Nature Admitting the core heritage behind by science and education, the path ahead may yet diverge. They are distinct fields, after all. A fact-value dichotomy might send them in divergent directions. In order for science to make theoretical discoveries, it strives to leave the normatively human behind as it neutrally postulates natural entities, energies, and laws that exist anywhere at anytime. Education places the normatively human world out in front, proceeding for the sake of that particular world and that world’s future. Education as a discipline must contain a historical and historicist component, even as it conveys its ongoing work into the future. Its pedagogical practices developed within humanity’s cultures to perpetuate those heritages, and nothing about pedagogy makes sense outside of that genealogy. Two divergent objectives now open up before us. What education, along with other humanistic disciplines, wants to comprehend is what being human can become. What science seeks to understand is what being natural must be. This is an important difference; whether there is also an ontological dichotomy is a meta-methodological matter for philosophical reflection. Both the humanities and the sciences deserve a careful hearing. What can humanity become, in light of the way that we have gotten to where we are so far? Agency, opportunity, and liberty are presumed values with the asking of that humanistic question and can’t be omitted in any sensible answer. What must nature be doing, in light of the way that we have observed events around us so far? Energy, regularity, and conditionality are presumed categories with the asking of that scientific question and won’t be omitted in any reasonable answer. Philosophies do not fail to note this divergence of objectives and presuppositions, while disagreeing over their ontological and metaphysical implications. Some philosophies formulate the compatibility or even convergence of these objectives, reconciling the human with the natural. Other philosophies magnify their incompatibilities, defining the human and the natural in categorically contrary ways. Education can be easily classified as a nonscientific discipline because of its inherently historical and social orientation. Narratives about what children should be becoming and how they should be developing are normatively prescriptive, not just naturalistically descriptive. Why then would science serve education? Disciplines such as psychology, social theory, and anthropology are truly relevant to pedagogy because they bridge humanistic disciplines and naturalistic sciences: they are able to be human sciences. They themselves are hybrid blends, so their humanistic knowledge is salient to pedagogy’s understanding of the child’s learning and the practices of teaching. Although education is primarily a humanistic discipline oriented more to culture rather than nature, pedagogical practice can incorporate
1092
J. R. Shook
knowledge about childhood development, interpersonal communication, intellectual growth, and social integration. Education, while far more than just scientific research, thereby joins the human sciences, contributing to scientific inquiry and cultural advancement simultaneously. The internal struggle within education over the question, Shall education be a culture-oriented discipline about comprehending what is human or a scienceoriented field about understanding what is natural, can be quelled and dispelled. It is no contradiction in terms for education to focus on what should be naturally human: how each child’s intellectual capacities should get developed toward bountiful results. It would be unnaturally wrong to neglect a child’s thinking capacities and normally right to administer sound teaching practices. Nurturing and naturing are united here, so long as education is well-informed about the import, efficacy, and impact of those practices. Education can be scientific to the extent that it views its body of time-tested pedagogical practices as opportunities for further investigation, trial, and adjustment, even as education’s pedagogical goals always transcend science’s purview. This historical co-development, between the improvement of practices which in turn enhance cultural ends, leaves nothing unaffected or unchanged while generation and after generation receives its education. How children are taught now is a lesson telling culture how its adults will be able to think. Philosophy of education need not follow the academic tendency to keep fields apart and erect rigid dualisms between disciplinary categories, goals, and methods. Philosophy itself can examine the historical-cultural nature of education alongside the experimental-natural purpose of science, discerning their shared commitments and methodologies. Education, like history, social theory, psychology, and anthropology, do not number among the sciences, intrinsically or in their entirety. Their missions revolve around agency, not causality, and their methods and ethics forbid fully controlled experiments on humans. (How would experimenting with control groups who are denied such things as autonomy, nurturing, opportunity, or security be allowed to proceed?) However, the portability and adaptability of many scientific methods allows nonscience disciplines to include selected scientific phases. History can take advantage of selected criteria for factual validity and explanatory adequacy. The amenability of social theory to observational and statistical methods allows sociology to flourish. Experimental psychology can use correlative statistics and control groups. Anthropology heeds the counsels of scientific objectivity while conducting its investigative and comparative inquiries. Scientific investigations are evidently educational for humanity’s ways no less than they educate humanity about nature’s ways. What ontological chasm still divides the human from the natural, to keep humanistic learning far away from naturalistic knowledge? Four proposals have been advanced so far. (1) What is scientific and what is educational are unified at their root by exploratory discovery. (2) Education is a humanistic discipline about culture rather than a scientific field about nature. (3) Learners and teaching practices can be studied by the human sciences. (4) Education’s humanistic ends are aided by knowledge about the nature of those to be educated.
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1093
Proposition (1) is met with the challenge that the methods of scientific experimentation are very different in kind from the procedures of learning discovery. This challenge can be satisfactorily answered. Proposition (2) is accurate enough, although answering the first challenge, revealing the core logic of abduction shared by scientific and educational discovery, explains why humanistic disciplines cannot match the experimental powers of scientific fields. Proposition (3) is met by the criticism that the notion of a “human science” remains obscure, since science yields necessary universal laws useless for the humanistic disciplines’ respect for contingency and freedom. That challenge is answered by pointing to historical sciences (such as geology, paleontology, biology, and archaeology) which appeal to neither universal nor chancy explanations. Biology is the key to defending (4) from objections. The cultured nature of the human learner has resulted from evolution, so the normativity of culture is a natural object of study by the humanistic sciences. The nature of humanity is to be intelligently cultural. Culture expresses that freedom of the ongoing discovery of human potential. While more creative than controlled, humanistic discovery benefits from scientific counsels about humanity. While scientific in methods, human sciences explore how humanity has been selfcreated and continues to self-create. Science is not alien to the historicity of cultural endeavor, as a component of that human project of exploration. The spirit to humanity lies in that distinctive life of the bio-culturally coevolved human animal.
Abductive Inquiry and Knowledge Discovery Are the methods of scientific experimentation so different in kind from the procedures of learning discovery? We have already left behind didactic repetition for rote memorization. We can also set aside inferior ways to learn from getting exposed to a heterogenous mass of supposed facts or from getting invited to accept conclusions deduced from assumed premises. The former method is appropriate for making acquaintances with natural curiosities, and the latter is essential in geometry and mathematics. Beyond those delimited stages, acquiring useful knowledge has to be interactive rather than passive. Learning that engages the learner’s own queries and develops the learner’s critical faculties must be exploratory: acquired knowledge opens up further questions for exploration by empirical eyes and rational minds. Are we only contemplating education with this conception of learning? Dewey has a basic definition for science in mind to offer us too: “science signifies, I take it, the existence of systematic methods of inquiry, which, when they are brought to bear on a range of facts, enable us to understand them better and to control them more intelligently, less haphazardly and with less routine” (Dewey, 1929/1984, pp. 3–4). Learning is learning, at any level or stage of progress. Exploratory learning is surely enhanced by suggestive teaching. We need not exalt “self-guided” learning, the sort of inadequate psychology that pragmatism warned against, in order to keep up with the oft-heard contention that science needs no teacher other than nature itself. Depiction of the lone scientist eliciting nature’s
1094
J. R. Shook
secrets is a romanticized image at best and a crude caricature at worst. Competent participants offering their knowledge, suggestions, and criticisms surround any research scientist. Research teams do have to eventually answer to nature, but so does any exploratory learner desirous to learn something directly rather than at secondhand. Motivationally, who is the cutting-edge experimenter but a proficient learner once again having a long look at nature? Methodologically, no empirical inquiry should be solitary, since the vagaries of cognitive bias and prejudice require communal compensations. Co-informants (any information sources whether natural or human) may be in the past, in the present, or mostly in the future. If informants are no longer here but only in the past, the inquirer must adopt their perspectives and viewpoints, on matters thus taken as historical. If co-informants are present and accessible, the inquirer can solicit their information, on matters thus taken as expositional. If instead co-informants will mostly exist in the future, the inquirer can conduct trials that are replicable, on matters thus taken as experimental. All three methodological orientations – whether historical, expositional, or experimental – are implemented and accomplished through abductive procedures of inference (Magnani, 2010; Aliseda, 2017; see Shook, 2021a for more references). The founder of pragmatism, Charles Peirce, places the burden upon abduction for pursuing and finding explanations: Abduction merely suggests that something may be. Its only justification is that from its suggestion deduction can draw a prediction which can be tested by induction, and that, if we are ever to learn anything or to understand phenomena at all, it must be by abduction that this is to be brought about. (Peirce, 1934, pp. 171–172)
Humanistic disciplines attempting to be exploratory and explanatory cannot avoid the rigors of abductive inference. The discipline of history, as the most historical of inquiries by definition, is inquisitive, selective, and organized, but those are low standards to meet. Exemplars of far-from-scientific history are historians composing a compelling tale (the literary historian), a morality lesson (the hagiographical historian), or a vindication epic (the ideological historian). For the investigative historian, sources are indispensable but not infallible or unchallengeable. Records from sources offer viewpoints upon their topics (they are not entirely subjective), so higher objectivity lies in the collection and colligation of many source records and material traces. This historian develops a hypothesis that can be put to trial, revolving around a proposal, such as “Rome’s civic instabilities were behind Caesar’s dictatorship and swift assassination,” that may be tested against further information and interpretations (Collingwood, 1946). The scientific historian refines investigations further, heeding the naturalistic worldview and consulting allied human sciences about the past, such as antiquarian forensics, paleography, archaeology, and geography (Diamond & Robinson, 2010; Roth, 2012). That abductive process also lies at the heart of the two remaining modes of investigation: expository and experimental.
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1095
While co-informants are presently accessible, exploration takes a predominantly expository form. The inquirer can solicit interviews or consultations. Broadly social influences and forces are amenable to investigative methods of the social sciences (Backhouse & Fontaine, 2010). Interviewing enough people about a town’s rising prices, as a sociologist, journalist, or pollster may undertake, offers some qualitative insight into local concerns. Quantitative investigations follow in their wake, scalable up to any desired scope. Calculating an entire nation’s rate of monetary inflation requires the quantitative tabulation of vast amounts of data collected by fairly representative sampling across the country. The resulting expositions from extensive investigation, whether journalistic or economic, remain improvable with enough resources. Journalism’s “first draft of history” is combined with related sociological information about mass behaviors, customs, institutions, and the like, to sketch a general portrait which is then available for further testing against the still-growing collection of insight and information. The efforts of economics to fine-tune its measures of fiscal and market activity can be especially relentless and ceaseless. A paradigm form of expository investigation is the crime investigation. The accused and witnesses are questioned, crime scene clues are forensically analyzed, and contextual conditions surrounding the crime are registered. Like the historian, the detective formulates a reasoned account to explain the unfortunate event, while minding the sociological adage that human affairs are so complicated that alternative accounts have to be considered and compared. The maximally coherent account is probably closer to the truth, especially if it survives impartial scrutiny (e.g., by judges and juries). Sherlock Holmes’ fictionalized acuity relied on abduction more than deduction or induction, although his swift capacity for contrasting multiple guesses by their deduced consequences, and then spotting the singular clues eliminating all but one explanation, disguised his abductive powers (Carson, 2009). A detective’s criminal investigation is akin to a trial and serves as an opening phase to a potential criminal trial. “If the accused really committed the crime, then further consequences of both the criminal’s behavior and the crime scene should become observable under the right conditions.” Rarely do the initial facts determine responsibility. They merely set up the opportunity to explain that event as the outcome of a chronological sequencing of conditioning events that allowed that particular event to happen. More evidence must be experimentally gathered (forensics, interviews, etc.) to test various hypotheses about the responsible causes for that crime. It is true that the “crime event” is treated as a particular event with its own contingent conditionings and causes, rather than as an “individual” event that must occur whenever necessary conditions are lawfully satisfied. That is because a “crime” is a sufficiently complex event, so inquiries into its “lawfulness” are impractical, and a “crime” is an event involving humans so it bears particular (indeed, unique) interest in its own right. For sociology, by contrast, individual deeds are only noticed and measured so that mass statistics about generic kinds of crime can be metrically accumulated. In sociology, a crime is still an event, but it is now an “individual” event to be treated and explained as an individual case within a general pattern of mass social action (Hester & Eglin, 2017).
1096
J. R. Shook
However exploratory, a detective’s investigative methods remain constrained by an inability to conduct a highly controlled experiment. It is impossible to recreate situational conditions prior to a crime, human behavior is hardly lawful even under ideal conditions, and neither the suspects nor the victim can be “reset” to initial psychological states as they were prior to the crime. It is impossible to learn if the accused “would have surely done that crime at that time in that same way.” All the same, investigations can conduct partially controlled experiments. That crime was in the past, but most of its components did continue on, and many effects from those components continue to exist to the present time. That is why the investigation must happen quickly, while conditions are fresh and clues haven’t dissipated (Fisher & Fisher, 2012). If too much time passes, looking for additional evidence will seem more like an expedition than an experiment. “Cold cases” can still be investigated (Adcock & Stein, 2014), but that sort of expeditionary quest is more akin to an animal hunt down a trail gone cold. The third mode of discovery, the experimental, is undertaken when co-informants for the inquiry are mostly in the future. An analogy from Peirce illustrates this collective intelligence: “The scientific world is like is like a colony of insects in that the individual strives to produce that which he himself cannot hope to enjoy. One generation collects premises in order that a distant generation may discover what they mean” (Peirce, 1958, p. 87). For Peirce, a community of scientific inquirers can be indefinitely extended into the future, no matter how many visible geniuses stand among us now (Shook, 2021b). Historical and investigatory disciplines can be “scientific” in this minimal sense, seeking empirical truths through methods amenable and answerable to similarly scrupulous and honest researchers. However, with experimental inquiry, Peirce’s vision from the heights of scientific inquiry looks to the far future. Indirect communication with people that one hasn’t met and won’t ever meet has to take the form of exacting experimental design and precise data collection. Controlled experiments to test an abductive hypothesis have to be closely replicable so that future confirming results are repeatable and comparable. The general way that a crafted experiment is reproducible by generic experimenters, anywhere and anywhen, is essential to the postulate’s credibility in the long run. The logic of abductive discovery requires this prolonged reach of experimental inquiry, so that weaknesses to poor hypotheses are eventually exposed. Peirce’s understanding of science expects future inquiries, if rigorously scientific in character, to converge in the very long run (over thousands or millions of years, if necessary) toward an answer: “Inquiry properly carried on will reach some definite and fixed result or approximate indefinitely toward that limit” (Peirce, 1932, p. 485). His definition of truth for science (not for “truth” in other contexts) is this: “The opinion which is fated to be ultimately agreed to by all who investigate, is what we mean by the truth” (Peirce, 1934, p. 407). Whatever cannot satisfy these two conditions of character and convergence cannot count as genuine scientific inquiry, since it will lack credible objectivity and realism. The humanities are not expected to satisfy this scientific idea of truth, for their missions are focused on comprehending and
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1097
explicating human practices, institutions, and areas of cultural achievement. The human sciences elevate their aims toward explanation, prediction, and direction. Differences between investigative and experimental inquiries can be exaggerated. Physics is often held up as “genuine science” because it conducts replicable controlled experiments and validates many hypotheses to near-certain degrees. Knowledge from partially controlled experiments can be obtained, without question, and many fields offer practical advice about statistical correlations. For example, clinical trials purporting to find the efficacy of a new drug include “control groups” cannot fully control all confounding variables. Determining true mechanisms of biophysiological action lies beyond the capabilities of the most rigorous trials, which are at best suggestive about sure causes to guaranteed effects (Machin et al., 2021). Clinical trials are abductive experiments on human subjects, but they are more investigational than strictly experimental like chemistry or physics.
Nine Modes of Exploratory Discovery We have distinguished three procedural types: the empirical, investigative, and scientific. We then distinguished the historical, expository, and experimental orientations to inquiry. A total of nine modes of exploratory inquiry, represented in Table 1, are given with the nine boxes, each representing a phase of abductive inquiry as an element of discovery.
Humanistic Disciplines Humanistic disciplines are not sciences, focused instead on understanding and explaining the capacities and results of human thought, agency, and activity. Factual evidence is not dispositive here; what ought to be enjoys preeminence over what happens to be. Exploring the possibilities of human potential is the supremely important sort of exploration, devoted to creative discovery, not empirical discovery. Each discipline is highly selective about the character and salience about “evidence” relevant to its normative paradigms. Exemplars of humanistic disciplines are philosophy (inclusive of logic and ethics), history, social theory, theology, political theory, economics, and mathematics. The nine discovery modes display the predominance of abductive hypothesizing and testing over deductive and inductive methods. Deduction by itself only confirms whatever is already believed, and that is why humanistic disciplines often fail to rise above traditional customs and parochial values. More imagination is necessary. Induction yields an enlarging evidence base to improve our acquaintance with ourselves and nature, but its meanderings only hint at deeper patterns and causes. Still more imagination is needed. Fully methodological inference (abduction) is far more explanatory, by postulating underlying explanations only revealable through experimental trial.
1098
J. R. Shook
Table 1 Nine modes of inquiry Type of informants Past informants for Basis of evidence historical narration Empirical evidence Mode One Respects all available sources that are able to pass checks for credible authenticity and mutual consistency Investigative Mode Two evidence One, plus: Imposes interpolations and interpretations to reach for maximal coherence and singular chronology Scientific evidence Mode Three One and Two, plus: Incorporates expertise from allied scientific fields and omits nonnaturalistic events
Present informants for dispositive exposition Mode Four Interrogates witnesses and collects evidence in order to discern which hypothesis can acquire the most plausibility Mode Five Four, plus: Consultations with recognized experts, but their judgments are not necessarily taken as definitive Mode Six Five, plus: Relies on knowledge from allied scientific fields and ignores nonnaturalistic ideas
Future informants for scientific experimentation Mode Seven Accumulates and categorizes evidence so that future inquiries can rely on its patterned and predictive organization Mode Eight Two and Seven, plus: Applies controlled methodologies ready for scrutiny and replication by further investigation Mode Nine Six and Eight, plus: Experiments fully control conditions for repeatable consistency with future science
Humanistic disciplines expanding their interests into explanation, prediction, and control, whether dealing with the natural or human realms, venture beyond the historical and exposition modes into scientific modes. Relationships among humanities and sciences have varied widely, from cooperation to conflict (Slingerland & Collard, 2011; Bouterse & Karstens, 2015). Conflict over methodological principles occurs periodically, but there is no need to regard that contest as permanent or irrevocable. So long as the nine discovery modes are discriminated and applied, jointly humanistic and scientific inquiries can be charted without confusion. Selected examples of fields for each of the nine modes are listed to illustrate distinctions found among them.
The Two Historical Modes Mode One. Herodotus-Style History, Ecclesiastical Theology, Oral Narrative Recording, Journalistic Reporting. Mode Two. Polybius-Style History, Rankean-Style History, Intellectual Biography, Political History, Systematic Theology. Let us pause to explore modes of historical investigation. “History” comes from the Greek word for “inquiry” into actual matters leaving evidence for an
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1099
inquirer to look into. An honest inquirer must be guided by sources having thencontemporary or near-contemporary perspectives, rather than blind credulity about hearsay and legend. Selectivity is necessary for the organization of empirical history, as the work of Greek historian Herodotus displays, but that is still a low standard to meet. For the investigative historian, such as the Roman historian Polybius, sources are indispensable but not infallible or unchallengeable. There is a methodological expectation of chronology and consistency among candidate facts, with a minimization of partiality and prejudice. The scientific historian (Mode Three below) refines investigations further, heeding the naturalistic worldview and consulting allied fields about past matters, such as literature, antiquarian forensics, paleography, and archaeology.
The Two Exposition Modes Mode Four. Ecclesiastical Inquisition, Crime Investigation, Investigative Journalism, Public Polling, Ethnography. Mode Five. Canon Jurisprudence, Civil Jurisprudence, Government Inquiry, Foreign Intelligence.
The Five Scientific Modes Mode Three. Chronological, but hard evidence is left to more scientific fields. Human Sciences: Scientific History, Antiquities Authentication, Art Authentication. Mode Six. Investigatory and diagnostic, but experimentation is in the hands of other scientific fields. Human Sciences: Digital Forensics, Forensic Criminology, Forensic Anthropology, Forensic Authentication, Abnormal Psychology, Psychiatry. Mode Seven. Exploratory and modestly predictive, but experimentation goes little farther than events naturally or socially provided. Natural History: Geography, Ecology, Linnean Biology, Botany, Zoology, Anatomy, Animal Behavior. Human Sciences: Human Physiology, Clinical Psychology, Sociology, Demographics, Cliodynamics, Linguistics, Educational Research, Anthropology, Epidemiology, Political Science, Economics. Mode Eight. Moderate control over experimental conditions, more for identifying conditions beyond human control. Historical Sciences: Cosmology, Astronomy, Geology, Earth Sciences, Evolutionary Biology, Paleontology, Paleoarchaeology, Materials Analysis. Human Sciences: Archaeology, Clinical Medicine, Experimental Psychology, Neuroscience. Mode Nine. High control over experimental conditions, more for identifying causes amenable to human control.
1100
J. R. Shook
Physical Sciences: Mechanics and Dynamics, Classical Physics, Quantum Physics, Chemistry, Minerology, Metallurgy, Materials Science. This organization is not about the subject matter of a discipline, but its methodological resources. Scientific fields among the earth and life sciences in Mode Seven have to be more exploratory and expeditionary than strictly experimental. Fields descended from traditions of natural history such as botany, zoology, geography, geology, and paleontology are obvious illustrations. To discover why a particular thing or event came to be, scientists will plan and execute investigations and explorations, but past and vast natural powers controlled what happened. Mode Eight allows for greater control over experimental design and execution. Archaeology and neuroscience, for example, share the capacity for modest control over experimental inquiry, by meticulously conducting earthen excavations or by painstaking analyses of neural tissue. By contrast, to precisely determine how a kind of thing or event always comes to be, the physical sciences in Mode Nine are able to test hypotheses with rigorous and replicable controlled experiments.
Educational Research Education as a discipline could be, and has often been, scaled back to history of education or to educational policy. Historical studies can summarize careers of pedagogical movements, recount teaching and learning experiences, lament inequitable access, praise respectable pioneers, point out correlations among social and civic factors, and identify enduring instructional methods (cite standard books on education, etc.). Humanistic disciplines do have normative standards and ideals to uphold. These studies in education can discern, debate, and decry the variable ways that different societies have valued learning and invested in teaching. Turning the spotlight onto national affairs, education can undertake inquisitive investigations in order to broadcast exposés, make policy indictments, and back political reforms. As broadly as education must sojourn, it avoids becoming undisciplined by aiming higher. For a human-centered discipline beholden to best practices both today and tomorrow, it is that teaching-learning space that must remain primary, so practical problem-solving cannot be secondary, and educational research can fulfill that commitment (Baez & Boyles, 2009; Biesta & Burbules, 2003). Striving to be more than a collection of customary traditions, current exemplars, revered paradigms, or ideological agendas, disciplined education additionally respects inquiries assignable to scientific Modes Six and Seven. Educational research cannot attain the status of Mode Eight science, unlike experimental psychology and neuroscience, although education should consult their well-established theories without imitating their techniques. Mode Nine science, by requiring fully controlled and exactly repeatable experiments, is impractical for human subjects and beyond education’s reach.
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1101
Overall, it is now clear that educational research cannot be reduced without remainder to any number of physical, biological, physiological, or brain sciences. Cooperation, not reduction, opens up the path in front of education. Quality research can be accomplished, including both student-centered and teacher-led research, so long as any “laboratory” for education revolves around living educational spaces. Displays of science-aversion out of loyalty to an idolization of disciplinary independence cannot serve learners or teachers. By joining the human sciences, disciplined education and educational research have plenty of good company for collaborations and alliances (Furlong & Lawn, 2010; Bridges & Thompson, 2011; Peters et al., 2014; Bridges, 2017). Sociology, ethnography, psychology, organizational studies, communication studies, technology studies, and related disciplines will never be sciences either, at most attaining to human sciences. Human sciences easily lend themselves to interdisciplinary investigations and experimentation so long as they are generous with their expertise. Scientific research in the area of educational research, like research in the human sciences generally, has been a controversial topic for decades. Key issues were present at the birth of education as an academic discipline, and they continue to carry great weight (Lagemann, 2000; National Research Council, 2002, 2012). Education has plenty of company with neighboring human sciences also digesting roles for scientific inquiry. For example, James Grand and colleagues who advised the Society for Industrial and Organizational Psychology on research practices have formulated criteria for “robust science” in the human sciences: research that is relevant, rigorous, replicated, accumulative and cumulative, transparent, open, and theory oriented. In their words: . . . a robust science is one in which activities throughout the entire scientific enterprise are conducted with the intention of producing positively impactful and relevant knowledge”; “the rigor of a science is reflected in the extent to which its core concepts and their relations are operationalized with precision, and the methodologies used to collect informative observations are accurate and appropriately aligned with the analytical techniques used to infer meaning from those observations”; “the replicability of [science] findings . . . pursues efforts to gather repeated (i.e., replicated) observations of the mechanisms and relationships among core concepts and processes of human behavior, and that these efforts are made accessible in the corpus of scientific evidence”; “the strength of scientific understanding and inference is enhanced through careful vetting, deliberate calibration, and compounding multiple observations into an integrative whole. . . . the pursuit of cumulative knowledge is reinforced by adopting an appropriate degree of intellectual skepticism toward novel propositions and appropriately adjusting those beliefs on the basis of accumulated evidence”; “a robust science [is] one in which transparency and openness are embraced throughout the research process and scientific system. Activities that embrace these principles include more complete disclosure of data, materials, analyses, and hypotheses to the scientific community; promoting publication practices in which important questions answered well have a place in the literature regardless of results; and creating accessibility to the research process at all stages of production”; “a robust science is simply one in which its scientific pursuits contribute to explanation and ‘refinement of everyday thinking’ by replicating, bounding, revising, falsifying, and, when appropriate, advancing new claims. (Grand et al., 2018, p. 11, 12, 13, 14)
1102
J. R. Shook
Going further, Grand et al. emphasize that theories searching for confirming data is not explanatory science – robust science run experiments searching for data disproving hypotheses: . . . robust science is ‘theory oriented’ (not theory driven or theory dependent) and promotes this tenet by describing, evaluating, and refining explanations. Genuinely accomplishing this goal requires research that reflects quantitative and qualitative methodologies across the full range of inductive, deductive, and abductive approaches. . . . The rightness of a theory is not determined by the clarity of its arguments or through formal logic but by subjecting its claims to the gauntlet of empirical investigation. A science that strives for precise theories purposefully subjects its explanations to an increased “risk” of falsification to determine the level of confidence that should be placed in proposed relationships. (Grand et al., 2018, pp. 14–15)
This burden on theory comes from a scientific respect for abductive problem-solving and discovery, not the outdated “top-down” approach of earlier eras. Expecting education to become a theoretical science of its own at one leap only revisits outdated epistemological and falsificationist philosophies (Rowbottom & Aiston, 2006; Rowbottom, 2014). As philosophy well knows, in the arena of empirical explanation, clever fits between a neat theory and its preferred data prove little. Education’s humanistic mission is not so permissive that it deserves exemption (Carr, 2006). Although positivism and foundationalism are now history, the fashionable dismissal of “pure” data is no license to deny that good data can’t drive out bad theory. A theory shouldn’t dictate its own observational support. That is why a mixture of methods and experiments conducted by trusted lower-level theories, including those from fields neighboring education such as psychology and sociology, supplies independently collected information (Gorard & Taylor, 2004; Niaz, 2008; Biesta, 2010; Hall, 2013). Hypotheses meriting credibility are those able to survive both serious rivals and plenty of reliably collected data from multiple sources. For educational research, a similar statement of robust scientific criteria was provided in a major piece of American legislation about educational policy, the “No Child Left Behind Act of 2001,” where scientifically based research is outlined: (i) employs systematic, empirical methods that draw on observation or experiment; (ii) involves rigorous data analyses that are adequate to test the stated hypotheses and justify the general conclusions drawn; (iii) relies on measurements or observational methods that provide reliable and valid data across evaluators and observers, across multiple measurements and observations, and across studies by the same or different investigators; (iv) is evaluated using experimental or quasi-experimental designs in which individuals, entities, programs, or activities are assigned to different conditions and with appropriate controls to evaluate the effects of the condition of interest, with a preference for random-assignment experiments, or other designs to the extent that those designs contain within-condition or across condition controls; (v) ensures that experimental studies are presented in sufficient detail and clarity to allow for replication or, at a minimum, offer the opportunity to build systematically on their findings; and (vi) has been accepted by a peer-reviewed journal or approved by a panel of independent experts through a comparably rigorous, objective, and scientific review. (quoted from Baez & Boyles, 2009, p. 7)
These sorts of criteria for robust scientific research, typical across the human sciences, are entirely appropriate for Mode Seven science and set minimum standards
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1103
for Mode Eight science. Educational research benefits from its own implementation of those experimental modes (Yates, 2004; Kincheloe, 2004; Lodico et al., 2010; Smeyers & Smith, 2014) and from its incorporation of knowledge acquired by related human sciences operating with Mode Six, Mode Seven, and Mode Eight. The human sciences, like the life sciences, are far more interdisciplinary than isolated (Lund et al., 2020). Education paradigms floating free from empirical testing or scrutiny from related disciplines only amount to undisciplined speculation.
Scientific Objects, Education Objectives Motivational and methodological commonalities are pervasive between the human sciences and most scientific fields. Their respective missions may yet be responsible for irredeemable tensions between them. Even if human sciences respect science’s methods and knowledge to understand humanity, scientific theories and scientific applications may be neither humanistic nor historical. If science reduces its subjects down to objects, what happens to educational objectives? Protests against scientific intrusions into humanistic areas try to awaken us to looming threats. Four typical contentions suffice to highlight deep tensions: Science has to be basically quantitative, awkward at best with the qualitative world unless registered information is rigidly categorizable. Scientific views have to conceptualize people and their activities in collective terms, reducing particular individuals and deeds to essentialized kinds. Scientific theories produce regularized formulations ready for routine application to generic objects to achieve standardized outcomes. Scientific theories postulate or presume lawful relations among natural kinds in order to explain events in deterministic and predictable ways.
On this accounting, it appears that science cannot appreciate uniqueness, difference, diversity, or self-determination. Humanistic disciplines by their nature must respect and protect those values. All the same, different disciplines have different practical means for achieving their valued ends. This is just as true for the natural sciences as the human sciences. Most scientific fields never approach the mechanistic models of Mode Nine science, while offering practical approaches to any number of human problems. As for education, it has its own practicalities as well as its principles to jointly consider. Surely there must be common practical ground, and educational research operates well there. The practical mission of a human science, such as education in its research mode, cannot get fixated on uniqueness or diversity for its own sake, since it seeks sharable lessons and teaching practices applicable to groups of learners displaying much in common already. Finding commonalities among groupings is just as much a human affair as it is a scientific matter. The notion that science turns everything it touches into cold dead objects is entirely unfair and demonstrably untrue for most scientific fields. Let us review Dewey’s insistence upon the broad meaning to “science” with his reminder in full:
1104
J. R. Shook
There are those who would restrict the term to mathematics or to disciplines in which exact results can be determined by rigorous methods of demonstration. Such a conception limits even the claim of physics and chemistry to be sciences, for according to it the only scientific portion of these subjects is the strictly mathematical. The position of what are ordinarily termed the biological sciences is even more dubious, while social subjects and psychology would hardly rank as sciences at all, when measured by this definition. Clearly we must take the idea of science with some latitude. We must take it with sufficient looseness to include all the subjects that are usually regarded as sciences. The important thing is to discover those traits in virtue of which various fields are called scientific. When we raise the question in this way, we are led to put emphasis upon methods of dealing with subject-matter rather than to look for uniform objective traits in subject-matter. From this point of view, science signifies, I take it, the existence of systematic methods of inquiry, which, when they are brought to bear on a range of facts, enable us to understand them better and to control them more intelligently, less haphazardly and with less routine. (Dewey, 1929/1984, pp. 3–4)
As Dewey knew well, there is no singular thing as “science” but only a multitude of scientific approaches to particular problems and inquiries. Educational research would indeed be irresponsible for imposing some singular notion of scientific method that does not actually exist in the sciences or for seeking greater uniformity or precision than its subject matter can bear. To find a good fit between teaching and learning, the ways that a group of learners are alike supply the clues for discerning how they can learn best. Experimenting with groups to discover their shared attributes, abilities, and attainments shows no disrespect to any among them. If some individuals are too distinctive to belong to this grouping or that, then new research groupings would be the intelligent response, instead of raising old gripes against science. Disputes between education and science can be amicably settled. Perhaps the intractable arguments over science in education erupt at the internal borders among educational policy, educational administration, and educational research. Undisciplined education permits these debates to take over the entire field’s agendas. One such dispute is fueled by the view that encouraging educational research to be more scientific has the effect of endorsing the restriction of education to standardized teaching and testing. Another oft-heard view instead holds that encouraging educational research to empower teachers as co-investigators has the effect of abandoning the administration of schooling to chaotic methods and outcomes. Education must indeed deliberate about abstract educational goals, but placing the blame on empirical research is a fallacious distraction and a disservice to those most in need of education.
Conclusion For both education and science, mind is the mission. Through its own traditions, research programs, and collaborations with other human sciences, education is a discipline displaying an unbounded potency for advancing human understanding and achievement.
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1105
(1) What is scientific and what is educational are unified at their root by exploratory discovery. (2) Education is a humanistic discipline about culture rather than a scientific field about nature. (3) Learners and teaching practices can be studied by the human sciences. (4) Education’s humanistic ends are aided by knowledge about the capacities of those to be educated. Inhuman reductionism cannot be the agenda of human sciences adapting scientific methods to their missions (Kagan, 2009; Slingerland, 2008). Complaints over scientism disrupting education or distorting the discipline of education can easily be overblown and often as ideologically motivated as the alleged scientism. The label of scientism is thrown at so many different abstractions (Shook, 2015) for polemical purposes that it is becoming meaningless in academic discourse. Scientific methods and modes, by contrast, are easily discriminated. Nonideological education, as observed in previous sections, has to foster explorative, experimental, and expansive opportunities for every mind. In conclusion, education and science are far from opposed, sharing the goal of discovery to empower humanity. Science strives for knowledge alongside education, and knowing the world is never contrary to knowing thyself. Learning prepared for ever-more learning, and applied to every art worth doing, remains free.
References Adcock, J. M., & Stein, S. L. (2014). Cold cases: Evaluation models with follow-up strategies for investigators. CRC Press. Aliseda, A. (2017). The logic of abduction: An introduction. In L. Magnani & T. Bertolotti (Eds.), Handbook of model-based science (pp. 219–230). Springer. Backhouse, R. E., & Fontaine, P. (Eds.). (2010). The history of the social sciences since 1945. Cambridge University Press. Baez, B., & Boyles, D. (2009). The politics of inquiry: Education research and the “culture of science”. SUNY Press. Biesta, G. (2010). Pragmatism and the philosophical foundations of mixed methods research. In A. Tashakkori & C. Teddlie (Eds.), Sage handbook of mixed methods in social and behavioral research (Vol. 1, pp. 95–118). SAGE. Biesta, G. J., & Burbules, N. C. (2003). Pragmatism and educational research. Rowman & Littlefield. Bouterse, J., & Karstens, B. (2015). A diversity of divisions: Tracing the history of the demarcation between the sciences and the humanities. Isis, 106(2), 341–352. Bridges, D. (2017). ‘Two cultures’ revisited: Science (‘scientism’) and the humanities in the construction of educational understanding. In D. Bridges (Ed.), Philosophy in educational research (pp. 35–55). Springer. Bridges, D., & Thompson, C. (2011). From the scientistic to the humanistic in the construction of contemporary educational knowledge. European Educational Research Journal, 10(3), 304– 321. Carr, D. (2005). Making sense of education: An introduction to the philosophy and theory of education and teaching. Routledge.
1106
J. R. Shook
Carr, W. (2006). Education without theory. British Journal of Educational Studies, 54(2), 136–159. Carson, D. (2009). The abduction of Sherlock Holmes. International Journal of Police Science & Management, 11(2), 193–202. Collingwood, R. G. (1946). The idea of history. Oxford University Press. Darling, J., & Nordenbo, S. E. (2003). Progressivism. In N. Blake, P. Smeyers, R. Smith, & P. Standish (Eds.), The Blackwell guide to the philosophy of education (pp. 288–308). Blackwell. Dewey, J. (1916). Democracy and education. The Free Press. Dewey, J. (1929/1984). The sources of a science of education. In J. A. Boydston (Ed.), The later works of John Dewey (Vol. 5, pp. 3–40). Southern Illinois University Press. Diamond, J., & Robinson, A. (Eds.). (2010). Natural experiments of history. Harvard University Press. Fisher, D., & Fisher, B. (2012). Techniques of crime scene investigation (8th ed.). CRC Press. Furlong, J., & Lawn, M. (2010). Disciplines of education: Their role in the future of education research. Routledge. Gorard, S., & Taylor, C. (2004). Combining methods in educational and social research. McGrawHill International. Grand, J. A., Rogelberg, S. G., Allen, T. D., Landis, R. S., Reynolds, D. H., Scott, J. C., Tonidandel, S., & Truxillo, D. M. (2018). A systems-based approach to fostering robust science in industrialorganizational psychology. Industrial and Organizational Psychology, 11(1), 4–42. Hall, J. N. (2013). Pragmatism, evidence, and mixed methods evaluation. New Directions for Evaluation, 138, 15–26. Hester, S., & Eglin, P. (2017). A sociology of crime. Routledge. Jackson, P. (2011). What is education? University of Chicago Press. Kagan, J. (2009). The three cultures: Natural sciences, social sciences, and the humanities in the 21st century. Cambridge University Press. Kincheloe, J. (2004). Rigour and complexity in educational research. McGraw-Hill International. Lagemann, E. C. (2000). An elusive science: The troubling history of education research. University of Chicago Press. Lodico, M. G., Spaulding, D. T., & Voegtle, K. H. (2010). Methods in educational research: From theory to practice. Wiley. Lund, K., Jeong, H., Grauwin, S., & Jensen, P. (2020). Research in education draws widely from the social sciences and humanities. Frontiers in Education. https://doi.org/10.3389/feduc.2020. 544194 Machin, D., Fayers, P., & Tai, B. C. (2021). Randomised clinical trials: Design, practice and reporting (2nd ed.). Wiley-Blackwell. Magnani, L. (2010). Abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. Moore, T. (2010). Philosophy of education. Routledge. National Research Council. (2002). Scientific research in education, Ed. R. J. Shavelson & L. Towne. National Academy Press. National Research Council. (2012). Discipline-based education research: Understanding and improving learning in undergraduate science and engineering. The National Academies Press. Niaz, M. (2008). A rationale for mixed methods research programmes in education. Philosophy of Education, 42(2), 287–305. Noddings, N. (2015). Philosophy of education (4th ed.). Abingdon, UK and New York. Peirce, C. S. (1932). The collected papers of Charles Sanders Peirce, Vol. 1, Ed. C. Hartshorne & P. Weiss. Harvard University Press. Peirce, C. S. (1934). The collected papers of Charles Sanders Peirce, Vol. 5, Ed. C. Hartshorne & P. Weiss. Harvard University Press. Peirce, C. S. (1958). The collected papers of Charles Sanders Peirce, Vol. 7, Ed. A. W. Burks. Harvard University Press. Peters, M., Reid, A., & Hart, E. (Eds.). (2014). A companion to research in education. SpringerScience.
51 Abductive Inquiry and Education: Pragmatism Coordinating the. . .
1107
Roth, R. (2012). Scientific history and experimental history. Journal of Interdisciplinary History, 43(3), 443–458. Rowbottom, D. P. (2014). Educational research as science? In A. D. Reid, P. Hart, & M. Peters (Eds.), A companion to research in education (pp. 145–153). Springer. Rowbottom, D. P., & Aiston, S. J. (2006). The myth of ‘scientific method’ in contemporary educational research. Journal of Philosophy of Education, 40(2), 137–156. Shook, J. R. (2015). Spelling out scientism, A–Z. In M. Pigliucci (Ed.), Scientistic chronicles: Exploring the limits, if any, of the scientific enterprise (pp. 17–24). ScientiaSalon.Org. Shook, J. R. (2021a). Abduction, complex inferences, and emergent heuristics of scientific inquiry. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action (pp. 177–206). Springer. Shook, J. R. (2021b). Abduction, the logic of scientific creativity, and scientific realism. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action (pp. 207–227). Springer. Slingerland, E. (2008). What science offers the humanities: Integrating body and culture. Cambridge University Press. Slingerland, E., & Collard, M. (2011). Creating consilience: Integrating the sciences and the humanities. Oxford University Press. Smeyers, P., & Smith, R. (2014). Understanding education and educational research. Cambridge University Press. Woods, R., & Barrow, R. (2006). An introduction to philosophy of education (4th ed.). Routledge. Yates, L. (2004). What does good educational research look like? Situating a field and its practices. McGraw-Hill International.
Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching of School Scientific Explanation and Argumentation
52
Agustín Adúriz-Bravo and Leonardo González Galli
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nature and Function of Abduction in Science and in Science Education . . . . . . . . . . . . . . . Abduction and the Scientific Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning in School Scientific Explanation and Argumentation . . . . . . . . . . . Examining the Role of Abduction in Charles Darwin’s Works . . . . . . . . . . . . . . . . . . . . . . . Studies on Abductive Reasoning in Darwin’s Formulation of His Evolutionary Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reconstruction of Passages from Darwin’s Origin . . . . . . . . . . . . . . . . . . . . . . Conclusions: The Power of Abduction in Didactics of Science . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1110 1112 1112 1116 1118 1118 1129 1139 1141
Abstract
In this chapter, it is proposed that science teachers can foster students’ construction of model-based scientific explanations and argumentations through the intentional use of abductive reasoning. This kind of reasoning could be introduced in the science curriculum as a central “mode of thinking” when discussing the scientific methodology. School scientific argumentation is here understood as an “explanation of a scientific explanation,” where the connection between a natural phenomenon under scrutiny and a theoretical model (semantically conceived) to account for it is made explicit. It is contended that, in many cases of acknowledged relevance for school science, the “ascent” from evidence to model can be characterized as abduction (either in a broad or in a narrow sense). Accordingly, a suggestion is here advanced to teach the specific mechanics of the
A. Adúriz-Bravo () · L. González Galli CONICET/Instituto de Investigaciones CeFIEC, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_50
1109
1110
A. Adúriz-Bravo and L. González Galli
participation of abductive inferences in explanation and argumentation in (mainly secondary or tertiary) science classes through the use of paradigmatic cases (“epitomes”) taken from the history of science. Along this line, selected excerpts from Charles Darwin’s book, On the Origin of Species, (Darwin (1859) On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life, 1st edn. John Murray, London), are employed as such epitomes for biology classes. Following the proposals of several authors, it is shown how a number of theoretical propositions contained in that book can be reconstructed as the conclusions from pieces of abductive reasoning that in many cases use intermediary analogies. Together with the analysis of the examples, the possible formative value of such “abductive reconstructions” for science education is explicated.
Keywords
Abduction · Analogy · Scientific explanation and argumentation · Theoretical models · Evolutionary theory · Charles Darwin · On the Origin of Species
Introduction The main aim of this chapter is to discuss a possible role for abduction – conceptualized as a “mode of thinking” (Adúriz-Bravo, 2015) – in the construction of model-based scientific explanations and argumentations in science education. One of the theses assumed here is that abductive reasoning plays a major role in the production of scientists’ science (see Thagard, 1978, 2005; Giere, 1991; Psillos, 2002; Samaja, 2005; Oh, 2019); it is therefore strongly suggested that it should be accorded a prominent role in science teaching (Adúriz-Bravo, 2001, 2003, 2015, 2016; Adúriz-Bravo & Izquierdo-Aymerich, 2009; Sans Pinillos & AdúrizBravo, 2021; Adúriz-Bravo & Sans Pinillos, in press). In the view sustained in this chapter, classroom discussion around the nature and function of abductive inferences can provide substantive support for meaningful teaching of the sanctioned scientific models (current or from the past), of scientific modeling understood as a methodological competence, and of the so-called nature of science (known by its acronym NOS). In relation to this last point, abduction could be considered an indispensable ingredient in an epistemologically sound characterization of the scientific methodology for educational purposes (Adúriz-Bravo & IzquierdoAymerich, 2009; Lawson, 2010; Oh, 2019). Additionally, in this chapter, the abductive mode of thinking is used as a “connecting thread” to make converge a series of teaching strategies that are currently favored within didactics of science (i.e., science education as an academic discipline): models and modeling (Sensevy et al., 2008); use of science stories (Laçin-Sim¸ ¸ sek, 2019); metaphors and analogies (Niebert et al., 2012); scientific language, genres, and text types (Nygård Larsson & Jakobsson, 2020); and explicit
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1111
and reflective teaching of the nature of science (Habiby et al., 2020). The proposal here is that science teachers can foster the construction of solid model-based explanations and argumentations in students of the different educational levels (with special focus in secondary and tertiary education) through the explicit incorporation of abductive inferences, which in many cases also include the use of analogies and analogical reasoning. It is suggested that, in order to scaffold school scientific explanation and argumentation – understood as cognitive-linguistic competences indispensable in the development of citizenship (Sengul, 2019) – science teachers can explicitly discuss some aspects of the nature of scientific methodology. In this appraisal of the “school scientific method,” the epistemological mechanics of abductive reasoning could be examined through critically analyzing paradigmatic examples (“epitomes”) of historical explanations in science. In this chapter, a number of such epitomes are constructed by modeling some of Charles Darwin’s ideas formulated in his celebrated 1859 book On the Origin of Species (from now on, Origin) as abductive pieces of reasoning, showing the role that scientific models, evidences, and analogies play therein. This modeling, or “abductive reconstruction,” of evolutionary ideas is accompanied by an explication of its possible formative value for science students. Darwin’s magnum opus is chosen for the fact that it contains the detailed development and justification of a theory of undeniable social prestige, undoubted centrality in science curricula, and recognized complexities for teaching (González Galli et al., 2020; Pérez et al., 2021), which is widely seen by philosophers of science as being the result of “what Peirce called an ‘abduction’” (Putnam, 1981: 198). The analysis performed in this chapter on that theory can of course be extended to those contained in other foundational scientific texts from biology or from physics, chemistry, geology, etc. (see Adúriz-Bravo & Izquierdo-Aymerich, 2009 and Adúriz-Bravo, 2013b, 2016, for further abductive reconstructions). For the last two decades, several scholars within didactics of science have been advancing different innovative approaches to science teaching that resort to a carefully designed combination of the strategies mentioned above (see, as an interesting example, Leite et al., 2020). Clement and Núñez Oviedo (2003) have even added to those strategies that of abduction, which is here taken instead as an epistemological “lens” to give cohesion to them, in the following way. In didactical research, the process of school scientific modeling has been characterized by some authors as the “projection” (or “mapping”) of a theoretical model onto a phenomenon of interest that is being studied (Justi & Gilbert, 1999; Izquierdo-Aymerich & Adúriz-Bravo, 2003; Adúriz-Bravo, 2020); the model would then work as a map (a possible “state of affairs”) that satisfactorily accounts for the phenomenon and leads to further research. This process of model projection is placed, in the present chapter, at the core of scientific explanation and argumentation, the latter being understood as an “explanation of the explanation,” where the relations between the phenomenon and the (purportedly) explanatory model in question are made explicit. A key contention here is that the competence of school scientific modeling, which implies the “ascent” from evidence to model, can be conceptualized as
1112
A. Adúriz-Bravo and L. González Galli
the establishment of a case-rule relationship (Thagard, 2005), an idea very much developed by the American pragmatist philosopher Charles Sanders Peirce (1839– 1914) in his writings on abductive inference. Hence arises the possibility of reconstructing that ascent as a process of abduction (in a broad or in a narrow sense, as it will be specified below), and instantiating it with passages from a scientific masterpiece. For the didactical transposition of the notion of abduction to science education that is undertaken in this chapter, it seems necessary to make, from the very beginning, two theoretical choices, with the effect of inevitably reducing the diversity of conceptualizations offered in the vast literature on the subject. In this sense, abduction in school science will be here understood as an inference: (1) that involves “some form of explanatory reasoning” (Douven, 2021: n/p, an influential text that overtly adopts the “explanationist” perspective), and (2) partially identifiable, in one of its broadest meanings, with the methodological tool known as “inference to the best explanation” (as in Campanaro, 2021, an article from archaeology, a discipline that can provide powerful teaching analogies with natural sciences). Finally, given the inclusion of this chapter in a handbook intentionally delving into the intricate cognitive aspects of abductive reasoning, another proviso is due: the approach to abduction embraced here has a recognizable logico-methodological character (see the parallel with Costa, 2009), in coherence with the stated purposes for the incorporation of this mode of thinking in science education. A few comments on the relationship of such an approach to perspectives strictly centering on cognition will be made at the beginning of Section “Examining the Role of Abduction in Charles Darwin’s Works”.
Nature and Function of Abduction in Science and in Science Education Abduction and the Scientific Methodology It is certainly not easy to offer a “closed” definition of abduction as a form of inference, as it can be seen in the numerous and varied chapters of this book. The debate around the nature and function of abduction rekindled in the last century and gave place to suggestive taxonomies as a product; such taxonomies became the input for the contemporary debate on this kind of reasoning (see Aliseda, 2006). It is now generally assumed that the best option to characterize abduction is to take into consideration the particularities of the contexts in which it takes place, so that a strong family resemblance between the different meanings and uses of the term can be identified (Adúriz-Bravo & Sans Pinillos, in press). In general terms, abduction refers to a genuine “mode” of inference, distinct from deduction and induction (Kapitan, 1992), that can be defined, according to the explanationist perspective assumed here, as “the process of formulating a hypothesis which, if it were true, would provide an explanation for [a] phenomenon” (Clement & Núñez Oviedo, 2003: 2) that puzzles the observer and requires to be
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1113
accounted for. The inference that “it must have rained” from the observation of the (very likely unexpected and annoying) facts of wet sidewalks and people wearing raincoats and carrying umbrellas is probably the most cited example in the teaching of abductive explanation. The term “abduction” was most probably coined by Peirce (cf. Samaja, 2005; Psillos, 2002; Konrad, 2004) in his early papers on inference, in the last third of the nineteenth century (see, for instance, Peirce, 1931–1958: 2, 623). When creating it, he used the template provided by the classical nomenclature of “induction” and “deduction.” In Peirce’s conception, abduction (the act of “taking away” a hypothesis from a phenomenon) is a kind of reasoning that “merely suggests that something may be” (Peirce, 1931–1958: 5, 172). It results in a hypothetical conclusion that is not necessary in deductive terms nor even probable in inductive terms. The conclusion of a piece of abductive reasoning is neither a direct derivation nor a generalization or expansion; it is only a plausible proposition, one that has explanatory virtue so that it results valuable to further investigate it. The key characteristic of the nature of abduction is that it is a genuinely ampliative inference. By ampliative, it is meant that “something new is generated in its conclusion” (Blachowicz, 1998: 60). In this sense, it could be safely stated that abductive reasoning produces decidedly new knowledge that was not at all implicit in the premises. In a scientific explanation, this means that a hypothesis abduced with the specific purpose of explaining a puzzling situation (a problem with a set of data collected through observation and experimentation) is only merely “suggested” by the data, which, through the lens of theory, become evidence for it. Transformation of data into evidence, which – according to the theoretical framework presented here – is done with the aid of a model, is the conceptual element situated at the core of the reconstructions of Darwin’s ideas that will be provided. Indeed, it is contended that Peirce’s theoretical perspective on abduction allows giving it a central function in scientific modeling and this conceptualization can also be didactically transposed to school science (Adúriz-Bravo, 2003, 2013b). Thus, this chapter looks for what can be considered, in Brian Haig’s (2005) terms, an “abductive theory of scientific method” with educational value. A theoretical model (as it is understood in the so-called semantic conception of scientific theories of the last quarter of the twentieth century: Giere, 1988) would work as an abstract map of the scientific problem that needs to be solved. Data in the problem, transformed into evidences, would point at the model as the most satisfactory solution. This “ascent” from evidence to model could be understood as an abductive inference, in senses that can be broader or narrower, as it will be explained below. School scientific argumentation is here conceptualized as the production of a text in which that abductive ascent is made explicit by showing the relations between available, theoretically reconstructed evidences around a problematic phenomenon and the school theoretical model that is being taught. This said, several authors (cf. Magnani, 2001; Clement & Núñez Oviedo, 2003; Delrieux, 2004; Aliseda, 2006) converge in distinguishing two main meanings for the concept of abduction (a broad and a narrow sense), though the pragmatic extent of such a distinction is not the same in all of them. As a synthesis of different positions that are currently available, two big categories of abductive inference and
1114
A. Adúriz-Bravo and L. González Galli
reasoning will be discussed here: (1) abduction sensu lato, a category to include the broader meanings, which often imply conflating it with inference to the best explanation, and that identifies it with any general process of hypothesis production, and (2) abduction sensu stricto, focused on its narrower definitions, which capture it as a form of syllogism first suggested by Aristotle and then developed by Peirce (cf. Auletta, 2017).
Abduction Sensu Lato In its most general meaning, abduction can be seen as the process of hypothesizing: generating, evaluating, and revising hypotheses (Magnani, 2001). More technically, a broad portrayal of abduction as “the inference process that goes from observations to explanations within a more general context or theoretical framework” (Delrieux, 2004: 412) will be adopted here, although the explanationism introduced in this conceptualization – as it was already noted – is frontally contested by some influential authors (for instance, the Canadian logician John H. Woods, 2013). As advanced, the centrality accorded to the theoretical background for the inference, highlighted by Atocha Aliseda (2006), is what permits a reconstruction of the school scientific method as an abductive process of theoretical modeling (Adúriz-Bravo, 2020; Upmeier zu Belzen et al., 2021). In this sense, any coordinate set of mixed cognitive-linguistic procedures that leads to a hypothetical, inferential, theory-laden, and fundamentally model-based conclusion as a product can in principle qualify as abduction. Revisiting the stereotypical example above, the inference of rain from a disarticulated set of episodic observations in the street fits this characterization of abduction as model-based reasoning, since the inferring subject needs a “guide” (theory, in its etymological sense of “view”) to collect or discard those observations and to connect the ones that have been collected. This portrayal of abduction follows Clement and Núñez Oviedo (2003, 9) in considering it a process of “open-ended design under constraints,” i.e., the creation – with mainly explanatory aims – of an invented proposition that should 1. fit into boundaries provided by preexistent theoretical frameworks; 2. “cover” most of the available evidence (though there are always things that cannot be accounted for); and 3. constitute the expression of a model, understood as a non-linguistic, mainly imagistic, entity (as in Giere, 1988). It is this first, broader sense, of abduction that is often equaled to the so-called inference to the best explanation (cf. Iranzo, 2009). According to Peter Lipton’s (2000) conception of this form of inference, scientists work out what to infer from the evidence by thinking about what would actually explain that evidence; the “ability” of a possible hypothesis to provide a satisfactory explanation is taken as a sign that the hypothesis is correct. Abduction, once mapped onto this kind of explaining processes, entails selecting between hypotheses: the abduced conclusion is compared to “alternatives” that can be more or less explicit and diffused (cf. Rodrigues da Silva & Castilho, 2015). Just as in the case of explanationism, it should be noted that the identification presented here between abduction and inference to the best explanation is rather contentious in contemporary philosophy of science (see, for instance, Campos,
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1115
2011). In any case, and for the educational purposes of this chapter, the potentially explanatory function of abductive reasoning and its general structure as a mechanism of model evaluation permit a reconstruction of science stories that is highly formative in terms of giving an epistemologically solid portrayal of the nature of science-in-the-making.
Abduction Sensu Stricto One of the “narrower” meanings of abduction maps it onto a kind of syllogism (or, more properly, of syllogism-like fallacy) structurally similar to the classical fallacy of affirming the consequent (Plutynski, 2011). Such syllogistic presentation can be described as a “reverse deduction.” Peirce’s canonical reconstruction of abduction as the “third” syllogism (completing the circle of deduction and induction), presented in his Harvard lectures at the beginning of the twentieth century, is as follows: The surprising fact C is observed. But, if A were true, C would be a matter of course. Hence, there is reason to suspect that A is true. (Peirce, 1931–1958: 5, 189; modernized punctuation)
Norwood R. Hanson (1958), when characterizing the abductive inference closely following Peirce’s third syllogism, shows that the abduced hypothesis (but none of the possible alternatives) is present as a premise. This can be seen in the following example adapted from the Argentinian philosopher of science Juan Samaja (2005), which more clearly shows the “reverse deduction” or “retroduction” pattern: Jane wears a blue shirt. All bus drivers wear blue shirts. Then, Jane is a bus driver.
As James Blachowicz (1996) points out, in such an approach to abduction sensu stricto, this particular formalization – from the many provided by Peirce – can hardly be taken as ampliative, since “the hypothesis does not emerge ampliatively (for the first time) in the conclusion” (Blachowicz, 1996: 149); it is present as a premise. This contradiction is caused by the fact that the abductive syllogism focuses only on the “part of the process whereby already generated hypotheses are judged in terms of their plausibility, simplicity, etc.” (Blachowicz, 1996: 141). However, in Hanson’s and Peirce’s conceptions of abduction, there is genuine “amplification” of content from the moment when the “major” premise (“All bus drivers wear blue shirts”) is chosen among others and on the basis of some selected and reconstructed facts (“Jane wears a blue shirt”) that capture attention. The ampliativeness of an abductive syllogism is shown in the fact that the “case” in the conclusion (“Jane is a bus driver”) is a simplified expression of a more complex, inferential construction: “The hypothesis that Jane is a bus driver is fruitful (as opposed to others) and deserves to be further investigated (in relation with the facts).”
1116
A. Adúriz-Bravo and L. González Galli
Abductive Reasoning in School Scientific Explanation and Argumentation As it was hinted in the previous section, a number of authors from the philosophy of science, and also a handful from didactics of science, follow Peirce in considering abduction as a powerful model of how scientists work when they do inquiry (refer to Harré, 1986 and Upmeier zu Belzen et al., 2021 as noteworthy examples in the two scholarly fields). This abduction-based conception of the scientific methodology can be condensed as follows: The most important extension Peirce made of his earliest views on what deduction, induction, and abduction involved was to integrate the three argument forms into his view of the scientific method. As so integrated, deduction, induction, and abduction are not simply argument forms any more: they are phases of scientific methodology [ . . . ]. Scientific method begins with abduction or hypothesis: because of some perhaps surprising or puzzling phenomenon, a conjecture or hypothesis is made about what is actually going on. This hypothesis should be such as to explain the surprising phenomenon, such as to render the phenomenon more or less a matter of course if the hypothesis should be true. (Burch, 2021: n/p; emphasis added)
Considerations like these ones support the proposal in this chapter to model relevant aspects of the nature of scientific inquiry as they appear in scientific texts using some of Peirce’s patterns of abductive inference; as it was said, such an approach has been only timidly proposed in the literature on science education. The contention here is that a constitutive step of the scientific methodology that should be taught in science classes of the different educational levels is that of the “provisional adoption of an explanatory hypothesis” (Peirce, 1931-1958: 4, 541) through abduction. Justification of this “hypothetical guess” that emerges as the abduced conclusion is in turn elaborated into a scientific argumentation (Wirth, 1999), and this model of scientific thinking is transposable to school science. In previous publications that are direct antecedents for the proposal here (AdúrizBravo, 2001, 2003, 2015; Adúriz-Bravo & Sans Pinillos, in press), a comprehensive theoretical “elucidation” of the concept of abduction was developed in order to characterize its participation in scientific reasoning in school science. Based on an epistemological examination of scientists’ science, abductive inference was identified as a key function in the production of new knowledge during the process of modeling. In the theoretical view ascribed in this chapter, scientists, starting from a concrete, relevant problem in their discipline, obtain and propose to the community a “plausible” solution – under the form of a hypothesis – that arises from a chain of inferences, in a process that can be seen as “evidence-based thinking.” This mode of thinking cannot be totally reduced to deduction or induction, nor can it be fully mapped to Jerome Bruner’s logical or narrative rationalities (Adúriz-Bravo, 2015, 2016). After obtaining their hypothesis, scientists proceed to show its plausibility as solution to the problem against a background of already accepted knowledge. This is done through the deployment of a scientific argument, in which evidences and models are carefully connected. The piece of reasoning contained in such an
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1117
argument results non-demonstrative (non-deductive), cannot be identified with an inductive generalization, and is characterized by its high degree of ampliativeness. From this point of departure, and with different theoretical tools, a didactical transposition of this conception of the abductive mode of thinking was produced for science education, especially aimed at the professionalization of science teachers. The process of performing this transposition involved collecting possible epitomes of abductive reasoning from the history of science, chosen to foster discussion among teachers around how to design didactical materials and conduct lessons around the nature of science. A central ingredient in the proposal for pre- and in-service science teacher education is constituted by the examination of cases of teaching of scientific concepts that, in their historical construction, were presumably the results of abductive inferences. On the basis of all these considerations, this chapter uses a variety of Peirce’s presentations of abductive inference as a template to reconstruct science stories of “discoveries” or “inventions” (here, Darwin’s achievements in the field of evolution) for the purpose of science education. A first reconstruction of Darwin’s key theoretical ideas as conclusions of model-based reasoning departing from evidence will fix the standards of how his arguments can be understood as epitomes of abduction in the science classes. Accordingly, the example that follows condenses suggestions, directed to science teacher education, on how to work on abductive reasoning as a mode of thinking when explicitly addressing the nature of science in science teaching. Thus, it can be shown to teachers that, in the introduction to Origin, the canonical “indices” of a piece of abductive reasoning can be profusely identified. Such indices are linguistic constructions (italicized in the quotations that follow) that result very similar to those found in other famous texts, for instance, from Maria Skłodowska-Curie or Ernest Rutherford (see Adúriz-Bravo, 2013b, 2015; AdúrizBravo & Izquierdo-Aymerich, 2009). In the first place, Darwin identifies a set of puzzling facts in need of explanation: “When on board H.M.S. ‘Beagle’, as naturalist, I was much struck with certain facts in the distribution of the inhabitants of South America, and in the geological relations of the present to the past inhabitants of that continent” (Darwin, 1859: 1). These facts are theoretically reconstructed to make emerge a scientific problem relevant for the community, the “question [ . . . ] on the origin of species –that mystery of mysteries, as it has been called by one of our greatest philosophers[, namely Herschel]” (ibid.). With this point of departure, Darwin initiates what can be safely seen as a series of typical abductive inferences: he “allowed [himself] to speculate on the subject, and drew up [ . . . ] a sketch of the conclusions, which then seemed to [him] probable” (ibid.). At the same time, he asserts that he will accompany those conclusions with all the facts and references on which they have been grounded (ibid., p. 2), elements that can be taken here as evidences (although this word first appears in p. 15) in favor of the abduced propositions. The two (arguably) most fundamental of such propositions are clearly stated in the initial passages of the text. In the first place, Darwin says that:
1118
A. Adúriz-Bravo and L. González Galli
[i]n considering the origin of species, it is quite conceivable that a naturalist, reflecting on the mutual affinities of organic beings, on their embryological relations, their geographical distribution, geological succession, and other such facts, might come to the conclusion that each species had not been independently created, but had descended, like varieties, from other species. (Darwin, 1859: 3)
The second proposition, on “the means of modification and coadaptation” (ibid., p. 4), needed from Darwin “a careful study of domesticated animals and of cultivated plants” (ibid.) and the use of “the doctrine of Malthus, applied to the whole animal and vegetable kingdoms” (ibid., p. 5). This theoretical transfer constitutes a paramount element, here taken to be a probable indicator of abduction. Consistent application of such a system of analogies and instantiations would, according to Darwin, “offer the best chance of making out this obscure problem” (ibid., p. 4), affording “the best and safest clue” (ibid.). From the collected materials, an abduced conclusion of composed structure “follows”: that [1] any being, if it vary however slightly in any manner profitable to itself, under the complex and sometimes varying conditions of life, will have a better chance of surviving, and thus be naturally selected[, and that (2), from] the strong principle of inheritance, any selected variety will tend to propagate its new and modified form. (Darwin, 1859: 5; original emphasis)
Darwin finishes the brief introduction to his book declaring full adherence to the fruitfulness of his hypotheses: he entertains no doubt, “after the most deliberate study and dispassionate judgment” (ibid., p. 6) of which he is capable, that the standard view of each species having been independently created – a view that he openly confesses to have formerly sustained – is erroneous and that “species are not immutable” (ibid.), but have been modified by natural selection. Following the previous example of abductive analysis of a scientist’s inferences expressed in written discourse, which could be referred to as “paradigm,” 12 more reconstructions addressed to teachers will be undertaken in Section “Abductive Reconstruction of Passages from Darwin’s Origin”.
Examining the Role of Abduction in Charles Darwin’s Works Studies on Abductive Reasoning in Darwin’s Formulation of His Evolutionary Theory The nature of the inferences that connect empirical, or other kinds of, evidence with theoretical statements in Darwin’s works has been the object of a plethora of analyses and debates, most of them in the field of the philosophy of science, but also from the history of science, linguistics, cognitive science, etc. More specifically, one of the issues under scrutiny, researched at length, has been the reasoning mechanisms that can be identified in his (written) formulation of the model of evolution by natural selection (see Recker, 1987). From the perspective of the studies on scientific cognition, Gruber and Wallace (2001) point out that the very nature of Darwin’s cognitive processes leading to the
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1119
construction of his ideas has been taken as object of analysis. For instance, Gruber’s (1978, 1981) pioneering writings on creativity in scientists make Charles Darwin a paradigmatic case, scrutinizing his notebooks. Gruber and Wallace’s (2001) paper on Darwin, in turn, critically discusses the notion of “insight,” (over)used to account for discovery and invention, and examines the role of the metaphors and analogies interwoven in the model of natural selection. In relation to this last point, these authors recognize that the fact that “[n]o metaphor ever makes a perfect fit between its terms” (Gruber & Wallace, 2001: 349), rather than being a hindrance, may have opened new creative paths in evolution. On the other hand, Robert Keegan (1989) inspects the psychological aspects in Darwin’s works with the aid of the theoretical construct of “thought-forms,” i.e., the idiosyncratic ways of thinking that permit to characterize an individual’s cognitive functioning. In his analysis, Keegan also focuses on analogies, which according to him appear when a scientist “applies the expert knowledge in the attempt to understand new areas” (Keegan, 1989: 109). Paul Thagard (2005), an author that will be here cited for providing key epistemological foundations for the inclusion of abductive reasoning in science education, also tackles abduction and analogy from a strong cognitive perspective, seeing them as mechanisms in an integral model of scientific creativity. He locates abductive thinking in a variety of scientific activities, where the standard higherorder cognitive processes identified by psychology and cognitive science are operating: 1. In the analogical construction of theory, where a known case is adapted to solve a new problem 2. In the generation of hypotheses through the use of visualizations or basing on rules (such as in the examples of the rainy day and Jane the bus driver discussed above, where the rule is “run backward” to explain a state of affairs) 3. In causal modeling, involving activation of concepts and schemas This cognitive perspective on abduction, as it was advanced, will not be followed in the chapter, but is compatible with the logical and methodological approach assumed here. Darwin’s pieces of reasoning are analyzed under the form of written texts, and a proposal of semiformal reconstruction is advanced; an important number of scholarly publications serve as direct antecedents for this task. Some authors (Thagard, 1978, 1992; Putnam, 1981; Lipton, 2000; Turner, 2000; Magnani, 2001; Okasha, 2002; Kleiner, 2003; Paavola, 2004; Lewens, 2007a, 2007b; Rivadulla, 2007, 2015; Stamos, 2007; Haig, 2008; Andrade, 2009; Duarte Calvo, 2016; Niinilouto, 2018; Norton, 2021) have defended the idea that Darwin’s evolutionary theses can be modeled as inferences to the best explanation (i.e., what has been here defined as a noteworthy form of abduction sensu lato). Other authors (Ruse, 1979, 2008; Evans, 1984; Richards, 1997; Venville & Treagust, 1997; Shelley, 1999; Sterrett, 2002; Gildenhuys, 2004; Wilner, 2006; Pramling, 2009; Burnett, 2009; Theunissen, 2012) have highlighted the role of analogical reasoning (which, just as abduction, is decidedly ampliative) in Darwin’s
1120
A. Adúriz-Bravo and L. González Galli
theorization. It has been repeatedly marked that the analogy between natural and artificial selection occupies a, or perhaps the, central place in Darwin’s arguments in Origin. However, the precise character and extent of the participation of analogies and metaphors in the construction of Darwin’s abductive inferences remain controverted (see, for instance, Richards, 1997). In turn, other authors (Recker, 1987; Morrison, 2000; McGrew, 2003; Ruse, 2008) have characterized Darwin’s theory as a fine example of explanation by unification or “consilience” (a term featured in William Whewell’s philosophy of inductive sciences, but not used by Darwin himself in the book to name the criterion of construction for his models). The creation of unifying, “consilient” explanations can be conceptualized as a process of abduction in the broadest possible meaning: that of a markedly ampliative inference (Adúriz-Bravo & Sans Pinillos, in press) seeking for an overarching conjecture that is bold (in terms of how far away it goes from the evidence that supports it), but has the virtue of being found satisfactorily explanatory. Peirce himself could be added, but only to a certain extent, to this list of antecedents. He was 20 years old when Darwin’s book got its first edition, and so “it is not very easy to think that this did not influence his intellectual life” (Sans Pinillos, 2021: n/p); indeed, an important part of his career as a philosopher was immersed in the early academic and social debates around Darwinism. Nevertheless, his (abundant) mentions to Darwin mostly revolve around metaphysical and ontological matters (cf. Brioschi, 2019), and not much is said in terms of reasoning patterns. A remarkable exception to this generality is a passage where Peirce proposes an analogy between the function of retroduction (one of his forms of abduction) in the gradual development of modern science and the role of “those fortuitous variations in reproduction [ . . . ] in Darwin’s original theory” (Peirce, 1931-1958: 2, 755) of natural selection. In the following paragraphs, arguments from some of the aforementioned authors are summarized to support the view defended in this chapter: that Darwin’s explicit pieces of reasoning in Origin can be, in many cases, semiformally reconstructed as abductive inferences. As Elizabeth Lloyd (1983) and Michael Ruse (2008) point out, Darwin was well aware of, and participated in, the philosophical discussions of his time around scientific methodology, and he made efforts to show in his writings that he proceeded according to the then accepted methodological criteria in order to derive “good science.” He was reportedly influenced by John F. W. Herschel’s (1792– 1871) and Whewell’s (1794–1866) philosophical proposals: he was aware of their writings and exchanged correspondence with them (Honenberger, 2018). Based on their ideas, Darwin sought to produce a causal schema for a wide range of matters that were, in his view, connected to evolution. In general terms, Darwin followed Whewell in assuming that the world was ruled by “natural laws” to be discovered by scientists. Within a system of “consistently empiricist” (Lloyd, 1983: 112) ideas, Darwin attempted at “showing” that natural selection was the “vera causa” (initially, in a strong sense specified by Herschel) of the emergence of new species; early in the book, he states that he is “convinced
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1121
that Natural Selection has been the main but not exclusive means of modification” (Darwin, 1859: 6; emphasis added). According to the more radical empirical-inductivist positions, a vera causa should be directly perceived through the senses or at least derived unmediated from factors of which we can obtain solid empirical evidence. Consistent with this is the fact that Darwin resorted to a simple, graphic, and probably very persuasive analogy with artificial selection, with the intention of showing his contemporaries that selection in his theory was indeed a kind of cause (in the Herschelian sense) that they could experiment or perceive in a more or less straightforward fashion (namely, in the practices of breeders). For Whewell, in contrast, the vera causa did not need to be directly apprehended in reality; in order to identify it, it sufficed to notice what it could actually effect in the world, that is, to recognize its palpable consequences. Whewellian causes are then inferred from a “consilience of inductions,” a sophisticated convergence of empirical generalizations. According to Ruse (2008), Darwin also took into consideration this second, arguably more “rationalist,” notion of causation. Thus, he made efforts to show how natural selection went beyond perceptible forces and constituted itself a consilient conclusion from well-established observations (made by himself or by other scientists and naturalists) in fields as varied as animal breeding (livestock and pets), floriculture and gardening, horticulture and agriculture, cynegetics, veterinary and medicine, speleology, ornithology, geography, ethnography, government statistics, etc. The hypothesized operation of natural selection in this impressive diversity of empirical realms can be construed as Darwin’s abduced subjacent Whewellian cause. In a letter to the English botanist George Bentham in May 1863, Darwin offers an argument based on consilience as the main grounds for his theoretical model: In fact, the belief in natural selection must at present be grounded entirely on general considerations: (1) on its being a vera causa, from the struggle for existence, and the certain geological fact that species do somehow change; (2) from the analogy of change under domestication by man’s selection; and (3) chiefly from this view connecting under an intelligible view a host of facts. (Darwin Correspondence Project, 2020: Letter no. 4176, n/p; emphasis added, punctuation modernized)
The interpretation provided in this chapter around Darwin’s repeated use of this kind of justifications to accept natural selection as the sought vera causa results consistent with a reconstruction of his “long argument” (i.e., the whole system of hypotheses and their applications to cases; see Kitcher, 1985) as a set of inferences to the best explanation. Indeed, as it was indicated, there is extended consensus among philosophers of science on the pertinence of such a reconstruction of Darwin’s ideas. David Stamos (2007) even deems that Origin is a paradigmatic – or epitomic, as it is described here – example of this kind of reasoning, which he considers typical of biology, where “explanation is not, if ever, in terms of laws of nature” (Stamos, 2007: 193). Thagard’s (1992) position is aligned with the previous considerations: for him, Darwin infers (in particular, he abduces) natural selection only as a plausible (as opposed to necessary or probable: Adúriz-Bravo & Sans Pinillos, in press)
1122
A. Adúriz-Bravo and L. González Galli
explanation, since, in order to do the inference, he needs to assume the occurrence of some facts in the past that cannot be directly accessed and are therefore to be established from the observations, interventions, and extrapolations that can be actually performed in the present. Thagard (1992) also highlights that Darwin calls for his hypothesis of selection to be accepted on the basis of its virtue of consilience of “facts”: its value would then reside in the capacity that it shows to explain an enormous variety of phenomena (geographical distribution of organisms, similarities between embryos of different species, presence of vestigial organs, imperfect nature of adaptations, similarities in the structure of functionally different organic features, existence of a fossil record, etc.) that would otherwise appear not to be connected at all and were therefore often disregarded by creationists. Darwin insists in several passages of Origin that his hypotheses explain more phenomena than a creationist framework, but Thagard (1992) and Ruse (2008) provide elements to consider that the sheer number of facts illuminated by Darwinian theory is not as important as the astonishing diversity of domains of the life sciences (among others: comparative anatomy and physiology, biogeography, embryology, ecology, paleontology, ethology) where those facts were being studied in those times. It can be said, then, that Darwin provides a theoretical mechanism, natural selection, that is a strong candidate to explain the “evolution” of species, understood itself as a theoretical concept. Such a mechanism is purportedly inferred from a hardcore of facts that were considered, as it was said, “established” in Darwin’s times (among these, Ernst Mayr, 1998 mentions: scarcity of resources, uniqueness of individuals, and heritability of many of the individual variations). In this way, evolutionary hypotheses (the general on transmutation and the specific on natural selection) are, in Thagard’s terms, abductively consilient: they emerge to conform a credible and robust explanation. (It is worth noticing here that Philip Kitcher, 1985 quite controversially points out that the individual propositions converging in such a consilient explanation are more or less “trivial”; it is their “conjunction” that can be considered Darwin’s original contribution.) Haig (2008), elaborating on Thagard’s ideas, also reckons that inferences to the best explanation are the central mechanism in a “theory of explanatory coherence” to account for the nature of the scientific method, a theory that, according to him, would be totally appropriate to reconstruct Darwin’s modeling. In this philosophical view, an inferred hypothesis is accepted if and when it coheres better overall than its competitors (Thagard, 1989); the aim of the inference is to establish strong explanatory relations between pieces of evidence or facts to be explained. The evaluation of an inference to the best explanation is, in consequence, mainly done in terms of its coherence and consistency. In another article, Haig (2005) claims that Darwin’s theory is based on an analogical abduction (which would be located midway between the broad and the narrow sense as defined in the previous section). He provides a “simplified” reconstruction of the analogical argument inspired in the template of a Peircean syllogism, as follows:
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1123
The hypothesis of evolution by artificial selection was correct in cases of selective domestic breeding. Cases of selective domestic breeding are like cases of the natural evolution of species with respect to the selection process. Therefore, by analogy with the hypothesis of artificial selection, the hypothesis of natural selection might be appropriate in situations where variants are not deliberately selected for. (Haig, 2005: 380)
With all the previous views combined, validity of Darwin’s theory should then be assessed in function of its “degree” of coherence, established on the basis of three criteria: consilience (measured as explanatory breadth), simplicity, and analogy (Thagard, 1978). These criteria fall into the category of “virtues” of an explanation, a category that has been thoroughly discussed in the literature on abduction (see, for instance, Magnani, 2001; Aliseda, 2006; Plutynski, 2011; Callaway, 2014). Haig (2005, 2008) spots the fulfillment of those three criteria in Darwin’s Origin: the consilience of the theory lies in the variety of facts that it could in principle explain (some of them left unexplained or poorly explained by rival theories); its simplicity appears in the economy of the auxiliary assumptions (or ad hoc hypotheses) chosen: for instance, those on the gaps in the fossil record; and the use of a range of analogical inferences broadens its power of unification. Scott Kleiner (2003) also adopts Thagard’s (1992) proposal that an innovative idea in science (a genuine “conceptual change”) entails an increase in explanatory coherence. This author agrees with Elizabeth Lloyd (1983) in considering that Darwin solidly bases his arguments on analogies with observable phenomena, but he states that Darwinian explanations cannot be totally reduced to analogical inferences; there should then exist another kind of inferences, of abductive type. These non-analogical explanations, for Kleiner, are those that imply narratives on facts that it is assumed occurred in the past; following Kitcher (1993), he calls those narratives “Darwinian histories” and attributes to them an explanatory structure that he labels “causal etiology.” For Kleiner, explanations based on Darwinian histories do not reduce themselves to a mere temporal succession of events; they rather imply a concatenation of those events linked to one another with a syntax that reminds that of a computer program. (It is worth stating here that the influential American philosopher Daniel Dennett, 1995 also makes an analogy between natural selection and an algorithm.) In these explanatory narratives, the “terminating” conditions of a given state of affairs constitute the “initiating” conditions for the next one, in a way that the whole concatenated series of “productive components” (Kleiner, 2003: 516) constitutes an efficient causal mechanism (satisfying the requisites of the philosophical notion of vera causa; see Costa, 2009) for the generation of a given result in an evolutionary process that needs to be accounted for. This capacity of the Darwinian history to provide an efficient cause of the outputs in final stages is what prompts Kleiner’s analogical use of the biological concept of etiology. Kleiner (2003) also maintains that these etiologies are “causally coherent,” i.e., theoretically consilient; in Doren Recker’s (1987) terms, they prove to possess
1124
A. Adúriz-Bravo and L. González Galli
“causal efficacy.” For Kleiner, Darwinian theory would entail an increase in explanatory coherence with respect to the creationist natural histories that preceded it (or to Lamarck’s theoretical conception). In this sense, the general architecture of Darwin’s argumentation in favor of the hypothesis of evolution through natural selection would be a case of abduction of a best “available” explanation (Schurz, 2008) in relation to the background knowledge of mid-nineteenth-century Britain. Lipton (2000) starts by recognizing that, in the vast majority of historical cases of scientific research, the relation between evidences and hypotheses cannot be accurately reconstructed with strictly inductive or deductive logic (a realization that was Peirce’s driving force in his postulation of abduction; see Kapitan, 1992). When doing empirical science, truth of the evidences does not deductively imply the truth of a hypothesis “covering” them, nor can this truth be induced, or “extended,” from the available information (in which a pattern is, in many cases, hardly recognizable); this is the famous underdetermination of theories by data (see Newton-Smith, 1978; Okasha, 2002). In effect, many important scientific hypotheses, of a high degree of generality and abstraction, are not implicated by data; they can prove false despite those data being corroborated. This is the status of an inference to the best explanation that satisfactorily accounts for the phenomena under study. And this is the case, of course, of an abduced conclusion, which retains its hypothetical nature during inquiry. In this approach, the central idea is that explanatory-aimed considerations direct the system of inferences, in a way that a scientist abduces from available data a hypothesis that, if it were true, would prove the best to explain those data. Significantly, Lipton illustrates this idea with the “case” of Darwin; he states that he “inferred the hypothesis of natural selection because, although it was not entailed by his biological evidence, natural selection would provide the best explanation of that evidence” (Lipton, 2000: 184; emphasis added). Stamos (2007) convenes in this consideration of Darwin’s reasoning as nondemonstrative and therefore “faulty” from the point of view of classical logic, but he argues that abduction “is a model of scientific explanation that in many ways is superior to the traditional deductive-nomological model” (Stamos, 2007: 193) that dominated nineteenth-century physics. With the same spirit, Tim Lewens (2007b) states that Darwin, in the final chapter (XV) of the sixth British edition of Origin, while making explicit the merits of his abduced hypotheses, praises their explanatory virtue and at the same time acknowledges their defective logical derivation: It can hardly be supposed that a false theory would explain, in so satisfactory a manner as does the theory of natural selection, the several large classes of facts above specified. It has recently been objected that this is an unsafe method of arguing; but it is a method used in judging of the common events of life, and has often been used by the greatest natural philosophers. (Darwin, 1872: 421; emphasis added)
Darwin’s reiterated explicitation of the power of his theses to offer a better explanation of diverse sets of facts than the competing theories did is a strong reason to choose reconstructing his arguments in Origin as cases of inference to the best explanation, since, according to Lewens (2007a: 15; emphasis added), “within
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1125
science and without, we often infer that a proposition is true on the grounds that if it were true, it would be the best explanation of our data.” This conditional clause on the epistemic status of a scientific hypothesis furnishes for this chapter direct links with Peircean abduction and with the semanticist account of modeling (Adúriz-Bravo, 2013a; Sans Pinillos & Adúriz-Bravo, 2021). But being equipped with a hypothetical proposition that gives evolutionary meaning to the existence of a particular trait in a species is only a hint for a possible explanation, an indication of having a “good candidate” for explanatory hypothesis, in Peircean terms (Niinilouto, 2018). In order to conclude that such a hypothesis is valid (let alone true), Lewens (2007a) deems necessary to (1) put to the test the assumptions underlying the hypothesis; (2) inspect whether the explanatory inference adjusts itself to a general pattern that can be recognized, which would give it support; and (3) extract “observational consequences” (the more precise, the better) of the hypothesis to contrast them with results of empirical interventions. In a similar way, Kitcher (1985) claims that a number of passages in the final sections of Origin are centered in showing that many “details” in the biological constitution of an organism demand some kind of explanation (as opposed to the creationist approach, which did not problematize such aspects). Then, Darwin’s proposal would initially consist in showing how the application of a particular schema would explain those details in each separate case, which would otherwise remain opaque to understanding. But Darwin goes beyond and makes substantive efforts to construct Darwinian histories à la Kitcher (1993) in order to explain general traits of the organic world, while at the same time, he strives to make apparent the explanatory merits of this leap in comparison with the available alternatives. For Kitcher, the theory (or perhaps theories) contained in Origin is a collection of “problem-solving patterns” (Kitcher, 1985: 76); following this idea, it could be claimed that Darwin’s creation of hypotheses through abduction contributed to both the formulation of new biological questions and the institution of new ways of answering such questions. Darwin’s abduced theoretical models thus perform a function that can be assimilated to that attributed by Thomas Kuhn to his exemplars, any of the novel puzzle solutions that “crystallizes consensus [and] is regarded and used as a model of exemplary science” (Bird, 2018: n/p). Kitcher (1985, 1993) argues that acceptance of Darwin’s theory in his time was more due to the force of the “long argument” in Origin than to its strict predictive power. According to his reconstruction, and in tune with what was previously expressed, Darwin is transparent about his epistemological view: he considers that an understanding of the natural world is the main aim of science, and he deems that such an aim is reached through the production of unifying answers to the questions and problems posed by a wide range of phenomena (Kitcher 1993). Thus, he extensively argues – in Origin and in many other writings – in favor of the unifying capacity of this theory, showing that it is (at least in principle) instantiable and applicable and also that some of the presumed objections can be turned back (Kitcher, 1985). Kitcher asks the question on the type of argument involved in this historical case: he confronts previous hypothetico-deductive interpretations and
1126
A. Adúriz-Bravo and L. González Galli
proposes that the heart of Darwin’s arguments consists in the firm statement that his proposal on how to do biology is more satisfactory than the rival proposals, in accordance with the standards accepted by his contemporary scientists. Stamos (2007) adheres to this view that abductive inferences suppose a contrastive process of explanation; that is, it is not enough to show that a theoretical model explains something; it is also necessary to prove that it does so better than imaginable alternative models (Rodrigues da Silva & Castilho, 2015). Along the same line, in another text (Stamos, 2008: n/p), he asserts that, although many of the evidences discussed by Darwin in Origin were his own achievements, the book could only be written in dialog with pieces of knowledge that were already circulating among contemporary naturalists, who were, in most cases, creationists. Darwin presents himself as offering them a better explanation for what was already known, and the (abductive) production of such an explanation cannot be modeled with the standard conceptions of the scientific method: In arguing for evolution by natural selection, Darwin was also arguing against creationism (as well as Lamarckian evolutionism), showing why the latter was not a good explanation of the evidence. Inductivist, confirmationist, falsificationist, and other models of science fail to capture this. (Stamos, 2008: n/p; emphasis added)
The last two analyses, besides decidedly admitting that Darwin’s main arguments are abductive, also rebut previous suggestions to reconstruct them in terms of the classical systems of ideas that were proposed in the philosophy of science during the first half of the twentieth century (e.g., those in italics in Stamos’s quotation above). Putnam (1981) uses the same strategy, but he develops it in more detail. He inspects Darwin’s methods under the view of a variety of “criteria of rationality.” According to this kind of approach, due to the abductive nature of the construction of the model of evolution through natural selection, this cannot be captured in terms of Bayesian probabilities nor using Karl Popper’s criterion of falsifiability. (In relation to the latter, it is worth remembering that Popper, using in a far too strict fashion his criterion, considered for some time that “Darwinism is not a testable scientific theory, but a metaphysical research programme” [Popper, 1974: 168; original emphasis].) Putnam rejects positivistic attempts at founding scientific rationality on formalized, algorithmic processes, while he states that a valid option supposes recognizing that scientific inquiry is based on a set of methodological maxims, which “are not rigorous formal rules; they do require informal rationality, i.e. intelligence and common sense, to apply” (Putnam, 1981: 195). For him, there is indeed a scientific method, but such a method presupposes pre-existing notions of rationality shared by the community; it does not “define” them. Putnam also questions the extended idea of deductively deriving propositions from a hypothesis to test them experimentally, for such an idea identifies proceeding rationally with “believing theories solely because they are supported by carefully performed experiments” (Putnam, 1981: 196; original emphasis). There are at least two reasons to cast doubt on this view: (1) it is not always possible to perform the adequate experiments, and (2), much more importantly, evaluation of the degree in which an experimental result supports a theory continues to be an issue quite difficult to formalize, which the Bayesian approach of the inductivists left unsolved.
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1127
On the other hand, Putnam considers that Popper’s way out to this last problem, which “consists in putting forward ‘highly falsifiable’ theories; theories that imply risky predictions[, and] then proceed[ing] to test all of the theories until only one survives” (Putnam, 1981: 198), is also inappropriate. Since the Popperian proposal of “elimination” of all the theories but one is made on deductive grounds, it supposes a definite step forward, but it is still insufficient for Putnam. He concludes that “such a conception of rationality is too narrow even for science” (ibid.; emphasis added), as it can be seen in the fact that its unyielding application entails to “rule out the acceptance of one of the most successful and widely admired of all scientific theories, Darwin’s theory of evolution by natural selection” (ibid.). It may be true that the model of natural selection, due to its strong abductive nature, results in a formulation that it is not highly falsifiable: it does not imply definite predictions “such that if they come out wrong then the theory is refuted” (Putnam, 1981: 198). But, as Putnam claims, the model is accepted not because it can survive Popperian tests, but because it provides a set of plausible explanations for an enormous amount of phenomena, and especially because: it has been fruitful in suggesting new theories and in linking up with developments in genetics, molecular biology, etc., and because the alternative theories actually suggested have either been falsified or seem wholly implausible in terms of background knowledge. (Putnam, 1981: 198; emphasis added)
Putnam is thus asserting the fact that Darwin’s modeling of evolution by natural selection can be totally subsumed under Peircean abduction or under inference to the best explanation. Popper’s normative epistemology would see this as the kind of inference that should be driven out of science, but, as Putnam says: scientists are not going to be persuaded by Popper that they should give up theories which are not strongly falsifiable in cases where those theories provide good explanations of vast quantities of data, and in cases where no plausible alternative explanation is in the field. (Putnam, 1981: 198)
Colombian biologist Eugenio Andrade (2009) is another author who maintains that the inferences formulated by Darwin in his book fit abductive schemas. This author understands abduction in a way – very much connected to Peirce’s original framework – that is, as previously indicated, highly valuable for the purposes of this chapter; he portrays it as the establishment of a case-rule relationship: [Abduction is] the process through which the existence of a new general rule is postulated or inferred. When inquiring on a puzzling phenomenon that we do not understand, we can infer that it should constitute a particular case of a general rule that we, even if we do not know it yet, dare to postulate by comparison with other cases that are the expression of other laws applicable to other domains of knowledge. (Andrade, 2009: 99; translated from Spanish, emphasis added)
In the second part of this quotation, Andrade postulates a sort of “principle of abduction” similar to the well-known principle of induction discussed at length in the philosophical literature (Okasha, 2002). He then characterizes what would constitute the “abductive step,” where a hypothesis is adopted because, as Peirce conceived it, it is suggested by the facts. Such a hypothesis “constitutes itself in the
1128
A. Adúriz-Bravo and L. González Galli
necessary and preliminary condition of any research process, since it illuminates what and how to observe” (Andrade, 2009: 100; translated from Spanish). Andrade follows one of Peirce’s elaborations on the logic of abduction, in which this mode of inference is previous to any induction and deduction and constitutes a genuine “phase” of the scientific method. In his abductive analysis of Darwin’s theory, the first step for validating the model of natural selection supposes acknowledging that selection without a selective agent is, strictly speaking, a metaphor. Darwin himself conceded the strong “metaphorical character” (Darwin, 1859: 62) of all the key notions in his theory; in a famous passage of the sixth British edition of Origin, he even says that “[i]n the literal sense of the word, no doubt, natural selection is a false term” (Darwin, 1872: 63). For Andrade, this audacious, and very much criticized, metaphor constitutes Darwin’s “great abduction,” where he “sees” that competence for resources is the regulating factor and that the overall architecture of the process is similar to that of artificial selection. When examining the role of the explicit relations that Darwin establishes with authors from the field of economy (Adam Smith, Thomas Malthus), Andrade (2009: 167) states that the British naturalist contrived solutions to evolutionary problems that, even if they proceed from elaborations external to the discipline, result enormously fruitful because of the explanatory and interpretive power that they soon begin to show. He rounds up his case by stating that “Darwin provided a beautiful example of Peircean abduction” (Andrade, 2009: 167; translated from Spanish). Ilkka Niiniluoto (Niinilouto, 2018) in a way summarizes all the positions reviewed above, since he contends that Darwin’s evolutionary theory implicates abduction in three distinct ways, which would correspond to (1) the transformation of data into evidence, (2) the construction of an explanation, and (3) its use in modeling phenomena as cases. In the first place, evidences on which the theory is based are construed, according to Niiniluoto, mainly through abductive processes, driven by an aim to explain in a parsimonious and fruitful way. His reconstruction of this point is similar to some of the already presented: (1) Darwin resorts to the analogy between natural selection and domestication when arguing that selection can constitute in itself a vera causa in the sense de Herschel; and (2) he proceeds in accordance with a principle of unification or consilience à la Whewell to show that his proposal explains a great diversity of phenomena and that it does so much better (concretely, more parsimoniously) than creationism. Secondly, the explanation of many current traits of organisms as adaptations is “again abductive” (Niinilouto, 2018: 63): it hypothesizes the existence of a selection in earlier times, pointing at a very clear theoretical model. And thirdly, Darwin’s modeling of the history of life on earth uses a tree, its trunk being the common descent and its branches corresponding to speciation. Use of this image means that while evolution is a process going forward in time, its reconstruction, which should be done backward from the present evidence (contemporary life forms and fossil records), is a retroductive task, in Peirce’s sense (where retroduction, as said, is one of his conceptualizations of abduction).
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1129
Abductive Reconstruction of Passages from Darwin’s Origin The point of departure for this section is that the kind of reasoning described above, which starts from evidences, aims at explaining, and uses theoretical models to generate hypotheses, can be identified in the publications of scientists by means of some linguistic “indicators.” The diverse presentations of an abductive inference in natural language that were introduced in the theoretical framework give clues of what kind of marks (terms and constructions) should be looked for in the text. In addition to the patterns or templates that have already been analyzed, and attending to the theoretical disquisitions in this chapter, further reconstructions of abduction can be used. For instance, model-based abductive inference (both in the “strictus” and in the “latus” meanings) could also be semiformalized under the following syntax: If I am right in the fact that this (a phenomenon that is construed as a problem) is a case of that (a model under which the phenomenon can be subsumed), then the following things (observations, predictions, unifications, etc.) should happen. It is important to look for those things in the world, since they reinforce the pertinence of the established relationship between phenomenon and model.
The purpose of the following paragraphs is then to find indices of abductive inference in a collection of excerpts from the first edition of On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life by Charles Darwin. Those excerpts will be reconstructed as pieces of abductive reasoning in accordance with the different representations of abduction already provided. The pages specified for each analyzed fragment are taken from a photographed version of one surviving 1859 volume, which is available on the webpage of the colossal scholarly project titled “Darwin Online” (van Wyhe, 2002). Readers should be warned that, in some cases, punctuation has been modernized for the sake of clarity. Reconstruction 1: Identification of puzzling situations to be accounted for by means of a hypothesis
As it was documented in the extensive literature review of this chapter, Darwin problematizes facts that the biological establishment of his time took for granted (among them, perhaps the most salient is adaptation). He identifies puzzling phenomena, explanatory weaknesses or contradictions, and gaps in the theoretical framework, “reading” them through arguments based on his idea of selection. In the process of doing so, he overtly and repeatedly states that the hypothesis of the origin of a species as an individual act of creation very often does not account for such facts; it provides no “apparent reasons.” A powerful notion contained in the book can be analyzed through this lens: at the end of chapter II, devoted to natural variation, Darwin contends that a high number of species in a genus provide conditions (“circumstances,” p. 55) that are favorable for the existence of variation. This leads him to predict (“anticipate,” ibid.) that species within larger genera will present more intra-specific varieties than
1130
A. Adúriz-Bravo and L. González Galli
within smaller genera. The underlying analogical inference working here is that the same conditions generating a higher number of species in the genus will generate a higher number of varieties within the species; he is conceiving varieties as “incipient species” (p. 52) to warrant this transference by analogy. In the construction of this idea, Darwin states that available facts are adjusted to his prediction, pointing out once again the argument of an inference to a “better” explanation for this fact that becomes enigmatic than the one that was customarily accepted: “if we look at each species as a special act of creation, there is no apparent reason why more varieties should occur in a group having many species, than in one having few” (p. 55). A few pages later, he makes explicit the analogical justification at the basis of his explanation, introducing a conditional clause with an abduced hypothesis: [T]he species of large genera present a strong analogy with varieties. And we can clearly understand these analogies, if species have once existed as varieties, and thus originated; whereas, these analogies are utterly inexplicable if species are independent creations. (Darwin, 1859: 59; emphasis added)
The abduction of a possible explanation for this highlighted fact entails a theoretical reconstruction or modeling: that of a species as a special kind of variety (“species are only strongly marked and permanent varieties,” p. 56). This conceptual construction attempts at settling a question that, according to Darwin, was quite problematic in his time, since naturalists had “no golden rule by which to distinguish species and varieties” (p. 297). A generalized version of this abduced model, revolving around the idea that there is only a matter of gradation between taxa, provides grounds for further consilience, for instance, when Darwin introduces the variable of time to look into variety in genera and in species. He would then be putting his abductive conclusion into application when he formulates a prediction-like statement such as the following: [W]hy should that part of the structure, which differs from the same part in other [ . . . ] species of the same genus, be more variable than those parts which are closely alike in the several species? I do not see that any explanation can be given[, but that of] species being only strongly marked and fixed varieties[. If this is the case,] we might surely expect to find them still often continuing to vary in those parts of their structure which have varied within a moderately recent period, and which have thus come to differ. (Darwin, 1859: 155; emphasis added)
Darwin then uses, in chapter VIII, his abduced, model-based hypothesis on the similarity between species and varieties to “harmonize” yet another fact that remains opaque if analyzed with the traditional creationist view, the fact of “hybridism”: Laying aside the question of fertility and sterility, in all other respects there seems to be a general and close similarity in the offspring of crossed species, and of crossed varieties. If we look at species as having been specially created, and at varieties as having been produced by secondary laws, this similarity would be an astonishing fact. But it harmonises perfectly with the view that there is no essential distinction between species and varieties. (Darwin, 1859: 275–276; emphasis added) Reconstruction 2: Abduction with a conditional clause
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1131
Darwin defends the idea that no organism self-fecundates for an indefinite number of generations, formulating the strong postulate that crossing with other individuals is “indispensable.” According to him, it is “on the belief that this is a law of nature” (p. 97; emphasis added) that several classes of facts can be understood; such facts would otherwise (“on any other view,” ibid.) result utterly inexplicable. Darwin recognizes an extremely puzzling fact: “How strange that the pollen and stigmatic surface on the same flower, though placed so close together, as if for the very purpose of self-fertilisation, should in so many cases be mutually useless to each other!” (p. 99). Such a fact is quite “simply” explained with an abduced hypothesis “of an occasional cross with a distinct individual being advantageous or indispensable!” (ibid.). This hypothesis, of course, can easily be adjusted into a Peircean pattern with a major premise of the style: “But if cross were advantageous for plants, the fact that the masculine and feminine elements in the same individual are useless to each other would be a matter of course.” Darwin then extends his law-like hypothesis to “cover” cases of sexual reproduction in animals, when he deals with hermaphrodite species (of which he mentions “land mollusca” and earthworms), which always pair: “As yet I have not found a single case of a terrestrial animal which fertilises itself” (p. 100). Explaining this fact, which for Darwin is “remarkable” and not entirely identical to what was being argued in the case of terrestrial plants, again requires “the view of an occasional cross being indispensable” (ibid.). Darwin says that it is so far not known, and even difficult to imagine if we take into consideration “the medium in which terrestrial animals live and the nature of the fertilising element” (ibid.), a plausible mechanism that would function as an analogue in animals to the action of insects and of wind in plants. The “concurrence of two individuals” (ibid.) is therefore “necessary” for him (i.e., it would be a selected trait). Reconstruction 3: Extraction of predictions
In chapter V, in the section on “acclimatization,” Darwin brings attention to the fact that species of cave animals in a region are, in some notorious aspects, similar to their equivalents outside the caves and, at the same time, quite different to those in the caves of regions far away; he points at the need to give a “rational” (p. 139) explanation for these matters. In order to introduce how his abduced hypothesis can be used to explain, he first derives a prediction using the creationist framework. According to him, “[i]t is difficult to imagine conditions of life more similar than deep limestone caverns under a nearly similar climate” (p. 138); then, “on the common view of the blind animals having been separately created for the American and European caverns, close similarity in their organisation and affinities might have been expected” (ibid.; emphasis added). But, as contemporary naturalists (Darwin cites among them the Danish Jørgen M. C. Schiødte) report, “this is not the case, and the cave-insects of the two continents are not more closely allied than might have been anticipated from the general resemblance of the other inhabitants of North America and Europe” (ibid.). In order to deal with this anomaly, he applies his hypothesis of selection to this particular case through assuming an evolutionary history:
1132
A. Adúriz-Bravo and L. González Galli
On my view we must suppose that American animals, having ordinary powers of vision, slowly migrated by successive generations from the outer world into the deeper and deeper recesses of the Kentucky caves, as did European animals into the caves of Europe. (Darwin, 1859: 138)
He then claims that there is “some evidence of this gradation of habit” (ibid.) spanning for “numberless generations.” After these, natural selection would have obliterated the eyes in those organisms and performed numerous other changes, some of them for “compensation” of blindness. But such changes would not be enough to hide common ancestry with related species that remained outside the caves; it should then be expected “to see in the cave-animals of America, affinities to the other inhabitants of that continent, and in those of Europe, to the inhabitants of the European continent” (ibid.). Darwin gives strength to this inference on the expectable similarities and differences between species affirming that “this is the case” (ibid.) in the empirical plane and then devotes the following page to show, once more, that the creationist proposal unsatisfactorily accounts for all these data. Reconstruction 4: Parsimony of abduced hypotheses
Another remarkable point that can be seen in many Darwinian inferences is that of the parsimony of the provided explanations, a “virtue” that was mentioned in this chapter in relation to abductive reasoning. Neat instances of this occur when he examines the cases of diverse species with analogous variations. In page 159, for instance, and after briefly addressing the particular case of pigeons, he moves to the vegetable kingdom and cites three well-known European plants with “enlarged stems,” taken as roots: the common turnip, the Swedish turnip, and the “Ruta baga” (sic). He first acknowledges that, on the basis of botanists’ claims, it can be safely taken as an established fact that those plants are “varieties produced by cultivation from a common parent” (p. 159). Then he stresses the extreme simplicity of taking the hypothesis of “community of descent” (ibid.) as vera causa for their similarities; this hypothesis would smoothly lead to an explaining mechanism – “a consequent tendency to vary in a like manner” (ibid.). For him, it is clear that attempting at explaining “analogous variation” in distinct species that are so close to one another with other mechanisms (such as “three separate yet closely related acts of creation,” ibid.) would be clumsier and less robust and convincing. On the other hand, it is important to notice that this particular situation of enlarged stems is treated in a paragraph of chapter V where Darwin introduces one of his “laws of variation”: “Distinct species present analogous variations; and a variety of one species often assumes some of the characters of an allied species, or reverts to some of the characters of an early progenitor” (p. 159; original emphasis). A law-like statement such as this can be considered, in the theoretical framework for abductive reasoning that was developed in this chapter, a “definition” (Giere, 1988) of a theoretical model with strong imagistic character: that on the behavior of organisms of common descent. Thus, the “law” helps to describe the dynamics of the hypothesized state of affairs, where evolution occurs in one or other direction according to the influencing conditions. Reconstruction 5: Establishment of a case-rule relationship
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1133
Much related to the previous reconstruction is Darwin’s explanation for homology, conceived as a sophisticated instance of “transition of organs.” In an instance of discussion of this phenomenon, Darwin once again focuses on a case and starts by stating an accepted premise: there is consensus among the anatomists of the time on the fact that swimbladders in fish and lungs in land vertebrates are homologous, similar in position and structure. From this, he abductively infers common descent: “[A]ll vertebrate animals having true lungs have descended by ordinary generation from an ancient prototype, of which we know nothing, furnished with a floating apparatus or swimbladder” (p. 191). This formulation of a piece of reasoning can be presented in terms of one of Peirce’s patterns: The surprising fact that fish and land vertebrates have similar air-filled structures is observed. But if the proposition that both fish and land vertebrates descend from an ancient prototype equipped with a floating apparatus were true, the observed homology would be a matter of course. Hence, there is reason to suspect that such an ancestor existed (though we know nothing of it).
The abduced hypothesis establishes a case-rule relationship: what – according to Darwin – most probably “happened” to this supposed ancient apparatus is explained accepting that “an organ originally constructed for one purpose, namely flotation, may be converted into one for a wholly different purpose, namely respiration” (p. 190). Subsuming the case under the rule points at a process of mapping the phenomenon through an elaborate model. The model is subsequently capable of explaining, in principle, an apparently disconnected fact, “the strange fact that every particle of food and drink which we swallow has to pass over the orifice of the trachea, with some risk of falling into the lungs” (p. 191). Reconstruction 6: Instantiation and application of an abduced hypothesis
In chapter VI, devoted to possible difficulties with his theoretical framework, Darwin faces the delicate issue of the origin of organs of “extreme perfection and complication” (p. 186; an issue, as it is known, that continues to be today the workhorse of the most adamant creationist discourse). His intention is to achieve a successful application of his abduced hypothesis in this kind of polemic cases. Darwin contends that even an example with subtle intricacies as the origin of complex eyes, of which thinking that they were formed by natural selection seems “absurd in the highest possible degree” (ibid.), is a candidate to be explained with his ideas. He starts by pointing out that two initial conditions need to be accepted: a great diversity in kinds of functional eyes and a small number of living animals “in proportion to those which have become extinct” (p. 185); with these two conditions, “numerous gradations from a perfect and complex eye to one very imperfect and simple, each grade being useful to its possessor, can be shown to exist” (ibid.). Equipped with this idea, Darwin then affirms that he: can see no very great difficulty (not more than in the case of many other structures) in believing that natural selection has converted the simple apparatus of an optic nerve merely coated with pigment and invested by transparent membrane into an optical instrument as
1134
A. Adúriz-Bravo and L. González Galli
perfect as is possessed by any member of the great Articulate class. (Darwin, 1859: 188; emphasis added)
Darwin is convinced that, for anyone who accepts that “large bodies of facts” not explained before can be now accounted for by his “theory of descent,” there should be no hesitation to “admit that a structure even as perfect as the eye of an eagle might be formed by natural selection” (p. 188), notwithstanding the fact that transitional grades toward such “inimitable contrivances” (p. 186) are yet not known. But Darwin recognizes that, even in his personal case, this is a difficult extension of the principle of natural selection, where “reason ought to conquer [ . . . ] imagination” (p. 188). Reconstruction 7: Support for the hypothesis from new facts that conform to the abduction
In several passages of Origin, Darwin lists uncontroversial findings of nineteenth-century life sciences to indicate that he deems they support his hypothesis on the action of natural selection. An eloquent example can be found in the final summary of chapter VII on “instincts”: On the other hand, the fact that instincts are not always absolutely perfect and are liable to mistakes; that no instinct has been produced for the exclusive good of other animals, but that each animal takes advantage of the instincts of others; that the canon in natural history, of “natura non facit saltum” [or nature makes no leap], is applicable to instincts as well as to corporeal structure, and is plainly explicable on the foregoing views, but is otherwise inexplicable; all tend to corroborate the theory of natural selection. (Darwin, 1859: 243; emphasis added)
He “horizontally” transports the model underlying his abduction on instinct to a variety of cases, acknowledging that this inference “may not be a logical deduction” (p. 243), but convinced that the explanatory result is satisfactory: [S]uch instincts as the young cuckoo ejecting its foster-brothers, ants making slaves, the larvae of ichneumonidae feeding within the live bodies of caterpillars [can be seen] not as specially endowed or created instincts, but as small consequences of one general law leading to the advancement of all organic beings, namely, multiply, vary, let the strongest live and the weakest die. (Darwin, 1859: 243–244; emphasis added)
Thus, Darwin deems that all the natural phenomena included in his lists ratify his model, defined by the abduced law. Reconstruction 8: Evidences in favor and against the abductive inference
Many passages in the book are devoted to introduce a diachronic, long-running view on evolutionary process as they may have happened. “Iterating” the model to see its consequences (following the Peircean pattern included at the beginning of this section) entails the assumption, as it was said in the theory review, of a chain of operating conditions at the beginning and in the different stages, conditions that would ideally need to be recognized – directly or through their consequences – in the natural world. The nature and role of such conditions can be seen in the following example, where Darwin sketches what in Section “Studies on Abductive Reasoning in
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1135
Darwin’s Formulation of His Evolutionary Theory” was called a Darwinian history. He is trying to account for the analogies that can be observed in the faunas of distant regions all over the globe: Thus, as it seems to me, the parallel, and, taken in a large sense, simultaneous, succession of the same forms of life throughout the world accords well with the principle of new species having been formed by dominant species spreading widely and varying; the new species thus produced being themselves dominant owing to inheritance, and to having already had some advantage over their parents or over other species; these again spreading, varying, and producing new species. (Darwin, 1859: 327; emphasis added)
The similarities between extinct and living forms also fall “into one grand natural system” (p. 329) if the abduced hypothesis is conveniently used as a problem-solving pattern à la Kitcher. Although Darwin says that such similarities can be “at once explained on the principle of descent” (ibid.), it is clear, under the light of the theoretical framework on abduction, that such an explanation needs the production of a plausible “state of affairs” that entails the use of auxiliary hypotheses on possible conditions. Those hypotheses, as required by the regulating principles of abductive inference, should be as few and as simple as possible. In this case – as in some of the previously analyzed – the imperfection of the fossil register conspires against giving full support to the details of the described mechanism; nevertheless, such register is judged by Darwin eloquent enough to: have a right to expect [ . . . ] that those groups which have within known geological periods undergone much modification, should in the older formations make some slight approach to each other; so that the older members should differ less from each other in some of their characters than do the existing members of the same groups. (Darwin, 1859: 333; emphasis added)
And this can indeed be proven to be the case using evidence “of our best palaeontologists” (ibid.). Notwithstanding the identified problem surrounding the quantity and quality of evidences in favor of this particular instantiation of the general model of descent with modification, Darwin deems that “the main facts with respect to the mutual affinities of the extinct forms of life to each other and to living forms seem to [him] explained in a satisfactory manner [while] they are wholly inexplicable on any other view” (ibid.; emphasis added). A few pages later, in a long and convoluted paragraph, Darwin characterizes the minimal paleontological support that he would require for his model to be satisfactorily explanatory: [W]e ought not to expect to find, as I attempted to show in the last chapter, in any one or two formations all the intermediate varieties between the species which appeared at the commencement and close of these periods; but we ought to find after intervals, very long as measured by years, but only moderately long as measured geologically, closely allied forms, or, as they have been called by some authors, representative species. (Darwin, 1859: 336)
For Darwin, the latter can be “assuredly” discovered, and this would be “evidence of the slow and scarcely sensible mutation of specific forms, as we have a just right to expect to find” (ibid.).
1136
A. Adúriz-Bravo and L. González Galli
Reconstruction 9: Including further facts in the abductive consilience
Darwin’s abductive explanations in Origin are often explicitly attributed in the text the virtue of being easily expandable to other phenomena; the theory is then seen as highly consilient. In chapter XII, which discusses the geographical distribution of species, Darwin pinpoints a fact in need of explanation: in oceanic islands, such as Madeira or the Galapagos, the proportion of “endemic” species (i.e., those that cannot be found anywhere else) is high. According to Darwin, expansion of his model to this case is possible; he provides some details on how to subsume it under the rule: This fact might have been expected on my theory, for, as already explained, species occasionally arriving after long intervals in a new and isolated district, and having to compete with new associates, will be eminently liable to modification, and will often produce groups of modified descendants. (Darwin, 1859: 390)
Darwin also points out another “grand fact” to be explained: the similarity between faunas in islands and continents. A conceptual obstacle needs to be taken into account: great environmental differences between those two kinds of places rule out a sensible use of the hypothesis of adaptation to life conditions. The similarity between groups, then, is for him due to their common descent and expressed in their inheritance. This idea is conveyed with a beautiful figure of speech: [I]t is obvious that the Galapagos Islands would be likely to receive colonists, whether by occasional means of transport or by formerly continuous land, from America, and the Cape de Verde Islands from Africa, and that such colonists would be liable to modification; the principle of inheritance still betraying their original birthplace. (Darwin, 1859: 398–399; emphasis added) Reconstruction 10: Construction of evidence from facts
In an abduction, one fact, reconstructed as evidence, is usually determinant for the inference: it becomes the “result” that suggests the conclusion (compare with the example of Jane the bus driver). Darwin finds in the patterns of geographical distribution of upper taxa a very strong support for his idea of inheritance with modification. For example, he shows that, using his view, “we can understand how it is that sections of genera, whole genera, and even families are confined to the same areas, as is so commonly and notoriously the case” (pp. 350–351). Another intriguing fact about the geographical distribution of organisms known in Darwin’s times concerns the similarity between “productions” (biota) in places as far away as the Pyrenees and Scandinavia. In order to transform this known fact into evidence to support his theoretical model, Darwin reconstructs it by proposing that these now geographically distant populations are actually descendants of a more widely distributed original population that temporarily fragmented as a result of changes in the environment: We can thus also understand the fact that the Alpine plants of each mountain-range are more especially related to the Arctic forms living due north or nearly due north of them: for the migration as the cold came on, and the re-migration on the returning warmth, will generally have been due south and north. (Darwin, 1859: 367; emphasis added)
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1137
Darwin interprets many patterns of geographic distribution as consequences of the existence of a hypothetical geographical center of origin and of distribution of a taxon, from which population expansions and retractions due to climatic and environmental changes occurred; this allows him to transform various phenomena accepted in his time into evidence in his favor. The transformation entails theoretically “loading” facts to show how they operate within the model. For instance, two facts are independently “seen” through the model and then connected to each other: On this view we can understand the relationship, with very little identity, between the productions of North America and Europe –a relationship which is most remarkable, considering the distance of the two areas, and their separation by the Atlantic Ocean. We can further understand the singular fact remarked on by several observers, that the productions of Europe and America during the later tertiary stages were more closely related to each other than they are at the present time; for during these warmer periods the northern parts of the Old and New Worlds will have been almost continuously united by land, serving as a bridge, since rendered impassable by cold, for the inter-migration of their inhabitants. (Darwin, 1859: 371; emphasis added)
Darwin then intends to explain the case of marine organisms using the same ways of reasoning that he has used with plants. In both cases, this entails the introduction of assumptions on the geographical and climatological conditions in the past: in the previous quotation, he says that two distant regions “will have been almost continuously united by land”; in this new case, he asserts that, during the Pliocene, the “slow southern migration of a marine fauna [ . . . ] was nearly uniform along the continuous shores of the Polar Circle” (p. 372). Explanatory mechanisms like this could account, within the frame of his “theory of modification,” for “closely allied forms now living in areas completely sundered” (ibid.), such as the “striking case of many closely allied crustaceans [ . . . ], of some fish and other marine animals, in the Mediterranean and in the seas of Japan, areas now separated by a continent and by nearly a hemisphere of equatorial ocean” (ibid.). Darwin’s reinterpretation of the numerous known facts concerning geographical distribution of organisms is, as it was shown, aided by a variety of theoretical tools. Among these, a new recurrence to the analogy with artificial selection (this time, around the case of breeding of English race horses, p. 356) serves him to “illustrate” what he means in these passages of Origin. Reconstruction 11: Managing facts that remain unexplained
The problems associated with the fossil register as far as it had been investigated in Darwin’s times pose a series of obstacles to the abduced model, many of them reviewed in Origin. Darwin confesses that he is “aware that there are some apparent exceptions” (p. 316) to the “rule” emerging from his model that groups of extinct organisms should not “reappear” later on in the register. He tackles this fact by pointing out that even firm opponents to his theory (the Swiss paleontologist F.J. Pictet de la Rive among them) admit that such exceptions are extremely rare. The “general rule” derived then holds as a satisfactory explanation.
1138
A. Adúriz-Bravo and L. González Galli
Darwin thus maintains that his hypothesis of the gradual appearance of species, in a tree-like process, should be accepted even in contradiction with what seemingly shows the fossil register, which is clearly imperfect: [T]he process of modification and the production of a number of allied forms must be slow and gradual, one species giving rise first to two or three varieties, these being slowly converted into species, which in their turn produce by equally slow steps other species, and so on, like the branching of a great tree from a single stem, till the group becomes large. (Darwin, 1859: 317; emphasis added)
This model of graduality in changes is, for Darwin, robust enough to be extended to extinction: “Thus, as it seems to me, the manner in which single species and whole groups of species become extinct accords well with the theory of natural selection” (p. 322; emphasis added). Reconstruction 12: The imagistic nature of model-based abduction
One core component of the process of scientific argumentation implies “iterating” the theoretical model to test how well it adjusts the phenomenon under study (Upmeier zu Belzen et al., 2021). This entails opening a “possible world” with strong imagistic character (Giere, 1988): the recipient of the argument can picture the situation that is being modeled and see the postulated mechanisms in action. An instance of this can be seen when Darwin addresses the question of why there are bats on remote islands, but no land mammals. His answer is constructed around a clear image: “On my view this question can easily be answered; for no terrestrial mammal can be transported across a wide space of sea, but bats can fly across” (p. 394). The same occurs when Darwin produces a candidate explanation for the nature of the animal populations in different kinds of islands. Here, he introduces the image of land bridges connecting some islands to mainland, in the form of a hypothetical proposition that for him is “obvious”: As the amount of modification in all cases depends to a certain degree on the lapse of time, and as during changes of level it is obvious that islands separated by shallow channels are more likely to have been continuously united within a recent period to the mainland than islands separated by deeper channels, we can understand the frequent relation between the depth of the sea and the degree of affinity of the mammalian inhabitants of islands with those of a neighbouring continent. (Darwin, 1859: 395–396; emphasis added)
The images of migration of species with large areas of distribution over long periods of time permit Darwin to explain the existence of similar organisms in very distant places in the globe. If it is taken to be true that all species in a genus have “descended from a single parent” (p. 405), then these would now be distributed in “the most remote points of the world” (ibid.): it would be conceivable, accepting the proposed picture, to find “that some at least of the species range very widely” (ibid.).
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1139
Conclusions: The Power of Abduction in Didactics of Science Reconstruction and interpretation of some of Charles Darwin’s arguments as epitomes of abductive reasoning holds, as the considerations in this chapter suggest, great potential for (1) teaching science to students in different educational levels, (2) educating science teachers, and (3) researching and innovating in didactics of science. In relation to students, analyzing cases of possible abduction in Darwin (or in any other major scientific author) would allow, in the first place, modeling vertebral aspects of scientific thinking, and consequently discussing the topic of scientific methodology under new epistemological light, in a serious attempt at abandoning the heavy positivistic legacy of the inductive-deductive schema. Classroom discussion of sophisticated methodological aspects would serve the universally proclaimed objective of teaching the nature of science, which nowadays counts among the most prominent in science education for citizenship. The general approach to classroom work on the nature of scientific thinking that is outlined in this chapter begins by focusing on the possible relations between observations and experiments, on one side, and ideas and concepts, on the other; such relations are too often clumsily reduced to a caricature of a school scientific method that students are required to follow step by step. In order to problematize those relations, other fields of human activity are pertinent for critical inspection; thus, students can be invited to reflect on how “raw” information and its interpretation relate in everyday life, detective inquiry, medical diagnoses, etc. (see Adúriz-Bravo, 2015, 2016). Once some of these situations have been “modeled” as abductive ascents, the teacher can return to a selection of scientific cases and undertake similar reconstructions. A formulation of possible phases in this schema of “reasoning on reasoning,” which can be proposed to students in the particular case of Darwin, is beautifully – though too compactly – conveyed in this paragraph by the Australian philosopher of science John D. Norton:
Upland geese, Darwin reported [in the 1872 edition of Origin], rarely go near water, but they have the same webbed feet that are of great utility to aquatic birds. This curious fact, Darwin noted, is readily explained by natural selection as a residual from ancestral aquatic geese. It is poorly explained by the hypothesis of independent creation. Why create geese with this unnecessary feature? (Norton, 2021: 252; emphasis added)
Students’ route along this sequence of reasoning based on the use of a model can be operationalized, for instance, by asking them to produce possible explanations for Darwin’s “report” on geese before moving to the one formulated by him. Later, the relative validity of each of those explanations can be compared in an argument elicited by means of a destabilizing question, such as the one on the “unnecessary feature” in upland geese.
1140
A. Adúriz-Bravo and L. González Galli
A second value of working around abduction with students is strictly connected to the biological topic that they are learning, namely, the model of evolution by natural selection. Scaffolding content learning with the aid of explicit epistemological reflection seems to be crucial in the case of evolutionary biology, where numerous, diverse, and grave obstacles of comprehension have been reported (Pérez et al., 2021). In relation to science teachers, it could be formative to introduce two particular philosophical reflections around the procedure of abductive reconstruction. The first one relates to the epistemological study of Darwin’s writings, which is currently very much developed but also existed during his lifetime. Such study covers (and covered) many angles: the hypothetical nature and explanatory power of evolutionary ideas, the elucidation of “Darwin’s methodology” (Lloyd, 1983: 112) to produce them, and the critique of his own account of the ways in which he proceeded. Thus, an issue that could be profitably examined with teachers is to what extent the proposal here presented accesses Darwin’s actual modes of reasoning when he was creating one of the most amazing theories in the history of humankind. What can be claimed, in fact, is that this chapter proposes a plausible reconstruction of Darwin’s written pieces of argumentation as disseminated in his book (see Costa, 2009). The second reflection with teachers deals with the object of abductive reconstruction. This technique can be understood as a rather sophisticated, but profoundly educational, way to model, at the same time, Darwin’s systems of inference to enunciate his explanatory hypotheses on the occurrence of some selected natural phenomena and “the mechanisms through which such hypotheses can be justified and accepted” (Rodrigues da Silva, 2017: 126–127; original emphasis, translated from Portuguese). This remark opens fruitful paths for teachers to design instructional sequences directed at students’ learning of Darwin’s model and of its rational foundations, an issue of utmost importance given the emergence of the so-called intelligent design. In relation to didactical research, an advantage – declared in the introduction of this chapter – of paying attention to the notion of abduction is its power to “amalgamate” different theoretical approaches for science education. Regarding the instructional use of school scientific models, the intention to institute natural selection as the privileged theoretical perspective to model key evolutionary facts in the classroom needs that students work around problematic situations that generate the need to explain with “school versions” of that model (González Galli et al., 2020). Students’ initial explanations will necessarily stay close to intuitive formulations; after increasingly more complex reconstructions of the natural facts under analysis are requested, and the laboriously accomplished explanations are appraised, students’ models can be expected to move toward their counterparts in normative biological knowledge. In this long-hauled process, a challenge for science teachers is to achieve that their students come to appreciate the advantages of the model sanctioned in the curriculum in front of those others to which they resorted before instruction. This necessitates explicit considerations on the nature and use of models in science (Adúriz-Bravo, 2013a) in order to show the “explanatory superiority” of Darwin’s
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1141
proposal in front of its two great rivals: the creationist and Lamarckian models. The kind of analyses performed with the abductive reconstructions in this chapter are of help for this task of evaluating models, an activity that is here conceived as the core of school scientific argumentation.
References Adúriz-Bravo, A. (2001). A proposal to teach the abductive argumentation pattern through detective novels. In D. Psillos et al. (Eds.), Proceedings of the third international conference on science education research in the knowledge based society (Vol. II, pp. 715–717). Aristotle University of Thessaloniki. Adúriz-Bravo, A. (2003). “La muerte en el Nilo”: Una propuesta para aprender sobre la naturaleza de la ciencia en el aula de ciencias naturales de secundaria [“Death on the Nile”: A proposal to learn the about nature of science in secondary science classrooms]. In A. Adúriz-Bravo, G. A. Perafán, & E. Badillo (Eds.), Actualización en didáctica de las ciencias naturales y las matemáticas (pp. 129–138). Editorial Magisterio. Adúriz-Bravo, A. (2013a). A semantic view of scientific models for science education. Science & Education, 22(7), 1593–1612. https://doi.org/10.1007/s11191-011-9431-7 Adúriz-Bravo, A. (2013b). La historia de la ciencia en la enseñanza de la naturaleza de la ciencia: Maria Skłodowska-Curie y la radiactividad [History of science to teach the nature of science: Maria Skłodowska-Curie and radioactivity]. Educació Química, 16, 10–16. https://raco.cat/ index.php/EduQ/article/view/313098 Adúriz-Bravo, A. (2015). Pensamiento “basado en modelos” en la enseñanza de las ciencias naturales [“Model-based” thinking in science teaching]. Revista del Instituto de Investigaciones en Educación, 6, 20–31. https://doi.org/10.30972/riie.063680 Adúriz-Bravo, A. (2016). “Modos de racionalidad” en la historia de la ciencia para la enseñanza de las ciencias [“Modes of rationality” in the history of science for science teaching]. In P. Grapí Vilumara & M. R. Massa Esteve (Eds.), Actes de la XIII Jornada sobre la Història de la Ciència i l’Ensenyament “Antoni Quintana Marí” (pp. 9–15). Institut d’Estudis Catalans. http:/ /publicacions.iec.cat/repository/ActesXIIIJornades.pdf Adúriz-Bravo, A. (2020). Contributions to the nature of science: Scientific investigation as inquiry, modeling, and argumentation. In C. N. El-Hani, M. Pietrocola, E. F. Mortimer, & M. R. Otero (Eds.), Science education research in Latin America (pp. 394–425). Brill/Sense. https://doi.org/ 10.1163/9789004409088_017 Adúriz-Bravo, A., & Izquierdo-Aymerich, M. (2009). A research-informed instructional unit to teach the nature of science to pre-service science teachers. Science & Education, 18(9), 1177– 1192. https://doi.org/10.1007/s11191-009-9189-3 Adúriz-Bravo, A., & Sans Pinillos, A. (in press). Science & Education. Aliseda, A. (2006). Abductive reasoning: Logical investigations into discovery and explanation. Springer. Andrade, E. (2009). La ontogenia del pensamiento evolutivo [The ontogeny of evolutionary thinking]. Editorial Universidad Nacional de Colombia. Auletta, G. (2017). A critical examination of Peirce’s theory of natural inferences. Revista Portuguesa de Filosofia, 73(3/4), 1053–1094. https://doi.org/10.17990/RPF/2017_73_3_1053 Bird, A. (2018). Thomas Kuhn. In: Zalta, E.N. (Ed.). The Stanford encyclopedia of philosophy (Winter 2018 edition), n/p. https://plato.stanford.edu/archives/win2018/entries/thomas-kuhn/ Blachowicz, J. (1996). Ampliative abduction. International Studies in the Philosophy of Science, 10(2), 141–157. https://doi.org/10.1080/02698599608573535 Blachowicz, J. (1998). Of two minds: The nature of inquiry. State University of New York Press. Brioschi, M.R. (2019). Does continuity allow for emergence?: An emergentist reading of Peirce’s evolutionary thought. European Journal of Pragmatism and American Philosophy [Online], XI(2), n/p. doi:https://doi.org/10.4000/ejpap.1647
1142
A. Adúriz-Bravo and L. González Galli
Burch, R. (2021). Charles Sanders Peirce. In: Zalta, E.N. (Ed.). The Stanford encyclopedia of philosophy (Winter 2021 edition), n/p. https://plato.stanford.edu/archives/win2021/entries/ peirce/ Burnett, D. G. (2009). Savage selection: Analogy and elision in On the origin of species. Endeavour, 33(4), 121–126. https://doi.org/10.1016/j.endeavour.2009.09.005 Callaway, H. G. (2014). Abduction, competing models and the virtues of hypotheses. In L. Magnani (Ed.), Model-based reasoning in science and technology (pp. 263–280). Springer. https://doi.org/10.1007/978-3-642-37428-9_15 Campanaro, D. (2021). Inference to the best explanation (IBE) and archaeology: Old tool, new model. European Journal of Archaeology, 24(3), 412–432. https://doi.org/10.1017/eaa.2021.6 Campos, D. G. (2011). On the distinction between Peirce’s abduction and Lipton’s inference to the best explanation. Synthese, 180(3), 419–442. https://doi.org/10.1007/s11229-009-9709-3 Clement, J., & Núñez Oviedo, M.C. (2003, March). Abduction and analogy in scientific model construction. Paper presented at the National Association for Research in Science Teaching Conference. http://people.umass.edu/~clement/pdf/clement_nunez_paper.pdf Costa, J. T. (2009). The Darwinian revelation: Tracing the origin and evolution of an idea. BioScience, 59(10), 886–894. https://doi.org/10.1525/bio.2009.59.10.10 Darwin, C. R. (1859). On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life (1st ed.). John Murray. Darwin, C. R. (1872). On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. 6th British edition. John Murray. Darwin Correspondence Project. (2020). University of Cambridge. Letter no. 4176. Accessed on 4 Feb 2022. https://www.darwinproject.ac.uk/letter/?docId=letters/DCP-LETT-4176.xml Delrieux, C. (2004). Abductive inference in defeasible reasoning: A model for research programmes. Journal of Applied Logic, 2(4), 409–437. https://doi.org/10.1016/j.jal.2004.07.003 Dennett, D. C. (1995). Darwin’s dangerous idea: Evolution and the meanings of life. Simon & Schuster. Douven, I. (2021). Abduction. In: Zalta, E.N. (Ed.). The Stanford encyclopedia of philosophy (Summer 2021 edition), n/p. https://plato.stanford.edu/archives/sum2021/entries/abduction/ Duarte Calvo, A. (2016). La abducción: Una aproximación dialógica [Abduction: A dialogic approach]. Doctoral dissertation. : Universidad Complutense de Madrid. https://eprints.ucm. es/id/eprint/35905/1/T36885.pdf Evans, L. T. (1984). Darwin’s use of the analogy between artificial and natural selection. Journal of the History of Biology, 17(1), 113–140. https://doi.org/10.1007/BF00397504 Giere, R. N. (1988). Explaining science: A cognitive approach. University of Chicago Press. Giere, R. N. (1991). Understanding scientific reasoning (3rd ed.). Holt, Rinehart & Winston. Gildenhuys, P. (2004). Darwin, Herschel, and the role of analogy in Darwin’s Origin. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 35(4), 593–611. https://doi.org/10.1016/j.shpsc.2004.09.002 González Galli, L., Pérez, G., & Gómez Galindo, A. A. (2020). The self-regulation of teleological thinking in natural selection learning. Evolution: Education & Outreach, 13, 6. https://doi.org/ 10.1186/s12052-020-00120-0 Gruber, H. E. (1978). Darwin’s tree of nature and other images of wide scope. In J. Wechsler (Ed.), On aesthetics in science (pp. 121–140). The MIT Press. Gruber, H. E. (1981). Darwin on man: A psychological study of scientific creativity (2nd ed.). University of Chicago Press. Gruber, H. E., & Wallace, D. B. (2001). Creative work: The case of Charles Darwin. American Psychologist, 56(4), 346–349. https://doi.org/10.1037/0003-066X.56.4.346 Habiby, I., Hernani, & Riandi. (2020). Improving students’ NOS understanding through explicitreflective learning with socio-scientific issues context. Journal of Physics: Conference Series, 1806, 012122. https://doi.org/10.1088/1742-6596/1806/1/012122 Haig, B. D. (2005). An abductive theory of scientific method. Psychological Methods, 10(4), 371–388. https://doi.org/10.1037/1082-989X.10.4.371 Haig, B. D. (2008). An abductive perspective on theory construction. Journal of Theory Construction and Testing, 12(1), 7–10.
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1143
Hanson, N. R. (1958). Patterns of discovery. Cambridge University Press. Harré, R. (1986). Varieties of realism: A rationale for the natural sciences. Blackwell. Honenberger, P. (2018). Darwin among the philosophers: Hull and Ruse on Darwin, Herschel, and Whewell. HOPOS: Journal of the International Society for History of Philosophy of Science, 8(2), 278–309. https://doi.org/10.1086/698894 Iranzo, V. (2009). Abduction and inference to the best explanation. Theoria: An International Journal for Theory, History and Foundations of Science, 22(3), 339–346. https://doi.org/10. 1387/theoria.455 Izquierdo-Aymerich, M., & Adúriz-Bravo, A. (2003). Epistemological foundations of school science. Science & Education, 12(1), 27–43. https://doi.org/10.1023/A:1022698205904 Justi, R., & Gilbert, J. K. (1999). History and philosophy of science through models: The case of chemical kinetics. Science & Education, 8(3), 287–307. https://doi.org/10.1023/ A:1008645714002 Kapitan, T. (1992). Peirce and the autonomy of abductive reasoning. Erkenntnis, 37(1), 1–26. https://doi.org/10.1007/BF00220630 Keegan, R. (1989). How Charles Darwin became a psychologist. In D. B. Wallace & H. E. Gruber (Eds.), Creative people at work: Twelve cognitive case studies (pp. 107–126). Oxford University Press. Kitcher, P. (1985). Darwin’s achievements. In: Kitcher, P. (2003). In Mendel’s mirror: Philosophical reflections on biology, pp. 45-93. Oxford University Press. Kitcher, P. (1993). The advancement of science: Science without legend, objectivity without illusions. Oxford: Oxford University Press. Kleiner, S. A. (2003). Explanatory coherence and empirical adequacy: The problem of abduction, and the justification of evolutionary models. Biology and Philosophy, 18(4), 513–527. https:// doi.org/10.1023/A:1025523022460 Konrad, K. (2004). Model generation for natural language interpretation and analysis. Springer. Laçin-Sim¸ ¸ sek, C. (2019). What can stories on history of science give to students?: Thoughts of science teacher candidates. International Journal of Instruction, 12(1), 99–112. https://doi.org/ 10.29333/iji.2019.1217a Lawson, A. E. (2010). Basic inferences of scientific reasoning, argumentation, and discovery. Science Education, 94(2), 336–364. https://doi.org/10.1002/sce.20357 Leite, L., Oldham, E., Afonso, A. S., Viseu, F., Dourado, L., & Martinho, M. H. (Eds.). (2020). Science and mathematics education for 21st century citizens: Challenges and ways forward. Nova Science Publishers. Lewens, T. (2007a). Adaptation. In D. Hull & M. Ruse (Eds.), The Cambridge companion to philosophy of biology (pp. 1–21). Cambridge University Press. Lewens, T. (2007b). Darwin. Routledge. Lipton, P. (2000). Inference to the best explanation. In W. H. Newton-Smith (Ed.), A companion to the philosophy of science (pp. 184–193). Blackwell. Lloyd, E. (1983). The nature of Darwin’s support for the theory of natural selection. Philosophy of Science, 50(1), 112–129. https://doi.org/10.1086/289093 Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation. Springer. Mayr, E. W. (1998). This is biology: The science of the living world. Harvard University Press. McGrew, T. (2003). Confirmation, heuristics, and explanatory reasoning. The British Journal for the Philosophy of Science, 54(4), 553–567. https://doi.org/10.1093/bjps/54.4.553 Morrison, M. (2000). Unifying scientific theories: Physical concepts and mathematical structures. Cambridge University Press. Newton-Smith, W. (1978). The underdetermination of theory by data. In R. Hilpinen (Ed.), Rationality in science (pp. 91–110). Springer. https://doi.org/10.1007/978-94-009-9032-6_8 Niebert, K., Marsch, S., & Treagust, D. F. (2012). Understanding needs embodiment: A theoryguided reanalysis of the role of metaphors and analogies in understanding science. Science Education, 96(5), 849–877. https://doi.org/10.1002/sce.21026 Niinilouto, I. (2018). Truth-seeking by abduction. Springer.
1144
A. Adúriz-Bravo and L. González Galli
Norton, J. D. (2021). The material theory of induction. BSPS Open/University of Calgary Press. Nygård Larsson, P., & Jakobsson, A. (2020). Meaning-making in science from the perspective of students’ hybrid language use. International Journal of Science and Mathematics Education, 18(5), 811–830. https://doi.org/10.1007/s10763-019-09994-z Oh, P. S. (2019). Features of modeling-based abductive reasoning as a disciplinary practice of inquiry in earth science: Cases of novice students solving a geological problem. Science & Education, 28(6-7), 731–757. https://link.springer.com/article/10.1007/s11191-019-00058-w Okasha, S. (2002). Philosophy of science: A very short introduction. Oxford University Press. Paavola, S. (2004). Abduction as a logic and methodology of discovery: The importance of strategies. Foundations of Science, 9(3), 267–283. https://doi.org/10.1023/b:foda.0000042843. 48932.25 Peirce, C. S. (1931–1958). Collected papers. 8 volumes. Harvard University Press. Pérez, G., Gómez Galindo, A. A., & González Galli, L. (2021). La regulación de los obstáculos epistemológicos en la enseñanza y el aprendizaje de la evolución. Enseñanza de las Ciencias, 39(1), 27–44. https://doi.org/10.5565/rev/ensciencias.2968 Plutynski, A. (2011). Four problems of abduction: A brief history. HOPOS: The Journal of the International Society for the History of Philosophy of Science, 1(2), 227–248. https://doi.org/ 10.1086/660746 Popper, K. R. (1974). Intellectual autobiography. In P. A. Schilpp (Ed.), The philosophy of Karl Popper (pp. 3–184). Open Court. Pramling, N. (2009). The role of metaphor in Darwin and the implications for teaching evolution. Science Education, 93(3), 535–547. https://doi.org/10.1002/sce.20319 Psillos, S. (2002). Simply the best: A case for abduction. In A. C. Kakas & F. Sadri (Eds.), Computational logic: Logic programming and beyond. Lecture Notes in Computer Science (Vol. 2408, pp. 605–625). Springer. https://doi.org/10.1007/3-540-45632-5_24 Putnam, H. (1981). Reason, truth and history. Cambridge University Press. Recker, D. A. (1987). Causal efficacy: The structure of Darwin’s argument strategy in The origin of species. Philosophy of Science, 54(2), 157–175. https://doi.org/10.1086/289368 Richards, R. A. (1997). Darwin and the inefficacy of artificial selection. Studies in History and Philosophy of Science Part A, 28(1), 75–97. https://doi.org/10.1016/S0039-3681(96)00008-8 Rivadulla, A. (2007). Abductive reasoning, theoretical preduction, and the physical way of dealing fallibly with nature. In O. Pombo & A. Gerner (Eds.), Abduction and the process of scientific discovery (pp. 199–210). Centro de Filosofia das Ciências da Universidade de Lisboa. Rivadulla, A. (2015). Meta, método y mito en ciencia [Goal, method and myth in science]. Trotta. Rodrigues da Silva, M. (2017). Paul Thagard e a inferência da melhor explicação [Paul Thagard and the inference to the best explanation]. Cognitio: Revista de Filosofia, 18(1), 125–134. https://doi.org/10.23925/2316-5278.2017v18i1p125-134 Rodrigues da Silva, M., & Castilho, D. C. (2015). Inferências eliminativas e o problema das alternativas não concebidas [Eliminative inferences and the problem of unconceived alternatives]. Filosofia Unisinos, 16(3), 241–255. https://doi.org/10.4013/fsu.2015.163.04 Ruse, M. (1979). The Darwinian revolution. University of Chicago Press. Ruse, M. (2008). Charles Darwin. Blackwell. Samaja, J. (2005). Epistemología y metodología: Elementos para una teoría de la investigación científica [Philosophy of science and methodology: Elements for a theory of scientific research]. 3rd edition, 6th reprint. EUDEBA. Sans Pinillos, A. (2021). Neglected pragmatism: Discussing abduction to dissolute classical dichotomies. Foundations of Science, open access. https://doi.org/10.1007/s10699-021-09817-x Sans Pinillos, A., & Adúriz-Bravo, A. (2021). Un lugar para el razonamiento abductivo en la formación de profesores de ciencias [A place for abductive reasoning in science teacher education]. Tecné, Episteme y Didaxis, special issue Memorias del IX Congreso Internacional sobre Formación de Profesores de Ciencias, 1825-1830. https://revistas.pedagogica.edu.co/ index.php/TED/article/view/15471/10250 Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. https://doi.org/10.1007/s11229007-9223-4
52 Darwin’s Ideas as Epitomes of Abductive Reasoning in the Teaching. . .
1145
Sengul, O. (2019). Linking scientific literacy, scientific argumentation, and democratic citizenship. Universal Journal of Educational Research, 7(4), 1090–1098. https://doi.org/10.13189/ujer. 2019.070421 Sensevy, G., Tiberghien, A., Santini, J., Laubé, S., & Griggs, P. (2008). An epistemological approach to modeling: Cases studies and implications for science teaching. Science Education, 92(3), 424–446. https://doi.org/10.1002/sce.20268 Shelley, C. (1999). Multiple analogies in evolutionary biology. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 30(2), 143–180. https://doi.org/10.1016/S1369-8486(98)00030-2 Stamos, D. N. (2007). Darwin and the nature of species. State University of New York Press. Stamos, D. N. (2008). Evolution and the big questions: Sex, race, religion, and other matters. Blackwell. Sterrett, S. (2002). Darwin’s analogy between artificial and natural selection: How does it go? Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 33(1), 151–168. https://doi.org/10.1016/S1369-8486(01)00039-5 Thagard, P. (1978). The best explanation: Criteria for theory choice. The Journal of Philosophy, 75(2), 76–92. https://doi.org/10.2307/2025686 Thagard, P. (1989). Explanatory coherence. Behavioral and Brain Sciences, 12(3), 435–502. https://doi.org/10.1017/S0140525X00057046 Thagard, P. (1992). Conceptual revolutions. Princeton University Press. Thagard, P. (2005). Mind: Introduction to cognitive science (2nd ed.). The MIT Press. Theunissen, B. (2012). Darwin and his pigeons: The analogy between artificial and natural selection revisited. Journal of the History of Biology, 45(2), 179–212. https://doi.org/10.1007/ s10739-011-9310-8 Turner, D. (2000). The functions of fossils: Inference and explanation in functional morphology. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 31(1), 193–212. https://doi.org/10.1016/S13698486(99)00043-6 Upmeier zu Belzen, A., Engelschalt, P., & Krüger, D. (2021). Modeling as scientific reasoning: The role of abductive reasoning for modeling competence. Education Sciences, 11(9), 495. https://doi.org/10.3390/educsci11090495 van Wyhe, J. (Ed.) (2002). The complete work of Charles Darwin online.http://darwin-online.org. uk/ Venville, G., & Treagust, D. (1997). Analogies in biology education: A contentious issue. The American Biology Teacher, 59(5), 282–287. https://doi.org/10.2307/4450309 Wilner, E. (2006). Darwin’s artificial selection as an experiment. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 37(1), 26–40. https://doi.org/10.1016/j.shpsc.2005.12.002 Wirth, U. (1999). Abductive reasoning in Peirce’s and Davidson’s account of interpretation. Transactions of the Charles S. Peirce Society, XXXV(1), 115–127. Woods, J. H. (2013). Errors of reasoning: Naturalizing the logic of inference. College Publications.
Abductive Irradiation of Cultural Values in Shared Spaces: The Case of Social Education Through Public Libraries
53
Alger Sans Pinillos
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Distinction Between Cultural Spaces and Spaces of Culturalization . . . . . . . . . . . . . . . Values Distribution: Behavioral Influence as a Mechanism for Transforming the Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Symptom of Invisibilization: The Non-representation . . . . . . . . . . . . . . . . . . . . . . . . Experiences Anticipation: The Role of Autobiography in the Bad Expectations Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Books, Ways of Reading, and Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sic Eum Legentem Vidimus Tacite: Personal Traits of Silent Reading . . . . . . . . . . . . . . . The Experience of Reading and Studying in Silence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as a Real Social Mediator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Abductive Shared Cosmology Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inert Moral Mediators: Irradiation from Agents to Artifacts and Vice Versa . . . . . . . . . . The Public Library Experience, Nowadays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Idealized Activities Materialized in a Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fer Barri: The Public Library Brought to the Neighborhood . . . . . . . . . . . . . . . . . . . . . . . Library Design and Its Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Public Library as a Situated Affordance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Public Library as Mediating Artifact and Distributor of Cultural and Moral Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions: The Distribution of Cultural Values Is a Process Based on Abduction . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1148 1149 1151 1152 1154 1156 1156 1157 1158 1159 1161 1162 1164 1165 1165 1166 1167 1168 1169
Abstract
This chapter analyzes how some spaces or places become distributors of cultural values. The relevance of these places lies in the fact that, together with a direct relationship with their activities, they become structures of irradiation of values A. Sans Pinillos () Department of Humanities – Philosophy Section, University of Pavia, Pavia, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_51
1147
1148
A. Sans Pinillos
considered positive. This emphasizes how these activities are carried out when they occur in these places, where the most important thing is represented by the synergies that make possible the transformation of the environment through modifying the habits of the people who go there. The public library is taken as an example. The modern library is the result of the evolution of reading and writing. These activities have a very high cultural value that influences the moral perception we have of the people who perform them. For this reason, the library occupies a very prominent place in society, not only because of the importance given to the activities that can be carried out there but also because wanting to go there implies adopting an attitude. In the same way, certain behaviors are assumed by the simple fact of being near them. This phenomenon is approached from an eco-cognitive perspective: libraries are distributive ethical mediators that allow the application of cognitive strategies to interact morally with the environment. Likewise, the design of the building itself also acts as a moral distributor because it incites us to behave correctly within socially and culturally accepted patterns. Abductive reasoning is presented as the ideal mechanism to represent this complex process of imbrication between values, facts, expectations, and emotions. Keywords
Abduction · Irradiation of cultural values · Social education · Public libraries · Situated affordance · Inert moral mediators
The rulers of planets and stars The power of the kings of traders and the wars Planetary cycles and the phases of the moon Is in the document a kingdom they will learn (Iron Maiden 2015, The Book of Souls, 03m39s)
Introduction This chapter analyzes how some spaces or places become distributors of cultural values. The relevance of these places lies in the fact that, together with a direct relationship with their activities, they become structures of irradiation of values considered positive. This emphasizes how these activities are carried out when they occur in these places, where the most important thing is the synergies that make possible the transformation of the environment through the modification of the habits of the people who come to them. Likewise, this way of acting seems to discourage behaviors considered the opposite. In particular, the public library is taken as a reference, and the organizations, centers, and institutions have based their structure and regularization on their standards. An example is the network of activities in a library, which often makes it possible to join a community center or hall.
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1149
The modern library is the result of reading and writing evolution. Therefore, we attribute to these activities a value that goes beyond the uses we give them, to the point of influencing the moral perception we have of the people who carry them out. For this reason, the library occupies a very prominent place in society, not only because of the importance given to the activities that can be carried out within it but also because wanting to go there implies adopting an attitude. In the same way, certain behaviors are assumed by the simple fact of being near them. In this chapter, this phenomenon is approached from an eco-cognitive perspective: by understanding the library as an artifact that enables the construction of ethical mediators, with which we generate strategies to interact morally with the environment. In the same way, the design of the building itself also acts as a moral distributor because it encourages us to behave correctly within socially and culturally accepted patterns. Abductive reasoning is the ideal mechanism to represent this complex process of imbrication between values, facts, expectations, emotions, etc., because it allows us to complete the cognitive dimension of the perception we have of places like the library.
The Distinction Between Cultural Spaces and Spaces of Culturalization In this chapter, public libraries can be understood as an exemplary case of distribution of values that can lead to social improvement. According to the definition of the library given in the Public Library Manifesto 1994: The public library is the local center of information, making all kinds of knowledge and information readily available to its users. The services of the public library are provided on the basis of equality of access for all, regardless of age, race, sex, religion, nationality, language or social status. Specific services and materials must be provided for those users who cannot, for whatever reason, use the regular services and materials, for example linguistic minorities, people with disabilities or people in hospital or prison. All age groups must find material relevant to their needs. Collections and services have to include all types of appropriate media and modern technologies as well as traditional materials. High quality and relevance to local needs and conditions are fundamental. Material must reflect current trends and the evolution of society, as well as the memory of human endeavor and imagination. Collections and services should not be subject to any form of ideological, political, or religious censorship, nor commercial pressures. (UNESCO & IFLA, 1994, sec. The Public Library)
I refer to these buildings as cultural spaces because they distribute cultural values to the people around them without standardized regularization. From this perspective, I understand culture as the whole cognitive dimension of the agent, his thoughts and actions, and his way of life. Likewise, “value” is understood here as a property that makes one object or fact better than another from a non-descriptive criterion (Sans Pinillos & Casacuberta, 2019, p. 321). However, regulations are often based on generalizations because they must be applicable to everyone. These definitions raise the following difficulty: how can a building be a distributor of cultural values without falling into regularization? One of the theses addressed in
1150
A. Sans Pinillos
this chapter is that these places are built to generate community through cultural activities. Then, these places must constitute a space whose rules are oriented only to allow a free relationship with the environment and others to be a distributor of cultural values. This first response is intended to place the reader before the difficulties, on the one hand, of trying to capture normatively the agent’s cognitive dimension, his thoughts and actions, and his way of life (cultural values) and, on the other hand, of articulating a system defined and oriented (in a certain sense, regulated) toward the stimulation of this agent’s dimension. Most commonly, institutional buildings are regularized. I refer to these places as spaces of “culturalization” because they are oriented to present finished cultural products. In this sense, whether these buildings generate community is a purely accidental matter. Furthermore, regularizations often entail a type of coherence that transforms the environment toward an institutional criterion, usually dominated by the most influential strata of society. As it will be shown, it tends to undermine, marginalize, and exclude specific social groups. Some processes of exclusion are “eliticization,” racism, and the cultural level of individuals. An example that will be addressed in this chapter is the museum. They are spaces of culturalization precisely because people cannot choose what is shown in them. Therefore, there is a passive imposition of socially standards and culturally canons. As will be seen below, precisely because they do not contemplate difference and exclusivity, regularization and generalization can exert violence and be factors that generate social injustice. Conversely, values contain a high degree of personal interpretation. This exclusivity belongs to the subjectivity of each person. This dimension contains the psychological (emotions, feelings, etc.), ethical, and aesthetic perspectives that constitute the agent’s inner life. The cognitive factors of these elements introduced are considered here because they enrich the epistemic processes, such as emotions and the assumptions and prejudices we make from the comfort of the social and cultural framework to which we belong. Also, we must bear in mind the environment, articulated with the activities of the people living in it (Lai et al., 2013, p. 605) and, in short, of everything that makes up the cognitive niche from which the cosmovision (the unified picture of biological (perception) and sociocultural (conceptualization) interpretations of the world by human agents) (Magnani et al., 2021) of a community is derived. A space can be considered cultural when the proposed activities are aimed at the personal growth of the participating agents. This problem entails revising the problem of transforming norms into rules from the problem of the relationship between quantitative and qualitative elements. The last ones will be understood as the psychological (emotions, feelings, etc.), ethical, and aesthetic perspectives that constitute the inner life of the agent. On the contrary, quantification is the process by which things are signified by some description method. When any part of the agent’s inner life is signified in this way, there is an inevitable loss of the personal dimension through generalization. The problem arises here: no defined scale can entirely represent a value judgment or the psychological dimension of an agent. The dichotomy between quantitative and qualitative elements
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1151
stems from fact and value. One of its most famous versions of this dichotomy is the naturalistic fallacy: that the good cannot be described by any natural object (c.f., Moore, 2002). Nevertheless, there are also scales whose definitional criteria are partly determined by values (Longino, 1983). An example would be the importance of mathematical beauty defined using the criteria of simplicity and brevity for the epistemology of mathematics (cf. Wells, 1990). In these cases, it is usual that a criterion of an evaluative type is annulled for the benefit of another quantitative one (i.e., the generalization) (Sans Pinillos & Casacuberta, 2019, p. 321). When it happens, the system that articulates a defined scale cannot explain why the value criterion is overridden for the sake of the quantitative one. In other words, there seems to be no clear criterion to establish the degree of participation that a value has in a defined system. One way to approach this question could be by analyzing the ontological statuses that values and facts occupy in people’s practical and theoretical dimensions. This perspective allows us to analyze how it is established the participation degree criterion that a value has from the debate on the transformation of the norms (the tacit and shared social and cultural dimension of standards of conduct) into rules (the explicit and generalized law using a defined legalized corpus). This approach allows us to interpret the problem derived from the transformation of norms into rules from a pragmatic point of view. When it is said that the public library is an artifact that distributes cultural values without implying any regularization, it implies the following. On the one hand, it means that the activities that can be carried out in public libraries contain self-regulated values in their practice. On the other hand, it implies that these activities are sufficiently rooted in every social reality for their practice to imply some injustice such as discrimination and invisibilization. As will be seen, these consequences apply even to those who have not learned the necessary skills to develop the activities offered in a public library. The reason is that these buildings are essentially places of learning, both individual and collective, and do not discriminate on any grounds. Therefore, a first orientation answer to the question of how a place can be a distributor of cultural values without falling into regularization could be the following: to be a distributor of cultural values, a place must constitute a space whose rules are oriented only to allow a free relationship with the environment and with others. The next question raised by this approach is how values are distributed. As will be seen, the key to this process is the abduction manifested in all human activity.
Values Distribution: Behavioral Influence as a Mechanism for Transforming the Environment As stated in the Declaration of Québec Libraries issued by the Bibliothèque et Archives Nationales du Québec:
1152
A. Sans Pinillos
Libraries are unique in providing free and unrestricted access to library resources, materials and cultural content, thereby enabling patrons to improve their knowledge and skills and pursue their own interests and goals. This access is made possible by the organization, processing and structuring of information done by technically and professionally qualified staff. Libraries give their patrons the necessary tools, access and knowledge so they can acquire the critical thinking skills that will enable them to become well-informed citizens, exercise their democratic tights and play an active role in their community. (BANQ, 2016, sec. An Information and Cultural Hub)
It is important to highlight what it means for the library to be a space of free and unrestricted access: They are buildings for all people, regardless of origin, beliefs, sexual orientation and identity, economic situation, and age. For this reason, leaving aside all the positive aspects that can be attributed to other possible cases, we will not consider those spaces that have the sole objective of making tolerable the difference through ideological homogenization. Instead, reference is made to places considered sacred for some, those destined to catechization and, in short, all those that motivate any kind of moralization. I am referring to places of religious worship, but also to spaces named after some historical figure or containing monuments dedicated to memorable characters and deeds, which, it is well known, can generate controversy and discomfort while inspiring pride, reinforcing the cohesion of groups of like-minded people, and so on. More typical examples come to mind, however, such as museums, concert halls, and theaters. The reader may feel some discomfort. Indeed, there are reasons not to equate a church with a museum, with a monument dedicated to people with a questionable past. However, it cannot be denied that there are also similarities between them, such as the motifs described here: (a) They are not places that are really intended for everyone. (b) They are not distributors of those values that can transform the environment for the better by influencing the behavior of the people who come to them. In other words, although some of them may be considered cultural spaces, it seems that a disruption occurs in the relationship between society and culture in them. An example of this is the alienation produced by elitism, which strips culture of the elements of value that underlie all social activity. This fact always affects people somehow, a circumstance that must be addressed to delimit the topic.
The Symptom of Invisibilization: The Non-representation The idea of the museum as a space for collecting and preserving persists. Although this situation does not occur in all cases (Rectanus, 2006, pp. 388–389), in most of them, there is no explicit orientation to offer a social value (Sandell, 2002, p. xvii). On the contrary, it is often conceived as a space for idle contemplation (Prior, 2006, pp. 509–510). Of course, this does not mean that these places are devoid of some degree of transformative value, but where such change planning is headed.
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1153
For example, museums can teach and incentivize – roughly speaking – aesthetic experience (Bell, 2017). They can also be “restorative places” where people can let steam themselves through the fascination of the exhibits (Kaplan et al., 1993). Likewise, museums are focal points for the regeneration of the areas in which they are built because they encourage tourism. Most of the time, this circumstance is promoted and encouraged by governments through investors to turn that territory (city, town, village, etc.) into an exclusive place to visit and invest (Álvarez, 2010, p. 172), a fact that motivates and pushes neighbors to innovate and invest to meet their needs (ibid., p. 171). One way of understanding this circumstance is using Feyerabend’s cognitive distinction between the participant and the observer questions (Feyerabend, 1978, p. 18). In short, while the observer is the one who analyzes the final versions of systems, theories, etc., the participant is the one who develops the investigation. This distinction is used in this chapter to point out the difference between experts and visitors. From this perspective, the participants would be the visitors to these places and interact with what is exposed. On the other hand, the observers would be the experts behind the exhibitions, who design the structures. We must also consider who decides where museums will be built. It may seem shocking to relate the experts with the observers’ point of view. However, we must bear in mind that Feyerabend’s concept is about the investigations built from both the ideas of progress and finished science. From this perspective, only successful scientific cases are analyzed. This influence is reflected in the disciplines they study and practice through their actions. Nevertheless, this influence is nuanced, tempered, and justified by the formation of their areas of knowledge, which teaches them to formulate certain types of questions, to reconstruct from specific objects and patterns, to distance themselves from their assumptions, and to innovate in the same way through a specific criterion of rigor. The distinction between participant and spectator can also serve to analyze the process of “culturalization” as gentrification by analyzing those interested in the artifacts exhibited in museums. “Elitization” can be a good starting point to understand what is the cause and effect in the process of invisibilization of certain groups using cognitive biases. These patterns manifest themselves through the different ways of making the social prejudices that cohere us culturally. It is not relevant to this research whether museums are spaces that “elitize,” but they are perceived as such. This is important because it is the basis for public libraries to become effective distributors of value to all who visit them. In other words, for a library to influence people’s behavior, people must first give the library the ability to influence them. As shown below, whether this happens depends on people’s impression of what the library represents. This, in turn, comes from the socially and culturally shared perception of the concrete conceptual framework in which one lives and which will be referred to here as cosmovision. The perceived elitism in these places is based on the exclusivity of the activities that can be developed in them, facts that inevitably lead to people feeling excluded. The reason is not a prohibition but an effect of disaffection that may cause some people to feel disinterested in going to these places. This disinterest should
1154
A. Sans Pinillos
not be confused with minor issues such as the necessary ability to appreciate art or the knowledge to understand what is exhibited/represented in these places. On the contrary, this situation is due to the biases and prejudices that prevail in different societies. Reference is made to the different degrees of the feeling of uprootedness and, ultimately, detachment generated toward the place where one lives. This sometimes means that, although there may be a sincere and positive response to museums, there is also a lack of interest that is not reflected in everything they are considered valuable to society (Silvia, 2006, p. 96). One example is invisibilization caused by discrimination, which, leaving aside the injustices it may generate, directly influences the people who suffer it. Considering that interest is an emotion, its influence can be understood as qualitative in the agent’s perceptions of museums (i.e., what is shown in them and how it is presented).
Experiences Anticipation: The Role of Autobiography in the Bad Expectations Generation What has been said so far allows us to put forward the idea that perception is a situated information process, so the place has a determining role in the way we perceive (Määttänen, 2017, p. 95). This perspective is very close to pragmatism and can help understand the relevance of agents’ actions in a context in the cognitive process of experiencing. As is known, pragmatism affirms that investigation is vitally linked to the human condition: our actions are ontologically constitutive of the world (Sans Pinillos, 2021). Incorporating the action factor in the knowledge equation allows us to have valuative elements that better understand the investigation’s relationship between theory and praxis. As is well known, this is a classic topic in philosophy and refers to the problem of reconciling knowledge considered as particular with the contingencies produced by the variations that occur in the experience. This topic is configured by hypothetical knowledge and how to manage uncertainty and ignorance of the future (Apel, 2016, p. 11). In the first configuration of Peirce’s pragmatism, the problem about the grounding of sensible knowledge is managed through the continuous mechanism of open and hypothetical approximation represented with abduction. From this perspective, inquiry implies a constant and controlled approximation to reality (ibid., pp. 29–30). Further down, abductive reasoning is proposed as the centerpiece of values distribution. First, however, it is necessary to explain better the relevance of the effect of disaffection that may cause some people to feel disinterested in going to some places. The invisibilization of collectives is caused, among other things, by the discrimination that comes with elitism and institutional racism. On the one hand, these mechanisms (of power) generate cohesion based on different general criteria that make up the social and cultural cosmovision. Nevertheless, on the other hand, these same mechanisms are responsible for the fact that certain groups of people
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1155
do not go to spaces of culturalization such as museums. On the contrary, people who may feel excluded from some places may feel included in others, such as cultural spaces. Therefore, before moving on to the distribution of cultural values, it is necessary to explain what happens so that a person decides not to go to a museum. As has been said, both museums and public libraries differ in too many respects to be legitimately comparable. The reason for addressing this question is not to establish that a library is better than a museum but to establish one of the fundamentals by which a library becomes a cultural space: the agent does not prima facie feel excluded from the public library. This means that it is necessary to have prior knowledge that motivates the decision to go or not to go to a place (having emotional and rational bases). However, what happens in cases where someone does not want to go to a place they have never been to? There is a reason for not wanting to go to a place beyond the information one has. Multiple factors can explain this situation. For example, the collective imaginary directly influences the agents’ narratives. From a sociological perspective, it is possible to understand this relationship with the link between biography, history, society, and its structures and the relationship between what is public and what is private. The combination of these factors is the social reality: the resulting product of the relationship between personal worlds (microscopic) and social structures (macroscopic) (Brewer, 2004). These social factors cause the unique memory using multiple complex cognitive processes. For example, the agent presents himself from different perspectives through his/her autobiographical consciousness and their autobiographical memory (Nelson & Fivush, 2020). Of these perspectives, the one of interest here is imagining oneself in a possible future. In other words, a social reality built on discrimination and invisibilization can generate for an agent an autobiography that does not offer expectations of imagining himself going to a museum. Another way of saying the same thing would be that such a biography may expect bad experiences. As Albertazzi says: “anticipation is based on future expectation, which means adopting a forwardlooking stance and using it to change the present conditions” (Albertazzi, 2017). However, for an expectation to be substantiated as a sufficiently plausible hypothesis, there needs to be some corroboration in experience to adopt it as a guide for our actions. For example, an agent experiences cases of exclusion and hatred (racism) or invisibilization (i.e., that the school curricula do not contain writers, artists, and scientists of a particular collective, ethnicity, or gender). On this concern, abductive reasoning will be proposed later as the centerpiece of values distribution. From this perspective, the bad expectation of having bad experiences is caused by the multiple discriminating factors already mentioned, which cause in the present the action of not going to that place. In the same way, we can assume that good expectations to go to a library are closely connected to the type of activities that can be developed without discrimination. Of course, I am referring to studying, reading, and writing.
1156
A. Sans Pinillos
Books, Ways of Reading, and Libraries Reading and writing are practices that have not always been associated with the library. Similarly, they have evolved to the current form of activity over the centuries. During these transformations, reading and writing have been associated with values still present in our culture today. These same values constitute public libraries. In many ways, these practices have evolved as the basis for these buildings to be distributors of cultural values. In this chapter, the library is understood as a social and cultural project whose existence shows the state of society. The first library was built in Alexandria, whose purpose was to preserve human knowledge by studying, translating – into Greek – and cataloging every document that entered the city (Dupont, 2009, p. 145). The initial purpose of such copies was not their distribution among citizens, but the project was aimed at treasuring the content of the documents. Their access and dissemination to the general public were by oral transmission. The fact that the Greek and Roman people (ibid., p. 144) did not have a widespread habit of individual reading does not mean that they were illiterate people who did not know how to read (Thomas, 1999). On the contrary, the bureaucracy that every citizen had to handle to be politically involved in his community was transmitted in writing (ibid., p. 3). With what has been said so far, it can be seen that the habit that was not widespread was the current form of reading and writing practices. However, archaeological findings show evidence that writing was widespread. It was written on countless surfaces, such as ceramics, stelae, furniture, coins, etc. Likewise, reading was conceived as a collaborative practice in which people read aloud. The key to this orality is that writing is a support by which, through imitation, its content is represented and, thus, the words take on meaning (Dupont, 2009, p. 148). On the contrary, we know that writing and reading were not considered the most optimal resource for acquiring knowledge (i.e., Plato, 1903, Let. VII). The most accurate conclusion is to consider that writing and reading had not yet been defined as independent activities (Dupont, 2009, p. 144). The surface on which it was written and its content remained secondary to the public performativity of reading the text (ibid., p. 147). Another relevant fact is the high cost of the materials and the time needed to produce the texts (originals or copies). This circumstance did not change, in turn, the very conception of reading. In other words, the technology change is intimately linked to variations in its use, and, as we shall see, this is a determining factor in the way we conceive and experience libraries today.
Sic Eum Legentem Vidimus Tacite: Personal Traits of Silent Reading The popularization of silent reading is linked to the uses of the texts, which, in turn, is intimately related to the different modifications it has undergone. It should be kept in mind that there is no evidence of silent reading until the fifth century and that
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1157
it was unheard of for centuries. An example of the impressions of these premature cases can be found in the St. Augustine’s Confessions, who astonishingly comments that: But when he [Saint Ambrose of Milan] was reading, he drew his eyes along over the leaves, and his heart searched into the sense, but his voice and tongue were silent. Ofttimes when we were present (for no man was debarred of coming to him, nor was it his fashion to be told of anybody that came to speak with him) we still saw hi reading to himself, and never otherwise: so that having long sat in silence (for who durst be so bold as to interrupt him, so intensive to his study?) we were fain to depart. We conjectured, that the small time which he gat for the repairing of his mind, he retired himself from the clamour of other men’s businesses, being unwilling to be taken off for any other employment: and he was warry perchance too, lest some hearer being struck into suspense, and eager upon it, if the harder questions; so that spending way his time about this work, he could not turn over so many volumes as he desired: although peradventure the preserving of his voice (which a little speaking used to weaken) might be a just reason for his reading to himself. But with what intent soever he did it, that man certainly had a good meaning in it. (Augustine, 1931, Conf., VI, §3, 273–275)
There is a relationship between the abandonment of the scriptio continua (writing, mainly in capitals and without metrics, in which there was no use of spaces between words or punctuation marks) and the evolution of the Greek alphabet (Powell, 2009, pp. 227–244). We must also bear in mind the transition from the book in scroll format to the codex, that is, the one introduced and generalized by Christianity, in which the sheets (papyrus or leather) were superimposed on each other, folded in the middle, and fixed by the seams that defined the pages (Turner, 1977). In the same way as the books of our days, the compact form of these pages allowed their protection with covers and, therefore, to make them even more durable and transportable, which made them much easier to keep and conserve (ibidem). Finally, another relevant factor for the absorption of these changes is the evolution to use paper for writing, which lowered production costs and, therefore, facilitated the spread of the use of blank spaces in the text. As is well known, this process ended with the modern printing press.
The Experience of Reading and Studying in Silence There are notable differences between historically analyzing the evolution of writing and reading from the social and cultural patterns of communities and the question of the individual experience of these events of the agents. Again, Feyerabend’s distinction between observer and participant could be advantageous. On the one hand, the historical reading is done by the observers. On the other hand, the participants’ cognitive dimension is how the agents experience and interact with the environment. Then, while the historical reading is done by the observers, the different ways in which the agents experience the environment while interacting with it concern the cognitive dimension of the participants. For example, the challenge implied by the recovery of classical texts through Arabic translations caused advances in the sys-
1158
A. Sans Pinillos
tematization of the language that, in the end, led to the gradual abandonment of the scriptio continua. It is important to keep in mind that access to previously unknown works has consistently offered new tools for investigating nature (Crombie, 1953, pp. 19–21) insofar as it has posed new challenges and questions. These cases show how our experience is determined by the manipulation of the objects and artifacts. In the same way, the changes produced in the practice of reading implied modifications in the agents, which were made visible by a concrete social and cultural transformation. For example, the abandonment of the scriptio continua for writing with spaces and punctuation was one of the factors that led to silent reading. This circumstance was developed from the evolution of cognitive processes and strategies, for example, among other resources, decoding texts and learning to extract information from a page without the need to externalize its content. This gives rise to a particular type of relationship with the book, which becomes an artifact with which one can interact independently to extract knowledge from the world (Saenger, 1997, pp. 2–6). This relationship gives the conception of books and reading a series of cultural values that permeates society conducted through education. These values are a mixture of shared conceptions about the benefits of reading in the sense of becoming cultured, together with the behavior associated with those who engage in this practice. To this must be added the influence of the collective imaginary regarding introspection, meditation, and study. Although schooling was a determining factor for literacy, the free and disinterested practice of reading and silent studying began to crystallize in libraries during the eighteenth and nineteenth centuries, until the conception of our days.
Abduction as a Real Social Mediator Abduction can be defined as a mechanism that acts by an epistemic virtue distinct from the classical one, which serves to account for situations (surprising, puzzling, etc.) that cannot be approached from a classical epistemological perspective (Magnani, 2001; Gabbay & Wood, 2005; Aliseda, 2006; et al.). From this definition, it can be said that the function of abduction is to manage the experience of novel facts through a process of generating hypotheses or conjectures that indicate new lines of action. The current concept of abduction is a reinterpretation of Peirce’s (1958, CP ᾿ 5.14–40) recovery of the Aristotelian apag¯og¯e (απαγωγ η) ´ (Aristotle, 1957, An. Pr. II 25, 69a20–35). Peirce intended to ground the acquisition of experience (Peirce, 1958, CP 5.348). In this chapter, the eco-cognitive perspective of abduction (aka EC-Model) is taken because it situates embodied cognition from a contextual perspective (Magnani, 2017). Furthermore, the pragmatic maxim to considering the truth based on its practical dimension (Peirce, 1958, CP 5.14–40) is naturalized. Thus, the logic of abduction offers a characterization of the cognitive processes that articulate hypothetical reasoning (Park, 2017). In particular, an anthropomorphization of reasoning (Magnani, 2017, p. 138) is assumed to explain the processes of adaptation
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1159
to experience that occur during interaction with the environment (context). For this reason, abduction is characterized as an open system of information acquisition. Peirce’s pragmatism assumes that investigation is largely based on processes of continuous adaptation in an environment of the constant flux of experience. In this sense, abduction operates as third reasoning (Finocchiaro & Woods, 2014), controlling the proliferation of experiences (Peirce, 1958, CP 616–648). The pragmatic postulate operating at the epistemological level can be interpreted from a cognitive perspective to explain the strategies that operate to accommodate variations in perception (Shanahan, 2005).
The Abductive Shared Cosmology Construction Peirce places the hypothesis as the basis of perception (1958, CP, 2.619–644), which is socially and culturally characterized through semiotic relations with habits (Cannizzaro & Anderson, 2016). Peirce’s pragmatism places the agent at the center of the investigation in the philosophy of science and analyzes all these problems from the characterization of scientistic practice. In this way, abduction must fulfill some general requirements. First, abduction must be characterized as a constantly open multimodal system of inference. This means that abduction operates at different levels by leveraging the cognitive resources available depending on the context (c.f., Park, 2017, p. 42). This para-characterization optimizes abduction to optimize the situation where the reasoning takes place through a constant permutation of the input and output roles of information through the action (Magnani, 2017, p. 139). This fact is crucial for a correct understanding of the social and cultural dependence of the meanings of the concepts. Likewise, the objective of investigating could be the shared cosmology construction. Human relationships are highly tentative and assume many plausible hypotheses to understand, among others, gestures, dynamics, etc. In this sense, the adaptability of hypotheses is mixed with our predisposition to act according to our interpretations. As mentioned above, multiple factors can explain this situation: the narratives of the collective imaginary that directly influence the agents’ narratives. These narratives shape social reality: the product resulting from the relationship between personal worlds and social structures (the social and cultural framework). Therefore, abduction can be conceived to integrate the interactive agents’ predisposition with the environment. In this way, the context is closely related to the agents’ perception. From this perspective, it is possible to explain some of the multiple and complex cognitive processes that affect agents’ memory. As has been explained, one of the most interesting phenomena is the agents’ multi-perspective is the self-presentation (autobiographical consciousness and memory). This capacity allows agents to anticipate possible futures based on present expectations. However, it has also been argued that anticipation needs experience-based corroborations to endure and determine agents’ decisions and actions. It is possible to explain this circumstance from the classical characterization of syllogistic abduction (cf., Peirce, 1958, CP 5.180–212):
1160
A. Sans Pinillos
1. (rule): the information from A (collective imaginary) of to be discriminated by x. 2. (case): is corroborated by the agent’s experience of being discriminated. 3. (abduction): this experience triggers the agent’s hypothesis that the reason for being discriminated is x. The possibility of presenting oneself from different perspectives and the ability to anticipate possible futures in the present can be articulated through abduction (Adams et al., 2009). Both cognitive resources involve some action. First, reasoning is ever in time (past-future-present). On the other hand, the probabilistic calculation is based on (a) an estimation using the facts experienced in the present and (b) determining courses of action based on these forecasts. These processes also involve integrating assessments with objective data to project hypotheses that can be fulfilled in the present. In other words, it allows the agent to imagine what he wants to happen and what he does not want to happen, interrogate possible futures, and even imagine alternative ones (dystopian, utopian, or simply fantastic) to change the world. When these cognitive resources are situated in a society, the need arises to explain the relationship between individual and collective actions. By collective decision, we mean here the general sense of deliberation and by interaction the usual relationships that do not necessarily have a premeditated objective to resolve. In both cases, these are complex processes of abductive inference because they try to make sense of information within a social and cultural context. Therefore, several types of negotiations with the rest of the community are involved in information acquisition and distribution processes. The pragmatic postulate acts here to allow the agent to act on assumptions that are assumed prima facie to be provisional (Haack, 1995). As mentioned above, this is to manage situations of uncertainty. However, it is not easy to differentiate between the epistemic and the moral dimension when applied to social reality. A hypothesis on a knowledge issue may affect a hypothesis on a value issue. For example, teachers’ and psychologists’ decisions in a high school when they “invite” a student to drop out of school are strongly based on social and cultural assumptions about what a good student should be like. Similarly, assumptions about value issues embedded in scientific practices significantly affect people’s lives. For example, a physician’s meaning of life directly influence his/her professional practice when deciding to do an abortion. In the same direction, one could also consider the debate on euthanasia. Also, every day there are ethical arguments that try to influence scientific practice and scientific knowledge to modify prejudices and social assumptions (i.e., the current debates on pseudoscience and scientific negationism). However, anticipating a future is a form of “leading away” (hypothesizing). Therefore, it is a process of uncertainty management. However, social uncertainty is in many cases about problems whose only solution is to keep it open. I call this type of problem a continuous trigger because they predispose the agent to investigate topics with no definitive answers (Sans Pinillos & Magnani, 2022). Value questions are of this type. The will to keep these topics open is based on changing society through the
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1161
incentive to be imitated. Setting an example, actions are one of the most important social tools that an agent has to change his reality. Imitation can be characterized through abduction. By imitation, we will understand a cognitive system of representation (Magnani, 2017, pp. 135–136). One of the most relevant factors of mimesis is that it always maintains a relation of resemblance to the original object without compromising the copy with any criterion of accuracy. For example, it is possible to imitate by emulation: to improve what is copied. This perspective takes advantage of a socially distributed conception of cognition (Magnani, 2018). Actions entail moral behavior that is intentionally transmitted to objects and persons. The cognitive system of mimesis is abductively articulated as an adaptive mechanism. This means that a mimetic system is externalized through action. Thus, learning can be understood in terms of representation or, in other words, similaritybased approximation. Moreover, if human interactions are considered, mimesis presents itself as a satisfactory way of explaining how new practices are adopted by others and thus become part of the socially shared cosmovision.
Inert Moral Mediators: Irradiation from Agents to Artifacts and Vice Versa The type of moral interaction that is of interest for this work is the one that involves an ethical view. Every relationship with another person requires certain ethical frameworks. Interaction between people implies natural and ethical reciprocity based on understanding the more or less tacit intentionality of our actions. In this sense, intentionality manifests itself abductively through the actions of agents. The same is true of animals, which once considered as “ends in themselves,” we cease to consider them mere “means.” In the same way, artifacts can become part of our ethical view through our actions. An example would be the ecological transformation from perceiving a piece of land to be exploited to understanding it as a bio-space shared with other animals and vegetation. This step involves understanding these entities within our ethical framework. The paradigm in which we find ourselves today not only permits but demands a broadening of our ethical view. One of the challenges is the human relationship with the technology we use (Casacuberta & Guersenzvaig, 2019). Likewise, it is essential to consider the ethnicity of certain artifacts, such as the public library. The reason is that interaction with contexts determines our behavior and, therefore, implies that our actions have a moral impact on society. However, a different relationship occurs with artifacts than with people and living beings that we include in our ethical view. This difference lies mainly in the inability of artifacts to react to our intentionality. In this sense, agents can maintain an ethical interaction with an artifact because it becomes a passive moral mediator (Magnani & Bardone, 2007, p. 70) with the capacity to distribute human morality (Magnani, 2018, p. 68). This chapter presents an extension to the concept of a passive moral mediator to encompass cases in which (a) an artifact becomes moral for indirect reasons and (b)
1162
A. Sans Pinillos
the artifact itself is “incapable” of representing a moral value on its own. I call these artifacts inert moral mediators. As will be seen in the conclusions, the library is an exemplary case of these mediators. The main characteristic of these artifacts is that their capacity to distribute morality lies in the activities that can be performed with or in it. Therefore, it will be the agents’ actions that will “irradiate” these artifacts with value. I call this process radiation. In this way, these artifacts become moral distributors because the other people who are close to them perceive this morality through the other people’s actions. Therefore, these artifacts propose (irradiate) conditions that allow agents to interact using values intrinsically related to the activities usually done with or in them. Moral interaction thus arises in three ways of irradiation: 1. By the understanding of the activities we have 2. By the imitation of the rest of the agents 3. By the same predisposition that the artifact offers As we have seen, reading and writing are the indirect activities that give libraries a social value: in these buildings, these practices can be carried out using cultural values of free-thinking and integration that have evolved to the present day. We see how abduction emerges as an optimal mechanism to explain our relationship with artifacts and how it influences our ethical perception of the world. It also allows us to conjugate how individual agents participate in social reality. The mechanisms for representing themselves from different agents’ perspectives also make it possible to anticipate their actions based on expectations. As we have seen, these expectations result from a process of hypothetical validation based on the abduction of experiences using the individual autobiographical account of each agent. Some of these expectations are moral and are determinant in the configuration of the ethical conception of society. For example, the morality attributed to certain activities comes from developing them. This can cause specific spaces to become moral because they allow these activities to be developed in a certain way. Therefore, a proliferation of these places could contribute to a positive moralization of society through activities considered positive.
The Public Library Experience, Nowadays As stated by Declaration of Quebec Libraries: Libraries are essential tools for the democratization of culture and knowledge. They undeniably favor information, education, development, social integration and success of individuals, while being formidable levers of cultural, social and economic development for the communities they aim to serve. All of them are contributing to the learning and education of citizens and to the promotion of free and universal access to knowledge. [The] role of libraries [is] ensuring greater accessibility to information and knowledge sharing, thus contributing to building inclusive knowledge societies and sustainable communities. (UNESCO, 2017)
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1163
Likewise, in the Public Library Manifesto of 1994, the library was ratified as “the local gateway to knowledge, provides a basic condition for lifelong learning, independent decision-making and cultural development of the individual and social groups” (UNESCO & IFLA, 1994: Introduction). It is for this reason that the same manifesto sets out 12 clear objectives that every library should pursue: 1. Creating and strengthening reading habits in children from an early age 2. Supporting both individual and self-conducted education as well as formal education at all levels 3. Providing opportunities for personal creative development 4. Stimulating the imagination and creativity of children and young people 5. Promoting awareness of cultural heritage, appreciation of the arts, scientific achievements, and innovations 6. Providing access to cultural expressions of all performing arts 7. Fostering inter-cultural dialogue and favoring cultural diversity 8. Supporting the oral tradition 9. Ensuring access for citizens to all sorts of community information 10. Providing adequate information services to local enterprises, associations, and interest groups 11. Facilitating the development of information and computer literacy skills 12. Supporting and participating in literacy activities and programs for all age groups and initiating such activities if necessary (UNESCO & IFLA, 1994: sec. Missions of the Public Library) Nevertheless, although there may be a general definition of the public library and its objectives, one must never forget the different roles depending on where it is built. Once in place, the library must engage with the social reality and adapt its ideals to the transformation project. An example is the construction of identity and citizen recognition to be conscious of participation in equal conditions and opportunities (Jaramillo, 2010). This situation could even be considered within the planning of the construction of a library, which can be understood in some places as a simple negotiation and in others as a vindication. However, suppose the intention is not to eliminate these different relevant circumstances. In that case, the concrete definition and its application in a given territory are not so important as its will to preserve values manifested in every public library, although their ways of doing so may differ. In this specific case, the definition of a library is by generating a public library system, that is, of access to all citizens of the territory (Article 2 of Law 4/1993, of March 18, of the library system of Catalonia). All these regulations emphasize guaranteeing and preserving conditions of possibility, but there is no effort to regulate uses. The reason for this lies in the fact that we are very clear about these uses and customs. Such regulations describe what a library is, but not what it is to be, much less what it can be. While the prescriptive form sometimes precedes the descriptive, the form of possibility is determined by the social changes manifested in the various uses to which it is put.
1164
A. Sans Pinillos
Idealized Activities Materialized in a Building There is a certain romanticism surrounding libraries. Whether through personal experiences, third parties, or narratives, we talk about them with nostalgia and affection. Just as Bukowski laments: the old L.A. Public Library burned down that library downtown and with it went a large part of my youth (Bukowski, 1986)
Although they may have served as a repository of audiovisual material at certain times, the evocations to which I refer are rather directed to their community-building role (Scott, 2011, p. 193). For example, it is not surprising that, along with schools, proximity to a library is a relevant factor in choosing a domicile in which to live (ibid.), especially if those deliberating have, expect, or want children. As seen above, from its generative and transformative role in the community, the library is unique for crucial reasons, which would be schematized with the following two points: (a) for the free and open access to data and information and (b) for being a space that welcomes and does not stigmatize anyone (Scott, 2011, p. 194). What is most important here is why this is so, for, as is the case in all spaces, nowhere is it announced that the library is free of prejudice, but rather it is assumed and, precisely because of this, it is given. In other words, it is an institution that modifies our actions simply because of the conception we have of the practices we associate with it (irradiation process). For this reason, the actions carried out in and around it are motivated to promote knowledge and, in this spirit, also to behave better according to that community’s socially shared cultural values. For example, the value associated with the silence in which reading, writing, and studying take place does not interfere with the conception of the library as a space that fosters a type of socialization. As is well known, it is widespread to go to them in groups and to liven up the hours of concentration with looks of complicity and gestures of encouragement. The breaks to talk, drink coffee, smoke, etc., which take place outside the building, are also part of socializing. Likewise, groups of people also congregate to prepare works or exhibitions in the corresponding rooms and those who need the services offered by the library (photocopy machines, scanners, and, of course, Internet). In the same vein, but not in the same way, we must take into consideration the library staff, who operate the local machinery of the building, either by monitoring the behavior of the people inside or by helping to ensure that all the practices that can be performed are carried out smoothly (Dewey, 2008). In this sense, the staff must know how to use the technology that facilitates and know the preserved material, but, above all, they have to know how to convey what a library is (Scott, 2011, p. 192). As will be seen below, this concern with technological aspects is only a sample of the transforming power of the library. Although it is institutionally enabled and
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1165
guaranteed, it is fostered by the community’s will and materialized by its staff, who (it should not be forgotten) are also a part of the neighborhood during their working day.
Fer Barri: The Public Library Brought to the Neighborhood In the Catalan language, “fer barri” (literally, “making neighborhood”) refers to the generation of community through fostering relationships between people who live and work in the same area to create links aimed at improving coexistence. These social improvements come from understanding public space as a place of belonging, thus considering all spaces and people as constitutive of our life. This feeling is significant for a community, as it defines and unites its general character and the different public manifestations. For this reason, although a neighborhood public library may retain the same services as others, it also has its character concerning the community to which it belongs. This becomes even more evident through the different activities organized within its spaces and the different services it acquires. An interesting case is the size of the areas destined for reading the press, usually used by retirees and those that serve as children’s spaces. There are times when the library becomes such a focus of an activity that it becomes a cultural space through its annexation to a civic center. Likewise, this option allows us to contemplate the social reality in which the ideals and objectives of public libraries mentioned in this section are materialized. An example is the coexistence of the Collserola-Josep Miracle Library with the Vallvidrera-Vázquez Montalbán Civic Center (Barcelona, Spain). This symbiotic relationship has turned the whole building into a place of cultural ferment and neighborhood coexistence, hosting theatrical performances and opening its doors on its festivities. Reading clubs, exhibitions, etc. are also organized by popular request. Of course, all these activities are open to the whole public. Cases such as the one described above show that the conception of the library influences its walls. As we have seen, the building acts as a focus for a type of behavior motivated by the same cultural values that inspire the practices that can be developed within it. This influence occurs because the library distributes (irradiates) these values, which, being conceived as good, we want them to guide and define our actions.
Library Design and Its Structure As has been said, the conception we have of the library is intimately related to the conception we have of the practices in it. Likewise, the building mediates the connection between the values and the actions outside the library. This means that the library structure is itself a radiator of these values. The building in question could have been conceived as a library in its origins, occupying a disused historical space,
1166
A. Sans Pinillos
or being built from scratch. Here, design is a crucial element for our analysis, as it synthesizes tradition with adaptation to new challenges. The critical point is that a library must look like a library. In this sense, the building is a pedagogical tool (Adds et al., 2011, p. 541), which, to the extent that it transforms the attitudes of those who come to it, becomes a space of culturalization. In other words, making a library look like a library predisposes agents to behave in a certain way and thus irradiate the building with the values manifested in their actions. On the one hand, the shared nostalgic and romantic idea of the institution coexists, which, while complemented by the cultural conception of the moment, transmits the value of its conservation. On the other hand, the meaning of culture used in this chapter also refers to the people’s actions. Ultimately, these actions define the value of the things with which the agents interact (Lai et al., 2013, p. 604). Likewise, interaction with the library becomes an indispensable piece of making it a cultural space, which is predisposed by our assumptions. In this sense, it could be said that the library is a closed but unfinished project. Although its design depends on institutional requirements, the architect must make a hermeneutic effort to make these congenial within the final product (Dalsgaard, 2014, p. 145). This is a conflitto irresolubile between social and practical factors (Burckhardt, 2017, p. 46), in which material and bureaucratic conditions are mixed with citizens’ claims, as well as with the need to adapt to current problems. The latter is important since what needs to be solved is, de facto, a difficulty. Being able to account for it implies that designing a building co-evolves with the problems that are being faced and, finally, with the revolutionary aspect offered by the possibility of transforming an obstacle into an opportunity. Here, the complexity lies in the fact that possible conflicts are the fruit of interaction and, therefore, imagining a solution means transforming the conflict into a new course of action. It is important to note that citizens’ actions can become conflicts and that it is their coping strategies that draw possible paths to be explored in search of answers.
The Public Library as a Situated Affordance Problem-solving is one more interaction process among all those that make up the cosmovision we share and live with the rest of human beings. Applying these cognitive strategies in this shared environment generates the conceptual framework that we understand as a cognitive niche: the environment resulting from the changes actively sought by human beings in their attempts to find opportunities (Magnani & Bardone, 2010). This conception of ecologically situated cognitive systems offers a perfect perspective for understanding how devices and artifacts are conceptualized through their manipulation and how these interactions shape our context, which is continually modified as we live in it. From this perspective, the library can be understood
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1167
as an affordance, as the values and meanings of the things we perceive, which offer us opportunities for action in terms of ecological facts (ibid., p. 241). However, the concept of value allows us to go a step beyond the classical theory of affordances (Gibson, 1966, p. 285) and to posit that the ways of interacting with an artifact can modify social behavior.
The Public Library as Mediating Artifact and Distributor of Cultural and Moral Values Like any other artifact, the public library allows the construction of mediators with which it interacts and modifies the environment. On the one hand, these mediators are the cognitive models and systems from which the different strategies that make up the cognitive niches introduced in the previous section are generated. On the other hand, these mediators offer strategies of perception and conceptualization of the environment. The fact that the niche is shared with the other agents implies that one interacts with the others during the application of these strategies. In this way, the tacit patterns of human knowledge that make up our cosmovision are generated. These same patterns are what, when applied in culturalization programs, can generate discrimination and invisibilization. Similarly, there are tacit patterns in moral action that are also implicit in people’s behavior (Magnani & Bardone, 2007). These patterns also arise from the interaction with some artifacts, thus generating a different strategy that complements the cognitive niche. Likewise, the strategies offered by moral mediators are directed at behavior and socially shared values that also complement the cosmovision in which one lives. These values are the cultural ones that define people’s behaviors and regulate their actions. The crucial difference referred to here is shown through actions. In other words, they occur through an interaction that is different from the one that occurs in systemic relationships. When this interaction occurs, the artifact in question acts as a distributor of moral values by offering the people, who interact with them, the opportunities for acting. Likewise, these opportunities are closely related to the conception of these artifacts. What unites these two factors are the activities that these artifacts make possible. Applying this theory to public libraries, we see that our conception of these buildings is connected to reading, writing, and, ultimately, studying. This conception is based on the social and cultural perspective of these activities offered by our cosmovision. Therefore, the relevance of the library comes from our way of life. Because of this, the library is a distributor of cultural values in a non-normative sense. It is so because it incites a type of behavior and allows one to develop oneself without any imposition. In this sense, the public library is an artifact that distributes cultural values because it positively influences the modes of interaction of the people who come to it.
1168
A. Sans Pinillos
Conclusions: The Distribution of Cultural Values Is a Process Based on Abduction It is possible to understand the public library as an inert moral artifact: the ability to distribute moral value is related to the community’s value of the activities that can be performed in these buildings. I call this process radiation. Being in the vicinity of a public library predisposes the agent to adopt a behavior different from other places. Therefore, the environment provided by the library proposes conditions that allow the agents to interact by putting in front the values related to the activities that usually take place in libraries. As with any interaction, these elements trigger reactions mediated by different behavioral constrictors (i.e., imitation) that can generate feelings such as fraternity. This characterization is relevant because it allows us to account for the overlap between fact and value (Putnam, 2001). As had be said, value can be understood as a property that makes an object or fact better than another from a non-quantitative criterion. In this sense, there is a value inherent to reading and writing practices. This value is often related to the epistemic question of the study. This conception of these activities is transferred to the perception we have of the library. This fact irradiates the building endowing it with a radiation power of cultural values. This means that being close to the library triggers agents’ predisposition to act in a certain way in the form of opportunities for action. Precisely because of this, the library is an inert artifact: because it strictly depends on our perception and actions to be moralizing. In other words, the library is not genuinely a moral mediator but is so to the extent that we associate the practices in it more ethically than others. Another way of explaining this is that the practices that can be developed in libraries require a type of attitude assumed by those who come to them. From this point of view, abduction emerges as the ideal reasoning to explain the process of moral irradiation both from the agents to the library and from the library to the agents. Using the multimodal perspective of human cognition offered in the EC-Model of abduction, this relationship can be explained based on: 1. Our cognitive processes are mediated by the way we interact with the environment. 2. By the fact that all the resources available to understand what surrounds us act to give meaning to what we experience in the abductive process. Thus, the conception held of the library in a community or society entails the agents’ predisposition to be affected by it. However, this same conception forces the irradiation of cultural values to depend on being close to the library. This means that to recognize library as such, design is essential. Likewise, its transforming power lies in its capacity to modify the agents’ habits so that they come and appropriately use the facilities. Such a change in habits can affect all dimensions of a person’s life simply by being stimulated in certain circumstances.
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1169
As we have seen, reading and writing are indirect activities that give libraries cultural value. These buildings offer the opportunity to carry out these activities under the enlightenment’s free-thinking moral principles of liberty, equality, and fraternity. From this point of view, abduction has been presented as a genuine mechanism to articulate the relationship we have with the library (inert artifact) and the main activities through which we irradiate the building to become a distributor of cultural values. Likewise, abduction allows us to explain how our ethical view of the world is configured through the different cognitive strategies that agents have to position themselves in the world. In this sense, anticipation has been presented as an optimal mechanism to explain the predisposition to go to spaces such as the library. As we have seen, the result of the validation of certain assumptions (rule) through experience (case) can trigger a hypothesis (abduction) that guides our actions and determines our ethical view of the society in which we live. In this sense, the moral value attributed to certain activities from a social and cultural perspective can predispose the agent to assume an attitude when he practices them and when he goes to the places where it is usual to develop them. Therefore, as already mentioned in this chapter, a proliferation of these places could contribute to a positive moralization of society through activities considered morally positive. This ethical perspective is not only a moral commitment but also based on everything that surrounds us through the moral predisposition that each human being intentionally transmits in the interaction with the environment. We can recognize this type of ethical relationship with the world living. Acknowledgments Research for this chapter was supported by the PRIN 2017 Research 20173YP4N3-MIUR, Ministry of University and Research, Rome, Italy.
References Adams, V., Murphy, M., & Clarke, A. (2009). Anticipation: Technoscience, life, affect, temporality. Subjectivity, 28, 246–265. https://doi.org/10.1057/sub.2009.18 Adds, P., Hall, M., Higgins, R., & Higgins, T. R. (2011). Ask the posts of our house: Using cultural spaces to encourage quality learning in higher education. Teaching in Higher Education, 16(5), 541–551. https://doi.org/10.1080/13562517.2011.570440 Albertazzi, L. (2017). Microgenesis of anticipation: Windowing the present. In R. Poli (Ed.), Handbook of Anticipation. Springer. https://doi.org/10.1007/978-3-319-31737-3_13-1 Aliseda, A. (2006). Abductive reasoning: Logical investigations into discovery and explanation. Springer. Álvarez, M. D. (2010). Creative cities and cultural spaces: New perspectives for city tourism. International Journal of Culture, Tourism and Hospitality Research, 4(3), 171–175. https://doi. org/10.1108/17506181011067565 Apel, K.-O. (2016). Der Denkweg von Charles S. Peirce. Eine Einführung in den amerikanischen Pragmatismus. Suhrkamp Verlag. Aristotle. (1957). In W. D. Ross (Ed.), Analytica priora et posteriora. Oxford University Press. Augustine, St., of Hippo. (1931). Confessions (Vol. 1) (William Watts, Trans). William Heinemann. BANQ. (2016). Declaration of Quebec libraries. http://mabibliothequejyvais.com/media/ declaration_biblio_qc_EN.pdf
1170
A. Sans Pinillos
Bell, D. R. (2017). Aesthetic encounters and learning in the museum. Educational Philosophy and Theory, 49(8), 776–787. https://doi.org/10.1080/00131857.2016.1214899 Brewer, J. D. (2004). Imagining the sociological imagination: The biographical context of a sociological classic. The British Journal of Sociology, 55, 317–333. https://doi.org/10.1111/j. 1468-4446.2004.00022.x Bukowski, C. (1986). The burning of the dream (manuscript). Bukowski.net: https://bukowski. net/manuscripts/displaymanuscript.php?show=poem1986-00-00-the_burning_of_the_dream. jpg&w=2998 Burckhardt, L. (2017). L’archittettura: arte o scienza? In P. Feyerabend & C. Thomas (Eds.), Arte e Scienza (trad. Francesco Mugheddu) (pp. 45–60). Armando Editore. Cannizzaro, S., & Anderson, M. (2016). Culture as habit, habit as culture: Instinct, habitues-cence, addiction. In D. West & M. Anderson (Eds.), Consensus on Peirce’s concept of habit. Studies in applied philosophy, epistemology and rational ethics (Vol. 31). Springer. https://doi.org/10. 1007/978-3-319-45920-2_18 Casacuberta, D., & Guersenzvaig, A. (2019). Using Dreyfus’ legacy to understand justice in algorithm-based processes. AI & SOCIETY, 34, 313–319. https://doi.org/10.1007/s00146-0180803-2 Crombie, A. C. (1953). Robert Grosseteste and the origins of experimental science 1100-1700. Oxford University Press. Dalsgaard, P. (2014). Pragmatism and design thinking. International Journal of Design, 8, 143– 155. Dewey, B. I. (2008). Social, intellectual, and cultural spaces: Creating compelling library environments for the digital age. Journal of Library Administration, 48(1), 85–94. https://doi.org/10. 1080/01930820802035059 Dupont, F. (2009). The corrupted boy and the crowned Poet (trad. Holt N. Parker). In W. A. Johnson & H. N. Parker (Eds.), Ancient literacies. The culture of reading in Greece and Rome (pp. 143– 163). Oxford University Press. Feyerabend, P. (1978). Science in a free society. Lowe & Brydone Ltd. Finocchiaro, M. A., & Woods, J. (2014). Errors of reasoning: Naturalizing the logic of inference (Studies in Logic, Vol. 45). Argumentation, 28, 231–239. https://doi.org/10.1007/s10503-0149311-9 Gabbay, M., & Wood, J. (2005). A practical logic of cognitive systems: The reach of abduction, insight and trial (Vol. 2). Elsevier. Gibson, J. J. (1966). The senses considered as perceptual systems. Allen and Unwin. Haack, S. (1995). Evidence and inquiry. Towards reconstruction in epistemology. Blackwell. Jaramillo, O. (2010). La biblioteca pública, un lugar para la formación ciudadana: Referentes metodológicos del proceso de investigación. Revista Interamericana De Bibliotecología, 33(2), 287–313. Recuperado a partir de https://revistas.udea.edu.co/index.php/RIB/article/view/7644 Kaplan, S., Bardwell, L. V., & Slakter, D. B. (1993). The museum as a restorative environment. Environment and Behavior, 25(6), 725–742. https://doi.org/10.1177/0013916593256004 Lai, L. Y., Said, I., & Kubota, A. (2013). The roles of cultural spaces in Malaysia’s historic towns: The case of Kuala Dungun and Taiping. Procedia - Social and Behavioral Sciences, 85, 602– 625. Law 4/1993 of March 18th on the Catalan library system. Official State Gazette, 95, of 21/04/1993. BOE.es – BOE-A-1993-10384 Ley 4/1993, de 18 de marzo, del sistema bibliotecario de Cataluña. Longino, H. (1983). Beyond “Bad Science”. Science, Technology, and Human Values, 8(1), 7–17. Määttänen, P. (2017). Emotions, values, and aesthetic perception. New Ideas in Psychology, 47, 91–96. https://doi.org/10.1016/j.newideapsych.2017.03.009 Magnani, L. (2001). Abduction, reason and science: Processes of discovery and explanation. Kluwer. Magnani, L. (2017). The abductive structure of scientific creativity. An essay on the ecology of cognition. Springer.
53 Abductive Irradiation of Cultural Values in Shared Spaces: The Case. . .
1171
Magnani, L. (2018). The urgent need of a naturalized logic. Philosophies, 3(44). https://doi.org/10. 3390/philosophies3040044 Magnani, L., & Bardone, E. (2010). Chances, affordances, and cognitive niche construction: The plasticity of environmental situatedness. International Journal of Advanced Intelligence Paradigms (IJAIP), 2(2/3), 235–253. Magnani, L., Sans Pinillos, A., & Arfini, S. (2021). Language: The “ultimate artifact” to build, develop, and update worldviews. Topoi. https://doi.org/10.1007/s11245-021-09742-5 Magnani, L., & Bardone, E. (2007). Distributed morality. Externalizing ethical knowledge in technological artifacts. Foundations of Science, 13(1), 99–108. https://doi.org/10.1007/s10699007-9116-5 Moore, G. E. (2002). Principia ethica. University of Cambridge Press. Nelson, K., & Fivush, R. (2020). The development of autobiographical memory, autobiographical narratives, and autobiographical consciousness. Psychological Reports, 123(1), 71–96. https:// doi.org/10.1177/0033294119852574 Park, W. (2017). Abduction in context. Springer. Peirce, C. S. (1958). In C. Hartshorne & P. Weiss (Ed.), Collected papers of Charles Sanders Peirce (Vol. 1–6). Harvard University Press, 1931–1935; (Vol. 7–8) (A. W. Burks, Ed.). Harvard University Press. Plato. (1903). Seventh letter. In J. Burnet (Ed.), Platonis opera (pp. 323d–342a). Oxford University Press. Plato, Epistles, Letter 7 (tufts.edu). Powell, B. B. (2009). Writing: Theory and history of the technology of civilization. WileyBlackwell. Prior, N. (2006). Postmodern restructurings. In S. Macdonald (Ed.), A companion to museum studies (pp. 509–524). Wiley. Putnam, H. (2001). The collapse of the fact/value dichotomy. Harvard University Press. Rectanus, M. W. (2006). Globalization: Incorporating the museum. In S. Macdonald (Ed.), A companion to museum studies (pp. 381–397). Wiley. Saenger, P. (1997). Space between words. The origin of silent reading. Stanford University Press. Sandell, R. (2002). Museums, society, inequality. Routledge. Sans Pinillos, A. (2021). Neglected pragmatism: Discussing abduction to dissolute classical dichotomies. Foundations of Science. https://doi.org/10.1007/s10699-021-09817-xSans Sans Pinillos, A., & Casacuberta, D. (2019). Remarks on the possibility of ethical reasoning in an artificial intelligence system by means of abductive models. In Á. Nepomuceno-Fernández, L. Magnani, F. Salguero-Lamillar, C. Barés-Gómez, & M. Fontaine (Eds.), Model-based reasoning in science and technology. MBR 2018. Springer. https://doi.org/10.1007/978-3-030-32722-4_19 Sans Pinillos, A., & Magnani, L. (2022). How do we think about the unknown? The self-awareness of ignorance as a tool for managing the anguish of not knowing. In S. Arfini, L. Magnani (Ed.), Embodied, extended, ignorant minds. Synthese library (Vol. 463). Springer, Cham. https://doi. org/10.1007/978-3-031-01922-7_9 Scott, R. (2011). The role of public libraries in community building. Public Library Quarterly, 30(3), 191–227. https://doi.org/10.1080/01616846.2011.599283 Shanahan, M. (2005). Perception as abduction: Turing sensor data into meaningful re-presentation. A Cognitive Science, 29, 103–134. Silvia, P. J. (2006). Exploring the psychology of interest. Oxford University Press. Thomas, R. (1999). Literacy and orality in Ancient Greece. Cambridge University Press. Turner, E. G. (1977). The typology of the early codex. University of Pennsylvania Press. UNESCO. (16/06/2017). Declaration of Quebec libraries. https://en.unesco.org/news/declarationquebec-libraries UNESCO & IFLA. (1994). IFLA/UNESCO Public Library Manifesto 1994. https://repository.ifla. org/handle/123456789/168 Wells, D. (1990). Are these the most beautiful? The Mathematical Intelligencer, 12, 37–41. (https:// doi.org/10.1007/BF03024015)
Part X Abduction, Creative Cognition, and Discovery
Introduction to Abduction, Creative Cognition, and Discovery
54
Selene Arfini
Contents Abduction: A Bridge Between Creativity and Reason . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reason and Imagination: Peirce’s View of Creativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Representations and Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Costliness and Worthiness of Ideas Pursuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abstract and Embodied: Abductive Cognition from Minimal to Complex Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Surprise and Logical Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1176 1177 1177 1178 1178 1178 1179
Abstract
Creativity and discovery are two notions that have consistently and famously aroused hot debates in cognitive science, epistemology, and philosophy of science almost since the dawn of these disciplines. This part of the Handbook of Abductive Cognition proposes to consider abduction as a concept that brings out complex reflections regarding how human agents consider new possibilities, discover new findings, and adopt creative processes to solve problems within and outside the scientific context. In particular, the chapters’ authors offer a way to see abduction as a bridge between concepts linked to creative cognition and scientific processes that are still often discussed and presented as opposed, dichotomous, or simply too distant to be meaningfully connected: creativity and reason, logic and discovery, and a representational view of cognition and theories of embodiment. These authors do so, in part, by referring to the epistemological
S. Arfini () Department of Humanities – Philosophy Section, University of Pavia, Pavia, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_89
1175
1176
S. Arfini
works done by Peirce, especially on the idea of scientific progress, and, in part, by discussing current and hot debates in cognitive science, philosophy of science, and epistemology. In this introduction, the content of the chapters that compose this part of the Handbook will be briefly introduced and commented on (using the alphabetical order applied to the authors’ surname). At the same time, it will also be presented a discussion that links these contributions together and reflects on the role they play in the broader exploration of abductive cognition.
Abduction: A Bridge Between Creativity and Reason It is not a strange argument to say that when we exaggerate distinctions and treat them as dichotomies, simplifications naturally occur, and we use the newly dichotomous concepts to present a distorted, albeit sometimes convincing, depiction of complex phenomena. This is one of the reasons why there are still debates in the philosophy of science around the context of justification and the context of discovery (Nickles, 2006; De Freitas & Pietrobon, 2007; Novak, 2013), why the discussion around the so-called 4E cognition (the view that sees cognition as extended, embedded, enacted, and embodied) is often taken as a way to reject as a whole the traditional cognitivist and computational view (Menary, 2010; Newen et al., 2018), and why the ideas of reason and intuition, and creativity and logic, are hardly put in the same sentence without a prolonged explanation regarding why we think they deserved to be there together. Abduction is widely recognized (see especially (Sans Pinillos, 2021)) as a notion that tests and challenges the simplifications we use to speak about both cognitive processes and scientific progress, and, in turn, it also tackles the alleged dichotomies that still affect our depiction of these phenomena. Charles Sanders Peirce, the American philosopher whose works have inaugurated the discussion regarding abduction as a powerful cognitive and explicatory tool, explicitly discusses the epistemic, emotional, and psychological traits of abduction in connection to topics such as logical reasoning, scientific creativity, and the fallibility of human cognition (Peirce, 1931), disregarding the difference of theoretical framework in which these topics were (and sometimes still are) approached. Of course, not every use of the concept of abduction defies a dichotomy in the epistemological realm, nor can or should it be yielded to demonstrate the only apparent robustness of every dichotomous distinction we make. Althought, it can be argued that discussing the cognitive, emotional, and even physical processes that occur when an agent abduces (more or less consciously) something inevitably complicates and enriches our discussions regarding our cognition, the idea of logic, and what is or is not under the conscious control of the agent in epistemic processes. Thus, in this part of the Handbook of Abductive Cognition, the chapters’ authors offer a way to see abduction as a bridge between different concepts that are
54 Introduction to Abduction, Creative Cognition, and Discovery
1177
still often discussed as opposed, dichotomous, or presented as simply too distant to be meaningfully connected: creativity and reason, logic and discovery, and a representational view of cognition and theories of embodiment. These authors approach these notions and theories, in part, by referring to the epistemological works done by Peirce, especially on the idea of scientific progress, and, in part, by discussing current and hot debates in cognitive science, philosophy of science, and epistemology. In the rest of this introduction, the content of the chapters that compose this part of the Handbook will be briefly introduced and commented on (using the alphabetical order applied to the authors’ surname), also reflecting on the role they play in the broader exploration of abductive cognition.
Reason and Imagination: Peirce’s View of Creativity How and in which context do imagination and logical rigor converge? Sara Barrena and Jaime Nubiola provide a compelling answer to this question by discussing the notion of creativity in Charles Sanders Peirce’s works. In this theoretical framework, creativity is inherently connected to how agents perform abductions since the authors argue that “the logic of abduction is combined with a leap of imagination, which implies embracing a new conception of logic different from the rationalist one.” This reflection allows the authors to defy easy and unrealistic reductions of creativity to occasional and uninvestigable intuitions or mindless material manipulations. Acknowledging the works done by Peirce on this concept, the dichotomy between reason and imagination is sensibly dissolved, and they can present creativity as a cognitive act that fosters our understanding of the world while leaving room for doubt, rethinking, and tinkering with the creative product.
Representations and Alternatives Representation is one of the key elements of the traditional view of cognition that depends on the computational theory of the mind, so it is not surprising that, when discussing abduction, most accounts still rely on the concept of representations. Peter Bruza and Andrew Gibson offer a different take on philosophical tools: they focus on the idea of surprise as an experience that cannot be accurately presented as a representation. If indeed we form a representation of a particular phenomenon, that event stops being surprising, since it now belongs to our finite conceptualizations. However, the idea of abduction, especially when it is connected to a process of discovery, properly begins with the experience of surprise. Thus, in their chapter, the authors offer a way to bridge representational views on inferential reasoning and embodied nonrepresentational views of cognition to account for the bewildering nature of surprising events.
1178
S. Arfini
Costliness and Worthiness of Ideas Pursuit How agents can and should process hypothesis creation, selection, and evaluation remains one aspect of abductive cognition that raises different epistemological questions. In their chapter, Robert Folger, Christopher Stein, and Nicholas Andriese discuss this topic by proposing to consider an unusual perspective from which creative reasoning can be nurtured in a scientific context. They highlight how the Peircean notion of “esperable uberty” – hoped-for abundance – may offer an alternative to maintain a strict “economy of the research,” which instead focuses on keeping high selecting criteria for hypotheses evaluation. Thus, the authors discuss and highlight how research creativity may be encouraged and grow out of the adoption of the logic of pursuit-worthiness of ideas.
Abstract and Embodied: Abductive Cognition from Minimal to Complex Cognition Jordi Vallverdú and Alger Sans Pinillos’ chapter discusses an issue that has its roots in the revolution still occurring in cognitive science research: 4E cognition research offered a way to discuss cognitive processes and states, avoiding the complex drawbacks of assuming that the mind is a representations elaborator which is disembodied and mechanistically determined. In this framework, the problem of how to approach and discuss what minimal cognition (which emerges from the minimum requirements for the generation of cognitive phenomena – bacterial cognition, for example) and cognitive processes of highly complex systems (human cognition, but also AI systems) have in common still remains. In their chapter, Vallverdú and Sans Pinillos adopt a morphological approach to cognition in order to discuss this issue, using the idea of abduction as a cognitive switch and creativity as a bridge between different types of cognitive agents and the world.
Surprise and Logical Reasoning If we accept that surprise is a key engine of abduction and creativity, it would easily follow that children, being more easily surprised than adults, are probably the agents most easily able to shape abductive reasoning and creative inferences. In her chapter, Donna West discusses which kinds of abduction (or better, retroduction) reasoning children learn to use when they encounter surprising events in narrations. Using Berman and Slobin’s findings, she further discusses the importance of continuing the research on children’s inferential abilities since these abilities likely shape other competencies and skills, such as episodic memory and autonoetic consciousness.
54 Introduction to Abduction, Creative Cognition, and Discovery
1179
References De Freitas, R. S., & Pietrobon, R. (2007). Whoever could get rid of the context of discovery/context of justification dichotomy? A proposal based on recent developments in clinical research. The Journal of Medicine and Philosophy, 32(1), 25–42. Menary, R. (2010). Introduction to the special issue on 4E cognition. Phenomenology and the Cognitive Sciences, 9(4), 459–463. Newen, A., De Bruin, L., & Gallagher, S. (2018). 4E cognition: Historical roots, key concepts, and central issues. In The Oxford handbook of 4E cognition (Vol. 1, pp. 3–15). Oxford University Press. Nickles, T. (2006). Heuristic appraisal: Context of discovery or justification? In Revisiting discovery and justification (pp. 159–182). Springer. Novak, M. (2013). The argument from psychological typology for a mild separation between the context of discovery and the context of justification. In Legal argumentation theory: Crossdisciplinary perspectives (pp. 145–162). Springer. Peirce, C. S. (1931). Collected papers of Charles Sanders Peirce. Harvard University Press. Sans Pinillos, A. (2021). Neglected pragmatism: Discussing abduction to dissolute classical dichotomies. Foundations of Science, 1–19.
Abduction and Creative Theorizing
55
Robert Folger, Christopher Stein, and Nicholas Andriese
Contents Abduction and Creative Theorizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Characterizing Abduction as a Creative Form of Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . Peirce and the Esperable Uberty of Abduction in Scientific Inquiry . . . . . . . . . . . . . . . . . . . The Abductive Peirce-suit of Scientific Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pursuit-Worthiness as Tractability and the Abduction(s) of Leon Festinger . . . . . . . . . . . . . Abductive Reasoning about Dissonance: Speculation about Problematized Phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning About Dissonance: Theoretical Antecedents of Style and Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning about Dissonance: Pursuit-Worthiness as Tractability . . . . . . . . . . Summary: From Whence Uberty in Dissonance? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creative Scientific Inquiry as a Logic of Pursuit-Worthiness . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1182 1183 1186 1188 1190 1190 1194 1195 1198 1200 1202 1203
Abstract
This chapter looks at the ideas of C. S. Peirce about abductive reasoning and uses them to explore how those ideas apply to the creative aspects of scientific reasoning. A key notion comes from Peirce’s reference to the esperable uberty –
R. Folger () Management Department, University of Central Florida, Orlando, FL, USA e-mail: [email protected] C. Stein Department of Management, School of Business at Siena College, Loudonville, NY, USA e-mail: [email protected] N. Andriese University of Central Florida, Orlando, FL, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_17
1181
1182
R. Folger et al.
a hoped-for abundance – of novel ideas that can be nurtured to achieve the ultimate fruitfulness of their nascent promise. In contrast to some of Peirce’s emphasis on the economy of research as entailing selective criteria that would weed out some lines of research presumed to be unproductive, however, this chapter focuses on how the pursuit-worthiness of some creative ideas may inspire some lines of scientific inquiry and downplay the costliness of that pursuit – much like the creativity of brainstorming encourages giving room for ideas to “grow” before criticizing them. In particular, the discussion herein uses a scientific case history (the development of the theory of cognitive dissonance) to explore how a “logic of pursuit-worthiness” can be treated as creatively engaging in ideas that give them tractability at various stages of their development. Keywords
Abduction · Creative theorizing · Pursuitworthy hypotheses
Abduction and Creative Theorizing Suppose people encountered the phenomenon (observation, evidence, fact, datum) ᾿ of a sentence ended with απαγωγ η. ´ They might wonder what was responsible for its occurrence. What kind of reasoning might be used in attempts to understand why ᾿ απαγωγ η´ appeared, when, where, and how it did? People would certainly try to use some kind of reasoning process, hoping it could facilitate meaningful inferences about possible explanations. The notion of inferences applies when people want to use some sort of “logic” for drawing conclusions rather than thinking in irrational, “illogical” ways. That broad sense of logic is the topic of this chapter, with special attention to creative processes of reasoning guided logically in that sense of the word (e.g., see Bertolotti et al., 2016, p. 153, on “successful creative abductions”). When making inferences about the first sentence of this chapter, some people ᾿ might recognize Greek letters and presume that απαγωγ η´ is a Greek word – rather than, say, errant keystrokes that got past a copy editor. Not knowing anything else, they might try to reason in the deductive fashion of conclusions entailed by premises (e.g., All men are mortal. Socrates was a man. Hence, Socrates of necessity must have been mortal). Another option is induction, which typically involves extrapolating from known cases to those not-yet observed but from the same (presumed) category (e.g., No human being has been immortal, making human immortality seem an unlikely future prospect – although conceivably it is not impossible). ᾿ Neither deduction nor induction aids in understanding the απαγωγ η´ phenomenon. Nothing offers deductive certainty. Also, there are no other uses of ᾿ απαγωγ η´ from which to generalize inductively. But knowing this chapter addresses creativity in logic broadly conceived – and that deduction and induction lack the requisite explanatory resources – a reader might suspect the word involves something other than deduction or induction. From that, someone might also
55 Abduction and Creative Theorizing
1183
begin to suspect that this third logic category applies when reasoning from (a) the encounter with a phenomenon to (b) conjectures about possible explanations for its occurrence. With only the additional evidence of the word abduction in the title of this chapter, readers would not consider it illogical to suspect that the Greek ᾿ απαγωγ η´ corresponds to abduction in English. In fact, that would be reasoning abductively to one or more speculative explanations for a phenomenon.
Characterizing Abduction as a Creative Form of Reasoning The American philosopher/scientist C. S. Peirce was educated as a chemist. From 1859 to 1891, he spent time working with the United States Coast Survey. Also ᾿ steeped in the classics, Peirce encountered the word απαγωγ η´ when reading a passage of ancient Greek text attributed to Aristotle. He used abduction as an English translation, although in his voluminous writings he sometimes used other terminology (e.g., hypothesis, retroduction). His life-long passion for logic and science convinced him it was vital to scientific inquiry as a creative source of new ideas warranting additional consideration (for a review of Peircean scholarship, see Paavola, 2012). In thinking and writing about abduction so much, Peirce generated numerous and nuanced treatments of it. Most conform to a canonical schema in which the starting ᾿ point of abductive reasoning is an initial datum (e.g., the reader who sees απαγωγ η´ in this chapter’s first sentence), often referred to as being puzzling or problematic in some sense (surprising, anomalous, etc.). The next logical move is trying to come up with a conjecture (explanation, hypothesis, proto-theory, etc.) that might account for that datum (e.g., because the chapter’s title refers to abduction, perhaps that ᾿ might be the English translation of απαγωγ η). ´ The key feature of such tentative conjectures is that they would account for the datum if they were true – but at this speculative stage (e.g., before reading more of the chapter), their veridicality cannot be evaluated definitively. In that sense, the effort is to reason “backwards”: from the datum as effect to the tentatively held account of it as a phenomenon. At this point, an initial logical “conclusion” is only that there are grounds to suspect that the conjecture might turn out to be true. As analyzed in this chapter, abductive reasoning characteristically leaves the door open to further inquiry, with initial conjectures taking the form of ideas that seem as if they might be worth investigating – and even “theorizing” about – to a greater degree, perhaps in an ongoing fashion. When taken up in a potentially iterative manner, those ongoing lines of inquiry can include additional abductive reasoning along the way. Much has been written about abduction, and an analysis of abductive creativity represents a somewhat restricted focus. At the same time, however, emphasizing creativity also calls for a more in-depth coverage than is typical. The central aim of this chapter, therefore, is to explore ways in which the creativeness of an abductive orientation toward science extends beyond the initial consideration of candidate hypotheses. In short, this is a process perspective on the generative nature of abductive reasoning in its creative forms.
1184
R. Folger et al.
It is useful to treat abduction as a process distinguished from its products, especially when thinking about creative abductive reasoning as an ongoing process. Product refers to the result of an abduction, namely, a conjectured explanation of a phenomenon, whereas process refers to the inferential activities involved, such as mental algorithms, that lead to hypothesizing an explanation (Aliseda, 2006). In terms of judging candidate explanations as products, Pierce (1903/1955) gave some advice about three kinds of selection criteria – because it would be foolish to chase after every hypothesis that could possibly be imagined. First, candidate hypotheses should be testable. Second, they should have explanatory relevance. A third criterion Peirce referred to is the “economy” of research. Obviously, factors such as time, money, and effort are limited resources that could factor into how scientific inquiries are conducted. Specifying selection criteria by which to judge hypotheses as abductive products does not speak to the kinds of creative processes that might yield potentially explanatory hypotheses in the first place. Some discussions in the recent literature (e.g., Aliseda, 2006; Folger & Stein, 2017), however, have addressed the theme of process more directly. For instance, Aliseda (2006) described abductive processes as relating to two kinds of “triggers” that initiate attempts to come up with explanations for phenomena. On the one hand, there are abductive novelties, which involve phenomena that are not predicted by existing theories (but may be consistent with them). On the other hand, abductive anomalies represent phenomena that contradict existing theory. Festinger’s theory of cognitive dissonance (1957) (discussed below) is an example of this. Festinger (1957) observed a phenomenon that violated the dominant psychological theory of reinforcement, which triggered his search for an alternative explanation. Folger and Stein (2017) proposed a third abductive trigger – abductive imparity. This occurs when different theories offer partial explanations of phenomena, but where none of them alone is sufficient to fully explain the observations. Resolving this puzzle can involve combining theoretical approaches that may have previously been conceived as separate and distinct. The result is a reconceptualization of triggers as a continuum rather than the two extremes as Aliseda (2006) proposes. Abductive reasoning often begins when a surprising new phenomenon is observed, whereas abductive imparity refers to explanation-seeking activities triggered by a recognition that existing theories offer only incomplete versions of what might account for certain classes of phenomena. That is, creative abductive processes need not involve the complete absence of explanatory hypotheses, but instead can begin from puzzlement caused by previously inadequate explanations of phenomena. For example, observers long ago noticed that the facing coastlines of South America and Africa have shapes that look as if they fit together like jigsaw pieces. Scholars at times took that conceivable conjunction as if it were a historical fact (i.e., the continents had once been conjoined as part of “Pangaea”), offering various conjectures as to how they became dislocated. Modern approaches toward a “continental drift” explanation began with a “sea floor spreading” hypothesis – but while it seemed plausible to a few geophysicists, most disregarded it as without foundation because of its incompleteness. Specifically, it lacked sufficient details
55 Abduction and Creative Theorizing
1185
about the kinds of physical forces that could break up land masses and move them away from one another. Modern tectonic plate theory grew out of the Vine-Morley-Matthews hypothesis (Frankel, 1982), which originated from an abductive imparity trigger. The eponymous scientists linked two existing concepts – about convection currents under ocean ridges and about geomagnetic reversals in the earth’s polarity. Their unified hypothesis filled in the gaps and, with related evidence, quickly settled decades-old debates between “drifters” and “fixists.” (This case history contains mixtures of all three abductive triggers; see Frankel, 1982, for details.) The open-ended generation of new hypotheses from creative abductive reasoning processes bears a similarity to practices recommended for brainstorming. These encourage an initial exploration of numerous ideas without placing restrictions on them. In other words, this type of encouragement is the opposite of treating such brainstormed ideas as products suitable for evaluating in the sense of Peirce’s three criteria for selecting among hypotheses. Rather than evaluation by selection criteria, some creative uses of abductive reasoning stem from the use of contrasts. The inherently contrastive nature of much abductive reasoning becomes clear in terms of to-be-explained phenomena as triggers. Regardless of whether those triggers are novelties, anomalies, or imparities, the process is initiated by a dissatisfaction with an existing state of inadequate understanding – a failure of existing conceptual resources to account for the phenomena in question. At least implicitly, therefore, those insufficient perspectives represent the backdrop against which the search for explanations is launched. Folger and Stein (2017) referred to these triggering situations as fact/foil contrasts. A given phenomenon in essence is the fact to be explained. However, to treat it as needing explanation is to contextualize it in terms of an implicitly contrastive foil whereby it seems non-explained (or insufficiently accounted for). An observed fact is not surprising or puzzling, for example, unless it is pitted against a foil – some variation on preexisting notions of what is expected/predicted (such as based on an extant scientific literature). A phenomenon is contrasted with a foil (or foils), such that the relevant aspects of the phenomenon – in the context of the foil – make it seem poorly understood. In that sense, a proto-selection or proto-evaluation has already taken place as the initiation of the abductive process (i.e., reasons for the rejection of existing conceptualizations, ready-to-hand ways of trying to understand the phenomenon). Nonetheless, to say that the foil becomes the not “selected” candidate explanation threatens to place an undue emphasis on abduction products at the expense of abductive reasoning products. Especially in creative terms, the dynamic nature of that on-going, open-ended process (to be illustrated in this chapter by a scientific case history) would fail to emerge if the emphasis instead focused on a static snapshot of the products of any stage of that process. There is some debate about whether Peirce conceived of abduction exclusively as the process of generating hypotheses, thereby tied closely to the creative process, or as (e.g., also) the application of selection criteria by which to choose among hypotheses. The latter would link abduction with, in which case abduction is more
1186
R. Folger et al.
closely related to, inference to the best explanation (IBE; Harman, 1965; Lipton, 2004). Mackonis (2013) and Campos (2011), for example, have argued that Peirce conceived of abduction as both. Indeed, there are several possible interpretations of abduction because of various differences in Peirce’s treatments of the topic. In his early writings, abduction appears to be at least partially more aligned with IBE, but that seems less the case in his later characterizations of it as the reasoning process from surprise to inquiry. It seems that Peirce conceived of abduction in his later works as being related to imagination in the sense that a generated hypothesis is not judged as valid at that point, but rather as a useful way to begin further investigation. Abductive reasoning thus applies not only to the creative origins of scientific ideas but also to their ongoing conceptual development.
Peirce and the Esperable Uberty of Abduction in Scientific Inquiry Some brief remarks by Peirce himself already point in the direction of an emphasis on dynamic developments in the nurturing of ideas toward a realization of their further potential. Those remarks pertain to esperable uberty (cf. the gloss by Sebeok, 2001, as hoped-for abundance), one of his uniquely coined phrases, and mentioned in only two, somewhat obscure (i.e., unpublished in his lifetime) sources. One of Peirce’s (1913/1931–1958) references to uberty appears in a letter he wrote in 1913 to F. A. Woods (the source of all quotations in this paragraph and the next). Here he made a distinction between “two principal aims” for logicians: “first to bring out the amount and kind of security (approach to certainty) of each kind of reasoning, and second, to bring out the possible and esperable uberty, or value in productiveness, of each kind.” He continued by classifying reasoning into the categories of deduction, induction, and abduction. As a unique inferential orientation, abduction “depends on our hope, sooner or later, to guess at the conditions under which a given kind of phenomenon will present itself.” This way Peirce described abductive reasoning is notable for its wide-ranging generalizability – not being restricted, for instance, to anomalous or surprising phenomena (likewise a path followed in this chapter). Peirce pointedly brought out the distinctiveness of this “third kind” of reasoning by contrasting it with the other two along a security-to-uberty continuum: “From the 1st type to the 3rd the security decreases greatly, while the uberty as greatly increases”; thus, in reference to abduction as “the adaptation of a hypothesis on probation,” he says that “though its security is low, its uberty is high.” For that reason, he linked it with “a hypothesis on probation.” Another reference comes from the draft of an essay left unfinished at the time of his death. Having mentioned uberty, he added the following adumbration in distinguishing it from fruitfulness as a related property of observations that become abductive starting points: [Regarding] the fruitfulness of observations and that [of] reasonings . . . I can hardly be supposed to have selected the unusual word “uberty” instead of “fruitfulness” merely because it is spelled with half as many letters. Observations may be as fruitful as you will,
55 Abduction and Creative Theorizing
1187
but they cannot be said to be gravid with young truth in the sense in which reasoning may be, not because of the nature of the subject it considers, but because of the manner in which it is supported by the ratiocinative instinct. (1913/1992–1998, p. 472)
Here, Peirce links “observations” with “reasonings” in remarking that the latter provide a foundation for particular kinds of nascent understandings so young as to be “gravid” (a term generally referring to a creature’s as-yet unborn offspring). The reasonings/observations pairing offers such promise because it goes beyond observation as the mere receptivity of sense organs to stimuli (cf. Hanson, 1958). It involves an interpretive/inferential process that in the first place identifies the observed phenomenon as worthy of attention because of what it might portend. This identification classifies the phenomenon “as-if” it represented more than meets the eye, especially as if it might be representative of certain kinds of things rather than others – or in Peircean terms, identified as a token instance representative of a broader type or category of phenomena. Thus understood, “Identification is the seed from which knowledge grows, its embryonic form; to conceive something [in an as-if relation of token to type] is indeed to be impregnated with knowledge—if all goes well” (Kaplan, 1998, p. 85, emphasis added). The language in that passage comes from a methodology text written by an author who cited Peirce throughout, making the birthing metaphors apropos to uberty, along with “if all goes well” in relation to the esperable spirit of hopefulness. Putting it another way, observation-driven reasonings prompt the investigator to entertain conjectural inferences as having at least some promise of truth-relevance, even if short of a “fully developed” (or well-formed, completely fleshed-out) validity. The kickoff to such inferences is the hope that even a primitive kernel of an idea about “What the devil is going on around here?” (Kaplan, 1998, p. 85) – and how to suspect where the clues to understanding will be more abundant (pregnant with potential meaningfulness) – will suggest how to begin searching for answers to the questions raised by as-if identifications. This chapter takes up where Peirce’s brief characterization of esperable uberty left off. He wrote several unfinished works referring to security and uberty in the title. For example, he contrasted uberty with mere fruitfulness by using metaphors related to nutrition; in other words, capitalizing on uberty refers not only to a sense of where productive lines of inquiry might lead but also to a sense of ways to nourish initial ideas so that they might grow to their full potential. A productive pursuit of answers to identification-based questions depends on the investigator’s ability to cultivate sources of uberty. The esperable part of abductive inquiry is the hope of having that ability, the hope of being able to take advantage of that ability, and the hope that a possible type-token relationship, perhaps conjectured in a relatively dormant form at the outset, “can be cultivated” (Kaplan, 1998, p. 16) so as to yield a crop of further insights and deeper understanding of such phenomena in general. Peirce thus stressed that the worth of abduction-as-uberty far exceeded the limited capacity of deductive logic for increasing the body of knowledge (i.e., the non-ampliative nature of deduction). Indeed, he referred to the indispensability of abductive uberty as so removed from the ordinary practices of security-seeking
1188
R. Folger et al.
scientists as to be anathema to them! Essentially, he argued that when scientists become consumed with rigorous deductive argumentation, so as to rely exclusively on it for the sake of feeling secure about their inferences, they run the risk of failing to do insightful work. An old joke drives the point home: A drunk man wandering under a light pole, complaining about his lost car keys, is asked where he thinks he lost them. He points some distance away. When asked why he doesn’t look where he pointed rather than staying under the light pole, the drunk replies, “The light’s better here.”
The Abductive Peirce-suit of Scientific Inquiry Peirce’s own insights are unmatched, so trying to fathom the nuances of esperable uberty amounts to a near impossible task. A more modest aim relates that expression to a contemporary term from the philosophy of science – pursuit-worthiness – as a convenient proxy for an in-depth analysis of Peirce’s own ideas. This chapter adopts a loose notion of pursuit-worthiness consistent with the following definition (although without commitment to any of the author’s elaborations on the term): “As I will use the term . . . to pursue a hypothesis is to spend time and resources testing it, calibrating its empirical parameters, developing it theoretically (e.g. by resolving conceptual problems or drawing out its implications) or applying it to new domains. More succinctly, to pursue a hypothesis is to work on it” (Nyrup, 2017). The source of that definition also provides an excellent account of the topic as addressed by a variety of other authors (e.g., Elliot & McKaughan, 2009; Laudan, 1980; McKaughan, 2008; Šešelja & Straßer, 2013; Šešelja et al., 2012; Whitt, 1990, 1992). More recent explorations include Cabrera (2021), Nyrup (2020), and Shaw (2022). Related work has discussed notions like pursuit-worthiness in terms even more closely echoing Peirce by referring to fertility (McMullin, 1976; Schindler, 2017) and fruitfulness (Ivani, 2019). Even in the existing (though thin) contemporary literature on pursuit-worthiness and similar themes, however, only limited advances have been made toward explicating what the term might entail – especially as practiced, instead of as idealized in abstract discussions and those that try to impose specific kinds of normative, epistemic criteria. Part of the limitations of these attempts stems from a tendency to stick too closely to Peirce’s original conceptions. The closest Peirce came to fleshing out notions of pursuit-worthiness (à la esperable uberty) consists of criteria for good abductive hypotheses. He argued that to be worthy of investigation, such hypotheses should not only offer explanatory potential but also be testable. Moreover, he insisted on the essential importance of economy, by which he meant constraint according to the available resources of money, time, energy, and the like. The analysis in this chapter, however, will explore a characterization of economyin-inquiry differentiable from what is found in some readings of Peirce (e.g., Rescher, 1976). Peirce focused on the relevance of economizing resources when it comes to testing hypotheses. Such testing, however, involves attending to the efficient use of the resources required in the post-abductive, empirical stages of
55 Abduction and Creative Theorizing
1189
inquiry (i.e., after hypotheses-to-be-tested have been identified). Available resources at that point place constraints on research that differentiate the relative feasibility of pursuing various ways of testing empirically oriented hypotheses, such as the expensiveness and availability of the requisite scientific equipment. That interpretation of the notion of economy, however, can become a misplaced preoccupation in two respects. First, it can focus attention too narrowly on the costs of certain lines of research. Second, that focus might lead to a premature winnowing-out of the various types of hypotheses potentially relevant for testing. Above all, considerations of cost-based feasibility estimates threaten to forestall exploiting conjectures ripe with uberty. Because uberty refers to an abundance of inquiry-relevant resources, a resource-oriented search for hypotheses (e.g., seeking explanatory understandings rich in resources for cultivating broader implications and generalizations beyond the phenomenon as initially identified) can help to avoid overlooking lines of inquiry preemptively abandoned on the economic grounds of feasibility. Thinking of abductively oriented reasoning processes as they relate to creativity should thus encourage capitalizing on the psychological resources of brainstorming techniques. These include initially refraining from the critique of even very speculative and perhaps seemingly foolish ideas at the outset. Narrowing the focus and trying to pick those most worthy of further pursuit (based on whatever criteria) take place only after a thorough exploration of various solution-space domains. Peirce’s emphasis on economy followed from the now commonly recognized problem that in principle an infinite number of explanations might account for any given datum. It’s certainly true that bewilderment and potentially fruitless false starts can result when far too many candidate explanations come readily to mind. On the other hand, the danger of not capitalizing on uberty exists from having too few – that is, failing to consider some abductively identified paths of inquiry with esperable prospects for enriched understanding. This chapter emphasizes the resource-abundant, rather than cost-constraint, paths of conceptual pursuit-worthiness. Using a case study of conceptual pursuitworthiness helps clarify what it means to capitalize on a rich abundance of resources within a conceptualization itself. This case history illustrates how some features of creatively inspired pursuit-worthiness played out in explorations of the theoretic concept of cognitive dissonance (Festinger, 1957). First, the account of Festinger’s line of inquiry explores the Peircean criterion of explanatory potential in reference to the intellectual pursuit of a particular kind of attempt to understand a phenomenon. An explanation-as-understanding approach emphasizes that an illuminating conception of a given phenomenon need not involve precise causeand-effect details. Second, as regards the criterion of testability, the dissonance case history reveals how candidate explanations become pursuit-worthy as a function of identifying features relevant to operationalizing a concept in particular kinds of empirically promising situations (e.g., experimental setups). Third, when preliminary investigations followed the initial, triggering abduction of dissonance as a concept, the pursuit of programmatic research did not involve a simplistic notion of the economic
1190
R. Folger et al.
cost-oriented criterion as a consideration about the limited extent of resources available for testing hypotheses. Being economical in that sense evokes notions such as the efficiency of research efforts and pursuing the most easy-to-conduct types of research first, before turning to more costly endeavors. Rather than economizing on the costliness of research, developments in the construct of dissonance utilized resources for positive benefit. Just as economy can mean the evaluation of benefit-to-cost ratios, the pursuit-worthiness of hypothetical conceptions depends as much on the numerator in those ratios – the fertile, rich conceptual resources at hand – as on constraints encountered when seeking how to put conceptions to an empirical test and see how well they hold up in terms of explanatory content (increased understandings of the relevant phenomena). Tractability is a term applied in this context because it relates to all the various types of conceptual resources upon which inquiry might build (resources that enhance pursuit capability as well as the anticipation of worthwhile increases in understanding). Thus, tractability is a catch-all notion not tied to any particular criteria for uberty, pursuit-worthiness, and the like.
Pursuit-Worthiness as Tractability and the Abduction(s) of Leon Festinger Leon Festinger’s (1957) theory of cognitive dissonance provides a case study of abductive inquiry as esperable uberty. The case presents two kinds of backdrops to such lines of reasoning: the development of the explanatory relevance of the initial insight and the intellectual history of it roots. Those twin aspects of reasoning offer clues about pursuit-worthiness precursors and how they contributed to a path from the preliminary development of a working construct to its eventual elaboration into a full-fledged theory.
Abductive Reasoning about Dissonance: Speculation about Problematized Phenomenon This case indeed started with a puzzling phenomenon, but the important point is how and why it came to be regarded by investigators as a puzzle in the first place. Put otherwise: The phenomenon was not puzzling until first problematized (considered to be puzzling in light of a particular way of contextualizing it). Festinger and a team of collaborators had been asked to compile an inventory of propositions related to forms of communication and social influence. To do this, they planned to focus on “some narrowly defined problem” and “attempt to formulate a specific set of hypotheses . . . [to] account for the data” (Festinger, 1957, p. vi). Clearly this was the abductive approach of starting with phenomena, in terms of data, and then reasoning from effect back to presumable cause as a means of generating candidate explanations.
55 Abduction and Creative Theorizing
1191
What also stands out is the spirit of esperable uberty at the outset. The team realized that some data might not lend themselves readily to theoretical tractability, so they “hoped that one would quickly realize the dead end” (p. vi) – in other words, to sort out the pursuit-worthy leads from those not worth the effort. For reasons not explained in the text, but which will be considered below in an intellectual-history analysis, the group chose “the spreading of rumors as our first narrowly defined problem to work on” (p. vi). The initial work on the rumor literature proceeded easily enough. The same could not be said about what needed to follow: “More difficult were the problems of . . . getting some theoretical hunches that would begin to handle the data in a satisfactory way [reasoning abductively]. It was easy enough to restate empirical findings in a slightly more general form [cf. induction], but this kind of intellectual exercise does not lead to much progress [i.e., induction deemed ill-suited for uberty]” (p. vi, emphasis added). The exercise of efforts considered more intellectually fertile began with “[t]he first hunch that generated any amount of enthusiasm among us,” which “came from trying to understand some data [from an article by Prasad, 1950] . . . concerning rumors subsequent to the Indian earthquake of 1934” (p. vi). What generated enthusiasm was not the mere fact of an earthquake-to-rumor connection, which in itself might not ordinarily be considered particularly surprising. What instead struck the team as curious was the content of the most frequently occurring kinds of rumors. Of particular interest were predictions of even more disasters expected imminently. Again, however, the mere occurrence of fantastical rumors after an unexpected, life-threatening event need not have been considered surprising to the team. Rather, the problematizing of that phenomenon turned it into a to-be-pursued puzzle as a function of the question it raised when characterized in a particular way by Festinger and his team. Especially significant was the question of “why rumors that were so ‘anxiety provoking’ . . . were so widely accepted” (vii). Here, too, the problematized characterization of the phenomenon made it worth pursuing when contrasted with an alternative characterization: Finally a possible answer to this question occurred to us—an answer that had promise of having rather general application [i.e., esperable uberty]: perhaps these rumors predicting even worse disasters to come were not ‘anxiety provoking’ at all but were rather [as if] ‘anxiety justifying.’ That is, as a result of the earthquake these people were already frightened, and the rumors served the function of giving them something to be frightened about. Perhaps these rumors provided people with information that fit with the way they already felt [an as-if identification]. (Festinger, 1957, vii)
Thereafter “with the help of many discussions in which we attempted to pin down the idea and to formalize it somewhat, we arrived at the concept of dissonance and the hypotheses concerning dissonance reduction” (vii). Postponing more details about dissonance as a concept, it can be expressed as involving an unsettling juxtaposition of contradictory impressions (“I’m afraid” yet “There’s nothing to be afraid of”), followed by a construal of how to reconcile them (“I’m afraid because of all the horrible things that surely will follow this present
1192
R. Folger et al.
disaster”). The motivational tendency of dissonance reduction refers to attempting such rationalizations as a source of justification. In the terminology of the theory, any such justification would represent a “consonant cognition,” and the mental creation and importation of consonant cognitions attenuate the aversive experience prompted by cognitions otherwise causing dissonance. But how did Festinger hit upon this idea about the rumors’ functional significance? The following passage describes how the puzzling rumors became pursuit-worthy: The quake itself, a strong and prolonged one, was felt over a wide geographical area. Actual damage, however, was quite localized, and for a period of days, communication with the damaged area was very poor. The rumors were collected [by Prasad, 1950] among people living in the area which received the shock of the earthquake but which did not suffer any damage. We are, then, dealing with communication of rumors among people who felt the shock of the earthquake but also did not see any damage or destruction . . . .[I]t is plausible to assume [abductively] that these people who knew little about earthquakes had a strong reaction to the violent and prolonged quake which they felt. One may also assume that such a strong fear reaction did not vanish immediately but probably persisted for some time after the actual shock of the quake was over. (Festinger, 1957, p. 237, emphases added)
Note the abductive leaps from objective characterizations of geographical areas to subjective, assumption-driven characterizations of the experiences of people in one area. Indeed, Festinger was quick to admit that Prasad’s article contained “little concerning the emotional reactions of these people to the quake” (237). To presume how people felt, and to hypothesize a reversed causal direction (from rumors as antecedent and fear as effect to fear as antecedent and rumors as effect), represents a chain of abductively inferential reasoning. That speculation about peoples’ emotions points to interpretive characterizations of phenomena as abductive triggers. Surprising phenomena per se do not exist; rather, some are construed as surprising (anomalous, etc.) by scientists such as Festinger. He explicitly referred to the contrasting construals of fear-arousal versus fear-justifying rumors: “One might . . . call them ‘fear provoking’ rumors, although, if our interpretation is correct, they would more properly be called ‘fear-justifying’ rumors” (p. 238). The abduction is that if hypothesized as fear-justifying, then the content of the rumors and their spread seem more understandable. The distinction between provoking and justifying is significant on several grounds. First consider why Festinger and his team might have been dissatisfied with a fear-provoking explanation. One possibility is figure/ground contrast between a phenomenon and the context of received-view perspectives such as common-sense beliefs. That context for Festinger was the prevalence of psychological theories of reinforcement and common-sense assumptions about human motives to seek pleasure and avoid pain. That phenomenon/background (fact/foil) contrast piqued Festinger’s interest. If people seek pleasure, why torture themselves with thoughts about unpleasantly imaginable future events (e.g., reported rumors about a coming great flood soon or even the end of the world)? Festinger’s focus on this phenomenon grew from an insight about a candidate explanation that simply made no sense when scrutinized
55 Abduction and Creative Theorizing
1193
closely: “Certainly the belief that horrible disasters were about to occur is not a very pleasant belief, and we may ask why [emphasis added] rumors that were ‘anxiety provoking’ arose and were so widely accepted” (p. vii). Festinger’s abductive leaps, therefore, reflect approaching a “Why?” question from the direction of a “Why not?” foil. Peircean abduction involves an observation, C, and a tentative hypothesis, H, as accounting for C, if true. Festinger’s pursuitworthy inquiry began with “but if H*, C would not have occurred,” where C refers to the rumors’ content and H* refers to the (alternative or foil) hypothesis regarding the psychology of seeking pleasure and avoiding pain. Folger and Stein (2017) also commented on the role of foils or alternative conceptions in mentioning Festinger and dissonance. The foil-like element of contrast warrants more extensive discussion, however, and Festinger’s abductive reasoning drew upon two such contrasts. The first is the fear-provoking versus fearjustifying contrast. Festinger described the second as a contrast between two types of disaster situations followed by differing types of rumors. Recall that the fear-justification hypothesis replaced its foil (rumors as fearprovoking) because the people spreading the rumors felt tremors and aftershocks but saw no death and destruction (e.g., they were cut off from communications from the quake’s epicenter). Festinger referred to a conjectured hypothesis about a contrastive, counterfactual set of circumstances: [I]f rumors had been collected among persons living in the area of destruction, few if any of such ‘fear-justifying’ rumors would have been found . . . . Those persons in the area of destruction caused by the earthquake [at its epicenter] were, undoubtedly, also frightened . . . .[yet] no cognitive dissonance would have been created. The things they saw around them—the destruction, the wounded and killed—produced cognition which was certainly consonant with being afraid. There would have been no impulse to acquire additional cognition which would fit with the fear. (Festinger, 1957, p. 239)
This counterfactual thought experiment illustrates especially creative abductive reasoning. Festinger’s (1957) book noted an actual case relevant as a conceptually similar event – a 1950 catastrophic landslide also in India. A report on it explicitly stated that those who experienced this disaster had “a feeling of instability and uncertainty similar to that which followed the Great Indian Earthquake of 1934” (Sinha, 1952, p. 200). Two points of contrast stand out. First, “Widespread anxiety was inevitable, but it remained within reasonable bounds” (p. 200). In other words, apparently the need to justify that fear would not have been as intense as in the earthquake situation. Consistent with that interpretation, “the rumors which Sinha reported were collected from persons . . . who were actually in the area and witnessed the destruction . . . .Since for these people there would have been no dissonance (what they saw and knew was quite consonant with being afraid), one would not expect ‘fear-justifying’ rumors to have arisen and spread among them” (Festinger, 1957, p. 240). The second difference between the earthquake and landslide rumors involves the content of the rumors reported: “in Sinha’s report there was a complete absence of rumors predicting further disasters or of any other type of rumor that might be regarded as supplying cognition consonant with being afraid” (p. 240).
1194
R. Folger et al.
This theme of contrasts – possible only on a presumptive basis grounded in hypotheses about which situational details count as key determinants – gets further treatment below, in a subsequent section analyzing the notion of hypothesis tractability as a vital component of pursuit-worthiness based on hopefully rich conceptual resources. Setting up that discussion, however, first needs contextualizing in terms of some intellectual traditions that preceded the concept of dissonance – including some of those that had inspired Festinger’s own past work.
Abductive Reasoning About Dissonance: Theoretical Antecedents of Style and Strategy One precursor to the concept of dissonance was a study of rumors that Festinger had co-authored (Festinger et al., 1948). The context was the social organization of a neighborhood community and its communication patterns after a newly introduced community organizer began certain kinds of activities. As these activities began, and more people became involved, leadership of the community shifted away from the original tenants’ committee to newly active community members, which “threatened the status position of the old leaders” (Festinger et al., 1948, p. 470). This threat to leadership led members of the tenants’ committee to resist the planned activities and raised suspicion about the researchers as outsiders with unknown motives. Conditions were susceptible to the growth and transmission of rumors: “In the absence of satisfactory information supplied by the outsiders, an explanation was found which appeared plausible to some and which justified the resistance which had arisen” (Festinger et al., 1948, p. 471). In brief, the core content of the rumors was that the outside experts, the community organizer, and the new activity leaders were communists. The investigators’ approach to understanding the nature of this rumor foreshadowed the theme found in dissonance theory, namely, the “tendency to reorganize and to distort items so as to be consistent with the central theme” (Festinger et al., 1948, p. 485, emphasis in original). A motivation to keep cognitions consistent with one another, with distortions – if necessary – as part of that reorganization, became central in Festinger’s later formulation of dissonance theory. The analysis of factors influencing the rumor’s transmission repeatedly mentioned the notion of motivational “forces,” such as a reference to the rumors’ transmission as something that “may be analyzed in terms of the forces acting on people within a social structure . . . [or] the social field” (Festinger et al., 1948, p. 485). This idea of motivational influences from the social environment came from the “field theory” of Kurt Lewin, a legendary social psychologist with whom Festinger had studied. Lewin’s field theory dealt with situational influences on behavior. Lewin’s theoretical framework analyzed situations by referring to forces in terms of their magnitude and direction. He also noted that the latter would be determined by the relative strength of forces driving change in a particular direction, versus restraining forces that opposed them. Note the use of those concepts in the following analysis of rumor transmission:
55 Abduction and Creative Theorizing
1195
Once having heard the rumor, a force field is set up, with the [driving] forces to relay the rumor having various magnitudes in various directions. The forces in different directions will encounter different magnitudes of restraining force. Where the rumor spreads in the social field will depend upon the relative magnitude of the driving and restraining forces in each direction. (p. 485)
Similarly, determining which “people in a given region [of the force field] hear the rumor . . . will depend upon the strength of the restraining forces against telling the rumor” (p. 485). Moreover, the influence of field-theory concepts on Festinger’s development of dissonance theory stands out clearly in passages such as those referring to “forces arising from disparities of cognitive structure” and the need to have “cognitive structure in line” with an adopted perspective on the situation. Nevertheless, Festinger’s thinking would be modified in one key respect. The concluding paragraph of the report shows where its emphasis lay and thereby helps identify the nature of where the later changes would emerge in dissonance theory: The extent of spread of rumor through a social structure will be governed . . . by the magnitude and direction of the driving and restraining forces . . . . In short, . . . the spread of a rumor will be a function of the properties of a force field in a social space. (p. 485)
Briefly, dissonance theory shifted emphasis from “social space” to attitudes based on individual decisions (e.g., even outside the context of what other people believe and do). The next section focuses on this change and others that fostered the theory’s pursuit-worthiness. Note, however, the following acknowledgment about individual decisions even in field theory as a precursor to dissonance: “The fact that a decision, once having been made, gives rise to processes that tend to stabilize the decision has also been recognized, particularly by Kurt Lewin” (Festinger, 1957, p. 33). Processes “stabilizing” decisions became a predominant feature in dissonance theory.
Abductive Reasoning about Dissonance: Pursuit-Worthiness as Tractability Lewin’s field theory included a complex mathematics of topology not used by his disciples, including Festinger. The difference between motivational forces as discussed by Lewin versus Festinger points to reasons for dissonance theory’s enormous, direct influence on social psychology. The summary term used here – tractability – encompasses not only such differences but also those that distinguished dissonance from its contemporaries, most of which were quite like dissonance in many respects. The following subsections on tractability indicate features of dissonance theory that fed into the creativity and success it fostered in subsequent research. Tractability as simplicity. In focusing on the relations among a person’s cognitive representations, Festinger provided a parsimonious conception of possible relations between any two of what he called cognitive elements. First, some pairs
1196
R. Folger et al.
have no mutual relevance, such as the color of one’s house and what to have for dinner. Consonant pairs, on the other hand, share a sensible connection – such as the desirability of a favorite main dish and choosing to have it for dinner. Finally, two dissonant elements reflect cognitive conflict: one not only fails to follow from the other (psychologically speaking) but also seems contradictory – for instance, a person who cannot stand the thought of eating a certain substance yet agrees to do so in a particular context (fried grasshoppers in a dissonance experiment; Beauvois & Joule, 1996). Parsimony facilitates pursuit-worthiness by virtue of tractability, reducing theoretical constructs to a manageable core. That allows further theoretical complexity in straightforward paths by building on a tight foundation. In other words, dissonance theory had pursuit-worthy tractability as a function of the basic organizational structure of it as a theory. Only the structural relation between cognitive elements was needed to kick-start theoretical analyses. Tractability as generality. Uberty as abundance, nascent but gravid, requires tractability as the potential for generalization. The simplicity of dissonance theory’s core structure provides its generalizability because situations can be analyzed into their relevant kinds of cognitive elements. Dissonance researchers capitalized on this tractability by identifying characteristic classes of situations – namely, those whose key features highlighted which sorts of cognitions would be consonant or dissonant. This source of fruitful research creativity proved important when investigations no longer relied on the research originally driven by dissonance theorizing and especially when research extended beyond field investigations into the experimental laboratory. Others have noted that such research built on the unfilled promise of Kurt Lewin’s outlook on the cognitive/spatial topography of human relations: “In particular, the idea of a highly general theory being tested in a variety of socially interesting domains exemplifies Lewin’s proposed paradigm: the conceptual replication of an abstract functional relation in different concrete life spaces” (Wicklund & Brehm, 1976, p. ix). Tractability as rigor/flexibility trade-off. Dissonance theory’s pursuitworthiness also relates to another source of its structural tractability. In simplifying possible relations between cognitive elements, “this core is simultaneously rigorous and flexible,” to wit: It is rigorous in that it provides a solid theoretical basis for the key notion of the state of dissonance . . . . It is flexible on at least two counts. First, because it contains no propositions constraining the conditions that have to be fulfilled if the theory is to function . . . . However, it is also flexible because it contains few instructions relating to the operationalization of the main concepts. (Beauvois & Joule, 1996, p. xix)
The lack of “instructions” about operationalizations (e.g., how to measure predicted changes in cognitive elements or how to induce dissonance through experimental manipulations) provided a resource-rich conceptual framework in terms of the next aspect of tractability to be discussed. Tractability as theoretical specificity. Cognitive elements in paired relations comprise the organizing structure of the theory of cognitive dissonance. The theory’s specificity, however, comes from stipulating how its motivational dynamic
55 Abduction and Creative Theorizing
1197
plays out. Here the Lewinian legacy of force-field dynamics is revealed in an adaptation of the restraining force concept. In a given situation, cognitive elements acquire their theoretical status when the one most resistant to change acts as an anchor vis-à-vis all others. As a “generative cognition,” that anchor is “the cognition with reference to which the others are to be judged either consistent (consonant) or inconsistent (dissonant)” (Beauvois & Joule, 1996, p. 5). Identifying the most resistant element (e.g., an act that cannot be undone) allows conceptualizing consonant and dissonant cognitions in proportion to one another. This proportional representation of a cognitive “field” proved crucial for the theory’s uberty and differentiated it from other theories of cognitive consistency extant at the time – whose fruitfulness metaphorically died on the vine: While the example of a fearful person who has nothing to fear contains the central idea of dissonance theory, it omits those aspects of Festinger’s (1957) theoretical statement that distinguished it from other theories of cognitive balance [i.e., consistency] . . . .Most notably the original statement of dissonance theory included propositions about the resistance-tochange of cognitions and about the proportion of cognitions that are dissonant, both of which allowed powerful and innovative analyses of psychological situations. It was the inclusion of these latter propositions that not only distinguished dissonance theory from other theories of cognitive balance, but also made dissonance theory a fertile source of research. (Wicklund & Brehm, 1976, p. 1)
As the next facet of tractability emphasizes, this fertility reflects uberty made esperable even while merely gravid with potential prior to the extensive research it eventually generated. Tractability as identifying sources of variability. The strength of motivationally driving and restraining forces was a gravid Lewinian concept that came to fruition as adapted by Festinger. Cognitive dissonance was conceptualized as having noxious, aversive properties when aroused, thereby creating “a motivational state that impels the individual to attempt to reduce and eliminate it” (Wicklund & Brehm, 1976, p. 1). How hard a person strives to reduce or eliminate dissonance varies, therefore, with the degree of dissonance experienced. This variability in the magnitude of dissonance relates directly to the notion of proportionality mentioned above. An analytic, equation-like expression for that magnitude emerges in rudimentary form simply by representing proportionality as the numerator and denominator of a fraction. A corresponding “formula” is M = D/C, where M stands for the magnitude of dissonance (experienced as the strength of the motivation to reduce or eliminate that aversive state), D stands for dissonant elements, and C stands for consonant elements. Importantly, “the calculation of the magnitude of dissonance is carried out using the most resistant cognition as a focal point, or point of orientation”; hence, M “with regard to the most resistant element is a direct function of the proportion of relevant elements that is dissonant with it” (Wicklund & Brehm, 1976, p. 4). Tractability as the basis for classifying gravid research venues. Two simple examples illustrate the translation of the numerator and denominator of the dissonance equation into a schematically tractable tool for classifying decision-making
1198
R. Folger et al.
situations, which in turn gave researchers a ready-made outlook on broad domains of relevance for testing the theory. The first example involves choices among alternatives. After a choice, the resistant cognition is the fact of having made the choice. For example, “I chose to buy this car” cannot be denied afterward, with increased resistance to change if the choice cannot be undone (e.g., a non-returnable product). After the choice, the dissonant elements are favorable aspects of non-chosen items and unfavorable aspects of chosen items, per the D/C schematic. Dissonance shrinks with reductions to the number (or importance) of dissonant cognitions and with increases to the number (or weighted importance) of consonant cognitions. For example, regret about a purchase is reduced by disparaging the attractive features of un-chosen alternatives and by adopting an exaggerated set of favorable attitudes toward a chosen alternative. Variations on this paradigm were quickly seized upon by researchers because of the simple operational scaffold it provided. The second paradigmatic example relates to public behavior versus private attitudes. Once again, the key is the focus on behavior as the cognitive element resistant to change. Public behavior not aligned with private attitudes would create dissonance. To the extent of a behavior’s undeniability, irrevocability, implications for future commitments, and so forth, the magnitude of dissonance created would go up, and the opportunities for reducing that dissonance would come from changing one’s attitude to bring it in line with the behavior. This characterization of commonly encountered situations created research on what became known as counter-attitudinal arguing. Researchers found the predicted support for conditions in which private attitudes changed when having argued against one’s own prior views became the resistant element. These two paradigms illustrate how the theory prompted research possibilities that either were not implied by related theories or were contrary to their predictions. That distinctiveness stood out immediately to Festinger’s contemporaries because of the novelty of the kinds of predictions that dissonance theory implied when elaborated in relatively straightforward ways. Up until that period in the history of social psychology, for example, research had sought ways to change behavior by changing attitudes, whereas dissonance paradigms examined post-decisional dissonance reduction based on adjustments to a previously existing attitude structure – in other words, research capitalizing on a behaviors-change-attitudes causal sequence rather than the traditional attitudes-drive-behaviors approach to research and theorizing.
Summary: From Whence Uberty in Dissonance? The above discussion has pointed to ways that Festinger built on the uberty that lay dormant (“gravid with young truth”) in Lewin’s force-field analyses of human motivational dynamics. In turn, social psychologists conducted numerous studies of dissonance because of its uberty – so much so that “dissonance” has become a household word. Obvious caveats about this discussion must be acknowledged,
55 Abduction and Creative Theorizing
1199
however, because such accounts can reflect the revisionist history of 20/20 hindsight (but for similar accounts, see, e.g., Aronson, 1997; Beauvois & Joule, 1996; Wicklund & Brehm, 1976). Undeniably, however, Festinger changed the lens through which his team viewed the rumor phenomenon, eschewing conventional approaches to human motivation. Whether based on extant theory or common-sense psychology, those approaches lacked the resources to account for the rumors’ content if classified as fearprovoking. In this case, thinking about the nature of hypotheses that might account for a given phenomenon required rethinking the to-be-explained phenomenon itself. Festinger needed to understand what would require explanation in terms of why it needed explaining, because how to explain it was not readily obtainable from existing conceptual resources. He had to rely on abductive reasoning if he were to have any hope of generating possible interpretations of the situation. Pursuing inquiry along the lines of such tentative conjectures would depend in the first place on whether they seemed to contain (be gravid with) conceptually rich resources in abundance (indications of uberty). The seeds of dissonance theory were planted when Festinger abductively reasoned from effect (rumor) to conjectured cause (fear-justifying function). The theory’s gestation also reveals one way of thinking about how to abductively generate such conjectures, namely, by considering what categories of effects a phenomenon might have exhibited. Because fear-provoking effects do not follow from pleasure/pain psychology, one category of candidate hypotheses comprises those that do not put the rumors into the fear-provoking category. The relevant search space for conjectures then becomes possible non-provoking functions of various kinds. Festinger’s “searches” in that space did not have to wander far from the broad swath of familiar territory staked out by his mentor (Lewin). Moreover, his prior use of Lewin-inspired categories of explanation had prompted a related candidate explanation for some rumors he had previously investigated, as mentioned earlier: “In the absence of satisfactory information . . . , an explanation was found . . . which justified . . . .” (Festinger et al., 1948, p. 471, emphasis added). Thus, when suspecting that he might need to think of the earthquake-rumor phenomenon along the lines of something other than the explanatory prospects in the fear-provoking category, Festinger had no need to eschew logical reasoning processes in ultimately hitting upon a fear-justifying candidate cause. The abductive nature of those reasoning processes is not unlike how detectives reason when picking from some categories of people as “persons of interest.” What Festinger also suspected at the outset (esperably) was that this new theoretical lens might offer rich explanatory resources for various domains other than rumor transmission, which is why the team then had “many discussions in which we attempted to pin down the idea and to formalize it somewhat” (Festinger, 1957, vii). This speaks to the fruitfulness of the pursuit of these kinds of hypotheses, elaborated in ways that helped the central insights seem applicable to certain characteristic types of situations. Once one understands the organizational structure and motivational dynamics of dissonance, it becomes possible to “spot” its potential
1200
R. Folger et al.
emergence and the likely directions of its resolution in a much wider variety of contexts. This indicates the extent to which the uberty of the hypotheses was more than merely esperable. The question of why that was the case is the topic of the next section, which translates the analysis of the dissonance case study into the language of Peircean scholarship focused on creative inquiry.
Creative Scientific Inquiry as a Logic of Pursuit-Worthiness Often philosophers apply a rather abstract approach in their analyses of scientific inquiry. The idea is to conduct analyses grounded in something other than mysterious ideas about the psychology of intuition – to instead examine the pursuit of inquiry “treated by conceptual or philosophical means” (Paavola, 2004, p. 267), such as traditional forms of logic. The preceding discussion, however, suggests that creative inquiries can follow certain distinctive patterns of reasoning even if those patterns do not necessarily fit into formal structures of logic (e.g., of deduction or induction). That is, they nonetheless exhibit meaningful process patterns. One example of such patterns was illustrated in the development of dissonance from an initial observation interpreted conceptually and then used as the basis for further theoretical elaboration. Festinger and his team pursued their hypothetical interpretation of the rumors as fear-justifying rather than fear-arousing. They pursued it in ways they considered promising. Those pursuits pertain to esperable uberty in two respects, which might in fact be considered the basis for an ongoing series of abductive conjectures. First, they decided that the fear-justifying interpretation was itself worth examining in terms of implications (its internal logic). They speculated that the need for rumors as inventive rationalizations might have depended, in part, on the absence of direct evidence for true calamity. The internal logic of that idea gave special significance to Sinha’s report as a foil against which Prasad’s could be contrasted, namely, because rumors in the former case did not involve predictions of further disaster, and those rumors occurred in a region where direct evidence about the initial catastrophe was available. Second, the theory’s hoped-for fruitfulness was spawned by conceptualizing properties of that concept in particular ways as to their structure (types of cognitive elements and aspects of their relation to one another) and their dynamics (aversive tension in the presence of dissonant elements, restraining forces especially likely to be associated with past behavior, and the driving forces motivating attitude change). Of course, the above account of Festinger’s theorizing is a speculative reconstruction of events. Nonetheless, the particulars of that account also resemble philosophical approaches to abductive reasoning as a creative activity. One such approach is N. R. Hanson’s (e.g., 1971) analysis of the “logic of discovery.” Earlier philosophers had assumed scientific reasoning only addressed hypotheses’ acceptance rather than their discovery. A different perspective, however, emerges from making a distinction between “reasons for accepting a particular, minutely-specified hypothesis” and “reasons for suggesting that, whatever specific claim the successful H [hypothesis] will make, it will nonetheless be an hypothesis
55 Abduction and Creative Theorizing
1201
of one kind rather than another”; hence, the subject for a logical analysis of processes of reasoning pertains to “whether (before having hit on an hypothesis which succeeds in its predictions) one can have good reasons for anticipating that the hypothesis will be one of some particular kind” (Hanson, 1971, p. 291, emphases in original). This is an esperable outlook about how to generate candidate explanations with accessible uberty by focusing on a general category of hypotheses considered germane to the phenomenon of interest. In claiming that creative insights in science exhibit patterns of reasoning processes, Hanson enriched the Peircean account of abduction. He shifted the focus away from a single, particularized hypothesis to the idea of multiple hypotheses or clusters, each with properties reflecting a family-like kind of relationship to one another. This multiplicity of hypotheses (hence uberty as abundance) also pertains to classifying a phenomenon as an anomaly (or, in the broader sense, as something pursuit-worthy by virtue of some intriguing, problematic features). Regarding a phenomenon as anomalous, and the like, “cannot follow from any obvious premise cluster, else it would not be anomalous” (Hanson, 1971, p. 292). Similarly, the dissonance team treated rumor content pursuit-worthy because it did not fit pleasure/pain hypotheses. Given that the “identification of an event as ‘anomalous’ depends on . . . [an] elaboration of familiar premise clusters” (Hanson, 1971, p. 292), pursuit-worthy lines of inquiry exhibit creative uberty by entertaining conjectures and hypothetical constructs of a less-familiar kind – a kind considered potentially up to the explanatory task in light of why the phenomenon seems anomalous. The spark to creativity, therefore, stems from “descriptions of . . . observations” that are “incompatible with the expected . . . unpacking of currently accepted hypotheses” (Hanson, 1971, p. 297). Capitalizing on that kind of recognition (re-construal from familiar to problematic), however, begins to have its esperable features to the extent that such uberty resides in conjectures about how and why the phenomenon to be explained diverges from standard, familiar theoretical grounds upon which to base explanations. Hanson’s reference to pursuing conjectures within the search space of a certain kind (category) of hypotheses brings to mind the Peircean notion of token representative of a larger class of types, and more needs to be written in that vein. Based on the dissonance case study, it appears that the creative pursuit of uberty can involve attempts to discern matches between the type➔token patterns of the phenomenon (e.g., inability to deny one’s fear; lack of observable grounds for fear) and speculatively conceived type➔token relations among corresponding theoretical constructs (e.g., cognitive elements; restraining forces versus driving forces). This property-matching aspect of inquiry is consistent with one way of describing research strategies (see also Rodrigues & Emmeche, 2021, on abduction and styles of scientific thinking): So if I am a researcher looking for a good explanatory hypothesis . . . , I can (and must) try to constrain and guide my search by taking into account that my explanation must explain or at least be consistent with, most other clues and information that I have available concerning the subject matter . . . .I should have a good further explanation for why this
1202
R. Folger et al.
explanation . . . deserves attention. So I should have an explanation for my explanation. (Paavola, 2004, pp. 270–271; see also Paavola, 2012)
Festinger sought his explanation-for-an-explanation in Lewinian concepts about psychological force fields. He did not simply presume that the rumors were fearjustifying rather than arousing; rather, the idea of restraining forces as constraints on dissonance reduction facilitated reasoning why that might be the case. He used Lewinian concepts in a novel way to account for a novel phenomenon rather than simply hypothesizing from scratch. Thus, while “the different elements of the hypothesis were in our minds before,” it takes “the idea of putting together what we had never before dreamed of putting together” (Peirce, CP 5.181, cited in Paavola, 2004, p. 272) in order for pursuit-worthy lines of inquiry to get enough traction to capitalize on uberty. Paavola’s (2004) discussion of strategic explanatory pursuit bolsters the argument for Festinger as apposite to Peirce’s ideas and refers to other case histories for additional support. A “strategic viewpoint” is required to reasonably assess “the way in which this hypothesis is seen to fit with this particular problem in question and with other relevant information” (Paavola, 2004, pp. 272–273). For example, Paavola remarks that Darwin did not come up with the idea of evolution; others before him had suggested similar notions. Moreover, one commentary noted that “his notebooks show that he had or almost had the same idea a number of times before . . . So the historic moment [the Malthusian insight as an explanation-for-anexplanation] was in a sense a re-cognition of what he already knew or almost knew” (Gruber, 1981, p. 42, quoted in Paavola, 2004, p. 273). This epitomizes the idea of construing a phenomenon as a token representation of a type for which particular “kinds” of hypotheses might be appliable in illuminating certain lines of inquiry as pursuit-worthy. Hence, “the difficult part in discovery [creativity in identifying worthy scientific pursuits] is the recognition that the hypothesis really is a viable way of solving this particular problem and that the hypothesis works more generally, and not only in relationship to one, particularly anomalous phenomenon” (Paavola, 2004, p. 273). Festinger’s structuration of his theory facilitated the achievement of such generality to the extent that today dissonance is a household word, commonly referred to when analysts seek to explain all sorts of events.
Conclusions Is there a “logic of pursuit-worthiness,” such that lines of scientific inquiry might more often track sources of uberty that are not merely esperable? In the context of writing about uberty as the inverse of security, Peirce described the former as possessing a “gravid,” nascent potential for eventually leading to the exposure of as-yet-unborn “truth.” The analysis of the origins of dissonance theory identifies some of the sources of its uberty, but the abductive tractability of that theory cannot be verified in a retrospective fashion. The fecundity of the conceptual framework is now proven, but Festinger’s team had to have some grounds for being “esperable”
55 Abduction and Creative Theorizing
1203
about that in advance. Pursuing the idea of pursuit-worthiness, therefore, should not end with any given analysis of established successes in science, least of all this chapter’s account. Instead, philosophers and scientists alike should bemoan that many accounts of scientific reasoning “have said almost nothing about the conceptual context within which . . . an hypothesis is initially proposed,” ignoring how “Aristotle . . . insisted that the proposal of an hypothesis can at least be a reasonable affair . . . .[with] good reasons, or bad ones, for suggesting one kind of hypothesis initially, rather than some other kind” (Hanson, 1971, p. 289). Case histories such as the dissonance example provide reasons to believe that pursuitworthiness of “good reasons” for a given kind of inquiry has “logical credentials of its own” and that further case-history analyses might reveal more about “the feeling for explanatory fertility which so often guides a scientist’s experimentation and observation” (Hanson, 1971, pp. 299–300, emphasis added).
References Aliseda, A. (2006). Abductive reasoning: Logical investigatoins into discovery and explanation. Berlin: Springer. Aronson, E. (1997). Back to the future: Retrospective review of Leon Festinger’s. A Theory of Cognitive Dissonance. The American Journal of Psychology, 110, 127–137. Beauvois, J.-L., & Joule, R.-V. (1996). A radical dissonance theory. Taylor & Francis. Bertolotti, T., Arfini, S., & Magnani, L. (2016). Abduction: From the ignorance problem to the ignorance virtue. IFCoLog Journal of Logic and its Applications, 3(1), 153–173. Cabrera, F. (2021). String theory, non-empirical theory assessment, and the context of pursuit. Synthese, 198(Suppl 16), S3671–S3699. Campos, D. G. (2011). On the distinction between Peirce’s abduction and Lipton’s inference to the best explanation. Synthese, 180, 419–442. Elliot, K., & McKaughan, D. (2009). How values in scientific discovery and pursuit alter theory appraisal. Philosophy of Science, 76, 598–611. Festinger, L. (1957). A theory of cognitive dissonance. Row Peterson. Festinger, L., Cartwright, D., Barber, K., Fleischl, J., Gottsdanker, J., Keysen, A., & Leavitt, G. (1948). A study of a rumor: Its origin and spread. Human Relations, 1, 464–486. Folger, R., & Stein, C. (2017). Abduction 101: Reasoning processes to aid discovery. Human Resource Management Review, 27, 306–315. Frankel, H. (1982). The development, reception, and acceptance of the Vine-Matthews-Morley Hypothesis. Historical Studies in the Physical Sciences, 13, 1–39. Gruber, H. E. (1981). On the relation between ‘aha experiences’ and the construction of ideas. History of Science, 19, 41–59. Hanson, N. R. (1958). Patterns of discovery. Cambridge University Press. Hanson, N. R. (1971). In S. Toulmin & H. Woolf (Eds.), What I do not believe and other essays. D. Reidel Publishing Company. Harman, G. H. (1965). The inference to the best explanation. The Philosophical Review, 75, 88–95. Ivani, S. (2019). What we (should) talk about when we talk about fruitfulness. European Journal for the Philosophy of Science, 9(4), 1–19. Kaplan, A. (1998). The conduct of inquiry. Transaction Publishers. Laudan, L. (1980). Why was the logic of discovery abandoned? In T. Nickles (Ed.), Scientific discovery, logic, and rationality (Boston studies in the philosophy of science, Vol. 56, pp. 173– 183). Reidel. Lipton, P. (2004). Inference to the best explanation. Routledge.
1204
R. Folger et al.
Mackonis, A. (2013). Inference to the best explanation, coherence and other explanatory virtues. Synthese, 190, 975–995. McKaughan, D. (2008). From ugly duckling to swan: C. S. Peirce, abduction, and the pursuit of scientific theories. Transactions of the Charles S. Peirce Society, 44, 446–468. McMullin, E. (1976). The fertility of theory and the unit for appraisal in science. In R. Cohen, P. K. Feyerabend, & M. W. Wartofsky (Eds.), Essays in memory of Imre Lakatos (pp. 395–432). Springer Netherlands. Nyrup, R. (2017). Hypothesis generation and pursuit in scientific reasoning, Durham theses, Durham University. Available at Durham E-Theses Online: http://etheses.dur.ac.uk/12200/ Nyrup, R. (2020). Of water drops and atomic nuclei: Analogies and pursuit worthiness in science. British Journal of the Philosophy of Science, 71, 881–903. Paavola, S. (2004). Abduction as a logic and methodology of discovery: The importance of strategies. Foundation of Science, 9, 267–283. Paavola, S. (2012). On the origin of ideas. Lap Lambert Academic Publishing. Peirce, C. S. (1913/1931–1958). Letter to F. A. Woods, “On ‘Would Be.’” In A. W. Burks (Ed.), Collected papers of Charles Sanders Peirce (Vol. 8, pp. 384–388). Harvard University Press. Peirce, C. S. (1913/1992–1998). An essay toward improving our reasoning in security and in uberty. In N. Houser & C. Kloesel (Eds.), The essential Peirce: Selected philosophical writings (Vol. 2, p. 472). Indiana University Press. Prasad, J. (1950). A comparative study of rumours and reports in earthquakes. British Journal of Psychology, 41, 129–144. Rescher, N. (1976). Peirce and the economy of research. Philosophy of Science, 43, 71–98. Rodrigues, M. V., & Emmeche, C. (2021). Abduction and styles of scientific thinking. Synthese, 198, 1397–1425. Schindler, S. (2017). Theoretical fertility McMullin-style. European Journal of the Philosophy of Science, 7, 151–173. Sebeok, T. A. (2001). Signs: An introduction to semiotics. University of Toronto Press. Šešelja, D., Kosolosky, L., & Straßer, C. (2012). The rationality of scientific reasoning in the context of pursuit: Drawing appropriate distinctions. Philosophica, 86, 51–82. Šešelja, D., & Straßer, C. (2013). Kuhn and the question of pursuit worthiness. Topoi, 32, 9–19. Shaw, J. (2022). On the very idea of pursuitworthiness. Studies in the History and Philosophy of Science, 91, 103–112. Sinha, D. (1952). Behaviour in a catastrophic situation: A psychological study of reports and rumours. British Journal of Psychology, xx, 200–209. Whitt, L. (1990). Theory pursuit: Between discovery and acceptance. In A. Fine, M. Forbes, & L. Wessels (Eds.), PSA: Proceedings of the biennial meetings of the philosophy of science association (Volume One: Contributed Papers) (pp. 467–483). Philosophy of Science Association. Whitt, L. (1992). Indices of theory promise. Philosophy of Science, 59, 612–634. Wicklund, R. A., & Brehm, J. W. (1976). Perspectives on cognitive dissonance. Lawrence Erlbaum Associates.
Creativity and Abduction According to Charles S. Peirce
56
Sara Barrena and Jaime Nubiola
Contents What Is Creativity According to Charles S. Peirce? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Scientific Method: Logic and Imagination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as Engine of Creativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Artistic Creativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Creative Life: Self-Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Peircean Keys to Foster Creative Thinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1207 1210 1215 1220 1223 1226 1230 1231
Abstract
The work of Charles S. Peirce provides powerful keys for the study of human creativity. It is possible to argue, based on the pragmatism that Peirce defended, that there is a form of reasoning that joins logical soundness and imagination. Peirce’s scientific method, the power of which lies upon a logical operation called abduction, provides important clues to reach a better understanding on how to make new discoveries, how to embody new ideas in creations, and, ultimately, how to think more creatively and effectively. In our contribution, the authors have collected a good amount of Peirce’s texts because they provide a wonderful image of the creative spirit of his philosophy and of the relevance of his thought for the philosophy and culture of the twentyfirst century.
S. Barrena () · J. Nubiola Departamento de Filosofía, University of Navarra, Pamplona, Spain e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_18
1205
1206
S. Barrena and J. Nubiola
The chapter is arranged in six sections: 1. 2. 3. 4. 5. 6.
What Is Creativity According to Charles S. Peirce? The Scientific Method: Logic and Imagination. Abduction as the Engine of Creativity. Abduction and Artistic Creativity. Abduction and Creative Life: Self-Control. Some Peircean Keys to Foster Creative Thought.
Keywords
Creativity · Pragmatism · Abduction · Peirce · Scientific method
Creativity is more than the superficial idea that some manuals or self-help books usually offer us. Creative thinking is not at odds with exactitude or logical rigor. Rather, it can be said that the creative capacity constitutes the distinguishing trait of human reason: it is the essential characteristic to advance in knowledge, to promote sensitivity, and to achieve that which makes us fully human; creativity is the key to human flourishing. Human beings tend to grow, to develop themselves in ways that are not given to them, and to manifest themselves freely in science and art or simply through the actions they develop in their daily lives. The American scientist and philosopher Charles S. Peirce (1839–1914), who has been characterized as one of the most original and versatile intellects America has ever produced (Fisch, 1981, p. 17; Russell, 1959, p. 276), offers a good starting point from which to reflect on the creative capacity of the human being. The scientific practice that Peirce developed over many years at the US Coast and Geodetic Survey, his extensive knowledge of the history and theory of science, and his reflections on aesthetics and beauty – which have gained greater relevance in recent years – offer an excellent avenue for the study of creativity and the logic of discovery. In this sense, it is surprising how little attention Peirce scholars have paid to this issue of creativity after the seminal studies of Douglas Anderson (1987) and Carl R. Hausman (1987). In the last two decades, the valuable contributions of Sara Barrena (2007, 2014, 2019) and Regina M. Brioschi (2020) are of great interest for a clear understanding of creativity according to Charles S. Peirce. Peirce’s pragmatic thinking, far from erroneous interpretations that emphasize the practical or useful quality of actions, seeks to understand human beings in relation to their actions and what those actions can lead to, that is, the actions’ possible consequences. For Peirce, creativity resides precisely in the possibility of developing new courses of action with the help of imagination, of creating new signs that allow us to get to know the world around us and make it grow. Reason, as the Peircean notion of abduction will show us, is essentially creative. In the context of his pragmatism, Peirce writes:
56 Creativity and Abduction According to Charles S. Peirce
1207
On the contrary what he adores, if he is a good pragmaticist, is power; not the sham power of brute force, which, even in its own specialty of spoiling things, secures such slight results; but the creative power of reasonableness, which subdues all other powers, and rules over them with its sceptre, knowledge, and its globe, love. (CP 5.520, c.1905)
Pragmatism has to do precisely with learning from experience, transforming it until doubt turns into belief through a process that can be evaluated from a practical point of view; pragmatism has to do with reasoning in a more effective and creative way. In this contribution, the authors will first examine some general notions about creativity according to Peirce and the role it plays in his thought. Second, the attention will be focused on the Peircean scientific method, which involves the development of a type of thought, both creative and rigorous, in which logic and imagination are combined. This will be followed by a more detailed explanation of the Peircean notion of abduction as an engine of creativity. An analysis of creativity in art and in ordinary life will then be carried out, concluding with some Peircean clues on how creative thinking can be fostered and improved.
What Is Creativity According to Charles S. Peirce? Creativity, which is nothing other than the possibility of growing that permeates everything, the capacity to generate new intelligibility, is a central issue in Peirce’s thought. There is in Peirce a permanent interest in explaining creativity and the place that creative acts occupy in an intelligible universe (Hausman, 1993, p. vii). Besides being a topic that Peirce directly addresses in his study of methodology, creativity is a nerve that runs through and gives life to the entirety of Peirce’s philosophical system. Everything, including the universe and human beings, is for him subject to constant evolution and growth. In this chapter, the authors will set aside the metaphysical aspects of creativity in order to focus on certain issues arising from the Peircean scientific methodology that can provide important clues to develop creative thinking (Barrena, 2019). This can help us understand how humans are able to carry out their proper functions: to embody general ideas in art-creations, in utilities, and in theoretical cognition (CP 6.476, 1908). Creativity involves exercising certain capacities and promoting better reasoning, which will help us in any learning process, in expressing ourselves better, and in other numerous situations of everyday life (Chiasson, 2001, p. ix). Creativity is an answer, a way of expression, that is, only the person that has something to say or something to solve is able to develop creativity. Therefore, creativity requires content and certainly effort. For Peirce, something creative means, first, an actualization of possibilities within the continuity of thought. According to his synechism, Peirce argues that everything is continuous, both in the universe and in our thought. Synechism, Peirce writes, “means the tendency to regard everything as continuous,” and he adds that
1208
S. Barrena and J. Nubiola
he carries the doctrine “so far as to maintain that continuity governs the whole domain of experience in every element of it” (CP 7565–6, 1893). All propositions are continuous with experience and with other propositions; that is, every element of our thought is always preceded by others with which it relates, and will in turn be followed by a new one. “Ideas tend to spread continuously and to affect certain others which stand to them in a peculiar relation of affectability,” Peirce writes (W 8: 136; CP 6.104, 1892). In this framework, it is possible to speak of a “train of thought” which can be sketched as follows: • Life is like a chain. Our thought is continuous. There are no shortcuts in them. Our reasoning does not come out of the blue. • All reasoning is preceded and followed by other instances of reasoning. One thing leads to another, and there are no final ideas: there is no reason to believe that any given idea will be established forever (CP 7.569, 1893). • The continuity of thought allows us to advance step by step toward the truth. This association of ideas that Peirce defended in the nineteenth century is an important element in contemporary brainstorming or group problem-solving (Prince, 1970). It is necessary to pull on an idea to see the “string” of ideas behind it, although these elements, at first glance, may seem to have no relationship to each other. The successive thoughts lead to solutions. Peircean abduction consists precisely, as will be seen later, of rearranging various elements that seem unconnected, thus linking what at first glance may seem disparate and irrelevant. Creative thinking therefore implies a continuation of ideas, proceeding further in the train of thought. But that step from one thing to another must hold certain features to be creative: it must be new, intelligible, original, and valuable. What is creative is new not only insofar as it is different from the past elements, but also with respect to what is intelligible (Hausman, 1987, pp. 381–382). If differences were the only things that characterized the new, everything in the world would be new, since each thing has its specific place and time which are different from the previous ones, and, in that sense, it is different. Therefore, something more radical is needed to make something new: a difference in its intelligibility. It would be the case, for example, of a painting that can be perceived as from a different class with respect to previous ones, or of a scientific idea that is different from everything known before it. The new, on the one hand, must exist with reference to the old, for without tradition, there can be no novelty: something that was completely new could not even be expressed. In this sense, the “enthusiasm for experience” of the creative person is usually mentioned, although, on the other hand, it must also be recognized that the creator cannot be limited to experience: creative thinking must go beyond the limits of the past knowledge. Creative work must be intelligible into that dialogue between old and new. It must have a value, a quality that makes it possible for it to be recognized and identified. “The new thing must have a character, an identifiable principle or quality, and this
56 Creativity and Abduction According to Charles S. Peirce
1209
character is identifiable because it seems to be something we may in the future be able to connect with other things” (Hausman, 1987, p. 381). In this way, a painting has something that makes it identifiable and can be related to a certain style or other styles from the past, or it can potentially be framed within future currents, or a literary work can be ascribed to a genre, etc. In addition to novelty and intelligibility, a third feature of the creative, often quoted, is originality (Brioschi, 2020, pp. 108–113). Some scholars have defined it as the ability to make connections to something that was not previously connected in that same way (Boden, 1999, p. 369), and it is sometimes unclear where lies the difference between novelty and originality. In our understanding, when speaking of originality, the intention is to place the emphasis not on the differences, on being distinct, but rather on being oneself, on contributing one’s own personal factor, which is added to what is created. Originality would appear in this sense as the ability to express oneself, to express one’s own personal being. In this sense, the expressive capacity will be a fundamental element within the Peircean conception of artistic creativity. The work of art occurs for Peirce when artists manage to express something, that is, when they manage to generalize a quality or sensation, when they achieve a “reasonable incarnation,” “a distinctive quale to every combination of sensations so far as it is really synthetized” (CP 6.223, 1898). In his trips through Europe, where he had the opportunity to contemplate numerous works of art, Peirce often highlighted the expressive dimension of the artistic phenomenon, and it is striking the fact that the expressiveness or not of a work became a distinctive criterion of artistic quality for him (Barrena, 2014, 2015, pp. 194–201). The creative work, on the other hand, must have an external value. It is not a subjective phenomenon, and that makes something creative to differ from a purely extravagant invention. The mere originality of the artist is not enough, but reasonableness and some communicable action are required. You need to leave an external mark: being creative is not the same as being brilliant or witty. The artist creates something new that has real value. It is difficult to define what value is, but as Hausman has pointed out, it is possible to speak of certain conditions that demand our attention and make us judge that the new thing should exist (Hausman, 1987, p. 382). In the case of science, the value of hypotheses would lie in them being accurate explanations for the world. In this way, an original and novel hypothesis, but one that is not correct, could not be considered creative: scientific hypotheses must explain the facts. In the case of art, the value of creative ideas will not consist in their ability to explain reality but in the ability to express feelings that take shape and resolve an initial concern: the artist does not seek to understand what is true nor to make a discovery, but rather to express something in an admirable way. Thus, the first steps to understand creativity according to Peirce are the following: the experience must be taken as the starting point, and then to the analysis of the different elements that the experience provides us, we add the elaboration of the mind, which results in something creative, intelligible, and new. Showing how experience is possible and how its different components can be broken down and
1210
S. Barrena and J. Nubiola
recombined, that is, to show ultimately how growth and creativity are possible, is precisely – for Peirce – the whole task toward which philosophy is oriented (van Herdeen, 1998, p. 62). It will be seen below how this recombination of the elements of experience makes it possible to give rise to something new.
The Scientific Method: Logic and Imagination According to Peirce, in order to become better thinkers and more creative persons, it is necessary to choose the way of reasoning, and respect whichever reasoning method chosen, without ever despising other modes of thought: The genius of a man’s logical method should be loved and reverenced as his bride, whom he has chosen from all the world. He need not contemn the others; on the contrary, he may honor them deeply, and in doing so he only honors her the more. But she is the one that he has chosen, and he knows that he was right in making that choice. And having made it, he will work and fight for her, and will not complain that there are blows to take, hoping that there may be as many and as hard to give, and will strive to be the worthy knight and champion of her from the blaze of whose splendors he draws his inspiration and his courage. (W 3: 257; CP 5.387, 1877)
The choice of a method is, therefore, one of the most important decisions that can be made, for it will direct human life. And though Peirce does not reject any other particular method – because he thinks the research-road must never be closed – for him, the superiority of the scientific method is clear. Peirce states in his wellknown article, The Fixation of Belief (W 3: 242–257; CP 5.358–5.387, 1877), that there are four different methods that can lead to overcoming doubt and to the establishment of a belief: tenacity, authority, a priori method, and science. Tenacity would be the method of those people who cling to their own beliefs and hold them without wavering; the method of authority would accept what others – an institution or group of people – impose on us, making us intellectual slaves; the a priori method is to believe what one tends to believe, that which seems true according to our own reason. For Peirce the a priori method is valuable, but contains an accidental element that is not based on experience or universal nature, but on personal preferences, and therefore, research becomes something like the development of taste. The method of science, meanwhile, is the only one that is based on experience and presupposes the existence of reality, that is, of real things that affect our senses according to regular laws, independently of our opinions. The scientific method assumes that it is possible to know how things are and that anyone with enough experience and reason will come to the same conclusions. It is the only method that, because it is based on experience, enables agreement among all people. Although Peirce developed his methodology in the field of science, it should not be limited exclusively to that domain. The methodology of Peirce – what he called the “scientific method” – is much more than a way to make scientific discoveries: it actually aspires to be the right way to proceed with any research program concerning reality, that is, with any creative approach to our world. It is a tool that all can
56 Creativity and Abduction According to Charles S. Peirce
1211
use: “Everybody uses the scientific method about a great many things, and only ceases to use it when he does not know how to apply it,” Peirce writes (W 3: 254; CP 5.384, 1877). Scientific methodology must be used, according to Peirce, in any investigation that seeks to be serious and rigorous, whatever its object. This methodology can be used even in the most creative areas of human culture, for example, in art. What is, then, this Peircean method? In the first place, the scientific method is, according to Peirce, something logical, since a merely psychological explanation of how discoveries take place would not solve the question (CP 5.172, 1903). Peirce’s scientific method is a structured process, and it is susceptible to a logical explanation: “There is a purely logical doctrine of how discovery must take place, which, however great or little is its importance, it is my plain task and duty here to explore” (CP 2.107, c. 1902). The scientific method is a continuous and complex process that involves three steps aimed at discovering the truth, abduction, deduction, and induction, all three being equally necessary. It should be noted, first of all, that the distinction of the steps of the scientific method is not strictly temporary: “They are three kinds of reasoning that do not run independently or parallel, but integrated and cooperating in the successive phases of the scientific method” (Génova, 1997, p. 57; own translation). The method, as it is characterized by Peirce, begins with the known and observed facts, after which it proceeds into the unknown. All research originates with the observation of a surprising phenomenon, in an experience that makes us abandon some expectation or that produces a break in a habit. In The Fixation of Belief, Peirce writes that research always starts from doubt, not a methodological doubt, but a real question – “a real and living doubt” – that arises in us through experience. The phenomenon of surprise has no relation to Cartesian doubt, which for Peirce is a mere “paper-doubt” (CP 5.445, 1905; 5.416, 1905). Genuine doubt always has an external origin and cannot be produced by an act of will (CP 5.443, 1905). Surprise produces some irritation and demands a hypothesis; it forces us to seek an explanation which turn the surprising phenomenon into a reasonable one (Nubiola, 2005). Therefore, the implementation of the scientific method requires determining what it is that surprises us or what it is noticed as strange, that is, identifying that which – in our experience – is not perceived as it should be and causes us restlessness, a state of doubt that is desired to overcome. The problem to be solved must be clear to us, and sometimes, this will require us to redefine the problem in order to understand it better: a correct question is part of the solution. A “surprising” fact requires a change in our rational habit of belief; it demands an explanation. From this starting point, the three stages of the scientific method, strictly speaking, begin to develop. (a) Abduction (A Hypothesis Is Sought) The process begins with abduction. This phase involves the search and development of an initial hypothesis: from the observations of a surprising fact, a conjecture that provides a possible explanation arises. This means, as Peirce
1212
S. Barrena and J. Nubiola
states, that a syllogism arises which shows the surprising fact as necessarily resulting from the circumstances of its occurrence, along with the truth of the credible conjecture as premises. In this stage, a particular theory is not yet in mind, though surprising facts make us feel that a theory is needed to explain them; at least one hypothesis is needed. It is not enough for the hypothesis to explain the facts, since many other hypotheses could be found to explain them. Instead, the hypothesis has to be verified in the successive phases. (b) Deduction (Consequences Are Sought) The second phase of the investigation consists of exploring the possible consequences of the hypothesis through logical analysis. The hypothesis that has arisen is provisionally adopted and must be explained and specified through deduction: “The first thing that will be done, as soon as a hypothesis has been adopted, will be to trace out its necessary and probable experiential consequences. This step is deduction” (EP 2: 95; CP 7.203, 1901). The hypothesis is subject to growth and tends to become more and more defined: by deduction, the idea becomes more precise. “This testing —Peirce writes— to be logically valid must honestly start, not as Retroduction [another name for ‘abduction’] starts, with scrutiny of the phenomena, but with examination of the hypothesis, and a muster of all sorts of conditional experiential consequences which would follow from its truth” (CP 6.470, 1908). That is, deduction consists in analyzing the first hypothesis to see what possible consequences could be derived from it. (c) Induction (Facts Are Sought) This final stage starts from a hypothesis that seems to recommend itself because of its possible consequences. We feel that facts are needed to support the hypothesis, and the study of the hypothesis will suggest the experiments that will be needed to gather these facts. Peirce writes: The purpose of Deduction, that of collecting consequents of the hypothesis, having been sufficiently carried out, the inquiry enters upon its Third Stage, that of ascertaining how far those consequents accord with Experience, and of judging accordingly whether the hypothesis is sensibly correct, or requires some inessential modification, or must be entirely rejected. (CP 6.472, 1908)
The consequences that had been deduced have to be experimentally verified, through induction, in the experimental verification phase. It is time to do experiments and compare them with the predictions that were made. Induction is the operation that leads to assent, even if it is always provisional, since the achievements of science can always be improved or refuted in the future. Only after induction can significant value be attached to the creative hypothesis, a value that resides, as Anderson (1987, p. 53) explains, either in the non-refutation of deductively drawn conclusions or in the actual occurrence of the predicted conclusions. In short, abduction suggests that something may be, deduction proves that something must be, and induction shows that something actually is operative (CP 5.171, 1903).
56 Creativity and Abduction According to Charles S. Peirce
1213
Each of these stages – abduction, deduction, and induction – involves in turn a variety of operations. Abduction includes all those operations by which concepts and theories are engendered; the deduction, which must review the possible experiential consequences that would follow from the truth of the hypothesis, includes two parts: the explanation of the hypothesis through logical analysis and the deductive demonstration or argumentation (CP 6.470–471, 1908); induction, in turn, includes classification, by which general ideas are attached to objects of experience, evidences or evidentiary arguments, and the evaluation of these evidences until a final judgment is expressed on the entire result (CP 6.472, 1908). The question that can be posed at this point is: Does creativity intervene in the deductive and inductive phases, that is, not only in the invention of the new hypothesis but also in the testing process? There are no clear texts by Peirce that directly answer this question, but there are a few that can shed some light. For example, in the entry “Scientific Method” that Peirce wrote for the Dictionary of Philosophy and Psychology is asserted: “The testing of the hypothesis proceeds by deducing from it experimental consequences almost incredible, and finding that they really happen, or that some modification of the theory is required, or else that it must be entirely abandoned” (CP 7.83, 1901; our italics). That is to say, the scientific method is not only the application of mechanical rules or seeing what is evident, but it is also essentially the testing phase which requires us to go beyond what is before our eyes to the “almost incredible,” and therefore needs a certain dose of creativity. When considering the possible clues that it is possible to find in Peirce’s thought in order to understand the phenomenon of creativity, much attention has been paid to abduction. Peirce attributes the emergence of new ideas to abduction, but much less attention has been paid to this role of deductive reasoning. It may seem that necessary – deductive – reasoning has little to do with creativity. However, this is not the case. In fact, around 1896, Peirce explained deduction as diagrammatic reasoning, that is, as a sort of reasoning that required mental experiments: Deduction is that mode of reasoning which examines the state of things asserted in the premisses, forms a diagram of that state of things, perceives in the parts of that diagram relations not explicitly mentioned in the premisses, satisfies itself by mental experiments upon the diagram that these relations would always subsist, or at least would do so in a certain proportion of cases, and concludes their necessary, or probable, truth. (CP 1.66, c.1896)
In this type of reasoning, an imaginative study of an individual scheme is carried out: “We form in the imagination some sort of diagrammatic, [ . . . ] representation of the facts, as skeletonized as possible. [ . . . ] This diagram, which has been constructed to represent intuitively or semi-intuitively the same relations which are abstractly expressed in the premisses, is then observed, and a hypothesis suggests itself that there is a certain relation between some of its parts” (CP 2.778, 1901). Deduction, or diagrammatic reasoning, consists therefore in drawing a diagram of a hypothetical state of things and proceeding to observe it. This observation suggests to us that something may be true, it is formulated with greater or lesser precision, and then we proceed to investigate whether it is true or not.
1214
S. Barrena and J. Nubiola
Peirce points out that all necessary reasoning, like mathematical reasoning, requires imagination and originality: It is hardly credible however that there is anybody who does not know that mathematics calls for the profoundest invention, the most athletic imagination, and for a power of generalization in comparison to whose everyday performances the most vaunted performances of metaphysical, biological, and cosmological philosophers in this line seem simply puny. (CP 4.611, 1908)
It is not possible to know without imagining, and neither to reason without imagining. In this sense, some scholars have shown that deductive thinking plays a fundamental role in the creative process. Kathleen Hull pointed out in 1994 that mathematical reasoning through the construction of diagrams is the paradigm for all thinking, including creative thinking (Hull, 1994, p. 273). Beverly Kent also stated that – according to Peirce – creative thinking is due to mental manipulation of diagrams (Kent, 1997, p. 445). More recently, Michael Hoffmann has stated that diagrammatic reasoning, in combination with abduction, can be extended to a general theory of scientific discovery and creativity in general (Hoffmann, 2006). Sun-Joo Shin has also argued that abduction and deduction must be combined and have to do with creativity: “Both processes —constructing a hypothesis and constructing a theorematic step— require a creative mind to suggest an ingenious and insightful guess” (Shin, 2016, p. 68). When something is mentally represented, it is possible to play with the problem and approach it from new perspectives: here lies the creativity of diagrammatic reasoning. It is possible to detect relationships between the parts of the diagram other than those used in its construction (W 5: 163–165; CP 3.363, 1885). It is here where limitations are seen and new possibilities arise. This type of thinking alternates with the abductive phases in the creative or research process. In this sense, diagrammatic reasoning is precisely the one that facilitates new abductions (Hoffmann, 2006, p. 10), since the diagram can lead to something already expected but also to something that contradicts our expectations. It can be concluded that, in any research process, the abductive, deductive, and inductive phases have to be joined together, and it is necessary to assume that the deductive and the inductive phases – with their emphasis on verification through experience – are as important as the abductive phase and also need creativity. Only in this way is it possible to escape a conception of creativity as a brilliant intuition. Policastro and Gardner pointed out this idea about the process in their study on creativity: “Early intuitions require the support of other cognitive processes and long periods of persistent work before they can be successfully articulated into valuable final products” (Policastro & Gardner, 1999, p. 220). More than a stroke of luck, the success of science or art depends on continued work. Experience is key to creativity, but so is experimental analysis, which for Peirce plays an important role both in logic and in the history of science (CP 7.277, n.d.). If abduction and induction provide us with a contact with reality, the data from which the first ideas will emerge, and their effective verification, deduction allows us to handle those ideas, order them, analyze them, and experiment on them without the limits of reality, in such a way that all the fecundity they contain is achieved.
56 Creativity and Abduction According to Charles S. Peirce
1215
Knowledge cannot advance without abduction; neither it is possible to think in a rigorous and fertile way without theoretical reasoning (NEM 4: 49, 1902). The highest degree of originality, however, will be that of the new idea obtained through abduction, as pointed out by Peirce in a manuscript dated around 1903 and entitled in the Robin catalogue as “On Five Grades of Originality in Logic, with Illustrations from the History of Logic” (MS 816, c.1903). In the next section, it will be studied the peculiar combination of reason, instinct, and imagination that makes abduction possible.
Abduction as Engine of Creativity Peirce holds the premise that truth cannot be discovered by chance; it would be impossible, he says, to guess the right hypothesis among the infinity of possible hypotheses only by chance. There must be some capacity that makes us be right quite often. “How is it that man ever came by any correct theories about nature?,” Peirce asks in 1903. He answers: We know by Induction that man has correct theories; for they produce predictions that are fulfilled. But by what process of thought were they ever brought to his mind? A chemist notices a surprising phenomenon. Now if he has a high admiration of Mill’s Logic, as many chemists have, he will remember that Mill tells him that he must work on the principle that, under precisely the same circumstances, like phenomena are produced. Why does he then not note that this phenomenon was produced on such a day of the week, the planets presenting a certain configuration, his daughter having on a blue dress, he having dreamed of a white horse the night before, the milkman having been late that morning, and so on? The answer will be that in early days chemists did use to attend to some such circumstances, but that they have learned better. How have they learned this? By an induction. Very well, that induction must have been based upon a theory which the induction verified. How was it that man was ever led to entertain that true theory? You cannot say that it happened by chance, because the possible theories, if not strictly innumerable, at any rate exceed a trillion — or the third power of a million; and therefore the chances are too overwhelmingly against the single true theory in the twenty or thirty thousand years during which man has been a thinking animal, ever having come into any man’s head. (CP 5.591, 1903)
The human mind has a natural adaptation for imagining correct theories through abduction (CP 5.591, 1903). We owe every discovery and every creation to abduction: “Not the smallest advance can be made in knowledge beyond the stage of vacant staring, without making an abduction at every step” (HP 2: 900, 1901), Peirce writes. Abduction “consists in examining a mass of facts and in allowing these facts to suggest a theory” (CP 8.209, 1905). On other occasions, he refers to abduction as “the process of forming an explanatory hypothesis” or “the only logical operation which introduces any new idea” (CP 5.171, 1903). An example of abduction would be a doctor who considers a patient’s symptoms “surprising.” The doctor takes note of these symptoms and tries to find a diagnosis in which his vision is expanded, and the symptoms that the patient report appear to be the result of a suspected disease (Niño, 2001).
1216
S. Barrena and J. Nubiola
Further examples of abduction are the detective who, like Sherlock Holmes or Auguste Dupin, solves an enigma from a few clues (Eco & Sebeok, 1983). Although detective powers have traditionally been attributed to deduction, the authors argue, following Peirce, that they are actually clear instances of abduction, that is, of reasoning by conjecture. The same can be said of the main part of the work of the historians (Viola, 2020). There are a lot of examples of abduction that Peirce himself provided: I once landed at a seaport in a Turkish province; and, as I was walking up to the house which I was to visit, I met a man upon horseback, surrounded by four horsemen holding a canopy over his head. As the governor of the province was the only personage I could think of who would be so greatly honored, I inferred that this was he. This was an hypothesis. Fossils are found; say, remains like those of fishes, but far in the interior of the country. To explain the phenomenon, we suppose the sea once washed over this land. This is another hypothesis. (CP 2.625, 1893)
Abduction is a peculiar leap of the mind, a reasoning through hypotheses, that is, through the explanation that arises spontaneously when pondering what has surprised us in a specific circumstance. Although it would not be possible without prior knowledge, Peirce assigns abduction an original character (CP 5.181, 1903) and characterizes it as an extremely fallible reasoning: The abductive suggestion comes to us like a flash. It is an act of insight, although of extremely fallible insight. It is true that the different elements of the hypothesis were in our minds before; but it is the idea of putting together what we had never before dreamed of putting together which flashes the new suggestion before our contemplation. (CP 5.181, 1903)
Despite being the weakest and most insecure type of reasoning, it is the most fruitful (CP 8.385–388, 1913). It is, according to Peirce, the highest kind of synthesis, that which the mind is obliged to do, not by the internal attractions of the feelings, nor by a transcendental force of necessity, but in the interest of intelligibility, and which consists in introducing an idea not contained in the data, which causes connections that otherwise we would not have had (W 6: 187; CP 1.383, c.1890). Some features of this mode of reasoning are summarized as follows (Barrena & Nubiola, 2020): (a) Abduction springs from experience As it has been said above, Peirce was deeply impressed by the phenomenon of the introduction of new ideas in scientific research, which is totally unexplained by a mere calculation of probabilities. But, how does the right new idea spring? Peirce writes: “[We] can know nothing except what we directly experience.” And a few lines below he adds: “Where would such an idea [ . . . ] come from, if not from direct experience?” (CP 6.492–3, 1908). But what does Peirce mean by direct experience? Peirce calls musement the peculiar experience that makes the surprise that arises abduction possible. Musement is a peculiar state of mind that goes free, loose, from one thing to another, without following any rules. Peirce characterizes it as the pure disinterested play of the mind that “involves
56 Creativity and Abduction According to Charles S. Peirce
1217
no purpose save that of casting aside all serious purpose.” It “has no rules, except this very law of liberty” (CP 6.458, 1908). Musement is not limited to scientific study or logical analysis, and it is precisely in this non-reduction to science or logic that Peirce figures out the much broader possibilities it offers. It is a mental state of free speculation, without limits of any kind, in which the mind plays with ideas and can dialogue with what it perceives: a dialogue consisting not only of words but also of images, in which imagination plays an essential role. So, continuing the counsels that had been asked of me, I should say, “Enter your skiff of Musement, push off into the lake of thought, and leave the breath of heaven to swell your sail. With your eyes open, awake to what is about or within you, and open conversation with yourself; for such is all meditation.” It is, however, not a conversation in words alone, but is illustrated, like a lecture, with diagrams and with experiments. (CP 6.461, 1908)
Musement can take different forms: It “may take either the form of aesthetic contemplation, or that of distant castle-building [ . . . ], or that of considering some wonder in one of the Universes” (CP 6.458, 1908). Musement constitutes a peculiar experience, which for Peirce can only be understood from the rejection of the nominalist idea of experience as the mere first impressions of sense: “These ‘first impressions of sense’ are hypothetical creations of nominalistic metaphysics: I for one deny their existence. But anyway even if they exist, it is not in them that experience consists. By experience must be understood the entire mental product” (CP 6.492, 1908). (b) Abduction Has a Logical Form For Peirce, abduction has a logic structure (CP 5.171, 1903). The form of this type of inference is the following: “The surprising fact, C, is observed; But if A were true, C would be a matter of course. Hence, there is reason to suspect that A is true.” (CP 5.189, 1903)
Abduction is, therefore, a type of reasoning, and is not as mysterious as it might seem at first glance, but rather includes a series of operations of the mind that can be accounted for: The whole series of mental performances between the notice of the wonderful phenomenon and the acceptance of the hypothesis, during which the usually docile understanding seems to hold the bit between its teeth and to have us at its mercy, the search for pertinent circumstances and the laying hold of them, sometimes without our cognizance, the scrutiny of them, the dark laboring, the bursting out of the startling conjecture, the remarking of its smooth fitting to the anomaly, as it is turned back and forth like a key in a lock, and the final estimation of its Plausibility, I reckon as composing the First Stage of Inquiry. Its characteristic formula of reasoning I term Retroduction [or Abduction], i.e. reasoning from consequent to antecedent. (CP 6.469, 1908)
As a form of reasoning, rational control over abduction is possible, because all inference is essentially deliberate and self-controlled, even if that control is weak:
1218
S. Barrena and J. Nubiola
“Abduction furnishes all our ideas concerning real things, beyond what are given in perception, but is mere conjecture, without probative force” (CP 8.209, c.1905). The methodological and logical aspects of the creative process imply that, in a way, it can be learned and developed, though it is not an exact process. (c) The Imaginative Leap of Abduction The logic of abduction is combined with a leap of imagination, which implies embracing a new conception of logic different from the rationalist one. Hull has affirmed that achieving a harmony between creativity and logic was one of Peirce’s underlying philosophical tasks, and in order to do so, he had to reformulate logic itself in a radical new way (Hull, 1994, p. 271). The special nature of abduction turns the logical process leading to discovery – without ceasing to be logical – into a mixture of several factors, not just rationality, that explain the surprising and unexpected nature of the new finding. Among these factors are imagination and instinct, without which we could not come up with possible solutions nor hit on the hypothesis that has the best fit. So, creativity and logic are not mutually exclusive: creative thinking combines imagination and logical rigor, thanks to abduction. For Peirce, abduction has imagination at its core. The play of imagination is an essential part of any scientific, investigative, or artistic activity. Without imagination, for example, the construction of any scientific hypothesis would not be possible. Abduction involves the development of ideas in the inner world of imagination, ideas that allow the advancement of knowledge: Human instinct is no whit less miraculous than that of the bird, the beaver, or the ant. Only, instead of being directed to bodily motions, such as singing and flying; or to the construction of dwellings, or to the organization of communities, its theatre is the plastic inner world, and its products are the marvellous conceptions of which the greatest are the ideas of number, time, and space. (MS 318, c.1907)
Peirce gives to the imagination that capacity that has not always been taken into account: When a man desires ardently to know the truth, his first effort will be to imagine what that truth can be. He cannot prosecute his pursuit long without finding that imagination unbridled is sure to carry him off the track. Yet nevertheless, it remains true that there is, after all, nothing but imagination that can ever supply him an inkling of the truth. He can stare stupidly at phenomena; but in the absence of imagination they will not connect themselves together in any rational way. (CP 1.46, c.1896)
(d) Abduction Is the Form of Reasoning Closest to Instinct Peirce speaks of abduction as inference, but also as insight (CP 5.181, 1903). To explain this peculiar combination of logic and instinct, MaryAnn Ayim coined the expression of rational instinct, of reason and instinct as complementary in the acquisition of knowledge, as might be seen in the scientific method (Ayim, 1982, p. 18). Peirce attributes great strength to the instinctive character of the mind. Peirce states that “the obscure part of the mind is the principal part” and that instinctive
56 Creativity and Abduction According to Charles S. Peirce
1219
inferences answer questions with curious accuracy (CP 6.569, c.1905). Peirce even talks about “the instinct which we call reason” (MS 668, n.d., 14). Around 1906, referring to the importance of the instinct for knowledge, Peirce writes: “But whatever there may be of argument in all this is as nothing, the merest nothing, in comparison to its force as an appeal to one’s own instinct, which is to argument what substance is to shadow, what bed-rock is to the built foundations of a cathedral” (CP 6.503, c.1906). This instinctive capacity of abduction (insight) should not be confused with intuition, since instinct remains in a certain sense mediated, subject to interpretation and fallible. In fact, on the one hand, Peirce affirms that this faculty is of the same nature of instinct, resembling the instincts of the animals: “It resembles instinct too in its small liability to error; for though it goes wrong oftener than right, yet the relative frequency with which it is right is on the whole the most wonderful thing in our constitution” (CP 5.173, 1903). But, on the other hand, instincts, unlike the supposed immediacy of intuition, are susceptible to scientific investigation (Ayim, 1982, p. 90). Peirce asserts that all human knowledge, even the highest achievements of science, is nothing but the development of our natural instincts (CP 6.604, 1893; W 5: 450, CP 2.754, 1883). That natural light on which abduction is based – il lume naturale in Galileo’s expression (Nubiola, 2004) – is a peculiar instinct because it requires testing. It would not serve for the effective advancement of knowledge without a subsequent test, since abduction is fallible: “True, we are driven oftentimes in science to try the suggestions of instinct; but we only try them, we compare them with experience, we hold ourselves ready to throw them overboard at a moment’s notice from experience” (CP 1.634, 1898). Il lume naturale – the natural light of human reason – explains the emergence of the conjecture and also explains why some hypotheses should be preferred over others: because they are more natural and simpler (CP 5.60, 1903; 4.35, 1893), more conforming to our instinctive approach to nature, or more familiar to our minds (W 4: 439; CP 2.740, 1883). (e) Abduction Means Being Able to Reason Backward Sometimes a hypothesis seems to arise in an almost magical way, but it may be provided an a posteriori explanation of which paths have been followed, although perhaps in a not entirely conscious way. From a result, we are able to develop the steps that have led us to that result. As it has been said, abduction provides fallible and mediate knowledge, not at all similar to what the defenders of intuition claim, and yet it is also creative. The result of abduction is obtained through inference, and, although at the precise moment in which the new idea occurs to us we may not be aware of the reasons that have led us to it, this does not mean that it has come out intuitively out of nowhere: for Peirce, an explanation – at least later – of how an idea is obtained is always possible. Abduction is the logical extreme of thought, the one in which the strategies of thought are faster and less conscious, but not a miracle of the gods.
1220
S. Barrena and J. Nubiola
Correct hypotheses are, therefore, the result of a process, although one not conscious enough to be controlled or, to put it more aptly, a non-controllable and therefore not entirely conscious process. That seemingly magical ability is rational, logical, and creative at the same time; it combines logical rigor with imagination to invent possible explanations. It is a self-controlled process, a peculiar logical reasoning that exercises a form of limited and indirect control (CP 7.45, c.1907). This passive control, which occurs in a mental state in which attention is unfocused and which is often not fully conscious, is nevertheless crucial (Burton, 2000, p. 149) because it allows an association between ideas or images that were not previously connected. Besides the main element that occupies thought in an instant, Peirce writes, there are hundreds of things in our mind to which only a small fraction of attention is given (W 2: 224; CP 5.284, 1868), and the muser has the capacity to make present more things of those than are usually paid attention to. However, the hypothesis that arises can be proven or embodied. A rational control over it is possible, since otherwise it would not make sense to give it value, and that is what differentiates it from mental confusion, delusions, sterile daydreams, or frivolous games (Santaella, 1991, p. 127).
Abduction and Artistic Creativity Although it may seem that Peirce’s interests were far removed from aesthetics and art, his philosophy, as Hans Joas pointed out, was determined to find a place for artistic creativity in an era characterized by the domination of science (Joas, 1993, p. 4). What Peirce wrote extensively about scientific creativity applies to art as well. There is an artistic abduction by which relationships that had never been established before are obtained. In that sense, as Peirce affirms, the poet’s or the novelist’s work is not so different from that of the scientist’s: The work of the poet or novelist is not so utterly different from that of the scientific man. The artist introduces a fiction; but it is not an arbitrary one; it exhibits affinities to which the mind accords a certain approval in pronouncing them beautiful, which if it is not exactly the same as saying that the synthesis is true, is something of the same general kind. (W 6: 187; CP 1.383, c.1890)
Artistic abduction must start – like any other – from experience. Artists, Peirce holds, “are much finer and more accurate observers than scientists are, except of the special minutiae that the scientific man is looking for” (CP 1.315, 1903). The ability to observe the world in an aesthetic way, to discern the things around us without judging them, is not something that is done just like that, but requires discipline and training. The artist is the one who has that preparation, who is able to recognize sensations with accuracy, rigor, and depth. Peirce describes that extraordinary capacity for perception in the following way: When the ground is covered by snow on which the sun shines brightly except where shadows fall, if you ask any ordinary man what its color appears to be, he will tell you
56 Creativity and Abduction According to Charles S. Peirce
1221
white, pure white, whiter in the sunlight, a little greyish in the shadow. But that is not what is before his eyes that he is describing; it is his theory of what ought to be seen. The artist will tell him that the shadows are not grey but a dull blue and that the snow in the sunshine is of a rich yellow. That artist’s observational power is what is most wanted in the study of phenomenology. (CP 5.42, 1903)
Therefore, artistic abduction begins, like scientific one, with a surprising fact, which in the case of art is perhaps a state of restlessness, a feeling that in some sense the world is not as it should be. The artist then tries to fill that void with something original and intelligible that can be interpreted by others. This Peircean idea of art as expression means that the variety of experience and human sensations, albeit diverse and ungraspable, is also rationalizable, because the artist manages to express feelings by giving them shape and embodying them in a third. Artists deals with feelings that are possibilities; they perceive the world in its present being, in its firstness – Peircean category for what is present and immediate, fresh and new, spontaneous, and evanescent (CP 1.357, 1887) – and play with the imagination giving rise to a thirdness, the category of relationship, of mediation (CP 1.359, 1887), that allows the artists to express that firstness in something hard and tangible, something which is there, secondness (CP 1.358, 1887). Art is a play among Peirce’s three categories. Art is creation, discovery of a way to embody reasonableness; it is finding a way to express that which cannot be expressed, to communicate a feeling that is internal by giving it a reasonable form and making it external. The hypothesis that arises through artistic abduction is nevertheless only one of the possible ways in which that quality could be embodied, and it is only a first idea that must be worked on until it acquires its definitive form. Douglas Anderson developed an analogy between scientific and artistic creativity. Anderson considers that abduction in art, as in science, is followed by a phase of deduction and another of induction. It is necessary to explain and “prove” the artistic hypothesis, which otherwise would be reduced to mere emotion. Abduction is only the starting point of a process in which the artist loves her idea and lets it develop, allowing it to suggest its own perfection (Anderson, 1987). Based on that first hypothesis, the artist, in the first place, projects what the work of art will be like. Through deduction, the creative idea becomes an existing work of art; it becomes a likeness, a model that can be tested by contemplating it, for example, by making a first design, as Peirce explains: Another example of the use of a likeness is the design an artist draws of a statue, pictorial composition, architectural elevation, or piece of decoration, by the contemplation of which he can ascertain whether what he proposes will be beautiful and satisfactory. (CP 2.281, 1893)
Artists project what the work of art will be like through deduction, taking into account the limitations that experience, time, talent to imagine, etc. can impose on them. Through deduction, the creative idea becomes an existing work of art; it becomes a model that can be tested by contemplating it. The original idea becomes more precise when artists ask themselves how the project is going to turn out and
1222
S. Barrena and J. Nubiola
when they work to find it out. But, as in science, artistic abduction is not infallible. Many times, artists will have to reject their first idea when they see it on paper and find that it does not meet their expectations. In the deductive phase of art, it is not to predict consequences as in science what is sought, but to eliminate possibilities that do not satisfy the artist’s end, sometimes proving again and again: The artist must have the spirit of science. He must try again and again by experimenting with all the means available to him; make every progress and be aware of the progress made by colleagues around him. Naturally, experimentation should be followed by a check of the results. (Lorda, 1991, p. 165; own translation)
In the inductive phase, the truth of the hypothesis that the deduction has specified is verified. Through induction, the artist has to judge his work. Unlike science, art can only be true with respect to itself, insofar as it fulfills its purpose of creating the admirable in itself, as indicated by aesthetics. It is not a question of seeing if there is correspondence with the facts but of seeing if the work is admirable, if it satisfies its purpose, that is, if it has managed to express a feeling beautifully by making it reasonable. The artist has to judge for himself and also submit his work, as in science, to the final judgment of a community in an indefinite time. Generations either approve of works of art or not, and art movements are changing. Peirce, therefore, does not support an aesthetic subjectivism, an anything goes, but the work of art has to fulfill its purpose. These three stages, as in science, are intermingled until giving rise to a work of art that will always be incomplete in some sense. The work of art, as a sign, can always grow to adapt to new interpretations. In this sense, Anderson points out that the work is finished, but not complete. The work of art can grow by the interpretation of a community. “The fulfillment of the creative act is not complete without sharing it” (Bergmann & Colton, 1999, p. 115). The process of artistic creation is therefore not mechanical neither in its beginning nor in its development: abduction is not the result of an automatic process but involves finding one of the many possible ways in which the quality can be embodied. There is a margin of spontaneity, as in the scientific process, and the phases after the first abduction involve self-critical corrections, the elimination of errors that are part of the same creative process, and even new abductions. Many times, in the execution of the work itself, in the materialization of the idea, the artist’s plan is modified. There is a feedback, a give and take between the ideal plan, and the emerging concrete work. The ideal image that was had in the beginning is constantly changing due to the work that is actually done (Popper, 1992, pp. 229– 230). There is more freedom in artistic abduction than in the scientific one since it seeks to capture sensations, restlessness, and feelings, while the scientist, in turn, seeks rational explanations. Scientific hypotheses, although they are also creations, can only afford to be original if they explain the facts in question (Anderson, 1987, p. 44). Imagination in science is not entirely free, since hypotheses cannot be released from the hand of reality. Speaking of Kepler’s work, Peirce thus distinguished the scientific imagination of the poetic one:
56 Creativity and Abduction According to Charles S. Peirce
1223
What kind of an imagination is required to form a mental diagram of a complicated state of facts? Not that poet-imagination that ‘bodies forth the forms of things unknowne,’ but a docile imagination, quick to take the Dame Nature’s hints. The poet-imagination riots in ornaments and accessories; a Keppler’s makes the clothing and the flesh drop off, and the apparition of the naked skeleton of truth to stand revealed before him. (W 8: 290, 1892)
Science is interested in discovering the truth, in conforming reason to the facts of experience. The artist, on the other hand, seeks to create what is admirable in itself. While scientific reasoning ends with reasonable ideas, art ends with reasonable feelings (Anderson, 1987, p. 60). But both science and art are driven by abduction.
Abduction and Creative Life: Self-Control For Peirce, all thought occurs in the imagination: “A belief-habit in its development begins by being vague, special, and meagre; it becomes more precise, general, and full, without limit. The process of this development, so far as it takes place in the imagination, is called thought” (W 4: 164; CP 3.160, 1880). Imagination is responsible for the development of ideas and the growth of our reasonableness. The resolutions and exercises of the inner world can affect real determinations and habits, and therefore the inner movements of imagination are not mere fantasies but rather real agents, since they truly have an external effect. Our internal meditations thus become a guide for action. As Andacht has written: The traveler on the platform who imagines herself taking journeys that she does not intend to do, while contemplating advertisements that announce them, a child who dreams of being a possible savior of someone who has not yet suffered an accident like the one that then, effectively, happens, and he who dedicates his daily endeavors to science, and considers possible some states of the world, if certain conditions occur, have in common a real adherence to ‘an imaginary line of behavior’. (Andacht, 1996, p. 1271; own translation)
Imagination is of radical importance in the behavior of human beings; it allows us to face and order our experience, exercise control over ourselves, and open ourselves to others, and it gives us the possibility of establishing relationships. How could we, without imagination, empathize with another, put ourselves in their place, suffer with the one who suffers, and rejoice with the one who rejoices? Imagination allows us, says Peirce, to go out of ourselves to truly be in others: “When I communicate my thought and my sentiments to a friend with whom I am in full sympathy, so that my feelings pass into him and I am conscious of what he feels, do I not live in his brain as well as in my own — most literally?” (W 1: 498; CP 7.591, 1866). Only with the help of imagination can we inhabit a common world and mark the end to which we want to lead our efforts. Abduction, as in science and art, is key to ordinary life. Far from understanding Peirce’s pragmatism as an exaltation of practical action, it is possible to see in him a program to live creatively and reach the summum bonum that Peirce considers the ultimate goal. Living creatively consists of being guided by reason, which can develop and invent ways to grow, which has the capacity to go beyond what is given, acting in interconnection with imagination and with the rest of the human capacities.
1224
S. Barrena and J. Nubiola
Living creatively consists of embodying the ideal in our life, which for Peirce is none other than the growth of reasonableness. From the Peircean perspective, aesthetics is the normative science in charge of pointing out what the end is (Barrena, 2015, pp. 142–161), and the only ultimate good to which all practical facts should be directed, Peirce holds, is the development of “concrete reasonableness” (CP 5.3, 1901; 2.34, note 2, c.1902), since reason always looks for something further and hopes to improve its results. Aesthetics points not only to works of art, but it states also that our actions should be directed to the incarnation of the rational in the sensible until reaching a peculiar and beautiful balance: “The esthetic Quality appears to me to be the total unanalyzable impression of a reasonableness that has expressed itself in a creation. It is a pure Feeling but a feeling that is the impress of a Reasonableness that Creates. It is the Firstness that truly belongs to a Thirdness in achievement of a Secondness” (MS 310: 9, 1903). That’s something Peirce considers from experience: Every motive involving dependence on some other leads us to ask for some ulterior reason. The only desirable object which is quite satisfactory in itself without any ulterior reason for desiring it, is the reasonable itself. I do not mean to put this forward as a demonstration; because, like all demonstrations about such matters, it would be a mere quibble, a sheaf of fallacies. I maintain simply that it is an experiential truth. (CP 8.140, 1901)
The reasonable is the general ideal that through our actions and self-control is embodied in concrete aspects. Individual action is a means for the development of embodied ideas, of the reasonable, which for Peirce constitutes the end for which heaven and earth have been created (CP 2.122, c.1902). “We are all putting our shoulders to the wheel for an end that none of us can catch more than a glimpse at —that which the generations are working out. But we can see that the development of embodied ideas is what it will consist in” (CP 5.402, note 2, 1893). In this way, Peirce understands by “reason” something that somehow is not complete, that is evolving, and that differs from the human faculty that has been called reason from a rationalist perspective. Peirce’s “reason” might perhaps better be called reasonableness (Nubiola, 2009): Consider, for a moment, what Reason, as well as we can today conceive it, really is. I do not mean man’s faculty which is so called from its embodying in some measure Reason, or {Nous}, as a something manifesting itself in the mind, in the history of mind’s development, and in nature. What is this Reason? In the first place, it is something that never can have been completely embodied. The most insignificant of general ideas always involves conditional predictions or requires for its fulfillment that events should come to pass, and all that ever can have come to pass must fall short of completely fulfilling its requirements. [ . . . ] So, then, the essence of Reason is such that its being never can have been completely perfected. It always must be in a state of incipiency, of growth. [ . . . ] This development of Reason consists, you will observe, in embodiment, that is, in manifestation. [ . . . ] I do not see how one can have a more satisfying ideal of the admirable than the development of Reason so understood. The one thing whose admirableness is not due to an ulterior reason is Reason itself comprehended in all its fullness, so far as we can comprehend it. (CP 1.615, 1903)
To live creatively is to live according to that reason: there should be a consistency between what is done and what is thought (W 2: 241; CP 5.315, 1868), that is, to
56 Creativity and Abduction According to Charles S. Peirce
1225
invent, through abduction, new possibilities of action that bring us closer to that reasonableness. Pragmatism, also in this order, supposes generating and protecting possible consequences; it does not advocate concrete actions, but conceivable ones, and allows to delineate possible forms of controlled behavior. One of the functions of the pragmatist reason is, therefore, to invent new possibilities for the future. With self-control, we are able to give action a different course than the normal one: “In my opinion, —Peirce asserts— it is self-control which makes any other than the normal course of thought possible, [ . . . ] Where there is no self-control, nothing but the normal is possible” (CP 4.540, 1905). Thanks to self-control, human beings are not determined by circumstances, but they are capable of extraordinary things. Man will perceive his impulses and see the need for self-government, and then he will be bound by a government that will be the fruit of his own reasonable act, by a free government, which will be very different from being bound by nature, Peirce says (MS 675; EP 2: 459, 1911). Self-control consists of cultivating habits that will become sources of freedom (MS 930, n.d.). According to the pragmatist conception, we have the possibility of selfcontrolling our behavior so that it approaches the end, that is, of living according to reason and thus being the owners and authors of our own behavior. Peirce attached great importance to the concept of self-control not only as a personal quality but also as a philosophical concept (Raposa, 2003, p. 128). As Raposa has written, Peirce “clearly regarded human life as a struggle, a contest requiring of its participants both a heroic indifference and a supreme self-control” (Raposa, 2003, p. 126). Living creatively is controlling our actions and making them reasonable, introducing new intelligibility. The key to understanding life as a creative activity – as in science – lies in the notion of abduction, although Peirce does not explicitly discuss abduction on ethical grounds. In our life, we sometimes invent alternatives or we decide to take something in a different way than the one that seems to be determined. We invent a solution to a problem that at other times seemed insoluble to us, and – remember what Peirce said – we are generally capable of behaving in a different way than normal, in a way that we ourselves invent. There is not only one possible option or only one way to do good, but human beings can invent many possibilities. In human action there is, as in science, a construction and selection of hypotheses that is “a conscious, deliberate, voluntary and controlled conduct, and thus open to criticism at every step” (Fann, 1970, p. 40). These hypotheses, which embody the novelty in our own life, require deduction and induction; they need to be tested and are proven, like scientific ones, in practice. As in science, practical consequences constitute the definitive test of hypotheses, and ideals receive their ultimate confirmation in behavior: “If [ . . . ] the future development of man’s moral nature will only lead to a firmer satisfaction with the described ideal, the doctrine is true” (CP 5.566, 1901). The actions that originate the hypotheses when put into practice and the satisfaction that they produce allow us to accept them, so that they give rise to habits that are consolidated, or to reject them and replace them with others. Abductions are thus verified according to the behavioral habits that they originate.
1226
S. Barrena and J. Nubiola
Around 1905, Peirce wrote: “A man can be his own training-master and thus control his self-control. When this point is reached much or all the training may be conducted in imagination” (CP 5.533, c.1905). Peirce did not develop this theme extensively, although he clearly pointed out that habit formation through imaginary action is one of the essential elements for moral self-control (CP 5.440, 1905). Imagination is used for deliberation, for generating and exploring the possibilities that open up in a situation. Self-control “can often entail deliberations requiring a person imaginatively to conceive of the distant future, of the remote effects and long term consequences of behavior” (Raposa, 2003, p. 129). Maintaining selfcontrol will be more difficult for those who never prepare or think about the future, and imagination is where much of that preparation occurs (Raposa, 2003, p. 133). To behave creatively requires the ability to imagine oneself in different situations and conditions; it is required to put oneself in the place of others, broaden one’s perspective, and enrich oneself with the experience of others. All of this requires imagination, feeling, and abductive capacity. The conception of a creative life in the pragmatist sense does not enforce strict rules, but rather leaves the freedom to invent the ways in which the ultimate aim will be approached; it only prompts us to behave according to reason. In this regard, Johnson has pointed out in a graphic way that in our lives, we continually portray situations, delineate characters, formulate problems, and mold events; we compose situations and build relationships. “When we act we engage in various forms of creative making” (Johnson, 1993, p. 212). Our behavior can be soaked in creativity, and it must be so in order to grow. Good is imaginative, more than evil, that destroys ourselves insofar as it takes us away from the ultimate end. What Peirce writes can be interpreted in this sense: “The only moral evil is not to have an ultimate aim” (CP 5.133, 1903), an aim that can be consistently adopted and pursued, and that is the same under all circumstances (CP 5.136, 1903). Action can be colonized by reason, thus opening up a space for creative freedom in the field of practical life that is marked by self-control and by the ability to move freely toward an end. That is, for Peirce, the only freedom of which man has reason to be proud (CP 5.339, note 1, 1893).
Some Peircean Keys to Foster Creative Thinking (a) Be Aware that You Can Improve Your Process of Reasoning To learn how to think more creatively and effectively, the first thing to do is to want to do it, that is, to realize that we do not yet know how to think. It is a fact that human beings think, but they can learn to think better. Peirce argued that there is a logica docens and a logica utens (CP 2.186–190, c.1902): logica uten s is a rudimentary sense of logic used, a general method by which everyone obtains truths, even without being aware of doing so and without being able to specify what their method was. Logica docens is the scientific method of logic practiced by logicians and scientists, doctors, detectives, and experts that can be taught and learned consciously. It is a method developed to uncover the truth
56 Creativity and Abduction According to Charles S. Peirce
1227
and think better. Our task, then, is to move from logica utens to logica docens, to develop our natural logic and become aware of our thought processes in order to improve them. (b) Learn to Find Possible Explanations Broadening Your Perspective To make good guesses, some distance is needed, that is, to look beyond what is before the eyes. Sometimes, it is also needed to change the mental constructs and find more than a single type of response. It may be useful to not dedicate all one’s attention to the matter at hand, but to let it get out of focus. As it has been seen before, there is a very useful tool to defocus our attention, introduce new perspectives, and develop the imagination: musement; it could also be called daydreaming or mental game. Letting the mind wander is usually a good technique for introducing new perspectives on an issue (Martindale, 1999). Encouraging this “play” with ideas and developing concepts in their logical implications of interdependence and relationships – without any reference to their application or their real existence – that is, the development of a conceptual map, has great benefits. Singer has written that much of the imaginative activity takes the form of daydreaming, which can be considered very similar to the Peircean musement: Much of imaginative thought takes the form of daydreaming, which usually involves shifts of attention away from an immediate task or concrete mental problem to seemingly task-unrelated images or thought sequences. Such daydreams may range from memories to wishful future events, or to playful story-like reshapings of current concerns of the individual or of long-standing desires. (Singer, 1999, p. 14)
Musement, this peculiar daydreaming, a wandering of the mind, is a unique imaginative experience that will enable abduction to arise. In order to be more creative, we should give our minds the time and the possibility of “wandering.” Musement means knowing how to wait, listen, and be attentive to what appears. “In musement, one playfully explores these dispositions, testing their limitations, softening their effects. This disciplined playfulness maintains attention as a living phenomenon. It prevents a person from continuing to experience someone or something in a manner completely defined by habitus” (Raposa, 2003, p. 145). (c) Develop Your Powers of Observation In order to improve our logical and creative thinking, it must be learnt to be observant. Peirce took time to train his faculties of perception and attached great importance to the ability of being impressed, since the feelings that things cause in us will be combined later in an imaginative and rational development. In Peirce’s scientific methodology, that “impressionist” aspect of observation is highly meaningful, since abduction is grounded on a variety of impressions derived from experience that somehow are shaped and become a rational hypothesis. Observation, often unconscious, is always the most important element in practical reasoning (RLT, 182). The power of observation is critical for reasoning, and can be improved. Just as an untrained person can get in shape with regular exercise, says Peirce,
1228
S. Barrena and J. Nubiola
a person whose powers of observation have been neglected can also obtain amazing results by analogous exercises (RLT, 183). Peirce writes: I have gone through a systematic course of training in recognizing my feelings. I have worked with intensity for so many hours a day every day for long years to train myself to this; and it is a training which I would recommend to all of you. The artist has such a training. (CP 5.112, 1903)
The observation of data and the development of our perceptual abilities become, therefore, critically important for following a correct methodology. For a good observation, it is essential to see that which is before us, to see it such as it is, in the present, without being replaced by any interpretation, and without allowing any circumstance that could change it. The world must be looked at with a gaze free of prejudices, and we must let experience talk to us, realizing that we must not only see the data but also the absence of data. Experience should be perceived such as it is, and then be aware of the gaps in it so as to fill them in. (d) Imagine What May Be the Truth “It is not too much to say,” Peirce wrote, “that next after the passion to learn there is no quality so indispensable to the successful prosecution of science as imagination” (CP 1.47, c.1896). Imagination has been defined as the “ability of the individual to reproduce images or concepts originally derived from the basic senses but now reflected in one’s consciousness as memories, fantasies or future plans” (Singer, 1999, p. 13). Imagination is able to form images not subject to the here and now of perception, and is able also to freely combine representative contents to build new forms. Imagination is neither a fantasy nor a perception, although the uncontrollable percipuum can be converted, Peirce writes, “into a controllable imagination by a brief process of education” (CP 7.647, 1903). Imagination does not function in an uncontrolled fashion; rather, it can be educated and helps us to grow. One must not confuse the great potential of the imagination with mere fantasy, since for Peirce the castles that are built in the air must be copied, with effort, on the ground: “Every man who does accomplish great things is given to building elaborate castles in the air and then painfully copying them on solid ground. [ . . . ] Mere imagination would indeed be mere trifling; only no imagination is mere” (CP 6.286, 1893). Imagination should stimulate and orient our actions. (e) Select the Simplest and Most Natural Explanation In 1908, Peirce wrote: “Modern science has been built after the model of Galileo, who founded it, on il lume naturale . That truly inspired prophet had said that, of two hypotheses, the simpler is to be preferred,” but not simple in the sense of “logically simpler” but “the simpler hypothesis in the sense of the more facile and natural, the one that instinct suggests,” the one that adds the least to what has been observed (CP 6.477, 1908). The solution appears to us as the most reasonable and simple one, in the concrete circumstances. Peirce writes:
56 Creativity and Abduction According to Charles S. Peirce
1229
Science will cease to progress if ever we shall reach the point where there is no longer an infinite saving of expense in experimentation to be effected by care that our hypotheses are such as naturally recommend themselves to the mind, and make upon us the impression of simplicity, —which here means facility of comprehension by the human mind—, of aptness, of reasonableness, of good sense. (CP 7.220, 1901)
How can be known which hypothesis is simpler and better? Peirce gives a series of criteria to choose the correct hypothesis: it must be capable of being subjected to experimental testing and be such that it explains surprising facts, and it must have the lowest possible cost in terms of economy of research, that is, of time, money, energy, and thought at the time of testing it (CP 7.220, 1901; 5.600, 1903). These three general rules mean that the simplest hypotheses also turn out to be more easily compared with observation, more easily eliminated, and with less expense if they are false; those that can be discarded more quickly must be tested first (CP 6.532–36, 1901), those that appear as more probable because they are based on some experience, and those that risk less and can explain more (CP 7.221, 1901). The proper analysis of a hypothesis involves not being swayed only by what seems “plausible” to us, but always considering also the three factors quoted. These criteria are not a definitive test of the hypothesis: they are only criteria that should guide us when testing its validity. However, if the steps above are followed, Peirce writes, it can be easily arrived at the correct hypothesis: The history of science proves that when the phenomena were properly analyzed, upon fundamental points, at least, it has seldom been necessary to try more than two or three hypotheses made by clear genius before the right one was found. [ . . . ] We cannot go so far as to say that high human intelligence is more often right than wrong in its guesses; but we can say that, after due analysis, and unswerved by prepossessions, it has been, and no doubt will be, not very many times more likely to be wrong than right. (CP 7.220, 1901)
(f) Think About the Possible Consequences Although new ideas sometimes dazzle us, it must not be forgotten that the first hypotheses, however correct they are felt, are just that: hypotheses. It should be never neglected the deductive and inductive stages of the methodology, either in science or in art or in everyday life. Only after those stages can our ideas possess value as truth. The proper method of research involves learning to think about the consequences of actions and facts, analyzing and separating their components, and then trying to check them one by one, to gather enough convincing support. Peirce’s pragmatism maintained that the idea of something is the idea of its effects, a conviction that ought to be very present in the method followed. The meaning of an intellectual concept is determined by the practical consequences of that concept. Recognizing a concept under its various disguises, or a mere logical analysis, is not sufficient to understand that concept (CP 6.481, 1908): it is necessary to reach a third degree of clarity that can only be obtained through the consideration of the practical effects of the concept. This pragmatist conviction is precisely what reveals the possibility of and the need to be creative for us, for imagination and creativity are indispensable to ascertain the possible
1230
S. Barrena and J. Nubiola
consequences of something, the facts to which it may lead, and to devise possible paths for further action. Although pragmatism started as a logical method for clarifying concepts, it became a whole way of thinking about investigation, knowledge, and human progress toward truth. In this Peircean conception, it is found that neither the universe nor human life is already done or finished; rather, they are something open that has to be developed in the future, thereby supporting a creative vision. (g) Creativity Needs Experience and Work Creativity is often associated with spontaneity, but as Peirce stated, it is not realized without prior knowledge and experience: “The scientific man hangs upon the lips of nature, in order to learn wherein he is ignorant and mistaken: the whole character of the scientific procedure springs from that disposition” (CP 8.118, n.d.). Each thought is a sign for a posterior one; each reasoning involves another reasoning, and creative abduction needs experience to begin: “The order of the march of suggestion in retroduction is from experience to hypothesis” (CP 2.755, c.1905). Contrary to what is sometimes believed, good ideas do not come out of nowhere.
Conclusion Peirce teaches us that creativity is not a punctual phenomenon. The creative cannot be reduced to a brilliant and occasional intuition, nor does it correspond to just a few moments when new ideas or artistic achievements are obtained. Although it begins and rests in abduction, it requires a whole subsequent work process that is also creative, and without which the end result would not be achieved. On the other hand, creativity is a characteristic that belongs to every human being. Being creative is not exclusive to a privileged few, the “chosen,” but corresponds to all people and can be present in every mental activity. Human beings must grow through habits and realize the ideal of reasonableness. This task demands from each one imagination and creativity in all areas of human life. As Hans Joas has pointed out, pragmatism places creativity in daily human activity and conceives science only as a more pronounced development of that potential (Joas, 1993, p. 239). Creativity becomes the most proper human trait. Computers cannot abduct, surely because they cannot be surprised, because they do not have a body or hands and therefore do not have the real possibility of experience, and because they do not have the imagination to resort – perhaps as by chance – to the possibility that maybe neither a limited memory nor algorithms of a machine could never make present. Machines do not abduct, they can only simulate abduction because many aspects necessary for human life and for their creative development are left forgotten, such as the senses, imagination, and intuition, aspects that only acquire unity when the human being seeks an end here and now, with his way of being, with his embodied mind, and with his temporal and historical circumstances. As Peirce affirms, without
56 Creativity and Abduction According to Charles S. Peirce
1231
imagination, some signs could perhaps give rise to others, but their most proper function would never be completed (MS 283, 1905). Abduction, as Peirce conceives it, involves admitting that there is a logical operation whose result is only probable and that it may even be wrong. Despite this, Peirce affirms that it allows us to explain the advancement of knowledge and that it can account for our creative achievements, whether they are large or small. The notion of abduction that Peirce developed, anti-rationalist and framed within continuity, and his pragmatism as the ultimate proof of the abducted hypotheses imply a new understanding of reason and imagination, where not everything is linked to the propositional nor to the data of the senses. An idea of integrative rationality is necessary, as opposed to one that only considers the logical-deductive. In this new notion of reasonableness, imagination and instincts are not at odds with discipline or self-control, but everything is aimed at growth. Thus, a notion of the human being as essentially creative is found, a human being with a radical openness to what challenges each individual, who always knows from her own perspective, from her experience and her situation, but without forgetting that there is a reality that guarantees the truth and guides us toward it. Abstract reason with its fixed ideas, its categorizations, its preconceived molds, and its determined forms separates us from experience, isolates us, divides us. The ego becomes isolated without openness to experience and life. The ego has lost, despite its many achievements, something essential. Faced with this, it is necessary to leave rigid answers behind and begin a search for what is reasonable, thus making possible continuous growth that allows human beings to advance in science, create art, and originate new courses of action.
References Andacht, F. (1996). El lugar de la imaginación en la semiótica de C. S. Peirce. Anuario Filosófico, 29(3), 1265–1289. Anderson, D. (1987). Creativity and the philosophy of C. S. Peirce. Nijhoff. Ayim, M. (1982). Peirce’s view of the roles of reason and instinct in scientific inquiry. Anu Prakashan. Barrena, S. (2007). La razón creativa. Crecimiento y finalidad del ser humano según C. S. Peirce. Rialp. Barrena, S. (2014). Charles S. Peirce in Europe: The aesthetic letters. Transactions of the Charles S. Peirce Society, 50, 435–442. Barrena, S. (2015). La belleza en Charles S. Peirce: origen y alcance de sus ideas estéticas. Eunsa. Barrena, S. (2019). Contributions of Charles S. Peirce to creative thinking. Porto Arte: Revista de Artes Visuais, 24(41). https://doi.org/10.22456/2179-8001.97215 Barrena, S., & Nubiola, J. (2020). Abduction: The logic of creativity. In T. Jappy (Ed.), The Bloomsbury companion to contemporary Peircean semiotics (pp. 185–203). Bloomsbury. Bergmann, E. W., & Colton, E. (1999). Connecting to creativity. Ten keys to unlocking your creative potential. Capital Books. Boden, M. (1999). Computer models of creativity. In R. Sternberg (Ed.), Handbook of creativity (pp. 351–371). Cambridge University Press. Brioschi, M. R. (2020). Creativity between experience and cosmos. C. S. Peirce and A. N. Whitehead on novelty. Karl Alber.
1232
S. Barrena and J. Nubiola
Burton, R. (2000). The problem of control in abduction. Transactions of the Charles S. Peirce Society, 36, 149–156. Chiasson, P. (2001). Peirce’s pragmatism: The design for thinking. Rodopi. Eco, U., & Sebeok, T. A. (Eds.). (1983). The sign of three: Dupin, Holmes, Peirce. Indiana University Press. Fann, K. (1970). Peirce’s theory of abduction. Martinus Nijhoff. Fisch, M. H. (1981). Introductory note. In T. A. Sebeok (Ed.), The play of musement (pp. 17–21). Indiana University Press. Génova, G. (1997). Charles S. Peirce: La lógica del descubrimiento. Cuadernos de Anuario Filosófico. Hausman, C. R. (1987). Philosophical perspectives on the study of creativity. In S. C. Isaksen (Ed.), Frontiers of creativity research: Beyond the basics (pp. 380–388). Bearly. Hausman, C. R. (1993). Charles S. Peirce’s evolutionary philosophy. Cambridge University Press. Hoffmann, M. (2006). Seeing problems, seeing solutions. Abduction and diagrammatic reasoning in a theory of scientific discovery (Working Paper Series 2006). School of Public Policy, Georgia Institute of Technology. Available at https://smartech.gatech.edu/bitstream/handle/1853/24031/ wp15.pdf?sequence=1&isAllowed=y Hull, K. (1994). Why hanker after logic? Mathematical imagination, creativity and perception in Peirce’s systematic philosophy. Transactions of the Charles S. Peirce Society, 30(2), 271–295. Joas, H. (1993). Pragmatism and social theory. The University of Chicago Press. Johnson, M. (1993). Moral imagination. Implications of cognitive science for ethics. The University of Chicago Press. Kent, B. (1997). The interconnectedness of Peirce’s diagrammatic thought. In D. Roberts & J. Van Evra (Eds.), Studies in the logic of Charles S. Peirce (pp. 445–459). Indiana University Press. Lorda, J. (1991). Gombrich: Una teoría del arte. Eiunsa. Martindale, C. (1999). Biological bases of creativity. In R. Sternberg (Ed.), Handbook of creativity (pp. 137–152). Cambridge University Press. Niño, D. (2001). Peirce, abducción y práctica. Anuario Filosófico, 34(1), 57–74. Nubiola, J. (2004). Il lume naturale: Abduction and god. Semiotiche, I(2), 91–102. Nubiola, J. (2005). Abduction or the logic of surprise. In F. Merrell & J. Queiroz (Eds.), Abduction; between subjectivity and objectivity, semiotica (vol. 153(1/4), pp. 117–130). De Gruyter. Nubiola, J. (2009). What reasonableness really is. Transactions of the Charles S. Peirce Society, De Gruyter 45(2), 125–134. Peirce, C. S. (1931–1958). Collected papers of Charles Sanders Peirce (vols. 1–8) (C. Hartshorne, P. Weiss, & A. W. Burks, Eds.). Harvard University Press. [CP]. Peirce, C. S. (1966). Charles S. Peirce papers. Harvard University Library, Photographic Service 1966. The manuscripts are quoted by their number in Richard Robin, Annotated catalogue of the papers of Charles S. Peirce. University of Massachusetts Press, 1967. [MS]. Peirce, C. S. (1976). The new elements of mathematics (vols. 1–4) (C. Eisele, Ed.). Mouton, 1976. [NEM]. Peirce, C. S. (1982–). Writings of Charles S. Peirce: A chronological edition (vols. 1–8) (The Peirce Edition Project, Ed.). Indiana University Press. [W]. Peirce, C. S. (1985). Historical perspectives on Peirce’s logic of science: A history of science (vols. 1–2) (C. Eisele, Ed.). Mouton. [HP]. Peirce, C. S. (1992). Reasoning and the logic of things. The Cambridge conferences lectures of 1898 (K. L. Ketner, Ed.). Harvard University Press. [RLT]. Peirce, C. S. (1992–1998). The essential Peirce: Selected philosophical writings (vols. 1–2) (N. Houser, Ch. Kloesel, & Peirce Edition Project, Eds.). Indiana University Press [EP]. Policastro, E., & Gardner, H. (1999). From case studies to robust generalizations: An approach to the study of creativity. In R. Sternberg (Ed.), Handbook of creativity (pp. 213–225). Cambridge University Press. Popper, K. R. (1992). In search of a better world. Lectures and essays from thirty years. Routledge. Prince, G. M. (1970). The practice of creativity: A manual for dynamic group problem solving. Harper & Row.
56 Creativity and Abduction According to Charles S. Peirce
1233
Raposa, M. (2003). Meditation & the martial arts. University of Virginia Press. Russell, B. (1959). Wisdom of the west. Doubleday. Santaella, L. (1991). Instinct, logic, or the logic of instinct. Semiotica, 83, 123–141. Shin, S. (2016). The role of diagrams in abductive reasoning. In S. Krämer & C. Ljungberj (Eds.), Thinking with diagrams. The semiotic basis of human cognition (pp. 57–76). De Gruyter. Singer, J. L. (1999). Imagination. In M. A. Runko & S. R. Pritzker (Eds.), Encyclopedia of creativity (pp. 13–26). Academic. van Herdeen, M. (1998). Reason and instinct. In J. van Brakel & M. van Herdeen (Eds.), C. S. Peirce. Categories to Constantinople (pp. 61–82). Leuven University Press. Viola, T. (2020). Peirce on the uses of history. De Gruyter.
Surprise as the Dawning of Abductive Rationality: Evidence from Children’s Narratives
57
Donna E. West
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Launching Abductive Inferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Surprise as a Device to Invite Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The First Stage of Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Indispensability of Co-localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mental Time Travel as Scaffold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evidence from Children’s Narratives: Frog, where Are you? . . . . . . . . . . . . . . . . . . . . . . . . . Rationale for the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Findings and Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion and Prospects for Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1236 1237 1241 1243 1245 1248 1249 1249 1251 1251 1253 1263 1267 1268
Abstract
At the core of this chapter is the idea that children are more easily surprised than adults – in part due to their general inexperience with the world and in part due to their innate willingness to question previously held beliefs when compared with their adult counterparts. This latter ability most clearly manifests in the medium of creative inference or abduction. Utilizing Berman and Slobin’s findings, this chapter highlights how children between the ages of three and nine employ a Peircean retroductive model of creative inference when asked to reproduce, in narrative form, the events of a “pictures-only” story. Peirce’s
D. E. West () State University of New York at Cortland, Cortland, NY, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_19
1235
1236
D. E. West
model is “retroductive” insofar as interpreters reason from surprising consequent to antecedent. Results from the reported study indicate that children younger than five were more likely to fabricate events (outside of the core story line) to resolve surprising outcomes taken from the textual images. Beyond the age of five, the child narrators were more likely to use creative abduction to make logical connections between surprising events and provide more coherent narratives. This competence is likely connected with the emergence of episodic memory and autonoetic consciousness. Keywords
Children’s narrative · Episodic memory · Peirce · Surprise · Co-localization
Introduction This account demonstrates how narrative genres utilize the element of surprise to foster abductive inferences. Empirical findings of children’s narratives (at 3;0, 4;0, 5;0, and 9;0) are examined to determine to what degree they conform to sequences in Frog, Where are You? [NB: The x;y notation represents the child’s age in years and months, respectively, but, 3;0, 4;0, and the like can function as a range of the entire year. This particular notation is the convention of the child developmental scientific community.] The present account draws upon Berman and Slobin’s (1994) findings but recodes and reanalyzes such findings in light of Peirce’s abductive paradigm, namely, that apprehension of surprising consequences is foundational to reasoning abductively. Children’s frequencies of surprising events versus those without the surprise element are analyzed to determine whether mention of surprising consequences increases with increases in frequencies of total events depicted in the storybook. Evidence of a developmental pattern (increases in surprising consequences with age) can indicate the need for younger children to lead their narratives with unexpected outcomes, which supply explanatory inferences regarding which antecedents bear upon primary consequences in the account. This account likewise discusses the role of cognitive and semiotic factors in bringing about instinctual inferences. The cognitive skill most responsible for formulating such logical connections across events is the means to think episodically. Apprehending surprising outcomes can hasten the search for viable contributing factors, which itself is episodic, but from a retrospective or retroductive perspective. In other words, children’s episodic competency to sequence events logically by looking back from surprising consequences (relying upon Peirce’s retroductive reasoning paradigm) underlies their means to creatively integrate plausible propositions and arguments into the story line. In fact, episodic meanings facilitate children’s co-localization of sign objects, which amplifies propositional meanings to argumentative status. The rationale is as follows: the diagrammatic nature of pictures enhances abductive inferencing, since meanings associated with their objects are highlighted both by indexical and iconic sign features, that is,
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1237
features that provide directionality/location and resemblance/analogy, respectively. The depiction of the identity element and movement toward a goal in storybook pictures serve as scaffolds toward uniting events with reasonable explanatory sequences. Since thinking episodically does not emerge until 4;0, given the timetravel competencies (autonoetic skills) upon which it relies (Tulving, 2005), younger narrators are less competent in utilizing surprising outcomes to access episodic meanings inherent in the story line. Time travel (thinking episodically) incites older narrators to utilize surprising consequences as a scaffold to search for viable antecedents. But younger narrators must rely chiefly upon depictive signs to maintain sequentiality. Access to depictions which show iconic and indexical relations (spatially and temporally) starts younger narrators on a course to question previously held principles whereby they can take full advantage of their guessing instincts through purporting retroductive perspectives by way of their narrations. This inquiry offers findings from children at four distinct ages to trace when retroductive explanatory rationale emerge ontogenetically. The natural disposition of surprise as expressed in narratives, to instantaneously deliver insight into which antecedents are relevant to explain vital, anomalous consequences, makes obvious its indispensability in constructing reported speech. Using surprising outcomes as the nucleus for narratives can scaffold creative inferential reasoning (particularly for younger narrators), because the semiotic character of sequential images (as iconic and indexical signs together) bootstraps both cognitive and logical competencies – episodic thinking for the former and abductive inferencing for the latter.
Launching Abductive Inferencing For C. S. Peirce, the element of surprise is paramount in its spontaneity to launch the process of bringing to consciousness the possibility that a different principle (than previously believed) explains a state of affairs. Semiotically, this process relies upon Peirce’s categories of Firstness, Secondness, and Thirdness: the possible/feeling, the physical/experiential, and the law-like/meaning/effect-based, respectively. Firstness operates through the spontaneous awakening of the consciousness to some anomalous effect, while Secondness supplies the shock of the surprising outcome itself. Peirce’s category of Thirdness is likewise implicated, given the need to establish a replacement principle required in abductive inferencing. The presence of Secondness insinuates that a former habit of belief or action is in need of reform; and a replacement belief is called for. It is Secondness which supplies the platform for the struggle between former and present conflictual paradigms. In its resistance against effort (an unexpected quality/event), Secondness becomes the agent to abduce – to create a change in habit as non-mechanical belief/conduct (incorporating the operation of Thirdness), thus resolving the opposition (see West, 2015, 2016b, c, d). It does so at the moment when the old habit and new habit clash, and the “vividness of the representation is exalted” (1903: 5.53). The exaltation drives the interpreter to examine the non-ego-based fact (not yet experienced) over and against already held mental/physical practices. The examination draws upon
1238
D. E. West
the operation of Thirdness, in that it requires a close comparison – plunging the mind into increasingly more profound depths of consciousness as to whether a habit-change is recommended. To encapsulate, Firstness in surprise (unexpected mind-set), together with Secondness (managing the effect of the unexpected fact on established habits) to elevate interpreters’ minds to increased degrees of consciousness in Thirdness – necessary to arbitrate the decision of which habit to choose. It is evident that each of the categories is influential in bringing to fruition Peirce’s clash between feelings. Peirce introduces surprise as the core element in the quest for knowledge – constituting the catalyst to the entire abductive process: The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true. (1903: 5.189; EP 2: 231).
In surprise comes the realization (however implicit) that new facts are in disaccord with already established principles, calling for an inference to resolve the disagreement. It is the operation of surprise that keeps us humble – forever lifting the mind toward amplification and contraction of increasingly more rational hypotheses. Peirce indicates additional functions for surprise. He claims that surprise guards against docility; it launches the consciousness into a state of vigilance, demonstrating the need to follow a more accurate line of reasoning. As such, surprise prevents the exercise of sterile practices of thought and conduct. Its power to awaken the intellect and emotions simultaneously makes it a formidable force in habit-change, opening the way for abduction to orchestrate such change (see Haack, 2014). Narrative genres constitute an efficacious device to bring before the consciousness of another a series of surprising events (see West forthcoming for a fuller discussion of narrative), because it supplies an existential framework whereby interpreters develop and remonstrate over causes for surprising consequences. This is so because interpretations of narratives are perpetuated by the need to circumscribe logical principles onto individual existential events – providing goals/purposes for participants and reasons for the goals. Events which give interpreters pause (as in surprising outcomes) constitute primary catalysts to this end. In this way, narrators punctuate their accounts with surprising consequences, with outcomes which violate expectations; and listeners discard previous assumptions about why the expected consequence failed to surface; instead, a different outcome materialized. It becomes incumbent upon narrators to utilize a series of surprises as an initial device to force their listeners to refrain from reliance upon prescribed beliefs. Peirce considers that semiotic factors are responsible for the full effects which surprise (especially a series of surprises) can produce, in that the series of surprises launches interpreters into conscious examination of interplays between the existential object which is indexical and the conceptual object which is symbolic (1903: 4.448; cf. Stjernfelt, 2020: 428 for elaboration). Recognition of both the indexical and symbolic (conventional, agreed-upon meaning) factions together allows interpreters to emerge from apprehension of individual event-facts, to episodic frameworks intrinsic to story-based frameworks. The episodic nature of narratives accounts for
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1239
the expectation of resolution, for narrators and listeners. This expectation that the story will supply a goal and a resolution rivets interpreters to draw upon two kinds of skills: semiotic and logical to assemble the events and to examine the more objective meanings associated with physical situatedness. Learning the reason for inclusion of each participant constitutes an important framework to develop the logical acumen which underlies episodic thinking and discernment of the story’s plot, in that it integrates the participants’ physical with functional properties. The upshot of this episodic approach requires that interpreters of narratives actively assemble the story’s happenings into a planned lesson which builds an event-crescendo, such that both narrators and listeners extract what the character(s) in place of the listener should do to ascertain a remedy. As such, the unexpected consequences (as instances of surprise) provide scaffolds to ascertain the why for such consequences (West, 2021a, b). Bruner (1990: 81–83) describes an experiment in which children were presented with a story of a birthday party, but some iterations of the story “violated canonicality”: “the birthday girl was unhappy, or she poured water on the candles rather than blowing them out, and so on” (1990: 81). The surprising consequence (pouring water on the birthday candles) violates the children’s expectation that they would be extinguished by more conventional means, e.g., blowing out their luminating benefits. Here, the children are drawn to an explanation for extinction by water and to generate inferences to account for the discrepant conduct. All three of Peirce’s categories (Firstness, Secondness, Thirdness) work as an aggregate (informing inferences) to put together the story’s facts episodically, in a way that best explains the reason for facts’ emergence in that context. To do so, interpreters’ attention is redirected to an unconsidered fact or set of facts (antecedents) which might have effectuated the surprising consequence, however latent the facts might be. Bruner claims that children have a natural tendency to produce and interpret narratives; in fact, they serve a scaffolding function for determining more objective social and logical principles. As such, he observes: “Four-year-olds may not know much about the culture, but they know what’s canonical and are eager to provide a tale to account for what is not” (1990: 82–83). With respect to Firstness, surprise is consonant with a sense of feeling. This sense of feeling often comprises a betrayal of previous assumptions/assertions. The pre-established principles promote how to think or how to act in prescribed circumstances; but, doubt as to their plausibility from surprising consequences in Secondness leaves the door open to remain vigilant – to search out more fitting inferences in Thirdness. Peirce espouses that feelings arise from some prebit which is then checked against the validity of already held assertions (1909: MS 645: 9), whereupon subconscious considerations of truth acceptability for the inference materialize. Hence, the surprise element operates as a catalyst, whereupon message receivers re-examine the efficacy of the unexpected fact in light of stored classifications and propositions. Surprise defines the juncture when the facts inform old information, such that a surprising fact has merit to question the validity of current assertions. The incongruity activated by surprise becomes more solidified as a feeling (conviction) if the narrator or the receiver of the account questions whether
1240
D. E. West
stored knowledge no longer explains a consequence(s); instead, newly presented condition/s in Secondness more adequately do so. When the unexpected consequence is conceived of via an internal force, it exercises both a deontic and epistemic effect over the receiver. This conflict materializes when a qualitatively different paradigm (pictures, diagrams) suddenly emerges to compete with preformed conceptions. Surprise begins the process of destabilizing preconceived operating standards – both epistemic and emotional ones; it surfaces when an outcome suggests to interpreters alternate, if not more viable, contributors to a consequence told in the narrative; otherwise, the consequence would never have materialized. This resultant conflict between feelings drives interpreters to apprehend some perhaps lack of preparedness on the part of the listener (deontically) for this eventuality given less adequate assertions. In surprise, Peirce advocates that a collision in Secondness strikes the consciousness (1903: 8.266) and that a conflict (rational and affective) is about to be revealed. Cooke (2012: 185–186) recognizes the effects of surprise (both rational and emotional) which Peirce advanced to demonstrate the inception of inferencemaking: “ . . . Peirce’s focus on surprise as a rational or cognitive emotion at the beginning of inquiry and the transformation of unpleasant surprises and irritations of doubt into the pleasures and calm of new habits of belief confer new levels of self-control in the world.” Surprise opens the road to inquiry (Haack, 2014); hence, it operates as an agent of Firstness and Secondness to usher in new logical interpretants in Thirdness. Its purpose is to mark the dawning of belief conflicts in the wake of a decision of how much weight to ascribe to the unexpected, non-ego element. Accordingly, surprise promotes semiosis – in harkening to the vividness of objects in Secondness and converting them into potential facts with meanings which vie for authentication. Surprise in Firstness and Secondness ignites at least some basic awareness of whether to further examine and embrace novel interpretants. The mechanism responsible for steering the levels of self-control to which Cooke (2012; see infra) refers is nothing short of abductive inferencing, because it is when principles replace old ones (especially sequential principles), that surprise is resolved, ushering in habit-changes (see Bergman, 2016: 190). Homeostasis is reached when the irritant of surprise gives rise to a rethinking (see Aliseda, 2016: 147–148); and some recognition of ignorance demonstrates ignorance preservation (see Woods, 2013: 365–370). These mental states of affairs create a condition which calls for reform – proposals for why the surprising fact scenario surfaced in light of antecedents which might have been altered. The effort to extract more absolute truths is the objective of double consciousness regimes; and the clash between feelings brings about surprise and activation of inferences. Cooke characterizes surprise “as a form of error recognition” (2012: 179). In this capacity surprise serves as an irritant for change of habits within logical interpretants: “surprise disrupts a habit and shows a belief to be wrong, and thereby initiates inquiry . . . ” (Cooke, 2012: 186). Surprise, however, may not always dissolve beliefs altogether; instead, it often calls for some less severe alteration, e.g., assimilation (additive operation) to previous schemas.
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1241
Beyond the strict element of doubt which surprise induces, Peirce affords great weight to the instrument of narrative. He proposes that narrative situations (time, place of utterance) constitute a critical force to open one’s mind to improved hunches. Superimposing episodic rationale (searching out the why for an unexpected event) upon individual events infuses its object with a higher level of meaning/effect. As such, highlighting surprising consequences awakens listeners’ consciousness to the intent of the narrator in telling the narrative. In this way, the design of the narrator in inserting elements of surprise confronts interpreters with the need to recognize alternative models of belief. Surprise provides interpreters with the impetus to consider more than a single event/perspective (even beyond that of the narrator), such that listeners are driven to perceive the goals of narrative participants and the reasons for their attitudes and actions. This skill constitutes dialogic thought, in that others’ efforts to dislodge listeners’ unworkable inferences become central to comprehension of event progression. Figuring out how and why characters behave and think subverts the narrowness of idiosyncratic viewpoints and opens up opportunities for listeners to lay aside their parochial assumptions. It is an interesting phenomenon that playing out actions/expressions/states of being as experienced by characters in narratives affords seekers who have a stake in interpretation a bird’s-eye view of alternative event interworking. This outside but soon-to-become accepted as inside experience permits listeners the luxury of aligning their self-centric perspective without physically entering into the actual experience of the other. For Peirce, this imagined or virtual experience (see 1909: MS 620: 26; 1904: 8.330; Bergman, 2016: 190; and West, 2017) possesses greater potency for the listener than does the actual experience, given the vicarious effect for the listener of inferring that the foreign episode is particularly relevant to them. As mentioned previously, imagined touch/words are often more vivid and riveting than are actual interactions (see West, 2019a, in press), perhaps because imaginers are drawn to “feel” (in the Peircean Firstness sense) the fullness of possibilities (1902: 6.364–367) soon to come. This virtual experience is indisputably a viable means to facilitate dialogical reasoning.
Surprise as a Device to Invite Conjecture Peirce advocates that a two-sided conflict of thought (which operates in narrative exchanges) constitutes a necessary and primary component in facilitating novel insights. The dialogic interactions obviating belief/action conflicts begin with attention to the viability attributed to a different proposition (derived from a vivid, surprising circumstance). A flash of insight surfaces as a consequence of the affirmative impact of an unexpected event – suggesting that preexisting propositions/arguments are in need of change. According to Peirce, this insight emanates from outside forces such as vividness (cf. Atkins, 2018 and MS 318: 1907; MS 643: 1909). The primary advantage of vividness in surprise within double consciousness is its attentional benefit for determining whether the old or the new knowledge better cultivates truth principles. As such, focus is secured to vivid con-
1242
D. E. West
sequences in Secondness, calling for explications of their logical relationship with antecedents. To determine the value of the insight flowing from the vivid, surprising circumstance, subjects rely upon a two-sided consciousness (CP 5.53), whereby either the insight is discarded as a hypothesis or new propositions/arguments are presented (CP 8.373: 1908). The purpose of the two-sided argumentative venue is to reconstitute propositions in such a way that certain subjects are chunked with certain predicates to best be processed within the confines of the working memory system. Peirce is clear that the inquiry which surprise invites is both startling in Secondness and conjectural in Thirdness. In other words, surprise is pivotal to the logical progress inherent in abductive reasoning: “ . . . the search for pertinent circumstances and the laying hold of them . . . the bursting out of the startling conjecture, the remarking of its smooth fitting to the anomaly, as it is turned back and forth like a key in a lock, and the final estimation of its Plausibility, I reckon as composing the First Stage of Inquiry. Its characteristic formula of reasoning I term Retroduction, i.e., reasoning from consequent to antecedent. In one respect the designation seems inappropriate; for in most instances where conjecture mounts the high peaks of Plausibility and is really most worthy of confidence – in inquirer is unable definitely to formulate just what the explained wonder is; or can only do so in the light of the hypothesis. In short, it is a form of Argument rather than of Argumentation” (1908: 6.469; see supra for exegesis). Earlier in the same passage, Peirce explicitly sets up surprise as the indispensable catalyst for engagement in inquiry, especially abductive inquiry (see likewise Davies & Coltheart, 2020: 411). Peirce’s notion of abduction encompasses a wide berth of inquiry from instinctual to reflective forms (see West, 2016a); but it is the instinctual kind, that which surfaces from intrinsic recognition of an object’s property and whose source emerges from rapid retrieval rather than from consideration/reflection of comparative characteristics of objects which defines the hypothesis as abductive and which accounts for whether it is ultimately accepted (see Davies & Coltheart, 2020: 418). Peirce considers abductions to range “from a mere expression . . . in the interrogative mood, as a question meriting attention and reply, through all appraisals of plausibility, to uncontrollable inclination to believe” (1908: 6.469). The latter is consonant with his use of instinct, in which sudden insights (“flashes”) naturally emerge, which more often than not offer true explanations (1903: 5.181). Peirce reminds us in 6.469 that surprise figures prominently in abduction, initiating any of the aforementioned forms of abductive inquiry, even abductions deriving from non-instinctual conjecture: “Every inquiry whatsoever takes its rise in the observation, in one or another of the three Universes, of some surprising phenomenon, some experience which either disappoints an expectation, or breaks in upon some habit of expectation of the inquisiturus; and each apparent exception to this rule only confirms it. . . . The inquiry begins with pondering these phenomena in all their aspects, in the search of some point of view whence the wonder shall be resolved. At length a conjecture arises that furnishes a possible explanation, by which I mean a syllogism exhibiting the surprising fact . . . ” (1908: 6.469). The surprise in the first stage of inquiry, as Peirce advocates, is particularly highlighted in narrative contexts, to invite listeners to conjecture regarding the plot.
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1243
The First Stage of Inquiry The “first stage of inquiry” (1908: 6.469) compels interpreters to search out whether preconceived rationale underlying habits of belief/action call for interpreters to generate new hunches (see Magnani et al., 2016). The instrument at interpreters’ disposal to generate abductions of the instinctive kind (see Paavola, 2004; Feodorov, 2021) is surprise. In fact, it is the primary catalyst disrupting the habit: “An event that can be answered in an habitual form does not cause any surprise. On the contrary, a ‘surprising’ fact requires a change in our rational habit of belief; it demands an explanation” (Nubiola, 2005: 124, italics mine). Accordingly, the first stage of inquiry, apprehending surprising consequences, provides an inkling that a belief or action change is warranted. For Peirce, the emergence of surprise constitutes more than an affective tool to initiate inquiry into alternative causes and explanations; it likewise reveals the onset of cognitive dissonance between stored propositions/arguments and a result seemingly incongruent in the context. Incongruence (logical or practical) in the context is made obvious upon an affective reaction – that of surprise. As such, surprise is especially formidable as a point of departure from which narrators and interpreters inquire into logical congruities among events. Peirce alludes to the sudden transition from mental dullness/docility to vigilance: At one time a ship is sailing along in the trades over a smooth sea, the navigator having no more positive expectation than that of the usual monotony of such a voyage, − when suddenly the ship strikes upon a rock. . . . And naturally nothing can possibly be learned from an experiment that turns out just as was anticipated. It is by surprises that experience teaches all she deigns to teach us. In all the works on pedagogy that ever I read . . . I don’t remember that any one has advocated a system of teaching by practical jokes, mostly cruel. That, however, describes the method of our great teacher, Experience. She says, Open your mouth and shut your eyes And I’ll give you something to make you wise; and thereupon she keeps her promise, and seems to take her pay in the fun of tormenting us. (1903: 5.51, EP 2: 154).
Peirce alights upon this illustration of striking the rock not to chastise the navigator’s apparent negligence, but to highlight how expectations/anticipations, when the regularity is so entrenched and automatic, can dull interpreters’ minds to potential (even obvious) consequences – were they to but awaken (by surprise) from the schema that engulfs them. The illustration which Peirce supplies is particularly poignant; it highlights how Secondness features to jolt and awaken interpreters’ realizations that previous meanings/effects should no longer apply. The affect triggered by observations of surprise in Secondness personifies the rock’s abilities, extending or overextending its power to destroy. Surprise disengages the interpreter’s docility; at the same time, it informs him of the rock’s power to disturb the status quo. It amplifies qualities newly attributed to the object (rocks), changing their interpretants and introducing a potential argument for the interpreter. This form of argument constitutes a pre- or proto-argument. This suggestion of an argument is not “argumentative” (6.469) since it has not yet reached muster to be articulated to another – to replace the inadequacy of previous explanatory hypotheses. In other
1244
D. E. West
words, to qualify as argumentative, the proposed argument must reach a level of believability that another might not think the hypothesis to be preposterous. As such, the state of dulled consciousness is a necessary condition for making surprise effective – to produce a more heightened state of consciousness such that arguments might reach argumentative status – able to heighten an interlocutor’s sensibilities. In Peirce’s illustration of the awakening, the navigator is anesthetized by regularity and does not expect that his course (directionality of sailing in shallow waters) does not elicit a prediction (argument) of an untoward consequence. The navigator either was unaware of where he was sailing or was unfamiliar with changes in the sea’s bottom. In either case, the fact that indexical factors are not inscribed with more symbolic meanings accounts for the navigator’s lack of apprehension of the consequences. Stjernfelt (2020: 428) ingeniously points out the pivotal role of widening propositional meanings in the construction of new arguments. He attributes the inferences drawn from widening object meanings to the process of co-localization. Stjernfelt’s rationale is that location of objects is but one part of the objects’ meaning; generating plausible judgments (truth claims) likewise must operate, particularly those which examine general functional and/or descriptive object characteristics. Based on EP2: 282, Stjernfelt (2020: 420) explains that two object meanings are necessary to assert truth claims: “ . . . it is possible to make a truth claim about something by spatially connecting a sign referring to the Object with a sign describing that Object.” In fact, Stjernfelt (2020: 428) advocates that without introducing both index- and iconic-based object meanings (which propose existent and general object properties), truth claims are not fully assertable: “ . . . the proposition not only makes a claim about some fact or state of things, but it also makes a claim about the sign itself, a claim that it is a sign which stands in a certain, indexical connection to its Object, in turn granting its truth.” In other words, it is the existence of the object, together with familiarity with the object’s functions/properties, that completes the truth claim. Stjernfelt demonstrates the semiotic contrast which must be present to assert propositions, namely, index as subject of the proposition, and the proposition’s predicate, its iconic and symbolic components: “ . . . some sort of juxtaposition [between subject and predicate] must hold for all truth-claiming signs, for it is that juxtaposition which serves to describe the sign’s own claimed indexical relation to its Object – and so makes possible the expression of that truth claim” (Stjernfelt, 2020: 422; also see EP2:310). This passage emphasizes the critical role of sign juxtaposition for interpreters in expressing propositions which possess truth claims. It is interpreters’ realization of object juxtaposition (indexical versus iconic meanings) which comprises specific propositional syntax: “ . . . it is the juxtaposition which connects words” (1904: Kaina Stoicheia, EP 2: 310). This syntax of subject and predicate objects operates extralinguistically in diagrams, e.g., gestures and depictions, which appear to be even more primary for interpretation, especially early on in human ontogeny and across species. In fact, the source for completing the syntax which accounts for asserting plausible inferences (which hold truth claims) is interpreters’ increased acquaintance with the object of the index – to formulate more general meanings. The inferences which
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1245
underlie potential truth claims can qualify as instinctual, since even though they appear to surface as knee-jerk inferences (if they derive from previous cognitions (1868: 5.213)), they more often than not contain truth claims. Based upon a mere scrap of acquaintance (unlike intuitions), interpreters can assert plausible inferences (1908: 6.476): “It is more to the purpose, however, to urge that the strength of the impulse is a symptom of its being instinctive.” Even instinctual abductions emerge from some acquaintance, however slim. Any degree of acquaintance (consequent to co-localization) affords the presence of syntax – such that interpreters attribute to designated objects novel qualities and effects. When Peirce extends the expression of co-localization between interpreters, propositional meanings advance arguments: “A familiar logical triplet is Term, Proposition, Argument. In order to make this a division of all signs, the first two members have to be much widened” (1906: 4.538; cf. West, 2019a for elaboration). The interpretants of terms and propositions are widened through the process of co-localization, when new properties/functions are affiliated with the object of the index – forming a new subject-predicate union. This union consists in a new hypothesis – a change in belief about the nature of the object’s role in producing an outcome. Once the surprise of the outcome is felt in Secondness (from the object of index), interpreters are jettisoned into considering other hypotheses which might guide them (in the case of navigators) against sailing overly near to the coast. But surprising consequences alone do not alert the consciousness to current dangers if the interpreter’s mind is asleep or dulled by regularity. Remaining awake to new phenomena is critical; it rescues the consciousness from fallible beliefs and from actions/inactions which cause harmful consequences. The incidence of surprise awakens interpreters to the possibility that because of a newfound quality/functionality of the object (e.g., the sharpness of rocks on the sea’s bottom near the coastline), a different reaction should replace the faulty assumption of safety.
The Indispensability of Co-localization The possibility of falling into a state of non-vigilance militates in favor of engaging in sign coupling or the object of one sign as enhancing that of another. Stjernfelt (2014, 2020) inaugurates this sign enhancement as “co-localization” and demonstrates its primary function in building inferencing – opening interpreters to entertain new propositions and arguments. Co-localization initiates this expansion of meaning – to incorporate argument structure. According to Stjernfelt (2014: 141), “ . . . the ‘highest’ Peircean sign types: propositions and their linking into arguments, are what represents aspects of reality (propositions) and give rise to inference to action (arguments) . . . .” Co-localization facilitates the application of additional characteristics to terms and propositions, such that they imply novel predicates (see 8.373; Bellucci 2014 and West 2019a). Co-localization does so by applying to the same object newly discovered qualities (MS 789). The new affiliation consequent to predication augmentations constitutes sign co-localization when both (the subject index and the symbol) refer to the same object; and while the indexical sign is
1246
D. E. West
localized in the here and now, localization of the sign’s symbolic component is a general quality chiefly residing in the mind of the beholder. The latter surfaces consequent to conventional meaning and acquaintance. Stjernfelt (2020: 429) uses “acquaintance” to characterize the sign conjunction holding between the object in the spatiotemporal here and now (existence) and newly found general characteristics ascribed to the same object (symbolic element). He contends that existence alone is not predicative: “it must be aided by some direct acquaintance . . . or . . . how to achieve it [acquaintance]. The Predicate is co-localized not with a Subject index but with the very object of the sign itself” (Stjernfelt, 2020: 429). In this way, the physical object becomes part of the unity of two signs into one, when it is adorned with more conventionalized meanings which have logical merit. Sign unity of index and icon/symbol (co-location) is particularly pertinent in linguistic contexts, in view of the interplay between lexical selection and syntactic factors. These, together, attempt to explain truth claims by concatenating subjectpredicate aggregates with existent objects – describing the why of what narrators select to present to listeners. When narrators convey propositions to listeners, they imply a novel object class affiliation – such that the same object specified in the Subject of the sign is afforded an amplified predicate which specifies some previously unaffiliated feature/function to the same object. When a co-localized sign is presented to others in narrative contexts, it takes on argumentative force, given the attempts to convince listeners of the efficacy between two or more propositions; they contemplate whether the narrator’s propositions regarding the object of index and icon constitute objective truth claims. In this way, listeners become active players in fielding and establishing logical truth within story-based genres, where episodic meanings are prominent. Stjernfelt (2020: 440) articulates just how co-localized sign meanings extend to narrative exchanges (between partners): “Co-localization is topical . . . the presence of such a field (connectivity) must be communicated and recognized by the sign interpreter.” In other words, absent the speech partner, the speaker can only rely upon his own, idiosyncratic system to evaluate truth value – a distinctively subjective enterprise – one which is less reliable logically and is characterized by far less semiotic progress (arguably none). Extending Stjernfelt’s rationale still further, co-localization between speaker and listener is particularly poignant because the object which forms the topic of discourse may not be shared by both parties, whereby meaning remains uncommunicated and semiosis is precluded. The present narrative design (derived from Berman & Slobin, 1994) offers a simple means to ensure common object focus and communication of narrative plot, namely, to present the events in pictorial fashion – both partners having access. The depictions which Stjernfelt (2020: 435–437) examines possess less merit in the effort of ensuring common focus and facilitating semiosis. Even though they are pictorial in nature (hieroglyphs, comics), they fall short of supplying a natural discourse whereby co-localization can flourish. Supplying a series of sequentially organized pictures goes further to
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1247
intimate affiliations between propositions and arguments. The episodic (sequential) framework of pictures on different pages in the present design allows interpreters to more quickly infer logical connections and evaluate their validity. Utilizing pictorial formats provides index a primary role in inferring more symbolic object meanings. It places the subject of discourse (topic) squarely before the mind of the listener of the narrative. In this way, episodically arranged pictures – natural to bolster the message of the narrative – make obvious different qualities inherent to the object under consideration. In turn, interpreters are invited to colocalize with narrators by assigning different predicates to the object of the subject index. Pictures or diagrams which tell stories provide the speaker’s object focus in the here and now while measuring changes in object attributes throughout the account. The emergence of unexpected object properties over the course of the story disambiguates the reason underlying the objects’ attribute; their presence likewise implies the nature of the predicate pertaining to that object (which the narrator is advancing). These more iconic and sequentially arranged signs show the topic (the object), such that both parties have direct access not merely to the object, but to its situatedness, which implicitly supplies additional object functions to support the tenets of the story line. Both the depiction and implications of the story’s primary purpose are naturally present for interpreters. This narrative procedure facilitates the co-localization of sign meanings (indexical with symbolic) while enhancing reciprocal meaning transmission. Via pictorial approaches, interpreters of narratives benefit from at least initial acquaintance with the discourse subject and gain facts to infer the speaker’s reason for telling the story. Absent the co-presence of episodic event pictures – sequenced iconic signs – increasing logical challenges are likely to result. This is the case, since new propositions/arguments are more latent with presentment of linguistic representations (lexemes in a syntactic structure) alone; and object meanings are more likely to be missed/misconstrued. Although surprise from an unexpected pictorial scene has its advantages, it may not promote listeners’ abductive skills. Surprising depictions often awaken the consciousness to the need for more plausible beliefs; but they may not ensure that explanatory inferences follow. Despite its means to initiate notice of the subject of the discourse, surprise has little predicative benefit absent a realization of the narrator’s logical purpose for introducing newly determined object functions/qualities. In other words, sudden notice of disparity demonstrated by surprise alone fizzles out, without hints to determine different causes from the unexpected outcome. Beyond the attentional device (indexical enhancement) which surprise affords in the story line, narrators must integrate what Peirce advocates, namely, a “series of surprises” (1903: EP2: 154, see also West, 2021b). They do so by articulating characteristics not of the subject index (the picture) alone, but of the narrators’ perceived antecedent(s) (unforeseen object functions) – to convince listeners of the benefit of ascertaining logical linkages across primary elements of the narrative.
1248
D. E. West
Mental Time Travel as Scaffold Prior to narrating a series of events to others (one or more of which is surprising), the narrator must be conscious that interpreters ultimately need to weave together the events, not as a collage, but as a sequence. This sequence rests upon the teller’s own mental travel in the affairs of the characters. In telling these events to other parties, narrators subtly present novel argumentation whose truth value listeners are expected to contemplate (1908: 8.373; see also Bellucci, 2014; West, 2019a: for further elaboration). Narrators invite listeners to follow their argument in the story line and either adopt augment or reject the narrators’ prospects. In any case, timetravel skills (known as autonoetic consciousness) are foundational to factoring into the story not just descriptions of single events, but their logical integrity, leading to listeners’ discovery of the ultimate reason for the story’s conveyance. Although many investigators note the vital role of autonoetic consciousness in episode-building, Tulving’s (1985, 2002: 7 and 2005: 11, 32; Wheeler et al., 1997: 332) insights have been the most influential. Whereas noetic consciousness entails consolidating events which are remembered accurately in their actual and logical sequence (requiring index to supply some rudimentary argument structure), autonoetic consciousness entails the additional skill of remembering how the self traveled, or is likely to travel, through the event sequence (which requires index to suggest future-oriented spatial and temporal conditions for the self). In short, in its diagrammatic function, index hastens the consolidation of separate icons (depictions from a storybook). With indexical devices, event sequences (noetic) are facilitated; and inserting the self as player in constructed event sequences (autonoetic) can be achieved. But what truly sets episodic memory apart from autonoesis is the means to further project the self into events experienced by others (not by the self alone) and situating others in subsequent diverse events. In view of this other-based viewpoint, taking allocentric perspectives is vital to thinking episodically and to constructing coherent narratives – a fact recognized by Szpunar and Tulving (2011: 6) as well as Herman (2002: 303–309) and Klein (2015). When children represent the self in past scenarios, and recall the sequence of those scenarios ordinarily during narratives, they are only remembering the happening itself and their own feelings. To truly think episodically (going beyond Tulving’s assertions), children must make inferences based upon others’ anticipated reactions – a less direct source. As such, child narrators must not only cultivate autonoetic consciousness (insinuating the self as an event participant); they need to imagine themselves as taking others’ perspectives in subsequent event structures. Until child narrators consciously incorporate appreciation for diverse perspectives, by projecting the self into possible events which others have experienced or may experience, narration is unlikely to convince interpreters to perceive reported events as an episodic unit. Narrators must consciously enter into objective points-of-view – in order to recommend courses of action to their listeners, particularly in immanent circumstances. The latter constitutes one of Peirce’s primary pragmatistic scaffolds
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1249
to foster abductive reasoning – to “recommend a course of action” from every novel belief determination (1909: MS 637:15). To make workable recommendations, procedural memory (knowing the steps to reach a goal) must integrate with semantic memory (knowing what to suggest to ascertain a goal). To convince interpreters to propose recommendations for successful courses of action, narrators incorporate a diversity of characters in the same episode, e.g., asking: “what if X character were to experience the given consequence.” This strategy can adjust the consciousness of listeners to anticipate likely reactions for the continuum at large. As a consequence, interpreters’ proposals are more likely to consist in workable paths of action. This is so, because event frames which qualify as episodic depend upon implicit logical affiliations to hold together. In fact, the rather late ontogenesis of episodic memory (Tulving, 2005: 11) is likely to be a consequence of the need to integrate procedural with declarative knowledge (the latter is a component of semantic knowledge). Because procedural knowledge cannot ordinarily be “brought to conscious awareness” (Mandler, 2004: 46), accessing it and integrating it into perceptual-motor memories require executive control, not present early on in ontogeny (Baddeley, 2007: 148–149). The procedural knowledge necessary for episodic memory, however, is not disconnected from semantic knowledge, since it resides not merely in the spatial but temporal (sequencing) situatedness of the contributing events. In contrast, the autonoetic property of episodic memory relies chiefly upon declarative, semantic knowledge. Its procedural dependence is not insignificant – given its means to coordinate spatial and temporal components (sequencing the where and when of events). In fact, Newcombe, Lloyd, and Balcomb’s (2011) analysis is not inconsonant with the inclusion of procedural knowledge in the mix. In short, episodic memory requires integration of procedural with semantic knowledge to organize representations of past and future events pertaining to self and others. To coordinate both kinds of knowledge effectively, children need to have an awareness of the source for their event memories, i.e., how they know the events – from self-observation or others’ narratives, and need to exert executive control, utilizing the episodic buffer, to block irrelevant event memories from influencing related abductions.
Evidence from Children’s Narratives: Frog, where Are you? Rationale for the Study The purpose of this study was to determine whether the element of surprise is a primary factor in including unexpected outcomes in narratives at distinct ages. This claim derives from C. S. Peirce’s conviction that logical connections between propositions to form arguments (plausible explanatory hypotheses uniting happenings) rely upon abductive reasoning. The assumption presumes that the element of surprise serves a primary function in interpreting narrative structure,
1250
D. E. West
including which events establish story onset, which events manifest the unfolding of the plot, and which events determine narrative resolution (Labov & Waletzky, 1967, cf. Berman & Slobin, 1994: 46). Measuring children’s narratives (utilizing the same picture book, Frog, Where are You?, to provide a universal vantage point for legitimate comparisons) can uncover a developmental trend as to the questions, arguments, and hypotheses which children articulate at distinct ages. Although Berman and Slobin (1994) report a host of psycholinguistic trends (see supra), the present reanalysis of Berman and Slobin’s actual transcripts will examine whether trends in children’s reasoning are operating – when they begin looking backward to earlier events (from surprising consequence to potential antecedents) to determine viable explanations. Rather than taking a Peircean retrospective approach, Berman and Slobin appear to use a prospective approach in that they separate the plot of acquiring the frog into seven ascending episodes constructing a plot (1994: 558). Berman and Slobin’s (1994: 558) prospective approach is made obvious in the following design: “a setting (frog leaving) and an episode (search for the frog).” Berman and Slobin (1994: 558) further differentiate episodes into “a beginning constituent (frog jumping out), a development (complex reaction resulting in a goal path, constituting a number of attempts with subsequent outcomes), and an ending constituent (reaction of the protagonists to finding a frog).” Hence, the character of Berman and Slobin’s episodes appears to conform to an ascending (prospective) logical trajectory rather than a retrospective one (i.e., apprehension of a surprising consequence (often a subsequent event) incites listeners to look backward to identify antecedent contributors). In this way, the present study investigates whether inclusion of surprising consequences served a logical advantage for both members of the narrative dyad – imploring each to engage in retrospective analyses. Furthermore, examination of frequencies of surprising outcomes on the part of subjects may well produce greater opportunities for both interlocutors to look backward for plausible contributing events. As such, the present analysis seeks to examine whether increases in mention of surprising events was associated with increases in mention of antecedents or rationale for the surprising consequence. As such, the present study offers a strategy for narrators to draw listeners (especially at young ages) into picturing, in an inferential way, episodic frameworks (rather than viewing depictions as disparate). This study can demonstrate whether featuring surprising consequences invites both partners to take a further logical step, generating inferences as to causes for the surprise. Berman and Slobin (1994: 23) report that at 3;0, the capacity of their subjects to recognize and formulate logical event connections – detailing the what and why of the boy’s and dog’s actions/feelings – is far less evident than at older ages. They likewise report that older children (at 5;0 and thereafter) are not subject to the limitation of perceiving events as separate. The objective of the present study is to determine whether at 5;0 and thereafter children are more likely to attend to and mention surprising consequences and, if so, whether they are more likely to propose explanatory rationale for the emergence of the unexpected outcome. Berman and Slobin report that at 5;0 and thereafter, their subjects transcended fixation on the individual pictures.
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1251
The Children The children featured in this study (whose transcripts were analyzed) participated in Berman and Slobin’s 1993 original study reported in their 1994 book (also derived from Berman by personal communication). The transcripts which this author received from Berman were compared against the data from the same subjects which Berman and Slobin uploaded to the CHILDES database. These transcripts were examined and analyzed from 4 age groups, 12 children in each group: 3;0– 3;9, 4;0–4;8, 5;0–5;10, and 9;0–9;7. Informed consent to have their child participate was garnered beforehand from parents and guardians; and they were informed of their means to withdraw their child at any stage during the study. It should be noted that the performance of these subjects may reach greater competency in storytelling, given their more privileged socioeconomic status, and often higher scores on standardized verbal and cognitive tests, even as the investigators actively worked to neutralize socioeconomic disparity effects (Berman & Slobin, 1994: 28).
Design Each of the subjects was provided with a picture book before instructions to generate narratives. After familiarizing themselves with the book, subjects were asked to look at the pictures and tell what happened to a familiar examiner (who likewise had access to the pictures). The storybook consists in pictures only, not text, and is written by Mercer Mayer. Frog, Where are You? constitutes a ripe forum to examine how child narrators understand and communicate surprise. A prime vehicle to measure the surprise element in reported events is the narrators’ representation of the event as perfective – the telicity and punctuality ascribed to the event’s borders, compared with neighboring features. Frog, Where are You? is a useful template to elicit the perfective character of events within narratives at diverse ages, since the depictions in the book do not consist in an overly lengthy sequential set of events. This allows children, even at younger ages, to represent events along a continuum, recognizing their sequential character. The book contains 24 individual scenes appearing on different pages. Access to the same depictions within the same sequence for all subjects facilitates topical unanimity – allowing more valid comparisons across subjects. Moreover, subjects’ access to identical images prior to narrating the story (although depicting each event separately) has the benefit of providing the same event orientation; and permitting continued access during the narration itself allows opportunity to recognize each picture, thereby equalizing the likelihood that subjects will mention each of the happenings. Access to the same depictions likewise constrains subjects’ focus, guarding against insertion of happenings extrinsic/exogenous to the story line. Although continued access to the pictures may seem to have a rather prescriptive effect, it insures against subjects’ inclusion of spurious events – those which continue the story line or are fabricated (incongruent with the story line altogether).
1252
D. E. West
Moreover, using the same picture prompts for all children militates in favor of common narrative content for all of the children, within and across age groups. The structure of the picture book supplies a readily formed sequence of events which subjects can easily follow – highlighting actions/expressions which insinuate protagonists’ goals. But, to ensure against being overly suggestive of the story’s theme, the title of the book was obscured altogether. Prior to narrating, the adult preempted what subjects focused on; the adult named and pointed to a boy, a dog, and a frog. This approach provided a topical focus for younger subjects – to determine who and what had prominence. Subjects were first asked to look through the entire book and then tell the story while both parties looked at the pictures. During the narration, both child narrators and adult listeners sat side by side, such that both were able to see the pictures. Looking (as adult listener) at the pictures simultaneously with child narrators may introduce an artificial factor, since the adult experimenters were discouraged from prompting or providing substantive comments – the use of verbal feedback was minimized. Silence, nods of the head, and articulating “uh-huh”/“okay”/“yes”/“anything else?”/“go on” are examples of remarks to be excluded. The fact that both the child and adult had access to the same depictions (story line) may have been a disincentive for the children to tell listeners what they already know; hence, shared knowledge may have decreased the number of topics which the child narrators introduced. Shared knowledge of the pictures may have permitted more focus on filling in the spaces or gaps between pictures and suggesting rationale for facts. This practice could prevent children from merely describing the singleness of individual pictures – especially at 3;0 and 4;0 (Berman & Slobin 1994: 23). Children were sequestered individually and were instructed to look at the individual pictures in the Frog, Where are You? book. This allowed subjects to familiarize themselves with the nuts and bolts of the progression of the story. Each child was then instructed to tell the experimenter the story according to the pictures. Subjects were not provided with hints as to what to include or what to omit; hence, content bias (of what to tell) was not an issue. For the present study, the data were recoded to determine whether surprise influenced what narrators included in their narrations. Children’s verb production of Vendler’s four lexical categories (see infra, this chapter) was orchestrated to show whether punctual and telic events, which enshrine the element of surprise, supersede stative and activity events. Verbless sentences with noun phrases were likewise included in the analysis. Two raters coded 60 percent of the data – the interrater reliability rate was 0.9. The criteria used by raters to distinguish surprising events from regular events include: 1. The depicted event portrays something startling or unusual based on the narrator’s prior knowledge schemes (subjective). 2. The depicted event breaks from the logical sequence established in previous frames (objective).
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1253
3. The degree of surprise displayed by the characters depicted in the event, e.g., expressions of fear, suspense, pleasure, confusion, and misfortune (empathic). 4. Incongruencies of quality residing within individual objects, e.g., color, shape, and intensity.
Findings and Interpretations Frequency of Surprising Consequences One of the most mentioned of the surprising events in the frog scenario is the image of the frog escaping from the jar – one leg outside and the other inside the jar. The high frequency of this event (0.75) manifested itself at all ages, indicating its salience across development. From 5;0 and thereafter, subjects described this scene, conflating manner and motion, as “sneaking out [of the bowl]” – utilizing “escape” likewise incorporates motion with manner, although most of the subjects (even at 9;0) did not select this lexicon. In contrast, younger children merely described this scene utilizing a verb lexicon indicating motion alone, e.g., “got out.” In English, causation (determining antecedents from unexpected consequences) possesses certain syntactic and lexical characteristics. Syntactically, transitive constructions individuate an agent and a verb phrase which requires an accusative NP or occasionally a sentential complement. Lexically, the verb stem ordinarily specifies path/trajectory toward a goal – lexicalized by intransitive verb constructions; the degree of voluntariness of the experiencer is likewise lexicalized in transitive constructions. Alternatively, narrators may choose an intransitive verb lexicon to express episodic action trajectories. This lexical choice reconstructs the syntax of the sentence – such that the verb does not take an NP complement. This syntactic reorganization deliberately omits the agent of the resultant state, instead, leaving the decision to listeners. In this way, the narrator’s logical commitments beyond single propositions of action path are communicated to listeners. This selection likewise transmits the narrator’s assertion of whether the experiencing party/entity is responsible for the unexpected end-state – implicating who or what might constitute the cause. As such, the narrator’s decision to employ intransitive constructions emphasizes the involuntariness of the experiencing party/entity, given the focus on a resultant state of affairs rather than on the perpetrating agent. For example, the narrator’s selection of the intransitive verb “fall” rather than the transitive verb “shifted/descended” demonstrates some intent to disavow the experiencer’s agency/fault in his displacement (from one resting place to another). Accordingly, the narrator emphasizes the surprising resultant state rather than the full action sequence. This choice creates a different topic (the result of being dislodged to an unintended location), as opposed to conjectures regarding specific contributing factors. Examples of subject’s lexical selections include: “[the dog] knocked down [the beehive],” and “[the tree] got bammed on” when the dog was shaking it causing the beehive to fall (see Berman & Slobin, 1994: 155 for elaboration). The former transitive verb construction explicitly expresses the narrator’s assertion of
1254
D. E. West
the agent’s responsibility and the causative effect to the beehive rather than that the beehive got “knocked down.” The child narrator does not leave inferencing processes to the listener, but states all of the tenets of his assertion. The present analysis reveals that, although frequencies of events (surprising and otherwise) increase with age, those of surprising events are notably higher at all ages: 0.7 at 3;0, 0.8 at 4;0 0.9 at 5;0, reaching 0.99 at 9;0 (see Fig. 1). These frequencies show that subjects overwhelmingly mentioned events which depict unexpected effects (e.g., participants’ incongruent actions) rather than communicating concurrent background (expected) facts. These findings indicate that children at all ages employed surprising consequences as scaffolds for their listeners – to form the nucleus for their proposition-building. Peirce’s claims promoting abductive inferencing adamantly support narrators’ preference for including Phemes (signs that compel interpreters to form new meanings) to moderate and challenge listeners’ preconceived argument structure (see 1906: MS295: 29 and 1906: 4.552). In point of fact, drawing listeners’ attention to surprising outcomes becomes the bedrock to generate different relations between propositions. In this way, narrators bolster their interlocutors to propose new signs, akin to arguments (1906: 4.552). In short, the present findings show that Phemes (as surprising, indexical and iconic signs) operate as a potent force when telling narratives, given the attentional impact spurred by the double nature of indexical and iconic meanings. The contrast between features expected to surface and those which actually appear excites narrators’ notice as unlikely bedfellows, e.g., pictures featuring new object configurations or new together-objects; hence, they make these subject-predicate relations obvious to their listeners. These surprising depictions suggest different logical subjectpredicate connections, which accounts for the fact that the subjects herein more often select them (over events absent a surprising element) to present to listeners for topicalization. The indexical component of Phemes further advances the fit of new propositional material proposed by the iconic features into the episodic scheme. As such, at all ages, subjects employ the indexical component to serve a critical deictic function (see West, 2011, 2018) – pointing backward to see whether plausible antecedent events can complete the inquiry by offering explanations for the surprising outcomes. In this way, the potency of icons and indices together in the Pheme [“The next time one meets with it [the proper name], one regards it as an Icon of that Index,” (1905: MS 280: 27)] incites the adult listeners to use unexpected outcomes as catalysts to determine causality through retroductive inferencing. The mean relative frequencies of surprising events have been calculated as proportions of total surprising events (twelve) which subjects mentioned in their narratives. Similarly, the mean relative frequencies of regular events represent proportions (of the 12 events) which the subjects actually mentioned. All age groups omitted regular events more than surprising events. Regular events were omitted at a proportion of 0.61 for 3;0, 0.59 for 4;0, 0.63 for 5;0, and 0.78 for 9;0 (see Fig. 2). The two events most likely to be omitted are regular events. The most omitted event is the final image of the boy carrying away a baby frog, while the second most omitted event is the image of the boy shushing the dog. In the case of the boy
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1255
Fig. 1 Comparison of surprising versus regular events with age
Fig. 2 Omissions of surprising versus regular events with age
shushing the dog, the preceding event indicates that the boy heard something, but it is not visually obvious why the boy is shushing the dog. Images most included in the children’s narratives are the boy and dog in midair falling from the cliff and the dog in midair falling out of the window. This shows a preference for including the beginning of a resultative event in the narrative and not the resolution. This pattern of omitting the resultative scenes is less obvious at 9;0 and 5;0 than for 4;0 and 3;0. Berman and Slobin support this finding: 9-yearold narratives often contain a “more global structure organized around the onset
1256
D. E. West
and elaboration” (1994: 70). In fact, Berman and Slobin (1994: 69) summarize the developmental progression as follows: (a) spatially-motivated utterance linking of picture-by-picture description (3-year-olds); (b) temporal organization at a local level of interclausal sequential chaining of events (most 5year olds); (c) sequential and/or causal chaining of partially eleaborated events . . . like most 9-year-olds; and (d) global organization of entire texts around a unified action-structure . . . like some 9-year-olds, and all the adults.
Some events were syntactically truncated, but still included, in this analysis. At 3;0 the three most included events were (1) the initial depiction of the boy, dog, and frog together, (12) the boy falling and the dog being chased by bees, and (17) the boy and dog falling from the cliff. At 4;0, (12) was likewise rather likely to be included, as was (2) the frog escaping from the jar and (6) the dog falling out the window. Since the older children included more depictions, their inclusions are less noteworthy. Children’s mean relative frequencies of telic verbs show a similar pattern to frequencies of surprising events. Telic verbs in this analysis incorporate use of both achievement and accomplishment verbs. Findings demonstrate predominance of telic verb use (over non-telic verbs) when mentioning surprising versus regular events across all age groups (see Fig. 3) – 0.6 at 3;0 to 0.75 at 9;0. In contrast, although telic verb use in non-surprising contexts likewise increases with age, increases are less notable – slightly higher at 4;0 than at 3;0, declining at 5;0, ultimately increasing markedly at 9;0. These trends show a preference for telic verb use in surprising event contexts. With respect to comparisons (Vendler, 1967) of all kinds of Vendlerian verbs (achievement, accomplishment, stative, activity), child narrators at all age groups employ achievement verbs most often (see Tables 1 and 2 and Figs. 4 and 5). The
Fig. 3 Mean relative frequencies of telic verb use in surprising events versus regular events
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1257
Table 1 Mean relative frequencies of surprising events in Vendler’s categories
3;0 4;0 5;0 9;0
Achievement verbs 0.46 0.47 0.51 0.64
Accomplishment verbs 0.18 0.16 0.21 0.07
Activity verbs 0.16 0.24 0.15 0.12
State verbs 0.2 0.14 0.13 0.16
Table 2 Mean relative frequencies of regular events for Vendlerian verbs
3;0 4;0 5;0 9;0
Achievement verbs 0.14 0.16 0.18 0.25
Accomplishment verbs 0.23 0.34 0.3 0.23
Activity verbs 0.27 0.33 0.27 0.37
State verbs 0.36 0.18 0.26 0.14
Fig. 4 Mean relative frequencies of surpising events
mean relative frequency of achievement verbs is more than double those of verbs of other categories: Accomplishment, Activity, or State. Although this trend operates at 3;0, at older ages frequencies increase to more than triple the frequency, especially at 9;0 (0.46 at 3;0, 0.47 at 4;0, 0.51 at 5;0, and 0.64 at 9;0). With respect to events without the element of surprise, an upward trend with age for achievement verbs is likewise noted, although the trend is less steep than that of surprising events (see Table 2 and Fig. 5). In contrast, children’s production of other Vendlerian verb categories (accomplishment, activity, state) shows no pattern of use between event types (surprising or non-surprising), constituting a reverse
1258
D. E. West
Fig. 5 Mean relative frequencies of regular events
trend with respect to that of surprising events. Conversely, mean relative frequencies of achievement verbs in non-surprising contexts demonstrate an altogether different pattern; they are lower than those of the other verb types for all age groups except at 9;0 (0.14 at 3;0, 0.16 at 4;0, 0.18 at 5;0, and 0.25 at 9;0). This performance difference (preference for achievement verbs) in surprising events is further highlighted by the rather flat distribution of frequencies for the four verb types across all age groups, independent of the surprise element. Nonetheless, unlike surprising events, frequencies of Vendlerian verb types in regular contexts show a quite different selection of verb types, namely, increased production of activity, accomplishment, and state verbs over achievement verbs. The low incidence of accomplishment verbs at 3;0 is curious, since these verbs include in their lexemes the semantic element of culmination, which is associated with telicity – the incidence of which is high (see Table 2 Fig. 5). In fact, all age groups employ accomplishment verbs least often with respect to the other verb categories: 0.18. Essentially, younger narrators within the sample less frequently chose verbs which emphasize internal movement toward a goal (accomplishment verbs); but their frequency of use of telic verbs (which feature instantaneity) nonetheless demonstrates a preference for an event happening suddenly (all at once). The three-year-olds’ frequent production of achievement verbs featuring sudden happenings is in line with their more frequent mention of surprising outcomes. While the semantic content of achievement verbs expresses something happening all at once, accomplishment verbs encode dynamicity as well as telicity. But, (as mentioned previously) three-year-olds’ use of achievement verbs demonstrates a proclivity to highlight each instantaneous event for their listeners, without focusing on outcomes to elicit inquiry into causes/antecedents for outcomes. In other words, at the early ages, the high incidence of achievement verbs appears not to
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1259
encode logical goal-based meaning paradigms. With age, however, meanings of achievement verbs (telicity) widen after 5;0, once argument structure incorporates binding of sequential happenings. A caveat is notable here with respect to the performance of the three-year-olds. Their high frequency of telic verbs to emphasize events as separate may (despite possible absence of a logical terminus) unwittingly advantage their listeners to make inferences from the narrators’ attention to consequences as topics in the narrative. This focus may (without their apprehension) lead listeners to infer the contribution of antecedents. In short, the three-year-olds’ proclivity to direct listeners’ attention to surprising outcomes can serve as a model for retroductive approaches. Child narrators’ frequent use of telic verbs (achievement) may not actually imply the existence of antecedent events from sudden outcomes, but may at least establish and put into motion Peirce’s consequent-to-antecedent paradigm. Even if causal grounding (potential antecedents which might have promoted the consequence) is not operating for three-year-olds, their piecemeal narrative may still give rise to a cognitive advantage for listeners. Changes in states (which can possess a telic meaning component) appear to be similarly unlinked to causal eventualities for subjects at all ages. Since production of stative verbs is reasonably low with respect to achievement verbs, at this age, subjects may not have been aware of the telic meanings which can sometimes reside in stative events. Telic meanings in stative verb use (e.g., recovery from physical illness) are often less frequent and less obvious than are more imperfective uses, because the primary semantic feature of conditions ordinarily comprises continuance rather than the termination of the condition. As a consequence, the preponderance of durative meanings in stative verbs may have blinded children to telic meanings, accounting for the low incidence of stative verbs in general. Conversely, suddenness as instantiated in both onset and offset, and in instantaneous events (codified in telic verbs) even if not associated with their full meaning, may nonetheless serve as a catalyst for abductive logic in narrative contexts.
Tense-Related Findings Before 5;0, child narrators do not reliably possess the requisite theory of mind skills to realize which signs contribute to sequenced events for their listeners. In fact, they themselves may not have imposed logical coherency on the series of events; hence, their means to express temporal overlaps between events is inadequate to reveal episodic coherence. The emergence of these temporal skills (sequencing and binding into episodic frames) has far-reaching facilitative consequences – highlighting aspectual distinctions for their interlocutors. Directing listeners to view the internal constituencies of events on the part of narrators makes obvious the internal character of the story – the stops, starts, and turns of the defining episodes. Utilizing signs which increase listeners knowledge of and rationale for participants’ motivations and next moves is paramount to the interpretive process, especially one which relies heavily upon episodic frames. For this reason, narrators’ aspectual competencies and the semiotic devices which represent them become the linchpin to predict what participants will do next and to infer which preceding
1260
D. E. West
events are responsible for the unexpected outcomes. Highlighting continuities and stops and starts within the account is indispensable to enhance surprising outcomes (particularly the latter) and hasten aspectual determinations by virtue of the presence of telicity. Failure to express either event dynamicity or culmination (telicity) can depress generation of interpreters’ retroductive inferencing. Expressing punctuality by means of surprising outcomes, while omitting foregrounded happenings, however distinctive they are with respect to the order of the story’s happenings, may, in fact, hasten the self-controlled inquiry which Peirce advocates. Subjects’ production of tense marking (“-ed”) inflections and irregular forms shows more than twice the frequency of present tense verb forms over past tense forms early on, at 3;0 (see Fig. 5), while past tense morpheme markers increase to equalize those of present tense at 4;0 and 5;0 (see Fig. 6). The largest increase of past tense markers is observed at 9;0, when past tense productions supersede present forms by more than twice the frequency (see Fig. 6). In fact, 8 of the 12 9-year-old subjects are past tense dominant, while over half of them never produced the present tense at all (see Fig. 6). These findings are consonant with Berman and Slobin’s (1994: 132) findings regarding increased production of past tense. These corroborating findings support Tulving’s (2002: 7 and 2005: 11) observation that mental time-travel skills regulate use of past tense. Tulving’s claim accounts for increases in past tense production, once autonoetic competency is in place.
Narrative Event Sequencing The findings here attempt to compare the sequence of children’s reported events with the sequence of the events depicted in the storybook itself. The following kinds of event sequences were coded: those narratives which do not conform to canonical event sequence, those which do, and those in which events were added but which fail to embellish upon the story. Relative frequencies are derived from each subject’s absolute frequencies of included events. Events were coded as: (1) conforming to the event sequence of the storybook (such that the reported event matched that of the story), (2) those not conforming to the sequence of depictions within the storybook, and (3) reported depictions which never appear in the storybook (fabricated events). The out-of-order events are highest at the youngest age, at 3;0 (0.085), but decrease by more than half at 4;0 (0.028), and remain at similar levels thereafter (see Table 3 and Fig. 7). Production of fabricated events follows a reverse trend, with three-year-old subjects scoring the highest frequencies (0.069) and decreasing by half at 4;0 (0.033; see Table 3). It is interesting to note that at 5;0 and thereafter, no instances of fabricated events were found; at 5;0 and 9;0 subjects did not introduce events that were not already depicted in the book’s story line (see Table 3 and Fig. 7). While frequencies of non-conformity were low and that of fabrication high at the earliest age, with age the two frequencies reversed themselves, demonstrating that when sequential episodic story patterns were least likely to be followed (at the youngest age), younger subjects (especially at 3;0) were most likely to insert extraneous, and often irrelevant, events. Conversely (consequent to their episodic memory skills and means to time travel), older subjects’ narratives followed the
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1261
Fig. 6 Increased incidence of past tense with age. Relative frequencies of past versus present tense selection in 9-years-olds Table 3 Relative frequency of narrative sequence divergence
3;0 4;0 5;0 9;0
Out-of-order events 0.085 0.028 0.032 0.032
Fabricated events 0.069 0.033 0.000 0.000
sequence of the storybook; and given the confidence of fluent telling, the need to embellish the narrative with irrelevant, external events became unnecessary. Explanations were coded illogical when child narrators produced irrelevant explanations to the progress of the narrative account and/or explanations which could not bring about the events alluded to. As might be expected, illogical effects decreased with age – 0.54 at 3;0, 0.33 at 4;0, and 0.29 at 5;0 – and were absent altogether at 9;0 (see Table 4 and Fig. 8). Given both the age gap and the
1262
D. E. West
0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 3;0
4;0 Out of Order Events
5;0
9;0
Fabricated Events
Fig. 7 Relative frequency of narrative sequence divergence Table 4 Relative frequencies of logical and illogical effects
Illogical Logical
3;0 0.54 0.46
4;0 0.33 0.67
5;0 0.29 0.71
9;0 0.00 1.00
0.60 0.50 0.40 0.30 0.20 0.10 0.00 3;0
4;0
5;0
9;0
Fig. 8 Relative frequencies of illogical effects
more advanced episodic memory competencies at older ages, the largest change is observed between 5;0 and 9;0. Like production of illogical explanations for resultative events, irrelevant descriptions trend downward with age: 0.23 at 3;0, 0.11 at 4;0, 0.12 at 5;0, and 0.05 at 9;0 (see Table 5 and Fig. 9). These findings (irrelevant/illogical) measure the congruity or dissonance between the descriptions in the narrations and the book’s depictions.
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . . Table 5 Relative frequencies of relevant and irrelevant descriptions
Irrelevant Relevant
3;0 0.23 0.77
1263 4;0 0.11 0.89
5;0 0.12 0.88
9;0 0.05 0.95
0.25 0.20 0.15 0.10 0.05 0.00 3;0
4;0
5;0
9;0
Fig. 9 Relative frequencies of irrelevant descriptions
General Discussion The preponderance of surprising events reported by subjects at all ages, especially prevalent at 9;0 (with concurrent exposure to the depictions in Frog, Where are You?), reflects not just the physical salience of surprising consequences, but a need to bootstrap the logical components that contributed to the consequence. In this way, explanations for unexpected outcomes can be manufactured. It is notable that subjects overwhelmingly included surprising outcomes in their narratives in the semiotic condition when an icon depicted some incongruence with an indexical feature (the sequence of events). Hence, these kinds of signs (whose icons conflict or are unconventional with respect to the series of events that comprise the episode) account for child narrators’ impetus to announce them in their narratives, imposing their import upon listeners. These conflictual iconic and indexical strands became enshrined in the narrative, producing a discontinuous profile. The semiotic character of this profile often omitted descriptive, background information (based on more durative verb meanings) – requiring listeners to fill in the progression of the account. Gaps in the narrators’ account may have been influenced by the arrangement in the book itself – placement of depictions on distinct pages. This narrative paradigm is not without some advantage however; it may well advance interlocutors’ inquiry into logical explanations for the surprising consequences – perpetuating the generation of inferences. The alternative possibility of introducing into children’s consciousness slide-like depictive accounts as opposed to separate depictions (as in the story at hand) may serve the distinct benefit of instilling a more continuous
1264
D. E. West
account from which child narrators can draw. Unlike the suggested introduction of slide-based depictions, the current Frog, Where are You? paradigm virtually predisposes children toward discontinuous scene interpretation, since depictions on different pages imply not merely discontinuous events, but likewise the existence of more than one narrative running concurrently. In other words, the fact that the pictures from which children constructed narratives are themselves on different pages may militate in favor of perceiving the integral nature of a single narrative with goalbased junctures which invite interpreters to generate retroductive inferences – an issue particularly critical to children at younger ages. The separation of scenes may have different effects for narrators at distinct ages – having a more affirmative effect for older children – inducing them to inquire into extra-textual event relations. It behooves narrators (especially younger ones) to apprehend that the relations of and explanations for goals expressed in the narrative consist likewise in interpreting the spaces between images/icons – superseding their function as signposts. The literal mind of younger child narrators assumes that empty spaces/pauses mean just that – an absence of event relations; and often their analysis stops there. This phenomenon of failing to fill in the logical spaces between depictions is founded upon additional evidence reported by Hayne and Imuta (2011). Hayne and Imuta’s findings reveal that child narrators inform their listeners of the places where events transpired while not imposing temporal organization (which would indicate beginning and endpoints with respect to other events). Other supporting evidence drawn from Berman and Slobin (1994: 104) demonstrates that descriptions of places characterize narratives at 3;0, and that such descriptions distinguish events, rather than connecting them. In line with this reasoning, Tulving (2002: 7 and 2005: 11) claims that children’s inability at 3;0 to supersede place considerations and organize component events according to a logic-based time line reflects a failure to embed the elements into episodic frames. In light of Tulving’s observations, interventions (on the part of narrators) to make prominent telic aspects of events, while integrating background descriptions of ongoing states, appear not to be operational until sometime within the four-year mark; hence, narratives at earlier ages often consist in disconnected descriptions, without the aspiration of inferencing to logically integrate event frames. The lack of logical integration across event frames at 3;0 is concurrent with these subjects’ less frequent use of past tense and their relative inability to engage in mental time travel (see Tulving, 2005; West, 2014). The present findings clearly indicate age to be a primary factor in sequencing events in narrations, given the high incidence of out-of-order and fabricated events which subjects at 3;0 inserted into their narratives. In fact, after 4;0, production of out-of-order events declined, while insertion of event fabrications (inventing events not depicted in the storybook) ceased altogether. Subjects’ increased competencies sequencing events at 5;0 is concurrent with other time-travel competencies, e.g., exclusion of irrelevant/illogical events; and increased incidence of logical/relevant explanations for surprising consequences is likewise observed at the same age. In short, because older subjects constructed narratives more consonant with depictions in Frog,
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1265
Where are You?, their inferences offered to explain how the surprising outcomes materialized were likewise more adequate. Syntactic and lexical analyses from Berman and Slobin (1994: 98) support these increased logical competencies. None of their three- and four-year-olds explicitly articulated the source and the goal within a single syntactic unit, i.e., within the same clause, sentence, or even in adjacent sentences. This is consonant with the findings of Trabasso and Nickels (1992), demonstrating that goal-attempt-outcome sequences increase sharply between 4;0 and 5;0. Their subjects at 5;0 exhibited the clearest use of higher-level planning skills to produce a narrative in which goals are highlighted for the listener (see also Trabasso et al., 1992). To ascertain the plot, subjects would need to recognize each episode as a failure given that the frog was not apprehended (see Berman & Slobin, 1994: 86). Even though approximately half of their subjects at 4;0 recognized the beginning of a sequence as an attempt (e.g., to recapture the frog), only 25% sustained the attempts beyond the third scene (Trabasso et al., 1992: 159). Trabasso et al.’s findings regarding inability at 4;0 to maintain apprehension of goals may be a consequence either of a lack of planning (what to include in the narrative and in which order) or an inability to know and anticipate how to fashion the elements of the narrative for their listeners. Trabasso et al. (1992) report that planning is for the most part in place at 5;0. The present findings corroborate those of Trabasso et al., since frequencies of fabricated events (event insertions which are not depicted in the storybook) are far lower at 5;0 (and absent thereafter) than at younger ages (see Fig. 7). In fact, only three subjects within the 3;0 and 4;0 groups together supplied some loose connection between source and goal, by including the place where the surprising consequence occurred, or the action of the agent responsible for the antecedent event, but without mentioning temporal situatedness and without logically/syntactically connecting potential antecedents to the surprising consequence – mentioning the fall into the water (surprising consequence) absent any reference to the antecedent events, e.g., the deer shoving the two unfortunate travelers (see also Berman & Slobin, 1994: 163, for a discussion of the absence of logical relations). Additional examples from three-year-olds include: “He [presumably the boy] fell – in the pool.” Still another three-year-old likewise articulated the surprising consequence only without the source: “and he’s [presumably the boy] in the water.” Nonetheless, some fouryear-old subjects produced two successive clauses indicating event serialization, although not all of the serialization was explanatory. One four-year-old mentioned the transition of location and not the antecedent: “And then there was a cliff, and they both [boy and dog] fell straight into the water.” This subject merely describes an accurate sequence but omits the antecedent which consists in the reason for the boy’s and dog’s fall. Another more precocious four-year-old communicated both the antecedent and the surprising consequence: “He [the deer] pushed them both [the boy and the dog] off of the cliff, and they landed in some water.” It is crucial to note that the aforementioned syntactic construction consists in two coordinate clauses, which fails to tightly connect the precedent source event (the antecedent) with the result (surprising consequence). This syntactic pattern gradually disappeared
1266
D. E. West
in favor of production of both events in embedded clause constructions, such that half of the five-year-olds supplied the source, as well as the consequence, although still in separate, coordinate clauses. In coordinate clauses, the conjunction “and” intervenes between each clause – that which expresses source and that which mentions the surprising outcome. This continued syntactic formation of subjects at 4;0 demonstrates recognition of both origin and consequent but is silent as to the logical relationship between them. It is unfortunate that data are not available from seven-year-old children, since these data would mark when in development children utilize syntactic devices (e.g., embedded clauses) to unite consequence with source. This union within the same subject-verb construct clearly indicates an underlying logical connection between two propositions – one determining the source/agent/antecedent and the other the consequence. This mature inferential competence is demonstrated by virtually all of the subjects at 9;0, along with increases in production of relevant/logical explanatory inferences (see also Berman & Slobin, 1994: 162, 164). A clear illustration of single clause inclusion of both event aspects can be found in one nine-year-old’s utterance: “He [the deer] pushed him [the boy] off the side of the cliff into water.” This utterance communicates the rationale of the narrator – the reason for the boy’s fate in taking an unexpected plunge, namely, because of the deer’s thrust (as perpetrator). This transformation, in Peircean terms, of propositions into arguments/Delomes (1906: 4.552) surfaces as an elaboration of two distinct events. Berman and Slobin (1994: 151) support the claim that even at 5;0, syntactic constructions still fail to demonstrate logical event connectedness; they observe that sentences are still not structured “beyond the bounds of a bare verb,” such that events are expressed with two conjugated verbs, housed in two separate independent clauses. Nonetheless, some of Berman and Slobin’s subjects at 5;0 represented logical relations within a single clause – by compressing “different facets of the situation within a single clause rather than arranging them linearly across successive clauses” (Berman & Slobin, 1994: 151). This more mature syntactic form (clausal compression) illustrates a logical advance, in that it integrates narrators’ inferences, encapsulating who did what to whom. Logical factors compressed syntactically in a single clause often include manner, cause, direction, and source or goal of movement (Berman & Slobin, 1994: 151), logical components indispensable to inferential logic. Lexical factors can likewise scaffold logical advancement by fusing propositions into an argument. Since at older ages children have greater competency to choose among several verb lexemes to express the manner, particularly in English (Berman and Slobin, 1994: 152), their single clauses can subtly incorporate additional manner-based propositions pertaining to the core proposition of consequence or source, e.g., the lexical entry of “run” implies haste to leave a scene. This lexical selection can imply either that haste to perform is a consequence of a pursuing agent or the narrator’s assumption that the runner anticipates a significant consequence from the influence of another event.
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1267
Conclusion and Prospects for Future Research The youngest narrators’ tendency to fabricate and insert irrelevant events may well have been a consequence of attempts to cover over their inability to organize and store chains of information in working memory (see Baddeley, 2007; West, in press), especially pertinent to events which were not particularly surprising. Because the older subjects exploited some argumentative logic, by virtue of inclusion of logical explanations for consequences, especially for surprising ones, event fabrications disappeared. The greater application of explanatory inferencing to consequences for the older subjects is likely to have been a direct consequence of their higher incidence of attention to and inclusion of surprising outcomes in their narrative accounts – although subjects across the board were generally less likely to omit surprising events compared to less surprising ones (Fig. 2). Nonetheless, had intervention strategies been used to highlight surprising events, the younger children’s inclusion of events might have approximated frequencies of the older subjects, conforming more directly to the storyboard. This interventive approach offers younger narrators an easier means to tell more holistic narratives, in line with Labov and Waletzky’s (1967) determination of the contents of full-fledged narratives. The factors which make up narratives (for Labov and Waletzky) consist in beginning, unfolding, and resolution of the account. The narrative advance of the older subjects may result from their realization that sharing knowledge (simultaneous focus on the same depictions) with interlocutors can amplify their own inferential skills by considering listeners’ perspectives regarding the influence of certain antecedents. In this way, reliance upon common iconic and indexical cues provides a scaffold for narrators to have their listeners attribute the same explanatory inferences to the same observable signs. The presence of sequential iconic signs allows narrators the means to direct listeners’ interpretation of event structures while likewise examining the viability of those of their listeners. Introducing a device to highlight surprising outcomes can elicit younger children’s implicit knowledge – and facilitate generation of instinctual hunches as to which previous events might be responsible. In fact, a focus on resultative facts may even force child narrators themselves to create more coherent accounts – filling in the gaps across what might at first glance appear to be separate (isolated) happenings. Several techniques can be implemented, e.g., telling episodes to younger children in an order inconsonant with that of canonical order can advantage young narrators, such that surprising consequences are mentioned out of order, prior to earlier events. This dissonance between reported events and canonical events may compel children to reflect upon the role of prior/concurrent happenings. This approach (reasoning from the surprising outcome) may be particularly successful with younger narrators for whom time lines are often confused or simply immaterial (see Hayne & Imuta, 2011). Rather than altering the prescribed sequence in the picture book, pictures of surprising consequences might alternatively be presented in color (or in a different color), while other surrounding events are in black and white. Although the
1268
D. E. West
present design does not alter Frog, Where are You? in this way, this strategy might enhance attention to the import of the outcome and could give rise to narrators’/listeners’ interrogations for explanations. As such, child narrators might be more likely to search out potential antecedents, consonant with Peirce’s paradigm of abductive (retroductive) inferencing. Deliberate implementation of this colorenhancing approach for child narrators may permit them to tell narratives which are not fabricated and which are more logically integrated. What the findings of the present analysis primarily demonstrate is that drawing younger child narrators’ attention to the surprise element of outcomes is vital to hasten the application of retroductive rationality to the story line. As such, narrative genres are particularly efficacious in compelling child narrators to create viable abductions from surprising circumstances. Independent of interlocutors’ strategies to intervene, certain signs may be salient enough (particularly Phemes, see West, in preparation) to alone produce an interventive effect, e.g., surprising event leading to inquiry and explanatory hunches as to the causes. The iconic and indexical character of pictures/depictions engenders distinctive effects, particularly for younger narrators. Signs featuring icons along with indices compel young narrators to colocalize an image in a temporal and spatial location. Using the semiotic features of these surprising outcomes frees younger narrators up to internally create rationale, and not to simply depend upon the emergence of developmental milestones, namely, time-travel competencies.
References Aliseda, A. (2016). Belief as habit. In D. West & M. Anderson (Eds.), Consensus on Peirce’s concept of habit (pp. 143–152). Springer. Atkins, R. K. (2018). Charles S. Peirce’s phenomenology: Analysis and consciousness. Oxford University Press. Baddeley, A. (2007). Working memory, thought, and action. Oxford University Press. Bellucci, F. (2014). “Logic, considered as semeiotic”: On Peirce’s philosophy of logic. Transactions of the Charles S. Peirce Society, 50(4), 523–547. Bergman, M. (2016). Beyond explication: Meaning and habit-change in Peirce’s pragmatism. In D. West & M. Anderson (Eds.), Consensus on Peirce’s concept of habit (pp. 171–197). Springer. Berman, R., & Slobin, D. (1994). Relating events in narrative: A Crosslinguistic developmental study. Lawrence Erlbaum Associates. Bruner, J. (1990). Acts of meaning. Harvard University Press. Cooke, E. F. (2012). Peirce on wonder, inquiry, and the ubiquity of surprise. Chinese Semiotic Studies, 8, 178–200. Davies, M., & Coltheart, M. (2020). A Peircean pathway from surprising facts to new beliefs. Transactions of the Charles S. Peirce Society, 56(3), 400–426. Feodorov, A. (2021). Arrested development: On instinct and reasoning in C.S. Peirce’s philosophy. Human Arenas. https://doi.org/10.1007/s42087-021-00187-1 Haack, S. (2014). Do not block the way of inquiry. Transactions of the Charles S. Peirce Society, 50(3), 319–339. Hayne, H., & Imuta, K. (2011). Episodic memory in 3 and 4-year-old children. Developmental Psychobiology, 53(3), 317–322. Herman, D. (2002). Story logic: Problems and possibilities of narrative. University of Nebraska.
57 Surprise as the Dawning of Abductive Rationality: Evidence from. . .
1269
Klein, S. (2015). What memory is. Crosswires, 1, 1–38. Labov, W., & Waletzky, J. (1967). Narrative analysis. In J. Helm (Ed.), Essays on the verbal and visual arts (pp. 12–44). University of Washington Press. Magnani, L., Arfini, S., & Bertolotti, T. (2016). Of habit and abduction: Preserving ignorance or attaining knowledge? In D. West & M. Anderson (Eds.), Consensus on Peirce’s concept of habit (pp. 361–377). Springer-Verlag. Mandler, J. (2004). The foundations of mind: Origins of conceptual thought. Oxford University Press. Newcombe, N., Lloyd, M., & Balcomb, F. (2011). Contextualizing the development of recollection. In S. Ghetti & P. J. Bauer (Eds.), Origins and development of recollection: Perspectives from psychology and neuroscience (pp. 73–100). Oxford University Press. Nubiola, J. (2005). Abduction or the logic of surprise. Semiotica, 153(1/4), 117–130. Paavola, S. (2004). Abduction through grammar, critic, and methodeutic. Transactions of the Charles S. Peirce Society, 40(2), 245–270. Peirce, C. S. (i.1866a–1913). The collected papers of Charles Sanders Peirce Vols. I–VI, ed. C. Hartshorne and P. Weiss (Harvard University Press, 1931–1935); Vols. VII–VIII, ed. A. Burks (1958). Cited with the CP convention of volume and paragraph number CP X.yyy. Peirce, C.S. (i.1866b–1913). The Essential Peirce: Selected philosophical writings Vol. 1, ed. N. Houser & C. Kloesel; Vol. 2, ed. Peirce Edition Project. University of Indiana Press, 1992–1998. Cited as EP 1 and EP 2. Peirce, C. S. (i.1866c–1913). Unpublished manuscripts are dated according to the Annotated Catalogue of the Papers of Charles S. Peirce, ed. R. Robin (University of Massachusetts Press, 1967), Cited according to the convention of the Peirce edition project, using the numeral “0” as a place holder. Cited as MS or R. Peirce, C.S. (1903). Pragmatism as a principle and method of right thinking: The 1903 Harvard lectures on pragmatism, ed. P. Turrisi. SUNY Press, 1997. Cited as PPM. Peirce, C. S. & Welby, V. (i.1898–1912). Semiotic and Significs: The correspondence between Charles S. Peirce and Victoria, Lady Welby, ed. C. Hardwick & J. cook. Bloomington: University of Indiana Press, 1977. Cited as SS. Pietarinen, A.-V. (2006). Signs of logic: Peircean themes of the philosophy of language, games, and communication. Springer-Verlag. Stjernfelt, F. (2014). Natural propositions: The actuality of Peirce’s doctrine of Dicisigns. Docent Press. Stjernfelt, F. (2020). Co-localization as the syntax of multimodal propositions: An amazing Peircean idea and some implications for the semiotics of truth. In T. Jappy (Ed.), The Bloomsbury companion to contemporary Peircean semiotics (pp. 419–458). Bloomsbury. Szpunar, K., & Tulving, E. (2011). Varieties of future experience. In M. Bar (Ed.), Predictions in the brain: Using our past to generate a future (pp. 3–12). Oxford University Press. Trabasso, T., & Nickels, M. (1992). The development of goal plans of action in the narration of a picture story. Discourse Processes, 15(3), 249–275. Trabasso, T., Stein, N. L., Rodkin, P. C., Munger, M. P., & Baughn, C. R. (1992). Knowledge of goals and plans in the on-line narration of events. Cognitive Development, 7(2), 133–170. Tulving, E. (1985). Memory and consciousness. Canadian Psychology, 26(1), 1–12. Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1–25. Tulving, E. (2005). Episodic memory and autonoesis: Uniquely human? In H. S. Terrace & J. Metcalfe (Eds.), The missing link in cognition: Origins of self-reflective consciousness (pp. 3– 56). Oxford University Press. Vendler, Z. (1967). Linguistics in philosophy. Cornell University Press. West, D. (2011). Deixis as a symbolic phenomenon. Linguistik Online, 50(6), 89–100. West, D. (2016a). Peirce’s creative hallucination in the ontogeny of abductive reasoning. Public Journal of Semiotics, 7(2), 51–72. West, D. (2016b). Reflections on complexions of habit. In D. West & M. Anderson (Eds.), Consensus on Peirce’s concept of habit: Before and beyond consciousness (pp. 421–432). Springer.
1270
D. E. West
West, D. (2016c). Indexical scaffolds to habit-formation. In D. West & M. Anderson (Eds.), Consensus on Peirce’s concept of habit: Before and beyond consciousness (pp. 215–240). Springer. West, D. (2016d). Course of action recommendations and their place in developmental abduction. IfCoLog Journal of Logics and their Applications, 3(1), 123–152. West, D. (2017). Virtual habit as episode-builder in the inferencing process. Cognitive Semiotics, 10(1), 55–75. West, D. (2018). The work of Peirce’s dicisign in representationalizing early deictic events. Semiotica, 225, 19–38. West, D. (2019a). Index as scaffold to logical and final interpretants: Compulsive urges and modal submissions. Semiotica Special Invitation Issue 228, 333–353. West, D. (2021a). Logical and practical advantages of double consciousness. Cognitive Semiotics, 14(1), 47–69. West, D. (2021b). The element of surprise in Peirce’s double consciousness paradigm. Semiotica, 243, 11–47. West, D. (forthcoming). Narrative: The dialectic of abduction. Springer. West, D. (in preparation). The operation of Peirce’s Pheme in narrative contexts. Semiotics 2021, F. Seif (ed.). West, D. (in press). Habit and semiosis. In J. Pelkey (Ed.), Bloomsbury Semiotics Volume I: History and semiosisLondon: Bloomsbury. Wheeler, M., Stuss, D., & Tulving, E. (1997). Toward a theory of episodic memory: The frontal lobes and autonoetic consciousness. Psychological Bulletin, 121(3), 331–354. Woods, J. (2013). Errors of reasoning: Naturalizing the logic of inference. College Publications.
Abduction Beyond Representation
58
P. D. Bruza and Andrew Gibson
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Surprise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phenomenal Beliefs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Non-representational Account of Phenomenal Beliefs . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Synechistic Continuum of Belief Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuity and Possibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Synechistic Representation of Beliefs in Abductive Reasoning . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks: The Benefits of Synechistic Abduction . . . . . . . . . . . . . . . . . . . . . . . . Accommodating Representationalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1272 1272 1274 1275 1276 1279 1283 1284 1284 1285 1286
Abstract
This chapter proposes a non-representational basis for abduction which facilitates a pragmatic connection between experienced possibilities and actualized concepts. It foregrounds the significance of experience in the surprise of abduction and drawing on Peirce, proposes a broadening of the scope of abduction in the form of a continuum. The significant features of the continuum are detailed, and the benefits this larger synechistic view of abduction may provide beyond the received representational view. Keywords
Abduction · Phenomenal beliefs · Representationalism · Continuity · Synechism P. D. Bruza () A. Gibson Queensland University of Technology, Brisbane, QLD, Australia e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_20
1271
1272
P. D. Bruza and A. Gibson
Introduction Abduction was put forward by C.S. Peirce and became, and still is, a subject of interest in fields such as logic, cognitive science, AI, and the philosophy of science (Gabbay and Woods, 2005). These fields invest heavily in propositional representations. Consequently, investigations into abduction tend to frame it as being a form of reasoning that involves propositional representations. It is the author’s view that this framing shifts an important element of abduction to the periphery, namely, that abduction is initiated by surprise. Surprise is an experience. As such, those with a phenomenological bent would be inclined to view surprise as being incompatible with representation, because once the experience is represented, it is no longer the surprise but rather a description of the surprise. Put colloquially, the experience of surprise has slipped through the fingers of representation. A foundational assumption in this chapter is that surprise is an essential element of abduction and shouldn’t be ignored nor relegated to the periphery. It must be appropriately foregrounded in order to furnish a complete account of abduction. This claim raises a dilemma: On the one hand, surprise is fundamentally nonrepresentational and on the other hand the received view of abduction is that it is based on representations, usually propositional representations. In order to resolve this dilemma, an account of abduction is put forward that goes beyond representations.
Surprise At the heart of abduction is the “surprise” and Peirce notes that a key feature of surprise is that is just that – a surprise – and that the only way of accounting for the surprise is that previously there must have been some expectation, which now appears erroneous: Thus it is that all knowledge begins by the discovery that there has been an erroneous expectation of which we had before hardly been conscious. Each branch of science begins with a new phenomenon which violates a sort of negative subconscious expectation, like the frog’s legs of Signora Galvani.19 . (EP 2:88)
Having erroneous expectations is a very common human occurrence, but for Peirce erring was vital to human inquiry as it presented the opportunity for learning, and furthermore erring justifies the belief that we are in touch with the world, ironically because to err means that we are in some way at odds with it. The moment that our expectations are shown to be erroneous is accompanied by surprise. Imagine that the door doesn’t open. There is surprise. Is the door knob broken? Is the door somehow jammed or blocked? According to Peirce, the experience of surprise provides the awareness of the mind and world at odds with one another, i.e., there is recognition of an error (Cooke, 2011). Therefore, there is a distinct first-person experience that there is an “error” – something is wrong, or at least not quite right.
58 Abduction Beyond Representation
1273
In Peirce’s thought, this surprise is fundamentally based on self/world distinction. In his account of the development of self-consciousness, he argues that the experience of error leads to a self/world distinction, which is necessary to experience that something is wrong and that wrongness has to do with the beliefs held by the self. That error recognition itself is a kind of experience involving the malfunction of the self’s beliefs (Cooke, 2011). More specifically, for Peirce, the experience can and does justify the meta-belief that there is something wrong with the self’s beliefs. This is because surprise involves “a double consciousness at once of an ego and a non-ego, directly acting upon each other” (CP 5.52, 1903). In other words, the experience of surprise actualizes together with, and in the context of, a perceived separation between an internal self (ego) and the external non-ego which stands opposite to the ego (non-ego). The phenomenological basis of belief malfunction is apparent in Peirce’s writings, e.g., The man has some belief at the outset. This belief is, as to its principal constituent, a habit of expectation. Some experience which this habit leads him to expect turns out differently; and the emotion of surprise suddenly appears (CP 8.270, 1902)
Cooke (2011) observes that “Both the subjective feltness of error and the objective judgment that I have erred are essential to inquiry because in this way we are rationally answerable to the nonhuman world that impinges conceptually upon our minds through experience.” Therefore, unless there is the experience of surprise, together with the meta-belief that there is something wrong with the self’s beliefs, there would be no need for abduction at the level of individual inquiry. Due to the phenomenological basis of Peirce’s error recognition, a consideration of a non-representational account of beliefs and the crucial meta-belief is warranted. Moreover, when Peirce anchors abduction on surprise, the authors believe that he does not intend surprise be reduced to cognitive representations, e.g., the difference between an incoming representation and one that has been stored in memory. Rather, abduction grows from the seed of a new immediate experience that does not easily accord with any existing beliefs, a “doubt” that occurs in relation to fixed beliefs to use Peirce’s terminology from ‘The Fixation of Belief’ (EP 1:109). Peirce notes that “The irritation of doubt causes a struggle to attain a state of belief” (EP1:114). Prior to the surprise, there is no doubt, habit has resulted in minimal conceptual change, yet new immediate experience which is surprising as it does align well with habitual conceptualizations results in the irritation of doubt and motivates a struggle for belief, a process of inquiry. Finally, the authors agree with Peirce that epistemology must be understood from the first-person point of view. And when our epistemic situation doesn’t align with the reality confronting us, then there is not only the experience of error-recognition but that feeling also involves an interpretation, a cognition of error, and is therefore conceptual. At the heart of conceptualization lies a continuity that spreads from immediate transient feelings that are intrinsic to experience through interconnected ideas which become less and less intense and less and less variable. This opens the door for both
1274
P. D. Bruza and A. Gibson
non-representational and representational beliefs to coexist. Bridging this divide within the context of abduction is the central aim of this chapter. The challenge to be addressed next is how to provide a non-representational account of beliefs.
Phenomenal Beliefs Beliefs have overwhelmingly been studied from an analytical rather than experiential perspective. However, Chalmers (2003) introduces the problem of explaining why mental states have phenomenal or experiential qualities. These phenomenal qualities are often approached by posing the question “What is it like to . . . ,” e.g., What is it like to taste coffee, to touch an ice cube, to surf a wave? But what about beliefs? Do they have experiential qualities? Zahavi (2003) thinks so, “..an analysis of thoughts, beliefs, categorization etc. that ignored the experiential side would merely be an analysis of what could be called pseudo-thoughts or pseudobeliefs.” If so, it would seem possible to explore Peirce’s malfunction of beliefs from a phenomenological perspective. To begin this exploration, Hume’s notion of elementary beliefs is now introduced (Rocknack, 2021). Hume felt that “belief, or assent, which always attends the memory and senses is nothing but the vivacity of those perceptions they present” (T 1.3.5.7) (Hume, 2003). That is, sense impressions and memories of sense impressions are always beliefs. For example, feeling the coolness of the door handle in your hand and seeing its shiny color are Humean elementary beliefs. So too is the attendant inthe-moment arising of a memory of what is behind the door. Their vivacity is a clear indication that Hume grounds them in perceptual experience, “every kind of opinion or judgement, which amounts not to knowledge, is deriv’d entirely from the force and vivacity of perception, and these qualities constitute in the mind, what we call the BELIEF of the existence of any object” (Rocknack, 2021, p. 583) [emphasis in the original]. It is important to note that the vivaciousness of sense impressions or memories of them is not an extra quality that makes them into beliefs. Rather, for Hume, vivaciousness is inseparable from any given experience of the sense perception. In other words, it is a phenomenal belief – it is not a belief about something expressed as a propositional representation, the perceptual experience is the belief. Hume proposes that we become conditioned to experiencing through the senses by two “species” of impressions when we experience them as being “constantly cojoined.” That is, they occur in a successive, contiguous fashion. Hume then states, “without farther ceremony, we call the one cause and the other effect” (T 1.3.6.2). Rocknack (2021) interprets this as reflexively associating perceptions in a causal manner. Hume states further that we “union[ize]” instances of pairs of correlated sense impressions and by using our imagination to form a natural relation of causality, a “necessary connexion,” which is nothing more than “transition from the accustom’d union [in the imagination]” (T 1.3.14.21). To illustrate the preceding, Rocknack (2021)’s formalism of Humean beliefs is adapted and applied to the running example of the door. Your hand is on the
58 Abduction Beyond Representation
1275
cool shiny door handle and you feel it turn as your wrist turns. Let us denote this dn , where n denotes the specific perceptual experience of these sense impressions. Contiguous with this experience is a sense impression of the door opening, denoted on . At different times, you have opened doors which are “union[ized]” via the collection: {(d1 , o1 ), . . . , (dn , on )}. Consider that you now have the cool doorknob in your right hand and are now beginning to turn it (dn+1 ). There is a “necessary connexion” to imagine the door opening slowly revealing the imagined space behind the door (on+1 ), before you actually experience the opening. (The superscript on+1 is used to denote it as a reflexively determined “idea”). In other words, Hume is pointing out a conditioned reflex to associate certain sense impressions with each other, with the assistance of the imagination. It is important to note that, according to Hume, the natural relation of causality between dn+1 and on+1 is not a belief, whereas the imagined associate of the causal reflex on+1 is a belief, but not an elementary one (see footnote 17 in Rocknack 2021). Of course, the imagined on+1 might not actually be experienced, because the door is jammed (jn+1 ). And when this happens, there is the experience of surprise: The conditioned reflex in the form of the idea on+1 is at odds with the present state of the world. In summary, the Humean beliefs just described are phenomenological. As phenomenal beliefs are inseparable from the experience of those beliefs, they provide a foothold to provide an experiential account of error recognition and attendant surprise that initiates abduction.
A Non-representational Account of Phenomenal Beliefs Let us fast forward a few centuries and reconsider Humean elemental beliefs within the cognitive theory of predictive processing (PP) (Kirchhoff and Robertson, 2018). A specific experiential encounter with a door handle, denoted by the elementary belief di , 1 ≤ i ≤ n+1, corresponds to a probability distribution Di in PP. Similarly, there are the distributions Oi , 1 ≤ i ≤ n+1 corresponding to the elementary beliefs of specific door opening experiences. Furthermore, distributions Di and Oi are correlated which corresponds to the elementary beliefs di and oi being “conjoined.” It is conceivable that Hume’s conception of a “species” of sense impressions can ˆ which is a fusion of the distributions be associated with a probability distribution D, Di corresponding to distinct door handle experiences, whereby all the contributing distributions can be envisaged as having similar shapes. The correlation between Dˆ and Oˆ would therefore correspond to “two ‘species’ of impressions occurring in a successive, contiguous fashion” (Rocknack, 2021, p.589), namely, the experience of the door handle followed by the experience of the door opening. Now consider the elemental belief dn+1 . It’s membership of the door handle species is reflected by the divergence Dˆ and Dn+1 being small. In PP, the divergence is formalized by the Kullback-Leibler divergence which is an information-theoretic measure of how one probability distribution is different from a second. As for the
1276
P. D. Bruza and A. Gibson
imagined door opening on+1 , its distribution On+1 would have a form close to the ˆ door opening “species” O. This is because the idea on+1 is a conditioned reflex in a natural relation of causality. Consequently, the divergence between the distributions Dn+1 and On+1 is small, which translates into the idea on+1 being attributed with high degree of expectation. In addition, the door opening experience occurs, then the divergence between Oˆ and On+1 would be small, which reflects Hume’s “natural relation of causality” (Rocknack, 2021). Now imagine if the door is jammed, with the corresponding phenomenal belief denoted by jn+1 . In terms of PP, the divergence between the probability distributions On+1 and Jn+1 would be large, where Jn+1 is the distribution corresponding to the elementary belief jn+1 . A way to interpret this divergence is that it corresponds to surprise element intrinsic to Peircian belief malfunction, i.e., the expected experience of the door opening is very different from the actual experience of the jammed door. Thus far, nothing has been said regarding the representational status of these distributions within the cognition of the agent. Within PP, this question is still being debated (Kirchhoff and Robertson, 2018). A powerful argument that the distributions correspond to cognitive representations rests on the argument that the belief malfunction is due to misrepresentation: When the door is jammed, the distribution On+1 corresponding to the idea on+1 is a misrepresentation because the internal cognitive representation does not align with external reality in the form experiencing the jammed door jn+1 , modeled by the distribution Jn+1 . Notwithstanding this argument, Kirchhoff and Robertson (2018) put forward a convincing counterargument that the probability distribution should not be construed as cognitive representations. The crux of this argument is that the Kullback-Leibler divergence used in predicting the aforementioned divergences between distributions is a measure of Shannon information, and therefore does not establish a sufficient measure of representational content. The impact of this determination is the following: In order to be faithful to the experiential nature of abductive surprise, a non-representational account is required. PP, a contemporary theory in cognitive science, shows that it is possible to model non-representational account of how the belief malfunction, which is intrinsic to surprise.
A Synechistic Continuum of Belief Representation Thus far, it has been argued that abduction is initiated by surprise borne of belief malfunction. The malfunction has a phenomenological basis in the form of phenomenal beliefs, which are non-representational entities. The question arises as how to square this stance with the view that abduction involves reasoning with propositional representations. The approach taken is based on viewing the underlying beliefs as having a continuum. Bergman (2007) describes how after 1868, Peirce turned away from “representationism” and adopted a view he labeled
58 Abduction Beyond Representation
1277
“presentationism,” the essential basis of which is that perception is immediate and not mediated by representations. This aligns with the nature of phenomenal beliefs. The development of the representational continuum to follow parallels the interpretive perceptual continuum put forward by Hausman (1997). Hausman introduces “sense-one” percepts that are similar with phenomenal beliefs in the sense they are “undifferentiated sheer qualitative aspects of experience.” Perceptual judgment, then, begins with sense-one percepts, which, it seems, are completely dumb. However, it is important to notice that “they are experienced objects, and, although as yet uninterpreted, they are not things-in-themselves.... they are not objects separated from our minds so that they remain forever mysterious”. In fact, there is reason to suggest that Peirce saw in them a primitive, or at least a proto-interpretive, element.” (Hausman, 1997, p. 187)
An important aspect in phenomenal beliefs is that they constrain the perceiver by the very qualities of the experience, e.g., the chromy shininess to the eye, the cool, round feel in the hand. Phenomenal beliefs are thus consistent with Peirce’s view of percepts as consisting of distinct sense-perceptions which are synthesized and experienced and encountered as a whole, but with an inherent indeterminacy. That is, in the experiential encounter the phenomenal belief does not yet announce “a determinate, classifiable object” (Hausman, 1997, p.186), such as a door handle. A modern take on this concept is that perception simply presents the external world (as opposed to representing it), thereby giving access to external objects and properties for thought, belief, judgment, etc. As such, phenomenal beliefs can be viewed to be non-representational entities consistent with Hausman’s sense-one percepts. According to Hausman, Peirce proposed a “percipuum,” which corresponds to the sense-one percept as being immediately interpreted, e.g., the door handle announces itself as a “determinate, classifiable object.” A “percipuum” is where the indeterminate phenomenal beliefs become determinate in the form of a protobelief. A proto-belief is the actual of the phenomenal beliefs as a door handle. Hausman puts forward the view that the interpretation associated with a percipuum is “embryonic,” which subsequently evolves into “fully articulated perceptual judgments.” Correspondingly, the representational status of a proto-belief can be similarly assumed to be embryonic in the form of a proto-representation. Perceptual judgment is when the proto-belief becomes a belief in the door handle, i.e., “the door handle is jammed.” Such an assertion is taken by Peirce to be a mental description of a percept, in language or other symbols (MS 939:25, 1905). Consequently, perceptual judgments can be considered to be fully fledged beliefs which can be appropriately represented in a propositional form. Peirce likens perceptual judgments to “stenographic reports” of the evidence of the senses, which may be erroneous (CP 2.141, 1902). In other words, the perceptual judgment professes to represent the percept, i.e., the phenomenal belief has evolved into a belief with a propositional representation via mediation in the form of the protobelief. In summary, the representational continuum of beliefs looks like the following: phenomenal belief (non-representation), proto-belief (proto-representation), and
1278
P. D. Bruza and A. Gibson
perceptual judgments (propositional representation). However, this is not yet a continuum in a Peircian sense. Peirce’s idea of a continuum is tightly coupled to what he termed synechism. Synechism is conceptually more extensive than its continuum core, providing a foundation to the whole of Peirce’s philosophy. Peirce’s synechism is both a metaphysical stance against dualism and a pragmatism which requires propositions to be inextricably related to experience: “. . . I have proposed to make synechism mean the tendency to regard everything as continuous. . . . I carry the doctrine so far as to maintain that continuity governs the whole domain of experience in very element of it. Accordingly, every proposition, except so far as it relates to an unattainable limit of experience (which I call the Absolute), is to be taken with an indefinite qualification; for a proposition which has no relation whatever to experience is devoid of all meaning” (EP 2:1). Susan Haack notes that “This is Peirce the metaphysician at his most philosophically fertile, his most mathematically imaginative, his most scientifically sweeping, and his most cosmologically prescient; but also his most darkly Cimmerian” (Haack, 2005, p. 241). Clearly, a comprehensive examination of synechism is beyond the scope of this chapter. However, a brief excursion is helpful to understand cognitive representation on a continuum. For Peirce there is no fundamental distinction between the “psychical” and the “physical” as both lie on the one continuum. While rejecting dualism, he also eschews materialist, idealist, and neutral forms of monism, arguing that “ The one intelligible theory of the universe is that of objective idealism, that matter is effete mind, inveterate habits becoming physical laws” (EP 1:293). For Peirce there is a dynamic evolutionary relationship between the purely psychical which is rooted in experience and conceptions which by “habit” has proved stable in relation to ongoing experience to the extent to which they can be understood as “physical.” Peirce’s synechism provides us with something of a common canvas that seamlessly relates the phenomenological to the epistemic. It embraces the indeterminacy of pure experience while accommodating the determinacy of a physical reality. For Peirce all concepts are triadic in nature and can be understood in terms of his categories of “firstness,” “secondness,” and “thirdness.” Crudely, the categories in terms of concepts can be understood as an evolution from pure experience to complete concept: an original immediate sense or feeling is firstness, which is followed by secondness where there is a sense of similarity or difference, and which evolves into thirdness which a relationship between firstness and secondness. Thus, thirdness accommodates for a representation of experience as ideas which is not possible in firstness or secondness. Peirce describes it as a consciousness of three modes (EP 1:260): (1) immediate feeling, the highly experiential aspect of consciousness; (2) polar sense, an awareness that an idea that is immediate is in some way different or similar to something that came before or which might come in the future; and synthetic consciousness, an awareness that ideas relate to immediate feelings. Within this triadic view of concepts, there is a continuum from experience to conception. Notably, this view of consciousness has a striking resemblance to contemporary embodied theories of cognition like the aforementioned predictive
58 Abduction Beyond Representation
1279
processing. It also lays the foundation for a kind of “knowledge” that emerges from both immediate experience and an ability to make predictions based on that experience. For Peirce, the material world appear through habitual interactions in a field of experience where ideas spread from immediate feeling of high intensity to repeated predictability which generalizes. Thus, knowledge is something akin to stability of ideas. At the heart of all conceptualization lies a continuity of thought that spreads from immediate transient feelings through interconnected ideas which become less and less intense and less and less variable. The less variable more general direction of the continuum is what we come to view as knowledge, generally reliable over time despite continual interaction with new experiences, ultimately “physical laws.” However, these laws are not “representations” as such, as each idea is completely new, but rather they are the result of us being conscious of similarity between an immediately present idea and others that have come before – a habit of mind (the second mode of consciousness). In “A Guess at the Riddle,” Peirce illustrates the idea thus: When red is not before my eyes, I do not see it at all. . . . I remember colours with unusual accuracy, because I have had much training in observing them; but my memory does not consist in any vision but in a habit by virtue of which I can recognize a newly presented colour as like or unlike one I had seen before. (EP 1:259)
In “The Law of the Mind,” he reinforces this point: We are accustomed to speak of ideas as reproduced, as passed from mind to mind, as similar or dissimilar to one anaother, and, in short, as if they were substantial things; nor can any reasonable objection be raised to such expressions. But taking the work “idea” in the sense of an event in an individual consciousness, it is clear that an idea once past is gone forever, and any supposed recurrence of it is another idea. (EP 1:313)
He goes on to say that the relation between two ideas “can only exist in some consciousness” (EP 1:314), supporting the third mode of synthetic consciousness which provides a relation between the first mode, immediate feelings, and second mode, polar sense. Thus, while ideas are discrete and non-repeating, they are connected in a continuum of consciousness that ranges from immediate, e.g., phenomenal beliefs, to the habitual, e.g., perceptual judgments.
Continuity and Possibility The aim to provide a clear conception of continuity was an ongoing struggle for Peirce. He was dissatisfied with Cantor’s views but he also revised his own position to view a continuum as being composed on individual instances infinitesimally close together (Brian Noble, 1989), which he then rejected as it asserts that instants, or points, are ultimately distinguishable. While Peirce’s views on continuity were often developed mathematically, they also involved philosophical and logical considerations (Havenel, 2008). Of particular relevance is his statement: “the reality of continuity appears most clearly in reference to mental phenomena.” The focus
1280
P. D. Bruza and A. Gibson
will now turn to continuity in relation to cognitive phenomena and the implications that might have for the representation of beliefs. Brian Noble (1989) states that the concept of possibility was pivotal in leading to a new definition of continuity that was not based on individual instances. Peirce deemed that in a true continuum, there are no actual points or instances, but there are a number of possible points that exceed any non-denumerable multitude. In the following it will be argued that Peirce’s concept of possibility shows some interesting similarities with the notion of indeterminacy that is used in relation to quantum phenomena (Malin, 2002; Pitowsky, 1994), with a particular emphasis on quantum-like cognitive phenomena (Busemeyer and Bruza, 2012; Bruza et al., 2015). The following quote illustrates Peirce’s view of possibility: When we say that of all possible throws of a pair of dice one thirty-sixth part will show sixes, the collection of possible throws which have not been made is a collection of which the individual units have no distinct identity. It is impossible so to designate a single one of those possible throws that have not been thrown that the designation shall be applicable to only one definite possible throw; and this impossibility does not spring from any incapacity of ours, but from the fact that in their own nature those throws are not individually distinct (4.172; 1896) (emphasis ours)
Peirce’s stance is at odds with George Boole’s logical approach to probability, which was developed some 30 years earlier (Boole, 1862). Boole formulated his probability theory in terms of “conditions of possible experience.” Given that logic is a highly abstract enterprise, Boole’s use of the term “experience” is a curious one and requires some grounding. He states that the calculation of probability “depends upon information contained in the data, information supposed to be derived from actual experience, or at least to be of such a nature that experience might have furnished it” (Boole 1862, p. 226, emphasis ours). To provide an example, he describes an “urn containing balls distinguished by certain properties, e.g., by colour, as white or not white, by form, as round or not round, by material, as ivory or not ivory” (Boole 1862, p. 227). It is by actual experience that these properties are established. Let us now look at Boole’s “conditions of possible experience” in more detail. The symbols pw , pr , and pi , respectively, denote the probability that a ball retrieved from the urn is white, round, or made of ivory. Instead of using Pr(w ∧ r) to denote the probability of retrieving a white, round ball, the abbreviation pwr will be used. Boole’s “conditions of possible experience” were formalized in terms of constraints on the probabilities. Some of the constraints are expected common sense constraints on probabilities such as pwr ≥ 0, pr ≥ pwr , pw ≥ pwr , pw + pr − pwr ≤ 1
(1)
Pr(w ∨ r ∨ i) ≥ pw + pr + pi − pwr − pwi − pri
(2)
and
58 Abduction Beyond Representation
1281
as well as, pw + pr + pi − pwr − pwi − pri ≤ 1
(3)
In addition, there are inequalities of the following form: pw − pwr − pwi + pri ≥ 0
(4)
pr − pwr − pri + pwi ≥ 0
(5)
pi − pwi − pri + pwr ≥ 0
(6)
Pitowsky (1994) shows that the inequalities (1), (2), (3), (4), (5), and (6) define a polytope. Each point in the polytope can be thought of as a “distinct identity” that Peirce disagreed with (see quote above). However, Pitowsky notes that quantum phenomena can violate one or more of Boole’s “conditions of possible experience.” For example, (4), (5), and (6) have a special significance in quantum physics as they are related in the Bell-CHSH inequalities, which, when violated, is taken to signify that the system under investigation is quantum; the associated probabilities lie outside of the polytope. Violations of Boole’s conditions open the door to frame a notion of possibility that is not based on individually distinct identities. Pitowsky offers various reasons why Boole’s “conditions of possible experience” may be violated: The most relevant to Peirce’s position is the following: We may attribute this failure to one of our habitual (often implicit) assumptions, namely that there exists a well defined distribution of properties over some population, and the results of our measurements merely reflect this fact. Maybe there is ‘no population’, or, even if there is, there are no well-defined properties, existing independent of observation and distributed in a specific manner. All that exist are the phenomena themselves, which simply occur without cause. [Emphasis added] (Pitowsky, 1994, p. 107)
In other words, the phenomenon is indeterminate prior to observation – it does not have well-established properties, the values of which are simply “read-off” by observation, which corresponds to Boole’s notion of “actual experience.” It is important to note that the uncertainty expressed by Boole’s notion of probability is epistemic, i.e., there is no uncertainty regarding the underlying fact of the matter, namely, the phenomenon is in the state of being “white” or “not white”; we just don’t know which prior to actual experience, hence the need for probabilities (Colyvan, 2004). It is the author’s view that epistemic uncertainty encompasses what Peirce rejected, i.e., possibilities based on individual instances, for example, “possibly the ball will be white or possibly it will not be white.” In contrast to epistemic uncertainty, there is non-epistemic uncertainty (Colyvan, 2004). Non-epistemic uncertainty means that there is no underlying fact of the matter: the phenomenon is neither “white” nor “not white.” Even if all relevant data were available, uncertainty would remain about the truth of the following proposition: “The phenomenon is white.” In other words, the uncertainty is ontological – it relates to the indeterminate nature of the very being of the phenomenon, in and of itself, not uncertainty relating to knowing what state it is in. According to
1282
P. D. Bruza and A. Gibson
Pitowsky, the ontological indeterminacy of the phenomenon is what allows Boole’s “conditions of possible experience” to be violated. Non-epistemic uncertainty resonates with Peirce’s quote above because prior to actual experience, the possibilities associated with the phenomenon cannot be considered to be individually distinct. In other words, the contradiction of a phenomenon being neither in state σ nor not in state σ , i.e., the phenomenon is not in a determinate state with respect to σ . Phenomenal beliefs have this character, which aligns with both Peirce’s notion of possibility and firstness. Peirce’s secondness is just what intrudes on the actual world by “sheer force and determination from the world of infinite possibilities” (Brian Noble, 1989, p. 167). Peirce expressed it in terms via the conception of “thisness,” where “the true characteristic of thisness is duality, and it is only when one member of the pair is considered exclusively that it appears as individuality” (MS 942, 00015: [IV, 13536;n.d.). Brian Noble (1989, p.167) further clarifies “thisness” in terms of Peirce’s continua of possibility as follows: A This is accidental; but it only is so in comparison with the continua of possibility from which it is arbitrarily selected. A This is some thing positive and insistent, but it only is so by pushing other things aside and so making a place for itself in the universe.
So, a phenomenal belief selected from the continua of possibilities and made an actual experience of “whiteness.” In other words, phenomenal beliefs correspond to possibilities that are actualized, thereby becoming actual facts and belong to Peirce’s secondness. Following Noble, actualized possibilities are like points that are marked on a line in that they interrupt the continuity and hence can be considered to be discontinuities. The intuition associated with this notion of discontinuity is similar to quantum collapse. Quantum collapse is when from a background of potentialities, one potentiality is selected and made actual (Malin, 2002). A quantum collapse is viewed as a discontinuity which interrupts the continuity formalized by the unitary dynamics of a closed quantum system. The field of quantum cognition provides an analogous picture (Busemeyer and Bruza, 2012; Bruza et al., 2015). In this picture, a cognitive phenomenon is quantum-like; it does not possess the objective property of whiteness, but a disposition to produce an actual experience of whiteness, when one looks at the ball. Correspondingly, in quantum theory, the electron does not possess an objective position, but a probabilistic disposition to appear in various places when observed. In line with Peirce’s secondness, the discontinuity involves a duality – a selected possibility, which becomes an actual versus the un-selected ones. Thus far, the realm of possibilities, Peirce’s firstness, corresponds to phenomenal beliefs. These can be considered to be non-representational entities that exhibit quantum-like indeterminacy. Collapse occurs – a phenomenal belief becomes an actual experience. It has become a proto-belief. How can a perceptual judgment like “The ball is white” be attributed to a propositional representation? When quantum theory was being developed, Werner Heisenberg could not account for the paths of electrons through a gas chamber. These appeared very much to like an electron acting like a billiard while during
58 Abduction Beyond Representation
1283
its trajectory was hitting other billiard balls (gas molecules), with energy being released at each impact and thus recording what looked like a continuous path of the electron. In a feat of brilliance, Heisenberg freed himself from this classical explanation and posed a more fundamental reality. When the electron gun is fired, an elementary quantum event occurs. The electron is very briefly actual but returns to the background of potentialities, which evolves, and interacts with, a gas molecule. Collapse occurs and an electron is again actualized. Energy is released as part of the interaction. What looks like a path of a persistent electron bumping into gas molecules is in fact a continuity of discontinuities, with a totally new electron actualized at each discontinuity. Stated more formally, quantum theory does not describe the electron’s trajectory. Instead, it provides a way to calculate probabilities for the electron to be observed at different positions. Returning to Boole’s urn, each experience of the shape and color of a phenomenon pulled out of the urn is a totally new actual experience – a proto-belief – where the analogue of the electron is becoming actual as part of the interaction with the gas molecule. The authors speculate that over time the associated protorepresentations start to solidify into conceptual representations which end up in fixed propositional representations. Just like the path through the gas chamber, such representations appear to have a continuity somewhat akin to Peirce’s thirdness, but in fact, the more fundamental reality is that they are vestiges of discontinuities arising from a continuity of possibilities.
Synechistic Representation of Beliefs in Abductive Reasoning Return to the representational dilemma posed at the beginning of this chapter and address it by connecting the preceding synechistic view of representation with the act of abductive reasoning. Recall that surprise is the initiator of abduction – the handle is turned, the expected opening of the door is not experienced. There is belief malfunction with the attendant meta-belief that there is something wrong with our beliefs. Is the door jammed? Is it locked? Gabbay and Woods (2005) hold that the these are produced by a logic of discovery and are subsequently subjected to a logic of justification, which often involves analytically assessing each alternative and weighing them up. These logics serve as a filtration structure whereby one hypothesis is selected for action, e.g., the door is locked and so you go and retrieve the key hidden under the flower pot. The preceding example attempts to illustrate a few aspects. Firstly, abduction is a form a pragmatic reasoning, which is often used by human cognitive agents to address the unexpected surprises encountered in their lifeworld. Secondly, the hypotheses arise quickly and intuitively. Thirdly, their justification often amounts analytically assessing each alternative and weighing them up. This takes time and effort. The latter two aspects are features of dual process theory (DPT) of reasoning as well as the fuzzy-trace theory (FTT) theory of reasoning. Though both of these are well-established theories in cognitive psychology (Thompson et al., 2021).
1284
P. D. Bruza and A. Gibson
FTT focuses on mental representations of which there are two: gist and verbatim. Gist representations encode the underlying meaning, or essence, of the stimulus, whereas verbatim representations encode its details. Gist representations seem to have the character of proto-beliefs, i.e., they are not fully fledged representations. In contrast, verbatim traces would seem to approach the character of fully fledged representational entities due to the detail they must encompass. Dual process theories of reasoning (DPT) involve two types of processes. Type 1 processes are intuitive, spontaneous, and autonomous, meaning that they are elicited whenever the appropriate cues are primed in the environment. This includes assessments, e.g., classifications of physical properties, meaning-based processing, spontaneous belief judgments, associations drawn from stereotypes, etc. Type 2 processes, on the other hand, involve deliberate, slower analytical thinking. Thompson et al. (2021) highlight some important points of intersection between FTT and DPT. Firstly processing gist representations may be one of the primary functions of Type 1 processing. Proto-representations (of proto-beliefs) are like gists, which are manipulated by a Type 1 process in order to produce hypotheses. This explains how they often appear so quickly and effortlessly. The ease can be explained by Type 1 process defaulting to process simpler gists, at least initially. This behavior relates to cognitive economy that has been strongly associated with human abductive reasoning (Gabbay and Woods, 2005). Correspondingly, verbatim representations are speculated as being manipulated by Type 2 processes. In this case, fully fledged representations corresponding to perceptual judgments are being subjected to analytical assessment by the slower, more deliberative Type 2 process. The combination of Type 1 and Type 2 processes distils a particular hypothesis to be judged suitable for action. While there are important issues to address to reconcile the gist/Type 1 and verbatim/Type 2 alignment, there is an inherent plausibility for its consideration in providing an account of abductive reasoning that goes beyond standard propositional representation and allows abduction to be placed in a synechistic continuum of belief representation.
Concluding Remarks: The Benefits of Synechistic Abduction By way of conclusion, a number of benefits are proposed which stem from taking a synechistic view of abduction.
Accommodating Representationalism Firstly, a synechistic view of abduction does not necessarily nullify previous logical work that has assumed propositional representations. Rather, to the extent that it does not require exclusive representationalism, this work can be accommodated within the synechistic continuum. However, the authors argue that where there is a clear need for representationalism, the reasoning is more likely to be
58 Abduction Beyond Representation
1285
inductive or deductive and that a purely abductive process necessarily begins (nonrepresentationally) in the experience of surprise. Within a representational perspective, both inductive and deductive reasoning are unproblematic as they rest on propositions and what has been conceptually “actualized.” What is actual can be accorded a truth condition. However, truth is indeterminate for that which remains in the realm of the “possible.” Further, for a surprise to really surprise the cognitive agent, it needs to be beyond current conceptualization and therefore non-representational. The scope of “possible” is extended by surprise, after the moment of surprise, and so abduction beginning in surprise provides a bridge from experience to non-propositional possibility which paves the way for truth-seeking inquiry resulting in propositions affording a basis for representationalism. Thus, representationalist perspectives can be viewed as being possible within a limited epistemic frame, but not tenable as a foundation for surprise and consequently abduction.
Abduction and Prediction A second benefit of the author’s recommended approach to abduction is that it is very accommodating of predictive theories of cognition. Recent theories of cognition propose that prediction is central – that the foundations of thought are continuous predictions which are compared to continuous sense data. Cognition is the continual updating of prediction to accommodate errors between what is predicted and what is actually encountered. 4E cognition (extended, embodied, embedded, enacted) maintains that this predictive mechanism is not confined to the brain but includes aspects of the cognizers’ environment (Barrett, 2018). Thus, opening the door involves predictions of a door handle that will respond to action and involves taking the action itself, not merely thinking about turning the handle. This suggests a pragmatic interaction with the environment as opposed to a computational representation as data in, data processed, data stored, and data out. A door opening habit results in “knowing” the physicality of opening the door through experiential interactions with the door, and if at some point the door does not respond in accord with its expected physicality, then there is surprise. It is a way in which beliefs are being enacted. For an infant, the world is full of surprise and every new experience is a discovery. A knowledge of the world is not required, merely expectation. When expectation of nothing is found to be at odds to experience of something, there is surprise. As more experiences of similar somethings occur, surprises occur less and generality increases resulting in knowledge: The first new feature of this first surprise is, for example, that it is a surprise; and the only way of accounting for that is that there had been before an expectation. Thus it is that all knowledge begins by the discovery that there has been an erroneous expectation of which we had before hardly been conscious. (EP 2:88)
1286
P. D. Bruza and A. Gibson
Thus, for the infant their very first surprise experiences are suggestive of something more – they are abductive in nature resulting in inquiry or further exploration, typically by physical interaction with their environment. As these experiences become “known” through habit (repeated similar interactions), the infant can induce the likelihood of an experience and make more successful predictions, with less surprise. As the child grows in their ability to reason, they are then able to make deductions based on their knowledge. Notably, this natural process of cognitive development begins with abduction on experience, not deduction from knowledge: Deduction proves that something must be, Induction shows that something actually is operative, Abduction merely suggests that something may be. Its only justification is that from its suggestion deduction can draw a prediction which can be tested by induction and that, if we are ever to learn anything or to understand phenomena at all, it must be by abduction that this is to be brought about. (EP 2:216)
If the newborn experiences phenomenal beliefs through interactions with their totally new environment, the only necessary condition for learning is that they have expectation. As they encounter surprise, abduction begets prediction which over time yields less surprise, and habitual prediction and confirmation fertilizes connection forming and the spreading of ideas which become less surprising and more general, a process of learning of which the least surprising concepts become knowledge. For Peirce, abduction was not merely an additional form of reasoning but rather a way of interacting with the environment that foreshadows perception of the world: . . . the abductive faculty, whereby we divine the secrets of nature, is, as we may say, a shading off, a gradation of that which in its highest perfection we call perception. (EP 2:224).
This perception is not a skull-bound representation but rather an ongoing relation between thinking and acting, a cognition that has practical effects which lies at the very heart of pragmatism: . . . the most striking feature of the new theory was its recognition of an inseparable connection between rational cognition and rational purpose; (EP 2:333)
Thus a view of abduction that embraces phenomenal beliefs is foundational to learning; it is a pragmatic abduction inextricably tied to interaction with the world, without which the world cannot be known.
References Barrett, L. (2018). The evolution of cognition. A 4E perspective. In A. Newen, L. de Bruin, & S. gallagher (Eds.), Oxford Handbook of 4E Cognition. Oxford University Press. Bergman, M. (2007). Representationism and presentationism. Transactions of the Charles S. Peirce Society, 43(1), 53–89. Boole, G. (1862). On the theory of probabilities. Philosophical Transactions of the Royal Society of London, 152, 225–252.
58 Abduction Beyond Representation
1287
Brian Noble, N. A. (1989). Peirce’s definitions of continuity and the concept of possibility. Transactions of the Charles S. Peirce Society, 25(2), 149–174. Bruza, P. D., Wang, Z., & Busemeyer, J. R. (2015). Quantum cognition: A new theoretical approach to psychology. Trends in Cognitive Sciences, 19(7), 383–393. Busemeyer, J., & Bruza, P. (2012). Quantum Cognition and Decision. Cambridge University Press. Chalmers, D. J. (2003). The content and epistemology of phenomenal belief. In Q. Smith & A. Jokic (Eds.), Consciousness: New Philosophical Perspectives. Oxford. Colyvan, M. (2004). The philosophical significance of Cox’s theorem. International Journal of Approximate Reasoning, 37, 71–85. Cooke, E. F. (2011). Phenomenology of error and surprise: Peirce, Davidson, and Mcdowell. Transactions of the Charles S. Peirce Society, 47(1), 62–86. Gabbay, D., & Woods, J. (2005). The Reach of Abduction: Insight and Trial (A Practical Logic of Cognitive Systems, Vol. 2). Elsevier. Haack, S. (2005). Not cynicism, but synechism: Lessons from classical pragmatism. Transactions of the Charles S. Peirce Society: A Quarterly Journal in American Philosophy, 41(2), 239–253. Hausman, C. R. (1997). Charles Peirce and the origin of interpretation. In J. Brunning & P. Forster (Eds.), The Rule of Reason: The Philosophy of C.S. Peirce (pp. 185–200). University of Toronto Press. Havenel, J. (2008). Peirce’s clarifications of continuity. Transactions of the Charles S. Peirce Society, 44(1), 86–133. Hume, D. (2003). Nature Loves to Hide: Quantum Physics and the Nature of Reality, a Western Perspective. Oxford University Press. (Abbreviated as T). Kirchhoff, M. D., & Robertson, I. (2018). Enactivism and predictive processing: A nonrepresentational view. Philosophical Explorations, 21(2), 264–281. Malin, S. (2002). A Treatise of Human Nature. Oxford University Press. Pitowsky, I. (1994). George Boole’s “conditions of possible experience” and the quantum puzzle. British Journal for the Philosophy of Science, 45, 95–125. Rocknack, S. (2021). Regularity and certainty in hume’s treatise: A humean response to husserl. Synthese, 199, 579–600. Thompson, V. A., Newman, I. R., Campbell, J. I., Kish-Greer, C., Quartararo, G., & Spock, T. (2021). Reasoning = representation + process: Common ground for fuzzy trace and dual process theories. Journal of Applied Research in Memory and Cognition, 10, 532–536. Zahavi, D. (2003). Intentionality and phenomenality: A phenomenological take on the hard problem. In E. Thompson (Ed.), The Problem of Consciousness: New Essays in Phenomenological Philosophy of Mind (Vol. 29, pp. 63–92). Canadian Journal of Philosophy.
The Foundations of Creativity: Human Inquiry Explained Through the Neuro-Multimodality of Abduction
59
Jordi Vallverdú and Alger Sans Pinillos
Contents Introduction: Morphological Determination and Cognitive Inevitability . . . . . . . . . . . . . . . . The Key Role of Abduction as Cognitive Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Complex to Minimal Abduction: A Genealogy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Eco-cognitive Perspective on Minimal Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Cognition of Creativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cognitive Triggers: Surprise, Ignorance, and Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cognitive Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Neuro-cognitive Basis of Abduction and Related Bioinspired Computation . . . . . . . . . Mirroring (Artificial) Abduction: Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1290 1293 1293 1298 1303 1305 1307 1308 1309 1311 1312
Abstract
This chapter offers arguments in favor of a morphological characterization of situated abductive processes in perception, considering them as adaptation mechanisms to the varieties of experience. The mechanism that has been analyzed in this essay is the creativity. The thesis defended in this chapter is that the human being maintains a constant hypothetical openness to adapt to the uncertainty of the future. Characterizing creative processes using abduction means analyzing this phenomenon from morphological bases. Therefore, the
J. Vallverdú () ICREA Academia – Department of Philosophy, Autonomous University of Barcelona, Bellaterra (Cerdanyola del Vallès), Spain e-mail: [email protected] A. Sans Pinillos Department of Humanities – Philosophy Section, University of Pavia, Pavia, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_71
1289
1290
J. Vallverd´u and A. Sans Pinillos
inevitability of creativity defended in this chapter is situated in the neurochemical and morphological bases of perception to characterize the adaptive dimension of creativity through abduction. The starting point of this proposal is based on the EC-Model of abduction: a contextualized interpretation of classical pragmatism from the naturalization of cognitive processes. Abduction is proposed as the simple mechanism of hypothesis generation and its selection. This mechanism can be considered present in all degrees of human experience: the generation of epistemic content is grounded in natural biologically based adaptive processes. Therefore, there is a coupling between embodied mechanism of cognition and sociocultural values. Because of its fundamental value, the implementation of abductive mechanisms in Machine Learning systems is explored, especially in most recent Deep Learning models. Keywords
Determinate indeterminism · Abductive cognition · Neuro-multimodality · Inevitable creativity · Human inquiry
Informi esseri il mare vomita Sospinti a cumuli su spiagge putride I branchi torbidi la terra ospita Strisciando salgono sui loro simili E il tempo cambierà i corpi flaccidi In forme utili a sopravvivere (Banco del Mutuo Soccorso, 1972, L’evoluzione, 04m26s)
Introduction: Morphological Determination and Cognitive Inevitability During the second half of the twentieth century, the revolution held in cognitive sciences implied a new way of understanding cognition. It implied the naturalization of the field, looking for the natural mechanisms that could explain human thinking. The resulting field is also summarized as 4E Cognition (for Embodied, Embedded, Enacted, and Extended), which reintroduced body and environment into the cognitive equation (Newen et al., 2018). Classic top-down symbolic approaches to cognition were heavily affected and criticized by these new perspectives. This situation changed with technological revolutions in neuroscience such as the scanning technologies. With the possibility of checking in vivo brain functioning, new data about brain performance were made accessible, paving the way for new revolutions. The most salient of these data offered the possibility of understanding the fundamental role of emotions in cognition (Damasio, 1994). Such work opened the understanding of cognitive processes as bodily processes, as interactions of a soup of hormones and neurotransmitters created to manage social interactions. That revolution also had an impact on robotics and Artificial Intelligence fields
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1291
(Kurzweil, 2000), opening the path to the area of affective computing (Picard, 2000) and social robotics (Breazeal et al., 2016). In any case, the neuroscientist successes breasted a bias: an anthropomorphization of the field, as well as a neurocentrism. In addition, the area experienced a lot of hype because funny (but serious) experimental biases such as the Dead Salmon fMRI (Lyon, 2017) could not be diminished. Nevertheless, despite the influences of both anthropocentrism and braincentrism, the field has evolved thanks to the research on minimal and plant cognition (Calvo Garzón, 2007), opening new ways to understand cognition as both natural and social processes (Vallverdú et al., 2018). The analysis of the evolutionary and functional basis of cognition, from minimal to highly complex systems, helped to understand the role of morphological constraints in the process of knowing. The primary elicitors of this revolution were the requests of the robotics researchers, who were trying to design a scenario that allowed a new step toward the creation of intelligent machines (Pfeifer & Bongard, 2006). Therefore, the role of the morphologies was crucial for understanding the properties of a cognitive system. This statement had another reading: the morphology afforded specific ways of processing information and, therefore, creating cognitive mechanisms. In this naturalistic and evolutionary perspective, the authors affirm that cognition is inevitable (although dynamically evolvable, in some cases). Furthermore, according to local environmental variations, such morphologies can act differently as an ecological constraint that shapes cultures (Nisbett, 2004). Therefore, how living or artificial cognitive systems can select, process, and react to information is morphologically constrained. That morphological design makes social interaction and intentional behavior possible, which is not a “learned mechanism.” Throughout human history, the hot ethical and religious debate about freedom vs. determinism has been discussed from an ethical perspective. The only existing determinism is the cognitive horizon mediated by morphological constraints by which human beings feel and understand the world. It is essential not to separate the processes of knowing and feeling because both processes are intrinsically connected in experiencing the world and ourselves (Koch, 2019). This chapter offers an approach to biological determinism to have an indeterminate cognitive predisposition based on abduction. Such a predisposition is abductively articulated because a weak but necessary relation between knowing and feeling is embedded in perception as a constant openness experience system. Abduction is traditionally understood as a form of reasoning: a complex inferential cognitive process. One of the traditional forms of this reasoning is scientific discovery. As it is known, abductive reasoning was recovered by Peirce from the ᾿ Aristotelian apag¯og¯e (απαγωγ η) ´ (Aristotle 1957: An. Pr., II, 25, 69 to 20–35) as the third syllogism (Peirce, 1958, CP: 5.14–40): the hypothetical or possible inference. As the authors will explain in the second section, abduction is the cornerstone of pragmatism: a philosophical theory that assumes an ontological perspective of the unfinished world. Therefore, it is a philosophical system to face the uncertainty of becoming. One of the main characteristics of Peirce’s abduction is that it is oriented to complete and to characterize perceptual judgment
1292
J. Vallverd´u and A. Sans Pinillos
(Peirce, 1958, CP: 5.348). This characteristic has notoriously influenced current inquiries on abduction. For example, this form of inference was recovered to disentangle the debate in the philosophy of science on the context of discovery. Nevertheless, an analysis of pragmatism may offer clues to reinterpret abduction as a mechanism of hypothetization inherent in perception. This pragmatic perspective is approached from an epistemological holism. The main idea is to show how the most complex forms of hypothesizing depend on the most basic ones and vice versa. In this way, it can be argued that there are microprocesses of hypothesizing in perception. This view is defended from different perspectives of classical pragmatism. However, Peirce and Mead are the authors who treated it in more depth. The reason for using Peirce’s theory for the question of hypothesizing processes in perception is based on his characterization of the abductive process situated in firstness: a non-cognitive, inexplicable, and immediate qualitative state of feeling or sensation. Likewise, Mead emerges as an indispensable author to situate the category of firstness in perception. The primary motivation for relaunching this genealogy of classical pragmatism toward a foundation of the abductive processes of perception lies in the need to broaden the philosophical tradition from which the current debate on abduction is articulated. The perspective of recent abduction from which our proposal is articulated is the EC-Model of abduction: an enactive cognitive proposal based on contextualization and anthropomorphization of the agents who know as they interact with the environment. The critical point is that the anthropomorphization of cognition is based on the naturalization of logic: abductive management of heuristic information. The proposal to apply this model of abduction to perception implies extending the influence of factors such as emotions, feelings, and narratives to characterize hypothetical manipulation as an unreflective process prior to cognitive ones – section three attempts to characterize many abduction-based cognitive hypothesizing strategies as creative processes. Two forms of creativity are distinguished: genuine generation and divergent adaptation to surprising situations. The second type of creativity can be characterized abductively in phases of complex reasoning as divergent hypothesizing in the face of methodologically surprising cases. It is also possible to understand divergent hypothesizing processes in perceptual phases, for example, unreflective resources that arise during manipulation. In the same section, triggers and constraints are characterized: both articulate abductive processes. They are logical and cognitive mechanisms that represent the operational bivalence of morphological and cultural aspects in defining the margins of our reality (constraints) and the process of change (triggers). What has been said so far allows us to argue in section four that a naturalistic and evolutionary perspective needs to incorporate abduction as a morphologically embedded mechanism. Thus, abduction appears as an embodied mechanism for explaining creativity. Consequently, because of the fundamental role of creativity in the epistemic process of knowledge generation, it has to be undertaken that abduction must be assumed to play a fundamental role in creative processes. A proof of this lies in the attempt to capture this epistemic dimension from evolutionary and cultural perspectives for implementation in IA systems. As shown at the end of
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1293
section four, bioinspired mechanisms of abductive reasoning are a fundamental key to the phylogenetic understanding of cognition, which must include abduction.
The Key Role of Abduction as Cognitive Switch From Complex to Minimal Abduction: A Genealogy As previous section has shown, there is a very particular form of determinism: living beings are determined to maintain an indeterminate state of adaptation. The two biases presented, neurocentrism and anthropocentrism, aim to point out the problem of embodying cognition from a single definition. In other words, there are many bodies and many ways of being concerning the environment. This statement goes a step beyond adaptability in speciesist terms and focuses attention on the individual. In this sense, biological, social, and accidental circumstances imply substantial environmental differences. Genetic circumstances are the conditions ascribed in the DNA (genotype) and their phenotypic expression modulated with the environment. There are also losses of faculties that the genotype may determine. For example, it is necessary to have at least one of the celiac disease alleles (HLA-DQ2 and/or HLA-DQ8) for the disease to manifest. However, the disease is not active when it is not phenotypically expressed. On the other hand, accidental circumstances are the loss of psychological and physical faculties. The facultative loss can be natural and progressive caused by oxidation (old age) or abrupt, such as accidents (injuries, contusions, etc.), intoxications, and diseases. Finally, social circumstances are the cosmovision in which the human being lives: the natural environment measured and transformed by the actions of people who are part of the same context. This situation highlights the need to address the anthropomorphic and neural issue of cognition in a way that considers this degree of divergence between living beings of a given species. Thus, the challenge is to characterize a human’s embodiment of cognition (anthropomorphic and neural) that does not fall into the biases mentioned above. This chapter takes abduction as the genuine mechanism for managing cognition’s neural and anthropomorphic connection with living beings’ open and constant relationship with the environment. In particular, the authors are interested here in the role of abduction in what has been defined as “minimal cognition”: all living systems’ characteristics cover a vast cognitive spectrum that fills the gap between the mindful and the mindless about interacting actively with the world, which requires an embodiment consisting of a sensorimotor coupling mechanism that subsumes an autopoietic organization. Open door with the dimension of minimal cognition compels a kind of genealogy of abductive cognition. This is because the contemporary debate on abduction has been directed toward complex cognitive processes, such as discovery, hypothesizing, and guessing (Gabbay & Woods, 2005; Aliseda, 2006). Typically, situations that trigger abduction are understood as surprising, puzzling, etc. Classic examples would be Le Verrier’s discovery of the planet Neptune (Sans Pinillos, 2017) and
1294
J. Vallverd´u and A. Sans Pinillos
Kepler’s extrapolation of the elliptical orbit of Mars to the Solar System (Hanson, 1972: 72–85). Likewise, other more common situations are cases in which complex abductions are generated. For example, the way a person dresses can lead us to think about his or her profession (Thagard, 1988: 54–56). Similarly, factors such as skin color, hairstyle, way of dressing, moving, and acting (smoking, speaking loudly, etc.) can trigger hypotheses that determine a course of action. Examples are “not crossing a street,” “not passing through an alley,” “not renting an apartment,” or “accusing someone of a crime.” An autobiographical episode in which Peirce had his watch stolen shows how some of these factors can be –abductively – crucial in determining who the thief was (Sebeok & Umiker-Sebeok, 1983). Of course, these factors and many others can come from prejudices (racism, aporophobia, homophobia, etc.), assumptions, emotions, and feelings. However, it is essential to see that the final result is a complex process of hypothesizing in which abduction is its cornerstone. For this reason, this inference is often referred to as abductive reasoning: the mechanism by that surprising situation (they cannot be reacted to in a usual way) can be tentatively managed from a non-classical epistemic virtue (in short, which does not participate in the classical processes of verification, justification, falsification, contrasting, etc.). From this perspective, abduction is concerned with managing the experience of novel facts through a hypothesis generation process. The role of abduction can be better understood with a summary of its development in the contemporary debate. Abduction was one of the concepts that were inherited along with the aporetic debate on the context of discovery. This problem stated that it was impossible to represent discovery because it was not logical but psychological. Therefore, only those results that could be justified were considered scientifically relevant (verifiable): scientific language was the best to represent true knowledge because of its descriptive capacity (Niiniluoto, 2014: 378). This situation changed with the possibility of formalizing synthetic (ampliative) reasoning brought about by the computational paradigm and AI’s emergence. One example is formalizing sensible knowledge by describing and automating heuristic processes (Simon, 1985). The programs BACON, GLAUBER, STAHL, DALTON (Simon et al., 1997), and the EURISKO system (Lenat & Brown, 1984) were models of human discovery based on heuristic relations built on ampliative inductive inference. The motivation for bringing heuristics into creativity is its functional connection to abduction. These two concepts have been intimately related to the discovery debate. The reason for this lies in need to rationalize abduction: to find a way to represent hypothetical inference. For this reason, heuristics is characterized as “the method -neither totally rational nor blind- of discovery to characterize the selective search with reliable results” (Aliseda, 2006: 16). Research in computational science and AI has led to the conceptualization of the role of heuristics in the discovery process. The relationship between discovery and heuristics stems from the classical relationship formulation between – universal – axioms (analysis) and sensible knowledge (synthesis) (Hintikka & Remes, 1974). In other words, it aims at establishing a kind of dialectical relationship that allows explaining how the general
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1295
rule (heuristic) is extracted and experimentally confirmed. Likewise, once confirmed by experience, the regularities are understood as universal rules (Sans Pinillos, 2021). From this perspective, abduction is understood as the inference proper to the hypothesis (Thagard, 1988: 51–52): the mechanism that can relate the regularities extracted from experience (induction) to universal axioms (deduction) and vice versa. This first reinterpretation of the discovery debate made from inquiries on computation and AI has allowed us to consider heuristics as more than a way of relating information. Thanks to the unification of AI and cognitive science, the possibility has also arisen to understand some cognitive processes as heuristics. This unification has made it possible to address a fundamental challenge: a formalization of ampliative inferences (induction, inference to the best explanation, abduction, etc.) that consider the cognitive richness of complex processes such as discovery. This challenge has forced the incorporation of additional elements hitherto been considered spurious because of their lack of descriptive content (Putnam, 2001: ch. 1). The generation of novelty and creativity are two cases that cannot be explained through the accumulation and combination of information (heuristic or diagrammatic) (Boden, 2004: ch. 8). At this point, the emergence of a new approach to “being there” (Dasein) in which the classical characterization of abduction plays a predominant role converges. This final consequence is directly linked to the abduction under the actual Pragmatist project.
Classical Pragmatist Holism Pragmatism is a philosophical theory of managing a relationship with a world conceived as unfinished. Pragmatism assumes the ontological thesis that the world is known from a continuous flow of information. For this reason, Pragmatism is essentially fallibilist: continuous experience incites us to maintain an attitude of provisionality toward the knowledge that humans possess so that it is possible to revise and adapt our beliefs to the new circumstances experienced (Sans Pinillos, 2021). Thus, abduction (as the mechanism of Pragmatism, cf., Peirce, 1958, CP: 5.180-212) is prima facie, an open-ended inference of hypothetical generation and selection process (Kapitan, 1997) focused on managing the variations of contingency. As mentioned above, abduction is the mechanism of hypothetical inference by which surprising situations are managed from a non-classical epistemic virtue. As it is well known, Peirce places abduction at the heart of his pragmatic theory precisely in order to ground sensible knowledge (Peirce, 1958, CP: 5.348). His characterization of abduction is that of the continuous mechanism of generating hypotheses to infer a reasonable conjecture (Peirce, 1958, CP: 2.619–644) that allows an open, in turn, controlled approach to what is still unknown. To properly understand the pragmatist project, it is crucial to understand that hypotheses have a relevant epistemic role precisely because they arise in situations where no other type of more specific knowledge can be expected. As it has been said, these situations are those of surprise, bewilderment, etc., because the classical epistemic process cannot reach a satisfactory resolution. Abduction generates knowledge to complement percipient moments (Peirce, 1958, CP 5.41–56). Thus,
1296
J. Vallverd´u and A. Sans Pinillos
knowledge is not a passive act-reflection but an activity that all human beings develop in our interaction with the world. This does not annul the possibility that there are other simpler processes at work in the complex epistemic processes, which the authors call active act-reflection: unconscious intentional reactions. For example, emotions or feelings predispose the agent to interact from different affective modes. The reason for highlighting Peirce here from among the pragmatists is to emphasize that adaptivity is found in the early formulations of abduction. Peirce’s adaptivity is manifested through language, which maintains an iconic semiotic relation of resemblance between signs and meaning (Dingemanse et al., 2020: sec. 2.2). One of the attractions of resemblance is its potential to relate things that humans do not yet know about their icon. Therefore, it can be affirmed from Peirce’s Pragmatism that all languages are iconic in the sense that they function, assuming their potential capacity for resemblance. However, such resemblance also introduces some biases, such as the belief in supranaturalistic agents, due to the cognitive mechanisms of anthropomorphization of reality. Consider, for example, pareidolia or transfer of analysis from social to natural events (Willard et al., 2022): both are examples of innate social cognitive mechanisms used by human agents to transfer and find sense of the external world. Likewise, inquiry in robotics has opened debates on anthropomorphism, situational relevance, and interaction with external entities and the environment (Müller & Hoffmann, 2017). One way to push the limits of these biases is to take the more general perspective of pragmatist holism. The intention is to show that the generation of each agent’s cosmovision is closely linked to both the worldview and the general community’s cosmovision to which one belongs (cosmovision is understood as the unified image of the world, the product of the sociocultural and biological interpretation of the environment, and worldview as the image generated solely from perception) (Magnani et al., 2021). Therefore, mediation with the environment is highly influenced by biological and social factors that the agent cannot control, of which he or she often does not take into consideration or is simply unaware. Putnam picks up pragmatic holism: 1. 2. 3. 4. 5. 6.
Knowledge of particulars (facts) presupposes knowledge of theories. Knowledge of theories presupposes knowledge of (particular) facts. Knowledge of facts presupposes knowledge of values. Knowledge of values presupposes knowledge of facts. Knowledge of facts presupposes knowledge of interpretations. Knowledge of interpretations presupposes knowledge of facts (From point 1–4, Putnam, 2001: 136–137; from point 5–6, Putnam, 2006: 33).
As can be seen, knowledge of facts depends on theories, values, and interpretations. In other words, a “fact” is interpreted, so that: 1) every fact is experienced from a theory, and 2) every theory presupposes and assumes some facts. 3) Every theory presupposes values that allow it to justify, falsify, corroborate, etc., the facts. 4) Therefore, every value about something presupposes a fact. 5) All knowledge of facts is based on interpretation, and 6) all interpretation presupposes a fact. In
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1297
other words, perception is full of theories, values, and interpretations that determine what humans know as facts. The critical point is that the “presupposition” in factual knowledge is given in experience and can be characterized abductively. This statement allows us to anticipate complex discovery processes and assume that there are forms that the authors call simple hypothesizing: adaptive conjectures that are not necessarily linked to a systematized intellectual activity to increase knowledge of a given topic.
Simple Hypothesizing and Adaptive Conjectures It is possible to ground this perspective from classical approaches to Pragmatism. As noted above, Peirce characterizes abduction as the process of experimentation. For this reason, abduction is a mechanism of continuous and open-ended generation of reasonable hypotheses. In this way, facts suggest hypotheses (Peirce, 1958, CP: 7.202), which propose lines of action to direct the theoretical, evaluative, and interpretative background. Mead made profound reflections on the action from a pragmatic point of view. He assumed an ontology of a world of events, in which the present was perceived as the becoming and disappearing of an event (Mead, 1932). For Mead, the experience of events is action-based perception and manipulation. Briefly stated: “there can be no society without selves, no selves without minds, and no minds without embodied social interaction” (McVeigh, 2020). To situate the body as the central axis of biological interaction with the social environment, Mead distinguished two sensory experiences related during experimentation: distance sensations (smell, sight, hearing) and contact sensations (Lewis, 1981). Perceived objects arise from experience-oriented purposes of use that guide action. It is interesting to highlight Mead’s theory because it encompasses the whole dimension of interaction with the social and natural environment, down to the very basis of experience: perception. The basic process of extracting information is defined as a highly hypothetical system even in temporal perception: the “present facts” are the conjugation of the past (the instant that disappears) with the present experience, which configures a projection toward a new line of action (becoming). Although there is no time to develop it in this chapter, it may be relevant to the following. In Dewey’s pragmatic approach on human nature and behavior to understand better the role of values for factual knowledge, causal perception oriented to future consequences are moral because they claim how things should be (appealing to justice, order, the good, etc.) (Dewey, 1930: 18). Thus, the theories and models with which human beings know the world depend on their workableness to achieve the hypothetically stated end (James, 1987: 826). Nevertheless, it is essential to consider all these factors that determine both the ways of knowing facts and the facts themselves as social and cultural constructs that participate in how humans experience the environment on a biological scale. The reconstruction made here from different perspectives of classical Pragmatism allows us to apply the debate on abduction to the discussion of minimal cognition. Similarly, it is possible to link the perspective defended in this chapter to contemporary abduction theories. This is possible because both perspectives have the same pragmatic roots. Therefore, it is plausible that the same theoretical
1298
J. Vallverd´u and A. Sans Pinillos
foundation found genealogically to ground minimal cognition from classical Pragmatism is also in contemporary abduction theories. Moreover, considering that the contemporary debate on abduction is grounded on the ideas of classical Pragmatism, it is also plausible to think that the same elements will be found in it, allowing us to raise relative questions about minimal cognition in classical Pragmatism. An excellent starting point for this chapter’s project is the eco-cognitive model of abduction (aka EC-Model). As presented below, it is a model that allows us to approach the question of cognition from a situated and embodied perspective that emphasizes contextualization. Thus, it allows addressing the question of hypothesizing and discovery from the interaction with the environment. Therefore, it is also an excellent theoretical point to deepen determinism to maintain an indeterminate perspective at the scale of simple hypotheses and, therefore, to implement minimal cognition.
The Eco-cognitive Perspective on Minimal Cognition The proposal of this chapter is articulated with the EC-Model of abduction: cognition is [contextually] embodied and the interactions between brains, bodies, and external environment are its central aspects (Magnani, 2017: 207). The pragmatic viewpoint assumed in the EC-Model is a naturalistic perspective of cognitive processes such as reasoning, which allows conceptualizing the generation and selection of hypotheses as mechanisms of adaptation to varieties of experience. This model emphasizes the contextualization and anthropomorphization of reasoning. In a general sense, the main idea that defines the EC-Model is that both the agent and the environment are modified during experimentation. In this perspective, it is assumed that part of human cognition is structured to adapt to the various ways of experiencing contingency. Therefore, the varieties in perception do not allow us to conceive reasoning as a definite and structured form of inference. On the contrary, these varieties allow us to conceive reasoning as deterministically open (aka indeterminate): a cognitive system that tends to make the most of the resources at its disposal. As has been said, the EC-Model allows us to synthesize the essential features of Classical Pragmatism from a naturalized philosophical perspective. This is crucial for this work. The main reason is that the EC-Model allows us to update Pragmatist reflections on the most basic layers of perception. In this chapter, perception is characterized as a cognitive process of information extraction that is deterministically open-ended: the active extraction of information and elaboration of representations not only tends to make the best use of the resources at its disposal but is also sensitive to adapt to the changes that may be experienced. Thus, new forms of perception may occur. For example, introducing the notion of number helps the children to discriminate, select, temporize, etc. (c.f., Pirahã language as a counterexample) and create ethical relations (consider when a child understands that they has more candy than their friend). Then, the pragmatic maxim of considering that knowledge is defined from the practical effects of objects (Peirce, 1958, CP: 5.402) can be
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1299
approached from a cognitive perspective of perception. Furthermore, the EC-Model of abduction translates the iconic relation-based resemblance property of Peirce’s theory into the form of hypothetical reasoning. From this perspective, abduction can be conceived as a process of adaptation to an environment composed of a constant flow of information in which experiences proliferate. As is known, Peirce incorporates abduction into perceptual judgments through the reasoning (syllogism) of experimentation. Although Peirce’s primary concern is structurally cognitive (the relation between abductive-perceptual process and judgment) (Tibbetts, 1975: 229), his analysis of perception allows the introduction of non-cognitive and immediate elements that participate in the subsequent abductive process. Phenomenologically, Peirce recognizes this gnoseological stage in firstness (the information given immediately in experience). The category of firstness is characterized as non-cognitive: it is an inexplicable and immediate qualitative state of feeling or sensation (ibid.: 223). Likewise, Mead goes further into how the contextualization of biological needs determines perception. This reflection is situated at a non-cognitive stage of perception: perceiving is an act that is not thought (ibid.: 227). On the contrary, immediate perception predisposes (anticipates) the way objects will be experienced and, therefore, the form of perceptual judgment. In other words, both the agents’ perception and the factual context determine the hypothetical generation and selection of lines of action. The abductive relationship between the practical dimension and the hypothesized product’s truth (Magnani, 2017: 15) can be explained through pragmatic holism: any reasoning that predominates in each circumstance is supported by the rest of the inferential cognitive resources. On the one hand, the EC-Model emphasizes the pragmatist idea that the diversity of inquiry forms requires a reasoning effort on the agent’s part. On the other hand, it is possible to incorporate the non-cognitive elements integrated into perception (action) that trigger subsequent cognitive mechanisms. In both cases, the specificity of the context and the moment in which agents inquire are essential. However, the second case deals with how the biological dimension of the agent determines the hypothetical approach to, for example, perceive novel cases.
Naturalization of Logic as Anthropomorphization of Cognition The EC-Model of abduction is situated within the project of naturalization of logic (Magnani, 2017: 140), in the sense of fixing attention on the cognitive dimension of the agent to define logical systems. It is possible to identify the naturalistic project of logic to translate the epistemic issues of scientific investigation into its cognitive dimension (ibid.: 1-1n). For this reason, elements such as heuristics are introduced. It is common to refer to Polya’s notion of heuristic reasoning to discuss this concept. An example is a discovery (Polya, 1971: 113). From a cognitive point of view, heuristics can be considered a sophisticated type of inference that occurs as an auxiliary mechanism when someone cannot reach specific knowledge. It is common to characterize the process of abductive hypothesizing as to applying heuristic strategies (c.f., Magnani, 2017: 57–60; Amra et al., 1992). This perspective understands hypothesizing as relating information in the face of a novel scenario. Therefore, it is an inferential resource for problem-solving situations management.
1300
J. Vallverd´u and A. Sans Pinillos
Likewise, if attention is paid to the primary dimension of action, another aspect of heuristics emerges. It is possible to understand heuristics as a minimal cognitive strategy that manifests itself through interaction with the environment. In other words, it is an immediate and permanent resource of perception that determines the indeterminate way of experiencing the constant flow of information. The perspective offered here makes much more sense within the program of naturalized abduction offered by the EC-Model. As mentioned, one of the strengths of this theory is the contextualization of cognition: contemplating the conditions under which experimentation (perception and perceptual judgments) takes place. The EC-Model can be equated with Peirce and Mead’s process of perception because it allows a) analyzing the complex processes of judgment (abduction as reasoning), and b) analyzing the processes inherent in the act of perceiving. Demonstrating this last point implies an extension of the EC-Model. It can be considered that the agent becomes part of the context because it is also part of the environment. After all, agents are modified by their actions in a particular context. In other words, every agent is always part of the context of others. For this reason, it can be stated that the naturalization implied by the EC-Model is based on an anthropomorphization. This is a systematic integration of human capacities within an embodied conception of cognition that contemplates α) the general and concrete characteristics of bodies and β) the possibilities and limits that arise from interacting with the environment with this (type of) body. For this reason, the naturalization of logic also means assuming the pragmatist postulate that relates action to reasoning. This relation follows those phenomena such as attention not excluding other (heuristic) information, transformed while the agent adapts during experimentation. Indirect information is incorporated into direct one through actions. In other words, there are indirect elements involved (unconscious, automatic, reflexes, etc.), entailing variations of different types on different scales that allow agents to adapt to the contingency of experience. In this sense, the naturalization of logic also means giving more weight to the (still) non-formalizable aspects of human reasoning. Presumably, the different types of information involved are not inferred and related using a single type of reasoning or a single pattern of inference. Therefore, the pragmatic postulate that operates at the epistemic scale to accommodate variations in perception (Shanahan, 2005) could be considered to be articulated from minimal cognitive mechanisms that integrate the basic cognitive processes that relate the action of perceiving to the action of thinking about perception. Let us take, for instance, an archaeological investigation in which it is necessary to hypothesize a plausible scenario to make sense of some remains found. Archaeologists need to use resources such as imagination, creativity, etc., to generate hypotheses to develop their investigations. In particular, the type of hypotheses that archaeologists must come up with must offer an explanation that makes sense of the objects they study and their distribution in the place where they have been discovered. In this hypothesizing process, a series of mental templates in recognition patterns let archeologists perceive and observe phenomena during the archaeological investigation (Shelley, 1996). It is important to note that Shelley’s
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1301
characterization of abduction using recognition patterns is based on the process of conscious inference. The example of archaeology allows the authors to discuss both processes regarding abduction in perception and abductive reasoning. Usually, abductive reasoning is characterized as an under-coded process (Meyer, 2010): a strategy to find a hypothesis to a fact that has surprised us. There are, on the other hand, over-coded abductions (Eco, 1983): the unconscious (Schurz, 2008: 207) and instantaneous (Magnani, 2001) process that occurs in perception. Just as under-coded abduction is triggered by a situation that classical epistemic processes cannot answer, overcoded abduction is triggered because cognitive resources manifest themselves in the form of interaction, which humans adopt in the face of suggestions offered by an object or fact. Factors such as memory, sensations, emotions, narratives, and feelings predispose how something unknown will be perceived because they will influence the final form of the interaction. Returning to the case of archaeology, the artifact that archaeologists want information about triggers a series of under-coded abductive processes: the hypotheses are aimed at solving a defined problem of ignorance. Likewise, during the manipulation of that artifact, over-coded abductions occur. In this sense, the characteristics of the object suggest affordances that propose ways of interacting with it (Withagen & Costall, 2021; Sans Pinillos & Magnani, 2022). For example, a fissure predisposes the archaeologist to trigger the hypothesis of the object’s fragility, which predisposes the agent to act carefully. The authors call this type of action hypothetical-irreflexive. This means that human perceptions are biologically determined to maintain an open interaction with the environment. As seen, it is possible to manage this indeterminacy in perception through the EC-Model of abduction (Magnani, 2017: 15). However, it is essential to differentiate between indeterminacy and unlimitedness. While indeterminacy is an inherent biological condition in perception, unlimitedness is a circumstance subject to the context of the perceived fact. In other words, it can be said that there is no definite number of patterns for interacting reality, although a limited number of them usually apply. Through abduction, it can be shown that the situation is quite the opposite. Using the idea that perception is based on agents’ actions, infinite combinations arise between the different elements that make up the environment. The reason is that this process involves the different cognitive dispositions of the agents interacting with their context. In other words, the way agents may consider responding, and the form of these responses are part of the world because they define the ways of perceiving and knowing. Section four shows that this active predisposition to manage contingencies of varying degrees and magnitude perceived and conceptualized may be related to creative processes.
Hybridization of Cognitive Processes: Multimodality Therefore, abduction can be conceived as something that integrates the agents’ interactive predispositions with their environment. In this way, the context is closely related to experiencing the facts. Therefore, its possible change also comes from
1302
J. Vallverd´u and A. Sans Pinillos
how agents deal with different situations. There are four requirements that abductive reasoning must meet in order to be conceptualized from the EC-Model proposal: 1. It must optimize the different situations we live in (optimization of situatedness). 2. There is a mutability (permutation) between the roles (input/output) of the elements (maximization of changeability). 3. Abduction is sensitive to absorbing information from what is presented to us continuously (information-sensitive). 4. It is necessary to enrich the inferential system we have to acquire all these requirements (Magnani, 2017: 138–139). Therefore, a multimodal and non-monotonic system is necessary, which considers how the information is presented to us, the means that the agents have to apprehend it, and, in addition, to consider the possibly new information generated through inferences (ibid.: 139). Hybridity is a correct way of referring to this system. In a strict sense, the proposal defended in the EC-Model of abduction is a sophisticated version of Hintikka’s (2007) selective and creative abduction. Also, this model contains many of the elements with which Thagard defined abduction. For example, abduction is a component in discovering a hypothesis and an essential element for justification (Thagard, 1988: 52). In addition to selective and creative abduction, the EC-Model includes manipulative and theoretical abduction, and sentential and model-based abduction (Fig. 1). Theoretical abduction is dominated by an internal relationship between our knowledge and the cognitive schemas and strategies to acquire new information. Therefore, theoretical abduction is the reasoning in which creativity has a more significant presence. On the contrary, manipulative abduction is dominated by the tacit application of knowledge. However, this perspective can be broadened to include situations where creative cognitive processes determine hypothetical manipulation. For example, the relationship between emotions, feelings, and memory may suggest hypotheses about the environment that manifest themselves in the Fig. 1 The pattern of the EC-Model of abduction, inspired by Park (2017: ch. 2)
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1303
form of thoughtless behavior. Similarly, the predisposition to act in one way may determine the perception and, finally, the more complex modes of hypothesizing, such as abductive reasoning.
The Cognition of Creativity Creativity is undoubtedly the most elusive concept affecting all human disciplines. To paraphrase St. Augustine: “What then is creativity? If no one asks me, I know what it is. If I want to explain it to anyone who asks me, I don’t know.” In many ways, explaining this phenomenon is the holy grail of current AI inquiries, especially those related to Machine Learning and Deep Learning (Nguyen et al., 2015). However, surprisingly, the study of such ability lacks systematicity, as no simple formula for reproducing or obtaining creative skills has been obtained either (Sans Pinillos & Vallverdú, 2021). The best empirical, evolutionary, and comprehensive approaches to creativity (Csikszentmihalyi, 1997) have shown that no single heuristic explains the rules followed by creative people in a wide range of human activities. At the same time, there seems to be bad news: most people are not creative in their everyday activities. Consequently, only a few individuals will create new avenues of knowledge or human practices in their lifetime. Moreover, analysis of creative agents in very different specialized fields, such as the sciences or the arts, do not show significantly different or similar action patterns to be creative. In short: creativity is a capacity that has no direct way of being achieved. This reality forces a hypothesis: there is more than one meaning of creativity. In this chapter, two meanings are explored: 1) creativity understood as the production of theoretical or physical artifacts or ideas that imply a radical novelty for humanity (a solution to a crucial problem, an artistic work, etc.) and 2) divergent ways of managing the environment. The authors’ interest lies in the second type. Both types of creativity are primarily social and cultural phenomena. This is because human beings are essentially gregarious and live primarily in communities. In this sense, it is crucial to keep in mind that creativity arises from the interaction of agents in their sociocultural cosmovision. Therefore, the statement that few creative people are directed at the first type of creativity means that few results are perceived as creative. However, both the processes by which a person achieves a creative outcome and the appreciation of that outcome as creative by the rest of society could be placed under the second type of creativity. In other words, creativity is not usually an individual phenomenon (Feyerabend, 1987). The conceptual elements with which humans build our systems always have enough erosions to break them, combine them, etc., to use them as convenient. Abduction has a fundamental role in eliciting creative behaviors of the second type of creativity. On the one hand, creativity is closely related to the contextual conditions that agents manage with abductive mechanisms. In this sense, those mechanisms will differ from one person to another, but in the backbone, they would remain the same biological basis: abduction as a switch activator to decide positively toward innovative possibilities. The authors assume that there is also a
1304
J. Vallverd´u and A. Sans Pinillos
combination of attitudinal characteristics: stubbornness, dedication to work, focus, self-confidence, and confidence in the success of existing ideas. Again, it must be differentiated between what is considered creativity as a complete cognitive ability at the animal level (Kaufman & Kaufman, 2004) and the specific human ability to be creative. This distinction is fundamental to confront cognitive approaches to creativity with algorithmic and statistical ones. It is necessary to clarify some neural mechanisms that explain abduction, which the authors will discuss in the next section to understand how this process might be possible. It is necessary to clarify some neural mechanisms that explain abduction, which the authors will discuss in the next section to understand how this process might be possible. Sometimes, the creative process is reduced to combining information in different ways to get out of the impasse. In those cases, the creative response is a new line of inquiry. However, it is essential to note that not every new course of action is perceived as creative. For example, the creative response is often the “icing on the cake” when faced with a defined problem. Although it does not matter whether the “icing” is placed at the beginning, during, or at the end of the process, it must usually be an element that will be decisive for an outcome that was not foreseeable. However, the possibility of introducing this crucial element will be determined by many external factors. This chapter hypothesizes that the abductive mechanisms of triggering and constraining critical information for hypothesis generation and selection are sometimes related to creative processes. For example, Łukasiewicz claimed that all types of reasoning involve a degree of creativity because they include interpretations of facts using laws, generalizations, etc. This Łukasiewicz’s holistic approach of complex hypothesizing processes is structured abductively (through a reduction) because it is fundamentally constructive: information is added to make sense of something unknown (Łukasiewicz, 1970: 7). Likewise, it is possible to identify this relationship between abduction and creativity in perception. Both types of abduction will be creative if the puzzling fact triggers a divergent hypothesis. The authors use divergent to capture the novelty-never-seen: the novelty that at first may even be challenging to comprehend. Socially, something will be considered genuinely creative if it is original. However, any divergent hypothesis never experienced before will be experienced as genuine for an agent (Boden, 2013). This chapter is interested in the second type of creative product. It may be that a personal experience best exemplifies these cases. For example, one of the ways to interrupt an immobilization in Judo is for the one receiving the technique (uke, 受け) to lock the legs of the one applying the technique (nage, 投げ). It is expected that when learning immobilizations such as KuzureKesa-Gatame (崩袈裟固) (nage wraps an arm around uke’s waist), nage develops a creative strategy so that uke does not manage to lock his legs: turning like a clockwork with the help of the legs. This “technique” is not taught but is a resource that emerges unreflexively in almost all novices in the first fights they participate in and can be understood as a creative resource. As a second example, the authors introduce the Deep Learning system created by Google’s Deep Mind company: AlphaGo, which is considered the most complex game. In March 2016, such AI beat the best-known player Lee Sedol
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1305
(Wang et al., 2016). In Game Two, Move 37, AlphaGo moved completely unexpected for any expert human in the last 3000 years of the game’s existence. Of course, such movement changes how humans approach it, but it did not take years of analysis: just two Games later, in Game Four, Sedol made an unexpected move again, “Move 78,” also called “God’s touch.” Despite losing the match, the lessons were formidable: again, machines dominated humans even in abstract games, but at the same time, the human player was able to react instinctively to the pressure and abductively chose a new strategy that gave him some advantage. Furthermore, too, Deep Learning started to demonstrate its overwhelming power.
Cognitive Triggers: Surprise, Ignorance, and Interest Triggers and constraints articulate the abductive processes. Both are logical and cognitive mechanisms that represent the operational bivalence of morphological and cultural aspects in defining the margins of our reality (constraints) and in the process of change (triggers). Introducing these mechanisms allows us to address different degrees of the materiality of the interactions that humans perceive and with which they conceptualize reality. Reference is made to agents’ interaction with themselves, the interaction between other agents, artifacts, and technological devices. As already stated, the relationship between triggers and creativity can be traced back to Csikszentmihalyi’s (1997) inquiries in psychology. In a general way, triggers can be conceptualized as the set of mechanisms that operate bivalently with the constraints, whose function is to break or expand the margins with which humans configure reality. Cognitive triggers are those emotions and sensations that generate different strategies, including abduction. They are complicated to conceptualize, mainly because there are no developed representation tools, but also because they are concepts that have more than one meaning. A classic example is a surprise (Peirce, 1958, CP: 5.188–189), which can occur in many ways, intensities, contexts, etc. One way to interpret surprise is in terms of an event that violates a preexisting belief (Gabbay & Woods, 2005: 82). In this sense, surprise initiates a highly original and creative doxastic process, the purpose of which is to begin to devise something new. Examples could be when a judo player realizes how simple it can be for his legs to be blocked and when in an experiment, incomprehensible data emerges from the theoretical framework on which the inquiry is based. The attractiveness of this proposal is that this process is understood in a gradual sense as follows: 1. Showing for the first time that some element, however vaguely characterized, is an element and must be recognized as distinct from others. 2. To show that this or that element is not needed. 3. Giving distinctness – workable, pragmatic, distinctness – to concepts already recognized. 4. Illuminative and original criticism of the works of others (Gabbay & Woods, 2005: 82).
1306
J. Vallverd´u and A. Sans Pinillos
Another feature of this scheme is that it assumes Anderson’s distinction (from Peirce’s one) between reordering and concept creation (Anderson, 1987: 47). While reordering deals with the structure of experience, concept creation is concerned with generating new ideas and with the concept of novelty itself. To assert that only a surprise can interrupt the ordinary course of reasoning implies accepting that humans are not usually surprised. There is every reason to reject this idea. For example, the act-reflection processes (unconscious intentional reactions) introduced in this paper can be characterized as abductive processes embedded in perception to provide an adaptive mechanism capable of overcoming the surprise of experiencing new information. In other words, just as people do not always ignore in the same way, there are different ways of being surprised, for instance, at the perceptual level. For example, the surprise of noticing something cold when it “should” be hot (a pot that was heating in the oven) is not of the same intensity as the surprise that it does not fall when it “should” fall (e.g., the step that has broken has a surface immediately below it). Another very creative case is the surprise of being sure that someone has survived something that “should” have been fatal. From this point of view, it is possible to consider ignorance as a second-order trigger: the experimentation and possibility of obtaining empirical knowledge are mediated by a degree of ignorance (Sans Pinillos & Magnani, 2022). Likewise, ignorance of something is one of the possible triggers of surprise. Then, ignorance generates a genuine situation that provides inquiry (action) (Arfini, 2019). From the EC-Model of abduction perspective, ignorance is produced by cognitive interaction with the environment. Another critical but underworked trigger is the interest. There is a crucial relationship between interest and ignorance: 1. 2. 3. 4. 5.
Having a particular interest Understanding the meaning of an impending phenomenon as a chance Putting a scenario based on a selected chance into a concrete shape Running a simulation or taking action based on the scenario Acquiring a new interest (Maeno & Ohsawa, 2007)
It is possible to deal with the trigger of interest in the discursive context. The question of how the particularity of the method is configured through the shared observations that generate the appreciative discursive basis can be addressed by abduction to apprehend sensibility. From the Habermasian perspective, a distinction can be made between description (Beschreibung) and narration (Erzählung). While the former refers to observed objects, the latter understands the observed objects under a concrete discourse (aka method) (Habermas, 1992: 395). This distinction allows Habermas to formulate the possible experience. Likewise, the discourse of possible facts is constructed based on the distributive property that this form of experience acquires when situated in a discursive framework (ibid.: 396). This schematization of sensibility is grounded in action: all descriptions are based on the action of knowing the object of inquiry. Therefore, description and the action of describing define the subsequent narration (understanding of the observed objects under a concrete discourse) (ibidem).
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1307
As it has been seen, everything said so far is part of the discursive realm, where facts are transformed into parts of it in order to separate themselves from the world and, in turn, maintain a specific connection through their use. In other words, with use (action) within a narrative, descriptions are thematized to give meaning to something that may vary, depending on the context through which it is viewed. The connection is maintained by the logic of shared discourse and also by pre-scientific logic. Finally, this makes sense in the theory of interest, which motivates the desire to know something. The point of common discursive understanding ensures that interests can be shared. This point is critical for how extra-theoretical information is abductively used to advance our work method. The first point of understanding is crucial for recognizing what the other is inquiring into and then learning it as a regulative guide to our actions.
Cognitive Constraints Constraints are to be understood as the bivalent counterpart of triggers. These elements are postulated to show that our capacity to generate hypotheses is controlled by different morphological and cultural mechanisms that determine our reality. The term has been used in the abduction debate, especially from logical and computational theories, where these mechanisms are understood as protocols to avoid the massive proliferation of generated information. Cognitive constraints structurally operate similarly to logical ones but are enriched by many factors. This circumstance is so because, on the one hand, the logical/algorithmic reduction is based on an attempt to capture our psychological life and, on the other hand, because, as already seen, computation has very dense material limits. Indeed, some of these constraints, as has also been seen with triggers, are inherently human and, in this sense, are as rigid as the computational ones, with the difference that humans are hardwired to adapt. Ergo most of these strategies are also hardwired to change. Nevertheless, on the other hand, there are also cultural and social strategies where change should also be contemplated in the use of humor, trivialization, power hierarchies, advertising, or the economic factor, among others. It is important to note that, although everything can be generally understood as a trigger from an affordance point of view (Estany & Martínez, 2013), this does not reflect the kind of disposition that an agent must have to consider it as such. On the contrary, from the constraint perspective, it is evident that our disposition often has little or nothing to do. Examples are circumstantial conditionals that allow us to be aware of some anomaly or unexpected event (Roberts, 1989) (such as the weather). The tools that agents use must also be considered. Languages are tools that open a field of possibility (Magnani et al., 2021), but they are also the limits of what can be said and thought. Moreover, devices are designed precisely to change the world through their operation. Language is composed of an infinite number of resources that allow us to modify its tone and offer poetic, sarcastic, metaphorical visions, etc., which have served to understand the world in a certain way. An example that authors love is Pedersen’s analysis of the Norse sagas, which shaped a culture’s cosmovision
1308
J. Vallverd´u and A. Sans Pinillos
(Pedersen, 2013). The way things are related opens the doors to new interpretations and, more interestingly, when mixed in different domains, superb results can be given. Successful theories are known that have started with a visual metaphor or using an idea taken from a dream, an extrasensory experience, or using religious beliefs (Boden, 2004: ch. 1). However, at the same time, these possibilities are also an indeterminate limit. In other words, it conditions the way reality is understood and our role in it (Huang & Jaszczolt, 2018). To the extent that the possibility of signification is determined, the eco-cognitive environment is also configured. In turn, the environment that the authors refer to as the socially shared cosmovision is traversed and shaped by culture.
The Neuro-cognitive Basis of Abduction and Related Bioinspired Computation From a naturalistic and evolutionary perspective, abductive processes should be elicited by some morphological mechanism. All humans perform innately abductive processes, and, therefore, some embodied system must be the cause of such a process. Neurochemical processes are the best explanation of such a mechanism (Thagard, 2007; Seddon, 2021), at least for humans (it is assumed the necessity of exploring abductive processes in non-neural cognitive organisms, but it is beyond our scope in this chapter to justify or give solid support to it). An excellent way to check the neuroanatomical basis of abductive processes is to check their malfunction due to some psychological disorders (Coltheart et al., 2010) and the study of single isolated pieces of such mechanisms, like the dopaminergic mechanisms (Dasgupta et al., 2018). Cognition can then be understood as a heuristical process that uses several sources of data and integration strategies as a blending process (Vallverdú, 2019). It is possible to find connections between abduction and other cognitive processes for those reasons (Calzavarini & Cevolani, 2022 ). On the other hand, although there are ways to consider the naturalistic connections between abductive reasoning and Bayesian statistics (Vallverdú, 2016), this aspect will also be neglected in this paper because it is out of the limits of our current research. Nevertheless, it is an interesting debate that can help understand the different perspectives about information analysis (Psillos, 2004). Despite the sound arguments in favor of the existence of abductive practices in animal cognition (Vitti-Rodrigues & Emmeche, 2017), more studies elucidating in detail these mechanisms and their variabilities (as a taxonomical approach to abductive cognition) are needed. In any case, there are enough empirical shreds of evidence that support the hypothesis of this paper: embodied cognition is morphologically mediated, and this functional framework determines that agents are inevitably forced to use abductive reasoning in our day-to-day activities. Consequently, abduction must have a fundamental role in creative processes (Nubiola, 2005).
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1309
Mirroring (Artificial) Abduction: Machine Learning Because of the fundamental role of creativity in knowledge acquisition, this process is being tried to capture and implement into AI systems from an evolutionary and cultural point of view. Machine Learning and Deep Learning fields (henceforth, Ml and Dl, respectively) are working intensively on achieving artificial creativity using algorithmic and statistical processes. The extraordinary recent successes of Deep Learning methods applied by Google’s DeepMind company has provided new insights (in fact, a revolution) into classic games like Chess or Go, as well as on very complex scientific problems, like the protein folding structures, with their AI systems AlphaGo, AlphaZero, and AlphaFold (Callaway, 2020). The funny side of this revolution is that it started with Demis Hassabis’ (founder of DeepMind) omnivorous curiosity and his interests in neurosciences and videogames. So again, the paths of creativity are beyond the expected roads of sound and official thinking. There is a fundamental aspect for analyzing abductive models in Machine Learning: considering the internal trends and debates inside AI communities. Classic GOFAI created an AI-based on logical rules, while the Second Wave AI experts tried to create a system from a bottom-up view, introducing morphologies into the intelligence equation. In any case, the successes came from the creation of bioinspired techniques, the Neural Networks, which matured and increased in complexity in the machine learning subfield, being the same statistical techniques necessary for processing the new huge sets of Big Data, the called Data-Tsunami, at the beginnings of the twenty-first century. Deep Learning was so successfully applied to solve a broad range of socio-technical and scientific problems that it has even been affirmed in 2020 by one of its creators, Geoff Hinton (Hao, 2020): “Deep learning is going to be able to do everything.” Nevertheless, the real thing is that DL is entering into an epistemic bottleneck despite some wonderful achievements, for several reasons: opacity due to black-boxes, general lack of explainability, and, among all the others, the lack of causal understanding (Schölkopf et al., 2021; Vallverdú, 2020). For these reasons, current studies are trying to introduce meaning into statistical approaches to AI systems, that is, combing symbolic with statistical knowledge in Machine Learning (Liu et al., 2019). Even non-conventional logics is fundamental for non-conventional computing (Schumann & Zenil, 2020). Anyhow, it will be soon noticed that the grounded basis of symbolic emergence, a combination of embodied plus social aspects of cognition, is still missing. So again, the bioinspired mechanisms of abductive reasoning are a fundamental key for the phylogenetical understanding of cognition, which must include abduction. On the other hand, an essential question concerning our research is: AI researchers have tried to implement abduction in ML or DL fields. But, why? The reasons are because they aim at improving AI’s creative and innovative properties (if any). After some early attempts (Shanahan, 1989; Marquis, 1991), the conceptualization of abductive reasoning done by computer scientists working on Machine Learning has appeared to be soon-systematic. Let us see some examples of it:
1310
J. Vallverd´u and A. Sans Pinillos
I. Bergadano et al. (2000) defined abduction as a form of defeasible reasoning, usually implemented into Machine Learning as (a) an aside technology and used as reasoning in explanation-based learning systems (as a heuristic to guide search in top-down specialization), or as (b) a way to generate missing examples in relational learning. The authors considered that abduction was not implemented as a general reasoning way but, in contrast, as a way to solve every tiny and specific problem. II. Ignatiev et al. (2019) bring an example and a recent analysis of abduction in ML. They used abductive reasoning to allow a constraint-agnostic solution for computing explanations for any ML model. With this tool, they tried to exploit the best properties of logic-based and heuristic-based approaches because they represented the ML model as a set of constraints in some theory (e.g., a decidable theory of first-order logic). III. Mooney and Shavlik (2021) applied abduction to the “Theory refinement” (theory revision, knowledge-based refinement) as the Machine Learning task of modifying an existing imperfect domain theory. Then, it can be made consistent with a set of data from an accurate abduction definition as “the process of inferring cause from effect or constructing explanations for observed events and is central to tasks such as diagnosis and plan recognition.” IV. (Dasgupta et al., 2018): “Abduction refers to inferring the premises (causes) from the observations (effects) of any rule of the form: if-then. Although the logic of propositions/predicates does not support abduction, it has importance in many real-world situations. The logic of fuzzy sets offers a solution to abductive reasoning problems. Several techniques of abductive reasoning are available in the literature. For example, for a given rule: if x is A, then y is B, where x and y are linguistic variables and A and B are fuzzy sets, given y is B/, where B/ is an observed MF, it can be inferred A/ by computing the implication relation R(x, y), and then by computing the MF for x is A/ by using max-min composition (o) of B/ oR-1(y, x), where R-1(y, x) is the inverse fuzzy implication relation, such that R-1o R = I, the identity matrix.” Perhaps it is too reductionist and misleading to reduce abduction to any if-then computational rule description. Other authors have explored the connections of introducing abduction in neural networks (Ray & d’Avila Garcez, 2006) or modeling ML with abductive reasoning from an early wave at the end of the twentieth century (O’Rorke, 1988; Hirata, 1993), to the second one at the beginning of 21st one (Chakraborty et al., 2009; Kakas & Michael, 2020; Huang et al., 2021). In some cases, adjustments on the classic view of abduction have also been used in these technical fields, like “weighted abduction” (Appelt & Pollack, 1992). Among plenty of explorations using abduction, the authors want to highlight Vladimir Vapnik and his colleagues’ research. They attempt to introduce statistical tools to describe abductive practices to improve ML methods. For example, in Vapnik and Izmailov (2019), the authors introduce a new type of inference based on statistical invariants (see formula 13). Since it is valid for any predicate (any function $\psi(x)\in L_2$), one can construct as many statistical invariants as one
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1311
wants (by defining properties of class). In philosophy, sometimes it is called “The Duck Test$” and refers to abductive inference. Vapnik suggests that learning using statistical invariants is the most effective way of learning (Vapnik & Izmailov, 2015). From these examples, the authors can defend the interest and necessity of introducing abductive mechanisms in artificial reasoning, something especially requested in the case of causal connections between events. At the same time, the lack of universal formalization and amateurism in these topics held by computer scientists has created a fuzzy implementation and understanding of such fundamental cognitive mechanisms.
Conclusions This chapter has given arguments favoring morphologically situated abductive processes in perception. This perspective extends the naturalistic proposal defended in the EC-Model of abduction. As has been shown, naturalized abduction is based on a contextualized view of the principles of Classical Pragmatism. On the other hand, the pragmatic viewpoint assumed in the EC-Model is a naturalistic perspective of cognitive processes such as reasoning, which allows conceptualizing the generation and selection of hypotheses as mechanisms of adaptation to varieties of experience. The mechanism that has been analyzed in this essay is creativity. Characterizing creative processes using abduction means analyzing this phenomenon from morphological bases. In doing so, the authors have tried to defend the thesis that human beings are determined to maintain a constant openness of hypothesizing in the face of the contingencies experienced. Although this has been raised in this essay only as conjecture, this thesis could gradually be applied to the rest of living beings. For this reason, the inevitability of creativity defended in this chapter goes beyond a methodological attitude or a theory. Because of the morphological bases of perception (neurochemical, in human beings) abduction appears as a key mechanism for the adaptive dimension of creativity. As it has been argued, this dimension is present in all degrees of human experience, always a fundamental part of the generation of epistemic content in the socially shared cosmovision using natural processes of biological basis. Bodies naturally evolved accompanied by a social pressure to be creative: humans must be creative to solve plenty of challenges throughout their life. Abduction is the ground mechanism by which (naturalized and socialized) cognition adapts to evolving scenarios and therefore is at the bottom of all cognitive procedures: from intuition to high symbolic data processing. Causal knowledge is at the horizon of such artificial abductive systems, finding the same meaning explanations that humans have searched throughout history. The excellent news for cognition designers is that the fundamental mechanism which allows the ladder of other cognitive mechanisms is identified and explained here: abduction. Acknowledgments Research for this article was supported by the “Innovacion epistemológica: el caso de las ciencias biomédicas” (FFI2017-85711-P), and ICREA Acadèmia Grant, and the PRIN 2017 Research 20173YP4N3-MIUR, Ministry of University and Research, Rome, Italy.
1312
J. Vallverd´u and A. Sans Pinillos
References Aliseda, A. (2006). Abductive reasoning: Logical investigations into discovery and explanation. Springer. Amra, N. K., Smith, J. W., Jr., Johnson, K. A., & Johnson, T. R. (1992). An approach to evaluating heuristics in abduction: A case study using RedSoar – An abductive system for red blood cell antibody identification. In Proceedings of the annual symposium on computer application in medical care (p. 690). American Medical Informatics Association. Anderson, D. R. (1987). Creativity and the philosophy of C. S. Peirce. Springer-Science+Business Media, B. V. Appelt, D. E., & Pollack, M. E. (1992). Weighted abduction for plan ascription. User Modeling and User-Adapted Interaction, 2(1), 1–25. Arfini, S. (2019). Ignorant cognition. A philosophical investigation of the cognitive features of non-knowing. Springer. Aristotle. (1957). Analytica Priora et Posteriora. In W. D. Ross (ed.). Oxford University Press. Bergadano, F., Cutello, V., & Gunetti, D. (2000). Abduction in machine learning. In Abductive reasoning and learning (pp. 197–229). Springer. Boden, M. (2004). The creative mind: Myths and mechanisms. Routledge. Boden, M. A. (2013). Creativity as a neuroscientific mystery. In O. Vartanian, A. Bristol, & J. C. Kaufman (Eds.), The neuroscience of creativity (pp. 3–18). MIT Press. Breazeal, C., Dautenhahn, K., & Kanda, T. (2016). Social robotics. In: Siciliano, B., Khatib, O. (eds) Springer Handbook of Robotics. Springer Handbooks. Springer, Cham, 1935–1972. Callaway, E. (2020). ‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. Nature, 588(7837), 203–205. Calvo Garzón, F. (2007). The quest for cognition in plant neurobiology. Plant Signaling & Behavior, 2(4), 208–211. Calzavarini, F., & Cevolani, G. (2022). Abductive reasoning in cognitive neuroscience: weak and strong reverse inference. Synthese 200, 70. Chakraborty, S., Konar, A., & Jain, L. C. (2009). An efficient algorithm to computing max– min inverse fuzzy relation for abductive reasoning. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 40(1), 158–169. Coltheart, M., Menzies, P., & Sutton, J. (2010). Abductive inference and delusional belief. Cognitive Neuropsychiatry, 15(1-3), 261–287. Csikszentmihalyi, M. (1997). Flow and the psychology of discovery and invention (p. 39). HarperPerennial. Damasio, A. (1994). Descartes’ error: Emotion, reason, and the human brain. Putnam. Dasgupta, M., Konar, A., & Nagar, A. K. (2018, November). Online prediction of dopamine concentration using EEG-induced type-2 fuzzy abduction. In 2018 IEEE symposium series on computational intelligence (SSCI) (pp. 212–218). IEEE. Dewey, J. (1930). Human nature and conduct. An introduction to social psychology. The Modern Library. Dingemanse, M., Perlman, M., & Perniss, P. (2020). Construals of iconicity: Experimental approaches to form–meaning resemblances in language. Language and Cognition, 12(1), 1–14. https://doi.org/10.1017/langcog.2019.48 Eco, U. (1983). Horns, hooves, insteps. In U. Eco, U. Eco, & T. A. Sebeok (Eds.), The sign of the three. Dupin, Holmes, Peirce (pp. 198–220). Indiana UP. Estany, A., & Martínez, S. (2013). “Scaffolding” and “affordance” as integrative concepts in the cognitive sciences. A Philosophy of Psychology, 27(1), 98–111. Feyerabend, P. (1987). Creativity: A dangerous myth. Critical Inquiry, 13(4), 700–711. http://www. jstor.org/stable/1343525 Gabbay, M., & Wood, J. (2005). A practical logic of cognitive systems: The reach of abduction, insight and trial (Vol. 2). Elsevier. Habermas, J. (1992). Erkenntnis und Interesse. Suhrkamp.
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1313
Hanson, N. R. (1972). Patterns of discovery. Cambridge University Press. Hao, K. (2020, November 3). AI pioneer Geoff Hinton: “Deep learning is going to be able to do everything”. MIT Technology Review. Hintikka, J. (2007). Socratic Espistemology. Explorations of Nowledge-seeking by ques-tioning. Cambridge University Press. Hintikka, J., & Remes, U. (1974). The method of analysis: Its geometrical origin and its general. Springer-Science+Business Media, B. V. Hirata, K. (1993). A classification of abduction: Abduction for logic programming. Machine Intelligence, 14, 405. Huang, M., & Jaszczolt, K. M. (2018). Expressing the self: Cultural diversity and cognitive universals. Oxford University Press. Huang, Y. X., Dai, W. Z., Cai, L. W., Muggleton, S., & Jiang, Y. (2021). Fast abductive learning by similarity-based consistency optimization. Advances in Neural Information Processing Systems, 34, 26574–26584. Ignatiev, A., Narodytska, N., & Marques-Silva, J. (2019, July). Abduction-based explanations for machine learning models. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 1511–1519). James, W. (1987). The meaning of truth. William James. Writings 1902–1919. The varieties of religious experience; pragmatism; a pluralistic universe; the meaning of truth; some problems of philosophy; essays (pp. 821–978). Literary Classics of the United States, Inc.. Kakas, A., & Michael, L. (2020). Abduction and argumentation for explainable machine learning: A position survey. arXiv preprint arXiv:2010.12896. Kapitan, T. (1997). Peirce and the structure of abductive inference. In N. Houser, D. D. Roberts, & J. van Evra (Eds.), Studies in the logic of Charles Sanders Peirce (pp. 477–496). Indiana University Press. Kaufman, J. C., & Kaufman, A. B. (2004). Applying a creativity framework to animal cognition. New Ideas in Psychology, 22(2), 143–155. Koch, C. (2019). The feeling of life itself: Why consciousness is widespread but can’t be computed. MIT Press. Kurzweil, R. (2000). The age of spiritual machines: When computers exceed human intelligence. Penguin. Lenat, D. D., & Brown, J. S. (1984). Why am and eurisko appear to work. Artificial Intelligence, 23(3), 269–294. https://doi.org/10.1016/0004-3702(84)90016-X Lewis, J. D. (1981). G.H. Mead’s contact theory of reality: The Manipulatory phase of the act in the constitution of mundane, scientific, aesthetic, and evaluative objects. Symbolic Interaction, 4, 129–141. https://doi.org/10.1525/si.1981.4.2.129 Liu, J., Patwary, M. J., Sun, X., & Tao, K. (2019). An experimental study on symbolic extreme learning machine. International Journal of Machine Learning and Cybernetics, 10(4), 787–797. Łukasiewicz, J. (1970). Creative elements in science. Selected works (pp. 1–15). North-Holland Publishing Company. Lyon, L. (2017). Dead salmon and voodoo correlations: Should we be skeptical about functional MRI? Brain, 140(8), e53–e53. Maeno, Y., & Ohsawa, Y. (2007). Human-computer inter-active annealing for discove-ring invisible dark events. IEEE Transactions on Industrial Electronics, 54(2), 1184–1192. Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation. Kluwer. Magnani, L. (2017). The abductive structure of scientific creativity. Springer. Magnani, L., Sans Pinillos, A., & Arfini, S. (2021). Language: The “ultimate artifact” to build, develop, and update worldviews. Topoi. https://doi.org/10.1007/s11245-021-09742-5 Marquis, P. (1991, September). Extending abduction from propositional to first-order logic. In International workshop on fundamentals of artificial intelligence research (pp. 141–155). Springer. McVeigh, R. (2020). The body in mind: Mead’s embodied cognition. Symbolic Interaction, 43, 493–513. https://doi.org/10.1002/symb.476
1314
J. Vallverd´u and A. Sans Pinillos
Mead, G. H. (1932). The philosophy of the present. The Open Court Company. Meyer, M. (2010). Abduction – A logical view for investigating and initiating processes of discovering mathematical coherences. Educational Studies in Mathematics, 74, 185–205. https://doi.org/10.1007/s10649-010-9233-x Mooney, R. J., & Shavlik, J. W. (2021). A recap of early work on theory and knowledge refinement. In A. Martin, K. Hinkelmann, H.-G. Fill, A. Gerber, D. Lenat, R. Stolle, & F. van Harmelen (Eds.), Proceedings of the AAAI 2021 spring symposium on combining Machine Learning and Knowledge Engineering (AAAI-MAKE 2021). Stanford University, March 22–24, 2021. Müller, V. C., & Hoffmann, M. (2017). What is morphological computation? On how the body contributes to cognition and control. Artificial Life, 23(1), 1–24. https://doi.org/10.1162/ARTL_ a_00219 Newen, A., De Bruin, L., & Gallagher, S. (Eds.). (2018). The Oxford handbook of 4E cognition. Oxford University Press. Nguyen, A. M., Yosinski, J., & Clune, J. (2015, July). Innovation engines: Automated creativity and improved stochastic optimization via deep learning. In Proceedings of the 2015 annual conference on genetic and evolutionary computation, Association for Computing Machinery, (pp. 959–966). Niiniluoto, I. (2014). Representation and truthlikeness. Foundations of Science, 19(4), 375–379. Nisbett, R. (2004). The geography of thought: How Asians and westerners think differently... And why. Simon and Schuster. Nubiola, J. (2005). Abduction or the logic of surprise. Semiotica, 2005(153), 117–130. O’Rorke, P. (1988, March). Automated abduction and machine learning. In Proceedings of AAAI symposium on explanation-based learning. USA, Stanford, AAAI PRESS, (pp. 170–174). Park, W. (2017). On classifying abduction. In Abduction in context. Studies in applied philosophy, epistemology and rational ethics (Vol. 32). Springer. https://doi.org/10.1007/978-3-319-489568_2 Pedersen, A. (2013). The last conference. A pragmatist Saga. Akademika Publishing. Peirce, C. S. (1958). In C. Hartshorne & P. Weiss (Eds.), Collected papers of Charles Sanders Peirce (Vol. 1–6). Cambridge: Harvard University Press, 1931–1935; (Vol. 7–8) (A. W. Burks, Ed.). Harvard University Press. Pfeifer, R., & Bongard, J. (2006). How the body shapes the way we think: A new view of intelligence. MIT Press. Picard, R. W. (2000). Affective computing. MIT Press. Polya, G. (1971). How to solve it; a new aspect of mathematical method. Princeton University Press. Psillos, S. (2004). Inference to the best explanation and Bayesianism. In Induction and deduction in the sciences (pp. 83–91). Springer. Putnam, H. (2001). The collapse of the fact/value dichotomy. Harvard University Press. Putnam, H. (2006). El pragmatismo. Un debate abierto, Editorial Gedisa Sevilla. Ray, O., & d’Avila Garcez, A. S. (2006, August). Towards the integration of abduction and induction in artificial neural networks. In Proceedings of 2nd international workshop on neuralsymbolic learning and reasoning, Riva del Garda. Roberts, R. M. (1989). Serendipity: Accidental discoveries in science. Serendipity. Wiley. Sans Pinillos, A. (2017). El lado epistemológico de las abducciones: La creatividad en las verdadesproyectadas. Revista Iberoamericana De Argumentación, 15, 77–91. Recuperado a partir de https://revistas.uam.es/ria/article/view/8573 Sans Pinillos, A. (2021). Neglected pragmatism: Discussing abduction to dissolute classical dichotomies. Foundations of Science. https://doi.org/10.1007/s10699-021-09817-x Sans Pinillos, A., & Magnani, L. (2022). How do we think about the unknown? The selfawareness of ignorance as a tool for managing the anguish of not knowing. In S. Arfini & L. Magnani (Eds.), Embodied, extended, ignorant minds: New studies on the nature of notknowing. Elsevier. Forthcoming. Sans Pinillos, A., & Vallverdú, J. (2021). What the #® §=$@ is creativity? Debats. Revista De Cultura, Poder I Societat, 135–147. https://doi.org/10.28939/iam.debats-en.2021-9
59 The Foundations of Creativity: Human Inquiry Explained Through. . .
1315
Schölkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., & Bengio, Y. (2021). Toward causal representation learning. Proceedings of the IEEE, 109(5), 612–634. Schumann, A., & Zenil, H. (2020). Non-classical logics in unconventional computing. International Journal of Unconventional Computing, 15(4), 237–244. Schurz, G. (2008). Patterns of abduction. Synthese, 164, 201–234. https://doi.org/10.1007/s11229007-9223-4 Sebeok, T. A., & Umiker-Sebeok, J. (1983). You know my method. In U. Eco & T. A. Sebeok (Eds.), Dupin, Holmes, Peirce. The sign of three (pp. 11–54). Indiana University Press. Seddon, P. B. (2021). Nature chose abduction: Support from brain research for Lipton’s Theory of inference to the best explanation. Foundations of Science, 1–17. Shanahan, M. (1989, August). Prediction is deduction but explanation is abduction. In IJCAI (Vol. 89, pp. 1055–1060). Shanahan, M. (2005). Perception as abduction: Turing sensor data into meaningful re-presentation. A Cognitive Science, 29, 103–134. Shelley, C. (1996). Abductive reasoning in archaeology. A Philosophy of Science, 63(2), 278–301. Simon, H. A. (1985). Psychology of scientific discovery. Paper presented at the 93rd Annual APA Meeting, Los Angeles, CA. Simon, H. A., Valdés-Pérez, R. E., & Sleeman, D. H. (1997). Scientific discovery and simplicity of method. Artificial Intelligence, 91, 177–181. Thagard, P. (1988). Computational philosophy of science. MIT Press. Thagard, P. (2007). Abductive inference: From philosophical analysis to neural mechanisms. Inductive reasoning: Experimental, developmental, and computational approaches, 226–247. Tibbetts, P. (1975). Peirce and Mead on perceptual immediacy and human action. Philosophy and Phenomenological Research, 36(2), 222–232. https://doi.org/10.2307/2107055 Vallverdú, J. (2016). Bayesians versus frequentists: A philosophical debate on statistical reasoning. Springer. Vallverdú, J. (2019). Blended cognition: The robotic challenge. In Blended cognition (pp. 3–21). Springer. Vallverdú, J. (2020). Approximate and situated causality in deep learning. Philosophies, 5(1), 2. Vallverdú, J., Castro, O., Mayne, R., Talanov, M., Levin, M., Baluška, F., & Adamatzky, A. (2018). Slime mould: The fundamental mechanisms of biological cognition. Biosystems, 165, 57–70. Vapnik, V., & Izmailov, R. (2015). Learning using privileged information: Similarity control and knowledge transfer. Journal of Machine Learning Research, 16(1), 2023–2049. Vapnik, V., & Izmailov, R. (2019). Rethinking statistical learning theory: Learning using statistical invariants. Machine Learning, 108(3), 381–423. Vitti-Rodrigues, M., & Emmeche, C. (2017). Abduction: Can non-human animals make discoveries? Biosemiotics, 10(2), 295–313. Wang, F. Y., Zhang, J. J., Zheng, X., Wang, X., Yuan, Y., Dai, X., ... Yang, L. (2016). Where does AlphaGo go: From church-turing thesis to AlphaGo thesis and beyond. IEEE/CAA Journal of Automatica Sinica, 3(2), 113–120. Willard, A. K., Turpin, H., & Baimel, A. (2022). Maximally Intuitive, Minimally Evidenced: Universal cognitive biases as the basis for supernatural beliefs. Retrieved from psyarxiv.com/aubem. Withagen, R., & Costall, A. (2021). What does the concept of affordances afford? Adaptive Behavior. https://doi.org/10.1177/1059712320982683
Part XI Abduction and Technological Design
Introduction to Abduction and Technological Design
60
Ehud Kroll
Contents Introduction to Abduction and Technological Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1320 1323
Abstract
After a brief introduction to some relevant works in the area of abduction and technological design, the four chapters included in this part of the handbook are described. The first chapter looks at modern software-based tools that designers use and examines the extent to which computerized design assistants may exhibit abductive reasoning. The second chapter looks at evaluation activities during design, which have traditionally been carried out by deduction. The author argues that only abductive reasoning can break the constraints imposed by a fixed, predetermined set of evaluation criteria and lead to innovative performance. The next two chapters are more theoretic in nature: The third chapter claims that regular abduction cannot account for innovation and uses concept-knowledge mappings to represent the process of generating new knowledge as a precondition for innovation and creativity. The fourth chapter uses the modern C-K Theory of design, which is capable of accounting for generativity, to study formulations of design abduction, both of the explanatory and innovative types. New facets of design abduction, and also of science, are uncovered using C-K Theory as the framework for analysis.
E. Kroll () Department of Mechanical Engineering, ORT Braude College, Karmiel, Israel e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_90
1319
1320
E. Kroll
Introduction to Abduction and Technological Design Technological or engineering design is the process of creating artifacts to satisfy some needs. It uses a combination of theories, methods, tools, and intuitive, experience-based decision-making and seems quite far from the rigor of scientific inquiry. So, what does the notion of abduction have to do with design? To answer this question, we note that design processes encompass a plurality of mental activities that are repeatedly applied; primary among them are the generation of plausible concepts and the ranking of such concepts to select one of them for further development. These concepts are solution candidates to the design task at hand and are considered hypothetical as long as they have not been tested and proven by implementing them as operational artifacts. One of the first treatments of abduction in relation to design is by March (1976), who claims that it is the key mode of reasoning in design, but also that it differs considerably from science. The goal of science, according to March, is to establish general laws, while design is concerned with realizing a particular outcome. March proposes the following pattern of abduction: From certain characteristics that are sought, and on the basis of previous knowledge and models of possibilities, a design proposal is put forward. March further presents a three-step cyclic design process that is similar to Peirce’s abduction-deduction-induction modes of reasoning, and says that rational designing has three tasks: 1. Creating a novel composition – the artifact – as the outcome (the “case”) of productive (i.e., abductive) reasoning 2. Predicting the performance characteristics of the artifact by deduction 3. Accumulating habitual notions and established values by induction Therefore, induction may have two related roles: a background activity that represents ongoing acquisition of experience and expertise and an evaluative step in the design process. March’s work has been followed by several design researchers, some of them inspired by treatments of abduction in philosophy of science, who attempted to analyze the design process and associated reasoning in terms of abduction. A major contribution can be found in Roozenburg’s (1993) discussion of whether the reasoning towards a tentative description of a design follows the conventional view on abduction. He argues that the common view of abduction as “explanatory” is not applicable to design and that the core of design reasoning follows another type of abduction, which he calls “innovative” abduction or “innoduction” (Roozenburg & Eekels, 1995). In fact, says Roozenburg (1993), Habermas (1978) distinguished between explanatory abduction and innovative abduction, and it was March who did not make that distinction. Roozenburg says that in the case of innovative abduction, one starts from a surprising, not yet explainable fact (the result), and tries to conceive of a new rule (a principle, law, or theory) that allows inferring the cause (the case). Therefore, the rule itself is not yet assumed to be true. He further explains that the
60 Introduction to Abduction and Technological Design
1321
conclusion of this inference is a hypothesis that still needs to be tested by deduction and induction before it becomes a new rule. The abovementioned pattern of innovative abduction is therefore: q (q is a given fact, a desired result) ----------------------------------------------------------------------------------p → q (a rule to be inferred first, IF p THEN q) q (p is the conclusion, the cause, that immediately follows) Roozenburg even claims that this pattern is Peirce’s original intention, using the well-known argument that the p cannot be part of the premise and needs to be part of the conclusion of the inference. This means that both p → q and p “present” themselves together, at the same moment. Roozenburg’s innovative abduction is claimed to represent the kernel of the design process. The desired result is the function to be satisfied, his rule looks like “if form + way of use then function,” and the conclusion is form + way of use. Dorst (2011) offers another view on design abduction that uses the following formula: what (the artifact) + how (the working principle) → value (aspired) in which the aspired value is always given and is the starting point for design. If the how is also given, then the what is generated by a so-called abduction-1, which is precisely “explanatory” abduction. Dorst calls this case “conventional (‘closed’) problem-solving that designers often do.” However, if the how is not given, then we have a more “open” problem in which we need to decide on both the working principle and the artifact. This is accomplished by abduction-2, which is the same as Roozenburg’s innovative abduction. Abduction-2 is carried out by first developing or adopting a “frame” (after Schön, 1983), which is a “general implication that by applying a certain working principle we will create a specific value” and takes place according to the following pattern: q (q is a given desired value) ----------------------------------------------------------------p → q (IF how THEN value, the first conclusion) p (how, the second conclusion) In addition, Dorst says that when a possible or promising frame has been proposed and the how is known, abduction-1 can take place to design the what, the artifact. Kroll and Koskela (2016) build upon the works of Roozenburg and Dorst and present their own version of the kernel of design: The mechanism of design reasoning from function to form is suggested to consist of a two-step inference of
1322
E. Kroll
the innovative abduction type. First is an inference from a desired functional aspect to an idea, concept, or solution principle to satisfy the function, and this is followed by a second innovative abduction, from the latest concept to form, structure, or mechanism. The few studies into design abduction described so far hint at the variety of interpretations that exist and the lack of a unified picture of the subject. This, of course, is also related to the controversy among philosophers of science regarding Peirce’s original (but changing over time) intentions and the debate between interpreting abduction as the generation of novel explanatory hypotheses and the selection of the most promising hypothetical explanation among several for further testing, usually considered to be inference to the best explanation, or IBE (McAuliffe, 2015). Each interpretation raises some objections and counter-arguments, as summarized by Mohammadian (2019), who proposes a unified interpretation of abduction and IBE whereby abduction has two phases: (1) hypotheses-generation, a creative, ampliative phase governed by insight, to generate a few plausible explanatory hypotheses, and (2) hypotheses-ranking, wherein relative “pursuitworthiness” is determined based on economic considerations in a rule-based, deliberate, and self-controlled manner. Selecting the highest-ranking hypothesis follows. The many discussions on abduction in science contribute insights that can be used in design research. If design abduction is broadly defined in such a way that it encompasses generating plausible concepts and choosing one among them for further development, then perhaps advances in understanding design can take place. Accordingly, the four chapters included in the present part of the handbook represent efforts in the direction of characterizing abduction in the area of technological design. The chapter by Pieter Pauwels and Vishal Singh offers a broad look at design processes, with an emphasis on architectural design and engineering. While abduction traditionally took place in the designer’s mind to support the creative parts of design, new opportunities have emerged in recent years as increased use of computers brought about a combination of human abductive reasoning and machines. The authors elaborate on model-based and data-driven software processes and examine future roles that technology may play as an assistant to human designers. They conclude that the computer indeed allows creating new models (still external, as opposed to the internal models in the designer’s mind) but is still far from being able to serve as an assistant that can carry out abductive reasoning, interpretation, and hypothesis. Andy Dong’s chapter addresses an important part of any design process, that of evaluation. Evaluating a candidate solution against a set of criteria or comparing solutions to each other is a common step in design that takes place either at specific points along the process (e.g., at the end of the conceptual design phase) or continuously, with any change made to the evolving artifact. The activity of evaluation oversees the convergence of design processes and mostly employs deductive reasoning. However, Dong points to the following apparent contradiction: A novel, radically innovative solution cannot emerge from a selection process that uses a predetermined set of value criteria (objectives, requirements, and constraints),
60 Introduction to Abduction and Technological Design
1323
because established norms tend to reject new ideas for being too risky or too implausible. This problem, suggests Dong, can be alleviated by abductive, instead of deductive, evaluations. Empirical research is presented to support the claim that abductive reasoning can increase the tendency of designers to adopt novel alternatives through the generation of new insights and the reframing and refinement of ideas. Sharifu Ura, in the third chapter, explains why regular abduction (called here first-order abduction) cannot account for innovative design. It is second-order abduction that can produce new solutions to design problems. The abduction-based logical network that promotes innovation is represented by concept-knowledge mappings, where concepts are the design solutions. Such mappings follow the 20+ year old C-K Theory of design (elaborated and used in the next chapter) and the need for generating new knowledge to arrive at a creative design is emphasized. The necessity to be creative and innovative is described as a case of high epistemic uncertainty (undecidability à la C-K Theory) and high compelling reason. Finally, the chapter by Ehud Kroll, Pascal Le Masson, and Benoit Weil uses the modern C-K Theory of design and its ability to account for the logic of generativity (the production of both rules and artifacts) to study design abduction of the explanatory and innovative types. Because C-K Theory has been developed without relying on the notion of abduction, it can be used as a tool, a framework, to analyze abduction both in design and science. The authors demonstrate how this method can help uncover interesting properties of abduction in design and also an aspect of design they call “preservative generativity”. The three main conclusions of the chapter are (1) that abduction – regarded as a logic of hypothesis generation – actually addresses many unknowns with a strong generativity potential. In particular, it shows that the relevant unknowns are not embedded in the hypotheses themselves, but rather by the concepts of hypotheses, which require substantial design work to become testable explanations of the evidence; (2) that even if abduction might explore these multiple unknowns, formulations of abduction tend to only very partially explore the full range of unknowns, so that abduction turns out to be a form of “bounded generativity”; (3) that an abduction that is based more explicitly on design theory might overcome the bounded generativity, and this would thus lead to consider how this “unbounded” abduction could be a preservative generativity that rigorously combines the creation logic of scientific discovery and the cumulative preservative logic of robust, reliable scientific knowledge.
References Dorst, K. (2011). The core of ‘design thinking’ and its application. Design Studies, 32(6), 521–532. Habermas, J. (1978). Knowledge and human interests (2nd ed.). Heinemann. Kroll, E., & Koskela, L. (2016). Explicating concepts in reasoning from function to form by two-step innovative abductions. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 30(2), 125–137.
1324
E. Kroll
March, L. (1976). The logic of design and the question of value. In L. March (Ed.), The architecture of form (pp. 1–40). Cambridge University Press. McAuliffe, W. H. B. (2015). How did abduction get confused with inference to the best explanation? Transactions of the Charles S Peirce Society, 51(3). Mohammadian, M. (2019). Beyond the instinct-inference dichotomy: A unified interpretation of Peirce’s theory of abduction. Transactions of the Charles S. Peirce Society, 55(2), 138–160. Roozenburg, N. F. M. (1993). On the pattern of reasoning in innovative design. Design Studies, 14(1), 4–18. Roozenburg, N. F. M., & Eekels, J. (1995). Product design: Fundamentals and methods. Wiley. Schön, D. A. (1983). The reflective practitioner: How professionals think in action. Basic Books.
Abductive Reasoning in Creative Design and Engineering: Crossroads of Data-Driven and Model-Based Engineering
61
Pieter Pauwels and Vishal Singh
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creativity in Engineering and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model-Based and Data-Driven Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning as Essential Part of Architectural Design and Engineering Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What Is Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Models in the Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Role of Abductive Reasoning in Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Beliefs and Habits to Creativity and Surprise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Impact of Technology on Design Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From CAD Drafting Tools to BIM Information Modeling Tools . . . . . . . . . . . . . . . . . . . . The Connected World: IoT and the Web of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation Models: Structural Analysis, Energy Simulation Software, Etc. . . . . . . . . . . . Parametric and Generative Design Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technology: An Assistant or “Just” an Extra Medium in the Creative Process? . . . . . . . . . . Data-Driven and Model-Based Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Machine as an Assistant? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1326 1327 1329 1331 1331 1333 1335 1336 1337 1337 1339 1340 1342 1343 1344 1346 1347 1348
P. Pauwels () Department of the Built Environment, Eindhoven University of Technology, Eindhoven, Netherlands e-mail: [email protected] V. Singh Centre for Product Design and Manufacturing, Indian Institute of Science, Bengalaru, India e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_16
1325
1326
P. Pauwels and V. Singh
Abstract
Creative processes require abductive reasoning from a human being, which is a feature that humans exhibit naturally, and machines are not able to do. Abductive reasoning underpins creative reasoning, interpretation, and analogy, all of which are crucial to creative design processes in diverse fields, including, for example, architectural design and engineering. In the absence of computers, abductive reasoning has typically been a physical process that takes place in the human mind and is oriented mainly on the design project itself (e.g., pen, paper, and words in the case of architectural design and engineering). In the current digital computer world, abductive reasoning takes place also in combination with machines, in which case abductive reasoning is still exclusively a human trait. As the medium changes, however, entirely different affordances and opportunities emerge. In this chapter, a review is given of design processes and the location of abductive reasoning in those; the impact of technology in these design processes, with a particular focus on model-based versus data-driven software processes; and, in a third section, a review of where technology may fit as an assistant, or just yet another medium to work with. Keywords
Abductive reasoning · Creativity · Data-driven · Design · Engineering · Model-based · Parametric design · Semantics
Introduction The world has seen plenty of advances in the use of digital technologies over the last few years. Ever accelerating digitization trends turn our world into an increasingly virtual world. The last few years furthermore have witnessed a new wave of artificial intelligence (AI), after the earlier waves that the world experienced in the 1970s– 1980s and after the 2000s. Computing power is abundantly available, and the world has never been as connected and interlinked as it is now. This clearly has an impact on plenty of processes and ways of working throughout the world. As the machine is here to stay, humans need to learn how to work with these machines. This chapter of the book intends to look into the impact of these digitization trends and our way of interfacing with technology in the domains of design and engineering. While the chapter looks at this from an architectural design and engineering perspective (buildings, architecture, construction), conclusions in this chapter can easily be transposed to other fields, most notably software development practices and industrial product design and engineering. Those fields are very particular domains of human intellect, as they depend since nearly forever on human creativity, human interpretation, human intellect, human surprise, and human out of the box thinking. Yet, also this domain is increasingly often impacted by digital technologies. For example, the Internet of Things (IoT) and Web of Data
61 Abductive Reasoning in Creative Design and Engineering:. . .
1327
technologies (Hogan, 2020) are available, forming an alternatively connected world that we would otherwise only experience inside our minds. Furthermore, sketching and physical modeling is gradually being replaced or complemented with computeraided design (CAD) and building information modeling (BIM) techniques (Eastman et al., 2008; Borrmann et al., 2018), which is software that allows to model a building design in full 3D, including all semantics. Furthermore, high-performance computing (HPC) clusters are available, which can be used to compute complex mathematical problems (such as stress-strain analysis in windmill blades), instead of having to rely on physical tests, scale models, and formulas. Clearly, plenty more techniques are available (e.g., machine learning) that can be used by the designer, architect, and engineer for a wide set of alternative new opportunities.
Creativity in Engineering and Design A lot of research has looked into creativity and computational creativity in design. This chapter will not go in complete detail into this research and its diverse areas and results, and it will instead focus mostly on the way in which technology is impacting these design and engineering processes. In terms of creativity and design, this chapter relies on a Peircean interpretation (Peirce, 1958) that has been outlined and followed earlier in Pauwels et al. (2013, 2014a,b) and Pauwels (2017). These works define a design and engineering process as a creative sense-making process, in which the creativity lies in the person interacting with the media around her, interpreting, sense-making, and continuously adapting views and models about the design or product in front of her. The designing agent hereby serves as a practitioner interacting with her world (Schön, 1983) (Fig. 1).
Fig. 1 An architectural designer in interaction with her world
1328
P. Pauwels and V. Singh
This perspective and approach are very different from the interpretation of “creativity” as a “feature of a product” (e.g., this product looks very creative because of all its rounded features). So the current chapter focuses entirely on creativity as a feature of a process. In working with her environment, a designing agent goes through a process of sense-making, and every action that she takes to “create” something is interpreted as a form of creativity. A measure could hereby be used to identify how creative such an act of creation (e.g., drawing one line in a sketch) really is, which is a topic that is at length covered in the distinction between hcreativity and p-creativity (Boden, 2004). Historical creativity (h-creativity) refers to an action that is perceived as radically new in history and never made before, while psychological creativity (p-creativity) refers to an action that is perceived as radically new by the person making this action, but not necessarily in the entirety of history. This chapter instead assumes that every act of creation holds a level of creativity, in the sense that the world view (mental model) of the person making this act always changes, no matter how minimal the change is (Fig. 2). Taking this approach towards creativity in design and engineering, in which interaction between human and world is central, leads to a very important role by the surrounding world and its media. As indicated in Pauwels (2017), “the world around the practitioner has a tremendous impact on the practitioner, in the sense that, in the long run, it forms the models that are being used by that practitioner to act upon the world. As a result, the models that are being used at a specific moment in time, for instance by a musician, clearly depend first and foremost on the world in which this practitioner has grown up and learnt to appreciate certain musical styles. The same holds for scientific models, mathematical models, language models, and so forth.” In the case of a design process, the surrounding world consists of several influences, tools, and media. A sketch-based design process relies on the paper and pencil as a medium for the designer to perform her acts of creation. Yet, a designer can rely on plenty more media, such as physical scale models, conversations, site visits, and so forth to form models and external representations of the mental design model in the mind of the designer. Act of Creation
Mental Model
Mental Model
Act of Creation
Mental Model
Act of Creation
Mental Model
Time
Fig. 2 Mental models change with every act of creation over time
61 Abductive Reasoning in Creative Design and Engineering:. . .
1329
While the world digitizes, more and more digital media become available, which are also simply forms of representation for a human designer and the mental design model that this designer upholds. Yet, these media have different affordances compared to paper and pencil and thus enable different (not necessarily better) possibilities for design space exploration.
Model-Based and Data-Driven Engineering Model-based reasoning can be considered central in very diverse domains of practice. Recently considered domains of practice are political discourse, social intercourse, language learning, archaeology, collaboration and conversation, and so forth. Model-based reasoning has a long history already as the term emerged in the early days of artificial intelligence (1970s–1980s). Model-based reasoning was hereby used in the context of case-based reasoning, expert systems, neural networks, and similar. Model-based reasoning hereby assumes the existence of a model of a device, system, or situation that can be used to find explanations for the behavior of that device, system, or behavior (Kurfess, 2003). This model would typically take the shape of a combination of rules (behavior) and object-oriented modeling (states) techniques. As those rules typically take the shape of interconnected IF-THEN statements, this easily leads to an expert system that relies on primarily deductive inference. In a broader and more contemporary interpretation, model-based reasoning refers to the process in which a model is used during the inference or reasoning process. The word “model” hereby refers to “an internal or external representation that is intended as an interpretation of a target physical system, process, phenomenon, or situation and is retrieved or constructed on the basis of potentially satisfying salient constraints of the target domain” (Magnani and Casadio, 2016). The broader interpretation of model-based reasoning takes several important forms in the case of architectural design and engineering, and the term “model” has plenty of versions and interpretations in this domain. The following typical models can then typically be distinguished in design and engineering: 1. Physical scale model: a physical model in a specific scale that still maintains and shows the most important traits of the considered system, device, or situation 2. Mental model: an internal model inside the human brain that captures one’s current understanding of how a certain device, system, or situation “works” 3. Simulation model: a specialized external model of a system, device, or situation that includes a number of key characteristics and allows to perform a simulation of its behavior (e.g., finite element analysis (FEA), computational fluid dynamics (CFD) simulation, etc.) 4. Information model: an external model of a system, device, or situation that includes key information (semantics) of this system, device, or situation for further reuse in an engineering and design process
1330
P. Pauwels and V. Singh
(a)
(b) x1 x2 x3 x4 y1 y2
yn xm
(d) (c)
Fig. 3 A variety of model-based and data-driven techniques in design and engineering: (a) simulation model, (b) regression model, (c) parametric model/information model, and (d) neural network model
On the other hand, the recent reemergence of AI technologies, with a much bigger focus on data, analytics, statistics, and machine learning, has led to a shift towards data-driven practices, also in design and engineering. Data-driven practices refer to processes where data is collected from a monitoring process, analyzed in detail and across large batches of data, and then turned to action in the form of datadriven predictions (machine learning, neural network models, regression models, etc.). The monitored data can easily be analyzed using machine learning techniques and data analytics tools, leading to insights into data that come from the real world. Hence the data is expected to give a more truthful and accurate insight into the real world, especially compared to the models that are created from scratch and not relying as explicitly on existing data (e.g., simulation models, information models, etc. – see also Fig. 3). While information models and simulation models are of considerable value (model-based), it is argued that they need to be compared to actual data points coming from the real world (data-driven) to further enhance and improve them.
61 Abductive Reasoning in Creative Design and Engineering:. . .
1331
A key research challenge in that area and engineering then becomes to “merge model-based and data-driven techniques.” The assumption hereby is that a model can be created of the physical processes and phenomena in the environment, ideally as close as possible to the real world, and data from data-driven analyses can then be compared to the predicted values from the model. Data-driven practices play a less strong role in a design and engineering process, simply because the real-world result is not available yet and data cannot be retrieved. At best, it is possible to retrieve data from previously made designs, and those can be matched or merged with the models created for that design, in order to learn from the value and correctness of design decisions.
Abductive Reasoning as Essential Part of Architectural Design and Engineering Processes In this section, an overview will be given of architectural design and engineering processes. This includes a review of what design is, explained in the background of problem-solving and using a theory of sequential design situations that change through impacting acts of creation (creativity). Furthermore, this section reviews the position of models as well as abductive reasoning and analogical reasoning in design and engineering processes. Finally, this section clearly points out the importance of beliefs and habits to enable surprise, wonder, and subjective human interpretations.
What Is Design How designers think and work has been the subject of many research initiatives over several years. An earlier review is available in Pauwels and Bod (2013), and other studies can be found in Cross (2007), Bayazit (2004), and Eastman (2001). Design and engineering is known to be a process of re-solution of wicked problems. This includes two key terms, namely, “wicked problems” and “re-solution.” A wicked problem (Rittel and Webber, 1973, 1984) or an ill-structured problem (Simon, 1973) is a problem that does not have a single best solution, mainly because it cannot be represented in a solvable structure. Unlike well-defined problems with clear solutions and solving methods, wicked problems and ill-structured problems can never be solved and therefore need to be solved and re-solved again and again while never attaining the best solution, hence the term re-solution. As these wicked problems are the default in design and engineering, this area has evolved by now in a practice with a diversity of methods, none of them as methodological or rigid in more well-structured domains (see also Dorst 2006, p.11). The above theory is grounded in the works by Cross (2006), Lawson (2005), Schön (1983), and Simon (1996). These researchers consider and uphold design and engineering as a complex process that relies on a particular kind of reasoning and thinking. The act of designing hereby does not consist that much of a design problem and solution but rather of a number of states which the design process goes
1332
P. Pauwels and V. Singh
SOLUTION SPACE Design Situation
Design Situation
Act of Creation
Mental Model
Design Situation
Act of Creation
Mental Model
Mental Model
Design Situation
Act of Creation
Mental Model
PROBLEM SPACE Time Fig. 4 Proceeding from design situation to design situation while also learning and improving mental models during acts of creation
through during the procedure of continuously resolving it. These states were referred to as “design situations” in Pauwels and Bod (2013). This sequence of design situations matches quite well with the overall perspective taken regarding creativity and model-based reasoning (see Fig. 4), in which a designer continuously proceeds from one mental state into the next while making acts of creation (=creativity) at every state transition (every second, minute, hour). The design situation hereby serves as “a snap-shot, in terms of time, in the overall design process, in which a limited number of constraints and parameters are taken into account and adjusted by a designer, in order to satisfice the design situation, as interpreted at that moment, into an alternative and new design situation” (Pauwels and Bod, 2013). The term “satisficing” refers to the attitude of architectural designers to sufficiently instead of entirely satisfy constraints (see also Simon 1973 and Cross 1982, p.224). In this process, designers take a solution-focused or goal-oriented strategy instead of a problem-focused strategy (Cross, 1982; Simon, 1996). Indeed, from the very first moment, a designer works towards an aspired goal; they proceed forward through the design process, continuously facing new design situations and addressing them as they see fit in order to obtain the goal they have in mind at that specific moment in time (mental model). While moving from one design situation to the next, every design situation is evaluated and interpreted every time, like any other thing surrounding us. In this interpretation step, a reference can easily be made to C.S. Peirce’s process of (scientific) enquiry that consists of a hypothesis, an experiment, and a confirmation or refutation of the original hypothesis (learning).
61 Abductive Reasoning in Creative Design and Engineering:. . .
1333
Every act of creation is in other words an experiment that is needed to confirm or refute a particular designerly hypothesis. As such, the design situation “talks back” to the designer or engineer, a phenomenon known as “situational feedback” (Schön, 1983). So, not only the design situation proceeds over time, but also the initial understanding of the design challenge (mental model) evolves over time. This is interpreted by Maher and Poon (1996) and Poon and Maher (1997) as a process of “co-evolution” of problem and solution spaces. As such, as part of their practice, designers and engineers learn while doing.
Models in the Design Process In this process, models are abundantly present, both internal to the human mind and external as part of the design situation. Arguably, the mental model in the human mind is the most important of these models, as this is where decisions are taken. In each act of creation, or each interaction with a surrounding world, a designer or engineer relates the interpreted design situation (situational backtalk) with her own background memory and knowledge, in order to make design decisions. As this situational backtalk is interpreted by the designers, it also reshapes the background knowledge of these designers. Making sense of the situation thus happens by switching back and forth between problem and solution while continuously reframing both. Schön (1983, p.39–40) refers to problem setting as “the process by which we define the decision to be made, the ends to be achieved, the means which may be chosen (Fig. 5). In real-world practice, problems do not present themselves to the practitioner as givens. They must be constructed from the materials of problematic situations which are puzzling, troubling, and uncertain” (Schön, 1983, p.39–40). The changing background knowledge of the designer has been discussed at length by Lawson (2005, p.159). He uses the term guiding principles to refer to the personal background knowledge or the knowledge by experience of a designer. They consist of familiar design patterns that a designer relies upon – built up from plenty of experienced design cases. A designer thus never starts a design from an empty page, never from scratch or a blank mind. Instead, a designer always relies on a lifetime of knowledge built up by experiences. It is documented in Lawson (2005, p.179) how these guiding principles, in combination with a mental model of the situation at hand, essentially guide practitioners, among which designers, through their thinking process. They play an important role not only in framing the design situation but also in generating solutions for a problem, devising experiments, and in learning from experiences. According to Lawson (2005, p.159), these guiding principles not just include objective, factual information but also include much more information, involving, for instance, motivations, beliefs, values, and attitudes. These guiding principles can thus be very strong and almost become a religion for the person that follows them. These guiding principles, beliefs, and habits form the design grammar of the designer, through which she makes almost all of her decisions. When imagining and creating, a person still mostly relies on those known
1334
P. Pauwels and V. Singh
Sketch
Physical Model
Simulation Model
Sketch
Mental Model
Mental Model
Mental Model
Mental Model
Hypothesis
Hypothesis
Refutation
Hypothesis
Refutation
Hypothesis
Refutation
Refutation Time
Fig. 5 Timeline showing the interaction between designer or engineer and the external models that represent the design situation
principles. As is indicated by Ward (1994), imagination relies almost entirely on known concepts, and, although modifications are made, they are typically only constituted by different combinations of known elements. It is hard to entirely step outside one’s own categories and beliefs, also in imagining (Ward, 1994). This background knowledge serves as a repertoire of reference models for the designer to continuously and actively reorganize and restructure new design situations in memory into new abstract mental models or understandings of those design situations. In this context, references can be made to the work on casebased reasoning (CBR) (Aamodt and Plaza, 1994; Kolodner, 1992, 1993) and on diagrammatic reasoning. Case-based reasoning captures the idea of matching new cases with previously encountered cases in order to appropriately act upon them. This is clearly very important in the use of models for design and engineering. Namely, the designer or engineer relies on a set of previously experienced “cases” that trigger depending on the real-world design situation (e.g., CBR, diagrammatic reasoning). This real-world design situation is most often represented using an externalized model (e.g., physical scale model, simulation model, information model, sketch). Hence, the quality and nature of this externalized model dictate which references are triggered in a human designer mind. The interplay between externalized models and internal mental models is thus absolutely crucial in guiding acts of creation and the entire design and engineering process (Fig. 5).
61 Abductive Reasoning in Creative Design and Engineering:. . .
1335
The Role of Abductive Reasoning in Design Every step made during a design and engineering process is an act of creation and therefore holds “creativity.” Key in this creative design process is the notion of “hypothesis,” which is the point where a designer interprets the media in front of her and connects that to her background knowledge (Fig. 5). These points in this process are key moments of creativity, which have been well described by Cross (1997) as moments in which two oscillating points, “problem” and “solution,” are still and close enough to be bridged by “an apposite concept.” “The crucial factor [...] is the bridging of these two partial models by the articulation of an apposite concept [...] which enables the models to be mapped onto each other. The ‘creative leap’ is not so much a leap across the chasm between analysis and synthesis, as the throwing of a bridge across the chasm between problem and solution. Such an apposite ‘bridge’ concept recognizably embodies satisfactory relationships between problem and solution. It is the recognition of a satisfactory bridging concept that provides the ‘illumination’ of the creative ‘flash of insight”’ (Cross, 1997, p.439440). The term apposite makes an intended reference to the notion of appositional reasoning, which was originally coined by Bogen (1969) and which is considered similar or the same as abductive reasoning by Cross (1990). This apposite bridge between current design situation and the designer’s background knowledge can easily be explained as a kind of analogical reasoning or case-based reasoning, as was explained before (Heylighen, 2007; Maher and Pu, 1997; Goldschmidt, 1991, 1994). Analogical reasoning is hereby explained as the cognitive ability to think about relational patterns (Grace et al., 2011; Holyoak et al., 2001; Kokinov, 1998; Ward, 1998), which allows one to find a structural alignment or mapping between a base and a target pattern residing in (partially) different domains (Grace et al., 2011; Ward, 1998; Gentner et al., 2001; Hofstadter, 2001; Lakoff and Johnson, 1980). Practicing designers and engineers thus continuously make alignments between the current design situation (the base pattern) and previously experienced design situations (the target pattern), and they make actions accordingly. Cross (1990) refers to several other research initiatives that distinguish a very similar kind of reasoning as fundamental for design thinking, thereby mentioning the terms abductive reasoning, productive reasoning, and appositional reasoning as called by their respective inventors Peirce, March, and Bogen (Bogen, 1969; Peirce, 1958; March, 1976). This form of analogical reasoning often occurs between a new design-related experience (building, sketch, 3D model, conversation, and so forth) and a previous design experience as it is stored in the human mind (Heylighen, 2007). Sketching is just one example of such experience, as analogical reasoning occurs constantly while sketching in the form of “seeing as” (Goldschmidt, 1991, 1994). The designer reinterprets the sketch and, as such, adds new and original meaning to it, thereby generating new ideas. Diagrammatic reasoning is a specific form of such analogical reasoning that focuses mostly on
1336
P. Pauwels and V. Singh
visual input (interpreted diagrams). From the interpreted diagram, a designer or engineer finds similar diagrams (interpretation and reframing) and applies the same principles that she knows from the previous case onto the new situation.
From Beliefs and Habits to Creativity and Surprise Over time, the experiences of a designer form this designer’s repertoire (guiding principles) based on which this person makes new designs. As such, the designer forms a design grammar that is recognizable to people. In a way, such design grammar forms the belief of the designer or architect, without which he or she does not act. This aligns with Peirce’s theories on habit formation and belief: “The essence of belief is the establishment of a habit; and different beliefs are distinguished by the different modes of action to which they give rise” (Peirce, 1958, CP 5.398) (Peirce, 1878). As a human being, we intuitively build beliefs and habits, because these beliefs bring predictability to us as humans; they bring us comfort and trust, which is a state that we wish to maintain as a human being, in contrast to the state of anomaly, surprise, and unpredictability, which is highly insecure and discomforting. Hence, also designers and engineers build beliefs and habits that give them security over time from which they seldom want to stray. Creativity is related to this notion of belief and habit, not in the sense that a person would want to “be creative” as a form of habit. As a designer, or engineer, the purpose is seldom to “be creative” (with exceptions). Instead, the designer and engineer typically tend to follow their habits and beliefs, in order to convey a message of meaning to a selected audience, big or small, while being only remotely concerned whether something may be considered to “be creative” or not. Nevertheless, people that are not familiar with the design repertoire or design grammar of the designer or engineer that come across their designs or products and are confronted with the conveyed meaning are often surprised and in cases shocked as the belief built in into the design does not match the belief and values that the observer attains. This surprise and shock is a feeling of discomfort that is created by a feeling of seeing something that is entirely novel and does not match the available belief of the observer, and precisely this is the point where “creativity” is observed as a feature of a design product (even if it is not). On a similar basis, many other examples outside of engineering and design can be named, the most famous ones being jokes and stories. Both are built on the aspect of “expecting something to happen,” according to one’s belief, while something completely unexpected happens (a “creative story plot”). A good designer is therefore a person who is aware of his or her grammar, built up through his or her own history, and masters it, and who can bring a story of value with this grammar to the audience that has the opportunity to observe and experience it. This observer in that case learns and experiences something new and therefore alters and ideally enriches his or her own world views and beliefs.
61 Abductive Reasoning in Creative Design and Engineering:. . .
1337
Impact of Technology on Design Processes The above sections concern design in general. Experiences from the outside world can take any form, in principle, although it is typically assumed that sketching is the language of the designer and engineer. However, any form of input is valuable during a design process, including conversations with colleagues and relatives, writing, building physical models, etc. Each of those media brings its own affordances that may unlock different results during the diagrammatic or analogical reasoning process. Depending on one’s beliefs, one may choose to make one’s set of media as diverse or limited as wished. Of course, with the ever more rapidly emerging set of technologies, entirely different affordances come about for the support in the design and engineering process. In this section, a short review is given of prevailing technologies in the design and engineering process. This review will indicate what the role of these technologies may be in the context of modelbased reasoning versus data-driven design. This section lists in particular building information modeling (BIM) software, Semantic Web technologies, Internet of Things and Web of Data, parametric and generative design tools, simulation models, and artificial intelligence techniques (both statistical and symbolic AI).
From CAD Drafting Tools to BIM Information Modeling Tools The most common medium for a designer and engineer in architectural design and construction is computer-aided design software, which is often also known to be computer-aided drafting software. Indeed, this software was initially often used as a medium/tool to replace and mimic the functionality and affordances of paper and pencil. CAD software was used to draw those lines that one would usually also draw on paper. Over time, and as originally intended (Eastman, 1975), it became clear that a computing device brings entirely new affordances in the externalization of a mental model. Drafting tools transformed or were started to be used as 3D modeling and later information modeling software. These 3D models and information models, later on BIM models, allow a person to create a more complete representation of the design model. In other words, the 3D BIM model forms an externalized version of the internal design model of the designer or engineer (Fig. 6). It is thereby important to distinguish between the externalized and internal model, both of which are different from each other and different from the real world, like any other model. An important and almost disruptive change that happened in the evolution from CAD into BIM is the addition of semantically meaningful information and formal knowledge presentation inside the 3D models. From that moment onwards, 3D models were not just visual representations of the intended future world, which are perfect for diagrammatic reasoning if done well (using diagrammatic and less detailed shapes); instead they became embedded with formally represented
1338
P. Pauwels and V. Singh
Fig. 6 BIM model in Solibri Model Checker, incl. display of properties and semantics
Fig. 7 Graph-based information models can readily be combined with deductive inference engines
information or knowledge. As such, they turned into semantically meaningful representations of the world (information models) that can be used in inference engines, query engines, and graph-oriented computations (Pauwels et al., 2017). For example, Fig. 7 shows an example graph that can readily be combined with query engines and inference engines for diverse forms of computation (mostly deductive inference and information retrieval). As much as this may seem to bring value, all of these technologies happen on the information model that is external to the designer and engineer’s minds. The designer and engineer are instead confined to diagrammatic and analogical
61 Abductive Reasoning in Creative Design and Engineering:. . .
1339
reasoning, which does not match very well with the information represented on screen in text-form based on a computation that is often unknown to the human user. For example, an architect or engineer is not easily triggered in any kind of diagrammatic or analogical reasoning process when she is confronted with what is displayed in Fig. 7, unless she understands in detail what is represented by such graph. In other words, the model-based inferences that occur inside the information models have a risk to be too much detached from the mental models in the designers’ mind, where very different decision processes happen (often still based on diagrammatic reasoning using the visualized information on screen).
The Connected World: IoT and the Web of Data The abovementioned graph-based technologies are somewhat related or similar to another important new set of technologies that is brought about in this new world of information technology, namely, the “connected world.” In this chapter, this term is used to refer to both Internet of Things (IoT) and the Web of Data, which are two very different kinds of technologies with one common feature, namely, that they have an ambition to connect the things and concepts surrounding humans, either with data streams of semantically rich information. IoT here refers to the multitude of devices and physical objects in general that are connected to each other over a communication network (e.g., the Internet) for data exchange across this network. The Web of Data (Hogan, 2020) is much less linked to actual devices and physical objects and refers instead to the large amounts of data that are published worldwide in a machine-readable format (e.g., RDF, OWL) and that are connected to each other to enable various sorts of data services (W3C, 2021). It is clear that IoT technologies focus mostly on communications and connections across a network. Data is therefore much more superfluous and short-lived, as it often perishes after being communicated. Even if the data is stored on longer term, the data is most often limited to streaming data that is less immersed with semantically rich formal information (knowledge representation). Such data is of less direct value for a designer or engineer unless the data is analyzed for patterns and insights that can bring insights that were not known before (surprise, shock, and breaking of beliefs). And this data analysis can be achieved using data analysis tool chains that became abundantly available through the third AI wave. This typically leads in the case of engineering data to visual displays and analyses as shown in Fig. 8. It is very clear that these data analyses are very powerful, yet they are not usable by any machine, mainly because a machine does not have the capability to interpret these visual results and turn them into something meaningful like an insight. Precisely this feature is the power of a human mind that is capable of analogical and diagrammatic reasoning, as well as abductive reasoning, all of which are not available for a machine. As a result, a human is always needed to make these interpretations, based on background knowledge and experience, to inform the designer’s or engineer’s mind and design process, and also through diagrammatic
1340
P. Pauwels and V. Singh
Fig. 8 Patterns retrieved in a sensor data stream
and analogical reasoning. This last part is important, as it indicates that the mental model of the designer is not the same as the computational model behind the data analysis, and very different conclusions may occur depending on the way in which the results of the data analysis are visualized and presented. The Web of Data is considerably different from IoT data in the sense that it typically includes much more semantics and tends to be more model-based. As such, it resembles the information models shown earlier in Figs. 6 and 7, and it then has similar affordances for a designer or engineer. The insights that can be retrieved based on the Web of Data, using SPARQL queries, for example, are semantically rich data which reside in the external model representation captured in the Web of Data. These insights are, however, very different from the insights made by the designer in his internal mental model, which again relies heavily on analogical reasoning and continuous reframing and hypothetical reasoning based on the visual display of this data and its inferred results. These two sets of insights are almost always different, with the differentiating factor being the form of abductive reasoning that underlies analogy and hypothesis.
Simulation Models: Structural Analysis, Energy Simulation Software, Etc. A third major group of models is constituted by simulation tools that exhibit a simulation model. These are also models present in engineering and design processes that are of incredibly important value for the engineer as well as designer. These dedicated tools allow to simulate the energy performance, structural safety, fire safety, and plenty other specialized aspects of a design or engineering solution. These simulation models and tools are typically built up through procedural code that has different levels of complexity. Structural analysis tools, for example, often rely on finite element analysis (FEA) methods that are computationally expensive
61 Abductive Reasoning in Creative Design and Engineering:. . .
1341
Fig. 9 Finite element analysis performed using Karamba and Grasshopper on a concrete 3D printed bridge
(Fig. 9) yet give a clear simulation of the structural performance of key loadbearing elements in buildings and infrastructure (roads, tunnels, bridges, etc.). Most simulation models however rely on a decision tree model that is encoded inside the application logic of simulation software. Examples are energy simulation software and fire safety simulation software; these simulation software applications parse available information and compute in a complex decision tree or network structure the performance of a building for typically one dimension (e.g., fire safety, energy performance). In most cases, the algorithms embedded in simulation software are opaque to the designer or engineer using them, because they typically form the intellectual property (IP) and commercial value of any of these tools and/or because they are simply too complicated and out of scope for either designer or engineer. Hence they act as black boxes towards them, except for a few parameters, settings, and input values that can be provided to these algorithms (in the case of Fig. 9, those would be the parameters entered via Karamba and Grasshopper). Arguably, this transforms these algorithms and simulation tools into gray boxes, in which case the model behind the algorithms becomes somewhat visible to the designer or engineer. Still, the model and the input, settings, and parameters are external to the mental design model of the designer. And still, the designer and engineer predominantly rely on interpretation and analogical reasoning of the results presented to him through a visual interface, which makes this interface incredibly important. For example, while the bridge displayed in Fig. 9 is displayed mostly green and blue and seems to trigger that the bridge qualifies, it is actually the displacement values in the scale on the right that matter and that need to be evaluated by the engineer. A visual display of colors ranging from red to green may hence, for example, be very misleading if the actual data values are not considered or taken into account.
1342
P. Pauwels and V. Singh
Fig. 10 Parametric design script in Grasshopper that is used to create a parametric 3D geometric shape
Parametric and Generative Design Tools A fourth and last relevant set of design tools are parametric and generative design tools. These are very unlike any of the earlier mentioned models, the closest relative being the CAD tools, and they provide a very different kind of affordances. Opposed to the semantic models that are hard to use as an external user, and opposed to the simulation models that are typically opaque to an external user, parametric design tools allow a designer and engineer to specify the rules and logic behind a design solutions space, and they therefore put the designer or engineer much more in control. A designer or engineer is able to represent a parametric model (externalize an internal model) using wires and nodes that together form a 3D geometric shape with parametric functionality. In case the designer creates the visual script herself, the result is a white box model that is transparent to the designer (Fig. 10). Still, also in this case, the white box model remains different from the mental model used by the designer or engineer, and there is only a “best possible” connection between the two, in the sense that the externalization of the mental model into the parametric model is a best effort. While making this best effort, choices need to be made that may be different from what the designer or engineer has in mind (e.g., encoding a for-loop in a visual script is often quite different than anticipated). In extension of these parametric design and engineering software lie optimization algorithms. A parametric model is, namely, a set of interconnected rules and algorithms that rely on a number of input parameters that lead to an output (see left to right in Fig. 10). It is possible to exploit this in two manners. One manner is to use these parameters to generate as many as possible design solutions by generating all available values or solutions for all available combinations of input parameters. As such, a large number of design solutions are generated, and the entire design solution space is generated in a saturated form. A second manner is to build an optimization formula (e.g., minimize value, maximize value) that refers to a number of input data. In that case, the design solution space can be explored for
61 Abductive Reasoning in Creative Design and Engineering:. . .
1343
Fig. 11 The use of Galapagos in Grasshopper allows to perform optimization techniques against a user-defined fitness function with parametric input variables
an optimal design solution according to the fitness function (optimization formula), and the most optimal solutions can be obtained and returned to the designer and engineer (see Fig. 11). Depending on the method, this often leads to a Pareto front or a multidimensional front with several local optima. This particular case of parametric and generative design as well as optimization techniques is interesting for model-based engineering and design. In contrast to the other models, the external model representation has much more connection to the mental model of the designer or engineer. As a result, diagrammatic reasoning, analogical reasoning, and/or hypothetical reasoning is much more straightforward, entirely because the parametric model is a white box model that is understandable to the designer or engineer. Hence, the model is potentially of much higher value to the designer or engineer (provided that the model remains to be understandable). Similarly, the optimization model can be highly valuable, provided that the designer or engineer understands how it works and is able to interpret results accordingly and map them to his or her own mental model and decisions.
Technology: An Assistant or “Just” an Extra Medium in the Creative Process? This last section first reviews the value and difference between model-based versus data-driven engineering and how those two may be combined in a meaningful manner to support the design and engineering process. Then, a qualitative evaluation is made of the extent to which the aforementioned tools (BIM models, simulation models, IoT data streams, optimization models, parametric models, etc.) function as
1344
P. Pauwels and V. Singh
extra media that enrich the creative design and engineering process or whether an even tighter connection with the machine may be possible so that it may transform into an assistant with a voice that also dares to have a different opinion.
Data-Driven and Model-Based Engineering As indicated before, the latest AI wave has mainly brought a surge in data handling techniques. A large majority of this AI wave is focused on the statistical side of AI that includes neural networks, machine learning, deep learning, prediction algorithms, and all other sorts of learning from data through dedicated expert statistics. Among the main enablers for this upsurge is the endless availability of data storage as well as the exponentially increased computational power that are both available at a very low cost. As a result, it is becoming much more common to be able to analyze incoming data streams using any of the available statistical AI algorithms (e.g., creation of a neural network, pattern recognition and prediction, etc.), gain new insights, and make new actions (top row in Fig. 12). The result of such data analytics workflows leads from data to a visual output (e.g., dashboard similar to what is shown in Fig. 8 and top-right in Fig. 3). A key part when working with such machine learning output is that it always still needs to be interpreted by a domain specialist (right in Fig. 12). While machine
impact insight & action
data data data
statistics machine learning data analytics
visual output
data
ion tat e r erp int
data insight & action semantics information model knowledge graphs procedural code simulation model parametric model
visual output
n tio eta r p er int
impact Fig. 12 Flows of data (input, output) and insights in data-driven and model-based engineering
61 Abductive Reasoning in Creative Design and Engineering:. . .
1345
learning (ML) algorithms can find patterns, clusters, classifications, and more, there is still a domain expert needed to take that input and make decisions and actions accordingly. A second key feature of this form of data-driven engineering is the presence of bias in the input data (top left in Fig. 12). Data is provided typically by the same person that is eventually interpreting the results of the data analytics workflow and that almost always creates bias (which is understandable and to an extent also not avoidable). A third feature in data-driven engineering is the role that the data analyst plays in tuning the algorithms in the middle (black box, gray box). These algorithms come with a number of settings and choices to be made (e.g., number of hidden layers in a neural network, choosing between available activation functions in a neural network, choosing between data imputation techniques, etc.), leading to a diversity of possible outputs that is determined by the data analyst. As a result, the combination of (1) expected data input and (2) the choices made for the data analysis determines the signature of the data analysis and its outcomes. In a way, there is therefore no “clean” or “unsubjective” data analysis when taking such data-driven engineering process. In fact, the choices made for data input as well as data analysis settings typically resemble the mental model of the data analyst and what she would prefer to see as an output. In other words, these two ingredients indirectly form an externalized model of reality and how that reality is expected to function. This to some extent also explains how data-driven processes often still lead to models, such as prediction models, neural network models, regression models, etc. In contrast to data-driven engineering, model-based reasoning and model-based engineering typically do not rely on input data. Instead, they are directly constructed by the designer and engineer, who makes a best effort to create a model (information model, simulation model, parametric model, etc.) that resembles as closely as possible her own mental model (green arrows in lower row of Fig. 12). Often, a very large investment is needed from the domain expert to create a model of high enough quality. Creating a simulation model, BIM model, information model, or parametric model is typically known as a creative and expensive activity. This process in itself obviously also generates bias or in any case personal identity inside the model, which is ideally minimized as good as possible, similar to how bias in a model would be minimized in a data-driven approach as well (e.g., checks/input by multiple sources and colleagues). As an example, there are diverse approaches towards building a parametric model, with proponents on each of these approaches; similarly, it is good to know the modeling guidelines followed by a modeler in her creation of a BIM model as displayed in Fig. 6. Recent research initiatives, in academia as well as industry, are looking into a combination of data-driven and model-based engineering approaches, as engineers would like to enjoy the benefits and added values of both. This combination is schematically displayed in Fig. 12 using an orange arrow. This combination is very difficult to make, as it is near to impossible to relate an information model or simulation model with, for example, a neural network built from input data. It is not possible to merge such models. The best effort nowadays seems to consist of recognizing that there is a multitude of such models and they all deal with the
1346
P. Pauwels and V. Singh
same physical object. A physical thing is then characterized using a number of very different models, each of which brings insight to the human spectator.
The Machine as an Assistant? In the above explanation (Fig. 12), it is clear that the machine is exclusively used to house a number of externalized models for the human agent. All of the interpretation and decision-making happens towards the right of the figure, where the human agent interprets the visual output to come to insights and new actions. In other words, in these approaches, the machine works as simply another medium at the hands of the human agent, like the architect’s pencil. Starting from that perspective, it is important to consider the way in which the human agent applies abductive reasoning and hypothesis onto this output (see Fig. 5). The agent, namely, looks at this output as the outcome of an experiment (see Peirce’s process of enquiry) and thus directly seeks for confirmation (ideal case) or refutation of an original hypothesis. For example, if the results from a simulation model confirm what the engineer had expected, it strengthens her beliefs in both her mental model of the design situation ánd in the reliability of this simulation model. The opposite effect is achieved if the results of the simulation model refute the engineer’s expectations. Similarly, for the data-driven workflow (top row in Fig. 12), if the ML agent gives unsurprising and expected output, then it is typically said to be correct and well-functioning (as it confirms the beliefs of the receiving human), while an ML agent that gives surprising and unexpected output is very often dismissed as faulty and erroneous (as it refutes beliefs and triggers doubt). If the latter does not happen, and the ML agent does still give a surprising result while it is not dismissed, then the result is on the contrary often found to be an amazing act of insightful and creative thinking, only after the engineer has been able to locate the result in a revised belief (see Lakatos 1976). While in some cases, the novel insight is attributed to the machine; this in reality simply is the insight that happens on the human side of course (see top right in Fig. 12). So, also in this case, the machine serves as simply another medium to work with as a designer and engineer. The only way in which a machine may transform into an independently thinking assistant that creates its own mental model and is able to make its own experiments, hypotheses, and actions is by enabling the feature of “interpretation” in this machine. While the main challenge lies here in the simulation of abductive reasoning inside a machine, the work is more complex, as there are also interactions needed with the external world that are not dictated by a human being, as well as the possibility to make experiments that still make sense eventually to a human, and finally also the time needed to learn sufficiently meaningful patterns to reason with. This is a rarely investigated avenue of research and will likely remain so for a while to come.
61 Abductive Reasoning in Creative Design and Engineering:. . .
1347
Conclusion This chapter consisted of three main sections in its evaluation of the position of abductive reasoning in creative design and engineering and particularly the role that diverse new technologies play in a designer or engineer’s creative design processes. The chapter has first reviewed the design and engineering process. This process consists of wicked and ill-structured problems that cannot be solved but can only be re-solved over and over again. Every re-solution hereby satisfices different sets of constraints and requirements that vary over time and place and that are definitely also different from person to person. Hence, the design and engineering process is an everlasting interactive communication between designer/engineer and the everchanging design situation. It is clear that this two-way communication hinges on the quality of “interpretation” and “hypothesis,” which are the two key human traits that allow to make sense of a design situation and proceed. This interpretation and hypothesis is hereby often faulty (hypotheses can by definition always be refuted) and also often out of the box and novel (analogy makes an apposite link between design situation and matching experience – out of the box). Based on these interpretation and hypothesis steps, creative actions are made that alter design situations according to the views of the person that made the interpretation and hypothesis to begin with. The notion of hypothesis is a clear trademark of abductive reasoning, which therefore has a very important role to play in creative design and engineering. A more common form of abductive reasoning that can more easily be recognized is analogical reasoning, or even diagrammatic reasoning. In making an analogy, a previously experienced concept is triggered by interpreted concepts in an experienced design situation. So the repertoire of design and engineering experiences is triggered and used to create an “apposite bridge” (act of creation) that alters and reshapes the design situation. With the emergence of plenty of new information technologies, the above process is impacted by the use of such technologies. This chapter gave a review of some of these technologies, namely, computer-aided design (CAD) tools, building information modeling (BIM) tools, parametric models, simulation models, knowledge graphs and the Web of Data, data streams from the Internet of Things (IoT), and optimization models. These tools and technologies allow a rich set of model representation techniques. So, in addition to the sketch, written text, physical scale models, and oral conversation, designers and engineers nowadays also have all of these other technologies and techniques at their disposal. These tools and techniques mainly allow to represent a design situation in different kinds of computer-based models, each with their own affordances of use (e.g., white box-gray box-black box; model-based-data-driven). Yet, they house externalized models, which still stand apart significantly from the mental models used by the designer and engineer for decision-making. In the last section, a short review is given of data-driven versus model-based engineering and also of the role of the machine in this entire creative design and
1348
P. Pauwels and V. Singh
engineering process. The most obvious location for the machine is to be a medium that allows to (externally) represent a design situation and perform a number of computations, while all forms of interpretations remain with the human being. A role for the machine as an assistant that actively performs abductive reasoning, interpretation, and hypothesis, and is hence able to state an opinion or make a decision that contradicts the designer or engineer’s opinion or decision, is not existing and unfortunately still far away from reality.
References Aamodt, A., & Plaza, E. (1994). Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Communications, 7, 39–59. Bayazit, N. (2004). Investigating design: A review of forty years of design research. Design Issues, 20(1), 16–29. Boden, M. A. (2004). The Creative Mind: Myths and Mechanisms (2nd ed.). London: Routledge/Taylor & Francis Group. Bogen, J. E. (1969). The other side of the brain II: An appositional mind. Bull Los Angeles Neurological Societies, 34(3), 135–162. Borrmann, A., König, M., Koch, C., & Beetz, J. (2018). Building Information Modeling Technology Foundations and Industry Practice: Technology Foundations and Industry Practice. Springer. Cross, N. (1982). Designerly ways of knowing. Design Studies, 3(4), 221–227. Cross, N. (1990). The nature and nurture of design ability. Design Studies, 11(3), 127–140. Cross, N. (1997). Descriptive models of creative design: Application to an example. Design Studies, 18(4), 427–455. Cross, N. (2006). Designerly Ways of Knowing. London: Springer. Cross, N. (2007). Forty years of design research. Design Studies, 28, 1–4. Dorst, K. (2006). Design problems and design paradoxes. Design Issues, 22(3), 4–17. Eastman, C. M. (1975). The use of computers instead of drawings in building design. AIA Journal, 63, 46–50. Eastman, C. M. (2001). New directions in design cognition: Studies of representation and recall. In Design Knowing and Learning: Cognition in Design Education (pp. 147–198). Elsevier. Eastman, C. M., Teicholz, P., Sacks, R., & Kathleen Liston, K. (2008). BIM Handbook: A Guide to Building Information Modeling for Owners, Managers, Architects, Engineers, Contractors, and Fabricators. Hoboken: Wiley. Gentner, D., Bowdle, B. F., Wolff, P., & Boronat, C. (2001). Metaphor is like analogy. In The Analogical Mind: Perspectives from Cognitive Science. Cambridge, MA: The MIT Press. Goldschmidt, G. (1991). The dialectics of sketching. Design Studies, 4, 123–143. Goldschmidt, G. (1994). On visual design thinking: The vis kids of architecture. Design Studies, 15(2), 158–174. Grace, K., Saunders, R., & Gero, J. S. (2011). Interpretation-driven visual association. In D. Ventura, P. Gervás, F. D. Harrell, M. L. Maher, A. Pease, & G. Wiggins (Eds.), Proceedings of the Second International Conference on Computational Creativity, Mexico City (pp. 132–134). Universidad Autonoma Metropolitana – Unidad Cuajimalpa. Heylighen, A. (2007). Building memories. Building Research & Information, 35, 90–100. Hofstadter, D. R. (2001). Analogy as the core of cognition. In The Analogical Mind: Perspectives from Cognitive Science. Cambridge, MA: The MIT Press. Hogan, A. (2020). The Web of Data. Cham: Springer. Holyoak, K. J., Gentner, D., & Kokinov, B. N. (2001). Introduction: The place of analogy in cognition. The Analogical Mind: Perspectives from Cognitive Science. The MIT Press. Kokinov, B. N. (1998). Analogy is like cognition: Dynamic, emergent and context sensitive. In Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences (pp. 96–105). NBU Press.
61 Abductive Reasoning in Creative Design and Engineering:. . .
1349
Kolodner, J. (1992). An introduction to case-based reasoning. Artificial Intelligence Review, 6, 3–34. Kolodner, J. (1993). Case-Based Reasoning. San Francisco: Morgan-Kauffman. Kurfess, F. J. (2003). Artificial Intelligence (pp. 609–629). Elsevier. Lakatos, I. (1976). Proofs and Refutations: The Logic of Mathematical Discovery. Cambridge: Cambridge University Press. Lakoff, G., & Johnson, M. (1980). The metaphorical structure of the human conceptual system. Cognitive Science, 4(2), 195–208. Lawson, B. (2005). How Designers Think – The Design Process Demystified (4th ed.). Oxford: Architectural Press (Elsevier). Magnani, L., & Casadio, C. (2016). Model-Based Reasoning in Science and Technology: Logical, Epistemological, and Cognitive Issues. Cham: Springer. Maher, M. L., & Poon, J. (1996). Modelling design exploration as co-evolution. Microcomputers in Civil Engineering, 11, 195–209. Maher, M. L., & Pu, P. (1997). Issues and Applications of Case-Based Reasoning in Design. San Francisco: Lawrence Erlbaum Associates. March, L. J. (1976). The logic of design and the question of value. In The Architecture of Form (pp. 1–40). Cambridge, MA: Cambridge University Press. Pauwels, P. (2017). Models in Architectural Design (pp. 975–988). Springer. Pauwels, P., & Bod, R. (2013). Architectural design thinking as a form of model-based reasoning. In Model-Based Reasoning in Science and Technology (Studies in Applied Philosophy, Epistemology and Rational Ethics, Vol. 8, pp. 583–608). Berlin/Heidelberg: Springer. Pauwels, P., De Meyer, R., & Van Campenhout, J. (2013). Design thinking support: Information systems versus reasoning. Design Issues, 29(2), 42–59. Pauwels, P., Morkel, J., & Bod, R. (2014a). Reasoning processes involved in ICT-mediated design communication. In M. Laakso & K. Ekman (Eds.), Proceedings of NordDesign 2014 (pp. 213– 222). Aalto Design Factory. Pauwels, P., Strobbe, T., Derboven, J., & De Meyer, R. (2014b). Analysing the impact of constraints on decision-making by architectural designers. In K. Zreik (Ed.), Architecture, City & Information Design (pp. 97–111). EuropIA Productions. Pauwels, P., Zhang, S., & Lee, Y.-C. (2017). Semantic web technologies in AEC industry: A literature review. Automation in Construction, 73, 145–165. Peirce, C. S. (1878). How to make our ideas clear. Popular Science Monthly, 12, 286–302. Peirce, C. S. (1958). Collected Papers of Charles Sanders Peirce. Vols. 1–6 (Eds. Charles Hartshorne & Paul Weiss) (1931–1935), vols. 7–8 (Ed. Arthur W. Burks) (1958). Cambridge: Harvard University Press. Poon, J., & Maher, M. L. (1997). Co-evolution in design: A case study of the Sydney Opera House. In Y.-T. Liu, J.-Y. Tsou, & J.-H. Hou (Eds.), Proceedings of the Second Conference on Computer Aided Architectural Design Research in Asia (pp. 439–448). Hsinchu: Hu’s Publisher. Rittel, H. W., & Webber, M. M. (1973). Dilemmas in a general theory of planning. Policy Sciences, 4, 155–169. Rittel, H. W., & Webber, M. M. (1984). Planning problems are wicked problems. In Developments in Design Methodology (pp. 135–144). Wiley. Schön, D. (1983). The Reflective Practitioner: How Professionals Think in Action. London: Temple Smith. Simon, H. A. (1973). The structure of ill-structured problems. Artificial Intelligence, 4, 181–201. Simon, H. A. (1996). The Sciences of the Artificial (2nd ed.). Cambridge: The MIT Press. W3C (2021). W3C DATA ACTIVITY – Building the Web of Data. https://www.w3.org/2013/ data/. [Online; accessed 11 Dec 2021]. Ward, T. B. (1994). Structured imagination: The role of category structure in exemplar generation. Cognitive Psychology, 27, 1–40. Ward, T. B. (1998). Analogical distance and purpose in creative thought: Mental leaps versus mental hops. In Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences. NBU Press.
Abduction in the Evaluation of Designs
62
Andy Dong
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Logical Reasoning in Design Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identifying Abductions in Natural Language Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Influence of Abductive Reasoning on Design Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1352 1354 1355 1357 1359 1360
Abstract
By definition, evaluations, be they of ideas or artifacts, consist of judgments relative to established criteria of value. In the field of design, the evaluation of designs is intentionally rational. The evaluation should exhibit empirical, metrics-based reasoning to prove that the present design satisfies or exceeds the required criteria, or at least is more desirable than other possibilities, or it does not. If design evaluation is rational and guarantees verification of goodness according to prescribed criteria, what makes radical innovations possible? If designs are evaluated against established criteria of value, would evaluations suppress the reception of novel ideas that run counter to established criteria or when future notions of goodness will deviate from present evaluation standards? In situations in which ideas have highly uncertain potential and ambiguity exists over evaluation standards, a different approach is needed for evaluations. This chapter brings attention to a necessary shift in the form of logical reasoning from deductive evaluations, which affirm norms, to abductive evaluations, which
A. Dong () School of Mechanical, Industrial, and Manufacturing Engineering, Oregon State University, Corvallis, OR, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_46
1351
1352
A. Dong
can undermine the assertion of common norms. Hence, abductive evaluations are expected to favor innovation, especially when ideas deviate from established criteria. Keywords
Concept selection · Radical innovation · Decision theory
Introduction During the process of design, it is not uncommon for multiple designs that address the same problem to be produced. Following a set-based approach to design (Sobek II et al., 1999), multiple, discrete designs would be generated using creative idea generation activities such as brainstorming, TRIZ (Altshuller, 1999), or design-by-analogy (Goel, 1997). Each alternative design will embody a different set of attributes and therefore cause trade-offs between objectives and requirements (Thurston, 1991). At some point, designers must stop generating alternatives and start assessing them for fitness relative to prescribed objectives, requirements, and constraints. Given a finite set of resources, designers can advance only a few designs for further development. Only one of the design alternatives will be developed into the final design. Many methods for the evaluation of design alternatives exist. In early stages of the design process, details about alternative design concepts are limited, and there may be ambiguity over the meaning of a design – i.e., what it “is” (Krippendorff, 1989). Designers might have nothing more than intuition and experience to judge the fitness of designs. Since the details about designs will not be known with much certainty, designers might apply qualitative design evaluation methods to rank alternatives using tools such as a pairwise comparison chart, concept screening matrix, or concept scoring matrix (Frey et al., 2009). In a group setting, they might apply social choice voting methods such as a Borda count and Copeland’s method. In later stages, computational models could be produced to simulate and analyze a design for its performance, or physical prototypes could be fabricated to test designs with potential customers. Given the existence of better information about designs, tools such as the house of quality (Hauser & Clausing, 1988) assist designers to evaluate the potential value of designs with respect to customer requirements and existing competitive products. While the formality of the methods for design evaluation increases as the design process advances, the purpose of design evaluation remains the same: to reduce the search space of potential design alternatives to a given problem down to a utility-maximizing option. Stuart Pugh (1991) describes this approach to design as “controlled convergence”: generate many alternative designs; filter out weak solutions; improve concepts based on relative strengths; and repeat iteratively until a utility-maximizing design emerges.
62 Abduction in the Evaluation of Designs
1353
The main purpose of the evaluation of designs is therefore to decide which alternative may have the highest worth. Design evaluations verify a design according to a predetermined set of specifications “to establish how well suited they are to their purposes” (Goldschmidt, 1992, p. 76). The end goal is to choose the alternative worth pursuing because it best satisfies all of the objectives, requirements, and constraints. As design alternatives are built, tested, and iteratively refined, a single alternative eventually emerges and proceeds toward production. This chapter delves into a cognitive aspect of the evaluation of designs, namely, the designer’s logical reasoning that is involved during design evaluation independent of the method or tool of evaluation. Understanding the characteristics of cognitive operations which led to specific evaluation outcomes brings insight into the thinking underpinning methods and corrects the erroneous assumption that the evaluation method or tool alone mediates the outcome (Goldschmidt, 1992). Specifically, this chapter hypothesizes that the form of logical reasoning influences the outcome of the evaluation. Recent studies in design evaluation are starting to show, for example, that the way designers think as they evaluate alternatives affects the evaluation regardless of the specific method of evaluation. For example, if designers think about their ownership over alternatives as they evaluate them, they are more likely to prefer the alternatives they generated (Nikander et al., 2014). If they continue to generate alternatives as they evaluate the set of alternatives presented to them, they are more likely to accept novel alternatives than technically feasible ones (Toh & Miller, 2015). If they continue to think divergently even when evaluating, they are more likely to accept novel options (Berg, 2016). The aforementioned study has particular salience to this chapter because it raises an important question: how it is that a novel alternative can progress through the design process when the evaluation of alternatives relies on a prescribed set of objectives, requirements, and constraints? How can a novel alternative become a radical innovation unless it was selected for development? A radical innovation is by definition an unusual event. By definition, the innovation upends established criteria of value. How is it possible, then, for radical innovations to make it through the various gatekeepers of production and evaluation standards to legitimate the novelty of the innovation if criteria for evaluation follow established rules and norms? Social structures in organizations tend to create normative rules and procedures to reject new ideas that are considered too novel or too implausible (Mainemelis, 2010). This chapter describes forms of logical reasoning and the ways that they can manifest in design evaluation. It explains a method to identify the form of logical reasoning applied during design evaluation and the effect of the form of reasoning on the outcome of the design evaluation. In particular, I will present research suggesting that when abductive reasoning occurs during design evaluation, the likelihood of accepting a novel design increases. The occurrence of abductive reasoning is therefore hypothesized as a cognitive explanation for the emergence of a radical design innovation within an otherwise typical, set-based approach to design.
1354
A. Dong
Logical Reasoning in Design Evaluation The introduction of the iMac G3 in 1998 typifies exactly the challenges of the evaluation of a novel design that could become a radical innovation. The iMac G3 introduced a novel, translucent, curved form to the shape grammar of personal computers. Personal computers then and (still) today are typified by opaque plastic or metallic rectilinear shapes. Even the iMac G3’s first color is a blue that evokes an association with clear water rather than rigidity and durability. The technology press were not enthused. “It’s no slimmer than any computer and takes up a ton of room on my desk,” wrote one critic (Dreyfuss, 1998). The critic’s evaluation criterion: take up little desk space. Converted into deductive logic in which p is a premise and q is a conclusion, the critic’s deductive evaluation is: p ⇒ q: IF no slimmer than any computer (p) THEN takes up a ton of room (q) p: It’s no slimmer than any computer q: [It] takes up a ton of space
The deduction is a physically provable true conclusion of the more general rule: p ⇒ q: All large objects (p) ⇒ consume more space (q). p: This (iMac) is a large object. q: This object consumes more space.
Don Norman offered an alternative evaluation: “It’s cute, and it’s yours” (Vinzant, 1998). Norman’s evaluation, more specifically, his conclusion of “It’s cute,” has no scientifically or empirically true premise (yet). Instead, we could describe his explanation as abductive by piecing together the observation that the iMac “merges cool industrial design and usability” (Vinzant, 1998) and the conclusion “It’s cute”: q: It’s cute. p ⇒ q: Objects that merge cool industrial design and usability (p) ⇒ evoke cuteness (q). p: This (iMac) merges cool industrial design and usability.
Norman’s explanation is an example of selective or explanatory abduction (Magnani, 2001). In an explanatory abduction, individuals choose the best candidate explanation for an observation from a multitude of explanations, typically from some established knowledge, that may have been used to explain other observations but not the current one. An effect is observed, but the cause of the effect is not known with logical or scientific surety (Galle, 1996); hence, an explanatory abduction is offered to explain the effect. Before the introduction of the iMac G3, no personal computer evoked cuteness. It was not yet empirically true whether merging industrial design and usability could evoke cuteness in personal computers. Could merging industrial design and usability be the cause of cuteness in personal computers (Galle, 1996)? The explanatory abduction of cuteness appears to have come to Norman “like a flash . . . an act of insight, although of extremely fallible insight” (Peirce, 1934). It is possible that the idea of merging industrial design
62 Abduction in the Evaluation of Designs
1355
and usability was already in Norman’s mind, “but it is the idea of putting together what we had never before dreamed of putting together which flashes the new suggestion before our contemplation” (Peirce, 1934). Abductive hypotheses are based on observations and broadly knowledge available at a given time, thus connecting the realms of “what is” and “what might be” (Kroll & Koskela, 2014). In sum, explanatory abductions require novel ideation in response to explanations of observations. In the following sections, this chapter describes techniques to identify abductive reasoning in natural language, that is, from everyday talk and writing. Then, the chapter summarizes empirical research showing the effect of abductive reasoning on the acceptance of novel designs and triggers that cause individuals to evaluate designs with explanatory abductions.
Identifying Abductions in Natural Language Text Identifying abductions in natural language text is challenging because humans do not speak in formal logic. Humans do not readily flag premises and conclusions, or causes and effects, in natural language communication. As such, scholars have taken different approaches to the identification of abductive reasoning in natural language. Galle (1996) developed a method based upon the identification and interpretation of effects (E) and possible causes (C), which are equivalent to conclusions and premises. In this method, the analyst searches natural language text for the instantiation of detailed characteristic of a design (the effect or conclusions). Then, the analyst locates surrounding text for causes (or premises) that explain the observed effect. In the example provided by Galle (1996, p. 262), “Option C” and “Option D” are realized design options to be evaluated; they are the effects. The causes of these effects are “strips as ‘figure’ and ‘relatively independent units . . . separated by the trees.” It is then up to the analyst to determine whether the effect follows from the causes – that is, are “strips as ‘figure’ and ‘relatively independent units . . . separated by the trees”’ the causes of “Option C” and “Option D”? If it is not possible to make this logical conclusion, then the explanation is abductive rather than deductive. Other scholars have developed several interpretive approaches that apply qualitative content analysis to code natural language. For example, Cramer-Petersen et al. (2019) provide four definitions for abductions based upon theories on abductive reasoning in design. They apply these definitions to perform verbal protocol analyses of design ideation sessions. Likewise, Dunne & Dougherty (2016) applied a “bottom-up” grounded theory-building approach to establish an approach to identify events and activities that involved abductive reasoning. Dong et al. (2015) proposed a method rooted in the theory of abductive reasoning in design. The underlying concept in their approach is that abductive reasoning in design is about projecting possibilities. Their interpretive approach searches for explanations that project possibilities about a design such as alternative use contexts or alternative ways of framing the design, that is, seeing the design as something else than what
1356
A. Dong
it is presently intended to be. They propose five criteria for the identification of abductive reasoning in design. The criteria relate to causal explanations for the design’s current or future state based upon possible but as yet unrealized: • • • • •
Structural or behavioral characteristics User needs Contextual factors Contexts of use Product category
For example, the “cuteness” (premise) of the iMac (conclusion) could be a new user need that was not previously known to be a need in relation to computer products. A computer as a piece of candy or furniture (Vinzant, 1998) (premise) rather than as a work of technology defies current product categorization. It is also possible to combine interpretive with linguistically driven approaches to increase the rigor and objectivity in the identification of abduction in natural language text. First, the analyst should start by identifying statements that describe the design and surrounding text that explain the description. Statements that describe the design or its characteristics are the conclusions (effects), and the surrounding texts are premises (causes). This follows the method proposed by Galle (1996). To further reduce the amount of text needed to be analyzed for abductive reasoning, the analyst can apply any of the interpretive approaches summarized above. For example, the analyst might search for statements that propose ideas (CramerPetersen et al., 2019) or passages in which individuals are trying to make sense of new information (Dunne & Dougherty, 2016). Then, rather than relying solely upon deep content knowledge to determine whether the conclusions logically follow from the premises, and therefore the statement is deductive rather than abductive, analysts can apply two linguistic analyses: semantic specificity and linguistic modality. The semantic specificity of a statement refers to an object, environment, or situation that is certain, particular, and specific instead of generalized expressed (Mürvet, 1991). For example, the expression Prairie school houses has higher semantic specificity than house. Linguistic modality refers to the certainty of an observation (Palmer, 2001; Lyons, 1977). Linguistic modality is commonly expressed in English with subjunctive verbs such as “were” (e.g., “If the handle were . . .”) and auxiliary verbs such as “might” (e.g., “If the design might be regarded as . . .”). Abductive statements are likely to have a high level of semantic specificity, because the conclusions are scoped to a design or to the characteristics of a design and the causes that explain the design or its characteristics are strongly associated with the context of the design rather than general principles or rules. The causes are likely to have weak modality, expressed through words such as “could” rather than “must” since it is not yet logically or scientifically known whether the cause is true. Reanalyzing an example of abduction (Dong et al., 2015) using this approach, semantic specificity and linguistic modality, indicated by boldface and italics typeface, respectively, are
62 Abduction in the Evaluation of Designs
1357
enacted in the following statement: “I think this would be good for sick people who are like alone. They don’t have any friends or families and this helped them to remind them to take their medicine.” The hypothesis is that if there are people who are sick and alone without friends or families, they explain the goodness of a product that reminds them to take their medicine. It is uncertain whether this hypothesis is true because the cause is hedged with the word “would.” q: remind them to take their medicine p ⇒ q: sick people who are like alone . . . don’t have any friends or families (p) ⇒ remind them to take their medicine (q) p: sick people who are like alone . . . don’t have any friends or families
Influence of Abductive Reasoning on Design Evaluation Having explained methods to identify abductions in natural language text, this section describes some empirical studies on the effect of abductive reasoning in the evaluations of designs. Dong et al. (2015) investigated the effect of the form of logical reasoning on the outcome of the evaluations of designs. Committees were presented with discrete design alternatives and tasked with their evaluation and the selection of a single alternative to advance for further development. The design alternatives are considered novel because none of the alternatives exists on the market. To motivate the committee to make the best evaluation and selection within an experimental condition, the committee was informed that if they selected the same concept as an expert had previously accepted, they would receive a total payoff twice the base payoff. For ethical reasons, they always received this bonus. The committee members spoke aloud to verbalize their evaluations, which were then transcribed and analyzed to determine the form of logical reasoning taken by the committee. Each committee was primed for either deductive or abductive reasoning. The deductive prime offered them explicit criteria, such as market potential, against which to evaluate the alternatives. The abductive reasoning prime asked them to think of a possible future in 2–3 years’ time wherein further development of the selected alternative could lead to a viable innovation. Following the notation for deductive and abductive reasoning shown above, the primes were intended to trigger the following forms of logical reasoning: Deductive reasoning prime: p ⇒ q: IF p (market potential) THEN q (accept) p: premise (market potential) q: conclusion (accept)
Abductive reasoning prime: q: a given fact, the proposed design alternative: q p ⇒ q: a rule to be inferred first: IF p THEN q p: the conclusion: p
1358
A. Dong
Under the abductive reasoning prime, the committee is hypothesized to invent the rule IF p THEN q. For example, the committees saw a device that introduced a new way (function) to turn on and off other devices. One example of abductive reasoning is: q: turn appliances on or off: q p ⇒ q: IF It’s for pour me a drink. Make me a sandwich THEN turn appliances on or off p: It’s for pour me a drink. Make me a sandwich: p
Under the abductive reasoning prime, the probability of the acceptance of a concept increased by nearly 90%. The reason for the increased probability of acceptance is that a committee member would attempt to explain the existence of the design by hypothesizing the conditions of possibility for its existence since this is what the prime asked the committee to do. Under the deductive prime, the committee simply evaluates the design according to the status quo. Novel designs are therefore rejected if the current environment (market) is unsupportive. In a follow-up study (Guenther et al., 2020), individuals were presented with a similar design evaluation and selection task except that the evaluations and selections were performed individually rather than in a committee. In addition, the level of creativity of each individual was assessed using the Alternative Uses Test (AUT) (Guilford & Hoepfner, 1971). Before selecting a concept, participants were asked to provide up to ten possible extensions “that could create new, viable follow-on business opportunities in the next 2–3 years.” This activity served as the abductive reasoning prime. The authors reported that higher levels of creativity in individuals positively correlated with the amount of abductive reasoning manifested in their generated extensions of the presented designs and with the likelihood of accepting novel designs. Usefully, the increased propensity to accept novel designs did not increase their propensity to accept inferior designs. The authors conjecture that the extensions served as a trigger for mentally exploring further, evaluating, and eventually selecting the most plausible ones. In order for an individual to make the conclusion that the presented design alternative could exist, the extension became part of the hypothesis: IF extension THEN design. Since this hypothesis is neither scientifically nor empirically true, it generates epistemic uncertainty. The individual would have been motivated to reduce this epistemic uncertainty through mental time travel (Suddendorf et al., 2009) to generate a series of intermediate changes that would precede the extension or mental simulation (Christensen & Schunn, 2009) to imagine how the presented design might be modified to accommodate the extension. In a detailed analysis of reasoning patterns throughout a design task that primarily involved the generation of design alternatives, Cramer-Petersen et al. (2019) discovered that design practitioners apply abductive reasoning to evaluate candidate alternatives – that is, ideas that were being proposed but not yet fully developed into a design alternative. The abductive reasoning enabled the participants in the study to propose new frames or perspectives on functions to achieve to address the design objectives, corroborating the framework presented by Dong et al. (2015), which ultimately led them to a novel design alternative. In one episode, the designers propose a way to solve the design problem by reusing water. Importantly, this idea
62 Abduction in the Evaluation of Designs
1359
came from the evaluation of other methods that entail returning wastewater to a central treatment facility. The statement proposing the principle of reusing water exhibits abductive reasoning is: “But maybe you could clean the water sufficiently from one to the other in a bathroom” (Cramer-Petersen et al., 2019, p. 58, Table 7). Later, another designer proposes a possible solution: “Yes, it could be that you could make a closed circuit.” Converting these statements into abductive reasoning: q: clean the water sufficiently from one to the other in a bathroom: q p ⇒ q: IF make a closed circuit THEN clean the water sufficiently from one to the other in a bathroom p: make a closed circuit: p
The important observation in this study is that design evaluations do not occur only at the end of a cycle of idea generation. They occur during the process of idea generation, too – that is, while designers are generating design alternatives. During the process of idea generation, the evaluation of initial concepts can entail abductive reasoning. The abductive reasoning serves as a way for designers to produce new frames for ideas that they had just introduced, thereby creating a launching pad to other design alternatives.
Conclusions This chapter presents a review of the influence of abductive reasoning in the evaluation of design options. The chapter hypothesizes that abductive reasoning plays a role in modifying the outcome of putatively objective evaluations of design options such that novel options that do not readily satisfy extant criteria could be selected. The chapter presents empirical research illustrating the effect of abductive reasoning on the propensity for individuals and groups to select novel options and, generally, to make better selections when presented with novel options and no opportunity to generate further information at the point of decision-making. While the chapter focused on the evaluation of design options, the implications of the research presented extend beyond the professional practice of design. The opportunity to choose novel options occurs in everyday life. Students might be given opportunities to take courses or internship opportunities in areas that are not familiar to them but could be attractive. Individuals might be given opportunities to try a new restaurant or new experience that is “outside of their comfort zone.” Companies might generate ideas for new lines of businesses and must decide whether to make initial investments to test their commercial viability. Companies in the creative industries might be presented with opportunities to produce new types of movies, television shows, or plays. Explanations for the selection of novel options range from a person’s self-perception of their being a person who makes decisions independent of the tastes and judgements of others (Cowart et al., 2008) to social structures and cultures preferring novelty over conformity (Mainemelis, 2010). This chapter proposes that the form of logical reasoning at the point of decisionmaking (selection of a novel option) can influence the propensity for novel ideas to be selected even when the selection criteria favor the familiar. Abductive reasoning
1360
A. Dong
can help the decision-maker to generate fresh insights, reframe the idea, or refine the new idea. This book chapter contributes a new explanation for the selection of novel options based upon observations of abductive reasoning in design evaluation.
References Altshuller, G. (1999). The Innovation Algorithm: TRIZ, Systematic Innovation and Technical Creativity. Worcester, MA: Technical Innovation Center, Inc. Berg, J. M. (2016). Balancing on the creative highwire: Forecasting the success of novel ideas in organizations. Administrative Science Quarterly, 61(3), 433–468. Christensen, B. T., & Schunn, C. D. (2009). The role and impact of mental simulation in design. Applied Cognitive Psychology, 23(3), 327–344. Cowart, K. O., Fox, G. L., & Wilson, A. E. (2008). A structural look at consumer innovativeness and self-congruence in new product purchases. Psychology & Marketing, 25(12), 1111–1130. Cramer-Petersen, C. L., Christensen, B. T., & Ahmed-Kristensen, S. (2019). Empirically analysing design reasoning patterns: Abductive-deductive reasoning patterns dominate design idea generation. Design Studies, 60, 39–70. Dong, A., Lovallo, D., & Mounarath, R. (2015). The effect of abductive reasoning on concept selection decisions. Design Studies, 37, 37–58. Dreyfuss, J. (1998). The imac: Not quite cool enough. Forbes, 138(9), 239–240. Dunne, D. D., & Dougherty, D. (2016). Abductive reasoning: How innovators navigate in the labyrinth of complex product innovation. Organization Studies, 37(2), 131–159. Frey, D., Herder, P., Wijnia, Y., Subrahmanian, E., Katsikopoulos, K., & Clausing, D. (2009). The pugh controlled convergence method: Model-based evaluation and implications for design theory. Research in Engineering Design, 20(1), 41–58. Galle, P. (1996). Design rationalization and the logic of design: A case study. Design Studies, 17(3), 253–275. Goel, A. K. (1997). Design, analogy, and creativity. IEEE Expert, 12(3), 62–70. Goldschmidt, G. (1992). Criteria for design evaluation: A process-oriented paradigm. In Y. Kalay (Ed.), Principles of Computer-Aided Design: Evaluating and Predicting Design Performance (pp. 67–79). New York: Wiley. Guenther, A., Eisenbart, B., & Dong, A. (2020). Creativity and successful product concept selection for innovation. International Journal of Design Creativity and Innovation, 9, 1–17. Guilford, J. P., & Hoepfner, R. (1971). The Analysis of Intelligence. New York: McGraw-Hill. Hauser, J. R., & Clausing, D. (1988). The house of quality. Harvard Business Review, 66(3), 63–73. Krippendorff, K. (1989). On the essential contexts of artifacts or on the proposition that “design is making sense (of things)”. Design Issues, 5(2), 9–39. Kroll, E., & Koskela, L. (2014). On abduction in design. In J. S. Gero & S. Hanna (Eds.), Design Computing and Cognition DCC’14 (pp. 357–376). Dordrecht: Springer. Lyons, J. (1977). Semantics (Vol. 1). Cambridge: Cambridge University Press. Magnani, L. (2001). Abduction, Reason, and Science: Processes of Discovery and Explanation. New York: Kluwer Academic/Plenum Publisher. Mainemelis, C. (2010). Stealing fire: Creative deviance in the evolution of new ideas. Academy of Management Review, 35(4), 558–578. Mürvet, E. (1991). The semantics of specificity. Linguistic Inquiry, 22(1), 1–25. Nikander, J. B., Liikkanen, L. A., & Laakso, M. (2014). The preference effect in design concept evaluation. Design Studies, 35(5), 473–499. Palmer, F. R. (2001). Mood and Modality (2nd ed.). Cambridge: Cambridge University Press.
62 Abduction in the Evaluation of Designs
1361
Peirce, C. S. (1934). Book 1: Lectures on pragmatism. In C. Hartshorne & P. Weiss (Eds.), Collected Papers of Charles Sanders Peirce (volume 5: Pragmatism and Pragmaticism, chapter Lecture 7: Pragmatism and Abduction). Cambridge, MA: Belknap Press. Pugh, S. (1991). Total Design: Integrated Methods for Successful Product Engineering. Reading: Addison-Wesley. Sobek II, D. K., Ward, A. C., & Liker, J. K. (1999). Toyota’s principles of set-based concurrent engineering. Sloan Management Review, 40(2), 67–83. Suddendorf, T., Addis, D. R., & Corballis, M. C. (2009). Mental time travel and the shaping of the human mind. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1521), 1317–1324. Thurston, D. L. (1991). A formal method for subjective design evaluation with multiple attributes. Research in Engineering Design, 3(2), 105–122. Toh, C. A., & Miller, S. R. (2015). How engineering teams select design concepts: A view through the lens of creativity. Design Studies, 38, 111–138. Vinzant, C. (1998). The iMac: Fast like cheetah, cute like kitten. Fortune, 138(9), 46.
Logical Processes Underlying Creative and Innovative Design
63
Sharifu Ura
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Logical Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First-Order Abduction-Centric Inference Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Second-Order Abduction-Centric Inference Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-K Mapping and Creative Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Compelling Reason of Creative Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantitative Assessment of Creativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: Calculating Information Content Under Epistemic Uncertainty . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1364 1366 1368 1370 1371 1375 1376 1379 1380 1383
Abstract
This chapter presents logical processes underlying creative and innovative design. The introductory section defines the key notions (design, design process, creative design process, and epistemic uncertainty) and refers to the relevant papers that the readers may consider while going through this chapter. The following section presents the logical inferences (deduction, induction, abduction, and other scientific explanations). The next section presents the abduction-centric logical network where innovation does not occur. The next section presents a new abduction-centric logical network where innovation occurs. The network can be represented by a concept-knowledge mapping consisting of several mappings between concepts (designs) and knowledge. The mappings result in a set of propositions. Some of the propositions are true or false; some are partially true or false. The concept-knowledge mapping shows that new knowledge must be S. Ura () Division of Mechanical and Electrical Engineering, Kitami Institute of Technology, Kitami, Japan e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_15
1363
1364
S. Ura
injected to justify a creative design, and before justifying it, it must be conceived. The conceiving process needs existing knowledge as well as creative knowledge. The noncreative design entails low compelling reason (low suitability in terms of the design requirements) and low epistemic challenge (because the performance is known). On the other hand, the creative design entails high compelling reason (high suitability in terms of the design requirements) and high epistemic challenge (because the performance is not known yet). A set of multivalued logicbased functions are used to measure the information content of the creative and noncreative designs. It is found that the information content must be maximized for conceiving a creative design. Keywords
Creativity · C-K theory · Abduction · Knowledge · Epistemic uncertainty · Concept map
Introduction Designing an artifact entails a decision-making process, and when a creative design is considered, step-by-step parameter analyses must be carried out (Kroll et al., 2001; Kroll, 2013) where both limited domain knowledge and human cognition play their respective role (Ullah, 2008). However, this chapter presents logical processes underlying creative and innovative design. A comprehensive treatment of logical processes is presented in the following sections. For better understanding, this section describes the key notions – design, design process, and creative and innovative design process. A design means a tangible or intangible solution to a problem. Thus, a design process is represented by a problem-solving process. Consequently, a design process cannot be completed without the required knowledge. At the outset of a design process, epistemic uncertainty occurs. When we continue a design process under epistemic uncertainty, the outcomes of the design process are the creative or innovative designs. The schematic diagram shown in Fig. 1 explains the abovementioned notions – design, design process, epistemic uncertainty, and creative design. As seen in Fig. 1, at the outset of a design process, we have limited knowledge regarding a design and an abundance of choices. This lack of knowledge and an abundance of choices create a certain kind of uncertainty called epistemic uncertainty. While carrying out a design process under epistemic uncertainty, the process faces ill-defined scenarios and partially true propositions. As a result, the suggested solutions remain controversial and entail a huge amount of information. These solutions, in turn, are defined as creative or innovative designs. Thus, if we want to formalize the abovementioned creative or innovative design process, we need to apply logical operations that work under epistemic uncertainty. The logical operations must operate on the existing knowledge. At the same time, the logical operation must support the creation of new knowledge. Elucidating such
63 Logical Processes Underlying Creative and Innovative Design
1365
Conceptual Phase (Innovation or creativity-driven)
Percentage [%]
100
Knowledge of the problem
Freedom of making choices
0 Time (not necessary linear) Fig. 1 Premise of innovation or creativity-driven design
logical operations is a challenging task. This chapter sheds some light on how to elucidate the logical operations for performing design computation under a lack of knowledge and restricting ourselves to fewer design choices. The rest of this chapter is divided into the following sections: Logical Inferences, First-Order Abduction-Centric Inference Chain, Second-Order Abduction Centric Induction Chain, C-K Mapping and Creative Design, Compelling Reason of Creative Design, Quantitative Assessment of Creative Design, and Concluding Remarks. The section of Logical Inferences presents three well-known inferences (deduction, induction, and abduction) as well as two scientific explanations called I-S Explanation and S-R explanation. While going through this section, the readers may refer to Peirce (1931–1958), Koskela et al. (2018), Salmon (1965), Hitchcock (1995), and Ullah (2020). While going through the section of FirstOrder Abduction-Centric Inference Chain, the readers may refer to Sharif Ullah et al. (2012). While going through the section of Second-Order Abduction Centric Induction Chain, the readers may refer to Sharif Ullah et al. (2012). While going through the section of C-K Mapping and Creative Design, the readers may refer to Hatchuel and Weil (2009), Hatchuel et al. (2011), Hatchuel et al. (2013, 2018), Kroll et al. (2014), Sharif Ullah et al. (2012), Kroll and Koskela (2016), Ullah (2020), Ausubel (2000), and Kant et al. (1998). While going through the section of Compelling Reason of Creative Design, the readers may refer to Ausubel (2000), Kant et al. (1998), Kroll et al. (2014), and Sharif Ullah et al. (2012). While going through the section of Quantitative Assessment of Creative Design, the readers may refer to Zadeh (1975), Ullah (2005), Sharif Ullah (2005), Sharif Ullah and Tamaki (2011), and Sharif Ullah et al. (2012).
1366
S. Ura
Logical Inferences This section presents the logical inferences often found in the extant literature. The inferences can be divided into two categories. The first category consists of three types of inferences found in the literature of logic studies. The other category consists of two inferences found in the literature of philosophy of science. The first inference found in the literature of logic studies is called deduction. The expression is as follows. Deduction : (A → B) ∧ (A) B, ((A → B) ∧ (B → C)) (A → C)
(1)
In Eq. (1), A, B, and C are true propositions and their logical implications (A → B) and (B → C) are uncontroversial. For example, if (A → B) = (bird → fly) and given a bird (A), then the conclusion is it must fly (B). The second inference found in the literature of logic studies is called induction. The expression is as follows. I nduction : (O1 , . . . , On ) (A → B)
(2)
In Eq. (2), O1 , . . . ,On refer to a finite set of observations, experimental results, experiences, or datasets. The entities denoted as A and B are consistent with O1 , . . . ,On . Thus, induction is used to find logical implications (A → B) from a finite set of observations, experimental results, experiences, or datasets. For example, consider the scenario shown in Fig. 2. As seen in Fig. 2, a set of experimental results {(xi , yi ) | i = 1, . . . ,10)} regarding two variables denoted as x and y are plotted on a two-dimensional scatter plot. Thus, O1 , . . . ,O10 = {(xi , yi ) | i = 1, . . . ,10), i.e., n = 10. The results can be used to induce logical implications. For this particular case, the following logical implications can be induced: (1) (A → B) = (x < p → y linearly increases) and (2) (A → B) = (x > p → y linearly decreases). Consequently, most of the algorithms used in artificial intelligence use induction to infer rules from a given set of datasets. The third inference found in the literature of logic studies is called abduction. The expression is as follows: Abduction : (A → B) ∧ (B) P lausable A1, A2, . . .
(3)
As such, abduction produces plausible (hypothetical) outcomes denoted as A1, A2, . . . , from a logical implication (A → B) and its conclusion (B). For example, if (A → B) = (bird → fly) and the object must fly (B), then the conclusion is, it could be a bird (= A1), a helicopter (= A2), or other objects that can fly (= A3, A4, . . . ). Consequently, the truthiness of plausible outcomes other than the obvious one, i.e., A2, A3, . . . , not A1, is neither true nor false until a new piece of deductive
63 Logical Processes Underlying Creative and Innovative Design
1367
Experimental results =
y
p
x
Fig. 2 A set of arbitrary observations
or inductive knowledge is available. Consequently, abduction is more conducive to creative or innovative design than deduction and induction; its role in producing creative or innovative design is presented in the following sections. However, other than the inferences underlying logic studies (i.e., deduction, induction, and abduction), the literature of philosophy of science advocates two more inferences (often referred to as scientific explanations). The first one is called Inductive-Statistical (I-S) explanation introduced by Hempel (1965). The expression is as follows: I − S Explanation : (Pr (x|y) ∼ = 1) ∧ (y) Most − likely (x)
(4)
As defined in Eq. (4), if the conditional probability Pr(x|y) is very close to the unit and the condition y occurs, then the outcome x most likely occurs. Thus, I-S explanation permits probability (or likelihood) to play its role in inference. Its essence is close to induction as defined in Eq. (2). The other scientific explanation is called Statistical Relevance (S-R) explanation introduced by Salmon (1965). The expression is as follows. S − R Explanation : (Pr (x|y) = Pr(x)) ∧ (y) Most − likely (x)
(5)
As such, S-R explanation replaces conditional probability (Pr(x| y) ∼ = 1) by the statistical relevance (Pr(x| y) = Pr (x)) to infer a most likely event (x). This makes the scientific explanation more pragmatic. It is worth mentioning that apart from I-S and S-R explanations, there is another explanation known as DeductiveNomological (D-N) explanation, which takes the form of deduction as defined in Eq. (1). The outcomes of scientific explanations (D-N, I-S, and S-R explanations), deduction, and induction are represented by so-called scientific theories, postulates, and rules. These entities constitute the body of so-called scientific knowledge. As such,
1368
S. Ura
not only pure cognitive inference (D-N explanation, deduction) but also empiricism (i.e., I-S, S-R explanations, and induction) play a vital role in constructing the body of “scientific knowledge.” Consequently, a design process explicitly or implicitly involves the abovementioned inferences and explanations. The remarkable thing is that scientific knowledge is not static; it evolves. According to Popper, an entity becomes a “scientific theory” if it is too specific and can be tested in the real-world domain (i.e., the domain of experience). If the scientific theory is tested and found to be false, the old theory remains valid for specific cases, and, at the same time, a new theory replaces it for some other cases. This is exactly what happens to the theories of Newtonian and Einsteinian physics. Even though Einsteinian physics subsumes Newtonian physics, the theories of Newtonian physics remain to be the theories that are valid in some cases only. Thus, the growth of the body of knowledge and conceiving “new” things are the concern, the abduction (Eq. 3) starts to play its role along with other inferences mentioned above. This is further described in the next section.
First-Order Abduction-Centric Inference Chain The above section presents a mechanized view of the five major inference mechanisms. The section also points out that the simultaneous actions of these mechanisms lead to an ever-growing knowledge-base, where new knowledge is injected due to abduction. The byproduct of newly injected knowledge is perhaps the main ingredient of creative or innovative design. Based on this consideration, this section presents an abduction-centric inference chain for creative and innovative design. This time, a more semantic representation is considered for better understanding. Consider the scenario shown in Fig. 3. In this scenario, four inferences, namely, induction, I-S explanation, deduction, and abduction are involved. The description is as follows. First, a set of observations on objects able to fly is arranged. These observations are processed by induction or I-S explanation. The outcome is a logical implication – if bird then fly. This implication can be further processed by deduction and abduction. For performing deduction, two pieces of information are needed. One is the logical implication (if bird then fly), and the other is the premise (there is a bird). Thus, deduction results in a conclusion – it can fly. On the other hand, for performing abduction, two pieces of information are needed. One is the logical implication (if bird then fly), and the other is the premise (an object that must fly). Thus, abduction results in a conclusion – bird or other objects. Thus, one of the plausible states of “other objects” could be a helicopter. This solution (helicopter) does not produce any contradiction given the datasets of objects able to fly, as schematically shown in Fig. 3. The remarkable thing is that abduction is more instrumental for a design process compared to deduction. As such, a design process can be represented by an abduction-centric inference chain, as shown in Fig. 4. This chain is defined as the first-order abduction-centric design process. The phrase “first-order” refers to the
63 Logical Processes Underlying Creative and Innovative Design
Observations on objects able to fly
1369
No contradiction
Induction or I S Explanation Deduction
If bird then fly
Abduction
If birth then fly Object must fly ___________________ Bird or other objects
If birth then fly A bird ________________ It can fly
An helicopter Fig. 3 An inference-driven chain Fig. 4 First-order abduction-centric design process
Existing knowledge
is consistent with the
of
objects Helicopter able to is a fly design solution can
that can
A bird
fact that there are other complex forms of the abduction-centric design process; this is perhaps the simplest form. As seen in Fig. 4, a first-order abduction-centric design process can be represented by the four propositions, as follows:
1370
1) 2) 3) 4)
S. Ura
Existing knowledge of objects able to fly. A bird can fly. Helicopter is consistent with the existing knowledge of objects able to fly. Helicopter is a design solution that can fly.
The above propositions are all true, proven by the existing knowledge. Note that “existing knowledge” results from the logical inference called induction or IS explanation described above (Fig. 3). The function “fly” is the central notion or focus of these propositions. The scenario also proves that a design process is a set of propositions where both design solutions (bird and helicopter) and design knowledge coexist. The remarkable thing is that the logical inferences remain implicit and amalgamate as knowledge.
Second-Order Abduction-Centric Inference Chain The first-order abduction-centric logical chain is not necessarily the right way to capture the activities underlying creative and innovative design because this chain results in an existing design. Consequently, the logical inference chain (Fig. 3) and the resulting design process (Fig. 4) need a revisit. This section revisits the design process shown in Fig. 4 and presents a modified one that captures the essence of creative and innovative design (i.e., epistemic uncertainty, see Fig. 1). In order to overcome epistemic uncertainty, “new knowledge” must be added to existing knowledge. Hence, a more sophisticated logical inference must be introduced other than those presented before. This logical process is denoted as second-order abduction. It entails new knowledge. Thus, the design process can be represented by the design process, as shown in Fig. 5. As seen in Fig. 5, the second-order abduction-centric design process adds a few more propositions due to the injection of knowledge in the process. Thus, the design process shown in Fig. 5 boils down to the following seven propositions. 1) 2) 3) 4) 5) 6) 7)
Existing knowledge of objects able to fly. A bird can fly. Helicopter is consistent with the existing knowledge of objects able to fly. Helicopter is a design solution that can fly. New knowledge suggests that X can fly. X is a design solution. X is not consistent with the existing knowledge.
Consequently, the solution denoted as X is the creative or innovative design, and the logical inferences involved in formulating the new knowledge are second-order abduction.
63 Logical Processes Underlying Creative and Innovative Design
Existing knowledge is consistent with the
of
1371
New knowledge
is not consistent with the
objects
suggests that
X
Helicopter able to
can
is a fly
design solution can
that can
A bird
is a Fig. 5 Second-order abduction-centric design process
Since two sets of knowledge are involved in the creative or innovative design process, the truthiness of the proposition depends on the existing and new knowledge. For example, the proposition “X is a design solution” is not true from the context of existing knowledge, but it is true from the context of new knowledge. As a result, creative or innovative design process blends true and false propositions. As a result, a creative or innovative design process encounters fuzziness that needs fuzzy or multivalued logic for quantitative analysis. This issue will be treated with more rigor in the subsequent sections.
C-K Mapping and Creative Design The descriptions of creative or noncreative design processes boil down to some propositions and manifest two distinct domains. One of the domains consists of design solutions (helicopter, bird, and X), and the other consists of knowledge (existing knowledge and new knowledge). Therefore, a natural description of the design process becomes a C-K mapping, where “C” means concept and “K” means knowledge. The C-K mapping of the creative or innovative design process is shown in Fig. 6.
1372
S. Ura
Concept Domain
Ordinary concept
can replace
Knowledge Domain provides an
Existing knowledge
cannot explain is added to
Creative concept
results in
becomes
New knowledge
Design solution
injects more
Knowledge Fig. 6 C-K mapping of creative or innovative design process
As seen in Fig. 6, there are mappings between knowledge and concepts (K or C → C or K). These mappings gradually enrich the knowledge domain. On the other hand, ordinary concept (design solution) and creative concept (design solution, too) populate the concept domain. The C-K mapping of creative or innovative design process entails the following propositions. 1) 2) 3) 4) 5) 6) 7)
Existing knowledge provides ordinary concept (K → C). New knowledge is added to existing knowledge (K → K). New knowledge results in creative concept (K → C). Existing knowledge cannot explain creative concept (K → C). Creative concept can replace ordinary concept (C → C). Creative concept becomes design solution (C → K). Design solution injects more knowledge (K → K).
63 Logical Processes Underlying Creative and Innovative Design
1373
The semantic phrases (e.g., “is added to”) make the propositions underlying Figs. 5 and 6 different, but their essential meaning is the same because the main concepts – existing knowledge, new knowledge, ordinary concept, and creative concept – are the same. The C-K mapping shown in Fig. 6 rightly narrates the creative or innovative design process. However, it is not clear how the new knowledge evolves. Does it evolve after conceiving a creative concept? Or, does it evolve before? It means that the mappings shown in Fig. 6 follow a sequence. This sequence determines whether or not a creative concept precedes new knowledge. This is an open question, but one of the pragmatic answers is that a segment of existing knowledge and the ordinary concept dictates a decision-making process, which manifests a creative concept. When the creative concept is studied further, new knowledge justifying the creative concept evolves. Consequently, a sequence exists among existing knowledge, new knowledge, ordinary concept, and creative concept, as schematically illustrated in Fig. 7. A decision-driven process plays a central role. The immediate question is, what is this decision-driven process? The answer to this question must incorporate the notion of epistemic uncertainty described in the introductory section (Fig. 1). What happens is that the ordinary concept may not fulfill the design needs to be considered. The considered needs motivate to consider additional existing knowledge in the knowledge domain, provoking a new but undecided solution. This undecided solution becomes the creative design. When creative design is further pursued, new knowledge evolves. For better understanding, consider designing a propulsion engine for Mars exploration, as schematically illustrated in Fig. 8. Though the convention fossil-fuel-based engine (C1) works well in the earth atmosphere (knowledge of earth atmosphere is denoted as K1), as confirmed by the knowledge of fossil-fuel-based engine performance (K2), it may not work in the Mars atmosphere. Therefore, a creative design, i.e., a new propulsion engine, must
2 Existing knowledge
1 2 Ordinary concept
Decision-driven process
3 Creative concept
4
New knowledge Fig. 7 Interplay of existing knowledge, new knowledge, ordinary concept, and creative concept
1374
S. Ura
Fossil-fuel-based propulsion engine (C1)
is suitable for
(Knowledge) of earth atmosphere (K1)
confirms the performance of
is a creative concept compared to
Knowledge of fossilfuel-based engine (K2) is perhaps not suitable for
Mg-CO2-based propulsion engine (C2)
(Knowledge) of Mars atmosphere (K3) is undecided with respect to Knowledge of Mg-CO2based propulsion engine (K4) because it is
Empty (K4 = {Æ}) Fig. 8 An example of conceiving a creative design
replace the fossil-fuel-based engine for Mars exploration. As such, the knowledge regarding Mars atmosphere (K3) can be considered to find a plausible engine. Based on K3, the Mg-CO2 propulsion engine (C2) can be considered because the atmosphere of Mars is full of Magnesium (Mg) and CO2 . For an Mg-CO2 propulsion engine, Magnesium and CO2 can serve as the fuel and oxidizer, respectively. Since the knowledge of the Mg-CO2 -based propulsion engine (K4) is unavailable yet, i.e., K4 is empty (K4 = {∅}), the Mg-CO2 -based propulsion engine’s performance is unknown or undecided. In synopsis, both compelling reason and undecidedness coexist for the creative design. These two factors are the characteristics of epistemic uncertainty associated with creative design. These two characteristics are described further in the subsequent sections.
63 Logical Processes Underlying Creative and Innovative Design
1375
Compelling Reason of Creative Design As mentioned in the previous section, compelling reason and undecidedness coexist when the C-K mapping conceives a creative design. This section describes the mechanism behind the compelling reason. Compelling reason means that instead of an ordinary design, a creative design should be conceived to meet the design needs in a befitting manner even though its performance is unknown (i.e., it may or may not work). In order to elucidate the compelling reason for creative design, a scenario is considered, as shown in Fig. 9. In this scenario, the chunks of knowledge denoted as K1, K2, and K3 (Fig. 8) are given in detail. As seen in Fig. 9, instead of K1, K2, and K3, there is another piece of knowledge denoted as KCR (knowledge of compelling reason) that acts while conceiving an ordinary and creative designs (C1 and C2). In this particular case, K1 means the following propositions: Earth’s atmosphere supplies ample O2 and hydrocarbon and hardly supplies CO2 . This piece of knowledge creates two pieces of analytic a priori knowledge denoted as KCR1 = all engines need fuel and oxidizer and KCR2 = ample supply of fuel and oxidizer is essential for engines. These three pieces of knowledge inject the ordinary design C1 = An Internal Combustion (IC) engine which uses hydrocarbon as fuel and O2 as oxidizer. On the other hand, K3 (= Mars atmosphere hardly supplies O2 , supplies ample Mg and CO2 , and does not supply hydrocarbon) makes a couple with a piece of creative knowledge KCR3 = An engine for Mars exploration also needs fuel and oxidizer. This piece of knowledge injects the creative design (C2 = An engine for Mars exploration can use Mg as fuel and CO2 as oxidizer).
Fig. 9 Mechanism of compelling reason
1376
S. Ura
Thus, compelling reason can be expressed as follows: Compelling Reason : (K1, KCR , K3, C1) C2
(6)
The case shown in Fig. 8 shows that KCR = {KCR1 , KCR2 , KCR3 }. The remarkable thing is that at least one of the elements of KCR must be creative knowledge. The case shown in Fig. 9, KCR3, is a piece of creative knowledge. Creative knowledge exists in three categories: analytic a priori-based creative knowledge, synthetic a priori-based creative knowledge, and synthetic a posterioribased creative knowledge (Ullah, 2020). For this particular case, KCR3 is analytic a priori-based creative knowledge. Thus, along with other ingredients, an implicit piece of creative knowledge must accompany a creative design process as far as the compelling reason is concerned.
Quantitative Assessment of Creativity This section presents how to quantitatively assess the creative or innovative design process described so far in this chapter. For better understanding, two notions are considered. The first is the compelling reason, and the other is the epistemic challenge. The states of the ordinary design (CI = Fossil-fuel-based propulsion engine) and the creative design (C2 = Mg-CO2 -based propulsion engine) are schematically illustrated in Fig. 10. As seen in Fig. 10, the ordinary design (C1) entails low compelling reason, whereas creative design (C2) entails high compelling reason; the reason is described in the previous section (Fig. 9). On the other hand, the ordinary design (C1) entails low epistemic challenges because its performance is already known (i.e., K1, K2, and K3 are known), whereas creative design entails high epistemic challenge because K4 is still unknown. No matter the state of epistemic challenge and compelling reason of the design solutions, the design requirements are the same. This time, the requirements are the following: (1) An engine should be suitable for Mars atmosphere; (2) engine performance should be satisfactory. Based on the above consideration, the information content of ordinary and creative designs can be calculated. The information content has two components. The first component quantifies the certainty entropy (information content of certainty or uncertainty), and the other quantifies the requirement entropy (information content regarding requirement fulfillment). Certainty entropy equal to the unit means that all propositions are equally true or false (i.e., all truth values are equal to 0.5). It happens when the knowledge is absolutely incomplete (a state of complete ignorance). Certainty entropy equal to zero means that all propositions under consideration are absolutely true or false (i.e., all truth values are equal to 0 or 1). On the other hand, requirement entropy equal to zero means that the requirement is fulfilled completely. Requirement entropy equal to the unit means that the requirement is not fulfilled at all. The settings for determining the certainty and requirement entropies of C1 and C2 are shown in Tables 1 and 2, respectively.
63 Logical Processes Underlying Creative and Innovative Design
Fossil-fuel-based propulsion engine (C1)
entails low
is suitable for
1377
(Knowledge) of earth atmosphere (K1)
confirms the performance of degree of epistemic challenge
degree of compelling reason is a creative concept compared to
Knowledge of fossilfuel-based engine (K2) is perhaps not suitable for
entails high Mg-CO2-based propulsion engine (C2)
(Knowledge) of Mars atmosphere (K3) is undecided with respect to Knowledge of Mg-CO2based propulsion engine (K4) because it is
Empty (K4 = {Æ})
Fig. 10 States of ordinary and creative designs in the creative design process Table 1 Setting for determining certainty and requirement entropies of C1 Propositions P11 = C1 is suitable for Mars atmosphere P12 = C1 is not suitable for Mars atmosphere P13 = Performance of C1 is satisfactory P14 = Performance of C1 is not satisfactory
Truth values Linguistic Mostly false Perhaps true
Requirements Numerical 0.1 Engine should be suitable for Mars atmosphere 0.73
Mostly true
0.9
Mostly false
0.1
Engine performance should be satisfactory
For each design, four propositions are considered. The first two propositions are for the suitability, and the other propositions are for performance. The requirements supply two more propositions (engine should be suitable for Mars atmosphere, and engine performance should be satisfactory) that remain the same for both designs. The mathematical settings needed to calculate certainty entropy and requirement entropy are summarized in Appendix A. These settings are used to calculate the certainty entropy, requirement entropy, and their aggregated values (the values of coherency measure). The results are described as follows. Refer to Table 1. P11 and P12 deal with the compelling reason, i.e., whether or not it is true that C1 is suitable for Mars exploration. The underlying design requirement is “engine should be suitable for Mars exploration.” On the other hand, P13 and P14 deal with the epistemic challenge, i.e., whether or not C1’s
1378
S. Ura
Table 2 Setting for determining certainty and requirement entropies of C2 Propositions P21 = C2 is suitable for Mars atmosphere P22 = C2 is not suitable for Mars atmosphere P23 = Performance of C2 is satisfactory P24 = Performance of C2 is not satisfactory
Truth values Linguistic Perhaps true Perhaps false Not sure Not sure
Requirements Numerical 0.73 Engine should be suitable for Mars atmosphere 0.27 0.5
Engine performance should be satisfactory
0.5
Requirement Entropy
1 0.8 Compelling reason (suitableness)
0.6
0.4 Epistemic challenge (performance)
0.2 0
0
0.2
0.4
0.6
0.8
1
Certainty Entropy Fig. 11 Information content of an ordinary design (C1)
performance is indeed known. The underlying requirement is “engine performance should be satisfactory.” As listed in Table 1, P11 is “mostly false”; P12 is “perhaps true” given the knowledge of Mars atmosphere (K3). On the other hand, P13 is “mostly true,” and P14 is “mostly false,” given the knowledge of engine performance (K2). The values of certainty entropy and requirement entropies of P11 and P12 are 0.37 and 1, respectively, whereas the values of certainty entropy and requirement entropy of P13 and P14 are 0.2 and 0, respectively. As a result, the overall information content of C1 is 1.74 (i.e., the value of coherency measure, see Appendix A). The information contents are plotted in Fig. 11. As seen in Fig. 11, the relative positions of the epistemic challenge (down) and compelling reason (up) are different. Refer to Table 2. P21 and P22 deal with the compelling reason, i.e., whether or not it is true that C2 is suitable for Mars exploration. The underlying design requirement is “engine should be suitable for Mars exploration.” On the other
63 Logical Processes Underlying Creative and Innovative Design Fig. 12 Information content of a creative design (C2)
1379
Requirement Entropy
1 0.8 0.6
Epistemic challenge (performance)
0.4 Compelling reason (suitableness)
0.2 0 0
0.2
0.4
0.6
0.8
1
Certainty Entropy
hand, P23 and P24 deal with the epistemic challenge, i.e., whether or not C1’s performance is indeed known. The underlying requirement is “engine performance should be satisfactory.” As listed in Table 2, P21 is “perhaps true”; P22 is “perhaps false” given the knowledge of Mars atmosphere (K3). On the other hand, the truth values of both P23 and P24 are “not sure” (i.e., neither true nor false) because K4 is empty – a complete lack of knowledge. The values of certainty entropy and requirement entropy of P21 and P22 are 0.54 and 0, respectively, whereas the values of certainty entropy and requirement entropy of P23 and P24 are both equal to the unit. The overall information content of C2 is equal to 3 (in terms of coherency measure), which is a very high value compared to that of C1 (1.74). The information contents are plotted in Fig. 12. Note the relative positions of the epistemic challenge (up) and compelling reason (down). This means that the compelling reason and epistemic challenge of a creative design is opposite to that of an ordinary design. Moreover, the information content of a creative design is very high compared to that of a noncreative design. The above quantification shows that creative design process increases information content. This means that creative design process violates the second axiom of axiomatic design (minimize information content) introduced by Suh (1990).
Conclusions Creative or innovative design process cannot be elucidated using simple abduction or any other logical processes (deduction, induction, I-S explanation, and S-R explanation). A chain of logical inferences dominated by second-order abduction carries out creative design. This chain can ultimately be represented by C-K mapping. In this mapping, the instance of conceiving a creative design is governed by creative knowledge assisted by existing knowledge. The whole mapping exhibits a very high amount of information content in the sense of epistemic uncertainty.
1380
S. Ura
Concept domain
Knowledge domain Existing knowledge
Low information content
y = a + be(– x) b a= cd
a yes x
y
a
c
x
d
a
High information content
y
no b
Ordinary concept (design)
a b c d
b
c
d
y
x(t) t
Creative concept (design)
b
a
c x
Creative knowledge
Fig. 13 Logical process of creative design compared to ordinary design
Consequently, the graphical summary of the logical process of creative or innovative design is the diagram shown in Fig. 13.
Appendix A: Calculating Information Content Under Epistemic Uncertainty Let TV ∈ [0,1] be the numerical truth value. Let LTV = {LTV1 = mostly false, LTV2 = perhaps false, LTV3 = not sure, LTV4 = perhaps true, LTV5 = mostly false} be a set of five linguistic truth values. The membership functions of the linguistic truth values are as follows: 0.3 − T V μmostly f alse (T V ) = max 0, min 1, 0.3 − 0 T V − 0 0.5 − T V μperhaps f alse (T V ) = max 0, min , 0.3 − 0 0.5 − 0.3
(A1)
(A2)
63 Logical Processes Underlying Creative and Innovative Design
mostly false
Membership Vlaue
1
perhaps false
1381
perhaps true
not sure
mostly true
0.8
0.6 0.4 0.2
0
0
0.2
0.4
0.6
0.8
1
Truth Value Fig. A1 Membership functions of linguistic truth values
T V − 0.3 0.7 − T V , (T V ) = max 0, min sure 0.5 − 0.3 0.7 − 0.5 T V − 0.5 1 − T V μperhaps true (T V ) = max 0, min , 0.7 − 0.5 1 − 0.7 T V − 0.7 μmostly true (T V ) = max 0, min 1, 1 − 0.7 μnot
(A3) (A4) (A5)
Figure A1 illustrates these membership functions The average truth value of a linguistic true value denoted as ATV(LTVi), i = 1, . . . ,5, is calculated using centroid method, as follows: 1 AT V (LT V i) =
0
μLT V i (T V ) · T V dT V 1 0 μLT V i (T V )dT V
(A6)
The values of ATV (mostly false) = 0.1, ATV (perhaps false) = 0.27 (rounded up to two digits), ATV (not sure) = 0.5, ATV (perhaps true) = 0.73 (rounded up to two digits), and ATV (mostly true) = 0.9. These values are used in Tables 1 and 2. Information content of TV or ATV is a tent function. The expression is as follows: x − 0 0.5 − x I (x) = max 0, min , 0.5 − 0 1 − 0.5
(A7)
1382
S. Ura
Fig. A2 Information content of numerical truth value Information Content (I)
1 0.8 0.6
0.4 0.2 0 0
0.2
0.4
0.6
0.8
1
TV or ATV
In Eq. (A7), x ∈ {TV, ATV}. The illustration of this function is shown in Fig. A2. It means that when a proposition is absolutely true or false (TV = 0, 1), the proposition does not carry any information, i.e., I = 0. It happens where the knowledge is complete. When a proposition is neither true nor false (TV = 0.5), the proposition carries the highest amount of information, i.e., I = 1. It happens when there is a complete lack of knowledge. For TV = (0,0.5), the information linearly increases with the increase in TV, and TV = (0.5,1), the information content linearly decreases with the increase in TV. Thus, the function of information content I(.) is instrumental in quantifying the lack or abundance of knowledge or epistemic uncertainty. However, for a set of propositions the average information content is called Certainty Entropy (CE). The expression is as follows: n CE =
i=1 I
(T V i or AT V i) n
(A8)
Note that for the cases shown in Tables 1 and 2, n = 2 because a set of two propositions (e.g., P11 and P12) collectively carry the information regarding an issue (performance or suitability). The Requirement Entropy (RE) quantifies how strongly the requirement is completely fulfilled. Thus, RE = 0 when the requirement is fulfilled, and RE = 1 when the requirement is not at all fulfilled. Otherwise, RE ∈ (0,1). The expression of RE is as follows: ⎧ ⎪ 1 TVR ≤ b ⎨ VR RE = a−T T V R ∈ (b, a] (A9) a−b ⎪ ⎩ 0 TVR > a
63 Logical Processes Underlying Creative and Innovative Design
1383
In Eq. (A9), TVR is the truth value of the requirement proposition, a and b are the maximum and minimum truth values, respectively, of a given set of propositions that carry the information of an issue. For the cases shown in Tables 1 and 2, TVR is the truth value (this time, average truth value) of proposition P11, P13, P21, or P24. As such, for the cases shown in Tables 1 and 2, RE = 0 or 1.
References Ausubel, D. P. (2000). The acquisition and retention of knowledge. Kluwer Academic Publishers. Hatchuel, A., & Weil, B. (2009). C–K design theory: An advanced formulation. Research in Engineering Design, 19(4), 181–192. Hatchuel, A., Masson, P. L., & Weil, B. (2011). Teaching innovative design reasoning: How concept–knowledge theory can help overcome fixation effects. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 25(1), 77–92. Hatchuel, A., Weil, B., & Masson, P. L. (2013). Towards an ontology of design: Lessons from C–K design theory and forcing. Research in Engineering Design, 24(2), 147–163. Hatchuel, A., Masson, P. L., Reich, Y., & Subrahmanian, E. (2018). Design theory: A foundation of a new paradigm for design science and engineering. Research in Engineering Design, 29(1), 5–21. Hempel, C. G. (1965). Aspects of scientific explanation. In C. G. Hempel (Ed.), Aspects of scientific explanation and other essays in the philosophy of science (p. 331). Free Press. Hitchcock, C. R. (1995). Salmon on explanatory relevance. Philosophy of Science, 62(2), 304–320. http://www.jstor.org/stable/188436 Kant, I., Guyen, P., & Wood, A. W. (1998). Critique of pure reason (The Cambridge edition of the works of Immanuel Kant). Cambridge University Press. Koskela, L., Paavola, S., & Kroll, E. (2018). The role of abduction in production of new ideas in design. In P. E. Vermaas & S. Vial (Eds.), Advancements in the philosophy of design (Design Research Foundations) (pp. 153–183). Springer. Kroll, E. (2013). Design theory and conceptual design: Contrasting functional decomposition and morphology with parameter analysis. Research in Engineering Design, 24(2), 165–183. Kroll, E., & Koskela, L. (2016). Explicating concepts in reasoning from function to form by two-step innovative abductions. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 30(02), 125–137. Kroll, E., Condoor, S. S., & Jansson, D. G. (2001). Innovative conceptual design: Theory and application of parameter analysis. Cambridge University Press. Kroll, E., Masson, P. L., & Weil, B. (2014). Steepest-first exploration with learning-based path evaluation: Uncovering the design strategy of parameter analysis with C–K theory. Research in Engineering Design, 25(4), 351–373. Peirce, C. S. (1931–1958). Collected papers of Charles Sanders Peirce (Vols. 1–6, C. Hartshorne, & P. Weiss (Eds.); Vols. 7–8, A. W. Burks (Ed.)). Harvard University Press. Salmon, W. C. (1965). The status of prior probabilities in statistical explanation. Philosophy of Science, 32(2), 137–146. Sharif Ullah, A. (2005). A fuzzy decision model for conceptual design. Systems Engineering, 8(4), 296–308. Sharif Ullah, A. M. M., & Tamaki, J. (2011). Analysis of Kano-model-based customer needs for product development. Systems Engineering, 14(2), 154–172.
1384
S. Ura
Sharif Ullah, A. M. M., Rashid, M. M., & Tamaki, J. (2012). On some unique features of C–K theory of design. CIRP Journal of Manufacturing Science and Technology, 5(1), 55–66. Suh, N. P. (1990). The principles of design. Oxford University Press. Ullah, A. M. M. S. (2005). Handling design perceptions: An axiomatic design perspective. Research in Engineering Design, 16(3), 109–117. Ullah, A. M. M. S. (2008). Logical interaction between domain knowledge and human cognition in design. International Journal of Manufacturing Technology and Management, 14(1–2), 215– 227. Ullah, A. S. (2020). What is knowledge in Industry 4.0? Engineering Reports, 2(8), e12217. Zadeh, L. A. (1975). The concept of a linguistic variable and its application to approximate reasoning – I. Information Sciences, 8(3), 199–249.
Abduction and Design Theory: Disentangling the Two Notions to Unbound Generativity in Science
64
Ehud Kroll, Pascal Le Masson, and Benoit Weil
Contents Introduction: Design Theory to Shed New Light on Abduction . . . . . . . . . . . . . . . . . . . . . . . Abduction and Design: Disentangling the Two Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction as the “Kernel of Design”? The Critical Issue of Generativity . . . . . . . . . . . . . The Issue of Generativity in Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Advances in Design Theory: Accounting for Generativity Without Relying on Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Disentangling Design Theory and Abduction Opens Avenue for Research . . . . . . . . . . . Abduction Through the Lens of C-K Design Theory: Bounded and Preservative Generativities in Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Method: Analyze Abduction with Design Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Result 1: The Unknowns in Abduction – Why Scientific Concepts Are More than “Hypotheses” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Result 2: Abduction as “Bounded Generativity” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Result 3: Toward Unbounded Abduction – Facing the Issue of Preservative Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion: Design Theory to Unbound Generativity of Abduction? . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1386 1387 1387 1389 1390 1393 1394 1394 1395 1399 1399 1401 1402 1403 1404
E. Kroll () Department of Mechanical Engineering, ORT Braude College, Karmiel, Israel e-mail: [email protected] P. Le Masson · B. Weil Center of Management Science (CGS) – i3 UMR CNRS 9217, Mines Paris – PSL, Paris, France e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_47
1385
1386
E. Kroll et al.
Abstract
Research on design theory and research on abduction have long developed in two parallel streams without connections. However, some researchers have noticed that design and abduction might be fruitfully connected: they identified some forms of abduction in design processes and characterized the variety, and even uniqueness, of forms of abduction in design. Following this stream of work, this chapter includes an analysis of how design theory might help uncover some critical properties of abduction and, conversely, how this analysis might also help uncover particular facets of design, namely, the logic of preservative generativity. More specifically, in recent years, research on design theory has contributed to reconstructing a basic science, design theory, that accounts for the logic of generativity. Moreover, design theory developed without relying on the notion of abduction. Hence, design theory appears as an interesting scientific analytical framework to analyze the generativity logic of design abduction and, more generally, abduction in science. It leads to making two main propositions: (1) abduction descriptions actually tend to underestimate the potential of the generativity of abduction, making it a form of “bounded generativity,” and (2) unbounding generativity in abduction would lead to discuss the relationship between generativity and preservation in the construction of scientific hypotheses. Keywords
Design abduction · Design theory · C-K theory · Generativity
Introduction: Design Theory to Shed New Light on Abduction Research on design theory and research on abduction have long developed in two parallel streams without connections. Nineteenth- and twentieth-century works on “Konstruktionslehre” in Germany (König, 1999; Heymann, 2005) did not rely on abduction, although dealing with invention and knowledge creation. Conversely, in the same period, abduction research – be it Peirce’s abduction or more recent works in philosophy of science – did not refer to design theory, although relating to “discovery” and the introduction of new ideas in knowing systems (Fann, 1970). However, some researchers have noticed that design and abduction might be fruitfully connected: they identified some forms of abduction in design processes (March, 1976; Coyne, 1988; Coyne et al., 1990; Roozenburg, 1993) and characterized the variety, and even uniqueness, of forms of abduction in design (Dorst, 2011; Kroll & Koskela, 2016). Following this stream of work, this chapter includes an analysis of how design theory might help uncover some critical properties of abduction and, conversely, how this analysis might also help uncover particular facets of design, namely, the logic of preservative generativity.
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1387
More specifically, in recent years, research on design theory has contributed to reconstructing a basic science, design theory, that accounts for the logic of generativity and is comparable in its structure, foundations, and impact to decision theory, optimization, and game theory in their time (Hatchuel et al., 2018). Moreover, design theory developed without relying on the notion of abduction (Ullah et al., 2011). Hence, design theory appears as an interesting scientific analytical framework to analyze the generativity logic of design abduction and, more generally, abduction in science. It leads to making two main propositions: (1) abduction descriptions actually tend to underestimate the potential of the generativity of abduction, making it a form of “bounded generativity,” and (2) unbounding generativity in abduction would lead to discuss the relationship between generativity and preservation in the construction of scientific hypotheses. The chapter unfolds in three parts: (a) it is first shown that the two notions of theory are in fact disentangled, and this disentanglement itself enables using design theory as an “instrument” to better understand generativity in abduction; (b) the authors’ method is described next, consisting of analyzing abduction – formulations and illustrations – through a design theory lens; and (c) the results of this analysis are presented, followed by a discussion of the two propositions (abduction as bounded generativity and unbounded abduction as a balance between generativity and preservation).
Abduction and Design: Disentangling the Two Notions Abduction as the “Kernel of Design”? The Critical Issue of Generativity A stream of works in design research has analyzed how abduction can be considered as the “kernel of design” (e.g., Roozenburg, 1993). These authors recognize that abduction, since Peirce, discusses the process of generating and selecting hypotheses to test in science – and is opposed to deduction – whereas design begins with a “desire” that is not satisfied by known artifacts and leads to generating a new artifact. But even if deduction and abduction seem to be two dissimilar types of reasoning, a correspondence can be established as follows (as explained by Roozenburg):
•
•
An existing design can be described, and properties can be inferred by deduction (given are a design description p and a known rule p➔q that connects this description to some property; therefore, the design exhibits property q). The design process itself follows a pattern of reasoning that is considered analogous to abductive reasoning (Coyne et al., 1990): it begins with the desired performance (we wish to have q), and the designers rely on some rules (of the forms p➔q) that relate shape, material, dimensions, etc. to the performance, to be able to get an artifact with design description p that will exhibit the performance q. This pattern of reasoning (the
1388
E. Kroll et al. premises are q and p➔q; the conclusion is p) can be considered abductive and could be found in AI and knowledge-based systems from the 1980s (to perform diagnostic tasks in expert systems) and more specifically in knowledge-based design systems (Coyne, 1988; Coyne et al., 1990) to design artifacts based on existing, known design rules, which is assimilated by Coyne et al. to “cause finding.”
Note that the distinction between deduction and abduction also corresponds to a classical trope in design theory: the distinction between knowledge about existing designs and design of new artifacts. Design theorists have shown that the design of new artifacts is not just deduction or “applied science” but requires specific reasoning to make use of knowledge to design a desirable object – Redtenbacher elaborated his Konstruktionslehre on this distinction (Redtenbacher, 1852; Le Masson & Weil, 2013); in the 1970s, Rodenacker underlined that design could be represented as an “inversion” of the experimental process (Rodenacker, 1970), since, according to Rodenacker, experiment is going from a physical phenomenon to measurement to physical concept, whereas construction is going from function to command signal to the final artifact. Hence in this rough description, abduction appears as a name given to the phenomenon that should be described by design theory: a reasoning that goes from desired performance to a known artifact. It designates the issue but provides limited insight on the reasoning itself. At a more detailed level, some authors have tried to explicate more clearly what could be considered as “the kernel of design”: Roozenburg insists on the fact that design cannot be limited to the use of given rules, and the designer might have to conceive a new rule. Roozenburg refers to the distinction proposed by Habermas between explanatory abduction and innovative abduction (Habermas, 1968); see Fig. 1: in explanatory abduction, the rule (p➔q) is a premise, whereas in innovative abduction, the rule is in the conclusion. Therefore, according to Roozenburg, design is not a combinatoric choice among disposable rules, but rather the kernel of design is the generation of both rules and artifacts. This process of generating new rules and artifacts is called generativity (Eris, 2003; Rogers et al., 2005; Zittrain, 2006). Generativity appears as the critical feature of design (Hatchuel et al., 2011a), and abduction, if considered as innovative abduction, would refer to the fact that design theory should try to account for generativity as a rigorous reasoning process. Referring to abduction, Roozenburg implicitly underlines three main requirements that a design theory should satisfy: • Requirement 1: it should be a reasoning – rational, logical, and, more precisely, based on controllable logic. • Requirement 2: in a way design theory should be “more than” deduction. • Requirement 3: design theory should account for generativity, not only the generation of (previously unknown) artifacts but also the generation of new rules to design these artifacts. This perspective is reinforced by further works on design and abduction (Dorst, 2011; Kroll & Koskela, 2016; Koskela et al., 2018), where the authors show that a
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1389
Fig. 1 Explanatory abduction vs innovative abduction after Habermas (1968) and Roozenburg (1993)
design process can be described as a variety of abductive steps, connected into an exploratory divergent−convergent process.
The Issue of Generativity in Abduction How do works in abduction deal with generativity? It is well-known that the distinction between explanatory abduction and innovative abduction corresponds to important debates in research on abduction. Many researchers have underlined that in the works of Peirce on abduction, Peirce had a clear ambition to meet the requirements above, but only partially succeeded. Peirce definition is well-known (Peirce C.P. 5.189, 1903): “[T]he operation of adopting an explanatory hypothesis, -which is just what abduction is,- [is] subject to certain conditions. Namely the hypothesis cannot be admitted, even as a hypothesis, unless it be supposed that it would account for the facts or some of them. The form of inference therefore is: The surprising fact, C, is observed But if A were true, C would be a matter of course, Hence there is reason to suspect A is true.”
There is a rich exegesis of Peirce’s definition (Frankfurt, 1958; Fann, 1970; Hookway, 1995; Lipton, 2000, 2004; Schurz, 2008; Douven, 2021; McAuliffe, 2015; Roudaut, 2017; Mohammadian, 2019a). Douven (2021) follows Frankfurt (1958) to remark that “this is not an inference leading to any new idea. After all, the new idea, the explanatory hypothesis A, must have occurred to one before one
1390
E. Kroll et al.
infers that there is reason to suspect that A is true, for A already figures in the second premise.” McAuliffe (2015) brings some nuances and considers that “there is no reason to interpret this passage as evidence that Peirce viewed abduction as a method for adopting a hypothesis as true.” The debates finally show that (a) it is unclear whether Peirce himself in fact limited “abduction” to hypothesis adoption, but (b) over time, large streams of works on abduction have assimilated abduction to hypothesis adoption and more specifically, to Inference to the Best Explanation (IBE), so that in the Stanford Encyclopedia of Philosophy “Peirce on Abduction” is now a supplement to the “Abduction” article that explicitly identifies abduction with IBE (Douven, 2021). It is interesting to underline that even in the IBE perspective, the question of hypothesis generation cannot be completely neglected: as explained by Douven (2021), “best” in IBE can hardly be understood in absolute terms since the inference is a choice among conceived hypotheses, and “it is rather implausible to hold that we are this privileged [that we consider all potential explanations],” and we may well be led to believe “the best of a bad lot” (van Fraassen, 1989). Roudaut (2017) gives a nice example of the “bad lot” issue and the question of the capacity to generate hypotheses: one typical example of IBE is the well-known demonstration of Neptune’s existence by Le Verrier, who aimed at explaining anomaly in Uranus trajectory. In an abductive framework, the reasoning can be described as follows: fact: anomaly in Uranus trajectory. rule: Newton’s theory being considered as true; the existence of a new planet would explain Uranus trajectory. result: Newton’s equations and considerable computational effort enabled predicting the size and position of the new planet, and this planet was finally observed by Johann Gottfried Galle at Berlin observatory working from Le Verrier calculations.
As recalled by Roudaut, less known is that Le Verrier also noticed an anomaly in Mercury perihelion; following the same pattern of reasoning, he proposed the existence of another planet, Vulcano. But “Vulcano was not here to be discovered” (Roudaut, 2017, p. 53); Mercury perihelion is today explained by general relativity, which was of course unknown to Le Verrier and his contemporaries. Hence, Le Verrier was actually doomed to search in a bad lot. Therefore, be it central or marginal, “hypothesis generation” remains an open question in research on abduction.
Advances in Design Theory: Accounting for Generativity Without Relying on Abduction In design research over the last decades, design theory was developed to make theoretical propositions that meet the abovementioned requirements. Several research works have contributed to design theory, step by step increasing its capacity to account logically for generativity (Hatchuel et al., 2011a; Le Masson & Weil, 2013). General Design Theory (Reich, 1995; Yoshikawa, 1981; Takeda et al., 1990), later
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1391
developed in the Coupled Design Process design theory (Braha & Reich, 2003), has extended knowledge-based design system methods by relying on topological approaches. C-K design theory can be considered as an extension of the Simonian approach to account for “expandable rationality” (Hatchuel, 2002). As shown in Hatchuel et al., (2018, p. 5), research works on design theory “have reconstructed historical roots and the evolution of design theory, conceptualized the field at a high level of generality and uncovered theoretical foundations, in particular the logic of generativity.” Especially, C-K design theory helps to draw some lessons on the ontology of design and hence on generativity (Hatchuel et al., 2013, 2018). In C-K theory (Hatchuel & Weil, 2003, 2009; Le Masson et al., 2017, 2020), design is modeled as an interaction between two spaces, the space of knowledge (K), composed of propositions characterized by the fact that they all have a logical status (true or false), and the space of concepts (C), where propositions are interpretable but undecidable with respect to the actual existing propositions in space K. Concepts are of the form “Ci = there exists a (non-empty) class of objects X for which a group of properties p1 , p2 , . . . pn is true in K.” A design starts with a concept C0 , a proposition that is undecidable with respect to the initial K space. The theory formalizes how this undecidable proposition becomes a decidable proposition. This is realized by two processes, expansions in K (new propositions are added to K by deduction, learning, experimentation, remodelling, etc.) that can continue until a decidable definition for the initial concept is obtained in K* (the expanded K space) and partitions in C (it is possible to add attributes, known in K space, to the concept to promote its decidability). Partitions are called restrictive when they rely on properties usually associated with the object X in K; partitions are called expansive when they rely on properties that are not normally associated with the object X in K. Figure 2 is a diagram summarizing the C-K design theory. Figure 3 is a very simple example to illustrate the different C-K notions. For real case examples, see, for instance, Le Masson et al., (2017). It has been shown that C-K theory cannot be assimilated to one simple abduction (Ullah et al., 2011). Still, C-K design theory meets the three main expectations listed earlier: • Requirement 1: It is a rational, logical process. In particular, it has been shown that C-K theory can be seen as the interaction between two logics, an intuitionistic logic (in C) and a classical logic (in K) (Kazakçi, 2013). It is known today that these two logics can interact, for instance, in a topos structure, where sheafification corresponds to the mathematical transformation of a structure with an intuitionistic logic (the presheaf) into a structure with a classical structure (the associated sheaf) (Prouté, 2016). • Requirement 2: It is more than deduction. In C-K theory, deduction is one of the K➔K operators (and statistical inference as well). • Requirement 3: It accounts for strong generativity. Specifically, it has been shown that there is deep correspondence between C-K design theory and mathematical models of generativity, such as field extension (mathematical construction of new fields from existing ones (Kokshagina et al., 2013)), forcing (mathematical
1392
E. Kroll et al.
Fig. 2 Diagram summarizing the C-K design theory (Le Masson et al., 2017, p. 140). There are four main operators: K➔K = classical deduction, inference, modeling, optimizing actions; K➔C = disjunction, from the known to the unknown; C➔C = refinement, control of partitions; C➔K = conjunction
Fig. 3 A very simple case to illustrate the main notions of the C-K design theory. (After Le Masson et al., 2017, p. 137)
construction of new models of sets with “interesting properties” (Cohen, 1963, 2002)), or topos sheafification (mathematical construction of sheaves from preasheaves in a topos (Mac Lane & Moerdijk, 1992; Hatchuel et al., 2019)). More generally, it is possible to account for various generativity regimes, designated C-K/K*, depending on the structure K* imposed on K: if K=K* that
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1393
is a set theory structure, then C-K/set has the generativity of forcing; if K=K* that is a toposic structure, then C-K/topos has the generativity of sheafification. Meeting these requirements facilitates addressing critical issues related to generativity in design: What is the quality of the generativity? Are there biases in generativity? Is it possible to tune generativity? Is it possible to improve generativity? Can one teach how to overcome impediments to generativity? Advances in design theory have paved the way to a large research program on these questions, traversing many research fields, and are also very relevant for practitioners (see syntheses of some results in Agogué and Kazakçi (2014); Hatchuel et al. (2011b, 2015, 2018); Agogué et al. (2014); Le Masson et al. (2017)).
Disentangling Design Theory and Abduction Opens Avenue for Research So far, some critical results on design theory and abduction have been recalled: (a) Design theory and abduction developed in two different, parallel streams so that the two notions can be disentangled. (b) Still, authors have noticed that there is innovative abduction (in the sense of hypothesis generation and adoption) in design. (c) Design theory today proposes models of generativity, generally applied to the so-called “desirable unknown”. (d) Hypothesis generation remains an open issue in abduction. Hence, design theory appears as an interesting scientific analytical framework to analyze the generativity logic of design abduction and (innovative) abduction in science. This paves the way to a research program: can design theory be applied to abduction to help uncover some facets of hypothesis generation? One can easily figure out possible outcomes. One might better qualify what exactly “innovative abduction” is and better address questions such as the following: Is there good/bad “innovative abduction?” What is a rigorous, reliable “innovative abduction,” and can one control the “quality” of innovative abduction, the control of the quality and rigor of reasoning being a critical issue in scientific methods? Can one help improve innovative abduction, and can one train scientists to carry out better innovative abductions? And there might also be interesting results for design itself, since applying design theory to the field of hypothesis generation might help uncover specific forms of design. In the remainder of this chapter, an illustration of this research program will be presented: abduction is analyzed through the lens of design theory, with two strong restrictions: (a) the authors choose to rely on C-K design theory (other investigations could be made with additional formulations of design theory), and (b) the authors choose a couple of very specific formulations of abduction (the variety of definitions and models of abduction are not addressed). These specific formulations
1394
E. Kroll et al.
of abduction are hence “cast” into C-K design theory, or, to use another metaphor, these abduction formulations are analyzed in light of design theory. It is expected from this “casting” (or this “lighting”) to learn about (innovative) abduction, to wit: • What is the unknown in (innovative) abduction? • Can one evaluate the quality of the generation process in (innovative) abduction? • Are there specific features in the generativity logic of (innovative) abduction?
Abduction Through the Lens of C-K Design Theory: Bounded and Preservative Generativities in Science Method: Analyze Abduction with Design Theory An analysis of abduction formulations with C-K design theory is conducted. Note that many papers have already used C-K design theory as an analytical framework for generativity cases; see, for instance, Reich et al. (2012) and Kroll et al. (2014). The authors therefore follow here an established procedure. The method unfolds as follows: (a) Choice of the abduction formulations to be used as “object of analysis”: We use two formulations, one related to explanatory abduction and the other related to (design) abduction and associated with innovative abduction. These formulations can be considered as a reference in their respective fields: the first one appears in the Abduction entry of the 2021 revision of the Stanford Encyclopedia of Philosophy (Douven, 2021); the second one is given by Kroll and Koskela (2016, p. 130) and is a synthetic reformulation of design abduction as proposed by Roozenburg (1993) and largely diffused and reused since then. It may sound strange to rely on a formulation deeply related to explanatory abduction to analyze the generativity logic; however, this is deeply justified because (i) works on innovative abduction actually consider that the generation will be followed by an explanatory abduction (e.g., Schurz, 2008); hence, the formulation gives us the “final situations” targeted by an innovative abduction; (ii) it has already been noted that even in IBE, the issue of generativity cannot be neglected; and (iii) the formulation is general and formal enough to be a good starting point for C-K analysis. The explanatory abduction definition is given by Douven (2021). It is actually the third and last formulation in a series and is considered to take into account the limits of the first two: Given evidence E and candidate explanations H1 , . . . Hn , if Hi explains E better than any of the other hypotheses, infer that Hi is closer to the truth than any of the other hypotheses. (ABD3 in Douven, 2021)
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1395
The second formulation (design abduction) is given with a hypothetical example of designing the first ever kettle; the general formulation and the example can be synthesized as follows: “given the function q (e.g., boil water), ‘discover’ the rule ‘IF form + way of use THEN function,’ p➔q (e.g., IF hemisphere and metal + fill water and place on burner THEN boil water), and immediately get the second conclusion q, which is a solution to the design problem (e.g., hemisphere and metal + fill water and place on burner)” (from Kroll and Koskela (2016) and Roozenburg (1993)). Note that Roozenburg’s formulation is used here although Kroll and Koskela eventually proposed a modified, two-step formulation of this design process. (b) Analytical framework: In practice, an analysis based on C-K design theory consists of answering well-defined questions: What is the concept at the outset of the generation process (i.e., what is C0 , why is it a concept, what is unknown, etc.)? What is the knowledge available at the start of the process? What are the knowledge expansions in the generation process? What are the partitions applied to the initial concept?
Analysis One Formulation of Explanatory Abduction Analyzed with C-K Framework Douven’s formulation of explanatory abduction is analyzed (see Fig. 4): 1. Evidence E as well as hypotheses Hi are in K (known, given). 2. The abduction process consists of finding the “best explanation” (of E by a hypothesis H) among the given hypotheses that are all explanations. The desired unknown is hence Cini = an “explained evidence” that is the best, a pair (E, H) where H “explains” E better than all other given hypotheses. 3. To formulate this concept, one therefore needs to have in K: • A model of “explanation,” i.e., the (acceptable) implication(s) to go from one Hi to E. This means that the hypothesis Hi is composed of possible complex implications that actually make E an (acceptable) consequence of Hi . • A value function V associated with “better explanation”: each pair of “explained evidence” can be evaluated and the values compared to identify an optimum. This value function is known to be a complicated issue (see Peirce considerations on economic evaluation, as discussed by Mohammadian (2019a) and McAuliffe (2015)). Note that a single-dimension value function is in itself a restrictive evaluation function, since it impedes multi-criteria evaluation. 4. Based on known implication rules and a known value function, the design consists of computing the value of each “explained evidence” and choosing the one that maximizes the value (see the design path on the left-hand side of Fig. 4, linking white-colored boxes).
1396
E. Kroll et al.
Fig. 4 C-K analysis of Douven’s abduction formulation. The dark-shaded boxes (with text in white) refer to the knowledge expansions and design partitions that are “blocked” by the definition of explanatory abduction but could be opened in an innovative abduction perspective. On the lefthand side is the design path imposed by the definition
5. The C-K framework helps understand the following points: • What is supposed to be known in this formulation: not only E and Hi , but also the implications that enable to relate each Hi to E and the value function V with some restrictive property. • The design paths (in C): there is a very clear, simple design path imposed by the formulation – the C-K framework makes visible the paths that are “blocked” (impeded) by the formulation – these design paths are therefore the ones that could be re-opened by an (innovative) abduction. From a design perspective, the concept “an explanation for E” could actually lead to several partitions: • Obviously, a new hypothesis Hn+1 . • But also new implication rules (associated even with known hypotheses!). • And new value functions associated with what is a “good explanation” (e.g., multi-criteria evaluation functions). • Moreover, one could consider that the evidence itself is partially unknown and could require new investigations, new analyses, to get a more “interesting” fit with the hypotheses. “Interesting” could mean “better positive fit,” but in this partition path, one could also find the situation where one deepens the investigations on E because E is an anomaly that shows that all given Hi are false! The case of Röntgen publications on X-rays (proving that the X-ray is not one of the radiation types known at Röntgen’s time) would fall in this path (Röntgen, 1895).
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1397
• The knowledge expansions (in K): these are also blocked by the formulation but could be considered in the perspective of innovative abduction: • Of course, all learning on hypotheses, value function, implication rules, and evidence. • Interestingly enough, this learning can be considered in relation to the explanatory, closed abduction process itself: in explanatory abduction, the reasoner is supposed to calculate the value of each “explained evidence,” so he or she produces knowledge at this stage. This knowledge can be used then for selecting the best explanation (as mentioned in the explanatory abduction formulation), but it can also be used to push exploration in other directions: What if the value of all the “explanations” appears low? Perhaps this would push to generate new hypotheses? Or maybe this will push to change the value function? Or to revise/deepen the implication rules? Or to reanalyze the evidence itself, possibly to consider it as an anomaly for hypotheses Hi,i=1 . . . n ! • Clearly, this new knowledge would then push toward a new design of (Ex , Hx ). This process, where tests, prototypes, and proofs of concept lead to new expansions, is illustrated and described in detail by Kroll et al. (2014) and Jobin et al. (2021).
One Formulation of Design Abduction Analyzed with C-K Framework A similar analysis is conducted of the formulation of innovative design abduction by Roozenburg (1993), also studied by Kroll and Koskela (2016) (see Fig. 5): 1. In K is the expected “function” as a list of functional requirements {FRs}. 2. The abduction process consists of designing an artifact that is made with “design parameters” such as a form (but also matter, components, etc., i.e., means that are available to the designers) and also “design parameters” such as “way of use” (which is also a design parameter from the point of view of the user designing the artifact’s usage), and the artifact satisfies the function(s). The desired unknown is hence an artifact that meets the requirements. 3. To formulate this concept, one therefore needs to have in K a value function (or a test) associated with “being an artifact”: each pair of ({DPs};{FRs}) will be tested to check that this is a valid artifact; valid here means, for instance, feasible, acceptable, usable, legal, or even profitable, marketable, etc. 4. Following the formulation: based on fixed desired functionalities {FRs} and with a known value function V, the design consists of: • Finding design parameters (either present in K or learnt/invented during the design process) • Finding the rules that relate these DPs to the expected FRs (either based on available knowledge or based on newly created knowledge) • In order to then design a pair ({DPs};{FRs}) that can be evaluated (tested) with the V function. 5. The C-K framework helps understand the following points:
1398
E. Kroll et al.
Fig. 5 C-K analysis of Roozenburg’s and Kroll and Koskela’s design abduction. FR stands for functional requirement; DP stands for design parameter. The light-shaded boxes (text in black) refer to the knowledge expansions and design partitions associated with the formulation. The darkshaded boxes (text in white) refer to the knowledge expansions and design partitions that are rather blocked by the example
• What is supposed to be known in this formulation: not only the artifact’s function(s) but also the value function that enables to validate the artifact. • The design paths (in C): the definition of design abduction clearly identifies the paths related to the exploration of design parameters and the rules associated with these parameters to enable to relate these parameters to the function(s). Note that C-K analysis helps understand at least two additional paths: – New value functions associated with what is “an artifact.” Some tests might be discovered in the design process: during the design process, the criteria for feasibility, testability, marketability, etc. might be revised. For instance, one could discover new testing and simulation techniques or new suppliers, etc. – One could also discover new functions. New expectations might be revealed, and new stakeholders might be discovered during the design process. Hence, the value function itself could be designed. • The knowledge expansions (in K): The formulation explicitly implies learning on DPs and implication rules to move from DPs to FRs. Note that this “learning” can mean a large variety of efforts, from identifying an existing, already known DP to the discovery of a new means of action. Moreover, the definition might also lead to learning on other dimensions, such as learning on function(s) and learning on value (test).
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1399
The analysis of abduction formulations with the C-K framework thus uncovers the knowledge conditions to activate such a design abduction and also enables to reveal the directions opened for generation. Note that these design directions are in fact larger (or at least more detailed) than the intuitive expectations of design abduction: innovative abduction is not just about generating design parameters and their associated rules, but design abduction can also be generative in functions and in tests.
Results and Discussion The analysis of the two formulations related to abduction leads to three main results:
Result 1: The Unknowns in Abduction – Why Scientific Concepts Are More than “Hypotheses” It was expected that (innovative) abduction would generate new hypotheses. The detailed analyses of generativity processes associated with abduction formulations lead to enrich this perspective in several respects: Result 1.a: the unknown that was found in the analyses is not limited to the hypothesis; it also relates to the “value function” that helps evaluate whether a hypothesis is good and better than another one. This refers to the methods, instruments, observation techniques, and proof techniques that are accepted by current practices and epistemologies in a scientific community! And it is known that these epistemological instruments evolve over time (see how certain research communities have slowly accepted statistics and simulation as proof techniques). Result 1.b: the unknown is not only related to the hypothesis and evaluation but also to the evidence itself! Enriching the observation, the measures, the manipulations, and experimentations on/with the evidence is actually part of the generativity process, with many smart reasoning steps to truly design an interesting “explained evidence.” In particular, the demonstration of an anomaly is an interesting case of generativity in science, where one must design new observations to prove that an evidence does not fit with existing hypotheses. Result 1.c: the hypothesis itself is a source of unknown that might have been underestimated. The analyses above show that the concept associated with “innovative abduction” is not exactly “a new hypothesis” but more precisely “a hypothesis that better explains the evidence.” In C-K terms, one should distinguish between a hypothesis that is in K and comes with a set of rules that relates H to E and that can be evaluated by V and a hypothesis that is in C that might be a partially unknown hypothesis in the sense that as a concept, some rules might be missing to relate the concept of H to E or to evaluate the concept of H. This formal nuance implies clear consequences: (i) contrary to intuition, it is not so self-evident to design such an hypothesis “that better explains the
1400
E. Kroll et al.
evidence” – this analysis is in full coherence with Douven’s argument “even if there is an infinity of hypotheses that account for a given fact, there may still be only a handful that could be said to give a satisfactory explanation of it” (Douven, 2021); (ii) but one cannot go as far as saying that this would mean that available (known) hypotheses are necessarily the best ones, which would mean that there is no issue with “bad lot” and hypothesis generation: it rather means that a hypothesis requires careful design! The design of a hypothesis is actually very hard work since it also requires designing how the concept of hypothesis relates to the evidence (the rules, i.e., one or several theoretical constructions, previously explained evidence, etc.) and how the “explained evidence” E will be evaluated. This analysis enables to discuss Douven’s argument against the “bad lot” issue (Douven, 2021). Douven builds on Schupbach (2014): “given the hypotheses Hi,i=1 . . . n we have managed to come up with, we can always generate a set of hypotheses which jointly exhaust logical space. Suppose H1 , . . . Hn are the candidate explanations we have so far been able to conceive. Then simply define Hn+1 =¬ H1 ∧¬ H2 ∧ . . . ∧¬ Hn and add this new hypothesis as a further candidate explanation to the ones we already have. Obviously, the set H1 . . . Hn+1 is exhaustive.” Douven then goes further: he notices that Hn+1 is hardly informative, and it will not even be clear what its empirical consequences are. He gives the following example: suppose H1 = Special Relativity Theory; H2 = Lorentz’ version of aether theory; then H3 is “neither of these two theories is true. But surely this further hypothesis will be ranked quite low qua explanation [ . . . ] and it is fully unclear what its empirical consequences are.” Hence, Hn+1 is not an interesting hypothesis, and there is no real issue with “bad lot.” Based on the analysis of hypothesis generation, the reasoning above can be slightly modified: the proposition “¬ H1 ∧¬ H2 ∧ . . . ∧¬ Hn ” is actually not a hypothesis (technically this means that the collection of hypotheses is generally not stable by negation – such a stability would require much more severe mathematical conditions) precisely because it is unclear how it will relate to the evidence (what are the associated rules) and how it will be evaluated. But it could rather be formulated, in C-K theory, as a concept of hypotheses (C=“there exists an hypothesis Hn+1 that explains E better than all other H1 . . . Hn ”) that is still largely unknown and hence requires further partition and knowledge creation to actually design one (or probably several!) “well-constructed” hypothesis, i.e., with well-identified rules to relate to E and well-identified value V(E, Hn+1 ). Finally, results 1.a, 1.b, and 1.c provide a rich representation of the “unknowns” associated with generativity in science. They show that scientific concepts are much more than hypotheses! They also help account for the variety of scientific “results”: a contribution to scientific progress is of course not limited to the proof of the fit between a hypothesis and an evidence, and design theory applied to “abduction” helps enrich our understanding of this variety.
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1401
Result 2: Abduction as “Bounded Generativity” Based on decision theory, Simon showed that decision-making processes in organizations or by a human were actually bounded: people and/or organizations did not make the “optimal” decision, as defined by decision theory, but only a “satisficing” one (Simon, 1955, 1957). Similarly, based on design theory, which establishes a reference for generativity processes, abduction might be considered as a form of bounded generativity in the sense that it claims forms of generativity but in fact tends to underestimate the large set of generativity paths that design theory can formally associate with innovative abduction, e.g., hypothesis generation. More specifically, casting abduction in design theory led to identifying the following limits and biases: Result 2.a: abduction appears as bounded when it comes to the value function (test, evaluation, etc.) and to the evidence (see the example of explanatory abduction in the previous section) or the functional requirements (see the example of design abduction also in the previous section). Result 2.b: abduction is also bounded when it comes to the learning that is made at the evaluation stage: one seems to favor a form of selection (“adopt,” “select” the best hypothesis, validate an artifact), whereas this evaluation itself will produce knowledge that could be reused for further design. Result 2.c: abduction is also bounded by the reasoning process itself: abductions tend to consider hypothesis generation as an “emergence,” whereas design theory clarifies that hypothesis generation is actually a complex design process that might involve multiple steps, such as characterizing the unknown to be addressed (i.e., formulate a concept of hypothesis), learning from tests and evaluations, elaborating on the evidence, learning on the rules that could help relate a hypothesis to the evidence, etc. Abduction simplifies and reduces this complex process of knowledge creation and concept partition. This result corresponds to the in-depth studies done on abduction and the design process that have already shown that it was necessary to consider several connected abductions to actually account for a design process (Kroll & Koskela, 2016; Dorst, 2011). Result 2.c in particular leads to underline that hypothesis generation in science cannot be assimilated to abduction: among the complex steps that design theory led to identify in hypothesis generation, one finds especially regular deduction! Deduction clearly appears as an operator that is required to formulate and check the connections of rules that relate a hypothesis to the evidence. Neglecting deduction as an instrument for hypothesis generation is an example of how abduction can be a bounded generation. Design theory leads to show that “abduction,” in the broad sense of accounting for “hypothesis generation,” should in fact contain deduction, explaining why it is not possible to construct abduction as reversal of deduction.
1402
E. Kroll et al.
Result 3: Toward Unbounded Abduction – Facing the Issue of Preservative Generation Analyzing abduction in light of design theory shows that improved learning processes (in K) and more rigorous concept partitions (in C) would lead to a more systematic generation of hypotheses. Hence, a design theory-based abduction (in the sense of design theory-based hypothesis generation) could be an unbounded generativity (or at least a less bounded one). Building on results 1 and 2, this unbounded generativity would include a large variety of “unknowns” (see result 1) and, more specifically, would more rigorously address concepts of hypotheses (see result 1.c); it would also overcome the limits of bounded abduction (see result 2) and, more specifically, include some forms of deduction in the hypothesis generation reasoning (see result 2.c). Consequently, hypothesis generation would become more systematic. This leads to new issues and criteria to be added to the hypothesis generation process: Issue 1 of hypothesis generation: from the perspective of repeated scientific activity, one would wonder how a newly generated hypothesis will be helpful not only to explain a given evidence but also to support scientific generativity in the future! Hence, the criteria for evaluating a hypothesis would not be limited to their capacity to well/best explain a given evidence but also their capacity to be useful to generate other hypotheses and evidence! This is coherent with Poincaré claim: “we choose this geometry (i.e., this theoretical framework) not because it is more true but because it is more convenient” (Poincaré, 1898, p. 63); see also Mohammadian (2019b). Issue 2 of hypothesis generation: if hypothesis generation intensifies, then so does the control of generativity. The generation of new rules will necessarily raise the issue of whether this generativity will disturb well-established rules and “explained evidence.” Design theory has long mentioned that expansions and the emergence of new pieces of knowledge might require a so-called knowledge re-ordering (Hatchuel et al., 2013; Brun et al., 2016). In case of scientific constructions that aim at a global coherence and unification, this re-ordering effort might become critical. Consequently, it might be required that the newly generated hypothesis actually limits costly re-ordering and enables to preserve as much as possible the previously established results, or, in Peirce’s words, to preserve the “consistency with well-confirmed beliefs.” This is actually well-known in scientific production, where the greatest breakthroughs actually also relied on a preservation logic – Einstein’s relativity theory, for example, preserved critical equations in physics, including Newton’s ones at low speed (Damour, 2005; Einstein, 2011). As an outcome, unbounded abduction would actually require models of preservative generativity. Recent advances in design theory enable to model the logics of creation heritage by injecting topos theory into C-K design theory (Hatchuel et al., 2019). These works might be useful to deepen the understanding of preservative
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1403
generativity in science. The study of creation heritage with C-K/topos has pushed to explore new facets of design theory: injecting a topos structure in K-space of C-K enables to uncover how tradition preservation and innovation are not doomed to produce poor trade-offs, but they actually correspond to deep generative processes (corresponding mathematically to sheafification), where innovation occurs within tradition and tradition is inventively preserved in the generativity process. It describes forms of preservative creation, proving that innovation is not necessarily a creative destruction.
Conclusion: Design Theory to Unbound Generativity of Abduction? In this chapter, the authors showed how design theory can contribute to Peirce’s historical program to better model the logic of scientific knowledge creation. One could consider that for Peirce, abduction was more an unfinished program than a result: abduction was the name for the project to rationally (logically) account for generativity in science and in other fields where similar generativity would occur. More than one century later, advances in research on the logics of generativity – in mathematics, engineering design, cognition, etc. – have enabled quantum leaps in design theory, where recent advanced formulations such as C-K theory finally meet Peirce’s requirements: design theory is a model of creative reasoning that accounts for generativity, goes beyond deduction, and is logically grounded. Design theory developed without referring to abduction, but despite that (or, maybe, because of that), researchers have considered that studying design abduction could be fruitful not only to better understand design but also to better understand abduction itself. In this chapter, this logic was extended by applying design theory, formulated as C-K design theory, to analyze the generativity logics in two abduction formulations. This exercise led to three main results: Result no. 1: it showed that abduction – seen as a logic of hypothesis generation – in fact addresses many unknowns with a strong generativity potential. In particular, it shows that the relevant unknowns are not embedded in the hypotheses themselves, but rather by the concepts of hypotheses, which require substantial design work to become testable explanations of the evidence. Result no. 2: it also uncovered that even if abduction might explore these multiple unknowns, definitions of abduction tend to only very partially explore the full range of unknowns, so that abduction is a form of bounded generativity . Result no. 3: finally, the exercise showed that an abduction that is based more explicitly on design theory would overcome the bounded generativity, and this would therefore lead to consider how this “unbounded” abduction could be a preservative generativity that rigorously combines the creation logic of scientific discovery and the cumulative preservative logic of robust, reliable scientific knowledge.
1404
E. Kroll et al.
References Agogué, M., & Kazakçi, A. (2014). 10 years of C-K theory: a survey on the academic and industrial impacts of a design theory. In A. Chakrabarti & L. Blessing (Eds.), An anthology of theories and models of design. Philosophy, approaches and empirical explorations (pp. 219–235). https:/ /doi.org/10.1007/978-1-4471-6338-1 Agogué, M., Hooge, S., Arnoux, F., & Brown, I. (2014). An introduction to innovative design – Elements and applications of C-K theory. Sciences de la Conception. Presses de l’Ecole des Mines. Braha, D., & Reich, Y. (2003). Topological structures for modelling engineering design processes. Research in Engineering Design, 14(4), 185–199. Brun, J., Le Masson, P., & Weil, B. (2016). Designing with sketches: The generative effects of knowledge preordering. Design Science, 2, E13. https://doi.org/10.1017/dsj.2016.13 Cohen, P. (1963). The independence of the continuum hypothesis. Proceedings of the National Academy of Science, 50, 1143–1148. Cohen, P. (2002). The discovery of forcing. Rocky Mountain Journal of Mathematics, 32(4), 1071– 1100. Coyne, R. (1988). Logic models of design. Pitman. Coyne, R. D., Rosenman, M. A., Radford, A. D., Balachandran, M., & Gero, J. S. (1990). Knowledge-based design systems. Addison Wesley. Damour, T. (2005). Einstein (1905–1955): son approche de la physique. Séminaire Poincaré, VII, 1–25. Dorst, K. (2011). The core of ‘design thinking’ and its application. Design Studies, 32(6), 521–532. Douven, I. (2021). Abduction. In: E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. (revised Summer 2021, first edition 2011). Einstein, A. (2011). Letters to Solovine (1906–1955). Open Road / Philosophical Library. Eris, O. (2003). Asking generative questions: a fundamental cognitive mechanism in design thinking. In: International conference on engineering design, ICED’03, Stockholm. Fann, K. T. (1970). Peirce’s theory of abduction. Martinus Nijhoff. Frankfurt, H. G. (1958). Peirce’s notion of abduction. The Journal of Philosophy, 55(14), 593–597. Habermas, J. (1968). Erkenntnis und Interesse (English translation: Knowledge and human interests, Hinemann, London, 1978, 2nd edition). Suhrkamp Hatchuel, A. (2002). Towards design theory and expandable rationality: the unfinished program of Herbert Simon. Journal of Management and Governance, 5(3–4), 260–273. Hatchuel, A., & Weil, B. (2003). A new approach to innovative design: an introduction to C-K theory. In: ICED’03, August 2003, Stockholm, Sweden, p. 14 Hatchuel, A., & Weil, B. (2009). C-K design theory: An advanced formulation. Research in Engineering Design, 19(4), 181–192. Hatchuel A, Le Masson P, Reich Y, & Weil B (2011a) A systematic approach of design theories using generativeness and robustness. In: International conference on engineering design, ICED’11, Copenhagen, Technical University of Denmark, p. 12 Hatchuel, A., Le Masson, P., & Weil, B. (2011b). Teaching innovative design reasoning: How CK theory can help to overcome fixation effect. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 25(1), 77–92. Hatchuel, A., Weil, B., & Le Masson, P. (2013). Towards an ontology of design: Lessons from C-K design theory and forcing. Research in Engineering Design, 24(2), 147–163. Hatchuel, A., Le Masson, P., Weil, B., Agogué, M., Kazakçi, A. O., & Hooge, S. (2015). Mulitple forms of applications and impacts of a design theory – Ten years of industrial applications of C-K theory. In A. Chakrabarti & U. Lindemann (Eds.), Impact of design research on industrial practice – Tools, technology, and training (pp. 189–209). Springer. Hatchuel, A., Le Masson, P., Reich, Y., & Subrahmanian, E. (2018). Design theory: A foundation of a new paradigm for design science and engineering. Research in Engineering Design, 29, 5–21.
64 Abduction and Design Theory: Disentangling the Two Notions to. . .
1405
Hatchuel, A., Le Masson, P., Weil, B., & Carvajal-Perez, D. (2019). Innovative design within tradition – Injecting Topos structures in C-K theory to model culinary creation heritage (reviewers’favourite award). Proceedings of the Design Society: International Conference on Engineering Design, 1(1), 1543–1552. Heymann, M. (2005). “Kunst” und Wissenchsaft in der Technik des 20. Jahrhunderts. Zur Geschichte der Konstruktionswissenschaft. Chronos Verlag Hookway, C. (1995). Abduction. In T. Honderich (Ed.), The Oxford companion to philosophy. Oxford University Press. Jobin, C., Hooge, S., Le Masson, P. (2021). The logics of double proof in proof of concept: a Design-theory based model of experimentation in the unknown (reviewer’s favourite award). Proceedings of the Design Society 1 (160011.png):3051–3060. Kazakçi, A. O. (2013). On the imaginative constructivist nature of design: A theoretical approach. Research in Engineering Design, 24(2), 127–145. Kokshagina, O., Le Masson, P., & Weil, B. (2013). How design theories enable the design of generic technologies: Notion of generic concepts and genericity building operators. Paper presented at the International Conference on Engineering Design, ICED’13, Seoul, Korea. König, W. (1999). Künstler und Strichezieher. Konstruktions- und Technikkulturen im deutschen, britischen, amerikanischen und französischen Maschinenbau zwischen 1850 und 1930, vol. 1287. Suhrkamp Taschenbuch Wissenschaft. Suhrkamp Verlag, Frankfurt am Main Koskela, L., Paavola, S., & Kroll, E. (2018). The role of abduction in production of new ideas in design. In P. E. Vermaas & S. Vial (Eds.), Advancements in the philosophy of design. Springer. Kroll, E., & Koskela, L. (2016). Explicating concepts in reasoning from function to form by two-step innovative abductions. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 30(2), 125–137. Kroll, E., Le Masson, P., & Weil, B. (2014). Steepest-first exploration with learning-based path evaluation: Uncovering the design strategy of parameter analysis with C–K theory. Research in Engineering Design, 25, 351–373. https://doi.org/10.1007/s00163-014-0182-8 Le Masson, P., & Weil, B. (2013). Design theories as languages for the unknown: Insights from the German roots of systematic design (1840–1960). Research in Engineering Design, 24(2), 105–126. Le Masson, P., Weil, B., & Hatchuel, A. (2017). Design theory – Methods and organization for innovation. Springer Nature. https://doi.org/10.1007/978-3-319-50277-9 Le Masson, P., Weil, B., & Hatchuel, A. (2020). C-K design theory. In S. Vajna (Ed.), Integrated design engineering – Interdisciplinary and holistic product development. Springer-Verlag GmbH Germany, part of Springer Nature. Lipton, P. (2000). Inference to the best explanation. In W. H. Newton-Smith (Ed.), Companion to the philosophy of science (pp. 184–193). Blackwell Publishers. Lipton, P. (2004). Inference to the best explanation (2nd ed.). Routledge. Mac Lane, S., & Moerdijk, I. (1992). Reals and forcing with an elementary topos. In: Y. N. Moschovakis (Ed.), Logic from computer science: Proceedings of a workshop held November 13–17, 1989. Springer New York, New York, NY, pp. 373–385. https://doi.org/10.1007/978-14612-2822-6_15 March, L. (1976). The logic of design and the question of value. In L. March (Ed.), The architecture of form (pp. 1–15). Cambridge University Press. McAuliffe, W.H.B. (2015). How did abduction get confused with inference to the best explanation? Transactions of the Charles S Peirce Society, 51(3). Mohammadian, M. (2019a). Abduction − The context of discovery + underdetermination = inference to the best explanation. Synthese, 198(5), 4205–4228. Mohammadian, M. (2019b). Beyond the instinct-inference dichotomy: A unified interpretation of Peirce’s theory of abduction. Transactions of the Charles S Peirce Society, 55(2) Peirce, C. S. ([C.P.]) Collected papers of Charles Sanders Peirce, edited by C. Hartshorne, P. Weiss and A. Burks, 1931–1958. References are to CP in decimal notation by volume and paragraph number edn. Harvard University Press, Cambridge, MA
1406
E. Kroll et al.
Poincaré, H. (1898). On the Foundations of Geometry. The Monist 9(1):1–43. Prouté, A. (2016). Introduction à la logique catégorique. CNRS / Univesrité Paris Didrerot. Redtenbacher, F. (1852). Resultate für den Maschinenbau (2nd ed.). Friedrich Bassermannn. Reich, Y. (1995). A critical review of general design theory. Research in Engineering Design, 7, 1–18. Reich, Y., Hatchuel, A., Shai, O., & Subrahmanian, E. (2012). A theoretical analysis of creativity methods in engineering design: Casting ASIT within C-K theory. Journal of Engineering Design, 23(2), 137–158. Rodenacker, W. G. (1970). Methodisches Konstruieren (Konstruktionsbücher). Springer. Rogers, P. C., Hsueh, S.-L., & Gibbons. A. S. (2005). The generative aspect of design theory. In 5th IEEE international conference on advanced learning technologies:3. Röntgen W (1895) Ueber eine neue Art von Strahlen. Sitzungsberichte der Würzburger physikmed Gesellschaft Würzburg 137:132–141. Roozenburg, N. F. M. (1993). On the pattern of reasoning in innovative design. Design Studies, 14(1), 4–18. Roudaut, F. (2017). Comment on invente les hypothèses: Peirce et la théorie de l’abduction. Cahiers philosophiques, 150(3), 45–65. Schurz, G. (2008). Patterns of abduction. Synthese, 164(2), 201–234. Schupbach, J. (2014). Is the Bad Lot Objection Just Misguided?. Erkenntnis, 79:55–64. Simon, H. A. (1955). A Behavioral Model of Rational Choice. Quarterly Journal of Economics 69:99–118. Simon, H. A. (1957) Models of Man: Social and Rational John Wiley & Sons, New York. Takeda, H., Veerkamp, P., Tomiyama, T., & Yoshikawa, H. (1990). Modeling design processes. AI Magazine Winter 1990:37–48. Ullah, A. M. M. S., Mamunur Rashid, M., & Ji, T. (2011). On some unique features of C-K theory of design. CIRP Journal of Manufacturing Science and Technology, 5(1), 55–66. van Fraassen, B. C. (1989). Laws and symmetry. Oxford University Press. https://doi.org/10.1093/ 0198248601.001.0001 Yoshikawa, H. (1981). General design theory and a CAD system. In: T. Sata & E. Warman (Eds.), Man-Machine communication in CAD/CAM, proceedings of the IFIP WG5.2-5.3 working conference 1980 (Tokyo). Amsterdam, North-Holland, pp. 35–57 Zittrain, J. L. (2006). The generative internet. Harvard Law Review, 119, 1974–2037.
Part XII Adversarial Abduction
Introduction to Adversarial Abduction
65
Samuel Forsythe
Contents Introduction – Abduction and Adversariality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1409 1411
Abstract
Chapters in this section explore problems of abduction when inquiries are hindered or repurposed according to the pragmatic criteria of conflict. In adversarial scenarios that involve danger, uncertainty, and contingency, inquirers may find themselves compelled to exploit the operations of abduction to evade or facilitate violence, to uncover secrets and discover advantages, or to deceive the perceptions and inferences of adversaries. Through detailed investigation of some less familiar problems of abduction the authors of this section reveal the significance and consequences of adversarial reasoning for issues of cognition, ethics, epistemology, semiotics, and social research methodology. Keywords
Abduction · Adversariality · Deception · Detection · Inquiry · Violence
Introduction – Abduction and Adversariality For researchers working across philosophy, cognitive science, semiotics, and social epistemology, the topic of abduction has proven to be a productive source of
S. Forsythe () Peace Research Institute Frankfurt, Frankfurt am Main, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_91
1409
1410
S. Forsythe
problems and puzzles through which to explore pragmatic and applied scenarios where hypothetical and diagnostic reasoning play a significant role. Situations of interest often concern those fields where surprises, puzzles, and practical problems compel us to proceed abductively, using hypotheses and conjectures to uncover the hidden meanings of signs and anticipate contingent possibilities. In most research, the focus is on social practices that share norms of cooperative problem-solving that resemble, to a greater or lesser degree, Peirce’s ideal community of inquiry: discovering, formulating, refining, testing, and sharing results in good-faith with the aim of pursuing a collective, self-corrective practice of truth-seeking. However, in the many years since Peirce made his own inquiries into the logic of discovery, there have been those who have noted the relevance of abduction for uncooperative and adversarial scenarios, where inquiry is not organized according to the values of a shared community but is instead shaped by the dynamics of opposition, struggle, and conflict. (See: Eco & Sebeok, 1983; Thagard, 1992; Stjernfelt, 2003, 2007; Magnani, 2009, 2011, 2013; Bertolotti et al., 2014; Bertolotti, 2015; Fanti Rovetta, 2020.) It is an unfortunate fact that even our greatest efforts of reasoning, problemsolving, and invention can be put towards ends that are both ethically and epistemically opposed to the spirit of inquiry imagined by Peirce. One must only remember the historical relationship between modern science and modern warfare to get a sense of how the tools of the mind can be turned into instruments of harm. And indeed it is not only the products of inquiry that can be adapted for situations of conflict but also the process of inquiring – the search for answers to questions – itself. Indeed it sometimes happens that individuals and communities find themselves faced with problems that differ in key ways from those dealt with by cooperative inquiries: they are faced not only with the puzzlement and doubt that arise from surprising circumstances but also the uncertainty and anxiety that result from an unpredictable and dangerous environment. And if the stakes are raised even higher, such that the source of uncertainty is not simply contingent or inhospitable circumstances, but the actions of hostile and harmful adversaries, then the tasks and standards of inquiry can acquire a very different pragmatic criteria. A prevailing state of adversariality means that an existential dynamic complicates problems that may also involve time-pressure and anticipation, ignorance and limited information, or misperception and sign manipulation. In such cases, abduction, along with the other instruments of thought and reason, must reckon not only with the intractability of nature but the active opposition of intelligent beings, other inquirers who have set their will and efforts against them. Under these conditions, the cognitive, inferential, and semiotic capacities of inquiry are employed to solve several interrelated problems of a uniquely adversarial nature: danger, secrecy, and deception. And because, as Peirce often noted, abduction is the form of inference best suited to conditions giving rise to doubt and uncertainty – and because adversarial situations are perhaps the most uncertain and doubt-filled of all human encounters – the dynamics of adversariality can transform the logic of discovery into a logic of strategic rationality.
65 Introduction to Adversarial Abduction
1411
Chapters in this section explore some of the problems of adversarial abduction, where inquiries are hindered or repurposed according to the pragmatic criteria of resistance and opposition. The chapter by Magnani examines the interweaving of abduction and hypothetical cognition in moral and violent behavior. Elucidating the eco-cognitive functions of abduction and introducing concepts from catastrophe theory and ecological psychology, Magnani reveals how our abductive capacity to discover, construct, and exploit hidden environmental opportunities can be instrumentalized – in thought, action, speech, and technics – for moral and violent ends. The chapter by Forsythe examines how adversarial dynamics generate new pragmatic roles for abduction, conceptualized as the epistemic and semiotic operations of detection and deception. Through the iterative and reflexive procedures of abduction, the adversarial inquirer aims not at truth but at advantage, discovering, manipulating, and exploiting epistemic opportunities and semiotic processes in order to uncover secrets, to deceive opponents, and to formulate advantageous strategies. Exploring the inferential operations of adversarial abduction in more depth, the chapter by Fanti Rovetta examines the role of conjectures in the epistemology of deception and the dynamics of abductive ruses. As the cognitive operation that underlies the ingenious operations of deception, abduction plays a role not only in the representational manipulation of an adversary’s inferences but exploits sensorimotor cognitions to deceive the inquiries of perception and embodied problem solving. The final chapter by Stjernfelt scrutinizes the role of abduction in the study of secrecy and in situations where there is a systematic resistance to inquiry. Examining the case of contemporary social research into the mysteries of Cold War intelligence, submarine warfare, and psychological operations, Stjernfelt illustrates how inquirers can employ abductive methodologies to overcome the epistemic and normative obstacles that arise when investigation is opposed by denial, deception, and the persistence of uncertainty. The chapters in this section offer glimpses into exciting and promising pathways for research into the adversarial operations of abduction. And despite their focus on the uses and abuses of hypothetical cognition, each chapter clearly manifests the spirit of inquiry envisaged by Peirce, aiding our collective effort to pass from the known to the unknown and into an uncertain future.
References Bertolotti, T. (2015). Patterns of rationality. In Studies in applied philosophy, epistemology and rational ethics (Vol. 19). Springer International Publishing. https://doi.org/10.1007/978-3-31917786-1 Bertolotti, T., Magnani, L., & Bardone, E. (2014). Camouflaging truth: A biological, argumentative and epistemological outlook from biological to linguistic Camouflage. Journal of Cognition and Culture, 14(1–2), 65–91. https://doi.org/10.1163/15685373-12342111
1412
S. Forsythe
Eco, U., & Sebeok, T. A. (Eds.). (1983). The sign of three: Dupin, Holmes, Peirce. Advances in semiotics. Indiana University Press. Fanti Rovetta, F. (2020). Framing deceptive dynamics in terms of abductive cognition. Pro-Fil, 21(1), 1. https://doi.org/10.5817/pf20-1-2043 Magnani, L. (2009). Abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning (Cognitive systems monographs) (Vol. 3). Springer. https://doi.org/10. 1007/978-3-642-03631-6 Magnani, L. (2013). Scientific innovation as eco-epistemic warfare: The creative role of on-line manipulative abduction. Mind & Society, 12(1), 49–59. https://doi.org/10.1007/s11299-0130118-4 Magnani, L. (2011). Understanding violence: The intertwining of morality, religion and violence: A philosophical stance (Studies in applied philosophy, epistemology and rational ethics) (Vol. 1). Springer. https://doi.org/10.1007/978-3-642-21972-6 Stjernfelt, F. (2007). Diagrammatology: An investigation on the borderlines of phenomenology, ontology, and semiotics (Synthese Library 336). Springer. Stjernfelt, F. (2003). The ontology of espionage in reality and fiction: A case study on iconicity. Sign Systems Studies, 30, 133–162. Thagard, P. (1992). Adversarial problem solving: Modeling an opponent using explanatory coherence. Cognitive Science, 16(1), 123–149. https://doi.org/10.1207/s15516709cog1601_4
The Epistemology of Secrecy: The Roles of Abduction in the Investigation of Deep State Issues
66
Frederik Stjernfelt
Contents Secrecy As a Scientific Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A PsyOp of the Late Cold War . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Secrecy and Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Research Techniques Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multiplicity of Source Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Investigating Parallel Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abduction Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Archives Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Addressing the Delicacy of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hedging Wild Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Avoid Lapsing into Unfounded Conspiracy Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Special Conditions of Secrecy Investigators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical Motivations for Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Probability of a Narrative Versus The Probability of Its Invention . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1414 1415 1419 1423 1423 1424 1424 1426 1428 1429 1430 1431 1431 1432 1433 1434 1435
Abstract
The investigation of secret matters in intersubjective human affairs such as economics, politics, privacy, etc. poses a special challenge for investigators. This chapter looks at the issue of “Deep State” matters picking the Swedish researcher Ola Tunander’s investigation of the 1980 submarine scandals in the Baltics as a case study in order to track the use of abduction in such research. As much of such research has to dispense with normal standards of open, written sources, it
F. Stjernfelt () Aalborg University, Copenhagen, Denmark e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_30
1413
1414
F. Stjernfelt
has to rely on abduction, in a large degree, both in its overall conclusions and in numerous single steps of investigation. Keywords
Abduction · International politics · Secrecy · Deep state · Submarines
Secrecy As a Scientific Issue There are special conditions to humanities and social sciences research addressing issues which are deliberately kept secret. In most societies, there are actors striving to keep certain information hidden, secret, even to the extent of deliberately avoiding leaving written traces, deleting information, oftentimes providing alternative information in the shape of lies, cover-ups, fake news, planned disinformation campaigns, etc. Some such activities are legitimate, as in companies protecting secrets of production, political parties protecting strategy development, state or military organizations protecting strategic political information of different sorts, and individual citizens protecting private life choices and personal information; others are less legitimate such as those of more or less organized crime. There is a considerable gray zone here, however, and states may strongly disagree which parts of their policies vis-a-vis each other are legitimately kept in the dark, just like citizens and political organizations may question or challenge different aspects of state security secrecy. In all cases, such issues pose special problems for investigators, be they historians, social scientists, political analysts, investigative journalists, etc. who cannot, in many cases, rely on normal scholarly standards requiring open sources, written or printed archive material, and public access to information. This chapter investigates the special roles of abduction in such research. It should immediately be added that the chapter, in itself, has a tentative, if not abductive, character. To throw light upon the issue, the special type of secrecy connected to “Deep State” matters has been chosen as an example, secrecy as maintained in some state by networks of intelligence services, top state officials, leading circles of government, general staffs, etc. in different combinations. Such issues are often perceived as anomalies as compared to other human and social issues accessible via open sources. Secrecy, however, has been central to large swathes of human history, and the establishment of public access archives is a rather recent historical achievement and, in the larger perspective, an exception rather than the norm. Until the emergence of modern democracies, state matters were generally, as a matter of course, kept secret, and the publication of information pertaining to state issues was subjected to tight and detailed control. The continuing existence of secret services and related bodies in modern democracies retains some of these standards of premodern states, and even if recurrent criticism may demand more or even full openness of all such state activity, there is no reason to suspect this may fully happen, even in the most open and liberal of democracies. The first step in the
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1415
direction of principled public access to state matters was probably the Swedish Press Freedom ordinance of 1766. Unlike the Danish Press Freedom of 1770, the Swedish version was a pretty restricted Freedom, maintaining religious writings and criticism of individuals under censorship. Also unlike the Danish law, however, the Swedish initiative introduced Freedom of Information in the sense that a large swath of state documents was now given free to be published. The law, to be sure, maintained secrecy of certain qualified types of state documents, cf. Langen and Stjernfelt (2022) and Nordin, Langen, and Stjernfelt (in press). In a certain sense, state secrecy appears prototypical for the whole, broader problematic of secrecy, not least because state power, both in terms of economic muscle, technical proficiency, stability of institutions, and monopoly of violence, is typically in command of the strongest means, and is able to do the most and go the farthest to conceal information considered worthy of secrecy, in comparison to other societal actors. A recurrent danger in the philosophy of science is to rely too much on armchair speculation exclusively. Such speculation is necessary indeed, but it needs concrete matter of analysis to attempt to bridge the normative approach of philosophers with the description and insight in how actual scholars of the field go about their task, cf. the tradition for philosophy of science coming out of Peirce and Cassirer with a huge emphasis upon how actual scientists operate in the disciplines. For this reason, a case study has been chosen: the Swedish historian and political scientist Ola Tunander and his research into the Swedish submarine affair and the related international crisis of the 1980s, during the last decade of the Cold War. It is necessary here to add an immediate disclaimer: the author has known Tunander since the 1980s, and he has contributed valuable comments to the present chapter. To put it briefly, after many years of research, Tunander was able to reveal an unknown layer of activity beneath official events, in a series of books spearheaded by his (2001) and its English version (2004), continuing to this day with his (2019) and (2022). (See also Stjernfelt, 2001). The majority of the many submarines appearing in Swedish coastal waters in the early 1980s were not, as initially supposed, from the Soviet Navy but were rather vessels from Western countries masquerading as Soviet, thereby conducting a major “Psy Op” (Psychological Operation) in the closing phase of the Cold War, with the result that Swedish opinion turned strongly against the Soviet Union, from 27% to 83% in the course of 3 years. In order to track the epistemological means by which Tunander has been able to conduct his research, we shall give a brief resume of the results.
A PsyOp of the Late Cold War The affair began when a Soviet submarine of the “Whiskey” class was grounded in the shallow waters of the coastal archipelago southeast of the major Swedish naval station of Karlskrona in October 1981. This surprising event directed the attention of world media toward Sweden and its military situation. In the ensuing years, sightings of foreign submarines began to spread along the Baltic East coastline of
1416
F. Stjernfelt
Sweden, oftentimes inside coastal archipelagos, including cases close to the capital of Stockholm. Many sightings were by ordinary citizens, others by Swedish military authorities. Events came to a peak in a number of episodes in October 1982 when the Swedish coastal defense was close to catching or destroying enemy vessels, but every time the foreign u-boats seemed to manage to disappear. The whole series of events were standardly interpreted as a Soviet aggression so as to intimidate neutral Sweden. Rumor had it that despite its formal neutrality, Sweden was in practice, to some degree, coordinating policies with NATO and thus, effectively, a NATO ally. The Swedish government submitted a protest to Moscow against the submarine intrusions. Subsequently, several ensuing state commission reports (1983, 1995) counted up to 5.000 submarine sightings in the period, ranging from “possible” over “probable” to “certain” submarines. The 1983 commission classified them as Soviet offenses of Swedish sovereignty. The 1995 commission more cautiously claimed there was no evidence to point to any particular state responsible, while the 2001 commission, in which Tunander participated, considerably revised this assessment and claimed that both Western powers and the Soviet Union might have taken interest in operating in Swedish waters. Tunander’s research began with a Peircean case of “surprise”: The standard interpretation was challenged by new information. In the 1980s, Tunander had been subscribing to the standard explanation of events until he spoke, in 1985, with a Swedish military diver relating how he had been sent to the bottom of Swedish coastal waters only to find there the wreck of a submarine including a corpse on the seabed. Tunander made an appointment with the diver about a closer interview two days later, only to learn that he had been transferred, on one day’s notice, to new duty in the north of Sweden and could no longer be contacted. This, of course, prompted the curiosity of the investigator, and he embarked on what would prove to comprise several decades of research. The immediate surprise of this new information, of course, prompted the abduction of an alternative hypothesis as against the standard version: An unknown number of the many submarine sightings may not have been Soviet but vessels of other origin – supported by the deduction that if the Swedish defense had wrecked a Soviet u-boat, there would be no reason to keeping it secret. This initial hypothesis led to ensuing attempts of testing it by means of deduction and induction. Deduction: The normal reason for a foreign submarine to appear in the Baltics would be that it was deliberately ordered there, reflecting some foreign policy decision. If so, certain officials would know about the affair and it would possibly also be reflected in archive material. Induction, then, was used to test these initial deductions, such as a demanding and protracted search of archives and interviews with a long series of naval officers, intelligence officers, politicians, submarine sighters, journalists, other researchers, much of it taking place at academic conferences on security policy, naval strategy, and intelligence, in more or less informal conversations, resulting in a jigsaw puzzle of tiny bits of information to be synthesized. This emerging pattern consists, in a certain sense, of the initial abductive hypothesis refined, transformed, and sophisticated through numerous deductions and, particularly, inductive evidence.
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1417
One decisive breakthrough proved to be the access to personal diaries of senior officers, to reports and drawings of observed submarines, and particularly to a series of “war diaries” of the Swedish naval base of Muskö in Hårsfjärden southeast of Stockholm and of the coastal defense base further out at sea. The world press situated on the coast was able to watch, in October 1982, how depth charges were detonated in attacks against an intruding enemy who, however, managed to escape time and time again. In a detailed interpretation of the war diary documents, Tunander was able to show that there was much closer recognition of the invasive u-boats and far more detailed evidence of their presence than apparent in the early commission reports. Tunander concludes from the diaries and interviews with naval officers participating that at least one of the inimical submarines took a hit and had to be repaired, submerged on the seafloor, giving rise to noise of hammering and sawing in recordings by Swedish naval equipment. Testimony from another diver claims that a US diver had made an emergency ascent to the surface and was, after an hour in medical care, transported away by a US embassy vehicle. Furthermore, Tunander documented that again and again, naval station officers stood ready to attack the intruders but were prevented, in the last moment, by ceasefire counterorders from above, as angry naval officers confirmed to him in interviews. Why was this? Tunander’s abduction from this strange information makes the following hypothesis: Parts of the Swedish top naval leadership ordering ceasefire deliberately granted that the submarines escape. And why was this? Another abduction: They were possibly friendly vessels which should be protected. When they had shown their periscope or tower on the sea surface, scaring the Swedish public, they should peacefully disappear again. Shooting orders would be given again, regularly, only when the submarines were safely back in high sea. This further abduction, then, pointed to the involvement not only of foreign powers, but also of certain figures in the top of the Swedish Navy, and one strand in Tunander’s investigation aims at identifying these characters, the admirals Per Rudberg, Christer Kierkegaard, and Bror Stefenson. Additional inductive indices were their personal close connection to several US admirals as well as secretary of defense Casper Weinberger. Tunander’s qualified guess – abduction – here claims that possibly as few as 3–4 top naval officers in Sweden would have been directly involved in an operation with US connections, while one source claimed more people would know about it, maybe up to 20. In 2000, long tv interviews with Weinberger and with British Navy Minister Keith Speed, both now retired, confirmed the presence of Western submarines in Swedish waters in the 1980s – a statement which only prompted the Swedish government to publicly discrediting Weinberger’s credibility, referring to his progressing senility – and, a bit later, made the government appoint the third consecutive commission in 2001 to again investigate events, now with the participation of Tunander. Another trend in Tunander’s investigation aimed at identifying the types of submarines was sighted in Swedish waters. Going through submarine types of leading European and NATO powers of the period, he concludes that the majority of u-boats sighted in Sweden did not involve Soviet-type vessels, but rather specific
1418
F. Stjernfelt
British, West German, and particularly small Italian u-boat types. This finding was corroborated by the declassification, in 2001, of a 1987 report by Swedish Military Intelligence to then PM Ingvar Carlsson relating that two types of small submarines had been identified, neither of which fit any known Soviet submarine type. Analysis of tape-recordings of their motor sounds also pointed in the direction of Western submarines. This surprising finding, again, called for a new abduction: Rather than a purely American operation, it would have been an activity involving at least a few top NATO countries. A third trend in the investigation comes out of interviews with involved Swedish naval officers of lower rank, those who time and time again wondered why they were receiving ceasefire orders. War diaries and other documents show that the Chief of Defense Staff Admiral Stefenson in coordination with Admiral Rudberg gave these orders, while Chief of Defense General Ljung and the Swedish Prime Minister were kept in the dark. Local officers, however, would interpret those orders quite differently: as an indication that newly appointed Swedish PM Olof Palme, whose government incidentally took their seats during the 1982 Hårsfjärden u-boat hunt, was responsible for ceasefires. In the standard picture of Soviet vessels, ceasefires were, by many naval officers, interpreted as a sign of Palme’s weakness or even his appeasement vis-a-vis the aggressive Soviet Union. This fed into a much-reported “rebellion” of officers against Palme, in which much of the Swedish military turned against the PM, and may even have fed into his still unsolved assassination a few years later in 1986. The resulting PsyOp influencing the Swedish public may not have been planned from the outset, but rather improvised on the basis of the earlier installment of an American hydrophone system in Swedish waters to track Soviet submarine activity, a network monitored by US/Italian mini-submarines already from the late 1960s. A later 1970s development may have been British submarines coordinating with Swedish so-called “Stay Behind” forces ready to take action in case of an invasion. A still later development may have been repeated testing of Swedish coastal defense by means of simulated attacks from submarines from the powers mentioned. Maybe the idea of deliberately allowing for submarine sightings as part of a PsyOp may only have occurred when the actual effect of the 1981–1982 incidents on the Swedish public became apparent. Take note of the many “maybe”s here: Many of Tunander’s conclusions are expressed abductively, on a scale from possible to probable. A highly indicative, higher-level abduction from the investigation points in the direction of a rather small, close-knit personal network of certain top US government, military, and intelligence officials, top British and Italian dito, and Swedish top naval officers involved in the operation. Neither Swedish politicians nor Swedish commander-in-chief Lennart Ljung seem to have been involved nor informed until later. This, of course, amounts to nothing but a conspiracy theory, and Tunander proceeds cautiously, expressing himself in abductive hypotheses, rather than definitive conclusions: “There is no final proof, but there are a large amount of sources which indicate that a small submarine vessel was sunk or seriously damaged . . . ” Tunander (2022). Translations of this and following quotes from this
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1419
paper are by the present author; also, Tunander’s reference notes, important to the corroboration of his claims, are left out for perspicuity reasons. All in all, Tunander’s research collects and connects thousands of indices, but no single smoking gun.
Secrecy and Abduction Our task here is not to confirm nor scrutinize the exact plausibility of the detail of Tunander’s conclusions, rather to investigate the means used in conducting research and reaching those conclusions. It is obvious that his research process repeats, again and again and on different levels, the standard Peircean investigation syntax Abduction-Deduction-Induction. (To Peirce’s notion of abduction, see the CP (Peirce 1931–58) and the EP II (Peirce, 1998)). A surprising discovery calls for the articulation of a hypothesis (among many possible) which provides an explanation of that fact in the sense that the fact follows as an implication or exemplification of the hypothesis – a hypothesis which is then subjected to subsequent testing, first by drawing a number of deductive consequences which would hold were the hypothesis true, and then searching and sampling inductive facts able to corroborate or falsify those consequences. (Peirce’s generalized notion of diagram is central in the process: the abduction constructs a general hypothesis of a diagrammatic nature; deduction performs a manipulation of this diagram; induction seeks to test these diagrammatical results against facts. See Stjernfelt (2007), chs. 4, 8, 16; (2014) chs 8, 10; (2022) chs. 8–11). Oftentimes, a convincing hypothesis – in the sense that it presents itself as a good candidate for being selected for testing – is one which like a gestalt pattern may connect a whole set of puzzling data and make them understandable. In Tunander’s development of a hypothesis on the basis of new, surprising facts, it is easy to see the important distinction introduced by Hoffmann (2007) about the logical character of an abductive hypothesis (the fact that it does provide an explanation of the surprising fact) on the one hand, and its perceptive character, on the other (in the sense that it provides a compelling new view of the whole situation), the latter leading to the selection of that hypothesis among many other such possible hypotheses equally logically explanatory (the diver lied, he mistook something else for a submarine, the submarine on the bottom was Swedish, the submarine on the bottom had been placed there the day before by extraterrestrials, the event was driven by astrological alignment, etc.). Peirce does not give any procedure for how to find the fertile, supposedly small set of explanatory hypotheses worth testing, most often merely referring to the natural penchant for guessing right granted to human beings by evolution and residing in the unconscious parts of the mind. Hoffmann distinguishes between abductions leading to a new general hypothesis (a “hypostatic abstraction”), on the one hand, and those leading to an overall perspective shift of the problem, on the other. This does not seem an absolute distinction, however, as a perspective shift may be summed up in an abstract concept (cf. the concepts of “rabbit” and “duck” summing up the perspective shifts of the famous Jastrow duck-rabbit figure) – but Hoffmann importantly points to the issue
1420
F. Stjernfelt
that, oftentimes, successful abductions reorient the whole problem by arranging the given facts in a new Gestalt pattern. That is indeed the case with Tunander’s initial hypothesis: Instead of immediately seeing intruding vessels as stemming from the nearest inimical superpower, his hypothesis makes a lot of existing facts – the Soviet Whiskey 1981 incident, the subsequent u-boat sightings, the Muskö 1982 incidents, etc. – fit into a completely new and more inclusive conceptual pattern, not only with several new actors involved (Western powers), but also with a different and more complicated overall schema, instead of the simple dual scheme of two opposed powers, namely, that of larger-scale pretending games involving several powers and their publics. Most often, an initial abduction must be modified or refined after subsequent testing – thus the initial hypothesis that non-Soviet submarines indicated a foreign operation had, in the face of u-boat types identified, to be modified in the direction of an operation involving certain NATO countries. Again, the small number of involved persons identified points in the direction of an operation not only conducted through standard NATO channels, but rather a person network within NATO with strong US dominance, but also involving Great Britain and to a lesser degree West Germany and Italy, targeting neutral Sweden. All of this, however, is hardly surprising. Research of all kinds operates in this recursive Ab-De-Induction modus, with very different means across disciplines, to be sure, particularly not only as to the methods of collecting inductive evidence for hypotheses, but also as to the nature of existing, established background knowledge providing material for deductions from hypotheses. And a large investigation like Tunander’s would consist of a huge network of thousands of small such inference steps over years. Peirce developed the notion of abduction (also: hypothesis or retroduction) through all of his career. Detailed presentations of his doctrine of abduction, criteria for selecting a hypothesis, and how to subsequently test it by means of deductions and inductions of various types may be found in other chapters of the present Handbook. His 1901 text “On the Logic of Drawing History from Ancient Documents Especially from Testimonies (Logic of History)” (EP II, 95, CP 7.202; EP II, 106–107, CP 7.218–219) is probably where he goes furthest in the direction of detailing the role of abduction with respect to historical investigations in particular. Here, Peirce’s examples are chosen from ancient intellectual history, but they seem, with some modifications, to apply to the actual case of political-historical research as well. Here, he develops a number of ideal rules applying to abduction: 1. The hypothesis chosen should explain all of the facts known, including how and why existing untrue counteraccounts would have originated. 2. The principal testimonies motivating the hypothesis should be taken to be true, and this “should not be abandoned until it is conclusively refuted.” Disbelieving a witness, e.g., should only occur in cases with “definite, objective, and strong reason for the suspicion.” 3. Great, objective probabilities for one hypothesis over another should influence our choices of hypotheses for testing, rather than small probabilities or subjective
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1421
similarities which will only reflect the preconceived prejudices of the investigator. 4. The hypothesis chosen should, subsequently, be split up in as many parts as possible, so that each can be tested independently. 5. When choosing between two hypotheses, an enlargement of the field of facts they should cover may give rise to a preference for one of them. 6. A hypothesis which you will have to investigate anyway should be preferred over another, for economy reasons. A final, crucial principle is the following: “A hypothesis having been adopted on probation, the process of testing it will consist, not in examining the facts, in order to see how well they accord with the hypothesis, but on the contrary in examining such of the probable consequences of the hypothesis as would be capable of direct verification, especially those consequences which would be very unlikely or surprising in case the hypothesis were not true.” The mere adding of further facts to a hypothesis chosen does not, importantly, prove anything (but may help the selection of the better hypothesis among several, cf. above). What should be tested is rather some deductive consequences predicted by the hypothesis, and among them those should be selected which would be “most unlikely or surprising” if the hypothesis were not true. Among such deductive consequences, Peirce lists the following – in this quote, political actions, statements, and sources (in UPPERCASE) have been substituted for Peirce’s references to ancient monuments, documents, and authors, respectively: It is not easy to enumerate the different kinds of consequences; but among them may be that the hypothesis would render the present existence of a POLITICAL ACTION probable, or would result in giving a known POLITICAL ACTION a certain character; that if it were true, certain STATEMENTS ought to contain some allusion to it; that if it is misstated by some authority not considered in the selection of the hypothesis, that misstatement would be likely to be of a certain kind; that if the hypothesis is true, and an assertion or allusion found in A SOURCE is to be explained by THE SOURCE’s knowing it to be true, he must have had certain other knowledge, etc. (EP II, 113–114, CP 7.225–231)
The interesting question here is then: What is particular to the investigation of such secrecies? The world of intelligence holds a series of peculiarities as against other state organizations: An elementary one, of course, is the secrecy of certain information and activities, oftentimes classified on a secrecy scale (e.g., confidential, secret, and top secret); secrecy also about the activity of gathering that information; secrecy of other defensive, preemptive, even aggressive activities such as counterespionage actions and operations, decided on the bases of such information. Particularly delicate issues may be handled as “Nothing on paper” matters, deliberately leaving little or no traces in archives. But to conclude from such lack of traces is a weak argument, cf. the classic historians’ warning against Argumentum ex silentio (the idea that if something is not mentioned, it did not happen, an argument sometimes voiced in the discussion of the crucifixion of Jesus or other New Testament events which are but rarely mentioned in contemporaneous sources, if at all). Sometimes, such an argument is ridiculed as a bogus argument
1422
F. Stjernfelt
or even as an outright logical error, cf. the saying that “the lack of a proof does not imply the proof of a lack.” Rather, the argument from silence, however, is but an abduction and, as all abductions, must yield faced with strong counterindications. But “Nothing on paper” is not all. Secret services possess other emergencylike powers; oftentimes they have the ability to pay informants or agents in cash, outside of formal registration and taxation so as to avoid traceability; more generally, intelligence services may maintain some degree of discretion, probably differing in degree from one country to the next, to bracket or transgress parts of normal legislation and norms in their activities. They often retain the possibility of registering citizens for perfectly legal if suspicious behaviors. They may indulge in feigned or as-if activities to cover up real activities. They may utilize fake news, organized disinformation, going all the way up to PsyOps, and large-scale operations attempting to making the enemy or the public accept certain invented fictions. They may be able to bar archives access to “sensitive” information; there may be longtime classification of access to certain archive files (oftentimes many decades of such classification), including the release only of redacted documents with deletions of sensitive information. In extreme cases, they may be able to go so far as to employ so-called “wet jobs” (covert action involving violence and possibly killing). This is the special field with which investigators must deal. A note must be made here about the recent tendency to gather such phenomena under the conceptual headline of “Deep State,” indicating that in many modern societies, parts of intelligence services, administration, military, private companies, etc. may come together in the organization of activities which may, to some degree, go against the official policies of democratic governments, with or without the acceptance of those governments. Related concepts include “the sovereign” (Carl Schmitt), “dual state” (Hans Morgenthau), “deep politics” (Peter Dale Scott), “double government” (Michael J. Glennon), “parapolitics” (Eric Wilson), and “securitization” (Ole Wæver), which only go to indicate that the very issue behind the “Deep State” notion is much older than the term, cf. also President Eisenhower’s reference to the “military-industrial complex.” (Cf. Wilson, 2012). Actually, Tunander himself played a role in the actual spread of the term “Deep State” which originated in Turkey and the particular discussions of the existence of Kemalist network activity in Turkish state bodies, maybe with roots dating all the way back to Ottoman times, such as top military, judiciary, officials and others to some degree collaborating to defend secularist norms against elected governments of the Turkish state. According to Gingeras (2019), Tunander’s conference presentation “Democratic State vs Deep State: approaching the Dual State of the West” at a 2006 conference in Australia was the first scholarly attempt at generalizing and defining the Deep State concept from the Turkish case so as to apply also to modern democratic societies, from where the concept enjoyed some spread in political science circles, not least due to Peter Dale Scott. The presentation was later published as Tunander (2009); cf. also Scott (2017). In Tunander (2009), he uses “Deep State” and “security state” interchangeably as an analytical rather than a normative notion: the Deep State “ . . . is able to calibrate or manipulate the policies of the ‘democratic state’ through the use of a totally different logic of politics – a
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1423
kind of politics that in this book is referred to as ‘parapolitics’ and which operates outside the law to define the limits of the legal discourse. The argument presented here is not meant as a normative statement, but rather as an attempt to describe and analyse the Western state as it actually operates, both inside and outside the law” (57). Later, however, the Deep State term was famously picked up by influential forces on the American right wing where figures like Steve Bannon and Alex Jones popularized it, around 2016–2017, as an effective moniker for the “swamp” which newly elected President Trump was supposed to “drain” in Washington. This spread of a scholarly notion into actual politics, however, simultaneously seems to have considerably simplified and exaggerated it, even distorting it in the direction of referring to a unitary, stable, long-term, strongly organized secret body. This, however, is no necessary implication of the term. Dale Scott, e.g., takes the more cautious attitude that “ . . . there are competing elements in the deep state. I’m not saying that there’s some kind of secret team that is in charge of everything, As I say in my book, ‘the deep state is not a structure but a system, as difficult to define, but also as real and powerful, as a weather system.’ A vigorous deep state, like America’s, encompasses dynamic processes continuously generating new forces within it like the Internet – just as a weather system is not fixed but changes from day to day” (2017). Tunander’s view evidently has the latter character: the deep state is not necessarily unitary nor stable but may consist of different competing and conflicting elites. Cf. his argument is his 2000 feud with historians claiming he should stick to written archive material: “For every scholar dealing with security policy, a main problem is the secrecy of the decision-making process. An analysis of some secret documents might give a true picture, but a top secret or even more highly classified document could contradict this picture. An even more secret oral commitment and understanding may be contradictory even to the ‘top secret story.’ In the final analysis, there is no guarantee. ( . . . ) The secret of security leads us into a hierarchy of discourses with no solid foundation” (Tunander, 2000, pp. 436–37). Tunander’s guess is that such networks are kept together not by explicit agreements but rather by trust between particular persons – trust, of course, being individual, fragile, fleeting, and subject to erosion. In conclusion, Deep State activities may spring from much more improvised, temporary, fluid networks, even harboring their own inner tensions.
Research Techniques Used Let us revisit a number of research steps taken by Tunander.
Multiplicity of Source Types Central to Tunander’s investigation are interviews with numerous naval and intelligence officers. An important issue, however, is not to focus too closely on certain
1424
F. Stjernfelt
selected or initially obvious types of information sources. Some of the information leading to Tunander’s research breakthroughs were inferences from classic archive material, such as naval war diaries; others were guided by open source material, such as the detailed shapes, looks, and sounds of different submarine types which proved crucial in the refutation of the Soviet-only hypothesis. In the same result, information by ordinary passers-by making submarine sights proved important in a sort of “citizen science”: Their observation and ensuing description most often did not fit with Soviet submarine shapes. Some information may be pure indices, such as a photo of Swedish Top Admirals Rudberg and Stephenson with US Secretary of Defense Weinberger together on the naval station of Muskö proving in itself nothing more than that they, at least on one occasion, had professionally met. This exploratory openness to information types has, in itself, a strong abductive bent – it takes the shape of trial-and-error guessing where relevant information may reside, only to be subject to ensuing deduction-induction testing weeding out many such initial guesses as irrelevant.
Investigating Parallel Scenarios A recurring technique is the listing of parallel possible scenarios given the actual amount of partially conflicting information. An example is Tunander’s “Three possible scenarios” (Tunander, 2022, in press, p. 12). Here, he considers three possible candidates for the presence of mini u-boats: 1. British submarines – their presence in the area was confirmed, and the Brits did have mini-submarines at their disposal, but their vessels were largely there in order to train with Swedish stay-behind forces, not to show themselves. 2. West German submarines – they were close by in the Baltic sea and were confirmed to play important roles in some US collaborations. 3. Italian submarines –– they had many mini-submarines, used by the USA already by the 1960s, and the shape and characteristics of two types of submarines described by Swedish observers fit with specific Italian vessels. The three possibilities need not exclude each other, and the abductive conclusion is not unanimous but points in the direction of the two latter possibilities, with special emphasis on the Italian one, supported by other evidence, cf. below. In a certain sense, the parallel-scenario technique mirrors the well-known “war games” of military strategists, here put to use about past rather than future events.
Interviews A central research tool in Tunander’s investigation is the interview with surviving participants in events, of many different sorts. This is a procedure practiced by journalists but often scorned by historians, rightfully pointing to human forgetful-
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1425
ness, wishful thinking, and selective memory, if not outright tactical lies, potentially strongly shaping and reshaping memories of even conscientious and intelligent participants. Part of the very ontology of secrecy, however, is the strong tendency to deliberately avoid leaving written traces for the archives cherished by historians. So, there is a difficult choice or trade-off to be made here. If one sticks to the standard of substantiating hypotheses with reference to open, written, contemporary sources only, typically public archive material, the danger is that important events may pass completely below the radar because they are not documented there, or appearing only in strongly curtailed or distorted versions. If, on the other hand, one allows for the introduction of interviews, the danger is to be misled by the mentioned uncertainties of memories, by rumors, even by deliberate misinformation. This dilemma is not easily solved. Tunander has, on some occasions, been publicly attacked by historians on exactly this issue (cf. Kronvall et al. (2000) and Tunander’s (2000) answer where he briefly says that “My point is that, in general, more sensitive information will not be found on paper” (434)). How to avoid being fooled by a cesspool of rumors or, worse, by systematic disinformation? An obvious road followed by Tunander is to maximize the number of interviewees. His latest summary of results (2022, in press) addresses this in his final conclusion: Sometimes it may be difficult to know how reliable a source is, but there are many sources saying almost the same, and we can hardly neglect Swedish divers claim to have experienced things, that their superiors confirm such things have happened and that the most central figures form CIA and US Navy say the same. When top figures in CIA and US Navy claim there was an “Underwater U-2” and that CIA Director Casey and Admiral Lyons were responsible, and when Lyons himself says the same, this must be taken seriously. (Tunander, 2022, in press, p. 23)
This argument is double. First, it refers to the amount of informants expressing similar things, an informal inductive argument. Second, it refers to the diversity of sources: Swedish personnel of different ranks as well as US top officials and officers. Basically, this argument is a qualitative induction making probable the abductive pattern which most available information fits into. Still, this strategy also involves dangers. Very often, the way to get in contact with a new informant is through an earlier informant to some degree trusting the next link in the chain; Tunander’s research holds many examples of this. But such trust may also imply that the information gained from different informants is not independent or not wholly independent; the optimal situation being, of course, the collecting of information from as many independent informants as possible. But full independence of sources may be difficult, even impossible to establish. It may be hard to judge whether or to what degree certain informants may have coordinated what they wish to make public, maybe unconsciously so. You should also not underestimate that a rumor about the appearance of a curious and skilled investigator may run fast in certain close-knit environments, possibly making some informants prepare diligently what they wish to reveal and what not. Networks involved with secrecy may adopt a certain degree of rational paranoia, so to speak.
1426
F. Stjernfelt
Abduction Examples Robert [Bathurst, US officer later Tunander’s colleague at PRIO in Oslo, FS] but also Bob Woodward said that while Bobby Inman [US attache in Sweden, FS] was in Stockholm, he had “a terrific source, who provided significant military information on other countries.” This may have been Sven Andersson who, when Inman left Stockholm, had been secretary of defense for 10 years, but it may also have been later Chief of Navy Per Rudberg who went on holiday with Bobby Inman, and who would have become Swedish Chief of Defense if Sweden had been occupied, and who, at his visit in USA in 1978, was received by most of the US Navy leadership. They had a dinner for Rudberg with seven admirals. He later met Chairman of Joint Chiefs of Staff William Crowe three times, according to notes from Swedish attache Lennart Forsman. Rudberg was also the officer whom secretary of defense Caspar Weinberger chose as his escort officer during his 5-day visit to Sweden in October 1981. (Tunander, 2022, in press, p. 14)
– This is abduction from the existence of personal networks, professional and nonprofessional, to the hypothesis that a person so connected may be identical to an identified US informant in Sweden. Two such persons are named, so the hypothesis remains open: Either of the two – or both – may fit the description. In July 1982, Bobby Inman had left as deputy head of CIA in protest against Casey. Inman had been leader of Naval Intelligence (1974–76), deputy head of DIA (1976–77), and leader of NSA (1977–81), and in January 1981 he took over as deputy leader of CIA after Frank Carlucci, while Carlucci (who did not tell me much) became deputy secretary of defense, national security advisor, and in 1987, secretary of state after Weinberger. (Tunander, 2022, in press, p. 20)
– A widespread abduction is that from the careers of informants, often meticulously presented in compact detail, to corroborate their possible influence and level of knowledge. James Schlesinger confirmed (in the car back from a conference in Rjukan in 1993) to OT that a Western submarine under American command had been damaged at Muskö in 1982. (Tunander, 2022, in press, p. 4)
– This is abduction from small talk which may be more important than immediately obvious. Oftentimes, it seems to be the case that the most important or sensitive information is given in the informal “margins” of conferences, in coffeebreaks, at late-night sessions involving alcohol, or like here, during transport. It is important for the investigator to keep alert in order to exploit such informal situations when standard cautiousness of informants may momentarily relax. An elementary reason may be that the informant agrees to pass on some information but does not wish to be observed by others while doing so. At a lunch I had in 1989 with Admiral Crowe’s predecessor as Chairman of Joint Chiefs of Staff, General John Vessey (1981–1985), he said: “When it comes to Sweden, there was only one rule: Nothing on paper”. (Tunander, 2022, in press, p. 21)
– It gives the strong abduction that such a policy may in fact have been agreed upon in the American military top. Remarkably, a bit like the preceding, this is the single only information quoted from the whole conversation with Vessey.
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1427
This indicates an important practical condition for many such conversations: The investigator cannot meet with a clear list of questions quickly to go through one by one as if in a semistructured interview. Rather, the conversation may have to be less structured and informal, slowly meandering in the hope of an opening which may, in many cases, never materialize. It is a sort of an abductive stance on part of the interviewer, a bit like Freud’s “gleichschwebende Aufmerksamkeit” during analysis, also connected to the slow building of trust during a conversation, gradually realizing (or not) the reliability and well-informedness of the investigator. Names of members of the secret liaison office between the US Navy and CIA named National Underwater Reconnaissance Office (NURO) are listed by Tunander with caution (“are mentioned to be . . . ”) and are subsequently abductively confirmed from an absence of denial: “ . . . Norwegian intelligence with connection to NURO is claimed not to have denied this” (Tunander, 2022, in press, footnote 10). – This is abduction from the nondenial of information by an authority supposed to be in the know, effectively a case of argumentum ex silentio. No further reference is given here; this is obviously a weak abduction. MacEachin [top US intelligence official, from 1981 deputy and chief of CIA’s «Operation Center», preparing daily reports to President Reagan, head of the CIA Soviet Office, and Deputy CIA Director for Intelligence, FS] was informed of the Muskö 1982 incident. It was obviously no Soviet operation. When I asked him about it, he said: “This was like an underwater U-2,” as if a vessel under CIA eller Navy command had been sunk and an American pilot had survived just like the U-21960 episode, just like the diver and the doctor/officer above related. I then said: “But the Swedes never went public” (different from what the Soviets did). He then turned to Ben Fischer, also present, and asked: “Do you know what we are talking about?” Ben Fischer said shortly: “Yes, Ola wrote a paper on it. Ola’s presentation was the first I heard in Oslo.” Then the conversation was over. (Tunander, 2022, in press, pp. 8–9)
– This is abduction from extremely rudimentary information. MacEachin’s mere six-words comment presents a whole analogy which Tunander (2022, in press) spends an entire section of his paper unpacking: the comparison of the October 1982 event to the Soviet shooting down of an American U-2 spy plane over Russia in 1960 where the pilot survived. In a certain sense, an analogy argument is an overall abduction of a whole structural comparison involving, on a lower level, ab-, de-, and inductive structure, allowing for Tunander the abduction that this top intelligence officer knew about the capture and release of an American from a wrecked submarine in 1982 (cf. Hoffmann’s 2018 distinction between abduction and 1) facts, 2) concepts, 3) laws, 4) theoretical models, and 5) new representation systems; here, the abduction is of type 4). Three days after this event [the Muskö incident in October 1982, FS], there was a meeting in Geneva between two Swedes, four Americans, and an Italian intelligence officer. From the Swedish side, it is said to have been an industrialist and an officer (or retired officer), but no diplomat. From the American side, it was two civilians [one may have been from CIA] and two military officers, and the only reason to include an Italian officer was that this “Underwater U-2” was Italian. (Tunander, 2022, in press, p. 15)
1428
F. Stjernfelt
– This is abduction from the mere national composition of a meeting, without knowing the identities of participants, except for the Italian officer (the source of this meeting is shared with an Italian TV-network but for some reason kept secret). In the week up to this meeting, the Swedes had released 45 depth charges. The following week, they released only 2 . . . (ibid.)
– This is an abduction from a change in military activity to make probable the importance and effect of the simultaneous meeting mentioned above. A review of Tunander’s 2019 book by four experienced politicians, journalists, and diplomats (Hirdman et al., 2019), claiming that Swedish secrecy about submarines should now finally cease, could not find a publishing outlet among important Swedish newspapers, despite attempts in Svenska Dagbladet, Dagens Nyheter, Expressen og Aftonbladet. Tunander briefly concludes: “These questions are obviously not of interest for Sweden and for Swedish media” (Tunander, 2022, in press, p. 23). – This is a tacit, ironic abduction from publication problems. Tunander rhetorically comments the strange rejection of a piece written by influential public figures, intimating without saying it directly that the Swedish press still remains under some political spell of silencing relevant information. This has as much to do with rhetorics as with argument structure: Oftentimes, an information is given, leaving the obvious abduction to the reader.
Archives Research Most of Tunander’s archives research follows standard procedures in much historical scholarship and investigative journalism which are not peculiar to Deep State issues. A few peculiarities connected to deletions and classification of archive material, however, may be mentioned: In March 1984, the House Armed Services Committee asked head of Navy Intelligence Admiral Butts, together with Weinberger’s marine secretary John Lehmann, about events in Sweden. Butts said the Soviet submarine which was grounded near Karlskrona in 1981 was “genuine” while the section of the resume about the submarines at Muskö 1982 is still classified. This implies these submarines were hardly Soviet (Tunander, 2022, in press, p. 5).
This is an abduction from which exact parts of a text have been classified: There would be no reason to continued classification if those vessels had been Soviet. Similarly, Tunander argues that the main report by the Swedish Navy is now largely declassified and speaks about two submarine wrecks in the 1982 events rather than one: of 16 and less than 10 meter length, respectively. Their more detailed identities, however, remain classified in the text. Tunander attempted to challenge this classification at court but did not succeed. He adds that the information that the Navy had dived to these wreckages in 1982 was declassified in 2001 but reclassified in 2008 (Tunander, 2022, in press, p. 7). This is an abduction
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1429
from the classification history of specific parts of a document: The reclassification of it gives rise to the deduction that a new assessment of the sensitivity of information has been made and the more general abduction that there are continued, active attempts going on to silence submarine matters.
Addressing the Delicacy of Information Many informants, e.g., in the military or in intelligence, may be professionally obliged to keep secrecy. But this is not a formal requirement of duty only; informants spilling the beans may risk not only legal prosecution if revealed – but also the loss of social standing and substantial parts of their social networks and friends. It may be extremely delicate to make such potential whistleblowers participate. This may be circumvented, partially, by the slow building up of trust with the investigator, − trust which may be subsequently compromised on publication where informants may have to resort to withdrawing true information not to become subject to threats or sanctions, like the saying “You know I shall have to deny this if it goes public.” But even if granting such delimitation, trust is no simple issue but may be partial, on probation, vicarious, etc. An informant may be perfectly willing to report the truth on some important matters but remain tacit on others and even deliberately deceitful on still others. Some informants may be explicit about what they wish to say and what not (but can you trust this?), others less so. A related issue, also faced by intelligence services themselves, is that you cannot be too picky in your selection of informants. Very few persons may be in a position or have the personal profile to gather, harbor, and share specific information. Such activity requires special knowledge and access to information shared maybe only by few, but it may also in some cases tend to involve special personalities, adventurous, idealist, courageous, cynical, self-indulgent, neurotic, or fantasizing characters willing to risk personal losses for providing and passing on information, maybe in some cases even involving borderline psychologies. (Cf. Stjernfelt (2007) ch. 18). Another possible source of potential unreliability of secret source material lies in the elementary self-interest of informants to appear important. A classic issue, e.g., is the obvious interest of handlers in exaggerating the importance and activities of the agents they handle. Sometimes, the pride of informants may make them reveal things they should not, cf.: John Lehman’s (former US Secretary of Navy) protegé, Admiral James “Ace” Lyons, had claimed to be one of those responsible for secret US operations in Northern Europe. When Tunander related this to the TV network Arte in 2015, this prompted Arte to go to the USA to interview Lyons who then said on camera: Yes, right, right, right. [ . . . ] This was my staff [ . . . ] Me! It was I who sit here. I am the one who put this together: the deception units, all the false systems and all of it. I and my staff did it. [ . . . ] I am sure they [the Swedes or somebody on the Swedish side] recognized what we were doing. (Tunander, 2022, in press, p. 22. Square bracket deletion and insertions are Tunander’s)
1430
F. Stjernfelt
Tunander relates how Lyons left the TV studio seemingly disappointed with himself. He should not have revealed this, a judgment later corroborated by another source. A standard means to attempt to circumvent the hesitation of informants, of course, involves the promising of anonymity of such whistleblowers which inevitably, however, will go against the scientific principle of openness of sources and open access ideals of the sharing of data. Some sources are still kept in the dark even in Tunander (2022, in press), forty years after events. This, of course, may make harder, even impossible, the replication and confirmation of results by later researchers who may not have access to the same or even to similarly relevant sources, not only because they may not have survived but also because it is simply not known who they are – cf. the actual “replication crisis” of science. A special technique as to the informability of informants is to address retired personnel in particular, who may, in some cases, more easily relax on their obligation to silence because they are now beyond responsibility and prosecution (cf. an interview with the retiree Keith Speed in 2008; or the 2000 interviews with Weinberger and Speed prompting the third 2001 commission). In all cases, a sort of qualitative induction lies in the very wish to extend interviews to as many informants as possible, in order to hopefully include at least a minimum number of such independent informants, potentially to be able to triangulate between them and balance and combine their accounts of facts.
Hedging Wild Information So, in a certain sense, when investigating secrecies, you have to multiply the amount of abductive guesses, hoping some pattern will emerge by triangulating them. A very open, explorative phase of “promiscuous” information gathering where no detail may a priori be irrelevant is crucial – the danger of the unfocused character of such research, of course, is to find nothing of interest, or, maybe worse, find a multiplicity of leads forming not one or a few patterns, but leading in many diffuse directions. In that case, research would, at some point, grind to a halt and just have to stop. Another means to hedge the dangers of wild information, of course, is to base the research on a framework of the most updated standard historical accounts, in this case of the Swedish military and its US relations, investigated by standard means, as the baseline of the investigation. This may seem elementary or even trivial, but that is an important point of distinction as against the vast flora of wild conspiracy theories of the QAnon type out there, which typically do not care to thus situate their claims. Of course, aspects of such standard accounts may possibly, during the investigation, have to be revised or even contradicted, but chance is that their overall framework is valid and may be taken for granted such as to provide structure, e.g., to the deductive phase of hypothesis testing – an example in Tunander (2022, in press) being the parallel mentioned to the more well-known 1960 U-2 incident. In general, contrary to what might be assumed in research so dependent on a multitude of tiny controversial empirical findings as the study of secrets, deductions are far from absent. Elementary deductive diagrams include the following: the
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1431
simple schema of friend and foe, of different interests clashing, and contrasting strategies developing on several sides in a mutual interaction to guide behavior during the development of tensions – in a certain sense belonging to the regional ontology of state security. Such political a priori structure is tacitly assumed. Likewise, more empirical diagrams, such as those depicting existing state structures with interrelated bodies with specific dependences and different requirement of secrecies, may appear as given to the investigation, and so figure in deductions.
Avoid Lapsing into Unfounded Conspiracy Theories Or, to be more precise, avoid lapsing into false conspiracy theories rather than true ones. Dealing with the worlds of state security is to address a realm where conspiracies are, in fact, sometimes present and active, and the very field itself brims with more or less spontaneous conspiracy theories claimed by participants themselves. The investigator, of course, cannot accept the standard, skeptic view that conspiracy theories are always nothing but false inventions. But the investigator has to be extremely cautious not to succumb to circulating such theories without strong indices. A particular danger lies in exaggerating the extent, activities, and range of actual, existing conspiracies found. To really achieve the “Nothing on Paper” standard, a conspiracy typically has to be pretty restricted to a number of participants in the know and relatively short-lived, otherwise chances increase that some subordinate member may break secrecy.
Special Conditions of Secrecy Investigators An elementary complication in investigating secrecies in general and Deep State issues in particular is that the investigation itself may immediately constitute a threat to certain actors – those bent on protecting secrecy. Such research, then, in a very acute sense different from much other research, immediately becomes part of the very social structures it sets out to investigate. Different actors in the field may suspect the investigator of being allied with or part of other such actors or interests, making investigation steps difficult. In extreme cases, researchers themselves may be subject to threats, more or less subtle – an important abduction ability here on part of the investigator is to judge the degree of reality behind such threats. Speaking with former US Navy Secretary Lehman, Tunander recalls: Who were involved in Sweden, Lehman would not address. But he added “If I tell you that, I would have to kill you afterwards.” (Tunander, 2022, in press, p. 10)
This is obviously, in the situation, a joke. But a joke with a potentially real depth. Tunander recalls a 1990s break-in at his PRIO office in Oslo where the intruder had opened his computer and searched a particular document as if to indicate somebody was monitoring what he was working on. A researcher may be interpreted by some
1432
F. Stjernfelt
observers to be an agent deployed by some counterespionage operation and thus appear to be an active part of the issue itself rather than of its neutral investigation. Furthermore, forces intent on maintaining secrecy may attempt to discredit the investigator’s credibility and academic standing – among other things, by pointing to exactly the lack of adherence to the written-sources standard. In the face of such difficulties, the investigator relies, to a large degree, on abductions in judging the reliability of threats, as a subset of the general check of each informant, applying classical source criticism: Given the accessible knowledge about the informant’s own position, intention, and interests, how should the information given be judged?
Theoretical Motivations for Action A particular issue is that even pretty old complexes of secrecies – now the Swedish submarine crisis is 40 years of age – may continue to harness future implications. The degree of clandestine Sweden-NATO involvement already in the 1980s, e.g., may play into the actual activities in Sweden and Finland to join NATO after the Russia-Ukraine war broke out in February 2022. Political science, not unlike economics, has proven, time and time again, not to be a very reliable predictor even of the immediate future, despite the existence of large bodies of skilled experts in both fields. Just like very few economists were able to predict the 2008 economic crisis, very few political scientists proved able to predict the outbreak of the 2022 Russia-Ukraine war; many of them even stuck to their interpretation against mounting evidence in the days just before hostilities began. This gives the sobering insight that even very well-informed scholarly and journalist experts, with no lack of theoretical overview and empirical knowledge, have little choice but resorting to mere abductions in the face of immediate events. This may also illustrate the dependence of secrecy research on theoretical – deductive-diagrammatical – presumptions, maybe implicitly so. Many scholars ascribed their lack of confidence in war to rationality assumptions claiming that by marching up large amounts of troops at the border of Ukraine, Putin had already secured as many gains as he could rationally hope for, avoiding the enormous costs and risks of actual warfare. But such an assumption relies on a narrow understanding of rationality being confined to a calculus of immediate gains and losses, primarily of economical nature. There is little reason, however, to assume that Putin acted irrationally, as some commentators concluded. Rather, he acted perfectly rationally, just on the basis of a quite different theoretical worldview than the primarily economics-derived one of “rational man” focused on immediate, secure payoffs. If you harbor a vast, geopolitical, mythical conception of the world in which long-term territorial gains and cultural values and influences in those territories are overarching goals for political powers, then Putin’s decisions may appear perfectly rational. The bottom line is that there is a multiplicity of possible diagrammatizable worldviews on which to apply rationality criteria, and in the field of secrecy the investigator has no right to assume a priori that all actors – or informants – are acting by one and the same playbook of rational principles but should keep an open, abductive mind
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1433
about very different possible motivations for action in different actors. In Tunander’s case, thus, keeping open the idea that different Swedish naval officers of the same generation, even if with similar standard education and positions in the very same command-structure organization, may differ considerably ideologically. Thus, naval officers may act rationally on the bases of radically different worldviews, including pertaining to the role of that organization’s connections to the US and other NATO countries.
Probability of a Narrative Versus The Probability of Its Invention Peirce, in the aforementioned 1901 “On the Logic of Drawing History from Ancient Documents,” made some important observation on how to approach ancient narratives (like the one about the authenticity and destiny of Aristotle’s manuscripts and how they ended up in Rome). Peirce is skeptical against the skepticism among contemporary German scholars vis-a-vis such narratives. More or less directly, they rely upon Hume’s argument of the “balancing of likelihoods,” effectively a quantitative calculus comparing the probabilities of aspects of one hypothesis against the probabilities of counterhypotheses. But historical narratives cannot be so judged, says Peirce, partially because there are no numerical values to be connected to such probabilities and partially because sources relating them are rarely independent. But also because what ought to be compared is not the truth versus falsity of a hypothesis, but the more precise issue of: What is the qualitiative probability that the story is true versus the probability that the source telling it would invent a lie? These observations may, to some degree, be carried over to the investigation of secrecies. Also here, there is a finite number of informants, their independence cannot in all cases be ascertained, and the comparison task remains the same. Is it more probable that secret Western involvements in the Baltic took place in the 1980s than that a number of sources would, for some reason or another, invent such a story and reveal it in small bits and pieces only? Why would Tunander’s initial diver informant invent a story about a submarine wreck? The unconditional truth of his claim cannot be in any way ascertained, of course, but, by Peirce’s comparison, we lack a possible motive why the diver would be inventing or circulating a false narrative (in a certain sense, this is a methodological application of Peirce’s more general refutation of Cartesian universal doubt. To Peirce, you should apply doubt only when some kind of reason appears to distrust existing knowledge). In the absence of further meetings with the person, of course, a more stable assessment of his trustworthiness and the detail of his claim is impossible. Peirce, however, argues that such stories are probably true more often than not, as long as we have no particular reason to doubt the veracity of the informant. Given the subsequent complete disappearance of the diver, he does not, in any case, seem to have marketed his story in any search for promotion, recognition, fame, attention, vanity, economic gain, or the like. The possibility cannot be excluded, of course, that he was an example of the rare case of sociopathic liar – or even a Soviet disinformation agent.
1434
F. Stjernfelt
Still, in the absence of any signs in that direction, it seemed reasonable to trust the information at least to the abductive degree that a hypothesis explaining it should be subjected to further investigation. Then, of course, subsequent findings (interviews with naval officers, politicians, and others; submarine types; war diaries, etc.) served to add support to the diver’s report, but that could not be known at the point when Tunander accepted his report – not as true but as the basis for his initial hypothetical abduction. Peirce’s idea, however, is not that any old story should be immediately trusted, of course. Background knowledge may counter its probability (as in the case of the repeated ancient report that Pythagoras had a golden thigh), obvious motives of the source to lie (to protect himself, to profit, to gain interest, or many other such purposes) may make it suspect; the source’s history of trustworthiness similarly, and much else. Importantly, the abduction does not simply claim the truth of the resulting hypothetical proposition, merely its worthiness of further investigation.
Conclusion To sum up, the investigation of secrecies, like any investigation process, proceeds in the Abduction-Deduction-Induction syntax, repeated on many different levels of detail. This comes as little surprise. But in addition to that, abduction seems to play a number of special roles in Deep State investigation. The most obvious place is in the determination of the trustworthiness of sources. Based as it must be, to a large degree, on oral information and personal encounter, the largely abductive, hard-to-formalize trust of witness reports plays center stage, and a constant double movement is necessary of striving to get the trust of a source and simultaneously to be on abductive lookout for any signs that might betray personal or other motives in the source for not being truthful. The inbuilt uncertainty in this process may be countered by a number of means: The initial, exploratory, open-ended, in itself abductive gathering of information deemed vaguely relevant. Then, of course, the deductive investigation of necessary implications of abductively reached hypotheses. Here, as we saw, Peirce emphasizes that the ensuing inductive testing should, if possible, focus upon the most fantastic, least probable of such deduction results because they will, if false, be easiest to disprove. Here, the deduction that an American-run Italian mini-submarine in the Baltics must have been deliberately ordered there on some mission is indeed a prima facie improbable result of deduction from the abductive hypothesis. Finally, the systematic gathering of as much bitwise, inductive evidence as possible, here particularly the interviewing, judging, and comparing of as many informants as possible, striving simultaneously to make probable the independence of at least some of them, e.g., by emphasizing those in different organizations, countries, and social backgrounds with little probability of personal connections. Abduction playing a higher role in such research than in many other branches implies, of course, not only its more adventurous nature, but also the lower degree of certainty of its results. As Tunander himself never ceases to insist, there
66 The Epistemology of Secrecy: The Roles of Abduction in the. . .
1435
may be thousands of indices pointing in the overall same direction, forming an abductive pattern – many more than the counterindices pointing in other directions. But in the absence of classic, definitive smoking-gun evidence, that still makes the final conclusion tentative, if probable. This is why credible research of such a nature must be expressed cautiously, painstakingly listing uncertainties and alternative explanations. And this is maybe also why such research easily becomes interminable, in a certain sense a priori unfinishable, mesmerizing even long-term investigators over decades. There always remains a potential witness uninterviewed, a classified document potentially one day declassified, and a yet unknown piece of information lurking in some unexpected location. That is why the investigation of secrecy is probably, by its very ontological nature, the most abductive of all.
References Gingeras, R. (2019). How the deep state came to America. https://warontherocks.com/2019/02/ how-the-deep-state-came-to-america-a-history/ Hirdman, S., Mossberg, M., Olofson, S., Schori, P. (2019). Korten på bordet i ubådsfrågan! https:// www.alliansfriheten.se/korten-pa-bordet-i-ubatsfragan/ Hoffmann, M. (2007). Seeing problems, seeing solutions. Abduction and diagrammatic reasoning in a theory of scientific discovery. In O. Pombo & A. Gerner (Eds.), Abduction and the process of scientific discovery (pp. 213–235). Centro de Filosofia das Ciências da Universidade de Lisboa. Kronvall, O., Petersson, M., Silva, C., & Skogrand, K. (2000). Comments on Ola Tunander’s article ‘The uneasy imbrication of nation-state and NATO: The case of Sweden’. Cooperation and Conflict, 35(4), 417–429. Langen, U., & Stjernfelt, F. (2022). The world’s first full press freedom: The radical experiment of Denmark-Norway 1770–1773. De Gruyter. Nordin, J., Langen, U., & Stjernfelt, F. (2022, in press). Implementing freedom of the press in eighteenth-century Scandinavia: Some perspectives on a surprising lack of transnationalism. In R. Hemstad, J. S. Kaasa, E. Krefting, A. Nøding (Eds.), Literary citizenship: Tracing the transnational crossroads of books in Norway and beyond, 1519–1850. Woodbridge. Peirce, C. S. (1931–1958; 1998). Collected papers (Vol. I–VIII). In C. Hartshorne, P. Weiss, & A. W. Burks (Eds.). Referred to as CP; references given by volume and paragraph numbers. Thoemmes Press. Peirce, C. S. (1998). The essential Peirce (Volume II (1893–1913)). In N. Houser, & C. J. W. Kloesel (Eds.). Referred to as EP II. Indiana University Press. Scott, P. D. (2017). Trump, the deep state, and the risks of war: Tikkun interviews Peter Dale Scott, Tikkun 2017. https://www.tikkun.org/trump-the-deep-state/ Stjernfelt, F. (2001). Bourbon on the rocks (review of Tunander (2001) and interview with the author), In Weekendavisen, 23 November 2001, Copenhagen. Stjernfelt, F. (2007). Diagrammatology. Investigations on the borderlines of phenomenology, ontology, and semiotics. Springer. Stjernfelt, F. (2014). Natural propositions: The actuality of Peirce’s doctrine of designs. Docent Press. Stjernfelt, F. (2022). Sheets, diagrams, and realism in Peirce. De Gruyter. Tunander, O. (2000). A criticism of court chroniclers: A response from Tunander. Cooperation and Conflict, 35(4), 431–440. Tunander, O. (2001). Hårsfjärden. Det hemliga ubådskriget mot Sverige. Norstedts Förlag. Tunander, O. (2004). The secret war against Sweden: US and British submarine deception in the 1980s (Naval policy and history, Band 21). Routledge.
1436
F. Stjernfelt
Tunander, O. (2009). Democratic state vs deep state: Approaching the dual state of the west. In E. Wilson (Ed.), Government of the shadows: Parapolitics and criminal sovereignty (pp. 56–72). Pluto Press. Tunander, O. (2019). Det svenska ubåtskriget. Medströms Bokförlag. Tunander, O. (2022, in press). 40 år siden “U-2-episoden under vann”. Wilson, E. (2012). The dual state: Parapolitics, Carl Schmitt, and the National Security Complex. Routledge.
Abductive Ruses: The Role of Conjectures in the Epistemology of Deception from High-Level, Reflective Cases to Low-Level, Perceptual Ones
67
Francesco Fanti Rovetta
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peircean Elaboration of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning in a Naturalized Epistemology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hypotheses and Decision-Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Explanatory Coherence and Deception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Eco-cognitive Model of Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Inference in Exploratory Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regaining Environmental Attunement Through Skilled Sensorimotor Conjectures . . . . . . . Abductive Ruse and Sensorimotor Hypotheses in Adversarial Interactions . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1438 1440 1442 1444 1446 1449 1452 1455 1458 1461 1461
Abstract
Deception is a complex phenomenon which has been investigated from various perspectives in different disciplines. From an epistemological standpoint, it is undisputed that abductive inferential processes of the deceived play a role in some cases of deception. So far, the literature of the epistemology of deception has only considered cases involving representational and propositional hypotheses. In recent times, various scholars have proposed a pluralistic ontology of hypotheses: they need not be propositional and representational mental entities, but can also take the form of low-level, sensorimotor conjectures. If these scholars are right, epistemological accounts of deception elaborated so far, by not considering instances of deception involving sensorimotor conjectures, have had too narrow of a focus. In this chapter, previous works on the relation of
F. Fanti Rovetta () Research Training Group ‘Situated Cognition’, Osnabrück University, Osnabrück, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_31
1437
1438
F. Fanti Rovetta
deception and abductive reasoning are presented, and contemporary proposals for pluralistic approaches to the ontology of hypotheses are examined. Building on these approaches, cases of sensorimotor conjectures in deception are considered, and the paradigmatic example of “feints” in sport performances is discussed. Finally, the concept of abductive ruse is presented to denote the various kinds of epistemological dynamics involved in deception, independently of whether the kinds of hypotheses generated are representational or sensorimotor. Keywords
Abductive ruse · Sensorimotor conjectures · Epistemology of deception
Introduction With regard to abductive or hypothetical reasoning, the question usually asked is how it is possible that it ever leads to plausible or even correct conjectures, considering the innumerable, logically possible alternatives. This manuscript focuses instead on an ignored and overlooked dimension of abductive reasoning, the dark side of abduction: when it leads to errors and how other agents’ abductive inference can be intentionally manipulated to deceive. In short, here abduction will be considered not – following Aristotle – as leading-away (the literal translation of ᾿ the Greek term apagoghé, απαγωγ η, ´ Magnani, 2021) but as leading-astray. In doing so, I will expand Thagard’s work on the role of abductive reasoning in deceptive adversarial problem-solving by adopting a pluralistic approach on the ontology of hypothesis, focusing particularly on non-representational hypotheses involved in deceptive dynamics. The aim of this paper is to explore the idea that abductive inferences and their manipulation play a central role in deception (Thagard, 1992). For example, hypotheses can be suggested by aptly modifying the environment so as to lead an adversary to the formulation of a misleading hypothesis. This happens at a large scale in military deception and, as I set out to argue, at a smaller scale in some cases of sensorimotor interaction between various agents. Whereas the role of hypotheses in deception is well noted, pluralistic approaches in the contemporary literature on abduction (Magnani, 2008, 2017; Menary, 2016) have elaborated the concept of non-representational hypothesis, so far completely ignored in the epistemology of deception. Introducing this notion will extend the reach of the epistemology of deception far beyond its usual boundaries to include instances involving sensorimotor conjectures. More specifically, I present the argument that by directing others’ hypothesis generation processes through the modification of external cues, it is possible to lead others astray and to gain a competitive advantage over the competitor in adversarial dynamics, such as in games, business, or war. In order to support this claim, I contextualize my proposal within the broader frameworks of a naturalistic epistemology and situated cognition. The former will be functional to put forward
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1439
a realistic picture of the reasoner’s mental capacity and limited resources, crucially exploited in deception, whereas the latter will be needed to anchor abduction in the interaction between the agent and its environment. Both these frameworks will be needed to argue that hypothesis generation and selection do not necessarily result in hypotheses instantiated in propositional mental entities but can take the form of sensorimotor conjectures. From this follows that there are at least two epistemological accounts of deception, depending on whether representational or sensorimotor hypotheses are involved. Whereas the former account has been discussed in some depth in previous works, chiefly by Thagard (1992), cases of deception involving sensorimotor conjectures have been completely overlooked. The present contribution fills this gap. The paper will unfold as follows: in the first part of the manuscript, the original elaboration of abductive inference by Peirce will be briefly described, focusing particularly on some of its characteristics, such as defeasibility, non-monotonicity, and situatedness. Contextually, the focus will be on naturalistic approaches to epistemology and logic, on situated views in cognitive science, and on how these traits of abductive reasoning have been considered in contemporary discussion, with particular attention to the naturalization of logic and epistemology program championed by John Woods. Following this, the chapter will discuss how, in situations of epistemic ignorance, hypothesis generation and selection are part of the broader architecture of decisionmaking process. Linking the making of conjectures with decision-making will be crucial to understand the role they play in instances of deception. Indeed, the ultimate goal of deception is to lead the adversary to commit to an unfavorable course of action. It will be shown that abductive inference precedes the option-generation process which in turn is the basis of decision-making. Since understanding the dynamics of deception is a particularly pressing and high-stakes matter in military settings, it is not surprising that an insightful debate on deception, and its relation with hypothesis generation and selection, is to be found in the literature from the community of intelligence analysts and defense intellectuals. Therefore, a section will be devoted to discussing the insights from military strategists and intelligence analysts. The first part of this paper, where representational hypotheses and their relationship with deception are discussed, will culminate with the discussion of Thagard’s, 1992 paper. There, Thagard expounds on the role of explanatory coherence and hypothesis generation and selection in adversarial problem-solving dynamics, which include instances of deception. The second part of the chapter opens with the introduction of contemporary discussions of pluralistic approaches to the ontology of hypothesis and arguments for non-representational hypotheses. More specifically the focus will be on two contemporary frameworks that argue for a pluralistic approach. These two frameworks are Magnani’s eco-cognitive approach to abductive reasoning, and more specifically the notion of manipulative abduction, and Menary’s concept of implicit hypothesis or sensorimotor conjecture in exploratory cognition. By building on these two models, I argue that there are instances of deception which do not involve representational hypotheses, but rather involve non-representational, sensorimotor
1440
F. Fanti Rovetta
conjecture. I introduce and discuss some examples of sensorimotor conjecture in deception, such as in the cases of feints in various sports. I also introduce the concept of abductive ruse which extends to all cases of deception involving either representational or non-representational hypotheses. The concept of abductive ruse will be proposed and used to characterize the deceptive dynamics based on the manipulation of abductive inferences irrespective of whether they result in representational or non-representational hypotheses.
Peircean Elaboration of Abduction Consider the following case. After a pleasant night with some friends, you head back home. When you arrive at your destination, you immediately notice that something is out of place, more specifically, you observe that there are muddy foot-prints that lead from the garden through the porch to disappear at the base of the window of your dining room. Seeing that the window is not closed as you left it, you are now sure: a burglar must have entered your house. You arrived at the conclusion by means of an abductive inference. You have generated the hypothesis of a burglary in order to explain the unexpected situation you have stumbled upon. Abductive reasoning is resorted to due to the uncertainty of the situation, and as result it produces one or more hypotheses. Although abductive reasoning is transversal to many activities, some clear instances in which it is involved include legal proceedings, medical diagnosis, criminal investigation, and scientific research. Moreover, assuming that to explain is, often, to put forward a causal account, hypothesis generation is also framed as the first step to individuate unknown causal relations (Thagard, 2007). When discussing the cognitive ability to generate hypotheses (also known as abductive reasoning), one cannot escape referring to Charles Sanders Peirce, who devoted much of his intellectual energy to clarifying this enigmatic phenomenon. Besides the analytic or truth-preserving kind of inference, i.e., deduction, and the theorized synthetic or ampliative and non-truth-preserving form of reasoning, i.e., induction, Peirce individuated a third form of reasoning, also synthetic, which he called abduction (or sometimes retroduction). Peirce’s theory of abduction is often divided in two periods (Anderson, 1986; Fann, 1970). The first formulation of Peirce’s theory of abduction is in syllogistic terms which he would later dismiss as unsuitable for defining how abduction works. Peirce’s concerns with the syllogistic formulation (which is equivalent to the fallacy of affirming the consequent) were rooted in his recognition of the deeply context-sensitive and creative nature of abduction, which was not captured by the propositional formulation of the early years. Peirce was aware that abductive reasoning is based on the recognition of salient environmental cues and that an externalist account of cognitive processing is more adequate to explain it. Contrary to deductive and inductive reasoning, it cannot be abstracted from the pragmatic context in which it is performed. A way to formalize Peirce’s early understanding of abduction as well as the fallacy of affirming the consequent is as follows:
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1. 2.
1441
A —> B, B
——— 3.
A
The successive formulation presented by Peirce in the 1903 Harvard Lectures on Pragmatism was in natural language and included psychological elements, such as doubt, suspicion, and matter-of-courseness. The formulation is as follows: The surprising fact, C, is observed; But if A were true, C would be a matter of course; Hence, there is reason to suspect that A is true. (C.P. 5.189) In the new formulation, surprise, matter-of-courseness and suspicion are central to understanding abduction. A feeling of surprise is involved in so far as the observed fact, which prompts the abductive inference, cannot be accommodated within the known facts: something has to be added or revised. There are many ways in which what we know about the world could be changed to accommodate the newly observed fact. Among these many ways, a plausible one is selected. The revision of the background knowledge is such that the surprising fact is not surprising anymore, and for this reason the revision is provisionally accepted. Compared to the previous formulation, the new one considers the perspective and psychological states of a subject performing the abduction as well as the contextual cues the agent is able to pick up in producing the hypothesis. In this sense, unlike deduction, there is no context-free abduction. Moreover, in the Peircean elaboration of abductive reasoning, the environmental cues which lead to the generation of hypotheses are not necessarily consciously perceived; thus we may be unaware why we have reached a certain abductive conclusion. In fact, according to Peirce abductive reasoning is akin to a “guessing-instinct” (C.P. 7.220). The term “instinct” is part of a strategy put in place by Peirce to explain why, by generating hypotheses, we come closer to faithfully understanding the situation than by performing random guesses (Paavola, 2005: 5). Once formulated and operationalized, the purpose of the hypothesis is to be as stable as possible and “to lead to the avoidance of all surprise and to the establishment of a habit of positive expectation that shall not be disappointed” (C.P. 5.197). An important point in Peirce’s formulation of abduction is that it is not restricted to offline and primarily theoretical cases of hypothesis generation, such as in scientific practice. Grounding his reasoning on situations presenting ambiguous perceptual stimuli, he argues that a subconscious hypothesis generation process is taking place even in perception. “Abductive inference” he writes “shades into perceptual judgments without any sharp line of demarcation between them” (C.P. 5.181).
1442
F. Fanti Rovetta
It has been noted that both formulations provided by Peirce share one problematic aspect: they are not able to represent the creative element that characterize some instances of abductive reasoning. The hypothesis A is already present in the major premise and as such is formalized as known and not creatively generated in the conclusion (Frankfurt, 1958). This may be appropriate for those cases in which a hypothesis is selected from background knowledge – also known as selective abduction (Magnani, 2009) but not when it is creatively generated or invented. Ultimately, Peirce was not satisfied with the new formulation of hypothetical reasoning either, and he never reached a definitive stance on the matter. Already in Peirce’s writings it is present the idea that abductive inference may have a creative element, typically absent in other forms of inference (C.P. 5.172). In line with this view, according to contemporary discussions, abduction is separated in two phases: a preliminary phase of hypothesis generation, in which the creative element is involved, followed by the phase of hypothesis selection (Mohammadian, 2019). Hypothesis selection can receive a Bayesian interpretation, and it overlaps with inference to the best explanation. On the contrary, hypothesis generation is much more difficult to interpret in formal terms. In the attempt to distinguish hypothesis generation and hypothesis evaluation/selection in formal terms, Bonawitz and Griffiths (2010) have defined the former from a Bayesian perspective as a first restriction of the hypothesis space, so that the successive evaluation or selection is interpreted as a search operated on this, preliminary restricted, hypothesis space.
Abductive Reasoning in a Naturalized Epistemology Abduction has proved particularly challenging to those intending to provide a set of characteristics that make it – at times – successful. In fact, from a pure deductivist standpoint, i.e., by upholding deductive validity as the golden standard for correct reasoning, abduction is a deductively invalid form of inference. The truth of the premises does not guarantee the truth of the conclusion. The same is true if we evaluate abductive inference through the standard of inductive strength (i.e., the probability of the conclusion being true given that the premises are true). A shared traditional view in epistemology is that validity and inductive strength are the criteria to measure whether an inference is acceptable. In recent times, both the normative claim and descriptive claim regarding deductive and inductive reasoning have come under fire. Such criticisms can be found in the works of proponents of the Bias and Heuristic framework, Bounded Rationality model, Ecological Rationality model as well as in the program of a naturalistic turn in epistemology and logic, championed by John Woods. According to Woods (2013), we do a disservice to our reasoning skills by upholding them to such demanding standards. Woods’ point is that rather than condemning abduction and other alleged fallacies as epistemic errors because they fail to satisfy the standard imposed by logician or Bayesian statisticians, we should look for standards
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1443
that are sensitive to the reasoning agent’s limited resources (time, information, and computational capacity) as well as to the reasoning outcome in relation to the epistemic target. In real scenarios pure epistemic standards are often compromised in favor of contextual considerations and practical interests (Kompa, 2021). In this perspective uncertainty and (practical) doubt is a luxury an agent cannot afford for too long, and abduction is a quick and a fast way to produce provisional and conjectural understanding out of incomplete information. Woods and Gabbay have proposed the GW Schema (Gabbay & Woods, 2005: 336–342) to supplement the previously proposed schema of abduction called AKM. A crucial distinction introduced by the GW Schema is that between complete and partial abduction (step 10 and 11 of the GW schema, respectively). The difference is that complete abductions are acted on, once formulated they come to constitute a positive basis for action, without the need of further confirmation. In the case of partial abduction, on the contrary, the formulation of the hypothesis does not guarantee that level of confidence to base our actions on it. A special instance of partial abduction is typically to be found in natural science: upon formulating a hypothesis, scientists cannot act on them as if their truth was taken for granted, rather the hypothesis needs to be tested. It is important to note that the activation of abduction, i.e., taking the hypothesis as the basis for action, does not involve those cases where these actions are aimed at corroborating H. The step from partial to complete abduction implies that the hypothesis abandons its conjectural, but not defeasible, status and it is taken as true. Whereas “epistemologists of risk-averse bent” may consider partial abduction rather than complete abduction as the only serious version of the two, Woods adds, “there are real life contexts of reasoning in which such conservatism is given short shrift, in fact is ignored altogether” (Woods, 2013: 371). Such is the case in many instances of practical reasoning in which the possibility of testing a hypothesis is not given. In these cases, abduction is a form of practical reasoning which: Provides the benefits of action without the cost of knowledge. It provides this guidance not on the basis of supporting evidence but in the absence of it. It provides the agent with an answer not to an epistemic question but rather a prudential one. It tells the agent that it is worth the risk of putting the conjectured proposition to provisional inferential use, and of acting in ways consonant with its provisionality. It is an action-guiding practice of immense and indispensable economy. It is a scant-resource adjustment strategy. (Woods, 2013: 371)
In sum, the value of abductive inference is not in overcoming once and for all the irritating doubt which prompted the generation of the hypothesis – the author deems abductive reasoning to be ignorance-preserving – rather that of providing a provisional basis for action. After all, as already argued by Peirce, abduction is “nothing but guessing” (C.P. 7.219). Thus, hypothesis generation is crucial in the decision-making process.
1444
F. Fanti Rovetta
Hypotheses and Decision-Making The theoretical background so far has been useful to identify abduction as a defeasible, open-ended (i.e., non-monotonic) and situated inferential process. These are all elements that will concur to describe the functioning of abductive ruses, discussed in the last section. Moreover, the description of abduction put forward so far, by stressing the practical nature of abductive inferences, already suggests that hypothesis generation and selection are a first, crucial component of the decisionmaking process. As discussed above, although the epistemic status of the hypothesis, before further testing, is frail, many situations do not afford the luxury of being amenable to scientific corroboration processes. Thus, oftentimes, in situations of epistemic ignorance it is necessary to elaborate a hypothesis that, at least temporarily, suppresses the ignorance and provides possible courses of action, even though there are no resources for further testing. In these cases, only once the situation has been conjecturally reconstructed via an abductive inference, it becomes possible to engage in option generation, which is fundamental to decision-making (Kaiser et al., 2013). The role of option generation, in turn, is to elaborate various concrete possibilities for actions which are then fed into the decisional process. This process can be made clear by referring to medical settings (Stanley & Nyrup, 2020): only once the physician decides for a certain diagnosis among the various possible explanations for the detected symptomatology (hypothesis generation and selection), several treatments are considered (option generation) and one is chosen as the cure (end of the decision-making process). The theoretical connection between deception and hypothesis generation is crucially dependent on the role that generating and selecting a hypothesis plays in guiding the successive step of option generation. Deception is not realized in changing the beliefs of the deceived, rather it is realized in leading the adversary to act as the deceiver wishes. Since adopting different hypotheses leads to conceiving different sets of options to pursue, the manipulation of the adversary’s inferential processes of hypothesis generation and selection ultimately steers the result of the decision-making process in a certain direction. For the same reason, the fewer options afforded by a given hypothesis, the better it is for the deceiver in terms of predictability of the opponent’s behavior. The conceptual distinction between hypothesis generation and selection and decision-making is perfectly clear in the intelligence community, in which the functional difference is mirrored by different roles: analysts are involved in hypothesis generation and selection based on collected data and decision-makers, typically politicians or high rank military officers, in decision-making (Fanti Rovetta, 2020; Heuer, 1999). In this context, Richard Heuer (1999) elaborates a quasi-formal eight-step method called Analysis of Competing Hypotheses (ACH) involving the construction of a matrix to help intelligence analysts to select hypotheses. The intention behind Heuer’s work in the psychology of intelligence analysis is that of making the work of intelligence analysts less dependent on intuitions, less subject to biases and open
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1445
to external scrutiny. At the end of the process, the winning hypotheses are then communicated to politicians and military strategists, which use them as the basis for the decision-making process. For the present purpose, various factors make intelligence analysis particularly fruitful to take into consideration when elaborating an epistemological framework for deception. One is that intelligence analysts work with many of the constraints that are also present in real-life situations (limited time, cognitive resources, and available information) and, given that they are often dealing with high-stakes issues, there is a high incentive for the intelligence community to do the best that is humanly possible. This makes a perfect mix to assess what in reasoning, and more specifically in hypothesis generation and selection, is feasible and achievable – with sophisticated methods – in concrete situations, beyond philosophers’ idealizations. Concretely, intelligence analysts need to strike a balance between “methods that are rigorous but technically daunting (e.g., Bayesian networks) and unbridled intuition” (Chang et al., 2017: 2). Another factor is that the work of intelligence analysts typically happens within adversarial dynamics, meaning that the possibility of collecting intentionally misleading data and of being deceived is always there. Heuer takes this problem into consideration. He distinguishes unproven from disproved hypotheses: the former has no evidential support, whereas the latter have been negated by evidence. Based on this distinction, the curious epistemic status of the hypothesis that one is being deceived (D-hypothesis from now on) is observed: One example of a hypothesis that often falls into this unproven but not disproved category is the hypothesis that an opponent is trying to deceive us. You may reject the possibility of denial and deception because you see no evidence of it, but rejection is not justified under these circumstances. If deception is planned well and properly implemented, one should not expect to find evidence of it readily at hand. The possibility should not be rejected until it is disproved, or, at least, until after a systematic search for evidence has been made and none has been found. (Heuer, 1999: 98)
Contrarily to any other hypothesis, the absence of evidence does not mean that the D-hypothesis is less plausible, since a good deception does not leave traces and goes undetected. Moreover, if the D-hypothesis is true, other hypotheses are false; thus we can contrast it to any other. Conversely, the success of a deception consists exactly in leading the adversary to formulate and choose other hypotheses, possibly a pre-established one. Deception succeeds when the evidence is crafted and exposed in such a way that it suggests other hypotheses, while hiding itself. Thus, in order to detect well-executed deceptions, one ought to proceed counter-inductively: whereas the lack of evidential support makes a hypothesis less plausible, in case of D-hypothesis, the opposite may be true. Moreover, intelligence and defense analysts have recognized the central role of deceptive techniques in modern warfare since the 1950s and 1960s and have attempted to further the understanding of such techniques, such that considerations of the psychology and epistemology of deception can also be found in the literature on military strategy. According to Barton Whaley’s Stratagem: Deception and Surprise in War (2007), the present state of contemporary warfare techniques is in
1446
F. Fanti Rovetta
part due to the development of the intuition of military theorists stemming from the human catastrophe of WWI. In particular, Whaley refers to Liddell Hart’s theory of “indirect approach” (1967) as the milestone that influenced military strategists and commanders. According to Liddell Hart, the principle of the indirect approach in warfare is that the best course of action is not to seek direct engagement, but rather in avoiding direct confrontation while at the same time actively pursuing the weakening of the enemy through other means. Only after having exhausted these possibilities, if needed, a direct confrontation would ensue. For obvious reasons, the adoption of the indirect approach has revamped the interest in deceptive techniques among military strategists. Barton Whaley, who has been defined by A. D. Clift, president emeritus of the National Intelligence University, as “the undisputed dean of U.S. denial and deception experts” (Whaley, 2016), analyzes and presents some elements necessary for the success of deception. In order for the deception to be successful, according to Whaley, the enemy will: • Take notice, if the effect is designed to attract his ATTENTION; • find it relevant, if the effect can hold his INTEREST: • form the intended hypothesis about its meaning, if the project pattern of characteristics is CONGRUENT with patterns already part of his experience and memory; and, • fail to detect the deception, if none of the ever-present characteristics that are INCONGRUENT are accessible to his sensors. (Whaley, 2016: 15) The characterization provided by Whaley, while intended for military settings, can be generalized. In order to suggest how to organize a successful deception, it takes into consideration the adversary’s psychological factors (attention, relevance, hypothesis generation, and congruency with background knowledge), which are to be found at work also in other contexts. Intelligence analysts and military strategists provide interesting insights regarding the relation between hypothesis generation, decision-making, and deception having the need to create a theoretical framework that can withstand the test of reality. And their considerations can be extended beyond the limited scope of warfare. In order to generalize these observations, in the last section I introduce the concept of “abductive ruse” and present some examples thereof. But before doing that, in the next section I introduce another crucial element for any epistemology of deception: the creation of a model of the opponent’s beliefs, plans, and goals.
Explanatory Coherence and Deception Whether hypothesis generation and evaluation are in the business of offering explanation has been a debated issue in the related literature (Park, 2015). No position will be supported in this regard here. What is less debatable is whether abductive reasoning involved in the kind of deceptions discussed in this section is a
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1447
matter of explanatory hypothesis. Framing deception in terms of the manipulation of the adversary’s hypothesis generation and selection processes entails the assumption that the hypothesis adopted explains a certain situation. As discussed in the previous section, the role of abductive reasoning is to suppress doubt – if only momentarily – thus providing the basis from which to start the option generation process, which, in turn, culminates in the decision to act in a certain way. Besides playing an explanatory role, the hypothesis generated and selected needs to be harmonized with the rest of an agent’s beliefs. Indeed, it is plausible to assume that in most cases the amount of revision needed to adopt a certain hypothesis is inversely correlated with the ease of accepting it as true. Given this background, assuming that any rational agent strives to maintain a set of coherent beliefs, modeling the opponent’s beliefs, plans, and goals becomes central not only to predicting their actions but also to prepare a successful deception. This idea has been developed by Thagard (1992), who discusses the role of modeling the opponents in adversarial problem-solving – such as that taking place in warfare, business negotiation, and some types of games – by exploiting explanatory coherence. More specifically, according to Thagard, modeling an opponent is necessary in order to predict and understand their actions. How the modeling takes place is described by means of seven principles: 1. Construct a model of the opponent, O, involving O’s situation, past behavior, general goals, value scale, degree of competitiveness, and attitude toward risk. 2. Make sure that your model of O includes O’s model of you, because O’s responses to your actions will depend in part on how your actions are interpreted. 3. Use this model to infer O’s plans and add the inferred plans to the model. 4. Use this enhanced model to infer O’s likely actions and likely response to your actions. 5. Combine your model of yourself, O, and the environment to make a decision about the best course of action. 6. In particular, use your model of O to predict possible effective actions that O might not expect, and that, therefore, would be more effective because of the element of surprise. 7. Take steps to conceal your plans from O and to deceive the opponent about your plans. (1992: 130) Having specified these principles, Thagard goes on to describe the cognitive mechanisms underlying adversarial problem-solving as inferences based on representations. These representations involve three elements: propositions, rules, and analogs. Propositions store information regarding the situation and the opponent; rules serve to predict the unfolding of the situation and the behavior of the opponent; analogs are informative regarding relevant past cases, both involving O and involving other agents in similar situations. Modeling the opponent may present some unforeseen challenges: “Often, O will perform some unexpected action and it will be crucial for P to explain that action. Typically, explanation will require the formation of hypotheses about the goals and
1448
F. Fanti Rovetta
plans of O, and these hypotheses will feed crucially into rule-based predictions of what O will do” (1992: 133). This means that in instances of unpredicted behaviors from the opponent, the model will have to be updated with new hypotheses to make sense of the anomaly and to integrate it with previously known facts. In doing so, explanatory coherence needs to be maximized, i.e., the hypothesis that coheres the most with available evidence ought to be chosen. The degree of coherence is measured by parsimony: “The more hypotheses it takes to explain something, the less the degree of coherence” (1992: 136). Thagard discusses at length a couple of cases of adversarial problem-solving which can be aptly described within the framework elaborated so far. One of these is centered around the 1988 USS Vincennes’ shooting of an Iranian commercial aircraft, mistakenly assumed to be a hostile F-14 of the Iranian air force, in the Persian Gulf. Captain Will Rogers III, in charge of the USS Vincennes, had to make a decision on how to act in response to the incoming flight. The main competing hypotheses on the basis of which to proceed with the decision-making process were (a) the flying aircraft could have either been a commercial flight or (b) an F-14 of the Iranian Army. In the latter case, it could have been a serious threat to the American vessel. The information available, according to the following investigation (Fogarty, 1988), was this: the aircraft was not in the airspace designated for commercial flights; it was turning towards the USS Vincennes; it did not respond to verbal warnings; it was – erroneously – perceived as descending. Captain Rogers chose the hypothesis of a hostile, approaching F-14 and shot it down, killing 290 persons and creating an international accident. Assuming that the report of the incident is reliable, according to Thagard the captain chose the hypothesis which was more coherent with the evidence available. In fact, the hypothesis of a hostile act explained the available evidence, whereas the hypothesis of a commercial flight would have to be supplemented with ad-hoc hypotheses – e.g., that the commercial airplane had technical problems with the radar and had lost its way – in order to explain the evidence. Another case narrated by Thagard regards the invasion of Normandy. The case is quite well-known. Allied forces were to decide on how to best approach the invasion of France. Being an instance of adversarial problem-solving, the decisionmaking process involved modeling Hitler’s model of the Allies. They knew that the most plausible landing was at Pas de Calais for various reasons, being in a better geographical position from a strategic perspective. Moreover, they increased the plausibility of the Landing at Calais in Hitler’s eyes by any possible mean: they increased the bombing of Calais to weaken the defenses, sent false reports to German High Command through double agents, simulated radio communications and displayed fake army camps including inflatable tanks and airplanes, on the English coast close to Calais. In the frameworks proposed by Thagard, all these measures were adopted to increase the degree of explanatory coherence of the hypothesis of the landing in Calais while diminishing the hypothesis of the landing in Normandy. According to this view then, deception is achieved by creating and displaying false evidence. Besides being false, the various pieces of evidence must be coherent with each other and with the deceiving hypothesis. The result is
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1449
that evidence collected by opponents lead them to naturally assume the deceiving hypothesis as intuitively true. This example gives us the opportunity to discuss in relation to a concrete scenario why it is reasonable to proceed counterinductively in instances of deception. To proceed counterinductively means to generate and use “hypotheses that contradict well-confirmed theories and/or well-established experimental results” (Feyerabend, 1993: 5). In the context of Thagard’s proposal, to proceed counterinductively means choosing the hypothesis that has a lower explanatory coherence compared to the alternatives. The reason to adopt this strategy when one suspects that one is a victim of deception is clear: if the evidence collected is purposely crafted to hide real intentions and deviate the decision-making process, then generating a hypothesis which is in contradiction with the evidence is the correct way to proceed. In the case of the Normandy landings, if Hitler had proceeded counterinductively, all the evidence – purposely created by the Allies to deceive him and his Generals – would have suggested to him that the landing operations were going to take place in Normandy. Of course, this is easier said than done: on one hand counterinduction is highly counterintuitive, being a kind of reasoning which yields positive results in a very limited set of circumstances, on the other, adopting a counterinductive procedure when it is not a case of deception can lead to catastrophic outcomes. Moreover, in those cases in which there are more than two competing hypotheses, the counterinductive procedure leaves underspecified which hypothesis to choose among the many in contrast with the collected data. So far, we have described abductive reasoning in cases of deception as resulting mainly in linguistic artifacts, i.e., as hypothetical propositions. Is this necessarily the case? Are hypotheses linguistic artifacts, are they necessarily in a propositional format? In the next two sections I present two perspectives which suggest a negative answer: Magnani’s eco-cognitive model of abductive reasoning and Menary’s account of exploratory cognition.
The Eco-cognitive Model of Abduction A pluralistic view of hypotheses is shared by several contemporary authors working on the conceptualization of abductive reasoning (Magnani, 2009, 2017; Menary, 2016). In a recent review (Flórez Restrepo, 2021) more than 30 different types of abductive inferences have been identified. At the forefront of this effort, Lorenzo Magnani (2009) has proposed the distinctions between (1) model-based/sentential, (2) manipulative/theoretical, and (3) creative/selective abduction. The first distinction refers to the kind of objects used in inferential processes. Sentential abduction is based on and results in propositions. Such is the case of the Peircean formulation above. By contrast, model-based abduction is achieved through the construction and manipulation of various kinds of (non-sentential) representations, following Nernessian’s notion of model-based reasoning (Nernessian, 1995; Magnani, 2008). The second distinction, between manipulative and theoretical abductions, refers to the way in which hypotheses are generated and
1450
F. Fanti Rovetta
selected: manipulative abductions result from the active and dynamic interaction of the cognitive agent with the environment. In the case of manipulative abductions, the generation of a hypothesis is facilitated by exploring the environment and using external resources, for example, by tweaking the variables in a certain model. According to Magnani’s own definition: Manipulative abduction is a specific case of cognitive manipulating in which an agent, when faced with an external situation from which it is hard or impossible to extract new meaningful features of an object, selects or creates an action that structures the environment in such a way that it gives information which would be otherwise unavailable, and which is used specifically to infer explanatory hypotheses. (2008: 54)
Consider, for example, the case of diagrammatic reasoning, in which the inferential process is scaffolded by the iconicity of the manipulated figures to enhance understanding and, possibly, hypothesis generation. This kind of abduction results from what Magnani calls “thinking through doing” (2009), in which the environment and tools in it are actively explored and manipulated in order to facilitate the activity of hypothesis generation. Conversely, the concept of theoretical abduction individuates those cases in which the hypothesis generation phase is entirely a mental phenomenon and only internal cognitive resources are employed to arrive at its formulation. Finally, the distinction between hypothesis selection and hypothesis generation is used to distinguish between those cases in which a hypothesis is selected among a set of pre-known plausible hypotheses and those in which the conjecture is creatively generated. Building upon these distinctions and focusing particularly on manipulative character of certain forms of abductive inference, Magnani proposes the eco-cognitive model of abduction. This model is built in the last two out of the three books devoted by Magnani to the topic of abduction (2001, 2009, 2017). The eco-cognitive model stresses the inherent situatedness and openness of abductive inference, as well as the multimodal and strategical or “military” character (Magnani, 2011) of cognition more broadly. The eco-cognitive model is chiefly an epistemological framework. As suggested by the name itself, the eco-cognitive model is based on two parallel dimensions: the ecological and the cognitive. The ecological aspect intends to frame the thinking agent as deeply embedded in an environment and her cognitive performances as the result of the interplay between the two. This means that semiotic and contextual elements cannot be ignored or abstracted from when describing the reasoning process, as is the case in deductive and inductive reasoning. The ecological point of view here leads to refuting the abstract notion of an agent as the homo economicus proposed by classical economics as well as a normative account of reasoning standards, i.e., standards to which the agent should conform if she wants to think correctly. The cognitive dimension refers to the idea that a realistic view of the agent’s cognitive capacity ought to be reflective of empirical results coming from cognitive sciences, contrasting it with classical epistemology which typically delves into conceptual and definitional issues. As noted by Bertolotti this amounts to “an ideal of epistemology that must not be cognition-blind,” meaning that it needs to
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1451
take into consideration the best scientific theory regarding how humans actually reason. Magnani proposes an agent-based framework centered on three elements: (1) cognitive agents and their competences, (2) the constraints and limits on performance imposed by the limited cognitive and informational resources, (3) and the cognitive targets at which the agents aim – their agenda. In this sense, “the problem of describing how agents perform reasoning”, writes Magnani, “is constrained in three crucial ways: in what they are disposed towards doing or have it in mind to do (i.e., their agendas); in what they are capable of doing (i.e., their competence); and in the means they have for converting competence into performance (i.e., their resources)” (2008: 385). Adopting the agent-based perspective, cognition happens not in an information-complete environment, but it is a matter of partial information and scant cognitive resources. The agent’s reasoning and actions are not modeled realistically as quests to optimize utility but as a quest to reach satisfying results. Given this background, a pluralistic view of hypotheses also entails that the process of suppression of doubt through the formulation of hypotheses happens at different cognitive levels, from low-level sensorimotor activities to high-level abstract and conceptual reasoning. Another important aspect, highlighted in the ecocognitive model, is that in order to collect and make sense of partial information real agents use whatever they can, including internal means such as our senses as well as external means like the collective manipulation of environment that increases the overall fitness of the organism, resulting in the construction of cognitive niches (Bertolotti & Magnani, 2017). Thus, according to Bertolotti, who follows Magnani on this matter, abduction is more aptly framed as fitness-reliable, not truth-reliable. In a practical context, abductive reasoning is aimed not at reaching true conclusions but at reaching conclusions that most enhance fitness from an evolutionary perspective and practical success from a pragmatic perspective. Similar to Menary’s position discussed in the following section, Magnani extends the concept of inference in a semiotic direction, arguing that abductive inference, and inferences in general, should be understood in a semiotic, rather than logic framework. This amounts to the recognition that inferential premises or conclusions need not be in propositional or symbolic form, rather they are signs, and signs can be “sensations, images, indices, external representations, and senso-motoric activities” (Paavola, 2011: 255), and in general everything that can stand for something else to someone or something. The concept of inference thus interpreted has a much broader scope than typically assumed. As an example, one could consider trees’ physiological reactions to the shortening days during the fall. Trees could be said to interpret the change in daylight as a sign of an incoming new season, and in doing so they implement a series of actions, such as shedding their leaves (Everett, 2019), which constitute what in Peircean terms is called an “interpretant.” By extending inferential processes in a semiotic direction and thus beyond the identification of inferential processes with the manipulation of abstract symbols as is typically assumed by classic cognitive scientists and formally
1452
F. Fanti Rovetta
inclined epistemologists, Magnani, following Peirce’s lead on this point, presents a framework of cognitive capacities able to accommodate both representational and non-representational kinds of inferences.
Abductive Inference in Exploratory Cognition Cognitivism, a group of theories that interpret cognition as computation over representations that have dominated cognitive science, is now considered as merely one among many explanatory strategies. We are in the era of post-cognitivism. Along with classic cognitivism, a plethora of different approaches has swept the debates in cognitive science. These approaches – grouped through family resemblances, rather than shared principle – contest some of the main tenets of cognitivism as too abstract, intellectualistic, and hostage to modeling methodologies. A group of theories emerged in the last 30 years and are now well established, particularly in certain research programs. As noted by Hutto and Myin: Enactive and Embodied ways of thinking about the mind and cognition are certainly already comfortably ensconced in cognitive science, having established deep roots in a number of disciplines. Far from merely being at the gates, the barbarians are, it seems, now occupying cafés and wine bars in the heart of the city. (2013: 3)
At the same time, as most revolutionary movements in the aftermath of their success, 4E cognition is struggling to find a single standpoint, a consensus, as no theory has imposed itself as the received view. Therefore, while 4E cognition is here to stay, it has more so taken the shape of a shared Weltanschauung and common research focus than a single mainstream theory. The 4E movement has a large intellectual debt to classical pragmatist philosophy and in particular to James and Dewey. Said debt consists of a number of core tenets that the two groups share, such as the idea that the main function of the mind is that of attuning the organism to the environment, the life-mind continuity thesis, an antirepresentationalist approach to cognition, a critical stance towards methodological individualism and the emphasis on the role of habits. With few notable exceptions, C. S. Peirce’s philosophy has received little attention in post-cognitivist cognitive science. This is due to the fact that Peirce endorsed a hyper-inferentialist view of cognition (Legg, 2008) – the idea that every cognitive process is inferential, from perception to higher level cognition – and a sui generis semiotic theory of representation, whereas following James and Dewey, many 4E theorists attempt to get rid of all inferential processes by understanding them as immediate, to explain cognition in terms of affordance detection, and to do without representations. In short, many – not all – philosophers and cognitive scientists understand the 4E turn as involving a non-representationalist and non-inferential theory of cognition and Peirce’s philosophy is neither. Peirce not only endorsed a representationalist and inferential theory of the mind but extended the role of inference to perception and lower cognition. Therefore, it is unsurprising that when discussing the debts of 4E cognition to pragmatism, contrary to James and Dewey, Peirce’s name is rarely
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1453
mentioned or does not appear at all (Heras-Escribano, 2019; van Dijk & Myin, 2019). Nonetheless, there are notable exceptions to the damnatio memoriae of Peirce’s legacy in post-cognitivist panorama. Recently, his theory of life-mind continuity, called Synechism, has been interpreted as an antecedent of enactivism (Fanaya, 2021). Other exceptions are represented by researchers who work on the connection between 4E cognition and biosemiotics (De Jesus, 2016), by Menary’s Cognitive Integration Framework (2007), and by Magnani’s Eco-cognitive framework (2009). The proponents of these accounts are particularly fascinated by one of Peirce’s original concepts: abduction. Following Menary’s proposal, broadly inspired by Peirce’s philosophy, abduction is reframed as exploratory inferences, which are open-ended, flexible, and self-correcting. This understanding of cognition resonates with the earlier attempts of cybernetic theory in proposing an extended theory of mind as well as with the contemporary 4E approaches. Menary’s proposal is centered around what he calls the Peircean principle. The principle states that: • Thinking is structured by the interaction of an organism with its environment. • Cognition develops via exploratory inference, which remains a core cognitive ability throughout the life cycle. • Inquiry/problem-solving begins with genuinely irritating doubts that arise in a situation and is carried out by exploratory inference. (2016: 220). According to Menary, exploratory cognition is one among several interaction styles described by the Cognitive Integration approach. Following Peirce, he considers exploratory inference (i.e., abduction) to be the operation through which cognitive systems engage in fallible and open-ended interaction with the environment. Through exploratory cognition, the cognitive system can self-correct in real time, resulting in environment-organism fine-tuning and skilled engagement with the environment. Crucially, Menary (2016) under the umbrella of exploratory inferences endorses a pluralistic view of hypotheses or conjectures, which are the results of exploration of the environment performed by cognitive agents: hypotheses do not need to be conceived as propositions that require to be interpreted and tested; they might be entirely spatial or action-based. The infant moving in its environment trying to get a solid grip of its own environment can be framed in terms of an exploratory cognitive system. We may imagine the child as being in the process of forming its own actionbased hypotheses on how to stand, how to walk, how to ride a bike. The theoretical move proposed by Menary with respect to the concept of hypothesis is not different from similar moves made by 4E proponents in the attempt to deflate intellectualistic approaches to cognition. In the case of social cognition, for example, various researchers have supported the view that alongside a theory of mind (understood as sentential mental knowledge) used to mindread (i.e., infer the reasons and intentions behind others’ behaviors), humans are also endowed with the intuitive capacity of understanding others by interacting with them. A more
1454
F. Fanti Rovetta
parsimonious explanation of the reciprocal understanding resulting from some cases of human interaction can be proposed by resorting to the concept of embodied practices (Gallagher, 2020: 98–120). Similarly, Menary argues that the exploratory inferences or conjectures involved in low-level cognitive tasks, such as in coordinating sensorimotor activities, are nonrepresentational. More specifically, he distinguishes “explicit hypothesis generation and test involving beliefs or theoretical posits,” which are typically representational, as in the military examples discussed above, from those conjectures “in early developmental, and at least some sensorimotor cases” that “may be based on motor activity rather than on beliefs or representation” (Menary, 2016: 230). In these latter cases, which we may call implicit or tacit hypothesis generation, the sensorimotor conjectures developed in the course of development are put to test in a trial-and-error procedure in the interaction with the environment and are adjusted accordingly. In this context, the concept of tacit knowledge proposed by Polanyi (1967) may help to clarify the matter at hand. Polanyi notes that not all human knowledge can be specified in a propositional, linguistic form. He summarizes this insight in the slogan: “we can know more than we can tell” (Polanyi, 1967). Knowing how to ride a bicycle is a typical example of tacit knowledge: it is something that would be difficult to explain with a set of instructions and it is even counterintuitive in some respects. But how is this tacit and implicit knowledge acquired in the first place? In learning how to maintain the equilibrium, move one’s center of gravity and all the other actions needed to ride the bike; various sensorimotor conjectures are generated and put to test. When something works, it is then repeated, learned, and retained. In this process of exploring the environment, the objects in it and their functioning, according to Menary, the cognitive agent formulates sensorimotor conjectures on how to correctly interact with them. In order to better understand the concept of sensorimotor conjectures, it is necessary to introduce the concepts of sensorimotor contingencies and of sensorimotor knowledge. The concept of sensorimotor contingencies refers to the “regular sensorimotor co-variations that depend on the agent and the environment” (Di Paolo et al., 2014) or, in other words, that certain actions reliably produce incoming stimuli, creating stable patterns in the perception-action cycle. Mastery over this cycle and the ability to enact certain sensorimotor contingencies in the skilled engagement with the environment give rise to sensorimotor knowledge. Mastery is achieved by drawing sensorimotor conjectures, i.e., implicit hypothesis that a certain action will create certain incoming stimuli. The sheer fact that human beings acquire sensorimotor knowledge, which “consists in a perceiver’s familiarity or attunement with the lawlike ways in which sensory stimulation varies with movement” (Kiverstein, 2010) seems to require the concept of sensorimotor conjectures, in so far as acquiring sensorimotor knowledge requires guessing and maintaining those guesses that prove efficient. Assuming that it cannot be spelled out in detail, sensorimotor knowledge cannot be taught explicitly. It may be possible to learn by observing and repeating others’ actions, and this is actually the case in some instances, but not always, and even this sort of apprenticeship, based on the observation of others’ sensorimotor knowledge
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1455
at work, requires the formulation of sensorimotor guesses. In other instances, sensorimotor knowledge is acquired by sheer trial and error. This kind of trial and error process is not merely blind variation and selective retention, but is guided by plausible and typically implicit hypotheses. These hypotheses are then what we have termed, following Menary’s conception of implicit hypothesis generation, sensorimotor conjectures. A central feature of hypotheses, both implicit and explicit, is their fallibility: the child will fail to stand, he will be irritated and feel frustration, he will have to form better sensorimotor hypotheses, test them, find their limit, and try again. In this sense, exploratory cognition is a self-corrective practice, presenting similarities with practices involving explicit hypothesis generation and testing, such as in the scientific method, which is also based on the formulation and testing of hypotheses (although, this is not all there is to the scientific practice). Frequent iteration of this process of hypothesis forming and testing in different environments will result in habitual, skilled, and reliable engagement with the environment. A continuity between trial and error in skilled engagement with the environment, practical reasoning, scientific conjecture-making, and conjecture-testing results from a pluralistic conception of hypotheses. Although the details may vary, all these cases share the same epistemological dynamic: a preliminary conjectural thesis is proposed, and when possible undergoes a testing phase, otherwise it becomes the basis for further action, to be maintained until its eventual failure.
Regaining Environmental Attunement Through Skilled Sensorimotor Conjectures As argued in the previous section, sensorimotor hypothesis generation is not a blind trial and error process. Rather the sensorimotor hypotheses put to the test find their legitimacy in and are chosen because of their initial plausibility. Their plausibility depends partly on the cognitive agent’s past sensorimotor interactions with the environment as well as on its capacity to scan the environment for relevant cues and regularities. G. Klein, psychologist of expert decision-making, provides a fitting example of skilled implicit sensorimotor hypothesis generation in his book: Sources of Power, How People Make Decisions (2017). In this book, Klein highlights how expertise is manifested in a power to have the correct intuition, which he defines as “pattern recognition, having the big picture, achieving situation awareness” (2017: 209), in general as synthetic – as opposed to analytical – forms of inference and thus as both inductive and abductive reasoning, depending on the case discussed. Klein’s view has sparked a lively debate among experimental psychologists and decision-making theorists. The two sides of the debates are represented by supporters of the Heuristics and Biases (HB) and the Naturalistic Decision-Making (NDM) frameworks. The debate is centered around the validity of experts’ decisionmaking and professional intuition, especially when compared to algorithmically implemented decision-making procedures. By means of laboratory experiments, proponents of HB such as Kahneman highlight the fact that experts’ judgments
1456
F. Fanti Rovetta
are often unreliable, whereas algorithmic procedures are more reliable, because they are not influenced by contextual factors, biases, and inconsistencies that affect human reasoning. Conversely proponents of NDM, such as Klein, insist on the weakness of algorithmic procedures, e.g., the potential to lead to automation bias. A compromise is reached by individuating some criteria for expert intuition to be effective and by distinguishing environments in which the reasoning is performed: environments with a stable relation between cues and outcome (defined as high validity environments) afford expert intuitions to be reliable, while in low validity environments, algorithmic procedures, being trained to handle a greater number of variables compared to human reasoning, regularly outperform experts. An overview of the debate, with a tentatively reached middle ground between the two positions can be found in Kahneman and Klein (2009). Although, as said, Klein prefers using the term intuition or intuitive judgment as a general term for synthetic forms of inference, the following passage based on an interview he had with a firefighter commander perfectly exemplifies an instance of non-representational hypothesis generation and the resulting decision: It is a simple house fire in a one-story house in a residential neighborhood. The fire is in the back, in the kitchen area. The lieutenant leads his hose crew into the building, to the back, to spray water on the fire, but the fire just roars back at them. “Odd,” he thinks. The water should have more of an impact. They try dousing it again, and get the same results. They retreat a few steps to regroup. Then the lieutenant starts to feel as if something is not right. He doesn’t have any clues; he just doesn’t feel right about being in that house, so he orders his men out of the building – a perfectly standard building with nothing out of the ordinary. As soon as his men leave the building, the floor where they had been standing collapses. Had they still been inside, they would have plunged into the fire below. (Klein, 2017: 33)
It is clear that while the example here cited is not explained in terms of hypothesis generation by Klein, it does represent a case of guessing, involving all the classic features which define hypothesis generation. There is an expectation that is betrayed, surprise and disorientation follow, a hypothesis is implicitly generated, and a decision is taken based on it. More specifically, the firefighter commander experiences the failure of sensorimotor knowledge: law-like regularities he would expect to experience are not there and so he has to form an implicit conjecture to restructure his interaction with the environment. From the commander’s firstperson perspective, the hypothesis generation process he engaged in, in this case, is perceived as an instinctual insight, what some phenomenologists may define as form of non-representational intentionality, or directedness, involved in skilled performances (Gallagher & Miyahara, 2012): “A sixth sense,” he assured us, and part of the makeup of every skilled commander. Some close questioning revealed the following facts: He had no suspicion that there was a basement in the house. He did not suspect that the seat of the fire was in the basement, directly underneath the living room where he and his men were standing when he gave his order to evacuate. But he was already wondering why the fire did not react as expected. The living room was hotter than he would have expected for a small fire in the kitchen of a single-family home. It was very quiet. Fires are noisy, and for a fire with this much heat, he would have expected a great deal of noise. (Klein, 2017: 33–34)
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1457
The interpretation of hypothesis generation as a skilled performance which leads to attunement with the environment is based on the fact that the expert has a tacit understanding of how to operate in situations that surprise her, depending on each situation. The expert has a tacit capacity to detect certain crucial cues and to attune to certain regularities in the environment as well as an understanding of how to deal with those situations in which the codified and rule-based procedures fail. Generating successful sensorimotor hypotheses is required when the usual patterns are broken and the situation appears off, even in an inexplicable fashion. It amounts to an ability to dynamically coordinate with an environment that is changing faster than careful deliberative reasoning can follow. In such situations, the expert is able to quickly re-establish the coupling to the environment and overcome the initial surprise and disorientation. The activity of generating sensorimotor hypotheses is context-sensitive, it requires the skill to detect the salient features by being mindfully aware of one’s own circumstances in order to avoid overloading the working memory capacity, and by enacting appropriate sensorimotor interactions. The generation of a full-sketched representation of the situation would, in such instances, be excessively time consuming and, by engaging the working memory system, would possibly lead to missing some key environmental cues. For this reason, “expert athletes speak of ‘keeping their minds blank’”(Hutto, 2019: 2). Once the cues are detected, it requires moving from these features to what they stand for. Combining these two steps, the experts arrive at a plausible hypothesis. Novices, whose operations are characterized by strict rule-following procedures (Gallagher, 2020: 44), handle this kind of situation poorly, precisely because known rules do not apply. In the hypothesis generation process, skills are not only involved in generating the hypotheses, but also in noting that the situation diverges from normality beforehand and in making the correct decision afterwards. In the hypothesis generation process, skills are not only involved in generating the hypotheses, but also in noting that the situation diverges from normality. In the example above, it is the commander which “starts to feel as if something is not right”; novices are not sufficiently perceptual attuned to the environment to recognize variations from normal circumstances. For they are not familiar with salient features of the environment’s normal responses, they do not notice when these are missing. Moreover, sensorimotor conjectures are typically manipulative abduction, in Magnani’s sense discussed above. Indeed, their formulation is based on the active exploration of the environment on the part of the cognizer, who by doing so in a skillful manner is able to detect crucial cues. Sensorimotor hypothesis generation is then neither mindless and automatic nor conscious and reflective. It is not a mindless, automatic activity because it does not rely on previously acquired sensorimotor knowledge as in the case of behavioral routines, which – as noted above – fails in cases which require generating hypotheses. It is not reflective because hypothesis generation, differently from hypothesis evaluation, escapes conscious and critical control. Reflection and strategic consideration on which hypotheses to test or adopt intervene only once hypotheses are generated and make up the hypothesis selection part of abductive reasoning,
1458
F. Fanti Rovetta
in cases which allow for it. The whole process results in the re-establishment of tentative coordination or attunement of the organism with the environment. Differently from the decision-making process based on representational hypotheses, in this case the generation of the sensorimotor hypothesis – which manifests itself in a form of action, such as the evacuation of the burning edifice – is implicit in the fast decision of the firefighter commander to leave the building. Although the commander is not conscious of the hypothesis generated, since he does not know how he arrived at the decision of leaving the building, we can only understand his decision a posteriori as based on the generation of the correct hypothesis, i.e., the guess that, something being off, the situation could develop in a unpleasant direction. Compared to the cases involving representational hypotheses, here both the generation of the hypothesis and the decision-making processes are temporally compressed and entangled. Moreover, on a phenomenological basis, it appears that hypothesis evaluation and option generation play only a marginal role or are entirely left out. In the cases of sensorimotor conjectures, guessing and the actions which usually follow from it collapse into one, so that the action itself instantiates a hypothesis: that the building was unsafe, in the case of the firefighter commander.
Abductive Ruse and Sensorimotor Hypotheses in Adversarial Interactions In this section, I propose and elaborate on the concept of abductive ruse (AR) and discuss cases of deception involving the formulation of conjectures. The concept of AR is functional in providing a term to capture the variety of cases of deception involving hypothesis generation and selection. It refers to the process taking place in the interactions (both proximal and distal) between at least two agents, in which one of the agents involved (agent1 ) purposely crafts and displays false and misleading evidence in order to misdirect the other’s (agent2 ) abductive inferences, whether resulting in representational or in sensorimotor conjectures. If the ruse is successful, the result is the generation and selection of a false – and possibly preestablished – hypothesis by agent2 . Agent1 ’s goal is ultimately to influence the decision-making process of agent2 in order to benefit from it. Compared to previous proposals of deception involving only propositional and representational hypotheses such as those discussed in Thagard’s framework, the model presented here is more ecumenical insofar as it allows for cases of non-representational hypotheses. The AR is implemented via a series of cues scattered in the environment (broadly understood) which point to a specific deceitful hypothesis or, in other cases, to any hypothesis which is not the correct one. In some cases, the AR may involve cues which lead to the formulation of non-representational misleading conjectures; other times the hypothesis is propositional and representational, as in the examples from Thagard’s paper discussed above.
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1459
In general, cases of AR show that the environment is not solely something manipulable in order to discover new information and to scaffold cognitive processes, but it is also something that can be exploited to manipulate and suggest false hypotheses, as it happens in cases of abductive ruses. In many cases of AR, the effort of agent1 does not stop after this phase. Sometimes, agent1 may be determined to find out if his hypothesis is true or not by means of empirical verification. In such cases, agent1 will need to supplement the initial AR, by matching the implication drawn from the hypothesis by agent2 and by making sure that the information collected matches the expectations based on such a hypothesis. Especially when confronting a savvy agent, the abductive ruse needs to present non-contradictory cues. There are two ways in which cues may be in contradiction: they may lack internal coherence, meaning that the various pieces of information conveyed to agent2 in setting the abductive ruse contradict each other. If detected, any contradiction will lead the adversary to doubt the veracity of the received data and will jeopardize the formulation of the deceiving hypothesis. Or, they may lack external coherence, meaning that information conveyed during the AR is incoherent to some degree with other information already stored in agent2 ’s background knowledge. In these cases, the misleading hypothesis AR will not seem believable in the first place, resulting in suspicion, disbelief, and the failure of the deception. Since examples of deception involving the formulation of representational hypotheses have been discussed above, here the focus will be on instances of deception centered around the suggestion to the adversary of non-representational, sensorimotor false conjectures. Examples of this latter kind can be easily recruited from sport performances (Mitchell & Thompson, 1986). Consider, for instance, the case of a football player executing a feint to trick the opponent. The objective of the player with the ball (agent1 ) is that of passing through without losing the ball; conversely, the objective of the other player (agent2 ) is that of stopping her and getting the ball. In such cases, if sufficiently skilled, agent1 will engage in a number of actions creating false expectations in the adversary, such as shifting the center of gravity, faking movements in the wrong direction, and so on. All these actions result in agent2 attuning to the environment, including in it the opponent, having certain implicit hypotheses – sensorimotor expectations and conjectures – on how the interaction will unfold. Such tentative attunement with the environment presents the added difficulty that the most salient part of it, i.e., agent1 , and agent2 are in a reciprocal adversarial relation. In such cases of deception, it is impossible to elaborate sophisticated methods in order to avoid eventual biases. A matrix such as that utilized by intelligence analysts is not only made useless by time constraints but also by the format in which the hypothesis is formulated. The hypotheses formulated by agent2 are not necessarily propositional mental entities as those in Thagard’s examples but are embodied guesses (embodied by the posture assumed, muscles relaxation and tension, shift of center of gravity) which try to correctly anticipate how agent1
1460
F. Fanti Rovetta
will behave to successfully counter her actions. At the same time, the agent is rarely if ever aware of the hypothetical nature of the sensorimotor guesses inferred. The other player is perceived as going in a certain direction or as trying to feint. Thus it vindicates Peirce’s claim that “abductive inference shades into perceptual judgments without any sharp line of demarcation between them” (C.P. 5.181). The D-hypothesis, i.e., the hypothesis of being deceived, in such cases, is not something one needs to remember to add to the list of plausible hypotheses, as recommended by Heuer in military contexts; rather the feint is either detected or not. On the other hand, the psychological factors suggested by Whaley remain unaltered: the AR at a sensorimotor level requires capturing and directing the adversaries’ attention, to maintain their interest, to present congruent cues, and hide incongruent ones. In this sense there is a parallel between the two kinds of deception discussed; in both cases a degree of coherence in the cues displayed is required for the deception to function. With the proviso that the cues that need to be coherent in cases of sensorimotor guesses are compressed in a much lower time scale, compared to the cases relevant for intelligence analysts. Additionally, the coherence exploited by this kind of AR is not explanatory except in a minimal sense: what is needed is that in the unfolding of the interaction no detectable cues point to the real action that will follow. In such cases it is a form of skilled action which guides the formulation of the correct sensorimotor conjectures, including that of being deceived. Going back to the example of the soccer players, obviously a skilled player can easily feint against a novice player, but it is much more difficult to do so with a player on a similar skill level. The skilled attunement to the environment in the case of the player who does not fall for the feint includes those minimal cues that signal the incoming feint. At a skilled level it quickly becomes an arms race to display and detect cues which lead to the formulation of the wrong or correct sensorimotor conjectures. The reason for the arms race is that in comparison to the example of skilled sensorimotor guess of the firefighter captain, part of the environment of AR, namely, the other agent, is actively modifying the cues detectable. In other words, in adversarial dynamics of this kind, part of the environment is actively engaged in trying to make the agent’s sensorimotor knowledge fail. It follows that sensorimotor knowledge, i.e., the familiarities with how sensory stimulation co-varies with movements and actions, is never achieved once and for all in this type of environmental dynamic. In contrast with Thagard’s characterization of adversarial interaction as requiring the development of a model of the adversary’s plan, goals, and intentions, in instances of deception involving sensorimotor conjectures such as in the sports example, no model is required. Again, the reason for this is to be found not only in time constraints, which impede the formulation of an articulated model of the opponents, but also in the lack of usefulness of modeling the adversary’s mental states in such interactions. Reasoning whether the adversary will move left or right would only risk to overload working memory and distract the player from attuning with the other player’s movement and from perceiving the salient cues that may lead to the correct guess and physical response to the opponent’s move.
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1461
Conclusions The concept of sensorimotor conjectures, developed in this chapter in relation to deception, is based on previous formulations of non-representational hypotheses, such as Menary’s notion of low-level, implicit hypothesis in exploratory cognition as well to Magnani’s concept of manipulative abduction introduced above. Throughout the chapter it has been shown that by adopting a pluralistic approach to the ontology of hypotheses, it is possible to extend the analysis of deception to many more cases than previously thought, specifically to low-level perceptual instances of deception. Having considered existing literature on the role of abductive reasoning in deception and shown that much of the research on this topic has been conducted, surprisingly, by psychologists of intelligence analysis and theorists of military strategy, I have explored the possibility that low-level, perceptual instances of deception, mostly overlooked in the existing literature, can also be considered as involving abductive inferences, once we acknowledge the possibility of nonrepresentational, sensorimotor conjectures. Abductive ruses are then not limited to classic examples – typically based on representational hypotheses – discussed in the literature of deception, but also include also these latter cases. While the idea that abduction is involved in adversarial problem-solving and in deception has been already discussed in length by Thagard, his lack of a non-representational notion of hypothesis generation and selection unwarrantedly restricted the potential application of this conception only to instances of representational – and more precisely – propositional hypotheses. This chapter is but one example of the many theoretical avenues opened by a non-representational notion of hypotheses.
References Anderson, D. R. (1986). The evolution of Peirce’s concept of abduction. Transactions of the Charles S. Peirce Society, 22(2), 145–164. Bertolotti, T., & Magnani, L. (2017). Theoretical considerations on cognitive niche construction. Synthese, 194(12), 4757–4779. https://doi.org/10.1007/s11229-016-1165-2 Bonawitz, E., & Griffiths, T. (2010). Deconfounding hypothesis generation and evaluation in Bayesian models. In Proceedings of the 32nd annual conference of the Cognitive Science Society. Chang, W., Berdini, E., Mandel, D., & Tetlock, P. (2017). Restructuring structured analytic techniques in intelligence. Intelligence & National Security, 33. https://doi.org/10.1080/02684527. 2017.1400230 De Jesus, P. (2016). From enactive phenomenology to biosemiotic enactivism. Adaptive Behavior, 24(2), 130–146. https://doi.org/10.1177/1059712316636437 Di Paolo, E. A., Barandiaran, X. E., Beaton, M., & Buhrmann, T. (2014). Learning to perceive in the sensorimotor approach: Piaget’s theory of equilibration interpreted dynamically. Frontiers in Human Neuroscience. https://doi.org/10.3389/fnhum.2014.00551 Everett, D. (2019). The American Aristotle. https://aeon.co/essays/charles-sanders-peirce-was-ame ricas-greatest-thinker . Accessed 22/08/2021.
1462
F. Fanti Rovetta
Fanaya, P. F. (2021). Autopoietic enactivism: Action and representation re-examined under Peirce’s light. Synthese, 198, 461–483. https://doi.org/10.1007/s11229-019-02457-6 Fann, K. T. (1970). Peirce’s theory of abduction. Martinus Nijhoff. Fanti Rovetta, F. (2020). Framing deceptive dynamics in terms of abductive cognition. Pro-Fil, 21, 1. https://doi.org/10.5817/pf20-1-2043 Feyerabend, P. (1993). Against method. Verso. Flórez Restrepo, J. A. (2021). Are there types of abduction? An inquiry into a comprehensive classification of types of abduction. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action. Studies in applied philosophy, epistemology and rational ethics (Vol. 59). Springer. https://doi.org/10.1007/978-3-030-61773-8_1 Fogarty, W. (1988). Formal investigation into the circumstances surrounding the downing of Iran Air Fight 655 on 3 July 1988. Department of Defense. Frankfurt, H. G. (1958). Peirce’s notion of abduction. The Journal of Philosophy, 55(14), 593–597. Gabbay, D. M., & Woods, J. (2005). The reach of abduction: Insight and trial. Vol. 2 A practical logic of cognitive systems. Elsevier. Gallagher, S. (2020). Mindful performance. In A. Pennisi & A. Falzone (Eds.), The extended theory of cognitive creativity. Perspectives in pragmatics, philosophy & psychology. Springer. https:// doi.org/10.1007/978-3-030-22090-7_3 Gallagher, S., & Miyahara, K. (2012). Neo-pragmatism and enactive intentionality. In J. Schulkin (Ed.), Action, perception and the brain. New directions in philosophy and cognitive science. Palgrave Macmillan. https://doi.org/10.1057/9780230360792_6 Heras-Escribano, M. (2019). Pragmatism, enactivism, and ecological psychology: Towards a unified approach to post-cognitivism. Synthese, 198(1), 337–363. https://doi.org/10.1007/s11229019-02111-1 Heuer, R. J., Jr. (1999). Psychology of intelligence analysis. Center for the Study of Intelligence. Hutto, D. D. (2019). Minds in skilled performance: Two challenges. In S. Gallagher, D. D. Hutto, J. Ilandain-Agurruza, M. Kirchhoff, K. Miyahara, & I. Robertson (Eds.), Minds in skilled performance: From phenomenology to cognitive explanations (Vol. 35, pp. 1–20). Annual Review of the Phenomenological Association of Japan. Hutto, D. D., & Myin, E. (2013). Radicalizing enactivism: Basic minds without content. MIT Press. Kahneman, D., & Klein, G. (2009). Conditions for intuitive expertise a failure to disagree. The American Psychologist, 64, 515–526. https://doi.org/10.1037/a0016755 Kaiser, S., Simon, J. J., Kalis, A., Schweizer, S., Tobler, P. N., & Mojzisch, A. (2013). The cognitive and neural basis of option generation and subsequent choice. Cognitive, Affective, & Behavioral Neuroscience, 13(4), 814–829. Kiverstein, J. (2010). Sensorimotor knowledge and the contents of experience. Perception, Action, and Consciousness: Sensorimotor Dynamics and Two Visual Systems.https://doi.org/10.1093/ acprof:oso/9780199551118.003.0014 Klein, G. (2017). Sources of power: 20th anniversary edition. MIT Press. Kompa, N. A. (2021). Epistemic evaluation and the need for ‘impure’ epistemic standards. Synthese, 199, 4673–4693. https://doi.org/10.1007/s11229-020-02996-3 Legg, C. (2008). Making it explicit and clear: From “Strong” to “Hyper” – Inferentialism in Brandom and Peirce. Metaphilosophy, 39(1), 105–123. Liddell Hart, B. H. (1967). Strategy: The indirect approach. Faber & Faber. Magnani, L. (2001). Abduction, reason, and science. Processes of discovery and explanation. Kluwer Academic/Plenum Publishers. Magnani, L. (2008). Discovering and communicating through multimodal abduction. In I. Shuichi, Y. Ohsawa, S. Tsumoto, N. Zhong, Y. Shi, & L. Magnani (Eds.), Communications and discoveries from multidisciplinary data. Springer. Magnani, L. (2009). Abductive cognition. The epistemological and eco-cognitive dimensions of hypothetical reasoning. Springer. Magnani, L. (2011). Understanding violence. The intertwining of morality, religion and violence: A philosophical stance. Springer.
67 Abductive Ruses: The Role of Conjectures in the Epistemology of Deception. . .
1463
Magnani, L. (2017). The abductive structure of scientific creativity. An essay on the ecology of cognition. Springer. Magnani, L. (2021). Abduction as “leading away”. In J. R. Shook & S. Paavola (Eds.), Abduction in cognition and action. Studies in applied philosophy, epistemology and rational ethics (Vol. 59). Springer. https://doi.org/10.1007/978-3-030-61773-8_4 Menary, R. (2007). Cognitive integration: Mind and cognition unbounded. Palgrave Macmillan. Menary, R. (2016). Pragmatism and the pragmatic turn in cognitive science. In K. Friston, A. Andreas, D. Kragic, & A. Engel (Eds.), The pragmatic turn: Toward action-oriented views in cognitive science (pp. 219–237). MIT Press. Mitchell, R. W., & Thompson, N. S. (1986). Deception: Perspectives on human and nonhuman deceit. SUNY Press. Mohammadian, M. (2019). Beyond the instinct-inference dichotomy: A unified interpretation of Peirce’s theory of abduction. Transactions of the Charles S. Peirce Society, 55(2), 138–160. Nersessian, N. J. (1995). Should physicists preach what they practice? Constructive modeling in doing and learning physics. Science and Education 4, 203–226. Paavola, S. (2005). Peircean Abduction: Instinct or Inference?. Semiotica. https://doi.org/10.1515/ semi.2005.2005.153-1-4.131 Paavola, S. (2011). Review of abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning, by Lorenzo Magnani. Transactions of Charles S. Peirce Society, 47(2), 252–256. Park, W. (2015). On classifying abduction. Journal of Applied Logic, 13, 3. https://doi.org/10.1016/ j.jal.2015.04.001 Peirce, C. S. (1931–1966). Collected papers (8 Vols.). Hartshorne, C., Weiss, P. (Vols. I–IV), and Burks, A. W. (Vols. VII–VIII) (Eds.). Harvard University Press. Polanyi, M. (1967). The tacit dimension. Anchor Books. Stanley, D., & Nyrup, R. (2020). Strategies in abduction: Generating and selecting diagnostic hypotheses. Journal of Medicine and Philosophy, 45(2), 159–178. https://doi.org/10.1093/jmp/ jhz041 Thagard, P. (1992). Adversarial problem solving: Modeling an opponent using explanatory coherence. Cognitive Science, 16(1), 123–149. https://doi.org/10.1016/0364-0213(92)90019-q Thagard, P. (2007). Abductive inference: From philosophical analysis to neural mechanisms. In A. Feeney & E. Heit (Eds.), Inductive reasoning: Experimental, developmental, and computational approaches. Cambridge University Press. van Dijk, L., & Myin, E. (2019). Reasons for pragmatism: Affording epistemic contact in a shared environment. Phenomenology and the Cognitive Sciences, 18, 973–997. https://doi.org/10.1007/ s11097-018-9595-6 Whaley, B. (2016). Practise to deceive, learning curves of military deception planners. Naval Institute Press. Woods, J. (2013). Errors of reasoning naturalizing the logic of inference. College Publications.
Adversarial Abduction: The Logic of Detection and Deception
68
Samuel Forsythe
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inquiry and Adversariality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cooperative Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adversarial Rationality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adversarial Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Detecting the Adversary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Intelligence Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Affordance Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Logic of Deception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inquiry and Deception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Semiotics of Deception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Deceiving Abductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion: Abduction and Adversarial Rationality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1466 1467 1467 1468 1470 1473 1474 1477 1480 1480 1481 1483 1485 1485
Abstract
Today, it is widely agreed that abductive reasoning is not only central to scientific inquiry but also plays an important role in other kinds of inquiry and practical problem-solving situations. But while there is now a significant body of research on the role of abduction in the inquiries of cooperative communities, there is still much to be said on the role of abduction in adversarial situations, where the scientific norms of cooperation are set aside in favor of pragmatic criteria dictated by the ethics and epistemology of conflict. This chapter uses Peirce’s pragmatistic philosophy of inquiry to conceptualize the role of abduction in two
S. Forsythe () Peace Research Institute Frankfurt, Frankfurt am Main, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_32
1465
1466
S. Forsythe
key modes of adversarial rationality: detection – abductive efforts to inquire into the adversary; and deception – semiotic efforts to mislead the abductions of the adversary. Keywords
Abduction · Adversariality · Deception · Detection · Inquiry · Semiotics · Intelligence · Peirce
Introduction For researchers working across philosophy, cognitive science, semiotics, and social epistemology, the topic of abduction has proven to be a productive source of conceptual and theoretical problems through which to explore inquiries where hypothetical and diagnostic reasoning play a significant role. Situations of interest often concern those fields where surprises, puzzles and practical problems force us to proceed abductively, using hypotheses to reconstruct past events and conjecture future possibilities. Most of this research addresses social practices that share norms of cooperative problem-solving that resemble, to a greater or lesser degree, Peirce’s ideal scientific community: discovering, investigating, testing, and sharing results in good faith with the aim of pursuing a collective practice of truth-seeking. However, in the many years since Peirce made his first inquiries into the logic of discovery, there have been those who have noted the relevance of abduction for uncooperative and adversarial scenarios, where inquiry is not organized according to the values of a cooperative community but by conflict, struggle, and opposition. The main aim of this chapter is to conceptualize adversarial modes of abduction. Abduction, as understood within the Peircean pragmatistic framework, is the process of reasoning through which hypotheses are formed and evaluated, hypotheses which may serve explanatory, exploratory, diagnostic, or predictive purposes. To conceptualize adversarial modes of abduction, I adapt Peirce’s concept of inquiry to adversarial situations and theorize the role of abduction in detection – abductive efforts to inquire into the adversary – and deception – semiotic efforts to mislead the abductive inquiries of the adversary. The first section will examine how the adoption of an adversarial rationality affects the conduct of inquiry. Using Arrigo’s (2000) work on the ethics and epistemology of weapons research, we can see how efforts to exploit inquiry as an instrument of conflict gives rise to two adversarial modes of abduction that distort cooperative modes of discovery and communication. The following section explores the logic of detection in the practice of secret intelligence inquiry, examining its methodological premises in light of Magnani’s (2009) eco-cognitive study of the relationship between abduction and affordance. The third section examines the significance of abduction for deception, where semiotic efforts are made to mislead the adversary’s inquiries through the manipulation of their abductive inferences.
68 Adversarial Abduction: The Logic of Detection and Deception
1467
Inquiry and Adversariality This section explores how the adoption of an adversarial rationality shapes the conduct of inquiry, giving rise to distorted forms of both inquiry and “counterinquiry,” conceptualized as abductive processes of detection and deception.
Cooperative Inquiry To begin, it is helpful to clarify the concept of inquiry, for which a return to Peirce’s idea will be helpful, at least to construct an ideal-typical model against which to compare its adversarial rival. Throughout his life and work, one of Peirce’s key concerns was to understand inquiry, or the logic of discovery. The concept of inquiry signified the rational and methodical search for answers to puzzling questions, exemplified – but not exhausted – by the operations of the sciences. For an idealized sketch of Peirce’s concept of inquiry, it is enough to say that it is concerned with two complementary processes, discovery and communication. Discovery begins with the experience of surprise. Living and striving in the world, we are often surprised to discover that things are not as we believe them to be. Upon discovering that our beliefs and our observations no longer align, our surprise might give rise to the discomfiting experience of doubt, which Peirce in his early work thought it was the role of inquiry to resolve by searching for answers that allow us reattain a state of belief. (CP 5.370; Note: All references to the writings of Charles Sanders Peirce are to volume number and paragraphs, and not pages, in Collected Papers of Charles Sanders Peirce, Vols. 1–7, ed. Charles Hartshorne and Paul Weiss, Cambridge, MA: Belknap Press of Harvard University, 1934–63.) Peirce was also aware that in both scientific and everyday inquiries, we must often begin our efforts to explain surprise using limited or unreliable information. These initial stages of inquiry were of additional interest to Peirce as they revealed many of the most intriguing questions of abduction, such as how it is possible to reason from surprise to conjecture and on towards truth, or how we detect signs that point towards hidden salience. Regardless of the background knowledge with which we begin our inquiries, Peirce believed that in all cases, we are prompted by the desire to explain, anticipate, avoid, or uncover further surprises, leading to habits of conjecture and experiment through which we shape our experience and environment (CP 5.374ff). And should our initial investigations afford hypotheses that appear sufficiently plausible to explain a puzzle before us, then we might take the further step of considering the potential consequences of our abductions, making deductive formulations of our hypothetical premises and then, if possible, testing our theories inductively through experimental practices. In some cases, we may find we are soon able to confirm and validate our initial hypotheses; however, if the object or context of inquiry is complex and contingent, we may find ourselves once again surprised and in need
1468
S. Forsythe
of further abductions and inquiries. But by shifting iteratively between hypotheses, formal theorizations, models, and experimental manipulations, scientific inquiries are able move slowly, piecewise towards objects which may nevertheless remain immensely complex or hidden from direct apprehension. For Peirce, and for those who follow the trails opened up by his research, all inquiries begin with abductions, and abduction arises at every stage of inquiry (Auxier, 2018, p. 6). It is only our abductive powers of investigation and hypothesis that allow us to step from the known into the unknown, and which guide us, often poorly but ultimately effectively, through the dark. However, Peirce believed that inquiry does not only take place through particular efforts of discovery but requires the collective efforts of a community. Such a community of inquirers, as it has come to be called, must share a cooperative ethics, an epistemic methodology of self-correction, and adhere to a fallibilist conception of truth arrived at through consensus. Those who adopt these commitments will necessarily find themselves opposed to practices that aim to inhibit discovery and stifle scientific communication: Upon this first, and in one sense this sole, rule of reason, that in order to learn you must desire to learn, and in so desiring not be satisfied with what you already incline to believe, here follows one corollary which itself deserves to be inscribed upon every wall of the city of philosophy: Do not block the way of inquiry. (CP1.135)
The commitments of inquiry in turn shape the forms and norms of communication that allow our discoveries to be shared, discussed, debated, questioned, refuted, and revised (Misak, 2000, p. 35). Only through this collaborative effort can we move (asymptotically) towards truth over potentially vast spans of time. And while the competitive reality of scientific and scholarly professions may not always live up to such ideals, the global scientific community is still normatively committed to practices of cooperation, open and publicly available sources of information, and the collective search for knowledge.
Adversarial Rationality Despite the lofty ideals inspiring Peirce and those who share his vision, it is an unfortunate fact that even our greatest efforts of reason can be put towards ends that are both ethically and epistemically opposed to the cooperative spirit of inquiry. One must only think of the historical relationship between modern science and modern warfare to get a sense of how the tools of the mind can be transformed into instruments of harm (Galison, 2000). And as this section will argue, it is not only the products of inquiry that can be adapted for situations of conflict but also the process of inquiring itself. Faced not only with the puzzlement and doubt that arise from surprising circumstances but also the uncertainty and suspicion that result from unpredictable and dangerous environments, inquirers may find themselves drawn to
68 Adversarial Abduction: The Logic of Detection and Deception
1469
modes of discovery and communication that differ in key ways from those founded on the ethics of cooperation. And if the stakes are raised even higher, such that inquirers must not only reckon with the intractability of nature but the active hostility of intelligent beings, then we may find that the procedures of inquiry are deformed even further. A particularly useful conception of adversarial inquiry can be found in Arrigo’s (2000) work on the ethics and epistemology of scientific weapons research, which theorizes the epistemological premises that shape and constrain the methodology of adversarial inquirers. Arrigo conceives of adversariality as a state of active hostility between two or more agents, motivated by moral premises founded on the perception of danger but structured by the epistemic premises of adversarial reasoning (Arrigo, 2000, p. 305). Believing themselves to be threatened by some form of significant danger, persuaded of the moral superiority of their cause (survival, victory, domination, etc.), and in the absence of any adjudicating third party, adversarial actors adopt a permissive moral stance that allows them to justify taking adversarial actions that may be offensive, defensive, preventive, or even preemptive (ibid., p. 315). Arrigo characterizes the overall result as a kind of “strategic ethics” entailing “consequentialism with respect to the Adversary (the ends justify the means), military virtue towards self and colleagues (courage, commitment, discipline, integrity, etc.), and sacrifice of subordinates and lesser Clients when tactically necessary to defend against the Adversary (expendability/self-sacrifice)” (ibid., p. 315). This adversarial ethics is then operationalized through a rational framework that Arrigo calls the adversarial epistemology, which will be discussed in detail below. In this chapter, the combination of an adversarial ethics and epistemology is conceptualized as an “adversarial rationality.” The moral premises of adversarial rationality create conditions in which ethical discourse between ”insiders” – i.e., those responsible for carrying out adversarial activity – and “outsiders” – i.e., non-adversarial actors who seek to question insider methods and justifications – becomes difficult, since insiders do not share the moral premises of outsiders (ibid., p. 303). The epistemological premises of adversarial rationality also generate moral problems, since the secrecy entailed by adversarial activity further inhibits inquiry and moral assessment by outsiders. To address these problems, Arrigo’s aim is to enable outsiders to make sharper moral assessments of weapons research projects by helping them acquire an appreciation of the methodological premises of adversarial rationality, relieving them of the complicity that arises from “culpable ignorance” (ibid., p. 322). However, as Arrigo observes, while making the ethical premises of adversariality explicit might help outsiders grasp what actions an adversarial actor considers morally justified, it does not yet give any indication of how adversariality changes inquiry or how it might otherwise alter our epistemic commitments. To this end, Arrigo argues that the moral outcomes of adversarial inquiries are more closely tied to their epistemic premises than they are to their ethical premises, and therefore it is of central importance to make such premises explicit (ibid., p. 303).
1470
S. Forsythe
Adversarial Inquiry To explain both internal and external constraints upon adversarial inquiries, Arrigo formulates the concept of an adversarial epistemology, which describes a rational framework for conflict that arises in high-stakes circumstances, “where competition for knowledge is crucial to the attainment of a limited good, such as military and political power.” As we will see, such noncooperative forms of inquiry give rise to an extended repertoire of abductive inference and reasoning. Arrigo conceptualizes the principles of adversarial epistemology in four premises, each of which reveals some of the ways that the logic of inquiry and the role of abduction are influenced by problems of adversariality: 1. 2. 3. 4.
The ultimate goal of inquiry is advantage over an Adversary. The Adversary is dangerous and implacable. All observations are vulnerable to deliberate deception by the Adversary. Clients govern the broad topics, opportunities, and constraints of inquiry. (ibid., p. 307)
The first principle, “the ultimate goal of inquiry is advantage over an adversary,” introduces one of the most significant ways that adversarial inquiries deviate from cooperative norms. In cooperative inquiry, the ultimate goal is to take inquiry as far as it can go, and to reach a revisable and fallible consensus on truth within a community of inquirers. However, adversarial inquiries have an entirely different purpose characterized by a particular epistemic aim: the search for exclusive and efficacious knowledge that leads to advantage, which we can conceive as the ability to surprise the adversary and to anticipate their efforts to surprise us in turn. Such knowledge is not universally valued but particular and “intrinsically partisan, because the value of knowledge depends on its utility to us” (ibid., p. 307). This notion of advantage describes not only a functional utility relative to features of the environment but to the balance of power between adversaries, in which the agent with advantage has an increased capacity for knowledge, decision, or action that often entails a disadvantage for the opponent – exemplified in the “zero-sum games” studied by game theory (von Neumann & Morgenstern, 1944; Goffman, 1970). One curious feature of adversarial inquiry, and adversarial rationality in general, is that they have a generally ”pragmatic” character, insofar as the role of such inquiries is to produce beliefs that can support action. However, the peculiarities of the adversarial context shift both the purpose and consequences of belief formation. Arrigo clarifies this problem by noting that when advantage motivates inquiry, it introduces “a gap between the validity of knowledge and the value of knowledge.” What is important is not, as in our Peircean model, that the product of an inquiry conform to mind-independent reality thanks to the exhaustive and uninhibited conduct of discovery and communication, but rather that it afford effective action even in conditions of limited information and often without the validation that comes from rigorous testing and open debate. This of course does not exclude the possibility that adversarial inquiries can reach reliably true conclusions, but
68 Adversarial Abduction: The Logic of Detection and Deception
1471
rather highlights the fact that they simply do not elevate truth to the highest aim. Indeed, there are a variety of adversarial circumstances in which incorrect beliefs, or even ignorance, might be more advantageous than true beliefs (Arrigo, 2000, p. 307). Additionally, while deductive and inductive procedures might be more reliable, they suffer from the constraints of time-pressure and limited information. By contrast, abductive inferences are well suited to such circumstances, since they can be produced heuristically even under significant information constraints and can enable action without confirmation of their validity (Bardone, 2011). Thus in this first premise, we can see how adversarial inquiries shift epistemic aims from consensus on truth to the production of situational advantage, enlarging the role of abduction in the production of hypotheses and belief formation but at the same time inhibiting aspects of inquiry, without which we cannot expect to eventually converge upon truth. The moral and practical premises of the adversarial rationality mean that the detection and exploitation even of merely hypothetical advantages takes epistemic priority over all other commitments of inquiry. If Arrigo’s first principle provides the strategic motivation for the adversarial epistemology, the second principle, “the adversary is dangerous and implacable” gives it its justification and reveals many of its constraints. Irreconcilable interests, high political stakes, the dangers of imminent violence, and the wide variety of possible threats all serve to further deform the nature of adversarial inquiry away from the cooperative model. The problem of danger “directs attention to prevention of surprises,” which is analogous (in a dark and parodic fashion) to the pragmatistic motivation of inquiry, insofar as adversarial inquiries aim to eliminate the doubt and uncertainty that are thought to arise from dangerous surprises (Arrigo, 2000, p. 309). However, the risks and advantages of dangerous surprises produce the specific moral grounding of adversarial ethics and provide its epistemic motivations, since when “the stakes are high, the Adversary may attempt to destroy us, not only to thwart our inquiry” (ibid., p. 308). Additionally, because the “circumstances of competition are believed to prevent reconciliation of opposing interests,” adversarial actors expect any lapse in their wariness to be an advantage exploitable by the adversary (ibid., p. 308). However, danger also constrains inquiry insofar as the threat of danger limits the time available for procedures of discovery and limits cooperative communication. Since the end of such inquiries is to support advantageous action, and the aims of such actions are shaped by the perception of imminent danger, adversarial inquiries also entail a further trade-off between the speed and the accuracy of hypotheses and beliefs (ibid., p. 308). And because situations of danger encourage adversaries to make efforts to conceal their activity to maximize their ability to produce surprising effects, adversarial inquirers “must forgo the ideal of perfection of knowledge in limited fields” and instead spread their “epistemic resources widely for brief inquiry into unlikely domains so as not to leave them unattended for exploitation by the Adversary,” creating “a further trade-off between accuracy and comprehensiveness” (ibid., p. 309). All of these dynamics give rise to an increased utility for abductive inference since, without abductions, there are simply no other means for cognitively overcoming the pragmatic demands and practical limitations
1472
S. Forsythe
of the situation. However, because each adversarial actor fears the surprising actions of the other, the dynamics of mutual threat and danger give rise to the possibility of a cascade of preemptive or escalatory actions founded on incorrect abductions. The third premise, “all observations are vulnerable to deliberate deception by the adversary,” brings us to the key difference between adversarial and scientific inquiry. As Goffman (1970) notes in his work on the interpersonal logic of concealment, disclosure, and deception, to maximize one’s own potential for advantageous surprise, it might not only be necessary to conceal intention and action from the adversary but to cause them to acquire false beliefs that will be to one’s own advantage (Goffman, 1970, pp. 5–13). Therefore, adversaries must not only attend to the complications of inquiry arising from danger, contingency, and complexity, but also the possibility that their inquiries are being intentionally manipulated. Arrigo notes that while scientific inquiry works to eliminate errors resulting from “unrepresentative samples, faulty instrumentation, omitted dated, misused statistical analyses” etc., adversarial inquiry must remain constantly vigilant to avoid being taken in by deception, since “regardless of the adversary’s knowledge of a phenomenon, they may deceive us about the nature of the phenomenon itself, about their knowledge of it, about their knowledge of our knowledge of it, and so on” (Arrigo, 2000, p. 309). And unlike scientific practice, secret, and systematic adversarial inquiry may actually increase vulnerability to deception by creating new channels through which multilayered ruses can play out, such as a cryptographer cracking a code only to be taken in by the disinformation of the message itself (ibid., p. 310). Following this logic, the adversarial inquirer must not only inquire into the adversary but must also scrutinize their own behavior in order to discover how it may afford the adversary the opportunity to make abductions about their habits and thus an opportunity to anticipate, surprise, and deceive. Thus, the possibility of deception opens up a semiotic domain of adversarial contest, not only over the acquisition of advantageous knowledge but competition over who can best manipulate the abductions of the other. As an inherent element of adversarial inquiry, deception is the practice that reveals how far such inquiries deviate from cooperative norms. The centrality of deception also reveals just how significant abduction is to adversarial inquiry and how dependent adversarial rationality is on abductive methods. In their efforts to gain advantage, overcome secrecy, and avoid dangerous surprises, adversarial inquirers are not only forced to rely on abductions for the formation of beliefs to support action, they must also take care to inhibit and manipulate the abductions of the adversary. While adversarial discovery involves abductions about the adversary, deception takes these abductions and exploits them to generate semiotic activity that will manipulate the adversary’s own abductions. The final principle, “clients govern the broad topics, opportunities, and constraints of inquiry,” reinforces the notion that adversarial inquiry cannot not serve a cooperative community or result in open forms of communication (Arrigo, p. 311). Clients, in Arrigo’s model, are the politicians and military decision makers on whose behalf adversarial inquiries and deceptions are undertaken. Rather than pursuing truth for the good of all, such inquiries pursue advantage for the power of
68 Adversarial Abduction: The Logic of Detection and Deception
1473
a few. Regardless of whether these advantages are thought to serve a greater good, the reality is that the means and methods adopted are inimical and opposed to the cooperative and scientific spirit of inquiry—an ethical dilemma that is of particular significance for democratic nations engaged in adversarial practices of international relations (Misak, 2000). Further, the operations of deception instrumentalize the inferential products of discovery, using them as the basis for the manipulation of the adversary’s inquiry, denying them the opportunity to move towards truth, and inhibiting their participation in a community dedicated to this end. To summarize Arrigo’s model, the adoption of an adversarial rationality not only distorts the practice of inquiry but extends and changes the role played by abduction in such a way that it is used to oppose the very aims of cooperative inquiry itself. To better distinguish the adversarial exploitations of abduction from cooperative discovery and communication, it might be helpful to develop alternate terms that highlight their instrumental and adversarial character. For discovery, let us instead use the term detection, and for communication, we can substitute the notion of deception.
Detecting the Adversary In its general sense, detection can be conceived as the perceptual and cognitive process of abductively discovering something hidden, latent, and otherwise semiotically resistant to immediate observation. Historical images of skilled detection include the arts of the wily hunter or the cunning military general, while today, we might include the expert inquiries of police detectives or ingenious investigative journalists (Detienne & Vernant, 1991; Jullien, 2004; Weizman, 2017). The objects of detection need not be physical, but might also be conceptual, linguistic, semantic, hidden in the conceptions of thought, in patterns of discourse, or in vocal utterances of speech (Magnani, 2009, p. 294, 334, 357). Since ancient times, the discourse of medicine has employed a semiotic concept when it speaks of detecting the hidden presence of illness through indirect signs (semeion) and symptoms. Today, military sensor systems abductively detect hidden adversary presences (Shakarian & Subrahmanian, 2011; Shakarian et al., 2011) while the profession of police forensic investigation is entirely dedicated to the practice of detecting the concealments and dissimulations of criminal actors (Danesi, 2014). In their collection of papers comparing Peirce’s logic of abduction to the investigative methodology of the fictional detective Sherlock Holmes, Eco and Sebeok (1983) reveal that not only is the inquiring detective concerned with similar procedures of discovery as the pragmatistic inquirer, they are, like Peirce, primarily concerned with an abductive approach to the semiosis of secret and hidden phenomena, with certain kinds of environmental clues – traces, tracks, incidental marks, and other indices of past activity – that afford further hypotheses about causes, intentions, and future possibilities for action otherwise hidden by the passing of time or the activities of intentional concealment (Eco & Sebeok, 1983). However, the authors of that volume are also aware that the mode of detection employed
1474
S. Forsythe
by the detective is not identical with that of the scientific inquirer, insofar as the detective employs their detection procedures against an adversary, one who oftentimes carries with them the threat of danger and other unpleasant surprises. In this sense, detection is an adversarial mode of inquiry that not only discovers things but also opportunities, whether for action or for thought. Detection detects further complexes of signs and phenomena through which we can continue the work of inquiry to explain surprise and neutralize doubt, not in the service of truth, but for situational advantage. In the following section, I conceptualize adversarial detection as the process of abductively discovering adversarial affordances. To develop this line of thought, I first follow Arrigo’s example and take a look at the adversarial procedures of discovery and investigation typical of political-military intelligence.
Intelligence Inquiry “Intelligence” is a term with a remarkable semantic density, a homonym with two equally multivalent meanings. While the principal sense of the word refers to a cognitive agent’s capacity for learning, knowledge, and reasoning, the other sense of intelligence displays as much polysemy as its twin, simultaneously denoting at least three different modes of epistemic, practical, and organizational activity, all characterized by being relevant to contexts of adversariality. Intelligence, in this latter sense, refers to a particular kind of political-military inquiry, in which adversarial actors seek to inquire into their (actual or merely potential) adversaries. Through networks of spies, clandestine officers, remote sensor systems, technicians, and analysts, intelligence organizations gather information, rumors, clues, and myriad indirect indices in order to piece together pictures of the operational environment, to model the adversary’s capabilities, and, if possible, to predict their intentions. From this general notion can be discerned three uses of intelligence that describe different conceptual and practical levels: the activity of intelligence that constitutes both the inquiry and the advantageous exploitation of its products; the particular knowledge “products” that are the results of inquiry; and the type of organizations that engage in this epistemic production (Thomas, 1988, p. 219; Warner, 2002, pp. 15–20; Vrist Rønn & Høffding, 2013, pp. 695–696; Wheaton & Beerbower, 2006, pp. 319–330; Herman, 2001, pp. 3–29; Scott & Jackson, 2004, pp. 139–169). While many organizations gather information for competitive advantage, intelligence is otherwise defined by its adversarial character and its utility in producing beliefs that afford strategic advantage (Horn, 2003; Sims, 2022). As an introduction to the adversarial inquiry of intelligence, I would like to examine one of the most comprehensive theories intelligence, the “intelligence cycle” (Phythian, 2013). The intelligence cycle is a conceptual scheme that is understood to describe the different “core” stages of practical and epistemic activity involved in the functioning of an intelligence organization (Davies & Gustafson, 2013, p. 63). In its most basic form, the intelligence cycle describes several stages of discovery and communication, each defining a different mode of inquiry essential to the process of
68 Adversarial Abduction: The Logic of Detection and Deception
1475
producing epistemic “intelligence” products, such as reports, warnings, evaluations, predictions, etc. Conventionally, the cycle begins with the “planning and direction” stage, wherein the doubts, surprises, suspicions, and objectives of decision makers (“Clients,” in Arrigo’s model) provide the requirements that initiate and orient the inquiry. Such directives structure the aims of the “collection stage,” wherein various methods of inquiry, both technical and interpersonal, are employed to discover and acquire advantageous information. Collected intelligence is then passed on to the “processing” stage, where it is evaluated and interpreted in order to draw out the latent significance and inferences that might afford the creation of useful models, evaluation of likelihoods, and the development of explanatory hypotheses. Having been analyzed, intelligence passes to the dissemination stage, where it becomes the intelligence “products” that are communicated to decision makers who provide further collection guidelines or they may be fed back into some other stage of the cycle to adjust ongoing collection operations or to update the premises of analysis. The planning/direction stage of the intelligence cycle describes the formulation and communication of the tasks, targets, and areas of interest relevant to decision makers requiring intelligence. These requirements are derived from the nationalsecurity needs of governments, and these in turn derive from the general nature of the international system and the strategic threats relevant to or stemming from each international actor. As Arrigo notes, despite there being an element of communication at this stage of the inquiry, it is necessarily constrained by the strictures of secrecy, which immediately distinguish it from cooperative inquiries (Arrigo, 2000, p. 311). Intelligence collection takes place through a number of different and distinct modes of activity, the oldest and most well known being “human intelligence” (Humint), i.e., collection through spies, foreign agents, informers, defectors, prisoners, travelers, or other “well-placed” persons. The techniques of Humint vary considerably, and those such as agent “recruitment” and “handling” appear as something more akin to a craft or art than any kind of formal methodology, requiring an experienced understanding of human behavior, knowledge of cultures and customs, as well as knowledge about the personal tendencies of individual human sources (Herman, 1996, pp. 61–66). The uncertainty and contingency of this activity suggest a great deal of potential for the exploitation of abductive cognition, not only at the level of inquiry but in the contingent and improvisatory techniques of spies and secret agents. Of the cognitive character of intelligence inquiry through interpersonal channels, it appears to involve a pragmatic ability to handle unpredictable situations, deal with contingencies, and the ability to manipulate other persons in order to extract relevant information. In the modern era, the interpersonal capability of the intelligence officer has in many instances been displaced through the development of the technical observation of signals (Sigint) and imagery (Imint) intelligence. Through technologically enabled communications interception and satellite observation, the perceptual apparatus of the human agent is extended and distributed into the world, allowing interpersonal communications to be surveilled from a distance or intercepted en route through telecommunications networks. Through Sigint, vulnerabilities (i.e.,
1476
S. Forsythe
hidden opportunities for adversarial action) latent in technological infrastructure are exploited to facilitate interception. While Humint can be thought of as specializing in the detection and exploitation of human vulnerabilities, Sigint specializes in the exploitation of the “environmental” features of ICTs and other technological systems. Imint similarly exploits environmental features, providing indirect visual indication of an adversary’s material capabilities, movements, and operations (ibid., p. 18). However, Imint images alone are not sufficient as intelligence products, and before being passed on to decision makers, they undergo a process of interpretation and analysis (identifying objects through size, shape, shadow, etc.), both for the purposes of explaining the scenes depicted as well as to enable further inferences that such scenes might afford (ibid., p. 100): the presence of a missile battery may indicate an intention to attack an adversary, or the size of a logistics delivery may point to the acceleration of a weapons research project. However, it must be noted that the partial and limited perspectives of mediating technologies ensure that the interpretation of signals and imagery depends in large part on the success of conjectures and hypothesizing (Horn, 2003, p. 14). The processing stage of the intelligence cycle, involving interpretation, evaluation, and expert analysis, has long been considered the functional core of the intelligence process, without which intelligence as a product cannot acquire its special utility or value (Kent, 1966, pp. 4–5). Indeed, because of the nature of adversarial inquiries – constrained by danger and time-pressure – the products of inquiry are often partial and, as Arrigo points out, subject to the possibility of deception. Therefore, intelligence inquiry proceeds by combining the search for novel phenomena with the analysis of anomalous phenomena (Arrigo, 2000, p. 309; Aliseda, 2004, p. 353). The character and activities of this stage are many and complex, but it is uncontroversial to suggest that the activity is effectively inferential and abductive in nature, in the sense that it involves making hypothetical inferences from limited, contingent, and potentially unreliable data (Herman, 1996, p. 100). The general scheme of intelligence processing and analysis involves a series of observations and inferences about the content, meaning, and implications of a given piece of intelligence material (ibid., pp. 100–104). The inferences derived from Humint, Sigint, Imint, and the other “Ints” not covered here (e.g., geospatial, electronic, intelligence from “open” sources), all have the capability of indicating directly or indirectly the nature, state, or possibility of a past, current, or future event. In this sense, intelligence inquiry employs abduction for reconstructing or explaining past events, predicting future surprises, or for diagnostic reasoning about current events (Magnani, 2001). The abductive character of intelligence collection is mirrored in the analysis stage, where hypotheses are derived from a range of textual, visual, and artifactual evidence (which together cover Peirce’s three types of sign: iconic, indexical, and symbolic). The cognitive character of intelligence “processing” appears to involve a mix of empirical analysis, the formulation – and, if possible, testing – of hypotheses, and expert conjecture (Herman, 1996, pp. 82–86). Intelligence theorist Michael Herman likens the process of intelligence analysis to the work of an archaeologist, while others have compared
68 Adversarial Abduction: The Logic of Detection and Deception
1477
it to the diagnostic activity of a doctor, both practices that have been discussed in the literature on abduction (ibid., p. 85; Warner, 2013, p. 27). The dissemination stage of intelligence inquiry covers that aspect of the cycle where collated intelligence products are communicated to policy and decision makers, or are fed back into some other stage of the cycle to optimize operational control. We can understand the product stage as one in which intelligence is formed into various epistemic instruments. Depending on the intended use or desired instrumentality, products may serve descriptive, explanatory, estimative, or predictive roles, and may have various recipients, e.g., the executive branch of government, military forces, the foreign office, or national security services. In our rough sketch of intelligence inquiry, we can see a general resemblance to the ideal-typical model of pragmatistic inquiry, and yet we can easily discern a clear divergence that follows the pattern set by our concept of adversarial inquiry. The highest aim of intelligence is not to move towards truth through exhaustive research and cooperative consensus but to produce advantageous hypotheses useful for coordinating further action and inquiry, inform strategic decision-making, shape the formulation of policy, and even for secretly manipulating events and persons in the world (Arrigo, 2000; Horn, 2003). In this sense, intelligence is the example of adversarial inquiry par excellence. The epistemic and pragmatic character of intelligence inquiry, its efficacy in detecting, predicting, evading, or helping to produce dangerous surprises, results from its hypothetical and conjectural character. However, this efficacy comes at the price of its unreliability, the inherent instability of abductively formed beliefs observed by Arrigo as being typical of adversarial inquiry. As Horn (2003) notes in their critique of the “epistemology of enmity” that characterizes intelligence, its epistemic products represent a “high-risk, ephemeral sort of knowledge,” which is “not only won under dangerous conditions but also has its own inherent dangers and paradoxes” – not only the possibility of adversarial deception but the self-deception of fallacies and biases (Horn, 2003, p. 2): The probability of being deceived and the necessity to deceive are the heart of the epistemology of secrecy that distorts the knowledge produced by intelligence into an abyss of endless hypothesizing. (ibid., p. 14)
Affordance Detection Intelligence inquiry typically employs multiple modes of observation in order to gain access to what is usually concealed. These mediated observations rely on the discovery and exploitation of environmental features that render them vulnerable to the effects of collection methodologies. However, the nature of such observations is such that any information is likely to be of uncertain reliability and any analysis lacking a complete understanding of the context in which the information was originally produced. Further, the adversarial intentions of the adversary entail the possibility that observations might be vulnerable to deliberate deception. Such epistemic constraints immediately imply the relevance of abduction for the construction
1478
S. Forsythe
of hypotheses and explanations to compensate for incomplete data. But while the role of abduction in adversarial detection is now becoming somewhat more clear, it remains somewhat uncertain what exactly is being discovered in this process. This can be clarified with reference to the concept of affordance, and the way that our abductive efforts at discovery and detection exploit the semiotic and ecological character of our environment. As Magnani (2009) has pointed out, in order to increase our chances of successfully undertaking inquiries and other ventures, humans constantly delegate and distribute cognitive functions to the environment (i.e., through models, representations, and other various mediating structures), for which the systems of intelligence collection provide an excellent example (Magnani, 2009, p. 317). Whether it is through the process of designing human environments or the navigation of a natural landscape, humans constantly seek to discover and exploit affordances, a concept which the perceptual psychologist J. J. Gibson conceived to describe the ecological nature of perception: affordances are what the environment furnishes or provides to an agent; particular features that afford an opportunity to interact (Gibson, 1986, p. 127; Magnani, 2009, pp. 332–335; Bardone, 2011, p. 88). This interaction may be physical or cognitive, and it may be an opportunity to perform movements, access resources, or to make further inferences about otherwise hidden phenomena. For example, a chair affords sitting for agents with particular biomechanical configurations, underbrush may afford concealment for beings of a certain size or coloration, while an ecological or technological niche may afford more specific and complex behaviors for those organisms that are biologically and cognitively adapted so as to be able to perceive and make use of them. The term affordance not only describes what is afforded by the environmental resources themselves but – and perhaps primarily – the environmental clues that must be detected and inferred from in order to make use of the opportunities that are provided by those particular environmental features. This is a distinction between the affordance property of an environmental feature (i.e., stairs+perceiver = climbability) versus the subsequent action that an environmental feature affords (i.e., climbing). As Magnani notes, “the fact that a chair affords sitting means that we can perceive certain clues (robustness, rigidity, flatness) from which a person can easily say ‘I can sit down’” (Magnani, 2009, p. 336). While the original Gibsonian notion of affordance deals with those situations in which detectable signs and clues prompt or suggest a certain action or exploitation rather than others, in Magnani’s more nuanced sense, finding or constructing affordances also deals with a semiotic-inferential activity, implying that we are afforded by an environment if we can detect those signs and cues from which we may infer the (hidden) presence of an actionable or cognitively mediating resource (Bardone, 2011, p. 88). This points to the idea that affordances are themselves mediated by semiotic properties of an environment, at once affording the distribution and externalization of human cognition in features of an environment (either designed, as in a system, or natural, as in a landscape), as well as features of the environment that afford the abductive inferences characteristic of discoveries and detections.
68 Adversarial Abduction: The Logic of Detection and Deception
1479
While the concept of abduction was originally developed by Peirce to describe the inferential and creative process of generating a new hypothesis, implicated in creative reasoning and scientific discovery, Magnani has shown that abduction takes a number of forms, involving not only sentential and propositional activity but also practical manipulations of affordances (Magnani, 2001). While “theoretical abductions” are largely sentential, “related to logic and to verbal/symbolic inferences, and ’model-based’ abductions are related to the exploitation of internalised models of diagrams, pictures, etc.,” manipulative abduction refers to processes that are largely defined by a “thinking-through-doing” characteristic, such as when an agent enters a new environment and physically searches hidden or overlooked features affording further action or inference (Magnani & Bardone, 2008, p. 10). In this activity of affordance detection, abduction is not only a sentential activity proceeding from propositions but also proceeds through internal and tacit modeling of phenomena, as well as external manipulations to acquire new information and perspectives on a given problem, or discover hidden features of an object, idea, or environment (Magnani, 2009, pp. 18, 41–51). Abduction of environmental affordance is necessarily diagnostic in nature (e.g., “the flatness of a surface indicates a suitable place to sit”) as well as hypothetical (e.g., “from the distinct shadow of that object may be inferred the presence of a hidden missile battery”). Abduction is thus the process by which affordances are detected in scenarios of limited or constrained information. And if affordances are what allow humans to overcome their situational limits (both biological and environmental), then adversarial affordances would be those features of any environment that afford the overcoming of limits or realization of objectives related to an adversary, i.e., the pursuit of a relative or absolute advantage. Like intelligence, adversarial inquiry is instrumental and affordance-seeking, employing both theoretical and manipulative abduction, a process of recursive searching, testing, conjecturing, and investigating. The adversarial constraints and conditions that might inhibit scientific inquiry difficult give rise to more effective but less reliable modes of inquiry, extending the role of abduction from only the inference that initiates inquiry to the inference upon which significant and consequential actions and judgments are founded. The function of an intelligence system, governed and managed by the intelligence organization, is fully attuned to, and indeed is methodologically defined by, this abductive activity of affordance detection. Consequently, the risks and dangers of intelligence (failed operations, misdirected missile attacks, missed warnings, etc.) are exacerbated by incorrect and erroneous hypothetical inferences (Wohlstetter, 1962; Betts, 1982; Jervis, 2010). Given the above discussions, we can see that intelligence is a useful example of an adversarial inquiry, the primary objective of which is to distribute and operationalize the cognitive process of abductively detecting and exploiting adversarial affordances. Intelligence inquiry employs multimodal perceptual and cognitive detection methods in order to gain epistemic access to affordances that have been concealed. These mediated perceptions rely on the detection and exploitation of environmental features that render them vulnerable to the effects of collection methodologies. The processing of information acquired through collection further
1480
S. Forsythe
exploits any latent semiotic features that afford cognitive purchase. Viewed through this lens, intelligence activity can subsequently be recast as an abductive practice of affordance detection, employing theoretical, diagnostic, and manipulative abductions. This cognitive character covers the activity of collection as well as that of processing and analysis. The adversarial logic of intelligence, therefore, is the abductive detection of advantageous affordances. However, as we will see in the following section, the adversarial role of abduction does not end with detection, but extends to efforts to defeat the adversary’s inquiries.
The Logic of Deception In a paper on the cognitive mechanisms of adversarial problem-solving (APS), Thagard (1992) notes that abductive inquiries into the adversary are not only useful for avoiding dangerous surprises but are also of great utility in manipulating and misleading the adversary’s inquiries, an activity we have referred to by the notion of deception. The following section builds on the previous discussion by showing how deception exploits the abductive operations of detection, giving rise to a form of adversarial practice in which the highest aim is the defeat of inquiry itself.
Inquiry and Deception As observed by Arrigo in their theorization of adversarial epistemology, and by Thagard in their work on APS, the highest aim of an adversarial inquirer is to “anticipate, understand and counteract the actions of an opponent” (Thagard, 1992, p. 123). Unlike in cooperative and scientific inquiry, the epistemic contribution of these cognitive outcomes is not to help inquirers move towards truth but to produce situational and strategic advantages, which, as we have seen above, are realized by detecting affordances for adversarial action and cognition. The abductive detection of advantageous affordances contributes to the process of modeling the adversary, allowing further abductions about their capabilities and intentions, a process embodied in the practice and institutions of intelligence. However, one of the major problems for adversarial rationality is that in developing models of the adversary, an agent must always take care to include the model that the adversary has made of them. This problem of reasoning about the reasoning of opposed agents is the dynamic that is thought to define a given situation as “strategic” (Goffman, 1970, p. 145) and which Thagard takes as being the defining aspect of adversarial problem-solving (Thagard, 1992, p. 123). However, not only is it important for adversarial actors to discover and model their opponent’s plans but also to detect their adversary’s inquiries, since such knowledge might afford them the opportunity to mislead the adversary’s efforts to detect and thwart their own plans. Awareness of the adversary’s inquiry not only leads to efforts to conceal any signs that might afford advantage to the
68 Adversarial Abduction: The Logic of Detection and Deception
1481
adversary but also efforts to create fraudulent forms of signification to mislead their inquiries (Thagard, 1992, pp. 125–126; Goffman, 1970, pp. 3–13). This is the logic of deception between adversarial inquirers, a complex agent-interaction whose epistemic dynamics resemble a mirror-maze of suspicion, surprise, doubt, and uncertainty. To model this complex process, Thagard develops principles of APS: 1. Construct a model of the opponent, O, involving O’s situation, past behavior, general goals, value scale, degree of competitiveness, and attitude toward risk. 2. Make sure that your model of O includes O’s model of you, because O’s responses to your actions will depend in part on how your actions are interpreted. 3. Use this model to infer O’s plans and add the inferred plans to the model. 4. Use this enhanced model to infer O’s likely actions and likely response to your actions. 5. Combine your model of yourself, O, and the environment to make a decision about the best course of action. 6. In particular, use your model of O to predict possible effective actions that O might not expect, and that, therefore, would be more effective because of the element of surprise. 7. Take steps to conceal your plans from O and to deceive the opponent about your plans (Thagard, 1992, p. 130) As Thagard observes, concurring with Arrigo, only the principles of concealment and deception are typical of adversarial problem-solving and adversarial rationality (Thagard, 1992, p. 130). Additionally, Thagard also recognizes that the forms of inquiry necessary to undertake successful surprise and adversarial modeling are bounded by the dangerous, time-limited, and information-constrained dynamics of adversarial interactions. Thus, Thagard proposes that the procedures underlying efforts to deceive an adversary’s detections are the inferential operations of abduction (Thagard, 1992, pp. 133–135). But while Thagard points to the centrality of abduction in creating the hypothetical and representational models with which adversarial action is planned, the analysis does not reveal how exactly this deception takes place, and whether abductive inferences play a further role beyond the operations of detection.
The Semiotics of Deception While abduction certainly appears necessary for the planning of deception, there is another piece of the puzzle that it will be necessary to find before we can produce a useful theoretical model of the role of abduction in adversarial inquiry. For this, we turn to more recent work in the pragmatistic tradition, in which we can see that deception take place through the semiotic manipulation of the adversary’s abductive efforts at detection.
1482
S. Forsythe
In their study of the semiotics of biological and linguistic camouflage, Bertolotti et al. (2014) examine the inferential and abductive dynamics implicated in both detection and deception strategies (Bertolotti et al., 2014). Camouflage in this context refers to a type of deception, the ability “to make something appear as different from what it is, or not to make it appear at all” (ibid., p. 66). The biological concept of camouflage describes “a range of strategies used by organisms to dissimulate their presence in the environment” and is often borrowed by other semantic fields, insofar as “it is possible to camouflage one’s position, intentions, opinion, etc.” (ibid., p. 66). Within the context of an adversarial ecology, “[e]very organism normally attempts to detect the presence of other agents and hide its own presence from other agents in the surroundings [ . . . ] Both predators and prey simultaneously behave accordingly, as organisms tend to avoid recognition by both their predators and their prey” (ibid., p. 68). Animals must rely on their senses to detect the presence of other organisms – or, in the context of human adversariality such as warfare, on the perceptual apparatus of military technology. In either case, what “the senses pick up is not an immediate picture of external agency but a more or less rich complex of signs” (ibid., p. 68). Mirroring our own argument in this chapter, Magnani et al. consider abduction to be the inferential and cognitive process by which both deceiver and detector identify and infer from sign activity. In the context of the discussion of camouflage, what is abductively inferred are environmental affordances (ibid., p. 68). In an adversarial environment, “survival, for any animate organism, is a matter of coping with the environment, and the relationship with the environment is mediated by a series of cues the organism must make sense of in order to generate, even if tacitly, some knowledge it did not possess before” (ibid., p. 70). Thus, detectors and deceivers engage in an effort to produce signs that fall outside of the agent’s detection mechanisms, or which cause the adversary to make false inferences, resulting in them acquiring and acting upon beliefs that, while briefly appearing as suitable opportunities for action, will result in a failure of their intended act. This, as we can see, is a clear epistemological description of the inferential dynamics of camouflage, mimicry, feints, and other forms of deception. In the context of ecological predation, Magnani et al. describe these inferential dynamics of camouflage as “abductive warfare,” and argue that every agent has a twofold inferential relevance, active and passive: on the one hand, it disperses signs out in its environment, and on the other hand, it receives and processes signs from other organisms. The former dynamic must be minimized while the latter maximized both to counteract predation and to avoid being spotted by a potential adversary (ibid., p. 72). Crypsis is the ecological term that describes the morphological means of stealth in organisms, in which they “minimise the extent to which the signs of [their] agency contrasts against the background environment” (ibid., p. 73). In military affairs, this correlates to camouflage patterning and materials, stealth technology, and in the skills and methods of fieldcraft and operations. In intelligence, it is represented by the skills of the operator in counter-surveillance, false identity, and the evasion of adversary security measures. Masquerade is a semiotically different kind of camouflage, where rather than merging with the background, organisms
68 Adversarial Abduction: The Logic of Detection and Deception
1483
display or communicate explicit signs which are intended to be misidentified as other innocuous or inert objects. In military deception, this might entail actions such as disguising missile batteries as geological features. Also pertinent are forms of kinesthetic camouflage, relying on alterations of the organisms semiotic “shadow,” not preventing detection or recognition, but preventing “an effective prediction of their spatial bearings” (ibid., p. 74). Within the context of an adversarial interaction, whether of a predatory or political nature, Magnani et al. remind us that the “various cognitive biological agents involved must reason on the basis of incomplete or uncertain information: an appraisal of the necessary implications of spatial relations and actions is only slowly and progressively reached through a cycle of continuous updating of spatial mappings” (ibid., p. 74). This is to say that the abductive process of detection and deception involves the collection, analysis, and continuous comparison and verification of information with the existing internal model of the situation, a process that resembles both the intelligence cycle as well as Thagard’s model of APS, with their iterative procedures of direction, collection, analysis, and dissemination. This dynamic of observation and comparison is part of an adversarial-abductive process, where detection and deception proceed by way of hypothetical inferences, with agents’s constantly on the lookout for signs of deception while continually transmitting signs that they have themselves hypothesized as being effective to deceive the internal model constructed of them by the adversary. This observation of the continuity in the deception and detection dynamics in biological and political ecologies brings us to the linguistic aspects of camouflage and deception. Whereas in ecological camouflage the intention is to provide an adversarial observer with signs that lead to incorrect inferences, “communicative camouflage involves the display of a series of semantical and performative acts likely to mislead one’s interlocutors, by shielding from intellection [one’s] actual beliefs, intentions, etc.” (ibid., p. 77). This has implications, for example, for the practice of adversarial politics, where deception or malicious persuasion occurs through the media of speech and written communication. Whereas in a biological framework, a successful deception results in the organism’s survival, “debunking a fallacious argument has instead to do with the assessment of whether what is being uttered corresponds to a state of things in the world” (ibid., p. 79). In this case, the affordances that are exploited by the deceiver are beliefs, expectations, knowledge, and other epistemic features of human communication, which, in cooperative contexts, form the ethical and organizational aspects necessary to conduct inquiries whose end is truth, rather than advantage.
Deceiving Abductions The insights into the inferential and semiotic aspects of camouflage lead us to the final piece of the puzzle regarding the role of abduction in deception. In a recent study of abduction and deceptive dynamics, Fanti Rovetta (2020) reveals that the inferential dynamics of deception are such that when one agent deceives another,
1484
S. Forsythe
not only do they do so using the abductive methods of detection, modeling, and signification detailed above, deceptions ultimately succeed through their manipulation of their targets hypothetical, that is, abductive inferences (Fanti Rovetta, 2020; see also the chapter by Fanti Rovetta in this volume). Fanti Rovetta notes that: In order to understand how abductive reasoning is tied to deception it is necessary to shift the focus of attention from the agent that makes the abductive inference to another agent, which is external to the process and wants to direct it. Since abduction is highly contextual, by modifying the context, e.g. by disseminating appropriate clues, it is possible in principle to indirectly suggest a hypothesis.
This suggestion reinforces our thesis that in addition to the abductions of detection and strategic modeling of affordances for deception, abduction plays a further role as the cognitive affordance of the adversary that is targeted by deceptive semiosis. Fanti Rovetta describes the cognitive operations necessary for this to take place. Firstly, it involves “mutual reinforcing clues and skill-dependency”; once we have modeled the background knowledge and skills possessed by the target of our deception (hypothetical beliefs we have acquired through our detection operations), then we can include in our model a further hypotheses about what they will look for, where they will direct their attention and what they will expect in a given situation (ibid., p. 5). Subsequently, “several intentionally crafted and mutually reinforcing signs are left to be discovered by the agent we desire to deceive” (ibid., p. 5). These deceptive signs function as “fraudulent-affordances,” designed to appear indexical and unintentional, and thus likely (i.e., hypothetically plausible) to induce a false hypothesis in the mind of the deception target. A further cognitive skill required of the deceiver is the capacity to “match drawn implications,” which involves “the act of forecasting the implications drawn from the deceitful hypothesis in order to ensure that subsequent empirical verification meets such expectations” (ibid., p. 5). Since a given hypothesis presents an infinite number of implications, “the deceiver needs to forecast those that will be drawn by the deceived person, in order to match these expectations in further investigations” (ibid., p. 5). To achieve this, “the deception has to be crafted by considering what the deceived holds as plausible and relevant. This also means that any information which may contradict the deceiving hypothesis and its implications need to be hidden” (ibid., pp. 5–6). Fanti Rovetta’s description of this adversarial modeling mirrors Arrigo’s, Thagard’s, and Bertolotti et al.’s accounts of adversarial-inferential dynamics, in which the epistemic products of adversarial inquiries are exploited to support semiotic operations of deception. The final deceptive exploitation of abduction considered by Fanti Rovetta lies in the “exploitation of biases” (ibid., p. 6). Because “the link between abductive reasoning and deception resides in belief-formation,” knowledge about the deception target’s habitual biases affords the deceiver an opportunity to exploit these cognitive heuristics by predicting which hypotheses will be considered more plausible, which in itself suggests which semiotic cues to fabricate for the purpose of inducing particular inferences (ibid., p. 6). As Fanti Rovetta notes, “[t]his means that to deceive is not simply to impose a false hypothesis, because the process may easily
68 Adversarial Abduction: The Logic of Detection and Deception
1485
fail, but it also requires assessing what hypotheses the target of the deception may more easily infer.” Fanti Rovetta’s account gives the final details necessary to see that the adversarial role of abduction is not only to discover and anticipate dangerous surprises but is in fact necessary at every stage of the adversarial interaction, from the detection of information hidden by the adversary to the construction of hypothetical models that include their models of the situation and the other agents within it, and even extends to the act of deception itself, in which models of the adversary’s abductive habits are used to design the false semiotic signatures that mislead processes of hypothesis and belief-formation. In this sense, the logic of deception is the defeat of inquiry itself.
Conclusion: Abduction and Adversarial Rationality In this chapter, the aim has been to develop adversarial conceptions of abductive cognition and reasoning. To do this, it was first necessary to create an adversarial conception of inquiry useful for determining how abduction might play a role in the discovery and communication that leads not to truth but to advantage over an adversary. This led to the conception of two modes of adversarial abduction: detection and deception. The section that followed elaborated the concept of detection by looking at a practical examples of adversarial inquiry – the conduct of political-military intelligence – which revealed that abduction defines every stage of adversarial detection, from the perceptual observations of minute environmental details and affordances to the production of hypotheses and cognitive models of the adversary. The section on deception found that not only are abductions useful for discovering opportunities to deceive the adversary, but that through the manipulation of semiotic phenomena deception succeeds by misleading the abductive detections of the adversary during the course of their inquiries. Despite its relatively abstract mode of theorizing, this initial inquiry adds to the growing body of research into the adversarial aspects of abductive cognition (some of which is featured in this volume), indicating fruitful areas for future research. With clear conceptions of detection and deception in hand, it ought now to be possible to extend the theoretical and empirical study of abduction into domains where adversariality interrupts the cooperative conduct of inquiry.
References Aliseda, A. (2004). Logics in scientific discovery. Foundations of Science, 9(3), 339–363. https:// doi.org/10.1023/B:FODA.0000042847.62285.81 Arrigo, J. M. (2000). The ethics of weapons research – A framework for moral discourse between insiders and outsiders (1). Journal of Power and Ethics: An Interdisciplinary Review, 1(4), 302–327. Auxier, R. E. (2018). Eco, Peirce, and the pragmatic theory of signs. European journal of pragmatism and American Philosophy, X(1), 1. https://doi.org/10.4000/ejpap.1112
1486
S. Forsythe
Bardone, E. (2011). Seeking chances (Vol. 13). Springer Berlin Heidelberg. https://doi.org/10. 1007/978-3-642-19633-1 Bertolotti, T., Magnani, L., & Bardone, E. (2014). Camouflaging truth: A biological, argumentative and epistemological outlook from biological to linguistic camouflage. Journal of Cognition and Culture, 14(1–2), 65–91. https://doi.org/10.1163/15685373-12342111 Betts, R. K. (1982). Surprise attack: Lessons for defense planning (1st ed.). Brookings Institution Press. Danesi, M. (2014). Signs of crime: Introducing forensic semiotics. De Gruyter Mouton. Davies, P. H. J., & Gustafson, K. (2013). The intelligence cycle is dead, long live the intelligence cycle: Rethinking intelligence fundamentals for a new intelligence doctrine. In Understanding the intelligence cycle. Routledge. Detienne, M., & Vernant, J.-P. (1991). Cunning intelligence in Greek culture and society. University of Chicago Press. Eco, U., & Sebeok, T. A. (Eds.). (1983). The sign of three: Dupin, Holmes, Peirce. Indiana University Press. Fanti Rovetta, F. (2020). Framing deceptive dynamics in terms of abductive cognition. Profil, 21(1), 1. https://doi.org/10.5817/pf20-1-2043 Galison, P. (2000). Pragmatism at War. In J. Ockman (Ed.), The Pragmatist imagination: Thinking about ‘Things in the making’ (pp. 148–155). Princeton Architectural Press. Gibson, J. J. (1986). The ecological approach to visual perception. Erlbaum. Goffman, E. (1970). Strategic interaction. University of Pennsylvania Press. Herman, M. (1996). Intelligence power in peace and war. Cambridge University Press. Herman, M. (2001). Intelligence services in the information age. Routledge. https://doi.org/10. 4324/9780203479667 Horn, E. (2003). Knowing the enemy: The epistemology of secret intelligence (S. Ogger, Trans.). Grey Room, 11, 58–85. https://doi.org/10.1162/15263810360661435 Jervis, R. (2010). Why intelligence fails: Lessons from the Iranian Revolution and the Iraq War. Cornell University Press. Jullien, F. (2004). A treatise on efficacy: Between Western and Chinese thinking. University of Hawai’i Press. Kent, S. (1966). Strategic intelligence for American world policy. Princeton University Press. http://archive.org/details/strategicintelli0000kent Magnani, L. (2001). Abduction, reason and science. Springer US. https://doi.org/10.1007/978-14419-8562-0 Magnani, L. (2009). Abductive cognition: The epistemological and eco-cognitive dimensions of hypothetical reasoning (Vol. 3). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-64203631-6 Magnani, L., & Bardone, E. (2008). Sharing representations and creating chances through cognitive niche construction. The role of affordances and abduction. In S. Iwata, Y. Ohsawa, S. Tsumoto, N. Zhong, Y. Shi, & L. Magnani (Eds.), Communications and discoveries from multidisciplinary data (Vol. 123, pp. 3–40). Springer Berlin Heidelberg. https://doi.org/10.1007/ 978-3-540-78733-4_1 Magnani, L., Nersessian, N. J., & Thagard, P. (Eds.). (1999). Model-based reasoning in scientific discovery. Springer US. https://doi.org/10.1007/978-1-4615-4813-3 Misak, C. (2000). Truth, politics, morality: Pragmatism and deliberation. Taylor & Francis. https:// doi.org/10.4324/9780203283523 Neumann, J. V., & Morgenstern, O. (1944). Theory of games and economic behavior. Princeton University Press. Peirce, C. S. (1934–63). In C. Hartshorne & P. Weiss (Eds.), Collected papers of Charles Sanders Peirce (Vol. 1–7). Belknap Press of Harvard University. Phythian, M. (2013). Understanding the intelligence cycle (1st ed.). Routledge. https://doi.org/10. 4324/9780203558478 Scott, L., & Jackson, P. (2004). The study of intelligence in theory and practice. Intelligence and National Security, 19(2), 139–169. https://doi.org/10.1080/0268452042000302930
68 Adversarial Abduction: The Logic of Detection and Deception
1487
Shakarian, P., & Subrahmanian, V. S. (2011). Geospatial abduction. Springer. https://doi.org/10. 1007/978-1-4614-1794-1 Shakarian, P., Nagel, M. K., Schuetzle, B. E., & Subrahmanian, V. S. (2011). Abductive Inference for Combat: Using SCARE-S2 to Find High-Value Targets in Afghanistan. AAAI-11 / IAAI-11 Proceedings of the 25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference, Proceedings of the National Conference on Artificial Intelligence, 2 November 2011, 1689–94. Sims, J. E., & Sims, J. E. (2022). Decision advantage: Intelligence in international politics from the Spanish Armada to cyberwar. Oxford University Press. Thagard, P. (1992). Adversarial problem solving: Modeling an opponent using explanatory coherence. Cognitive Science, 16(1), 123–149. https://doi.org/10.1207/s15516709cog1601_4 Thomas, S. T. (1988). Assessing current intelligence studies. International Journal of Intelligence and Counter Intelligence, 2(2), 217–244. https://doi.org/10.1080/08850608808435061 Vrist Rønn, K., & Høffding, S. (2013). The epistemic status of intelligence: An epistemological contribution to the understanding of intelligence. Intelligence and National Security, 28(5), 694– 716. https://doi.org/10.1080/02684527.2012.701438 Warner, M. (2002). Wanted: A definition of intelligence. Studies in Intelligence, 46(3), 15–22. Warner, M. (2013). The past and future of the intelligence cycle. In Understanding the intelligence cycle. Routledge. Weizman, E. (2017). Forensic architecture: Violence at the threshold of detectability. Zone Books. https://doi.org/10.2307/j.ctv14gphth Wheaton, K. J, & Beerbower, M. T. (2006). Towards a New Definition of Intelligence 17 (n.d.): 12. Wohlstetter, R. (1962). Pearl Harbor: Warning and decision. Stanford University Press.
Abduction and Violence Hypothetical Cognition in the Entanglement of Morality and Violence
69
Lorenzo Magnani
Contents Abduction, Pregnances, and Affordances in an Eco-cognitive Perspective . . . . . . . . . . . . . . Abductive Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saliences and Pregnances as Biological and Cognitive Mediators of Morality and Violence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moral Pregnances and Affordances in an Eco-cognitive Perspective . . . . . . . . . . . . . . . . The Naturalness of Proto-morality and Violent Punishment . . . . . . . . . . . . . . . . . . . . . . . The Moral/Violent Abductive Powers of Pregnant Linguistic Signs . . . . . . . . . . . . . . . . . . . Abductive Cognition in Violent Mobbing and Scapegoating . . . . . . . . . . . . . . . . . . . . . . . Mimetic Desire, Abductive Emotions, and Sacrifices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fallacies, Hypotheses, and Distributed “Military” Intelligence . . . . . . . . . . . . . . . . . . . . . . . Judging People Accused of Violence: Abductive Evaluation and Assessment of Evidence and of Fallacious Narratives . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1490 1490 1494 1496 1499 1501 1504 1505 1507 1510 1513 1514
Abstract
The relationship between moral and violent behavior and the related role of abduction are still overlooked in current philosophical, epistemological, and cognitive studies. In this chapter, also to the aim of clarifying the complex dynamics of this interplay and adopting an eco-cognitive perspective, the concepts of salience and pregnance and the concepts of abduction and affordance will be described. New light on the examination of the strict relationships between these behaviors will be presented, by offering a wide and unified perspective rooted
L. Magnani () Department of Humanities, Philosophy Section and Computational Philosophy Laboratory, University of Pavia, Pavia, Italy e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_33
1489
1490
L. Magnani
in a morphodynamical framework in which physical, biological, and cognitive processes can be simultaneously analyzed and the related role of abduction explained. The final part of the chapter will deepen the analysis of two relevant issues: (1) the role of salience and pregnances as linguistic instruments which are fundamental in substantiating that “military intelligence” in which typical moral and violent conducts, such as bullying and scapegoating, can be clearly and synthetically explained, in a unified abductive perspective, and (2) the role of fallacies (abduction included) in performing events of language-based violence and the analysis of the basic inferential abductive processes that substantiate the judgments regarding individuals accused of violence and the evaluation of the concomitant fallacious narratives. Finally, the role of the so-called coalition enforcement is delineated, especially useful in integrating the treated topics, thanks to the illustration of the relationship between morality and violence from a paleoanthropological perspective. Keywords
Abduction · Violence · EC-Model of abduction · Salience · Pregnance · Cognitive mediators · Moral affordances · Proto-morality · Violent punishment · Linguistic signs · Mobbing · Scapegoating · Sacrifice · Mimetic desire · Military intelligence · Fallacies · Judging violence · Coalition enforcement
Abduction, Pregnances, and Affordances in an Eco-cognitive Perspective Abductive Cognition Abduction is a logical, epistemological, and cognitive concept that refers to all kinds of human and non-human animals’ hypothetical cognition. When hypotheses are generated – whether creatively invented, as in science, or merely of a diagnostic kind, such as in the case of physicians and detectives – we face a process of abductive cognition. This chapter will also explain how it is important to take into account this kind of reasoning when dealing with the entanglement between morality and violence. To know the conceptual details of abductive cognition, it is necessary to comprehend its role in the framework of salient/pregnant processes that will be illustrated in the following subsection. Stating that abduction, morality, and violence are entangled does not only mean that reasoning, morality, and violence can be studied together, but rather that it is beneficial to study them together. The word entanglement is clearly borrowed from the language of quantum physics: even if those three human distinctive aspects have each their own theoretical dignity, many of the behaviors they deal with are just deeply entangled, so that ignoring one aspect or the other may cause a philosophical misperception of the matter at stake.
69 Abduction and Violence
1491
This is occurring, for instance, by failing to appreciate how the inferential abductive dimension in a moral judgment and its enactment can lead to violent outcomes or, conversely, how moral priorities strongly inform and override our best hypothetical reasonings. Just to make an example, it is well-known (and in the last years also well studied) that the subject of gossip studies displays such powerful entanglements (Bertolotti and Magnani, 2014), in which abductive hypothetical reasoning about absent people is fundamental, but also any epistemological approach on religion that cannot overlook how the violence entailed by religious cognition is rooted both in the moral assumptions and in the inferential regime that are typical of religion (Bertolotti, 2015), and overall the philosophical approach to the relationships between morality and violence (Magnani, 2011). The entanglement described above takes advantage of another more fundamental one, between epistemology and ethics, that has tacitly emerged over the past recent years, transcending the philosophical impasses of the is/ought debate. It seems, for example, to be more strongly nested in applied epistemology: David Coady explicitly connects the origins of applied epistemology to the tradition of applied ethics (Coady, 2012, p. 1 and ff.), highlighting a theoretical practice of mutual borrowing that has characterized the different branches of philosophy since the very beginning. Clearing up the relationship, and the entanglement, between epistemology and ethics helps to shed light on the entangled relationship indicated above, that is, the one between abduction, morality, and violence. Indeed, the understanding of each theoretical entanglement (epistemology-ethics, abductionmorality-violence) rests on the understanding of the other, as the five poles are connected in a double system that will be explored in this chapter. The problem of the relationship between abduction, morality, and violence can be usefully described taking advantage of Thom’s theory of morphogenesis, based on the catastrophe theory. It is in this light that the relationship can be comprehended as an ordinary semiophysical process. Indeed, Thom considered the use of models in catastrophe theory as illustrating semiophysical processes, which in the case of cognition express what he called a “physics of meaning” (Thom, 1988, Foreword). Furthermore, in the framework of catastrophe theory, it is very simple and clear to see and understand the constitutive moral and at the same time violent nature of human natural language, also delineated in a philosophical perspective in Magnani (2011, Chapter 1). To understand the basic tenets of catastrophe theory, it is useful to exploit the concept of abduction, which refers to the role of “guessing hypotheses” in human and non-human animal cognition. Abduction is a popular term in many fields of AI, such as diagnosis, planning, natural language processing, motivation analysis, logic programming, and probability theory. Moreover, abduction is important in the interplay between AI and philosophy; cognitive science; historical, temporal, and narrative reasoning; decision-making; legal reasoning; and emotional cognition. The reader that is interested in a list of the classical and more recent bibliography on abduction can refer to Magnani (2001, 2009, 2017, 2022). To illustrate the concept of abduction, let us consider the following interesting passage, from an article by Simon (1965), dealing with the logic of normative theories:
1492
L. Magnani
The problem-solving process is not a process of “deducing” one set of imperatives (the performance programme) from another set (the goals). Instead, it is a process of selective trial and error, using heuristic rules derived from previous experience, that is sometimes successful in discovering means that are more or less efficacious in attaining some end. If we want a name for it, we can appropriately use the name coined by Peirce and revived recently by Norwood Hanson (1958): it is a retroductive process. The nature of this process – which has been sketched roughly here – is the main subject of the theory of problemsolving in both its positive and normative versions (Simon, 1977, p. 151).
Simon states that discovering means that are more or less efficacious in attaining some end is performed by a retroductive process. He goes on to show that it is easy to obtain one set of imperatives from another set by processes of discovery or retroduction and that the relation between the initial set and the derived set is not a relation of logical implication. Simon is right: retroduction (i.e., abduction, cf. below) is the main subject of the theory of problem-solving, and developments in the fields of cognitive science and artificial intelligence have strengthened this conviction. Hanson (1958, p. 54) is perfectly aware of the fact that an enormous range of explanations (and causes) exists for any event: There are as many causes of x as there are explanations of x. Consider how the cause of death might have been set out by a physician as “multiple hemorrhage”, by the barrister as “negligence on the part of the driver”, by a carriage-builder as “a defect in the brakeblock construction”, by a civic planner as “the presence of tall shrubbery at that turning.”
The word “retroduction” used by Simon is the Hansonian neopositivistic one replacing the Peircean classical word abduction. Following Hanson’s perspective, Peirce “[. . . ] regards an abductive inference (such as ‘The observed position of Mars falls between a circle and an oval, so the orbit must be an ellipse’) and a perceptual judgment (such as ‘It is laevorotatory’) as being opposite sides of the same coin.” It is also well-known that Hanson relates abduction to the role of patterns in reasoning and to the Wittgensteinian “seeing that” (Hanson, 1958, p. 86). As Fetzer has stressed, from a philosophical point of view, the main modes of argumentation for reasoning from premises to conclusions are expressed by these three general kinds of reasoning: deductive (demonstrative, non-ampliative, additive), inductive (non-demonstrative, ampliative, non-additive), and fallacious (neither, irrelevant, ambiguous). Abduction, which expresses likelihood in reasoning, is a typical form of fallacious inference (at least in the perspective of classical logic): “[. . . ] it is a matter of utilizing the principle of maximum likelihood in order to formalize a pattern of reasoning known as ‘inference to the best explanation’” (Fetzer, 1990, p. 103), studied, for example, by Harman (1965, 1968), Thagard (1987), and Lipton (2004). To conclude this short digression on abduction, it is worth to recall the distinction introduced in an article of 1988 (Magnani, 1988) and further illustrated in Magnani (2001) between creative abduction that generates new hypotheses, which could lead to finding an unexpected solution in the uninvestigated field, for example, in scientific discovery, and selective abduction – for example, in diagnostic reasoning, where abduction is merely seen as an activity of “selecting” from an encyclopedia
69 Abduction and Violence
1493
of pre-stored hypotheses – and the one between two kinds of abduction, theoretical and manipulative, described in Magnani (2009). Theoretical abduction mainly takes advantage of internal cognitive resources; manipulative abduction also and primarily exploits all kinds of external representations, cognitive mediators, and cognitive artifacts (consider, for instance, the use of epistemic mediators in scientific practice, such as computational representations or in vitro models). In particular, manipulative abduction shows how it is possible to find methods of manipulative constructivity, to the aim of making hypotheses. In both cases, various kinds of representations can work, from the model-based representations (e.g., icons, diagrams, spatial frameworks, etc.) to the propositional ones. Still, in both cases, two main cognitive aspects of abduction can be found, as anticipated above: (1) abduction that only generates “plausible” hypotheses (“selective” or “creative”) (further analysis of this important concept is illustrated in subsection (Magnani, 2009, Chapter 2)) and (2) abduction considered as inference “to the best explanation,” which also evaluates hypotheses (cf. Fig. 1). An illustration of creative abduction from the field of medical knowledge is represented by the discovery of a new disease and the manifestations it causes. Therefore, “creative” abduction deals with the whole field of the growth of scientific knowledge. This is irrelevant in medical diagnosis where instead the task is to abductively “select” from an encyclopedia of pre-stored diagnostic entities. Both inferences are called ampliative, selective, or creative, because in both cases the reasoning involved amplifies, or goes beyond, the information incorporated in the premises (Magnani, 1992). Taking advantage of the concept of abduction, first of all, it is important to clarify in the following paragraphs the notions of pregnance and salience, which play an important role in the catastrophe theory. An example of a special case of abduction, instinctual (and putatively “unconscious”), is given by the case of certain cognitive abilities embodied in animals. These abilities are in turn capable of leading to some appropriate behaviors: as Peirce said, abduction even takes place Fig. 1 Creative and selective abduction
1494
L. Magnani
when a newborn chick picks up the right sort of corn. Following Peirce, this is an example of spontaneous abduction – analogous to the case of other hardwired unconscious/embodied abductive processes in human beings: When a chicken first emerges from the shell, it does not try fifty random ways of appeasing its hunger, but within five minutes is picking up food, choosing as it picks, and picking what it aims to pick. That is not reasoning, because it is not done deliberately; but in every respect but that, it is just like abductive inference. (“The proper treatment of hypotheses: a preliminary chapter, toward an examination of Hume’s argument against miracles, in its logic and in its history” [1901], in (Peirce, 1966, p. 692)).
It is clear that Peirce considers hypothesis generation a largely instinctual endowment of human beings given by God or related to a kind of Galilean lume naturale: “It is a primary hypothesis underlying all abduction that the human mind is akin to the truth in the sense that in a finite number of guesses it will light upon the correct hypothesis” (Peirce, CP, 7.220). Instinct is of course considered by Peirce as in part conscious: it is “always partially controlled by the deliberate exercise of imagination and reflection” (Peirce, CP, 7.381). Hence, the human mind is “akin to truth,” and this tendency is also present in animals. Again, the example of the innate ideas of “every little chicken” is of help to describe this abductive human instinctual endowment: How was it that man was ever led to entertain that true theory? You cannot say that it happened by chance, because the possible theories, if not strictly innumerable, at any rate exceed a trillion – or the third power of a million; and therefore the chances are too overwhelmingly against the single true theory in the twenty or thirty thousand years during which man has been a thinking animal, ever having come into any man’s head. Besides, you cannot seriously think that every little chicken, that is hatched, has to rummage through all possible theories until it lights upon the good idea of picking up something and eating it. On the contrary, you think the chicken has an innate idea of doing this; that is to say, that it can think of this, but has no faculty of thinking anything else. The chicken you say pecks by instinct. But if you are going to think every poor chicken endowed with an innate tendency toward a positive truth, why should you think that to man alone this gift is denied? (Peirce, CP, 5.591).
Saliences and Pregnances as Biological and Cognitive Mediators of Morality and Violence What is the role played by abduction when dealing with saliences and pregnances as mediators of violence? First of all, it is necessary to provide an illustration of these two important concepts. The concept of pregnance, introduced by Thom (1972, 1980) on the basis of Wertheimer’s Gestaltic concept of Prägnanz, can shed further light on a kind of morphodynamical “physics” of abduction, first of all, in the case of the instinctual hardwired aspects illustrated in the previous subsection. Furthermore, pregnance and salience can in turn become clearer and richer when reframed in the perspective of abductive cognition. As will soon be shown, they are key concepts that can be adopted to analyze important aspects not only of the instinctual but also of the plastic nature of abductive hypothetical cognition. The plastic nature
69 Abduction and Violence
1495
of cognitive activities refers to all the skillful non-instinctual capacities to make hypotheses, which human beings are able to create, learn, and exploit. What is a pregnance? The complicated – and at first sight obscure – concept of pregnance is based on the concept of salience, which emerges in the dynamical framework of Thom’s “semiophysical” perspective. First of all, it is possible to say that, in general, phenomenal discontinuities are perceived by organisms as salient forms (e.g., in the auditive case, the eruption of a sound in the midst of silence), that is, as contextual effects between forms: “The simplest feature is the punctual discontinuity geometrically represented by a point dividing the real straight line R into two half lines” (Thom, 1988, p. 3). Discontinuities out there in the environment are basically translated into other more or less amplified discontinuities in the subjective sensorial state, as a kind of “echo” or “shock” of the physical environment within an organism. In the case of sensory systems, salience of course is at the basis of the first possibility of perceiving individuated forms. In this case, perception can also be appropriately influenced by a certain form of concept “[. . . ] that is to say a class of equivalence between forms referent to the same concept” (ibid.): the lack of the concept can annihilate the grasping of the individuated form, especially when analysis proceeds from the whole to the parts. To the aim of the present analysis, it is important to stress that the term pregnance can be applied to physical and biological phenomena, but also to the cognitive ones. Hence, it can further clarify the distinction between the instinctual chicken abduction above and other plastically acquired abductive ways of cognition: So we will get this general pattern of a world made up of salient forms and pregnances – salient forms being objects, very often individuated, that are impenetrable to one another, and pregnances being occult qualities, efficient virtues that emanate from source-forms and invest other salient forms in which they produce visible effects (that is the so-called “figurative” effects for the organisms invested) (Thom, 1988, p. 2).
Let us explain the passage. First of all, it is important to note that when Thom calls pregnances “occult qualities,” he does so only metaphorically; actually, Thom thinks that pregnances are not occult and mysterious qualities at all, because they could be accounted for as fully explainable psychological phenomena in neurological and biological terms and they can also be made intelligible through mathematical models. The description of the processes affected by pregnance activity aims at providing what Thom calls a “protophysics, source and reservoir of all permanent intuitions, of all those archetypal metaphors that have nourished man’s imagination over the ages” (p. 3). Thom further says: “Pregnances are non-localized entities emitted and received by salient forms. When a salient form ‘seizes’ a pregnance, it is invaded by this pregnance and consequently undergoes transformations in its inner state which can in turn produce outward manifestations in its form: we call these figurative effects” (p. 16). To clarify the two concepts of salience and pregnance, the following two examples can be of some utility [the wide range of events covered by the two concepts is testified by the fact that the first example does not have any cognitive/psychological significance]:
1496
L. Magnani
1. an infection (pregnance) contaminates healthy subjects (representing the “invested” form: salience). These subjects in turn re-emit the same infection (pregnance) into the environment. In this case, pregnance has in itself a material/biological support (e.g., a virus) – as a mediator – which in turn is transmitted, thanks to a suitable medium (e.g., air or blood); 2. worker honeybees communicate with each other by means of signs (through the iconic movements of a dance) – pregnance – that express the site where they have found food in order to inform the other conspecific individuals, the invested salience, about the location. In this second case, the pregnance is transmitted – mediated – through undulatory sounds and light signals and produces a neurobiological effect at the destination, that is, in other words, a “psychic” effect [of course in this case, the expression “psychic” can be used only if it is admitted, in a mentalistic and unorthodox way, that honeybees are endowed with a kind of animal psyche: an example regarding a cat or a boy would have been more convincing for the reader. . . ]; 3. finally, fields in physics are the true paradigm of objective pregnances in modern science, because in that case it is possible to be theoretically able to calculate their variation in space-time, thanks to a mathematical description (based on an explicit geometrical definition of space-time) (Thom, 1988, p. 32). As pregnance and salience acquire their meaning in a very naturalistic intellectual framework related to an analysis of complex systems, they are also useful to propose a unified perspective on moral and violent abductive processes and other various cognitive/psychic ones, seen as basic physico-biological events, also endowed with a profound eco-cognitive significance. A pregnance affects an organism, and when this is happening, various related abductive/hypothetical responses are promptly triggered, and they can be biological, physical, and cognitive. Here, it is important to emphasize that in gregarious animals, the triggered response is often a proto-moral/proto-violent one, given the fact some non-human animal behaviors can reasonably be called proto-moral, to lessen the anthropomorphic aura of the adjective “moral” (Waal et al., 2006). Hence, pregnances are genetically transmitted but can also be actively and plastically created, for example, through learning and high cognitive capacities, through the formation of multiple forms of hypothetical intelligence. However, to better grasp the concept of pregnance and its relationship with moral and violent social behaviors, a further analysis is needed, which also takes advantage of the concept of affordance.
Moral Pregnances and Affordances in an Eco-cognitive Perspective The concepts of salience and pregnance have to be linked to the concept of affordance: this last concept can usefully further enrich the concept of abduction to the aim of finally explaining its role in the entanglement between morality and violence.
69 Abduction and Violence
1497
In general, in the case of salient forms, their impact on the organism’s sensory apparatus “remains transient and short lived” (Thom, 1988, p. 2), so they do not have relevant long-term effects on the behavior of the organisms. To continue and deepen our analysis, it is useful at this point to introduce the concept of affordance. If the fact that environments and organisms evolve and change is acknowledged, and also both their instinctual and cognitive plastic endowments, it is possible to argue that affordances (Gibson, 1951, 1979, 1982) can be related to the variable (degree of) “abducibility” of a configuration of signs. Adopting in this case the semiotic/Peircean lexicon which refers to cognition as sign activity, it is possible to say that a chair affords sitting in the sense that the action of sitting is a result of a sign activity (abductive) in which humans perceive some physical properties (flatness, rigidity, etc.), and therefore they can ordinarily infer that a possible way to cope with a chair is sitting on it (see also Magnani, 2009, Chapter 6). In the case of cognitive events, if the perspective of the affordances is adopted, it is possible to say that salient forms – contrary to pregnant forms – “afford” organisms without triggering relevant modifications either at the level of possible inner rumination or in terms of motor actions. Thom says that when salient forms carry “biological significance,” like in the form of prey for the hungry predator, or the predator for its prey, or in the case of sex and fear, or when a salient form is invested by an infection, the reaction is much bigger and involves the freeing of hormones, emotive excitement, and behavior (or an immune response in the case of the infection) devoted to attracting or repulsing the form: salient forms of this type are called pregnant. However, in the perspective of the complexity of animal behaviors, in the non-strictly biological case of cognitive functions, pregnances are still at stake. In this case, pregnances, no matter whether due to innate releasing processes or to complicated, more or less stable internal learned plastic processes and representations (or pseudorepresentations (Bermúdez, 2003), such is the case of non-human animals), are triggered by a very small sensory stimulus (a stimulus “with a little figuration, an olfactory stimulus for instance” (p. 6)). Hence, they represent a relationship with certain special phenomenological aspects that of course are stable to different extents and so can appear and disappear. At some times and in some cases, the special sensitivity to pregnances is disregarded. Like in the case of affordances, this variability and transience can be seen at the level of the differences of pregnance sensitivity among organisms and also at the level of the same organism at subsequent stages of its cognitive and biological development. It is possible to say that a pregnant stimulus is – so to say – highly diagnostic and a trigger to initiate abductive cognition, like in the case of the hardwired pregnance occurring to our Peircean chicken and its food: the chicken promptly reacts when perceiving it. When a pregnance affects an organism, the abductive reaction can be promptly triggered. It must be said that in this case what can be surely seen as a biological/instinctual reaction – reaching the food – is at the same time endowed with a kind of “compacted” cognitive value, like Peirce brilliantly contends: the non-human animal mind is already “akin to truth”: indeed, in this regard, let us reiterate the passage already reported above:
1498
L. Magnani
[. . . ] you think the chicken has an innate idea of doing this; that is to say, that it can think of this, but has no faculty of thinking anything else. The chicken you say pecks by instinct. But if you are going to think every poor chicken endowed with an innate tendency toward a positive truth, why should you think that to man alone this gift is denied? (Peirce, CP, 5.591).
Finally, it is necessary to recall that the pregnant character of a form is always “relative” to a receiving subject (or group of subjects), just as in the ecopsychological case of affordances. Pregnances can be abductively activated or created. When a bell ringing is repeated often enough together with the exhibition of a piece of meat to a dog, thanks to Pavlovian conditioning, the alimentary pregnance of meat spreads by contiguity to the salient auditive form, so that the salient form, in this case the sound of the bell, is invested by the alimentary pregnance of the meat; here, the metaphor of the invasive fluid – even if exoteric – can be useful: “So we can look on a pregnance as an invasive fluid spreading through the field of perceived salient forms, the salient form acting as a ‘fissure’ in reality through which seeps the infiltrating fluid of pregnance” (p. 7). To explain the formation of pregnances, Thom exploits the classical Pavlovian perspective. More recent approaches take advantage of Hebbian (Hebb, 1949) and other more adequate learning principles and models, cf., for example, Loula et al. (2010). The propagation can also occur through similarity, taking advantage of the mirroring force of some features. Once the reinforcement is established, the bell – Thom says – refers symbolically in a more or less stable way to the meat. In these cases, it is possible to say that, metaphorically and anthropomorphically, an example of “emergence of meaning” is at play. Of course, extinction of pregnances through the absence of reinforcement is possible, when an organism moves away for a long time from the source form or when the invested salient form is associated with another pregnant form still in the absence of reinforcement. From this point of view the “symbolic activity” seen in the above situation is seen as fundamentally linked to biological control systems in two ways: (1) it is an extension of their efficacy (new favorable cognitive abductive chances – new pregnances – are added); and (2) an internal simulation concerning the relationship between the food and its index, the bell, is implemented so that the door is opened to the formation of multiple forms of abductive semiotic cognition (and/or intelligence): The fact that initially, as in the Pavlovian schema, this stimulation is no more than a simple association, does not stop us from considering that we have the first tremors in the plastic and competent dynamic of the psychism of [the actant] of an external spatiotemporal liaison interpreted not without reason as causal. [. . . ] Hence, from the beginning, the situation is not fundamentally different from that of language [. . . ]. Only these fundamental “catastrophes” of biological finality have the power of generating the symbols in animals (Thom, 1988, pp. 268–269).
A final note about the so-called coalition enforcement hypothesis must be added in this subsection, given the fact it will be repeatedly quoted in this chapter because it plays a central role in rendering patent the entanglement between
69 Abduction and Violence
1499
morality and violence taking advantage of a paleoanthropological perspective. This hypothesis, put forward by Bingham (1999, 2000), aims at providing an explanation of the “human uniqueness” that is at the origin of human communication and language, in a strict relationship with the spectacular ecological dominance achieved by H. sapiens and of the role of cultural heritage. From this perspective, and due to the related constant moral and policing dimension of Homo’s coalition enforcement history (which has an approximately two million-year evolutionary history), human beings can be fundamentally seen as self-domesticated animals. The main speculative value of this hypothesis consists in stressing the role of the more or less stable stages of cooperation through morality (and through the related unavoidable violence). In hominids, cooperation in groups (which, contrary to the case of non-human animals, is largely independent of kinship) is fundamentally derived from the need to detect, control, and punish social parasites, who, for example, did not share the meat they hunted or partook of the food without joining the hunting party (Boehm, 1999) (also variously referred to as free riders, defectors, and cheaters). These social parasites were variously dealt with by violently killing or injuring them (and also by killing cooperators who refused to punish them) from a distance using projectile and clubbing weapons. In this case, violent injuring and killing are cooperative and remote (and at the same time they are “cognitive” activities). Of course, cooperative morality that generates “violence” against unusually “violent” and aggressive free riders and parasites can be performed in other weaker ways, such as denying future access to the resource, injuring a juvenile relative, gossiping to persecute dishonest communication and manipulative in-group behaviors, waging war against less cooperative groups, etc. (see below the second last section of this chapter.)
The Naturalness of Proto-morality and Violent Punishment After having explained the concepts of pregnance, salience, and affordance in their intertwining with abduction, it will be relatively easy to show their dynamics in the processes of moral behaviors – in human and non-human animals – and related effects in terms of cooperation but also of more or less potential violent punishments (and possible violent conflicts with different moral frameworks). As already indicated, Thom sees pregnances not only as innate endowments (like in the case of the basic ones seen in birds and mammals: hunger, fear, sexual desire) but also as related to higher-level cognitive capacities, which also involve the role of proto-morality (in non-human animals) and morality (in human beings). “When animal pregnance is generalized in the direction of human conceptualization ‘conceptual’ or individuating pregnances will be revealed, the nature of which is close to ‘salience’” (p. 6). At this point, it should be clear that it is possible to synthetically account for both these processes in terms of different kinds of abductive hypothetical cognition. For example, Thom observes, reverberating the view of visual perception as informationally semi-encapsulated – that is, despite its
1500
L. Magnani
bottom-up character, it is not insulated from plastic cognitive processes and contents acquired through learning and experience, so it is possible to say it is also pre-wired cf. Raftopoulos (2009) – that “[. . . ] it is doubtful whether genetics alone would be able to code a visual form [. . . ]. Whence the necessity of invoking cultural transmission, linked with the social or family organization of the community” (p. 10). In gregarious animals, the signals (which also have to be seen as referring to the explanation of the origin of the “pregnance-mirroring” functions of human language) are a vector of pregnances insofar as they transfer a pregnance from one individual to another or to several others. In such a way, they favor teaching and learning, working to constitute the collectively shared behavior needed, for example, to capture food and to ward off predators. In this perspective of gregarious animals, pregnances are de facto immediately related to the emergence of kinds of proto-moralities relying on shared proto-axiological features. The reader that is interested in the emergence of proto-morality and proto-violence in a naturalistically evolutionary perspective can refer to (Magnani, 2011, Chapter 1). When an organism – through abductive cognition – traces back a symbolic reference to a “source” form [in Thom’s sense as indicated above], often a motor reaction becomes necessary to bring satisfaction. As already illustrated above, in this case, abduction plays an inferential role similar to the one it plays in physician’s diagnostic reasoning when a symptom is explained by a hypothesis, a diagnosis, suitably selected among an already available encyclopedia of diagnostic hypotheses referred to the corresponding diseases. On the contrary, when a pregnancy is originally built, the process is akin to the case of creative abductive cognition, for example, in science, when a new successful hypothesis is established for the first time. On these aspects of abductive cognition, see above and Magnani (2009, Chapter 2). Here is an example that is clearly and patently related to “sociality” (through morality, given the presence of the role of altruism): In a social group, one individual’s encounter with a source form S may give rise to a dilemma: whether to pursue the “individual interest” which consists in using the regulatory reflex that will result in selfish satisfaction, or to follow the altruistic community strategy by uttering the cry that will carry the pregnance S to the other members of the community; such a cry is then the signal by which the signal P of S experienced by individual 1 can be transferred to another individual 2 (p. 12).
Thom himself nicely adds that this kind of animal proto-moral conflict resonates with the more completely “moral” conflict of civilized societies “This dilemma exists well and truly in our society. Witness the scruples most honest citizens have in making true declarations of their taxable revenues” (Thom, 1988, p. 12). Another example is provided by the case of a signal (or a proximal “clue”), which transfers the pregnance of fear in birds, which further prompts the motion of taking flight but that also incurs the risk of attracting the predator’s attention. Animals perceive the pregnant sign/clue (e.g., tracks or excreta of the predator) and then emit a further sign (cry) that mirrors that sign/clue and its pregnance.
69 Abduction and Violence
1501
At this point, it is clear that, in this Thomian perspective, the establishment of a proto-morality immediately depicts behaviors and reactions that are exposed to punishment and violence: at the same time, moral behavior creates the space of violent behavior.
The Moral/Violent Abductive Powers of Pregnant Linguistic Signs Among the various semiotic processes that in humans and in animals give rise to pregnances, affordances, and abductive reactions, human language certainly plays a dominant moral role that must be analyzed to the aim of unveiling its strict relationship with violent outcomes. This section will also explain how human language is intertwined with the construction of the so-called cognitive niches in their function of promoting coalition enforcement. From the point of view of the functions of human language, Thom sees the birth of the “genitive” as the syntactical form that denotes the proximity of a being while denying its immediate presence. This syntactical form permits us to emit and receive alarm calls which provide individuals (and the group) with an adequate defense. By the way, in many animals, alarm calls/cries are the analogues of the second-person singular imperatives typical of human natural languages (Thom, 1980, p. 172). From this perspective, the presence of a pregnant sign associated with a form S can be considered as a fundamental kind of concept or class of equivalence between salient forms, which incorporates a primary, rudimentary, and prompt abductive power. As already stressed, the cultural acquisition of a sensitivity to source forms has to be hypothesized in both humans and various animals. In these cases, pregnance transmission occurs, beyond the hardwired cases, thanks to the presence of suitable artifactual cognitive niches, a concept introduced by Tooby and DeVore (1987) and later on reused by Pinker (1997, 2003), extendedly illustrated in Magnani (2009, Chapter 5). Representational delegations to the external environment that are configured as parts of cognitive niches are those cognitive human actions that transform the natural environment into a cognitive one. Humans have built huge cognitive niches, characterized by informational, cognitive, and, finally, computational processes, as described by the studies in the field of biosciences of evolution by Odling-Smee, Laland, and Feldman (Odling-Smee et al., 2003; Laland and Sterelny, 2006; Laland and Brown, 2006). Human natural languages can be seen as a cultural niche (Clark, 2006), functioning as pregnance mediators, where plastic teaching and learning is possible. These cognitive niches make plenty of cognitive tools available that in turn make the organisms who acquire them able to pregnantly manage signs (which consequently gain a special “meaning”). This process is clearly illustrated by the description of various aspects of “plastic” – and not merely hardwired – cognitive skills in animal abduction and by the relevance of the “mediated” character of several affordances. In these cases, both cognitive skills and sensitivity to suitable affordances require cultural learning/training imbued in appropriate cognitive niches.
1502
L. Magnani
A note about the cognitive niches must be added. A direct consequence of coalition enforcement is the development and the central role of cultural heritage (morality and sense of guilt included): in other words, the importance of cultural cognitive niches as new ways of arriving at diverse human adaptations. From this perspective, the long-lived and yet abstract human sense of guilt represents a psychological adaptation, abductively anticipating an appraisal of a moral situation to avoid becoming a target of violent coalitional enforcement. Again, it is important to recall that Darwinian processes are involved not only in the genetic domain but also (with a looser and lesser precision) in the additional cultural domain, through the selective pressure activated by modifications in the environment brought about by cognitive niche construction. According to the theory of cognitive niches, coercive human coalition as a fundamental cognitive niche constructed by humans becomes itself a major element of the selective environment and thus imposes new constraints (designed by extragenetic information on its members). On the concept of extragenetic information, cf. Chapter 5, subsection “Gene/cognitive niche coevolution and moral decriminalization,” of Magnani (2011) and the recent Magnani (2021). Chapter 5 of the book on abductive cognition (Magnani, 2009) clearly emphasized that fleeting and evanescent internal pseudorepresentations (beyond reflexbased innate releasing processes, trial, and error or mere reinforcement learning) are needed to account for many animal “communication” performances even at the rudimentary level of chicken calls: Evans says that “[. . . ] chicken calls produce effects by evoking representations of a class of eliciting events [food, predators, and presence of the appropriate receiver] [. . . ]. The humble and much maligned chicken thus has a remarkably sophisticated system. Its calls denote at least three classes of external objects. They are not involuntary exclamations, but are produced under particular social circumstances” (Evans, 2002, p. 321). In Thom’s words, these calls are of course pregnant signals which can be learned, which in turn play a protomoral and a kind of “deontological” role by triggering reactions that are implicitly considered “good.” Of course, in the case of animal cheating, analogous calls trigger reactions that are basically negative for the receiver’s welfare, as described by El-Hani et al. (2009), thanks to the analysis of an interesting case of animal “deception.” Chickens form separate representations when faced with different events, and they are affected by prior experience (of food, e.g.). These representations are mainly due to internally developed plastic capacities to react to the environment and can be thought of as the fruit of learning. Many animals (especially gregarious ones) go beyond the use of sound signals in their cognitive performances; they, for example, reify and delegate cognitive/semiotic roles to true pregnant external artificial “pseudorepresentations” (e.g., landmarks, urine marks, etc.) which artificially modify the environment to consequently become affordances for themselves and other individuals of the group or of other species. In a study concerning natural human language as an adaptation, the cognitive scientist and evolutionary psychologist Pinker (2003, p. 28) says: “[. . . ] a species that has evolved to rely on information should thus also evolve a means to
69 Abduction and Violence
1503
exchange that information. Language multiplies the benefit of knowledge, because a bit of know-how is useful not only for its practical benefits to oneself but as a trade good with others.” The expression “trade good” seems related to a moral/economical function of language: let us explore this issue in the light of the coalition enforcement hypothesis introduced in the first chapter of Magnani (2011) and delineated in a previous subsection of this chapter. As illustrated above, taking advantage of the conceptual framework brought up by Thom’s catastrophe theory on how natural syntactical language is seen as the fruit of social necessity, its fundamental function can only be clearly seen if linked to an intrinsic moral (and at the same time violent) aim which is basically rooted in a kind of military intelligence, an expression coined by Thom, which mainly relates to the problem of the role of language in the so-called coalition enforcement, illustrated in a previous subsection of this chapter, that is in the affirmation of morality and the related perpetration of violent punishment. It is in this sense that the importance of fallacies as “distributed military intelligence” will be soon pointed out. To anticipate the content of this section taking advantage of a kind of motto, it can be affirmed: “when words distribute moral norms and habits, often they also wound and inflict harm.” Thom explicitly links the concept of “military intelligence” to the functions of human language, even if he warns the reader: “It may seem like an oversimplification to see the origins of language in the informational necessities of ‘military’ intelligence” (Thom, 1988, p. 27). However, Thom says language can simply and efficiently transmit vital pieces of information about the fundamental biological oppositions (life, death; good, bad): it is from this perspective that human language – even at the level of more complicated syntactical expressions – can be clearly seen as a carrier of information (pregnances) about moral qualities of persons, things, and events. Such qualities are always directly or indirectly related to the survival needs of the individual and/or of the group/coalition. Thom too is convinced of the important role played by language in maintaining the structure of societies, defending it, thanks to its moral and violent role: “information has a useful role in the stability or ‘regulation’ of the social group, that is, in its defence” (Thom, 1988, p. 279). When illustrating “military” and “fluid” societies, he concludes: In a military type of society, the social stability is assured, in principle, by the imitation of the movement of the hierarchical superior. Here it is a question of a slow mechanism where the constraints of vital competition can impose rapid manoeuvres on the group. Also the chief cannot see everything and has need of special informers stationed at the front of the group who convey to him useful information on the environment. The invention of a sonorous language able to communicate information and to issue direction to the members of the group has enabled a much more rapid execution of the indispensable manoeuvres. By this means (it is not the only motivation of language), one can see in the acquisition of this function a considerable amelioration of the stability of a social group. If language has been substituted for imitation, we should note that the latter continues to play an important role in our societies at pre-verbal levels (cf. fashion). In addition, imitation certainly plays a primary part in the language learning of a child of 1 to 3 years (pp. 235–236).
1504
L. Magnani
Chapter 1 of Magnani (2011) extendedly illustrates that in human or pre-human groups, the appearance of coalitions dominated by a central leader quickly leads to the need for surveillance of surrounding territory to monitor prey and free-riders and watch for enemies who might jeopardize the survival of the coalition. This is an idea shared by Thom who believes that language becomes a fundamental tool for granting stability and favoring the indispensable manipulation of the world “thus the localization of external facts appeared as an essential part of social communication” (Thom, 1988, p. 26), a performance that is already realized by naming (the containing relationship) in divalent structures: “X is in Y is a basic form of investment (the localizing pregnance of Y invests X). When X is invested with a ubiquitous biological quality (favorable or hostile), then so is Y ” (ibid.) A divalent syntactical structure of language becomes fundamental if a conflict between two outside agents must be reported. The trivalent syntactical structure subject/verb/object forges a salient “messenger” form that conveys the pregnance between subject and recipient. In sum, the usual abstract functions of syntactic languages, such as conceptualization, appear strictly intertwined with the basic military nature of communication. Moreover, it is important to stress that pregnant forms, as they receive names, tend to lose their alienating character.
Abductive Cognition in Violent Mobbing and Scapegoating This subsection and the following one aim at illustrating some clear examples of the violent effects that can be created, thanks to human language together with the role played by the abductive capacities of making hypotheses that substantiate more judgments and actions: mobbing, scapegoating, and sacrificing. The military nature of linguistic communication is intrinsically “moral” (protecting the group by obeying shared norms) and at the same time “violent” (e.g., killing or mobbing to protect the group). This basic moral/violent effect can be traced back to past ages, but also when the “primitive” use of everyday natural language in current mobbers is witnessed, who express strategic linguistic communications “against” the mobbed target. These strategic linguistic communications are often performed, thanks to hypothetical reasoning, abductive or not. In this case, the use of natural language can take advantage of efficient abductive hypothetical cognition through gossip (full of hypotheses about people’s characters, true or false it does not matter: what counts is the “practical” result), fallacies, and so on, but also of the moral/violent exploitation of apparently more respectable and sound truth-preserving and “rational” inferences. The narratives used in a dialectic and rhetorical setting qualify the mobbed individual and its behavior in a way that is usually thought of by the mobbers themselves (and by the individuals of their coalition/group) as moral, neutral, objective, and justified while at the same time hurting the mobbed individual in various ways. Violence is very often subjectively dissimulated and paradoxically considered as the act of performing just, objective moral judgments and of persecuting moral targets. In this case people are in what Magnani (2011) called moral bubble, as the result of a process of dissimulation
69 Abduction and Violence
1505
and unawareness of the actual or possible perpetrated violence. In sum, de facto the mobbers’ coordinated narratives harm the target (just as if she were being stoned in a ritual killing), very often without an appreciable awareness of the violence performed. This human linguistic behavior is clearly made intelligible when we analogously see it as echoing the anti-predatory behavior which “weaker” groups of animals (birds, e.g.) perform, for example, through the use of suitable alarm calls and aggressive threats. Of course, such behavior is mediated in humans through socially available ideologies (differently endowed with moral ideas) and cultural systems. Ideologies can be seen as fuzzy and ill-defined cultural mediators spreading pregnances that invest all those who put their faith in them and stabilize and reinforce the coalitions/groups: “[. . . ] the follower who invokes them at every turn (and even out of turn) is demonstrating his allegiance to an ideology. After successful uses the ideological concepts are extended, stretched, even abused” (Thom, 1988, p. 37), so that their meaning slowly changes in imprecise (and “ambiguous,” Thom says) ways, as it happens in the application of the archetypical principles of mobbing behavior. From this perspective, the massive moral/violent exploitation of equivocal fallacies in ideological discussions, oratories, and speeches is obvious and clearly explainable, as illustrated in Chapter 3, section “Fallacies as distributed ‘military’ intelligence” of Magnani (2011).
Mimetic Desire, Abductive Emotions, and Sacrifices That part of the individual unconscious individuals share with other human beings, i.e., the collective unconscious – this concept is extendedly illustrated in Chapter 4, “Constructing morality through psychic energy mediators,” of Magnani (2011) – shaped by evolution, represents a good explanation, even if rather speculative, of the presence in human beings of archetypes of behaviors such as the “scapegoat” (mobbing) mechanism mentioned above. In this cognitive mechanism, a paroxysm of violence focuses on an arbitrary sacrificial victim, and a unanimous antipathy would, mimetically, grow against him. The process leading to the ultimate bloody violence (which was, e.g., widespread in ancient and barbarian societies) is mainly carried out in current social groups through linguistic communication. Following Girard (1977, 1986), it can be said that in the case of ancient social groups, the extreme brutal elimination of the victim would reduce the appetite for violence that had possessed everyone just a moment before, leaving the group suddenly appeased and calm, thus achieving equilibrium in the related social organization (a sacrificeoriented social organization may be repugnant to us but is no less “social” just because of its rudimentary violence). This kind of archaic brutal behavior is still present in civilized human conduct in rich countries and is almost always implicit and unconscious, for example, in that racist and mobbing behavior already quoted. Again, given the fact that this kind of behavior is widespread and partially unconsciously performed, it is easy to understand how it can be implicitly “learned” in infancy and still implicitly
1506
L. Magnani
“pre-wired” in an individual’s cultural unconscious (in the form of ideology as well) that individuals share with others as human beings. The analysis of this archaic mechanism (and of other similar moral/ideological/violent mechanisms) might shed new light on what can be called the basic equivalence between engagement in morality and engagement in violence since this kind of engagement, amazingly enough, is almost always hidden from the awareness of the human agents that are actually involved. Recent evolutionary perspectives on human behavior, taking advantage of neuroscience and genetics, illustrated by Taylor (2009), provide neuroscientific explanations on how brains process emotions, evoke associations, and stimulate reactions, which offer interesting data – at least in terms of neurological correlates – on why it is reactively easy for people to harm other people. These studies have also illustrated the related process of otherisation – which decisively primes people for aggression – as a process grounded in basic human emotions, i.e., our bias toward pleasure and avoidance of pain. Perceiving others as the “others” causes fear, anger, or disgust, universal “basic” responses to threats whose physiological mechanisms are relatively well understood. It is hypothesized that these emotions, as kinds of abduction Peirce would add, evolved to enable our ancestors to escape predators and fight enemies. Of course, the otherization process continues when structured in “moral” terms, like in the construction of that special other that becomes a potential or actual scapegoat. Abductive cognition is always at play in the just described behavior, even in the case of emotion, as already anticipated, quoting Peirce. Indeed, for Peirce, all sensations or perceptions participate in the nature of a unifying hypothesis, that is, in abduction, in the case of emotions too: Thus the various sounds made by the instruments of the orchestra strike upon the ear, and the result is a peculiar musical emotion, quite distinct from the sounds themselves. This emotion is essentially the same thing as a hypothetic inference, and every hypothetic inference involved the formation of such an emotion (Peirce, CP, 2.643).
It is worth mentioning, in conclusion, the way Thom accounts for the social/moral phenomenon of scapegoating in terms of pregnances. “Mimetic desire,” in which Girard (1986) roots the violent and aggressive behavior (and the scapegoat mechanism) of human beings, can be seen as the act of appropriating a desired object which imbues that object with a pregnance, “the same pregnance as that which is associated with the act by which ‘satisfaction’ is obtained” (Thom, 1988, p. 38). Of course, this pregnance can be propagated by imitation through the strong abducibility granted mere sight of “superior” individuals or through the exposure to descriptions and narratives about them and their achievements, in which it is manifest: “In a sense, the pleasure derived from looking forward to a satisfaction can surpass that obtained from the satisfaction itself. This would have been able to seduce societies century after century (their pragmatic failure in real terms having allowed them to escape the indifference that goes with satiety as well as the ordeal of actual existence)” (ibid.) Recent cognitive research stresses the influence that intentional gaze processing has on “object processing”: objects falling under the gaze of others acquire
69 Abduction and Violence
1507
properties that they would not display if not looked at. Observing another person gazing at an object enriches it with motor, affective, and status properties that go beyond its chemical or physical structure. It can be concluded that gaze presents an abductive force: it plays the role of transferring to the object the intentionality of the person who is looking at it. This result further explains why mimetic desire can spread so quickly among people belonging to specific groups (Becchio et al., 2008). On gaze cueing of attention, the reader can refer to Frischen et al. (2007), who also established that in humans prolonged eye contact can be perceived as aggressive. Grounded in appropriate wired bases, “mimetic desire” is indeed a sophisticated template of abductive behavior that can be picked up from various appropriate cultural systems, available over there, as part of the external cognitive niches built by many human collectives and gradually externalized over the centuries (and always transmitted through activities, explicit or implicit, of teaching and learning), as fruitful ways of favoring social control over coalitions. Indeed, mimetic desire triggers envy and violence, but at the same time, the perpetrated violence causes a reduction in appetite for violence, leaving the group suddenly appeased and calm, thus achieving equilibrium in the related social organization through a moral effect, that is, at the same time a carrier of violence, as already illustrated. Mimetic desire is related to envy (even if of course not all mimetic desire is envy, certainly all envy is mimetic desire): when humans are attracted to something the others have but that they cannot acquire because others already possess it (e.g., because they are rival goods), humans experience an offense that generates envy. In the perspective introduced by Girard, envy is a mismanagement of desire, and it is of capital importance for the moral life of both communities and individuals. As a reaction to offense, envy easily causes violent behavior. From this point of view, it is possible to psychoanalytically add that “[. . . ] the opposite of egotist self-love is not altruism, a concern for common good, but envy and resentment, which makes me act against my own interests. Freud knew it well: the death drive is opposed to the pleasure principle as well as to the reality principle. The true evil, which is the death drive, involves self-sabotage” (Žižek, 2009, p. 76).
Fallacies, Hypotheses, and Distributed “Military” Intelligence In a previous subsection of this chapter, the role played by the so-called coalition enforcement hypothesis in the social distribution of morality and violence has been quoted, a perspective also stressing the intrinsic moral (and at the same time violent) nature of language (and of abductive and other hypothetical forms of reasoning and arguments intertwined with the propositional/linguistic level). As illustrated above, in this perspective, language is basically rooted in a kind of military intelligence, as maintained by Thom (1988). Indeed, it is important to note that many kinds of mechanisms of hypothesis generation (from abduction to hasty generalization, from ad hominem to ad verecundiam) are performed through inferences that embed formal or informal fallacies. Maybe the reader does not know the meaning of the expression “hasty generalization”: it occurs when a person infers a conclusion about
1508
L. Magnani
a group of cases based on a model that is not large enough, for example, just one sample; it is witnessed in animals as well, for example, in mice, where the form of hypotheses making can be ideally modeled as a hasty induction. It is also relevant to remember, as already described in the previous section, that language transfers not only morality and violence but also motor actions and emotions: it is well-known that overt hostility in emotions is a possible trigger to initiate violent actions. The “moral” role of emotions is well-known, and their potentially “violent” nature goes without saying. De Gelder and colleagues (2004) indicate that observing fearful body expressions produces increased activity in brain areas narrowly associated with emotional processes and that this emotion-related activity occurs together with activation of areas linked with the representation of action and movement. The mechanism of emotional contagion (fear) hereby suggested may automatically prepare the brain for action. In Thom’s terms, essentially, as already illustrated, language efficiently transmits vital pieces of information about the fundamental biological opponents (life, death; good, bad), and hence, it is intrinsically involved in moral/violent activities. What is now important to note is that this conceptual framework can shed further light on some fundamental dialectical and rhetorical roles played by the so-called fallacies that are of great relevance to stress some basic aspects of human abductive cognition. In the following subsection, some of the roles played by fallacies will be considered that can be ideally related to the intellectual perspective of the coalition enforcement hypothesis mentioned above. Of course, the “military” nature is not evident in various aspects and uses of syntactilized human language. As previously illustrated, the military/violent nature is obviously manifest, for instance, in hateful, racist, homophobic speech. It is hard to directly see the coalition enforcement effect in the many epistemic functions of natural language, for example, when it is simply employed to transmit scientific results in an academic laboratory situation or when humans gather information from the Internet – expressed in linguistic terms and numbers – about the weather. However, it is necessary to remember that even the more abstract character of knowledge packages embedded in certain uses of language (and in hybrid languages, like in the case of mathematics, which involves considerable symbolic parts) still plays a significant role in changing the moral behavior of human collectives. For example, the production and the transmission of new scientific knowledge in human social groups not only operates on information but also implements and distributes roles, capacities, constraints, and possibilities of actions. This process is intrinsically moral because in turn it generates precise distinctions, powers, duties, and chances which can create new between-group and in-group violent (often) conflicts or reshape older pre-existent ones. Just to make an example, new abductively created theoretical biomedical knowledge about pregnancy and fetuses usually has two contrasting moral/social effects: (1) a better social and medical management of childbirth and related diseases and (2) the potential extension or modification of conflicts surrounding the legitimacy of abortion. In sum, even very abstract bodies of knowledge and more innocent pieces of information enter the semio/social process which governs the identity of groups
69 Abduction and Violence
1509
and their aggressive potential as coalitions: deductive reasoning and declarative knowledge are far from being exempt from being accompanied by argumentative, deontological, rhetorical, and dialectic aspects. For example, it is hard to distinguish, in an eco-cognitive setting, between a kind of “pure” (e.g., deductive) inferential function of language and an argumentative or deontological one. For example, the first one can obviously play an associated argumentative role. However, it is in the arguments traditionally recognized as fallacious that it is possible to more clearly grasp the military nature of human language and especially of some hypotheses reached through fallacies. Searle (2001) considers eccentric that aspect of our cultural tradition according to which true statements, fruit of deductive sound inferences, which also describe how things are in the world, can never imply a statement about how they ought to be. According to Searle, to say that something is true is already to say you ought to believe it: other things being equal, you ought not to deny it. This means that normativity is more widespread than expected. It is in a similar way that Thom clearly acknowledges the “general” (and intrinsic) “military” (and so moral, normative, argumentative, etc.) nature of language, by providing a justification in terms of the catastrophe theory, as described in the first section of the present chapter. Woods contends that a fallacy is by definition considered a mistake in reasoning, a mistake which occurs with some frequency in real arguments and which is characteristically deceptive (Woods, 2013). Of course, deception – insofar as it is related to deliberate fallaciousness – does not have to be considered as part of the definition of what a fallacy is (Tindale, 2007). Traditionally recognized fallacies, like hasty generalization and ad verecundiam, are considered “inductively” weak inferences while affirming the consequent is a deductively invalid inference. Nevertheless, when they are used by actual reasoners or “beings like us,” that is, in an ecological and eco-cognitive and not in an aseptic, ideal and abstract, logical framework, they are no longer necessarily fallacies. In these cases, fallacies are seen in a social and real-time exchange of speech acts between parties/agents. Traditionally, fallacies are considered to be errors, attractive and seductive, but also universal, because humans are prone to commit them. Moreover, they are “usually” considered incorrigible, because the diagnosis of their incorrectness does not cancel their appearance of correctness: Woods remarked that, for example, if a person is prone to hasty generalization prior to its detection in a particular case, he will remain prone to it afterward (Woods, 2013). Woods calls this perspective the traditional – even if not classical/Aristotelian – “EAUI-conception” of fallacies. Further, he more subtly observes that the EAUIconception is not a sufficiently clear notion of fallacyhood. Actually, not being a fallacy is not sufficient to vindicate an error of reasoning. Fallacies are errors of reasoning with a particular character. They are not, by any stretch, all there is to erroneous reasoning (2013, Chapter 3). If a sharp distinction between strategic and cognitive rationality is adopted, many of the traditional fallacies – for instance, a hasty generalization – demand an equalizing treatment. They are sometimes cognitive mistakes and strategic successes, and in at least some of those cases, it
1510
L. Magnani
is more rational to proceed strategically, even at the cost of cognitive error. As a matter of fact, hasty generalization or the fallacy of affirming the consequent – that is, abduction – instantiates the traditional concept of fallacy (for one thing, it is a logical error), but there are contexts in which committing the error is smarter than avoiding it. According to Woods’ latest observations, the traditional fallacies – hasty generalization included – do not really instantiate the traditional concept of fallacy (the EAUI-conception). In this perspective, it is not that it is “sometimes” strategically justified to commit fallacies (a perfectly sound principle, by the way), but rather that in the case of the Gang of Eighteen traditional fallacies, they are just not fallacies. The distinction is subtle and can be adopted in the following sense: the traditional conception of fallacies adopts – so to speak – an aristocratic (idealized) perspective on human thinking that disregards its profound eco-cognitive character, where the “military intelligence” quoted above in this chapter is fundamentally at play. Errors, from an eco-cognitive perspective, certainly are not the exclusive product of the socalled fallacies, and in this wider sense, a fallacy is an error – as Woods remarks, that virtually everyone is disposed to commit with a frequency that, even if low, is greater than the frequency of their errors in general. Woods’ “negative thesis” is clearer exactly in the light of the account of language’s military nature provided in the previous sections. As already illustrated, 1. human language possesses a “pregnance-mirroring” function; 2. in this sense, it is possible to say that vocal and written language can be a tool exactly like a knife, that is, full of moral functions but also of violent ones; 3. the so-called fallacies are linked to that “military intelligence,” which relates to the problem of the role of language in the coalition enforcement illustrated in a previous subsection of this chapter; 4. this “military” nature is not always evident in various aspects and uses of syntactilized human language.
Judging People Accused of Violence: Abductive Evaluation and Assessment of Evidence and of Fallacious Narratives Some years ago, a talk show on television devoted to the case of a Catholic priest, Don Gelmini, accused of sexual abuse by nine young men hosted in a rehab facility belonging to Comunità Incontro in Italy represents an interesting example worth analyzing. He was the founder of that charitable organization now present worldwide. Four journalists argued in favor and against Don Gelmini, using many of the so-called fallacious arguments (mainly ad hominem) centered on the past of both the accused and the witnesses. The description of this television program is useful to illustrate the role of abduction in the filtration and evaluation of arguments, when seen as distributed in real-life dialectical and rhetorical contexts. An individual belonging to the audience, at the end of the program, could have concluded in favor of the ad hominem arguments (also “recognizing” them
69 Abduction and Violence
1511
as fallacies) used by the journalists who argued against Don Gelmini, and so he could have accepted the abductive hypotheses put forward against him. Hence, in this case, the data and gossip embedded in the fallacious reasoning describing the “immoral” and “judiciary” past of Don Gelmini were considered more relevant than the ones which described the bad past of the witnesses. That TV viewer could have been aware of being in the midst of a riddle of hypotheses generated by various arguments, and of course, this would not have probably been the case of the average viewer, who may not have been trained in logic and critical thinking, but it is easy to see that this fact does not actually affect the rhetorical success or failure of arguments in real-time contexts, as it was also occurring in the case of our ideal viewer. As listeners, humans are all in an “epistemic bubble,” compelled to think they know things even if they do not know them. In this case, the bubble forces human beings to quickly evaluate and pick up what they consider the best choice. An idea can be forwarded regarding the fact that, at least in this case, the evaluation/explanation of the ad hominem arguments must be seen as the fruit of an abductive appraisal and that this abductive process is not rare in argumentative settings. An analogy to the situation of trials – in the common law tradition as described by Woods (2009) – can be of help. Like in the case of the judge in the trial, in our case, the audience (and our ideal viewer, as part of the audience) were basically faced with the circumstantial evidence mainly carried by the two clusters of ad hominem arguments, that is, faced with evidence from which a fact can be reasonably inferred but not directly proven. Furthermore, like the jury in trials, an audience is on average composed of individuals who are not experts capable of “overt calibration” of performance to criteria, but instead ordinary – untutored – people, reasonably used to reasoning in the way ordinary people do, performing a kind of “intuitive” and “unreflective” reasoning (Woods, 2009). As already illustrated above, in a situation of lack of information and knowledge and thus of constitutive “ignorance,” abduction is usually the best cognitive tool human beings can adopt to relatively quickly reach explanatory, non-explanatory, and instrumental hypotheses/conjectures: the reader that is interested in these various aspects of abductive reasoning can refer to Magnani (2009, Chapter 2). Moreover, it is noteworthy that evidence – embedded in the ad hominem arguments – concerning the “past” (supposedly) reprehensible behavior of the priest and of the witnesses were far from being reliable, probably chosen ad hoc, and deceptively supplied, that is, so to say, highly circumstantial for the judge/ audience. The cognitive process of abductive evaluation/explanation of that ideal TV viewer, regarding the fallacious dialectics between the two groups of journalists, was based on a kind of sub-abductive process of filtration of the evidence provided, choosing what seemed to the viewer the most reliable evidence in a more or less intuitive way, and then he performed an abductive appraisal of all the data. The filtration strategy is of course abductively performed and guided by various “reasons,” the conceptual ones, for example, being based on judgments of credibility. However, these reasons were intertwined with other reasons such as variously
1512
L. Magnani
conscious emotional reactions, based on feelings triggered by the entire distributed visual and auditory interplay between the audience and the scene, in itself full of body gestures, voices, and images (also variably and smartly mediated by the director of the program and the cameramen). Along these lines, it is possible to also observe that the journalists “fallaciously” discussing the case were concerned with the accounts they could trust and certainly emotions played a role in their inferences as well. In summary, our ideal TV viewer was able to abductively make a selection (selective abduction) between the two hypothetical narratives about the priest, forming a kind of explanatory theory – of course, he could have avoided the choice, privileging “indifference,” thus stopping any abductive engagement. The guessed – and quickly accepted without further testing – theory of what was happening in the dialectic exchange further implied the hypothesis of guilt with respect to the priest. That is, the ad hominem of the journalists that were speaking about the priest’s past (he was, e.g., convicted for 4 years because of bankruptcy fraud, and some acceptable evidence - data and trial documents – was immediately provided by the staff of the television program) appeared convincing to the viewer, that is, no more negatively biased, but a plausible acceptable argument. Was it still a fallacy from this eco-cognitive perspective? The answer is negative: it still was and is a fallacy only from the logical/intellectual/academic perspective! The “military” nature of the above interplay between contrasting ad hominem arguments is patent. Indeed, they were armed linguistic forces involving “military machines for guessing hypotheses” clearly aiming at forming an opinion in the viewer’s mind (and in that of the audience) which the viewer reached through an abductive appraisal, quickly able to explain one of the two narratives as more plausible. In the meantime, our ideal TV viewer became part of the wide coalition of individuals who strongly suspected that Don Gelmini was guilty and were in condition of potentially being engaged in further “armed” gossiping. In summary, it is possible to conclude with the following observations: 1. In special contexts where the so-called fallacies and various kinds of hypothetical reasoning are at play, at both the rhetorical and the dialectic level, their assessment can be established in a more general way, beyond specific cases (examples are provided in Tindale (2005)). An example is the case of a fallacy embedding patently false empirical data, which can easily be recognized as false at least by the standard intended audience; another example is when a fallacy is structured, from the argumentative point of view, in a way that renders it impossible to address the intended audience, and in these cases, the fallacy can be referred to as “always committed”; 2. not only abduction but also other kinds of (supposed) fallacious argumentation can be further employed to evaluate arguments in dialectic situations like the ones mentioned above, such as ad hominem, ad populum, ad ignorantiam, etc., but also hasty generalization and deductive schemes;
69 Abduction and Violence
1513
3. the success of a “fallacy” and of the inherent “fallacious” hypothetical argument can also be seen from the arguer’s perspective insofar as she is able to guess an accurate abductive assessment about the character of her actual or possible audience. From this perspective, an argument is put forward and “shaped” according to an abductive hypothesis about the audience, which the arguer guesses on the basis of available data (internal, already stored in memory, external – useful cues derived from audience features and suitably picked up, and other intentionally sought information). Misjudging the audience would jeopardize the efficacy of the argument, which would consequently be a simple error of reasoning/argumentation. These strategic appraisals and assessments can be considered as the basis of preaching: making appropriate judgments about the audience was the main objects of teachings in rhetoric; 4. as clearly shown by the example of the TV priest, fallacious arguments are not only “distributed,” as illustrated above, but they are also embedded, nested, and intertwined in self-sustaining clusters, which individuate peculiar global “military” effects.
Conclusion In this chapter, taking advantage of the concepts of salience and pregnance, derived from the catastrophe theory, and of the concept of abduction and affordance, some eco-cognitive aspects of moral and violent behavior have been illustrated. New insight on the analysis of the strict relationships between these behaviors has been offered, by presenting a unified perspective rooted in a morphodynamical framework in which physical, biological, and cognitive processes can be simultaneously analyzed and the related role of abduction explained. The last part addresses two relevant problems: (1) the role of pregnances as linguistic tools which are essential in substantiating that “military intelligence” in which exemplar moral and violent behaviors, such as bullying and scapegoating, can be elegantly and synthetically explained, in a unified abductive perspective, and (2) the function of fallacies (abduction included) in executing episodes of languagebased violence and the analysis of the basic inferential abductive routines that ground the judgments regarding people accused of violence and the evaluation of related fallacious narratives. The approach described in this chapter can be extended in many directions, beyond merely socio-moral aspects. This chapter also offered the chance of stressing the attention to the so-called coalition enforcement, especially useful in describing the relationship between morality and violence from a paleoanthropological perspective, as a clear example of their entanglement. Acknowledgements Research for this chapter was supported by the PRIN 2017 Research 20173YP4N3 – MIUR, Ministry of University and Research, Rome, Italy.
1514
L. Magnani
References Becchio, C., Bertone, C., & Castiello, U. (2008). How the gaze of others influences object processing. Trends in Cognitive Science, 12(7), 254–258. Bermúdez, J. L. (2003). Thinking Without Words. Oxford: Oxford University Press. Bertolotti, T. (2015). Patterns of Rationality: Recurring Inferences in Science, Social Cognition and Religious Thinking. Berlin/Heidelberg: Springer. Bertolotti, T., & Magnani, L. (2014). An epistemological analysis of gossip and gossip-based knowledge. Synthese, 191, 4037–4067. Bingham, P. M. (1999). Human uniqueness: A general theory. The Quarterly Review of Biology, 74(2), 133–169. Bingham, P. M. (2000). Human evolution and human history: A complete theory. Evolutionary Anthropology, 9(6), 248–257. Boehm, C. (1999). Hierarchy in the Forest. Cambridge, MA: Harvard University Press. Clark, A. (2006). Language, embodiment, and the cognitive niche. Trends in Cognitive Science, 10(8), 370–374. Coady, D. (2012). What to Believe Now: Applying Epistemology to Contemporary Issues. New York: Blackwell. de Gelder, B., Snyder, J., Greve, D., Gerard, G., & Hadjikhani, N. (2004). Fear fosters flight: A mechanism for fear contagion when perceiving emotion expressed by a whole body. PNAS (Proceedings of the National Academy of Science of the United States of America), 101, 16701– 16706. El-Hani, C. N., Queiroz, J., & Stjernfelt, F. (2009). Firefly femmes fatales: A case study in the semiotics of deception. Biosemiotics, 1, 33–55. Evans, C. S. (2002). Cracking the code. Communication and cognition in birds. In M. Bekoff, C. Allen, & M. Burghardt (Eds.), The Cognitive Animal. Empirical and Theoretical Perspectives on Animal Cognition (pp. 315–322). Cambridge, MA: The MIT Press. Fetzer, J. K. (1990). Artificial Intelligence: Its Scope and Limits. Dordrecht: Kluwer Academic Publisher. Frischen, A., Bayliss, A. P., & Tipper, S. P. (2007). Gaze cueing of attention. Psychological Bulletin, 133(4), 694–724. Gibson, J. J. (1951). What is a form? Psychological Review, 58, 403–413. Gibson, J. J. (1979). The Ecological Approach to Visual Perception. Boston, MA: Houghton Mifflin. Gibson, J. J. (1982). A preliminary description and classification of affordances. In E. S. Reed & R. Jones (Eds.), Reasons for Realism (pp. 403–406). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Girard, R. (1977). Violence and the Sacred [1972]. Baltimore, MD: Johns Hopkins University Press. Girard, R. (1986). The Scapegoat [1982]. Baltimore, MD: Johns Hopkins University Press. Hanson, N. R. (1958). Patterns of Discovery. An Inquiry into the Conceptual Foundations of Science. London: Cambridge University Press. Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74, 88–95. Harman, G. (1968). Enumerative induction as inference to the best explanation. Journal of Philosophy, 65(18), 529–533. Hebb, D. O. (1949). The Organization of Behavior. New York: Wiley. Laland, K. N., & Brown, G. R. (2006). Niche construction, human behavior, and the adaptive-lag hypothesis. Evolutionary Anthropology, 15, 95–104. Laland, K. N., & Sterelny, K. (2006). Perspective: Seven reasons (not) to neglect niche construction. Evolution International Journal of Organic Evolution, 60(9), 4757–4779. Lipton, P. (2004). Inference to the Best Explanation. London: Routledge. Originally published in 1991. New Revised edition.
69 Abduction and Violence
1515
Loula, A., Gudwin, R., El-Hani, C. N., & Queiroz, J. (2010). Emergence of self-organized symbolbased communication in artificial creatures. Cognitive Systems Research, 2, 131–147. Magnani, L. (1988). Epistémologie de l’invention scientifique. Communication and Cognition, 21, 273–291. Magnani, L. (1992). Abductive reasoning: Philosophical and educational perspectives in medicine. In D. A. Evans & V. L. Patel (Eds.), Advanced Models of Cognition for Medical Training and Practice (pp. 21–41). Berlin: Springer. Magnani, L. (2001). Abduction, Reason, and Science. Processes of Discovery and Explanation. New York: Kluwer Academic/Plenum Publishers. Magnani, L. (2009). Abductive Cognition. The Eco-Cognitive Dimensions of Hypothetical Reasoning. Heidelberg/Berlin: Springer. Magnani, L. (2011). Understanding Violence. The Intertwining of Morality, Religion, and Violence: A Philosophical Stance. Heidelberg/Berlin: Springer. Magnani, L. (2017). The Abductive Structure of Scientific Creativity. An Essay on the Ecology of Cognition. Cham: Springer. Magnani, L. (2021). Cognitive niche construction and extragenetic information: A sense of purposefulness in evolution. Journal for General Philosophy of Science, 52, 263–276. Magnani, L. (2022). Discoverability. The Urgent Need of an Ecology of Human Creativity. Cham: Springer. Odling-Smee, F. J., Laland, K. N., & Feldman, M. W. (2003). Niche Construction. The Neglected Process in Evolution. Princeton, NJ: Princeton University Press. Peirce, C. S. (1966). The Charles S. Peirce Papers: Manuscript Collection in the Houghton Library. Worcester, MA: The University of Massachusetts Press. Annotated Catalogue of the Papers of Charles S. Peirce. Numbered according to Richard S. Robin. Available in the Peirce Microfilm edition. Pagination: CSP = Peirce/ISP = Institute for Studies in Pragmaticism. Peirce, C. S. (CP). Collected Papers of Charles Sanders Peirce. Cambridge, MA: Harvard University Press. (Vols. 1–6, C. Hartshorne & P. Weiss (Eds.); Vols. 7–8, A. W. Burks (Ed.), 1931–1958). Pinker, S. (1997). How the Mind Works. New York: W. W. Norton. Pinker, S. (2003). Language as an adaptation to the cognitive niche. In M. H. Christiansen & S. Kirby (Eds.), Language Evolution (pp. 16–37). Oxford: Oxford University Press. Raftopoulos, A. (2009). Cognition and Perception. How Do Psychology and Neural Science Inform Philosophy? Cambridge, MA: The MIT Press. Searle, J. (2001). Rationality in Action. Cambridge, MA: The MIT Press. Simon, H. A. (1965). The logic of rational decision. British Journal for the Philosophy of Science, 16, 169–186. Simon, H. A. (1977). Models of Discovery and Other Topics in the Methods of Science. Dordrecht: Reidel. Taylor, K. (2009). Cruelty. Human Evil and the Human Brain. Oxford: Oxford University Press. Thagard, P. (1987). The best explanation: Criteria for theory choice. Journal of Philosophy, 75, 76–92. Thom, R. (1972). Stabilité structurelle et morphogénèse. Essai d’une théorie générale des modèles. Paris: InterEditions. Translated by D. H. Fowler, Structural Stability and Morphogenesis: An Outline of a General Theory of Models. Reading, MA: W. A. Benjamin, 1975 Thom, R. (1980). Modèles mathématiques de la morphogenèse. Paris: Christian Bourgois. Translated by W. M. Brookes, & D. Rand, Mathematical Models of Morphogenesis. Chichester: Ellis Horwood, 1983. Thom, R. (1988). Esquisse d’une sémiophysique. Paris: InterEditions. Translated by V. Meyer, Semio Physics: A Sketch. Redwood City, CA: Addison Wesley, 1990. Tindale, C. W. (2005). Hearing is believing: A perspective-dependent view of the fallacies. In F. van Eemeren, & P. Houtlosser (Eds.), Argumentative Practice (pp. 29–42). Amsterdam: John Benjamins.
1516
L. Magnani
Tindale, C. W. (2007). Fallacies and Argument Appraisal. Cambridge, MA: Cambridge University Press. Tooby, J., & DeVore, I. (1987). The reconstruction of hominid behavioral evolution through strategic modeling. In W. G. Kinzey (Ed.), Primate Models of Hominid Behavior (pp. 183– 237). Albany: Suny Press. Waal, F. D., Wright, R., Korsgaard, C. M., Kitcher, P., & Singer, P., (Eds.) (2006). Primates and Philosophers. How Morality Evolved. Princeton, NJ: Princeton University Press. Woods, J. (2009). Ignorance, inference and proof: Abductive logic meets the criminal law. In G. Tuzet & D. Canale (Eds.), The Rules of Inference: Inferentialism in Law and Philosophy (pp. 151–185). Heidelberg/Berlin: Egea. Woods, J. (2013). Errors of Reasoning. Naturalizing the Logic of Inference. London: College Publications. Žižek, S. (2009). Violence [2008]. London: Profile Books.
Part XIII Abduction and Cognitive Neuroscience
Introduction to Abduction and Cognitive Neuroscience
70
Gustavo Cevolani
Contents Abduction in Cognitive Neuroscience: Theoretical, Methodological, and Empirical Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1519 1522
Abstract
Cognitive neuroscience is the study of the biological, and especially neural, processes that underlie cognition and mental activities. The three chapters in this Section of the Handbook explore abductive reasoning with a special focus on inference patterns routinely employed by neuroscientists both in evaluating evidence concerning brain activity and in assessing cognitive hypotheses on its basis.
Abduction in Cognitive Neuroscience: Theoretical, Methodological, and Empirical Issues The three chapters in this Section of the Handbook explore abductive reasoning in the context of cognitive neuroscience. Cognitive neuroscience is the study of the biological, and especially neural, processes that underlie cognition and mental activities. Neuroscientists aim at understanding how high-level cognitive processes like memory, language, or emotions are “implemented” at the lower level of brain regions and neural activity. To this purpose, they employ several different
G. Cevolani () IMT School for Advanced Studies Lucca, Lucca, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_92
1519
1520
G. Cevolani
techniques that allow researchers to study the activity in the brain of participants performing a wide variety of experimental tasks. One prominent example of such techniques is functional magnetic resonance imaging (fMRI) that allows researchers to find systematic correlations between cognitive processes plausibly engaged in experimental tasks and the increased activation of specific brain structures, as estimated by the BOLD activity in the relevant areas (which measures, roughly, the release of oxygen from blood to neurons, which is greater for active than for inactive neurons). Cognitive neuroscience is a relatively late comer to the study of the mind, flourishing in the 1990s (but see the chapter by Coraci et al. on its older history). However, it deeply changed the field of cognitive sciences in the last decades, bringing to it new experimental methods to tackle old problems and, at the same time, opening new perspectives and raising new theoretical and practical challenges. As it often happens in the history of science, some time was needed to focus on the methodological issues arising from scientific practice. Today, the philosophy of neuroscience, and of cognitive neuroscience in particular, is a well-established and quickly developing research field, with philosophers and neuroscientists fruitfully interacting to explore foundational issues in the field, moral and social implications of ongoing research, and prospects and limitations of current techniques and experimental methods (Bechtel & Huang, 2022; Bickle et al., 2019; Calzavarini & Viola, 2020; Kästner, 2017; Roskies, 2021). However, many interesting issues remain to be studied, especially as far as scientific inference and reasoning are performed in the field (cf., e.g., Ferretti & Viola, 2019). One such issue is for sure the role of abduction and abductive reasoning in cognitive neuroscience. The chapters in the present Section of the Handbook are both a survey of what has been done so far and an invitation to philosophers of neuroscience and to neuroscientists with methodological interests to explore abductive inference (and scientific inferences more generally) in the field of cognitive neuroscience. The first chapter discusses both normative and empirical aspects of current theories of abduction; the second chapter focuses on the clearest example of abductive inference in cognitive neuroscience, i.e., so-called reverse inference; the third chapter offers a survey of the statistical and inferential methods commonly employed in the field, with a focus on Bayesian reasoning. In the Chap. 72, “Abduction: Theory and Evidence” Igor Douven presents an up-to-date assessment of current research on abduction, building on his own work on the topic during the last two decades (cf. Douven, 2022). He starts with a brief discussion of different accounts of abduction proposed by philosophers of science, and of the strong criticism raised by Bayesian philosophers against the very idea and viability of abductive reasoning in the last 40 years. Against this background, Douven then critically assesses the pros and cons of abduction, offering both theoretical and empirical arguments, with the transparent aim of defending the usefulness and importance of the notion of abduction for the purpose of understanding human reasoning, in philosophy as well as in psychology and cognitive science. Besides empirical evidence from psychological experiments with real participants, showing how explanatory considerations crucially modulate human reasoning and cognition,
70 Introduction to Abduction and Cognitive Neuroscience
1521
Douven also presents some recent work on abduction using computer simulations, and in particular the methods of evolutionary computation, in order to assess the relative performance of different reasoning modes, both abductive and nonabductive (e.g., Bayesian). With the Chap. 71, “Reverse Inference, Abduction, and Probability in Cognitive Neuroscience” the discussion on abduction enters the field of cognitive neuroscience proper. In this area, neuroscientists routinely infer from specific neural activation patterns to the engagement of particular mental processes, relying on previously built brain maps obtained through fMRI. This so-called reverse inference is clearly an instance of abductive reasoning, from neural activities to their putative causes at the level of mental processes (Calzavarini & Cevolani, 2022). Reverse inference plays a crucial role in many applications of fMRI and has attracted a great deal of attention and criticism, especially after leading neuroscientist Russell Poldrack (2006) denounced an uncontrolled “epidemic” of this reasoning pattern, cautioned against its (improper) use and pointed to its crucial weakness. In their chapter, Davide Coraci, Fabrizio Calzavarini, and Gustavo Cevolani first explain how reverse inference is currently performed in cognitive neuroscience, starting with a basic introduction to the fields and its techniques for the nonexpert reader. Then, they survey recent work at the interface between neuroscience and the philosophy of science on how reverse inference can be best modeled and defended, discussing both abductive and Bayesian approaches, and concluding with a review of open problems and challenges for future research. Finally, in the Chap. 73, “Plausible Reasoning in Neuroscience,” neuroscientists Tommaso Costa, Donato Liloia, Mario Ferraro, and Jordi Manuello offer an informative exposition of some of the main statistical techniques employed in the field of cognitive neuroscience and related areas. The authors adopt and defend the Bayesian approach to statistical inference, favored over the more traditional, and still dominant, frequentist one. After discussing some foundational issues concerning the interpretation of probability and of statistical inferences, they survey how canonical statistical problems, like parameter estimation and model selection, are treated in applications related to cognitive neuroscience, e.g., in Bayesian analyses of perception, behavior, and brain functioning. The problem of reverse inference (the focus of the chapter by Coraci et al.) is an important issue in this area, where statistical inference is applied to assess the strength of the correlation between mental processes and brain activations as revealed by fMRI data. In their chapter, Costa and colleagues also briefly discuss BACON, a tool based on Bayes factors that they recently developed in order to perform automated reverse inference on large datasets of fMRI data. Overall, the three chapters in the present Section of the Handbook provide an upto-date overview of recent work at the interface between the philosophy of abduction in general (Douven), the philosophy of neuroscience (Coraci, Calzavarini, and Cevolani), and Bayesian statistics as employed in neuroscientific research (Costa, Liloia, Ferraro, and Manuelo). The main upshot is a clarification of the role of abductive inference as a mode of reasoning about neuroscientific hypotheses, thus adding cognitive neuroscience to the list of fields where researchers successfully
1522
G. Cevolani
deal with abduction and that can serve as a rich source of case studies for the philosopher of science (cf. Niiniluoto, 2018). In this connection, one should note that there are other interesting ways in which abduction enters the area of (cognitive) neuroscience that could not be covered here. Interesting issues concern, for instance, the general study of explanatory and abductive reasoning in human cognition and learning (Lombrozo, 2012), as well as specific issues in neuroscientific research, like the study of the neural and psychological bases of tool-use behavior, that can benefit from conceptualization in terms of abductive inference (Caruana & Cuccio, 2017). While further research is needed to clarify the potentialities and limitations of abduction in cognitive science and neuroscience, the following chapters will hopefully give the interested reader a useful, informative, and up-to-date entry point to this kind of study.
References Bechtel, W., & Huang, L. T.-L. (2022). Philosophy of neuroscience. Cambridge University Press. Bickle, J., Mandik, P., & Landreth, A. (2019). The philosophy of neuroscience. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (Fall 2019 edition), https://plato.stanford.edu/ archives/fall2019/entries/neuroscience/. Calzavarini, F., & Cevolani, G. (2022). Abductive reasoning in cognitive neuroscience: Weak and strong reverse inference. Synthese, 200. https://doi.org/10.1007/s11229-022-03585-2 Calzavarini, F., & Viola, M. (Eds.). (2020). Neural mechanisms: New challenges in the philosophy of neuroscience. Springer Nature. Caruana, F., & Cuccio, V. (2017). Types of abduction in tool behavior. Phenomenology and the Cognitive Sciences, 16(2), 255–273. Douven, I. (2022). The art of abduction. Massachusetts Institute of Technology. Ferretti, G., & Viola, M. (2019). How philosophical reasoning and neuroscientific Modeling come together. In M. Fontaine, C. Barés-Gómez, F. Salguero-Lamillar, L. Magnani, & Á. Nepomuceno-Fernández (Eds.), Model-based reasoning in science and technology. Springer. Kästner, L. (2017). Philosophy of cognitive neuroscience: Causal explanations, mechanisms and experimental manipulations. De Gruyter. Lombrozo, T. (2012). Explanation and abductive inference. In K. J. Holyoak & R. G. Morrison (Eds.), The Oxford handbook of thinking and reasoning (pp. 260–276). Oxford University Press. Niiniluoto, I. (2018). Truth-seeking by abduction. Springer. Poldrack, R. (2006). Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences, 10(2), 59–63. Roskies, A. (2021). Neuroethics. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (Spring 2021 edition), https://plato.stanford.edu/archives/spr2021/entries/neuroethics/.
Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
71
Davide Coraci, Fabrizio Calzavarini, and Gustavo Cevolani
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What Is Cognitive Neuroscience? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Cognitive Processes to Neural Regions and Back Again . . . . . . . . . . . . . . . . . . . . . . . The Problem of Reverse Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kinds of Reverse Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modeling Reverse Inference: Abductive Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modeling Reverse Inference: Bayesian Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Open Problems and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reverse Inference as a Discovery Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Likelihoodist Approach to Reverse Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task Relativization and Bayesian Priors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reverse Inference from Brain Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Univariate to Multivariate Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cognitive Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1524 1524 1528 1528 1530 1533 1536 1539 1539 1540 1541 1542 1543 1544
D. Coraci () Models, Inference, and Decisions (MInD) Group, MoMiLab Research Unit, IMT School for Advanced Studies Lucca, Lucca, Italy Institut d’Histoire et de Philosophie des Sciences et des Techniques (IHPST), Paris 1 Panthéon-Sorbonne University, Paris, France e-mail: [email protected] F. Calzavarini Department of Letter, Philosophy, Communication, University of Bergamo, Bergamo, Italy Center for Logic, Language, and Cognition (LLC), University of Turin, Turin, Italy e-mail: [email protected] G. Cevolani Models, Inference, and Decisions (MInD) Group, MoMiLab Research Unit, IMT School for Advanced Studies Lucca, Lucca, Italy Center for Logic, Language, and Cognition (LLC), University of Turin, Turin, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_60
1523
1524
D. Coraci et al.
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1545 1546
Abstract
The chapter presents the problem of reverse inference in cognitive neuroscience, discussing its analysis in terms of abductive and Bayesian reasoning and highlighting the main methodological issues and open problems surrounding this crucial inferential practice with reference to case studies from current research. Keywords
Neuroscience · fMRI · Bayesian confirmation · Inference to the best explanation · Abduction · Likelihoodism · NeuroSynth · MVPA · Cognitive ontology
Introduction Establishing associations between cognitive functions and neural activations is a pivotal aspect of neuroscientific research, in particular when functional magnetic resonance imaging (fMRI) techniques are used for studying the brain. Recent discussion in neuroscience and philosophy has however highlighted the limitations of some inferential strategies commonly used to draw such associations. The most important one is “reverse inference,” occurring when researchers draw conclusions about the recruitment of a certain cognitive process, given the brain activation observed experimentally. Interestingly, reverse inference can be modeled as an instance of both abductive and probabilistic (Bayesian) reasoning, thus raising relevant methodological and philosophical questions. This chapter provides an overview of the problem of reverse inference in both a neuroscientific and a philosophical perspective. After a short historical introduction to neuroscientific research, the problem of reverse inference is presented by means of concrete examples from the literature. Then, the two main conceptual interpretations of reverse inference, the one in terms of abduction and the one in terms of Bayesian inference, are discussed, highlighting their potentialities and limitations. Finally, a number of open problems on which future research will likely focus in the next few years are briefly surveyed.
What Is Cognitive Neuroscience? Neuroscience is the study of the brain’s activity and of its relationships with behavior and cognitive functions. Today, neuroscience is a flourishing and highly progressive enterprise, thanks to the rapid development of increasingly more
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1525
powerful tools to study the brain and the details of its functioning. The experimental and theoretical advancements of neuroscience raise many interesting issues also from a methodological and philosophical point of view. The present section briefly presents the historical development of modern neuroscience and some of its main techniques and points to some crucial conceptual questions, which will be discussed in the rest of the chapter. A Bit of History From a historical point of view, the roots of modern neuroscientific research can be traced back to the nineteenth century, with some seminal contributions from different disciplines, including experimental psychology, biology, physiology, and medicine. For instance, the studies of Richard Caton (1842–1926) were among the first describing the electrical phenomena in the brain of living animals, while Camillo Golgi (1843–1926) and Santiago Ramón y Cajal (1852–1934) contributed to the understanding of the structure of the neuron. The anatomical mapping of the human cerebral cortex is instead due to the work of neurologist Korbinian Brodmann (1868–1918), who classified the cortex in 52 distinct regions based on their histological characteristics. In the meantime, lesional studies on brain-damaged patients carried out by Paul Broca (1824–1880), and later by Carl Wernicke (1848– 1905), provided the first evidence in favor of the localization of the language functions in specific regions of the brain. In those same years, the first example of a neuroimaging technique – allowing the study of cognitive activity in living human brains – was also introduced by physiologist Angelo Mosso (1846–1910), who invented an instrument for recording blood pulsation changes in the cortex of neurosurgical patients performing various mental activities. However, it was in the twentieth century that, with the increasing availability of techniques for investigating the structure and the functioning of the nervous system, neuroscience became an autonomous and rapidly developing research field. During the first decades of the new century, some fundamental techniques still widely employed today, such as electroencephalography (EEG; (Adrian & Matthews, 1934a,b; Teplan et al., 2002; Tudor et al., 2005), were developed. Then, during the 1980s and the 1990s of the past century, neuroscientific research received another big boost from the first experimental studies using positron emission tomography (PET) and magnetic resonance imaging (MRI) (Ferris et al., 1980; Belliveau et al., 1991; Bandettini et al., 1992; Kwong et al., 1992, 60:for an historical overview of MRI-based experiments, see Bandettini 2012). With the introduction and increasingly rapid improvement of such techniques, neuroscience made impressive progress in understanding the nervous system and of its basic mechanisms. Moreover, one central question that interested neuroscience since its early beginning – i.e., that about the functional characterization of brain regions – gained new relevance. Over centuries, the idea that specific cognitive functions can be localized in different regions of the brain had animated hot debates both in science and philosophy. It inspired both research programs like phrenology, which is now regarded as pseudoscientific, and also important advances in neuropsychology, like Broca’s lesional studies. Despite recurrent skeptical voices
1526
D. Coraci et al.
(Uttal, 2001), the question of the localization of cognitive functions is central to modern neuroscience and keeps motivating interesting methodological discussion, especially about the reliability of MRI-based studies for inferring the recruitment of cognitive functions based on patterns of brain activation observed during experiments, as discussed below.
Cognitive Neuroscience and Functional Magnetic Resonance Imaging As just recalled, the development of PET and MRI techniques for investigating the brain represented a key aspect for the recent flourishing of neuroscience as a research enterprise. In particular, they increased the interest toward the study of the neural basis of higher cognitive functions, such as language, memory, and decision-making, leading to what is called “cognitive neuroscience.” Since the chapter especially focus on cognitive neuroscience, a closer look at these techniques is in order. Both PET and MRI allow to study brain activity in living humans; for this reason, the two techniques can appear quite similar, even if they rest on significantly different principles. On the one hand, PET imaging detects metabolic brain changes related to brain activity by means of scanning gamma rays generated by the radioactive decay of a contrast agent (i.e., the radiotracer) injected in the subject blood flow before data acquisition. On the other hand, MRI measures the electromagnetic behavior of oxygen in the blood flowing through brain regions and can be employed in essentially two different ways. First, the so-called structural MRI is used for morphological analysis of the brain, which is mainly useful for clinical purposes. Second, the same technique can be used within an experimental setting where one measures the neural activity of participants performing a given cognitive task as compared to neural activity recorded during a control task or the resting-state condition. In this case, neuroscientists talk of “functional” MRI, or simply fMRI, where the local consumption of oxygen is considered an indirect indicator of neural activity. In these experiments one tries to measure activity-related changes in blood oxygenation of neural cells – the so-called BOLD signal or “blood oxygenationlevel-dependent” signal – and to interpret it as an indicator of the cognitive activity which is the object of experimental manipulation. More in detail, to study how the activity of brain regions is affected by the experimental manipulation, the experimenter uses the MRI scanner to detect the BOLD signal at the level of single voxels or clusters of them. A voxel is the minimal three-dimensional unit at which the MRI scanner is able to acquire the BOLD signal and to produce images of the active brain. The resolution of MRI images reflects the size of the voxel, which generally is a cube with sides between 1 and 4 millimeters of length. In a typical experimental setting, the BOLD signal is measured in two different conditions: the one that is the object of the experiment (e.g., a participant being engaged in a memory task) and a “control” one, where no specific task is involved. Then, the resulting measures are compared voxel-by-voxel, and a statistical assessment of their difference is provided. If a single voxel, or more
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1527
commonly a cluster of voxels, shows a significantly different activation during the experimental condition, as compared to the control one, this is taken as evidence that the corresponding brain region is associated to the cognitive process of interest (e.g., memory). From this, the neuroscientist can infer that the more relatively activated brain region was recruited for processing the experimental stimuli and, in turn, associates that area to the engagement of the cognitive function supposedly involved in the task. Methodological Issues in Cognitive Neuroscience The approach just described to analyze fMRI data is known as “univariate” or “voxel-based” analysis (for a more technical introduction to the topic, see Davis et al. 2014). Employing this and more sophisticated approaches, in recent decades, researchers studied many different associations between specific brain regions and various cognitive processes, such as moral decision-making (Greene et al., 2001), language (Poldrack, 2006), pain (Lieberman & Eisenberger, 2015), or even pathological conditions, such as vegetative and minimally conscious states (Tomaiuolo et al., 2016). The aim of these studies is identifying the neural bases or neural correlates of an increasing number of cognitive functions and processes, thus ideally “mapping” the whole brain relative to the localization of all these different functions. Of course, such an ambitious project raises a number of thorny issues at many levels. Leaving aside the assessment of the complex experimental and statistical techniques involved in this kind of analysis, a hot debate is ongoing among both philosophers and neuroscientists also about the methodological foundations and the philosophical implications of current neuroscience studies (e.g., Calzavarini & Viola, 2020; Bechtel & Huang, 2022). Doubts can be raised, for instance, about the interpretation of the BOLD signal as an adequate proxy of the engagement of a certain cognitive function (see Roskies 2007, 2008 on the controversial status of fMRI images as evidence of neural activity in the brain) or on the fact that statistical associations between cognitive functions and neural activity can actually provide evidence of psychoneural “bridge laws” in the sense of Del Pinal & Nathan (2013) and Nathan & Pinal (2016) or about the relevance and impact of scientific results for central philosophical issues like the mind/body problem as discussed within contemporary philosophy of mind (Churchland, 2008). In this chapter, most of these important problems will be sidestepped. Instead, a philosophy of science perspective is taken in order to study some aspects of the current research practices of cognitive neuroscientists. Accordingly, it has taken for granted fMRI-based imaging as one of the most used, advanced, and promising approach to study the relations between the brain and the mind, leaving in the background the discussion about its foundations and relevance. More specifically, the focus will be on the problem of inference in cognitive neuroscience, i.e.: How justified is reasoning from the activity observed in some brain region to the engagement of some cognitive process and vice versa? How reliable is such kind of reasoning? And, how can researchers assess the patterns of inference involved?
1528
D. Coraci et al.
From Cognitive Processes to Neural Regions and Back Again One crucial goal of neuroscientific studies based on fMRI is establishing associations between cognitive functions or processes (like memory, emotion, language, and so on), on the one hand, and patterns of neural activation in the brain, on the other hand. To this aim, neuroscientists employ different strategies for reasoning about the relationships between a certain cognitive function Cog and the evidence, provided by fMRI, concerning some pattern of brain activation Act. Recent discussion has highlighted, in particular, two kinds of inference that neuroscientists employ to investigate these relationships and which are, in a sense, complementary (e.g., Poldrack, 2006; Henson, 2006; Calzavarini & Cevolani, 2022). The first is called forward inference (FI in the following) and consists in reasoning from the putative engagement of some process Cog in an experimental task to some specific activation Act. The second is known as reverse inference (RI henceforth), since it follows the opposite direction: from pattern of neural activity Act to the engagement of cognitive processes Cog. As clarified in this discussion, forward and reverse inference are not equivalent and hence require different kinds of analysis; moreover, there is a virtual agreement that RI is much more problematic than FI from a methodological point of view. The next section focuses on this problem of reverse inference, presents the main issues involved, and discusses in some detail how RI is actually performed in current neuroscientific research. The rest of the chapter will be devoted to the different methodological analyses advanced in order to model and justify RI as a sound pattern of scientific inference.
The Problem of Reverse Inference Suppose, for example, that neuroscientists are interested in studying which neural regions are involved in the processing of emotions. In order to investigate this specific function, they will design an experimental setting involving emotionally salient stimuli that will be presented to subjects during fMRI scanning. The results of the experiment might show the activation of certain regions in the brain. Then, the neuroscientists can conclude that these regions represent the neural bases of the supposedly recruited process, i.e., emotion processing. Such a conclusion rests on the “forward” direction of the Cog-Act association, that is, the one that proceeds from the Cog (emotion processing in our example) to the Act (as observed in fMRI). This kind of forward inference is a crucial strategy employed by neuroscientists when they are interested in isolating a certain cognitive function Cog by means of an experimental task and localizing what brain region Act is mainly activated when participants undertake that task as compared to a control task or a restingstate condition (Poldrack & Yarkoni, 2016). Indeed, by repeating the experiment with different stimuli and under different conditions, researchers may hope to find a consistent and robust association between the engagement of Cog and some
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1529
reasonably well-specified pattern of activation Act, which can then be viewed as the “neural correlate” of Cog. If successful, this kind of study will result in a sort of “map” of the brain, which would ideally allow one to associate each specific cognitive function to its more representative neural correlate, thus providing a list of “bridge laws” connecting psychological to neural processes in the brain (Nathan & Pinal, 2016). Coming back to the previous example, suppose now that neuroscientists are ready to accept that the processing of emotions, i.e., Cog, is associated to the activation of a certain brain region Act, let’s say the amygdala. In other words, the fact that “if emotion processing is engaged then the amygdala is activated” is accepted as a robust relation on the basis of a sufficient number of FI studies published in the literature and documenting this association between Cog and Act. Now, suppose that researchers run another fMRI experiment, in which they investigate, say, the processing of language. From the data acquired during this experiment, they observe an increased activity in the amygdala. In light of this evidence and of the association between the processing of emotions and the amygdala established by previous studies, neuroscientists could then infer that in the experiment at issue emotion processing was actually recruited. In other terms, the activation of amygdala suggests the occurrence of emotion processing during the experimental manipulation, independently from the fact that the experimental task was not explicitly related to emotions but to language processing. This kind of reasoning, from an observed Act to the corresponding Cog which is putatively engaged, is a typical example of a reverse inference: from previous knowledge that Cog implies Act and from observing Act, one would like to conclude that Cog was actually engaged during the target task. Of course, reverse inference is a powerful way of interpreting fMRI data in order to draw conclusions about the cognitive functions engaged during experimental tasks. In principle, it allows both to discover new, and perhaps surprising, connections between brain regions and mental operations and to confirm hypotheses concerning the neural localization of specific processes. This explains why RI has been very widely employed since the early days of fMRI, to the point that Poldrack (2006, p. 59) can speak of veritable “epidemics” of this reasoning pattern. The problem of reverse inference arises when one realizes this kind of inference is not logically valid and faces some deep problems that undermine its reliability. The most relevant one is the well-known problem of the “selectivity” of brain regions (Poldrack, 2006; Nathan & Del Pinal, 2017), i.e., the plain fact the same brain region can be, and usually is, activated by more than one cognitive process. In other words, in virtually no real case, observing Act allows one to selectively conclude that one specific Cog is engaged; it is impossible, in general, to establish a oneto-one association between Cog and Act and claim that Act is selective only for a specific Cog. Accordingly, observing Act is not a sufficient condition for inferring the recruitment of Cog, making the grounds of RI quite shaky. In the next sections, it is shown how one can analyze the logical structure and methodological status of RI, in order to assess and possibly improve its reliability, by mitigating the selectivity problem. Before doing this, however, it will be useful
1530
D. Coraci et al.
to take a closer look at how neuroscientists routinely perform reverse inferences on their data.
Kinds of Reverse Inference As mentioned above, RI is a widespread inferential practice among neuroscientists. As noted by Poldrack & Yarkoni (2016), however, there are quite different ways in which one can reason from patterns of brain activation to the engagement of cognitive processes. Accordingly, they distinguish between “qualitative” and “quantitative” RIs, where the latter ones rely on more sophisticated analyses and automated tools for synthesizing results from published literature. Here, it is preferred to introduce a slightly more fine-grained taxonomy, distinguishing three different ways of employing fMRI data to reverse infer mental functions; for each of them, one case study from the relevant literature is discussed. Informal Reverse Inference How humans reason about morality, assessing whether actions and behaviors are “right” or “wrong,” is a central topic of investigation in traditional philosophy. Two of the leading accounts in this part of the philosophical literature are consequentialism and deontology (Alexander & Moore, 2021). In a nutshell, while, for consequentialism, a behavior has to be considered as morally appropriate depending on the consequences it causes to happen, deontologists claim that whether an act or intention should be judged morally grounded also depend on the worthiness of the act per se or, for instance, on its conformity with particular moral norms. One of the most famous problem that triggered this theoretical opposition is the one about the moral dilemma known as the “trolley problem” (Bruers & Braeckman, 2014). Firstly introduced by Philippa Foot (1978), this dilemma and its numerous variants, such as the “footbridge” dilemma by Judith J. Thomson (1976), clearly highlight the contrasting decisions we make when our moral intuitions are at stake. Both the trolley problem and the footbridge dilemma depict a scenario in which a runaway trolley is about to kill five people that are tied up on a railway track, meanwhile you are standing nearby the railway. However, the two versions differ from an aspect. In the classical trolley version, you can divert the trolley onto a side track, by pulling a lever, but you will kill a person on the side track. Instead, in the footbridge variant, you are on a bridge under which the train is passing, and to stop it, you can push a person over the bridge, killing the person. Consequentialism and deontology rely on different moral intuitions about these cases. According to consequentialism, given that it is only the effect that matters, killing one person instead of five should be always considered the most morally appropriate decision to make. On the contrary, deontologists claim that a relevant distinction exists between the two versions of the dilemma and killing someone, even though for rescuing others, is morally questionable, in particular when asked to push a person over a bridge as in the footbridge scenario.
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1531
Moral psychologists assume an empirical approach to this kind of problems, investigating how people actually reason in front of moral scenarios like the trolley. In 2001, Joshua D. Greene and collaborators (Greene et al., 2001) were the first to suggest to address the problem by studying the neuropsychological processes involved in moral decision-making (see also Klein 2011 and Del Pinal & Nathan 2013). Greene and colleagues investigated trolley-like scenarios during an fMRI experiment in which subjects were presented with a series of dilemmas and asked to judge a certain behavior (e.g., pulling the lever in the classical trolley scenario) as morally appropriate or not. The background assumption was that the moral intuitions involved in the different versions of the dilemma were associated to different cognitive and emotional responses. According to this view, the footbridge dilemma would elicit a greater emotional response than the trolley version: pushing a person off a bridge consists in an “up close and personal” (Greene et al., 2001, p. 210) behavior compared to pushing a lever and changing the trajectory of a trolley. For this reason, one may be more emotionally engaged when asked to evaluate a footbridge-like dilemma. Furthermore, authors supposed that the putatively different cognitive and emotional responses engaged by the trolley and the footbridge dilemmas were, respectively, reflected in the different neural activation of relevant brain regions. The analysis of the fMRI data revealed that the BOLD signal recorded during judgments in different moral dilemmas was significantly different. In particular, by focusing on the areas showing higher activation during footbridge-like scenarios as compared to trolley-like ones, the authors identified many brain regions as differently recruited (i.e., the medial frontal, the posterior cingulate, and the angular gyri showed higher activity during footbridge-like scenarios, while the middle frontal gyrus and the bilateral parietal lobe were less engaged). In order to interpret these findings, the authors surveyed the fMRI literature, looking for the functional specialization of the mentioned brain regions. They found that the areas more active during footbridge-like scenarios were associated with emotional processing, while those less active were mainly related to cognitive functions such as the working memory. Therefore, Greene and colleagues concluded that moral decision-making in “up close and personal” contexts such as the footbridge one involves a greater emotional processing. Such a reasoning reflects the structure of RI. Indeed, it is supported by the association the authors draw between brain regions observed in the experiment and a particular function, i.e., the processing of emotions, based on the available literature. In other words, given that previous fMRI studies associate the activity of the medial frontal, the posterior cingulate, and the angular gyri to emotions, they infer a greater emotional processing is required in moral decision-making concerning footbridge-like dilemmas as compared to the other experimental conditions. To identify this association, Greene and colleagues examined the relevant literature that previously investigated the functional specialization of the brain regions recruited during the fMRI experiment. They did that informally, that is, by manually surveying published fMRI studies and providing an overview of the current state of the art concerning the cognitive functions that have been more
1532
D. Coraci et al.
reliably associated to the observed brain activations. This type of RI for assessing the relationship between brain areas and cognitive hypotheses can be called “informal” RI. Systematic Reverse Inference To illustrate systematic strategies for drawing RIs, the paper by Poldrack (2006) on reverse inference will be briefly discussed. This work represents a seminal attempt to analyze the reliability of the association between a certain cognitive function, such as language processing, and relevant brain regions by using a systematic search strategy on available fMRI data. In particular, Poldrack (2006) proposed to quantify the strength of the association between the cognitive process language and Broca’s area based on a database of fMRI studies, i.e., BrainMap (Fox & Lancaster, 2002; Fox et al., 2005). BrainMap has been developed for running meta-analyses of the neuroscientific literature, that is, providing a synthesis of the literature by comparing results from published papers studying a particular cognitive function or brain region. Poldrack used BrainMap criteria for systematically classifying fMRI studies according to the cognitive hypotheses they test (e.g., about memory or language), the experimental paradigm they use (e.g., whether data are acquired from subjects affected from specific pathological conditions), and the observed peaks of activation (i.e., the voxels of activation in the brain). In that way, it is possible to quantitatively assess the reliability of a potential RI regarding Broca’s area (a region anatomically located in the inferior frontal gyrus of the dominant hemisphere of one’s brain) and language by taking into account all the studies reporting an activation in that region and investigating that specific cognitive function among all the studies included in the database. In contrast to informal or qualitative approaches, systematic analyses are grounded upon standardized procedures for surveying the literature. For instance, Poldrack (2006) used a strategy that shares relevant common aspects with ordinary meta-analyses of the fMRI literature. These methods lead to more reliable RI between brain activations and cognitive functions as compared to informal ones but do not refer to any dedicated platform or software. However, the possibility of developing specific tools for surveying the literature and addressing the associations between neural evidence and cognitive functions has inspired more recent solutions in the field of neuroinformatics. This has led to “systematic” RI. Automated Reverse Inference The last decade has seen the rise of several proposals for analyzing fMRI data and drawing reliable large-scale, automated inferences on them (Poldrack, 2010, 2011). The most famous example among these attempts is NeuroSynth (Yarkoni et al. 2011; NS in the following), the software used in Tomaiuolo et al. (2016). NS is a specific tool that synthesizes results from published literature. Like BrainMap, it includes a large database of coordinates of brain activations and cognitive processes investigated in fMRI studies. Based on these data, NS performs meta-analyses allowing to compare results across experiments. In particular, NS
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1533
automatically extracts and processes information reported in fMRI studies by implementing text-mining algorithms (Yarkoni et al., 2011; Rubin et al., 2017) and implements both frequentist and Bayesian analyses of RI, leading to estimates of the strength of the associations between cognitive functions (i.e., terms as proxies of them) and brain activations as conditional probabilities. This is the main difference between NS and BrainMap: while the former has been specifically developed for estimating the strength of the associations between cognitive functions reported in papers and brain regions and, therefore, implements particular algorithms for performing such an analysis in an automated way, the latter represents a software with the coarser scope of supporting neuroscientists in running systematic metaanalyses of the literature. In 2016, Tomaiuolo and colleagues used NS to single out brain regions, such as the left angular gyrus, that have been associated to the processing of linguistic stimuli, sentences, and comprehension. They were interested in finding clues of residual awareness in patients suffering from vegetative or minimally conscious states. The neuroimaging literature, indeed, showed that brain areas within the language network activate during passive listening tasks, even in subjects affected from these pathological conditions. The type of RI associating language processing and the left angular gyrus on which, for instance, Tomaiuolo et al. (2016) based the interpretation of their experimental results is automatically carried out by the NS algorithm. Therefore, this form of RI can be called “automated” RI. Even if NS represents the main and most used tool for making large-scale RIs on fMRI studies, some of its applications have been criticized, and some of its fundamental aspects (like, for instance, its implementation of a Bayesian analysis of reverse inference) appear as controversial. In the following, the current debate surrounding NS is surveyed in order to point to some of its weaknesses. For the moment, it is sufficient to note that, in recent years, further software has been developed to address the limitations of NS by offering finer (Bayesian) analyses of fMRI data. A relevant example is the “Bayes fACtor mOdeliNg” tool, i.e., BACON (Costa et al., 2021; Cauda et al., 2020), that performs inferences by implementing Bayes’ factor analysis on the probability distributions originated from the BrainMap database.
Modeling Reverse Inference: Abductive Reasoning Starting with Poldrack (2006), both philosophers and neuroscientists have tried to classify the structure of RI, its potentialities, and its limitations. From a logical point of view, RI can be reconstructed as an argument based on premises (P1 ) and (P2 ) and conclusion (C): (P1 ) In the fMRI literature, when the cognitive process Cog was putatively engaged under the experimental task T, the brain area Act was active.
1534
D. Coraci et al.
(P2 ) In the present fMRI study, under the experimental task T, the activation of brain area Act is observed. (C) Therefore, in the present study, the cognitive process Cog under task T is engaged.
(P1 ) states the presence of an association between the cognitive function and the brain activation: previous fMRI literature shows that if Cog is engaged, then the activation of Act is elicited. By interpreting (P1 ) as an if-then statement, the recruitment of Cog would represent a sufficient (but not necessary) condition for having the activation of Act. From a logical point of view, it is natural to formalize this conditional statement by means of material implication as “Cog → Act” where the engagement of Cog represents the antecedent and the activation of Act represents the consequent. In the argument above, it is assumed that this conditional statement is true. (P2 ), instead, defines the evidence observed in the current fMRI experiment, that is, the activation of the brain region Act related to experimental manipulation T . From a logical point of view, (P2 ) is interpreted as a true individual proposition concerning the occurrence of Act. Given (P1 ) and (P2 ), the argument leads to the conclusion (C): under the experimental manipulation T , in the current fMRI experiment, when area Act is activated, the cognitive process Cog is engaged. However, the logical structure of RI does not allow to derive a true conclusion (C), even if (P1 ) and (P2 ) are true. The reason is syntactical: given the definition of logical validity and the logical rules governing the meaning of material implication, researchers are not entitled to derive that the conclusion (C) is true, given true premises. This means that RI is not a deductively valid argument according to logic. Indeed, given the logical structure of RI, if neuroscientists would infer the truth of the conclusion, they would commit a well-known logical fallacy, that is, the “fallacy of affirming the consequent” (Poldrack, 2006). Of course, the fallible status of RI is also determined by empirical factors, such as the lack of a conclusive oneto-one relationship between Cog and Act, an issue known in the literature as the “problem of selectivity” of brain regions (Poldrack, 2006). Indeed, if it was the case of a unique, one-to-one association between Cog and Act, the recruitment of Cog would represent both a sufficient and a necessary condition for having the activation of Act, and, in turn, the entire argument would be logically valid. For the above reasons, a more fruitful way to conceptualize reverse inference is to consider it as an instance of abduction (Poldrack, 2006; Bourgeois-Gironde, 2010). The term “abduction” was firstly coined by Peirce (1931). Peirce called “abduction” the pattern of reasoning – for which he also used the terms “retroduction” (CP 1.68) or “hypothesis” (CP 2.623) – involved in “the operation of adopting an explanatory hypothesis” (CP 5.189) for a given piece of evidence. For instance, “Fossils are found; say, remains like those of fishes, but far in the interior of the country. To explain the phenomenon, we suppose the sea once washed over this land” (CP 2.625). Peirce clearly saw that, even if the truth of the premises is taken for granted, the conclusion of an abductive argument may be false: in other words, like induction and contrary to deduction, abduction is a form of ampliative and
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1535
uncertain reasoning. The logical form of an abductive inference, according to Peirce (CP 5.189), is the following: (P1 ) The surprising fact, E, is observed. (P2 ) But if H were true, E would be a matter of course. (C) Hence, there is reason to suspect that H is true. The inferential pattern “if H then E; but E; therefore, H ” clearly reflects the structure of the argument outlined above for RI and, consequently, is an instance of the fallacy of “affirming the consequent.” However, as noted by Peirce (CP 5.192), a scientific argument can be effective in making its conclusion worth of further consideration, even though logically invalid. Accordingly, although their conclusions are always tentative and conjectural, Peirce argued that abductive arguments provide a fundamental form of inference both in scientific and everyday reasoning. In its modern reading, abductive reasoning is usually equated with “inference to the best explanation” (e.g., Harman, 1965). The “textbook version of abduction” (Douven, 2021) is something along the following lines: “[g]iven evidence E and candidate explanations H1 , . . . , Hn of E, infer the truth of that Hi which best explains H .” This basic formulation raises a number of critical issues, having to do with the correct explications of the notions of candidate explanations and best explanation. A critical issue about the justificatory status of abduction has to do with the criteria for deciding which is the best among the alternative candidate explanations (cf. Lipton 2003). One immediate suggestion is to identify the best explanation with that hypothesis Hi that is most probable given the evidence E (i.e., it has the highest degree of posterior probability) or that is most strongly supported by E (i.e., it has the highest degree of confirmation). In this sense, as some scholars have suggested (Salmon, 2001), abduction can be formalized using Bayes’ theorem or one of the measures of probabilistic support studied within the Bayesian confirmation theory (see Crupi 2020 and Niiniluoto 2018). However, the relation, and even the compatibility, between abduction and Bayesian reasoning is quite controversial, for there is no direct and clear connection between probabilistic and explanatory considerations in assessing hypotheses (see Douven 2021 and Sprenger & Hartmann 2019 for discussion). For its centrality in scientific investigation, the concept of abduction can shed light on the nature of RI and how it is effectively used by neuroscientists for interpreting neuroimaging evidence. Indeed, the Peircean schema can be used to rephrase RI and further analyze it as form of abduction. In particular, this may shed new light on the problem of the lack of selectivity of brain regions for specific cognitive functions. To see how the problem of selectivity affects research practices in neuroscience, it will be useful to briefly consider the debate recently raised by Lieberman & Eisenberger (2015). In this work, the authors (L&E in the following) planned to investigate whether the dorsal anterior cingulate cortex (dACC), a region generally associated to a variety of cognitive functions, is selective for a specific one, that is, pain. As mentioned in the previous section, L&E used NS to study
1536
D. Coraci et al.
the correlation between the activation of the dACC and four cognitive processes (including pain), known to be associated with the dACC. Their conclusion was that: Whereas psychological processes and tasks related to pain, executive processes, conflict, and salience all reliably activate the dACC, the only psychological phenomenon that can be reliably inferred given the presence of dACC activity is pain. (Lieberman & Eisenberger, 2015, p. 15522)
As noted by Calzavarini & Cevolani (2022), L&E’s reasoning constitutes a clear example of abductive reasoning as inference to the best explanation, where the activity of dACC represents the evidence E that should be explained, while the alternative psychological processes associated with the dACC activity represent the different candidate explanations H1 , H2 , . . . , Hn of E. However, this conclusion has been largely criticized by other neuroscientists and the very same developers of NS (Lierberman, 2015; Shackman, 2015; Yarkoni, 2015a,b; Wager et al., 2016; Gelman, 2017), pointing out both specific issues affecting L&E’s analysis and, more generally, the intrinsic epistemic unreliability of RI as a form of reasoning, even when such reasoning is based on the automated analyses implemented in NS. Nevertheless, one could argue that having a logically valid structure or having absolute epistemic reliability is not a necessary requirement for scientific reasoning. Science is ampliative, based on inductive rather than deductive inferences. Indeed, the majority of scientific hypotheses, the process of discovery of scientific statements, and the empirical testing of them do not rely on a rigid logical formalization. The same happens for neuroscience. These aspects, firstly noticed by Poldrack (2006), have opened the debate about RI. The main questions under investigation have been: How justified are RIs? Can researchers systematically rely on RIs in neuroimaging research? How can reverse inference be improved? As shown in the following section, these questions have primarily been addressed in the probabilistic framework of Bayesian analysis.
Modeling Reverse Inference: Bayesian Reasoning The most advanced proposal to make RI more epistemically reliable can be traced back, once again, to Poldrack’s seminal paper “Can Cognitive rocesses be inferred from neuroimaging data?” (Poldrack, 2006). Poldrack’s work represents the turning point for the entire discussion of the problem of RI and, in particular, for the Bayesian approaches developed in the literature. This paper largely inspired the debate for three main reasons. First, it states the general conceptual framework for formalizing RI and provides the neuroscientific community with formal tools and measures for quantitatively analyzing the reliability of such a fallible inference. Second, it represents the background for further refinements of the Bayesian analysis of RI. For instance, it originated relevant discussions among methodologists and philosophers offering competing alternatives for modeling RI (e.g., task relativization, see Hutzler 2014). Third, the Bayesian approach introduced by Poldrack motivated the development of meta-analytical platforms and software
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1537
aiming at drawing automated RIs based on literature-mining and large-scale fMRI databases, the most widely used being the NeuroSynth platform (Yarkoni et al., 2011) discussed earlier in the chapter. Bayesian inference focuses on estimating the probability of a certain hypothesis in the light of some available empirical evidence. Applying Bayes’ theorem, one can compute the posterior probability p(H |E) of the hypothesis H , conditioned on the observation of some piece of evidence E. According to Poldrack (2006), this idea can be applied to assess the strength of the empirical evidence provided in favor of the recruitment of a specific cognitive process. In other words, reverse inference amounts to a Bayesian inference where the probability of the engagement of Cog is evaluated on the activation Act by computing p(Cog|Act) and further updated according to Bayes’ rule as more evidence is observed. Calling Cog the target cognitive hypothesis and Act the evidence relative to a neural activation, Bayes’ theorem gives us the desired posterior probability, as follows: p(Cog|Act) =
p(Act|Cog)p(Cog) p(Act)
(1)
As Bayes’ formula shows, p(Cog|Act) depends on two other probabilities: the probability of the hypothesis before evidence is observed, i.e., the “prior” probability p(Cog), and the conditional probability p(Act|Cog) of observing the evidence Act given that the hypothesis is true, i.e., the “likelihood” of Cog. The following alternative, but mathematically equivalent, formulation of Bayes’ theorem is more useful for the sake of applications:
p(Cog|Act) =
p(Act|Cog)p(Cog) p(Act|Cog)p(Cog) + p(Act|¬Cog)p(¬Cog)
(2)
In the above formulation, p(Cog|Act) becomes a function of three (independent) probabilistic estimates: the prior probability of Cog, its likelihood p(Act|Cog), and the likelihood p(Act|¬Cog) of its negation. (The other prior probability appearing in Eq. 2, i.e., that of ¬Cog, is immediately calculated as p(¬Cog) = 1 − p(Cog).) The Bayesian analysis just outlined allows to quantitatively estimate the credibility that a cognitive hypothesis receives from the observations made in fMRI experiments. In this way, it can mitigate the problem of selectivity (pointing to the most probable hypothesis among the ones that can explain the evidence), and it can help in assessing the “strength” of a given reverse inference, thus going beyond the purely qualitative analysis based on the notion of abduction. In his paper, Poldrack offers a concrete example of how to apply the Bayesian analysis of reverse inference by discussing how to assess the selectivity of the so-called Broca’s area for the processing of language. To this purpose, Poldrack performs what we earlier called a systematic reverse inference, deploying the fMRI data contained in the BrainMap database (Fox & Lancaster, 2002; Fox et al., 2005).
1538
D. Coraci et al.
Table 1 A frequency table from the BrainMap database for language processing and the activation of Broca’s area (From Poldrack, 2006, p. 60) Language study 166 703
Activated Not activated
Not language study 199 2154
A systematic search in BrainMap produces a 2-by-2 frequency table (see Table 1) with the number of fMRI studies in the database that investigates the language function (or not) and that reports activation in Broca’s area (or not). Using these numbers, Poldrack can estimate the likelihoods of the cognitive process “language” 166 (Lan) and of its absence as the following ratios: p(ActBroca |Lan) = 166+703 199 .191 and p(ActBroca |¬Lan) = 199+2154 .084. By assuming uniform prior probabilities p(Lan) = p(¬Lan) = 0.5 and filling these numbers in Eq. 2, one then obtains the following: p(Lan|ActBroca ) = =
p(ActBroca |Lan)p(Lan) p(ActBroca |Lan)p(Lan) + p(ActBroca |¬Lan)p(¬Lan) .191 .694 .191 + .084
(3)
In this way, a Bayesian analysis of reverse inference shows that the (posterior) probability that language processing is engaged, given the observation of activation in the Broca’s area, is roughly 0.69 (Poldrack, 2006, p. 62). This points toward a mild credibility for the language hypothesis given the available evidence: relative to the prior belief state where researchers assumed equal probability for both Lan and ¬Lan, after observing the activation in Broca’s area, they are now justified to believe that the engagement of language processing is clearly more probable than not. Following a quite widespread use, Poldrack (2006, p. 62) suggests to use the socalled Bayes factor to assess how strong should be the confidence of the researcher in the conclusion of the above reverse inference. The Bayes factor BF is a measure of the “strength of evidence” in favor (or against) a given hypothesis, and it can be interpreted as a measure of inductive support or empirical confirmation in the sense of Bayesian confirmation theory (Crupi, 2020; Sprenger & Hartmann, 2019; Coraci & Cevolani, 2022, L’analisi bayesiana dell’inferenza inversa in neuroscienza: una critica, Manuscript.). In the example seen above, the Bayes factor in favor of Lan can be simply computed as the likelihood ratio, i.e., as the ratio of the likelihood of Lan on ActBroca and of its negation, as follows:
BF (Lan, ActBroca ) =
.191 p(ActBroca |Lan) = 2.3 p(ActBroca |¬Lan) .084
(4)
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience Table 2 The Bayes factor as a measure of empirical support (From Jeffreys, 1961)
BF BF > 10 3 < BF < 10 1 < BF < 3 BF = 1
1539
Support in favor of the hypothesis Strong Moderate Weak No support
The above calculation should be interpreted as a measure of the evidential support given by the observation of ActBroca to the engagement of Lan: following the convention introduced by the Bayesian statistician Richard Jeffreys (1961), this amounts to “weak” evidence in favor of Lan (see Table 2). This simple example discussed by Poldrack highlights the general idea and the potentialities of the Bayesian analysis of reverse inference as based on the notions of posterior probability and empirical support. While such analysis is surely one of the most promising framework to reason about reverse inference and other methodological issues in neuroscience practice, it is not the only proposal on the market, nor it is without problems. In the next, and final, section, some open problems in current discussion and some possible routes for further development are discussed.
Open Problems and Future Work This section presents some recent trends in the discussion about reverse inference and methodological issues in (cognitive) neuroscience, with a focus on those that deserve more urgent attention by philosophers and practicing scientists and provide more promising routes for further research in this area.
Reverse Inference as a Discovery Procedure As already discussed above, RI can be conceptualized as a form of abductive reasoning and often as a form of inference to the best explanation. In this “strong” interpretation, RI can be formulated as a rule of acceptance, since it gives reasons to tentatively accept its conclusion as the “best” explanatory hypothesis of a given brain activation among the available ones. Consequently, RI has been discussed from the very beginning in the context of justification rather than in the context of discovery. In the philosophical debate, however, it has become clear that there is at least another way of assessing the proper role of abductive inference. According to this “weak” or “strategic” interpretation, abduction has a primary discovery (or “heuristic,” see Schurz 2017, p. 153) function that of suggesting or finding promising or “test worthy” hypotheses which are then set out to further inquiry or empirical testing. This interpretation is mainly related to the context of discovery and the formulation of new hypotheses for explaining evidence. In this sense, the heuristic or exploratory status of abductive reasoning has been examined, leading to
1540
D. Coraci et al.
a relevant area of research discussing abductions as selective and creative patterns of reasoning (Magnani, 2009, 2011; Schurz, 2017). Given the twofold interpretation of abductive reasoning, the same can be expected for RI. Indeed, RI can apparently play two different roles: a “strong” one, in the context of justification of neuroscientific hypotheses, and “weak” one, having strategic or heuristic purposes in the context of scientific discovery. Then, according to this second interpretation, RI represents a strategy for determining new hypotheses in need of further investigation. As noted by Calzavarini & Cevolani (2022), the current neuroscientific and philosophical literature on RI seems to have underestimated such strategic role of RI in favor of the strong interpretation, even if Poldrack recognized that “reverse inference might be useful in the discovery of interesting new facts about the underlying mechanisms” (Poldrack, 2006, p. 60). This is problematic because taking into account the strategic or discovery function of RI is crucial to make sense of current neuroscientific practice. In many cases, RI is indeed employed as a search strategy that tells us which hypotheses about the cognitive interpretations of a given brain activation neuroscientists should set out for further inquiry and/or as a tool for making new hypotheses and assisting discovery. Thus, investigating the nature and limits of weak RI remains an important open task. The exploratory investigation provided in Calzavarini & Cevolani (2022) suggests that weak RI is indeed performed in current neuroscientific research and instantiates most of the strategic functions that philosophers have traditionally assigned to abduction. In particular, the authors examine three case studies, illustrating both the selective function of weak RI (i.e., individuating a restricted set of plausible hypotheses worth of further empirical testing) and its creative function (i.e., suggesting a partially or radically new psychological interpretation of a given brain activation). As an example, a recent investigation by Pauli et al. (2016), using the NS database, of the cognitive functions associated to the human striatum can be interpreted as a case of “creative” reverse inference. Indeed, the results of this study revealed some associations that had not been highlighted in previous literature, thus “extend[ing] previous knowledge of the involvement of the striatum in rewardrelated decision-making tasks” (p. 1911). According to Calzavarini & Cevolani (2022), acknowledging the role of weak RI in current research practice thus sheds new light on its methodological role and may mitigate the skepticism that presently surrounds RI within the community.
A Likelihoodist Approach to Reverse Inference The Bayesian analysis discussed above represents the main approach in the literature for reconstructing RI. However, other accounts have been proposed in the recent years, such as the “likelihoodist” one defended by Machery (2014). Machery also reconstructs the association between cognitive hypotheses and brain activations in statistical terms; however, his approach differs in important respects from the Bayesian analysis offered by Poldrack.
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1541
Machery’s analysis of RI is based on a conception of evidential support that is different from the one assumed within a Bayesian framework. According to him, indeed, the main role of RI is not to quantitatively assess the empirical support given to the individual hypothesis Cog but to evaluate two hypotheses in a comparative way. Such a comparison is carried out in “likelihoodist” terms: the hypothesis that is more empirically supported is the one with the higher likelihood (Edwards, 1976; Royall, 1997; Forster, 2006; Forster & Sober, 2010; Gandenberger, 2016). In light of this approach, the RI is modeled by means of the conditional probability or likelihood p of the evidence Act given the hypothesis Cog, i.e., p(Act|Cog). When the RI is drawn, Machery claims: neuroscientists can conclude that a certain pattern of brain activation Act provides evidential support for the hypothesis Cog1 over another hypothesis Cog2 if and only if, by comparing the likelihoods, Act is more likely to be observed when Cog1 is engaged than when Cog2 is recruited. More formally, Act would support the hypothesis Cog1 over Cog2 if and only if p(Act|Cog1 ) > p(Act|Cog2 ). Therefore, RI succeeds in correctly explaining a certain observation if it selects the cognitive function associated to the highest likelihood in a comparative way. The likelihoodist approach advanced by Machery has the main advantage to avoid the problem of setting the priors (for a critical analysis, see Coraci and Cevolani, 2022, L’analisi bayesiana dell’inferenza inversa in neuroscienza: una critica, Manuscript.) but appears less compelling in other respects. For a discussion of the pitfalls of the likelihoodist approach, see Machery (2014) and Glymour & Hanson (2016).
Task Relativization and Bayesian Priors Poldrack notices that the prior always depends on a specific task and then should be termed as the conditional probability p(Cog|T ). For a matter of simplicity, he does not implement this further conditionalization, even if it could represent a relevant aspect for the Bayesian analysis of RI as underlined in Hutzler (2014). A revised formulation of the Bayesian approach to RI that takes into account the role of the task is discussed in several works (Christoff & Owen, 2006; Del Pinal & Nathan, 2013; Hutzler, 2014). Here, the proposal in Hutzler (2014) will be briefly outlined. Hutzler starts noting that the reliability of the RI depends on the lack of “functional specificity” of brain activations. The level of functional specificity of a certain region Act, Hutzler argues, is reflected within the Bayesian approach by the likelihood p(Act|¬Cog), i.e., the false-alarm rate. This likelihood estimates the number of times Act results active, given the absence of the cognitive process Cog. Of course, from a quantitative point of view, an high false-alarm rate decreases the overall posterior probability p(Cog|Act) computed via Bayes’ rule. Therefore, Hutzler proposes, by taking into account the specific task or experimental manipulations involved in fMRI studies, there is the chance of reducing the falsealarm rate. According to this perspective, RI would not be simply dependent on the functional specificity of the considered brain region but on its “task-specific
1542
D. Coraci et al.
functional specificity” (Hutzler, 2014, p.1063). Hutzler’s revised formulation of the Bayesian analysis of RI is outlined below, where the prior p(Cog), the likelihood p(Act|Cog), and the posterior probability from Eq. 2 are conditioned upon the experimental context under which the data are acquired, i.e., the task T : p(Cog|Act ∧ T ) =
p(Act|Cog ∧ T )p(Cog|T ) p(Act|Cog ∧ T )p(Cog|T ) + p(Act|¬Cog ∧ T )p(¬Cog|T )
(5)
This approach offers several advantages. First, the experimental manipulation is surely pivotal for the recruitment of a certain cognitive function rather than another. So, it appears both conceptually and experimentally meaningful to assess how the task can increase the chance of engagement of Cog. Indeed, when neuroscientists design experiments, it is natural to suppose that their expectations about the recruitment of a particular cognitive function rather than another vary according to the specific task used. Second, even though Hutzler presents task relativization as only affecting likelihoods (see Hutzler 2014, p. 1064 and p. 1068), this proposal can be easily extended also to prior probabilities. Therefore, for instance, the prior related to the overall engagement of Cog would become the conditional probability of Cog given the task T , i.e., p(Cog|T ). Task-related priors have two main pros: (1) they can realistically reflect what we described above, that is, the researchers’ belief or expectations about Cog when a task is chosen for the experiment, and (2) would avoid to set both the priors (related to Cog and ¬Cog) at 0.5, as originally proposed in Poldrack (2006), leading to relevant consequences for the assessment of the strength of RI via BF. In general, by narrowing the number of fMRI studies to consider in the analysis (i.e., only to those implementing a specific task), task relativization would reduce the potential cognitive explanations of the observed activation. Indeed, if the task at issue can effectively recruit the expected cognitive function, task relativization increases the overall accuracy of RI. The risk of false-alarm rate decreases, while the prior and the likelihood, i.e., respectively, p(Cog|T ) and p(Act|Cog ∧ T ), increase. This affects the final results and, then, the precision and the reliability of the inference (Hutzler, 2014). While Hutzler’s proposal appears highly relevant to improve the current Bayesian analysis of RI, other scholars (e.g., Del Pinal & Nathan, 2013; Poldrack, 2014) have pointed out the limits of this account, emphasizing conceptual and empirical aspects in need of further discussion.
Reverse Inference from Brain Networks As explained above, the main problem surrounding reverse inference is the lack of selectivity of brain regions. The activation of the very same region can occur in relation to the recruitment of different cognitive functions, during distinct tasks. Such a “pluripotency” of brain areas determines a one-to-many mapping
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1543
between activations and cognitive processes that dramatically drops the reliability of inferences relying on these associations. As a way for mitigating this crucial issue, Klein (2012) proposes to pursue a brain network analysis as opposed to the classical perspective based on single neural regions. He illustrates this alternative by taking as example brain regions activated across moral decision-making tasks. In this type of studies, researchers commonly report the activation of a set of areas, including, among others, the posterior portion of the cingulate cortex, the temporoparietal junction, and the ventromedial prefrontal cortex (Klein, 2011). This evidence determines a network of brain regions associated to experimental tasks in which moral decision-making and social self-projection are involved (Klein, 2011, p. 958). Klein argues that focusing on brain networks and the associated tasks can represent a promising view for addressing the issue of poorly selective brain areas. According to this approach, local activations associated to different types of moral decision-making tasks should be interpreted as specific responses within the network, depending on the type of stimuli used, and not as distinct brain patterns selective for different moral dilemmas (as argued, for instance, by Greene et al. 2001). Moreover, Klein continues: referring to single regions appears misleading, given the methodological issues still affecting neuroimaging techniques and the interpretation of fMRI evidence. This approach reflects a more general and recent attempt of the fMRI community to focus on the role of brain networks for understanding human cognition (e.g., Ramsey et al., 2010; Smith et al., 2011, 2013). For instance, a similar strategy grounding RI on the identification of causal networks in the brain has been proposed by Glymour & Hanson (2016). However, such proposal seems to be strongly dependent on the neuroscientists’ ability to categorize and describe experimental tasks and, thus, providing an adequate “cognitive ontology” for inferring that similar experimental manipulations are associated to the recruitment of the same brain network (Klein, 2012, p.957).
From Univariate to Multivariate Analysis As discussed so far, reverse inference concerns the association between a cognitive process and an activation “localized” in a specific brain area. This “location-based” view of RI is the most standard one in the neuroscientific literature. However, more recent research has targeted a different form of RI that can be called “patternbased” or “decoding-based” reverse inference (Del Pinal & Nathan, 2017; Nathan & Del Pinal, 2017; Ritchie et al., 2019; Weiskopf, 2021). The crucial aspect that distinguishes between these two types of RI is how acquired fMRI data are analyzed. The “location-based” RI rests on the univariate or voxel-based analysis of fMRI data. Suppose that a neuroscientist is interested in finding out the brain areas that show higher activation during a specific experimental manipulation in an fMRI setting. The simplest way to do that is to scan the brain activity in a group of subjects both under the experimental manipulation of interest and at rest. In this way, the neuroscientist would be able to compare subjects’ brains at work with neural data
1544
D. Coraci et al.
from the same subjects when no specific experimental condition is presented. Such a comparison basically consists in calculating the difference between the neural activity recorded during the experimental condition and the neural activity recorded during the rest condition. Those brain areas that show a positive difference from this comparison will be interpreted as more active during the experimental manipulation. The univariate approach simply tells that this difference is calculated on the neural activities recorded at the level of one single voxel and then repeated voxel by voxel until all the brain volume is covered. Therefore, by means of “location-based” RI, researchers basically refer to an RI in which the brain region Act is obtained by gathering together voxels that show a positive difference in neural activity between the two considered conditions (e.g., experimental manipulation and resting state). On the other hand, the “pattern-based” RI considers multivariate analysis (MVPA in short) of neural activity. Instead of the voxel-based comparison of brain activity as in the univariate approach, MVPA allows to directly investigate populations of voxels. Firstly, it takes the recording of the overall neural response associated with every stimulus presented during the experiment, that is, the maps of the activity of all the voxels, for instance, in a limited region of the brain over the series of experimental conditions. Secondly, it analyzes the maps related to the series of stimuli in order to assess whether specific patterns of voxels vary in processing them. The aim is to spot similarities between the recorded neural responses over the series and, in turn, associate changes in the patterns of voxels with changes in the stimuli features. Such a task usually rests on a classification task in which a machine learning algorithm, such as a support vector machine (SVM), is trained on a limited set of neural data already associated to labeled experimental stimuli and, then, used to predict to which experimental stimuli new neural data should be related (Haynes & Rees, 2006; Norman et al., 2006; Davis et al., 2014; Hebart & Baker, 2018; Ritchie et al., 2019). Without entering in the details of MVPA, it is worth noting that the introduction of machine learning and data-mining techniques for analyzing fMRI data is having an impressive impact also on the way in which standard, “location-based” reverse inferences are performed. On the one hand, MVPA offers new, promising methods for quantitatively assessing the reliability of the association between Cog and Act (Del Pinal & Nathan, 2017; Nathan & Del Pinal, 2017). On the other hand, also “pattern-based” inferences do not seem to be free from methodological problems, quite similar to the ones affecting “location-based” reverse inference (Weiskopf, 2021). In any case, the debate is not settled, and further work is needed to evaluate how these new development will change current research practices and inferential strategies in neuroscience.
Cognitive Ontology Finally, one fundamental problem affecting not only reverse inference and other inferential practices but virtually any aspect of neuroscientific research is the lack of a clear-cut vocabulary for describing cognitive functions and processes. This is
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1545
known as the problem of “cognitive ontology” (Price & Friston, 2005; Bilder et al., 2009; Klein, 2010, 2012; Poldrack, 2010), that is, the difficulty to clearly disentangle between cognitive processes, their functional roles, and their connections to experimental settings. As Poldrack (2010) pointed out, the neuroimaging literature is unable to establish an unquestionable and specific structure-to-function mapping because the assumed cognitive ontology might be false (e.g., the referred function does not refer to anything implemented in the brain and can be a combination of other functions) and the experimental manipulations might fail to selectively cut mental functions at the joints. The uncertainty about how neuroscientists use concepts for isolating cognitive processes and the fact that many of these terms unavoidably come from our intuitive, folk psychological vocabulary for describing mental states strongly impacts on the mapping between the cognitive and the neural level. For these reasons, a number of specific tools aiming at developing supervised ontologies of cognitive states and experimental tasks and software for mapping cognitive functions to brain regions have been developed in the last years. Besides BrainMap (Fox & Lancaster, 2002) and NeuroSynth (Yarkoni et al., 2011), already described above, it is worth mentioning Cognitive Atlas (Poldrack et al., 2011), NeuroQuery (Dockès et al., 2020), and Neuroscience Knowledge Engine (Beam et al., 2021). These latter tools, based on unsupervised machine learning, are changing the traditional way of thinking about cognitive ontology and may have a lasting impact on the debate concerning how neuroscientists conceptualize cognitive domains (Turner & Turner, 2021).
Conclusion The neuroscientific study of the associations between neural regions and cognitive functions faces important methodological challenges. This chapter has focused on the problem of reverse inference – inferring the engagement of cognitive processes from the activation of specific brain regions – that has attracted much attention among neuroscientists and philosophers during the last 15 years. While the debate is not settled, the contributions surveyed in the chapter provided us with an improved understanding of reverse inference and more and more reliable strategies for addressing its empirical and theoretical limitations. These include both formal modeling of reverse inference patterns in terms of abductive and probabilistic reasoning and the implementation of large-scale databases and automated analytic tools in the new field of neuroinformatics. These increasingly sophisticated approaches will require, in the near future, a further, joint effort from neuroscientists, philosophers, and computer scientists to tackle some crucially relevant methodological questions remaining open for future research. Acknowledgements The authors would like to thank Tommaso Brischetto Costa, Davide Bottari, Luca Cecchetti, Vincenzo Crupi, Igor Douven, Vincenzo Fano, Jordi Manuello, Jan Sprenger, and Marco Viola for very useful discussions on the topics of the chapter. Davide Coraci acknowledges funding from an ERASMUS+ Mobility Grant 2020/2021 and Gustavo Cevolani from the PRIN
1546
D. Coraci et al.
2017 grant (n. 201743F9YE) “From Models to Decisions” of the Italian Ministry of Education, University and Research.
References Adrian, E. D., & Matthews, B. H. (1934a). The interpretation of potential waves in the cortex. The Journal of Physiology, 81(4), 440–471. Adrian, E. D., & Matthews, B. H. (1934b). The Berger rhythm: Potential changes from the occipital lobes in man. Brain, 57(4), 355–385. Alexander, L., & Moore, M. (2021). Deontological ethics. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Winter 2021 edition. Bandettini, P. A. (2012). Twenty years of functional MRI: The science and the stories. Neuroimage, 62(2), 575–588. Bandettini, P. A., Wong, E. C., Hinks, R. S., Tikofsky, R. S., & Hyde, J. S. (1992). Time course EPI of human brain function during task activation. Magnetic Resonance in Medicine, 25(2), 390–397. Beam, E., Potts, C., Poldrack, R. A., & Etkin, A. (2021). A data-driven framework for mapping domains of human neurobiology. Nature Neuroscience, 24(12), 1733–1744. Bechtel, W., & Huang, L. T.-L. (2022). Philosophy of Neuroscience. Cambridge University Press. Belliveau, J., Kennedy, D., McKinstry, R., Buchbinder, B., Weisskoff, R., Cohen, M., Vevea, J., Brady, T., & Rosen, B. (1991). Functional mapping of the human visual cortex by magnetic resonance imaging. Science, 254(5032), 716–719. Bilder, R. M., Sabb, F. W., Parker, D. S., Kalar, D., Chu, W. W., Fox, J., Freimer, N. B., & Poldrack, R. A. (2009). Cognitive ontologies for neuropsychiatric phenomics research. Cognitive Neuropsychiatry, 14(4–5), 419–450. Bourgeois-Gironde, S. (2010). Is neuroeconomics doomed by the reverse inference fallacy? Mind & Society, 9(2), 229–249. Bruers, S., & Braeckman, J. (2014). A review and systematization of the trolley problem. Philosophia, 42(2), 251–269. Calzavarini, F., & Cevolani, G. (2022). Abductive reasoning in cognitive neuroscience: Weak and strong reverse inference. Synthese, 200(2), 1–26. Calzavarini, F., & Viola, M. (2020). Neural Mechanisms: New Challenges in the Philosophy of Neuroscience (Vol. 17). Springer Nature. Cauda, F., Nani, A., Liloia, D., Manuello, J., Premi, E., Duca, S., Fox, P. T., & Costa, T. (2020). Finding specificity in structural brain alterations through bayesian reverse inference. Human Brain Mapping, 41(15), 4155–4172. Christoff, K., & Owen, A. (2006). Improving reverse neuroimaging inference: Cognitive domain versus cognitive complexity. Trends in Cognitive Sciences, 10(8), 352–353. Churchland, P. S. (2008). The impact of neuroscience on philosophy. Neuron, 60(3), 409–411. Costa, T., Manuello, J., Ferraro, M., Liloia, D., Nani, A., Fox, P. T., Lancaster, J., & Cauda, F. (2021). Bacon: A tool for reverse inference in brain activation and alteration. Technical report. Crupi, V. (2020). Confirmation. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Stanford, Spring 2020 edition. Davis, T., LaRocque, K. F., Mumford, J. A., Norman, K. A., Wagner, A. D., & Poldrack, R. A. (2014). What do differences between multi-voxel and univariate analysis mean? How subject-, voxel-, and trial-level variance impact fMRI analysis. Neuroimage, 97, 271–283. Del Pinal, G., & Nathan, M. J. (2013). There and up again: On the uses and misuses of neuroimaging in psychology. Cognitive Neuropsychology, 30(4), 233–252. Del Pinal, G., & Nathan, M. J. (2017). Two kinds of reverse inference in cognitive neuroscience. In The Human Sciences After the Decade of the Brain (pp. 121–139). Elsevier.
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1547
Dockès, J., Poldrack, R. A., Primet, R., Gözükan, H., Yarkoni, T., Suchanek, F., Thirion, B., & Varoquaux, G. (2020). NeuroQuery, comprehensive meta-analysis of human brain mapping. Elife, 9, e53385. Douven, I. (2021). Abduction. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Summer 2021 edition. Edwards, A. W. F. (1976). Likelihood: An Account of the Statistical Concept of “Likelihood” and Its Application to Scientific Inference. Cambridge University Press. Ferris, S. H., de Leon, M. J., Wolf, A. P., Farkas, T., Christman, D. R., Reisberg, B., Fowler, J. S., MacGregor, R., Goldman, A., George, A. E., et al. (1980). Positron emission tomography in the study of aging and senile dementia. Neurobiology of Aging, 1(2), 127–131. Foot, P. (1978). The Problem of Abortion and the Doctrine of Double Effect, in Her Virtues and Vices (pp. 19–32). Berkeley/Los Angeles: University of California Press. Forster, M., & Sober, E. (2010). Why likelihood? In The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations (pp. 153–165). University of Chicago Press. Forster, M. R. (2006). Counterexamples to a likelihood theory of evidence. Minds and Machines, 16(3), 319–338. Fox, P. T., Laird, A. R., Fox, S. P., Fox, P. M., Uecker, A. M., Crank, M., Koenig, S. F., & Lancaster, J. L. (2005). BrainMap taxonomy of experimental design: Description and evaluation. Human Brain Mapping, 25(1), 185–198. Fox, P. T., & Lancaster, J. L. (2002). Mapping context and content: The brainmap model. Nature Reviews Neuroscience, 3(4), 319–321. Gandenberger, G. (2016). Why I Am Not a Likelihoodist. Ann Arbor, MI: Michigan Publishing, University of Michigan Library. Gelman, A. (2017). Is the dorsal anterior cingulate cortex “selective for pain”? Glymour, C., & Hanson, C. (2016). Reverse Inference in Neuropsychology. The British Journal for the Philosophy of Science, 67(4), 1139–1153. Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293(5537), 2105–2108. Harman, G. H. (1965). The inference to the best explanation. The Philosophical Review, 74(1), 88–95. Haynes, J.-D., & Rees, G. (2006). Decoding mental states from brain activity in humans. Nature Reviews Neuroscience, 7(7), 523–534. Hebart, M. N., & Baker, C. I. (2018). Deconstructing multivariate decoding for the study of brain function. Neuroimage, 180, 4–18. Henson, R. (2006). Forward inference using functional neuroimaging: Dissociations versus associations. Trends in Cognitive Sciences, 10(2), 64–69. Hutzler, F. (2014). Reverse inference is not a fallacy per se: Cognitive processes can be inferred from functional imaging data. NeuroImage, 84, 1061–1069. Jeffreys, H. (1961). Theory of Probability. Clarendon. Klein, C. (2010). Philosophical issues in neuroimaging. Philosophy Compass, 5(2), 186–198. Klein, C. (2011). The dual track theory of moral decision-making: A critique of the neuroimaging evidence. Neuroethics, 4(2), 143–162. Klein, C. (2012). Cognitive ontology and region-versus network-oriented analyses. Philosophy of Science, 79(5), 952–960. Kwong, K. K., Belliveau, J. W., Chesler, D. A., Goldberg, I. E., Weisskoff, R. M., Poncelet, B. P., Kennedy, D. N., Hoppel, B. E., Cohen, M. S., & Turner, R. (1992). Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation. Proceedings of the National Academy of Sciences, 89(12), 5675–5679. Lieberman, M. D., & Eisenberger, N. I. (2015). The dorsal anterior cingulate cortex is selective for pain: Results from large-scale reverse inference. Proceedings of the National Academy of Sciences, 112(49), 15250–15255. Lierberman, M. D. (2015). Comparing pain, cognitive, and salience accounts of dacc. Lipton, P. (2003). Inference to the Best Explanation. Routledge.
1548
D. Coraci et al.
Machery, E. (2014). In Defense of Reverse Inference. The British Journal for the Philosophy of Science, 65(2), 251–267. Magnani, L. (2009). Creative abduction and hypothesis withdrawal. In Models of Discovery and Creativity (pp. 95–126). Springer. Magnani, L. (2011). Abduction, Reason and Science: Processes of Discovery and Explanation. Springer Science & Business Media. Nathan, M. J., & Del Pinal, G. (2017). The future of cognitive neuroscience? Reverse inference in focus. Philosophy Compass, 12(7), e12427. Nathan, M. J., & Pinal, G. D. (2016). Mapping the mind: Bridge laws and the psycho-neural interface. Synthese, 193(2), 637–657. Niiniluoto, I. (2018). Truth-Seeking by Abduction. Berlin/Heidelberg: Springer. Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: Multivoxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10(9), 424–430. Pauli, W. M., O’Reilly, R. C., Yarkoni, T., & Wager, T. D. (2016). Regional specialization within the human striatum for diverse psychological functions. Proceedings of the National Academy of Sciences, 113(7), 1907–1912. Peirce, C. S. (1931). Collected Papers of Charles Sanders Peirce. Harvard University Press. Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences, 10(2), 59–63. Poldrack, R. A. (2010). Mapping mental function to brain structure: How can cognitive neuroimaging succeed? Perspectives on Psychological Science, 5(6), 753–761. Poldrack, R. A. (2011). Inferring mental states from neuroimaging data: From reverse inference to large-scale decoding. Neuron, 72(5), 692–697. Poldrack, R. A. (2014). Is reverse inference a fallacy? A comment on Hutzler. Poldrack, R. A., Kittur, A., Kalar, D., Miller, E., Seppa, C., Gil, Y., Parker, D. S., Sabb, F. W., & Bilder, R. M. (2011). The Cognitive Atlas: Toward a knowledge foundation for cognitive neuroscience. Frontiers in Neuroinformatics, 5, 17. Poldrack, R. A., & Yarkoni, T. (2016). From brain maps to cognitive ontologies: Informatics and the search for mental structure. Annual Review of Psychology, 67, 587–612. Price, C. J., & Friston, K. J. (2005). Functional ontologies for cognition: The systematic definition of structure and function. Cognitive Neuropsychology, 22(3–4), 262–275. Ramsey, J. D., Hanson, S. J., Hanson, C., Halchenko, Y. O., Poldrack, R. A., & Glymour, C. (2010). Six problems for causal inference from fMRI. NeuroImage, 49(2), 1545–1558. Ritchie, J. B., Kaplan, D. M., & Klein, C. (2019). Decoding the brain: Neural representation and the limits of multivariate pattern analysis in cognitive neuroscience. The British Journal for the Philosophy of Science, 70(2), 581–607. Roskies, A. L. (2007). Are neuroimages like photographs of the brain? Philosophy of Science, 74(5), 860–872. Roskies, A. L. (2008). Neuroimaging and inferential distance. Neuroethics, 1(1), 19–30. Royall, R. (1997). Statistical Evidence: A Likelihood Paradigm. Routledge. Rubin, T. N., Koyejo, O., Gorgolewski, K. J., Jones, M. N., Poldrack, R. A., & Yarkoni, T. (2017). Decoding brain activity using a large-scale probabilistic functional-anatomical atlas of human cognition. PLOS Computational Biology, 13(10), e1005649. Salmon, W. C. (2001). Explanation and confirmation: A Bayesian critique of inference to the best explanation. In Explanation (pp. 61–91). Springer. Schurz, G. (2017). Patterns of abductive inference. In Springer Handbook of Model-Based Science (pp. 151–173). Springer. Shackman, A. J. (2015). The importance of respecting variation in cingulate anatomy: Comment on lieberman & eisenberger 2015 and yarkoni. Smith, S. M., Miller, K. L., Salimi-Khorshidi, G., Webster, M., Beckmann, C. F., Nichols, T. E., Ramsey, J. D., & Woolrich, M. W. (2011). Network modelling methods for fMRI. Neuroimage, 54(2), 875–891.
71 Reverse Inference, Abduction, and Probability in Cognitive Neuroscience
1549
Smith, S. M., Vidaurre, D., Beckmann, C. F., Glasser, M. F., Jenkinson, M., Miller, K. L., Nichols, T. E., Robinson, E. C., Salimi-Khorshidi, G., Woolrich, M. W., et al. (2013). Functional connectomics from resting-state fMRI. Trends in Cognitive Sciences, 17(12), 666–682. Sprenger, J., & Hartmann, S. (2019). Bayesian Philosophy of Science. Oxford University Press. Teplan, M. et al. (2002). Fundamentals of EEG measurement. Measurement Science Review, 2(2), 1–11. Thomson, J. J. (1976). Killing, letting die, and the trolley problem. The Monist, 59(2), 204–217. Tomaiuolo, F., Cecchetti, L., Gibson, R. M., Logi, F., Owen, A. M., Malasoma, F., Cozza, S., Pietrini, P., & Ricciardi, E. (2016). Progression from vegetative to minimally conscious state is associated with changes in brain neural response to passive tasks: A longitudinal single-case functional MRI study. Journal of the International Neuropsychological Society, 22(6), 620–630. Tudor, M., Tudor, L., & Tudor, K. I. (2005). Hans Berger (1873–1941) – The history of electroencephalography. Acta medica Croatica: casopis Hravatske akademije medicinskih znanosti, 59(4), 307–313. Turner, J. A., & Turner, M. D. (2021). Re-conceptualizing domains in neuroscience, hopes and utopias aside. Nature Neuroscience, 24(12), 1643–1644. Uttal, W. R. (2001). The New Phrenology: The Limits of Localizing Cognitive Processes in the Brain. The MIT Press. Wager, T. D., Atlas, L. Y., Botvinick, M. M., Chang, L. J., Coghill, R. C., Davis, K. D., Iannetti, G. D., Poldrack, R. A., Shackman, A. J., & Yarkoni, T. (2016). Pain in the ACC? Proceedings of the National Academy of Sciences, 113(18), E2474–E2475. Weiskopf, D. A. (2021). Data mining the brain to decode the mind. In Neural Mechanisms (pp. 85– 110). Springer. Yarkoni, T. (2015a). No, the dorsal anterior cingulate is not selective for pain: Comment on Lieberman and Eisenberger. Yarkoni, T. (2015b). Still not selective: comment on comment on comment on Lieberman & Eisenberger. Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature Methods, 8(8), 665–670.
Abduction: Theory and Evidence
72
Igor Douven
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What Is Abduction? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abductive Reasoning: Empirical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Is Abduction a Recipe for Disaster? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Dynamic Dutch Book Argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Error Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The End of Abduction? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Case for Abduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1552 1553 1556 1561 1562 1564 1567 1568 1577 1577
Abstract
This chapter looks at new theoretical work on abduction, with a special focus on arguments concerning the normative status of abduction, as well as at empirical results relevant to the question of whether theories of abduction are descriptively adequate. Keywords
Abduction · Bayes’ rule · Belief change · Computer simulations · Explanatory reasoning · Inference · Probability
I. Douven () IHPST/Panthéon–Sorbonne University/CNRS, Paris, France e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_61
1551
1552
I. Douven
Introduction Broadly understood, abduction is the idea that explanatory considerations have confirmation-theoretic significance. What this means, to a first approximation, is that whenever we wonder how much confidence to invest in a hypothesis or theory, given the available evidence, we should also consider the question of how well the hypothesis or theory explains that evidence. Suppose, for instance, that we conduct an experiment and the results allow us to eliminate a number of theories in the relevant domain but leave still more than one contender in the running. Then if one of the remaining theories is a clearly better explanation of our experimental results than the others are, that is reason to put more confidence in the former than in any of the other theories; according to some authors, it is even reason to infer that the former is correct. Abduction is different from the more widely studied inference form of deduction, if only because an abductive inference is revisable: we may receive additional evidence, and then some other theory may best explain our evidence, in which case we may become more confident, or infer categorically, that this other theory is correct. Abduction is also different from induction, another form of inference that is revisable. The key difference is that induction, as commonly understood, exploits only frequency information, whereas abduction relies crucially on judgments of explanation quality (which, note, is not to exclude that these judgments may rely, at least partly, on frequency information). Until at least 1980, philosophers of science and epistemologists took abduction more or less for granted. This changed with the advent of Bayesianism, which came to dominate thinking about rationality in the 1980s and 1990s, even to the extent that any rule apart from Bayes’ rule (see below for details) came to appear suspect. The most general point of critique Bayesians raised against abduction was that whereas their position builds on a precise mathematical machinery, abduction is no more than a slogan. And even if abduction can be made formally precise, then it could still only be subservient to Bayes’ rule, for instance, by recruiting explanatory intuitions in order to help determining prior probabilities, or by functioning as a heuristic shortcut to approximate probabilities whose exact calculation would take more time and effort than should be spent given the use case at issue. Any formally precise version of abduction that is more aspiring is doomed, according to Bayesians, if not because it makes the user “Dutch-bookable” (i.e., open to engaging in sets of bets that she is guaranteed to lose), then because it makes the user’s degrees of belief (i.e., subjective probabilities) more inaccurate than they would be were the user a Bayesian. In view of these criticisms, should we still care about abduction? We should, for at least two reasons. First, there is evidence that people do reason abductively and that they do so in ways that lead them to violate Bayes’ rule; that makes the study of abduction worthwhile at least from a psychologist’s standpoint. Second, there is recent work casting doubt on the Bayesian arguments according to which abduction violates norms of good reasoning. Among other concerns raised about
72 Abduction: Theory and Evidence
1553
these arguments, friends of abduction have countered that even if abduction has the flaws Bayesians attribute to it, there is reason to suspect that abduction can offer benefits in return which may more than make up for those flaws. Section “What Is Abduction?” looks at various ways in which authors have proposed to make the broad idea of abduction precise. Section “Abductive Reasoning: Empirical Support” presents empirical evidence bearing on abduction. Section “Is Abduction a Recipe for Disaster?” critically discusses the main arguments that have been leveled against abduction. And section “The Case for Abduction,” finally, canvasses a recent defense of the idea that, in the right kind of circumstances, abduction is a rational mode of reasoning.
What Is Abduction? Above, abduction was characterized broadly. In the literature, one finds various more precise statements of this mode of inference. According to what is probably the most common characterization, abduction licenses the acceptance of a hypothesis on the basis that it best explains the available evidence (e.g., Psillos, 2004, 83). As various authors have pointed out, however, this characterization is unsatisfactory for more than one reason. A first reason is that, thus conceived, abduction authorizes an absolute judgment—accepting a hypothesis as true—on the basis of a relative one, to wit, that the hypothesis better explains the evidence than the other available candidate explanations, which will typically not include all potential explanations of the evidence (van Fraassen, 1989, Ch. 6). A second reason why the previous characterization has been deemed inadequate is that in cases in which the best explanation of our evidence is still a poor one, or is satisfactory but hardly more so than the second-best explanation, an inference to that best explanation would pre-theoretically appear unwarranted. These concerns have inspired authors to come up with more refined proposals. For instance, Kuipers (1992) has addressed the first concern by proposing a reformulation of abduction according to which it licenses the inference to the conclusion that the best explanation is closer to the truth than the other available candidate explanations. And in response to the second concern, Lipton (1993) strengthens the standard definition of abduction by adding to it the requirement that the best explanation be both sufficiently good and sufficiently much better than its closest rival. A more general concern that has been raised about abduction is that it lacks precision, whether in its standard formulation or in the versions of Kuipers and Lipton, and that thereby it contrasts unfavorably with Bayes’ rule, which is its main contender. Admittedly, Bayes’ rule comes as a precise mathematical formula, in comparison with which abduction can easily appear as a vague suggestion. However, already van Fraassen (1989, Ch. 6) proposed a probabilistic version of abduction, and recently a number of variants of that version have appeared in the literature.
1554
I. Douven
According to Bayes’ rule, a rational person updates her personal (or subjective) probabilities upon the receipt of new information E by setting, for all propositions H expressible in her language, PrE (H ) = Pr(H | E) =
Pr(H ) Pr(E | H ) , Pr(E)
where Pr(·) is the person’s probability function right before she receives E and PrE (·) her probability function immediately after that event. Pr is also referred to as the person’s prior probability function and PrE as her posterior probability function. Van Fraassen’s rule, which will here be called “EXPL,” is like Bayes’ rule except that it attributes a bonus for explanatory superiority. Where {Hi }in is a set of selfconsistent, mutually exclusive, and jointly exhaustive hypotheses, a person’s new probability for Hi immediately after receiving E is in accordance with EXPL if and only if Pr (Hi ) =
Pr(Hi ) Pr(E | Hi ) + f (Hi , E) , j =1 Pr(Hj ) Pr(E | Hj ) + f (Hj , E)
n
(EXPL)
where Pr and Pr are the prior and posterior probability function, respectively, and f is a function assigning a bonus c (c 0) to the hypothesis that best explains E and nothing to the other hypotheses. Whereas EXPL gives all credit to the best explanation, one could plausibly consider rules that credit a number of hypotheses in proportion to how well they explain the data. So for instance, the best explanation might get most of the credit, but the second-best explanation might also get some credit, and might even get almost as much credit if it is almost as good, qua explanation, as the best explanation. One could also consider giving some credit to the third-best, the fourthbest, and so on, explanation, where then the credit attributed gets less and less, again most plausibly reflecting the explanation quality of each individual hypothesis. Indeed, if a hypothesis would make for a particularly poor explanation of the data, one could even assign it a malus point. Taking this idea as a starting point, (Douven, 2017, 2019, 2020a, 2022; also Douven & Wenmackers, 2017) formulates probabilistic versions of abduction that can credit individual hypotheses separately, in accordance to their explanation quality. Specifically, the rules he proposes are instances of the following schema: Pr(Hi ) Pr(E | Hi ) + c Pr(Hi ) Pr(E | Hi )M(Hi , E) , j =1 Pr(Hj ) Pr(E | Hj ) + c Pr(Hj ) Pr(E | Hj )M(Hj , E)
Pr (Hi ) = n
(S)
where Pr and Pr are as before, M is a measure of explanation quality, and with again c 0. Note that, as stated here, the above schema as well as EXPL have Bayes’ rule as a limiting case, viz., if c is set to 0. One could thus say that advocates of either schema
72 Abduction: Theory and Evidence
1555
are committed to Bayesian updating in cases in which no explanatory considerations are at play. Alternatively, one could require c to be strictly greater than 0, thereby leaving entirely open how to update one’s probabilities when explanation plays no role. In principle, M can be any measure of explanation quality. Douven (2022) considers two in particular, one building on Popper’s (1959) work and the other on Good’s (1960) work. According to Popper’s measure, hypothesis H ’s power to explain evidence E is given by Pr(E | H ) − Pr(E) , Pr(E | H ) + Pr(E) while according to Good’s measure it is given by
Pr(E | H ) ln . Pr(E) Douven uses these measures (to illustrate certain normative points about abduction, to be discussed in section “Is Abduction a Recipe for Disaster?”) because they had performed well in empirical research (Douven & Schupbach, 2015a,b), not necessarily because he thinks they are “objectively best.” The easiest way to think of these rules is that they first update a hypothesis’ probability according to Bayes’ rule, calculate that hypothesis’ explanatory goodness (or badness, as the case may be) according to one of the above measures, add (or subtract) a percentage of the hypothesis’ probability in proportion to its explanatory goodness (or badness), and then, as a final step, renormalize. The details matter less than the general observation that there are precise versions of abduction, for example, instances of EXPL or the schema of Douven (2017, 2019, 2020a, 2022), and possibly many others. But although these schemata help to address the concern of lacking precision, they do raise a concern of their own, at least for anyone wishing to maintain the rationality of abductive reasoning. The new concern is that there appear to be many versions of abduction, without an indication of which of those is the right one, the one to be followed in our reasoning. Douven (2017, 2022) proposes not to see this as a concern but rather to embrace the thought that abduction is a general idea—that explanation has a role to play in confirmation—that not only can be articulated in a diversity of ways but that has to be articulated differently in different contexts of use. Exactly how to reason abductively depends on what the reasoner’s goals are, on the environment in which she is pursuing those goals, as well as on her capacities. Indeed, if Foley (1993) and others are right that we sometimes reason qualitatively—in terms of what to (categorically) believe—and sometimes quantitatively—in terms of what probabilities to assign—there may be times when we rely on something like Kuipers’ or Lipton’s versions of abduction, referenced in the previous section, and also times when instead we rely on EXPL or a kindred probabilistic rule.
1556
I. Douven
The proposal to understand abduction as a broad idea, requiring further fleshing out depending on context and user, takes its cue from work by Gigerenzer (2000, 2001), Elqayam (2011, 2012), Schurz & Hertwig (2019), and others, arguing for an ecological conception of rationality. This work suggests that we must be willing to abandon the classical idea that rational reasoning is a matter of following a small number of universally valid principles and to acknowledge that the ability to pick the right learning tools for each particular situation is an important aspect of what we generally think of as human intelligence. In light of this work, the thought that rationality may require us to use one precisification of abduction in some contexts, another in other contexts, and perhaps Bayes’ rule in yet other contexts, makes a lot of sense. Nevertheless, philosophers love generality, and so they may not be easily persuadable to let go of the one-size-fits-all solution that Bayesianism appears to offer. And then there are still the arguments that were mentioned in the introduction, which aim to show that any deviation from Bayesian reasoning leads to irrationality. Before turning to those, I discuss some evidence seemingly showing that, in quite ordinary learning situations, people tend to reason abductively, by taking explanatory factors into consideration in ways that lead them to violate Bayes’ rule. At a minimum, that puts some pressure on those wanting to stick to Bayesianism, given that it would require the attribution of massive error in people’s learning practices (Bayesians may be quick to point out that it is long known that people violate Bayesian principles; see the next section. However, many Bayesians still want to maintain that, by and large, their view is descriptively adequate. (e.g., Oaksford & Chater, 2007)—which becomes harder to maintain with every newly discovered violation.)
Abductive Reasoning: Empirical Support Bayes’ rule as well as the probabilistic versions of abduction forms the core of competing accounts of rational updating. It is not necessary for such accounts to be descriptively accurate to a tee. But they should be at least broadly predictive of how humans update their probabilities. If not, why think that these accounts have any bearing on human rationality, rather than being some highly idealized form of robot epistemology? So, how do these accounts hold up against the experimental results? To start with Bayes’ rule, it is to be stressed that much of what is commonly advertised as evidence for Bayesianism is unrelated to the question of how people update their probabilities upon the receipt of new information and concerns probabilistic reasoning more broadly, for instance, whether people’s static assignments of probabilities are coherent, that is, whether people’s (subjective) probabilities are truly probabilities in that they conform to the probability calculus, at least by and large (Oaksford & Chater, 2007). Whereas there are quite a number of known results supporting the thought that people do obey Bayesian prescriptions, at least approximately, there are also reports of stark violations of these prescrip-
72 Abduction: Theory and Evidence
1557
tions, most famously in the work of Kahneman and various of his collaborators (e.g. Kahneman et al., 1982; Tversky & Kahneman, 1983). However, Bayesians have tried to explain away such violations as being due to people’s reliance on error-prone heuristics, or on their confusing the concept of probability with that of confirmation (see Tentori et al., 2013, for discussion). Support specifically for the descriptive adequacy of Bayesian updating is hard to find. Griffiths & Tenenbaum (2006) present participants with random samplings from a closed interval (e.g., a random person’s age, or the length of a random couple’s marriage) and then ask for the upper bound of that interval (e.g., the person’s life span, or the total duration of the marriage). They show that their participants’ responses are close to what one would expect them to be on the assumption that they updated on the random outcome of the sampling via Bayes’ rule. Note, though, that in this setup, explanatory considerations nowhere enter the picture, meaning that the assumption that the participants updated via an instance of EXPL or of the other schema discussed in the previous section would lead to the same predictions. Besides, there is older work on updating that also explicitly compared people’s updates with what those updates should be according to Bayesianism, and this work reported strong evidence against Bayes’ rule (Edwards, 1968; Marks & Clarkson, 1972; Fischhoff & Lichtenstein, 1978; Schum & Martin, 1982). Particularly worth mentioning in this regard is the research reported in Phillips & Edwards (1966), which involved a so-called bookbags-and-poker-chips experiment. In this type of experiment, participants are being informed about the contents of two containers (e.g., bookbags or urns), where these containers hold two types of objects (e.g., black and red poker chips, or blue and green balls) in different ratios. For instance, they might be told that the bag composition is 70/30 versus 50/50. The experimenters then randomly draw a number of objects from one of the bags, without disclosing to participants which bag it is. Finally, the participants are shown the sample and asked for their probability that the sample comes from the 70/30 bag rather than from the 50/50 bag. Using this setup, and comparing their participants’ probability estimates with the probabilities for the two bags given the sample that were mandated by Bayes’ rule, Phillips and Edwards found significant discrepancies between the former and the latter. Here too, no attempt was made to contrast Bayesian updating with updating via some form of abduction. More recently, however, Douven & Schupbach (2015a) relied on basically the same experimental paradigm with the explicit aim of investigating whether deviations from Bayes’ rule in participants’ probability updates—if any deviations were found—could be due to the participants’ taking into account explanatory considerations. These authors were specifically interested in three questions, to wit, first, how Bayesianism and explanationism—the normative view that people ought to reason abductively when explanatory factors are at play—compare in terms of descriptive adequacy; second, whether if judgments of explanatory goodness are found to have an essential role in updating, probabilities still play an important role, too, in updating; and third, what kind of explanatory judgments figure in updating, if any do.
1558
I. Douven
To answer these questions, Douven and Schupbach slightly extended Phillips and Edwards’ bookbags-and-pokerchips paradigm, the extension consisting of the additional gathering of judgments of explanation goodness, alongside that of probability judgments. Specifically, the procedure was as follows: Participants were interviewed individually and were, at the start, presented with two urns, labeled “urn A” and “urn B.” They were shown that each urn contained forty balls, the composition being thirty black balls and ten white ones for urn A, and fifteen black balls and twenty-five white ones for urn B. Participants could consult this information at any time during the interview. Then the experimenter flipped a coin and, depending on the outcome, chose one or the other urn, outside of the participants’ view. Next, from the chosen urn ten balls were drawn, one after the other, and without replacement. The balls were lined up before the participant as they were drawn. After each draw, the participant was asked the following questions: (i) How well, in your opinion, does the hypothesis that urn A was selected explain the results from the drawings so far? (ii) How well, in your opinion, does the hypothesis that urn B was selected explain those results? (iii) How probable is it, in your opinion, that urn A was selected, given the results so far? The two questions about explanatory goodness had to be answered by making a mark on a continuous scale with “extremely poor explanation” and “extremely good explanation” as anchors. In their analysis, Douven and Schupbach fitted a number of linear regression models (in fact, so-called linear mixed-effects models; see Douven (2022), for some background), each of which had the collected responses to question (iii) as dependent variable and at least the objective conditional probabilities that could be calculated for each participant and each drawing as predictor variable. The models differed in their further predictors. In one model there were no further predictors beyond objective conditional probabilities. A second one included as further predictors both the collected responses to question (i) and the collected responses to question (ii). A third one, finally, had besides objective conditional probabilities as a predictor also the computed differences between the participants’ responses to question (i) and question (ii). Adding predictors to a model tends to yield a model with better fit. Therefore, Douven and Schupbach compared the aforementioned models using the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), which weigh model fit against model complexity. On both criteria, the third model, with objective probabilities and difference in judged explanatory goodness as predictors, did best, followed by the second model, and with the first model—the one with only objective probabilities as predictor—coming in at a very distant third place. This result casts doubt on the claim that people update via Bayes’ rule rather than via some version of abduction. If they updated via Bayes’ rule, the smallest
72 Abduction: Theory and Evidence
1559
model should have come out on top (At least that is so assuming the so-called Principal Principle, according to which (roughly) subjective probabilities should equal objective probabilities if the latter are known. But this principle is also almost generally endorsed in the Bayesian community.). What should make Douven & Schupbach’s (2015a) result particularly unsettling for Bayesians is that not only were the participants’ responses out of sync with Bayesian prescriptions (which could perhaps be explained away in terms of noise), but these deviations could be successfully accounted for in terms of the participants’ giving weight to explanatory considerations. This is evidence that people factor in judgments of explanation quality when they update, at least in some contexts, and that they do so in a way that is essentially non-Bayesian. It does not follow from Douven and Schupbach’s analysis that explanatory considerations have any systematic impact on people’s updates. In particular, it does not follow that people are following something like a probabilistic rule of abduction. This observation motivated follow-up research specifically directed at the question left open by Douven & Schupbach (2015a). In this research, Douven & Schupbach (2015b) used the objective probabilities from their earlier study in conjunction with Popper’s and Good’s measures of explanatory power to compute, separately for each participant and each draw, the explanatory power of the hypotheses at play in the experiment reported in Douven & Schupbach (2015a)—that is, that urn A had been selected, and that instead urn B had been selected—and then used the results of those computations together with the objective conditional probabilities as predictor variables to again regress the updates from the participants in the experiment from Douven & Schupbach (2015a) (In fact, Douven & Schupbach (2015b) looked at some other measures of explanatory power as well, but these did significantly worse than Popper’s and Good’s measures.). In other words, where the analysis of Douven & Schupbach (2015a) had used subjective judgments of explanatory goodness as a predictor, in their new analysis, these authors fitted models that had computed explanatory goodness values as a predictor. More exactly, one model had the values computed according to Popper’s measure as a predictor and the other had the values computed according to Good’s measure as a predictor, while both shared objective conditional probabilities as a predictor. It was found that both models did considerably better in terms of AIC and BIC values than the model with only objective conditional probabilities as a predictor. That is compelling evidence that, at least in some contexts, explanatory considerations do play a systematic role in people’s probabilistic updates: the way they help shape those updates can be captured by formal measures of explanatory power. It was previously mentioned that we may not always reason quantitatively, in terms of probabilities, but may also sometimes reason in terms of categorical beliefs, and that therefore qualitative versions of abduction may have psychological reality as well. Just as there is little empirical work on quantitative versions of abduction, there is little work on qualitative versions. To our knowledge, the only research directly concerned with a qualitative version of abduction—specifically, Lipton’s, as stated in the previous section—is to be found in Douven & Mirabile (2018). Recall
1560
I. Douven
that Lipton’s version stressed the importance of the best explanation being not only good enough but also being appreciably superior to the second-best explanation. To investigate the descriptive adequacy of this version, Douven and Mirabile focused on the following questions: (1) Does the quality of an explanation predict people’s willingness to accept that explanation, and is there a quality threshold such that an explanation must be above that threshold for people to infer to it? (2) Will it make a difference to people’s perception of the quality of an explanation if they are introduced to a rival explanation? Will that make a difference to their preparedness to accept the former? (3) If people are given two rival explanations, does it matter to their preparedness to inter to the best of those how much the explanations differ, in terms of quality? Douven & Mirabile (2018) describe three experiments designed to answer these questions. The three experiments used materials deriving from six basic scenarios, each presenting a fact alongside one (in the first experiment) or two (in the second and third experiments) possible explanations of that fact. The explanations could vary in quality; where two explanations appeared, the explanations could vary in quality independently of each other. The participants were asked to answer three questions: (i) whether they were willing to infer to one of the explanations (or to the explanation, where only one was given); (ii) how likely, in their judgment, the explanations were; and (iii) how good the explanations were, qua explanations. The participants in the third experiment also always had to indicate how confident they were in their answer to question (i). In their analysis, Douven and Mirabile found that how highly a person rates the quality of an explanation accurately predicts how willing she is to infer to that explanation, and also that the perceived quality needs to be above a certain threshold (which differed somewhat among participants) before the person will make the inference. Importantly, whereas the probability a participant assigned to an explanation was also a good predictor of whether the participant was willing to infer to that explanation, perceived explanation quality was a significantly better predictor. Furthermore, Douven and Mirabile found that people’s willingness to infer to an explanation is reliably affected by whether that explanation is presented on its own or is accompanied by a rival explanation, even though their judgment of the quality of an explanation is not affected by that. There was a further reliable effect of the quality of the rival explanation on people’s preparedness to infer to the other explanation. Specifically, Douven and Mirabile found that when their participants were presented an explanation alongside a rival explanation that was more or less as good, the participants were reliably less inclined to infer to the former explanation, whereas the effect of introducing a rival explanation tended to be smaller when that rival was a clearly poorer explanation (Note that this finding is in line with Douven & Schupbach’s (2015a) finding that their model
72 Abduction: Theory and Evidence
1561
with differences in judged explanatory goodness as a predictor, next to objective probabilities, did best.). In summary, Douven and Mirabile found positive answers to question (1), the second part of question (2), and question (3), but a negative answer to the first part of question (2). It was already known that, in some form or other, explanation is involved in various cognitive processes, such as categorization (e.g., Williams & Lombrozo, 2010, 2013; Edwards et al., 2019; Vasilyeva & Lombrozo, 2020), generalization (e.g. Lombrozo & Gwynne, 2014), and understanding (Keil, 2006; Legare & Lombrozo, 2014; Walker & Lombrozo, 2017). The studies discussed in this section are among the first to look specifically at the role of explanation in belief updating. The outcomes of these studies should at least for psychologists be reason to take abduction in the context of belief change more seriously than they have so far done. For example, it would be interesting to have more information about the degree to which abductive reasoning depends on context, and also to know more about the actual cognitive mechanisms underlying or involved in that type of reasoning. But whatever the outcomes of such (hopefully) future research, is there any reason for philosophers to care about it? Section “The Case for Abduction” makes a case for a positive answer to this question. But section “Is Abduction a Recipe for Disaster?” first discusses the two main arguments commonly taken to suggest that philosophers can safely ignore abduction.
Is Abduction a Recipe for Disaster? People are susceptible to all sorts of biased thinking. They have a strong tendency to give more weight to information that favors their views than to information that challenges those views (the so-called confirmation bias); they tend to discount older information in favor of more recent information (the recency bias); they easily neglect prior probabilities in their quantitative reasoning (the base rate fallacy); they often overestimate their own abilities (the Dunning–Kruger effect); and on and on. Hence, the finding that people systematically attend to explanatory factors when changing their beliefs, or their probabilities, is of little significance from a normative standpoint. After all, it could just be one more bias, one more unfortunate but hard to unlearn cognitive habit. If, at the end of the day, we had to acknowledge as much, that would at most be a slight additional blow to our self-esteem. Philosophers have given two main arguments for the claim that abduction, if it has psychological reality, is a bias indeed, and a quite detrimental one at that. Before looking at these arguments in some detail, there are two remarks to be made. First, whereas abduction is nowadays almost generally derided, in the 1970s and 1980s it was almost equally generally considered a paradigmatically sound form of reasoning. McMullin (1992) referred to it as “the inference that makes science,” and Boyd (1984, 1985) argued that because (on his analysis) abduction is central to scientific reasoning and the methods of philosophy should be continuous with those
1562
I. Douven
of science, abduction should be central to philosophy as well. That the appreciation of abduction changed so dramatically has everything to do with the meteoric rise of Bayesian philosophy of science and Bayesian epistemology, for reasons to be seen shortly. Second, it is to be noted that the arguments to be considered in the following are strictly concerned with probabilistic versions of abduction. As said, there may well be situations in which we want to rely on Lipton’s or Kuipers’ or a similar qualitative version of abduction. The present author is not aware of any arguments against these. Naturally, hard-nosed Bayesians will regard the fact that these versions are phrased in terms of categorical rather than graded belief as disqualifying in itself. But the view that the two notions of belief are both to be taken seriously (rather than dismissing the categorical notion as somehow having no place in scientific philosophy) is increasingly popular, and much recent work in epistemology has looked at how (in Foley’s, 1993, terms) the epistemology of beliefs and the epistemology of degrees of belief are connected (see, e.g., the papers in Douven, ed. 2021). Nevertheless, from here on, the focus will be on the probabilistic versions of abduction, on which most of the recent discussion about abduction has centered.
The Dynamic Dutch Book Argument According to the widely endorsed betting concept of probability, the degree to which you believe that your favorite football team will win its next match is the price in cents at which you are willing to take either side in a bet that pays $ 1 if indeed the team will win that match and nothing if it does not win. So suppose that you have no preference for selling that bet (you have to pay $ 1 dollar if the proposition turns out to be true) or for buying it (you receive the dollar if the proposition turns out to be true) for the price of ¢ 30. Then your probability for your favorite football team winning its next match equals 0.3. Bayesians have relied on this concept to argue that any failure of our probabilities to accord with the axioms of probability—that is, for our subjective probabilities to be probabilities properly speaking—means you are in an irrational belief state. That is because—they argue—any such failure exposes us to a so-called Dutch book, which is the standard name in the literature for a bet or set of bets that guarantee a negative net pay-off. For instance, according to one of the axioms of probability theory, logical truths, like “A or not A” (with A any proposition), should be assigned a probability of 1. According to another axiom, the probability of a disjunction of mutually incompatible propositions should equal the sum of the probabilities assigned to the separate disjuncts. Now suppose you believe A to a degree of 0.4 and its negation to a degree of 0.7. Obviously, A and its negation are mutually incompatible, so you should believe their disjunction to a degree of 0.4 + 0.7 = 1.1. On the other hand, that disjunction is a logical truth, and so you should believe it to a degree of 1. You are clearly violating the axioms of probability theory. Here is how that can be exploited: I offer you for the price of ¢ 40 a bet on A that pays $ 1 dollar if A is true and nothing if A is false. Given the degree to which you believe A,
72 Abduction: Theory and Evidence
1563
you are willing to buy that bet. At the same time, I offer you for the price of ¢ 70 a bet that pays $ 1 if the negation of A is true (so if A is false) and nothing otherwise. Again given your degrees of belief, you are willing to buy that bet. Exactly one of A and its negation will turn out to be true, so you can be sure to receive exactly $ 1 dollar from me. Note, however, that you have already paid me $ 1.1, meaning that, whatever the future brings, you will have a net loss of ¢ 10. Betting is risky—you can always lose money. What is different here, however, and what—Bayesians have argued—makes this an exhibit of your irrationality, is that you could have seen the loss coming. Not only that; you could have figured out how to avoid it, to wit, by making your probabilities for A and its negation align with the probability axioms. The axioms of probability theory have nothing to say about how to change your probabilities in response to new evidence. Bayesians proposed Bayes’ rule as an answer to that question, but it was already seen that there are alternatives to that rule, even ones which are very similar to it except that they take explanatory factors into account, such as the instances of EXPL and S stated in section “What Is Abduction?.” Bayesians have complemented the above Dutch book argument, which is static in that it only looks at probabilities held at the same time by a dynamic Dutch book argument, which looks at the development of a person’s probabilities over time. In the typical presentation, someone who changes her degrees of belief by a non-Bayesian rule for belief change is offered a number of bets at different points in time. It is then argued that the bets will all appear fair to the person at the time they are offered to her but are, if she engages in all of them, guaranteed to make her lose money in the end. To make this concrete, here is an example. Let it be given that a certain coin either is fair or has a perfect bias for heads (every toss lands heads) or has a perfect bias for tails (every toss lands tails). Suppose that, initially, each of these possibilities is equally likely. We are allowed to toss the coin, but first a bookie offers us two bets, one that pays $ 48 if the first two tosses do not both land heads, and one that pays $ 600 if the first two tosses do both land heads and, in addition, the third toss lands tails. In light of our prior probabilities, $ 28 and, respectively, $ 25 appear reasonable prices for these bets. (For instance, our prior that the first two tosses land heads equals 5/12 and so our prior that they do not both land heads equals 7/12, and 7/12 × 48 equals 28; similarly for the other bet.) Suppose the bookie agrees to sell the bets at these prices. Then so far we have spent $ 53. Now let us start flipping the coin and update our probabilities as we watch the outcomes of the first two tosses. Suppose at least one of them comes up tails. Then we have won the first bet but lost the second one, which means that we receive $ 48 but still have a net loss of $ 5 (we payed $ 53 for the bets, after all). This would be unfortunate but nothing out of the ordinary: it is in the nature of betting that the bettor should be prepared to accept losses. But now suppose that the first two tosses do both come up heads. We have now lost the first bet but may still win the second one, which would allow us to pocket $ 600, thereby making a net profit of $ 547. However, before we toss the coin a third time, the bookie approaches us again and now instead of proposing to sell any bets proposes to buy one, viz., a bet that pays $ 600 if the third toss lands tails. For what price should we be willing to sell it?
1564
I. Douven
Here it matters which rule we use to update our probabilities for the three hypotheses of interest (that the coin has a perfect bias for heads, that it is fair, and that it has a perfect bias for tails) on the outcomes of the first two tosses. Suppose we use EXPL, with an explanation bonus of 0.1. Then, as can be easily verified, our probability for the third toss landing tails will be 0.08. Thus, we are willing to sell the designated bet for $ 48 (eight hundredth of what the bet pays if the third toss lands tails). But notice now that, whatever the outcome of the third toss, we will have lost money. If the third toss does land tails, we will receive $ 600 but have to pay the same amount; in the other case, we will not have to pay anything but will also not receive anything. However, whereas we have spent $ 53 on the bets we bought, we have only made $ 48 on the bet we sold. In short, we have again lost $ 5. Thus, we are bound to lose $ 5 no matter what. One equally easily verifies that updating via Bayes’ rule would not have led to this result. For had we used that rule, our probability for the third toss landing tails after watching the first two landing heads would have been 0.1, so that we would only have been willing to sell the bet to the bookie had she offered to pay (at least) $ 60. And if she had bought the bet for that price, we would have made a profit of $ 7. This is easily generalized to any update rule deviating from Bayes’ rule, so in particular, to any instance of EXPL or S, or indeed to any other probabilistic version of abduction. What, according to Bayesians, makes this so damning to non-Bayesian update rules is that, again, the user can figure out herself, before deciding to update via a non-Bayesian rule, that the threat of engaging in bets that are bound to lose her money will always be lurking. From this, they conclude that non-Bayesian updating betokens irrationality. And so in particular, using some probabilistic version of abduction betokens irrationality.
Expected Error Minimization The dynamic Dutch book defense of Bayes’ rule, and the Dutch book approach to defending Bayesianism generally, has lost much of its erstwhile popularity. Most Bayesians have come to regard this approach as addressing the wrong sort of rationality. Being liable to Dutch books is a practical problem and may therefore indicate that we fall short of meeting standards of practical rationality, that is, the rationality concerned with our actions. But the debate about how to update our probabilities concerns a question of epistemic rationality, that is, a question of what we can rationally believe and to what degree. Joyce (1998) was the first to point out this problem, and in the same paper he proposed an alternative to the Dutch book approach, one in terms of error minimization. Joyce was in effect only concerned with the “static” part of Bayesianism—the claim that rationality requires our subjective probabilities to be probabilities properly speaking—and not with updating. What he argued was, in essence, that any person whose epistemic state is not in accordance with the static norms of Bayesianism (i.e.,
72 Abduction: Theory and Evidence
1565
whose subjective probabilities are not probabilities properly so called) falls short of realizing our epistemic goal, which Joyce understands in terms of inaccuracy minimization. That is to say, if a person’s subjective probabilities are not formally probabilities, the person could improve the accuracy of her epistemic state just be bringing her subjective probabilities in line with the formal requirements of probability theory. There is a variety of ways to measure the accuracy of subjective probabilities, but by far the most popular one is the so-called Brier scoring rule. To explain this n rule, let Hi i=1 be a set of self-consistent, mutually exclusive, and collectively exhaustive hypotheses, and let δij (for i, j ∈ {1, . . . , n}) equal 1 if i = j and equal 0 otherwise. Then, where Hj is the true hypothesis, a person who assigns n subjective probabilities p = (p1 , . . . , pn ) to the members of Hi i=1 , with pi her probability for Hi , incurs a Brier score of 1/n ni=1 (δij − pi )2 . By way of illustration, suppose H1 , H2 , and H3 are self-consistent, mutually exclusive, and jointly exhaustive, and your subjective probabilities for these hypotheses are 0.1, 0.5, and 0.5, respectively. Suppose that of these hypotheses H2 is the truth. Then your Brier score equals (0.1)2 + (1 − 0.5)2 + (0.5)2 /3 = 0.17. Because your subjective probabilities do not sum to 1, they are not probabilities in the formal sense. Suppose you bring them into alignment with the probability axioms by lowering your probability for H3from 0.5 to 0.4. Then your Brier score comes to equal (0.1)2 +(1−0.5)2 +(0.4)2 /3 = 0.14. Naturally, this could be a coincidence, and it is certainly not true that any way of making your subjective probabilities accord with the probability axioms would lower your Brier score; for instance, if you lower your probability for H2 from 0.5 to 0.4, that will bring your subjective probabilities in line with the probability axioms, but your Brier score would go up by more than 0.03. However, Joyce’s point is that whenever your subjective probabilities fail to obey the probability axioms, there is a way to lower your Brier score—and thus take a step toward realizing your epistemic goal—just by bringing your subjective probabilities in line with those axioms. That, and not protection against Dutch bookies, is why conformity with the probability axioms is a rationality requirement for subjective probability, or so Joyce argues. While Joyce did not address the issue of the rationality of Bayes’ rule, others argued that Joyce’s general approach could also be used to show the rationality of Bayesian updating. Most notably, Leitgeb & Pettigrew (2010) sought to show that just as (according to Joyce) having subjective probabilities that are not really probabilities is sub-optimal from the perspective of realizing our epistemic goal, so is updating in ways that stray from Bayesian prescriptions sub-optimal from that perspective. More exactly, their claim is that updating via Bayes’ rule is both necessary and sufficient for minimizing the expected inaccuracy of our postupdate subjective probabilities, where the expectation is relative to our pre-update subjective probabilities, and where inaccuracy is again measured by the Brier score. To illustrate the idea, consider again hypotheses H1 , H2 , and H3 , to which we assign probabilities 0.1, 0.5, and 0.4, respectively. One piece of evidence relevant to these hypotheses that we might obtain is E. Exactly how it is relevant is specified by the following probability distribution:
1566
Pr wH1 E = 0.011 Pr wH1 E = 0.09
I. Douven
Pr wH2 E = 0.250 Pr wH2 E = 0.25
Pr wH3 E = 0.100 Pr wH3 E = 0.3
Here, wXY is the possible world in which both X and Y are true, and E designates the negation of E. Note that we can get the probabilities of the various hypotheses from this simply by summing the probabilities of the worlds in which they hold true. For example, we derive from the above that our probability for H1 is equal to Pr wH1 E + Pr wH1 E = 0.01 + 0.09, which indeed equals 0.1. If we do obtain evidence E, then, according to Bayesians, we should update on that new information via Bayes’ rule. As one easily verifies, this would lead us to assign the following subjective probabilities to the relevant possible worlds: PrE wH1 E ≈ 0.028 PrE wH1 E = 0.0
PrE wH2 E ≈ 0.694 PrE wH2 E = 0.0
PrE wH3 E ≈ 0.278 PrE wH3 E = 0.0
Right now, before receiving the evidence, what is our expectation for the Brier score we would incur if we updated on E? To calculate this, we consider what our score would be were H1 to hold, we calculate what it would were H2 to hold, calculate what it would be were H3 to hold, and take the weighted average of the three scores, the weights being our probabilities for the worlds that still be possible after the update. This yields an expected Brier score of approximately 0.158. Remarkably, if we minimize 0.01 (1 − x)2 + y 2 + z2 + 0.25 x 2 + (1 − y)2 + z2 + 0.1 x 2 + y 2 + (1 − z)2 subject to the constraint that x + y + z = 1, we find a minimum of (approximately) 0.158, and equally remarkably, we find this minimum precisely at (0.028, 0.694, 0.278), which are our post-update probabilities for the remaining possible worlds. If instead of Bayes’ rule we use EXPL, again with a bonus of 0.1, to update on E, supposing we do receive that evidence, then that would lead us to assign different probabilities to those possible worlds. Suppose we find H1 worthy of the explanation bonus. Then our probability assignment would become PrE wH1 E ≈ 0.239 PrE wH1 E = 0.0
PrE wH2 E ≈ 0.543 PrE wH2 E = 0.0
PrE wH3 E ≈ 0.217 PrE wH3 E = 0.0.
And we already know that, with those probabilities, we are not minimizing our expected inaccuracy. Indeed, in this case, our expected Brier penalty would be approximately 0.184, so greater than the penalty of 0.158 we would incur were
72 Abduction: Theory and Evidence
1567
we to use Bayes’ rule. Leitgeb & Pettigrew (2010) show that nothing of this is a coincidence: any update rule that minimizes expected inaccuracy is equivalent to Bayes’ rule. The conclusion seems to be exactly parallel to the one Joyce drew from his argument: Bayes’ rule is rational because using it is most conducive to our epistemic goal of inaccuracy minimization and not because it serves some practical goal (such as offering protection against dynamic Dutch bookies).
The End of Abduction? Both of the arguments discussed in the foregoing have done much to cement the popularity of Bayes’ rule, the inaccuracy minimization argument currently being considered the more compelling of the two, for the reasons explained. At first blush, one could indeed wonder how these arguments, and certainly the second one, leave any room for doubt about the irrationality, or at least sub-optimality (and how could using a sub-optimal rule not be irrational if an optimal rule is available?), of any form of non-Bayesian updating, including probabilistic versions of abduction. On closer inspection, however, the arguments leave much to be desired. Both have specific shortcomings, and they share a general one. I start with the shortcomings specific to each argument. As for the dynamic Dutch book argument, we already encountered the critique that it seems unrelated to what is or should be at issue, to wit, epistemic rationality. Moreover, some authors have questioned the betting concept of probability, or indeed the existence of any direct connection between probability and willingness to engage in bets, on which that argument, as any Dutch book argument, ultimately relies (e.g. Williamson, 1998). Finally, it has been argued that we should not think of update rules in isolation, but rather as parts of packages of further epistemic as well as decision-theoretic principles, and that there are such packages that include EXPL or a kindred version of abduction and that shield one from being exploited by dynamic Dutch bookies (Douven, 1999, 2022). Specifically, there are packages that will lead their users to deny that all bets offered by the bookie are fair, even though use of the package may also lead to violations of Bayes’ rule. As for the more specific problems facing the inaccuracy minimization argument, first note that it is not quite an extension of Joyce’s argument to the dynamic case. According to Joyce, concordance with the probability axioms guarantees inaccuracy minimization, not expected inaccuracy minimization, which is what Leitgeb and Pettigrew claim obedience to Bayes’ rule guarantees. In fact, the difference is a bit more subtle still, given that what Leitgeb and Pettigrew actually argue for is that obedience to Bayes’ rule guarantees expected next-step inaccuracy minimization— so concerning the inaccuracy of our subjective probabilities immediately after the update—not expected inaccuracy minimization tout court. What this means is that they leave open the possibility that your expectation of how inaccurate your subjective probabilities will be at some point in the future is greater supposing you are committed to Bayes’ rule than if you commit to some non-Bayesian update
1568
I. Douven
rule, like EXPL for instance. Not only that: they leave open the possibility that you will ultimately end up having more accurate subjective probabilities if you update via some non-Bayesian rule than if you update via Bayes’ rule. Their argument could still be compelling if they had given a reason to believe that we should only, or at least first and foremost, care about expected next-step inaccuracy minimization. But they have not, and pre-theoretically the claim appears rather implausible (Independently, the Brier score is not as obviously compelling as the proponents of the inaccuracy minimization arguments take it to be; see Douven (2020b, 2023). And there are possible alternatives relative to which versions of abduction, rather than Bayes’ rule, minimize expected inaccuracy; see Douven, 2022, Sect. 5.2.). But there is a more general point to be made, which pertains to both arguments, to wit, that still nothing follows about non-Bayesian belief change if there are monetary (as per the dynamic Dutch book argument) or epistemic (as per the inaccuracy minimization argument) costs attached to it. Few things in life are for free. There are costs attached to having dinner in a restaurant, but that does not prevent you from eating out: if you pick the right restaurant, you will find the meal you get there worth the money and will be happy to pay the bill. For some reason, Bayesians have never even bothered asking whether non-Bayesian updating could have any benefits compared to Bayesian updating. The next section addresses that question.
The Case for Abduction To see how abduction can be preferable over Bayes’ rule, all things considered, let us start by asking what one may want from an update rule. We gather evidence in the hope of arriving at the truth concerning some matter of interest. Sometimes, the evidence informs us immediately about the truth of that matter. Is Susan in her office? We may be in the position to simply have a look and see Susan in her office, which settles the matter to everyone’s (but the skeptic’s) satisfaction. But often the matter is not so easily decided. Why did the dinosaurs go extinct? Piecing together various bits of evidence, we may become inclined to think that it was due to environmental changes brought about by some catastrophic event, like the impact of an asteroid on earth. In cases like this, instead of simply observing the truth of the matter, we try to infer the truth from the evidence, the inference typically being uncertain to a degree. Bayesians and advocates of rules like EXPL or other probabilistic versions of abduction agree that the inferential mechanism at play is to be thought of as a rule that outputs new subjective probabilities on the basis of evidential input. A number of desiderata for rules of this sort naturally flow from the idea which also underlies the inaccuracy minimization arguments just discussed, viz., that truth is the ultimate epistemic goal: all our epistemic efforts are geared toward becoming certain of things that are true that they are true and of things that are false that they are false. The most general desideratum is, of course, that we want update rules to be conducive to realizing this goal. More specific ones are suggested by attending to
72 Abduction: Theory and Evidence
1569
the most relevant dimensions along which update rules can vary with respect to their truth conduciveness. To begin with, we want such rules to be reliable in that they typically lead us to become more confident in truths and less confident in falsehoods, and the more so the more evidence we obtain. All else being equal, we prefer more reliable rules over less reliable rules. In practice—especially in scientific practice— it will often be difficult to arrive exactly at the truth and we may have to settle for getting close to the truth, or close enough for all practical purposes. All else being equal, we prefer an update rule that leads us to spread our confidence close to the truth over one that leads us to spread our confidence further away from the truth. A last important desideratum stems from the fact that, again in practice, we are frequently under some time pressure to arrive at the truth. That an update rule eventually will make us confident in the truth is not so helpful in situations in which, for instance, becoming confident in the truth, or just becoming more confident in the truth than in any of its false rivals, or becoming sufficiently confident in a hypothesis close enough to the truth, is a matter of life and death. So, all else being equal, we prefer an update rule that increases our confidence in the truth rapidly over one that does so more slowly. Ideally, an update rule makes us reliably and rapidly gain high confidence in the truth and nothing but the truth. More realistically, we have to be prepared to make trade-offs. A rule that rapidly concentrates our confidence in some small area of the space of possibilities may do so at the expense of accuracy; it may quickly get us in the vicinity of the truth but then be quite slow in taking us exactly at the truth. Other rules may be quicker in bringing us exactly at the truth though it may take longer for them to gear up and therefore they may be actually slower in bringing us in the vicinity of the truth. Or, one rule may often quickly take us quite close to the truth though also often make us invest high confidence in falsehoods, whereas another rule moves us toward the truth more slowly but also more reliably. We cannot say in general which trade-off or trade-offs we should be prepared to make and which we should not. In some circumstances, it may be of the utmost importance to be able to quickly concentrate our confidence in a smallish region of the space of possibilities—for instance, it may be important for a medical doctor to be 95 percent certain that a patient’s systolic blood pressure is between 110 and 130 mmHg—but then be further relatively unimportant to concentrate our confidence even more (e.g., becoming highly confident that the systolic pressure is between 117 and 122 mmHg may have no further consequences for how the doctor will treat the patient). In other circumstances, it may be more important to estimate some given parameter with great accuracy but there may be no pressure to do so quickly. We can imagine how different update rules serve our purposes best in the different situations. It was mentioned previously that philosophers are strongly inclined to aim at generality, even universality. Among other things, they have aimed to state rules of rational thinking and behavior that apply in each and every situation. In line with this tradition, Bayesians have tried to argue that Bayes’ rule is the rational update rule in all contexts, under any circumstances, regardless of who is to use it. As was also mentioned, however, many researchers—especially in psychology—have
1570
I. Douven
recently warmed to the idea that rationality is a context- and even agent-dependent matter, an idea often going under the name of “ecological rationality.” While the proponents of this conception of rationality disagree on details, they share the view that rationality is a matter of picking the right tool in relation to whatever one’s goals and abilities happen to be in the context of use. And not only may different people have different goals or possess different abilities in the same context, one and the same person may have different goals or different abilities in different contexts of use. To illustrate, consider the possibility that, in some domains, there is a strong correlation between explanatoriness and truth in the sense that hypotheses concerning matters in those domains that strike us as being explanatorily powerful have a tendency to be true; that could be a contingent fact about us in relation to the world we inhabit, or it could be due to the workings of some evolutionary mechanisms. In other domains, there might be no such correlation, or a much weaker one. In domains of the former type, we might be better off using a version of abduction rather than a rule (like Bayes’ rule) that does not take explanatory factors into account. In domains of the latter type, it might be counterproductive to rely on any version of abduction. This observation is the starting point for the defense of abduction to be found in Douven & Mirabile (2018) and Douven (2020a, 2022). Rather than seeking to show that abduction is the rational update rule, Douven demonstrates that there are realistic circumstances under which probabilistic versions of abduction outperform Bayes’ rule in offering a better trade-off between speed and accuracy, that is, between how rapidly our confidence gets concentrated in a small region of the space of possibilities and how close that region is to where the truth is located in the space. Accordingly, in those circumstances, it would make more sense to use any of those versions of abduction than to use Bayes’ rule. The demonstration comes in the form of various computer simulations, pitting Bayes’ rule and a number of different versions of abduction—instances of the schemata labeled “EXPL” and “S” in section “What Is Abduction?”—against each other in contexts in which they are used to update sequentially on pieces of evidence received over time and related to some practical problem at issue. Here, one set of simulations will be described in detail and will also be generalized somewhat. The simulations to be considered concern a setting in which medical doctors, working at an intensive care unit (ICU), are tasked to diagnose the patients who are brought into the unit and to determine, based on test results, how to treat the patient. Time is of the essence, given that the probability that the patient will die increases as time passes, though that probability decreases if the doctor makes the right intervention. By contrast, the probability that the patient will die increases if the doctor decides upon the wrong intervention. How the probability of death increases with time, provided no intervention is made, can be modeled in various plausible ways. Douven (2020a, 2022) considers two options, one of which models this probability by the cumulative density function (CDF) of some Weibull distribution, and the other of which models that probability by the CDF of some Gamma distribution. Here, only the former will be described.
72 Abduction: Theory and Evidence
1571
Also, most of the formal details of Weibull distributions are skipped; it is only noted that they are characterized by a shape parameter and a scale parameter. Figure 1 shows five examples of a Weibull distribution, all having a shape parameter of 1 but having different scale parameters. In the simulations to be considered, the probability of death for a given patient brought into the ICU is assumed to be modeled by some Weibull distribution, where the shape parameter is, for each patient individually, chosen randomly and uniformly from the [0.5, 5] interval and the scale parameter is, also per patient, chosen randomly and uniformly from the [50, 250] interval. In Douven’s simulations, patients are further characterized by two parameters indicating how the right and, respectively, wrong intervention will impact the probability that the patient will die. Again skipping the formal details (for those, see Douven, 2020a, 2022), the idea is that making the right intervention lowers the probability of death by a certain percentage while making the wrong intervention increases that probability, the magnitude of the impact depending both on the patient and on the time of intervention. Figure 2 illustrates these effects for a specific parameter setting and a specific Weibull distribution. Finally, what is wrong with a patient is, rather abstractly, taken to be a matter of the value a α assumes for her, the idea being that, as the patient enters the ICU, his medical status is known except for the value of this parameter. It is given, however, that this parameter can take a value in {0, .1, .2, . . . , 1} only, that the doctor knows this, and that she initially deems each of these values equally likely. The doctor receives one new test result per unit of time, on the basis of which she is to estimate the value of α, the results being either “positive” or “negative,” and the tests being probabilistically independent of each other, with the same (unknown) probability of
CDFs of Weibull distributions
Probability of death
1.00
0.75
Distribution Weibull(1, 50) 0.50
Weibull(1, 100) Weibull(1, 150) Weibull(1, 200) Weibull(1, 250)
0.25
0.00 0
50
100
Time
Fig. 1 Examples of Weibull CDFs that give the probability of death of a patient as a function of time after admission into an intensive care unit
1572
I. Douven Effect of intervention: Weibull(1, 50)
Probability of death
1.0
Intervention wrong none right
0.5
0.0 0
50
100
Time
Fig. 2 Examples of the effect of right and wrong interventions for a Weibull distribution, where the orange graph is the probability of death of the patient over time if no intervention is performed; the green graph gives, for every point in time, the probability of death of the patient if at that point in time a wrong intervention is performed; and the blue graph does the same for the correct intervention
being positive. The hypothesis that α = x states that the probability for any given test turning up positive is x. Doctors are fully characterized by the update rule they use to accommodate the test results. Some doctors are Bayesian updaters, others use an instance of EXPL, still other doctors use a version of “Popper’s rule,” which is an instance of S with M being Popper’s measure of explanation quality, and yet other doctors use a version of “Good’s rule,” with M being Good’s measure of explanation quality; in the case of the versions of abduction, different doctors can assume different explanation bonuses. The simulations assume that a doctor must be sufficiently certain about a hypothesis before she intervenes, where “sufficiently certain” was understood as having a subjective probability greater than 0.9 in the hypothesis. They further assume that a doctor will perform the correct intervention only if she becomes sufficiently certain about the true hypothesis; else, she will make an incorrect intervention, where it is stipulated that all incorrect interventions will have an equally big negative impact on the patient’s survival chances. The question the simulations then seek to answer is this: Given (as we may assume) that each doctor has the goal of saving her patients’ lives, which update rule should she use to accommodate the test results she receives? Rather than just summarize the simulations reported in Douven’s work, we would like to rerun them, adding a slight twist to them. The twist concerns the fact that, in Douven’s simulations, there is a fixed decision threshold of 0.9. The choice of this value was not entirely arbitrary: in the literature on the connection between categorical belief (or acceptance) and subjective probability, many authors have proposed 0.9
72 Abduction: Theory and Evidence
1573
as the threshold for belief. Needless to say, however, this is at best an idealization. It is more realistic to assume that different people have different (possibly context dependent) thresholds for belief. Indeed, in Douven’s simulations, should the real question not have been which combination of update rule and decision threshold serves best the doctors’ shared goal of saving as many lives as possible? (This question was raised independently by Paul Thorn and Zina Ward.) As Douven (2020a, 2022) explains, this question can be thought of as a constrained optimization problem, the constraint coming from the fact that our choice of update rules is limited to the ones mentioned previously. There appears to be no closed form of the objective function (i.e., the function to be optimized), due to which analytical methods are not going to be of much help in solving the problem. For that reason, Douven recruits a form of evolutionary computation, which is a well-known optimization technique. As the name suggests, this technique seeks to exploit the basic principles at work in the process of natural selection, where instead of organisms struggling for survival the units of selection are different solutions to a given problem, which can differ in their “fitness,” the criterion of fitness being determined by the problem at hand. The algorithm starts by selecting from a pool of randomly generated solutions the “fittest” solutions to be retained and then typically lets the selected solutions “reproduce” in some specific way. The retained solutions together with their “children” form the pool for the next round of computations, in which the competition for survival and reproduction starts again. This is repeated either for a predetermined number of times or until a fixed point is reached at which all solutions are the same or at least are equally good (Barbati et al., 2012). As in the simulations documented in Douven (2020a, 2022), our procedure starts with a pool of 200 “medical doctors” (the first generation of solutions), with fifty doctors using Bayes’ rule, fifty using an instance of EXPL, fifty using an instance of Good’s rule, and fifty using an instance of Popper’s rule. For all but the first of these groups, the value of the explanation bonus c is, for each doctor individually, chosen randomly and uniformly from the [0, 0.25] interval. In addition to what was done in Douven’s simulations, where each doctor had the same fixed threshold for belief of 0.9, here a threshold value is picked for each doctor separately, where this value is chosen randomly and uniformly from the [0.5, 1] interval (It would make no sense to allow for values below 0.5, as such a value would mean that the doctor can believe things she deems less likely than their negation.). Each doctor treats one hundred patients, whose relevant characteristics (probability of survival, how that probability is affected by right and wrong interventions, and value of α) are chosen randomly and separately per patient, in the way specified previously. The doctor can spend 100 units of time on the treatment of each patient, where at each moment, until the doctor decides to intervene (if at all), the doctor receives the outcome of a single test, which is positive with a probability determined by the value of α that was randomly picked for the agent. At start time, the doctor has the same subjective probability in all of the eleven hypotheses about the value of α. These probabilities are updated sequentially, as the test results come in, one per time step, and using the update rule associated with the doctor. As soon as the probability for one hypothesis exceeds the threshold associated with the given doctor, she
1574
I. Douven
intervenes. If that probability is assigned to the true hypothesis, the doctor receives a score determined by the probability of death associated with the right intervention at the time the probability crossed the threshold; if the doctor assigns a probability above the threshold to a false hypothesis, her score is determined by the probability of death associated with the wrong intervention at the time the probability crosses the threshold; and if no hypothesis is assigned a probability above the threshold during the 100 time steps, the doctor receives the score of 1 minus the probability of death at the 100-th time step. After a doctor has treated 100 patients, her overall score is simply the mean of the scores received for each patient, which can be interpreted as the average patient survival rate for that doctor. Then the 100 “fittests” doctors—the doctors with the highest average patient survival rate—are selected to go on to the next generation, which they form together with a copy of themselves (so that this generation again consists of 200 doctors). This is repeated for 250 generations, after which the simulation terminates. Fifty of these simulations were run. As an illustration, Fig. 3 shows for one of those simulations how the pool of doctors evolved in the optimization process, with the generations represented on the x-axis and the count of doctors belonging to a certain group, characterized by the type of update rule they use, represented on the y-axis. It is seen that, in this simulation, Bayesians held up quite well for a while, but in the end the doctors updating via some instance of Popper’s rule wiped out the competition entirely. It is more informative to look at all simulations and consider the average number of doctors of the types at issue to be found in the 250 generations. These averages are shown in Fig. 4. It is already somewhat clear from this figure that the simulation shown in Fig. 3 is rather representative: Popperians were, overall, the clear winners,
200
150
Count
Rule Bayes EXPL Good Popper
100
50
0 100
200
Generation
Fig. 3 Counts of doctor types per generation for a randomly chosen simulation
Average percentage
72 Abduction: Theory and Evidence
1575
0.50
Rule Bayes EXPL Good Popper
0.25
0.00 100
200
Generation
Fig. 4 Percentages of doctor types per generation, averaged over the fifty simulations. Shaded areas indicate 95 percent confidence bands
with Bayesians being a distant second. The first is very much in line with the findings reported in Douven (2020a, 2022), but Bayesians do in fact markedly better than in the previous simulations, where they ended up doing worse than the EXPL users, which is not the case here. Hence, there is an effect of “unfixing” the threshold for intervention, albeit not one which changes our view that, in the present context, it is more advisable to use Popper’s rule than any of the other rules, including Bayes’ rule. Of course, threshold values were also subjected to evolutionary pressures in the new simulations. How did these impact them? The answer is highly surprising. As already seen in Fig. 5, the mean threshold value converged to a value close to 0.9. To be precise, the average threshold value of the doctors in the last generation was 0.91 (± 0.02). As said, the choice of 0.9 as a threshold value in the previous simulations was not entirely arbitrary. However, it almost looks too good to be true that, when we make the threshold a parameter that can be optimized in the evolutionary process, we do find that this process drives this threshold to have an average of basically 0.9, with the vast majority of doctors having thresholds very close to that value. The author has been unable to find any bug in the code for the simulations that might account for this finding, though interested readers are invited to inspect the Julia code that was used for the simulations. (The code is publicly available at this repository: https://github.com/IgorDouven/Abduction-Theory-and-Evidence.git.) In connection with these simulations, it is worth reiterating some of the observations already made in Douven (2020a, 2022). First, as noted there, whereas the evolutionary algorithm used in the simulations first and foremost serves as an optimization method, it can in the case at hand also be conceived as showing how evolution may have favored agents good at selecting the right update rule for the right environment. Second, a plausible explanation of why an explanation-based
1576
I. Douven
Average threshold value
1.0
0.9
0.8
0.7
0.6 20
22
24
26
28
Generation
Fig. 5 Log plot of average threshold value per generation, with 95 percent confidence band
update rule is, in the context considered, preferable to Bayes’ rule is that it allows for adaptive learning (by letting users increase or decrease the bonus for explanatory goodness), which Bayes’ rule in itself does not do. Indeed, this point is reinforced by comparing the outcomes of the new simulations with those reported in Douven (2020a, 2022). In the former, Bayesians have acquired some flexibility—because of the flexible thresholds—that they did not have in the previous simulations. (But then why do EXPL users and users of Good’s rule do worse than Bayesians in the simulations, given that Bayesians still do not have as much flexibility as those other agents? As explained in Douven (2022), that has to do with the fact that EXPL users and users of Good’s rule are unable to bring the explanation bonus quickly enough close enough to what the optimal value for that bonus would be for them. For Popperians, the bonus value is, on average, already from the beginning of the evolutionary process quite close to what the optimal value for them is.) Most importantly, neither the new simulations nor the ones reported in Douven (2020a, 2022) aim to show that it is always more rational to update via abduction (in some form) than via Bayes’ rule. Rather, their point is to help counter the claim made by Bayesians that it is never rational to update via abduction. Everything said in the foregoing is consistent with the insights of Elqayam, Gigerenzer, and others who have worked on ecological rationality, which imply that there is no one-sizefits-all norm of rationality and that instead different update rules may be called for in different contexts for different persons. In light of the work on ecological rationality, specifying a realistic type of situation in which we are better off by relying on a version of abduction is all a defense of this type of reasoning requires. Finally, much ink has been spilled over the question of whether abduction is compatible with Bayesianism. The foregoing suggests that the answer is a resounding yes if we are willing to let go of the imperialist ideas that have
72 Abduction: Theory and Evidence
1577
traditionally accompanied defenses of both Bayes’ rule and abduction. As was shown, there can be contexts in which abduction trumps Bayesian updating in all respects that matter in that context. But it is by no means ruled out that there are contexts in which one is better off using Bayes’ rule. So it is not only the case that Bayesians and explanationists can be friends (as Lipton, 2004, Ch. 7, argues); we can all in good conscience be Bayesians and explanationists, just not at the same time.
Conclusion The evidence showing that explanatory considerations play a role in how people adapt their subjective probabilities on the receipt of new information is not necessarily evidence that people are irrational. The arguments purporting to show otherwise—the dynamic Dutch book argument and the expected inaccuracy minimization argument—were defused. What is most fundamentally wrong with these arguments is that they only look at costs and not at possible benefits that might be worth the costs (granting that the costs are real, which we are under no obligation to do). Starting from an ecological conception of rationality, it was possible to go beyond defusing the criticisms leveled at abduction and to make a positive case for this mode of reasoning. The conclusion is not that abduction is a universally rational mode of reasoning, but rather that there are situations in which rationality recommends its use, leaving open the possibility that there are other situations in which one does better to rely on some other form of reasoning. (I am grateful to two anonymous referees for helpful comments on a previous version of this paper.)
References Barbati, M., Bruno, G., & Genovese, A. (2012). Applications of agent-based models for optimization problems: A literature review. Expert Systems with Applications, 39, 6020–6028. Boyd, R. N. (1984). On the current status of scientific realism. Erkenntnis, 19, 45–90. Boyd, R. N. (1985). Lex orandi est lex credendi. In P. Churchland & C. Hooker (Eds.), Images of science (pp. 3–34). Chicago: University of Chicago Press. Douven, I. (1999). Inference to the best explanation made coherent. Philosophy of Science, 66, S424–S435. Douven, I. (2017). Inference to the best explanation: What is it? And why should we care? In K. McCain & T. Poston (Eds.), Best explanations: New essays on inference to the best explanation (pp. 4–22). Oxford: Oxford University Press. Douven, I. (2019). Optimizing group learning: An evolutionary computing approach. Artificial Intelligence, 275, 235–251. Douven, I. (2020a). The ecological rationality of explanatory reasoning. Studies in History and Philosophy of Science, 79, 1–14. Douven, I. (2020b). Scoring in context. Synthese, 197, 1565–1580. Douven, I. (Ed.). (2021). Lotteries, knowledge, and rational belief: Essays on the lottery paradox. Cambridge: Cambridge University Press. Douven, I. (2022). The art of abduction. Cambridge, MA: MIT Press. Douven, I. (2023). Scoring, context, and value. Synthese, in press.
1578
I. Douven
Douven, I., & Mirabile, P. (2018). Best, second-best, and good-enough explanations: How they matter to reasoning. Journal of Experimental Psychology: Language, Memory, and Cognition, 44, 1792–1813. Douven, I., & Schupbach, J. N. (2015a). The role of explanatory considerations in updating. Cognition, 142, 299–311. Douven, I., & Schupbach, J. N. (2015b). Probabilistic alternatives to Bayesianism: The case of explanationism. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.00459 Douven, I., & Wenmackers, S. (2017). Inference to the best explanation versus Bayes’ rule in a social setting. British Journal for the Philosophy of Science, 68, 535–570. Edwards, W. (1968). Conservatism in human information processing. In B. Kleinmuntz (Ed.), Formal representation of human judgment (pp. 17–52). New York: Wiley. Edwards, B. J., Williams, J. J., Gentner, D., & Lombrozo, T. (2019). Explanation recruits comparison in a category-learning task. Cognition, 185, 21–38. Elqayam, S. (2011). Grounded rationality: A relativist framework for normative rationality. In K. I. Manktelow, D. E. Over, & S. Elqayam (Eds.), The science of reason (pp. 397–420). Hove: Psychology Press. Elqayam, S. (2012). Grounded rationality: Descriptivism in epistemic context. Synthese, 189, 39– 49. Fischhoff, B., & Lichtenstein, S. (1978). Don’t attribute this to Reverend Bayes. Psychological Bulletin, 85, 239–243. Foley, R. (1993). Working without a net. Oxford: Oxford University Press. Gigerenzer, G. (2000). Adaptive thinking: Rationality in the real world. New York: Oxford University Press. Gigerenzer, G. (2001). The adaptive toolbox. In G. Gigerenzer & R. Selten (Eds.), Bounded rationality: The adaptive toolbox (pp. 37–50). Cambridge, MA: MIT Press. Good, I. J. (1960). Weight of evidence, corroboration, explanatory power, information and the utility of experiment. Journal of the Royal Statistical Society, B22, 319–331. Griffiths, T. L., & Tenenbaum, J. B. (2006). Optimal predictions in everyday cognition. Psychological Science, 17, 767–773. Joyce, J. (1998). A nonpragmatic vindication of probabilism. Philosophy of Science, 65, 575–603. Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. Cambridge: Cambridge University Press. Keil, F. C. (2006). Explanation and understanding. Annual Review of Psychology, 57, 227–254. Kuipers, T. A. F. (1992). Naive and refined truth approximation. Synthese, 93, 299–341. Legare, C. H., & Lombrozo, T. (2014). Selective effects of explanation on learning during early childhood. Journal of Experimental Child Psychology, 126, 198–212. Leitgeb, H., & Pettigrew, R. (2010). An objective justification of Bayesianism II: The consequences of minimizing inaccuracy. Philosophy of Science, 77, 236–272. Lipton, P. (1993). Is the best good enough? Proceedings of the Aristotelian Society, 93, 89–104. Lipton, P. (2004). Inference to the best explanation (2nd ed.). London: Routledge. Lombrozo, T., & Gwynne, N. Z. (2014). Explanation and inference: Mechanistic and functional explanations guide property generalization. Frontiers in Human Neuroscience, 8, https://doi. org/10.3389/fnhum.2014.00700 Marks, D. F., & Clarkson, J. K. (1972). An explanation of conservatism in the bookbag-andpokerchips situation. Acta Psychologica, 36, 145–160. McMullin, E. (1992). The inference that makes science. Milwaukee: Marquette University Press. Oaksford, M., & Chater, N. (2007). Bayesian rationality. Oxford: Oxford University Press. Phillips, L. D., & Edwards, W. (1966). Conservatism in a simple probability inference task. Journal of Experimental Psychology, 72, 346–354. Popper, K. R. (1959). The logic of scientific discovery. London: Hutchinson. Psillos, S. (2004). Inference to the best explanation and Bayesianism. In F. Stadler (Ed.), Induction and deduction in the sciences (pp. 83–91). Dordrecht: Kluwer.
72 Abduction: Theory and Evidence
1579
Schum, D. A., & Martin, A. W. (1982). Formal and empirical research on cascaded inference in jurisprudence. Law and Society Review, 17, 105–151. Schurz, G., & Hertwig, R. (2019). Cognitive success: A consequentialist account of rationality and cognition. Topics in Cognitive Science, 11, 7–36. Tentori, K., Crupi, V., & Russo, S. (2013). On the determinants of the conjunction fallacy: Probability versus inductive confirmation. Journal of Experimental Psychology: General, 142, 235–255. Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90, 293–315. van Fraassen, B. C. (1989). Laws and symmetry. Oxford: Oxford University Press. Vasilyeva, N., & Lombrozo, T. (2020). Structural thinking about social categories: Evidence from formal explanations, generics, and generalization. Cognition, 204, 104383. Walker, C. M., & Lombrozo, T. (2017). Explaining the moral of the story. Cognition, 167, 266–281. Williams, J. J., & Lombrozo, T. (2010). The role of explanation in discovery and generalization: Evidence from category learning. Cognitive Science, 34, 776–806. Williams, J. J., & Lombrozo, T. (2013). Explanation and prior knowledge interact to guide learning. Cognitive Psychology, 66, 55–84. Williamson, T. (1998). Conditionalizing on knowledge. British Journal for the Philosophy of Science, 49, 89–121.
Plausible Reasoning in Neuroscience
73
Tommaso Costa, Donato Liloia, Mario Ferraro, and Jordi Manuello
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What Is Probability? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What Is Statistical Inference? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Applications of the Bayesian Approach to Inference Problems . . . . . . . . . . . . . . . . . . Parameters Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Models Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Bayesian Approach in Neuroscience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Behavioral Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brain Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Bayes Factor-Based Approach to Neuroimaging Coordinate-Based Meta-Analysis . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1582 1583 1586 1588 1592 1595 1595 1598 1606 1607 1608 1612 1614 1616
Abstract
What is probability? How this relate to statistical inference? These are fundamental questions, the answer to which varies greatly, depending on the approach adopted. This chapter takes the perspective of the Bayesian tradition, describing the differences with the canonical frequentist one. By discussing the logical and
T. Costa () · D. Liloia · J. Manuello Focus Lab, Department of Psychology, University of Turin, Turin, Italy GCS-fMRI, Koelliker Hospital, Turin, Italy e-mail: [email protected]; [email protected]; [email protected] M. Ferraro Department of Physics, University of Turin, Turin, Italy GCS-fMRI, Koelliker Hospital, Turin, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_74
1581
1582
T. Costa et al.
mathematical foundations of probability introduced by Cox and Jaynes, it will be shown how the early objections to the Bayesian approach can be overcome. Next, it will be highlighted how the Bayesian approach offers a useful method to deal with the choice of research hypotheses. Several examples of parameters estimation and model comparison will be described, whose solutions are based on distributions of probabilities, previously introduced. Some relevant Bayesian theories from neuroscientific literature, such as multimodal sensory integration and predictive coding, will be described. Finally, an application to neuroimaging meta-analytical methods will be presented. Keywords
Bayesian inference · Bayes Factor · Brain models · Neuroimaging · Meta-analysis
Introduction Statistical inference can be defined as a procedure, or a collection of practices, aiming to extract information from data and generalize the obtained results beyond the observed sample. For this reason, statistical inference allows us to test scientific hypotheses, or to evaluate competing models describing a given phenomenon. Let consider the following examples: you happen to know that Alice and Bob have recently had a terrible row that ended their engagement. Now someone tells you they just saw Alice and Bob jogging together. The best explanation you could think is that they finally reconciled. You therefore conclude that they are fiancés again. The type of inference here exemplified will probably strike the most of us as entirely familiar. Philosophers as well as psychologists tend to agree that this is a form of abduction. This term refers to some form of explanatory reasoning that is also often called “Inference to the Best Explanation” and had its origin in the work of Peirce (1974). In theoretical terms, two kinds of statistical inference can be accounted for: classical inference, also referred to as frequentist, and Bayesian inference. Although both are based on the theory of probability, they differ in the way probability itself is conceptualized. Arguing about the “correct” definition would be worthless and mistaken, as each of them simply reflects a different approach to the inquired phenomenon. Elements of coherence between the two schools seem to emerge even on a mathematical level. Notwithstanding, it is still possible to reason about the most adequate approach to a specific topic given its nature, as well as about which of the two traditions better captures the logical process behind scientific research. In this chapter, the concept of probability will be first discussed, and it will be shown how Bayes theorem can be derived and used. A similar conceptualization will be provided for the process of statistical inference. After the technical explanation of some useful distribution, several concrete examples of both Bayesian parameter estimation and model comparison will be discussed. Finally, some relevant Bayesian theories from neuroscientific literature will be described, including recent applications of Bayesian statistics in the field of neuroimaging coordinate-based
73 Plausible Reasoning in Neuroscience
1583
meta-analysis. This will show that Bayesian statistics is just possibly the best way to tackle the problem of abductive inference in many scientific disciplines, including neuroscience. For relevant examples, see Calzavarini and Cevolani (2022) and Niiniluoto (2011).
What Is Probability? At the end of the nineteenth century, the need for a conceptual systematization of the field of probability calculus was clear among mathematicians. Indeed, the sixth Mathematical Problem proposed by David Hilbert during the Second International Congress of Mathematicians in Paris (Scott, 1900) concerned the axiomatization of those branches of Physics, in which mathematics played an important role, the theory of probability being among the most relevant ones. This occurred 33 years later, when A.N. Kolmogorov axiomatized the theory of probability, putting it on par to other axiomatized parts of mathematics (Kolmogorov & Bharucha-Reid, 2018). Kolmogorov’s approach was solely based on the following three postulates: Let S, the sample space, be an ensemble of elementary events and A a subset of S. Then: 1. The probability of the event A is a non-negative real number between 0 and 1: 0 ≤ P(A) ≤ 1. 2. The probability of the certain event, the sample space S, is equal to 1: P(S) = 1. 3. If A and B are two mutually exclusive events, then P(A ∪ B) = P(A) + P(B). Mutually exclusive is a term describing two or more events that cannot happen simultaneously. It is commonly used to describe a situation where the occurrence of one outcome supersedes the other. From these axioms, some theorems and further concepts can be derived. First, the theorem of total probability: Let E1 , E2 , . . . , En be a set of mutually exclusive events; then the probability of the union event E is the sum of the probability of each event P (E) = P (E1 ) + P (E2 ) + · · · + P (En ) . Next, the concept of conditional probability is introduced. Define the conditional probability of the event A given the event B as the probability of A to occur based on the previous occurrence of B: P (A|B) =
P (A ∩ B) . P (B)
From Eq. (1), Bayes’ theorem can be derived. Let’s first be: P (B|A) =
P (B ∩ A) , P (A)
(1)
1584
T. Costa et al.
Since P(A ∩ B) = P(B ∩ A), it follows that: P (A|B) P (B) = P (B|A) P (A), most commonly written as: P (A|B) =
P (B|A) P (A) . P (B)
which is the famous Bayes theorem. The advantage of the formulation of Kolmogorov is to avoid any ambiguity about the definition of probability, which remains a primitive concept detached from its meaning. Moreover, each of the presented theorems is formally proved. The axiomatic foundation of probability theory, albeit of great mathematical relevance, leaves open the question of the meaning of probability, and different interpretations have emerged, in a long, and often acrimonious, debate. The frequentist interpretation maintains that probability represents the expected frequency of an event over a series of repeated trials, being therefore a long-term property of a well-defined process. A classic example is the expected 50% of head outcome in a long series of coin-tosses. However, such an interpretation does not allow us to deal with the probability of the single occurrence of an event. What is more, it implies strong assumptions about repeatability and inter-trial independence for the inquired phenomenon. Notably, the relative frequencies, as in the case of head/tail outcomes, are seen as real components of the physical world, being therefore persistent and measurable properties, as in the case of dimensions or weight (Neyman, 1977). This interpretation of the probability excludes a large set of events like: What is the probability of eruption of the Vesuvio? What is the probability that my favorite soccer team will win the championship? These are all single events not reducible to a frequency of repeated events and cannot be studied with this kind of framework. A different and more general approach, which can be used for all kinds of events, has been proposed by Cox (1946) and later expanded by Jaynes (2003). According to them, probability is a measure of plausibility of a proposition under incomplete knowledge. Assuming this, inference problems pertain to inductive logic rather than deductive logic. The latter can be schematized as follows: If A is true, then B is true, A is true. Therefore, B is true. and its inverse: If A is true, then B is true, B is false. Therefore, A is false. A famous example of the syllogistic reasoning is: “All men are mortal. Socrates is a man. Therefore, Socrates is mortal.”
73 Plausible Reasoning in Neuroscience
1585
This kind of reasoning can be hardly used in real circumstances, as in general the right kind of information is not available. Therefore, a weaker syllogism should be followed instead: If A is true, then B is true B is true Therefore, A becomes more plausible. In this case, which Peirce showed to be a form of abductive reasoning addressable through probability, the information does not allow to prove that A is true but the occurrence of one of its consequences make more plausible the truth of A. Following Jaynes, let consider the following example: A = it will start raining by 10 AM at the latest B = the sky will become cloudy before 10 AM Now, if we observe cloud at 9:45 AM, this evidence does not give us a logical certainty that it will rain at 10. However, our belief in the coming rain has increased. Cox focused on the individuation of the quantitative rules necessary for logical and coherent reasoning (Cox, 1946). In doing this, he considered how we can express our beliefs concerning the degree of truth of some propositions. These rules are: 1. The plausibility of a proposition is a real number and depends on information related to the proposition. 2. Plausibility should vary in a logical way. Therefore, larger numbers mean greater plausibility. 3. If the plausibility of a proposition can be derived in different ways, the results must be the same independent of the way used to derive them. Furthermore, the transitive property holds true. Given the statements (a), (b), (c) if we are more certain about (a) than (b) and about (b) than (c) then we must be necessarily more certain about (a) than about (c). This kind of rank can be easily obtained by assigning to each proposition a real number using the rule 1, so that the surer we are about it the higher the value, in line with rule 2. But what rules must these numbers abide by to meet the criteria of logical consistency? Cox made two assumptions. The first and more intuitive: Quantifying the degree of truth associated with an item implicitly implies the quantitative belief about the same item being false. In other words, if we strongly believe that it will rain tomorrow, we also think that this assumption, and the related belief, can hardly be false. Cox did not specify a particular form for this relation, but he considered some form reasonable to exist. The second assumption is slightly more complex: consider two propositions, X and Y, and state how sure we are about Y being true. If
1586
T. Costa et al.
we now declare how certain we are about X being true given the fact that Y is true, then we have implicitly specified how sure we are about X and Y being both true. Cox therefore affirmed the existence of a relation between X and Y, but, again, he did not formalize this statement. In fact, according to him, the form of the relations described in the two assumptions could be derived based on the already existing rules of algebra, both ordinary and Boolean. Cox eventually found that to abide by this constrain, the real numbers used to express our beliefs concerning the degree of truth of any proposition (here expressed through P) must follow two rules of the theory of probability: P (A|I ) + P A|I = 1 and P (A, B|I ) = P (A|B, I ) P (B|I ) In these two formulae, A denotes proposition A being false, the comma is a symbol for the logic “and,” the “|” reads “given,” and implies that everything appearing on its right is considered true. Finally, I represents all the available information. The first equation is called “sum rule,” affirming that the probability of the proposition A to be true plus the probability of the proposition A to be false is equal to 1. The second equation is called “product rule,” affirming that the probability of propositions A and B being both true is equal to the probability of A to be true given that B is true, multiplied by the probability of B to be true (irrespectively from A). The term I is the background information, on which probability depends. For example, your belief in the probability associated with the proposition “it will rain this afternoon,” which can be intended as the belief in the event to happen, depends on the sky being currently cloudy or clear and on being aware of the weather forecast. Although I is often omitted to lighten the calculation, its existence in the formulae should never be forgotten. Omitting to specify all available information and hypotheses is often the true cause of vibrant debates in statistical research.
Bayes’ Theorem Let’s now describe Bayes’ theorem in light of Cox’s interpretation. As previously shown, following from Kolmogorov’s theory and considering two sets of generic events A and B, Bayes’ theorem can be written as: P (A|B) =
P (B|A) P (A). P (B)
In order to show how it can be used in inference problems, let H be a set of hypotheses or models of the world and H an element of such set and, further, let D be a set of available data relevant for H . The previous formula becomes:
73 Plausible Reasoning in Neuroscience
P (H |D) =
1587
P (D|H ) P (H ). P (D)
In this formula, the term P(H| D) is called the posterior probability and it is the probability of the hypothesis H, once data has been observed. The probability distribution P(D| H) describes the probability of data, assuming the hypothesis to be true and is called sampling distribution, when seen as a function of data, or likelihood, if considered a function of the hypothesis. In the latter case, it is sometimes written as L(D| H). The likelihood is thus the probability that the hypothesis or model H from the set H gives rise precisely the observed data D. The term in the denominator, P(D), is the probability of the data irrespective from the hypothesis, called marginal likelihood. Finally, P(H), called prior, reflects any prior knowledge about the probability that H holds true, and thus, it takes into account the fact that not all models or hypotheses are equally likely a priori. Therefore, Bayes’ theorem postulates that the probability of the hypothesis having observed the data, that is, P(H| D), depends on the probability of the data assuming that the hypothesis is true [P(D| H)], on our preconceived beliefs on the hypothesis P(H), and on the probability of the data P(D). In other words, this theorem describes the process of learning, modeling how our beliefs should be updated while we gain information through the acquisition of further data. To avoid any confusion, it is relevant to clarify that here the terms prior and posterior refer to a logical, rather than temporal, relation. What distinguishes the prior probability from the posterior probability is the observation of the data in between the two. If the data arrive sequentially, it is possible to update the inference on the unknown hypothesis online. Therefore, being D1 , D2 . . . , Dn the series of data, Bayes’ theorem can be written as: P (Hn+1 |Dn+1 ) =
P (Dn |Hn ) P (Hn |Dn ) P (Dn )
In other words, the formula means “the posterior of today will be the prior of tomorrow.” Therefore, the solution of inference problems, in a probabilistic setting, ideally consists in the computation of P(H| D), as it gives the probability for a given hypothesis H to hold, on the basis of the data. However, this is not always feasible, since the computation of P(D) may require the calculation of complex sums or integrals (see, for instance, Smith (1991) and Park (2021)). A common alternative is to find the element H corresponding to the maximum of P(H| D). In fact, assuming to compare the posterior probability of two different hypotheses H1 and H2 given the same data D, P(D) is irrelevant, as it does not depend on H. Thus, for inference purposes, only P(D| H) and P(H) are important. The likelihood P(D| H) depends on the data, whereas P(H) can be determined based on some a priori information or some general theory concerning the phenomena under study. For instance, some astronomical observations can be explained by both geocentric and heliocentric models, but from a general theory of the universe we know a priori that a geocentric model is very unlikely to be true. The inclusion of the a priori probability
1588
T. Costa et al.
differentiates the Bayesian approach to inference from the frequentist one and it is often criticized because P(H) does not depend on the data, being somehow arbitrary. However, it can be argued that using a priori hypothesis or models is just the way science operates and that the use of a priori probability allows to consider not just the data at hand but, more generally, the whole knowledge on the laws subsuming the phenomena (i.e., the data) under study. Furthermore, as explained previously, the a priori probability can be thought of as recovered from the posterior probability of previous experiments. Finally, let’s consider how the likelihood and the prior probability contribute to the posterior probability. If there is little a priori information to constraint the inference process, P(H| D) is determined mostly by the likelihood, and the estimators computed from P(H| D) and P(D| H) give very close results; in the case of all hypotheses being equally likely a priori, P(H| D) and P(D| H) coincide. In other words, if there little, or none, prior information for preferring a particular hypothesis, the agreement with the data, represented by the likelihood, predominates. The opposite occurs if P(H) is sharply peaked around a given element of the set H ; in this case, the maximum a posteriori probably may correspond to a hypothesis whose likelihood, given the data, is relatively low. Some examples of this interplay between a priori probability and likelihood will be presented in the sequel. The relevance of the likelihood P(H| D) is also affected by the accuracy of the measurements with which data are obtained. Accurate measurements imply that data are reliable, and hence our beliefs are shaped mostly by P(D| H): In case of inaccurate data, prior information, represented by P(H), becomes more relevant.
Probability Distributions A probability distribution is a function assigning to a set of events the probability for each of them to happen. A wide range of probability distributions exists, characterized by specific features and parameters. It is possible to characterize a probability distribution through single numbers. Among these the most important are the expected value (also referred as mean) and the variance. Consider a set X of discrete event {xi } with probability distribution P(xi ) then the expected value is given by:
E [X] =
xi P (xi )
i
and the variance is defined by: σ2 =
i
(xi − E [X])2 P (xi ).
73 Plausible Reasoning in Neuroscience
1589
In other words, the expected value is calculated by multiplying each of the possible events by the probability of each events and then summing all of those values. The variance is the expected value of the squared difference between the events and expected value. Furthermore, the variance represents, instead, the degree of homogeneity among the events: The higher the variance is, the farer the data are from the computed expected value. The extension to the continuous case follows immediately, considering continuous probability density distributions and using the integral instead of the summation. In the following paragraphs, the lowercase p will be used to indicate probabilities that can assume continuous values, while the uppercase P will be used if the probability can only assume discrete values. Here, three probability distributions, particularly relevant in the field of plausible reasoning, will be described. The binomial distribution is employed to compute the probability to obtain a predetermined number of positive outcomes in case of repeated trials, given the probability of the positive outcome of a single trial. The prefix bi- refers to the only two possible outcomes: The event can happen or not. In case of more than two possible results for a single trial are possible, the binomial generalizes to a multinomial distribution. Typical problems that can be modeled through a binomial distribution are: • To obtain two tails in three coin tosses • To win at least once in one million of scratch cards bought • To get a 6 in 4 tosses of a six-faced dice The formula is: n B (k; n, p) = pk (1 − p)n−k k where k is the number of positive outcomes, n is the number of trials, and p is the probability of success in a single trial, and n! n = k k! (n − k)! is the binomial coefficient. In the previous example of obtaining two tails in three coin tosses, the abbreviated notation would be B(2; 3, 1/2). The semicolon after the parameter k means that the probability is considered a function of k, for a fixed value of n and p. However, the whole distribution is often expressed simply as B(n, p). Figure 1 depicts the binomial distribution for different values of n and p. The binomial distribution has expected value np and variance np(1 − p).
1590
T. Costa et al.
Fig. 1 Binomial distributions for different values of n (number of trials) and p (probability of success)
In the previous example, k is a discrete random variable in that it can assume just integer values. When dealing with continuous random variables instead, the probability distribution is replaced by a probability density distribution (PDF). In these cases, the computed probability is not that of a given value (which is zero), but rather that a value falls in a given interval . Let’s now introduce the beta distribution. This can be described as a probability distribution on probabilities and its domain in bounded between 0 and 1. The beta distribution can be defined through the following probability density function (PDF): Beta (p; α, β) =
(α + β) α−1 p (1 − p)β−1 (α) (β)
where p represents the probability of an event, α > 0 and β > 0 are shape parameters, and Γ (x) is the gamma function, which has a normalizing role (Fig. 2). Note that the binomial distribution and the beta distribution are very similar, apart from a normalization constant. The difference between the two is that the binomial models the number of success k, while the beta models the probability of success p. The third and last distribution to be introduced is the Gaussian distribution (also called normal). This is a continuous distribution of probability, as in the case of the beta distribution. The Gaussian distribution gives the probability of a single random variable to fall in a given interval, with mean (μ) and standard deviation (σ ), which are the two parameters through which it is defined. Figure 3 depicts the normal distribution with μ = 0 and σ = 1. As shown in Fig. 3, the normal distribution is centered around the mean μ, while its width depends on the standard deviation σ . While the latter decreases, the distribution gets tighter around its mean. Therefore, the normal distribution
73 Plausible Reasoning in Neuroscience
1591
Fig. 2 Beta distributions for different values of α and β
Fig. 3 The normal (or Gaussian) distribution. Here the values on the x-axis represent the data
describes how much the mean estimated value can be trusted: not that much, in case of a wide range of observed values; more and more while observed values tend to be closer to the mean, and hence less scattered (i.e., small σ ). When all that is known about a phenomenon is the mean and standard deviation of the observed data associated with it, the normal distribution is the most honest possible representation of our belief. The three distributions described above share the mathematical properties of conjugacy. In the Bayesian theory of probability, when the posterior distribution belongs to the same family of the prior distribution, the two distributions are said to be conjugated, and the prior distribution is termed conjugate prior for the likelihood function . This property is particularly relevant, as it allows to simplify the calculations necessary to obtain analytically the posterior distribution. For example, if a likelihood represented through a binomial distribution is multiplied by a prior represented through a beta distribution, the posterior will be a beta distribution: Beta (θ |α + k, β + n − k) = B (k; n, θ ) × Beta (θ ; α, β) (XX) Or, in the case of a likelihood of a single data x represented through a Gaussian distribution with parameters μ and σ 2 multiplied by a prior represented through a Gaussian distribution with parameters μ0 and σ02 , the posterior will still be represented through a normal distribution, with mean and variance computed as:
1592
T. Costa et al.
μp =
x σ2
+
1 σ2
+
μ σ02 1 σ02
and σp−2 = σ −2 + σo−2 , that can be extended to multiple data as: μp =
nx σ2
+
n σ2
+
μ σ02 1 σ02
with 1 xi n n
x=
i=1
and σp−2 = nσ −2 + σo−2 , where n is the number of data and the summation is across all data values.
What Is Statistical Inference? In the introduction, statistical inference has been defined as the procedure of finding out some features of a population through the observation of a part of it, called sample. This can also be extended to the process of comparing different populations through their observed samples. From a philosophical perspective, this is a process of learning through experience, quantified by means of mathematical techniques. It has also been mentioned that two main approaches exist in the field of statistical inference, differing by the way the concept of probability is interpreted. The first tradition originated from the historical contribution of R. Fisher, J. Neyman, and K. Pearson and is called classical inference (or frequentist inference). The second one, named Bayesian inference, is based on the application of Bayes’ theorems. Both the approaches share the axioms of probability and share the same mathematical structure. Indeed, Bayes’ theorem is valid even in the frequentist paradigm, although used and interpreted in a different way. The main elements of divergence between the two traditions are the use of the information already available before knowing the data and the way results must be interpreted. Let’s now describe a same experiment according to each of the two frameworks. An urn is filled with a given number of marbles, only differing for the color. An unknown fraction p of them is black. After
73 Plausible Reasoning in Neuroscience
1593
100 draws, with the picked marble being placed back in the urn every time, a black colored one has been extracted 30 times. How many black marbles are in the urn (i.e., what is the value of p)? Irrespective of the approach, the problem can be solved through a binomial distribution as the following criteria are fulfilled: (i) the event admits two only outcomes: true or false (black or not-black in the example); (ii) each event is independent from any other past or future event; and (iii) the probability of each event is constant. Following the frequentist school, a 95% confidence interval (CI, a range of estimates for an unknown parameter) is built for p. In this case, the CI is 0.21–0.39. This does not mean that p is inside that CI with a probability of 95%. Rather, this shows that, given the considered hypothesis, the selected methodology makes correct assumptions in the 95% of the cases, that is, the true value of p actually falls inside the computed CI. Therefore, the information obtained through the frequentist approach concerns p being or being not included in the CI, but there are no probabilistic values for the outcome “being included.” In frequentist statistics, it is also possible to use the method called maximum likelihood estimation. This method allows to estimate the value of p which makes the observed data the most probable to be observed. Applying the maximum likelihood estimation to the binomial model, it would be possible to obtain an estimate of p as pˆ = 30/100 = 0.3. Let’s now follow the Bayesian approach. First, it is necessary to formalize the expectation of the true value of p, before any measurements. Assuming the specific case of complete ignorance about p, this would be modelled through a uniform distribution in the range 0–1. Solving the calculations through the binomial distribution again, we obtain k/n = 30/100 = 0.3, as in the frequentist version. However, this time this is the most probable value, namely, the posterior, given the priors (i.e., our belief) and the experimental results. The posterior distribution shows that the probability of the unknown parameter p to assume a value in the range 0.21–0.39 is 0.95. To summarize: the frequentist approach observes how many times the selected techniques generate true outcomes; the Bayesian approach assigns a measure of plausibility directly to a specific range of values. Despite being often ignored in practice, this difference is substantial from a theoretical point of view. As explained above, a peculiarity of the Bayesian approach is the inclusion of prior knowledge in the model, as in the following example. Assume that vampirism has an incidence of 1/1000, meaning that 1 in a thousand people is a vampire, and, also, that a “garlic test” exists, allowing to identify vampires, with a probability to detect a vampire of the 95%, and a false positives rate of 2%. While using it to screen a subject a positive result is obtained. What is the probability of that subject truly being a vampire? In a frequentist framework, since disease status is not repeatable, the probability that the subject is a vampire is either 0 or 1. The solution can be obtained through the following formula: P (V |y) = P (y|V )
1594
T. Costa et al.
that is, the probability of the subject being a vampire (V) given the positive result (y) is equal to the probability of having a positive result given that the subject is a vampire. Therefore, it should be concluded that the probability of being in front of a vampire is equal to 95%. However, this solution completely neglects the incidence rate. On the contrary, following a Bayesian approach, this information can be modeled as a prior. As the known incidence is 1/1000, P(V) = 0.001 is the a priori probability of being a vampire. Consequently, P V = 0.999 is the a priori probability of being healthy. Since the known false positives rate of the garlic test is 2%, P y|V = 0.02 is the probability of having a positive result given that the subject is not a vampire and that P(y| V) = 0.95, then applying Bayes’ theorem we obtain: P (V |y) =
P (y|V ) P (V ) 0.95 × 0.001 = 0.02 × 0.999 + 0.95 × 0.001 P y|V P V + P (y|V ) P (V )
which returns a probability of being in front of a real vampire equal to ∼4%, much smaller than the previously obtained 95%. This observed discrepancy opens the way to a more general consideration of how a Bayesian approach could help to solve cases of the so-called “bad science,” which is also related with the crisis of reproducibility and replicability (Baker, 2016; Ioannidis, 2005). A typical schema of scientific inference should work as: (a) A hypothesis is considered true (or false). (b) A statistical procedure is applied to test it, producing an imperfect clue of the falsity of the hypothesis. (c) Bayes’ theorem is used to estimate how much the clue has modified the level of certainty about the initial hypothesis. However, the last step is rarely done. Let’s see how this would work for a toy model, assuming that the probability to obtain a positive result when the hypothesis is true is P(results| true) = 0.95. In classic statistics, this represents the power of the test. Secondly, the false positives rate is instead P(results| false) = 0.05, that is, how probable is to obtain a positive result when the hypothesis is false. Finally, the supposed base rate for true hypotheses is P(true) = 0.01, meaning that 1 hypothesis out of 100 turns out to be true (Ioannidis, 2005). Applying the same calculations showed for the previous example on vampirism, the obtained posterior probability is P(true| results) = 0.16. Therefore, there is a 16% probability for the hypothesis to be true since we had observed a positive result. Even adopting a less conservative false positives rate of P(results| false) = 0.01, the posterior probability does not significantly exceed the 50%, which is the same chance level for a cointoss. Therefore, the only way to improve reliability of science is to make more trustworthy hypotheses, which would increase P(true) in the proposed example. Crucially, this process requires a more accurate and critical thinking, rather than an extensive testing of any conceivable hypothesis.
73 Plausible Reasoning in Neuroscience
1595
Some Applications of the Bayesian Approach to Inference Problems The above sections were mainly dedicated to theoretical aspects. Now the use of a Bayesian approach to the solution of concrete cases will be detailed, with a focus on more practical features. To note, it should be reiterated that to adopt a Bayesian framework, it is necessary to know all the available information. This condition is fundamental to correctly quantify and assign probabilities in the calculations, following Cox’s teaching. In other words, this means to build what are called sample space and hypothesis space. The former refers to all the possible results of a given experiment; the latter to any possible concurrent hypotheses to be evaluated. The majority of the problems encountered in scientific research can be organized in two classes: model comparison and parameter estimation. Parameters are usually estimated for a specific model that is considered true. What really distinguishes the two classes is the way in which the hypothesis space is defined. For the sake of clarity, the solution of the following examples will be based on the use of the sole three probability distributions introduced above: binomial, beta, and normal. However, other strategies applying different distributions would be possible and equally correct.
Parameters Estimation Every problem of parameter estimation involves a model that is true for an unknown value of its parameters. The Bayes’ theorem can then be used to explore the constraints for each of the parameters. Therefore, a parameterized model can be seen as an ensemble of mutually exclusive hypotheses, each of them associated with a specific value of one or more parameters, either discrete or continuous. For conciseness, a model with the only parameter θ will be considered. The hypothesis space H = {θ i } is therefore the ensemble of all the possible values of the parameter. The sample space S = {si } consists of one, or more, sample of data. For example, let’s consider a survey asking whether or not the students feel generally happy. The survey is conducted on n = 130 students of which k = 110 replied that they were happy. This case can be modeled through a binomial distribution: n B (k; n, p) = pk (1 − p)n−k k Here the hypothesis space is formed by the possible value of p, in the interval between 0 and 1, being therefore H = {0, 0.1, 0.2, . . . , 1}. The sample space is instead the number of happy students k. Let now be the unknown real value of the parameter θ ; D is a proposition affirming the actually observed values of the data; H is a proposition declaring that = θ is the real value of the parameter. Finally, to be thorough, I represents
1596
T. Costa et al.
the prior information about the problem under investigation. This includes, at least, some propositions affirming that is contained in H , that the observed data are N samples in the space S N , specifying the link between the data and the parameter value, and any other information being possibly relevant for the solution of the problem. The Bayes’ theorem can now be written as: P (θ |D, I ) = P (θ |I )
P (D|θ, I ) P (D|I )
To solve the calculations, the three probabilities on the right must be known. Both the prior P(θ | I) and the likelihood P(D| θ , I) are direct probabilities, depending on the specific details of the problem. The denominator, which is instead independent from θ , can be computed applying the axioms of probability. In case of a discrete parameter, we can write: P (D|I ) =
P (θi |I ) P (D|θi , I )
(2)
i
This equation allows to express the denominator as a function of the prior and the likelihood. In fact, the content of the summation is tantamount to the numerator of the Bayes’ theorem computed for each possible value of θ . If the parameter assumes continuous values, the summation becomes an integral, and Eq. (2) can be written as: P (D|I ) = P (θ |I ) P (D|θ, I ) dθ Therefore, in any parameter estimation problem, the denominator simply acts as a normalizing term, assuring that values of posterior probability will range between 0 and 1. On the contrary, it has a more relevant role in cases of model comparison, as it will be shown later. In literature, P(D| I) is also called prior predictive distribution, as it represents the probability of the data solely based on the information provided by the a priori distribution of the model parameters. Alternative names are marginal likelihood and global likelihood. Once that each probability has been computed, Bayes’ theorem can be applied, and a posterior probability calculated, which represents the inference for the parameter θ , considering all its possible values. Depending on the context, this posterior distribution can be represented in graphic or tabular form, or described through some quantitative indexes, such as the mean, the mode, or the maximum value. Although the last option is preferable especially for models with more than one parameter, the selection of the most appropriate indicator depends on the specific case. As an example, the mode would not be particularly meaningful for a wide and rather flat distribution with a small hump on one side (Fig. 4, top), as most of the probabilities will lie on one half. The median would be more informative instead. In turn, the median would be a poor descriptor of a distribution with two narrow peaks (Fig. 4, bottom), as it may fall in between them.
73 Plausible Reasoning in Neuroscience
1597
Fig. 4 Examples of posterior distribution for which point estimators could result inaccurate
Together with a point estimator, it is always preferable to include a measurement of uncertainty. In common practice, this means to report the standard deviation or the width of an interval containing a specified fraction of the posterior probability distribution. This is usually called credible region or highest posterior density interval (HPD interval), for models including continuous parameters. Notably, being probability a measurement of the plausibility of the values of a parameter, the bounds of the interval should be chosen so that each of the included values has higher probability than the external ones. When dealing with multiparametric models, the Bayesian method allows treating the parameters of greatest interest differently from those less relevant. The latter, which are usually referred to as nuisance parameters, can be eliminated by means of a technique called marginalization. In the case of a model with two parameters θ 1 and θ 2 , where θ 2 is considered a nuisance, then p(θ 1 | D, I) can be computed applying the product rule as: p (θ1 |D) =
1 P (D|I )
p (θ2 |I ) p (θ1 |θ2 , I ) p (D|θ1 , θ2 , I ) dθ2
Marginalization is therefore relevant both theoretically and practically, as it allows to reduce dimensionality and to simplify calculations and graphical representation of the results. To conclude this section, consider the case of a satisfaction survey about an academic course of statistics. Of the 129 students interviewed, 118 (i.e., the 91%) answered they had liked the course, 11 (i.e., the 9%) answered they had not. In this scenario, the parameter θ , representing the probability to receive a positive evaluation from a student not yet considered, can be inferred through a binomial distribution with n = 129 (i.e., the total records) and k = 118 (i.e., the positive outcomes). Now applying the Bayes’ theorem, assuming as prior a beta distribution with parameters α = 1 and β = 1, the obtained posterior follows a conjugated beta distribution: Beta (θ |α + k, β + n − k) = Beta (θ |119, 12)
1598
T. Costa et al.
Fig. 5 The posterior probability density distribution of the parameter θ, based on the results of the survey
The results are graphically represented in Fig. 5, showing that the maximum a posterior (MAP) is around θ = 0.9. Note that the posterior and prior distributions are both beta distributions with different shape parameters.
Models Comparison As explained in the previous section, any case of parameter estimation considers the model as true. Conversely, identifying the most appropriate model among some alternatives is the goal of model comparison problems. To this aim, Bayes’ theorem is used to compute the probability associated with each option based on the observed data. Consider M alternative models, each of them identified through the index k. As usual, D represents the available data, while the hypothesis k states that “the model k is true.” Bayes’ theorem can therefore be written as: p (Mk |D, I ) = p (Mk |I )
p (D|Mk , I ) p (D|I )
where I is the background information, as introduced above. Solving this implies computing each of the probabilities in the formula for each of the k models. As stated previously, P(D| Mk , I) is the marginal likelihood for model M that can be calculated, using the sum rule, as: p (D|Mk , I ) =
dθp (θ |Mk , I ) p (D|θ, Mk , I )
The denominator is again a normalization constant obtained by summing the priors and the marginal likelihood of all models considered: p (D|I ) =
p (Mk |I ) p (D|Mk , I )
As it can be noted, the model comparison is analogous to parameter estimation. Just as the posterior probability of a parameter is proportional to its prior times the
73 Plausible Reasoning in Neuroscience
1599
likelihood, so the posterior probability of the model is proportional to its prior times the marginal likelihood. Now it is useful to compare the probabilities of the models. Considering two competing models k and j, they can be compared through the ratio between their numerators:
Okj =
p (Mk |I ) dθp (θ |Mk , I ) p (D|θ, Mk , I ) dθp θ |Mj , I p D|θ, Mj , I P Mj |I
(3)
This ratio, usually called odds ratio , is an expression of the relative probability between the two hypotheses k and j. It can be also written as: Okj =
p (Mk |I ) Bkj , p Mj |I
where term in brackets, referred as prior odds, is equal to 1 when the same prior probability is assigned to each alternative model, while Bkj , called Bayes Factor, is defined as Bkj
dθp (θ |Mk , I ) p (D|θ, Mk , I ) = dθp θ |Mj , I p D|θ, Mj , I
The odds ratio is also useful when multiple models are available. For example, let’s compare the model M1 with a class of alternative models Mj . In this case, the strategy is to compare the model M1 against all the other models Mj to obtain the odd ratios Oj1 in favor of each alternative model over M1 . The probabilities of each model can be computed as follows: N
P Mj |D, I = 1
j =1
where N is the number of models. Dividing by P(M1 | D, I), one obtains 1 = Oj 1 p (M1 |D, I ) N
j =1
and with some algebra Oj 1 . P Mj |D, I = N j =1 Oj 1
1600
T. Costa et al.
A consequence of the marginalization procedure used to calculate the marginal likelihood of the models is that the Bayes Factor automatically favors simpler models, unless more complex alternative models are justified by the data. This is reminiscent of the famous Occam Razor (MacKay, 2003). For an intuitive description, consider two models: M1 with a single parameter θ with a Gaussian prior on θ with mean equal to 0 and variance σ02 , and M0 with θ fixed to some value θ θ = 0. Suppose to perform a measurement of θ described by a normal likelihood 2 with variance σ and with the maximum value lying n standard deviation away from θmax 0, that is, σ = n. Then the Bayes factor between the models is
B01 =
1+
σ σ0
−2
⎛
⎞
⎟ ⎜ n2 ⎟. − exp ⎜ ⎝ 2 ⎠ σ 2 1 + σ0
For n 1, the exponential term dominates and B01 1, favoring the complex model with extra parameter; for n ≤ 1 and σσ0 1, meaning that the likelihood is much more sharply peaked than the prior and more near to 0, then the simpler model predicts better. Some concrete examples of model comparison will be now discussed. First, consider a set of unfair coins, some of them biased toward the outcomes head, some others toward tail. One of them is randomly selected, and 10 tosses are allowed to decide the direction of the unbalancing of that specific coin. The two competing models are therefore: • Model 1: the coin favor heads • Model 2: the coin favor tails After the 10 attempts, six heads had been obtained. The prior for model 1 can be therefore beta (7.5, 2.5), and those for model 2 beta (2.5, 7.5). The respective beta distributions are shown in Fig. 6.
Fig. 6 The Beta distributions for the priors of the two competing models
73 Plausible Reasoning in Neuroscience
1601
Based on those priors, the computed posterior for model 1 is Beta(13.5, 6.5), while the computed posterior for model 2 is Beta(8.5, 11.5). Following Eq.(3) to compare the two models, and assuming them as equiprobable, the value 2.93 is obtained. This evidence favors the hypothesis to have picked-up a coin unbalanced toward the outcome heads. The previous example was based on data obtained through several repetitions of the same test. However, one of the main advantages of Bayesian statistics, and particularly for model comparison problems, is to compute the probability of events that cannot be experimentally manipulated, and even in absence of any prior information. A classic real example of this nature concerns the probability of a collision between two airplanes during a scheduled flight. This question, which can be seen as a case of the so-called credibility theory, was presented by the chief of an American insurance company to his analyst Longley-Cook, in 1950. To solve this, set as prior a beta distribution with parameters a = 1 and b = 1. This is, therefore, a uniform prior distribution assuming equiprobability. Data can be modeled through a binomial distribution, and the posterior will follow a beta distribution. At the time of Longley-Cook, commercial flights had only been in operation for 5 years, without any crash. Keeping in mind the previous section on the conjugacy in the binomial case, this corresponds to a prior beta distribution with parameter a = 0, number of previous crash, and b = 5, number of years without crash. The posterior will be a beta distribution with parameters set to a = 1 and b = 6. The mean of this distribution, computed as m = a /(a + b , is equal to 1/7. Therefore, there was a 14.3% probability of collision in the next year, with the 95% confidence interval being [0, 39%]. Rounding the range to 40%, the expected time window before a flight collision happens is 1/0.4 = 2.5 years, meaning four incidents in the next 10 years. Tragically, two airplanes departed from Los Angeles International Airport collided on June 30, 1956, over the Grand Canyon. Four years later, on December 16, 1960, the same happened over New York City. As a further and less fatal example, Bayes’ theorem can be applied to investigate the actual clairvoyance of “Paul the psychic octopus.” It has acquired fame during the 2010 football World Cup, as it seemed it was able to consistently predict the winner of an upcoming match (Garbett, 2010). Define the data through the statement: D = Paul s prediction was correct for 12 out 14 matches Does this evidence strongly support the existence of Paul’s psychic powers? This corresponds to compute the posterior probability of Paul being actually psychic (ESP) given the observed data: P (ESP|D) The two competing models, or hypotheses, are therefore: H = Paul is truly psychic
1602
T. Costa et al.
R = Paul is randomly guessing Set now the probabilities of the possible outcomes under each hypothesis. This is rather straightforward for model R: P (make a correct prediction|R) = 0.5 P (make a wrong prediction|R) = 0.5 In fact, the chance level for random guessing is 50%. Things get slightly variable for the concurrent hypothesis. What is the probability of a correct prediction for a real psychic? To concede Paul a margin of error, let be: P (make a correct prediction|H ) = 0.9 P (make a wrong prediction|H ) = 0.1 The last decision concerns the priors for each model. How likely do we think to be meeting a real psychic octopus? As this has probably never happened so far, it is natural to assign to this hypothesis a very low value: P (H ) = 1/100 P (R) = 99/100 Through a few calculations, the obtained posteriors for each model are: P (H |D) = 0.32 P (R|D) = 0.68 Following from this, the posterior odds is: ORD =
P (R|D) = 2.12 P (H |D)
The hypothesis of random guessing is therefore twice as probable than the hypothesis of psychic powers, although 12 out 14 (86%) observed predictions were correct. Consider now, always in the scenario of “paranormal powers,” a more complex case that can be analyzed with a similar model. A friend of mine attended a show
73 Plausible Reasoning in Neuroscience
1603
where a so-called medium predicted the outcome of 100 coin tosses. Data can therefore be defined through the statement: D = The medium predicted the outcome 100 times or in short: D = predicts Should we reasonably think the medium has real psychic powers (ESP)? The hypothesis space is therefore: H = {ESP, ∼ ESP} where the symbol ∼ negates the veracity of the psychic powers. The Bayesian theorem for our problem is then: P (ESP|predicts) =
P (predicts|ESP) P (ESP) P (predicts)
Breaking it down, the likelihood P(predicts| ESP) encodes the probability to predict 100 outcomes having psychic powers. As in the case of psychic Paul, let’s concede a 10% margin of error, setting the likelihood to 0.9. We are skeptic about clairvoyance, yet we want to give the benefit of the doubt, setting P(ESP) = 10−12 . To make a quantitative comparison, this means being surer about the nonexistence of psychic powers than to survive to the next flight. As explained above, the denominator can be also written as:
P (predicts) = P (predicts|ESP) P (ESP) + P (predicts| ∼ ESP) P (∼ ESP) (4) P(∼ESP) can be easily obtained as 1 − P(ESP) = 1 − 10−12 . P(predicts| ∼ESP) is the probability to randomly guess the outcome of 100 coin tosses. Being the probability for each trial equal to 0.5, for the 100 repetitions it is 2−100 , roughly 7 × 10−31 . Now inserting values in the Eq.(4), we have:
P (ESP|predicts) =
0.9 × 10−12
≈ 1 − 10−18 0.9 × 10−12 + 7 × 10−31 1 − 10−12
To our amazement, the result seems to suggest that the showman was a real medium almost for sure! However, we are still not convinced: Our friend could be being delirious and therefore telling unreal facts. Consequently, the hypothesis
1604
T. Costa et al.
space now includes the four possible combinations for clairvoyance existing or not, and our friend being delirious (C) or not (∼C): H = {ESP& ∼ C, ∼ ESP& ∼ C, ESP&C, ∼ ESP&C} The posterior now concerns the probability of psychic powers being true excluding that our friend is being delirious, given that the showman predicted those 100 outcomes. This can be expressed as: P (ESP& ∼ C|predicts) =
P (predicts|ESP& ∼ C) P (ESP) P (∼ C) P (predicts)
We know our friend well enough to consider him being delirious an unlikely event. Therefore, the prior P(∼C) is set to 1 − 10−6 . The complex computation of the denominator, which now accounts for the different four probabilities, can be avoided through the odds ratio approach: P (∼ ESP&C|predicts) P (predicts| ∼ ESP&C) P (∼ ESP) P (C) = P (ESP& ∼ C|predicts) P (predicts|ESP& ∼ C) P (ESP) P (∼ C) Since, as said above, we trust our friend, the likelihood P(predicts| ESP & ∼C) can be considered tantamount to P(predicts| ESP) = 0.9, assuming a margin of error as above. Therefore: 0.9 × 1 − 10−12 × 10−6 P (∼ ESP&C|predicts) = ≈ 106 P (ESP& ∼ C|predicts) 0.9 × 10−12 × 1 − 10−6 This result shows that it is a million times more probable that our friend is being delirious rather than psychic power really exists. This is in line with what the philosopher David Hume had already affirmed in one of his essays: no testimony is sufficient to establish a miracle, unless the testimony be of such a kind, that its falsehood would be more miraculous, than the fact, which it endeavors to establish (Hume, 1902).
Finally, the Bayesian approach can be useful to describe the scenario in which different people, observing the same data, and applying valid lines of reasoning, yet hold diverging opinions about the same fact (Jaynes, 2003). Consider the following example: Mister X affirms, during a TV show, the dangerousness of a given well-known medicine. This event is therefore the data D. The three spectators A, B, and C have different opinions about both Mister X’s credibility and the dangerousness of that medicine. These discrepancies could be due to past experience, or some specialized knowledge, but the actual reason is practically irrelevant for the statistical model. Before hearing Mister X, A and C do consider the medicine as reasonably safe, while B does not. This can be schematized through the following priors:
73 Plausible Reasoning in Neuroscience
1605
PA (safe) = 0.9 PB (safe) = 0.1 PC (safe) = 0.9 Moreover, they all agree that if the medicine is truly unsafe, Mister X statement is true. Therefore: PA (D|not safe) = 1 PB (D|not safe) = 1 PC (D|not safe) = 1 Finally, while A trusts Mister X, C strongly believes he could make a wrong assumption. B is instead rather trusting: PA (D|safe) = 0.01 PB (D|safe) = 0.30 PC (D|safe) = 0.99 Let’s now compute how Mister X’s statement changed their evaluation of the medication, comparing the priors with the posterior probabilities: PA (safe) = 0.9 → PA (safe|D) = 0.083 PB (safe) = 0.1 → PB (safe|D) = 0.032 PC (safe) = 0.9 → PC (safe|D) = 0.899 B is now even more skeptical than earlier. But A and C, who were equally trusting, ended up with opposite opinions, A completely changing his mind. Therefore, the same event had a different impact on the three spectators. It is interesting to note that this was not due to the content of Mister X’s declaration, but on the evaluation of his
1606
T. Costa et al.
reliability. Hence, when building a Bayesian model, it is fundamental to correctly identify all the variables coming into play as priors.
The Bayesian Approach in Neuroscience The Bayesian approach can be reasonably used in any scientific domain. This section and the next one will be devoted to its role in the research field of neuroscience. On a first level, Bayesian statistics can be used as an alternative to frequentist ones to obtain information about cerebral structure and function. In general, the significance of the observed phenomena is investigated through either the extraction of relevant parameters or the comparison of alternative models. Besides this technical usage, several Bayesian models had been developed to explain brain functioning and human behavior. For example, it has been hypothesized that a model of the world could be stored in the brain, and cerebral circuits could then operate Bayesian inference themselves, based on perceived stimuli. A model of this kind would contain, on a higher hierarchical level, information about the objects usually met in our environment (e.g., a desk, some chairs, the three outside the window). A lower level of this hierarchy would code the details of the objects instead, as the shapes or the colors of the objects. When a higher-level representation is triggered by the vision of an object, which could be interpreted, for example, as a chair, this in turn activates the lower level, where it is stored the configuration of straight lines and squared shapes which is expected to be received. The input detected through the primary visual cortices is therefore compared with the expected information available in the model. In case of a mismatch, the error is back propagated to the higher levels, a different interpretation of the object (e.g., a desk) is hypothesized, and the dynamical process is reiterated until the deviation between the expectation and the feedback is minimized (Mumford, 1992; Rao & Ballard, 1999). The same mechanisms of feedback and feed-forward had been proposed to describe in Bayesian terms human behavior as well. Among various models, an interesting one was formalized by Wolpert and Ghahramani (2005) and explained through the example of tennis. Let’s take the perspective of a player while the opponent is ready for the serve. One thing needed is to estimate the arrival position of the ball (x). This can be done observing its trajectory, which represents the available data y. Therefore, the maximum likelihood technique can be used to find the value of x that maximizes p(y| x). However, even before the stroke, the player may expect some regions of the court to be a more probable landing zone than others, modeled as the priors p(x). Through this formalization, Bayes’ theorem can be used to compute the posterior probability and then estimate the most probable region through the maximum a posterior (MAP) technique. During the match, the player will then reduce its uncertainty about the landing zone, by means of a recursive application of the Bayes’ theorem:
73 Plausible Reasoning in Neuroscience
1607
Fig. 7 The Gaussian distributions for the prior (red), the likelihood (blue), and the posterior probability (purple). The mean and the precision of each distribution are reported in the figure
p (xn |Yn ) =
p (Yn |xn ) p (xn |Yn−1 ) p (Yn )
where Yn = {y1 , y2 , . . . , yn } are the observations of the serves until this moment. Note that these appear in p(xn | Yn − 1 ), that now replaces the prior p(x) used to predict the first serve of the match in absence of previous observations. In other words, the above formula explains, as mentioned before, that today’s prior is yesterday’s posterior. When the causal variables x and y follow a normal distribution, then an exact analytical solution can be easily computed. If the prior is a Gaussian distribution with mean m0 and precision (i.e., the inverse of the variance) λ0 , and the likelihood is a gaussian distribution with mean mD and precision λD , as stated in previous section, the posterior distribution will have: λ = λ0 + λD
m=
λ0 λD m0 + mD λ λ
Therefore, the posterior precision is just the sum of the other precisions, while the posterior mean is the sum of the mean of the prior and the mean of the data, weighted by the relative precision (Fig. 7).
Behavioral Models A relevant feature of this kind of Bayesian models is that they allow us to describe the optimal behavior to solve a specific task. For this reason, they are also called ideal observer models, as they quantify how a belief should be updated each time that new data are observed. Any deviation from this optimal performance can be then explained in terms of computational complexity or individual differences, which influence the way the priors are derived from data (Stocker & Simoncelli, 2006). Interesting examples of behavioral Bayesian models relate to the topic of
1608
T. Costa et al.
sensory integration. In this framework, Ernst and Banks (2002) focused on the coupling between visual (v) and tactile (t) information. If the two modalities are independent, the Bayes’ theorem can be written as: p (x|v, t) =
p (v|x) p (t|x) p(x) p (v, t)
Assuming uniform priors and the likelihoods following the normal distribution, the posterior is still a normal distribution, with precision λvt = λv + λt and mean mvt =
λv λt mv + mt λvt λvt
as seen above. The last formula can be also written as mvt = wv mv + wt mt where w are the weigths of each modality. The researchers realized several experiments based on visual stimuli only, tactile stimuli only, or combining both modalities. Fitting a normal distribution to the unimodal data allowed to estimate the two precisions λt and λv (i), where i quantifies the noisy nature of the visual stimuli. The predicted weight for the bimodal modality is therefore wˆ v (i) =
λv (i) λv (i) + λt
with a good agreement with the observed data in the experiments. Similar integration models have been built for different modalities (Körding & Wolpert, 2004) and even to describe pain perception (Tabor et al., 2017).
Brain Models Brain structure and function can be investigated at various spatial and temporal resolutions. Different disciplines favor differing scales, although finding integrative perspectives is one of the main goals of neuroscientific research (Gordon, 2003). Different scale-specific models will be now described. At the microscale level, Gold and Shadlen (2002) proposed a theory of categorization of stimuli based on the accumulation of information over time. This framework, which well describes the learning mechanisms, has a lot to share with the so-called Bamburism used by Alan Turing and colleagues for cryptographic purposes (Bouchaudy, 2020). The decision
73 Plausible Reasoning in Neuroscience
1609
algorithm described by Gold and Shaladen is based on the weight of evidence, formalized as the logarithm of the ratio between the likelihoods. Their model was derived from the observation in monkeys of the activity of neurons in the superior colliculus and in the lateral intraparietal cortex while they were following visual stimuli (Gold & Shadlen, 2001). At the mesoscale level, Ma et al. (2006) studied the response of neuronal populations to different kinds of stimuli, suggesting that the variability measured in a single trial for the population may describe how the brain encodes uncertainty. It has been suggested by some authors (Amarasingham et al., 2006; Deger et al., 2009; Moreno-Bote, 2014) that the neural activity can be approximated through a Poisson distribution, and Bayesian inference could be internally used to implement optimal integration of information. Expanding on this, Fiser et al. (2010) showed that if a Bayesian model is used by neuronal populations to process sensory data, an interesting connection exists between spontaneous and stimuli-driven neuronal activity. In a probabilistic framework, the neural activity x can be seen as a representation of samples from external stimuli y forming the posterior distribution p(x| y) and the spontaneous activity has a natural interpretation as encoding the beliefs about the external environment, thus giving rise to the prior (x). The posterior is obtained, as usual, using Bayes’ theorem where the likelihood p(x| y) represents the probability of the sensory input given the model of the environment. In absence of input, prior and posterior should be equal: p(x) =
p (x|y) p(y)dy,
and this equality can be approximated by: p(x) ≈
p (xi |yi ) ,
i
where the sum runs over the data. This hypothesis has been confirmed by the analysis of the activity in the visual cortical area in ferrets (Berkes et al., 2011). Finally, at the macroscale, Rao and Ballard (1999) postulated the existence in the brain of a mechanism of predictive coding. Their aim was to describe how the brain could resolve the issue of reverse inference in perception, which means going from the sensory activity to the stimulus. According to them, several internal models of specific features of the world are stored in different regions of the brain, each of them encoding the possible causes of sensory inputs as the parameters of a generative model. Therefore, new sensory inputs are represented through those known possible causes. The best match between them is identified by means of an error minimization process. Predictive coding can be explained with a simple example (Mathys, 2016). Given N observations {m1 , m2 . . . , mN }, the mean m can be computed as
1610
T. Costa et al.
m=
N 1 mi . N i=1
However, the mean can also be iteratively calculated as mi+1 = mi +
1 (mi+1 − mi ) . i+1
The components of this updating equation are the previous mean mi representing the state of belief before the new observation mi + 1 . The difference between the new observation and the current belief is a prediction error. Then the mean can be take the form: new mean = old mean + weight × prediction error To conclude this section, consider an example in which the value of a single variable is inferred from an observation of it properties, as in the case of inferring the dimension of an object (φ) based on its light intensity (s) (Bogacz, 2017). Assuming the following relation between φ and s s = g (φ) + , where is a Gaussian noise associated with sensory inputs, light intensity is a function, either linear or nonlinear, of the object. Since s is distributed according to a Gaussian with mean g(φ) and variance Σ s , the likelihood can be written as: 1 (s − g (φ))2 p (s|φ) = N (s; g (φ) , s ) = √ exp − . 2 s 2π s Given the presence of noise, to correctly estimate φ, the brain must combine the information coming from the observation with prior knowledge about the possible dimensions of the object. For simplicity, assume the latter to follow a Gaussian distribution as well, with mean μp and variance Σ p (the subscript p referring to the prior): p (φ) = N φ; μp , p . Therefore, applying Bayes’ theorem: p (φ|s) =
p (φ) p (s|φ) . p(s)
Now, rather than building the whole posterior distribution p(φ| s), it is possible to find the value that maximizes the posterior. Since p(s) does not depends on φ, this
73 Plausible Reasoning in Neuroscience
1611
means to maximize the numerator p(φ)p(s| φ). In order to simplify the calculations, the logarithm F of p(φ)p(s| φ) can be used, which being a monotonic function has the same maximum: F = ln p (φ) + ln p (s|φ). The resulting formula is: F = ln N φ; μp , p + ln N (s; g (φ) , s ) 2 φ − μp 1 (s − g (φ))2 − ln p − − ln s − + c, = 2 p 2 s where c denotes the constant terms. The value of φ at which F attains a maximum value can be found taking the derivative of F: μp − φ ∂F s − g (φ) + g (φ) , = ∂φ p s and applying the gradient ascent technique, which allows to identify and follow the most direct way to reach the maximum of F. It can be shown that the optimal value of φ is the solution of the equation: φ˙ =
∂F , ∂φ
describing the rate of change of φ. Defining s as: s =
s − g (φ) , s
it can be expressed how φ deviates from the stimulus s, while p : p =
φ − μp , p
denotes the deviation between the hypothesized real dimension and the priors. Therefore, φ˙ can be written as: φ˙ = s g (φ) − p . This formula highlights the process of recursive reduction of errors which is characteristic of the Bayesian implementation. As possible neural implementation
1612
T. Costa et al.
of this model, the parameters μp , Σ p , and Σ s can be seen as the strength of the synaptic connections, while φ, s , and p refer to the neural activity encoding the sensory input s. The free energy theory (Friston, 2010; Friston et al., 2006) can be seen as a generalization of the model of predictive coding described here.
A Bayes Factor-Based Approach to Neuroimaging Coordinate-Based Meta-Analysis Coordinate-based meta-analysis (CBMA) of human brain imaging data consists of a method of growing interest in statistics. Its ultimate goal is to quantitatively evaluate consensus across independent published experiments using magnetic resonance imaging (MRI) or positron emission tomography (PET) technologies. The increasing relevance of CBMA is due to the need of synthetizing the exponential growth of publications, as well as to overcome the limited interpretability of single studies attributable to several methodological factors (e.g., small sample size, heterogeneity in analytic pipeline, or thresholding procedure) leading to low statistical power and poor reproducibility (Samartsidis et al., 2017). By definition, CBMA requires as input a typical dataset, which consists in a list of x-y-z coordinates of peak activations or alterations reported in the publications of interest. Among the available algorithms to conduct CBMA, the activation likelihood estimation (ALE) is the most used worldwide (Tahmasian et al., 2019). In order to assess the convergence between studies, ALE models the probability that a brain voxel v is the true location of a coordinate as a Gaussian distribution centered on it (Eickhoff et al., 2016; Turkeltaub et al., 2002). Therefore, given a certain study i, the map Lik for each coordinate xik is provided by the following formula: Lik (v) = c exp −(v − xik )2 /σi2 that is, a Gaussian distribution density with mean ν and standard deviation σ i , whose variability is due to the sample sizes of the studies (Eickhoff et al., 2009). Each map Lik is combined in a single study map Li , also named Modeled Activation (MA) map (Laird et al., 2005); hence, the probability that the coordinate near v is truly located in v is quantified. Finally, the realization of the ALE statistic l is computed as follows: l(v) = 1 −
(1 − Li (v))
i
Note that the current formula expresses the probability that one or more coordinates are located in the voxel v. The significance test of ALE is carried out via Monte Carlo procedure. In detail, multiple statistics are generated for each voxel v by sampling maps from random coordinates. The null ALE is generated as:
73 Plausible Reasoning in Neuroscience
l ∗ (v) = 1 −
1613
1 − Li v ∗ i
where v∗ can be located uniformly in all possible voxels of the brain mask. The null ALE distribution is thus employed to compute p-values (Eickhoff et al., 2012; Eickhoff et al., 2016). Since its introduction in the early 2000s, ALE has been used successfully to characterize brain-behavior relationships, cerebral anatomy, and signature of clinical conditions, yielding a corpus of literature containing more than 1,000 publications. Although ALE plays a crucial role in advancing the field of human brain mapping, its frequentist statistical background suffers from a major drawback, namely, the impossibility of infer about the selectivity of activation (or alteration) of brain areas in a given psychological process (or brain disorder) (Cauda et al., 2020; Poldrack, 2006). In other words, ALE is designed to answer research questions like “what is the cerebral pattern consistently involved in a given psychological process?” or “what is the atypical cerebral pattern in a given brain disorder?” It is well known, however, that a number of neural territories can be active during several psychological processes (Anderson et al., 2013; Cauda et al., 2012) or damaged by different pathologies (Cauda et al., 2019; Crossley et al., 2014). Ultimately, this limits the use of neuroimaging to distinguish competing cognitive theories, as well as the effective contribution to diagnostic and treatment strategies. As suggested by different authors (Liloia et al., 2022; Poldrack, 2006), the aforementioned issue can be overcome by the implementation of Bayesian statistics that may answer questions like “to what extent the activation of a cerebral pattern is selective to a given psychological process?” or “to what extent the cerebral pattern of alteration is selective for a given disorder?” Although the use of Bayesian statistics is not new in the field of neuroimaging, only recently the first CBMA tool, named Bayes fACtor mOdeliNg (BACON) (Costa et al., 2021; Liloia et al., 2022), has been devised. BACON permits to create 3-D maps reporting voxel-based selectivity of neural activation or alteration via posterior probability analysis. Specifically, this datadriven technique calculates the likelihood between two competing hypotheses, namely, the occurrence of the state under investigation H0 (e.g., the probability that a brain voxel is active because of a given psychological process) and the absence of the state H1 (e.g., the probability that a brain voxel is active because of one or more other psychological processes). This kind of computation takes advantage of the ALE environment and is based on the Bayes’ Factor (BF) (Jeffreys, 1961). According to it, the Bayes’ theorem can be formalized in terms of relative belief as follows: P (H0 |D) P (D|H0 ) P (H0 ) = P (H1 |D) P (D|H1 ) P (H1 ) where D is the measured effect of activation in a brain voxel.
(5)
1614
T. Costa et al.
Since knowledge of priors is not available for both H0 and H1 hypotheses, it is possible to set flat prior (Cauda et al., 2020; Jaynes, 2003). The BF01 can therefore be expressed as: BF 01 =
P (D|H0 ) P (D|H1 )
where the BF01 value greater that 1 corresponds to the degree of evidence in favor of H0 , and less that 1 to H1 . Considering Eq. (5), the BF01 can be also formalized as: BF 01 =
P (H0 |D) P (H1 |D)
The sum of the two posterior probabilities is equal to 1 P (H0 |D) + P (H1 |D) = 1 Therefore, we can rewrite the BF01 as: BF 01 =
P (H0 |D) 1 − P (H0 |D)
If we now invert it, it is possible to obtain the following expression: P (H0 |D) =
BF01 BF01 + 1
This allows to account for the evidence in favor of each of the hypotheses under consideration and, in doing so, to estimate the posterior probability of selectivity by taking into account the independent activation/alteration of the x-y-z coordinates (Costa et al., 2021). Graphical representations of the BACON pipeline, going from the x-y-z coordinates to the selectivity map of condition/disease of interest, are shown in Fig. 8.
Conclusions Bayesian methods are acquiring growing relevance in several scientific fields in which inference is a fundamental process. Their advantage is simple and rational and can answer in an optimal way to many scientific questions for a given state of information. The procedure to apply the Bayesian method is well-defined and it can be schematized as follows: (a) Clearly state the question and the prior information (b) Apply the sum and product rules
73 Plausible Reasoning in Neuroscience
1615
Fig. 8 Workflow pipeline of BACON. The final map, obtained with the calculation of BF01 , can be thresholded according to the desired level of selective probability. Data used are for visualization purpose only
Following this procedure, it is possible to calculate directly the probability of the hypothesis, while incorporating all the relevant prior information. Furthermore, the Bayesian approach provides a powerful way of assessing different and/or competing theories. Nonetheless, its impact on experimental psychology and neuroscience is still limited. Three main reasons account for that. First, as extensively shown in this chapter, building a Bayesian model to describe a phenomenon requires a peculiar way of reasoning and observing reality, resulting in a less straightforward solution if compared to conventional statistics. Secondly, it is still a common practice to prove reliability of the results derived from data analysis by means of descriptive statistics mainly based on the concept of p-value. This is in turn reflected in the requirements of several scientific journals, although things have begun to change (Amrhein et al., 2019). The third and last point concerns the priors. In most Bayesian literature, the a priori probability is considered as a purely subjective opinion about the hypothesis, even though a priori can be often derived also from well-founded scientific laws. This apparent lack of objectivity had caused Bayesian statistics to be considered for a long time as inappropriate for hard sciences. Jaynes was probably
1616
T. Costa et al.
the only one to postulate that the fundamental element for the development of an objective theory of probability is the desideratum of Consistency: Equivalent states of knowledge should be represented through equivalent plausibility assignments. This principle is the key to solve the issue of the assignment of direct probabilities, as both the prior and the likelihood are, which is essentially half of the whole theory of probability. The resulting theory will be still subjective, but in the sense that probability represents states of knowledge, and not properties of nature. On the other hand, the theory is founded on rigorous logical statements and probability laws and, at this level of interpretation, it can be considered objective.
References Amarasingham, A., Chen, T. L., Geman, S., Harrison, M. T., & Sheinberg, D. L. (2006). Spike count reliability and the Poisson hypothesis. The Journal of Neuroscience, 26(3), 801–809. https://doi.org/10.1523/jneurosci.2948-05.2006 Amrhein, V., Greenland, S., & McShane, B. (2019). Scientists rise up against statistical significance. Nature, 567(7748), 305–307. https://doi.org/10.1038/d41586-019-00857-9 Anderson, M. L., Kinnison, J., & Pessoa, L. (2013). Describing functional diversity of brain regions and brain networks. NeuroImage, 73, 50–58. https://doi.org/10.1016/j.neuroimage.2013.01.071 Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533, 452–444. https://doi. org/10.1038/533452a Berkes, P., Orbán, G., Lengyel, M., & Fiser, J. (2011). Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science, 331(6013), 83–87. https:// doi.org/10.1126/science.1195870 Bogacz, R. (2017). A tutorial on the free-energy framework for modelling perception and learning. Journal of Mathematical Psychology, 76, 198–211. https://doi.org/10.1016/j.jmp.2015.11.003 Bouchaudy, J. F. (2020). Enigma, the XYZ period (1939–1940). Cryptologia, 46, 1–66. https://doi. org/10.1080/01611194.2020.1864681 Calzavarini, F., & Cevolani, G. (2022). Abductive reasoning in cognitive neuroscience: Weak and strong reverse inference. Synthese, 200. https://doi.org/10.1007/s11229-022-03585-2 Cauda, F., Torta, D. M., Sacco, K., Geda, E., D’Agata, F., Costa, T., et al. (2012). Shared “core” areas between the pain and other task-related networks. PLoS One, 7(8), e41929. https://doi. org/10.1371/journal.pone.0041929 Cauda, F., Nani, A., Manuello, J., Liloia, D., Tatu, K., Vercelli, U., et al. (2019). The alteration landscape of the cerebral cortex. NeuroImage, 184, 359–371. https://doi.org/10.1016/j.neuroimage. 2018.09.036 Cauda, F., Nani, A., Liloia, D., Manuello, J., Premi, E., Duca, S., et al. (2020). Finding specificity in structural brain alterations through Bayesian reverse inference. Human Brain Mapping, 41(15), 4155–4172. https://doi.org/10.1002/hbm.25105 Costa, T., Manuello, J., Ferraro, M., Liloia, D., Nani, A., Fox, P. T., et al. (2021). BACON: A tool for reverse inference in brain activation and alteration. Human Brain Mapping, 42(11), 3343– 3351. https://doi.org/10.1002/hbm.25452 Cox, R. T. (1946). Probability, frequency and reasonable expectation. American Journal of Physics, 14(1), 1–13. https://doi.org/10.1119/1.1990764 Crossley, N. A., Mechelli, A., Scott, J., Carletti, F., Fox, P. T., McGuire, P., et al. (2014). The hubs of the human connectome are generally implicated in the anatomy of brain disorders. Brain, 137(Pt 8), 2382–2395. https://doi.org/10.1093/brain/awu132 Deger, M., Cardanobile, S., Helias, M., & Rotter, S. (2009). The Poisson process with dead time captures important statistical features of neural activity. BMC Neuroscience, 10(1), P110. https:/ /doi.org/10.1186/1471-2202-10-S1-P110
73 Plausible Reasoning in Neuroscience
1617
Eickhoff, S. B., Laird, A. R., Grefkes, C., Wang, L. E., Zilles, K., & Fox, P. T. (2009). Coordinatebased activation likelihood estimation meta-analysis of neuroimaging data: A random-effects approach based on empirical estimates of spatial uncertainty. Human Brain Mapping, 30(9), 2907–2926. https://doi.org/10.1002/hbm.20718 Eickhoff, S. B., Bzdok, D., Laird, A. R., Kurth, F., & Fox, P. T. (2012). Activation likelihood estimation meta-analysis revisited. NeuroImage, 59(3), 2349–2361. https://doi.org/10.1016/j. neuroimage.2011.09.017 Eickhoff, S. B., Nichols, T. E., Laird, A. R., Hoffstaedter, F., Amunts, K., Fox, P. T., et al. (2016). Behavior, sensitivity, and power of activation likelihood estimation characterized by massive empirical simulation. NeuroImage, 137, 70–85. https://doi.org/10.1016/j.neuroimage.2016.04. 072 Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415(6870), 429–433. https://doi.org/10.1038/415429a Fiser, J., Berkes, P., Orbán, G., & Lengyel, M. (2010). Statistically optimal perception and learning: From behavior to neural representations. Trends in Cognitive Sciences, 14(3), 119–130. https:// doi.org/10.1016/j.tics.2010.01.003 Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138. https://doi.org/10.1038/nrn2787 Friston, K., Kilner, J., & Harrison, L. (2006). A free energy principle for the brain. Journal of Physiology, Paris, 100(1–3), 70–87. https://doi.org/10.1016/j.jphysparis.2006.10.001 Garbett, P. (2010). World Cup 2010: 10 things you didn’t know about Paul the psychic octopus. The Daily Telegraph, London. http://www.telegraph.co.uk/sport/football/world-cup/7877034/ World-Cup-2010-10-things-you-didnt-know-about-Paul-the-psychicoctopus.html. Accessed 7 July 2010. Gold, J. I., & Shadlen, M. N. (2001). Neural computations that underlie decisions about sensory stimuli. Trends in Cognitive Sciences, 5(1), 10–16. https://doi.org/10.1016/s13646613(00)01567-9 Gold, J. I., & Shadlen, M. N. (2002). Banburismus and the brain: Decoding the relationship between sensory stimuli, decisions, and reward. Neuron, 36(2), 299–308. https://doi.org/10. 1016/s0896-6273(02)00971-6 Gordon, E. (2003). Integrative neuroscience. Neuropsychopharmacology, 28(1), S2–S8. https://doi. org/10.1038/sj.npp.1300136 Hume, D. (1902). Enquiries concerning the human understanding: And concerning the principles of morals. Clarendon Press. Ioannidis, J. P. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124 Jaynes, E. T. (2003). Probability theory: The logic of science. Cambridge university press. Jeffreys, H. (1961). The theory of probability. Clarendon. Kolmogorov, A. N., & Bharucha-Reid, A. T. (2018). Foundations of the theory of probability: Second English edition. Courier Dover Publications. Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427(6971), 244–247. https://doi.org/10.1038/nature02169 Laird, A. R., Fox, P. M., Price, C. J., Glahn, D. C., Uecker, A. M., Lancaster, J. L., et al. (2005). ALE meta-analysis: Controlling the false discovery rate and performing statistical contrasts. Human Brain Mapping, 25(1), 155–164. https://doi.org/10.1002/hbm.20136 Liloia, D., Cauda, F., Uddin, L. Q., Manuello, J., Mancuso, L., Keller, R., et al. (2022). Revealing the selectivity of neuroanatomical alteration in autism spectrum disorder via reverse inference. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging. https://doi.org/10.1016/j. bpsc.2022.01.007 Ma, W. J., Beck, J. M., Latham, P. E., & Pouget, A. (2006). Bayesian inference with probabilistic population codes. Nature Neuroscience, 9(11), 1432–1438. https://doi.org/10.1038/nn1790 MacKay, D. J. (2003). Information theory, inference and learning algorithms. Cambridge university press.
1618
T. Costa et al.
Mathys, C. (2016). How could we get nosology from computation. Computational Psychiatry: New Perspectives on Mental Illness, 20, 121–138. Moreno-Bote, R. (2014). Poisson-like spiking in circuits with probabilistic synapses. PLoS Computational Biology, 10(7), e1003522. https://doi.org/10.1371/journal.pcbi.1003522 Mumford, D. (1992). On the computational architecture of the neocortex. II. The role of corticocortical loops. Biological Cybernetics, 66(3), 241–251. https://doi.org/10.1007/bf00198477 Neyman, J. (1977). Frequentist probability and frequentist statistics. Synthese, 36, 97–131. Niiniluoto, I. (2011). Abduction, tomography, and other inverse problems. Studies in History and Philosophy of Science, 42(1), 135–139. https://doi.org/10.1016/j.shpsa.2010.11.028 Park, J. (2021). Bayesian indirect inference for models with intractable normalizing functions. Journal of Statistical Computation and Simulation, 91(2), 300–315. https://doi.org/10.1080/ 00949655.2020.1814286 Peirce, C. S. (1974). Collected papers of Charles Sanders Peirce (Vol. 5). Harvard University Press. Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data? (Research Support, N.I.H.). Trends in Cognitive Sciences, 10(2), 59–63. https://doi.org/10.1016/j.tics. 2005.12.004 Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2(1), 79–87. https://doi.org/10.1038/4580 Samartsidis, P., Montagna, S., Nichols, T. E., & Johnson, T. D. (2017). The coordinate-based metaanalysis of neuroimaging data. Statistical Science, 32(4), 580–599. https://doi.org/10.1214/17sts624 Scott, C. A. (1900). The international congress of mathematicians in Paris. Bulletin of the American Mathematical Society, 7(2), 57–79. Smith, A. F. M. (1991). Bayesian computational methods. Philosophical Transactions of the Royal Society A, 337(1647), 369–386. Stocker, A. A., & Simoncelli, E. P. (2006). Noise characteristics and prior expectations in human visual speed perception. Nature Neuroscience, 9(4), 578–585. https://doi.org/10.1038/nn1669 Tabor, A., Thacker, M. A., Moseley, G. L., & Körding, K. P. (2017). Pain: A statistical account. PLoS Computational Biology, 13(1), e1005142. https://doi.org/10.1371/journal.pcbi.1005142 Tahmasian, M., Sepehry, A. A., Samea, F., Khodadadifar, T., Soltaninejad, Z., Javaheripour, N., et al. (2019). Practical recommendations to conduct a neuroimaging meta-analysis for neuropsychiatric disorders. Human Brain Mapping, 40(17), 5142–5154. https://doi.org/10. 1002/hbm.24746 Turkeltaub, P. E., Eden, G. F., Jones, K. M., & Zeffiro, T. A. (2002). Meta-analysis of the functional neuroanatomy of single-word reading: Method and validation. NeuroImage, 16(3 Pt 1), 765– 780. https://doi.org/10.1006/nimg.2002.1131 Wolpert, D. M., & Ghahramani, Z. (2005). Bayes rule in perception, action and cognition. Science, 1–4.
Part XIV Miscellaneous
Introduction to Miscellaneous
74
Lorenzo Magnani
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1622 1623
Abstract
This part of the Handbook presents a variety of important studies that regard abduction and that as the Editor-in-Chief of the Handbook I was not able to insert into the various other parts but that have to be offered to the reader. Minazzi focuses on the role of abduction in Galileo Galilei. Martínez-Bautista on the intertwining between abduction and phylogenetic inference. Haig studies the role of abduction in psychology. Cuccio and Caruana give a contribution on abduction in special neuroscientific areas. Brioschi deals with abduction and metaphysics. Jetli discusses the interplay deduction-abduction-induction in Plato’s Phaedo and Parmenides. Olmos deals with the role of abduction in inferences that involve “giving reasons.” Rivadulla further deepens the role of abduction in scientific reasoning. Cabrera furnishes an exhaustive account of abduction and the inference to the best explanation. Finally, Apostolidis and Psillos deal with the relevant relationship between Bayesianism and inference to the best explanation.
L. Magnani () Department of Humanities, Philosophy Section and Computational Philosophy Laboratory, University pf Pavia, Pavia, Italy e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_93
1621
1622
L. Magnani
Keywords
Abduction in Galileo Galilei · Abduction and phylogenetic inference · Abduction and neuroscience · Abduction and metaphysics · Abduction in Plato · Abduction in scientific reasoning · Abduction as best explanation · Abduction and Bayesianism
Introduction This Handbook explores abduction (inference to explanatory hypotheses), an important but, at least until 2000, a neglected topic in philosophy, logic, and epistemology. In the first book of mine about abduction (Abduction, Reason and Science. Processes of Discovery and Explanation, Kluwer Academc Publishers, Dordrecht, 2001), my aim was the one of integrating philosophical, cognitive, and computational issues, while also discussing some cases of reasoning in science and medicine. The main thesis was that abduction is a significant kind of scientific reasoning, helpful in delineating the first principles of a new theory of science. In the preface of that book, I said the status of abduction is very controversial. When dealing with abductive reasoning misinterpretations and equivocations are common. What are the differences between abduction and induction? What are the differences between abduction and the well-known hypothetico-deductive method? What did Peirce mean when he considered abduction a kind of inference? Does abduction involve only the generation of hypotheses or their evaluation too? Are the criteria for the best explanation in abductive reasoning epistemic, pragmatic, or both? How many kinds of abduction are there? In the last decades, abduction was extensively studied in logic, semiotics, philosophy of science, computer science, artificial intelligence, and cognitive science, and other important disciplines (Aliseda, 2006; Gabbay & Woods, 2005; Magnani, 2009, 2017; Niiniluoto, 2018). The interest in abduction derived largely from the neglect of the logic of discovery in the case of neopositivists but also in the so-called postpositivism, for example, in both Popper and Kuhn. Research on abduction immediately acquired a strong interdisciplinary character and this Handbook respects this feature: I suggest to the reader first see the various parts of it that are related to various disciplines. However, this Miscellaneous part of the Handbook presents a variety of important studies that regard abduction and that, as the Editor-in-Chief, I was not able to insert into the other parts but that have to be offered to the reader. In summary, we can say that knowledge about abductive cognition increase year after year: I think this Handbook testifies that abduction has acquired a central status in various disciplines, surely in philosophy, logic, epistemology, and cognitive science, but also demonstrated its capacity to fecundate new intelligibility of important issues in many other fields of the current research, not only of scientific character. In this Miscellaneous part, the reader will find two welcomed chapters of historical character related to important moments of the history of science and of
74 Introduction to Miscellaneous
1623
philosophy, the first one related to the role of abduction in Galileo Galilei (Galilean methodology and abductive inference, by F. Minazzi) and another one on Plato (Deduction-abduction-induction chains in Plato’s Phaedo and Parmenides by J. Jetli), but also four chapters that afford special problems concerning abduction in general in science, and specifically in psychology and neuroscience (Abduction as phylogenetic inference: epistemological perspectives in scientific practices by E. Martínez-Bautista; Abductive research methods in psychological science, by B. Haig; Motor simulation of facial expressions, but not emotional mirroring, depends on automatic sensorimotor abduction, by V. Cuccio and F. Caruana; Tracking abductive reasoning in the natural sciences, by A. Rivadulla, A). M. Brioschi, in Abduction and metaphysics, proposes the interesting relationship, that further demonstrates the relevant role of abduction also in philosophical cognition. P. Olmos, in Reason-giving-based accounts of abduction shows the centrality of abduction in every human cognitive activity that involves the role of “reasons.” Finally, two central topics that surely belong to the tradition of studies about abduction regard a complete account of the role of the concept of inference to the best explanation in abductive cognition (Inference to the best explanation – An overview, by F. Cabrera) and on the role of the uncertain character of abduction in the light of Bayesiansm (The limits of subjectivism: On the relation between IBE and (objective) Bayesianism, by A. Apostolidis and S. Psillos).
References Aliseda, A. (2006). Abductive reasoning. Logical investigations into discovery and explanation. Springer. Gabbay, D., & Woods, J. (2005). A Practical Logic of Cognitive Systems. The Reach of Abduction: Insight and Trial (Vol. 2). Elsevier. Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation. Kluwer. Magnani, L. (2009). Abductive cognition. The Epistemo-logical and Eco-cognitive dimensions of hypothetical reasoning. Springer. Magnani, L. (2017). The abductive structure of scientific creativity. Springer. Niiniluoto, I. (2018). Truth-seeking by abduction. Springer.
Galilean Methodology and Abductive Inference
75
Fabio Minazzi
Contents The Galilean Dialogue as a Document of a Decisive Conceptual Change . . . . . . . . . . . . . . Predictivity as the Rationalization of a Fact? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Galileo: The Scientist as a “Mathematical Philosopher” . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peirce: Abduction as a Rationalization of Any Fact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Descartes: The Deductive Method as an Instrument of Research . . . . . . . . . . . . . . . . . . . . . . Vailati: The Deductive Method as an Instrument of Research . . . . . . . . . . . . . . . . . . . . . . . . From the Dogma of the Method to a Critical-Historical-Normative Image of Knowledge . Galileo: Detective or Scientist? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1626 1628 1630 1633 1637 1640 1644 1647 1649 1650
Abstract
Galileo, in founding modern science, ended up attributing a decisive role to the deductive method by means of which human reason must do violence to sense in order to know the world. This emphasizes, in a highly admirable way, precisely the constitutively counterfactual aspect of modern scientific reasoning, which no longer starts from empirical observation, but rather conjecturally constructs a theory whose mathematically deduced conclusions are then placed in relation to the experimental dimension. This revolutionary approach to scientific research outlined by Galileo has not always been understood. On the contrary, the empiricist image of science has often been superimposed on the very way in which scientists work. Peirce’s elaboration of abduction corrected this empiricist reading of science, emphasizing the need to include abduction itself in a great multiplicity of perspectives, all very open and very broad. The essay retraces this F. Minazzi () Dp di Scienze Teoriche e Applicate, Università degli Studi dell’Insubria, Varese, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2023 L. Magnani (ed.), Handbook of Abductive Cognition, https://doi.org/10.1007/978-3-031-10135-9_45
1625
1626
F. Minazzi
complex knot concerning the most correct epistemological image of scientific knowledge, following and analyzing the contributions provided by both the empiricist and idealist traditions, and then focusing on the contribution of two authors such as Peirce and Vailati who, in two different contexts, grasped the importance of a normativist vision of scientific knowledge and the role of deduction and of Peircean abduction itself. In this way, the critical problematic nature of scientific knowledge is placed at the center of our image of scientific knowledge. Keywords
Human reason · Counterfactual · Abduction · Normativist knowledge · Conjectural knowledge
Poirot said placidly, ‘One does not, you know, merely employ the muscles. I do not need to bend and measure the footprints and pick up the cigarette ends and examine the bent blades of grass. It is enough for me to sit back in my chair and think. It is this’ – he tapped his egg-shaped head – ‘this, that functions.’ Agatha Christie, Five Little Pigs (1943, p. 2)
The Galilean Dialogue as a Document of a Decisive Conceptual Change In the Dialogue Concerning the Two Chief World Systems (1632), Galileo imagines a critical comparison not only between the two principal astronomical systems of the time – the Ptolemaic and the Copernican – but also between two different traditions of thought, namely, the traditional Aristotelian tradition and that connected with the genesis of the new modern science, which he himself inaugurated. Viewed in this hermeneutic perspective, Salviati, Simplicio, and their intelligent host Sagredo not only represent different traditions of thought, but place before the reader the dialogue, more intimate and profound, that induced Galileo himself, from being a Ptolemaic as he was by formation and culture, to finally become a convinced exponent of Copernicanism. With these pages of the Dialogue Galileo illustrates and analytically documents, in short, a truly extraordinary page of his own radical conceptual change . The various objections that Simplicio puts forward are in fact the perplexities, doubts, and critical questions that Galileo himself had experienced in person, when he began to reflect critically on the Ptolemaic astronomical system of the scientific tradition preceding him, and then to gradually focus his attention on the revolutionary version presented by Copernicus. Or rather, in this precise hermeneutic perspective, the Galilean Dialogue enables us to follow, in slow motion (so to speak), step by step, the same, extraordinary, conceptual change by virtue of which Galileo finally decided to abandon the Ptolemaic system and instead create his own and openly defend the new and revolutionary Copernican system. Namely, a heliocentric and heliostatic system which, at least according to what Isaac Newton would write in his De mundi systemate (of 1728), was very likely an opinion
75 Galilean Methodology and Abductive Inference
1627
spread by the Egyptians, who “were early observers of the heavens; and from them, probably, this philosophy was spread abroad among other nations; for from them it was, and the nations about them, that the Greeks, a people of themselves more addicted to the study of philology than of nature, derived their first, as well as their soundest, notions of philosophy; and in the vestal ceremonies we may yet trace the ancient spirit of the Egyptians; for it was their way to deliver their mysteries, that is, their philosophy of things above the vulgar way of thinking, under the veil of religious rites and hieroglyphic symbols” (Newton, 1728, §1). According to this Newtonian reconstruction, it was thinkers such as Anaxagoras and Democritus who finally conceived the Earth as an immobile celestial body, placed at the center of the universe, while, subsequently, the conception of “solid orbs was of a later date, introduced by Eudoxus, Calippus, and Aristotle; when the ancient philosophy began to decline, and to give place to the new prevailing fictions of the Greeks” (ibidem). But, leaving aside this very interesting Newtonian overview, in the Dialogue by Galileo, we are faced with the precise conceptual and critical process which finally induced Galileo to abandon the traditional Ptolemaic astronomical system, in order to be able to share and finally make his own the new and revolutionary heliocentric and heliostatic perspective presented by Copernicus in De rivolutionibus orbium coelestium (1543). To illustrate analytically the reasons that led him to embrace the new astronomical point of view illustrated and advanced by Copernicus, Galileo, in the Dialogue, also dealt with another decisive problem, one that more directly concerned modern science and its new and revolutionary conceptual, critical, prospective, and methodical approach. Simplicio, in fact, openly disputes the possibility of using mathematics to understand the physical world, subject to many, too many, incessant qualitative changes. In accordance with the traditional thought of the ancient Greeks, Simplicio does not believe that it is possible to use mathematics – namely, a discipline based on great intrinsic rigor – in the terrestrial physical world, which, by contrast, is dominated by incessant change and continuous qualitative transformation, hence by a process that constantly causes all things to pass from being to non-being and from non-being to being, since change, corruption, and death itself dominate completely unchallenged the sublunar world in which there is only imperfection and a continuous qualitative panta rei, as constant as it is indispensable and inevitable. Faced with this qualitative dimension, intrinsically ephemeral and inevitably transient, peculiar, and specific to the sublunar world, in the supralunar world we would instead find ourselves faced with the opposite reality, always immutable and perfect because it is made of a special material, namely, the ether, that by its intrinsic nature does not undergo any change or any physical and qualitative transformation. For this basic reason, the terrestrial physical world would constitute a reality that is at the same time too complex and always subject to an ontological and constitutive imperfection, which, moreover, appears to be one with our own destiny as mortal beings, born, in fact, to die. Against this traditional conception of the cosmos and of the precise central place occupied by the Earth within the finite universe, Galileo had already openly been fighting since the time of his first, revolutionary, terrestrial observations, made, as
1628
F. Minazzi
is well known, with his “long-sighted cannon”, namely, his famous telescope, the instrument with which the Pisan scientist, in the winter of 1609 and in the early winter months of 1610, explored the starry sky, thus being the first man in the history of humanity to have seen new celestial objects that had never been observed by anyone else before him. Not for nothing did Galileo entrust his extraordinary report concerning the “celestial novelties” which he had discovered thanks to the telescope and this radically innovative exploration of the starry sky to his famous Sidereus nuncius (English title The Starry Messenger), a precious yet small book, published in the spring of 1610, which, in a few weeks, was sold out and transformed this serious and respected, but rather obscure scholar of mechanics at Padua University into the most famous European scientist of the seventeenth century, whose name immediately rose to a highly deserved international reputation. But if a technical instrument such as the Galilean telescope had undoubtedly helped him to achieve a new and revolutionary view of celestial bodies, this same unprecedented and innovative exploration of the starry sky was inserted, by its intrinsic nature, into the precise claim of a profound material continuity between the sublunar and supralunar worlds. In his exploratory research, Galileo took it for granted that the whole universe is always made of the same matter present on earth and in the sublunary world, thus conceiving the entire universe as a reality made up entirely of a homogeneous material. Galileo in fact denied that there was any possible qualitative leap between the terrestrial world and the supralunar world, which he conceived as made and formed out of the same physical material present on earth and in the entire universe. But in addition to maintaining this basic material homogeneity existing in the entire universe, Galileo also developed a new way of practicing and conceiving the scientific investigation of the physical world, in order to continuously increase the technical-cognitive patrimony which was actually available to humanity in the different historical phases of civilization.
Predictivity as the Rationalization of a Fact? Seen in this perspective, the Galilean reflection on the new way of proceeding in scientific investigation helps us to understand better his own image of scientific progress. In this particular and decisive perspective, the objection that Simplicio puts forward, against the Galilean claim to be able to use the heuristic tool of mathematics to better investigate the physical world, constitutes a truly crucial, decisive and even quite strategic objection. It is, in fact, a completely radical epistemological observation, because it strikes at the heart of the very possibility of using the rigor of mathematics in order to study a sublunar physical world that is always imperfect and always changing, in constant mutation, precisely because it is said to lack any traditional “perfection”, which the ancients generally perceived as coinciding with a static, fixed, and substantially unalterable condition. For this precise reason Simplicio then puts forward a decisive objection, explicitly appealing to traditional Euclidean geometry, since Simplico points out that “then whenever you apply a material sphere to a material plane in the concrete, you apply a sphere
75 Galilean Methodology and Abductive Inference
1629
which is not perfect to a plane which is not perfect, and you say that these do not touch each other in one point” (Galilei, 1962, 207 and Galilei, 1968, vol. VII, p. 233). Faced with this first but decisive objection, Galileo – as if “playing for time” within the pressing rhythm, typical of every authentic critical dialogue between interlocutors who take their stand on opposite positions – immediately sketches out a first possible answer, in order to be able to “immunize”, as far as possible, Simplicio’s decisive criticism. For this reason, Galileo then tries to immediately “minimize” the decisive significance of this objection presented by Simplicio. A truly decisive objection precisely because it disputes the heuristic use of mathematics as an instrument which, by its intrinsic nature, would precisely delineate a rather poor, diminished and completely distorted image of the infinite empirical-qualitative complexity of the real world. Faced with this criticism, Galileo reacts with the following, traditional reasoning: “but I tell that even in the abstract, an immaterial sphere which is not a perfect sphere can touch an immaterial plane which is not perfectly flat in not one point, but over a part of its surface, so that what happens in the concrete up to this point happens the same way in the abstract”. In this way there would exist, in short, a sort of perfect parallel and an undeniable congruence between theory and reality: an equally imperfect theory would follow from an imperfect reality. In this way he tries to immunize Simplicio’s objection by imagining a theory that explicitly states how an imperfect sphere would necessarily touch an imperfect plane in several places. Which actually happens in our imperfect world of the quinque sensibus in which, for example, imperfect wooden spheres rest, in several points, on an equally imperfect wooden surface. In this case, Galileo then adds, in an almost “triumphant” way, “it would be novel indeed if computations and ratios made in abstract numbers should not thereafter correspond to concrete gold and silver coins and merchandise”. However, with this first reply Galileo did not really answer the objection of those who dispute, radically, the possibility of heuristically using a rigorous instrument like mathematics to study an imperfect, changeable and qualitatively changing reality like the one we live in in the sublunar world. Galileo himself is perfectly aware of the overall inadequacy of his first response, by which he only sought to repel, in the first instance, his adversary’s criticism, while also seeking to “stall for time”, in order to then develop a truer answer, much more fully articulated, deeper, and more relevant. What is more, with this first answer, in which a theory, in the final analysis, is limited only to “duplicating”, on a conceptual level, what happens in the physical world, Galileo relates, completely consciously, to the traditional and equally widespread “epistemological creed” typical of Greek antiquity, according to which our theories could only be isomorphic “copies” of the qualitative world that they seek to describe and know. In fact, for the traditional, and very widespread, correspondence theory, the statement “snow is white” turns out to be true “if and only if ‘snow is white,’” that is to say, if there is a physical body called “snow” which enjoys the property of being effectively “white”. For many centuries this correspondence thesis thus constituted the precise and fundamental epistemological standard criterion to which almost the whole tradition, albeit quite composite, of ancient pre-scientific thought always appealed.
1630
F. Minazzi
Moreover, this correspondence criterion is still dominant even within our own common sense, as well as in our courts of law today, and also in almost all the reasoning typical of almost all the bureaucracies of the contemporary world. Moreover, because of this precise “correspondence” motive, the ancients adopted the Ptolemaic system and discarded the decidedly “counterfactual” system developed by the Pythagorean Aristarchus of Samos who, if we are to believe Newton’s observation mentioned earlier, instead drew directly on the astronomical concepts of the ancient Egyptians. The Ptolemaic system seemed to adhere much more closely to the reality directly observed by our senses, while the heliocentric system seemed to enter into open and flagrant contradiction precisely with our own sense experience. Moreover, seen in this perspective, even the perfect geometric equivalence subsisting between the geocentric and geostatic system with the heliocentric and heliostatic one seemed to constitute, in turn, a useless and bizarre “complication” of a simple and immediate physical-material reality directly attested by quinque sensibus, since all men can see and observe, with their own eyes, the movement of the sun in the sky in the course of a day. For all these reasons – variously stratified over time, within the widespread epistemological creed of ancient thought – Galileo therefore knew very well that his first response to Simplicio did not by any means constitute the true and authentic reply to the objection of his interlocutor. Also because this objection was very radical, contesting, ab imis fundamentis, the very possibility of being able to use mathematics to know the sublunary world, which is always imperfect and constantly changing. It is not for nothing that the entire tradition of ancient Western thought had always and systematically undervalued and underestimated the criticalheuristic role of mathematics, to which it had, if anything, opposed syllogistic logic, moreover formally well codified by Aristotle in the Organon.
Galileo: The Scientist as a “Mathematical Philosopher” Hence Galileo introduces his true answer, with a rapid and emblematic change in the literary style of his writing. He deliberately introduces his answer, with a completely rhetorical and even rather ironic question, typical of those who are “very knowledgeable” and are now ready to unveil an innovative and revolutionary point of view that cannot fail to radically undermine the traditional framework within which Simplicio always thinks: “Do you know what does happen, Simplicio? Just as the computer who wants his calculations to deal with sugar, silk, and wood must discount the boxes, bales, and other packing, so the mathematical scientist [in Italian: filosofo geometra, see Minazzi 1994a], when he wants to recognize in the concrete the effects which he has proved in the abstract, must deduct the material hindrances, and if he is able to do so, I assure you that things are in no less agreement than arithmetical computations.” With this revolutionary and completely unexpected response, the scientific procedure inaugurated and practiced by Galileo appears to be completely at the antipodes of the traditional epistemological correspondence creed, peculiar and specific to antiquity and all ancient science. In fact, for Galileo the filosofo geometra
75 Galilean Methodology and Abductive Inference
1631
or “mathematical philosopher” must, first of all, be capable of demonstrating, in the abstract, the effects of his own theory. Therefore for Galileo the starting point of the scientist is precisely identified in his ability to know how to use his imagination and his creative fantasy to construct a new theory (or a new prospective-conceptual point of view) within which he must then be able to deduce some precise consequences. But how can a new theory be built? Naturally by starting from some precise and original theoretical assumptions. But, one might still ask: how can these new theoretical assumptions be introduced? Ex suppositione, Galileo would explicitly write in his famous letter to Giovanni Battista Baliani of 7 January 1639, an epistle in which the scientist from Pisa explains fully to his interlocutor how he constructed the physics of the motion of rigid bodies in the Dialogues Concerning Two New Sciences (1638). According to this epistemological explanation, Galileo first of all formulated a precise abstract definition, ex suppositione, of motion, a definition from which he then rigorously derived deductively the conclusions that can, in fact, be reached, thus constructing a highly articulated physical theory of motion. In this first preliminary phase, which is eminently theoretical, Galileo moves, therefore, in a sphere of thought that turns out to be, at the same time, creative-conventional (thanks to the introduction of conventional assumptions), but which is also, at the same time, deductive (according to the most rigorous requirements of mathematical deduction, which in fact makes it possible to construct a specific, accurate and precise physical theory). It is only after having deduced all the rigorous consequences that can be derived from this physical theory of his that the “mathematical philosopher” can finally investigate whether all these consequences are eventually confirmed (or refuted) by the experimental dimension. In starting his innovative experimental investigation, Galileo clearly knows, moreover, that the “mathematical philosopher” at all times “must deduct the material hindrances” if “he wants to recognize in the concrete the effects which he has proved in the abstract”. Consequently, for Galileo, the scientist’s way of proceeding can only be at the same time decisively counterfactual and also distinctly conceptual, to then compare the theoretical predictions with the experimental dimension, which is given the last, decisive, word. In any case, the cognitive process never starts from sense experience as such, because, if anything, the physical dimension to which constant and fundamental reference is made must always be “conquered” through the elaboration of a particular theory capable of making certain predictions, on the basis of which it will inevitably be judged and evaluated. In this way the facts have only rights towards theories, while the latter always have, if anything, precise duties towards empirical facts, the latter of which must always be scientifically “explained”, so making them known to us as the “results” of precise logical-random-deductive connections, by virtue of which we are able to also know them predictively. As it is easy to understand, this new epistemological model gives us a much more complex and highly articulated image of human knowledge itself which, for Gaileo, is always constructed in the fruitful and free problematic entwining between the specific dimension of scientific thought and that of experimental confirmation of a certain theoretical prediction. Thus in the scientific proceeding theorized and specified by Galileo, conventional moments ex
1632
F. Minazzi
suppositione, theory, mathematical deduction and even the experimental dimension itself all constitute indispensable phases of a modern scientific procedure worthy of the name. This then achieves an authentic physical revolutionary reversal with respect to the traditional and established scientific procedure of antiquity: in the latter, an overall inductivist approach always predominated, while in the modern scientific mentality precisely the deductivist phase acquires a decisive prevalence. In this more complex and highly articulated scientific and epistemological universe, the scientist must therefore always know how to start from a precise and rigorous hypothetical definition, which he introduces, conventionally and in a completely creative way, in order to then be able to develop, deductively, a precise physical theory, through which he will finally have to be able to compare his own theoretical conclusions with the experimental material available, which, in turn, will arise from a precise mathematization of the reality by means of which he will construct the necessary experimental dimension. But it is then evident that even the latter – which also plays a decisive and indispensable role in verifying or refuting a theory – is by no means assumed in a neutral way, precisely because, thanks to the construction of a new theory, the scientist must always be able to interpret, normatively, different aspects of the real world, within which he must know how to construct and develop a new specific explanation of the “experimental facts” which he took into due consideration. In this perspective, the “facts” themselves then change their very epistemic nature, because they can no longer be conceived in an atomistic way, as “isolated empirical facts,” because, if at all, they must be traced back to a precise and equally rigorous “theoretical framework,” through which a certain “scientific fact” is actually rationalized to be duly related and interconnected with other “facts”, also thought of and conceived always within a determinate and equally precise theoretical perspective. In other words, it can then be argued that these same “facts,” known and studied within a specific and particular theoretical context, constitute, by themselves, a possible objectification of reality which does not, however, exhaust all its physical potential. In short, we are faced with an objectification of the real which, while never exhausting all the intrinsic and infinite potentialities of reality (see Agazzi, 2014), initiates a cognitive process that is articulated precisely in its ability to always be capable of critically extending the different levels of our possible knowledge of the real world. In this precise epistemological perspective, the experimental dimension therefore plays an absolutely decisive and insuppressible role for Galileo, but, in any case, it never comes before theories, because it follows them and judges them only afterwards, by confirming or disproving a determinate and specific prediction. In this way, for Galileo, the game of science appears much more complex, malleable and creative than ancient thought was ever able to suspect and conceive (with very few exceptions, among which we must naturally include the quite extraordinary work of a great scientist of antiquity, Archimedes of Syracuse who – and not by chance – always represented for Galileo an indispensable, fundamental and privileged strategic point of reference that was both theoretical and practical-experimental). After all, ancient thought was always based precisely
75 Galilean Methodology and Abductive Inference
1633
on the qualitative sense experience peculiar to the quinque sensibus after which they sought to rationalize it in the most appropriate way possible. From this point of view, Aristotle’s Physics remained for many centuries a reliable and truly emblematic paradigmatic model of reference that even today helps us to better understand the different conceptual approach existing between the traditional physical researches of antiquity and the modern ones inaugurated by Galileo in the seventeenth century, opening a new and revolutionary path that has never since been abandoned, thus introducing a completely fundamental and decisive turning point in human history, by virtue of which, to put it in Bertrand Russell’s words, three centuries of science have changed the world more than four thousand years of pre-scientific culture (see Russell, 1931). In this hermeneutic perspective, ancient science constitutes an undoubted model with respect to which modern science constitutes a decidedly alternative guide, since it is based on counterfactual theoretical innovations that authorize the delineation of some predictive statements which must then be rigorously checked and verified precisely through the experimental dimension. Thus in modern science the phase of conventionality, that of deductivism and also that of inductivism and of the experimental dimension itself all end up by being some decisive, and yet quite different phases, autonomous and profoundly different, through which a new and revolutionary methodology is constructed, in order to be able to better understand, namely in a somewhat deeper way, the physical world in which we live.
Peirce: Abduction as a Rationalization of Any Fact To explain the nature of the scientific procedure, Charles Sanders Peirce observed that “we all know that as soon as a hypothesis has been settled upon as preferable to others, the next business in order is to commence deducing from it whatever experiential predictions are extremes and most unlikely among those deducible from it, in order to subject them to the test of experiment, and thus either quite to refute the hypothesis or make such corrections of it as may be called for by the experiments; and the hypothesis must ultimately stand of fall by the result of such experiments” (Peirce, 1960–1966, CP 7.182). In this case, however, we are not faced with what in the twentieth century became the overused Popperian slogan “daring conjectures, ruthless refutations” (in this regard see Minazzi, 1994b), since Peirce outlines, with a greater critical-interpretative refinement, a fruitful and articulated image of the scientific process, which, pace Popper, starts from the assumption that “science seeks to discover whatever there may be that is true”, since science only demands “solid truth, or reality” (CP 7.186). Consequently, for Peirce it is evident that “the different sciences deal with different kinds of truth”, precisely because, one could add in the manner of Husserl (see Husserl, 1928), all the various, possible and multiple disciplines construct, constitute and identify, always different “regional ontologies” within which their own peculiar cognitive objectivity is progressively defined. After all, for Peirce “the work of reason consists in finding connection between facts” (CP 7.198).
1634
F. Minazzi
In this precise perspective, à la Kant (and, this time, pace Peirce!), reason constitutes a specific function of critical integration of reality: a function of critical integration which critically probes the opacity and darkness of the real world, in order to delineate a specific hermeneutic comprehension of it. Thanks to this innovative scientific hermeneutics, the feeble critical focus of the rational human understanding of the universe makes it progressively less dark and, therefore, better known objectively. Better still: Peirce’s epistemological approach shows with bright epistemological clarity that scientific knowledge always “renders the fact a conclusion, necessary or probable, from what is already well known. It might be called a regularization, explanation and regularization being the two types of rationalization” (CP 7.199). Which brings out the intrinsically normative character of scientific knowledge as was clearly illustrated by Kant in his illustrious epistemological examination of Newtonian physics entrusted to the pages of his Metaphysische Anfagsgründe der Naturwissenschaft (see Kant, 1786; Pollok, 2001). This entails that “now it is true that the effect of the regularization is that the fact observed is less isolated than before; but the purpose of the regularization is, I think, much more accurately said to be to show that it might have been expected, had the facts been fully known” (CP 7.199). In fact, for Peirce the authentic spark that sets the cognitive process in motion does not at all arise from the consideration of the irregularities of experience, but rather from perceiving and identifying an unexpected fact, namely a phenomenon that proves to be contrary to the expectations connected with our standard scientific explanation of the world. Exactly on this basis Peirce then introduces the role and function of abduction (or retroduction in close connection with the heuristic role of induction. Indeed, Peirce writes: “accepting the conclusion that an explanation is needed when facts contrary to what we should expect emerge, it follows that the explanation must be such a proposition as would lead to the prediction of the observed facts, either as necessary consequences or at least as very probable under the circumstances. A hypothesis then, has to be adopted, which is likely in itself, and renders the facts likely. This step of adopting a hypothesis as being suggested by the facts, is what I call abduction. I reckon it as a form of inference, however problematical the hypothesis may be held” (CP 7.202). This problematicity of the hypothesis is one with the criticality of the scientific procedure, since, Peirce further points out, “a hypothesis adopted by abduction could only be adopted on probation, and must be tested”. This profoundly changes not only our traditional standard image of science, but also provides the opportunity for a very different reading of the history of science itself. For example, consider one of the most important steps that took place in modern times with the progressive affirmation of the Copernican theory. As is well known, the singular paradox of Copernicus (1473-1543) can perhaps be identified precisely in the fact that this astronomer was decidedly original and very innovative in his astronomical conception, at the same time, however, as his scientific method was, by contrast, somewhat conservative, to the point that Copernicus always believed, in profound agreement with the ancients, and with the famous Platonic dogma, that the celestial bodies necessarily followed uniform
75 Galilean Methodology and Abductive Inference
1635
and circular movements. Precisely because of his prejudicial attachment to the traditional astronomical dogmas of antiquity, his astronomical system ended up being, paradoxically, on the one hand “simpler” than that presented by Ptolemy, though, on the other hand, it was also much more complex, at least compared to the system later configured by Kepler (1571-1630), using his seven ellipses for the apparent movements of the celestial bodies which, in this way, replaced the thirty-four circles which Copernicus had to refer to in order to explain the apparent motions of the celestial bodies themselves. The innovative and certainly decisive astronomical result achieved by Kepler was after all obtained by this mathematician, then in the service and under the protection of his patron, the Emperor Rudolf II, undoubtedly being able to benefit from the most precise, widespread, and systematic astronomical observations, all made with the naked eye by the astronomer Tycho Brahe (1546-1601). Kepler’s Rudolphine Tables, published in 1627, attest that they were undoubtedly much more accurate than the Prutenic Tables devised, on the basis of the Copernican theory, by Reinhold in 1551 and also compared to the Alfonsine Tables, dating back to the thirteenth century, still based on the Ptolemaic system. The greater accuracy of Kepler’s Rudolphine Tables stemmed precisely from the fact that Kepler had been able to use all of Tycho Brahe’s precious and rigorous observational results. But precisely this fact, which is completely correct, has then led a substantial part of the historiographical tradition to believe that the three famous laws of Kepler were obtained by the German astronomer directly by a purely inductive method, which has given rise to a veritable “inductivist ‘legend’” (see Singer, 1959; Mason, 1962), which corresponds to an authentic “mental cramp” à la Wittgenstein. For what reason? For several reasons. In the first place, because it was not in the least taken into account that in his Epitome astronomiae copernicanae, composed by Kepler between 1618 and 1621, this astronomer-mathematician expounded with precision his astronomical method, which turns out to be quite different from the Copernican one. In fact in this work Kepler argues that, in his opinion, astronomy presupposes five parts, which are autonomous and quite different. First of all, astronomical observation of the sky, which, however, must be combined with the ability to develop hypotheses to explain the apparent motions of the celestial bodies observed. Then, thirdly, according to Kepler it is also necessary to know how to elaborate a physics and a metaphysics of cosmology, which latter must also be connected with the precise calculation of the positions – future and past – of the single celestial bodies. Finally, again according to Kepler, it is also necessary to know how to conceptually dominate a discipline such as mechanics, which enables us to construct optical instruments, so placing us in the best position to be able to use them in the best possible way. Certainly Kepler also believed that the metaphysical component was not, however, an essential factor in astronomical research, being ready to declare that if the hypotheses were congruent with the metaphysical structure, this certainly constituted a result not to be despised. However if they turned out to be openly at odds with astronomical theory, then, according to Kepler, it was certainly metaphysics that needed to give way. In any case, Kepler was also convinced that the hypotheses used in the astronomical field should always be reasonable and always
1636
F. Minazzi
capable of making possible “The demonstration of the phenomenon” taken into consideration, while also being able to show “its usefulness in practical life”. In the second place, because the image has gradually spread of an inductivist Kepler, one who constructed his new explanation of the solar system and the orbits of celestial bodies, emblematically epitomized in his famous three laws of the movement of the planets, which was based systematically on the extensive and highly articulated body of precise observational results. These were made by the Danish astronomer Tycho Brahe, with his numerous assistants, in the course of his research conducted over many years, from 1576 to 1597, on the island of Hven in the Sound, the strait of Copenhagen. Undoubtedly these observations by Brahe were far more accurate than those previously collected by Johannes Müller (1436-1476), the famous Regiomontano, in collaboration with his friend (and patron) Bernhard Walther (1430-1504), with whom another friend of Müller’s collaborated, namely, the famous artist Albrecht Dürer (1471-1528). After all, this complex and highly articulated set of astronomical observations was now available to astronomers when Copernicus began his work. Faced with this basic framework which, in general, has somehow “crystallized”, to then become a reference model for the main historical reconstructions of science, Peirce, by contrast, had the ability to reconsider the specific heuristic link which is established in the inductivist phase with the nomological-deductivist. In particular, in reconsidering the specific genesis of the new Keplerian image of the movement of the heavenly bodies, Peirce rightly felt the hermeneutic need to reconsider precisely this particular nomological-deductive link between a theory and the plan of experimental observations. In this regard, Peirce wrote: all necessary reasoning consists of tracing out what is virtually asserted in the assumed premises. While some of these may be new observations, yet the principal ones relate to states of things not capable of being directly observed. As has often been said, especially since Kant, such reasoning really does not amplify our positive knowledge; although it may render our understanding of our own assumptions more perfect. It is the kind of reasoning for any application of science. For example, it is by such reasoning that, assuming the law of gravitation to have been scientifically established, we go on to predict the time and place of an eclipse of the sun. Or, if our desire is to rectify our theory of the moon, we may do so by comparing such predictions, regarded as conditional, with observations, If, in making the correction, we assume that there can be no error discoverable by these observations except in the values of one or two constants employed, the correction is itself made by a mere application of principles assumed to be already scientifically established; and although it will be called a contribution to science, it leaves the frame-work of the theory untouched, and merely consists in incorporating the new observations into the places provided for them in our existing assumptions, so that there really is, in the logician’s sense, no enlargement of our knowledge, but merely an arrangement or preservation of the systematization of knowledge already established. (CP 7.180)
But then what is a hypothesis rooted in? Peirce replies: “the entire meaning of a hypothesis lies in its conditional experiential predictions: if all its predictions are true, the hypothesis is wholly true” (CP 7.203). Consequently, for Peirce, abduction cannot but be closely linked to induction, since “this sort of inference it is, from experiments testing predictions based on a hypothesis, that is alone properly entitled
75 Galilean Methodology and Abductive Inference
1637
to be called induction” (CP 7.206). The complex game that then takes place between abduction and induction within the scientific procedure is qualified by Peirce as follows: Abduction seeks a theory. Induction seeks for facts. In abduction the consideration of the facts suggest the hypothesis. In induction the study of the hypothesis suggests the experiments which bring to light the very facts to which the hypothesis had pointed. The mode of suggestion by which, in abduction, the facts suggest the hypothesis is by resemblance, – the resemblance of the facts to the consequences of the hypothesis. The mode of suggestion by which in induction the hypothesis suggests the facts is by contiguity, – familiar knowledge that the conditions of the hypothesis can be realized in certain experimental ways. (CP 7.218)
Descartes: The Deductive Method as an Instrument of Research But, as we have seen, Peircian abduction is not solely concerned with inductivist inference, since it also fully highlights the epistemological and heuristic role of deductivist inference, namely, the opposite and contrary operation, by virtue of which the object of knowledge is defined normatively. However, there has not always been a full epistemological awareness of the crucial role that deductivism performed within the very birth of modern science and its consequent research program, both philosophical and scientific (for this factor connected to the historical and theoretical link between scientific and philosophical thought, the reference to Geymonat 1970-1976 is still fundamental today). However, as Giovanni Vailati pointed as early as 1898, in his inaugural lecture to the course of History of Mechanics at the University of Turin, year 1897-1898 (Vailati, 1911, pp. 118-148 and Id., 2010, pp. 23-62): We can see that the distinction between the processes of induction or generalization and those of deduction and demonstration is already clearly acknowledged in the work of Greek philosophers, who may have made the first attempts at analysis and systematic classification of the processes and artifices used by the human mind in proceeding from the unknown to the known. (Vailati, 2010, p. 25, italics in the text)
In this way Vailati explicitly referred to the reflection present in Aristotle’s Organum, which clearly distinguishes between inductive and deductive inference. Induction (epagoghé) is thus “defined by Aristotle as that form of reasoning by means of which, from the examination and comparison of a series of particular cases, we arrive at a general proposition which contemplates not just the observed cases, but also an indeterminate number of other cases, which have a certain relation of similarity with the former cases” (ibidem). Conversely, deduction (apodeixis) is qualified by Aristotle “as any form of reasoning that can be reduced to the kind he has designated as syllogism (syllogismos), which, as we know, consists of the following: starting with two propositions, when one affirms a given property of an entire class of objects, and the other states that one or more objects belong to such a class, we get to a third proposition, where the initial property is also attributed to the aforementioned objects” (ibidem). On this basis Aristotle then stresses the
1638
F. Minazzi
fundamental distinction existing between deductive and inductive inference : in the former, if one admits the truth of the premises, one is then forced and obliged – by an evident logical necessity – to admit the truth of the consequences. Otherwise, one would end up violating the principle of non-contradiction, leading to theories that are logically too powerful, because they can demonstrate everything and the opposite of everything, showing, in their very “power”, their substantial irrelevance and uselessness. On the other hand, the results obtained inductively, while admitting the truth of the premises, would never give rise to any particular contradiction or inconsistency, if one ended up rejecting the truth of the inferential generalization. For this basic reason Aristotle then emphasizes that deductive inference always leads to necessary and compulsory results, while only inductive inference, by contrast, enables us to start from sense data drawn from the direct testimony of the senses. Again for this underlying epistemological reason, Aristotle is then fully convinced that the principles from which all our deductions must be articulated should always be based on the observation of some special facts (which, of course, change from discipline to discipline), precisely because he is convinced that all principles and axioms of a particular discipline have an inductive origin. He is so convinced of this that, in his opinion, even Euclidean geometry has an inductive origin. In any case, it should not be overlooked that Aristotle holds that deductive inference has a preeminent and fundamental function, namely that of rendering a certain statement more reliable and it should therefore contribute to increasing its certainty, with the simultaneous decrease in uncertainty, so succeeding in reducing what is questionable and doubtful to what is indisputable and indubitable. For this reason, Aristotle held that deduction constitutes an entirely privileged logical tool, being able to guarantee the truth of propositions that may appear to us, at first glance, only to be probable and partly plausible, relating them to indubitable, certain, reliable and incontestable propositions. In this respect, Geometry and Rhetoric again constituted for Aristotle two truly exemplary fields of intervention, since in both cases the different interlocutors nevertheless seek to corroborate their statements by deducing them from and/or supporting them by axioms or legislative provisions accepted and unquestioned by all. But precisely on this decisive point – which leads, therefore, to a clear contrast between deduction and induction – we can then see all the enormous epistemological and scientific distance that inevitably separates the image of scientific knowledge typical of antiquity from that developed by the scientists of modernity. For the former, in fact, one must always start from indubitable statements, to be derived directly from experience, while for the moderns, on the contrary, it is necessary in our reasoning to proceed scientifically from conjectural propositions that are always in need of proof. This innovative raising of critical awareness clearly emerges, moreover, from the texts of the founders of modern science. To give just one emblematic example, we can take the celebrated Discours de la méthode (1637) by René Descartes, in the second part of which the French thinker presents his famous four rules of scientific reasoning, which he polemically contrasts with the method of proceeding typical of the ancients. Now, his third rule explicitly refers to the function of order: “le troisième, de conduire par ordre mes pensées, en commençant par les objets les plus
75 Galilean Methodology and Abductive Inference
1639
simplex et les plus aisés à connaître, pour monter peu à peu, comme par degrés, jusques à la connaissance des plus composés; et supposant même de l’ordre entre ceux qui ne se précèdent point naturellement les uns les autres” (Descartes, 1987, pp. 18–19). Seen in this new perspective, the order of the different elements – the fruit of human sagacity – therefore means that the explanation of the next link certainly requires that of the previous link, while the opposite is not true, precisely because the previous element can – and must – be explained without reference to the next. Precisely on this new function of order, therefore, we can construct a new knowledge that begins from a starting point that can also constitute – as we have already seen in Galileo’s work – a decisively conjectural moment. In any case, also in Descartes the enumeration – that is the operation of control by which the whole deductive chain of all the links in a determined and specific order is traced – is rooted in the order established to understand the world. In this horizon, the chains of deductions are also capable of connecting distant reasons to all the objects of human knowledge, provided that we are able to consider only those that are evident (the first Cartesian rule) respecting their order that envisages starting from the simplest to then rise to the most complex (second and third Cartesian rule). Therefore in this innovative reflection outlined in the Discours de la méthode, the enumeration (or the fourth Cartesian rule) rigorously seals this new conceptual and scientific deductive approach, which on the heuristic plane undermines traditional inductivism, replacing it with a deductivism fully aware of all its innovative critical potential. In fact Descartes writes, precisely for the purpose of defending the deductivist approach he used both in Dioptriques and in Les Météores: Que si quelques-unes de celles [arguments, ed.] dont j’ai parlé, au commencement de la Dioptrique et des Météores, choquent d’abord, à la cause que je les nomme des suppositions, et que je ne semble pas avoir envie de les prouver, qu’on ait la patience de lire le tout avec attention, et j’espère qu’on s’en trouvera satisfait. Car il me semble que les raisons s’y entresuivent en telle sorte que, comme les dernières sont démontrées par les premières, qui sont leurs causes, ces premières le sont réciproquement par les dernières, qui sont leurs effets. Et on ne doit pas imaginer que je commette en ceci la faute que les logiciens nomment un cercle; car l’expérience rendant la plupart de ces effets très certains, les causes dont je les déduis ne servent pas tant à les prouver qu’à les expliquer; mais, tout au contraire, ce sont elles qui sont prouvée par eux. (Descartes, 1987, p. 76)
In defending this deductivist conjectural procedure, expounded and practiced both in the Dioptrique and Les Météores, Descartes draws on the scientific mode of proceeding of astronomers, who often make use of assumptions to develop their geometric constructions, by which they then try to explain the positions (past, present and future) of the various celestial bodies they have studied. Now, all these suppositions, even without ever pretending to be true, nevertheless enable us to identify a fruitful principle from which deductions can be drawn, as indeed Descartes did in the Dioptrique, in which from the initial suppositions he was finally able to deduce an explanation for refraction and for vision itself, etc., while from the suppositions of Les Météores he was able to explain the nature of vapors, winds, fumes, etc. Moreover, Descartes himself, in a letter dated 13 July 1638, addressed to Jean-Baptiste Morin (see Descartes, Correspondance, CXXVII, in
1640
F. Minazzi
Oeuvres de Descartes 1996, vol. II, pp. 196-221, in particular pp. 197-198), had already defended himself from the accusation of having fallen into a “vicious circle”, specifying that, in reality, there is no vicious logical circularity when one proves a cause starting from the effects, known through experience, to then prove some other effect derived from this same supposed cause. In short: for Descartes, if one proceeds from conjectural principles, one does not end up with a dogmatic circularity, but with a reciprocity by means of which the hypotheses help to better explain and understand physical phenomena, while, in turn, it is precisely the physical phenomena that enable us to prove (or disprove) the hypotheses. It is precisely through this new consciously deductivist approach that modern science has critically emancipated itself from traditional metaphysical inductivism and inaugurated a new and fruitful program of scientific research that has profoundly changed our world of praxis, as well as our very image of knowledge. Moreover, the passages quoted should clearly bring out the full cultural and prospective harmony existing between the Cartesian and the Galilean positions. A harmony that finally allowed Vailati to state that: The characteristic difference between Aristotle’s ideas and those of the founders of modern science about the function of deduction in scientific research lies precisely in how little importance has been given to deduction as an instrument of explanation and anticipation of experience, compared to the large amount of trust put in it as a mean of proof and ascertainment. His arguments about natural phenomena, even in those cases where, instead of being used to demonstrate the conclusions they led to, are used to test the premise on which they are founded, aim to reach this purpose more by showing the contradictions and inconsistencies among the various statements, or by showing that they cannot be affirmed simultaneously, rather than by venturing to conclusions never before suspected, whose verifications would have been able to give rise to new observations, which would have contributed to a better clarification on the matter at issue. (Vailati, 2010, p. 30 and see Minazzi, 2011, 2022)
Vailati: The Deductive Method as an Instrument of Research Modern science lives precisely on its predictive spirit, to the point where a chemist and microbiologist like Louis Pasteur, as Vailati rightly points out, “has appropriately defined the experiment as an observation guided by preconceptions, that is, in other words, an observation preceded and guided by deductive processes” (Vailati, 2010, p. 31, note 11). Moreover, Vailati, taking up his considerations relating to the history of science, also comes to remind us that: the history of science clearly shows us that, amongst the causes which gradually led to the substitution of the modern experimental methods in place of the ancient methods of mere passive observation, the application of deduction has to be included as one of the most important, even in those cases where the propositions taken as a starting point were considered more in need of proof than the resulting ones, cases where, therefore, the resultant propositions were those which had to pass on, to the initial conjectures, the grade of certainty that they were directly acquiring from a comparison with facts and experimental verifications. (Vailati, 2010, p. 31)
75 Galilean Methodology and Abductive Inference
1641
Modern science has therefore clearly understood that experience, by itself, does not teach anything unless we are able to critically probe the opacity of the world by the critical light of inquiring reason. Or rather, modern science is fully aware that what experience teaches us, precisely through inductive inference, is mainly linked to a pragmatic and practical-sensitive dimension, within which a basic degree of knowledge is attained that is the preliminary and yet fundamental one that we share with the other forms of life present on earth. With respect to this primordial and vital knowledge of the world, we can then develop only an essentially qualitative and pragmatic rationalization, according to the model of Aristotle’s Physics, which, undoubtedly, from this point of view, provides an excellent frame of reference for understanding the values and limits of the ancient world’s physical knowledge. Hence this ancient way of proceeding, which is perhaps configured as very “natural”, produces a rationalization with a low conceptual content which completely lacks the ability to implement a computational synthesis of critical integration of reality, in order to achieve a greater abstraction of thought. To achieve this further, deeper and more paradoxical level of knowledge, one must in fact have the ability, intelligence, and will to establish a new way of practicing scientific investigation, namely, the very decidedly counterfactual one that was initially developed by Galileo Galilei and then provided a privileged heuristic frame of reference for all modern science. The intrinsic, but extremely fruitful paradox, of the Galilean scientific model – which has since increasingly become a frame of reference for modern science – is rooted precisely in that curious inverted relationship that is established between abstraction and the practical-operative fruitfulness of this approach. After all, three centuries of modern science amply testify to how science has initiated, from the seventeenth century onwards, a continuous process of abstraction that has made it possible to achieve increasingly abstract and increasingly counterfactual levels. But precisely this curve that constantly grows towards an abstraction that is ever more abstract has enabled us to extend, equally constantly, our knowledge of the world. The paradox of this counterfactual model is rooted precisely in this inversely proportional relationship: the more human thought tried to adhere to the world of praxis and sense experience, the less it was able to develop an in-depth objective knowledge of this same physical world, but when it detached itself from the most immediate sense experience and began to test different hypotheses that were increasingly conjectural, abstract and daring, it progressively managed to penetrate ever more deeply into the secrets of life and of reality itself, thus initiating a revolutionary but constant process of growth of our technical-cognitive heritage, by which the possibilities of life on earth have also been improved. In this way, the more abstract, counterfactual and empirically improbable our theories are, the more they succeed in enabling us to penetrate into the most secret and hidden mechanisms of the world and of physical reality itself. For this basic reason, in order to truly know reality, it is not enough to observe it carefully and very scrupulously (as the ancients believed), because, if anything, it is necessary to be able to develop new ideas and new theories to try to explain this reality. In short, without theories you cannot know the world, even if it is also true that these theories and these same audacious abstract ideas must be promptly
1642
F. Minazzi
verified and subjected to an experimental control, within which the physical reality itself – as Galileo clearly understood – again plays an equally fundamental and decisive role since it is precisely through it that the world confirms or disproves our theories. For this underlying reason Vailati then observes that: in other words, ancient physicists were not inclined to experiment, mostly because they were busier making sure of the certainty of the starting propositions than of the truthfulness of those that were deducted by them, and therefore they did not have any reason to question what happens in cases different from those that, presenting, themselves spontaneously to their observations, immediately suggested the generalization on which their arguments were based. Therefore it can be assumed that, in a sense, the increasingly widespread and systematic application of deduction to the study of natural phenomena provided the first stimulus to the development of modern experimental methods, and it is not by chance that the first eminent initiators of such methods were also at the same time the greatest founders and advocates of the application of that powerful tool of deduction, mathematics, to the sciences of physics. (Vailati, 2010, p. 32)
Viewed in this precise perspective, mathematics – used as a privileged heuristic tool above all in physical investigations – then “lent wings” to scientific thought, meaning initially the physics. Again in this perspective, Vailati emphasizes how, for the first scientists of modernity, the experimental dimension took on the appearance of an authentic cimento or “trial”, to be precise “trial by ordeal,” since cimento is the word commonly used by Galileo to indicate an experiment: Maybe it has not been mentioned enough, by those who dealt with the history of Mechanics, that the first and most decisive experiences to determine the advancement, by those who undertook them, non as inquiries toward nature, but rather as challenges, some kind of cimenti, to use the word which has now become classic, to which nature was subjected in order to challenge it to answer differently from how it should have answered. In fact, for a great number of the most important cases, the experiences turned out to be mere verifications of conclusion that had already been reached independently by the experiments. They would have been really astonished if the answers of nature had not conformed to their anticipations, and such a lack of conformity, when it actually did occur, led them to wonder why the experiments had not worked, rather than immediately doubting the legitimacy of their assumptions. Moreover, sometimes they appeared to be drawn to the experiment more as a way of convincing other than as way to convince themselves, because for them appealing to facts was, in a way, the line of least resistance in penetrating the stubborn minds of their adversaries, not being able to counter their preconceptions with their own, without basing such preconceptions on some less subjective basis than their own personal convictions. (Vailati, 2010, pp. 32–33, italics in the text)
Precisely this approach, consistently deductivist, then helps us better understand the precise primary role that ideas, thoughts and theories play within modern science in order to start an innovative and fruitful cognitive process. This was also openly recognized and emphasized by Galileo, who on the third day of the Dialogue Concerning the Two Chief World Systems significantly breaks out of the text, making Salviati stress the role that reason always plays in knowing how to do “violence to meaning”: But the experiences which overtly contradict the annual movement are indeed so much greater in their apparent force that, I repeat, there is no limit to my astonishment when I reflect that Aristarchus and Copernicus were able to make reason so conquer sense that, in
75 Galilean Methodology and Abductive Inference
1643
defiance of the latter, the former became mistress of their belief. (Galilei, 1962, p. 328 and Galilei, 1968, vol. VII, p. 355)
Which emphasizes, in a highly admirable way, precisely the constitutively counterfactual aspect of modern scientific reasoning, which no longer starts from empirical observation, but rather conjecturally constructs a theory whose mathematically deduced conclusions are then placed in relation to the experimental dimension, so that a theory, to put it now with Imre Lakatos, can thus “stick its neck out” to the cleaver of experience. But, of course, this doesn’t trigger some naive falsificationist game (à la Popper) because, if anything, it instead fits into that more complex heuristic game by which “sense experiences” and “necessary demonstrations” are cleverly linked critically to lead to an effective critical increase in our knowledge of the world. In working in this way Galileo is moreover so artful that in the Dialogue he even pretends to attribute this scientific method of proceeding to Aristotle. Thus on the first day he presents Simplicio, who points out that “Aristotle first laid the basis of his argument a priori, showing the necessity of the inalterability of heaven by means of natural, evident, and clear principles. He afterward supported the same a posteriori, by the senses and by the traditions of the ancients” (Galilei, 1962, p. 50 and Galilei, 1968, vol. VII, p. 75). Faced with this deductivist Aristotle Galileo has the dialectical skill to twist this image against Simplico end up attributing his new scientific way of proceeding to the Stagirite. In fact, Galileo puts the following reflection into Salviati’s mouth: What you refer to is the method he used in writing his doctrine, but I do not believe it to be that with which he investigated. Rather, I think it certain that he first obtained it by means of the senses, experiments, and observations, to assure himself as much as possible of his conclusions. Afterward he sought means to make them demonstrable. That is what is done for the most part in the demonstrative sciences; this comes about because when the conclusion is true, one may by making use of analytical methods hit upon some proposition which is already demonstrated, or arrive at some axiomatic principle; but if the conclusion is false, one can go on forever without ever finding any known truth – if indeed one does not encounter some impossibility of manifest absurdity. And you may be sure that Pythagoras, long before he discovered the proof for which he sacrificed a hecatomb, was sure that the square on the side opposite the right angle in a right triangle was equal to the squares on the other two sides. The certainty of a conclusion assists not a little in the discovery of its proof – meaning always in the demonstrative sciences. (Galilei, 1962, p. 51 and Galilei, 1968, vol. VII, p. 75)
Naturally in this passage Galileo is conducting a skillful polemical use of the experimental method precisely because he attributes its use directly to Aristotle, in order to try to critically displace the Peripatetics of his time, also to demonstrate that it is he – and not the Peripatetics – who is in greater scientific harmony with the cognitive research developed by Aristotle. However, if this polemical and instrumental use is dropped, it also emerges from this page that it is precisely the new Galilean experimental method that is capable of critically integrating both “sense experiences” and “necessary demonstrations”, as well as induction and deduction within a new and revolutionary scientific procedure, capable of initiating a cognitive venture that has never found any stopping point from that time until the present.
1644
F. Minazzi
From the Dogma of the Method to a Critical-Historical-Normative Image of Knowledge This new, and certainly much more complex, theoretical-practical approach to the scientific study of reality is therefore based on many different and even contrasting elements that cannot be reduced at all to the well-known “discourse on method” of Cartesian memory. Certainly the famous work by Descartes, which appeared, as we have seen, in 1637, clearly proved seminal and, in its own way, it certainly contributed to spreading the modern scientific mentality. But in doing so it also contributed to spreading a precise and circumscribed idea of the same scientificity that made it substantially coincide with the method of science. Consequently this work of Descartes has contributed, in a way that is certainly fundamental and directly proportional to its cultural and theoretical success, to disseminating in modernity what has rightly been qualified as the “Cartesian syndrome” (see Pera, 1991, pp. 3–17). According to this Cartesian approach, science is essentially characterized by the development of a precise “scientific method”, by applying which it is possible to produce scientific knowledge worthy of the name. From this perspective, science and method thus end up by coinciding. This goes so far that it becomes a sort of “common sense” to believe that if the scientific method is not applied, no knowledge worthy of the name could be produced. Thus the Cartesian syndrome introduced a true dogma, or the dogma of method, which profoundly conditioned Western epistemological thought from Descartes’ Discours de la méthode (1637) all the way to Against Method by Paul K. Feyerabend (1975). Nor can it be overlooked that the spread of this “Cartesian syndrome” through the centuries has also given rise to a situation that today appears to be rather singular and bizarre. While the overwhelming majority of scientists and epistemologists have fully identified themselves with this syndrome, nevertheless the history of the last three centuries analytically documents how all those who have reduced modern science to the scientific method, have been in profound disagreement among themselves as to specifying what this same scientific method consisted of. Moreover, since the seventeenth century, an empiricist image of science was rapidly established which certainly did not help to grasp the role of deductivism within the dynamics of growth of the technical-scientific heritage of modernity. But then, in the face of the inductivist empiricism that has often played a hegemonic role in relation to the image of scientific knowledge, there have also been different and contrasting epistemological traditions that have variously reduced the “scientific method” to conjecturalism, conventionalism, mere deductivism, idoneism, pragmatism, materialism, instrumentalism, empiriocriticism, mathematics and even spiritualism, according to the most varied (and even curious) theoretical interpretations. In the light of this composite tradition of Western epistemological thought it can thus be affirmed that all these epistemologists, while converging in the thesis that modern science is invariably reducible to its scientific method, nevertheless have constantly quarrelled and disputed in trying to define, once and for all, the precise nature of this mythical “scientific method”. There was no doubt that it existed, while it was highly questionable that it would be possible to identify the precise nature of this
75 Galilean Methodology and Abductive Inference
1645
same “scientific method”. The critical discussion was so highly articulated and the positions differed so widely that today almost a different problem arises, namely, the following: how is it possible to avoid and escape from this singular epistemological stranglehold? An epistemological stranglehold that, in systematically reducing modern science to its method, is then unable to establish, with rigor, the precise nature of this scientific method itself. In order to try to emerge critically from this somewhat paradoxical epistemological outcome that on the whole distinguishes the last three centuries of our history, it is necessary to try to elaborate a different theoretical perspective by taking a different epistemological path. First by observing that an eminent scientist such as Galileo Galilei, the acknowledged founder of modern science, despite having published, in 1623, an important methodological work such as The Assayer (original title Il Saggiatore, see Galilei, 2016 and Galilei, 1968, vol. VI), yet never codified procedure in science in a rigid method defined once and for all. Why? Did he forget to do this? Or did Galileo, having discovered the “scientific method”, consciously wish to conceal his “secret”? Or, again, did he want to create a smokescreen to make the knowledge of this mythical “scientific method” more difficult? The paradox of these same questions should lead us to adopt a different theoretical perspective. Because precisely in The Assayer Galileo (as also emerges in other works of his) deliberately confined himself to indicating how scientific knowledge arises from the linking of two different and antinomic poles that he identified in “necessary demonstrations” and “sense experiences”. In other words, for Galileo, scientific knowledge always springs from the critical intertwining between the experimental dimension and the more properly theoretical dimension that found in mathematics its privileged heuristic instrument. But why did Galileo limit himself to providing this general indication without specifying, more analytically and in detail, as Descartes did, the specific nature of the scientific method? In my opinion, this crucial and decisive question can be answered by observing that Galileo was far too skillful and aware a scientific researcher to pursue the chimera of being able to freeze the scientific method into a single aseptic formula, valid for all areas of different scientific investigations. Moreover, it is also true that, in the course of his life as a researcher, he explored different fields of scientific investigation, so dealing with astronomy, physics, the problem of the floating of rigid bodies on water, anatomy, mathematics, geometry and many other scientific disciplines and problems. Certainly the still initial and often quite preliminary degree of many of these scientific researches undoubtedly enabled the Pisan scientist to deal with different and profoundly different fields of study, in keeping with a scientific practice that today the hyper-specialization achieved by the various disciplines would certainly make impossible. Today, specialization has gone so far that often two scientists who deal with similar but different problems have serious problems in understanding each other precisely because specialization in their specific and particular field of research requires precisely such a monomaniacal concentration that it prevents a scientist from being able to deal with anything else. On the other hand, this was not the condition of science at the beginning of modernity and then this wealth of different themes and different scientific
1646
F. Minazzi
problems – which Galileo dealt with variously throughout his scientific life – must have helped him to be very cautious in claiming to be able to reduce the scientific “method” to a single and indivisible methodological version. If anything, precisely the experience of profoundly different physical and scientific fields such as those to which his scientific research was devoted must have made Galileo understand the complexity of reality that the scientist can study from different and even opposite points of view, as well as within different disciplinary fields. The extraordinary richness of his scientific experience must then have led him to assume a prejudicial critical caution in the name of which he finally understood clearly that, within every scientific inquiry, precisely to be able to achieve a better objective knowledge of a particular sector of reality coinciding with what one wishes to investigate and study, one must inevitably forge, autonomously, a specific and peculiar program of scientific research, as well as a specific method of investigation, a specific language, together with a series of specific conceptual categories, specific methods of verification and also falsification, so giving rise to a precise conceptual disciplinary tradition, etc. Now, precisely the full awareness of the interaction that can therefore be established between all these different elements within a particular disciplinary field must have finally persuaded him that there is no presumed and mythical “scientific method” to be applied automatically in every different disciplinary field. On the contrary, it seems instead that every single scientific discipline must always know how to develop, independently, its own specific and peculiar method of critical-heuristic investigation to study the world from its own particular point of view. Besides, in The Assayer, in introducing the famous fable of the study of sounds, commenting on the 21st passage of the Sarsi, Galileo wrote, and certainly not by chance, this important preliminary warning: Long experience has taught me this about the status of mankind with regard to matters requiring thought: the less people know and understand about them, the more positively they attempt to argue concerning them, while on the other hand to know and understand a multitude of things renders men cautious in passing judgment upon anything new. (Galilei, 1992, p. 14)
In this admirable Galilean position, however, it is not necessary to perceive, as has also been done by some interpreters, either an (uncritical) exaltation of the function of experience, or an overall skeptical outcome of his reflection, which would even open the doors to his acceptance of the famous “argument of Urban VIII”, emblematically (but also polemically) recalled, with undoubted rhetorical ability, at the end of the Dialogue Concerning the Two Chief World Systems. These two references, often mentioned by critics in commenting on Galileo’s fable of the study of sounds, are not in fact pertinent precisely because Galileo’s intention is, if anything, in profound harmony with the ancient Socratic teaching. On the basis of this, as is well known, the more we get to know a particular sector of experience (and of the world), the more we become aware of our profound ignorance. Seen in this perspective, knowledge and ignorance, if anything, constitute two sides of the same coin of the cognitive process proper to humanity, a complex process within which the human knowledge of the world progresses always haltingly, learning from its
75 Galilean Methodology and Abductive Inference
1647
mistakes, always having to be critically warned of the partiality of its results, which are always revisable and critically investigable. Certainly this Galilean-Socratic approach is very distant from what in the final pages of the first day of Dialogue, distinguishing between extensive knowledge and intensive knowledge, almost seems to “load” the feeble scientific knowledge achieved by humanity with the infinite weight typical of absoluteness (and unchangeability) traditionally attributed to divinity. However, if one bears in mind the complexity of the cultural (and civil) battle sustained by Galileo almost all through his life, then it will not be difficult to understand that these different stages of his reflection are also explained above all in the light of the different phases and various forms that his tenacious commitment in favor of science, of Copernicanism, and of his own courageous cultural program necessarily had to adopt, configuring different curvatures and also different accentuations of perspective, which last, among them, may not be all precisely congruent or lacking in obvious contradictions. But, in any case, the transition from the fable of the study of sounds evidently attests to the possibility of always being able to critically extend our knowledge of a particular reality, with the conclusion that it is always necessary to increase our own critical spirit in order not to fall into any undue dogmatization of the cognitive results which, however, have actually been achieved. In other words, in this passage Galileo shows that he has always clearly understood the open and never concluded character of scientific research. It was precisely this critical awareness that led him to explain the scientific method at the time, making a deliberately loose but also indispensable reference. A programmatic and deliberately open reference, through which Galileo reminds us all how our objective knowledge of the world can only be extended by always leveraging both the relative autonomy of our thinking, with which we can precisely devise the most fruitful assumptions to dissect a certain problem, which we must then subject to an equally rigorous experimental control, since it is precisely the world, as we have seen, that has the function of confirming, or falsifying, our most daring forecasts and also our dearest and most challenging theoretical assumptions.
Galileo: Detective or Scientist? Hence Galileo did not wish to present a “recipe” of his own for the scientific method, precisely because he was an excellent scientist who ventured into different and quite disparate fields of scientific investigation, drawing from all these different and contrasting scientific experiences of studying the real and effective world a full critical awareness that the scientist, if he wishes to adhere in a profound and fruitful way to his object of study, must always be able to pursue the object of his investigations with all the intelligence and all the critical-hypothetical openness of mind that such research always implies. In short: in the case of Galileo (as had already happened with Euclid) there were no comfortable roads or royal paths already traced that will enable us to comfortably reach knowledge, since every cognitive conquest always implies a fully original and often completely new way
1648
F. Minazzi
of being able to conceive one’s own scientific research in a hypothetically fruitful way. The difficulty of the human condition consists precisely in this: no one explains to humanity how the world is made, and for this very reason humanity can only proceed through abductions, hypotheses, trial and error, to try to trace a new fruitful path, in order to be able to critically extend its technical-cognitive patrimony. In the light of all these considerations (at the same time Kantian, Vailatian and Peircian), it is then easy to realize how the Galilean description of the scientific procedure is largely congruent with the interlinking between hypothesis, abduction, deduction and induction, and experimental control. Certainly in Peirce’s reflection the suggestion of the hypothesis assumed as the basis of the hypothetical theory occasionally seems to derive mainly from the inductive sphere, through an operation of retroduction, while in Galileo, Kant and Vailati the theoretical hypothesis ex suppositione seems to be more rooted in a free and creative conventional act of thinking. Nor can it be overlooked that Peirce’s ideas on abduction are always very articulate and adapt to a great variety of perspectives. On the other hand, it should not be forgotten that from the time of John Stuart Mill onwards, the term induction has come to have different meanings. It has also come to be understood as reasoning that can identify various causal relations and explanations. Peirce also distinguishes different types of induction. For example, by means of induction, inductive generalizations can be made so that we can express general laws. But with induction we can also subject certain hypotheses to a test in order to confirm or falsify them. When Peirce emphasizes the controlling role of abductive hypotheses through induction, he is evidently referring to eliminative and falsifying induction and not to induction as creative inference. One cannot therefore forget the articulation of all these different meanings that induction has come to play in different argumentative contexts. But, in any case, in Galileo’s work these two different perspective accentuations actually find a fruitful critical balance because in his work and his epistemological reflection on the way that science proceeds, an innovative epistemological model of scientific knowledge emerges, within which the role of hypothetical deduction is always and necessarily entwined with that of abduction, induction and experimentation itself (cf. Magnani, 2017). Within these complex and highly articulated dialectical ties between all these different components, however, it is still hypothetical deduction that operates in a fruitful and decisive way, as Giovanni Vailati rightly pointed out, again in his lecture on the history of mechanics, held at the University of Turin in 1898: It could seem paradoxical to say that the power of deduction in this regard is such that through it we come to discover not only the more general and elementary properties that we study, but even get to force them to be reproduced in our minds as if the laws that regulate them and the properties they possess were far more simple and general than they actually are. Nevertheless, this is literally true. So, for example, the fact that there are no perfectly rigid bodies or absolutely incompressible fluids does not prevent the physicist from studying and determining which properties they should have if they were to exist, or by means of deduction come to analyze them, connect them, and recognize them as independent from one another, exactly as though they were properties of bodies which actually existed. In this way we obtain conclusion that are still applicable to bodies that are
75 Galilean Methodology and Abductive Inference
1649
not perfectly rigid or absolutely incompressible, provide that, of course, their lack of rigidity or incompressibility is not such so as to make the difference between their actual behaviour and that of their ideal fictitious models so great that it generates errors of inconvenience that are not compensated by the advantages provided by the performed simplification. Perfectly similar to this is the procedure that had to be followed by those who started the study of shapes and figures of bodies, emancipating it from any consideration of the other properties of the bodies themselves or the matter from which the figures were made. This simple process of abstraction that made at the same time both possible and necessary the applications of deduction to the research of properties of space, seems to us so simple and natural that we almost cannot conceive that it required any labor or intellectual effort. (Vailati, 2010, pp. 55–56, italics in the text)
For this reason, the scientist’s way of proceeding is not then comparable to that generally adopted by the policeman or by an investigator à la Sherlock Holmes. As was rightly pointed out by Massimo Bonfantini and Giampaolo Proni: a detective is a riddle-solver, not an interpreter of “opaque” facts. His art of abduction must thus belong to puzzle-solving, not to hermeneutics. Puzzle-solving, like detective work, calls for keen observation and encyclopaedic knowledge in order to have at one’s fingertips the finite and predetermined set of immediate and clue-fitting possible hypothetical solution. Then one needs training in logical calculation, coolness, and patience for comparing and selecting the hypotheses until one finds the line of interpretation supplying the only solution that fits all the clues. (Bonfantini & Proni, 1983, pp. 127–128, italics in the text)
Conclusions For this basic reason, Holmes’s rigor is rooted, first of all, in an “imperative of simplicity and plausibility according to logical and empirical criteria firmly accepted by society” and, secondly, it “obeys a complementary ban ‘never guess!’” On the contrary, a scientist (but also a brilliant investigator like Poirot) must have the ability to formulate bold hypotheses, precisely because he must always be able to critically probe the opacity and passivity of the world, using the light of his intelligence and rational understanding of reality. For this reason Poirot always alerts his interlocutors by warning them that “the accepted version of certain facts is not necessarily the true one”. For what reason? Because - to put it again with Poirot - “you can really see only with the mind’s eye.” In this perspective, the facts may also be in the public domain, but not the interpretations: “[T]here are many ways of regarding, for instance, a historical fact. Take an example: many books have been written on your Mary Queen of Scots, representing her as a martyr, as an unprincipled and wanton woman, as a rather simpleminded saint, as a murderess and an intriguer, or again as a victim of circumstance and fate. One can take one’s choice” (Christie, 1943, p. 82). But in all these cases the scientist, like Poirot, but not Sherlock Holmes, is precisely the one who finally manages to explain the various facts by constructing an explanation capable of presenting his reasons together with a demonstration. These reasons will never be definitive and ultimate, precisely because ours is a knowledge that can always be explored and critically reviewed. It is not a question of knowledge capable of being valid for eternity because it is human knowledge made by mortal men, born to die (cf. also Magnani, 2017).
1650
F. Minazzi
References Agazzi, E. (2014). Scientific Objectivity and Its Contexts. Springer. Bonfantini, A. M., & Proni, G. (1983). To Guess or not to Guess? (pp. 119–134). Eco-Sebeok. Christie, A. (1943). Five Little Pigs. Collins. Descartes, R. (1987). Discors de la méthode. Texte et commentaire par Étienne Gilson, Sixieme édition. Librairie Philosophique J. Vrin. Descartes, R. (1996). Oeuvres de Descartes. Publiées par Charles Adam & Paul Tannery. Librairie Philosophique J. Vrin, 11 vols. Eco, U., & Sebeok, T. A. (1983). The Sign of Three. Dupin, Holmes, Peirce. Indiana University Press, Blomington and Indianapolis. Feyerabend, P. K. (1975). Against method: Outline of an anarchistic theory of knowledge. NLB. Galilei, G. (1962). Dialogue Concerning the Two Chief World Systems – Ptolemaic & Copernican (S. Drake, Trans., A. Einstein, Foreword). University of California Press. Galilei, G. (1968). Le opere di Galileo Galilei, Edizione Nazionale (A. Favaro, Ed.). G. Barbèra Editore, 20 vols. Galilei, G. (1992). Il Saggiatore. With an introduction and edited by Libero Sosio, Feltrinelli. Galilei, G. (2016). The Assayer. Translated from the Italian by Stillman Drake. University of Pennsylvania. (1st ed., 1957). Geymonat, L. (1970–76). Storia del pensiero filosofico e scientifico. Garzanti, 7 vols. Husserl, E. (1928). Logiche untersuchungen. Max Niemeyer, 3 vols. Kant, I. (1786). Metaphysische Anfagsgründe der Naturwissenschaft. Translated and edited by Michael Friedman. Cambridge University Press. Magnani, L. (2017). The Abductive Structure of Scientific Creativity. An Essay on the Ecology of Cognition. Springer. Mason S. F., (1962). A History of the Sciences (New Rev. ed.). Macmillan Publishing Company. Minazzi, F. (1994a). Galileo “filosofo geometra”. Rusconi. Minazzi, F. (1994b). Il flauto di Popper. Franco Angeli. Minazzi, F. (2011). Giovanni Vailati epistemologo. Mimesis. Minazzi, F. (2022). Historical epistemology and European philosophy of science. Rethinking critical rationalism and transcendentalism. Springer. (in print). Newton, I. (1728). De mundi systemate. Impensis J. Tonson, J. Osborn & T. Longman, Londini. Peirce, C. S. (1960–1966). Collected Papers (A. W. Burks, Ed.). The Belknap Press of Harvard University Press, 8 vols. Pera, M. (1991). Scienza e retorica. Laterza. (English translation by Clarissa Botsford, The Discourses of Sciences. University of Chicago Press, 1994). Pollok, K. (2001). Kant “Metaphysische Anfagsgründe der Naturwissenschaft”. Ein Kritischer Kommentar, Felix Meiner Verlag. Russell, B. (1931). The Scientific Outlook. George Allen & Unwin. Singer, C. (1959). A Short History of Scientific Ideas to 1900. Clarendon Press. Vailati, G. (1911). Scritti di G. Vailati (1863–1909) (M. Calderoni, U. Ricci, & G. Vacca, Eds.). Johann Ambrosius Barth Verlagsbuchhandlung- Successori B. Seeber Librai-Editori. Vailati, G, (2010). Logic and Pragmatism. Selected essays (C. Arrighi, P. Cantù, M. De Zan, & P. Suppes, Eds.). CSLI Publications.
Abduction as Phylogenetic Inference: Epistemological Perspectives in Scientific Practices
76
Elizabeth Martínez-Bautista
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .