382 32 2MB
English Pages 180 Year 2020
Proceedings of the Canadian Society for History and Philosophy of Mathematics Société canadienne d’histoire et de philosophie des mathématiques
Maria Zack Dirk Schlimm Editors
Research in History and Philosophy of Mathematics The CSHPM 2018 Volume
Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques Series Editors Maria Zack, Point Loma Nazarene University, San Diego, CA, USA Dirk Schlimm, McGill University, Montreal, QC, Canada
More information about this series at http://www.springer.com/series/13877
Maria Zack • Dirk Schlimm Editors
Research in History and Philosophy of Mathematics The CSHPM 2018 Volume
Editors Maria Zack Point Loma Nazarene University San Diego, California, USA
Dirk Schlimm McGill University Montreal, Québec, Canada
ISSN 2366-3308 ISSN 2366-3316 (electronic) Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques ISBN 978-3-030-31196-4 ISBN 978-3-030-31298-5 (eBook) https://doi.org/10.1007/978-3-030-31298-5 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This book is published under the imprint Birkhäuser, www.birkhauser-science.com by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Editorial Board
The editors wish to thank the following people who served on the editorial board for this volume: Amy Ackerberg−Hastings Independent Scholar Eisso Atzema University of Maine Orono Christopher Baltus State University New York College of Oswego Janet Heine Barnett Colorado State University—Pueblo Moritz Bodner McGill University Daniel Curtin Northern Kentucky University David DeVidi University of Waterloo Craig Fraser University of Toronto David Gaber McGill University Jean-Pierre Marquis Université de Montréal
v
vi
Duncan Melville St. Lawrence University Julien Ouellette-Michaud McGill University Dirk Schlimm McGill University James Tattersall Providence College Valérie Therrien McGill University Glen Van Brummelen Quest University Maria Zack Point Loma Nazarene University
Editorial Board
Preface
This volume contains ten papers that provide some interesting insights into contemporary scholarship in the history and philosophy of mathematics. The Canadian Society for History and Philosophy of Mathematics has compiled this research. The volume begins with V. Frederick Rickey’s “Professor Bolesław Soboci`nski and Logic at Notre Dame.” Soboci`nski was a Polish mathematician who was active in the Polish underground in World War II. He eventually escaped to Brussels and immigrated to the United States where he joined the faculty of the University of Notre Dame. At Notre Dame, Soboci`nski started a thriving logic program and founded the Notre Dame Journal of Formal Logic. Rickey’s paper provides some fascinating glimpses into Soboci`nski’s life and work. From this beginning, the volume moves on to a collection of papers about twentieth-century philosophy of mathematics. In “Fred Sommers’ Notations for Aristotelian Logic,” Daniel Lovsted discusses Fred Sommers’ (1923–2014) creation of a formal system for Aristotelian logic, which Sommers called Traditional Formal Logic (TFL). In this paper, Lovsted uses TFL’s early development as a valuable case study of the complexity which underlies notational decisions. In “L’equivalence duale de categories: a third way of analogy?” Aurélien Jarry uses the work of Alexander Grothendieck (1928–2014) as a starting point for discussing analogies between commutative algebra and algebraic geometry. Jarry looks at explaining analogy via the preservation/projection of structure (structure-mapping theory), in terms of common laws or axioms (approach axiomatic), and equivalence (in the technical sense of the category theory). José Antonio Pérez-Escobar continues the scholarship of the philosophy of mathematics with “Mathematical Modelling and Teleology in Biology.” This paper discusses the notion that the mathematization of biology is creating a process in biology that puts it in line with the standards of rigor of the physical sciences. Pérez-Escobar challenges this idea by examining how teleological notions, which are common in biology, coexist and interact with modeling techniques in a very idiosyncratic scientific practice that does not exist in the physical sciences. In “Arithmetic, Culture, and Attention,” Jean-Charles Pelland discusses the study of numerical cognition which has accumulated scores of data on cognitive systems vii
viii
Preface
that could be involved in the uniquely human ability to practice formal arithmetic. An externalist point of view holds that our interaction with external support for cognition like fingers, numerals, and number words, explains what allows us to go beyond the size and precision limitations of the cognitive systems we are born with. This paper challenges the externalist answer to the origins of our arithmetical skills and argues in favor of an internalist approach to the development of formal arithmetical skills. The section on philosophy closes with Gregory Lavers’ provocatively titled “Did Frege Solve One of Zeno’s Paradoxes ?” Of Zeno’s book of forty paradoxes, it was the first that attracted Socrates’ attention. This is the paradox of the like and the unlike. Contemporary assessments of this paradox indicate that it is Zeno’s weakest surviving paradox. All of these assessments, however, rely heavily on reconstructions of the paradox. It is only relative to these reconstructions that there is nothing paradoxical involved. In this paper Lavers puts forward and defends a novel interpretation of this paradox, according to which the concept of a unit plays a central role. If this interpretation is correct, then the paradox that Zeno presented was the same as one discussed and solved in Gottlob Frege’s (1848–1925) Grundlagen der Arithmetik. This volume continues with two papers on nineteenth-century history of mathematics. First in “Charles Davies as a Philosopher of Mathematics,” Amy AckerbergHastings examines Charles Davies’ (1798–1876) book The Logic and Utility of Mathematics, With the Best Methods of Instruction Explained and Illustrated (1850). This text has been called “first American book on mathematics teaching methods,” and Ackerberg-Hastings provides detailed insights into this book and its historical significance. In “Gauss et le modèle du champ magnétique terrestre,” Roger Godard and John de Boer discuss “Allgemeine Theorie des Erdmagnetismus,” Carl Friedrich Gauss’ (1777–1855) famous article on the modeling of the terrestrial magnetic field. Benefiting from the previous scientific knowledge about gravitational theory, Gauss assumed that the earth is surrounded by a magnetic potential which obeys the Laplace equation, and Gauss solved this equation in spherical coordinates. In order to do this work, Gauss needed data from terrestrial magnetic observatories. This paper gives a brief history of magnetic observations and examines the validity of Gauss’ approach and his results. In “A Gaussian Tale for the Classroom: Lemniscates, Arithmetic-Geometric Means, and More,” Janet Heine Barnett examines some of Carl Friedrich Gauss’ (1777–1855) work on the lemniscate. Barnett argues that Gauss’ path to these discoveries is an example of the powerful role which analogy and numerical experimentation can play within mathematics and one well worth sharing with today’s students. This paper describes a set of three “mini-primary source projects” based on excerpts from Gauss’ mathematical diary and related manuscripts, which are designed to tell that tale while also serving to consolidate student proficiency with several standard topics studied in first-year calculus courses. The volume closes with Christopher Baltus’ “Philippe de la Hire: Was He Desargues’ Schüler ?” Philippe de la Hire (1640–1718) was the third of the seventeenth-century pioneers of projective geometry, after Girard Desargues (1591–
Preface
ix
1661) and Blaise Pascal (1623–1662). Little is known about La Hire beyond what he tells us in his various published works and what Bernard de Fontenelle reported in his Eloge, issued soon after La Hire’s death. It has been claimed that Desargues’ work strongly influenced La Hire’s Nouvelle Méthode en Géométrie pour les Sections des Superficies coniques et Cylindriques (1673). Baltus compares the work of Desargues and La Hire shows that Desargues’ influence was minimal. This collection of papers contains several gems from the history and philosophy of mathematics, which will be enjoyed by a wide mathematical audience. This collection was a pleasure to assemble and contains something of interest for everyone. San Diego, CA, USA Montreal, QC, Canada
Maria Zack Dirk Schlimm
Contents
´ Professor Bolesław Sobocinski and Logic at Notre Dame . . . . . . . . . . . . . . . . . . . V. Frederick Rickey
1
Fred Sommers’ Notations for Aristotelian Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Lovsted
25
L’équivalence duale de catégories: A Third Way of Analogy? . . . . . . . . . . . . . . . Aurélien Jarry
41
Mathematical Modelling and Teleology in Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . José Antonio Pérez-Escobar
69
Arithmetic, Culture, and Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jean-Charles Pelland
83
Did Frege Solve One of Zeno’s Paradoxes? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gregory Lavers
99
Charles Davies as a Philosopher of Mathematics Education . . . . . . . . . . . . . . . . 109 Amy Ackerberg-Hastings Gauss et le modèle du champ magnétique terrestre . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Roger Godard and John de Boer A Gaussian Tale for the Classroom: Lemniscates, Arithmetic-Geometric Means, and More . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Janet Heine Barnett Philippe de la Hire: Was He Desargues’ Schüler? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Christopher Baltus
xi
Contributors
Amy Ackerberg-Hastings Independent Scholar, Rockville, MD, USA Christopher Baltus SUNY Oswego, Oswego, NY, USA Janet Heine Barnett Colorado State University-Pueblo, Pueblo, CO, USA John de Boer Royal Military College of Canada, Kingston, ON, Canada Roger Godard Royal Military College of Canada, Kingston, ON, Canada Aurélien Jarry Bergische Universität Wuppertal, Wuppertal, Germany Gregory Lavers Concordia University, Montreal, QC, Canada Daniel Lovsted McGill University, Montreal, QC, Canada Jean-Charles Pelland Université du Québec à Montréal, Montreal, QC, Canada José Antonio Pérez-Escobar ETH Zurich, Zurich, Switzerland V. Frederick Rickey United States Military Academy, West Point, NY, USA
xiii
´ Professor Bolesław Sobocinski and Logic at Notre Dame V. Frederick Rickey
Abstract Bolesław Soboci´nski (1906–1980) received his Ph.D. in 1936 under the direction of Jan Łukasiewicz (1878–1960) and then served as assistant to Stanisław Le´sniewski (1886–1939). This close contact with the two founders of the Warsaw School of Logic determined the course of his research. Active in the Polish underground during WW II, he escaped to Brussels where he worked for several years and then emigrated to the USA. After a few years in St. Paul, MN, he joined the faculty at the University of Notre Dame where he started a thriving logic program, directed students to their Ph.D.s, and founded the Notre Dame Journal of Formal Logic, which he edited for 19 years. We will discuss his interesting life and make some remarks about his contributions to logic.
´ 1 Sobocinski’s Early Life Information about Soboci´nski’s early life and how he escaped from Poland is very sparse.1 In 1957, he prepared a curriculum vitae for a proposal to the Ford
1 This
note was prompted by a biographical article entitled “Bolesław Soboci´nski: The Ace of ´ etorzecka, pp. 599–613 in The Lvov-Warsaw the Second Generation of the LWS,” by Kordula Swi˛ School. Past and Present, Birkhäuser (2018), edited by Ángel Garrido and Urszula WybraniecSkardowska. It treats Soboci´nski’s life and work in Poland, but says little about his time after that. Consequently this note that deals primarily with his life in the USA. Full Disclosure: I attended Soboci´nski’s lectures on Le´sniewski’s logical systems from the spring of 1962 until I received my Ph.D. under his direction in 1968 for a dissertation on An Axiomatic Theory of Syntax. I then kept in close contact with him until his death. Since he died 39 years ago in 1980, I may escape the curse of writing about recent history. Also, what is contained below is based on documents. Even most of my personal comments were recorded before 1983. V. F. Rickey () United States Military Academy, West Point, NY, USA © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_1
1
2
V. F. Rickey
Foundation to edit a volume of English translations of Le´sniewski’s works (which came to nought). It will provide a good introduction to his early life and of his view of what he did during WW II, so I quote it in full: I was born on June 28, 1906. I received my high school certificate in 1926 and in the same year entered Warsaw University in Poland. Primary emphasis in all my university work was placed on Symbolic Logic and the Foundations of Mathematics. In 1930 I received a Master’s degree in Philosophy and in 1936 the degree of Doctor of Philosophy. In 1938 Warsaw University granted me the status of Lecturer in Logic with the title of Privat Dozent. From 1934 I was assistant and from 1937 senior assistant at the chair of the Philosophy of Mathematics of Prof. S. Lesniewski at Warsaw University. During World War II, which I spent in my native Poland, I conducted courses in philosophy, logic and foundations of mathematics in the underground Warsaw University. In 1945 I accepted an invitation from the University of Lodz to become a Professor of the Theory of Deductive Sciences. Late in 1945 I was warned that because of my anticommunist activities I was wanted by Russian and Polish communist secret police. It was necessary to go into hiding and in 1946 I succeeded in escaping from Poland into Belgium. I lived there from 1946–1949 and was employed in the Polish Scientific Institute in Brussels. In December 1949 I emigrated to the U.S.A. and from the beginning of 1951 I have been associated with the Institute of Applied Logic (Minneapolis, Minn.) in connection with various research activities. My work for the Institute has included the development of interpretations of logical systems in terms of computing machine design. From the autumn of 1956 I have been lecturing in symbolic logic at Notre Dame University. I became a citizen of U.S.A. in 1955. South Bend, July 20, 1957.
In this short autobiographical note many things were omitted, so we supply what we can. Born in St. Petersburg, Russia, Soboci´nski was the only child of Waleria and Antoni Soboci´nski, both of whom were Polish. He was educated at home by his father, an engineer, and by private tutors before he attended the Catholic Gymnasium of Saint Catherine of Alexandria Church in St. Petersburg from 1916 to 1918. His parents withdrew him as the gymnasium had become nationalized and a Bolshevik curriculum was introduced. Partly because of this experience he held strong anticommunistic views throughout his life. The family moved to Warsaw in October 1922 where Soboci´nski took the Academic Matriculation Courses and passed the matriculation exam as an external student on 23 February 1926.2 In March of 1926 he entered the University of Warsaw, where he studied philosophy and mathematics with Tadeusz Kotarbi´nski, Stanisław Le´sniewski, Jan Łukasiewicz, Władisław Tatarkiewicz, and Władysław Witwicki. As a student in the Faculty of Humanities from 1926 to 1930, he received his Magisterium (M.A. degree) on June 30, 1930 for work on the theory of deduction, giving three new axioms for the equivalential calculus.3 He then transferred to the Faculty of Mathematics and Natural Science, receiving his Ph.D. on June 30, 1936 eximia cum laude, for work on multi-valued
2 Internetowy
Polski Słownik Biograficzny. Thanks to Roman Sznajder for an English translation.
3 “Z bada´ n nad teorja˛ dedukcji” (Investigations into the theory of deduction), Przeglad ˛ Filozoficzny,
35 (1932), 171–193. No English translation has been made.
Professor Bolesław Soboci´nski and Logic at Notre Dame
3
propositional calculi.4 He was examined by Le´sniewski, Łukasiewicz, and Sierpi´nski. Both degrees were under the direction of the eminent Polish Logician Jan Łukasiewicz (1878–1956).5 From 1934 he was assistant and from 1937 senior assistant to Stanisław Le´sniewski (1886–1939) who held the chair of Philosophy of Mathematics at Warsaw University. At this time the University of Warsaw had two (full) professors of logic, more than any other school in the world. This close contact with the two founders of the Warsaw Logic School determined the course of his research. In 1939 he received a prize from the Faculty of Sciences at Warsaw University for his paper Issues about A. Tarski’s weakest basis of the theory of deduction.6 He passed the exam for his Venia Legendi (Dozent of Logic) in 1939 but war broke out before it could be officially approved by the Ministry of Religious Affairs and Public Education (MWRiOP). This paper on Protothetic was to appear in the new journal Collectanea Logica, but the printing house in Warsaw, Piotr Pyz i S-ka, was bombed in September 1939 and the plates of the first issue were destroyed. All that survived were a few offprints which had been sent out for review.7 One went to Heinrich Scholz in Münster who reviewed it in the Zentralblatt. It is hard to believe, but another of those offprints, inscribed to his future wife, survives in Soboci´nski’s Nachlaß: P. Ewie Wrze´sniewskiej od autora 19
4 “Aksjomatyzacja
20 32r. XII
pewnych wielowarto´sciowych systemów teorji dedukcji,” (Axiomatization of certain many-valued systems of the calculus of propositions), Roczniki prac Naukowych Zrzeszenia Asystentów Uniwersytetu Józefa Piłsudskiego w Warszwie, Vol. 1, Wydział matematycznoprzyrodniczy Nr. 1, Warszawa, 1936, pp. 399–419. 5 Soboci´ nski once remarked to me that Alfred Tarski suggested that he ask Le´sniewski to direct his dissertation. However, he had asked Łukasiewicz to advise him just the day before. 6 Zagadnienia A. Tarskiego o najsłabszej bazie teorii dedukcji [Noted in Krzysztof Tatarkiewicz (1923–2011), “Profesor Soboci´nski i kolega Bum” (Professor Soboci´nski and colleague Boom), Wiadomo´sci Matematyczne, 34 (1998), 123–146]. Soboci´nski received the nickname “Boom” (“Bum” in Polish) when he calmly remarked during a bombardment “What a great boom” ´ etorzecka, p. 5]. [Swi˛ 7 “Z bada´ n nad prototetyka” ˛ (An investigation of protothetic), Collectanea Logica 1 (1939), 171– 177. Protothetic is the most basic of Le´sniewski’s three logical systems, being the strongest possible extension of the propositional calculus. The term “pro venia legendi” is Latin for a “petition for permission to read,” i.e., to lecture. The German term is habilitation. He was examined by Le´sniewski, Łukasiewicz, Mazurkiewicz, and after Le´sniewski died, by the physicist Czesław Białobrzeski. For details see the 21 page introduction to Soboci´nski’s reconstruction (likely from memory) and translation of the Polish original: “An investigation of protothetic,” Cahiers de l’Institut d’Êtudes Polonaises en Belgique, no. 5. Polycopié. Brussels 1949, v + 44 pp. The technical portions of this paper have been printed in English translation in Storrs McCall, Polish Logic, 1920–1939, but I do not believe that this discussion of the fate of the journal has ever been reprinted.
4
V. F. Rickey
This shows that Soboci´nski met his future wife as early as 1932. Le´sniewski, who had been a chain smoker for decades, contracted thyroid cancer, and died on 13 May 1939 at the age of 53, just a few months before Germany invaded Poland on September 1. His voluminous papers were entrusted to his most knowledgeable and loyal student, Soboci´nski. Although Le´sniewski had published 750 pages of his results, his logical systems were little known. Part of the reason, Soboci´nski wrote, was that Le´sniewski “presented the results of his considerations in such an exact, formal and at the same time laconic form that it is almost impossible to understand it without an extensive introduction. Such an extensive introduction and commentary has been prepared by me, unfortunately however it has been destroyed during the Warsaw rising in 1944.”8 The earliest description of Soboci´nski comes from Czesław Lejewski: Generally speaking my knowledge of Soboci´nski’s life and editorial activities before 1939 is very very limited. I think I met him no more than about three years earlier, i.e. in 1936, but I knew him by sight since my first year at the University of Warsaw as an undergraduate (1931). It was his appearance and a way of dressing that attracted one’s attention. He could easily be singled out in a crowd of students. He always looked lean and hungry, and had very pale, almost unhealthy complexion. He used to wear striped trousers, a black jacket, and a black bowler hat. He used to carry a bulging brief case and an umbrella or a stick. If he were bodily transferred from Warsaw to the City of London as it was at the times of Dickens, he would be indistinguishable from a low paid clerk in a bank or an accountant’s office. [Lejewski9 to Rickey, 25.9.1981]
2 Surviving WW II Before WW II, Soboci´nski was a member of the National Radical Camp (ONR = Obóz Narodowo Radykalny), but he did not play a major role in it. Soboci´nski wrote that in The first year of the Second World War he spent with relatives in Lithuania, and after his return to Warsaw he took an active part in Polish underground organizations, fighting both German occupation and communism.10
´ etorzecka provides more information about his activities early in the war: Swi˛ At the beginning of the war (around 6.09.1939), Soboci´nski fled from Warsaw and headed to the estate of W. Tatarkiewicz’s mother-in-law, Ksawera Potworowska, located near Lublin (Radory˙z). There, he met Krystof Tatarkiewicz, who was sixteen at the time. Soboci´nski
8 From the typescript of a lecture given by Soboci´ nski at the College of Saint Thomas in the Spring of 1950. Copy in Soboci´nski’s Nachlaß at Notre Dame. 9 Czesław Lejewski (1913–2001), of whom we will hear more below, studied with Le´sniewski before the war. Soboci´nski’s attire, adding only a white shirt and black bow tie, is confirmed by Tatarkiewicz, “Bum”, p. 128. 10 This third person undated autobiographical note in his Nachlaß bears Soboci´ nski’s typed signature.
Professor Bolesław Soboci´nski and Logic at Notre Dame
5
planned to reach the estate of the Skirmunt family11 (at the time in Poland, now Belarus), but he was stopped (by a “gang of peasants”— as he described the attackers) and taken to a town nearby (Motol).12 Soboci´nski kept his acquaintance with the Skirmunts secret (the whole family was murdered in 1939 by the Bolsheviks); in this way, he saved his life and was able to leave Motol. He traveled to Vilnius first (at the time under Lithuanian). There he gave a lecture entitled “O prototetyce prof. Le´sniewskiego” (“On Professor Le´sniewski’s Protothetics”) at a meeting of the Vilnius Philosophical Society (18.10.1939). He went to the estate belonging to his maternal relative, Karol Parczewski (1875–1957), where he stayed from December 1939 to mid-1941 (Sto´ncze, Lithuania, at that time was not occupied by the Soviets). In 1941, he returned to Warsaw.13
During the war (precisely when is uncertain), Łukasiewicz continued the seminar he began in 1936 on Aristotle’s logic. The participants included Henry Hi˙z, Jan Salamucha, Jerzy Słupecki, and Bolesław Soboci´nski.14 In late 1943, Łukasiewicz withdrew from this secret teaching (as he feared for his life) and then, on July 17, 1944, departed Warsaw for Münster, arriving the next day.15 Soboci´nski took over. Krzysztof Tatarkiewicz, who attended these lectures, described Soboci´nski’s teaching style: They were completely different than lectures by J. Łukasiewicz — perfect in terms of teaching. My notes from his lectures resembled the pages of Principia Mathematica by B. Russell and A. N. Whitehead, but they contained more than theses and their purely formal proofs; during the lecture there were also comments: both intuitive (explaining what these patterns really mean), as well as metalogical, methodological or philosophical.16
When Soboci´nski returned to Warsaw he became much more involved in the underground resistance. The ONR changed names several times and Soboci´nski’s involvement increased. Towards the end of the war, using the nom de guerre Rawicz,
11 This
could be Raman Skirmunt (1868–1939), but his Wikipedia page indicates that “In 1939, upon annexation of West Belarus by the USSR, Raman Skirmunt was killed by some local people.” He was born in the village Parechcha near Pinsk, where the Parecca manor, Raman Skirmunt’s estate, is. 12 A footnote, with references, indicates that this information is from a letter of Soboci´ nski to Boche´nski. 13 More information about the whereabouts of Soboci´ nski during this period is in Krzysztof Tatarkiewicz (1923–2011), “Logik i Polityk (Bolesław Soboci´nski),” Matematycy Polskiego Pochodzenia na Obczy´znie, 1998, pp. 168–169. 14 Jacek Jadacki, Polish Analytical Philosophy (2009), p. 271. 15 Schmidt am Busch and Wehmeier, “On the relations between Heinrich Scholz and Jan Łukasiewicz,” History and Philosophy of Logic, 28 (February 2007), 67–81, p. 78. See also, V. Frederick Rickey, “Polish logic from Warsaw to Dublin: The life and work of Jan Łukasiewicz,” CSHPM Proceedings, volume 24 (2001), 93–109. 16 Tatarkiewicz, “Bum,” 1998. In the autobiographical note at the beginning of this paper, Soboci´nski confirms that during the war he “conducted courses in philosophy, logic and foundations of mathematics in the underground Warsaw University.” In a letter to Henry Hi˙z dated 16 September 1981, I wrote: “Mrs. Soboci´nski said that he did teach in the underground university. This is also on his vita. Was the organization big enough and diverse enough so that people did not know what others were doing? I’m asking as I am puzzled by your remark that Soboci´nski wasn’t in the underground university.”
6
V. F. Rickey
Soboci´nski was head of the National Political Department (IV A) in the Central Intelligence Service of the National Armed Forces (NSZ), which was comprised of some 75,000 combatants. Leopold Okulicki, Commander in Chief of the Armia Krajowa was directed to accept a Soviet invitation to negotiations concerning Poland’s future. He went to those talks on March 27, 1945 fully aware that he might be walking into a trap. And he was right. Since Soboci´nski held a high position in the NSZ he was included in the “invitees.” He was late for the meeting and when he got close, he noticed a number of Soviet trucks nearby. Suspicious of the Soviets, he turned back. Thus the Trial of the Sixteen was not the Trial of the Seventeen. Because of his resistance to both the Soviets and the Nazis, Soboci´nski was condemned to death in absentia nine times. So it was time to leave Poland. “In September 1946, he secretly left Poland, together with Ewa Wrze´sniewska. They traveled to Regensburg via Katowice and they eventually married in Germany” ´ etorzecka, p. 601]. On this trip they would have crossed Czechoslovakia.17 A [Swi˛ ´ etorzecka indicates that “At Łukasiewicz’s request, their travel was footnote by Swi˛ facilitated by Zbigniew Jordan (1911–1977).”18 While teaching modal logic in the Fall of 1971, one student asked for an example of an impossible proposition. Soboci´nski responded that to walk 50 km in one night would be impossible. Then he paused and said that he had done that, but to walk 100 km in one night would be impossible. After Soboci´nski’s death, his wife told me that on their way out of Poland she was very nervous as he kept stopping in the small towns to read the inscriptions on the monuments. Part of the journey was by train and, being accustomed to traveling with a fake passport, he calmly put his briefcase on his lap and started writing. She asked what he was doing and his answer was telling: Logic.
3 Brussels In October 1945, Łukasiewicz secured a temporary position to lecture on logic at the provisionally founded Polish Scientific Institute. It is not clear how he obtained this position or how he got to Brussels, but the individual in the background was,
17 Thomas
A. Sudkamp, Soboci´nski’s last Ph.D. student (1978) wrote “One of my big regrets is that I didn’t have a chance to learn more of his personal life. He was very reluctant to discuss it. Only one or two times did I get him to open up about his adventures during world war II, hiding out in Checkoslovakia (sp?).” [Sudkamp to Rickey, September 28, 1981] 18 Sayre notes, note 13, p. 337, that “According to rumors circulated by subsequent students at ND (my source here is Charles Quinn), Sobocinski brought valuable papers with him in a briefcase acquired from an SS officer he had killed during clandestine operations. Imagine how logic by day might have mixed with homicide by night.” I have never heard this rumor and find it hard to believe. Why would he be carrying secret papers while escaping? The comment about “logic by day” is Sayre’s imaginative hyperbole.
Professor Bolesław Soboci´nski and Logic at Notre Dame
7
possibly, Chaïm Perelman (1912–1984), a Polish Jew whose family had moved to Brussels when he was 12, who went to Warsaw in 1937 where he studied for the year with Łukasiewicz, who was Rector at the time, and Kotarbi´nski. He received his Ph.D. from Kotarbi´nski, writing on Frege’s metaphysics. Possibly Robert Feys was also involved in getting Łukasiewicz to Brussels.19 This is also likely how Soboci´nski got to Brussels, arriving a year after Łukasiewicz and probably with help from him. In the autumn of 1946, Łukasiewicz secured a position at the Irish Royal Academy in Dublin, where he lived the remainder of his life.20 Soboci´nski was a research member of the Polish Scientific Institute in Belgium (L’Institut d’Études Polonaises in Belgium), 1946–1949.21 While in Brussels, Soboci´nski again started to reconstruct Le´sniewski’s systems from memory. This is clear from several notebooks that survive in his Nachlaß which contain a careful development of the elementary portions of Le´sniewski’s Ontology, the first of which is dated “Bruksella, 5.II.1947.” There is a notebook containing deductions in Mereology as well as a notebook that treats three topics: Some examples of advanced definitions in Ontology, a section on Carnap’s Abriss der Logistik (1929), and deductions dealing with Le´sniewski’s analysis of the Russell antinomy. The last page is dated Bruksella, 10.IX.1948. The deductions here follow the early deductions in Soboci´nski’s paper “L’Analyse de l’antinomie Russellienne par Le´sniewski,” the first part of which was submitted on 15 December 1948.22 While in Belgium Soboci´nski gave lectures that Henry Hi˙z attended. When Soboci´nski visited Łukasiewicz in Dublin in 1947 he mentioned a thesis, Cδδ0δp, which was already known to Le´sniewski as a curiosity (here δ is a variable functor). This generated papers by Łukasiewicz and “Mr. C. A. Meredith who has attended my lectures on Mathematical Logic at the Royal Irish Academy since 1946.”23
19 Feys
(1889–1961) was a Belgian logician who worked on modal logic. He was one of the founders of the journal Logique et Analyse. Soboci´nski cites a book about logical symbolism, Logistiek, geformaliseerde logica (1944), by Robert Feys in his 1949 republication of “An investigation of Protothetic,” Whether he knew him before he arrived in Belgium is unknown. See Louis De Raeymaeker, “In memoriam le chanoine Robert Feys,” Revue Philosophique de Louvain, vol. 59, no. 62 (1961), pp. 371–374. Later he was invited to be a Visiting Professor at Notre Dame, but I don’t believe this came to fruition. 20 See Soboci´ nski, “In memoriam Jan Łukasiewicz (1784–1956)” Philosophical Studies, VI, December 1956, 3–49, Maynooth Ireland, 1956. 21 “Boleslaw Sobocinski,” The Polish Review, Vol. 27, No. 1/2 (1982), pp. 183–225. 22 In Soboci´ nski’s Nachlass there are deductions, but no text, and there are differences in the deductions, so this will require further study. The paper appeared in the first two volumes (1949– 1950) of the Italian journal Methodos. The reason the paper appeared in this new journal was that Boche´nski was the Editor of the Logical Section. There is an English translation by Robert E. Clay, entitled “Le´sniewski’s analysis of Russell’s paradox,” in Le´sniewski’s Systems: Ontology and Mereology (1984). 23 Łukasiewicz “On variable functors of propositional arguments,” Proceedings of the Royal Irish Academy,’ vol. 54, section A, no. 2 (January 1951), pp. 24–35. Reprinted in Jan Łukasiewicz: Selected Works, 1970.
8
V. F. Rickey
From the passports, note that the Soboci´nskis visited Boche´nski in Fribourg. In conformity with how Lejewski described Soboci´nski earlier, his face is rather thin. When I met Andrzej Mostowski in the summer of 1966 at the Séminaire de Mathématiques Supérieures at the Université de Montréal, he remarked “I hear that Soboci´nski is growing stout.” He also remarked that after he escaped being sent to a camp after the Warsaw Uprising, he heard a voice call out “Hello Professor Mostowski.” It was Soboci´nski. They talked for some time and then each went on his way. On 21 April 1949, E. W. Beth sent Soboci´nski a confidential circular (not a personal letter) on North Holland Publishing Company stationary inviting him to contribute a monograph to the Studies in Logic series which was just beginning. This project was conceived at the Tenth International Congress of Philosophy, which was held in Amsterdam, 11–18 August 1948. The monographs were to be in English, French, or German and to consist of 50 to 75 pages. A circular which was distributed later lists Soboci´nski as the author of a monograph on the Logic of Propositions. A letter from Beth to Soboci´nski dated 1 July 1949 indicates that Soboci´nski replied on 18 May, offering to write a book on Méréologie. On 22 September 1949, M. D. Frank, the Managing Director of the series, wrote Soboci´nski asking “if you really
Professor Bolesław Soboci´nski and Logic at Notre Dame
9
are moving to America,” for then royalties would need to be paid in “American currency.” The next letter, dated 13 January 1950, and addressed to Soboci´nski at the College of St. Thomas in St Paul, Minnesota, contained a contract for Soboci´nski to sign. There are notes on propositional calculi in his Nachlaß which Soboci´nski prepared for this work. For several years while at Notre Dame, Soboci´nski listed these books as “to appear,” but they never did.
4 College of Saint Thomas The September 23, 1949 issue of The Aquin, a publication of the College of Saint Thomas in Saint Paul, Minnesota, announced that eighteen individuals had joined the faculty.24 Here are the first two on the list: The philosophy department has received several outstanding men. Dr. Boleslaw Sobocinski, who will teach logic, is a renowned European scholar. A graduate of the University of Warsaw, Dr. Sobocinski was teaching at the University of Lodz up until the time he fled from Poland to escape possible imprisonment because of his activity against Communism. Dr. Marian W. Heintzman [sic] has also arrived from Poland and will teach philosophy at the College. He received his Ph.D. degree at the University of Cracow in Poland. Dr. Heitzman has held many responsible diplomatic positions and has traveled extensively in Russia.
This announcement was premature, because the Soboci´nskis had not yet arrived.25 The Aquin of January 13, 1950 reported that he “arrived last week to prepare for his duties . . . He will begin teaching next semester.” There is a picture of Mr. and Mrs. Soboci´nski with Dr. Heitzman (whose quality is too poor to reproduce here). Reportedly, Heitzman and Boche´nski recommended Soboci´nski for the position, but documentation is lacking.26 That he taught at the University of Łód´z was probably an embellishment to get the job; he was offered a professorship there but never ´ etorzecka, p. 3]. showed up [Swi˛ One interesting thing about the photo is that Soboci´nski is holding a short cigarette holder. It was his custom to break a Camel cigarette in half and smoke one half at a time. He was a chain smoker who would smoke in classes and seminars.27 24 Much
of the information in this section comes from the Special Collections Department at the University of St. Thomas, thanks to University Archivist, Ann M. Kenne. 25 Lejewski notes in “S. ´ P. Bolesław Soboci´nski,” Znak (1984), No. 351–352, pp. 401 that the Soboci´nskis left Belgium for the USA in December 1949. 26 Heitzman, a historian of Philosophy in Kraków, probably came to know Soboci´ nski through philosophical circles. See the unsigned “In memoriam: Marian Heitzman 20.X.1899 – 18.XI.1964,” The Polish Review, Vol. 10, No. 1 (Winter, 1965), pp. 99–100. 27 My memory does not entirely agree with what Sayre writes (p. 89): “Another remarkable fact about Soboci´nski is that he smoked incessantly. His smoking routine was intriguing. He would take a cigarette from his box, break it in half, somehow manage to light one half from the stub still in the holder, and smoke it down to the hilt until it was time to light the other half. About the only time you could count on Sobo’s not smoking was when he was expounding on logic in public.”
10
V. F. Rickey
From the “Faculty Record,” which is dated February 14, 1950, we learn that Soboci´nski was 47 years old, Catholic, married, “Stateless,” and had traveled in ˇ Poland, Russia, Finland, Lithuania, Czechoslovakia, Germany, Belgium, Ireland, France, Switzerland, and Holland. He had a reading and speaking knowledge of Polish, Russian, and French, but only a reading knowledge of German. Latin was not mentioned; perhaps it was expected at the time that every philosopher knew Latin. Soboci´nski, as a member of the Cracow Circle, certainly read Latin, an assumption confirmed by the course he taught at St. Thomas on Thomism. The Faculty Record indicates that Soboci´nski was appointed as a “Lecturer of Philosophy” at St. Thomas in September 1949 for ten months for a salary of $3500. In the “Evidences of Scholarship” portion of the Faculty Record, eight publications are listed, all but two of which survive in Soboci´nski’s Nachlaß. But the most interesting thing is the note which follows: Besides the above mentioned, up to 100 small notes and reviews published mainly in “Przeglad Filozoficzny” / Philosophical Review, Warsaw /, “Organon” / Warsaw /, “Przeglad Katolicki” / Catholic Review, Warsaw /, Journal of Symbolic Logic / Princeton /.
It is easy to identify the reviews in the JSL, but it would be difficult to identify the others as the publications are difficult to obtain and the reviews may not be signed. Nonetheless, the claim of writing this many reviews is quite plausible as he was the Managing Editor of Przeglad ˛ Filozoficzny from 1931 to 1939. “The Catholic Review” began publication in Warsaw in 1863, but there were interruptions for both World Wars (1915–1922 and 1938–1983). Soboci´nski listed membership in three professional organizations: • Warsaw Philosophical Society, admitted 1931. He was a member of the Executive Board. The word “admitted” is on the Faculty Record. • Polish Logical Society, admitted 1935. He was a member of the Executive Board.28 • Polish Mathematical Society, admitted 1937. During the one semester that Soboci´nski taught at St. Thomas, he did not publish any papers. But one paper seems to have begun there. It is a translation from the 1934 Polish original of a paper by Jan Salamucha (1903–1944) which uses firstorder logic to reveal the tacit assumptions and analyze the defects in St. Thomas’s proof ex motu of the existence of God. The translators are Tadeusz Gierymski and Marian Heitzman of the College of St. Thomas.29 The translation is preceded by 28 The
Polish Logical Society (Polskie Towarzystwo Logiczne) was founded in 1936 on the initiative of Łukasiewicz, who became the first president of the Society. It is curious that Soboci´nski claimed to be “admitted” already in 1935. The board of the society consisted of Jan Łukasiewicz, Adolf Lindenbaum, Andrzej Mostowski, Bolesław Soboci´nski, and Alfred Tarski. The Society started to publish the ill-fated journal Collectanea Logica. 29 Tadeusz Gierymski served as Assistant Professor of Psychology at the University of Saint Thomas from 1954 to 1989, when he retired. Marian Heitzman (1899–1964) is not listed among the retired faculty. From 1928 he taught at the Jagiellonian University in Cracow. During the war he was head of the Political Department of the Ministry of National Defense in London,
Professor Bolesław Soboci´nski and Logic at Notre Dame
11
an interesting biographical note about Fr. Salamucha by Soboci´nski who knew him from their interactions in the Cracow Circle. By the time this was published in The New Scholasticism in 1958, Soboci´nski was at Notre Dame. Soboci´nski was actively involved in the Philosophy Department giving lectures on Le´sniewski’s philosophy of logic. There is a 29 page typescript of two talks on Le´sniewski’s foundations of mathematics, that give a nice introductory survey of Le´sniewski’s Protothetic, Ontology, and Mereology. There is also a “Philosophy Seminar,” consisting of three lectures, 29 pages of typescript. The first of these is dated March 10, 1950. These lectures deal with the history of logic and display considerable erudition. It contains a very interesting discussion of the differences between Aristotelian logic, the logic of the Stoics, and traditional logic. Soboci´nski’s knowledge of the history of logic displayed here is impressive. These manuscripts are nicely written; it is likely that he had help with his English from Heitzman. Soboci´nski’s command of English was not very strong when he was at St. Thomas. As an example, in a letter to “Very Reverend Father,” Prof. Dr. Robert Feys he asked about publishing a short “and not specially technical” paper in the Revue Philosophique de Louvain: I would like to explain a problem concerning the bi-valued logic which, it seems to me, was not discussed in the literature. I observed it looking over some papers from modal logic and many-valued systems. I have not this paper written, but I can done it in very short time, if the answer will be positive.
Soboci´nski’s insistance on a rigorous high level approach to teaching Thomism, and his difficulties with English led to his downfall: “The administration was not too thrilled with his scholarship (he lectured on Thomism from the original texts and not on the texts of later commentators) and finally his lectures were suspended and he was removed from the college after one semester.” The level of scholarship was simply too high for the undergraduates at St. Thomas.
5 Institute for Applied Logic From 1951 to 1956 Soboci´nski served as director of research at the Institute of Applied Logic which was located in the Oppenheim Building in Saint Paul, Minnesota. John Goodell was the Director of the Institute. Together they founded the The Journal of Computing Systems, which appeared in June 1952. The authors who published in this short-lived journal (only one volume, four numbers, was published) reveal some of Soboci´nski’s connections: Jan Łukasiewicz, Alan Rose, Carew A. Meredith, A. N. Prior, Alan Ross Anderson, and Desmond Paul Henry.
military attache in the USSR, and author of the first report on the Katy´n massacre. After the war he emigrated to Canada where he taught at McGill University in Montreal before coming to the College of St. Thomas.
12
V. F. Rickey
Of the 24 papers published in The Journal of Computing Systems, four were by Soboci´nski. One of these, “On a universal decision element,” appears to break new research ground for Soboci´nski, but on closer examination, it too deals with propositional logic. It investigates logic gates and the simplest ways to construct them from given gates. The last, “Axiomatization of the conjunctive-negative calculus of propositions,” was a reconstruction from memory of a paper originally “published” in the ill-fated journal Collectana Logica in 1939. Due to financial difficulties the Institute of Applied Logic was forced to close in 1956, leaving Soboci´nski unemployed. He contacted several universities and industries, sending his three-page “Life Abstract.” From this document we learn that his height was five feet nine inches and his weight was 170 pounds. He also described his work at the Institute of Applied Logic: Research in Logics and Mathematics as applied to the Computer Systems; propositional calculus, Boolean algebra and algebra of logics. Consulting on application of Logical methods to the speeding of the arithmetical operations. Methods to the automatic programtheory and techniques of handling information, adaption of the theory of quantifiers, protothetic and the theory of recursive functions.
From correspondence in his Nachlaß it is clear that his applications were supported by Carnap, Church, Kleene, and Quine. Everyone he contacted reported that they had no openings or that he was overqualified for their positions. There was one exception, a job offer at General Mills, which he did not accept. On August 15, 1956, a friend wrote that “you have chosen well” to accept the offer at Notre Dame. He did obtain some additional employment in St. Paul: “I collaborated a long time with Minnesota Electronics Corporation, now defunct, concerning an application of symbolic logic to the theory of electronic computers. In 1956 I received a consultantship contract from Remington Rand Univac Corp., St Paul Minn.”
´ 6 I. M. Bochenski In 1952, Father Theodore Martin Hesburgh (1917–2015) was appointed the fifteenth president of the University of Notre Dame. One of his goals was to upgrade the quality of the university by attracting top scholars and so, in 1953, “With visions of Notre Dame becoming the ‘Catholic Princeton’ dancing in his head, Father Theodore Hesburgh, CSC, establishes the Distinguished Professors Program and begins barnstorming Europe to recruit anchoring talent.” On one of these tours he was specifically looking for philosophers.30 He visited Salamanca, Paris, Munich,
30 Hesburgh
realized that the Mathematics Department did not need upgrading as it was headed by Arnold Ross, who had succeeded Karl Menger [Kenneth M. Sayre, Adventures in Philosophy at Notre Dame (2014), p. 57. This book has proved invaluable in preparing this paper but I have not cited it each time I used it.] Menger was a professor at Notre Dame from 1937 to 1946; he
Professor Bolesław Soboci´nski and Logic at Notre Dame
13
Louvain—where he recruited Ernan McMullin, a young priest with a background in physics—and then headed for Fribourg to meet Boche´nski, the best Catholic philosopher in Europe [Sayre 2014, p. 58]. In 1934, Boche´nski received his doctorate in Sacred Theology at the Pontifical University of Saint Thomas Aquinas in Rome, also known as the Angelicum. Naturally it was run by the Dominicans, but is not to be confused with the Gregorian which is also in Rome and was run by the Jesuits. From 1934 to 1940 he taught logic there and then, from 1945 to 1972, he was a professor at the University of Fribourg in Switzerland.31 Around 1930, Boche´nski became interested in mathematical logic. The Principia Mathematica was one of the most important books for him, but he also read work by Kazimierz Ajdukiewics, Alonzo Church, and Haskell Curry.32 When Hesburgh visited Boche´nski he was struck by the fact that his office was crammed with books from floor to ceiling and in a variety of languages, including fifteen that he had written. When Hesburgh asked what he was working on he responded that he was working on a history of logic. When asked who might translate it into English, he immediately named a fellow Dominican, Father Ivo Thomas. This work, A History of Formal Logic, was indeed translated by Thomas and published by the University of Notre Dame Press in 1961 [Sayre 2014, p. 62]. It was the Cracow Circle [Koło Kracowskie] that tied Boche´nski and Soboci´nski together intellectually. It was created to show that modern logic was needed to explain the arguments of Thomas of Aquinas, who used only syllogistic logic. Its official origin can be traced to August 26, 1936 at a special meeting of the Third Philosophical Congress in Cracow, although there were earlier discussions of this goal.33 It was founded at the instigation of Łukasiewicz, and its most
then moved to the Illinois Institute of Technology in Chicago. Ross was chair of the mathematics department from 1946 until he moved to The Ohio State University in 1963. 31 During the war Boche´ nski served as a chaplain for the Polish army. I know little about his wartime activity other than comments in Sayre. He used various pseudonyms including Emil Majerski, Bogusław Prawdota, P. Banks, and K. Fred. I do not know if these were related to his wartime activity [Sayre 2014, p. 334]. 32 Wole´ nski, “Józef M. Boche´nski and the Cracow Circle,” Studies in East European Thought, Vol. 65, No. 1/2 (September 2013), pp. 5–15, p. 6. Curry gave a series of five lectures at Notre Dame, April 12–15, 1948 which are published as A Theory of Formal Deducibility (1960), the sixth volume of the Notre Dame Mathematical Lectures series. At the time Menger was department head. I suspect that they had met in Europe when Curry was working on his 1930 Ph.D. under the direction of Hilbert. 33 For its history see Roman Murawski, “Cracow Circle and Its Philosophy of Logic and Mathematics,” Axiomathes, September 2015, Vol. 25, Issue 3, pp. 359–376; Jan Wole´nski, “Józef M. Boche´nski and the Cracow Circle,” and Boche´nski,“The Cracow Circle,” pp. 9–18 in K. Szaniawski (ed) The Vienna Circle and the Lvov-Warsaw School, 1989.
14
V. F. Rickey
prominent members were Boche´nski (1902–1995), Jan Salamucha (1903–1944),34 Jan Franciszek Drewnowski (1896–1978),35 and Soboci´nski. Boche´nski came to Notre Dame as a visitor in the fall of 1955, McMullin having joined the philosophy faculty a year earlier: Notre Dame, Ind., Aug. 25 — Rev. I. M. Bochenski, O.P., an eminent mathematical logician who has been teaching at the University of Fribourg in Switzerland, will be a visiting professor of philosophy at the University of Notre Dame during the 1955–56 school year, it was announced here today by Rev. Philip S. Moore, C.S.C., vice president for academic affairs. Father Bochenski is the author of Problemgeschichte and is widely known for his writing in his field. He is one of several internationally recognized scholars being added to the Notre Dame faculty under the University’s Distinguished Professors Program. Courses to be conducted by Father Bochenski include “History of Logical Problems” and “Contemporary Logic.” During the first semester beginning in September he will also deliver a series of general lectures on Existentialism. He will give another lecture series on Dialectical and Historical Materialism during the second semester. [Press release, August 26, 1955.]
During the 1955–1956 academic year the Rev. I. M. Boche´nski, O.P., was a visiting professor in the Department of Philosophy at the University of Notre Dame.36 He had been invited to organize a logic program at the university, as we shall see below. For his contributions he received an honorary degree from the university on June 5, 1966.37 Father Boche´nski wrote about Soboci´nski: I come now to another logician whom I knew intimately, Soboci´nski. An assistant of Le´sniewski, he was said to be the only man in the world who really knew everything about his master’s logic — the search for the shortest axioms and the like — he left a considerable number of results which would surely merit republication. Let me tell something about him which happened in South Bend in 1956. Sitting at the fireside in his villa, he was bitterly complaining of “all the nonsense which is written in logical books”. “What books?”, I asked. “Well, your own books for instance”. You can
34 Jan
Salamucha (1903–1944) studied philosophy, mathematics, and logic at the University of Warsaw where he attended the lectures by Łukasiewicz, Le´sniewski, Kotarbi´nski, Władysław Tatarkiewicz, and Stefan Mazurkiewicz. 35 Between 1921 and 1927, Drewnowski studied philosophy, mathematics, and logic at the University of Warsaw under the supervision of Stanisław Le´sniewski, Jan Łukasiewicz and Tadeusz Kotarbi´nski. In 1927, he obtained his doctor’s degree under Kotarbi´nski’s supervision with a dissertation dealing with Bolzano’s logic. Drewnowski is also important as he headed the institute in Belgium where Soboci´nski was unhappily employed before he came to the USA. 36 For biographical information see Guido Küng, “In Memoriam: Joseph (Innocent) M. Bochenski, O.P. 1902–1995,” The Review of Metaphysics, Vol. 49, No. 1 (Sep., 1995), pp. 217–218. 37 Department of Information Services Records (DIS), University of Notre Dame Archives (UNDA), Notre Dame, IN 46556. File “UDIS 129/03 Subject: Bochenski, Reverend I.M. Joseph M., OP — Honorary Degree Recipient 1902–.” Kenneth Sayre writes that “The commendation accompanying the presentation of the honorary degree was written by Fr. McMullin, and contained lighthearted references to Bochenski’s earlier stay at ND. Unaccustomed to levity on such portentous occasions, Bochenski accepted the certificate with visible displeasure. This reinforced Fr. Ernan’s unabashed memory of Bochenski as a crotchety old man ‘with little feel for normal human relations’. ”
Professor Bolesław Soboci´nski and Logic at Notre Dame
15
imagine that I asked him to explain. “Of course”, he said, “you do assert, like most of the crowd, formulae with free variables, which is nonsense”. This gives an idea of how true he was to the tradition of rigour of his masters in Warsaw. Soboci´nski was also a completely unpractical and most “scholarly” person. During the interview which preceded his nomination in Notre Dame, we had quite a lot of trouble understanding what he meant by saying “Son loves Mary”. We asked in vain “whose son?”. It appeared finally that not a son, but John was meant. When appointed he asked that the expenses for the transportation of his wife and of his cat be paid. And so on. Yet, once in the chair, he proved to be an excellent, brilliant teacher. It is true that he wrote on the blackboard practically all that he said.38
From these comments we learn that Boche´nski was present when Soboci´nski was interviewed for a position at Notre Dame in the Spring of 1956. They also reveal problems with Soboci´nski’s command of English, something that concerned those who were hiring a teacher. There were also concerns about rumors about Soboci´nski’s anti-Semitism.39 Together these doubtless account for his being hired as a research associate rather than at one of the professorial ranks. That he was hired at all indicates the influence that Father Boche´nski had and the confidence that he had in Soboci´nski’s expertise as a logician and respect for what he had published already. This confidence benefited the university. In the 1950’s, on the recommendation of Prof. I. M. Boche´nski, the University of Notre Dame inaugurated a research program in logic, and invited [Father Ivo] Thomas to become a visiting professor, from 1958–1960. He collaborated with Prof. B. Soboci´nski in establishing Notre Dame Journal of Formal Logic and was a regular participant in his seminars. From 1959–1963 at Notre Dame and then from 1964–1974, at Ohio State University, Columbus, he taught in the summer school program for teachers and gifted students in mathematics, funded by the National Science Foundation. In 1963, he joined the faculty of the General Program of Liberal Studies at Notre Dame, becoming professor in 1970 and director of the Collegiate Seminar in 1973.40
7 Ernan McMullin Ernan McMullin (1924–2011) has been mentioned several times before in passing. As a long time member of the governing board for the journal, he played the important role of encouraging Soboci´nski in this work. McMullin was born October 13, 1924 in Ballybofey, Donegal, Ireland. As a high school student he learned
38 Józef M. Boche´ nski, “Morals of thought and speech — reminiscences,” pp. 1–8 in Philosophical
Logic in Poland, edited by Jan Wole´nski, 1994. preparing the obituary note “Bolesław Soboci´nski 1906–1980,” Jane Dunkle, the longtime mathematics department secretary who had moved with O. Timothy O’Meara (1928–2018) to the Office of the Provost, gave me access to Soboci´nski’s personnel file. She asked that I treat that information with care as it contained rumored information about Soboci´nski’s anti-Semitism. These records in the University Archives will not be available for 72 years after the date of their creation. 40 Otto Bird, “In Memoriam Ivo Thomas (1912–1976),” NDJFL, Vol. 18, April 1977, 193–194. 39 When
16
V. F. Rickey
Gaelic. He attended Maynooth College, the National University of Ireland, earning a B.Sc. in Physics in 1954 and a B.D. in Theology in 1948. In 1949 he was ordained a Roman Catholic priest. He then received a fellowship in theoretical physics at the Institute for Advanced Studies in Dublin, 1949–1950. He was the only person at the Institute without an advanced degree. He worked under the 1933 Nobel laureate Erwin Schrödinger (1887–1961) and Lajos Jánossy (1912–1978).41 Then he studied philosophy at the Institut Supérieur de Philosophie of the University of Louvain (Leuven) in Belgium, 1950–1954, taking the baccalaureat, licence, and doctoral degrees in philosophy, summa cum laude. The topic of his 1954 doctoral dissertation was The Principle of Uncertainty. A preliminary critical study of the origin, meanings, and consequences of uncertainty.42 In 1954 McMullin joined the Notre Dame faculty as an instructor in philosophy, with joint duties in physics. He had a single office with the physicists, rather than an office with five philosophers. He taught philosophy of science in John FitzGerald’s philosophy sequence for physics majors, a course that was well received.43 John Derwent took this course the first time McMullin taught it and reports that “it was by a wide margin the best course I had from the College of Arts and Letters” [email of September 20, 2018]. McMullin’s reception by the philosophers did not go so well. As was customary at the time, he was appointed without consulting the department. While still at Louvain, he wrote a scathing review of Philosophical Physics (1950) by Vincent Smith, one of his new colleagues in philosophy.44 The textbook for the logic course that McMullin was asked to teach was Logic: The Art of Defining and Reasoning (1952, 1963) by John Oesterle who joined the department in 1954.45 McMullin could only tolerate the book for a few days so he switched to Copi’s Introduction to Logic. Such things made McMullin persona non grata in the department. In 1954, President Eisenhower appointed Hesburgh to the National Science Board and he was instrumental in the creation of postdoctoral grants in the philosophy of science. He recommended that McMullin apply. He did and this allowed him to spend three years at Yale. Hesburgh then asked McMullin to return to Notre Dame as chair of the Philosophy Department. Aware of his reputation in
41 For
background, see “Eamon de Valera, Erwin Schrodinger, and The Dublin lnstitute for Advanced Studies,” Journal of Chemical Education, 1983, 60 (3), p. 199. 42 Notre Dame Archives, CEMM 16/13. McMullin’s Dissertation. 43 Sayre 2014, p. 66. 44 I have examined the book. I did force my way through the chapter on “Mathematics and the Infinite,” which begins with an idiosyncratic presentation of Cantor’s set theory, but the purpose of doing that is to put his “ideas under the lens of realistic philosophy” [p. 299]. Smith’s foreordained conclusion, based on ancient ideas of Aristotle and medieval ideas of Aquinas, is that infinite sets do not exist. 45 This book was specially designed for teaching syllogistic logic in Catholic colleges. It was cunningly designed with tear out work sheets. Since it could not be resold, Oesterle earned enough royalties to endow a chair.
Professor Bolesław Soboci´nski and Logic at Notre Dame
17
the department he sensibly declined. But a formal offer was soon to arrive and he returned to Notre Dame in the fall of 1960, to remain for the rest of his career. He spent the 1964–1965 academic year at the University of Minnesota where he was well liked: In the course of his relatively short stay here, Father Ernan’s wit and jovial disposition have become his most well known trademark. And not all of his accomplishments lie in the realm of scholarship. His forte is the harmonica, and he has a fine repertoire of Irish and American songs and ballads, as well befiits a son on County Donegal, “the most sung about place in all of Erie.”46
Years later, when McMullin was famous as a philosopher of science his colleague, Ralph McInerny, described him in The Philosophy Newsletter of December 1978 as follows: “the green tornado, made his ferocious energy felt here and there about the galaxy, touching down, among other places at . . . ”47
´ 8 Sobocinski Comes to Notre Dame On June 7, 1956, Soboci´nski wrote to Fr. Herman Reith, head of the Philosophy Department: “I received this morning a letter from Father P. J. Moore offering me the position of Research Associate in your Department.” On June 12, 1956, Fr. Reith wrote to Soboci´nski that he was to teach basic symbolic logic, Phil 111, to students with no previous knowledge of logic. Much more important was the announcement that there was to be a committee with two members from mathematics and two from philosophy “to formulate a program in symbolic and mathematical logic.” “The idea is that we want to inauguarate a very strong center of symbolic logic here at Notre Dame.”48 At a faculty meeting of the Department of Philosophy on October 26, 1956, department chair “Father Reith Wilcomed [sic] Father Lu and Mr. Sobocinski into the department.” At this meeting Father Reith suggested that the faculty “would be delighted to have Mr. Sobocinski lead a discussion on Isomorphism and Analogy.” This suggestion was realized on February 20, when Soboci´nski led an “intra-departmental discussion” on “The logical notion of the isomorphism and its application to the notion of analogy.” Two weeks later, McMullin gave an intradepartmental discussion on “Is modern logic relevant to our teaching?”
46 Notre
Dame Archives, CEMM 102 Articles [of McMullin] 1964–66. was levity in the Mathematics Department also. The minutes of a faculty meeting on December 11, 1974, which were likely written by Derwent, noted “Several faculty members have made trips to conferences or to deliver lectures. In particular, four attended the International Congress in Vancouver in August, but failed to win any Fields medals.” 48 When I saw this letter on 10 September 1981 it was in the Philosophy Department files. The above is the only part of the letter that I recorded. Due to regulations of the Notre Dame Archives, this letter is no longer accessible. 47 There
18
V. F. Rickey
On February 13, 1956, with Soboci´nski still in Minneapolis, Jan Łukasiewicz died. As the only one of his five doctoral students in the west, Soboci´nski took on the task of writing an obituary. He wrote two versions, one in Polish for the Polish Society of Arts and Sciences Abroad and another version in English, “In memoriam, Jan Łukasiewicz (1878–1956),”49 which contains more about Łukasiewicz’s work in logic and less about academic life in Poland. A footnote reveals its inspiration: “I would like to express here my gratitude to my colleague, Rev. Fr. Ernan McMullin of the University of Notre Dame, who not only encouraged me to make this version and arranged its publication, but also for correcting its English.” It is fitting that this was published in Ireland, for Łukasiewicz spent the last decade of his life in Dublin.50 What Soboci´nski actually taught during his first year at Notre Dame is not known, for the Notre Dame Archives has no copy of the Schedule of Courses for the 1956– 1957 academic year. He was assigned an office in the Administration Building—the one with the golden dome—but was fearful of using it as he considered the building a fire hazard. He met students in the Huddle, the student center. The “Committee for Symbolic Logic” met on February 3, 1957. This was likely their first meeting. Present at the meeting were two members of the Philosophy Department, John James FitzGerald and Father Ernan McMullin. There were five faculty from mathematics: Frederick Bagemihl, Arnold Ross, Vladimir Seidel, R. Catesby Talliaferro, and Soboci´nski. He must have been the secretary of the committee for there are several copies of the minutes in his Nachlaß. They include some of his fractured English: “Besides them it was invited prof A. Church from Princeton University.” Three topics were discussed at the meeting. The first item was to decide on the logic curriculum. They decided on an elementary course, Symbolic Logic, Philosophy 111 and 112. For the academic year 1957–1958, Soboci´nski proposed that he teach Modal Logic in the Fall semester and Theory of Relations in the Spring. In the fall of 1957 he indeed did teach an undergraduate “Symbolic Logic” course, Phil 111, and a graduate course on “Modal Logic,” Phil 251.51 In the spring he
49 Philosophical
Studies, (Maynooth, Ireland), 6 (1956), 3–49. This English version contains an interesting “Curriculum vitae of Jan Łukasiewicz.” 50 For details see Rickey 2001. 51 In a small pamphlet on Philosophy at Notre Dame this course is described as follows: “A discussion of modal logic in the context of contemporary symbolic logic centering on Aristotelian modalities and the notion of strict implication. This course is open only to students who have taken introductory courses in symbolic logic. Others must receive permission of the Head of the Department of Philosophy or Mathematics. Offered in the Fall semester. Three credits.” The word “Aristotelian” in this description shows the domination of Thomistic philosophy in the department. Here is the description of the Calculus of Relations course: “This course will consist of a presentation and discussion of the elementary properties of relations, the fundamental notions and principle theorems in this area.” The same restrictions as in Modal logic are included. Soboci´nski’s Nachlaß contains a well organized set on notes on the Calculus of Relations; probably these are his notes for this course.
Professor Bolesław Soboci´nski and Logic at Notre Dame
19
again taught Phil 111 as well as a graduate course on the “Calculus of Relations,” Phil 242.52 He was promoted to associate professor in the Spring of 1958.53 Every year the head of the Philosophy Department asked the faculty what the department members wished to teach the next year. Soboci´nski was allowed to teach what he wanted, as is witnessed by memos sent to the head of the Philosophy Department over the years. The second item on the agenda was a discussion of what courses Thoralf Skolem would teach as a visitor during the 1957–1958 academic year. They recognized the need for a short course of lectures about abstract set theory, to be given by Soboci´nski. The third and last item had a significant long term impact: III. The Committee discussed also a preliminary project concerning the publication of a journal which will be devoted exclusively to the foundations of mathematics. It was decided that it will be useful for a better analysis of this problem to consult prof. S. Kleene from the University of Wisconsin, one of the editors of The Journal of Symbolic Logic.
Adjacent to these minutes in Soboci´nski’s Nachlaß is a typescript headed “Note of a critical edition of Lesniewski’s writings.” Soboci´nski remarked that this work is “extremely interesting and important not only from the point of view of mathematics, but also for philosophy.” Le´sniewski published only a few papers about his systems, but commentary and supplementary remarks will be necessary for them to be more easily understood. The North Holland Publishing Company has expressed interest in publishing (provided they do not have to bear the cost). A translator can be hired for about $2,000 to translate the 300 pages “of a very difficult text.” The Soboci´nskis started a charming tradition by introducing a guest book at their home. The first entry is “2.VI.1959” (Soboci´nski almost always represented dates in this fashion). The first signature is that of Wracław Sierpi´nski. Other logicians there that evening were Ivo Thomas OP. S.T.M., Clay (Mathematics, 1961),54 and Skolem. The mathematicians included Frederick Bagemihl, Arnold E. Ross, and R. Catesby Taliaferro. This tradition continued until a few months before Soboci´nski died, so there are far more names than we can mention, but we shall note some of the others that are important to the history of logic at Notre Dame. Ernan McMullin and Milton Fisk were there at the next event, 13.XII.1959. Eugene C. Luschei (11.VI.1960), Czesław Lejewski (15.IX1960), Arthur N. Prior and Otto Bird (3.II.1962), and Guido Küng (1.I.1963) were also guests. On 18.I.1964 several of Soboci´nski’s students made their first appearance: Sister Paula Marie, SSND (M, 1964), Fred
52 According to graduate student scuttlebutt, to prepare for this course, Soboci´ nski used inter-library loan to obtain a copy of Ernst Schröder’s Vorlesungen über die Algebra der Logik. When it became due, he refused to return it until he had finished with it. He encouraged Chelsea to reprint it, so there is a copy of that edition in his personal library. 53 Notre Dame Alumnus, Vol. 36 No. 4, May–June 1958, p. 13 54 Soboci´ nski’s Ph.D. students are in italics together with whether in Mathematics or Philosophy and the year the degree was granted. Only Thomas Sudkamp (M, 1978) is missing.
20
V. F. Rickey
Rickey (M, 1968), Charles F. Quinn (P, 1971), John Thomas Canty (Philosophy, 1967), and Thomas Scharle (P, 1973). There were many more guests: V. Vuckovic (30.V.1964), Karel Lambert (10.XII.1965), William J. Frascella (M, 1966 and P, 1978) (20.VII.1966), Ignacio Angelelli (8.V.1967), Anjan Shukla (M, 1967) (30.V.1967), José Alberto Coffa (27.XII.1967), Storrs McCall (2.II.1968), Albert Dou S.J. and Roman Suszko, Nino B. Cocchiarella (10.V.1969), I. M. Bochenski OP (7.X.1969), W. Russell Belding (M, 1972) and Biswambhar Pahi (14.IV.1970), Willard V. Quine (15.V.1970), Jean Drabbe, Richard Poss (M, 1970), J. R. Senft, E. William Chapin, Jr. (11.II.1970), C. Davis (P, 1973), Jack C. Boudreaux (P, 1975), James G. Kowalski (P, 1975) (8.VIII.1972). There was another tradition: Soboci´nski insisted that the wives sign separately, joking that he wanted to be sure they could write. The 1962–1963 faculty directory has the Soboci´nskis living at 2405 Club Drive. They lived there for the rest of their lives. It seems reasonable to assume that the Soboci´nskis did this because his position was secure and his finances were improved as in June of 1961 the following announcement was made: “Rev. Chester A. Soleta, C.S.C., vice president for academic affairs, announced the promotion of four faculty members to the rank of professor. They are . . . Boleslaw Sobocinski, philosophy”55 When “La génesis de la Escuela Polaca de Lógica” appeared, Soboci´nski sent an offprint to Philip S. Moore, C.S.C., Vice President of Academic Affairs at Notre Dame. He thanked Soboci´nski in a letter of September 23, 1956, and then added that While in England this summer I met your friend Father Ivo Thomas who expressed to me his very deep esteem of you as a mathematical logician. Later on my trip I met Father Bochenski, and he has outlined a plan which I hope we are going to be able to realize, and which would bring you, Father Thomas and Father Bochenski together as a team here at Notre Dame to attack some of the important philosophical problems in the area of mathematical logic.
At a Philosophy Department faculty meeting on May 22, 1957, it was announced—“to a brisk round of applause”—that Father McMullin had received a National Science Foundation Fellowship and will spend the next academic year at Princeton. I find this phrase ambiguous. Were they proud of him receiving the grant or happy he would no longer be a thorn in their sides? Curiously, Soboci´nski sent a “List of Publications” for the period “From January 1 to September 1, 1956” to the head of the philosophy department, Father Reith. Curiously, because this dealt with the period just before Soboci´nski arrived at Notre Dame in the fall of 1956. Only one article is listed, “On well constructed axiom systems.” When I met Czesław Lejewski in his home in Manchester in the summer of 1976, I remarked about the good English style of this paper, a remark that pleased him as he had written the paper; Soboci´nski discussed the ideas with
55 The Notre Dame Scholastic, May 5, 1961, p. 11; Notre Dame Alumnus, Vol. 39 No. 3, June 1961,
p. 10.
Professor Bolesław Soboci´nski and Logic at Notre Dame
21
Lejewski and then he wrote the accompanying text. This memo also lists three published reviews, four “Papers completed and ready for publication” as well as six “In preparation.” One of these is “On the axiom systems of protothetic.” This is a reference to what I consider his best paper, “On the single axioms of the protothetic,” which was published in three parts in the NDJFL, 1960–1961. Another memorandum dealt with the period “From September 1, 1956 to September 1, 1957,” Soboci´nski’s first year at Notre Dame.56 It lists three published papers, including a biographical article (there are both Polish and English versions) about his doctoral advisor, Jan Łukasiewicz, and a paper in Spanish about the genesis of the Polish school of logic. Five more papers and four reviews that are completed and ready for publication (all of which did appear). But what is most interesting is the list of five papers and books that are in preparation. Three of these never appeared: “1) A remark on Le´sniewski’s system of the foundations of mathematics, 2) Note on the, so-called, pseudo-definitions, and 5) A critical English edition of all Le´sniewski’s writings. A project sponsored by the Mathematical Department of Notre Dame University.” Soboci´nski proposed to edit a “Critical English Edition of Lesniewski’s Work in Logic” and was “ready to prepare such a critical edition with remarks and commentaries and without royalties for his work.” Dr. Rose Rand (1903–1980) was to assist him.57 Had this been published it would have substantially altered the course of study of the logical systems of Le´sniewski. The annotations would have been invaluable, for only Soboci´nski was familiar enough with the work of Le´sniewski to supply them. There were two other items that never appeared: “4) Propositional Calculus. A textbook.” and “3) Mereology. A book. will appear in the series “Studies in Logic”, published by North Holland Publishing Company.” This monograph on Mereology would have dealt primarily with the technical aspects of the system and would have included a carefully laid out series of deductions. This comment is based on what was presented in Soboci´nski’s classes on Mereology. When Lejewski, who was then a visiting professor, attended Soboci´nski’s class on Mereology in 1960 he “got the impression that the deduction of the Schröder-Bernstein Theorem from the theorem of the mean was Sobo’s own [. . . ] Soboci´nski never discussed with me his plans for a monograph on Mereology. I don’t think he had worked them out in any detail or in any outline for that matter” [Lejewski to Rickey, 25.9.1981 and 19.10.1981].
56 The minutes of a Departmental Meeting of October 11, 1957, report that Father Reith announced
that “Members were asked to supply the chairman with lists of publications covering the period Sept. 1956–Sept. 1957.” 57 Sometime during the years 1955–1959 Rand was a research associate at Notre Dame. Her papers are at the University of Pittsburgh. Some of her unpublished translations of Le´sniewski’s works are in Soboci´nski’s Nachlaß.
22
V. F. Rickey
On January 9, 1959, Soboci´nski wrote to Father Reith “proposing the following graduate courses”: 1) Advanced Logic of Names. An analysis of the notions of similarity, isomorphism and homomorphism. Their definitions, principal theorems and applications. Inductive relations and their importance. Theory of logical chains. Finite and infinite class in logic. Families of classes. Theory of logical minima and maxima. The course is open to students who have taken an introductory course in symbolic logic. Three Credits.
In addition he proposed a course on “Introduction to metalogic” in the spring of 1960, and a “Seminary in Symbolic Logic” to be held both semesters. These grand publication plans came to naught for Soboci´nski’s time was overtaken by a new project, editing a logic journal.
9 Founding the Notre Dame Journal of Formal Logic On February 9, 1959, Fr. Paul Beichner, Dean of the Graduate School, wrote to Fr. Reith that the question of an annual volume to be published by the Department of Philosophy had come up on several occasions over the past decade. The earliest reference he could find was in a note in 1949 to Fr. Philip Moore, then in his last year as head of the Philosophy Department, to Fr. John Cavanaugh, President of the University. The idea came up again when Fr. Gerald Phelan was chair (from 1949 to 1952), and there was strong support from administrators and faculty. Beichner’s letter was prompted by a visit from Soboci´nski sometime in January 1959. He had a number of important papers in symbolic logic, some of which had been finished for years, and about which people had asked him. But the philosophy journals have much material on hand and sometimes publication was delayed for years, “so long that the papers lose their value.” Soboci´nski was concerned with the “log-jam” of papers awaiting publication. All agreed that it was important to have a publication that “comes out exclusively under our own name.” Soboci´nski also argued that it would make Notre Dame better known. The Notre Dame Journal of Formal Logic was founded in 1959, with the first issue appearing in 1960. Soboci´nski launched it “as a labor of his love for symbolic logic and the foundations of mathematics.”58 The Notre Dame Alumnus reported that Soboci´nski’s “impeccable grasp of symbolic logic supports” the journal.59 When the third and fourth numbers of volume one of the journal were sent to Soleta, he responded on March 21, 1961, that he was “delighted to see that it is now being produced smoothly.” He had no doubt that Soboci´nski would make “a distinguished contribution to original thinking” and the reports he had heard form outside ND were “enthusiastic and highly laudatory.”
58 Notre 59 Notre
Dame: A Magazine, Winter 1959, p. 6. Dame Alumnus, Vol. 43 No. 1, 1965, pp. 5 and 43.
Professor Bolesław Soboci´nski and Logic at Notre Dame
23
Every semester Soboci´nski taught a graduate course in logic, often a course dealing with one of Le´sniewski’s systems. The classes were never large, usually a dozen students or less. But the students liked his courses and returned semester after semester. They appreciated him as a teacher and as a mentor. In August of 1961, Robert E. Clay received his Ph.D. in mathematics for a dissertation entitled Contributions to Mereology. Although Soboci´nski was a member of the philosophy department, Clay’s degree was in Mathematics. He was the first of eight students to receive a Ph.D. in mathematics under Soboci´nski’s direction. There were seven in philosophy.
10 The End of an Era After Soboci´nski died on October 31, 1980, I published a short biographical note noting that During the war he taught in the underground university and was active in the underground resistance movement. Because of this he was wanted by the Russian secret police and so had to try to escape. On the third try he and his future wife, Ewa Wrze´sniewskia, successfully escaped.60
Shortly after this appeared, Tarski telephoned me, objecting to this comment.61 He said that Soboci´nski never taught in the underground university and that while he was active in the underground movement, his activities were not honorable, for he was involved in arranging for the murder of Jews. Tatarkiewicz notes, “Bum,” p. 170, that Soboci´nski “later strongly denied this.” The day after Soboci´nski’s funeral, I wrote to Henry Hi˙z: Went to South Bend Tuesday and returned last night after the Professor’s funeral. It was a very difficult trip for me to make, but I had to do it and am glad that I did. The funeral mass was at Sacred Heart Church on campus. Father Hesburgh, Provost O’Meara (the mathematician) and other University officials were there. I don’t know if that is standard procedure, but it pleased me to see them there. Ernan McMullin said the mass and gave the homily (he said that he has met you). My previous impression — and it was reinforced by many things over the past few years — was that people didn’t understand how important Soboci´nski was to the University. But now I think that at least Ernan understands, for he spoke with sincerity and eloquence. Later I mentioned to Mrs. Soboci´nski that he understood and she agreed. He spoke first of how the Professor was a scholar — a scholar in the traditional European sense, and that although the European educational system has its bad points and is often maligned, he exemplified all of its good points. He set for himself the highest standards, and went about the search for truth without regard for what others thought about his work. He spoke next of how the Professor had that rare ability to profoundly influence the students who gathered around him. He was able to bring out the best in them. It
60 “Bolesław Soboci´ nski 1906–1980,” Proceedings and Addresses of the American Philosophical Association, 55 (1982), 498–499. 61 His annotated copy of my obituary is in the Alfred Tarski Papers at Berkeley: BANC MSS 84/69 c (Series 8, Box 1, File 13). Thanks to James T. Smith for alerting me to its existence.
24
V. F. Rickey pleased me to know that at least Ernan understood how important Sobo was to his students. I couldn’t help but cry with joy — Mrs. Soboci´nski said later that she was happy to look over and see me crying, for she was trying so hard not to. Next he spoke of the Journal and how the Professor had overcome such tremendous odds to found it and to make it succeed. So many others spoke of the Journal as it was his only accomplishment. In my mind it is only a small part of what he did. Finally, Ernan went back to those days in Warsaw, those days when the war was brutalizing — but not destroying the fiber of the country. He spoke of how Soboci´nski had to made a commitment that few of us in this country have ever had to make; he committed his life for his ideals. He risked his life to help lead the underground movement against the Russians and the Germans. I was deeply touched by what he said. I knew all of these things, understood all of these things, but I wasn’t sure that those at Notre Dame did. Now perhaps, they do. You are right Henry, he was the end of an era. He will always live in the hearts and minds of those of us who were influenced by him. We loved him. We shall remember him. Your Fellow Le´sniewskian Fred
Acknowledgements Many thanks for helpful suggestions go to John Derwent, Florence Fasanelli, ´ etorzecka, Jim Smith, Roman Sznajder, and two anonymous referees. Kordula Swi˛
Fred Sommers’ Notations for Aristotelian Logic Daniel Lovsted
Abstract In the mid-twentieth century, American philosopher Fred Sommers created a formal system for Aristotelian logic which he called Traditional Formal Logic, or TFL. In his early work on TFL from 1967 to 1970, Sommers developed three notations to represent Aristotelian sentences—the testing procedure notation, the fractional notation, and the arithmetic notation. Sommers eventually chose to work exclusively with the arithmetic notation, and he accounts for this decision by claiming it was the only viable notation of the three. Observing the flexibility of the testing procedure and fractional notations undermines this simple account. His decision was more likely the result of many factors: oversights of his own notations’ merits, biases against two-dimensional notations, historical precedents, typesetting constraints, and the belief that notations can symbolically mirror psychological processes. TFL’s early development thus serves as a valuable case study of the complexity which underlies notational decisions.
1 Introduction The philosophical career of Fred Sommers (1923–2014) centered on a single and monumental task: the rehabilitation of Aristotelian logic for use in the twentieth and twenty-first centuries. As Sommers inherited it, Aristotelian logic had serious expressive and deductive shortcomings, compared with the style of logic that supplanted it around the turn of the twentieth century—a style which I will broadly call, after Sommers, “Fregean” logic. But Aristotelian logic, in Sommers’ view, also had important advantages over Fregean logic. Chief among these was its syntactical closeness to natural language, and the ease of learning and use that this virtue brought with it. Beginning in the 1960s, therefore, Sommers sought to forge a new
D. Lovsted () McGill University, Montreal, QC, Canada © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_2
25
26
D. Lovsted
logic, which he called Traditional Formal Logic (TFL). His goal was to use the Aristotelian paradigm as a foundation, and hence retain its naturalness, but to greatly expand its expressive and deductive power. The notation of TFL is a crucial part of Sommers’ project. Sommers wanted his logic to be used. He envisioned it as a tool not only for philosophers, but also for non-philosophers—humanities students more broadly, scientists, and the general public—and their more quotidian reasoning needs: deciphering the arguments of political debates, reasoning through popular science writing, or making day-to-day decisions. Sommers taught introductory logic classes at Brandeis University using TFL and wrote a TFL textbook for use in similar courses at other schools. One of TFL’s great virtues, Sommers thought, was its practicality: unlike modern Fregean logic, it was simple enough to learn and use that it could gain the widespread use he hoped for it. With these pragmatic considerations in mind, notation was for Sommers more than a trapping for theory; a clear, easy, and flexible notation was critical to the success of his project. We can see this concern explicitly, in Sommers’ comments on his notation and on notation in general—a favorite quotation of Sommers’ was “A good notation is better than a live teacher,” attributed to Bertrand Russell (quoted, e.g., in Sommers 2005, p. 12). And we can see the concern implicitly, in the successive and careful revisions Sommers made to TFL notation through the 1960s, 1970s, and 1980s. This paper examines an episode in the history of TFL: Sommers’ development of three notations for Aristotelian logic in the late 1960s, and his choice of one of them to serve as the primary notation of TFL.1 The paper is organized as follows. Since Sommers’ work is best understood in its context, I begin with a brief history of Sommers’ early career, explaining what led him to break from Fregean logical practice (Sect. 2). I then present the three notations that Sommers developed in the late 1960s. In 1967 and perhaps before, Sommers used a simple notation which I will call the testing procedure notation (Sect. 3). In the same 1967 paper that introduces the testing procedure notation, Sommers also presents a second notation, the fractional notation (Sect. 4). And in 1970, Sommers published a third notation, the arithmetic notation, which he would use for the rest of his career (Sect. 5). I then discuss his choice of the arithmetic notation (Sect. 6). I challenge Sommers’ own version of events, which presents his choice as the simple selection of the only viable notation available to him. I argue that the virtues he found in the arithmetic notation, such as closeness to natural language and adaptability to more complex deductions, were already present in his earlier attempts. His decision was more likely motivated by a variety of factors, which I propose include oversights, biases, historical comparisons, constraints on typesetting, and deep, tacit philosophical
1 Though Sommers’ main concern was to promote Aristotelian logic over Fregean logic, and though
most of the debates he engaged in with his contemporaries regard this distinction, we do not address this higher-level question at present. Rather, our concern is, once we have chosen to do Aristotelian logic, how are we to symbolize it?
Fred Sommers’ Notations for Aristotelian Logic
27
convictions. Finally, I conclude that the complexity of this episode, as well as the care and thought Sommers expended on his notations, make the development of TFL a valuable case study in the history and philosophy of notation (Sect. 7).
2 Sommers on the Road to Aristotelianism How did Sommers come to practice and champion Aristotelian logic, at a point in the history of logic when Aristotle’s work was overshadowed—even, in the eyes of many, discredited—by that of Frege? Sommers came to Aristotelian logic by way of a remarkably mainstream career. After he received his PhD in 1955 for a dissertation on the ontology of Alfred North Whitehead, Sommers embarked on an investigation into the semantic relations between natural language words, which culminated in his 1959 paper, “The Ordinary Language Tree.” It was a project self-consciously positioned within the bounds of ordinary language philosophy, which was indeed an ordinary kind of philosophy for an English speaker to do in the 1950s—what Kuhn might call “normal science.” Sommers at this stage was no maverick. But he became increasingly aware of dissonances between the logical relations he found in natural language, on the one hand, and the possibilities of explaining them offered by Fregean logic, on the other. One of Sommers’ key observations in “The Ordinary Language Tree” is that a predicate has the same range of plausible predicability as its contrary. What does this mean? Given a predicate like “mortal,” we can plausibly predicate it of some subjects (“Joe is mortal,” “dogs are mortal,” and “Hera is mortal” are all plausible, though maybe not true), but not of others (“the theorem is mortal” and “purple is mortal” just do not make sense). We call the set of subjects a predicate P is plausibly predicable of the “range of plausible predicability” of P. Also, any predicate P has a contrary P*—for “mortal” this is “immortal,” and for “green” the contrary is “nongreen.” Sommers’ observation, then, is that P and P* are specially related in that they have identical ranges of plausible predicability. A second key observation of the paper was that the converse of a plausible subject–predicate sentence is also plausible, and that the converse of an implausible sentence is implausible. For instance, given the plausible subject–predicate sentence “the man is rational,” we form its converse as “the rational thing is a man,” and observe that this, too, is plausible. Given “the theorem is tall” (implausible), the converse is “the tall thing is a theorem” (also implausible). Thus, plausibility is preserved when the subject becomes predicate and the predicate becomes subject. It is on these two fronts that Sommers found Fregean logic inadequate. First, Fregean logic has no primitive operation for forming the contrary of a predicate; instead, contrariety is expressed via contradictoriness. For example, to rewrite a sentence like ∀x P(x) with contrary predicate, we use the contradictory-forming connective ¬ and write ∀x ¬P(x). Since subjects are singular in Fregean logic, and
28
D. Lovsted
since contrariety and contradictoriness are equivalent for singular subjects,2 we can make do with just the one kind of negation (contradictoriness). But we make, for one, a stylistic sacrifice: insofar as we believe, with Sommers, that contrariety is an important relation in itself, we are depriving it of the primitive place it deserves in logic. And more materially, Sommers was concerned that, in fact, contrariety and contradictoriness are not always equivalent for singular subjects—for are not “Saturday is hungry” and “Saturday is satiated” (contrary propositions) both false? A logic that conflates contrariety with contradictoriness must import additional restrictions to avoid category-mistake cases like this one. Second, a logical subject is not interchangeable with a logical predicate in Fregean logic. Grammatical subjects and predicates may be interchangeable, depending on how they are rendered; if we write “All dogs are mammals” as ∀x [D(x) → M(x)], we can write “All mammals are dogs” as ∀x [M(x) → D(x)], because we have rendered the original grammatical subject (dogs) as a logical predicate (D(x)). But logical subjects in Fregean logic are constants or variables, and these cannot be transformed into predicates. If we have P(a)—say, “Aaron is prescient”—Fregean logic does not guarantee us an expression like A(p)—“the prescient thing is Aaron”; this latter sentence is not well-formed. (We might try something like ∃ x [P(x) ∧ (x = a)], but this transformation is cumbersome and nonobvious; moreover, the proposition it produces is distant in form from the natural language sentence, and it is unclear how to interchange its subject and predicate.) The fundamental problem here is that subjects and predicates in Fregean logic are different kinds of entity—predicates, like functions, have “gaps” that must be filled by terms, while logical subjects are gapless.3 Aristotelian logic provided Sommers with readymade solutions to many of his problems. It allows for (at least)4 two kinds of negation: contrary-forming term negation, and contradictory-forming sentence negation. For example, in the sentence “the man is tall,” the Aristotelian can primitively negate the subject term (“the non-man is tall”), the predicate term (“the man is non-tall”), or the entire sentence (“it is not the case that the man is tall”). Also, in Aristotelian subject– predicate sentences, the subject and predicate can be interchanged with ease. In sentences like “S is P,” “Some S is P,” and “All S are P,” S and P are the same kind of logical object, called terms, and so we can just as easily exchange the place of these terms in the sentences to form their converses: “P is S,” “Some P is S,” and “All P are S,” are all well-formed. We began this section by noting the deep unpopularity of Aristotelian logic in Sommers’ time, and asking how, then, Sommers came to practice it. We have now
2 This
is not the case for sentences with general subjects. “All men are clever” is not the contradictory of its contrary, “All men are non-clever,” since both sentences may be (and in fact are) false. But a singular sentence, like “Socrates is clever,” is the contradictory of its contrary, “Socrates is non-clever,” since exactly one is false. 3 See Frege (1997) for his seminal paper on the treatment of predicates as functions. 4 A third kind of negation, called predicate denial, exists in some Aristotelian systems, though elsewhere it is collapsed into sentence negation.
Fred Sommers’ Notations for Aristotelian Logic
29
seen two possibilities offered by Aristotelian logic, and not by Fregean logic, to a philosopher interested in the logical structure of natural language. But we could have made a stronger case than this. We might just as well have asked why, given Aristotelian logic’s grammatical basis and its common-sense rendering of natural language sentences, more ordinary language philosophers did not abandon Fregean logic in the 1950s and 1960s. To answer this kind of question, Sommers observed that Aristotelian logic was not merely out of fashion in the mid-twentieth century; rather, antipathy towards Aristotelian logic, and confidence in the superiority of Fregean logic, had been elevated to the status of a dogma. Sommers alone recognized and challenged this dogma. In 1967, he published “On a Fregean Dogma,” which opens by quoting Russell: Traditional logic regarded the two propositions “Socrates is mortal” and “All men are mortal” as being the same in form; Peano and Frege showed that they are utterly different in form. [ . . . ] the philosophical importance of the advance which they made is impossible to exaggerate. (p. 47)
Russell and other Fregean logicians hold that both a singular statement (e.g., “Socrates is mortal”) and a general statement (e.g., “All men are mortal”) predicate mortality of a singular subject; the difference between these statements is structural, in that the singular statement predicates simply while the general statement predicates under quantification. Sommers argues that the two statements may justly be considered to have the same logical form, the singular statement predicating simply of a singular subject and the general statement predicating simply of a general subject. He is thus defending a basic tenet of Aristotelian logic, its rendering of subject–predicate statements.
3 The Testing Procedure Notation Although “On a Fregean Dogma” is a precursor to Sommers’ work on TFL proper, the paper introduces several algorithmic techniques that resemble the foundations of a formal system. One such technique is Sommers’ “testing procedure,” which checks the validity of a given syllogism. The testing procedure is presented using a notation of its own,5 which is shown in the following reproduction of a section of Sommers’ text (Table 1). The headings in square brackets are not included by Sommers: 5 Developing notations seems to have been something of a penchant for Sommers. In “The Ordinary
Language Tree,” he develops a simple notation to express plausibility and implausibility of predication between two terms—U (A, B) indicates plausibility of predicating B of A, N (A, B) implausibility. In a fascinating addendum, he develops a separate, more involved notation, a kind of Huffman coding, for representing a tree of terms—since, Sommers writes, “A simple notation that tells us [plausible predicability relations] at a glance is . . . a desideratum” (p. 184). In “Types and Ontology” (1963), Sommers extends his U/N notation from 1959 and throughout the paper freely defines and uses other symbolic notations. It is no surprise, then, that in 1967 he presents, almost
30
D. Lovsted
Table 1 Sommers’ testing procedure (p. 51) [Categorical] A. E. I. O.
[Notation] Ps Ps P s P s
Perspicuous expression S’s are P’s S’s are un-P’s S’s are not un-P’s S’s are not P’s
Vernacular expression All S are P No S are P Some S are P Some S’s are not P
Thus, “Ps” represents predicating P of subject s, a bar above a term indicates its contrary, and a prime ( ) indicates predicate denial. Note the use of two kinds of negation, term negation to form a contrary term, and predicate denial to form a contradictory statement. This notation is used to perform the testing procedure, which Sommers presents in three steps: (i) If the statement contains [predicate] denials, transpose them [i.e., switch a denied premise with a denied conclusion and replace their denials with affirmations] until all statements are affirmative. This can be done only if the conclusion and one premise is a denial. (ii) Next get the implication into transitive form [i.e., with terms ordered as Ab · Bc ⊃ Ac]. This can always be done by using inversion [i.e., by replacing Ps with Sp, or vice versa, and P s with S p, or vice versa, as needed, noting that P can be replaced by P for any term P]. (iii) Count the [recurrent] terms. There should be no more than three [where P and P count as different terms]. (p. 52) The (somewhat dense) procedure is illuminated by an instructive example: Suppose, for example, we wished to test EIO.3 [i.e., the syllogism of E, I, and O categoricals in the third figure, shown below] for validity. We have P m · S m ⊃ P s [this is EIO.3] ≡ P m · Ps ⊃ Sm [after transposing second premise and conclusion per (i)] ≡ Mp · Ps ⊃ Ms [after inverting first premise and conclusion per (ii)] The syllogism is valid since there are only three terms, M, P, and S [counting per (iii)]. (p. 53) In short, the testing procedure involves three steps: transposition, inversion, and counting. The first two steps get the syllogism into a certain form (called by Sommers the affirmative, transitive form). If these steps cannot be applied, the syllogism is invalid. Counting is the final validity check for a syllogism in affirmative, transitive form. The testing procedure thus provides a concise, formal method for checking the validity of syllogisms. offhandedly, two impressively workable notations for representing categoricals and algorithms for performing deduction with them.
Fred Sommers’ Notations for Aristotelian Logic
31
4 The Fractional Notation Also in “On a Fregean Dogma,” Sommers presents a second notation that was “suggested” to him by analogies to algebra that he found in the steps of the testing procedure (1967, p. 53). First, the method of inversion in step (ii) allows for the interchanging of Ps with Sp. Sommers observes that this manipulation has an −1 analog in the algebraic law, PS = PS −1 . Second, he noticed that when a deduction is in transitive form, i.e., when the terms are ordered Ab · Bc ⊃ Ac, there is an C C analogy to be made to the algebraic law, B A · B = A . Sommers presents his second notation by using it to write the four categoricals (p. 53)6 : A.
All A’s are B = df.
E.
No A’s are B = df.
I.
Some A’s are B = df.
O.
a b a b−1
a
b−1
Some A’s are not B = df.
−1 a −1 b
A fraction thus represents universal predication (the denominator being predicated of the numerator), and a -1 exponent represents term contrariety when affixed to a numerator or denominator, and predicate denial when affixed to an entire fraction. It is important to note that we can only manipulate these expressions in restricted ways, compared to the manipulations allowed in algebra. For instance, −1 given ab , we may not treat the exponent algebraically to yield ab , as these are not logically equivalent expressions (“Some A’s are not B” is not equivalent to “All B’s are A”). The fractional notation is used to perform a similar validity check to that of the testing procedure, as follows. Define a positive proposition in the fractional notation as one without an external negative exponent, and define a negative proposition as one with an external negative sign. Notice that positive propositions are logically universal, while negative propositions are logically particular. Then define an affirmative syllogism as any syllogism all of whose propositions are positive, or as any syllogism with exactly one negative premise and a negative conclusion. Finally, a syllogism is valid if and only if it is an affirmative syllogism wherein the propositions algebraically equal the conclusion, in the fractional notation.
6 Sommers uses an in-line fraction notation in this presentation, i.e., he types a/b rather than the columnar ab , but this was likely a typesetting requirement. He consistently uses the in-line representation in the body text, but when he shows full illustrations of the fractional notation in use, he always uses the column representation. I have reproduced the text somewhat unfaithfully, using a column representation, since I believe this to have been Sommers’ preferred format, for the above reasons. Also, this is the only time Sommers uses the lower case to represent the terms in his fractional notation—capital letters are used in every other instance—but the reason for this discrepancy is not clear.
32
D. Lovsted
Sommers shows numerous examples of syllogism verifications using the fractional method. One is reproduced below (p. 54)7 : −1 −1 S S P = · M M P
P S S · = M P M
valid.
In other words, we may verify syllogism validity almost entirely algebraically. We must first perform the non-algebraic but simple step of checking affirmativeness. Once affirmativeness has been verified, multiply the premises together algebraically, and check if they equal the conclusion. If they do, the syllogism is valid; if not, it is invalid.
5 The Arithmetic Notation Though the name would not be used until later in the decade, the year 1970 has the strongest claim to being the birth year of TFL. It was an inauspicious birth. In 1970, Sommers published “The Calculus of Terms,” forty pages of dense prose, impenetrable formulae, unhelpful examples, and frequent errors. But it was this paper that first presented a formal system of Aristotelian logic of equal deductive power to Fregean logic.8 Prior to 1970, Aristotelian logic was not able to perform deductions with compound statements (handled by the Fregean propositional calculus) and deductions with relational statements (handled by multiply quantified statements in the Fregean predicate calculus). “The Calculus of Terms,” despite its defects, covers both territories, and also extends the forays into syllogistic reasoning made in 1967. Its presentation is poor only in proportion to its enormous scope. What concerns us most in “The Calculus of Terms” is a new notation for categorical statements, the arithmetic notation. As in the fractional notation, capital letters represent terms, but in the arithmetic notation terms are arranged horizontally 7 In
every example where, as in the one above, the initial syllogism is affirmative with two negative propositions, Sommers displays the fractional algorithm as a two-step process, where the first step is transposing the syllogism to an all-positive form. It is unlikely that Sommers is simply “showing his work” in doing this, as he leaves other complex algebraic expressions for readers to work out themselves; more likely, he envisions the fractional method as an extension of the testing procedure, and the step which he shows explicitly is analogous to step (i) of the testing procedure. Regardless, it is certain that he was aware of the fact that this step is not necessary to make separately: since the transposition it involves preserves algebraic equality (it can be done by dividing both sides by the two negative propositions), we can simply check for affirmativeness and verify the equality in a single step. 8 A formal proof of TFL’s equivalence to a standard Fregean logic was given in Sommers (1982).
Fred Sommers’ Notations for Aristotelian Logic
33
with + and − signs indicating quality and quantity. Sommers gives the “general form of a proposition” (p. 11) as ± (± (±S) ± (±P )) Each ± may be filled by either a + or a − sign. There are five ± signs in the general formula, which we shall label for reference with subscripts as ±1 (±2 (±3 S) ± 4 (±5 P )) Positions 1, 3, 4, and 5 represent various kinds of negation or affirmation. Position 1 indicates propositional affirmation or denial; a + here is read as “it is the case that . . . ” and a − is read as “it is not the case that . . . ” Position 4 indicates predicate affirmation and denial; + is read as the copula “is,” − as the copula “is not.” Positions 3 and 5 represent term-contrariety negation, with a + in either position indicating just the regular term S or P, and a − indicating “un-S” or “non-S” or “un-P” or “non-P.” Notice that we here have three kinds of negation, propositional, predicate, and term, whereas before (in the fractional notation) we have made use of only two. Position 2 indicates quantity, with a + representing particular quantity (read as “Some . . . ”) and a − indicating universal quantity (“All . . . ”).9 The meanings of + and − signs in the various positions of a proposition are summarized in Table 2. We are now in a position to read individual categoricals. For instance, −(−(−S)−(−P)) is read as “it is not the case that all non-S are not non-P.” +(−(+S)−(+P)) is read as “it is the case that all S are not P,” or more simply as “all S are not P.” +(+(+S)+(−P) is, “some S are non-P.” Table 2 Meanings of the + and − signs in various positions in the arithmetic notation Sign + −
Position 1 It is the case that . . . It is not the case that . . .
Position 2 . . . some . . .
Position 3 ∅
Position 4 . . . is . . .
Position 5 ∅
. . . all . . .
. . . un- . . . or . . . non- . . .
. . . is not . . .
. . . un- . . . or . . . non- . . .
The general form of a positions labelled, is ±1 (±2 (±3 S)±4 (±5 P))
9 The
arithmetic notation thus heavily utilizes what might be called in computer science the “overloading” of the + and − signs, in that each sign has several possible meanings depending on its position in the proposition. Sommers writes, “The signs of opposition [i.e., + and −] do have different interpretations depending on their locations in the statement. But order of occurrence renders this ambiguity harmless” (1970, p. 12n.).
34
D. Lovsted
Table 3 Simplification of the four categoricals Categorical A. E. I. O.
Notation −S+P −S−P +S+P +S−P
Vernacular expression All S are P All S are not P (No S is P) Some S is P Some S is not P
In practice, we often omit + signs in positions 1, 3, and 5, when they are positive. Signs in positions 2 and 4 are never omitted. We can therefore write the four categoricals more simply as is shown in Table 3.10 The same notation is used for the propositional calculus and for dealing with relational propositions. The propositional calculus represents “p and q” as +p+q, and negation of p as −p, and uses these two connectives to define the full range of statements. Relational propositions are expressed simply by extending the subject– predicate notation linearly; to express “All S is R to some P,” a form found in sentences like “Every dog loves some human,” we write −S+R+P. Deduction is performed arithmetically. In a syllogism, for example, we add together the premises, and check if they are equal to the conclusion. If they are, and if the same affirmative condition as in the fractional method is satisfied (either all propositions are positive, or exactly one premise and the conclusion are negative), the syllogism is valid. This is the notation Sommers would use from 1970 until his death.
6 Deciding on the Arithmetic Notation We will try to explain why Sommers decided on the arithmetic notation as the notation for TFL. Sommers himself supplies one answer. Consider the opening passage of Sommers’ The Logic of Natural Language (1982), which gives a brief history of the notational development of TFL: The essay before you is the fruit of some fifteen years of investigation into the logical syntax of natural language. In the summer of 1965 I read a paper to the Congress on Logic and Scientific Method at Bedford College, London, that presented an algorithm for the algebraic treatment of syllogistic arguments in which categorical propositions were transcribed as fractions and reciprocals [“On a Fregean Dogma,” published 1967]. I spent the next two years searching for a more general algorithm with greater expressive power, one that could transcribe relational, multi-general propositions as well as simple categoricals. The new algorithm – presented here in chapter 9 – came to me as I was sitting in a Tel Aviv shelter during an alert on the Monday morning of the Six Day Arab-Israeli War. This shows that logic, as well as philosophy in general, has its uses in times of stress. (p. vii)
10 Sommers
comments on this simplification: “The situation with logical signs is much like that in arithmetic; one doesn’t often write out ‘+((+12)−(+5) = +7’ in full but it is nevertheless crucial to have a notation where one can do just that” (1970, p. 8).
Fred Sommers’ Notations for Aristotelian Logic
35
This history, didactic though it is, errs on two key points. The first is an omission. Two perfectly workable notations were published in 1967. Why does the testing procedure notation go unmentioned? The second error is a misleading implication. Though the fractional notation, as it was presented in 1967, could not transcribe relational propositions, it could easily have been extended to do so. We will use these two errors as starting points for our discussion of Sommers’ notational decisions.11 Sommers did not regard the testing procedure notation as being on a par with the fractional and arithmetic notations. This is clear from his later historical remarks, such as in 1970 (p. 4) or 2005 (p. 22), which present the same story as the 1982 passage quoted above: Sommers writes the notational history of TFL as beginning with the fractional notation and ending with the arithmetic notation. And it is clear from his treatment of the testing procedure notation in 1967, when he introduces it without a word of explanation. The testing procedure notation was, for Sommers, not a notation at all, and one of his comments on the testing procedure itself reveals why: “It is not an algorithm,” he writes, “but a direct logical method” (1967, p. 53). This is cryptic. Why is the fractional method, which accomplishes the same thing as the testing procedure, an “algorithm,” while the testing procedure is not? This is not a comment on the procedure itself; Sommers is not using “algorithm” in its usual sense. The distinction Sommers is alluding to is notational: the testing procedure is a “direct logical method” in that it is written in a simple notation that directly transcribes the English syllogisms. This is how Sommers uses the word “direct.” He uses it twice in the 1970s, once in “The Calculus of Terms” (1970) and once in “The Grammar of Thought” (1978). In both cases, he uses it to talk about a notation that is close to natural language—he speaks of “direct symbolic transcription of the natural language original” in 1970 (p. 37) and the “direct transcription” of “every man is wise” in 1978 (p. 46). Observing that “direct” for Sommers is used exclusively in notational contexts allows us to recognize that, when he contrasts the “Direct method” (the testing procedure) with the “Fraction method” in one 1967 example (p. 55), these are parallel labels. He means that one method uses direct notation and the other uses fraction notation. The directness of the testing procedure notation is why Sommers ignores it. For him, it was closer to a shorthand for English than a true logical notation. But let us also observe that notational directness was unambiguously a desideratum for Sommers. His full comment is, “The testing procedure is fast and simple. Moreover, it is not an algorithm, but a direct logical method” (1967, p. 53, emphasis mine). In 1970, he calls directness a “clear advantage” (p. 37). And closeness to natural language was the greatest desideratum for Sommers’ logical syntax. Sommers’
11 There
is a third error in the passage, which is the implication that the algorithm presented in Chap. 9 of The Logic of Natural Language is the same as the algorithm thought up in Tel Aviv in 1967. Sommers’ arithmetic notation in fact underwent many subtle changes in the 1970s, but this is a topic for another paper.
36
D. Lovsted
situation, then, was paradoxical. In his search for a direct notation, he overlooked his first and most direct notation, because it was too direct for him to consider it a notation. Sommers’ own accounts of how he landed on the arithmetic notation always mention that the fractional notation could not handle deductions with relational propositions. In 1970, in his first introduction of the arithmetic notation, he writes that the fractional notation “supplied no way of representing relational statements and no way of calculating with them,” and that the arithmetic notation “repairs that defect” (p. 4). In 1982, we have seen, he comes to the arithmetic notation after a search for “a more general algorithm with greater expressive power [than the fractional notation], one that could transcribe relational, multi-general propositions as well as simple categoricals” (p. vii). And in 2005, he succinctly comments, “The fraction algorithm proved of little value for handling relations” (p. 22). But the fractional notation can in fact handle relations, provided we allow our symbolism to extend into two dimensions. In the arithmetic notation, we handle relational statements by extending the subject predicate notation linearly: instead of writing +S+P, to represent “Some S is P,” we simply write +S+R+P, to represent “Some S is R to some P.” This can be done repeatedly, to give statements of the form “R ± A ± B . . . ± K” (Sommers 1970, p. 21). What is crucial to observe is that the same thing could be done with the fractional notation. We would simply stack fractions on top of each other, to give expressions like S R , P
which could be read as “All S is R to all P,” or a general form of:
A B C
...
K
.
The possibility is there for a two-dimensional notation capable of handling relations, where stacked fractions are placed next to each other and multiplied, level by level. Why then did Sommers not use his earlier notation, the fractional notation, to represent relations? Why develop an entirely new system, rather than using one readily at hand? I will propose five possible reasons. First, perhaps the twodimensional option simply did not occur to Sommers. Second, it is possible that Sommers tried this option and eventually found that it failed, due to deeper problems than my brief exposition reveals. Third, perhaps he held some bias against two-dimensional notations. Frege is the most famous developer of a twodimensional logical notation (his Begriffsschrift) and Sommers was trying to break from Fregean practice; or perhaps the dominance of linear notations in math and logic led Sommers to believe that one dimension was somehow preferable to two. Fourth, perhaps Sommers was thinking pragmatically. He may have known of the
Fred Sommers’ Notations for Aristotelian Logic
37
Begriffsschrift’s poor reception, or feared his audience held an anti-two-dimension bias, and avoided two-dimensionality for popularity’s sake. Or fifth, Sommers may have feared typesetting complex fractions. We have conjectured that he could not easily typeset simple fractions (see note 6), so fractions on many levels might have caused serious difficulties. Sommers supplies an easy account of his choice of TFL notation: he made a simple upgrade from the limited fractional notation to the flexible arithmetic notation. But Sommers had three notations on hand, not two, and at least two of them were, or had the potential to be made, fully functional. We could end our story here: Sommers did not consider the testing procedure notation because he did not consider it a real notation; it was too similar to natural language. He did not develop the fractional notation for one or more of several possible reasons. So, by elimination, he chose the arithmetic notation. There was, however, one more important factor in Sommers’ decision: his views on the psychology of reasoning. In about 1973, Sommers came across a quote that resonated deeply with some observations he had already made. Leibniz writes Thomas Hobbes, everywhere a profound examiner of logical principles rightly stated that everything done by the mind is a computation, by which is meant either the addition of a sum or the subtraction of a difference (De Corpore I.i.2). So, just as there are two primary signs of algebra and analytics, + and −, in the same way there are, as it were, two copulas “is” and “isn’t”. (Leibniz 1966, quoted in Sommers 1973 and Sommers 1976)
Sommers thought his arithmetic notation captured something about the way we think when we reason. In 2005, commenting on his decision to switch from the fractional to the arithmetic notation, Sommers writes, “Nor did [the fractional notation] cast much light on how we actually reckon, since we do not [solve syllogisms] by thinking in terms of fractions and reciprocals” (p. 22). The implication here is that we do solve syllogisms by thinking in terms of arithmetic. He writes also in 2005 about logic as “how we think” and the need for a “cognitively adequate logic” (pp. 8–9). What exactly Sommers means by this is unclear. On the one hand, it seems at times as if he is making the weaker claim that, since we perform logical manipulations instantly in our head—Sommers mentions the example of 10-yearold children who can immediately recognize the equivalence of “No archer will hit every target” and “Every archer will miss some target” (2005, p. 8)—logical reasoning must be due to a faculty like that of simple arithmetic reasoning, which we can also do immediately. On the other hand, he appears occasionally to make a much stronger claim, namely, that logical reasoning is arithmetic reasoning, in that we use the same cognitive faculty and follow the same rules when we do each. He does not explain this further, and I will not try to guess at his meaning or problematize these claims. What is important is that, as of the early 1970s and perhaps earlier, Sommers had become convinced that his arithmetic notation reflected the psychological processes involved in reasoning. This advantage for the arithmetic notation would be the nail in the coffin of the fractional and testing procedure notations. By the mid-1970s, Sommers believed he had at last found the “good notation” of Russell’s dictum, “A good notation is better than a live teacher.”
38
D. Lovsted
7 Conclusion The story I have told in this paper is complex. Sommers began his inquiries into Aristotelian logic with a simple notation, that of his 1967 testing procedure. This notation was so simple, in fact, that Sommers did not consider it to be a bona fide notation, but a mere shorthand. Thus, he did not consider it for use in his later work on TFL, even though simplicity and similarity to natural language were key desiderata for him. Instead, he saw his notational decision as one between two options, the fractional notation of 1967 and the arithmetic notation of 1970. For reasons that are unclear, but which could be as mundane as typesetting constraints, Sommers chose not to extend his fractional notation into two dimensions to handle relations. Instead, due at least in part to puzzling considerations about psychological realism, he chose to develop the arithmetic notation and use it as the notation for TFL. Its complexity is, in large part, the point of this story. Notational decisions are very rarely simple. They are not always well understood even by the makers of those decisions. And they are even more rarely well-documented. They connect to pragmatic issues of historical and practical contingencies; to deep philosophical issues of ontology and language; and to scientific issues of cognition. Sommers’ invention of TFL allows us an important opportunity to look at the creation of a notation. The care and attention he gave to notation make TFL’s notational development a valuable case study in the philosophy of notation. Finally, the story of TFL’s notation does not end here. The arithmetic notation was not a monolith. Sommers presented it many times after 1970; his manner of presenting it changes radically from paper to paper, and the notation itself changes subtly too. It would be a shame to limit our discussion of Sommers’ notations to a window of a few years at TFL’s beginnings. There is room for further work.
Bibliography Frege, G. (1997). “Function and Concept,” in Michael Beaney (ed.), The Frege Reader, 130-148, Malden, MA, Blackwell Publishers Inc. Leibniz, G. (1966). Leibniz Logical Papers, G. H. Parkinson (ed. and trans.), Oxford, Clarendon Press. Sommers, F. (1959). “The Ordinary Language Tree,” Mind, 68, 160-85. Sommers, F. (1963). “Types and Ontology,” Philosophical Review, 72, 327-363. Reprinted in P. F. Strawson (ed.) (1967), Philosophical Logic, 138-69, Oxford, Oxford University Press. Sommers, F. (1967). “On a Fregean Dogma,” in I. Lakatos (ed.), Problems in the Philosophy of Mathematics, 47-81, Amsterdam, North-Holland. Sommers, F. (1970). “The Calculus of Terms,” Mind, 79, 1-39. Reprinted in G. Englebretsen (ed.) (1987), The New Syllogistic, 11-56, New York, Peter Lang. Sommers, F. (1973). “Existence and Predication,” in M. Munitz (ed.), Logic and Ontology, 159-74, New York, New York University Press.
Fred Sommers’ Notations for Aristotelian Logic
39
Sommers, F. (1976). “Logical Syntax in Natural Language,” in A. MacKay and D. Merrill (eds.), Issues in the Philosophy of Language, 11-42, New Haven, Conn., Yale University Press. Sommers, F. (1982). The Logic of Natural Language, Oxford, Clarendon Press. Sommers, F. (2005). “Intellectual Autobiography,” in D. Oderberg (ed.), The Old New Logic: Essays on the Philosophy of Fred Sommers, 1-23, Cambridge, MA, MIT Press.
L’équivalence duale de catégories: A Third Way of Analogy? Aurélien Jarry
Résumé D’après les reconstructions historiques disponibles, l’invention de la notion de schéma par Grothendieck à la fin des années 1950 résulte d’un doubleprocessus de généralisation dans lequel l’exploitation et l’extension d’analogies entre algèbre commutative et géométrie algébrique semble avoir joué un rôle déterminant. Ce processus a abouti à l’établissement d’un genre de « dictionnaire » entre ces deux domaines. Partant d’une distinction de Schlimm, je propose ici de tester sur cet exemple historique la portée descriptive de deux modèles de l’analogie, définie comme relation de similarité entre deux domaines. Le premier modèle, prédominant en sciences cognitives, explique l’analogie en termes de préservation/projection de structure (structure-mapping theory), le deuxième en termes de lois ou axiomes communs (approche axiomatique). Je montre alors que ces deux approches ne rendent qu’imparfaitement compte des spécificités de la relation d’analogie établie par Grothendieck entre l’algèbre commutative et la géométrique algébrique. Ceci tient au fait qu’elle s’exprime par une équivalence (au sens technique de la théorie des catégories) d’un type particulier appelé « dualité ».
1 Introduction Dans le cadre du projet « Duality – An Archetype of Mathematical Thinking » à l’université de Wuppertal (All.),1 mes recherches portent sur le développement
1 Projet financé par la Deutsche Forschungsgemeinschaft (DFG) : http://gepris.dfg.de/gepris/ projekt/279002986. J’aimerais remercier ici Ralf Krömer, Jean-Pierre Marquis et Dirk Schlimm pour leurs encouragements et commentaires précieux, ainsi que les deux correcteurs anonymes pour leurs critiques constructives.
A. Jarry () Bergische Universität Wuppertal, Wuppertal, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_3
41
42
A. Jarry
de la géométrie algébrique à la fin des années 1950 autour d’Alexandre Grothendieck, dont le nom est associé à un changement de paradigme. À partir de 1956, Grothendieck a en effet consacré l’essentiel de son activité mathématique à refonder la géométrie algébrique, c’est-à-dire à en redéfinir radicalement les objets comme les méthodes. Cette entreprise, qui s’est traduite par un nombre impressionnant de publications, notamment les plusieurs milliers de pages du traité Éléments de Géométrie Algébrique (EGA)2 corédigé avec Jean Dieudonné, est volontiers qualifiée de révolution conceptuelle3 , parce qu’elle s’est accompagnée d’un enrichissement des concepts préexistants et de l’introduction de nouvelles notions, modifiant les objets mêmes étudiés : d’une théorie des variétés algébriques, la géométrie algébrique est devenue, avec Grothendieck, la théorie des schémas (Schappacher 2007, p. 246). L’invention de la notion de schéma est spécialement intéressante parce qu’elle constitue d’une certaine manière un aboutissement dans l’histoire de la géométrie algébrique. À partir du XVIIe siècle avec Descartes, Viète et Fermat, les mathématicien.ne.s4 ont en effet exploré et cherché à expliciter les connexions existant entre l’algèbre et la géométrie. Progressivement s’est établi un genre de « dictionnaire » entre d’un côté les différents objets géométriques étudiés (les courbes, surfaces et plus généralement les variétés dites « algébriques »), et de l’autre des objets qu’on peut qualifier de formels parce que définis par des formules algébriques (des polynômes). Cette correspondance trouve son expression aujourd’hui dans le langage de la théorie des catégories et on attribue à Grothendieck d’avoir établi dans ce cadre une double généralisation: premièrement, en introduisant une nouvelle sorte « d’espaces » géométriques (les schémas), il a généralisé la notion de variété algébrique ; et deuxièmement, en prouvant l’équivalence (au sens technique de la théorie des catégories) de nouvelles « classes » d’objets mathématiques ainsi obtenues, il a généralisé et, pour ainsi dire, étendu une correspondance partielle établie avant lui entre certaines structures algébriques et certains objets géométriques.5 Ceci a permis de donner un cadre de traduction général et précis entre les théories mathématiques concernées, les unes se laissant ramener en grande partie aux autres, sans pour autant s’y réduire entièrement. À la théorie des variétés algébriques correspond de manière univoque une partie spécifique de la théorie des algèbres, et à la théorie des schémas affines, celle des anneaux commutatifs. Dans ce contexte, étudier la géométrie algébrique et l’algèbre commutative, c’est au fond 2 Le
plan annoncé dans le premier tome des EGA ne sera même jamais achevé (Grothendieck/Dieudonné 1960, p. 6). 3 Cf. Perrin (2008), p. 1. Schappacher parle d’un « upheaval of algebraic geometry » (Schappacher 2007, p. 246). 4 J’emploie ici et plus bas dans le texte une forme inclusive pour rendre visibles les femmes qui ont contribué de manière essentielle aux développement des sciences, là où la grammaire tend à effacer leur rôle derrière un masculin générique (cf. Arbogast/Condom 2017). 5 Une équivalence de catégories est donnée par un genre particulier de foncteur entre deux catégories. Tout ceci sera expliqué plus en détails ci-dessous, section 4.
L’équivalence duale de catégories: A Third Way of Analogy ?
43
étudier les mêmes objets de deux points de vue différents et complémentaires, et le choix de la méthode employée (algébrique ou géométrique), pour résoudre certains problèmes est souvent une question d’affinité personnelle. C’est en ce sens qu’il faut comprendre la formule suivante, couramment employée dans la communauté mathématique et historique spécialisée: Algebraic geometry « is » commutative algebra (Krömer 2007, p. 169).
Une question centrale dans mon travail est : quel rôle a joué l’analogie, comme mode de pensée, dans le double processus de généralisation indiqué ainsi que le développement par Grothendieck de ces cadres de traduction et des outils conceptuels attenant? Autrement dit, est-ce que la notion d’analogie, peut d’un point de vue épistémologique, contribuer à élucider le développement conceptuel ayant eu lieu dans ce domaine de la géométrie algébrique à la fin des années 1950, et sous l’impulsion de Grothendieck en particulier? Dans cet article, je vais montrer que non seulement cette question est pertinente, mais aussi qu’on peut y répondre partiellement par l’affirmative. Toutefois, les théories actuelles de l’analogie ne s’appliquent qu’imparfaitement au cas historique présenté ici, pour des raisons qui ont trait au fait que l’équivalence établie par Grothendieck entre théorie des schémas et algèbre commutative est d’un type particulier appelé « dualité ».6 Je propose donc d’examiner à partir de cet exemple comment remédier à ces défauts, de voir lequel des deux modèles est le plus à même d’être adapté et s’il ne serait pas plutôt préférable de se servir de cet exemple comme modèle pour caractériser une troisième forme de relation d’analogie. Ce programme est ambitieux pour au moins trois raisons. D’abord, la profusion des travaux de Grothendieck, même restreinte au domaine de la géométrie algébrique, n’a d’égal que sa réputation d’abstraction, de complexité et d’inaccessibilité, ne serait-ce que parce que la terminologie employée s’éloigne délibérément de celle de ses prédécesseurs (cf. Grothendieck/Dieudonné 1960, p. 5-6 et p. 9). Ceci est entre autres une conséquence du choix de Grothendieck d’utiliser les ressources offertes par la théorie des catégories, théorie alors encore en plein développement, pour reformuler et résoudre avec de nouvelles méthodes les problèmes mathématiques qui l’intéressaient (cf. Krömer 2007, chap. 3 et 4). Ensuite, tenter d’apporter une contribution neuve dans les travaux sur la notion d’analogie relève de la gageure, même en faisant abstraction de son ancienneté philosophique et en se restreignant aux débats contemporains, tant la littérature sur le sujet est immense. Ma contribution n’a pas d’autre prétention que d’apporter une pierre à l’édifice général de clarification de l’usage de cette notion ou principe
6 Il
existe d’autres exemples en mathématiques de dualités liant « espaces » et « fonctions » , et à travers elles différents domaines comme l’algèbre et la géométrie (cf. Atiyah 2007; Krömer/Corfield 2014 et Corfield 2017). Le choix de l’exemple utilisé ici pour illustrer le concept de dualité est évidemment plus justifié par son importance historique que par son accessibilité, il est vrai très relative, pour les non-spécialistes. Si mon analyse s’applique à d’autres domaines plus larges des mathématiques ou d’autres exemples historiques, c’est une question qui ne peut évidemment pas être abordée ici.
44
A. Jarry
dans le domaine des mathématiques et plus particulièrement en histoire des mathématiques. Enfin, un problème de circularité méthodologique, dont je ne suis pas sûr d’avoir la solution, se présente. Si le type de correspondance qu’établit une équivalence de catégories (ici entre anneaux et schémas) doit servir de modèle pour caractériser la relation d’analogie entre deux domaines (ici entre algèbre commutative et géométrie algébrique), il faut au minimum qu’il soit raisonnable de parler d’analogie pour qualifier une telle relation d’équivalence et donc disposer d’une préconception suffisante de l’analogie, notion que l’on voulait pourtant expliquer par ce modèle. On risque donc d’intervertir principe explicatif et objet de l’explication, autrement dit de finir par expliquer l’équivalence (duale) de catégories par elle-même.7 Pour éviter cet écueil, je vais d’abord essayer de justifier l’emploi du terme « analogie » à partir de son emploi en histoire et philosophie des sciences et, ce faisant, expliquer les avantages épistémologiques et historiographiques que j’espère en tirer (section 2). Ensuite, après un court survol de la littérature secondaire sur l’analogie, et en m’appuyant sur une dichotomie proposée par Schlimm (2008), je présenterai dans la section 3 les deux grands types de modèles de l’analogie en vigueur susceptibles de s’appliquer au cas historique analysé ici, dont je présenterai une reconstruction historique partielle dans la section 4. Dans la section 5, je serai alors à même d’évaluer si cet exemple se laisse expliquer par une des deux approches de l’analogie présentées, si ces deux approches sont incompatibles et s’il vaut la peine de faire l’hypothèse d’un troisième modèle qui rendrait mieux compte du type de relation qu’une équivalence duale de catégories établit.
2 Motivations et enjeux épistémologiques Tout d’abord, en quoi est-il pertinent d’étudier le rôle de l’analogie comme mode de pensée dans les travaux de Grothendieck en géométrie algébrique? Une première justification se trouve dans la manière dont Grothendieck lui-même décrit la motivation de certains de ses travaux. Dans l’article connu sous le nom de « Tôhoku », d’après le nom du journal japonais dans lequel il a été publié, Grothendieck explique qu’il a cherché à exploiter une analogie perçue entre deux théories dans le but de développer une théorie générale qui les contienne: Ce travail a son origine dans une tentative d’exploiter l’analogie formelle entre la théorie de la cohomologie d’un espace à coefficients dans un faisceau [. . .] et la théorie des foncteurs dérivés de foncteurs de modules [. . .], pour trouver un cadre commun permettant d’englober ces théories et d’autres (Grothendieck 1957, p. 119).
7 D’un
autre côté, Brown/Porter (2006) estiment que la théorie des catégories fournit justement un cadre mathématique abstrait précis pour expliciter et analyser les notions de comparaison et d’analogie.
L’équivalence duale de catégories: A Third Way of Analogy ?
45
Que Grothendieck parle explicitement d’analogie ici pour décrire sa méthode d’invention dans le domaine de l’algèbre homologique ne permet bien sûr pas d’en déduire qu’il a procédé de même en géométrie algébrique. Toutefois, il se trouve que parmi les commentateurs de l’œuvre de Grothendieck, certains décrivent ses innovations conceptuelles dans ce dernier domaine précisément dans ces termes. Entre autres, Brown et Porter parlent de l’exploitation d’analogies géométriques chez Grothendieck (Brown/Porter 2006, p. 16) et Krömer et Corfield affirment que Grothendieck aurait cherché à de nombreuses reprises à exploiter et compléter des analogies entre différents domaines, par exemple entre l’algèbre et la géométrie (Krömer/Corfield 2014, p. 106). Employer le terme ‘analogie’ pour qualifier le processus d’invention de la notion de schéma paraît donc de prime abord opportun, dès lors qu’on peut trouver des éléments justificatifs dans les sources textuelles aussi bien primaires que secondaires. Employer ce terme est même plutôt anodin, tant l’omniprésence de l’analogie dans tous les domaines lui confère un caractère familier et intuitif. Depuis quelques décennies en effet, de nombreuses études en sciences cognitives et en linguistique8 confirment l’implication quasi-universelle de l’analogie comme principe d’organisation et de développement de la pensée. L’analogie joue en effet un rôle moteur dans les innovations conceptuelles et linguistiques, sous toutes leurs formes, de la métaphore au développement des modèles scientifiques les plus élaborés.9 Elle s’est donc vu conférer un rôle fondamental au « cœur de la cognition » humaine.10 Dès lors, il est peu surprenant que de nombreuses études en histoire ou philosophie des sciences se servent précisément de cette notion pour décrire et expliquer l’évolution des concepts scientifiques.11 Les mathématiques ne font pas exception : plusieurs études et articles témoignent du fait que la notion d’analogie est plutôt à la mode en histoire et philosophie de cette discipline.12 On reconnaît aisément à l’analogie un rôle heuristique comme outil de découverte (DurandRichard 2008, p. 2, Corfield 2003, p. 81, Knobloch 1989, p. 35–36). Selon Grosholz et Sinaceur, cette notion, ou plus exactement certaines analogies particulières entre algèbre et géométrie, ont joué un rôle immense dans le développement des mathématiques. L’analogie fait donc figure de source d’innovation conceptuelle et
8 Pour
un panorama général, on peut se reporter à Bartha (2016), Holyoak (2012), Gentner/Holyoak/Kokinov (2001) pour les sciences cognitives et Itkonen (2005) pour la linguistique. 9 Pour les théories de la métaphore, cf. Hentschel (2010b), Grady (2010). 10 L’expression « analogy as the core of cognition » est de Douglas Hofstadter (cf. Hofstadter 2001 et Hofstadter/Sanders 2013, Prologue). 11 En témoignent par exemple les ouvrages collectifs Hallyn (2000), Durand-Richard (2008) et Hentschel (2010a). 12 En témoignent par exemple les ouvrages Durand-Richard (2008), les contributions de Grosholz, Knobloch et Sinaceur dans Grosholz/Breger (2000), l’article de Knobloch (1989) et sa contributon dans (Hentschel 2010a).
46
A. Jarry
de progrès (cf. Grosholz/Breger 2000).13 L’analogie comme principe cognitif est selon certains à l’origine même des concepts mathématiques (Lakoff/Nuñez 2000; Marghetis/Nuñez 2013), d’autres vont même jusqu’à faire des mathématiques le domaine propre de l’analogie (Knobloch 1989, p. 35, Atiyah 1976, p. 220). Au minimum, l’analogie est reconnue par les mathématicien.ne.s mêmes comme une source de généralisation et d’abstraction en mathématiques,14 ce que Corfield, résumant l’avis de MacLane, formule avec une acuité particulière: An analogy perceived to exist between two or more theories suggests the possibility of generalizing constructions in one of these theories by transferring them to the other and furthermore implies the existence of a common structure to be captured by an abstraction (Corfield 2003, p. 83).
Épistémologiquement parlant, proposer d’analyser le développement de la notion de schéma avec la notion d’analogie est donc amplement justifié, puisque ce mode de pensée est répandu en mathématiques. Que Grothendieck ait fait consciemment usage du terme « analogie » dans ce contexte n’est pas pertinent, car il ne s’agit pas ici de reconstruire le processus psychologique individuel de Grothendieck. Il s’agit de voir si l’évolution des concepts se laisse décrire et donc expliquer en termes d’exploitation d’une analogie. Si une telle description réussit, on pourra alors raisonnablement faire l’hypothèse qu’elle décrit aussi le processus de pensée individuel et/ou collectif, l’invention de la notion de schéma affine n’ayant d’ailleurs pas été l’œuvre d’une seule personne, comme le disent les protagonistes eux-mêmes (cf. McLarty 2007, p. 313). Supposer cela aura en outre le mérite de relativiser l’aura de génie qui tourne autour de Grothendieck, en rendant intelligible les processus de pensée à l’origine de ses idées réputées inaccessibles.15 Hofstadter et Sanders montrent par exemple comment des processus cognitifs de catégorisation et extrapolation sur la base d’analogies rendent intelligibles rétrospectivement les
13 L’analogie
ne se limite cependant pas à être une source de découverte parmi d’autres. En déterminant l’importance accordée à tel ou tel aspect d’une théorie, elle impose la direction prise par la recherche. On peut ici citer Poincaré qui explique que les faits qui intéressent les mathématiciens comme les physiciens sont ceux qui peuvent conduire à la découverte d’une loi ; [c’est-à-dire] ceux qui sont analogues à beaucoup d’autres faits, qui ne nous apparaissent pas comme isolés, mais comme étroitement groupés avec d’autres [par une] analogie [. . .] profonde, mais cachée. (Poincaré 1908, p. 22, cf. Corfield 2003, p. 81–82).
14 Sous
cette forme en particulier, on a vu que Grothendieck dit avoir chercher à exploiter une analogie dans le « Tôhoku ». 15 Parmi les capacités utiles aux mathématicien.ne.s, certaines ne sont probablement pas spécifiques au domaine d’application, quoique leur application requiert une grande familiarité avec les concepts et objets de ce domaine. Hofstadter/Sanders mettent ainsi en évidence la frontière floue entre le processus de (re)-catégorisation et la perception d’une analogie (Hofstadter/Sanders 2013, chap. 1–4). Corfield donne aussi plusieurs exemples où la perception et l’exploitation d’une analogie va de pair avec la capacité de réinterpréter certains faits, de changer de perspective, de porter son attention sur d’autres aspects d’un problème (Mac Lane parle de shift of attention) (Corfield 2003, chap. 4).
L’équivalence duale de catégories: A Third Way of Analogy ?
47
innovations conceptuelles d’Einstein en physique (théorie de la relativité), ce qui ne veut pas dire qu’elles étaient pour autant à la portée de tous avant d’avoir été établies (Hofstadter/Sanders 2013, chap. 8).16 L’analogie sert ainsi de justification a posteriori et de moyen de reconstruction rationnelle du développement conceptuel, puisqu’une fois explicitée et clarifiée, elle révèle rétrospectivement les liens profonds existant entre deux domaines. Often the sense of mystery accompanying a perceived analogy is dispelled when the structural similarity is explained away in some larger theory (Corfield 2003, p. 81).
Il faut ici prendre garde ne pas présenter qu’une version téléologique de l’évolution des concepts, ni de catégoriser trop vite les alternatives historiques entre celles qui ont échoué et celles qui ont réussi. L’intérêt des différentes théories et approches historiques précédant une innovation conceptuelle telle que celle de l’invention de la notion de schéma ne se limite pas à la contribution qu’elles ont apportée aux théories qui leur ont succédé. Or c’est souvent le défaut des reconstructions rationnelles en histoire des mathématiques qui, pour éclairantes qu’elles soient, ne rendent souvent que très partiellement compte de la complexité du processus de développement conceptuel, parce qu’elles gomment délibérément les variations terminologiques et conceptuelles et n’apprécient pas à leur juste valeur les alternatives historiques. La notion d’analogie a le mérite de proposer des ressources conceptuelles pour éviter ce piège méthodologique. L’analogie possède en effet un aspect dynamique et irrémédiablement dialectique à même de décrire les interactions complexes entre différentes branches des mathématiques. Pickering, et Corfield après lui décrivent le processus d’élaboration d’une analogie, c’est-à-dire de « l’idée vague » ou « intuition » d’une connexion entre deux théories apparemment distinctes à l’aide d’une métaphore. Ce processus comporte au moins trois moments: la création d’un premier « pont » entre deux théories, la phase de « transport » ou « transcription » des informations du premier domaine dans le nouveau domaine ainsi connecté, et celle de « complétion » de la théorie par des éléments manquants (Corfield 2003, p. 85). Si l’image est assez parlante, ces trois aspects du processus ne se passent pas dans le vide. Ils sont soumis à différentes contraintes, liés au contexte historique et intellectuel d’une époque et des personnes impliquées (cf. DurandRichard 2008, p. 5–6), en particulier aux informations qui leur sont disponibles et aux hypothèses qu’elles sont prêtes à formuler, accepter ou abandonner (Corfield 2003, p. 85).17 Comme source de l’innovation conceptuelle, l’analogie induit un processus d’assimilation et de modification des concepts qu’elle met en relation, de leur sens, et du sens et de l’emploi des symboles utilisés pour les dénoter (cf. Durand-Richard 2008, p. 6-7, Corfield 2003, p. 84–86).18 Ainsi, Durand-Richard 16 Corfield
fait une remarque similaire à propos de la notion « d’espace » de fonctions (Corfield 2003, p. 82). 17 Durand-Richard montre que l’image parfois négative associée à l’analogie est liée à celle d’une science déjà établie, normative, peu soucieuse de thématiser les conditions d’interprétation des modèles qu’elle présente (Durand-Richard 2008, p. 2). 18 Le cas échéant, cela passe par une phase de manipulation strictement formelle de symboles désignant des grandeurs « imaginaires » ou « impossibles », comme l’illustre l’histoire des nombres (cf. Durand-Richard 2008, p. 6).
48
A. Jarry
explique par exemple que l’acceptation de nombres imaginaires a conduit à renoncer pour les nombres à l’universalité de la relation d’ordre et à leur représentation sur une droite (Durand-Richard 2008, p. 7). De même, Corfield estime que la réussite partielle des efforts d’Hamilton à trouver une généralisation des nombres complexes était contrainte par le type de propriétés qu’il souhaitait conserver (Corfield 2003, p. 85–86). Pour résumer, en cherchant à analyser l’invention de la notion de schéma grâce à la notion d’analogie, on emploie ce dernier terme conformément à l’usage épistémologique qui en est fait en histoire et philosophie des sciences. Reste à décrire avec plus de précision le « processus de pensée » supposé sous-jacent au développement conceptuel que l’on reconstruit. Il n’est par exemple pas du tout clair que la notion de schéma résulte de l’exploitation du même type d’analogie que celui auquel Grothendieck fait explicitement référence en introduction du « Tôhoku ». Or, si l’analogie prise comme forme spécifique d’organisation de la pensée et principe moteur du changement sémantique doit avoir un rôle explicatif pour l’histoire des idées, il faut préciser sous quelle(s) forme(s) elle a pu jouer un rôle, et donc s’appuyer sur des modèles descriptifs de la relation qu’elle exploite ou extrapole. C’est l’objet de la section qui suit de présenter les deux approches les plus courantes à disposition.
3 Two Ways of Analogy La notion d’analogie n’a pas seulement un caractère intuitif, elle est très polysème. Le sens de cette notion reste difficile à circonscrire malgré de nombreuses tentatives de clarification et de modélisation. C’est même devenu un topos des études sur l’analogie de rappeler avec John Stuart Mill le fait suivant: There is no word [. . .] which is used more loosely, or in a greater variety of senses, than analogy (Mill, cité par Hentschel 2010b, p. 15).
Le terme « analogie » peut ainsi et entre autres choses faire référence: • à un certain type de relation/connexion non-arbitraire19 entre deux ou plusieurs « domaines », c’est-à-dire objets, phénomènes, idées, structures conceptuelles, théories, etc. (cf. Schlimm 2008). • au processus cognitif de perception ou d’établissement d’une relation d’analogie, par comparaison de deux domaines (cf. Bartha 2016).
19 Une
analogie ne relève pas du hasard (cf. Gentner/Markman 1997, p. 48). Comme le rappelle Corfield, en se référant à Polya: [A]nalogy is based on the hope that a common ground exists between two domains (Corfield 2003, p. 83). Cela ne dit cependant pas quel genre de raison ou de connexion est requise pour qu’une relation entre deux domaines puisse être qualifiée d’analogie.
L’équivalence duale de catégories: A Third Way of Analogy ?
49
• à un certain mode de raisonnement où la comparaison de cas particuliers sert de base à des inférences plausibles quoique faillibles pour des cas nouveaux (induction, argument par analogie) (cf. Bartha 2016; Holyoak 2012). Bien sûr, ces différents aspects recouverts par la notion sont interdépendants. Dans les études de psychologie cognitive par exemple, si l’analogie y est étudiée comme un principe cognitif fondamental d’organisation de la pensée (sens 2), c’est parce que dans de nombreuses tâches cognitives (catégorisation, raisonnement, résolution de problèmes), la perception et l’usage de relations d’analogie (sens 1) entre domaines distincts y sont prépondérantes et conditionnent nos réponses comportementales. De la même manière, décrire et analyser le raisonnement par analogie (sens 3) suppose aussi de clarifier le type de relations prises en compte dans les processus de comparaison et d’inférence. Même en se restreignant au premier sens, les choses se compliquent encore, car si la relation d’analogie est le plus souvent définie comme un certain type de relation de similarité, et outre le fait que la notion de similarité doit elle-même être spécifiée sous-peine de repousser le problème,20 l’emploi du terme ‘analogie’ ne se tient pas toujours à cette restriction, et il est difficile de délimiter son sens de celui des notions proches de « métaphore », « d’interprétation » et de « modèle » (cf. Epple 2016; Hentschel 2010b). Le terme ‘analogie’ désigne en effet plus un continuum de relations pouvant aller d’une connexion non arbitraire quelconque (par exemple de type métonymique) à une relation de similarité profonde ou « structurelle » mettant en jeu des relations entre relations, en passant par une relation de similarité relativement superficielle basée sur la présence de propriétés communes entre les deux domaines concernés (cf. Gentner/Markman 1997, p. 48). Toutefois, si l’on en croit Hentschel (2010b) et Holyoak (2012), le meilleur cadre pour décrire la relation d’analogie, vue comme relation de similarité « profonde » entre deux domaines, est celui décrit en termes de « projection de structure ». Ce type d’approche est connu sous le label Structure Mapping Theory (SMT). À ce modèle devenu dominant dans les sciences cognitives et en histoire et philosophie des sciences, il faudrait selon Schlimm (2008) en ajouter au moins un autre, décrivant la relation d’analogie en termes d’axiomes communs entre les domaines comparés, et plus apte à capturer le sens de certaines analogies pertinentes en mathématiques. Je propose donc ici de partir de la dichotomie proposée par Schlimm, sur lequel je m’appuie largement pour la présentation qui suit, pour voir si les deux modèles qui en ressortent sont adaptés au cas historique qui m’intéresse.
20 Hahn
(2003) note à propos de la notion de similarité que sans délimitation précise de son extension, la notion reste trop vague et passe-partout et perd alors toute pertinence explicative (Hahn 2003, p. 386). Sur les difficultés méthodologiques à modéliser et mesurer la relation de similarité, cf. Goldstone/Son 2012, p. 171–172. L’usage très large de la notion qu’en font Hofstadter/Sanders (2013) illustre très bien ce problème à mon avis, l’analogie intervenant dans leur présentation dans toute situation du type « Cela me rappelle quelque chose . . . ».
50
A. Jarry
3.1 L’approche structurelle SMT Le modèle SMT a été développé principalement par Gentner et son équipe (Gentner 1983) et a fait l’objet de multiples simulations numériques. Entre-temps, il en existe différentes variantes (cf. Holyoak 2012), mais dans l’ensemble toutes expliquent l’analogie en termes d’alignement structurel entre les deux domaines comparés sur la base d’une projection f , c’est-à-dire d’une mise en correspondance terme à terme du premier domaine (appelé source) sur le deuxième (appelé cible). Cet alignement structurel via f est caractérisé par trois choses : une exigeance de cohérence structurelle, l’importance accordée d’avantage aux relations complexes ou d’ordre supérieur et un caractère systématique (cf. Schlimm 2008, p. 181). Examinons ces trois points en détails. Selon la théorie SMT, une analogie associe à un objet (respectivement une relation) du domaine source un objet (respectivement une relation) du domaine cible, de telle sorte que les relations entre les objets ainsi mis en correspondance soient identiques. Plus exactement, il suffit que les relations mises en correspondances soient très similaires, c’est-à-dire sémantiquement très proches, et notamment qu’elles aient des arguments (objets) qui se correspondent. Une condition minimale requise est par exemple la conservation du nombre d’arguments des relations et leur ordre : à une relation à deux termes R est associée une relation à deux termes R , telle que si R(a, b), alors R (a , b ), où a est l’image de a et b celle de b. C’est ce qu’il faut entendre par cohérence structurelle. Ce faisant, la théorie SMT se focalise sur les relations d’ordre supérieur et/ou plus complexes, et non pas sur les propriétés des objets eux-mêmes et elle accorde aussi plus de poids aux relations entre relations. Selon (Holyoak 2012), différentes contraintes sémantiques et/ou pragmatiques peuvent jouer un rôle de sélection des relations admissibles et/ou pertinentes. C’est pourquoi les relations pertinentes peuvent parfois se réduire à des propriétés (c’est-à-dire a des relations à un argument), le contexte sélectionnant certains arguments par défaut. Si les tenant.e.s du modèle SMT se focalisent sur la conservation des relations à proprement parler, c’est parce qu’ils/elles restreignent l’emploi du terme « analogie » aux cas d’analogies « profondes ». Un exemple donné par Gentner, celui de l’analogie entre le modèle atomique de Bohr et le système solaire illustre l’idée générale. Les éléments des deux systèmes n’ont pas de propriétés communes pertinentes (masse, couleur, taille, . . . ), mais ils sont éléments d’un système caractérisé par une mise en relation d’une source d’attraction et les éléments sont soumis à cette attraction selon un « même » rapport. Les électrons sont dans une relation par rapport au noyau de l’atome similaire à celle que les planètes ont par rapport au soleil.21
21 Ici
on retrouve le sens classique de l’analogie comme rapport, l’analogie dite de « proportion » : A est à B comme C est à D, même si A et C d’un côté et B et D de l’autre n’ont pas de propriétés communes pertinentes (cf. Hentschel 2010b, p. 14 et Durand-Richard 2008, p. 16).
L’équivalence duale de catégories: A Third Way of Analogy ?
51
La théorie SMT explique alors l’aspect dynamique de l’analogie comme suit. La perception d’une relation d’analogie s’établit d’abord par une mise en correspondance partielle (la projection f ), perçue ou testée sur des sous-domaines des domaines source et cible, qui sert alors de base à un processus d’extrapolation, appelé transfert analogique. Là où des éléments ou des relations dans le domaine source sont encore sans équivalents dans le domaine cible, on postule dans ce dernier l’existence d’éléments ou relations correspondants. Cela conduit à émettre des hypothèses sur le domaine cible susceptibles d’être confirmées ou infirmées. Le cas échéant, cela force à reconsidérer les relations prises en compte et réajuster la projection établissant la correspondance. Par caractère systématique du modèle SMT, il faut entendre cette tendance à exploiter une analogie au maximum et à intégrer le plus grand nombre de relations et interrelations possible. En termes mathématiques, le type de projection mis en jeu est modélisé par un genre de « fonction » et la notion d’isomorphisme (au moins partiel) paraît pouvoir servir de modèle en première approximation. Comme le rappelle Schlimm (Schlimm 2008, p. 183), citant Pólya: [A]n isomorphism, understood as “one-to-one correspondence that preserves the laws of certain relations” provides “a fully clarified sort of analogy”.
Afin d’inclure les cas de correspondances non biunivoques, les notions d’homomorphisme ou d’immersion/injection d’un domaine dans un autre sont aussi parfois mis à contribution.22 Malgré son succès, la théorie SMT est selon Schlimm inadaptée à certains types de domaines. Schlimm en distingue en effet deux types différents: les domaines/contextes riches d’objets et les domaines/contextes riches en relations (Schlimm 2008, p. 183–184). Cette distinction est déterminée par le rapport entre le nombre d’objets et le nombre de relations pertinentes du domaine concerné. Pour un même ensemble d’objets considérés, on peut donc avoir des domaines plus ou moins riches d’objets/de relations. Typiquement, un « domaine » mathématique est « riche d’objets », parce qu’il y a en général plus d’éléments que de relations pertinentes prenant pour arguments les éléments du domaine. Ainsi, le nombre d’éléments d’une structure algébrique ou d’un espace est potentiellement infini (nombres, points, fonctions, etc.) tandis que le nombre de relations considérées et pertinentes est en général petit (opérations numériques, relations d’incidence et colinéarité, etc.). Selon Schlimm, le modèle SMT ne permet pas d’expliquer le type d’analogie qui existe entre deux tels domaines riches d’objets, par exemple entre deux structures algébriques du même type. Schlimm illustre son propos avec la notion de groupe et montre que l’analogie entre deux et a fortiori plusieurs groupes ne s’explique pas par une projection entre ces groupes, quel que soit le type
22 Ne
considérer que des fonctions est pourtant peut-être trop restrictif. Dans la section 5, je montrerai qu’entre deux catégories, un foncteur joue le même rôle que le type de projection ici envisagée, mais se montre encore plus souple d’emploi et permet surtout de relier sans constructions ad hoc des relations qui ne respectent pas l’ordre des arguments.
52
A. Jarry
de « projection » considéré (isomorphisme, homomorphisme ou immersion).23 Si l’approche en termes de fonction préservant une structure entre groupes distincts ne peut expliquer leur caractère analogue qua groupes, quelle approche est alors plus pertinente pour décrire ce type de relation d’analogie?
3.2 L’approche axiomatique Schlimm défend une approche qui caractérise l’analogie en termes axiomatiques. Selon cette approche, l’analogie définit deux domaines comme analogues l’un de l’autre s’ils satisfont les mêmes lois ou axiomes, c’est-à-dire s’ils sont des modèles du même ensemble d’axiomes (Schlimm 2008, p. 180).24 Le mathématicien Pólya exprime cette idée ainsi: [I]n general, systems of objects subject to the same fundamental laws (or axioms) may be considered as analogous to each other, and this kind of analogy has a completely clear meaning (Pólya, cité par Schlimm 2008, p. 183).
Pour reprendre l’exemple de Schlimm, ce que deux groupes ont en commun n’est pas tant une structure25 au sens du modèle SMT, c’est-à-dire une structure qui serait conservée par un genre de « fonction » entre groupes, qu’un certain nombre de propriétés des opérations de ces groupes (existence d’un élément neutre pour l’opération, loi d’associativité et éventuellement de commutativité, etc.), propriétés qui, elles, se laissent exprimer par des axiomes.
23 Je
renvoie à son article pour les détails de son argumentation. Schlimm note bien qu’il s’agit de disqualifier l’approche SMT uniquement pour ce type de domaines et sous l’hypothèse que la projection se fait entre les éléments des domaines. Plus précisément: [It] is not that one could not base a characterization of analogies on mappings of some kind, but that the characterization of analogies based on mappings between the two analog domains is inadequate for object-rich domains (Schlimm 2008, p. 188, n. 15).
24 Comme
l’explique Schlimm, cette caractérisation de l’analogie trouve son origine dans les travaux de Boltzmann et de Maxwell en physique, et a été défendue par des philosophes tels que Duhem, Suppes, Nagel, Hempel au siècle dernier. La terminologie de Schlimm recouvre ce qu’Hempel appelle nomic isomorphism, et Hesse formal analogy. 25 L’emploi devenu usuel en mathématiques du terme « structure » pour désigner ce genre d’objets algébriques abstraits que sont les groupes, les anneaux, etc. prête alors à confusion, car si ces objets abstraits sont nommés ainsi, c’est précisément parce que le critère d’identité d’une telle « structure » n’est pas dépendant des propriétés intrinsèques de ces éléments, c’est-à-dire de leur critère d’identité « ontologique ». Qu’il soit réalisé, c’est-à-dire exemplifié, par des transformations géométriques ou un ensemble de nombres, un groupe est caractérisé par les relations que ses éléments ont entre eux (c’est-à-dire ses opérations et les lois auxquelles sont soumises ces opérations, ses « sous-structures » (sous-groupes), etc.). En l’occurence, le terme « structure » renvoie alors finalement à la même idée générale que celle de Gentner: l’important, ce sont les relations, pas les éléments eux-mêmes.
L’équivalence duale de catégories: A Third Way of Analogy ?
53
Toujours selon Schlimm, l’approche axiomatique présente aussi d’autres avantages. Elle permet de formuler explicitement autant les points communs que les différences pertinentes (positive vs. negative analogies). Ensuite, la procédure de détermination des axiomes peut être appliquée de manière récursive, c’est-àdire par étape et sans remettre en question les résultats des précédentes étapes d’axiomatisation, pour « approfondir » l’analogie. Enfin, elle fournit une représentation économique des propriétés/relations « structurelles »26 entre deux domaines (cf. Schlimm 2008, p. 192–195). La caractérisation axiomatique de la relation d’analogie a effectivement servi à formuler et introduire des notions abstraites (comme celles de groupe, anneau, idéal, treilli, etc. en algèbre) dont les objets qui les instancient sont analogues les uns aux autres en raison de leur appartenance même à ces « classes » d’objets. Schlimm (2011) donne d’autres exemples tirés de l’histoire des mathématiques du XXe siècle et effectivement, une grande partie du développement de l’algèbre moderne semble avoir eu pour objectif de définir des objets mathématiques de plus en plus généraux et abstraits (cf. Corry 1996), mais sur la base d’une autre forme d’analogie que celle définie par un alignement structurel par projection entre les objets eux-mêmes.27
4 Des variétés aux schémas L’objet de cette section est de donner une version de l’histoire de l’invention de la notion de schéma suffisamment détaillée pour qu’on puisse l’analyser avec les deux modèles de l’analogie que l’on vient de présenter. Rappelons d’abord brièvement
du terme « structure » est là encore ambigu, mais révélateur de la difficulté de respecter une cohérence terminologique stricte. Dans ce contexte, Schlimm fait référence à l’aspect « profond » , non-superficiel de l’analogie, et non pas à une « structure » , une « forme » au sens du grec μορφή, qui serait préservée par une fonction ou morphisme au sens de Gentner. 27 Les exemples donnés par Schlimm sont étroitement liés à l’emploi d’une certaine méthode en mathématique, la méthode dite axiomatique, qui s’est beaucoup développée avec les travaux de Hilbert notamment. À partir de la recherche d’axiomes toujours plus précis et de l’exploration méthodique de leur indépendance et leurs conséquences logiques, des pans entiers des mathématiques ont été renouvelés. La méthode axiomatique s’est déclinée sous différentes variantes, menant parfois à l’introduction de notions plus abstraites, mais pas toujours et pas seulement. Reprenant une distinction de Mehrtens, Schlimm distingue trois variantes de l’application de la méthode axiomatique : la variante qu’il nomme « par analogie », où la comparaison de deux ou trois objets ou domaines permet de déterminer leurs points communs exprimés sous forme d’axiomes, la variante « par abstraction », où l’on met de côté certains aspects d’un objet ou domaine en formulant axiomatiquement partiellement le domaine considéré, et la variante « par modification », où l’on modifie explicitement un des axiomes d’une théorie déjà axiomatisée comme ce fut le cas pour l’introduction de la géométrie non-euclidienne (cf. Schlimm 2011). Dans une communication orale, Jean-Pierre Marquis m’a suggéré que sur ce point, il serait problablement utile de distinguer la méthode axiomatique proprement dite d’une autre méthode plus générale, et que Marquis nomme « méthode abstraite ». Selon lui, si la méthode axiomatique a été une voie possible d’exploration de domaines différents ayant conduit à l’abstraction, elle n’est pas la seule et l’abstraction n’a pas été son seul résultat. Marquis appelle donc à ne pas confondre abstraction, formalisme et méthode axiomatique (cf. Marquis 2014, 2016). 26 L’emploi
54
A. Jarry
l’objet principal de la géométrie algébrique telle que Grothendieck a pu la découvrir au milieu des années 1950.28 Pour simplifier, et en se restreignant au cas « affine », la géométrie algébrique classique étudie les propriétés des ensembles de points géométriques V dont les coordonnées sont solutions d’un ensemble S d’équations polynomiales à coefficients dans un corps de nombre k, par exemple k := Q, R ou C. V (S) := {x := (x1 , . . . , xn ) ∈ k n : P (x) = 0 ∀P ∈ S ⊆ k[X1 , . . . , Xn ]} Pour k := R, n := 2, on retrouve les courbes du second degré que sont les coniques, par exemple le cercle V (X2 + Y 2 − 1), l’hyperbole V (XY − 1) ou la parabole V (Y − X2 ). Comme le rapporte entre autres Schappacher (2007), une partie du travail des prédécesseurs de Grothendieck dans les années 1930 et 1940 a consisté à ramener l’étude de ces objets géométriques à celle de structures algébriques qu’on peut leur associer. À un ensemble algébrique V0 := V (S) de points défini comme ci-dessus, on peut en effet associer l’ensemble (en fait l’idéal) des polynômes I (V0 ) := {P ∈ k[X1 , . . . , Xn ] : P (x) = 0 ∀x ∈ V0 } qui s’annulent sur cet ensemble points, et l’algèbre affine associée (V0 ), définie par (V0 ) := k[X1 , . . . , Xn ]/I (V0 ), c’est-à-dire en gros l’ensemble des fonctions polynomiales qui ne s’annulent pas sur V0 . Ces constructions I et , qu’on peut aussi considérer comme des « fonctions » associant à toute variéte V0 un idéal I (V0 ) et une algèbre affine (V0 ), sont utiles car elles sont interdépendantes et donnent des informations sur l’ensemble V0 d’origine. Réciproquement, on peut considérer V comme une fonction associant à un idéal I0 un ensemble de points V (I0 ). Or en tant que fonctions, V et I ont des propriétés intéressantes qui les relient. Elles sont en effet décroissantes, c’est-à-dire qu’elles renversent l’ordre dans la relation d’inclusion: si une variété V est incluse dans une autre, W , alors l’ensemble I (W ) des polynômes qui s’annulent sur W est inclus dans celui (I (V )) de ceux qui s’annulent sur V , et inversement, si un idéal I est inclus dans un autre, J , la variété V (J ) est incluse dans la variété V (I ). Mieux, V et I sont duales l’une de l’autre, c’est-à-dire qu’elles échangent les rôles de la réunion et de l’intersection ensembliste29 : V (I ) ∪ V (J ) = V (I ∩ J )
V (I ) ∩ V (J ) = V (I ∪ J )
I (V ) ∩ I (W ) = I (V ∪ W )
28 Dans
cette section, je m’appuie sur la présentation technique de Perrin (2008). On trouvera une présentation très similaire et plus succinte dans l’article (Görtz 2018), paru alors que la rédaction du présent article était pour l’essentiel déjà terminée. 29 À strictement parler, dans la deuxième égalité, I ∪ J n’est pas un idéal, mais cela n’a pas d’incidence.
L’équivalence duale de catégories: A Third Way of Analogy ?
55
Comme on le remarque, cette dualité est cependant imparfaite. La raison en est que pour un idéal de départ I0 donné, en général I (V (I0 )) n’est pas égal à I0 , tandis que réciproquement, pour une variété V0 , V (I (V0 ) = V0 . Ceci traduit algébriquement le fait que défini comme ci-dessus, l’ensemble de points géométriques V (S) sous-détermine l’ensemble d’équations S dont ces points sont solutions. Par exemple pour k := R, n := 1, V (X2 ) = V (X) = {x = 0}. La définition d’une variété affine comme ensemble algébrique présente entre autres de ce fait plusieurs défauts. D’abord, elle n’est pas intrinsèque, car relative à une immersion dans un espace affine (k n ) ambiant. Pire, les ensembles de points définis sont relatifs aux corps de nombres autorisés pour résoudre les équations. Par exemple, selon que k := R ou k := C, V (X2 + Y 2 + 1) est soit l’ensemble vide ∅, soit la réunion de deux droites du plan complexe C2 . Ensuite, la définition restreint le type de coefficients et donc de solutions autorisées, alors qu’en arithmétique diophantienne, les équations sont à coefficients entiers (c’est-à-dire dans Z) ou même dans un corps de caractéristique non nulle (par ex. Zp ). Enfin, lorsqu’il s’agit de compter le nombre de points d’intersection de différentes variétés, il y a des problèmes de multiplicité, ceci même sans prendre en compte la possibilité de points situés « à l’infini ». Dans l’exemple k := R, n := 1 ci-dessus, V (X2 ) est un « point-double », {x = 0} compté deux fois. Algébriquement, ceci se traduit par le fait que l’idéal I (X2 ) engendré par X2 n’est pas premier et donc que l’algèbre affine (V (X2 )) a des éléments nilpotents. On dit qu’elle n’est pas réduite. Dans le cas particulier où k est un corps algébriquement clos cependant, par exemple k := C, l’égalité ci-dessous est vraie: I (V (I0 )) = I0 . C’est le résultat du théorème des zéros de Hilbert. Sous ces hypothèses, il est possible d’établir une correspondance partielle entre certaines structures algébriques et certains objets de la géométrie algébrique. Aux variétés algébriques V irréductibles correspondent les idéaux I (V ) premiers et les anneaux (V ) intègres, aux points V = {x}, les idéaux maximaux mx et pour tout point x, ({x}) s’identifie au corps k. C’est de cette manière que Grothendieck caractérise l’état des connaissances en géométrie algébrique au congrès international des mathématiciens à Édimbourg en 1958: an affine algebraic variety [V ] with ground field k is determined by its co-ordinate ring [(V )], which is an arbitrary finitely generated A-algebra without nilpotent elements; therefore, any statement concerning affine algebraic varieties can be viewed also as a statement concerning rings A of the previous type (Grothendieck 1958, p. 105).
Grothendieck expose alors les raisons qui selon lui (et d’autres avant lui) appelaient à une généralisation de la notion de variété. Bon nombre de théorèmes en algèbre se laissaient démontrer sous des hypothèses plus faibles que celles imposées par les algèbres de type fini réduites, c’est-à-dire sans éléments nilpotents, et donc étendre à des structures algébriques plus générales (par exemple les anneaux commutatifs unitaires dits noethériens). Sous l’hypothèse d’existence d’une correspondance du même type que celle établie sous les hypothèses du théorème de Hilbert, cela laissait espérer qu’il y ait une contrepartie géométrique aux anneaux commutatifs noethériens, et peut-être même aux anneaux commutatifs en général.
56
A. Jarry
La question restait de savoir comment définir cette contrepartie géométrique et à quel point une telle correspondance serait généralisable. Avant de retracer l’histoire30 de cette double généralisation attribuée à Grothendieck,31 il est nécessaire de rappeler ici quelques définitions de théorie des catégories.32 Une catégorie C est la donnée d’une « famille » d’objets mathématiques Obj(C) := {C, C , C , . . .} et d’ensembles de « morphimes »33 Mor(C) entre ces objets, c’est-à-dire pour chaque paire d’objets (C, C ), d’un ensemble Mor(C, C ) := {f, g, h, . . .} avec une règle de composition: ◦ : Mor(C, C ) × Mor(C , C ) → Mor(C, C ) (f, g) → g◦f Ces données doivent vérifier des conditions minimales de « cohérence »: • D’abord, la règle de composition doit être associative. Pour tous morphismes f ∈ Mor(C, C ), g ∈ Mor(C , C ) et h ∈ Mor(C , C ) pour lesquels la composition est définie: (h ◦ g) ◦ f = h ◦ (g ◦ f ). • Pour chaque objet C, il existe au moins un morphisme idC appelé « identité » et qui se comporte de manière neutre pour la composition: pour f ∈ Mor(C, C ), idC ◦ f = f , f ◦ idC = f . Un cas prototypique est celui d’une catégorie d’ensembles munis d’une « structure ». On a ainsi la catégorie des groupes Group, celle des anneaux Ring, celle des espaces vectoriels Veck sur un corps k, celle des espaces topologiques Top, etc. avec leurs morphismes respectifs associés, c’est-à-dire les fonctions qui « respectent les structures » de ces ensembles.34 Une catégorie n’est toutefois pas intéressante
30 En
fait, il s’agit plutôt là d’une reconstruction rationnelle de l’invention de cette notion, c’est-àdire d’une motivation a posteriori de la définition moderne, telle qu’elle est présentée par exemple par Dolgachev (1974), Dieudonné (1985), Cartier (2001), ou les manuels modernes comme Perrin (2008). Cette perspective masque la complexité du processus historique en effaçant délibérément les autres voies explorées par exemple par Chevalley et Nagata ou les hésitations terminologiques (cf. McLarty 2007). 31 La paternité de l’idée « finale » est attribuée par les protagonistes eux-mêmes (entre autres Cartier et Serre) à Grothendieck, quoique d’autres aient eu des idées similiaires conjointement (cf. Cartier 2001, p. 398, n. 29) ou aient en tout cas contribué au climat d’effervescence scientifique à l’origine de l’introduction de la notion de schème affine telle qu’elle est définie aujourd’hui (cf. McLarty 2007, p. 313). 32 Je m’appuie ici sur les présentations succintes de Brown/Porter (2006) (p. 4-5) et Vakil (2017) (p. 23–31), car elles suffisent à mon propos. 33 J’ai choisi ici de suivre la terminologie de Vakil, et de ne pas parler de « flèches » (arrow en anglais) parce qu’elle est celle employée par Grothendieck et est donc plus adaptée au cas historique ici analysé. Pour une explication des inconvénients du terme « morphisme », notamment son manque de généralité, voir Brown/Porter (2006). 34 Toute catégorie n’est cependant pas toujours de ce type (cf. Brown/Porter 2006, p. 5 et Vakil 2017, p. 27).
L’équivalence duale de catégories: A Third Way of Analogy ?
57
pour elle-même, mais plutôt en comparaison avec d’autres, de par les « liens » ou analogies qui existent entre elles. Cette idée de comparaison et d’analogie entre catégories trouve son expression mathématique dans la notion de foncteur, qui lui donne un sens technique précis (cf. Brown/Porter 2006). Un foncteur (covariant) est une « règle de correspondance » F entre deux catégories C et D, plus exactement entre leurs objets et leurs morphimes: F:
Obj(C) C
−→ Obj(D) −→ F(C) := D
Mor(C) −→ Mor(D) f : C → C −→ F(f ) : F(C) → F(C ) Autrement dit, pour chaque paire d’objets (C, C ), on a l’inclusion suivante: F(Mor(C, C )) ⊆ Mor(F(C), F(C )). Cette règle de correspondance doit être compatible avec la composition des morphismes, c’est-à-dire: • Pour tout objet C, F(idC ) = idF(C) . • Pour des morphismes f : C → C et g : C → C , (F )(g ◦ f ) = F(g) ◦ F(f ). Il existe un deuxième type de foncteur, dit contravariant, qui lui, inverse le « sens » des morphimes : F(Mor(C, C )) ⊆ Mor(F(C ), F(C)). C’est ce type de foncteur qui intervient en géométrie algébrique, comme on va l’expliquer plus loin. Une équivalence de catégories entre deux catégories C et D, qu’on écrit C D, est établie s’il existe un foncteur F : C → D aux propriétés suivantes: • Pour chaque objet D de D, il existe un objet C de C, tel que F(C) soit isomorphe à D ((F ) est appelé essentiellement surjectif ). • ∀ C, C ∈ Obj(C), F(Mor(C, C )) = Mor(F(C), F(C )) (On ne « perd » pour ainsi dire aucun morphisme, on dit que F est complètement fidèle). Dans le cas d’une équivalence de catégories établie par un foncteur F contravariant, on parle de « dualité », d’« équivalence duale » ou d’« antiéquivalence ».35 Une équivalence de catégories est une notion plus souple que celle d’isomorphisme de catégories, mais l’idée est somme-toute proche, une équivalence de catégories étant pour ainsi dire un « isomorphisme à isomorphisme près ». En effet, la correspondance établie entre deux catégories équivalentes n’est pas « univoque », c’est-à-dire qu’à chaque objet C de la première catégorie est associé un objet D de la seconde qui n’est pas strictement unique, numériquement parlant. D’autres « objets » D , D , D , . . . en nombre potentiellement infini, tous isomorphes les uns des autres ont les mêmes propriétés que D en ce qui concerne leur rôle vis-à-vis de la relation d’équivalence.
35 Les termes « équivalence duale »
anglais.
et « antiéquivalence » sont des traductions littérales des termes
58
A. Jarry
Revenons maintenant à la situation présentée précédemment. La classe des f ensembles algébriques affines Aff−Setsk et celle des k-Algèbres de type fini Algk constituent des candidates privilégiées pour établir une équivalence de catégories: ?
f
Aff−Setsk Algk
puisque dans le cas où k est algébriquement clos, on a une correspondance bien définie entre les objets des deux classes. Une catégorie n’est cependant pas seulement définie par une classe d’objets, mais surtout par le type de morphismes considérés entre ces objets, comme expliqué cidessus. Dans le cas des variétés algébriques classiques, celles définies sur un corps de base de caractéristique nulle (k = R ou C), plusieurs types de morphismes sont intéressants à étudier, selon les propriétés géométriques que l’on souhaite mettre en avant et les méthodes que l’on souhaite employer pour résoudre les problèmes (méthodes algébriques, de géométrie analytique, c’est-à-dire d’analyse réelle ou complexe, etc.). Une seule approche s’est révélée commune et exploitable pour tous les ensembles de coefficients, celle où la topologie de la variété est celle dite de Zariski (cf. Perrin 2008). De par les propriétés spécifiques de cette topologie, seules les fonctions dites polynomiales36 gardent un intérêt en tant que morphismes entre variétés. Dès lors qu’on se restreint à ce type de morphismes, fournit alors un foncteur f entre Aff−Setsk et Algk , et même une équivalence duale de catégories dans le cas où k est algébriquement clos. En effet, pour ϕ : V → W un morphisme de variétés (c’est-à-dire une fonction polynomiale) et f ∈ (W ), on peut définir ϕ ∗ (f ) := f ◦ ϕ et ϕ ∗ est un morphisme de k-Algèbres, ϕ ∗ : (W ) → (V ).37 Cette équivalence de catégories n’est cependant pas complètement satisfaisante, comme on l’a dit plus haut, parce que d’une part elle ne concerne qu’une partie restreinte des variétés (celles qui sont irréductibles) et d’autre part, il faut se restreindre à certains ensembles de coefficients (corps algébriquement clos) (cf. Cartier 2001, p. 398). Toujours sous les hypothèses du théorème de Hilbert (k algébriquement clos), un fait est toutefois remarquable: Si ϕ(x) = y, alors (ϕ ∗ )−1 (my ) = mx C’est-à-dire que l’ensemble des points d’une variété V s’identifie via le foncteur à l’ensemble des idéaux maximaux de l’algèbre (V ), ce qui revient à dire
V ⊆ k n et W ⊆ k m deux ensembles algébriques affines munis de leur topologie de Zariski et soit ϕ : V → W une fonction qu’on peut écrire sous la forme ϕ = (ϕ1 , . . . , ϕm ), où ϕi : V → k. ϕ est dite polynomiale si ses composantes ϕi sont polynomiales au sens usuel (c’est-à-dire ϕi ∈ (V )). 37 On voit aisément que le foncteur renverse le « sens » des morphismes: 36 Soit
: (ϕ : V → W ) → (ϕ ∗ : (W ) → (V )) .
L’équivalence duale de catégories: A Third Way of Analogy ?
59
qu’on peut réciproquement définir une variété (irréductible) à partir de n’importe quelle algèbre de type fini réduite A comme l’ensemble de ses idéaux maximaux : −1 (A) := {m, m ⊆ A,m idéal maximal}.38 Expliquons comment à partir de là, Grothendieck a pu procéder pour généraliser la notion de variété.39 Supposons avoir trouvé la bonne définition géométrique correspondant à un anneau commutatif quelconque, considérons deux tels objets S et T et un morphisme ϕ : S → T , ϕ(x) = y ∈ T . Via l’équivalence de catégories supposée, à ϕ correspond un morphisme d’anneau ϕ ∗ . Pour tout point y de T , on devrait encore pouvoir associer à l’idéal maximal my son image réciproque par ϕ ∗ : (ϕ ∗ )−1 (my ). En règle générale, ce ne sera plus un idéal maximal, mais c’est toujours un idéal premier, parce qu’un idéal maximal est premier et que l’image réciproque d’un idéal premier par un morphisme d’anneau est premier. En partant de ce constat, l’idée cruciale de Grothendieck40 fut donc de considérer l’ensemble des idéaux premiers d’un anneau A quelconque, comme un espace géométrique (noté Spec(A)), c’est-à-dire de définir une topologie appropriée sur cet ensemble, ou encore de considérer ces idéaux comme des « points », ce qui explique son caractère a priori peu intuitif. Quel sens géométrique pouvait-on donner à ces nouveaux « points » ? La difficulté principale restait de donner une « interprétation géométrique » à ces nouveaux objets (cf. Cartier 2001, p. 398–400 et Deligne 1998, p. 12). Grothendieck a alors prouvé l’équivalence des catégories ainsi définies, c’està-dire l’équivalence entre la catégorie des schémas affines Aff-Schemes et celle des anneaux commutatifs unitaires CRing. En fait, cette équivalence découle de la définition même des schémas, choisie pour respecter les morphismes d’anneaux. Le langage de la théorie des catégories donne donc un sens plus élaboré à la correspondance entre objets géométriques et objets algébriques qui était déjà établie par les prédécesseurs de Grothendieck, en reliant non seulement les objets entre eux mais aussi de manière cohérente les relations, c’est-à-dire les morphismes entre ces objets. Selon différents commentateurs, la recherche de bonnes propriétés du point de vue de la théorie des catégories a même été primordiale et a guidé le processus de généralisation et d’abstraction menant à la définition grothendieckienne de la notion de schéma (affine). C’est en tout cas ce qu’affirment Deligne (Deligne 1998, p. 13) ou Gelfand et Manin: Good categorical properties of [. . . ] functors [between algebra and geometry] (e.g. equivalence) are so important that to save them one is often forced to change old structures or
équivalence de catégories n’étant pas un isomorphisme, la notation −1 est impropre et abusive, car le foncteur n’a en fait qu’un quasi-inverse, mais l’idée suggérée est au fond la même. 39 En fait, différents indices textuels par exemple dans la correspondance entre Grothendieck et Serre corroborent cette hypothèse historique, mais ce n’est pas le lieu de cet article de développer ce point. 40 Görtz (2018) présente la même reconstruction rationnelle en guise de relecture commentée des EGA. 38 Une
60
A. Jarry to introduce new ones. This is how affine schemes, nuclear vector spaces [. . . ] and objects of derived categories appeared in Mathematics (Gelfand et Manin, cité par Krömer 2007, p. 190)
On peut donc dire que par le biais de l’usage de la théorie des catégories, Grothendieck a à la fois précisé et généralisé la correspondance entre algèbres affines réduites et variétés irréductibles établie par ses prédécesseurs. Pour paraphraser Polyà, on pourrait donc avancer que « l’équivalence (duale) de catégories » s’est avérée être une manière de clarifier l’analogie entre géométrie algébrique et algèbre commutative. La question qui se pose alors, c’est de savoir de quel type d’analogie on parle dans le cas d’une équivalence de catégories, et en particulier dans le cas d’une équivalence duale. Parmi les deux approches présentées ci-dessus, laquelle est la plus à même de décrire cette relation de correspondance entre deux domaines par dualité? Dans quel sens est-il encore question de ressemblance ou similarité dans ces équivalences entre domaines distincts (en fait, entre catégories distinctes), qui a priori – à tout le moins pour les non-spécialistes – ont peu à voir les uns (les unes) avec les autres?
5 Équivalence duale de catégories et analogie Examinons d’abord si l’approche axiomatique est plus pertinente pour décrire l’analogie entre catégories. Après tout, une catégorie est un domaine mathématique, or d’après Schlimm, une description axiomatique de la relation d’analogie est plus pertinente dans ce cas, puisque qu’un domaine mathématique est riche d’objets. L’analyse dépend ici de ce qu’on décide d’appeler « objet » et « relation » dans les domaines considérés, ici des catégories. Si l’on applique le modèle de Schlimm littéralement, et qu’on prend pour éléments du domaine les objets de la catégorie, et les morphismes entre ces objets comme relations, la distinction entre domaines riches d’objets et domaines riches de relations utilisée par Schlimm perd de sa pertinence, car la notion de cardinalité d’une catégorie est en général non définie. On peut certes se restreindre à une théorie des catégories « limitées en taille », c’est-à-dire qui sont des ensembles tous inclus dans un « univers », mais il n’est à ma connaissance pas évident que cette approche soit logiquement nécessaire. Il existe en tout cas d’autres moyens d’éviter les paradoxes liés à l’hypothèse de l’existence de classes de taille « indéfinie ». De toute manière, le nombre de relations pertinentes (les morphismes entre objets) est lui aussi infini et le quotient nombre d’objets/nombres de relations indéterminé. Par comparaison avec la notion de groupe, utilisée comme exemple par Schlimm, on peut néanmoins considérer une catégorie plutôt comme un domaine riche d’objets. Pour cela, il faut considérer les morphismes comme éléments et la composition des morphismes fait figure de relation entre eux, tout comme l’opération d’addition (ou multiplication) est la seule relation pertinente d’un groupe. Mais alors deux catégories sont analogues par définition: deux catégories reliées par
L’équivalence duale de catégories: A Third Way of Analogy ?
61
une équivalence ne le sont pas plus que deux autres quelconques, et l’analyse de Schlimm n’arrive pas à capturer la ressemblance existant entre deux catégories équivalentes. L’approche axiomatique de Schlimm apparaît donc inadaptée ici. Pour autant, l’approche axiomatique joue certainement un rôle secondaire, que ce soit en amont ou en aval. La théorie des catégories est susceptible d’un traitement axiomatique, tout comme le sont aussi bien la théorie des anneaux (ou des algèbres de certains types) que celles des variétés algébriques ou des schémas. Et historiquement, c’est ce qui s’est passé en algèbre (travaux d’Emmy Noether et de Van der Waerden, cf. Corry 1996) et en théorie des catégories (William Lawvere). Tous ces domaines se développent de manière quasi-indépendante de nos jours, car l’exploration d’un domaine peut se faire pour son intérêt propre et la méthode axiomatique y joue certainement un rôle pour déterminer quelles propriétés exprimées par des axiomes sont importantes. Comme on l’a vu, c’est justement la volonté d’établir un dictionnaire plus général entre structures algébriques et structures géométriques qui a motivé l’introduction de la notion de schéma, mais seulement une fois que la théorie des anneaux avait été développée dans toute sa généralité. Appliquer l’approche axiomatique aux variétés pour les généraliser aux schémas est probablement envisageable a posteriori, mais comme on l’a remarqué cidessus, les différents protagonistes avaient des réticences à accepter qu’une telle généralisation de la notion de variété ait un sens géométrique maniable, voire même un sens géométrique tout court. L’approche axiomatique n’a donc pas été le moteur direct de l’invention de la notion de schéma. C’est bien l’exploitation et la généralisation de la relation de correspondance entre objets algébriques et objets géométriques établie par les prédécesseurs de Grothendieck qui a joué un rôle prédominant. Remarquons ici que même une fois la théorie des schémas axiomatisée, il resterait de toute manière à déterminer quels axiomes sont conservés par l’équivalence de catégories, et lesquels non. Au moins dans le cas d’une équivalence duale, cette question n’est pas triviale, puisque les propriétés, lois et relations qui se correspondent ne sont pas identiques mais trouvent à travers le foncteur d’équivalence plutôt des contreparties qui sont pour ainsi dire inverses ou « symétriques », à cause de l’inversion du sens des morphismes. Donnons un premier exemple simple. On a vu que la relation d’inclusion entre variétés et idéaux est inversée par V et I et par conséquent, à la condition de chaîne ascendante dans les algèbres de type fini (et plus généralement des anneaux noethériens) correspond une condition de chaîne descendante dans les variétés algébriques. Ces deux conditions ne peuvent faire l’objet d’une même expression axiomatique. Même si elles se ressemblent par d’autres côtés et sont peut-être susceptibles de faire l’objet d’une abstraction (ce sont toutes deux des conditions de « finitude » de chaîne), elles n’ont évidemment pas le même contenu sémantique. Si c’était le cas, tout anneau noethérien serait automatiquement artinien, ce qui est manifestement faux.
62
A. Jarry
Pour les catégories ici données comme exemples (Aff-Schemes, CRing, etc.) d’autres propriétés catégoriques se prêtent à ce jeu de traduction ou transposition: à l’existence d’un objet initial correspond l’existence d’un objet terminal, à celle d’un produit de deux objets, celle d’un coproduit, etc. Les propriétés ne sont pas identiques, mais duales l’une de l’autre, et correspondent à des axiomes différents. Dans le cas de l’équivalence entre algèbre commutative et géométrie algébrique, cela ne se remarque pour ainsi dire pas, parce que les catégories considérées possèdent à la fois certaines propriétés et leurs propriétés « symétriques ». La notion de produit et de coproduit fait par exemple sens dans les deux catégories, qui vérifient donc toutes deux les mêmes axiomes d’existence pour ces notions.41 Examinons maintenant si l’approche SMT décrit mieux l’analogie entre algèbre commutative et théorie des schémas affines. À première vue, puisqu’une équivalence de catégories établit une correspondance entre les objets de deux domaines (ici des catégories), analyser cette relation en termes de projection de structure (SMT) semble bien pertinent. Le fait que la théorie des catégories mette surtout l’accent sur les morphismes, et donc en ce sens sur les « relations » entre les objets d’une catégorie est un élément en faveur d’une telle caractérisation. Il y a cependant au moins deux ajustements à faire. D’abord, comme on l’a vu, la correspondance établie entre deux catégories équivalentes n’est pas « univoque » ou plus exactement elle ne l’est qu’à un isomorphisme près. Ceci ne représente pas un obstacle majeur, même si elle ne prend en considération que les cas où la « projection » est un genre de fonction et accorde plus d’importance aux correspondances univoques (one-to-one correspondances), la théorie SMT laisse place à des correspondances qui ne sont ni des isomorphismes ni des immersions. Lorsque le nombre d’objets dans le domaine source excède celui du domaine cible, mais que certains objets sont relationnellement équivalents, on peut se restreindre à considérer des homomorphimes, c’est-à-dire une correspondance many-to-one (cf. Schlimm 2011, p. 181–182, Gentner/Markman 2005, p. 2). Dans le cas d’une équivalence de catégories, on peut avoir une situation inverse, où le nombre d’objets du domaine cible excède celui du domaine source. Un objet est dans ce cas alors associé à une « classe » d’objets cibles. La notion de « foncteur » peut donc servir de caractérisation mathématique plus générale que celle de « fonction ». Ensuite, il faut étendre le modèle au type particulier d’équivalence de catégories qu’est une « dualité ».42 Si l’on considère que les « relations » à conserver sont les morphismes entre objets, les propriétés et relations associées entre les
fait, la recherche de « bons » axiomes pour caractériser les catégories où la notion de produit et celle de coproduit coexistent et coincident a conduit à une autre forme d’abstraction et l’invention de la notion de catégorie abélienne, introduite pour unifier la théorie générale des foncteurs dérivés et celle de la cohomologie à coefficients dans un faisceau (cf. Grothendieck 1957, p. 119, Krömer 2007, chap. 3). 42 À ma connaissance, la théorie SMT n’analyse aucun exemple de ce genre. 41 En
L’équivalence duale de catégories: A Third Way of Analogy ?
63
deux catégories, i.e. les morphismes, vont en sens inverses, donc le foncteur d’équivalence ne respecte pas l’ordre des arguments. En effet, à un morphisme de C vers D est associé un morphisme de F (D) vers F (C). Dans l’exemple des variétés algébriques, si on a deux algèbres de type fini réduites A et B telles que A ⊂ B, alors les variétés correspondantes que l’on notera abusivement Spec(A) et Spec(B) sont dans une relation d’inclusion inverse (Spec(B) ⊂ Spec(A)). Formellement, on peut bien sûr résoudre ce problème en redéfinissant les relations, mais ceci est une solution ad hoc et qui n’est pas satisfaisante pour des raisons sémantiques et pragmatiques. À toute catégorie C correspond certes une catégorie C op duale, dont les objets sont les mêmes que ceux de C et dont les morphismes vont en sens inverse. On peut alors réécrire formellement la relation de correspondance établie par une équivalence duale de catégories en utilisant un foncteur covariant au lieu d’un foncteur contravariant. Le problème est que les morphismes de C op perdent leur sens concret de morphismes entre structures, auxquels les mathématicien.ne.s ont recours dans la pratique, quand il s’agit d’interpréter le sens des termes. S’en passer reviendrait ou bien à littéralement ne pas savoir de quoi on parle (ou plutôt à en faire abstraction), ou bien par un détour inutile à se référer explicitement aux morphismes originaux de C. On peut donc tout aussi bien introduire un type particulier de « projection » dans la théorie SMT, sur le modèle des foncteurs contravariants. Ainsi modifié et adapté, le modèle SMT s’applique donc bien au type de relation qu’exprime une équivalence (duale) de catégories. En fait, l’équivalence de catégories sert de paradigme général pour une extension du modèle SMT, dont elle partage l’aspect dynamique. Dans le cas de l’invention de la notion de schéma, l’établissement d’une équivalence de catégories a en effet conduit à introduire de nouveaux concepts, mais aussi à catégoriser des objets de manière différente en redéfinissant les contours des concepts employés, tel que celui de « point ». En ce sens, l’équivalence de catégories fonctionne comme une relation d’analogie à part entière: elle souligne la cohérence structurelle entre deux domaines de manière systématique, sans pour autant épuiser la richesse d’aucun des deux domaines qu’elle relie. Cependant, tout cela ne préjuge en rien de la profondeur de la relation établie par une équivalence de catégories et encore moins qu’elle constitue le seul type de relation d’analogie pertinente entre deux domaines mathématiques. C’est même plutôt un cas limite d’analogie, qui ne correspond peut-être pas au sens qu’on attribue couramment à ce terme. Une équivalence définit en effet une identité et non une simple relation de similarité entre deux domaines. Or, même si elle n’est que partielle, cette identité finit par effacer complètement les différences entre les domaines considérés, au détriment d’autres aspects. Deux espaces géométriques peuvent par exemple être isomorphes en tant que variétés algébriques, mais peuvent avoir d’autres propriétés géométriques différentes et intéressantes selon
64
A. Jarry
un autre point de vue,43 susceptibles d’engendrer d’autres analogies, innovations conceptuelles et re-catégorisations.
6 Conclusion Récapitulons. D’abord, on a montré l’intérêt d’expliquer le développement de la géométrie algébrique sous l’impulsion de Grothendieck (entre autres) à l’aide de la notion d’analogie en resituant une telle approche dans le contexte plus large des études en épistémologie. La reconstruction rationnelle de l’invention de la notion de schéma donnée ici fournit suffisamment d’éléments pour supposer que les innovations conceptuelles de Grothendieck et consort en géométrie algébrique se laissent analyser comme un processus d’exploitation d’analogies entre différents domaines. On a aussi montré que la correspondance établie entre algèbres (réduites) et variétés d’une part, anneaux commutatifs et schémas (affines) d’autre part, et qui s’exprime techniquement par une équivalence duale de catégories, présente des différences suffisamment significatives des modèles courants de l’analogie pour ne pas pouvoir être analysée de manière adéquate par eux, du moins en l’état, ni même de manière exclusive. Ceci justifierait à tout le moins une révision de l’approche SMT, approche la plus pertinente dans le cas analysé, pour qu’elle autorise des projections (mapping) renversant l’ordre des arguments d’une relation (comme dans le cas d’une antiéquivalence), à défaut de considérer une antiéquivalence comme un exemple-modèle d’un troisième type de relation d’analogie. Ce faisant, l’approche structurelle et l’approche axiomatique de l’analogie fournissent différents critères pouvant servir aux historiens pour retracer l’évolution des concepts mathématiques. Dans le cas de Grothendieck, ils demandent encore à être corroborés et nuancés par une analyse des sources historiques plus approfondie, pour déterminer quels aspects ont été les plus importants et à quels niveaux. Explicités ainsi, et pour imparfaits qu’ils soient, il faut considérer les deux modèles de l’analogie présentés plutôt comme complémentaires, Grothendieck semblant
43 L’équivalence de catégories entre variétés algébriques et algèbres réduites, ou celle entre schémas
affines et anneaux, capturent une partie très spécifique des propriétés géométriques de certains objets. Du point de vue de la géométrie algébrique réelle moderne, le cercle V (X 2 + Y 2 − 1) et l’ellipse V (X2 + 2Y 2 − 1) sont isomorphes, et donc quasi-identiques, et non plus simplement analogues l’un de l’autre. Ces deux objets sont toutefois différents en terme de courbure (point de vue de géométrie différentielle réelle), si l’on prend comme métrique la distance induite par l’inclusion dans le plan réel (R2 ). C’est justement l’envie de se passer de la relativité imposée par une telle inclusion dans un espace euclidien qui rend intéressant la définition moderne d’un schéma, et donc d’une variété, comme espace annelé, c’est-à-dire pour simplifier comme espace muni d’une tolopologie et d’un « bon » ensemble de « fonctions » définies sur cet espace et compatibles avec cette topologie. En tant qu’espace annelé, la droite privée d’un point R \ {0} est isomorphe à l’hyperbole réelle définie par XY − 1 = 0. On voit peut-être sur ce dernier exemple mieux les limites (d’aucuns diraient la puissance) de l’analogie ici présentée.
L’équivalence duale de catégories: A Third Way of Analogy ?
65
avoir associé tout à la fois méthode axiomatique, abstraction et recherche de correspondances « structurelles » au sens d’une équivalence de catégories pour arriver à ces buts.
Bibliographie Arbogast, M./Condon, S. (2017) : La rédaction non-sexiste et inclusive dans la recherche : enjeux et modalités pratiques, (Documents de travail de l’Ined, 231), Paris : INED, http://hdl.handle. net/20.500.12204/AWRHybP-gpz89Adag4Yn Atiyah, M. (1976) : Global Geometry, (Bakerian Lecture 1975), Proceedings of the Royal Society of London A 347: 291–299, https://doi.org/10.1098/rspa.1976.0001 Atiyah, M. (2007) : Duality in Mathematics and Physics, (Lecture notes from the Institut de Matematica de la Universitat de Barcelona), https://fme.upc.edu/ca/arxius/butlleti-digital/ riemann/071218_conferencia_atiyah-d_article.pdf Bartha, P. (2016) : Analogy and Analogical Reasoning, in : Zalta, E.N. (ed.): The Stanford Encyclopedia of Philosophy (Winter 2016 Edition), https://plato.stanford.edu/archives/win2016/ entries/reasoning-analogy/ Brown, R./Porter, T. (2006) : Category Theory : an abstract setting for analogy and comparison, in : Sica, G. (ed.) : What is Category Theory ? Advanced Studies in Mathematics and Logic, Monza: Polimetrica Publisher, 257–274 Cartier, P. (2001): A mad day’s work: from Grothendieck to Connes and Kontsevitch. The evolution of concepts of space and symmetry, Bull. Am. Math. Soc. 38(4), 389–408, http://www.ams.org/ journals/bull/2001-38-04/S0273-0979-01-00913-2. Corfield, D. (2003): Towards a philosophy of real mathematics, Cambridge: Cambridge Univ. Press Corfield, D. (2017): Duality as a category-theoretic concept, Studies in History and Philosophy of Modern Physics 59, 55–61, https://doi.org/10.1016/j.shpsb.2015.07.004 Corry, L. (1996) : Modern algebra and the rise of mathematical structures, (Science networks, Historical studies (17)), Basel: Birkhäuser Deligne, P. (1998): Quelques idées maîtresses de l’œuvre de A. Grothendieck, in: Audin, M. (ed), Matériaux pour l’histoire des mathématiques au XXe siècle, Actes du colloque à la mémoire de Jean Dieudonné (Nice, 1996), (Séminaires et congrès), Marseille: SMF, 11–19 Dieudonné, J. (1985): History of algebraic geometry: an outline of the history and development of algebraic geometry (The Wadsworth mathematics series), Monterey, CA: Wadsworth Dolgachev, I. (1974): Abstract algebraic geometry, Journal of Soviet Mathematics 2(3), 264–303 Durand-Richard, M.-J. (ed.) (2008): L’analogie dans la démarche scientifique, Paris: L’Harmattan Epple, M. (2016) : “Analogien”, “Interpretationen”, “Bilder”, “Systeme” und “Modelle” : Bemerkungen zur Geschichte abstrakter Repräsentationen in den Naturwissenschaften seit dem 19. Jahrhundert, Forum Interdisziplinäre Begriffsgeschichte, Berlin : Zentrum für Literatur- und Kulturforschung Berlin (ZfL), http://www.zfl-berlin.org/tl_files/zfl/downloads/publikationen/ forum_begriffsgeschichte/ZfL_FIB_5_2016_1_Epple.pdf Gentner, D. (1983) : Structure-mapping : A theoretical framework for analogy, Cognitive Science 7, 155–170 Gentner, D./Markman, A.B. (1997) : Structure mapping in analogy and similarity, American Psychologist 52, 45–56 Gentner, D./Markman, A.B. (2005) : Defining structural similarity, The Journal of Cognitive Science 6, 1–20 Gentner, D./Holyoak, K.J./Kokinov, B. (eds.) (2001) : The Analogical Mind : Perspectives from Cognitive Science, Cambridge, MA: MIT Press
66
A. Jarry
Goldstone, R./Son, J.Y (2012) : Similarity, in : Holyoak, K.J./Morrison, R.G. (eds.) : The Oxford handbook of thinking and reasoning, New York: Oxford Univ. Press, 155–176 Grady, J.E. (2010) : Metaphor, in : Geeraerts, D./Cuyckens, H. (eds) : The Oxford Handbook of Cognitive Linguistics, https://doi.org/10.1093/oxfordhb/9780199738632.013.0008 Grosholz, E./Breger, H. (eds.) (2000) : The Growth of Mathematical Knowledge, Dordrecht : Kluwer Academic Publishers Grothendieck, A. (1957) : Sur quelques points d’algèbre homologique, Tôhoku Mathematical Journal 2(9), 119–221, https://projecteuclid.org/euclid.tmj/1178244839 Grothendieck, A. (1958): The Cohomology Theory of Abstract Algebraic Varieties, in: Todd, J.A. (ed.): Proceedings of the International Congress of Mathematicians, 14–21 August 1958, Edinburgh, Cambridge : CUP, 103–118, http://www.mathunion.org/ICM/ICM1958/Main/icm1958. 0103.0118.ocr.pdf Grothendieck, A./Dieudonné, J. (eds) (1960): Eléments de géométrie algébrique I. Le langage des schémas, Publications mathématiques de l’IHÉS 4, 5-228, http://www.numdam.org/item?id= PMIHES_1960__4__5_0 Görtz, U. (2018) : Classics Revisited : Éléments de Géométrie Algébrique, Jahresber Dtsch MathVer 120, 235–290, https://doi.org/10.1365/s13291-018-0181-1 Hahn, U. (2003) : Similarity, in : Nadel, L. (ed.) : Encyclopedia of Cognitive Science, London: Macmillan, 386–388, https://doi.org/10.1002/0470018860.s00616 Hallyn, Fernand (ed.) (2000): Metaphor and Analogy in the Sciences, Dordrecht: Kluwer Academic Publishers Hesse, M. (1966): Models and Analogies in Science. Notre Dame, IN: Univ. Notre Dame Press Hentschel, K. (2010a) : Analogien in Naturwissenschaften, Medizin und Technik, (Acta Historica Leopoldina 56), Stuttgart: Wiss. Verl.-Ges. Hentschel, K. (2010b) : Die Funktion von Analogien in den Naturwissenschaften, auch in Abgrenzung zu Metaphern und Modellen, in: Hentschel (2010a), 13–66 Hofstadter, D. (2001) : Analogy as the Core of Cognition, in : Gentner/Holyoak/Kokinov (2001), 499–538 Hofstadter, D./Sanders, E. (2013): Surfaces and Essences. Analogy as the fuel and fire of thinking, New York: Basic Books Holyoak, K.J. (2012): Analogy and relational reasoning, in: Holyoak, K.J./Morrison, R.G. (eds.), The Oxford handbook of thinking and reasoning, New York: Oxford Univ. Press, 234–259 Itkonen, E. (2005) : Analogy as structure and process. Approaches in linguistics, cognitive psychology and philosophy of science, (Human Cognitive Processing, 14), Amsterdam : John Benjamins Knobloch, E. (1989): Analogie und Mathematisches Denken, Berichte zur Wissenschaftsgeschichte 12, 35–47 Krömer, R. (2007) : Tool and object. A history and philosophy of Category theory, (Science Networks, Historical Studies 32), Basel: Birkhäuser Krömer, R./Corfield, D. (2014) : The Form and function of duality in modern mathematics, in : Schroeder-Heister, P./Heinzmann, G./Hodges, W./Bour, P.E. (eds) : Logic and Philosophy of Science in Nancy (I). Selected contributed papers from the 14th International Congress of Logic, Methodology and Philosophy of Science, Philosophia Scientiæ 18(3), 95–109 Lakoff, G./Nuñez, R.E. (2000): Where mathematics comes from. How the embodied mind brings mathematics into being, New York: Basic Books Marghetis, T./Nuñez, R. (2013): The motion behind the symbols: a vital role for dynamism in the conceptualization of limits and continuity in expert mathematics, Topics in Cognitive Science 5(2), 299–316 Marquis, J.-P. (2014): Mathematical Abstraction, Conceptual Variation and Identity, in: SchroederHeister, P./Heinzmann, G./Hodges, W./Bour, P.E. (eds) : Logic and Philosophy of Science in Nancy (I). Selected contributed papers from the 14th International Congress of Logic, Methodology and Philosophy of Science, Philosophia Scientiæ 18(3), 1–24 Marquis, J.-P. (2016) : Stairways to Heaven. The Abstract Method and Levels of Abstraction in Mathematics, The Mathematical Intelligencer 38(3), 41–51
L’équivalence duale de catégories: A Third Way of Analogy ?
67
McLarty, C. (2007) : The Rising Sea : Grothendieck on simplicity and generality, in : Gray, J./Hunger Parshall, K. (ed.): Episodes in the history of modern algebra (1800–1950), (History of mathematics, 32), Providence, RI: American Mathematical Society, 301–322 Perrin, D. (2008): Algebraic Geometry. An Introduction, London: Springer-Verlag Poincaré, H. (1908): Science et méthode, Paris: Flammarion Schappacher, N. (2007) : A Historical Sketch of B.L. Van der Waerden’s Work on Algebraic Geometry 1926 – 1946, in: Gray, J./Hunger Parshall, K. (ed.): Episodes in the history of modern algebra (1800–1950), (History of mathematics, 32), Providence, RI: American Mathematical Society, 245–278. Schlimm, D. (2008) : Two ways of analogy : Extending the study of analogies to mathematical domains, Philosophy of Science 75(2), 178–200 Schlimm, D. (2011) : On the creative role of axiomatics. The discovery of lattices by Schröder, Dedekind, Birkhoff, and others, Synthese 183, 47–68 Vakil, R. (2017): The Rising Sea: Foundations Of Algebraic Geometry Notes, http://math.stanford. edu/~vakil/216blog/FOAGnov1817public.pdf
Mathematical Modelling and Teleology in Biology José Antonio Pérez-Escobar
Abstract Mathematical modelling is a group of techniques that have been making their way into diverse biological fields. The incipient roles of these techniques in biology are transforming the scientific practice, and it is believed that the mathematization of biology is progressively putting it in line with the standards of rigor of the physical sciences. While the first statement is true, the second does not necessarily follow from it. In this paper, I will challenge the idea that mathematics brings biology closer to the standards of physics by showing how teleological notions, common in biology but not in today’s physics, coexist and interact with modelling techniques in a very idiosyncratic scientific practice. To this end, I will explore modelling techniques of the so-called brain’s internal compass, a component of the “brain GPS system,” in computational neuroscience.
1 Introduction Teleology (telos: end, goal, purpose; logos: reason, explanation) is an explanatory strategy that appeals to the purpose of the object of study rather than its mechanical causes. Biology has traditionally incorporated not only mechanical explanations, but also teleological explanations. Yet, even modern biology, far away from vitalism (the metaphysical consideration that living beings are driven to purposes by an inner vital force) and intelligent design (teleology as the extension of God’s intentions), still includes teleological notions in its explanations either as metaphysical propositions or at least as a heuristic strategy, acting “as if” biological phenomena were subjected to design or had purposes (Ratzsch 2010). It is because of these nonmechanical components in the explanations of biology that it has been proposed to be irreducible to strictly mechanistic sciences such as physics (Ayala 1968, 1999).
J. A. Pérez-Escobar () ETH Zurich, Zurich, Switzerland e-mail: [email protected] © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_4
69
70
J. A. Pérez-Escobar
It has been argued that the teleological component of biological explanations cannot be eliminated without loss of information and explanatory power (Ayala 1999). Therefore, it is not justified to do without it in order to render biology a strictly mechanical science. However, this has not deterred reductionist efforts. Yourgrau and Mandelstam (1955) claim that teleology is reflected in natural language, not in mathematical formulas. Indeed, formulas can describe the motion of the rock, but not its purpose. A popular idea among scientists and philosophers is that the more mathematical a science is, the more mature and rigorous it is (Storer 1967). Enquist and Stark (2007) fully endorse the development of a “quantitative, mechanistic and predictive biology” so that it becomes a “capital-S Science.” And indeed, biology has received mathematical methods with open arms in the last few decades. In this paper, I argue that the inclusion of mathematical methods in biology does not render it free from teleology. On the contrary, mathematical modelling interacts with teleological notions in the scientific practice and may even assist in anchoring teleological notions to physical phenomena. This, in turn, calls into question the role of mathematics as a central pillar for a project for the unification of the sciences. I will first offer a short overview of the so-called brain’s inner compass and its involvement in spatial computation and cognition. After that, I will discuss the research program around it and the roles of biophysical modelling, mathematical modelling and simulations, dedicating a section for each one. I will present the sections in that order, establishing a canonicity between them, and discussing how teleological notions are present at all points and lead the research process. Finally, I will discuss how the harmonical coexistence of different modalities of representation in the scientific practice may account for the preservation of teleological content in the later stages of the research program, its unproblematic conjunction with mechanical content, and the success of this hybrid strategy.
2 The Brain’s “Inner Compass” The so-called inner compass is a key component of the “GPS system” of the brain, a system that has gathered massive attention from neuroscientists in the last few decades. The inner compass is comprised by cells which encode the angular direction that the organism faces. These cells, called “head-direction cells,” present a very characteristic pattern of activity: each of these cells has a “preferred direction,” so that when the organism faces that direction, the activity of the cell reaches its peak firing rate. The cell still responds to the direction faced by the organism when the angular distance from the former direction and the cell’s preferred direction is not bigger than 45◦ . Beyond an angular distance of 45◦ , the activity of the cell diffuses and becomes sparse. Moreover, the tuning of head-direction cells typically adjusts to a Gaussian distribution over their ∼90◦ response field (Fig. 1). The variability which head-direction cells (even samples of “representative” cells) express in this regard is illustrated in Fig. 2.
Mathematical Modelling and Teleology in Biology
71
Fig. 1 Parameters of the directional tuning function. (a) The tuning curve of a head-direction cell represents the cell’s firing rate (Y-axis) as a function of a rat’s directional heading in a horizontal plane (X-axis). The directional heading is plotted on a scale of 0–360. (b) To compute the parameters of the directional tuning function, a Gaussian function is fitted to the curve in (a). The mean of the Gaussian gives the cell’s preferred firing direction, D; the standard deviation of the Gaussian is equal to half of the cell’s directional tuning width, W; the peak height of the Gaussian gives the cell’s peak directional firing rate, P; the baseline of the Gaussian gives the cell’s background firing rate, B. Taken from Blair et al. (1997)
Fig. 2 Firing rate as a function of head direction for 3 representative cells from 3 different animals. Each plot is based on 8 min of recording, and head direction was analyzed with a 6” bin width. Note that the preferred direction and peak firing rate are different for each cell. (a) low-peak firing rate cell. (b) medium-peak firing rate cell. (c) high-peak firing rate cell. Taken from Taube et al. (1990a)
In spite of such variability, there is a well-defined concept of the “ideal” head-direction cell against which all empirical observations are measured. But where does this concept come from? What is a head-direction cell exactly then? The discovery/creation dichotomy of objects of study is very controversial. Here, several cells with similar electrophysiological characteristics are considered to belong to a category, namely “head-direction cell,” represented by an object with ideal characteristics. Such object, of course, is fictitious, but is appealed to in order to classify neurons as “head-direction cell” or “not a head-direction cell.” This is a relevant consideration in all forms of knowledge, but it is especially important in electrophysiological studies, for two reasons. First, because the
72
J. A. Pérez-Escobar
recording of electrophysiological activity is a very indirect cell observation method and classification procedures vary depending on the criteria of researchers and goals of studies. Normally, in order to be considered a head-direction cell, a given electrophysiological unit has to come “clean enough” out of the measuring procedure chosen, and provided that, then it has to meet more or less conservative criteria determining whether the activity of the unit resembles well enough that expected of an ideal head-direction cell. Second, because the construction of objects of study in biology often involves a second idealization in the form of a teleological judgment: a biological object is not just an ideal exemplar, but an ideal exemplar that serves an ideal purpose. In this sense, the “creation” of the biological object precedes actual observations, which operate under a lens of physical and teleological idealizations, and conditions further research. Upon their “discovery” in 1990 (Taube et al. 1990a, b) and a previous short report in 1984 (Ranck 1984), the phenomenology of the electrophysiological characteristics of these cells and its correlation with the organism’s facing direction led to the consideration that they provide a sense of direction to the organism.1 Such sense of direction would be a key element for spatial navigation, a critical ability of organisms for environmental adaptation. The early assignation of a role, function, or purpose to a biological object based on phenomenological characteristics and correlations is a common practice in the biological sciences, which guides and constrains critical aspects of the research process (for instance, what to look for and how to interpret whatever is found). Just a year after the discovery of head-direction cells, McNaughton et al. (1991) considered a spatial navigation problem that animals typically encounter, and proposed different computational approaches that may lead to its resolution. The “geometrical solution,” although able to solve the spatial navigation problem, was promptly discarded in favor of the “compass solution,” among other reasons, due to its economy of storage: “it is the economy of storage that is one primary argument in favor of the compass solution, assuming such a mechanism is available” (McNaughton et al. 1991). Another reason why the “compass solution” was preferred was the existence of a candidate cell type which could be responsible for the computation. The mechanism underlying compass computation would, of course, be based on the head-direction cells—the neurobiological substrate for a sense of direction—discovered just one year before. Here has begun the teleologically-
1 In
the neuroscience of cognition, the ascription of teleological content to the biological object is less straightforward than in other biological areas due to the abstract character of information processing and cognition, and therefore the process relies even more heavily on intuition. Usually, the teleological judgment is based on observations of physiological activity at the single-cell or network level, and on the behavior of the organism.
Mathematical Modelling and Teleology in Biology
73
guided research process, where purpose precedes mechanism,2 and where one finds explicit references to and inspiration from a deliberately designed artifact with a conferred purpose (a compass).
3 Biophysical Modelling In theoretical neuroscience, models usually have two aspects: a biophysical structure and a logico-mathematical representation. While the former represents the physical properties of the modelled system, the latter represents its abstract properties (such as information processing, Hebbian learning rules, or synaptic weights). However, as I will show in an upcoming example, biophysical models may sacrifice physical likelihood in order to achieve a compromise between the representation of mechanical properties and accepted teleological notions. In 1995, Skaggs et al. (1995) put forward an influential biophysical model of the head-direction system based on the considerations of McNaughton et al. (Fig. 3). First, they arrange head-direction cells in a compass fashion as an illustration of their purpose (encoding facing direction), in a way that the position of a given cell in the ring matches its preferred angular direction. Second, if head-direction cells are performing spatial computations relative to angular direction, then these cells likely need information inputs from the visual and vestibular systems. The biophysical model in Skaggs et al. does just that integrating potential mechanisms of visual and vestibular inputs to the ring attractor arrangement of head-direction cells. Note how the neuron at the top, the one whose preferred angular direction is being faced by the organism, is in turn exciting neighboring neurons, thus accounting for the observed activity of head-direction cells (responding at up to a 45◦ angular distance from their preferred direction). This is a mechanism proposed for their electrophysiological characteristics. However, visual and vestibular synaptic inputs, as well as clockwise and anti-clockwise rotation cells, are mechanisms proposed not only for their observed electrophysiological characteristics, but also for their assumed purpose: if such purpose was another, the proposed physical realization of the system could be very different. In addition, the ring attractor arrangement is also a compromise between the particular teleological notions with which the scientists work, and the unexhaustive physical characteristics known about the system. The model adapts to the physical and teleological characteristics of the cells, via a teleomechanical compromise: both the teleological notions and the mechanical information available constrain the possibilities of the model.
2 This
is not to say that the scientist explicitly commits to the metaphysical stance that the physical realization of the system is directed by purposiveness (although this may implicitly be the case), but that teleological intuitions in biological research guide the research process, including what is simplistically referred to as “to look for the mechanism.” The “mechanical commitment” of the neurosciences described by Kaplan (2011), thus depicts only part of the picture.
74
J. A. Pérez-Escobar
Fig. 3 Taken from Skaggs et al. (1995)
Fig. 4 Taken from Stringer et al. (2002)
4 Mathematical Modelling Inspired by the model proposed by Skaggs et al., Stringer et al. (2002) developed a mathematical model of the head-direction system (Fig. 4). The model is as follows: The left-hand side of the equation represents the continuous activity of headdirection cell i. On the right-hand side of the equation, the first component is a decay term, the second describes the effects of the recurrent connections in the network,3 the third stands for visual input to cell i, and the fourth represents connections conveying idiothetic information (vestibular and proprioceptive information derived
3 ϕ0/CHD
stands for the overall strength of the recurrent inputs, so that CHD is the number of inputs to one head-direction cell from other head-direction cells and ϕ0 is a constant, wij RC represents the excitatory synaptic weight from a given head-direction cell j to head-direction cell i, wINH is a constant which accounts for a global inhibitory effect of interneurons, and rj HD is the firing rate of head-direction cell j.
Mathematical Modelling and Teleology in Biology
75
from motion that provides a sense of rotation) that accounts for rotations of the head-direction signal.4 In the case of visual input amounting to 0, for example, in darkness, the idiothetic input can still account for the activation of the right headdirection cells when the organism changes its facing direction. This model yields several general predictions. However, due to the limiting nature of the techniques available back then (mostly based on electrophysiological recordings and histological examination) and even still today (after adding techniques like optogenetics and advances in viral neuronal tracing and calcium imaging), an exhaustive quantitative and mechanical assessment of the model is unfeasible. What the mathematical model allows for, unlike the biophysical model, is to perform simulations, which can in fact be assessed quantitatively. Biophysical simulations cannot be performed due to technical limitations (it would require the synthesis of an artificial brain system). Mathematical models, on the other hand, provide a convenient solution by discarding the material aspect and preserving abstract relational structures of the systems. They can be used to perform quantitative simulations, although they cannot be assessed in terms of physical structure (not to mention the multiple realizability argument for computations). Second, such simulations can be contrasted quantitatively against the phenomenology of the original system (provided that an account of quantification of that phenomenology exists, like in the case of head-direction cell tuning). In this sense, the physical realization of the system takes a step back in importance. The biophysical model is an iconic representation: the items and structure it depicts are intended to bear physical resemblance to the system it models. The mathematical model, on the other hand, is a symbolic representation: it bears no physical resemblance to the system it models, and its pairing to objects is supported by convention, or relies importantly on descriptions in natural language.5 But no representation is exclusively iconic or symbolic (Goodman 1968; Klein 2003; Grosholz 2007), and the mathematical model is not completely emancipated from the iconicity of the biophysical model that precedes it. After all, the mathematical model is based on the biophysical model. It mathematically represents the same types of cells, the arrangement of inputs, and electrophysiological activity and implicitly assumes the same teleomechanical compromises. For instance, concerning inputs j to i, natural language is employed to clarify that “neurons that represent similar states of the agent in the physical world have strong connections.” That is, neurons that are situated nearby in the compass arrangement—which represent facing directions separated by small angular distances—are connected strongly. In addition, the ring structure is implicitly assumed by the introduction of rotation cells, and more evidently described in natural language, by specifying that these cells can be either “clockwise rotation cells” or “anti-clockwise rotation cells.” Moreover, the natural language surrounding the model in Stringer et al. shows teleological notions similar to those of Skaggs et al.: “Some neurons encode information about the 4 r ROT k
is the firing rate of rotation cell k and wijk ROT is the overall effective connection to headdirection cell i. 5 This contrast of iconic representations against symbolic representations is due to Peirce (1885).
76
J. A. Pérez-Escobar
orientation or position of an animal ( . . . ),” “A key challenge in these CANN models is how the bubble of neuronal firing representing one location in the continuous state space can be updated based on non-visual, idiothetic, cues to represent a new location in state space,” “These networks maintain a localized packet of neuronal activity representing the current state of the animal. We show how the synaptic connections in a one-dimensional continuous attractor network (of for example head direction cells) could be self organized ( . . . ).” As we see, the mathematical model is partially emancipated from the biophysical model. Due to its symbolic character, it is emancipated enough to allow for simulations and quantitative predictions. However, it is due to its iconicity that it preserves many of the traits of the biophysical model, and therefore, the teleological precedence is still present at this stage of the research process. The process of emancipation is, however, continuous, and a middle step of the process is illustrated in Fig. 5, where both the iconic (cells, synapses) and symbolic (mathematical terms, natural language) are explicitly manifest. Synaptic connections for Sigma-Pi Model 1A Head direction cell j ROT
wji1
ROT
wji2 RC
wji
Clockwise rotation cells
Anti-clockwise rotation cells
r2ROT
ROT
r1
RC
wij wROT ij1
ROT
wij2
Head direction cell i Recurrent connections to head direction cells from other head direction cells Idiothetic connections to head direction cells from pairings of rotation cells and other head direction cells Fig. 5 Recurrent and idiothetic synaptic connections to head-direction cells in the sigma–pi model 1A. In this figure there is a single clockwise rotation cell with firing rate r1 ROT and a single anticlockwise rotation cell with firing rate r2 ROT . In addition, the idiothetic synaptic weights from the clockwise and anti-clockwise rotation cells are denoted by wij1 ROT and wij2 ROT , respectively. Taken from Stringer et al. 2002
Mathematical Modelling and Teleology in Biology
77
5 Simulations We have seen before that the partial emancipation of the mathematical model allows for simulations that can be assessed quantitatively. And indeed, this model has been used to perform simulations, showing that several phenomena of headdirection cells can be approximated quantitatively: subjecting an artificial agent to clockwise and anti-clockwise rotations under these parameters, or having it face different directions while stationary, yields an activity packet of the artificial network similar to that observed in the brain’s head-direction system. How is this interpreted? The quantitative assessment of the simulation indicates that the proposed mechanism could account for a sense of angular direction. This interpretation, however, relies on the initial teleological notion that such is the purpose of head-direction cells, which directed the research process from the beginning: the interpretation and quantification of the phenomenology of cells when first discovered, the proposition of specific computational solutions to problems, the arrangement of feasible physical implementations of such computations, and finally, the elaboration of mathematical formulas and simulations that match quantitative aspects of the phenomenology. Therefore, to the extent that mathematical models and simulations turn out to be convincing, the initial teleological notions gain further support in the later stages of research.
6 Mediation Between Modalities of Representation So far, it has been shown how teleological content is present at all stages of the research program, be it in form of intuition, or of models influenced by such intuition. But how do teleological notions implicitly end up in a symbolic representation like a mathematical model? And how can teleological, material, and formal content coexist in a single representation without turbulence, under control? A way to answer these questions is to analyze the relations between the different modalities of representation at stake. The first representations of teleological notions occur in natural language. Natural language is particularly useful for explicit descriptions of teleological content. For instance, after early observations of the phenomenology of a certain type of cell, “the purpose of the head-direction cell system is to provide a sense of angular direction” is a straightforward, early representation of a teleological notion in natural language. Later, we have iconic representations, which represent, among other types of content, teleological content. But the iconic modality of representation is less explicit and straightforward than the natural language representation, partly because it represents several types of content, not only teleological. The amalgamation of different types of content in a single representation is not necessarily a limitation of the iconic modality, but rather, a useful aspect of it: it is the integration of different
78
J. A. Pérez-Escobar
content and the representative ambiguity that may account for part of the success of science and mathematics (Grosholz 2007, Chaps. 2–5). This applies to the way that molecules are iconically represented in chemistry (icons representing, and making compromises in the representation of, different types of content such as kinds and number of atoms, structure, particularity but also generality). The icon of a molecule must compromise explicitness and physical resemblance to accommodate all this information. For example, hydrogen atoms are not depicted but presupposed, and the physical structure of the icon must sacrifice physical faithfulness to be able to present somewhat clearly the components of the molecule (so that the translation of the icon to a formal representation, the Berzelian formula, is not too bothersome). Likewise, the iconic representation of the head-direction system is not completely faithful to its physical properties, since it has to accommodate more content than just that: Besides bearing certain physical resemblance, it facilitates the translation to a formal system (so it places emphasis on what are considered relevant aspects such as cells and synapses) and integrates teleological notions earlier represented by natural language (depiction of a ring attractor network reminiscent of compasslike circularity, hypothetical synapses conveying information critical for the role that head-direction cells are supposed to play, and a rotatory component), all at the expense of physical faithfulness. In addition, the model does not substitute representations in natural language, but instead is presented together with natural language, which assists in the interpretation and includes clarifications on how the content of the iconic representation (material, abstract relational, and teleological) is to be understood. This becomes evident just by looking at the presentation of the models discussed in this paper. However, the multifaceted and ambiguous character of the iconic representation demands more than just its coexistence with representations in natural language, which is not enough to control representative ambiguity. A certain tacit knowledge implicit in the scientific tradition and practices, and provided by apprenticeship and membership, is required. For instance, what is depicted in the iconic representation as a rotation cell is a compromise between physical structure (either as a proper cell or groups of cells and axons . . . ) and necessary function (the cognitive sense of direction must be subjected to angular rotations), and its interpretation varies depending on specific contexts and activities within the scientific practice: Neuroanatomical analyses focus on the physical facet (but do not completely disregard functional intuitions), while behavioral analyses prioritize cognitive functions (but the analysis is constrained to some degree by what is known about the physical). The translation of the iconic representation into a symbolic representation itself is another component of the scientific practice that is dependent on tacit knowledge. Even if presented amalgamated, different types of content from the iconic representation and natural language are carefully but unproblematically selected, rearranged, and transformed. Let us consider the rotation element again. Its mathematization in conjunction with the rest of elements in the equation is the result of a new, value-oriented integration of the physical, relational, and functional aspects. It is constrained by both notions of physical feasibility, like what kind of electrical activity is reasonable and what relations with other elements are likely,
Mathematical Modelling and Teleology in Biology
79
and teleological notions, such as how the rotation element should modify the firing rate values of head-direction cells so that it contributes to the overall purpose of the head-direction system. Finally, there are the symbolic/formal representations. According to Grosholz (2007, Chap. 3, p. 79), the symbolic modality of representation is more tolerant than the iconic modality regarding the kind of content it can represent. This is, in part, because the symbolic modality is not as constrained by physical resemblance (although it is not completely detached from it). And while the iconic modality is better at representing physical structure, the symbolic modality is more suitable for the representation of abstract relational structure. For this reason, symbolic representations can further sacrifice physical structure and make other content more explicit (relations between components) and, as we have already seen, enable important techniques (simulations), while at the same time preserving teleological notions in the form of necessary elements to account for the purpose ascribed to head-direction cells (idiothetic and visual input and a rotatory component that together modify the firing rate values of head-direction cells, account for compasslike dynamics and explain changes in the cognition of angular directionality). And while accomplishing those feats, inklings of the physical structure are still represented (the rotatory component preserves the compass-like circularity of the ring attractor arrangement, while synapses are represented in terms of abstract relations, forming a relational structure). The mathematical model is not only about quantities, but is part of the context of a scientific practice, a bigger picture where it acquires meaning from, and confers meaning to, other elements of the practice (for example, but not only, other representations). Yet again, and even if sometimes the mathematical model is regarded as a self-sufficient object, it does not substitute iconic representations or natural language, which help interpret the meaning of parameters and numerical values. And just like in the case of iconic representations, tacit knowledge must come into play to further control the ambiguity at issue. The mathematical model, even if conceived as an end product or the pinnacle of a research program, is a practice-embedded representation that enables techniques and unifies quantities and abstract relations with important intuitions of the scientists, in this particular case, structure and purpose. The symbolic representation is enacted by its ancillary iconicity and verbality and becomes defunct when regarded in isolation from its practical contingencies. We have a scenario where natural language, iconic representations, and symbolic representations coexist not only in broad contexts like scientific practices, but also confined, simplified spaces like research papers. These representations, far from possessing univalent and straightforward meanings, include very different kinds of content, each important in its own way. Because they do not explicitly convey all the features of the phenomena they represent, but capture them only partially, they are ambiguous. Furthermore, the different representations in the practice are entangled with each other and cannot be dissolved without affecting their meanings and applications. Representational ambiguity, when controlled, is not faulty, but can help tackle the different aspects of heterogeneous and complex practices, like scientific practices. The harmonical coexistence of the different representations embedded
80
J. A. Pérez-Escobar
in the practice facilitated by the modulation of tacit knowledge and convention keeps ambiguity under control. The representations involved in the case here discussed, each of them multifaceted in their own way, enable the operativization of multiple kinds of content (teleological intuitions, physical structure, abstract relations, quantities). Under this practical harmony, the various representations involved work their magic, gracefully wrapping up in the same package as diverse and seemingly incompatible content as teleology and mechanisms.
7 Conclusion Through the discussion of the brain’s “inner compass” and the models here presented, we have seen how the teleological notions that typically guide biological research are present even when mathematical techniques are introduced. Instead of merely depicting a plausible mechanism, the models hold on to the very same teleological content to which researchers committed early in the research program. Even more, mathematical modelling and computer simulations may further endorse the use of teleological content as it becomes canonical in the research program.6 In the biological scientific practice, it is common to observe reality through a teleological lens, which influences the process of constructing objects of study. In the example discussed in this paper, we have seen how teleological notions are present in all stages of the research program and precede new developments in the chain of progress. This includes the stages where mathematical modelling takes place. Mathematics is, therefore, compatible with teleology-based biological scientific practice and is not a resource that will necessarily make biology a nonteleological science. Its representative and justifying potential, often ambiguous, multifaceted, and in interaction with iconicity and natural language, is far from being limited to mechanisms, statistics, or abstract objects. And while mathematics is ontologically tolerant in principle, it becomes ontologically insistent when embedded in practices and surrounded by other representations. However, it remains to be seen how much this ontological tolerance of mathematics can be stretched, as it is currently under debate whether there are certain kinds of biologically relevant content (such as historicity, organization, variation, and certain conceptions of possibility and novelty) that current mathematics is unable to represent (see, for example, Longo 2018; Montévil 2018; Montévil et al. 2016).
6 Typically,
in a research program, there is a teleological notion about a given biological phenomenon that stands dominant among alternatives, if there are alternatives. For example, regarding grid cells, it has been proposed that their function might be single-cell computation (and the feasibility of this has been backed by mathematical models as well) (Kropff and Treves 2008), but the canonical teleological notion is that they form a system that computes as a whole. In fact, “how the grid cell system processes spatial information” has been a source of inspiration for “actually designed” information processing neural networks (Banino et al. 2018), further blurring the line between “as if designed” and “actually designed.”
Mathematical Modelling and Teleology in Biology
81
References Ayala, F. J. (1968). Biology as an autonomous science. American Scientist, 56(3), 207-221. Ayala, F. J. (1999). Adaptation and novelty: Teleological explanations in evolutionary biology. History and Philosophy of the Life Sciences, 21(1), 3-33. Banino, A., Barry, C., Uria, B., Blundell, C., Lillicrap, T., Mirowski, P., . . . & Wayne, G. (2018). Vector-based navigation using grid-like representations in artificial agents. Nature, 557(7705), 429-433. Blair, H. T., Lipscomb, B. W., & Sharp, P. E. (1997). Anticipatory time intervals of head-direction cells in the anterior thalamus of the rat: Implications for path integration in the head-direction circuit. Journal of Neurophysiology, 78(1), 145-159. Enquist, B. J., & Stark, S. C. (2007). Follow Thompson’s map to turn biology from a science into a Science. Nature, 446(7136), 611. Goodman, N. (1968). Languages of art: An approach to a theory of symbols. Indianapolis, IN: Hackett publishing. Grosholz, E. R. (2007). Representation and productive ambiguity in mathematics and the sciences. Oxford, UK: Oxford University Press. Kaplan, D. M. (2011). Explanation and description in computational neuroscience. Synthese, 183(3), 339-373. Klein, U. (2003). Experiments, models, paper tools: Cultures of organic chemistry in the nineteenth century. Palo Alto, CA: Stanford University Press. Kropff, E., & Treves, A. (2008). The emergence of grid cells: Intelligent design or just adaptation? Hippocampus, 18(12), 1256-1269. Longo, G. (2018). How future depends on past and rare events in systems of life. Foundations of Science, 23(3), 443-474. McNaughton, B. L., Chen, L. L., & Markus, E. J. (1991). “Dead reckoning,” landmark learning, and the sense of direction: A neurophysiological and computational hypothesis. Journal of Cognitive Neuroscience, 3(2), 190-202. Montévil, M. (2018). A primer on mathematical modeling in the study of organisms and their parts. In M. Bizzarri (Ed.), Conceptual and methodological challenges in systems biology (pp. 41-55). New York, NY: Humana Press. Montévil, M., Mossio, M., Pocheville, A., & Longo, G. (2016). Theoretical principles for biology: Variation. Progress in Biophysics and Molecular Biology, 122(1), 36-50. Peirce, C. S. (1885). On the algebra of logic: A contribution to the philosophy of notation. American Journal of Mathematics, 7(2), 180-196. Ranck, J. B. (1984). Head direction cells in the deep layer of dorsal presubiculum in freely moving rats. Society of Neuroscience Abstracts, 10, 599. Ratzsch, D. (2010). There is a place for intelligent design in the philosophy of biology: Intelligent design in (philosophy of) biology: Some legitimate roles. In F.J. Ayala & R. Arp (Eds.), Contemporary debates in philosophy of biology (pp. 343-363). Malden, MA: Wiley-Blackwell. Skaggs, W. E., Knierim, J. J., Kudrimoti, H. S., & McNaughton, B. L. (1995). A model of the neural basis of the rat’s sense of direction. In G. Tesauro, D. S. Touretzky & T. K. Leen (Eds.), Advances in neural information processing systems 7: Proceedings of the 1994 conference (pp. 173-180). Boston, MA: MIT Press. Stringer, S. M., Trappenberg, T. P., Rolls, E. T., & Araujo, I. (2002). Self-organizing continuous attractor networks and path integration: One-dimensional models of head direction cells. Network: Computation in Neural Systems, 13(2), 217-242. Storer, N. W. (1967). The hard sciences and the soft: Some sociological observations. Bulletin of the Medical Library Association, 55(1), 75-84. Taube, J. S., Muller, R. U., & Ranck, J. B. (1990a). Head-direction cells recorded from the postsubiculum in freely moving rats. I. Description and quantitative analysis. Journal of Neuroscience, 10(2), 420-435.
82
J. A. Pérez-Escobar
Taube, J. S., Muller, R. U., & Ranck, J. B. (1990b). Head-direction cells recorded from the postsubiculum in freely moving rats. II. Effects of environmental manipulations. Journal of Neuroscience, 10(2), 436-447. Yourgrau, W., & Mandelstam, S. (1955). Variational principles in dynamics and quantum theory. London, UK: Sir Isaac Pitman & Sons.
Arithmetic, Culture, and Attention Jean-Charles Pelland
Abstract The study of numerical cognition has undergone tremendous progress in recent years, accumulating scores of data on cognitive systems that could be involved in the uniquely human ability to practice formal arithmetic. Among the important questions tackled by this burgeoning domain of research is what happens to the limited cognitive systems that we share with many animal species to allow us to develop arithmetically-viable numerical content. While answers to this question have varied, most have attributed a constitutive role to culturallyinherited extracranial cognitive support in their explanation of how numerical content emerges from our innate cognitive machinery. The idea here is that we need to look at our interaction with external support for cognition like fingers, numerals, and number words, to explain what allows us to go beyond the size and precision limitations of the cognitive systems we are born with. In this paper, I challenge this externalist answer to the origins of our arithmetical skills and argue for an internalist approach to the development of formal arithmetical skills. I argue that culture-independent learning trajectories involved in learning the meaning of number words as well as individual differences in arithmetical abilities against fixed cultural backgrounds suggest adopting a pluralist approach to our formal numerical abilities, where externalism only holds beyond an initial segment of the natural numbers. In Sect. 1, I discuss the appeal of adopting cognitive externalism with respect to numerical cognition. Then, in Sect. 2, I discuss culture-invariant aspects of the learning trajectory we follow when learning the meaning of number words. Section 3 contains an introduction to research into Spontaneous Focusing on Numerosity (SFON), in order to illustrate how individual differences in arithmetical abilities could be explained by referring only to things going on inside the head. I close in Sect. 4 by offering a few remarks on why paying attention to numerical aspects of the world, as framed by SFON research, may be a good place to build an internalist explanation of how we learn to manipulate numbers.
J.-C. Pelland () Université du Québec à Montréal, Montreal, QC, Canada © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_5
83
84
J.-C. Pelland
1 Introduction: Extended Numerical Cognition The study of the cognitive and perceptual systems underlying our numerical abilities has progressed tremendously in the past few decades, yielding scores of data on the potential role played by the so-called approximate number system (ANS) and the object-file system (OFS) in the development of natural number concepts (Carey 2009; Dehaene 1997/2011; Feigenson et al. 2004; Cohen Kadosh and Dowker 2015). While there is still disagreement on the relationship between these systems and on the extent to which they produce representations with numerical content, there is overwhelming consensus that, on their own, neither of them produces representations with sufficient precision and numerical range to account for the development of natural number concepts. For example, not only is there compelling evidence that the OFS does not produce representations with explicit numerical content (Carey 2009), but it can also only keep track of a very small number of items (up to 4). On the other hand, the ANS’s representations are, as its name suggests, increasingly approximate as stimulus numerosity1 increases, so that it is unable to accommodate distinct numerical magnitudes that are too close together (e.g., 21 and 22). Given these obvious limitations to our innate cognitive systems, one of the most important questions in the study of numerical cognition is how we bridge the developmental gap between the limited and imprecise output of our innate cognitive machinery and the mathematically-viable content associated with the ability to think about (and with) natural numbers. Most solutions to this gap problem have adopted a form of externalism about cognition that focuses on our cognitive coupling with culturally-inherited symbols and extracranial artifacts like knot systems, tally systems, number words, numeral systems, abacuses, etc. Philosophical motivation for such externalist answers to the gap problem generally adopt variants of Clark and Chalmers’ (1998) extended mind thesis, which interprets the fact that cognition often involves dynamic interaction between individual brains, bodies, and the objects in their environment as justifying the claim that things outside our head can count as legitimate, constitutive parts of cognitive systems.2 It is by appealing to such extended cognitive loops involving things outside our heads that the majority of theories of the development of numerical cognition explain how we bridge the gap and go beyond the limitations of systems like the
1 While
this term is rightfully criticized for being imprecise (Beck 2014; Schlimm 2018), I use it here to describe the number of items of a perceived collection of objects. In other words, when we focus on the number of objects in a perceptual scene, numerosity describes the number of things we perceive. 2 Common examples of objects to which we can get caught in loops of reciprocal causation include using pen and paper to run through calculations and physically re-arranging letters when playing Scrabble: in both these cases, we are manipulating external media to complete cognitive tasks instead of doing it in our heads.
Arithmetic, Culture, and Attention
85
ANS and the OFS.3 For example, while he does not explicitly endorse externalism about cognition, Stanislas Dehaene (1997/2011) holds that when children learn to count, numerals such as those in a count list come to be associated with representations of the ANS. Learning the meaning of the number words is then a matter of mapping these to a precise location on what has come to be called a “mental number line” (MNL).4,5 For Dehaene, the answer to how we bridge the gap from the content of our evolved cognitive systems to the content of arithmeticallyviable number concepts lies in culture: “Cultural inventions, such as the abacus or Arabic numerals . . . transformed [the intuition of number] into our fully- fledged capacity for symbolic mathematics.” (Dehaene 1997/2011, p. x) If Dehaene is right, without precise symbols, there would be no advanced number concepts. Implicitly or explicitly, then, answers to the gap problem have generally relied on attributing a constitutive,6 irreplaceable role to culturally-inherited extracranial objects, artifacts, or symbols in explaining how we move beyond the limits of innate systems like the ANS and the OFS. The appeal of such externalism with respect to formal numerical cognition is undeniable. The fact that the practice of arithmetic seems to rely on mastery of culturally-inherited numeration systems like the IndoArabic numerals makes it look like an obvious case of cognition necessarily relying on things outside the head and of culture allowing us to go beyond our innate cognitive limitations by supplying us with material extensions of our mind. Indeed, there seems to be no way for a person to express (let alone conceive of) the difference between numbers like 100,001 and 100,000 other than by mastering a numeration system with the appropriate syntax (Schlimm 2018). Further support for seeing the practice of arithmetic as a case of cognition involving things outside our head comes from brain imaging evidence showing
3 E.g.
Hurford (1987), Dehaene (1997/2011), Lakoff and Núñez (2000), Wiese (2004), De Cruz (2008), Carey (2011), Coolidge and Overmann (2012), Menary (2015), Malafouris (2010, 2013), and Ansari (2008). 4 Galton (1880) was the first to describe human numerical representations as being ordered on a left-right line, while Restle (1970) was the first to explicitly tie in Moyer and Landauer’s (1967) work to an analog line. See also Gallistel and Gelman (1992, 2000), Dehaene (2003), Cantlon et al. (2009), and van Dijck et al. (2015). 5 Carey (2009) offers two main reasons to doubt such mapping-based accounts succeed: first, they do not explain why learning the meaning of number words proceeds in a stepwise manner (Wynn 1990, 1992), as I discuss in Sect. 2 below. Second, while the evidence that such a mapping occurs is strong, there is also evidence that children have an understanding of what number words mean before this mapping occurs. See Carey (2009). 6 There is a lot to be said about what sort of thing constitutivity is here, and how it applies to numerical cognition, but this is not the place for such discussions. Dutilh Novaes (2013) and Schlimm (2018) are good places to get more in-depth discussion on what constitutivity means in this context. For current purposes, we can frame the notion as follows: in order to explain how children learn the meaning of the first few number words, do we need to appeal to aspects of our cognitive coupling with things outside our head, like the externalist would, or can we explain this learning process in terms of descriptions of purely intracranial processes, in a way analogous to how one would describe a digestive process?
86
J.-C. Pelland
the importance of culture-specific input on how we learn to manipulate numbers. Tang et al.’s (2006) oft-cited study, which used fMRI to determine whether there are culturally-induced differences in which parts of the brain are used to process simple arithmetical tasks presented in Indo-Arabic format, offers strong support for externalism about numerical cognition. The authors found increased activation in the left premotor cortex for Chinese speakers and increased activation of left perisylvian areas for English speakers. This difference in activation patterns likely reflects the difference in external artifacts and methods used when learning how to calculate, showing that our brain has integrated culture-specific practices involving external object manipulation in its circuitry. Another finding that seems to demonstrate the importance of culture-specific inputs in how we process numerical information is Cantrell et al.’ (2015) study, which asked participants to match displays based on the number of items they contained. Here, researchers found that Japanese speakers are more likely to attend to non-numerical cues than English speakers for numerosities larger than sixteen.7 This does not seem to bode well for internalist answers to the gap problem: if there are such culture-specific aspects to how we practice arithmetic, then it is difficult to frame arithmetic as being something that we can describe purely by referring to processes going on inside our head. In short, there is a good reason to appeal to cultural factors in explaining how we bridge the gap between natural number concepts and the content produced by systems like the ANS and the OFS: not only do we learn and manipulate numbers by relying on culturally-inherited symbols for these, but the practice of arithmetic as we know it is the result of a gradual, cumulative process of cultural evolution that took place over many generations, involving the transmission of cultural recipes and numeration systems from one person to another, and from one generation to the next. And yet, despite the obvious importance of external cognitive crutches like numerals and number words in the practice of arithmetic, in this paper, I want to argue that following the externalist approach and factoring in cultural effects on development is of limited use in explaining how we bridge the gap. Two main reasons motivate my skepticism towards externalist answers to the gap problem: the first is empirical and concerns culture-independent constants in the development of numerical thinking in children. I discuss these in the next section. The second has to do with individual differences in arithmetical abilities against fixed cultural backgrounds, which I examine in Sect. 3. I then offer a sketch of an internalist answer in Sect. 4.
7 The
explanation given is that speakers of languages like Japanese that do not mark the countmass distinction as much as English are more likely to attend to continuous properties of stimuli than speakers of languages that emphasize this distinction. However, this link between marking the count-mass distinction and attending to continuous properties has been questioned (e.g., Barner et al. 2009).
Arithmetic, Culture, and Attention
87
2 The Induction and the Limits of Externalism In this section, I want to present data that suggest some of the most important aspects of the development of numerical thinking in children are the same across cultures. If this is true, I think this makes a strong case against focusing on how we become coupled to things outside our head to explain how we learn to manipulate numbers, given that what is outside the head can change dramatically without there being any associated change in a child’s learning trajectory. To see what I mean when I talk about culture-independent aspects of how we bridge the gap, consider Karen’s Wynn’s celebrated research on the development of numerical abilities (Wynn 1990, 1992),8 which shows that children reliably go through the same stages in learning what numbers are. These stages typically go as follows: the “No-Numeral-Knower” stage describes when children are unable to give even one object when asked to, even though they have memorized a list of number words and can recite these in order. Then, typically between 24 and 30 months of age, they correctly give one object when asked to, but fail at any other number, at which point they are “One-knowers.” Six to nine months later, they become “Two-knowers,” where they can correctly give one object when asked to, or two objects when asked for two, but give random numbers of objects for any number word larger than two. Another, distinct three-knower stage follows.9 The crucial knowledge stage almost invariably occurs once children have become three-knowers (in some rare cases, after being four-knowers), when what is sometimes called the Induction10 happens: around age 3 1/2 on average, English middle-class children become cardinal principle knowers—they work out the numerical meaning of the activity of counting and can now reliably produce sets with the cardinal value of any numeral in their count list. (Carey 2009, p. 298)
In other words, once the induction happens, children’s knowledge of the meaning of number words is no longer stuck at individual numbers from 1 to 4. After the induction, children’s knowledge of numbers generalizes to their entire count list. After the induction, they know that adding one item to a collection means that the quantity of things in that collection can be described by using the next word in their count list.11 8 See
also Sarnecka and Lee (2009). at these stages are often referred to as “subset-knowers,” to highlight the fact that, for 12–18 months, they only know how to correctly apply an initial segment of their count list. 10 E.g. Margolis and Laurence (2008), Rips et al. (2008b). For criticism see Rips et al. (2008a). 11 I should mention that the induction does not lead directly to a full understanding of what numbers are. For example, there is evidence that it is possible to become a cardinal principle knower without understanding exact equality, and vice versa (Jara-Ettinger et al. 2016). This being said, one can question to which extent the tasks used to determine presence of an understanding of exact equality is one of exact numerical equality in this research. Also, the understanding that there is no largest number and that there are infinitely many of these comes much later than the induction, if at all (Chen and Mazzocco 2017). This being said, the induction does allow children to generalize their understanding of how the count list works to allow them to correctly identify the number of objects 9 Children
88
J.-C. Pelland
While the knower stages are well documented, the mystery persists as to what changes occur when the induction happens: what role do number words play in the induction, and what effect do they have on systems like the ANS and the OFS, if any, to allow us to bridge the gap and learn what numbers are? I do not propose to answer these questions here. What I do want to highlight is that, while cultural background can certainly have a strong influence on the extent to which an individual pays attention to numerosity and the age at which this happens, there appears to be no evidence that cultural background can affect how the induction goes through. For example, as far as I know, there are no cultures where the induction occurs when children learn the meaning of “eight,” or “twelve,” or “two.” While there is data showing that Japanese speakers are slower to become one-knowers,12 this does not change the fact that they go through the same knower stages as children speaking Russian and English (Sarnecka et al. 2007). The same incremental learning trajectory occurs even later for learners living in less industrialized cultures like the Tsiname, where early education is less of a priority than in more industrialized countries (Piantadosi et al. 2014). Nevertheless, here too we find that learning the meaning of number words is divided into the same knower stages, and the induction occurs at the same knower level.13 Such independence of cultural factors prompted Piantadosi and colleagues to conclude that “The presence of a similar developmental trajectory likely indicates that the incremental stages of numerical knowledge—but not their timing—reflect a fundamental property of number concept acquisition which is relatively independent of language, culture, age, and early education.” (Piantadosi et al. 2014, p. 1, emphasis added) Further, as mentioned earlier, Cantrell and colleagues found some cultural effects on the extent to which individuals pay attention to numerosity, but this result only held for larger numerosities (around 16). It is important to mention that well-known culture-independent effects were also found in this study, including that children’s attention is biased towards numerosity for small collections containing 1–4 items, and that attention to numerosity decreases with set size—in both cases, irrespective of cultural background. Evidence that there are aspects of the development of formal numerical cognition that do not require us to adopt externalism about cognition also comes from number-related deficits like dyscalculia. In such cases, it is not clear what sort of input cultural factors can have in identifying what the problem is, nor how
presented to them by enumeration for any number in their count list, which is not something they can do before the induction happens. 12 This could potentially be explained by the fact that classifier languages like Japanese do not emphasize the singular-plural distinction as much as English does, which means Japanese speakers’ attention is not as solicited by quantity-related information as it is for English speakers. See Carey (2009) or Sarnecka et al. (2007) for more on this. 13 Similarly, studies of bilingual learners suggest that while there is considerable evidence of language-specific effects on onset times of learning, “the logic and procedures of counting appear to be learned in a format that is independent of a particular language and thus transfers rapidly from one language to the other in development” (Wagner et al. 2015, p. 2).
Arithmetic, Culture, and Attention
89
to solve it. Hypothetically, it could turn out that dyscalculia is more prevalent in certain cultures than others due to culture-specific effects—say, certain eating habits would have negative impact on the brain’s development, if one stretches one’s imagination far enough. But even in such cases, the reference to cultural practices in the explanation of why individuals with dyscalculia have problems processing numerical information would be secondary, in that reference to cultural factors is not required in order to explain what is not working well in these people’s brains. Rather, the active ingredient in the explanation would be relative to individual-level processes that can be described independently of cultural factors or things outside the head. The point here is that the insight that leads to an understanding of what numbers are occurs in individuals’ heads, without any accompanying change in their environment. Given this absence of change in the environment despite change in understanding, I claim that any answer to the specific developmental question of how we bridge the gap should focus on cognitive processes that do not depend on attributing a constitutive status to culturally-inherited extracranial numerical artifacts. While this may appear trivial, it flies in the face of externalist claims for the constitutivity of external objects to the practice of arithmetic, which are omnipresent in the literature. To strengthen my case against externalism with respect to how we bridge the gap, in the next section, I consider cases of individual differences in arithmetical abilities against a fixed cultural background, as illustrated in research on Spontaneous Focusing on Numerosity (SFON). This will also allow me to sketch an internalist explanation to the gap problem in Sect. 4 by exploiting aspects of SFON research.
3 Spontaneous Focusing on Numerosity Since 2000, Minna Hannula-Sormunen and colleagues have been studying what they dubbed Spontaneous Focusing on Numerosity, or SFON.14 In a nutshell, SFON describes the tendency to behave in reaction to numerical features of the environment without having been directed to do so (i.e., spontaneously). The aim of SFON studies is to determine if there is a distinct mental process that characterizes paying attention to numerosity versus other features of stimuli, and to which extent such a process is related to arithmetical development. One of the hypotheses motivating this research is that a child’s tendency to attend to numerosity instead of other features of the environment can be considered a stable behavioral trait that should manifest itself across many tasks and over developmental time. The point I want to make in this section is that SFON research is an example of a productive way to study the development of numerical cognition that not only does not rely on
14 For
a review see Hannula-Sormunen (2015). See also Hannula (2000), Hannula and Lehtinen (2001, 2005), Hannula et al. (2010), and Hannula-Sormunen et al. (2016).
90
J.-C. Pelland
externalist assumptions, but that actually benefits from abstracting away from the influence of culture-specific factors and external aids to cognition, since finding individual differences in SFON tendencies involves testing individuals from the same cultural background. To understand this point, it will be helpful to take a look at some of the ideas and methods used in this research. The main idea behind SFON studies is that, in order for someone to learn what numbers are, they first have to pay attention to quantities of discrete objects in their environment—the numerosity of collections. If paying attention to numerosity is related to the development of numerical cognition, then people who tend to pay more attention to numerosity than others have more chances of developing an understanding of what numbers are, other things being equal. This means that SFON research could shed some light on why some individuals find it easier to learn what numbers are by identifying the process responsible for a person’s attending to numerical aspects of their environment. It also means that if we find ways to help people pay more attention to numerosity, we might help them learn what numbers are. SFON studies, as well as the related study of Spontaneous Attention to Number (SAN),15 have tried to hone in on potential effects of individual differences in attention to quantities of discrete objects by attempting to let test subjects choose which aspect of a stimulus they respond to, instead of explicitly telling them to respond to numerical aspects of stimuli, as is common in numerical cognition research. To do this, researchers in SFON studies give participants ambiguous instructions that can be interpreted in many ways, only some of which require paying attention to numerosity. Unlike research that studies children’s numerical abilities via explicit linguistic instructions such as “tell me which pile has more” or “how many on this card?”, the use of ambiguous directives in SFON studies does not force participants to attend to a particular (numerical) aspect of stimuli, thereby allowing them to freely attend to any feature, including numerosity, to complete the task. By putting the burden of figuring out which aspect of the stimulus to respond to on the participants, it is possible to single out to which extent some participants have a tendency to react to numerical features of stimuli without being explicitly told to do so, and thus, to get an idea of the level at which they spontaneously16 attend to
15 Baroody
et al. (2008) and Baroody and Li (2015). According to Baroody and colleagues, SFON research mainly focuses on children who have already developed some number concepts, which might mean that in SFON tasks, children can match collections based on numerosity in part because they have learned to attend to numerosity when learning the meaning of number words. It bears mentioning that there is an unusually public debate between Baroody and collaborators on one side and Hannula-Sormunen and colleagues on the other, regarding to which extent SAN is a form of SFON and whether Baroody and colleagues have properly acknowledged the influence of SFON on SAN (Hannula-Sormunen et al. 2015, 2016; Baroody and Li 2015). Though I will remain neutral on this matter here, one apparently major difference between the two approaches that is worth mentioning is that SAN researchers claim that SFON ability does not predict numerical abilities. 16 Chen and Mazzocco (2017) doubt that SFON is truly spontaneous given that contextual factors other than instructions can affect where attention is focused. For example, depending on what
Arithmetic, Culture, and Attention
91
numerical features of their environment. This way, researchers can determine which children have a tendency to focus on quantities of discrete objects in stimuli, versus any of the many other continuous dimensions that can—or, according to critics of the ANS,17 must—co-vary with numerosity. For example, researchers will scatter small numbers of objects on a mat in front of them and then instruct children sitting in front of similar mats to “make your mat like mine” without specifying which aspect of the collection of objects they are meant to copy. Participants can then match the collection in front of the researcher for numerosity, but also for total area, orientation, color, composition, or other features of the objects in the collection. In other cases, researchers will feed a puppet a specific (small) number of morsels of food and instruct the child to “do the same,” monitoring whether the child copies the act of feeding or also copies the specific number of morsels fed. Using such ambiguous instructions, Hannula-Sormunen and colleagues set out to find evidence that SFON exists, that it is stable throughout a person’s development, and that it is related to the development of arithmetical abilities. To test the existence of lasting individual differences in SFON, longitudinal studies involving around 30 different experimental tasks were carried out with the same individuals over extended periods of time.18 The data collected strongly suggest that there are indeed individual differences in SFON (Hannula and Lehtinen 2005) and, importantly, that these are correlated with (Hannula and Lehtinen 2005) and even predictors of (Hannula et al. 2010; Hannula-Sormunen et al. 2015; Batchelor et al. 2015) arithmetical proficiency, though this latter claim has been questioned (Baroody and Li 2015). For example, children aged 3–7 with a more pronounced tendency to focus on numerosity tend to be better at counting and subitizing than those with less pronounced SFON (Hannula et al. 2005), while children aged 4–5 have been found to perform better on symbolic arithmetical tasks if they have stronger SFON tendencies (Batchelor et al. 2015). What is important to note here is that much of SFON research proceeds on the assumption that there is individual variation in numerical ability and that the cause of this variation is not outside the head. Research into SFON tries to ensure that whether or not an individual responds to numerical aspects of stimuli is not the result of linguistic cues or other extracranial effects on behavior. Rather, paying attention to numerosity is supposed to be up to the individual, regardless of what sort of objects or practices are involved in the experimental setup. In this sense, SFON research illustrates that some aspects of the development of numerical cognition can best be studied by adopting internalism with respect to cognition. other features numerosity is pitted against in SFON studies, attention to numerosity will fluctuate, which suggests that it is not entirely spontaneous, since researchers can still guide it by varying non-numerical perceptual dimensions. This being said, they do admit that SFON is only malleable to a certain extent, which suggests that SFON research is still targeting self-initiated attention to numerosity. 17 See e.g. Leibovich et al. (2017) and Gebuis et al. (2016). 18 E.g. 3 years: Hannula and Lehtinen (2005). See also Hannula-Sormunen (2015).
92
J.-C. Pelland
Also, the fact that participants in many SFON studies share the same cultural background seems to imply that individual differences in tendency to focus on numerosity must be caused by intracranial processes. If one person has a stronger tendency to direct their attention to certain aspects of their environment as a result of differences in how their brain works, then it would seem inaccurate— not to mention explanatorily counterproductive—to attribute constitutive status to culturally-inherited extracranial cognitive aids in explaining why these differences are there, given that the differences are observed against a fixed cultural backdrop. Externalist insistence that all numerical cognition constitutively involves things outside the head seems incompatible with this. However, the reader familiar with SFON and SAN will no doubt object that I am proposing an inaccurate characterization of these research programs. After all, both explicitly mention culture as an important factor in shaping an individual’s SFON. For example, when discussing the possibility of remedying children’s delayed arithmetical development by enhancing their SFON, Hannula and Lehtinen write that “It would be of a particular interest to broaden our knowledge about the role of cultural environment, as well as that of adults (see Saxe et al. 1987) in how children learn to focus on numerosity and formulate the goals of quantitative tasks in social interaction.” (Hannula and Lehtinen 2005, p. 254).19 So it could appear inaccurate of me to say that SFON research supports my claim that we do not need to look outside the head to see how we bridge the gap. Of course, I am not claiming that culture has no effect on SFON. The cultural backdrop in which individuals grow up will affect the extent to which they manage to understand what numbers are, as well as the extent to which they pay attention to numerical features of their environment.20,21 So why am I claiming that SFON illustrates the need to look inside the head to bridge the gap, rather than at cultural factors? The claim being made here is that the relevance of culture in SFON studies is limited to its effects on the extent to which an individual pays attention to numerosity in their environment, and that this means that focusing on cultural factors will not allow us to identify the active (psychological) process that transforms systems like the ANS and the OFS, since whatever effect culture has on this mechanism stops at the interface between the individual and the outside world, whereas the SFON-driven modification occurs inside the head.
19 Similarly,
Baroody and colleagues write that “children’s understanding and functional use of even the intuitive numbers [i.e. 1, 2, 3] may not unfold naturally (i.e., readily or spontaneously) but may require scaffolding by parents, early childhood teachers, and others” (Baroody et al. 2008, p. 266) and that “a conceptual understanding of number only gradually directs [children’s] attention to collections larger than two because a concept of such numbers must be socially constructed.” (Baroody et al. 2008, p. 264) This of course begs the question of how such numbers could be socially constructed at all. For reasons to doubt that all numbers are the sort of thing that can be socially constructed, see Pelland (2018). 20 Assuming, of course, they do at all. Notwithstanding innovators and creative individuals, most people living in anumerate cultures will never develop an understanding of number. 21 See Menary (2015).
Arithmetic, Culture, and Attention
93
And yet, as mentioned earlier, it is impossible to even think about large enough numbers without mastering culturally-inherited numeration systems like the IndoArabic numerals. Given that internalism seems suited to the development of formal numerical cognition, while externalism certainly applies to the practice of sufficiently advanced arithmetic, what would seem more in line with the results considered here is a pluralist approach to numerical cognition, where an initial segment of the natural numbers can be processed without requiring help from outside the head, but beyond this limit, cognitive externalism applies. To explore this possibility, in the next section, I offer a few speculative comments on how internalism might provide some insight into how we bridge the gap by further exploring the role of attention in modifying cognitive systems in learning.
4 Attention and the Origin of Formal Numerical Cognition In the previous section, I mentioned evidence that individual differences in SFON can predict individual differences in arithmetical development. If attending to numerical features of the environment somehow affects the development of formal numerical cognition, and if our task is to find out how systems like the ANS and the OFS are modified in this developmental process, then it may be worthwhile to take a look at what sort of effects attention can have on cognitive systems in development to help find out how we bridge the gap. After all, there are entire textbooks dedicated to attention and associative learning, so looking into the role of attention in learning could help understand what attention brings to the induction, if anything. Given that attention is an individual-level mental process (Allport 2011) whose functioning can be described without attributing constitutive status to extracranial support, an explanation of how we bridge the gap that relies on attention would seem to count as internalist. While the precise role of attention in the development of numerical content has yet to be identified, there is little disagreement that it is an essential component to the process. In fact, despite their focus on extracranial aids to explain how we bridge the gap, externalists do not deny the importance of internal processes of attention and noticing. On the contrary, if we look at some of the externalist accounts considered above, we quickly see that they appear to depend on a form of noticing that leads to the discovery of novel numerical content. For example, Dehaene claims that “[a]ll children spontaneously discover that their fingers can be put into one-to-one correspondence with any set of items” (1997/2011, p. 81). Similarly, Carey’s description of the crucial step where children learn that words in count lists refer to precise quantities of objects centers around attending and noticing.22 Thus, on these externalist approaches, an important element in how individuals come to bridge the gap is the presence of a form of realization that
22 See
Carey (2009, pp. 326–327)
94
J.-C. Pelland
follows noticing a correspondence between representations of quantities and objects in the environment. It is uncontroversial that attention is an important part of such noticing. For example, Wu (2014) offers a theory of attention as selection for task in which what we pay attention to determines what sorts of perception-based beliefs we end up forming, since what we pay attention to determines what we notice (e.g. Wu 2014, p. 249).23 Indeed, the idea that attention is necessary for the development of numerical content is neither new nor particularly controversial, as seen by the fact that a number of authors have already appealed to general attentional mechanisms to explain how infants get numerical content from one-to-one correspondence on the output of the OFS. For example, Simon (1997) offers “a ‘non-numerical’ account that characterizes infants’ competencies with regard to numerosity as emerging primarily from some general characteristics of the human perception and attention system.” (Simon 1997, p. 349), while Izard and colleagues similarly propose that “infants may also be able to use their attentional resources to extract numerical information from displays containing only a small number of objects.” (Izard et al. 2009, p. 492) Such explanations of infant behavior in terms of attentional mechanisms have been well received and lend credence to an internalist account where attentional learning can lead to the development of novel representational content, given that in order to explain how attention works, it is not necessary to assign a constitutive role to extracranial cognitive support. Of course, attention on its own would be far too general to be specifically responsible for novel numerical content. After all, it could be argued that attention is necessary for any action (e.g., Wu 2014, but see Jennings and Nanay 2014 who disagree), and that it is too general and poorly delineated a theoretical construct to warrant its ubiquity in psychological explanation (Walsh 2003). However, SFON is a much more specific construct than attention, and, in that sense, is a more promising alternative that can explain individual and cultural differences in numerical abilities. For example, the internalist can appeal to cultural effects on SFON (or lack thereof) to explain why anumerate cultures like the Piraha (Frank et al. 2008) and the Mundurucu (Pica et al. 2004; Pica and Lecomte 2008) do not develop number concepts: their lifestyle and social organization do not require nor encourage paying close attention to quantities of objects. This means that SFON is not encouraged in such cultures and that the potential modification of systems like the ANS that could be caused by SFON does not occur. The importance of cultural context on SFON tendency can also be explained by reference to attention to quantity, since tradeoriented cultures would supply a context where paying attention to precise quantities of objects would be valued, which differentiates numerate cultures from anumerate ones. On this, Gelman and Butterworth speculate that anumerate cultures do not develop words for precise quantities because “numbers are not culturally important
23 Similarly, Hannula
and colleagues also highlight the task-dependence of what we notice, writing that “The numerosity of items depends on the way one carves up the set of items and, thus, on the goal of quantification.” (Hannula et al. 2005, p. 238)
Arithmetic, Culture, and Attention
95
and receive little attention in everyday life” (Gelman and Butterworth 2005, p. 9, emphasis added) in such cultures. In short, given the important role of attention in learning and the specifically numerical aspect of attention involved in SFON research, SFON appears to be a promising place to look to explain how we bridge the gap. Ideally, looking at attention and its effects on systems like the OFS could explain why the induction proceeds in this stepwise manner, but at present, I know of no attempt to look at this issue from a purely internalist perspective. These are, of course, speculative comments about the potential effects of attention on other cognitive systems, aimed at illustrating how an internalist approach can help bridge the gap without attributing a constitutive role to external artifacts. Even if it were to turn out that SFON does not have such effects, the point is that individual differences and culture-independent processes involved in bridging the gap mentioned in Sects. 2 and 3 warrant paying closer attention to what goes on in our heads, rather than to what we inherit from our cultural background.
5 Conclusion In this paper, I have tried to argue that research into one of the most important puzzles in the study of numerical cognition—how systems like the ANS and the OFS allow us to develop an understanding of what numbers are—should focus on what goes on inside our head rather than on what we get from our cultural background. To argue for this, I appealed to the existence of important cross-cultural developmental trajectories such as the stages that lead to the induction as well as to the existence of individual differences in SFON against fixed cultural backgrounds. Given that externalism seems limited in its ability to explain how we bridge the gap, I speculate that an internalist could appeal to the effects of attention in learning to make progress into answering this specific developmental question. It is important to emphasize here that I am not denying the importance of enculturation for the development of bodies of knowledge like mathematics, nor am I claiming that externalism cannot explain many important aspects of numerical cognition. Clearly, no single person could ever accomplish what we do as a species. But acknowledging the importance of cultural factors for the development of mathematics and arithmetic does not necessarily explain it. Of course, most externalists do not deny the central role played by the brain.24 But externalism seems like a counterproductive approach when it says that we can appeal to culturallyinherited cognitive aids to identify what bridges the gap, or that we bridge the gap because of cultural factors. As I tried to argue, while culture can indeed increase our chances of developing numerical content, the reasons for culture’s influence on how
24 This
being said, Menary (2015) and Malafouris (2013) can be read as denying the central importance of the brain in cognition.
96
J.-C. Pelland
we bridge the gap can be explained using vocabulary about how the brain works, and how attention to numerosity affects systems like the ANS and the OFS. If I am right, while it is true that we rely on external objects (and people) for our daily cognitive regime, including learning and using numerical content, this does not mean that we can explain the emergence of novel content in a person’s head by appealing to these general facts about how our minds work. It is not always helpful to appeal to the external aspects of our minds to explain particular phenomena, even if our minds are indeed extended, embodied, or enculturated.
References Allport, A. (2011). Attention and Integration. In Mole, C., Smithies, D., & Wu, W. (eds.), Attention: Philosophical and Psychological Essays. Oxford University Press. pp. 24-59 Ansari, D. (2008). Effects of development and enculturation on number representation in the brain. Nature Reviews Neuroscience 9, 278-291. Barner, D., Inagaki, S., & Li, P. (2009). Language, thought, and real nouns. Cognition, 111, 329– 344. Baroody, A. J. & Li, X. (2015): The construct and measurement of spontaneous attention to a number, European Journal of Developmental Psychology, DOI: https://doi.org/10.1080/17405629.2016.1147345 Baroody, A. J., Li, X., & Lai, M. L. (2008). Toddlers’ spontaneous attention to number. Mathematics Thinking and Learning, 10, 240–270. https://doi.org/10.1080/10986060802216151 Batchelor, S., Inglis, M., Gilmore, C., & Batchelor, S. (2015). Spontaneous focusing on numerosity and the arithmetic advantage. Learning and Instruction, 40, 79–88. Beck, J. (2014). Analog magnitude representations: A philosophical introduction. The British Journal for the Philosophy of Science 0 (2014), 1–27 Cantlon, J.F., Platt, M.L., and Brannon, E.M. (2009). Beyond the number domain. Trends in Cognitive Science 13(2): 83–91. https://doi.org/10.1017/S1364-6613(08)00259-3 Cantrell, L., Kuwabara, M., & Smith, L. B. (2015). Set size and culture influence children’s attention to number. Journal of Experimental Child Psychology, 131, 19-37. Carey, S. (2009). The origin of concepts. New York: Oxford University Press. Carey, S. (2011). Précis of the origin of concepts. Behavioral and Brain Sciences, 34(3), 113–124. doi: https://doi.org/10.1017/S0140525X10000919 Chen, J. Y-C., Mazzocco, M. M. M. (2017). Competing features influence children’s attention to number. Journal of Experimental Child Psychology 156, 62-81. https://doi.org/10.1016/j.jecp.2016.11.008 Clark, A., & Chalmers, D. (1998). The extended mind. Analysis, 58: 7–19. Cohen Kadosh, R. & Dowker, A. (Eds.) (2015) The Oxford Handbook of Numerical Cognition. Oxford: Oxford University Press. Coolidge, F. L., & Overmann, K. A. (2012). Numerosity, abstraction, and the emergence of symbolic thinking. Current Anthropology, 53(2), 204–225. doi: https://doi.org/10.1086/664818 De Cruz, H. (2008). An Extended Mind Perspective on Natural Number Representation. Philosophical Psychology 21, no. 4: 475–90. Dehaene, S. (2003). The neural basis of the Weber-Fechner law: A logarithmic mental number line. Trends in Cognitive Sciences, 7, 145–147. Dehaene, S. (1997/2011) The Number Sense: How the Mind Creates Mathematics. New York: Oxford University Press. Dutilh Novaes, C. (2013). Mathematical reasoning and external symbolic systems. Logique & Analyse, 56 (21): 45–65.
Arithmetic, Culture, and Attention
97
van Dijck, J.-P., Ginsburg, V., Girelli, L., & Gevers, W. (2015). Linking numbers to space: from the mental number line towards a hybrid account. In R. Cohen Kadosh & A. Dowker (Eds.), The Oxford handbook of numerical cognition (pp. 89–105). Oxford, UK: Oxford University Press. Feigenson, L., Dehaene, S., and Spelke, E. (2004) Core systems of number. Trends in Cognitive Sciences 8:307–14. Frank, M., Everett, D., Fedorenko, E., & Gibson, E. (2008). Number as a cognitive technology: Evidence from Piraha language and cognition. Cognition 108: 819- 824. Gallistel, C. R., & Gelman, R. (1992). Preverbal and verbal counting and computation. Cognition, 44, 43–74. Gallistel, C. R., & Gelman, R. (2000). Non-verbal numerical cognition: from reals to integers. Trends in Cognitive Sciences 4(2):59-65. Galton, F. (1880). Visualised numerals. Nature 21:252–256. Gebuis, T., Cohen Kadosh, R. & Gevers, W. (2016) Sensory-integration system rather than approximate number system underlies numerosity processing: A critical review. Acta Psychologica 171:17–35. Gelman, R. and Butterworth, B. (2005). Number and language: How are they related? Trends in Cognitive Sciences 9, 6–10. Hannula, M. M. (2000). The role of tendency to focus on numerosities in the development of cardinality. In T. Nakahara & M. Koyama (Eds.), Proceedings of 24th conference of the international group for the psychology of mathematics education (Vol. 1, p. 155). Hiroshima, Japan: Nishiki. Hannula, M. M., & Lehtinen, E. (2001). Spontaneous tendency to focus on numerosities in the development of cardinality. In M. Panhuizen-Van Heuvel (Ed.), Proceedings of 25th conference of the international group for the psychology of mathematics education (Vol. 3, pp. 113–120). Amersfoort, Netherlands: Drukkerij Wilco. Hannula, M. M. & Lehtinen, E. (2005). Spontaneous focusing on numerosity and mathematical skills of young children. Learning and Instruction. 15(3) 237-256 Hannula, M. M., Mattinen, A., & Lehtinen, E. (2005). Does social interaction influence 3-yearold children’s tendency to focus on numerosity? A quasi-experimental study in day-care. In L. verschaffel, E. De corte, g. Kanselaar, & M. valcke (Eds.), Powerful learning environments for promoting deep conceptual and strategic learning. Studia Paedagogica (vol. 41, pp. 63–80). Leuven: Leuven university Press. Hannula, M. M., Lepola, J., & Lehtinen, E. (2010). Spontaneous focusing on numerosity as a domain-specific predictor of arithmetical skills. Journal of Experimental Child Psychology, 107, 394–406. Hannula-Sormunen, M. M. (2015). Spontaneous focusing on numerosity and its relation to counting and arithmetic. In R. Cohen Kadosh & A. Dowker (Eds.), Oxford Handbook of Mathematical Cognition. Oxford: Oxford University Press. Hannula-Sormunen, M. M., McMullen, J., Räsänen, P., Lepola, J., & Lehtinen, E. (2015): Is the study about spontaneous attention to exact quantity based on studies of spontaneous focusing on numerosity? European Journal of Developmental Psychology. https://doi.org/10.1080/17405629.2015.1071252 Hannula-Sormunen, M. M., McMullen, J., Lepola, J., Räsänen, P., & Lehtinen, E. (2016) Studies on spontaneous attention to number (SAN) are based on spontaneous focusing on numerosity (SFON), European Journal of Developmental Psychology, 13:2, 179-182, https://doi.org/10.1080/17405629.2016.1151782 Hurford, J. R. (1987). Language and number. Oxford: Basil Blackwell. Izard, V., Sann, C., Spelke, E. S. & Steri, A. (2009) Newborn infants perceive abstract numbers. Proceedings of the National Academy of Sciences of the United States of America 106(25):10382–85. Jara-Ettinger, J., Piantadosi, S. T., Spelke, E. S., Levy, R., & Gibson, E. (2016). Mastery of the logic of natural numbers is not the result of mastery of counting: Evidence from late counters. Developmental Science. Jennings, C. & Nanay, B. (2014). Action without Attention. Analysis. 76. https://doi.org/10.1093/analys/anu096.
98
J.-C. Pelland
Lakoff, G., & Núñez, R. E. (2000). Where mathematics comes from: How the embodied mind brings mathematics into being. New York: Basic Books. Leibovich, T., Katzin, N., Harel, M., and Henik, A. (2017). From ‘sense of number’ to ‘sense of magnitude’ - The role of continuous magnitudes in numerical cognition. Behavioral and Brain Sciences. 1-62. doi:https://doi.org/10.1017/S0140525X16000960 Malafouris, L. (2010). Grasping the concept of number: how did the sapient mind move beyond approximation? In The archaeology of measurement: comprehending heaven, earth and time in ancient societies. C. Renfrew and I. Morley, (Eds). pp. 35–42. Cambridge: Cambridge University Press. Malafouris, L. (2013). How things shape the mind. Cambridge, MA: MIT Press. Margolis, E., & Laurence, S. (2008). How to learn the natural numbers: Inductive inference and the acquisition of number concepts. Cognition, 106, 924–939. Menary, R. (2015). Mathematical Cognition - A Case of Enculturation. In T. Metzinger & J. M. Windt (Eds.) Open MIND. Frankfurt a. M., GER: MIND group. Moyer, R.S. & Landauer, T.K. (1967). The time required for judgments of numerical inequality. Nature 215: 1519-1520. Pelland, J.-C. (2018). Which came first, the number or the numeral? In S. Bangu (Ed.), The cognitive basis of logico-mathematical knowledge (pp. 179–194). New York: Routledge. Piantadosi, S.T., Jara-Ettinger, J., & Gibson, E. (2014). Children’s learning of number words in an indigenous farming- foraging group. Developmental Science, 17 (4), 553–563 Pica, P., Lemer, C., Izard, V., & Dehaene, S. (2004). Exact and approximate arithmetic in an Amazonian indigene group. Science, 306, 499–503. Pica, P., & Lecomte, A. (2008). Theoretical Implications of the Study of Numbers and Numerals in Mundurucu. Philosophical Psychology 21:4, 507-522. Restle, F. (1970). Speed of adding and comparing numbers. Journal of Experimental Psychology, 83, 274–278. doi:https://doi.org/10.1037/h0028573 Rips, L. J., Bloomfield, A., and Asmuth, J. (2008a). From numerical concepts to concepts of number. Behavioral and Brain Sciences, 31, 623–642. Rips, L. J., Asmuth, J. & Bloomfield, A. (2008b) Do children learn the integers by induction? Cognition 106:940–51. Sarnecka, B., Kamenskaya, V., Yamana, Y., Ogura, T., & Yudovina, Y. (2007). From grammatical number to exact numbers: Early meanings of ‘one”,two’, and ‘three’ in English, Russian, and Japanese. Cognitive Psychology, 55(2), 136–168. Sarnecka, B. W., & Lee, M. D. (2009). Levels of number knowledge in early childhood. Journal of Experimental Child Psychology, 103, 325–337. Saxe, G. B., Guberman, S. R., & Gearhart, M. (1987). Social processes in early number development. Monographs of the society for research in child development, 52(Serial No. 216). Schlimm, D. (2018). Numbers through Numerals. In Bangu, S. (Ed). The Cognitive Basis of Logico-Mathematical Knowledge. (pp 195-217). New York: Routledge. Simon, T.J. (1997). Reconceptualizing the origins of number knowledge: A ’non- numerical’ account. Cognitive Development, vol. 12: 349-372. Tang, Y., Zhang, W., Chen, K., Feng, S., Ji, Y., Shen, J., Reiman, E., and Liu, Y. (2006). Arithmetic processing in the brain shaped by cultures. Proceedings of the National Academy of Sciences of the United States of America, 103:10775–10780. Wagner, K., Kimura, K., Cheung, P., & Barner, D. (2015). Why is number word learning hard? Evidence from bilingual learns. Cognitive Psychology, 83, 1–76. Walsh, V. (2003). A theory of magnitude: common cortical metrics of time, space and quantity. Trends in Cognitive Sciences, 7 (11): 483–488. doi:https://doi.org/10.1016/j.tics.2003.09.002 Wiese, H. (2004). Numbers, language, and the human mind. Cambridge, NY: Cambridge University Press. Wu, W. (2014). Attention. London: Routledge. Wynn, K. (1990). Children’s understanding of counting. Cognition, 36, 155–193. Wynn, K. (1992). Children’s acquisition of the number words and the counting system. Cognitive Psychology. 24: 220–251.
Did Frege Solve One of Zeno’s Paradoxes? Gregory Lavers
Abstract Of Zeno’s book of forty paradoxes, it was the first that attracted Socrates’ attention. This is the paradox of the like and the unlike. On contemporary assessments, this paradox is largely considered to be Zeno’s weakest surviving paradox. All of these assessments, however, rely heavily on reconstructions of the paradox. It is only relative to these reconstructions that there is nothing paradoxical involved, or that there is some rather obvious mistake being made. This paper puts forward and defends a novel interpretation of this paradox, according to which the concept of a unit plays a central role. There is every reason to think the paradox turns on the concept of a unit: after the presentation of the paradox the text of the Parmenides immediately turns to a discussion of units, and the concept of a unit is also central to the Greek conception of a plurality. If this interpretation is correct then the paradox that Zeno presented was the same as one discussed and solved in Frege’s Grundlagen.
1 Introduction Bertrand Russell once described Zeno’s paradoxes of space and time as ‘immeasurably subtle and profound’. He went on to complain: ‘the grossness of subsequent philosophers pronounced him to be a mere ingenious juggler, and his arguments to be one and all sophisms’ (Russell 1903/2010, p. 352). To this day, one of Zeno’s paradoxes is still widely seen as a mere sophism: the paradox of the like and unlike. According to the text of the Parmenides, this was the first in Zeno’s book of paradoxes. The majority contemporary view on this paradox is that it is one of the, if not the, weakest of Zeno’s paradoxes. It is somewhat odd that commentators are willing to make pronouncements about this paradox when we know so little
G. Lavers () Concordia University, Montreal, QC, Canada e-mail: [email protected] © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_6
99
100
G. Lavers
about it; any criticism of the paradox must be based on a reconstruction. Showing that a reconstruction of the paradox has an obvious fault, counts more against the reconstruction than it does against Zeno’s original paradox (whatever it might have been). On many reconstructions there is nothing paradoxical about this paradox— Zeno is just making an obviously false assumption or equivocates in some way. After surveying contemporary positions, I want to offer a new interpretation according to which there is something genuinely paradoxical that Zeno is latching on to. My suggestion is that central to the paradox is the conception of a plurality as ` made up of units. It is clear the ancient Greeks thought of number [αριϑμ´ oς] as a plurality of units. It is the elements of the plurality, as units, that must be both alike and unlike. The existence of any plurality at all required the existence of at least two units of the same kind. Frege, in his Grundlagen der Arithmetik, surveys views on units, dating back to the sixteenth century, and shows that writers are compelled to claim both that the units are identical and that they are distinct. Frege shows that this type of confusion was rampant in discussions of units. While the views he is considering do not go back further than Hobbes, there is nothing particularly modern about these considerations. His solution to the problem is that it is a mistake to think of the unit as an object (one of the things to be counted). Once we see the unit as a concept, we see that the unit is identical in each case, but the individual things to be counted retain their differences.
2 The Paradox and Its Contemporary Reception Let us begin by considering the description of the paradox as it appears in the Parmenides, as this is the starting point for any reconstruction of it. When the reading was finished, Socrates asked to hear the hypothesis of the first argument again. When it was read, he asked, What does this mean, Zeno? If things which are, are many, then it must follow that the same things are both like and unlike, but that is impossible; for unlike things cannot be like nor like things unlike. Isn’t that your claim? (127d-e)1
Here we see that it is described as his first paradox, and after all of the paradoxes were read, according to the dialogue, Socrates chose to focus on this one. The dialogue continues with Socrates saying: Then if it is impossible for unlike things to be like and like things unlike, it is surely also impossible for there to be many things; for if there were many, they would undergo impossible qualifications. Isn’t this the point of your arguments, to contend, contrary to everything generally said, that there is no plurality? And don’t you suppose that each of your arguments is a proof of just that, so that you in fact believe you’ve given precisely as many proofs that there is no plurality as there are arguments in your treatise? Is that what you mean, or have I failed to understand you? (127e-128a)
1 These
translations from the Parmenides are taken from Allen (1997).
Did Frege Solve One of Zeno’s Paradoxes?
101
After Zeno agrees with Socrates assessment, we see that Zeno sees all the other paradoxes as merely further, often, likely, less general, ways of making the same point. This first paradox attempts to show something inconsistent in the very notion of a plurality, and does so in the most direct manner. In the remainder of this section I would like to give the reader a sense of how this paradox has been seen by recent commentators. Jonathan Barnes, in his book on the presocratic philosophers, gives us a brief discussion of this paradox. Here he provides a couple of reconstructions and then declares the lack of any real paradox: We do not know how Zeno argued for [the claim that if pluralism is true, everything is both like and unlike], nor what he meant by ‘everything is alike’. The word for ‘alike’ is ‘homoios’. Perhaps: ‘If a and b are distinct existents, then they are similar (homoios) in so far as each exists—hence they are alike; and they are dissimilar (anhomoios) in so far as each is different from the other—hence they are unlike.’ Or perhaps rather: ‘If a and b are distinct existents, then as existent each will be homogeneous (homoios)—hence they are alike; and yet being distinct, they are heterogeneous and hence unlike’. Neither argument has any power; for neither conclusion is more than an apparent absurdity[.] (Barnes 1979/1982, p. 187)
At a few points in this work, Barnes explicitly discusses Frege, but none of these discussions have any connection to the paradox we are examining here. R. D. McKirahan in his piece on Zeno for the Cambridge Companion to Early Greek Philosophy also attempts a reconstruction and declares it a failure according to that reconstruction. Of course, in fairness, it should be noted that he clearly allows for the possibility of more charitable reconstructions: This state of the evidence makes it impossible to reconstruct the argument with any confidence. On one account it went as follows. If there are many things, there are at least two. Pick two of them, A and B. A is unlike B because A differs from B in at least one way (A is different from B, but B is not different from B). Likewise, B is unlike A. But A is like A (since A is not different from A in any way), and B is like B. Therefore, A and B are both like and unlike. If this was Zeno’s reasoning, the argument fails because A and B can be like and unlike in the way indicated; the alleged impossibility would arise only if the same things are both like and unlike the same things in the same respect, at the same time, and so on. Zeno may have reached this conclusion validly, but if so, we have no clue how he did. (McKirahan 1999, p. 137)
The paradox of the like and the unlike is not even mentioned in the Stanford Encyclopedia of Philosophy entry on Zeno’s paradoxes (Huggett 2019) and is only just mentioned in the entry on Zeno of Elea (Palmer 2017). Simply ignoring the paradox is not altogether uncommon. The dismissive attitude toward this paradox is exemplified perhaps most forcefully in the entry on Zeno’s paradoxes in the Internet Encyclopedia of Philosophy: Plato immediately accuses Zeno of equivocating. A thing can be alike some other thing in one respect while being not alike it in a different respect. [. . . ] So, there is no contradiction, and the paradox is solved by Plato. This paradox is generally considered to be one of Zeno’s weakest paradoxes, and it is now rarely discussed. (Dowden 2017)
Of the various ways the argument has been reconstructed, each shows the argument to have a quite clear problem. On some reconstructions the problem is that of
102
G. Lavers
treating likeness as a property instead of, as it should be, a relation. Clearly a thing cannot be like or unlike simpliciter, but is only so in relation to something else. On others the problem is assuming the obviously dubious principle that if two things are alike in any respect they are alike in every respect. One standout, in the discussions of this paradox is R. E. Allen in his discussion of this paradox in his book on the Parmenides. Allen warns those who might be too willing to dismiss the argument: The argument is elliptical and appears to be a mere sophism. Many critics, overlooking its connection with what follows, have discounted its significance. It is to be remembered, however, that Socrates will reply to it with the theory of Ideas, and no man trundles in artillery to shoot fleas. (Allen 1997, p. 76)
Notice Allen’s focus on what comes after the discussion of the paradox, and of course Allen is right that to answer the paradox Socrates turns to the theory of Ideas. But before this Socrates turns to a discussion of the one and the many. According to many interpretations of the paradox, this discussion is a mere tangent. Sure Socrates could be one of the seven men present, and Socrates could be many if we count his left and right side each as one, but what does this have to do with the paradox of the like and the unlike? My suggestion is that the concept of a unit plays a central role in this paradox. If this is correct, then the transition to Socrates talking of himself as one and many is not at all a tangent—what he is doing here is considering the number that applies to him while varying the unit. If ‘man present’ is taken as the unit, then Socrates is one. However, if we take ‘one side of Socrates (left or right)’ as our unit, then Socrates is many. The existence of a plurality is tied to what we take as our unit. Allen ultimately, however, charges Zeno of holding a ‘primitive nominalism’, which identifies the characteristic with the thing. This would allow him to prove that being white is identical to being a horse follows from there being a white horse. ‘Zeno’s paradox, then, is a special case applied to opposites of a more general failure to distinguish characters from things characterized’ (Allen 1997, p. 91). Seeing the paradox as an instance of such poor reasoning is again then quite dismissive. It does have the advantage of making the appeal to the theory of Ideas relevant, but this is again an interpretation according to which there is nothing of interest in this paradox. In his 1964 article, Allen charges Zeno, and Eleatic reasoning generally with a confusion between things characterized and characteristics (Allen 1964). On the reading I will be defending there is such a confusion, but it is not as blatant as on Allen’s reading. It is understanding the unit as one of the things characterized rather than a characteristic. In Fregean terms, the error is in thinking of the unit as one of the objects to be counted rather than as a concept. Perhaps the most detailed discussion of Zeno’s ‘puzzle’ is given in Lee (2014). But on Lee’s reading the mistake is in thinking that the ‘puzzle’ is a paradox at all. Lee rejects interpreting the ‘puzzle’ as a reductio. I would like to put forward an interpretation in which the paradox is a genuine paradox, and one that does not involve an obviously false assumption or any equivocation. I am not aware of such an interpretation existing in the literature.
Did Frege Solve One of Zeno’s Paradoxes?
103
The reconstruction that I want to put forward is one where Zeno is dealing with the same types of considerations concerning the concept of a unit, which Frege clearly lays out, and solves, in his Grundlagen. Frege’s solution comes after he identifies what is genuinely paradoxical in a very common way to think of units. That is, the paradox arises from a confusion concerning our understanding of unit. If we take the universe as a whole we can conceive of it as one. In order for there to be a plurality of any kind, we must choose a unit in such a way that there are several of those. For example, according to the Parmenides, there were seven men present for the discussion. So here we have an apparent plurality. Let us say we take Socrates, and we say ‘here is one of one of our units’, we cannot then, it seems, then turn to Zeno, and say ‘here is another one of those’. This is because Zeno is not another Socrates. Once we start thinking of a plurality as made up of units, we are quickly tempted to think of the units as both identical and distinct. It is the elements of a plurality, considered as units, which must be both alike and different. Frege’ Grundlagen discusses at length this seemingly hopeless problem involved in the notion of unit. That is not to say that Frege wishes to reject the idea of a unit. On the contrary he recognizes that the choice of a unit is required before any ascription of number can be made. Let us turn now to that work to see what light it may shed on the problem at hand.
3 Frege on the Identity of and Difference Between Units In the early sections of Grundlagen, Frege builds to what he referred to as his ‘fundamental thought’ with a discussion of units. The fundamental thought is the claim that an assertion of number asserts something about a concept. That is, when we say that there are four horses that pull the King’s carriage, for instance, we are saying something not about the individual animals, neither individually or as an agglomeration, but about the concept ‘horse that pulls the King’s carriage’. It is this concept, that we take as the unit and what we are claiming is that it is true of four things. Before discussing the topic of a unit more directly, however, Frege first demonstrates two important things regarding numerical ascriptions. First, he shows that number is not an objective property of things (like colour). He does this by showing that we can, while considering the same thing and varying the unit, arrive at different numbers. For example, a pile of cards could be considered some number of individual cards, or some other number of complete decks. A pair of boots can be two boots or one pair of boots. This is exactly analogous to the case of whether Socrates is one of seven men, or two halves of himself. Of course, this is not the case for an objective property like colour. Second, Frege shows that number is not something subjective. He does this by arguing that once the unit is chosen, it is not up to us what number applies. If, he argues, a botanist claims that a certain flower always has five yellow petals, then the claim that there are five is as objective as that they are yellow. Having thus established the centrality of the concept of unit for claims of number, Frege turns his attention to what other thinkers have said about units.
104
G. Lavers
His first target is Schröder. ‘Why do we call things units, if “unit” is only another name for thing, if any and every thing is a unit or can be regarded as one? E. Schröder gives as the reason, that the word is used for ascribing to the items that are to be numbered the required identity’ (Frege 1884/1980, §34). Frege goes on to quote from Hobbes, Hume, and Thomae to show that thinkers see the need to treat units as identical. We must at least abstract away differences to arrive at identical units. In abstraction we are supposed to ignore the differences between the objects and treat them as identical. Concerning this possibility, Frege could say very much the same thing as he did about ignoring the possibility of dissection in treating something as one. Here he says ‘as though lack of thought could get us anywhere!’ (Frege 1884/1980, §33). Simply ignoring the problem is not the same as solving it. If abstraction is to work, it must not be about merely avoiding thinking about certain properties of the objects, but the units must be abstract objects that actually lack those features. This view, however, fares no better. Let us take the example of trying to count the fingers my left hand. Let us say I start with my thumb, I consider my thumb to be one of my units, and then I move to my index finger. I am supposed to, apparently, treat it as another one of those. Of course, it is not another one of those—it is not, after all, a thumb. But in treating them both as my units, I am supposed to somehow abstract away their differences. However, if I abstract away all of the differences, then I no longer have two distinct things. That is, if we abstract away all of the differences, we never get past one. But if we do not abstract away all of the differences, then we do not seem to be in any better than before the abstraction. Our second unit is still not another instance of the first. But while the units must be identical, the things counted must preserve their differences if there are ever to be more than one of them. In §35, Frege quotes Descartes, Jevons, and again Schröder to the effect that units need to be distinct from one another. The reason for saying that the units must be different is clear, and has already been mentioned. If the units cannot be distinguished from one another, then we never have two distinct things. But taking seriously the distinction between units would make arithmetic impossible: Arithmetic would come to a dead stop, if we tried to introduce in place of the number one, which is always the same, different distinct things, however similar the symbols for them; yet to make the symbols identical would be, of course, a mistake, and surely we cannot suppose that the mainspring of arithmetic is a piece of faulty notation. (Frege 1884/1980, §38)
It is worthwhile now to review the paradox and see what assumption the paradox depends on: 1. Units are the things being counted. 2. A thing, as a unit, lacks all of its properties except those that make it that type of thing. 3. For there to be at least two things on some choice of unit, there must be distinct objects to be counted. These claims are jointly inconsistent with the possibility of counting to two. That is, they jointly imply the impossibility of a plurality of things of any type. The first
Did Frege Solve One of Zeno’s Paradoxes?
105
seems forced on us by the fact that we cannot count things directly, but must count units, as different numbers can apply to the same thing depending on our choice of units. The second seems forced on us whenever, for example, we treat a thumb and an index finger as two instances of the same unit. The third is a straightforwardly obvious claim. Of course, pragmatically speaking we can simply ignore these problems and go on as before, but of course neither Frege nor Zeno would be happy with such a resolution. Notice, also, that simply saying that the things are similar in some respects and different in others is no resolution to the paradox. Of course the things retain all of their distinguishing features as we count them, but treating them as units is supposed to be about discounting, or stripping away, those differences. When counting our units, we seem to be in the paradoxical position of simultaneously treating them as identical and distinct. Frege’s solution to the paradox is to argue that the unit is not one of the things to be counted. The unit is a concept (what we would call a property) under which the things fall. When I count the fingers on my left hand, the unit is the concept ‘x is a finger on my left hand’. As such the unit is exactly the same in the case of each thing to be counted. Numbers do not apply to things directly, nor to things considered as units. The unit is a concept and the number belongs to a concept. In §54 Frege summarizes his position as follows: We can now easily solve the problem of reconciling the identity of units with their distinguishability. The word “unit” is being used here in a double sense. The units are identical if the word has the meaning just explained [as a (sortal) concept]. In the proposition “Jupiter has four moons”, the unit is “moon of Jupiter”. Under this concept falls moon I, and likewise also moon II, and moon III too, and finally moon IV. Thus we can say: the unit to which I relates is identical with the unit to which II relates, and so on. This gives us our identity. (Frege 1884/1980, §54)
That is to say, when choosing a unit, in asking a question of number, we are selecting a concept under which the things in question fall. In fact, this claim, that an assertion of number contains a claim about a concept, is described in Frege’s Grundgesetze (Frege 1893/1967) as the fundamental thought of the Grundlagen. We saw above that Allen speaks of Socrates bringing out the heavy artillery to deal with this paradox. If my suggestion is correct, then Frege as well brings out the heavy artillery to deal with this paradox—it is resolved via the fundamental thought. Notice also, that both Frege and Socrates see the need to move from talk of objects to the level of universals (Forms in Socrates’ case and concepts in Frege’s case) to address the paradox.
4 Could This Fregean Paradox Have Been Zeno’s? So far, I have outlined the paradox as it is presented in the Parmenides and surveyed a number of discussions of it. I then presented Frege’s view on a paradox inherent in a quite standard view of units. My suggestion being that the paradoxical properties of units that Frege identifies are at the core of the paradox of the like and the unlike.
106
G. Lavers
That is, it is only the elements of a plurality, considered as units, that must be both like and unlike. But Frege is speaking of modern views of number, could the same considerations have applied to ancient views of number and units? This is the question I want to address in the present section. A plurality, for the ancient Greeks would be a number of things (greater than one, if one is considered a number). And by definition, this plurality would be composed of units. Consider this survey of ancient Greek definitions of number from Thomas Heath’s book on philosophy of mathematics from Thales to Euclid: The first definition of number is attributed to Thales, who defined it as a collection of units ( μoν αδων ´ σ nuσ ´ τ ημα), ‘following the Egyptian view’. The Pythagoreans ‘made number out of one’; some of them call it ‘a progression of multitude beginning from a unit and a regression ending in it’. [. . . ] Eudoxus defined number as a ‘determined multitude’ (π ληθoς ˆ ωρισ ´ μ´ νoν). Nichomachus has yet another definition, ‘a flow of quantity made up of units’ (π oσ oτ ´ ητ oς χ υμα ´ ` κ μoν αδων ´ σ υγ κ´ιμνoν). Aristotle gives a number of definitions equivalent to one or another of those just mentioned, ‘limited multitude’, ‘multitude (or ‘combination’) of units’, ‘multitude of indivisibles’, ‘multitude measured’, and ‘multitude of measures’ (the measure being the unit). (Heath 1921, pp. 69–70)
Clearly the concept of unit is central to the majority of these definitions. The ones that fail to mention a unit explicitly assume it implicitly. To assess whether such definitions suffer from the problems we saw Frege point out in the last section, we need to ask whether the units have the features identified above, as versions 1–3, that lead to the impossibility of a plurality. On Frege’s diagnosis, the fundamental problem is contained in version 1. Thinking of units as the things to be counted is what leads to the paradox, and it is clearly involved in the above definitions that discuss units. Version 3 is obvious, and at least the claim that to show plurality we need to point to a difference is assumed in the dialogue. ‘When he wishes to show I am many, he says that my right side is one thing and my left another, that my front is different from my back, and my upper body in like manner different from my lower [.]’ (129c) So the only question is whether the ancient conception of unit adhered to version 2. We saw that quite simple and general considerations lead to version 2, after all, the unit is meant to be kept constant. There is at least no reason to think the ancient conception of a unit would have been exempt from it. There seems to be every reason to believe that the ancient concept of unit suffered from the same problems as the modern one that Frege criticizes. But what reason is there for thinking that properties of units are under consideration in this paradox. Of course, all that we know of the paradox is a skeletal outline, and any interpretation of the paradox needs to put some flesh on the bone. The suggestion made in the present paper, and it is of course only a suggestion, has several advantages over reconstructions that exist in the literature. First, there is something genuinely paradoxical involved. The paradox points to a fundamental inconsistency in what seems to be a ubiquitous conception of units. Second, any idea of plurality at the time would have been associated with the idea of a plurality of units. And thirdly, it makes sense of the rapid transition in the text between this paradox and the fact that anything could be thought of as one or many. If the concept of a unit was centrally involved in the reasoning to the paradoxical conclusion, then this transition would
Did Frege Solve One of Zeno’s Paradoxes?
107
be natural, as it is a case of considering the same thing while varying the unit. Notice also the generality involved here, anything may be taken as a unit and the existence of a plurality requires two instances of one unit. Whereas Frege took the paradox he discusses as a reductio of the standard conception of units as (somewhat abstract) objects, Zeno assumes the standard definition of number and the standard concept of unit and uses the paradox as a refutation of the very idea of plurality.
References Allen, R. E. (1964). The interpretation of Plato’s “Parmenides”: Zeno’s paradox and the theory of forms. Journal of the History of Philosophy, 2(2), 143. Allen, R. E. (1997). Plato’s Parmenides: Translated with Comment. Chelsea, Michigan: Yale University Press, revised ed. Barnes, J. (1979/1982). The Presocratic Philosophers. The Arguments of Philosophers (Ted Honderich ed.). New York: Routledge. Dowden, B. (2017). Internet Encyclopedia of Philosophy: Zeno’s Paradoxes, 2017, http://www. iep.utm.edu/zeno-par/#SSH3bi, (2017-02-10). Frege, G. (1884/1980). The Foundations of Arithmetic. Evanston, IL: Northwestern University Press, second revised edition ed. Frege, G. (1893/1967). The Basic Laws of Arithmetic: Exposition of the System. Edited and translated by Montgomery Furth. Berkeley and Los Angeles: University of California Press. Heath, T. (1921). A History of Greek Mathematics Volume 1 From Thales to Euclid. London: Oxford. Huggett, N. (2019). Zeno’s paradoxes. In E. N. Zalta (Ed.) The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2019 ed. Lee, D. (2014). Zeno’s puzzle in Plato’s Parmenides. Ancient Philosophy, 34(2), 255–273. McKirahan, R. D. (1999). Zeno. In A. A. Long (Ed.) Cambridge Companion to Early Greek Philosophy, (pp. 134–157). Cambridge: Cambridge University Press. Palmer, J. (2017). Zeno of Elea. In E. N. Zalta (Ed.) The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2017 ed. Russell, B. (1903/2010). The Principles of Mathematics. London: Routledge Classics.
Charles Davies as a Philosopher of Mathematics Education Amy Ackerberg-Hastings
Abstract Charles Davies (1798–1876), who taught at the United States Military Academy at West Point (USMA), Trinity College in Hartford, Connecticut, New York University, the State Normal College at Albany, New York, and Columbia University, was one of the most prolific and popular compilers of mathematics textbooks in the USA in the nineteenth century. This essay explores his The Logic and Utility of Mathematics, With the Best Methods of Instruction Explained and Illustrated (1850), which Bidwell and Clason (1970, p. 38) and Jones and Coxford Jr. (1970, p. 31) called the “first American book on mathematics teaching methods” and “the first book in the United States for secondary teachers of mathematics,” respectively. I will describe and analyze the three major parts of the text—“logic,” “mathematical science,” and “utility of mathematics”—and consider the historical significance of Davies’s entire project, which he intended as an explication of the USMA “system of mathematical instruction” (1850, p. 3).
1 Introduction Textbooks with Charles Davies’s name on the title page were among the most commercially successful in the USA in the nineteenth century. Davies was the third American mathematics and natural philosophy professor to publish a series of mathematical textbooks, after Yale’s Jeremiah Day and Harvard’s John Farrar (Ackerberg-Hastings 2000). Although Day’s 1814 Introduction to Algebra sold widely after it was adopted into early high schools, the sum total of the books sold under Davies’s name was far larger and more commercially profitable. Parents, students, and teachers purchased a total of over five million copies of his nearly fifty separate schoolbooks and college mathematics texts. In particular, the popular
A. Ackerberg-Hastings () Independent Scholar, Rockville, MD, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_7
109
110
A. Ackerberg-Hastings
shorthand title for 1828’s Elements of Geometry, “Davies’s Legendre,” was effectively synonymous with “the standard American textbook for Euclidean geometry,” with 300,000 copies sold by 1862, while 1852’s School Arithmetic had total sales of 1,250,000. Among names appearing on American textbooks around the middle of the nineteenth century, only Joseph Ray, whose texts for children may have appeared in as many as 100 million copies (Kullman 1998), substantially outsold Davies’s before the explosion of mass-produced textbooks in the last quarter of the nineteenth century, by teachers such as George Wentworth and from publishers such as American Book Company and Ginn (Kidwell et al. 2008). Davies’s textbooks were comprehensive in topics covered and in the age ranges addressed. They also carried the prestige of his 14 years (1823–1837) as the professor of mathematics at the United States Military Academy at West Point (USMA), the institution that was generally viewed in the time period as the leading American provider of technical education. Both of these characteristics were effectively exploited by the groundbreaking marketing techniques of Davies’s publisher, Alfred S. Barnes. Meanwhile, many USMA alumni became professors in newly founded colleges and technical institutes and utilized Davies’s textbooks, further extending their reach. Davies, too, parlayed his reputation as a leading mathematics educator into positions at Trinity College in Hartford, Connecticut, New York University, the State Normal College at Albany, New York, and Columbia University. Additionally, he joined organizational efforts to professionalize teaching, including the New York State Teachers’ Association, which he served as president from 1852 to 1853. Any synopsis of Davies’s influence must be accompanied by caveats. His capability as a mathematician was always suspect in the view of certain of his peers, his academic integrity was known to be lacking, and his large ego annoyed onlookers. For instance, in the series of textbooks that formed USMA’s curriculum he habitually replaced authors’ and translators’ names with his own. In 1845 Davies and Barnes paid an out-of-court settlement for plagiarizing Davies’s 1840 First Lessons of Arithmetic from Frederick Emerson’s 1830 North American Arithmetic, Part I (Emerson v. Davies 1845). When Davies presided over the August 1853 meeting of the New York State Teachers’ Association, his lofty opinion of his own expertise was on display as he wore full military dress uniform even though he had resigned his commission in 1845. The most notable event while he wielded the gavel may have been what proved to be Susan B. Anthony’s first public speech. After she rose from among the 300 female teachers sitting at the back of the room, behind 200 male experts, Davies dithered with several other leaders before recognizing her and then calling for a motion that provoked 30 min of discussion before she was finally able to point out that gender explained why teachers were not as wellrespected as doctors, lawyers, or ministers. The next morning, Davies opened the meeting by stating that women should be worshipped as placid statues on pedestals, a view shared by many of the women and men in attendance. Anthony, on the other hand, received his comments as delivered with “majesty and pomposity”; her public disputes with him over equal pay and employment for women and over coeducation continued through at least 1857 (Husted 1899, pp. 98–99; Stanton et
Charles Davies as a Philosopher of Mathematics Education
111
al. 1889, pp. 513–517; Twelfth Annual Meeting 1857). In October 1863, Davies tried to secure the professorship of physics at Columbia for his son-in-law, William Guy Peck. In discussing Davies’s attempt at nepotism, F. A. P. Barnard’s brother, J. B. Barnard, described Davies as having “scientific attainments fit for a head schoolmaster” and Columbia Trustee G. Kemble told fellow Trustee Hamilton Fish, “With all his selfishness and mischief making, [Davies] is, and always was a fool. I have known him from boyhood” (Hamilton Fish Papers). Davies’s The Logic and Utility of Mathematics (1850, Fig. 1) nicely illustrates the coexistence of his reasons for fame and for notoriety. The 375-page book was printed eight times; a second edition appeared in a single printing in 1873 (National Union Catalog 1976, pp. 361–362). The second edition was expanded to 419 pages and retitled The Nature and Utility of Mathematics. As is typical for Davies’s books, Logic and Utility was and continues to be well-regarded. By 1851, publisher A. S. Barnes put together an advertisement that appeared in the firm’s other books (Mansfield 1851, front matter). It consisted of six reviews reproduced in entirety or excerpt, the most prominent of which came from Harper’s New Monthly Magazine. More recently, two 1970 NCTM publications, Readings in the History of Mathematics Education, edited by Bidwell and Clason (1970) and A History of Mathematics Education in the United States and Canada, edited by Jones and Coxford (1970), described Logic and Utility as the “first American book on mathematics teaching methods” (p. 38) and the “first American methods book” (p. 29), respectively. The textbook was divided into 3 “books”. The first, on logic, consisted of three chapters and took up 71 pages (less than 20% of the text). The third, on the utility of mathematics, also had three chapters but comprised only 48 pages, or 13% of the whole volume. The middle book covered over half of the text (193 pages) but only had four chapters: general characteristics and definitions of “mathematical science”; arithmetic; geometry; and everything that could be classified as “analysis” (algebra, analytical geometry, and differential and integral calculus). Davies identified four potential audiences: “general reader[s] . . . professional men and students . . . students of mathematics and philosophy . . . [and] professional teachers” (Davies 1850, p. 19). Much of the material those readers encountered in Books I and III was copied from other authors, with varying degrees of attribution. Certain themes flowed through Logic and Utility — as they did through Davies’s entire career — such as emphatic support of the mental discipline justification for teaching mathematics and a belief that mathematical practice was defined by quantity. However, Davies’s piecing together of substantive sections from multiple sources left open the possibility of logical incoherence in part or in whole. He also sometimes contradicted himself in the original portions of his argument. Additionally, it is not always apparent how the view of mathematics presented in Logic and Utility had been translated from the USMA curriculum. This paper thus describes and analyzes this textbook’s three “books” and evaluates the impact of Davies’s whole project. Logic and Utility is rich enough as an intellectual artifact that other scholars will be able to consider and contextualize aspects of the content that must be overlooked at present due to constraints on time and length.
112
Fig. 1 Title page for Davies (1850). Public domain
A. Ackerberg-Hastings
Charles Davies as a Philosopher of Mathematics Education
113
2 The Logic of Mathematics The concept of Book I was to assert the value of mathematical logic not only in its own right but as a necessary component of teaching the general public. Considerable space was given to the meanings of terms, including a definition of “definition”: “Definition is a metaphorical word, which literally signifies ‘laying down a boundary”’ (Davies 1850, p. 27). Then, the text discussed how the mind operates while it is reasoning, providing examples from the conceptualization of geometric shapes. Next is material on how and why humans accept the validity of proofs, which for twenty-first-century readers might seem to fall between philosophy and psychology. Similarly, the second of the section’s three chapters began with a meditation on the meaning of “truth” and distinguishes between intuitive and logical truths. The second half of Book I finally arrived at induction, deduction, and syllogisms. Davies ended by asserting that mathematical reasoning not only follows the rules of logic but also is identical to “logic applied to the abstract quantities, Number and Space” (Davies 1850, p. 96). While Davies’s argument that mathematics and logic are the same and so logic should be part of mathematics education appears to have been at least somewhat novel for Americans, the prose he used was often not original. Large sections of Book I are copied verbatim from Richard Whately’s Elements of Logic (1826) and John Stuart Mill’s System of Logic (1843). Examples of the paragraphs Davies borrowed included Whately’s definition of “proposition” and Mill’s explanation of how to identify propositions (Davies 1850, pp. 54–55). Davies used no quotation marks around the reprinted sections and provided no citations in Book I, but he did acknowledge the copying in Logic and Utility’s preface: “The materials of Book I have been drawn, mainly, from the works of Archbishop [of Dublin] Whately and [British philosopher] Mr. Mill. . . . In all cases where it could be done consistently with my own plan, I have adopted their exact language” (Davies 1850, p. 4). Whately compiled his main principles from a variety of sources, including friends’ manuscripts, explaining them so clearly that he is credited with revitalizing the subject in Great Britain (Whately 1827, pp. v–vii; Van Evra 2008). By 1844, American printings were selling so well that Whately commented that he had prepared “a volume which has been frequently reprinted, not only in English, but in the United States of America; where it is in use, I believe, in every one of their Colleges” (Whately 1844, p. xii). Mill’s work was one of those inspired by Whately’s volume, but he additionally wrote a thorough and technical defense of empiricism in philosophy, natural philosophy, and mathematics (Heydt n.d.). Besides the general popularity of these books, it is unclear how Davies got the idea to use logic in teaching mathematics. He is not known to have had first-hand experience with teaching the subject. It was not covered in any of the mathematical topics in USMA’s curriculum during his tenure there, so it would not have been part of the “West Point method of mathematical instruction” that he claimed to be outlining in Logic and Utility. Although Augustus De Morgan began using logic to motivate the teaching of elementary geometry in the late 1830s, it is unlikely
114
A. Ackerberg-Hastings
Davies was aware of this development. It also is not obvious that Davies understood what he was repeating. Most of Book I reads more like a theory of knowledge than like a system of logic. Moreover, most of the content from Book I is dropped in Book II. In fact, there Davies contradicted the premise of Book I by stating, in italics no less, “There is not, in the whole range of mathematical science any logical test of truth, but in a conformity of the conclusions to the definitions and axioms” (Davies 1850, p. 111). Two pages later, he shifted direction again by stating, “Although the syllogism is the ultimate test in all deductive reasoning . . . still we do not find it convenient or necessary, in mathematics, to throw every proposition into the form of a syllogism” (Davies 1850, p. 113). This lack of consistency and coherence was typical of Davies’s prose, a tendency that can leave one wondering what inexperienced mathematics students got out of reading his textbooks. In this case, how many secondary school pupils preparing for college and men and women enrolled in teacher training institutes were equipped to comprehend Book I and connect its concepts to mathematical subjects?
3 Mathematical Science The aim of Book II, to define mathematics and discuss the activities included in mathematics, was a very old one that had been explored from many different perspectives by 1850. (See, for instance Ackerberg-Hastings 2018). Davies probably wrote this part of Logic and Utility himself. Like Book I, though, some aspects suggest an intellectually muddled approach. For instance, he made charts to preface each of the chapters in Book II, including Fig. 2. One might assume this illustration depicts Davies’s conception of the branches of mathematics, but in
Fig. 2 Diagram of the branches of mathematics that Davies placed at the beginning of Book II (1850, p. 98). Public domain
Charles Davies as a Philosopher of Mathematics Education
115
the text he incoherently and simultaneously distinguished algebra from analysis, suggesting algebra might be its own branch; described algebra as “a species of universal Arithmetic”; and called algebra the “simplest” form of analysis, placing algebra within analysis (Davies 1850, pp. 105, 265). He also classified arithmetic, geometry, and algebra as “the elementary branches of Mathematical Science,” distinguishing these disciplines from the “higher and most advanced branches” such as analytical geometry, differential and integral calculus, and the theory of variations (Davies 1850, pp. 105–106). One reason for these competing notions about how to conceptualize the branches of mathematics is that Davies was fully committed to defining mathematics as the science of quantities, and so he believed that symbols only had merit insofar as they represented real things (Davies 1850, p. 133). This meant that, when he was reading ideas from multiple sources and attempting to reconcile contradictions, he was likely to fall back on his own concrete view of mathematical practice. Although he made a valiant attempt in the fourth chapter of Book II to define analytical geometry and differential and integral calculus (Davies 1850, p. 265) and paid lip service to “higher mathematics” elsewhere in the volume, Davies was really only concerned in Logic and Utility with explicating the basic principles of school mathematics. After all, arithmetic was the only mathematical subject shared by all four of his potential audiences, so he may have found it unnecessary to ensure his explanations of analysis were consistent. In fact, he published the textbook right at the time that American localities with well-established elementary and secondary school systems were starting to teach algebra and geometry to teenagers, while elite colleges like Harvard and Yale were removing those subjects from their own curricula and turning them into entrance requirements. Increasingly, white Americans of all classes and in all regions were receiving formal education and learning arithmetic throughout their first eight or so years of study. Davies himself was involved in efforts to provide schoolteachers with proper training before they gave these lessons in arithmetic: he attended and spoke at meetings of teachers and superintendents at least as early as 1843, served as president of the New York State Teachers’ Association (founded in 1845) from 1852 to 1853, and taught at the State Normal School at Albany (established in 1844) from 1855 to 1857 ((State Convention 1843, p. 56); Teachers’ Institutes 1845, p. 102; Kirk 1883, p. 25; Historical Sketch 1894, p. 6). Thus, the chapter on arithmetic was allotted 54% of the second book and 28% of the entire volume (105 pages). He based the concepts of number and of place value on the notion of a unit — every number could be broken down into its constituent units (or “one”s), while adding zeroes to the right of a unit increased its value by powers of ten. He showed place value as a scale, which enabled him to develop other scales of units in currency, weights, and measures (Davies 1850, pp. 117–120, 130–148). Davies also related an understanding of how numbers were constructed to the activities of spelling and reading. “Spelling” referred to naming two integers to be added and “reading” to visualizing the total without listing the components, so reading was superior (Davies 1850, pp. 121–130). When computing a sum, he added a list of numbers from bottom to top as well as from right to left. Davies then treated fractions, ratios,
116
A. Ackerberg-Hastings
and proportions and discussed thought processes for solving word problems, which he called “applications” (Davies 1850, pp. 151–172). Finally, he devoted nearly half of the chapter on arithmetic (49 pages) to how to teach the subject. He returned to his conviction that the unit 1 was the fundamental concept and argued that the Rule of Three ought to continue to be taught before topics such as percentage, interest, and discounts (Davies 1850, pp. 173–179). Again, he emphasized the “grammar” of arithmetical language (Davies 1850, pp. 187–190). As he transitioned into the value of mathematics for developing students’ ability to reason properly, readers would have noticed the principles that, in Davies’s opinion, made USMA alumni such great college instructors. He defined mathematics as “the science of quantity” (Davies 1850, p. 100). He also thought that definitions and terms had to be clear and exact and that reasoning was only valid when it referred back to those definitions and terms (Davies 1850, p. 191ff). As the mention of proper reasoning suggests, like many of his American contemporaries, Davies was an advocate of the mental discipline justification for teaching mathematics, stating that the subject’s importance lay in “train[ing] the mind to habits of clear, quick and accurate thought . . . apprehend[ing] distinctly . . . discriminat[ing] closely . . . judg[ing] truly—and . . . reason[ing] correctly” (Davies 1850, p. 200). At the same time, he allotted equal weight to showing throughout its study that arithmetic “is the most important art of civilized life—being, in fact, the foundation of nearly all the others” (Davies 1850, p. 200, emphasis in source). He believed learning only occurred when teaching started with general principles and then moved to specific practices and applications (Davies 1850, p. 202). Unsurprisingly, he thought teachers and students should pay close attention to a well-designed graded series of textbooks (Davies 1850, pp. 203–221). The chapters on geometry and algebra were similar in structure and approach but considerably shorter (37 and 32 pages, respectively). Davies argued that geometry was about “the development of all the laws relating to space” and contained three types of truths: definitions, axioms, and demonstrations (Davies 1850, pp. 223– 224). He then described the objects studied in elementary geometry and returned to his earlier topics of measurement and the importance of the unit 1 (Davies 1850, pp. 224–233). He proved two sample propositions and discussed the difference between a direct proof and reductio ad absurdum (Davies 1850, pp. 237–246). He illustrated proportions with figures and with algebra (Davies 1850, pp. 246–251). The chapter ended with a relatively brief list of sixteen “suggestions for those who teach geometry” (Davies 1850, pp. 256–259). As noted above, Davies began the fourth chapter of Book II with a discourse on various meanings of “analysis” and definitions of algebra, analytical geometry, and differential and integral calculus (Davies 1850, pp. 261–269). The chapter’s main focus, though, was algebra, which Davies said students learned easily if they paid attention to the symbols and definitions (Davies 1850, p. 270). The unit 1 reappeared in his discussion of zero and infinity, and he concluded with twelve tips for teaching algebra (Davies 1850, pp. 282, 289–292). None of the material on “the logic of mathematics” from Book I was referenced in Book II.
Charles Davies as a Philosopher of Mathematics Education
117
4 The Utility of Mathematics In contrast, Book II’s promotion of the mental discipline justification for teaching mathematics recurred in the third book, “Utility of Mathematics.” After all, the concept of Book III, to demonstrate that mathematics had value both for training the mind in proper reasoning and for utilitarian applications, was a longstanding concern for Americans. Probably best-known is Jeremiah Day and James Luce Kingsley’s defense of the existing college curriculum, focused on the classics and mathematics, as best suited for imparting mental discipline and other worthy habits in “Original Papers in Relation to a Course of Liberal Education” (1829), which appeared in American Journal of Science and thus was widely distributed. Davies’s approach was mainly to string together a series of quotations (which did appear within quotation marks and accompanied by citations) by John Herschel, John Locke, Edward Deering Mansfield, Francis Bacon, Isaac Barrow, and Captain Basil Hall (1788–1844), a Scottish sailor known for travel writing who also collaborated with John Playfair on geological reports. Mansfield (1801–1880) was Davies’s brother-in-law, as well as the son of mathematician Jared Mansfield. The two men overlapped at USMA, but after completing his studies in 1818 Mansfield then attended and graduated from Princeton, was admitted to the bar in Connecticut, and by the early 1830s had moved to Ohio. In 1834 he delivered a lecture titled “Discourse on the Utility of the Mathematics” to a Cincinnati organization called the College of Professional Teachers; he subsequently published the lecture in the group’s Transactions (Mansfield 1835). Book III contains at least two multi-page passages from the paper. In 1851 Mansfield expanded the article and his overall ideas about teaching into the book American Education, which he published with A. S. Barnes (Mansfield 1851). Indeed, American Education also contains one of the first appearances of Barnes’s detailed advertisement for Logic and Utility. In 1877 Mansfield provided an obituary of Davies to the Association of the Graduates of the US Military Academy (Mansfield 1877). The content of the three chapters in Book III could be reduced to “mathematics is deduction,” “mathematics is experiential,” and “mathematics is practical.” Davies talked about how children encounter numbers, space, and the natural world (Davies 1850, pp. 293–294, 309). He argued that mathematics fosters the two ways Locke said people learn by absorbing “clear and distinct ideas with settled names” and by combining concepts which were already known to develop a new insight (Davies 1850, pp. 298–299). He noted that some endeavors in mathematics required reasoning via a synthetical process, while other areas necessitated analytical methods (Davies 1850, p. 303). He celebrated Baconian induction but also listed the accomplishments of analysis, particularly in mathematical astronomy (Davies 1850, pp. 311–314, 320–322). He defined “practical” activities as those which take ideal principles and convert them into actual results, as in commerce, the mechanic arts, the engineering and navigation of steamships, surveying, construction of railroads, and design of waterworks (Davies 1850, pp. 325–339). Yet, despite these evidences of “the power and skill of man” presented in the third chapter (Davies 1850, p. 339),
118
A. Ackerberg-Hastings
Davies had closed the first chapter with a standard statement of the mental discipline justification for teaching mathematics: We may claim for the study of Mathematics, that it impresses the mind with clear and distinct ideas; cultivates habits of close and accurate discrimination; gives, in an eminent degree, the power of abstraction; sharpens and strengthens all the faculties, and develops, to their highest range, the reasoning powers (Davies 1850, p. 307).
5 Davies’s Philosophy of Mathematics Education Did Logic and Utility form a coherent system? This historian of mathematics is skeptical. Book I suggests Davies turned to logic because it was trendy but did not really understand the subject. Book II laid out the fundamental beliefs about the nature of mathematics that had guided Davies’s compilation of textbooks, teaching career, and advocacy for training schoolteachers, but simultaneously, on certain topics, read as a hodgepodge of statements. Book III seems to try to make mathematics all things for all readers. However, Davies was certain about the success of his project. Even though logic was not a part of that curriculum, he thought that he had outlined the “system of mathematical instruction which has been steadily pursued at the Military Academy over a quarter of a century . . . which has given to that institution its celebrity as a school of mathematical science” (Davies 1850, p. 3). Since every textbook still in use at USMA had his name on it, Davies claimed responsibility not only for defining the institution’s approach to teaching but also for its widespread influence. Indeed, the appendix on “what [a course of mathematics] should be” with which Davies closed Logic and Utility recommended the order of the subjects found in volumes of his own textbook series (advertised as the “arithmetical,” “academic,” and “collegiate” courses in the end matter): arithmetic; algebra; geometry; plane and spherical trigonometry; surveying and leveling; descriptive geometry; shades, shadows, and perspective; analytical geometry; and differential and integral calculus (Davies 1850, pp. 341, 345–351, 377). He also promoted the use of a series rather than a single compendium textbook and affirmed the importance of uniform terminology throughout that series (Davies 1850, pp. 341–342). Finally, he faulted educators for lacking sufficient training to combine multiple systems for explaining mathematics into their own unified methods, strongly implying that they would be better served by following his beliefs about the basic principles of mathematics that were set out in this textbook (Davies 1850, pp. 343–345). Enough reviewers agreed with Davies’s self-assessment that publisher A. S. Barnes was able to assemble an advertisement full of positive comments within a year of Logic and Utility’s publication (Fig. 3). In addition to the front matter of Mansfield’s 1851 American Education, this set of six testimonials appeared in works as varied as Walter Colton’s Ship and Shore, in Madeira, Lisbon, and the Mediterranean (1851, p. 325), Henry Noble Day’s Elements of the Art of Rhetoric (1853, p. 312), and John Darby’s Botany of the Southern States (1855,
Charles Davies as a Philosopher of Mathematics Education
119
Fig. 3 Advertisement for Logic and Utility as it appeared in Mansfield’s American Education (1851). Public domain
120
A. Ackerberg-Hastings
p. 617). Reviewers from the Lutheran Observer and New York Evangelist both thought Logic and Utility compared favorably with Davies’s other textbooks. A writer for a periodical called Independent also recommended the book for training the reasoning faculties of theologians; it is not clear why the book attracted so much attention from Christian organizations, since by 1850 American colleges were doing much more than educating clergy. Davies attended church services regularly, but he is not known to have been especially active in the Episcopalian denomination to which he belonged. A two-sentence mention from The Student referenced the larger readership that Davies envisioned in his introduction (Davies 1850, p. 19): “It is not only designed for professional teachers, professional men, and students of mathematics and philosophy, but for the general reader who desires mental improvement” (Mansfield 1851, front matter). The most substantive review, reprinted in entirety, came from Harper’s New Monthly Magazine, which contained articles from Harper and Brothers publications as well as pirated British literature and should not be confused with the better-known Harper’s Weekly. This appraisal was also the most critical of those in the advertisement, suggesting that Logic and Utility only appealed to readers who were already mathematically inclined, although at the same time praising a style which was “chaste, simple, transparent, and in admirable harmony with the dignity of the subject” (Literary Notices 1850, p. 428). It is not clear how widely Logic and Utility was used, although its eight printings suggest a reasonable upper estimate of 8000 copies sold. By 1859, A. S. Barnes was claiming that Davies’s “complete course of mathematics” was “adopted and in successful use in the Normal Schools of New York, Michigan, Connecticut, and other States” (Peck 1859, p. 339). It has not yet been possible to verify this claim with college catalogues or journal articles about the structure and curriculum of normal schools and teachers’ institutes. In another advertisement that appeared at least as early as 1854, Barnes marketed Logic and Utility as part of a series of “standard library books” that presumably ought to be in every educational institution’s collection (The Student and Family Miscellany 1854, pp. 180ff). Potential customers were convinced; for instance, in the first 30 years after publication, Logic and Utility appeared in catalogues for the Providence Athenaeum (1853, p. 107), San Francisco Mercantile Library (1854, p. 35–36, 148), Mercantile Library of New York (1856, p. 28), Library of Congress (1864, p. 324), Young Men’s Association of Chicago (1865, p. 40), Young Men’s Mercantile Library of Cincinnati (1869, p. 76), Public Library of Indianapolis (1873, p. 88), and Apprentice’s Library of New York (1874, p. 301). Additionally, libraries in Ontario were encouraged to purchase copies for “professional reference” (General Catalogue 1854, p. 26). Some evidence suggests Logic and Utility stimulated conversations about the meaning and role of mathematics. One of the few reviews that A. S. Barnes did not include in the advertisement engaged with Davies’s claims about the role played by mathematics in education (Use of Mathematics 1851). While the anonymous reviewer thought Davies went too far in treating mathematics as the foundation for all knowledge and maintained that the mental discipline justification for teaching
Charles Davies as a Philosopher of Mathematics Education
121
mathematics was not widely held, the reviewer did name several advantages of mathematical study, including the abilities to decompose concepts into parts and to generalize from a set of observations as well as the development of habits of patient attention. Thus, he commented: “The fact is, that nothing tends more effectually than mathematical demonstration to the attainment of those habits of precision, neatness, and accuracy, that make and mark the scholar in any department of life” (Use of Mathematics 1851, p. 225). In his 1855 geometry textbook, Transylvania University president James B. Dodd added an appendix to critique Davies’s definition of ratio in Logic and Utility (1850, p. 187); Dodd wanted to show that dividing the first number by the second was as valid as dividing the second number by the first. He noted that Davies switched between the two definitions in several of his other textbooks and said Davies had reversed Legendre’s definition in Elements of Geometry (Dodd 1855, pp. 231–237).
6 Conclusion The assessment of Bidwell and Clason, and Jones and Coxford appears to be valid: that Logic and Utility was the first book-length work on how to teach mathematics in schools published in the United States — previous volumes, such as Anna Cabot Lowell’s Theory of Teaching, with a Few Practical Illustrations (1841) and David Perkins Page’s Theory and Practice of Teaching (1847), were general manuals covering the teacher’s demeanor, habits of daily classroom management, and techniques and methods for fostering learning in all subjects. The specific topic of mathematics education, though, was a well-established component of conversations about academics. American journals had been filled with articles debating whether the classics and mathematics should remain at the center of a liberal college education for the previous two decades. In addition to the essays by Day and Kingsley and by Mansfield mentioned above, two other examples were published by T. M. Post and Thomas Smith Grimké (brother of abolitionists Sarah Moore Grimké and Angelina Grimké Weld) in the same volume of Transactions of the Fourth Annual Meeting of the Western Literary Institute and College of Professional Teachers in which Mansfield’s paper appeared (Grimké 1835; Post 1835). Like these pieces, journal articles about why and how to teach mathematics tended to be wide-ranging philosophical discussions that related the choice of subjects to the values of the American system of government. They typically did not offer guides for explaining specific topics in classrooms. English-language books on mathematics teaching methods during this time period seem to have been written and printed only in England. Besides well-known authors such as De Morgan (1830) and Whewell (1835), teachers and tutors such as Gregory (1840) and Pullen (1820) assembled hints and techniques for helping children comprehend arithmetic and geometry problems—albeit in order to develop habits of clear reasoning so they
122
A. Ackerberg-Hastings
could be admitted to secondary schools and higher education — while University of Durham professor Chevallier (1836) defended Whewell against a critical evaluation from Edinburgh Review. Despite the availability of these British works and although there were methods in Logic and Utility, particularly in the classroom tips at the ends of the chapters in Book II, most of the text’s content conveyed Davies’s beliefs about what mathematics was and why it should be a central component of education at all levels. In Book I, he proposed using logic to teach mathematics by borrowing material from Richard Whately and John Stuart Mill, but he was not able to realize this goal in the rest of the volume. Book II, on the branches of mathematics, contained some contradictory claims and a rather strange focus on the unit. It also emphasized the subjects of school mathematics, argued for connecting techniques to practice, and provided general instructions to teachers. These guidelines addressed the preferred order of topics, the fundamental values of what Davies called the “West Point system,” the importance of the mental discipline justification for teaching mathematics, and Davies’s preference for proceeding from general concepts to specific information. In Book III, he described the benefits of mathematics as mental discipline, exposure to induction and deduction, and practical achievements and inventions. Therefore, it may be more apt to characterize Logic and Utility as a philosophy of mathematics education than as a handbook of teaching methods. To an extent, Davies acknowledged his endeavor was philosophical when in 1873 he retitled the book The Nature and Utility of Mathematics. The changes to the content were relatively modest: in Book II (“mathematical science”), he expanded the first chapter by recasting his definition of mathematics as quantity into quantity of number and of space, added a fifth chapter on differential calculus, and relocated the appendix on “A Course of Mathematics — What It Should Be” that had previously appeared at the end of Book III (Davies 1873, pp. 4, 98–117, 289–348). Throughout the three decades it was in use, Davies’s way of thinking about mathematics found an audience and had an impact on at least some of its readers. It also fit well within the contours of Davies’s overall career. He made his name as a compiler of textbooks at USMA, then marketed his connection to USMA’s prominence in technical education for the rest of his career. At the same time, with his penchant for plagiarism, intellectual incoherence, and challenging personality, Davies contributed to sabotaging his own reputation. The evaluation of Logic and Utility given here suggests that translating the principles he associated with his collegiate textbooks into school instruction was not a straightforward task. Still, Davies loomed large in the realm of American mathematics textbooks and teaching in the nineteenth century. It is likely that additional details to broaden and deepen the story of his educational activities remain to be found in the vast corpus of nineteenth-century books and journals that remain unexamined by historians. Acknowledgements The author thanks the anonymous referees as well as audiences at the 2018 Joint Mathematics Meetings, CIR-MATH-Americas workshop, CSHPM Annual Meeting, and Women’s Intellectual Network Research Symposium for their helpful questions and comments. David Orenstein and Cathy Kessel provided suggestions for lines of research and potential sources. Adrian Rice supplied the information about De Morgan.
Charles Davies as a Philosopher of Mathematics Education
123
References Ackerberg-Hastings Amy (2000) Mathematics is a Gentleman’s Art: Analysis and Synthesis in American College Geometry Teaching, 1790–1840. Ph.D. diss., Iowa State University. Ackerberg-Hastings Amy (2018) John Playfair’s Approach to ‘the Practical Parts of the Mathematics’. In: Zack Maria, Schlimm Dirk (Eds) Research in History and Philosophy of Mathematics: The CSHPM 2017 Annual Meeting in Toronto. Proceedings of the Canadian Society for History and Philosophy of Mathematics. Springer International Publishing. Alphabetical Catalogue of the Library of Congress: Authors (1864) Washington, DC: Government Printing Office. Bidwell James K, Clason Robert G (Eds) (1970) Readings in the History of Mathematics Education. Washington, DC: National Council of Teachers of Mathematics. Catalogue of the Apprentice’s Library (1874) Schwartz, Jr J (Ed) New York: Chatterton & Parker. Catalogue of Books in the Mercantile Library, of the City of New York (1856) New York: Baker & Godwin. Catalogue of the Books Belonging to the Young Men’s Association of the City of Chicago (1865) Horton John M (Ed) (vol 1). Chicago. Catalogue of the Library of the Providence Athenaeum (1853) Providence. Catalogue of the Public Library of Indianapolis (1873) Indianapolis. Catalogue of the San Francisco Mercantile Library (1854) San Francisco: Daily Evening News Office. Catalogue of the Young Men’s Mercantile Library Association of Cincinnati (1869) Cincinnati. Chevallier Temple (1836) The Study of Mathematics as Conducive to the Developement [sic] of the Intellectual Powers. Durham, England: John W. Parker. Colton Walter (1851) Ship and Shore, in Madeira, Lisbon, and the Mediterranean (Cheever Henry Theodore (Ed)). New York: A. S. Barnes & Co. Darby John (1855) Botany of the Southern States. In Two Parts. New York: A. S. Barnes & Co. Davies Charles (1850) The Logic and Utility of Mathematics, with the best methods of instruction explained and illustrated. New York: A. S. Barnes & Co. Davies Charles (1873) The Nature and Utility of Mathematics, with the best methods of instruction explained and illustrated. New York: A. S. Barnes & Co. Day Henry Noble (1853) Elements of the Art of Rhetoric: Adapted for Use in Colleges and Academies, and for Private Study (3rd ed). New York: A. S. Barnes & Co., 1853. Day Jeremiah, Kingsley James Luce (1829) Original Papers in Relation to a Course of Liberal Education. American Journal of Science 15: 297–351. De Morgan Augustus (1830) The Study of Mathematics. Library of Useful Knowledge. London: Baldwin and Cradock. Dodd James B (1855) Elements of Geometry and Mensuration. New York: Farmer, Brace & Co. Emerson v. Davies, et al. (1845) 8 F. Cas. 615 (D. Mass.). General Catalogue of Books for Public Libraries in Upper Canada (1854) Journal of Education, Upper Canada 7, no. 2. Gregory Olinthus (1840) Hints, Theoretical, Elucidatory, and Practical, for the Use of Teachers of Elementary Mathematics, and of Self-Taught Students. London: Whittaker & Co. Grimké Thomas Smith (1835) American Education: Oration, on the Subject ‘That Neither the Classics Nor the Mathematics Should Form a Part of a Scheme of General Education in Our Country’. In: Transactions of the Fourth Annual Meeting of the Western Literary Institute, and College of Professional Teachers, Held in Cincinnati, October, 1834, 99–137. Cincinnati: Josiah Drake. Hamilton Fish Papers. Columbia University Rare Book and Manuscript Library. Harper Ida Husted (1899) The Life and Work of Susan B. Anthony (vol 1). Indianapolis and Kansas City: The Bowen-Merrill Company. Heydt Colin (n.d.) John Stuart Mill (1806–1873). The Internet Encyclopedia of Philosophy. http:// www.iep.utm.edu/milljs/.
124
A. Ackerberg-Hastings
An Historical Sketch of the State Normal College at Albany, N.Y. ([1894]) Albany: Brandow Printing Co. Jones Phillip S, Coxford, Jr Arthur F (Eds) (1970) From Colburn to the Rise of the Universities: 1821–94. In: A History of Mathematics Education in the United States and Canada, 24–35. 32nd Yearbook. Washington, DC: National Council of Teachers of Mathematics. Kidwell Peggy A, Ackerberg-Hastings Amy, Roberts David Lindsay (2008) Tools of American Mathematics Teaching, 1800–2000. Baltimore: The Johns Hopkins University Press. Kirk Hyland C (1883) A History of the New York State Teachers’ Association. New York: E. L. Kellogg & Co. Kullman David E (1998) Joseph Ray–The McGuffey of Mathematics. Ohio Journal of School Mathematics, 38: 5–10. Literary Notices (1850) Harper’s New Monthly Magazine 1, no. 3: 425–430. Lowell Anna Cabot (1841) Theory of Teaching, with a Few Practical Illustrations. Boston: E. P. Peabody. Mansfield Edward Deering (1835) Discourse on the Utility of the Mathematics. In: Transactions of the Fourth Annual Meeting of the Western Literary Institute, and College of Professional Teachers, Held in Cincinnati, October, 1834, 139–159. Cincinnati: Josiah Drake. Mansfield Edward Deering (1851) American Education. New York: A. S. Barnes & Co. Mansfield Edward Deering (1877) Charles Davies. In: Eighth Annual Reunion of the Association of the Graduates of the United States Military Academy at West Point, New York, June 14, 1877, 23–27. New York: A. S. Barnes & Co. The National Union Catalog: Pre-1956 Imprints (1976) (vol. 134) London: Mansell. Page David Perkins (1847) Theory and Practice of Teaching: Or, The Motives and Methods of Good School-Keeping. Syracuse, NY: Hall & Dickson. Peck William Guy (1859) Elements of Mechanics: For the Use of Colleges, Academies, and High Schools. New York: A. S. Barnes & Burr. Post T M (1835) The Classics: Lecture Upon the Study of the Greek and Latin Languages as a Part in the Course of a Liberal Education. In: Transactions of the Fourth Annual Meeting of the Western Literary Institute, and College of Professional Teachers, Held in Cincinnati, October, 1834, 63–96. Cincinnati: Josiah Drake. Pullen P H (1820) The Mother’s Book; Exemplifying Pestalozzi’s Plan of Awakening the Understanding of Children in Language, Drawing, Geometry, Geography, and Numbers. London. Stanton Elizabeth Cady, Anthony Susan B, Gage Matilda Joslyn Gage (Eds) (1889) History of Woman Suffrage (2nd ed, vol. 1) Rochester, NY: Charles Mann. State Convention of County Superintendents (1843) District School Journal, of the State of New York 4: 49–64. The Student and Family Miscellany (1854) 9, no. 5. Teachers’ Institutes (1845) District School Journal, of the State of New York 6: 101–103. Twelfth Annual Meeting of the New York State Teachers’ Association (1857) New York Teacher 6: 539–547. The Use of Mathematics in Education (1851) Methodist Quarterly Review 33: 218–226. Van Evra James (2008) Richard Whately and Logical Theory. In: Gabbay Dov M, Woods John (Eds) British Logic in the Nineteenth Century, 75–91. Handbook of the History of Logic (vol 4). Amsterdam: North Holland. Whately Richard (1827) Elements of Logic (2nd ed). London: J. Mawman. Whately Richard (1844) Elements of Logic (8th ed). London: B. Fellowes. Whewell William (1835) Thoughts on the Study of Mathematics as a Part of a Liberal Education (2nd ed). Cambridge.
Gauss et le modèle du champ magnétique terrestre Roger Godard and John de Boer
Abstract In 1839, Carl Friedrich Gauss published his famous article on the modeling of the terrestrial magnetic field. Benefiting from the previous scientific knowledge about the gravitational theory, Gauss assumed that the Earth was surrounded by a magnetic potential which obeys the Laplace equation. And Gauss solved this equation in spherical coordinates. Gauss assumed a trial solution of the form r1n U (θ, φ) where he generated a solution as a series of associated Legendre polynomials. In order to compute the coefficients in the spherical harmonic expression for the potential, Gauss selected 189 equations involving 24 unknown coefficients that he then found by the method of least squares. Indeed, Gauss assumed that the magnetic potential V goes to zero as the radius r goes to infinity. Unfortunately, the magnetic potential is unknown at the Earth surface, and only the three components of the earth magnetic field are accessible. Therefore, Gauss needed data from terrestrial magnetic observatories. We shall give a brief historical survey of magnetic observations, followed by comments on the gravitational theories, mainly from the works of Lagrange, Laplace and Legendre on potential theory. Finally we shall examine the validity of Gauss’ approach and his results.
1 Introduction Carl Friedrich Gauss (1777–1855) ne fut pas seulement le prince des mathématiciens mais aussi un mathématicien appliqué avec ses travaux sur la méthode de moindres-carrés, la solution numérique de systèmes d’équations linéaires, la solution numérique d’intégrales, l’interpolation, la transformée de Fourier rapide (FFT), etc. Il fut un physicien mathématicien avec ses apports en astronomie, en géodésie et en géomagnétisme (Garland 1979; Sheynin 2001), mais aussi un
R. Godard · J. de Boer () Royal Military College of Canada, Kingston, ON, Canada e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_8
125
126
R. Godardand and J. de Boer
expérimentateur talentueux. Dans ce travail, on s’intéresse à l’apport de Gauss sur le magnétisme terrestre. Il fut le premier à modéliser mathématiquement le champ magnétique à la surface de la terre, et à trouver des résultats numériques. Gauss commença à s’intéresser au champ magnétique dès 1803. On commentera les théories successives de l’attraction gravitationnelle qui bouleversèrent la physique mathématique et dont les méthodes influencèrent directement le modèle du potentiel magnétique.
2 Observations du champ magnétique terrestre et les instruments En 1831, Wilhelm Weber rejoint l’université de Göttingen comme professeur. Il remplacera Tobias Meyer qui était décédé. Ce sera le début d’une fructueuse coopération avec Gauss. En 1832, Gauss commence ses recherches sur le magnétisme terrestre. L’initiative semble venir d’Alexander von Humboldt qui voulait intéresser Gauss à son projet d’établir une grille de stations d’observations magnétiques autour du globe terrestre (Bühler 1981; Dunnington 1955) pour observer et bien comprendre les lois de variation du champ magnétique. Aussi en 1836, von Humboldt écrivit au Président de la Royal Society à Londres pour proposer d’établir des observatoires magnétiques partout dans les dominions britanniques et la mission fut confiée au Major Sabine. En 1833, Gauss fit construire un observatoire magnétique très moderne, à Göttingen, à côté de l’observatoire astronomique. Il était aussi intéressé par les variations temporelles du champ magnétique. Comme instruments, il y avait les boussoles de déclinaison, d’inclinaison, et d’intensité. En 1833 aussi, Gauss publie sa nouvelle invention, un magnétomètre pour mesurer le champ magnétique terrestre (Garland 1979). Les données magnétiques étaient essentielles et nécessaires comme conditions aux frontières pour le modèle mathématique global de Gauss. Garland dira: Écrivant en 1838, Gauss remarqua que pendant beaucoup d’années, il avait souhaité être tenté par l’analyse, mais qu’il était nécessaire pour lui d’attendre la publication d’observations suffisantes. (Garland 1979, p. 14).
Ce que fera Sabine en 1837. Il exista au cours des siècles, des explications bien fantaisistes du magnétisme terrestre. Christophe Colomb croyait que l’étoile polaire attirait l’aiguille aimantée. D’autres pensaient qu’il existait une montagne magnétique dans l’Arctique. William Gilbert bouleversa toutes les hypothèses précédentes en publiant en l’an 1600 De Magnete où il expliqua que la terre était un aimant gigantesque avec un pôle nord et un pôle sud. Gilbert fut un protégé de la reine Elisabeth I. Le modèle de Gauss de 1839 justifiera a posteriori les hypothèses de Gilbert.
Gauss et le modèle du champ magnétique terrestre
127
3 La théorie du potentiel gravitationnel Pour bien comprendre l’apport de Gauss dans le problème de conditions aux limites, il faut retracer d’abord les acquis de la théorie de la gravitation. Dans le livre Principia d’Isaac Newton, publié en 1687, les forces gravitationnelles considérées étaient les forces d’attraction entre deux points matériels. Si on dénote parμ1 une masse située au point M1 et par μ2 la masse située au point M2 , la force d’attraction mutuelle est exprimée par: μ1 μ2 → − → F =γ 2 − er , r
(1)
→ er est le vecteur où r est la distance radiale entre M1 etM2 , γ est une constante et − unitaire indiquant la direction d’une masse à l’autre. Après, les observations de Pierre Louis Moreau de Maupertuis concernant la sphéricité de la terre en 1736, la théorie de la figure de la Terre devint un sujet de recherche fondamentale. Puis Alexis Clairaut publia son fameux livre sur La Figure de la Terre (Clairaut 1743). Pour lui, une telle étude sur la Terre serait une étape importante pour la compréhension du système du monde. Un objet, placé à un certain endroit à l’extérieur de la Terre est attiré par toutes les particules composant la Terre, chacune d’elles agissant selon sa position et simultanément avec plus ou moins de force selon sa distance. Clairaut alla si loin qu’il voulait considérer la terre comme composée de matière hétérogène. Les géomètres réalisèrent rapidement qu’il était difficile de travailler avec des forces vectorielles, et commencèrent à travailler avec les composantes individuelles. En 1738, Daniel Bernoulli introduit en mécanique des fluides le concept d’une fonction scalaire du potentiel (ascensis actualis et potentialis) d’où une force est dérivée, et en (1773), Joseph Louis Lagrange présenta la quantité scalaire pour le problème d’un corps attiré par un système de masses ponctuelles (Burkhardt and Meyer 1900; Godard 2018): =
M m1 μ m2 μ M M + + + . . . , + + .. · · · = r1 r2
(2)
où ri ou Δi (notation de Lagrange) sont les distances des points de masse mi de coordonnées (xi , yi , zi ) au point de masse μ de coordonnées (a, b, c), et les Mi (notation de Lagrange) sont mi μ. Avec une écriture plus moderne, on peut exprimer (2) comme: V (a, b, c) =
i
mi μ
.
(3)
(a − xi )2 + (b − yi )2 + (c − zi )2
Rappelons que les mots fonction de potentiel V ne furent proposés que beaucoup plus tard par George Green dans son article de 1828 et en 1840 Gauss proposa tout simplement le mot potentiel. L’énorme avantage de travailler avec une fonc-
128
R. Godardand and J. de Boer
tion scalaire est que les différentes composantes s’additionnent simplement. La prochaine étape sera de généraliser la formule (3) en passant du cas discret au cas continu; ce que feront Laplace, Legendre, and Lagrange. Par exemple, en 1792– 1793, Lagrange écrivit les phrases suivantes: On sait que l’attraction d’un sphéroïde sur n’importe quel point dont la position dans l’espace est déterminée par les coordonnées a, b, c par rapport aux mêmes axes que les coordonnéesx, y, z, dépend de la formule:
dxdydz , (a − x)2 + (b − y)2 + (c − z)2
(4)
qu’on appelle V, l’intégration se faisant sur la masse entière du sphéroïde. De sorte que, dans cette quantité, V est vue comme une fonction de a, b, c . . . (Lagrange 1798).
Dans (4), la terre est supposée être un corps homogène et la masse est prise comme unité. Cette formule était déjà connue de Pierre-Simon Laplace et d’AdrienMarie Legendre en 1782. Il était clair que le système de coordonnées cartésiennes n’était pas adapté à la symétrie du problème. Dès 1782, Legendre et Laplace travailleront en coordonnées sphériques qui étaient plus commodes. Ces deux articles de Legendre et Laplace écrits en 1782 forment la pierre d’angle pour l’attraction terrestre. Laplace montrera que l’intégrale de convolution (4) est la solution de l’équation de Laplace ∇ 2 V = 0. Dans son article de 1782, publié en (1785), Laplace l’écrit directement en coordonnées sphériques sans explications! Ce n’est que dans le tome 1 du Traité de mécanique céleste (Laplace 1799, vol. 1, pp. 156–157) qu’il dérive l’équation de Laplace directement à partir de (4) en faisant
des dérivations sous le signe en fonction de variables a, b, c. Laplace écrivit l’équation comme: ddV ddV ddV + + = 0. 2 2 da db dc2
(5)
Laplace supposa que (5) était définie à l’intérieur de la Terre comme à l’extérieur. Mais en 1813, Siméon Denis Poisson observa que (4) divergeait si le point (a, b, c) était situé à l’intérieur du sphéroïde et que dans ce cas, l’équation de Laplace devait être replacée par l’équation de Poisson: ∇ 2 V + 4πρ = 0.
(6)
Poisson supposa que la densité ρ à l’intérieur de la sphère était constante. Ce résultat fut étendu par Gauss en 1839 au cas où la densité ρ(x, y, z) était variable, pourvu que ρ(x, y, z) avait des dérivées premières ρ x , ρ y , ρ z uniformément bornées. Rappelons que l’opérateur de Laplace Δ ou ∇ 2 a lui-même son histoire. Fourier écrira DV = 0, tandis que Poisson et Green disent δV = 0, pour Murphy, c’est ΔV, Δ2 V pour Lamé, Δ2 V pour Betti et ∇ 2 V pour Gibbs et Tait (Burkhardt and Meyer 1900, p. 468). Finalement l’équation de Laplace fut dérivée et étudiée pour
Gauss et le modèle du champ magnétique terrestre
129
la première fois non par Laplace, mais par Leonard Euler dans son papier Principes du mouvement des fluides, composé en 1752 (Kline 1972, p. 525).
3.1 L’article de Legendre en 1782 Cet article de Legendre sur l’attraction des sphéroïdes homogènes, écrit en 1782, publié en (1785), est bien connu (Kline 2, pp. 525–531; Todhunter 1873 vol. 2 chapitre XX). Il constitue le premier mémoire de Legendre sur l’attraction gravitationnelle. Après un exposé sur les travaux antérieurs de Colin MacLaurin sur l’attraction gravitationnelle d’ellipsoïdes intitulé Sur le flux et le reflux de la mer, et couronnés par l’Académie des Sciences en 1740, Legendre exprime le potentiel donné par (4) en coordonnées sphériques (1785, paragraphe 14):
V = =
3ρ r
dρ r2
+ r2
− 2rr cos (γ )
α0 P0 (cos θ ) α2 P2 (cos θ ) α4 P4 (cos θ ) + + + . . . . r0 r2 r4
(7)
Ou plus exactement: 3ρ V= r
1 α + 3 5r 2
3 2 1 cos θ − 2 2
ς + 4 7r
5.7 4 3.5 1.3 cos θ − 2cos2 θ + 2.4 2.4 2.4
+ ... . (8)
Ici, r est la distance du point attiré P(r, θ , 0), r’ est le rayon vecteur à n’importe quel point à l’intérieur du sphéroïde, et γ est l’angle entre les deux vecteurs. θ est l’angle entre le vecteur P et l’axe des z. C’est la colatitude. Legendre suppose une symétrie axiale, donc cela implique une Terre homogène. Il fut ainsi capable d’exprimer le potentiel en fonction des polynômes de Legendre de degré pair P2n (cosθ ) à cause de la symétrie du problème. Les α 0 , α 2 , α 4 , . . . sont des constantes et ρ est la masse. Legendre appellera le potentiel V, la quantité qui représente la somme des molécules du sphéroïde divisée chacune par sa distance au point attiré. Dans un deuxième mémoire intitulé Recherches sur la figure des planètes, accepté en 1784 et publié en (1787), Legendre étudie les propriétés de ses polynômes, notamment les propriétés d’orthogonalité, déjà connues pour les fonctions trigonométriques (page 374):
+1 −1
P2n (x)P2m (x) =
0 si m = n 1 4m+1 si m = n
(9)
130
R. Godardand and J. de Boer
Dans un quatrième mémoire très fructueux, datant de l’année 1789 et publié en (1793), Legendre écrit ses sept premiers polynômes pairs et impairs (page 377); les propriétés d’orthogonalité (page 384) et l’équation aux différences ordinaires pour les polynômes de Legendre associés (page 427) à deux paramètres k et m: d (1 − xx) dV m,k k2 V m,k + m (m + 1) V m,k = 0 − 1 − xx dx 2
(10)
Il étudie aussi la solution de l’équation différentielle de Legendre (page 454) à un paramètre m: d (1 − xx) dX m + m (m + 1) Xm = 0 dx 2
(11)
3.2 L’article de Laplace en 1782 et le traité de mécanique céleste En 1782, Laplace soumet son article Théorie des attractions des sphéroïdes et de la figure des planètes. Il fut publié en 1785. Cet article est son quatrième mémoire en la matière. Il choisit de solutionner directement l’équation de Laplace en coordonnées sphériques et il considère une Terre hétérogène (Laplace 1785, p. 363). Laplace explique son raisonnement: J’ai observé dans nos Mémoires pour l’année 1779, que les intégrales des équations linéaires aux différences partielles du second ordre n’étaient souvent possibles qu’au moyen d’intégrales définies, semblables à l’expression de V (Eq. 5); ainsi lorsqu’on a de semblables intégrales, il est facile, dans un grand nombre de cas, d’en tirer des équations aux différences partielles, dont la considération peut fournir des remarques intéressantes et faciliter la réduction des intégrales en séries.
Alors, Laplace choisit de faire le changement de variables cosθ = μ, et écrit directement: ∂2V ∂ 1 − μ2 ∂V ∂μ ∂ 2 (rV ) ∂φ 2 + + r . (12) 0= ∂μ 1 − μ2 ∂r 2 Ce ne sera seulement que dans le tome 1 du Traité de mécanique céleste (1799, pp. 157–159) que Laplace fera explicitement le changement de coordonnées pour le passage des coordonnées cartésiennes en coordonnées sphériques. Il va utiliser la méthode de séparation des variables et choisit comme solution d’essai en remarquant que le potentiel doit diminuer si on s’éloigne de la terre: V (r, θ, φ) =
U (0) U (1) U (2) + 2 + 3 + ... r r r
(13)
Gauss et le modèle du champ magnétique terrestre
131
où U(i) = F(i) (θ , φ). Il faut alors substituer cette valeur d’essai de V dans l’équation différentielle partielle en éliminant la composante en r. Les fonctions U(i) seront appelées les harmoniques sphériques, ou fonctions de Laplace. Alors Laplace pose:
0=
∂
(i) 1 − μ2 ∂U ∂μ ∂μ
+
∂ 2 U (i) ∂φ 2 1 − μ2
+ i (i + 1) U (i) .
(14)
Là il ne continue pas avec la méthode de séparation des variables pour étudier les composantes en μet en φ. Laplace prouvera les propriétés d’orthogonalité des harmoniques sphériques et le théorème suivant: Si U(n) et U(m) sont deux solutions de l’équation différentielle (14): On aura généralement U(n) U(m) dμdφ = 0où n et m sont deux nombres entiers positifs et différents entre eux, les intégrales étant prises depuis μ = 1 jusqu’à μ = − 1,et depuis φ = 0à φ = 360o (Laplace 1785, p. 389). Dans l’article de 1782, et le deuxième tome du Traité de mécanique céleste (Laplace 1799), Laplace calcule les deux premières solutions de (14). Il trouve que pour i = 0, la solution est une constante, et pour i = 1, sa solution est de la forme: H μ + H 1 − μ2 sin φ + H 1 − μ2 cos φ,
(15)
où H, H’ , H” sont des constantes, soit un développement en polynômes de Legendre associés et de fonctions trigonométriques. On appelle maintenant les fonctions sphériques Un de Laplace, les fonctionsYn (θ , φ). Elles seront définies par Gauss (1839) comme: Yn (θ, φ) =
n
(am cos mφ + bm sin mφ) Pnm (cos θ ) ,
(16)
m=0
où les am et bm sont des constantes arbitraires. Après Legendre et Laplace, on observe une pause mais il faut souligner les efforts de Poisson (1835) et de Lejeune-Dirichlet (1837) sur les harmoniques sphériques. Les deux s’intéressèrent à la représentation d’une fonction arbitraire f (θ , φ) comprise entre certaines limites, par une série de fonctions sphériques du type: f (θ, φ) = Y1 + Y2 + · · · + Yn + . . . ,
(17)
et les propriétés de convergence. Selon Poisson, ce développement en série avait une grande importance en mécanique céleste, la théorie de la chaleur, et autres questions de physique mathématique et de mécanique. Les fonctions Y étaient obtenues à partir de produits scalaires entre la fonction f (θ , φ)et des polynômes de Legendre. Cependant les preuves de Poisson et de Lejeune-Dirichlet étaient très restrictives (Lambert 1904–1916). Ils travaillaient par analogie avec les séries trigonométriques,
132
R. Godardand and J. de Boer
mais ils ne possédaient pas la définition correcte des fonctions Yn . Le chapitre VIII de (Poisson 1835) était intitulé: Suite de la digression sur la manière de représenter les fonctions arbitraires par des séries de quantités périodiques.
4 Carl Friedrich Gauss et le modèle du champ magnétique terrestre en 1839 Parallèlement aux progrès sur l’attraction gravitationnelle, les applications de l’analyse mathématique aux théories en électricité et en magnétisme avaient aussi avancé. Par exemple, entre 1811 et 1823, Poisson écrivit cinq mémoires notamment sur la théorie du magnétisme. Il eut aussi une forte influence sur George Green. En (1839), Gauss publia Allgemeine Theorie des Erdmagnetismus sur la modélisation mathématique du champ magnétique terrestre. Ce sera un article de physique mathématique ou mieux de géomagnétisme de 72 pages dans le volume V du Werke avec beaucoup de tables. Gauss va bénéficier de l’apport mathématique lié à la théorie gravitationnelle et donnera très peu d’explications sur les équations. Il ne donna aucune référence sur les mathématiques, mais par contre, il fit un bref historique sur les relevés magnétiques, notamment la carte de déclinaison d’Halley, le travail de Barlow, la carte d’inclinaison de Hansteen en 1780, etc. Dans son travail, Gauss suivit les hypothèses de William Gilbert que la terre est un aimant, que la surface de la Terre et l’extérieur ne contenaient aucune source de magnétisme et qu’alors le potentiel magnétique à la surface de la Terre et l’extérieur obéissait à l’équation de Laplace. Ce faisant, il travailla par analogie avec les forces d’attraction gravitationnelles et les lois de Charles-Augustin de Coulomb de 1787 sur l’électricité et le magnétisme. D’ailleurs Gauss ne s’intéressera qu’à la surface de la Terre et non à l’extérieur. Gauss devra alors résoudre l’équation de Laplace en coordonnées sphériques pour une surface terrestre hétérogène ayant la forme d’une sphère. Il aura besoin d’une carte magnétique comme condition aux limites à la surface de la Terre, d’où son grand soin pour prendre des observations magnétiques précises. Selon nous, le seul qui ait posé un problème aussi compliqué avant Gauss, fut Joseph Fourier en (1821–1822) dans son mémoire sur La Théorie du mouvement de la chaleur dans les corps solides. Fourier avait besoin de connaître la distribution de température à la surface du globe comme condition aux limites du problème de conduction de la chaleur. Malheureusement pour Gauss, le potentiel magnétique n’était pas une quantité observable. Seules les composantes du champ magnétique terrestre étaient mesurables. On préfère maintenant dire induction magnétique. Le modèle devient: ∇ 2 V = 0,
(18)
V (r → +∞) = 0,
(19)
Gauss et le modèle du champ magnétique terrestre
−
133
∂V (a, θ, φ) = Z, ∂r
(20)
1 ∂V (a, θ, φ) = X, r ∂θ
(21)
1 ∂V (a, θ, φ) = Y, r sin θ ∂φ
V (r, θ, φ + 2π ) = V (r, θ, φ) ,
(22)
(23)
où a est le rayon de la terre. Alors la déclinaison est donnée par D = arctan √ (Y/X); l’inclinaison sera I = arctan (Z/H) et l’intensité horizontale est H = X2 + Y 2 . Donc Gauss écrit l’équation pour les harmoniques sphériques (24), puis la formule pour obtenir les polynômes de Legendre associés Pnm et directement la solution correcte sous la forme d’une série formée de fonctions trigonométriques et de polynômes de Legendre associés sans explications ni références: Y (n) = g n,0 P n,0 + g n,1 cos φ + hn,1 sin φ P n,1 + g n,2 cos 2φ + hn,2 sin 2φ P n,2 + · · · + g n,n cos nφ + hn,n sin nφ P n,n . (24) L’équation (24) représente un progrès considérable par rapport aux travaux de Legendre et de Laplace du 18ième siècle. Les coefficients g et h seront appelés coefficients de Gauss. En notation moderne, on a pour la série complète du potentiel (Chapman and Bartels 1962; Thébault et al. 2015): V (r, θ, φ) = a
n N
a n+1
n=0
r
m Pnm (θ ) gm n cos mφ + hn sin mφ ,
(25)
m=0
où les coefficients de Gauss sont maintenant donnés en nanotesla, donc en unités du champ magnétique. Le potentiel V possède (N + 1)2 − 1 constantes. Gauss se limita à N = 4, soit 24 constantes pour les dérivées du potentiel. Gauss va innover pour le calcul des coefficients g et h. Il ne va pas essayer de les résoudre sous forme d’intégrales comme Poisson ou Lejeune-Dirichlet, mais par la méthode numérique des moindres-carrés. Il va rejeter aussi le calcul direct de (25) à partir des observations directes à cause des calculs prohibitifs. Gauss avait les observations de stations magnétiques réparties partout autour du globe terrestre. Il fallait par une méthode d’interpolation ramener les informations aux nœuds d’une grille θ = constante et φ = constante (Chapman and Bartels 1962, pp. 631–632,
134
R. Godardand and J. de Boer
Barraclough 1978, pp. 5–6). Ce sera un des premiers exemples de tessellation! Il garda 12 points sur chaque cercle de latitude et selon 7 cercles de latitude soit 84 points au total. Puis Gauss décomposa son problème en travaillant d’après les colatitudes θ = constante et en faisant une analyse harmonique (Heideman et al. 1985) Il avait par exemple pour la composante X: X=
n 4
m dP m n , gn cos mφ + hm n sin mφ dθ
(26)
n=1 m=0
soit: X =
4
(αmx cos mφ + βmx sin mφ). Donc il y a 9 coefficients à estimer par
m=0
cercle de latitude. Ceci était réparti sur 7 cercles de latitude soit 63 coefficients à déterminer au total. Comme il y devait faire les calculs pour les trois composantes du champ magnétique, cela faisait 63×3=189 coefficients α m , β m à trouver. La méthode de Gauss permettait cependant de faire des économies de calcul énormes. Par identification, il posa (Garland 1979, p. 15):
αmx βmx
=
4 m
dP m (θ ) g n
n=m
hm n
n
dθ
(27)
Il utilisa donc les 189 équations pour trouver les 24 coefficients de Gauss, qu’il appela éléments, par la méthode des moindres-carrés. Il utilisa probablement sa méthode itérative pour résoudre ce système d’équations (Chabert et al. 1993, pp. 333–335). Nous avons vérifié les premiers coefficients et sur la Fig. 1, on montre leur variation temporelle de 1835 (Gauss) à 1940. On voit bien la variation temporelle et les résultats de Gauss de 1835 étaient précis. On semble avoir un artefact pour le coefficient g11 ou g11 pour l’année 1880. Comme la série donnée par (25) converge rapidement, le choix de quatre harmoniques était judicieux. Sur les Figs. 2 et 3, nous avons reproduit sur ordinateur le modèle de Gauss avec les coefficients de 1835 pour la déclinaison et l’intensité totale. Ces cartes se comparent favorablement aux calculs modernes plus précis. Sur la Fig. 3, on voit que l’intensité augmente vers les pôles.
5 Après Gauss En 1857, un professeur du nom de Plarr publia un article de deux pages et demi dans les Comptes Rendus. Le titre était Note sur une propriété commune aux séries dont le terme général dépend des fonctions Pn de Legendre, ou des cosinus ou sinus des multiples de la variable. Plarr montra que si on veut minimiser l’erreur globale quadratique entre une fonction bornée, continue par morceaux et une série tronquée de polynômes de Legendre ou de polynômes trigonométriques, les
Gauss et le modèle du champ magnétique terrestre
135
Fig. 1 Variation temporelle des coefficients de Gauss de 1835 à 1940 (Chapman and Bartels 1962, p. 639; Thébault et al. 2015). Les coefficients sont en nanotesla
Fig. 2 Déclinaison [en degrés] selon le modèle de Gauss pour 1835, degré et ordre 4. La carte correspond à une projection Mercator
coefficients associés, étaient automatiquement les coefficients de Fourier-Legendre ou les coefficients de Fourier. Le lien était ainsi établi entre les séries de FourierLegendre et les moindres-carrés. Ainsi, la méthode de Gauss pour trouver ses coefficients était justifiée. En fait Bessel en 1815, avait déjà établi cette propriété dans le cas discret pour des séries trigonométriques en minimisant la somme des carrés des différences entre les valeurs données par le calcul et celles fournies par l’observation (Esclangon 1904-1916). Pour trouver les coefficients g et h
136
R. Godardand and J. de Boer
Fig. 3 Intensité du champ terrestre [en microtesla] selon le modèle de Gauss pour 1835, degré et ordre 4. La carte correspond à une projection Mercator
d’une façon simple et élégante mais peu utilisée, il suffit d’utiliser les propriétés d’orthogonalité des fonctions sphériques. On a ainsi les résultats suivants pour les produits scalaires des fonctions tesserales harmoniques, et les coefficients gnm et hm n selon les coordonnées sphériques (Chapman and Bartels 1962, pp. 610–612, Barraclough 1978, p. 15, Rexer and Hirt 2015): 1 4π 1 4π
gnm = hm n
=
π 2π
m 2 Pn (cos θ ) cos mφ sin θ dφdθ = 0 0
π 2π
Pnm (cos θ ) sin
mφ
2
(28) sin θ dφdθ =
0 0
2n+1 4π 2n+1 4π
π 2π
0 0
π 2π
1 2n+1 , 1 2n+1 ,
V (a, θ, φ) Pnm (cos θ ) cos mφ sin θ dφdθ, (29) V
(a, θ, φ) Pnm (cos θ ) sin mφ
sin θ dφdθ.
0 0
En conclusion, le mémoire de Gauss en 1839, en utilisant les harmoniques sphériques a donné une forme mathématique aux hypothèses de Gilbert que le champ magnétique terrestre est d’origine interne, mais aussi il a fait progresser les méthodes en présentant les harmoniques sphériques sous forme d’une série composée de fonctions trigonométriques et polynômes de Legendre associés pour trouver les coefficients g et h. Ce faisant, Gauss a aussi contribué aussi à l’avancement des théories concernant la gravitation. Bien plus, Gauss a contribué à l’explosion des modèles géomagnétiques. Ainsi Barraclough (1978) a trouvé dans la littérature, 264 modèles d’harmoniques sphériques du champ géomagnétique.
Gauss et le modèle du champ magnétique terrestre
137
Finalement, Kline (1972, pp. 522–531), dans son livre Mathematical thought from ancient to modern times, donna un long exposé des travaux de Legendre et de Laplace sur la théorie du potentiel sans mentionner les travaux de Gauss qui donna pourtant la solution définitive pour le magnétisme et la gravitation.
Références Barraclough D R (1978) Spherical harmonic models of the geomagnetic field. Geomagnetic Bulletin, Her Majesty’s Stationary Office, London: 1-66 Burkhardt H et Meyer W F (1900) Potentialtheorie. Encyklopädie der Mathematischen Wissenschaften, Druck et Verlag von B G Teubner, Leipzig vol 2, pt 1, No 1: 466-502 Bühler W K (1981) Gauss, a Biographical Study. Springer-Verlag, Berlin Chabert J L et al. (1993) Histoire d’algorithmes. Belin, Paris Chapman S and Bartels J (1962) Geomagnetism. Oxford at the Clarendon Press Vol II Clairaut A C (1743) Théorie de la figure de la terre: tirée des principes de l’hydrostatique. Gallica Internet site, Paris Dunnington C W (1955) Carl Friedrich Gauss: Titan of Science. Hafner Publishing Co., New York Esclangon E (1904-1916) Interpolation trigonométrique. Exposé d’après l’article de Burkhardt. Encyclopédie des sciences mathématiques, Jules Molk ed, tome II vol 5, Éditions Jaques Gabay, Paris Fourier J B (1821-1822) Théorie du mouvement de la chaleur dans les corps solides. Mém. de l’Académie Royale Sci., tome V année 1826: 153-246 Garland G D (1979) The contributions of Carl Friedrich Gauss to geomagnetism. Historia Mathematica 6(1): 5-29 Gauss C F (1839) Allegemeine Theorie des Erdmagnetismus. In Resultate magn. Verein 1838. Werke, Band 5: 121-193 Godard R (2018) The convolution as a mathematical object. Research in History and Philosophy of Mathematics, M Zack and D Schlimm Editors, the CSHPM 2016 Annual Meeting in Calgary, Alberta, Birkhäuser Suisse: 199-212 Heideman M T, Johnson D H, and Burrus C S (1985) Gauss and the history of the Fast Fourier Transform. Arch Hist Exact Sci 34: 265-276 Kline M (1972) Mathematical thought from ancient to modern times, Oxford University Press, Oxford Lagrange J L (1773) Sur l’équation séculaire. Œuvres, Gauthier-Villars, Paris (1867-1892) Vol. 6: 349 Lagrange J L (1798) Mémoire sur les sphéroïdes elliptiques, Nouveaux Mémoires de Acad. Royale Sci. et Belles-Lettres de Berlin, années 1792-1793: 652. Lambert A (1904-1916) Fonctions sphériques. Encyclopédie des sciences mathématiques, Tome II, vol 5, Jules Molk ed, Éditions Jacques Gabay, Paris Laplace P S (1785) Théorie des attractions des sphéroïdes et de la figure des planètes. Mém de l’Académie des Sci, Paris, année 1782. Les Œuvres, 10: 341-419. Gallica Internet site Laplace P S (1799) Traité de mécanique céleste. Duprat J B M, Paris. Réimprimé par Éditions Jacques Gabay (2006) vol 1 et 2 Legendre A M (1785) Recherches sur l’attraction des sphéroïdes homogènes. Mémoires de mathématique et de physique ou Mém. des Sav. Étrangers, année 1782 10: 411-434, BHL Biodiversity Heritage Library Internet Legendre A M (1787) Recherches sur la figure des planètes. Mém. de l’Académie royale des Sciences, année 1784: 370-.389. Biodiversity Heritage Libray BHL Internet Legendre A M (1793) Suite des recherches sur la figure des planètes. Mém. de l’Acad. Royale des Sci. Paris, année 1789: 372-454. Biodiversity Heritage Libray BHL Internet
138
R. Godardand and J. de Boer
Lejeune-Dirichlet G (1837) Sur les séries dont le terme général dépend de deux angles, et qui servent à exprimer des fonctions arbitraires entre des limites données. J. de Crelle: 35-56 Poisson S D (1835) Théorie mathématique de la chaleur. Bachelier Imprimeur-Libraire, Paris Rexer M and Hirt C (2015) Ultra-high degree surface spherical harmonics analysis using the GaussLegendre and the Driscott/Healy quadrature theorem and application to planetary topography models of earth, Mars and Moon. Surveys in Geophysics 36, 6: 803-830 Sheynin O B (2001) Carl Friedrich Gauss. Statisticians of the Centuries, C C Heyde and E Seneta, Eds: 119-122 Thébault E et al. (2015) International Geomagnetic Reference Field-The Twelfth Generation. Earth, Planets and Space: 67-79 Todhunter M A (1873) A history of the mathematical theories of attraction and the figure of the earth. Macmillan and Co, London
A Gaussian Tale for the Classroom: Lemniscates, Arithmetic-Geometric Means, and More Janet Heine Barnett
Abstract In a July 1798 entry in his mathematical diary, Gauss announced: On the lemniscate, we have found out the most elegant things exceeding all expectations and that by methods which open up to us a whole new field ahead.
Paving the way to the new field of elliptic integrals predicted by Gauss was an elegant relationship that he discovered between three particular numerical values: the ratio of the circumference of a circle to its diameter (π ), aparticular value of an
1 1 elliptic integral associated with the lemniscate 0 √ 4 dt , and the arithmetic1−t √ geometric mean of 1 and 2. As an example of the powerful role which analogy and numerical experimentation can play within mathematics, the tale of Gauss’ path to these discoveries is one well worth sharing with today’s students. This paper describes a set of three “mini-Primary Source Projects” based on excerpts from Gauss’ mathematical diary and related manuscripts which are designed to tell that tale, while also serving to consolidate student proficiency with several standard topics studied in first-year calculus courses.
1 Introduction In his Math Horizons article “Gaussian Guesswork” (Rice 2009), historian of mathematics Adrian Rice writes: Before theorems are proved, conjectures must be made, and for that to happen, all kinds of experimentation, observation, invention and, indeed, imagination must come into play.
J. H. Barnett () Colorado State University-Pueblo, Pueblo, CO, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_9
139
140
J. H. Barnett
The historical story recounted in that article is a lovely example of how these ad hoc processes — experimentation, observation, invention, imagination — can lead to deep and important mathematical ideas. Rice also shares a modern proof of the theorem which Gauss ultimately proved as part of the answer to the question implied by the article’s subtitle: “why 1.19814023473559220744 . . . is such a beautiful number.” In this paper, we describe a collection of short (1–2 day) classroom projects that go back to primary sources written by Gauss and others as a means to bring the Gaussian tale told by Rice to first-year calculus classrooms.
2 A Tale in Three (Mini)-Acts: The Student Projects Considered to be one of the most creative mathematicians of all time (alongside Archimedes and Newton), Carl Friedrich Gauss (1777–1855) kept a “mathematical diary” for nearly 20 years, from just before his 19th birthday in 1796 until July 1814. Among the discoveries that he recorded, there was the existence of a beautiful relationship between three particular numbers: • the ratio of the circumference of a circle to its diameter (π ); • a particular value of an elliptic integral associated with the lemniscate; and • a sophisticated form of average. The journey of Gauss’ discovery and eventual proof of this relationship provides the back drop for three mini-Primary Source Projects (mini-PSPs) that share the same title1: • Gaussian Guesswork: Polar coordinates, Arc Length and the Lemniscate Curve • Gaussian Guesswork: Sequences and the Arithmetic-Geometric Mean • Gaussian Guesswork: Elliptic Integrals and Integration by Substitution Designed to consolidate student proficiency with and understanding of the firstyear calculus topics identified in their subtitles, each of these mini-PSPs follows the guided reading approach employed by the NSF-funded TRansforming Instruction in Undergraduate Mathematics via Primary Historical Sources (TRIUMPHS) project (Barnett et al. 2017). Within this approach, primary source excerpts are interspersed with mathematical tasks that students complete in order to unpack and extend the ideas in those excerpts.2 The following subsections provide an overview of how this is done in these Gaussian Guesswork mini-PSPs completed to date.
1A
fourth mini-PSP entitled Gaussian Guesswork: Arc Length and the Numerical Approximation of Integrals is planned for later development, with a projected completion date of Summer 2020. 2 The guided reading approach to teaching and learning from primary source projects employed by TRIUMPHS emerged from work done under two prior NSF-funded projects. For more details about this approach and its evolution, see Barnett et al. (2014, 2016a,b).
A Gaussian Tale for the Classroom
141
2.1 A Mini-PSP on Polar Coordinates, Arc Length, and the Lemniscate Curve The mini-PSP described in this subsection (Barnett 2018b) sets the stage with a brief look at an early investigation involving π , the most ancient of the three numbers in Gauss’ trio. Archimedes, for instance, used circumscribed and inscribed regular 1 polygons with 96 sides to establish the numerical estimate 3 10 71 ≤ π ≤ 3 7 . In that same text, Measurement of a Circle (Archimedes), Archimedes also stated and proved the following: Proposition 1 The area of any circle is equal to a right-angled triangle in which one of the sides about the right angle is equal to the radius, and the other base to the circumference [of the circle]. As quoted in (Dijksterhuis 1987, p. 222).
Letting r represent the radius of the circle and using the usual relationship between the circle’s circumference and its radius, it is a straightforward task to translate Archimedes’ statement into today’s standard area computation formula for circles. After prompting students to do just that, the project remarks on two things that are worth noting in Archimedes’ statement. First, it describes how we would construct a rectilineal geometrical figure (i.e., a triangle) which has the same area as the given curvilinear figure (i.e., a circle), rather than providing a computational formula of the type in common usage today. This type of construction is called a quadrature, or squaring, in keeping with the long-standing geometric tradition in which “finding the area” of a curvilinear figure literally meant constructing a polygon, often a square or quadrilateral, with the same area as the given figure. The second noticeable thing about Archimedes’ description of how to “compute” the area of a circle is its interesting use of the circle’s arc length (or perimeter). The problem of finding the arc length itself has historically been called rectification, from the Latin word “rectificare” which
t can be translated to mean “straighten.” Students next explore the integral 0 √dx 2 from both a rectification (arc length) 1−x
and a quadrature (area) perspective. In particular, they are prompted to set up the integral for the arc length (or rectification) of one quarter of the unit circle, and to interpret that integral as an area (or quadrature) problem. This latter process might be described as “reducing a rectification problem to a quadrature.” In this case, the rectification question of finding the arc length of a quarter circle is actually far easier to think about than the quadrature problem of finding the area of the (unbounded!) region under the curve y = √ 1 2 on the interval (−1, 1). It would thus have 1−x
made more sense to have reduced the (unbounded) quadrature problem to that of the (simpler) rectification problem. As noted in the PSP, this was precisely the sort of idea that led seventeenth-century mathematicians to become interested in the curve known as the lemniscate.
142
J. H. Barnett
The mini-PSP continues with an excerpt from the paper3 in which the mathematician Jacob Bernoulli (1655–1705) christened the lemniscate with its name, in connection with his construction of another curve called the paracentric isochrone (Bernoulli 1694, p. 377).4 . . . because of this great √ desire, a curve of four dimensions is set up expressed by the equation xx + yy = a xx − yy, and which around the axis . . . [2a] forms a shape resembling a figure eight lying on its side ∞, a ribbon folded into a knot, a lemniscus; in French, d’un nœud de ruban.
Bernoulli’s reference in the preceding excerpt to “this great desire” was part of a methodological debate in which he was involved, concerning how best to solve integrals related to quadrature problems in general. Earlier in the same year that this paper was published, he had solved the same problem discussed in the excerpt by rectifying a different curve known as an elastica. Other geometers— including Bernoulli’s younger brother Johann Bernoulli (1667–1748)—criticized Jacob’s earlier solution of the quadrature problem, not for its use of rectification (arc length), but because the elastica itself is a transcendental equation. The lemniscate, on the other hand, is an algebraic curve since its equation involves no transcendental functions. Quoting Jacob Bernoulli once more (Bernoulli 1694, p. 336): It is a better [method] to employ a construction by rectification of an algebraic curve; for curves can be more quickly and accurately rectified, using a string or small chain wrapped around them, than areas can be squared.5
Of course, Bernoulli did not actually use string to find the arc length of the lemniscate, but instead used the differential techniques of his time to set up its integral. The mini-PSP does not follow his derivation directly, but instead includes a multi-part task that follows today’s standard approach of studying the lemniscate using polar coordinates, which were only beginning to emerge in the
3 The
English translation of the paper’s full title is “Construction of a Curve with Equal Approach and Retreat, with the help of the rectification of a certain algebraic curve: Addendum to the June solution.” 4 The paracentric isochrone is the curve with the property that a ball rolling down it approaches or recedes from a given point with uniform velocity. Although its construction did not require the finding of an area, it was viewed as a quadrature problem because the differential equation given by the physics involved led to the evaluation of integral (as would a quadrature problem). The paracentric isochrone problem itself was originally posed by Gottfried Leibniz in 1689 as one of a series of problems involving balls rolling along curves with which geometers of the day challenged each other, as well as the power of the new calculus techniques. In addition to the two solutions published by Jacob Bernoulli in June and September of 1694, Johann Bernoulli (1667– 1748) also independently solved it in that same year by rectifying the lemniscate. This incident of independent discovery, in which Jacob beat Johann to publication by just a month, along with the surrounding debate about proper methodology in which the two brothers participated on different sides of the argument, were factors in the increasingly uncivil sibling rivalry that existed between the two brothers. 5 Bernoulli’s full view about the best method to use was more complicated than we have presented here, as was the debate in general among geometers of the time. For a thorough treatment of these issues, see (Blåsjö 2017, 160–167).
A Gaussian Tale for the Classroom
143
late seventeenth century. It then moves ahead to the nineteenth century, and Gauss’ interest in the elliptic integral which students will have derived for the arc length of the lemniscate. Between Bernoulli in the 1690s and Gauss’ initial investigations, Leonhard Euler and other mathematicians continued to study both the lemniscate and the related elliptic integral for its arc length. From his notebooks, we know that Gauss was reading these works when he wrote his 51st diary entry6:
I have begun to investigate the elastic curve7 depending on (1 − x 4 )−1/2 dx. January 8, 1796
At some later (unknown) date, Gauss crossed out the word “elastic” in this entry and wrote in the word “lemniscatic” in its place. In fact, a major motivation for Gauss in his study of the integral itself was the analogy that he saw between t dx and its relation to the lemniscate; and • the integral √ 4 0 t 1 − x dx • the integral and its relation to the unit circle. √ 1 − x2 0 Two particular facts about the unit circle integral were especially important to his guesswork: 1 dt 1. π = 2 , with sin(u + 2π ) = sin u for all u; and √ 2 0 x 1 − t dt 2. If u = , then x = sin u. √ 1 − t2 0 A natural question that one might ask after pondering this list — at least, if one is Gauss! — is what happens with similar integrals? For instance, how might we complete the following sentence? x dt If u = , then x = . . . √ 1 − t4 0 Gauss’ response was to introduce a new non-elementary function, the lemniscatic sine, which he defined as the inverse of yet another non-elementary function with the latter function defined as an arc length integral (Gauss 1876a, p. 404): dx from x = 0 to x = 1 by 12 .8 √ 1 − x4 We denote the variable x [when considered] with respect to [this] integral by the symbol sin lemn, but [when considered] with respect to the complement of [this] integral to 12 , by cos lemn. Therefore, whenever
We always denote the value of this integral
sin lemn
dx √ 1 − x4
= x,
cos lemn
1 − 2
dx √ 1 − x4
=x
6 All excerpts from Gauss’ diary are taken from the English translation by Jeremy Grey that appears
as an appendix in (Dunnington 2004, pp. 469–484). is the curve that Bernoulli called the “elastica.” 8 The symbol Gauss used here is a variant of the Greek letter π which is called “varpi.” 7 This
144
J. H. Barnett
The variable x can be considered as the radius vector of the curve, but also as the integral for the arc of the curve; the true curve will be that which is called [a] lemniscate.
Notice the explicit recognition here that the lemniscatic cosine function could then be defined by an analogous relationship with the circular trigonometric functions. The final section of the mini-PSP leads students through a guided exploration of this somewhat exotic excerpt. Although these particular non-elementary functions themselves are not part of the standard calculus curriculum, they provide an excellent opportunity for beginning students of calculus to apply and consolidate core concepts and techniques while witnessing their interplay within the context of some amazingly beautiful, and important, mathematics.
2.2 A Mini-PSP on Sequences and the Arithmetic-Geometric Mean The mini-PSP described in the previous subsection involves two of the three numerical values involved in the beautiful relationship that Gauss discovered in 1798: 1 dt • the ratio of a circle’s circumference to its diameter: π = 2 √ 1 − t2 0 1 dt • a particular value of an elliptic integral: = 2 √ 1 − t4 0 In the mini-PSP described in this subsection (Barnett 2017), the third numerical value of the trio takes center stage. The definition of this constant as a particular arithmetic-geometric mean again reveals a new and important role for another standard calculus concept: infinite sequences. Although it appears that Gauss discovered the arithmetic-geometric mean when he was only 14 years old9 , he published very little about it during his lifetime.10 Much of what we know about his work in this area instead comes from his working notes which became known through the publication of his Nachlass only after his 9 Gauss
reminisced about his 1791 discovery of this idea in a letter (Gauss 1816) that he wrote to his friend Schumacher much later, in 1816. Although his memory of the exact date may not be accurate, Gauss was certainly familiar with the arithmetic-geometric mean by the time he began his mathematical diary in 1796. 10 The algorithm for computing the arithmetic-geometric mean did appear in a 1785 paper by Mdy Lagrange on the calculation of integrals of the form (Lagrange 1785). (1 + p 2 y 2 )(1 + q 2 y 2 ) However, Gauss was unaware of Lagrange’s earlier independent discovery, and Lagrange himself did not appear to realize the full potential of the arithmetic-geometric mean.
A Gaussian Tale for the Classroom
145
death. These notes include the manuscript of a paper which the mini-PSP itself follows closely, beginning with Gauss’ definition of two related infinite sequences (Gauss 1799, p. 361):
a, a1 , a2 , a3 , . . . be two sequences11 of magnitudes formed by this condition: that b, b1 , b2 , b3 , . . . the terms of either correspond to the mean between the preceding terms, and indeed, the terms of the upper sequence have the value of the arithmetic mean, and those of the lower sequence, the geometric mean, for example,
Let
a1 =
√ 1 1 1 (a+b), b1 = ab, a2 = (a1 +b1 ), b2 = a1 b1 , a3 = (a2 +b2 ), b3 = a2 b2 . 2 2 2
But we suppose a and b to be positive reals and [that] the quadratic [square] roots are everywhere taken to be the positive values; by this agreement, the sequences can be produced so long as desired, and all of their terms will be fully determined and positive reals [will be] obtained.
Gauss went on to give four specific examples of sequences (an ) and (bn ) defined by way of the arithmetic and geometric means as described above. The first of these (Gauss 1799, p. 363) provides the focus for the first student task, which we have included here as an exemplar of how the projects integrate primary source readings with student exercises: Example 1: a = 1, b = 0.2 a = 1.00000 00000 00000 00000 0 a1 = 0.60000 00000 00000 00000 0 a2 = 0.52360 67977 49978 99964 1 a3 = 0.52080 54052 86123 66484 5 a4 = 0.52080 54052 86123 95414 3 a5 = 0.52080 16381 06187
b = 0.20000 00000 00000 00000 0 b1 = 0.44721 35954 99957 93928 2 b2 = 0.51800 40128 22268 36005 0 b3 = 0.52080 78709 39876 24344 0 b4 = 0.52080 16380 99375 b5 = 0.52080 16381 06187
Here a5 , b5 differ in the 23rd decimal place. Task 1 This task examines Example 1 from Gauss’ paper. (a) Verify that the values given by Gauss in the previous excerpt are correct. Are you able to use your calculator to obtain the same degree of accuracy (21 decimal places!) that Gauss obtained by hand calculations? (b) Write three observations about the two sequences in this example. Use a full sentence to state each of your observations.
Following a second numerical example/task, the mini-PSP includes several tasks which prompt students to consider the following claims made by (Gauss 1799, p. 361): himself used prime notation (i.e., a , a , a ) to denote the terms of the sequence. In the project (and this paper), we instead use indexed notation (i.e., a1 , a2 , a3 ) in keeping with current notational conventions. To fully adapt Gauss’ notation to that used today, we could also write a0 = a and b0 = b.
11 Gauss
146
J. H. Barnett
I. If a = b, all of the terms of either sequence will be = a = b. II. If however a, b are unequal, then (a1 − b1 )(a1 + b1 ) = 14 (a − b)2 , whence it is concluded that b1 < a1 , and also that b2 < a2 , b3 < a3 etc., i.e. any term of the lower sequence will be smaller than the corresponding [term] of the upper. Wherefore, in this case, we suppose also that b < a. III. By the same supposition it will be that a1 < a, b1 > b, a2 < a1 , b2 > b1 etc.; therefore the upper sequence constantly decreases, and the lower constantly increases; thus it is evident that each [sequence] has a limit; these limits are conveniently expressed12 a∞ , b∞ .
In the multi-part task based on Property III, students are first asked to establish the monotonicity of the two sequences, and then to reflect upon Gauss’ assertion “. . . thus it is evident that each [sequence] has a limit.” After writing their own convincing explanation for why this conclusion must hold, the connection to what we today call the Monotone Convergence Theorem is brought out, and the importance of having a bounded sequence in order to apply this theorem is underscored. The mini-PSP’s reading of Gauss’ paper then culminates with one final excerpt, in which Gauss defined the arithmetic-geometric mean between a, b (Gauss 1799, pp. 361–362): IV. Finally, from
a1 −b1 a−b
(a−b) a−b 1 4(a1 +b1 ) = 2(a+b)+4b1 , it follows that a1 − b1 < 2 (a − b), and 1 a2 − b2 < 2 (a1 − b1 ) etc. Hence, it is concluded that a − b, a1 − b1 ,
=
in the same way, a2 − b2 , a3 − b3 etc forms a strictly decreasing sequence and the limit itself is = 0. Thus a∞ = b∞ , i.e., the upper and lower sequences have the same limit, which always remains below the one and above the other. We call this limit the arithmetic-geometric mean between a and b, and denote it by M(a, b).
Once more, a multi-part task leads students through the algebraic steps of the proof n
that an − bn < 12 (a − b) for any arbitrary value n, and prompts them to reflect on how this shows that lim an − bn = 0, thereby allowing Gauss to conclude that n→∞ a∞ = b∞ . The concluding section of this particular project focuses on the fourth example of an arithmetic-geometric mean from Gauss’ paper (Gauss 1799, p. 364): Example 4: a = a = 1.41421 35623 73095 04880 2 a1 = 1.20710 67811 86547 52440 1 a2 = 1.19815 69480 94634 29555 9 a3 = 1.19814 02347 93877 20908 3 a4 = 1.19814 02347 35592 20744 1
√ 2, b = 1
b = 1.00000 00000 00000 00000 0 b1 = 1.18920 71150 02721 06671 7 b2 = 1.19812 35214 93120 12260 7 b3 = 1.19814 02346 77307 20579 8 b4 = 1.19814 02347 35592 20743 9
use of superscripts (a ∞ , b∞ ) to denote the limiting values is again replaced with subscripts (a∞ , b∞ ) throughout the project (and this paper).
12 Gauss’
A Gaussian Tale for the Classroom
147
Notice that the first 20 decimal places of the limit value for this example is precisely the number that appears in the subtitle of Adrian√Rice’s Math Horizons paper. But why would Gauss or others have considered M( 2, 1) to be particularly beautiful? Gauss’ initial glimpse of√this beauty was revealed by some guesswork based on the numerical values of M( 2, 1) and the other two numerical constants of this tale13 : π = 3.14159265258979323846 . . . = 2.662057055429211981046 . . . √ M( 2, 1) = 1.19814023473559220744 . . . Perhaps you already see the relationship which Gauss discovered between these three constants—if not, pause for a moment to look for it (as students are asked to do in the project). Gauss’ own record of this discovery appeared in his mathematical diary: √ We have established that the arithmetic-geometric mean between 1 and 2 is π/ to 11 places; the proof of this fact will certainly open up a new field of analysis. May 30, 1799
It took Gauss another year to fully prove that his guesswork about the numerical √ π relationship M( 2, 1) = was correct. The third mini-PSP in this collection takes up the tale with a look at that proof.
2.3 A Mini-PSP on Elliptic Integrals and Integration by Substitution The third Gaussian Guesswork mini-PSP (Barnett 2018a) takes up the difficult
1 question: how do we evaluate the (elliptic)14 integral 0 √ 1 4 dx? Gauss’ first 1−x
idea for doing this is one that will naturally occur to students who have experience with trigonometric substitution:
. . . . . . xx =
sin θ cos
1 dθ = 2 √ sin θ
1 dx. √ 1 − x4 January 7, 1797
13 Gauss
approximated the value of = 2
1 0
√1
1−t 4
dt using power series methods. These
techniques will be the focus of the fourth Gaussian Guesswork mini-PSP described in footnote 1. terminology “elliptic integral” is in all three mini-PSPs as the special cases of
xintroduced 1 n = 3, n = 4 for integrals of the form 0 √1−t n dt. For n ≥ 5, this integral is called hyperelliptic.
14 The
148
J. H. Barnett
Gauss’ notation is a bit cryptic, as he seemed to be describing two different substitutions that would lead him to the result indicated. Students are first given the opportunity to verify the substitution x 2 = sin θ , both why it gives the result that Gauss recorded, and why it ultimately leads nowhere. Subsequently, they consider a slight variation which allowed Gauss to eventually succeed in evaluating the lemniscate arc length integral by showing that the substitution x = sin θ
π/2 transforms the original integral to 0 √ dθ 2 , which is then easily re-written 1+sin θ
π/2 dθ as 0 √ . Although this last integral may not seem promising at 2 cos2 θ+2 sin θ
first glance, the mini-PSP then works through Gauss’ treatment of the integral
π/2 dθ √ to see how he was eventually able to evaluate it. 0 2 m2 cos2 θ+n2 sin θ
As you may have already surmised, the key to Gauss’ treatment of integrals of the
π/2 dθ form 0 √ is the arithmetic-geometric mean! Since this mini-PSP 2 m2 cos2 θ+n2 sin θ
focuses on integration techniques (versus sequences), it treats the arithmeticgeometric mean only lightly. Indeed, although his Nachlass contains fairly extensive notes about the arithmetic-geometric mean and its properties, Gauss’ published works mentioned it only once — and then only briefly — in his important astronomical paper on the gravitational attraction of planets (Gauss 1818, p. 352): Let m, n be two positive quantities, and set15 m1 =
1 (m + n), 2
n1 =
√ mn
so that m1 , n1 represent the arithmetic mean and the geometric mean, respectively, of m and n. The geometric mean will always be taken to be positive. Similarly set m2 = 12 (m1 + n1 ),
n2 =
m3 = 12 (m2 + n2 ),
n3 =
√ √
m1 n1 m2 n2
and so on, by which manner [are obtained] the sequences m, m1 , m2 , m3 , etc., and n, n1 , n2 , n3 , etc., converging rapidly to a common limit, which we denote μ, and call simply the arithmetic-geometric mean between m and n.
Just after his definition of μ = μ(m, n) as the arithmetic-geometric mean of the numbers m and n, Gauss stated the theorem and proof which form the centerpiece of this mini-PSP (Gauss 1818, pp. 352–353): Now we shall demonstrate,
1 μ
to be the value of the integral
use of prime notation (i.e., m , m , m ) to denote the terms of these sequences is again replaced here by indexed notation (i.e., m1 , m2 , m3 ).
15 Gauss’
A Gaussian Tale for the Classroom
149 dθ 2π mm cos2 θ + nn sin2 θ
from θ = 0 extended to θ = 2π .16 Proof We suppose the variable θ is expressed by another variable θ1 , so that sin θ =
2m sin θ1 (m + n) cos2 θ1 + 2m sin2 θ1
[where] it is easily observed that while θ1 is increased from 0 to π2 , π , 3π 2 , 2π , θ also (although by different intervals) increases from 0 to π2 , π , 3π , 2π . The expansion duly 2 performed, it is found to be that dθ dθ1 = m1 m1 cos2 θ1 + n1 n1 sin2 θ1 mm cos2 θ + nn sin2 θ and indeed the values of the integrals
dθ , 2π mm cos2 θ + nn sin2 θ
dθ1 2π m1 m1 cos2 θ1 + n1 n1 sin2 θ1
if each of the variables is extended continuously from the value 0 to the value 2π , equal to each other. And if this [process] is permitted to continue further, clearly these values also are equal to the integral value 2π
dθ μμ cos2 θ
+ μμ sin2 θ
from θ = 0 to θ = 2π , which evidently becomes =
1 μ.
2m sin θ1 Gauss’ idea of setting up the substitution sin θ = certainly (m+n) cos2 θ1 +2m sin2 θ1 gives us some insight into why he is considered a mathematical genius! After (twice!) reading this proof in Gauss’ own words, students complete a series of tasks designed to help them first interpret its meaning (and power), and then to work through the details of the substitution at the core of the proof. This latter task is carefully laid out as a sequence of manipulations that offer one path for showing how the “expansion duly performed” gives the following:
16 Gauss’
dθ m2 cos2 θ + n2 sin θ 2
=
dθ1
(∗)
m21 cos2 θ1 + n21 sin2 θ1
use of degree notation for the limits of integration has been replaced by radians throughout the project (and this paper), in order to reduce the potential for unnecessary confusion on the part of students and instructors.
150
J. H. Barnett
The particular path followed in this task is based on details provided by Jacobi in his celebrated 1829 work on elliptic functions (Jacobi 1829); Gauss himself left no record of his reasoning. We summarize the major steps of the Jacobi’s reconstruction, as they appear in the project, in the next paragraph.17 An in-class worksheet that guides students through the derivations in this task and offers helpful suggestions at key junctures is also included with the Notes to Instructors for the project. The key to setting up the new integrand in θ1 in equation (∗) is to find two different (but equal) expressions in θ1 for the differential d(sin θ ). The initial steps of this process employ the (original) constants m, n from the dθ integral throughout. To begin, the Pythagorean Identity is used to rewrite Gauss’ given substitution for sin θ as follows: sin θ =
2m sin θ1 (m + n) + (m − n) sin2 θ1
.
Using the quotient rule to differentiate the right-hand side of this expression and simplifying gives the following: Differential Equality #1: cos θ dθ =
2m cos θ1 [(m + n) − (m − n) sin2 θ1 ] [(m + n) + (m − n) sin2 θ1 ]2
dθ1
Setting aside the right-hand side of this equality for now, another application of the Pythagorean Identity and some additional algebra (the details of which are sketched out for the students within the task) gives the following expression for cos2 θ : cos2 θ1 (m + n)2 − (m − n)2 sin2 θ1 . cos θ = 2 (m + n) + (m − n) sin2 θ1 2
Taking a square root here gives an expression for cos θ that can be substituted into the left-hand side of Differential Equality #1, giving the following quite complicated looking equality: Differential Equality #2: cos θ1 (m+n)2 −(m−n)2 sin2 θ1 2
(m+n)+(m−n) sin θ1
17 If
dθ =
2m cos θ1[(m+n) −(m−n) sin2 θ1 ] [(m+n)+(m −n) sin2 θ1 ]2
dθ1
you enjoy algebraic challenges, try your hand at finding your own path for carrying out this substitution before reading further!
A Gaussian Tale for the Classroom
151
Before taking this step, however, the next part of the task guides students through the algebra needed to rewrite the expression m2 cos2 θ + n2 sin2 θ in terms of sin θ1 , with the following end result: (m + n) − (m − n) sin2 θ1 m2 cos2 θ + n2 sin2 θ = m (m + n) + (m − n) sin2 θ1 Combining this with Differential Equality #2 (where the task guides students through doing this in two steps) allows us to then conclude the following key fact: (m + n)2 − (m − n)2 sin2 θ1 dθ = 2 m2 cos2 θ + n2 sin2 θ dθ1 Some slight rearrangement then gives the following, which is quite close to Gauss’ equation (∗): Differential Equality #3:
dθ m2 cos2 θ + n2 sin θ 2
=
2dθ1 (m + n)2 − (m − n)2 sin2 θ1
The final step of the task prompts students to use the definitions of m1 , n1 as the arithmetic mean and geometric mean of m, n respectively in order to show that 4(m21 cos2 θ1 + n21 sin2 θ1 ) = (m + n)2 − (m − n)2 sin2 θ1 , thereby allowing them to obtain Gauss’ equation (∗) by bringing the (new) constants m1 , n1 into the dθ1 integrand of Differential Equation #3. To bring the mini-PSP to closure, students are reminded (from an earlier project
π/2
1
1 dθ task) that 0 √ = 0 √dx 4 , where = 2 0 √ 1 4 dt. Rearrang2 2 1−x 1−t cos θ+2 sin θ √ ing this a bit allows us to then conclude (along with Gauss) that μ(1, 2) = π . . . a rather surprising connection between three apparently unrelated numbers!! Although it was published only in 1818, we know from Gauss’ diary that a proof that his guesswork about this numerical relationship was correct was completed about a year after his discovery of that relationship. And, as he predicted, this discovery went well beyond just this one numerical relationship. The theorem established in this mini-PSP, for instance, allows us to numerically estimate any
b dθ quite simply, by making use of the integral of the form a √ 2 2π
mm cos2 θ+nn sin θ
very rapid convergence of the arithmetic-geometric mean.18 Beyond this, the “new
18 The
arithmetic-geometric mean is also used today to construct fast algorithms for calculating values of elementary transcendental functions and some classical constants, like π .
152
J. H. Barnett
field of analysis” that opened up in connection with this proof led him well beyond the study of elliptic functions of a single one real-valued variable, and into the realm of functions of several complex-valued variables.19 Today, the special class of such functions known as the “theta functions” provides another powerful tool that is used in a wide range of applications throughout mathematics.
3 Some Concluding Remarks: Bringing This Tale to the Classroom The three mini-PSPs described in this paper are designed for use in a standard firstyear calculus course. They may be used alone or in conjunction with any of the other two. Each tells a portion of the tale in some detail, and provides enough of a glimpse of the remaining ideas to give students a feel for the complete story. LATEX code of each project is available from the author by request to facilitate preparation of reading guides or “in-class task sheets” based on the project tasks. The PSPs themselves can also be modified by instructors as desired to better suit their goals for the course. Their classroom implementation may be accomplished through individually assigned work, small group work, and/or whole class discussion; a combination of these instructional strategies is recommended in order to take advantage of the variety of questions included in the projects. To reap the full mathematical benefits offered by the PSP approach, students should also be required to read assigned sections in advance of in-class work, and to work through primary source excerpts together in small groups in class. In-class small group discussion is especially recommended for the more complicated algebraic manipulations of the mini-PSP on elliptic integrals and integration by substitution. More detailed implementation advice is provided in the Notes to Instructors that accompany each of the mini-PSPs. Too often, students enrolled in introductory calculus courses fall into the habit of mundanely practicing techniques disconnected from the wonders of the mathematical discoveries in which those techniques first emerged. The Gaussian tale told in the three mini-PSPs described in this paper offer one antidote to that danger. In addition to allowing students to witness how today’s standard concepts and techniques emerged as part of authentic mathematical discoveries, these projects also offer them opportunities for experimentation, observation, invention and, indeed, imagination of their own. Although few (if any) will rise to the level of Gauss’ genius for such guesswork, these are the very processes that make new mathematics possible today. How better to raise burgeoning mathematicians than to give them the chance to experience these same processes first hand at an early age?
19 For details about Gauss’ work on the arithmetic-geometric mean within the complex domain, see
(Cox 1984). Cox also treats Gauss’ early work on the real-valued case, and explores some of the pre-Gaussian history of these ideas.
A Gaussian Tale for the Classroom
153
Fig. 1 Diagram showing the lemniscate and the paracentric isochrone (Bernoulli, December 1695)
Fig. 2 Sketch of the lemniscate from Gauss’s notebook, from Gauss (1876b)
Acknowledgements The development of this paper has been partially supported by the National Science Foundation’s Improving Undergraduate STEM Education Program under Grant No. 1523494. Any opinions, findings, and conclusions or recommendations expressed in this project are those of the author and do not necessarily reflect the views of the National Science Foundation. The author wishes to thank George W. Heine III for technical assistance with the Latin-to-English translations of the primary source excerpts that appear in this paper and with the reproduction of the images that appear in Figs. 1 and 2. She is also immensely grateful to Adrian Rice for the inspiration that his Math Horizons paper (Rice 2009) provided.
References Archimedes. Measurement of a Circle. In: Great Books of the Western World, pages 447–451. The Franklin Library, Pennsylvania, 1985. English translation by T. L. Heath Barnett J (2017) Gaussian Guesswork: Sequences and the Arithmetic-Geometric Mean (MiniPrimary Source Project), available at https://blogs.ursinus.edu/triumphs/ (accessed September 15, 2018) Barnett J (2018a) Gaussian Guesswork: Elliptic Integrals and Integration by Substitution (MiniPrimary Source Project), available at https://blogs.ursinus.edu/triumphs/ (accessed September 15, 2018) Barnett J (2018b) Gaussian Guesswork: Polar Coordinates, Arc Length and the Lemniscate Curve (Mini-Primary Source Project), available at https://blogs.ursinus.edu/triumphs/ (accessed September 15, 2018) Barnett JH, Lodder J, Pengelley D (2014) The Pedagogy of Primary Historical Sources in Mathematics: Classroom Practice Meets Theoretical Frameworks. Science and Education 23:7– 27. https://doi.org/10.1007/s11191-013-9618-1
154
J. H. Barnett
Barnett JH, Bezhanishvili B, Lodder J, Pengelley D (2016a) Teaching Discrete Mathematics Entirely From Primary Historical Sources. PRIMUS 26(7):657–675, https://doi.org/10.1080/ 10511970.2015.1128502 Barnett JH, Lodder J, Pengelley D (2016b) Teaching and Learning Mathematics From Primary Historical Sources. PRIMUS 26(1):1–18. https://doi.org/10.1080/10511970.2015.1054010 Barnett JH, Clark K, D K, Lodder J, Otero D, Scoville N, White D (June 2017) A Series of Mini-projects from TRIUMPHS: TRansforming Instruction in Undergraduate Mathematics via Primary Historical Sources. Convergence Bernoulli J (December 1695) Explicationes, annotationes et additiones ad ea quæ in Actis superiorum annorum de Curva Elastica, Isochrona Paracentrica, & Velaria, hin inde memorata, & partim controversa lenuntur; ubi de Linea mediarum directionum, aliisque novis [Explanations, notes and additions to that in the Acts of the preceding year about the Elastic, Paracentric Isochrone and Velara Curves, thence from this recounted, the controversial part read, where concerning the line of the middle directions]. Acta Eruditorum pp 537–553, also in Opera Omnia, Volume V.1, pp. 639–662 Bernoulli J (September 1694) Solutio Constructio Curvæ Accessus & Recessus æquabilis, ope rectificationionis curvæ cujusdam algebraicæ: addenda nuperæ Solutioni mensis Junii [Construction of a Curve with Equal Approach and Retreat, with the help of the rectification of a certain algebraic curve: addenda to the June Solution]. Acta Eruditorum pp 336–338, also in Opera Omnia, Volume 1, pp. 608–612 Blåsjö V (2017) Transcendental Curves in the Leibnizian Calculus. Studies in the History of Mathematical Enquiry, Academic Press, series edited by Umberto Bottazzini Cox D (1984) The Arithmetic-Geometric Mean of Gauss. L’Enseignement Mathématique 30:275– 330 Dijksterhuis EJ (1987) Archimedes, with a new bibliographic essay. Princeton Legacy Library, Princeton, English translation by W. Knorr Dunnington GW (2004) Carl Friedrich Gauss: Titan Of Science. The Mathematical Association of America, Washington DC, reprint of original 1955 publication. Includes the English translation of Gauss’ Diary by Jeremy Gray (pp. 469–484) Gauss CF (1799) Arithmetisch Geometrisches Mittel [Arithmetic Geometric Mean]. In: Werke, vol III, Konigliche Gesellschaft der Wissenschaft, Göttingen, pp 361–432 Gauss CF (1816) April 1816 Letter from Gauss to Schumacher (in German). In: Werke, volume X:1, pages 247–248. Konigliche Gesellschaft der Wissenschaft, Göottingen, 1917 Gauss CF (1818) Determinatio attratctionis, quam in punctum quodvis positionis datae exerceret planeta, si eius massa per totam orbitam ratione temporis, quo singulae partes descibuntur, uniformiter esset dispertita [Determination of the Attraction, which a planet exerts on any point, if its mass is distributed uniformly through the time of the orbit]. In: Brendel M (ed) Werke, vol III, Gedruckt in der Dieterichschen universitätsdruckerei, Göttingen, pp 333-355
Gauss CF (1876a) Elegantiores Integralis √dx 4 Proprietates [Very Excellent Properties of 1−x
the Integral √dx 4 ]. In: Brendel M (ed) Werke, vol III, Gedruckt in der Dieterichschen 1−x
universitätsdruckerei, Göttingen, pp 404–412 Gauss CF (1876b) Teilung der Lemniskate [Division of the Lemniscate]. In: Brendel M (ed) Werke, vol X.1, Gedruckt in der Dieterichschen universitätsdruckerei, Göttingen, pp 160–163 Jacobi CCJ (1829) Fundamenta nova theoriae functionum ellipticorum [New foundations of the theory of elliptic functions]. Borntraeger, Königsberg, also in Gesammelte Werke, G. Reimer, Berlin, 1881, pp. 49–239
A Gaussian Tale for the Classroom
155
Lagrange JL (1785) Sur une nouvelle méthode de calcul intégral pour les différentielles affectées d’un radical carré sous lequel la variable ne passe pas le quatrième degré [On a new method of integral calculus for differentials involving a square root under which the variable does not surpass the fourth degree]. Mémoires de l’Acadèmie royale des Sciences de Turin Tome II (1784–1785), pp. 218–290, also in Œvres Complètes, Volume 1, Gauthier-Villars, Paris, 1868, pp. 253–312 Rice A (November 2009) Gaussian Guesswork, or why 1.19814023473559220744 . . . is such a beautiful number. Math Horizons pp 12–15
Philippe de la Hire: Was He Desargues’ Schüler? Christopher Baltus
Abstract Philippe de la Hire (1640–1718) was the third of the seventeenth century pioneers of projective geometry, after Girard Desargues (1591–1661) and Blaise Pascal (1623–1662). We know very little about La Hire beyond what he tells us in his various published works and what Bernard de Fontenelle reported in his Eloge, issued soon after La Hire’s death. That vacuum of information has been filled with misinformation and speculation. Beyond the annoying falsehoods, there is the very real issue of the degree of influence of Desargues in La Hire’s geometry, especially his Nouvelle méthode en Géométrie pour les Sections des Superficies coniques et Cylindriques of 1673. La Hire’s originality has been questioned, beginning, apparently, soon after the 1673 publication, reviving in the late nineteenth century after publication of Desargues’ long lost work of 1639 and, again, in the writing of eminent scholar René Taton around 1950. Much of the discussion of originality has centered on the availability of Desargues’ booklet to La Hire, rather than an examination of the work itself. The claim of this study is that comparison of the work of Desargues and La Hire shows that Desargues’ influence was minimal.
1 Introduction Richard Westfall ends the short entry for Philippe de la Hire on the Galileo website Westfall (2018): This extensive bibliography is misleading. It records my effort to find something about him beyond the small budget of information in Fontenelle’s Éloge, which is apparently the source of every biographical treatment of La Hire. There is an extraordinary dearth of material on this important man.
C. Baltus () SUNY Oswego, Oswego, NY, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 M. Zack, D. Schlimm (eds.), Research in History and Philosophy of Mathematics, Proceedings of the Canadian Society for History and Philosophy of Mathematics/ Société canadienne d’histoire et de philosophie des mathématiques, https://doi.org/10.1007/978-3-030-31298-5_10
157
158
C. Baltus
This vacuum on the life and work of Philippe de la Hire (1640–1718) has been filled with speculation, much improbable. Although we have no evidence that La Hire did any mathematics before the year of Desargues’ death, 1661, or that the two ever met, W.W. Rouse Ball wrote that La Hire was Desargues’ “favorite pupil” (Ball 1960, p. 317). And Wikipedia, 2018, has Upon his return to Paris [in 1664], [La Hire] became a disciple of Girard Desargues from whom he learned geometrical perspective. Wikipedia (2018)
Beyond unfounded statements, as illustrated above, there is an important question about the originality of La Hire’s first major work on the conic sections, the Nouvelle méthode (La Hire 1673) of 1673. Girard Desargues had produced a bold work (Desargues 1639) on conic sections, employing projective methods, in 1639. It was a short book in 50 copies, which all seem to have disappeared within 20 years. La Hire also employed projective methods. Further, both Desargues and Philippe de la Hire had collaborated with the engraver Abraham Bosse (1604–1676), and Philippe’s father was a friend of both Bosse and Desargues. Further still, in the nineteenth century, a handwritten transcript, dated 1679, by La Hire, of Desargues (1639) was found. La Hire claimed to have read Desargues’ work only in 1679. But did La Hire study Desargues’ work before 1673? Scholar René Taton (1915–2004) thought the answer was “yes.” Among various articles in which he questioned La Hire’s originality is the entry on La Hire for the Dictionary of Scientific Biography (Taton 1970). His judgment has been widely followed in online information. The MacTutor biography of La Hire (MacTutor 2018) simply quotes Taton’s article. On the other hand, the one broad study of La Hire’s mathematics, an unpublished thesis by Zbynek Sir, from 2002, does not support Taton (Zbynek 2002, p. 227). We argue that the goals and methods of La Hire’s Nouvelle méthode, of 1673, are very different from those of Desargues’ 1639 (Desargues 1639). Our argument centers on a detailed comparison of the handling of the pole and polar concept, perhaps the most central topic of projective geometry. We conclude that La Hire’s claim to not have seen Desargues’ mathematical work before 1679 is plausible. Section 2 provides background, starting with the Conics of Apollonius, on the pole/polar concept and the related concept of harmonic conjugates. Sections 3, 4, and 5 provide background on Girard Desargues and Philippe de la Hire and their work. Section 6, the longest, is the detailed comparison of development of the pole/polar concept by La Hire and Desargues. Then we are ready to turn, again, in Sect. 7, to a history of arguments, over the centuries, for Desargues’ influence on La Hire.
2 A Little Background from Apollonius The Conics of Apollonius was written, about 200 BC, in the definition–proposition– proof form found in Euclid’s Elements. As with Euclid, the initial propositions of the Conics were already well established. Due to the high quality of the work, and
Philippe de la Hire: Was He Desargues’ Schüler?
159
good fortune, the Conics survived into the European renaissance, while other ancient works on conic sections were lost. Apollonius defined a conic surface as that traced out by a line, extending indefinitely in both directions, turning on a point, the vertex, A, and traveling along a base circle (not coplanar with the vertex). The intersection, or section, of a plane with this surface was called a conic section, although planes on the vertex or parallel to the base circle were excluded. When the slicing plane meets the base plane (plane of the base circle) in line F G, and BC is the diameter of the base circle perpendicular to F G, then the line ED, which is the intersection of the slicing plane and the plane ABC, is a diameter of the conic in that all the chords of the conic section which are parallel to F G are bisected by ED. See Fig. 1 Left. These chords are the ordinates corresponding to diameter ED. E and D, the intersections of the slicing plane with AB and AC, are vertices of the conic section corresponding to diameter ED, and the midpoint of ED is the center of the conic section. (When the slicing plane is parallel to AB or AC, there is only one vertex and no center, and the conic is called a parabola.) Apollonius spent much of Book 1 showing that any other chord on the center was also a diameter; in the case of the parabola, any parallel to the diameter found as above is another diameter. Book 1 covers the relation of a point, A, and a line, LK, now called the polar of A with respect to a given conic. In the simplest case, let A lie outside the conic, and let the tangents from A meet the conic in K and L. See Fig. 1 Right. Then KL is the polar of A, and A is the pole of KL. (Note that these names are from the early nineteenth century.) When KL ∩ ED = M, then Apollonius showed (Apollonius of Perga 1696, Book 1 Prop 34, 36; Book 3 Prop 37). ME AE = , AD MD
abbreviated
H (AM, ED).
Further, if a line on A meets the conic in H and N , and meets LK in G, then H (AG, H N).
Fig. 1 Left: from Conics of Apollonius, 1696 ed. Right: Pole and polar property
160
C. Baltus
Definition (La Hire, 1673) Let A, E, M, D be collinear points with exactly one of A and M between E and D. Then A and M are harmonic conjugates of E and D if AE ME = AD MD
abbreviated
H (AM, ED).
When the order is clear, we refer to {A, E, M, D} as a harmonic set.
3 To 1639 The first four books of the Conics of Apollonius had been edited, translated into Latin, and published by the late sixteenth century. (Books 5–7 were later recovered from Arabic sources, while Book 8 remains lost.) This provoked several attempts to simplify the study, including short works by Werner (1522) and Maurolico (1575), and Mydorge (1631/1639) of 1631 and 1639. In 1639, a short work by Girard Desargues (1591–1661) was published in 50 copies, with the title Brouillon project d’une atteinte aux événements des rencontres d’un cône avec un plan, henceforth referred to as Brouillon project (Desargues 1639). The title could be translated as Draft study of the intersections of a plane with a cone. More than a simplification, this innovative work intended to put the study of the conics on a new foundation, which we would call projective. The mathematicians and scientists of Paris were in close communication, facilitated by the Minim priest Marin Mersenne. This “circle,” or informal “academy,” included Mydorge, Desargues, Roberval, Étienne Pascal and his son Blaise. By correspondence, Mersenne had ties to Descartes, Fermat, and the other major scientists of Europe. The painter Laurent de la Hyre (1606–1656), the father of Philippe, is reported to have been a good friend of Desargues, and his attention to perspective is attributed to this friendship (Sorensen 2009). Laurent would have known Father Mersenne if for no other reason than he made a series of 18 paintings to decorate the refectory of the Paris convent of the Minim fathers. This association of scientists did not survive Mersenne’s death in 1648, and these participants were dead by 1662; when Philippe de la Hire worked on conic sections in the late 1660s and the 1670s, he seems to have done so in isolation. All copies of the Brouillon project apparently disappeared soon after its publication, a mark of its difficulty and unusual character. Desargues and his ideas did influence at least one in Mersenne’s circle, the young Blaise Pascal. Pascal wrote in his Essay pour les coniques (Pascal 1640) of 1640, “I have tried as far as I could to imitate [Desargues’] method of approaching this material” (Field and Gray 1987, p. 183). However, a copy of the Brouillon projet came into the hands of Philippe de la Hire by 1679 and he made a transcription, which, in turn, disappeared until 1845. The existence of Desargues’ work was never in question, since it was discussed in various letters which survived, and was at the center of a vitriolic dispute in the 1640s (Taton 1951a).
Philippe de la Hire: Was He Desargues’ Schüler?
161
4 On the Life of La Hire Philippe de la Hire trained to be a painter, the profession of his father, but moved to mathematics. The evidence of this change is limited. In his short Éloge of 1718 (Fontenelle 1718), Fontenelle wrote that the teenage Philippe was interested in geometric aspects of painting, including perspective. In Italy, from 1660 until 1664, he developed his art, but also found a love for Greek geometry, especially the Conics of Apollonius. Fontenelle reported that, after 1664, La Hire continued his geometric studies, going deeper into the subject, and La Hire himself tells us in the initial Epistre to La Hire (1673) of 1673 that he had applied himself to the study of geometry for plusieurs années. He must also have continued in painting, for he was admitted to the painters’ guild, l’Académie de Saint-Luc, in 1670 (Bénézit 1960, Vol 3, p. 137). We know nothing specific until 1672, the year of his first publication in mathematics, a short work entitled Observations de Ph. de la Hire sur les points d’attouchement de trois Lignes droits qui touchent la Section d’un Cone. . ., et sur le centre de la mesme Section (La Hire 1772). This short work comprises seven propositions with corollaries. It was brought out by Abraham Bosse (1604–1676). The problem addressed is that of constructing an arc rampant, and that problem seems to have been the focus of a controversy between Bosse [See Bosse 1672] and (Nicolas)-François Blondel (1618–1686), where the disagreement centered on the value of approximate methods. The tracing of the arc rampant is the second problem of Blondel’s 1673 Résolution des quatres principaux problèmes d’architecture: to find a Conic Section tangent to three given straight lines, in one plane, at a given point on two of these lines: in other words, to describe geometrically the arcs rampant of all types of foot segments (pieds droits) and heights. (Blondel 1673)
Bosse had worked with Desargues. In 1643 he brought out the first part of a treatise on stone cutting (Bosse 1643), incorporating rules found by Desargues. In that work of 1643 [p. 50], Bosse wrote that he hoped to produce a second part. In his preamble to La Hire (1772), of 1672, he wrote that, working with diagrams for this second part, with a “very capable person”—identified by Taton as master mason M. Rouget (Taton 1953, p. 94), he realized the need for a mathematical collaborator on construction of the arc rampant. Bosse turned to Philippe, the son of his friend. The result was La Hire (1772) of 1672. (The preamble is in Taton 1953, p. 97; Rouget was mentioned by La Hire on p. 97 of La Hire 1673.) Both La Hire and Blondel solved the problem; when the pieds droits are EC and DB, the point of tangency to be found on BC is A, in Fig. 2 Left, the harmonic conjugate of S, where CB and ED meet at S. That Bosse did not solve the problem shows his limited understanding of the Conics of Apollonius. La Hire and Blondel independently introduced the term harmonic in their works of 1673. La Hire’s work of 1672 employs classic Euclidean geometry to specify the construction, in three-dimensions, of a cone, where a section of that cone gives the required arc rampant. Use of this geometry is puzzling, since La Hire seems to
162
C. Baltus
Fig. 2 Left: arc rampant, based on La Hire 1672 Prop. 1. Center Left: La Hire’s Lemma 3, 1673. Right: La Hire’s Lemma’s 8 and 9, 1673
have already been reworking the geometry of conic sections, which would appear in 1673, as the Nouvelle Methode. The construction of the arc rampant is a simple consequence of the projective theory of pole and polar as set out in Nouvelle Methode. We will turn to that treatment in Sect. 6.
5 Desargues Girard Desargues came from a wealthy Lyon family. In the 1630s he lived in Paris and became a member of Father Mersenne’s circle. He acquired a thorough knowledge of the Greek heritage in geometry, and his works show an interest in applications of geometry, including a 12-page pamphlet on perspective drawing, Exemple de l’une des manières universelles du S.G.D.L. touchant la pratique de la perspective, of 1636, followed by a work on sundials and, with Abraham Bosse, the work on stone cutting. The theorem on perspective triangles which bears his name appeared in a work published by Bosse in 1648, but even that was little known. We know he was involved in several construction projects as architect, but it would be difficult to label his profession.
6 A Comparison: Pole and Polar in La Hire and Desargues La Hire We have already noted the pole — polar pairing of a point and a line, with respect to a given conic section, as found in the Conics of Apollonius. Both Desargues and La Hire approached the topic projectively, by handling the topic for a circle and then
Philippe de la Hire: Was He Desargues’ Schüler?
163
showing that the relation found for the base circle was preserved in projection to the conic section in the slicing plane. Beyond this similarity, they proceeded in very different ways. We start with La Hire. Although later than Desargues, his method is simpler. He began his 1673 Nouvelle méthode with the definition, as we have seen, of harmonic: I call the straight line AD cut in 3 parts harmonically when the rectangle contained by all AD and the middle part BC is equal to the rectangle contained by the two extreme parts AB, CD.
Note that when collinear points A, B, C are given, there is exactly one point D so H (AB, CD). La Hire immediately entered, in his 1673 work, into a sequence of lemmas. Lemmas 2 through 6 showed that in a line-to-line projection from a point in a plane, a harmonic set is projected to a harmonic set. He needed numerous lemmas because he did not treat parallel lines as concurrent (at infinity) as Desargues had. It is instructive to look at one of the proofs. Here is his Lemma 3, with proof. In modern terms, it is the claim that the midpoint of a segment, AH , and the collinear point at infinity are harmonic conjugates of A and H , but La Hire avoided any reference to infinity. La Hire’s proof is based on similar triangles and properties of proportion. Theorem 1 (La Hire’s Lemma 3, 1673) Suppose H (AC, BD) on line AD and AD is projected from a point E onto line AH that is parallel to ED, B to G and C to H . Then G is the midpoint of AH . Proof See Fig. 2 Center Left. By similar triangles, AG AB AH AC = and = . DE BD DE CD Solving for DE in each equation gives CD · AH AH BD · AC BD · AG = . So = . AB AC AG AB · CD The numerator is (CB + CD)(AB + CB) = AB · CD + CB(AB + BC + CD) = AB ·CD +CB ·AD so division by AB ·CD = AD ·CB shows that AH ÷AG = 2. Lemma 7 follows, the claim that when two lines meet at B so H (BG, F H ) and H (BD, CE) for points on those lines, then the lines on corresponding points, GD, F C, and H E, are concurrent (or parallel). In Lemmas 8 and 9, we consider a point A outside a given circle, where tangents from A meet the circle in points F and G. See La Hire’s Fig. 12 and 13 of 1673, in our Fig. 2, when A is on the diameter on B. In La Hire’s Fig. 12, we have H (AC, BE), shown by similar triangles ALB and AH E, the tangent property:
164
C. Baltus
Fig. 3 Left: La Hire’s Lemma 10, 1673. Center La Hire’s Lemmas 13 and 14, 1673. Right: Euclid Book 3 Prop 36. For any line XAB on X, XA · XB is constant = XC 2
LB = LF and H F = H E, and a parallel projection from AH to AE. This is Lemma 8. In Lemma 9, the result of Lemma 8 is extended to any line on A meeting the circle in O and L, and meeting the polar in I . [La Hire’s Fig. 13.] The sphere for which the given circle is a great circle is cut perpendicular to the plane AGE, giving a circle for which OL is a diameter and DH the polar of A. So by Lemma 8, H (AI, OL). For Lemma 10, we consider two lines on A that cut the circle. See La Hire’s Fig. 14, in our Fig. 3, where F G is the polar of A. We know H (BE, AC) and H (OL, AI ). By Lemma 7, chords BO and EL meet, at D, on the polar of A. [Note that BL and EO will also meet on the polar of A.] Then La Hire observed that when point B coalesces with O, and, likewise, point E coalesces with L, then BO and EL become tangents to the circle at O and at L, still meeting on line F G. It follows that when a point D is on the polar of A and outside the circle, then the pole of D lies on A. What happens when a point, C, is inside a circle? La Hire, in Lemmas 13 and 14, drew a tentative polar AL — see La Hire’s Fig. 19 — perpendicular, at A, to the diameter on C that meets the circle at B and D, with H (AC, BD). We start with the secant on L and C, meeting the circle at E and F , then with GE drawn parallel to H O, La Hire showed line EF meets H O at C and, when continued to line LA, that H (LC, EF ). In modern language, La Hire had proved the Pole-Polar Theorem: Theorem 2 Given a circle, let x be the polar of X and y the polar of Y . Then Y is on x exactly when X is on y. (Note that cases when X or Y is on the circle or at the center of the circle were not considered.) What about the conic sections? The Pole-Polar Theorem applies just as well to the conic sections: “all that follows is a simple application of these lemmas . . . in all the conic and cylindrical sections”(p. 15). In particular (p. 35): Suppose that, in the base plane,
Philippe de la Hire: Was He Desargues’ Schüler?
165
line eg will be cut at points e, f, p, g in three harmonic parts. But the lines drawn joining these points of division to the vertex A [of the cone] will meet line EG on the slicing plane in points E, F, P , G and by Lemma 5 or 6 it will be cut in these points E, F, P , G harmonically in 3 parts. . . .
Desargues How did Desargues handle that material? For him, the key figure was a quadrilateral inscribed in a conic, and he began with the relation that he called involution, on collinear points. Definition (Desargues) Collinear pairs L, M; I, K; H, G are points in involution when there is a collinear point Q — called the souche — so QL·QM = QI ·QK = QH · QG, with Q separating either all or none of the pairs. In Fig. 4 Right, for example, pairs P , Q; L, M; H, G; I, K are in involution, where the souche would be between Q and L. Desargues characterized in the following lemma three pairs of points in involution, in terms of what is now called a cross-ratio (Field and Gray 1987, p. 87). Lemma 1 Let B, H ; C, G; D, F be three pairs in involution with souche A. Then GB · GH GD · GF AG = = . CB · CH CD · CF AC This is the same as CR(GC, BD) = CR(GC, H F ) in absolute value, where ·Y Z CR(XY, W Z) denotes XW XZ·Y W . [Corresponding equations hold for the other pairs.] Proof By the definition of involution, AB · AH = AC · AG = AD · AF with A AD separating all pairs or none. Since AG AF = AC then by adding or subtracting in both
Fig. 4 Left: Menelaus’ Theorem, based on La Hire’s Fig. 10 for Desargues 1639 (Taton 1951a, p. 126). Center: N G is polar of F . Right: based on La Hire’s Fig. 14 for Desargues 1639 (Taton 1951a, p. 142)
166
C. Baltus
the numerator and denominator, this fraction equals GF . Thus, the equal fraction CD
GD FC .
Likewise
AF AC
=
AG AD
gives
GD · GF AD AG AG GB · GH AG = · = . In a similar way, = . F C · CD AC AD AC CB · CH AC GD·GF (One can retrace steps in the proof to show the converse, that GB·GH CB·CH = CD·CF implies that the pairs B, H ; C, G; D, F are in involution with a souche A.)
What for La Hire was a “line divided harmonically,” is found in Desargues’s work as four points in involution: When H, B; F, D; C, G are three pairs in involution, and points D and F coalesce into point F , and points B and H coalesce into point H , then pairs H, F ; C, G are four points in involution . Since GH GF = , CH CF we have a harmonic set with H (GC, H F ) (Field and Gray 1987, pp. 74–84). After the initial material on involution and an introduction to conic sections as sections of a roll or conic surface (rouleau) Desargues introduced the traversale, a line, p, where for a given figure—which need not be a conic—and a point, F , every line on F that meets the figure in points B and C must meet p in point O so H (F O, BC). So the point and traversale are pole and polar. We consider Fig. 4 Center, with quadrilateral BCDE inscribed in a circle. Initially, only BCDE matters, whose diagonals meet at G and opposite sides at N and F . F G meets opposite sides EB and DC at Y and X, respectively. Desargues showed H (F G, XY ). And by projection from N , we get the same result for any line on F , making N G the traversale of F with respect to quadrilateral BCDE. And if Desargues did not need the circle, then how did he prove it? He used Menelaus’s Theorem, applied four times to XY N (Field and Gray 1987, p. 115). Desargues’ proved what is now call Menelaus’s Theorem, and used it repeatedly, as in his proof that the involution relation among points is preserved in a line-to-line projection (Field and Gray 1987, p. 92). It follows that the harmonic relation would also be preserved. Desargues attributed the theorem to Ptolemy, who had proved a spherical case in his Almageste [I.13] (Field and Gray 1987, p. 91). La Hire did not use this theorem. Here is Menelaus’s Theorem: If collinear points D, H , G lie, respectively, on sides (extended) 4h, hK, and K4 of triangle Kh4, then Dh H h GK = · . D4 H K G4 The proof is by triangle similarity in Fig. 4 Left, where KF is drawn parallel to line H DG.
Philippe de la Hire: Was He Desargues’ Schüler?
167
We return to the proof that H (F G, XY ). When XY N is cut by line BGD, then
DX · BN GX = , GY DN · BY
cut by line CGE, then
CX · EN GX = , GY CN · EY
cut by line F ED, then
FX DX · EN = , FY DN · EY
cut by line F BC, then
CX · BN FX = . FY CN · BY
It follows that (
GX 2 FX FX 2 GX ) =( ) , so = , so H (GF, XY ). GY FY GY FY
(We note that La Hire did give essentially this result in Prop 20 of Book 1 of La Hire (1685), his most complete and best known work on conic sections. There we have a complete quadrilateral, like BN CG, producing a harmonic set on a line like F D.) What else did Desargues prove? First, the points of intersection of a line with the six sides of a complete quadrilateral are points in involution. Menelaus’s Theorem is applied four times to produce the required equality of cross-ratios (Field and Gray 1987, p. 107). Next is Desargues’s Involution Theorem. Now the circle is involved. La Hire has nothing like this. In fact, it would not be seen again until the middle of the nineteenth century. In proving the Involution Theorem, Desargues first proved it for a circle. Theorem 3 Let BCDE be inscribed in a conic, where BE meets CD at F . Let a line meet the conic at L and M, meet CD at Q, BE at P , CB at I , ED at K, BD at G, and CE at H . Then pairs Q, P ; L, M; I, K; H, G are points in involution. [See Figure 4 Right.] (Field and Gray 1987, p. 108). Proof By the Euclid Book 3 Prop 36 (See Fig. 3 Right), P L · P M = P B · P E, QL · QM = QC · QD, F B · F E = F C · F D. With, additionally, four applications of Menelaus’s Theorem to P QF , with lines EH C, BGD, BI C, and EKD, we reach both QL · QM I Q · KQ QL · QM GQ · H Q = and = . GP · H P PL · PM I P · KP PL · PM
168
C. Baltus
This means that L, M; P , Q; G, H are points in involution, as are L, M; P , Q; I, K. We conclude that L, M; Q, P ; G, H ; I, K are in involution, for both involution relations would involve the same souche. (Field and Gray 1987, pp. 108–110). Desargues then considered the traversale when the figure in question is a conic. He simply observed that in the three dimensional cone, when points in involution on a line which cuts the circle in the base plane are joined by lines to the vertex of the cone, then those lines are concurrent at the vertex and so they meet the corresponding line in the plane of the section in, also, points in involution (Field and Gray 1987, p. 110). This, a generation before La Hire, was the revolutionary claim that those properties of a circle which are preserved under projection must hold for conic sections. Only at this point did Desargues handle the question of tangents to a conic. When F is the pole, or but, corresponding to a certain polar, or traversale, the tangents from F to the conic section meet it at the points on the traversale. The argument is a quick one: as a line on F moves across the conic section, the two points at which that line meets the conic coalesce into a single point, which must necessarily be a point of tangency, and must be a point where the traversale meets the conic. Where La Hire began with tangents, for Desargues they were close to an afterthought.
7 Was La Hire Desargues’ Schüler? The similarity in approach of Desargues and La Hire apparently aroused suspicion by some contemporaries. We can read this concern in La Hire’s letter (Poudre 1864, p. 231) accompanying his 1679 transcription of Brouillon project. La Hire wrote, with typical politeness, that if he had known the work of Desargues then he would not have discovered the method used in 1673, for he would not have believed it possible to improve on something so simple and general. La Hire defended his independence, pointing out that Desargues had used long compositions of ratios which he, La Hire, had not, and for this reason it would not be wrong to judge his work superior. Further, Apollonius had already employed the harmonic division of lines associated with conics, and both he, La Hire, and Desargues would have learned from this same teacher. We have some information, from G. W. Leibniz, about general awareness in the 1670s of Desargues’ work and its possible influence on La Hire’s 1673 (La Hire 1673). Henry Oldenburg wrote to Leibniz, in April 1673, about a 1670 criticism of Desargues by Grégoire Huret. It is clear that both Huret and Oldenburg had only second-hand knowledge of Desargues’ work, but that they had some sense of the projective character of the work: “Mr. [John] Collins feels that if we correctly gauge the mind and aim of the author, then the doctrine [of Desargues] merits praise rather than condemnation; the plan of this work was to treat the conic sections as projections of small circles which lie on the surface of a sphere. . . .” (Leibniz 1884, in Latin, p. 40–41). In letters from the years 1674 or 1676, Leibniz noted
Philippe de la Hire: Was He Desargues’ Schüler?
169
with approval Desargues’ universal treatment of the conics as one genre, where line constructions to resolve problems apply to all the conics, and that Desargues treated parallel lines as concurrent. Later (Le Goff 1994, p. 189) Oldenburg told Leibniz that he was unable to procure a copy of Desargues’ work. In a letter to Etienne Périer, nephew of Blaise Pascal, Leibniz wrote, in a crossed out section, (Echeverría 1994, August 1676, p. 287) “not long ago there appeared a new Methods des sections Coniques whose author was a friend of Monsieur de Boss and disciple of Monsieur des Argues (who was a great friend of Mons. Pascal) and spoke also of the properties of lines cut harmonically and of their application to conics in a manner strongly approaching that of [Desargues].” This is evidence that La Hire’s 1673 work was known of in mathematical circles, and, despite the disappearance of Desargues’ work, the similarity of La Hire’s work to that of Desargues was already commented on. When La Hire’s transcription of the Brouillon Projet was published by Poudra in 1864, the conic sections were part of a projective geometry that had been transformed by Poncelet and others. Particularly in this light, La Hire showed not the boldness, not the inventiveness, not the depth of understanding that could be read— with some effort—in the work of Desargues. La Hire surely had some exposure to the ideas of Desargues, for he had worked with Bosse, his father’s friend, and Bosse had worked with Desargues. How independent would he have been in his 1673 work? In short commentary, without any detailed examination, scholars Charles Taylor and Ernst Lehmann, in passages recognizing the achievement of La Hire, referred to him, respectively, as a “disciple” (Taylor 1881, p. lxiv) and a “Schüler” (Lehmann 1888, p. 1) of Desargues. The issue returned about 1950 when when an original copy of Brouillon Projet was found in the French Bibliothèeque Nationale. Soon after, René Taton produced a study of the mathematical work of Desargues Taton (1951a), and several related articles. While praising La Hire’s clarity of presentation, he wrote of La Hire’s treatises of 1673 and 1685 (Taton 1951b, p. 16) “their originality seems very contestable despite the contrary affirmation of Ph. de La Hire himself.” And in the Dictionary of Scientific Biography, “The ‘Nouvelle méthode’ clearly displayed Desargues’ influence, even though La Hire, in a note written in 1679 . . ., affirmed that he did not become aware of the latter’s work until after publication of his own. Yet what we know about La Hire’s training seems to contradict this assertion. Furthermore, the resemblance of their projective descriptions is too obvious for La Hire’s not to appear to have been an adaption of Desargues’.” Taton (1970). Le Goff (1994, p. 204) wrote, in 1994, that La Hire “seems to revert to the orthodoxy of the ancients, with his ‘line cut harmonically,’ which only retains involution in the case of harmonic division.” But after noting this difference with Desargues, he continued, “It is probable that Abraham Bosse, if not Laurent de La Hire, would have had a copy of Brouillon in their library. It is hard to believe the words of Phillipe de la Hire, according to which he was only acquainted with this text in 1679, when he transcribed a copy.”
170
C. Baltus
Our comparison of the development of the pole-polar concept at the hands of Desargues and La Hire shows, at least in this case, that Desargues’s work had minimal influence on La Hire. La Hire, as had Apollonius, began with tangents to the conic, by which the polar was characterized, and by which the harmonic division of a secant was shown; Desargues defined the polar by the harmonic division of a secant, and his concept of harmonic division began with involution — not found in La Hire. Further comparison of the Brouillon Projet with La Hire’s 1673 work finds essentially nothing in common, in either detail or broad plan. Desargues used Menelaus’s Theorem repeatedly; La Hire never did. La Hire did not treat parallel lines as concurrent, adding much to the length of his work. La Hire is detailed in showing myriad resulting properties of the three conics, each treated separately, in the tradition of those, like Mydorge, who wished to simplify what Apollonius had done. Desargues, on the other hand, aimed to unify the treatment of the conics in developing common properties, as he wrote to Mersenne in a letter of 1638 (Taton 1951a, p. 83, 84). With the topic of the foci, La Hire in 1673 offered little that was not already covered by Apollonius; as interpreted by Hogendijk (1991), Desargues, in some of the most puzzling parts of his work, was revamping in a completely new, projective, way the place of foci. If we look at La Hire’s 1685 work (La Hire 1685), we see specific borrowing from Desargues. Most prominent is the harmonic pencil of concurrent lines, which Desargues spoke of as rameau correspondants entr’eaux, branches (lines) corresponding among themselves (Field and Gray 1987, p. 96). This is a set of four concurrent lines where any other line intersecting the four does so in a harmonic set. La Hire did not have this concept in 1673, but he did in 1685 (La Hire 1673, Book 1 Prop XI), calling such concurrent lines harmonicales, and he made crucial use of the concept in, for example, his Book 8, on foci. One does not find any comparable borrowing in 1673. On the one hand, yes, many mathematicians in the 1670s knew that Desargues treated parallel lines as concurrent, at a point at infinity, and they had some general idea that he used projections in work on conic sections. And La Hire, even in relative isolation, could not have avoided this. But those, like Taton, who question the originality and honesty of La Hire, base their opinion primarily on the availability of Desargues’ 1639 work to La Hire. We see that Le Goff, in 1994, could point to the significant difference in goals and style of Desargues and La Hire while at the same time claiming that La Hire could not have been independent since people he knew were likely to have had a copy of Desargues’ 1639 booklet. Zbynek Sir, in 2002, came to another conclusion: that the only influence of Desargues would be in the méthode projective spaciale, but of the other methods attributed to La Hire, “nous ne trouvons pas de trace chez Desargues.” (Zbynek 2002, p. 219). On the other hand, notions of projective methods were in the air in the seventeenth century. Apollonius, once he developed the abscissa-ordinate equations of the conic sections, did not return to the three-dimensional cone except for a few problems, as at the end of Book 1, to describe the cone from which a given conic section could originate. Remember that there was no science of perspective when Apollonius wrote. However, Mydorge (1631/1639, Book 1 Prop 36) and Werner
Philippe de la Hire: Was He Desargues’ Schüler?
171
(1522, Prop 15, 16) did resort in places to the three-dimensional context of a cone. They brought in a tangent plane to a cone, in which the tangent line to a conic section lies. Mydorge (1631/1639, p. 14) has a diagram of a cone sliced in an ellipse identical to that 1696 Apollonius diagram in Fig. 1 Left. How would an artist, conscious of techniques of perspective drawing, not see the circle and ellipse as projections of each other from the vertex? Recall that Desargues wrote a pamphlet on perspective drawing and La Hire was himself an artist. It is unlikely that we will know anything definitive on the degree of influence of Desargues on La Hire’s 1673 geometry. But we can conclude that the details have the character of an independent work. Credits All reproduced figures are in the public domain. Thanks to the Bibliothéque Nationale de France, with the site gallica.bnf.fr., and to google books.
References Apollonius of Perga (1696) Apollonii Pergaei conicorum libri quattuor . . ., edited and transl. Federicus Commandinus, Bononiae, 1566. Reissued (1696) with new diagrams: Pistorii. W.W. Rouse Ball (1960) A Short Account of the History of Mathematics, 1908, reissued (1960), New York: Dover. E. Bénézit (1960), editor, Dictionnaire critique et documentaire des peintres, sculpteurs, dessinateurs et graveurs, Paris: Librairie Grund. (Nicolas)-François Blondel (1673) Résolution des quatres principaux problèmes d’architecture, Paris: Imprimerie Royale. Abraham Bosse (1643) La pratique du trait à preuves de M. Desargues, Lyonnois, pour la coupe des pierres en l’architecture, Paris: Pierre Des-Hayes. Abraham Bosse (1672) Regle universelle, pour décrire toutes sortes d’arcs rampants dans toutes les surjections que l’on puisse proposer, sans se servir des Axes, Des Foyers, ny du Cordeau, Paris: A. Bosse. Girard Desargues (1639) Brouillon project d’une atteinte aux événements des rencontres d’un cône avec un plan, in Field and Gray (1987) (trans Field) and original http://gallica.bnf.fr/ark:/12148/ bpt6k105071b. J. Dhombres and J. Sakarovitch (1994), editors, Desargues en son Temps, Paris: Blanchard. Javier Echeverría (1994) Leibniz, interprete de Desargues, in Dhombres and Sakarovitch (1994), 283–293. J. V. Field and J. J. Gray (1987) The Geometrical Work of Girard Desargues, New York: Springer. Bernard de Fontenelle (1718) Eloge de M. de la Hire, Histoire de l’Académie Royale de Sciences, 76–89. Jan P. Hogendijk (1991) Desargues’ Brouillon Project and the Conics of Apollonius, Centaurus 34, 1–43. Jean-Pierre Le Goff (1994), Desargues et la naissance de la géométrie projective, in Dhombres and Sakarovitch (1994), 157–206. Philippe de La Hire (1772) Observations de Ph. de la Hire sur les points d’attouchement de trois Lignes droits qui touchent la Section d’un Cone sur quelques-uns des Diametres, et sur le centre de la mesme Section, Paris: A. Bosse. Philippe de La Hire (1673) Nouvelle Méthode en Géométrie pour les Sections des Superficies coniques et Cylindriques, Paris. Philippe de La Hire (1685) Sectiones Conicae en novem libros distributae, Paris; French translation by Jean Peyroux (1995), Grand Livre des Sections Coniques, Paris: Blanchard.
172
C. Baltus
Ernst Lehmann (1888) De la Hire und seine Sectiones conicae, I. Teil, Abhandlung zu dem Jahresberichte des Königlichen Gymnasiums zu Leipsig auf das schuljahr 1887 bis Osern 1888, Leipsig: Edelmann. G. W. Leibniz, Leibnizens gesamelte Werke, Leibnizens mathematische Schriften, editor C. I. Gerhardt (1884) Erste Abtheilung, Band I, Berlin: Asher. Francesco Maurolico (1575) Opuscula mathematica: nunc primum in lucem aedita, cum rerum omnium notatu dignarum. . . Venis: Franciscum Farnciscium Senensem. MacTutor (2018) authors J. J. O’Connor and E. F. Robertson, Philippe de La Hire, http://wwwhistory.mcs.st-and.ac.uk/Biographies/La$\protect\T1\textdollarHire.html. Claude Mydorge (1631/1639) Prodromi catoptricorum et dioptricorum: sive conicorum operis ad abdita radii reflexi et refracti mysteria praevii et facem praeferetis, Books 1 and 2, 1631, Books 3 and 4, 1639. Blaise Pascal (1640) Essay pour les coniques, in Taton (1951a) 190–194, in English transl. in Field and Gray (1987), 180–184. M. Poudre (1864) Oeuvres de Desargues Réunies et Analysée, Paris: Leiber. Sir, Zbynek (2002) Les sections coniques chez Philippe de La Hire. Ph.D. Thesis, Université Pierre et Marie Curie (Paris VI), http://www.karlin.mff.cuni.cz/~sir/papers/These.pdf Madeleine Pinault Sorensen (2009) Laurent de La Hyre, Paris: Musée du Louvre Editions / 5 Continents. René Taton (1951) L’oeuvre mathématique de G. Desargues, Paris: Presses Universitaire de France. René Taton (1951) La géométrie projective en France de Desargues à Poncelet, conference fait au Palais de la Découverte le 17 février 1951, Université de Paris. René Taton (1953) La première oeuvre géométrique de Philippe de La Hire, Revue d’Histoire des Sciences et de Leurs Applications 6, 93–111. René Taton (1970) Philippe de la Hire, in Dictionary of Scientific Biography (New York 1970– 1990). http://www.encyclopedia.com/doc/1G2-2830902429.html. Charles Taylor (1881) An Introduction to the Ancient and Modern Geometry of Conics, Cambridge: Deighton Bell and Co. Johannes Werner (1522), Super vigintiduobus Elementis Conicis, Vienna: Lucae Alentsee Bibliopolae. Richard Westfall (2018) La Hire, Philippe de. Available at the Galileo Project. galileo.rice.edu/Catalog/NewFile/lahire phi.html. Wikipedia (2018) Philippe de la Hire, https://en.wikipedia.org/wiki/Philippe de La Hire.