123 2 31MB
English Pages 985 [2635] Year 2020
C1J V C
....C1J
,:V C1J
cc: .... C1J c:n ·-C ....
c.
V\
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois, Noether, and Grothendieck Arkady Plotnitsky
We might as well say that minor no longer designates specific [mathematics] but the revolutionary condition for every [mathematics] within the heart of what is called great (or established) [mathematics]. —Gilles Deleuze and Félix Guattari, Kafka: Toward a minor literature. (Deleuze and Guattari 1996, p. 18; paraphrased)
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Abel and Galois: Minority Mathematics and the Rise of Modern Algebra . . . . . . . . . . . . . . . . . 3 The Three Mathematics of Emmy Noether . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Topos Theory and Grothendieck’s Parliamentary Mathematical Ontology . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 18 26 38 48 49 49
Abstract
This chapter considers the nature of radical transformations of mathematics, enabled by minority mathematics. It will be particularly concerned with modern mathematics, which emerged roughly around 1800, as abstract mathematics – abstracted from mathematics’ relations to the natural world and physics, relations that previously dominated mathematics. As, however, defined here (transferring Gilles Deleuze and Félix Guattari’s concept of a minor(ity) literature, as exemplified by F. Kafka’s work), a minority mathematics is not something that exists entirely outside a major mathematics, to be distinguished here from a majority A. Plotnitsky (*) Literature, Theory, and Cultural Studies Program, Philosophy and Literature Program, Purdue University, West Lafayette, IN, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_140-1
1
2
A. Plotnitsky
mathematics. Instead, it is a mathematics that, while still exterior to the major mathematics to which it juxtaposes, constructs itself within and even at the very core of this major mathematics. This may be seen as merely a special form of revolutionary vis-à-vis normal mathematical practice in T. Kuhn’s sense. I shall argue, however, by using the work of N. H. Abel, É. Galois, E. Noether, and A. Grothendieck, as my main cases, that this “special” type of revolutionary practice is the primary and even the only form of revolutionary practice possible in mathematics. I designate this mathematics Abelian mathematics, the term commonly associated with formal mathematical properties (such as commutative group or abelian categories), because Abel’s work was, arguably, the first manifested case of a minority mathematics in this sense in modern mathematics. Keywords
Functoriality · Major mathematics · Minority mathematics · Modern algebra · Modern mathematics · Concept-form AMS Classification
00A30 · 01A65 · 01A60 · 81P05
1
Introduction
This chapter considers the nature of radical transformations of mathematics, enabled by minority mathematics. It will be particularly concerned with modern mathematics, which emerged roughly around 1800, as abstract mathematics – abstracted from mathematics’ relations to the natural world and physics, relations that previously dominated mathematics. As, however, defined here (transferring Gilles Deleuze and Félix Guattari’s concept of a minor(ity) literature, as exemplified by Franz Kafka work) a minority mathematics is not something that exists entirely outside a major mathematics, to be distinguished here from a majority mathematics. Instead, it is a mathematics that, while still exterior to the major mathematics to which it juxtaposes itself, is created, constructed within and even at the very core of the major mathematics it transforms. This may be seen as merely a special form of revolutionary vis-à-vis normal mathematical practice in Thomas Kuhn’s sense. I shall argue, however, by using the work of Niels Henrik Abel, Évariste Galois, Emmy Noether, and Alexandre Grothendieck, as my main cases, that this “special” type of a revolutionary practice is the primary and even the only form of revolutionary practice in mathematics. I designate this mathematics Abelian mathematics, the term commonly associated with formal mathematical properties (such as commutative group or abelian categories), because Abel was, arguably, the first manifested case of a minority mathematics in this sense in modern mathematics. (To highlight this difference, Abelian will be capitalized when designating a minority mathematics.)
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
3
I would like to stress from the outset that in the present view all such transformations of a major mathematics by a minority mathematics, and more generally all mathematical developments, continuous or discontinuous, are assumed to be local, even though they always have their (more or less) extended history. They are the product of a technical or technological mathematical practice: Each is a local invention of new mathematical technology or a local development of an already existing one, without having their origins in Platonist-like archetypal preexisting mathematical forms or any others preexisting mental reality, or being governed by a unique preordained form (however multifaceted) of mathematics in its historical developments. By the same token, the history of mathematics and its transformations becomes multiple. It is comprised of a manifold of histories proceeding along different trajectories, with multiple forms of interplay between continuities and discontinuities even within each such history, if one can ever rigorously, rather than provisionally, speak of a single history of any mathematical entity – a concept, a theory, or a field. In this chapter, I am specifically concerned with transformations enacted by means of what I define as minority mathematical practices. As I shall argue, however, such minority practices are always involved in and shape transformations, especially radical transformations, of mathematics. In speaking of a minority mathematics, I adopt Deleuze and Guattari’s concept of minor or, as I shall term it (for the reasons explained below) minority literature, introduced by them in their analysis of Kafka’s work, and expand this concept to other creative endeavors, with mathematics as my main subject. This expansion is virtually inherent in their concept and transpires in their treatments of many figures in different fields, including mathematics, for example, Abel and Galois, or Bernhard Riemann.1 Of course, a minority practice is specific to the nature of its field, such as literature and art, philosophy, or mathematics and science. Kafka’s work exemplifies both the generality of the concept and the specificity of its application to literature, and further specificity that accompanies the work of each particular figure, in this case Kafka vis-à-vis other modernists, such as James Joyce or Virginia Woolf. To cite Deleuze and Guattari’s definition of the concept, as applied to mathematics (which I substitute for literature in their formulation):
1 Deleuze and Guattari saw Riemann’s concept of a manifold (a form of minority mathematics, but its status as such would require a separate analysis) as heralding a revolutionary philosophical change, manifesting a minority philosophy, versus dialectic as a (post-Hegelian) major philosophy. This view is helped by the French term “multiplicité” as a translation of Riemann’s Mannigfaltigkeit (which the English translation of Deleuze and Guattari’s works often renders, understandably but incorrectly, as “multiplicity”). They say: “It was a decisive event when the mathematician Riemann uprooted the manifold from its predicative state and made it a noun, manifold [multiplicité]. It marked the end of dialectics and the beginning of the typology and topology of manifolds” (Deleuze and Guattari 1987, p. 483; translation modified). It could be shown, however, that to leave dialectics behind requires a traversal through dialectical thinking, which, or in any event, thinking in terms of unities (rather than multiplicities, even if unities that contain multiplicities within them) remains the dominant, major, philosophical thinking. Philosophically, this chapter advocates the philosophy of the multiple, ultimately uncontainable by a unity. For a discussion of Riemann’s conceptual thinking, including as parallel to that of Galois, see Plotnitsky (2022), which also contains further references.
4
A. Plotnitsky A minor [minorité] [mathematics] doesn’t come from a minor [mathematical discourse]; it is rather that which a minority constructs within a major [mathematical discourse]. But the first characteristic of minor [mathematics] in any case is that in it [mathematical discourse] is affected with a high coefficient of deterritorialization. . . . We might as well say that minor no longer designates specific [mathematics] but the revolutionary condition for every [mathematics] within the heart of what is called great (or established) [mathematics]. (Deleuze and Guattari 1989, pp. 18–19)
A minority is an exteriority: A minority thinking comes from the outside of the majority or (as explained below, they are not the same) major thinking to which it relates, even if its representative is an “insider,” as, say, Riemann was at Göttingen in his time, rather than is an “outsider,” as Abel or Galois was. But a minority mathematics (or a minority practice in any field in this definition) is an exteriority from within a major mathematics: It works through the interiority of a major mathematics, by interacting with a major mathematics and using it to create new mathematics. I do, thus, introduce a terminological asymmetry, which is, however, suitable and may be necessary here. The term “minority mathematics” is arguably more fitting. It is in effect suggested by Deleuze and Guattari by speaking of a minor literature as “that which a minority construct within a major (or established literature)” [emphasis on minority added]. Thus defined, a minority mathematics qua mathematics, such as that of Abel or other figures considered here, is not minor or marginal (any more than a minor literature qua literature is), quite the contrary, although it may be and may remain marginal for a while, even a long while. (“Minoritarian” has been used as a translation of the French “minorité,” but the term has other meanings in English, such as related to a political power of a minority in a society, which can make it misleading.) Hence, the term “major mathematics” (or “major literature”) is a more accurate reflection of the situation, insofar as a minority mathematics (or a minority literature, such as that of Kafka) transforms the central or the most advanced part of the existing mathematics, which may not necessarily be a majority mathematics. Accordingly, I shall adopt this asymmetrical terminology as more precise in reflecting what a minority or Abelian mathematics does. “Minor mathematics” is a linguistically more graceful term, but “minority mathematics,” or “minority mathematical practice,” is more precise in reflecting the role of a minority, sometimes consisting of a single person. A few immediate qualifications are in order, beginning with the concept of a single-person minority. A minority thinking is never entirely individual, even if the corresponding practice is individual or nearly so, as it was in the case of Abel or Galois. Either case is probably as close to a single-person minority in mathematics as one can find. One’s mathematical thinking, however, is always, to one degree or another, affected by the preceding history of mathematics, because even the most innovative concepts or theories have their history (which is always collective, however individualized it may become), and by extramathematical, such as cultural, factors, which give any individual thinking collective dimensions. Such exterior elements are manifested more in literature, especially, because of the role of ordinary language in shaping the difference between a major (or in this case, a majority) and minority literature (as German vs. Czech, or Hebrew, in Kafka). The role of extramathematical factors
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
5
impacting mathematical thinking and practice has been extensively considered in historical studies of mathematics. Indeed, the analysis of “exterior” personal and cultural, including political, factors in the workings and development of mathematics and science has been dominant in the historical scholarship of mathematics and science during the last half a century or so, especially in the so-called constructivist school or rather schools. There have been several such schools, sometimes different in their approach, and the number of works in these areas is massive by now, although most of these works are devoted to science, rather than mathematics.2 These works challenge the uncritical separation (previous common in the history and philosophy of mathematics and science) of the “interior” and the “exterior” in mathematics and science. On the other hand, the role of (conventionally) “exterior” factors is often difficult definitively to justify as concerns their weight relative to the role of (conventionally) “interior” considerations, the role of mathematics itself. The latter can still be sufficiently demarcated as such, even if one adopts a more critical view of the exterior and the interior of mathematics as relative, case-dependent, rather than absolute. Was Galois’ concept of group or Grothendieck’s concept of topos impacted by extramathematical factors, for example, given that in both cases one deals with figures with strong political views, and in Galois’ case with a political revolutionary? Perhaps, and even likely! The degree of this impact is, however, usually difficult to establish, in contrast to the interaction of either figure with the institutional mathematical establishment, although even this aspect of either case still poses complexities and has been debated, especially, in the case of Galois. In any event, the present chapter is not this kind of study. Mathematics itself, specifically as the interaction between a minority and a major mathematics, is my main concern, and it takes precedence throughout. While exterior elements will be brought up, for example, in commenting on the minority situation of the figures considered in relation to disciplinary or institutional mathematics, my main emphasis will be on mathematical thinking and primarily individual mathematical thinking. I will suggest certain political implications of my argument. This part of my argument is, however, very different from those of the works with the constructivist orientation just mentioned. It is not my aim to challenge this orientation and projects that it led to, some of which have been convincing and important. My aim is to offer a different type of view on the nature of the political in connection with mathematics. It is essential that, as defined here, a minority mathematics as the transformative, revolutionary, force for mathematics always arises within a given major mathematics, rather than (by definition) only in interaction with the latter from its intellectual exterior, as it does in the case of Abel or Galois. It would, however, be difficult to
I mention here some key precursors or founding figures, such as Imre Lakatos (one major philosopher in this field who was primarily concerned with mathematics), Thomas Kuhn, Paul Feyerabend, and a few key recent figures, such as Ian Hacking (who has also written on probability theory), Bruno Latour, Peter Galison, and the Edinburgh school, founded by David Bloor. While addressing minority figures or groups (in the conventional sense), none of these authors or, to my knowledge, any other authors in this field consider concepts similar to those of a minority mathematics (or science) in the present sense.
2
6
A. Plotnitsky
think of Karl Friedrich Gauss, Riemann, and David Hilbert, whose thinking led to revolutionary mathematical transformations, as minority figures in the way Galois and Abel were (for both mathematical and extramathematical reasons). Nevertheless, even in cases like those of Gauss, Riemann, or Hilbert, there must be (in their case was) some minority mathematics for a revolutionary transformation of mathematics to be possible, although Riemann was more a minority figure overtly. Such a transformation is only possible when a minority thinking restructures or reterritorializes major mathematics, whether this minority thinking is that of a manifested outsider or that of an “official” insider, who must move to a minoritytype outside to change mathematics. Cases of the first type, however, such as those of Abel, Galois, Noether, and Grothendieck, allow one to see better how, to return to Deleuze and Guattari’s formulation, “minor [is] no longer a specific [form of mathematics] but [is] the revolutionary condition for every [mathematics] within the heart of what is called great (or established) [mathematics],” all major mathematics. Conversely, no minority mathematics is possible without the major mathematics which it affects. A minority mathematics is a revolutionary condition for every mathematics, but by the same token, it can only be this condition within a major mathematics. The exteriority of a minority mathematics is an exteriority from within the major mathematics with which this minority mathematics interacts. The term “deterritorialization,” uncommon outside the context of Deleuze and Guattari’s work, needs to be explained. While Deleuze and Guattari’s concept so designated, used by them in a broader set of contexts, would require an extended treatment, its meaning in the present context is reasonably straightforward. It means expanding a given territory, while also reshaping, reterritorializing, this territory by means of new conceptions and practices, such as minority ones, although it can also be done otherwise. A clear example is the deterritorialization/reterritorialization of the part of algebra dealing with solving polynomial equations by radicals by making it the mathematics of group theory by Galois. In this respect, Galois’ work went beyond Abel’s already radical deterritorializing and then reterritorializing (restructuring) the field of mathematics in the context of this problem itself, although, as discussed below, Abel’s overall thinking was reaching much further as well. In any event, Galois’ group theory is one of the greatest mathematical deterritorializations and reterritorializations ever. The deterritorializations and reterritorializations enacted by Noether and Grothendieck’s restructurings of, respectively, algebra and algebraic geometry are far reaching as well, although it is hard to match the concept of group in this respect. While, however, arguing for the interaction and in this respect continuity between minority and major mathematics, I do maintain that the practice of mathematics contains both continuity and discontinuity, in a complex and (from case to case) varied interplay, including between minority and major mathematics. This interplay always, in each local case (in the present view all such cases are, again, local), contains both a continuity and a discontinuity, even a radical, revolutionary discontinuity, just as it does in other fields, such as literature. Modernism, such as that of Kafka (or James Joyce, Samuel Beckett, and others in literature, Pablo Picasso and Wassily Kandinsky in art, or Arnold Schoenberg in music), was a revolution that
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
7
transformed established major literature and art, while retaining continuity with it. Although certain key modernist trends were shared by these figures, the interplay between the major and minority art was also specific in each case, given the individual thinking of each figure and the particular cultural situation in which their minority practice took place. The same is true in mathematics, for example, in Abel and Galois’ cases, which had both shared and strictly individual aspects in their thinking and their cultural situations. It may not always be possible, and arguably is never ultimately possible, to fully track how new trajectories emerge from and against previous trajectories of thought, including all elements of minority thinking involved in any revolutionary practice. This limitation, however, leaves room for meaningful assessments of this emergence in terms of both continuities and breaks. Algebraic geometry, as transformed by Grothendieck, is a case in point. Grothendieck took advantage of and, first, transformed, by a minority thinking, established concepts of category theory, cohomological algebra, and sheaf theory, already in interaction with each other, even before he transferred them into algebraic geometry, where his work was most transformative. Continuities do not prevent new theories from breaking with old ones and changing the nature of mathematics in the process but instead help these breaks. Continuities with a major mathematics are, by definition, necessary for breaks enacted by a minority mathematics, although never sufficient for these breaks. Galois’ invention of the concept of group was helped by the major algebra of polynomial equations, but as a minority concept a group comes from elsewhere in a broader landscape of Galois’ thinking. It was a very complex landscape, far exceeding mathematics and shaping the “elsewhere” that gave rise to the idea of a group. Also, previous major practices may continue alongside revolutionary changes or may change more continuously with previous practices. Not all major literature was transformed into modernism by a minority literature, such as of Kafka or Joyce. In both cases the role of minority (they were not minor!) languages, Czech and and Irish, were crucial. Joyce expressly played with this linguistic multiplicity, by adding other languages into that of his novel, in his final masterpiece, Finnegans Wake. Realist literature and “milder” modernist, or mixed, literature, including in good and important works, have continued to remain prominent or even dominant, and they still do. It may be that, as Yu. Manin contends, there is more continuity in mathematical practice than scientific ones, specifically in physics (Manin 2002, 2010, 2019).3 It is
3
As an illustration of his thesis concerning the continuity of mathematical practice (keeping in mind that his article accomplishes a great deal more), Manin offers an analysis of the history of the mathematical theory of time and periodicity, from Ptolemy’s epicycles to Schrödinger’s quantum amplitude interference and Feynman path integrals, with a few key junctures in between, such as Fourier sums and integrals, which are also central to quantum theory (Manin 2019). One cannot deny this overarching continuity and local continuities with it, elegantly traced by Manin. It is also true that the role of such continuities deserves more attention than it is commonly given, versus revolutionary breaks within this history, which tend to have attracted most commentaries. I would argue, however, that it is difficult to see this history apart from several breaks, sometimes radical
8
A. Plotnitsky
certainly true that major accomplishments are possible by means of primarily, even if not entirely (which, I would argue, may not be possible), continuities with preceding developments, for example, with a major mathematics. This can, however, happen in physics as well: There are cases when the continuity succeeded, at least more so than expected, while attempts at revolutions failed, which also may (but need not) mean that the failure was in not finding a right revolutionary path.4 In general, it is not always clear, as Kuhn realized and has shown in the case of Max Planck’s discovery of quantum theory (Kuhn 1987), and may be a matter of interpretation and debate where, or whether, a normal practice ends and a revolutionary practice begins. I would, accordingly, argue that the balance between them is contingent, shaped by the circumstances of a given case. Thus, while set theory and its concepts are part of the history of category theory and its concepts, as Manin contends, there is, I would argue (contesting Manin’s view) just as much, if not more, discontinuity between them. Grothendieck’s use of category theory in his work, breaks, even as concerns the emergence of revolutionary mathematical concepts involved, such as those of Fourier’s analysis or the calculus of variations, used by Schrödinger, or the role of complex variables. (Each of these concepts has its own history of continuities and breaks with their predecessors.) These mathematical concepts also reflect and have shaped an even more radical change in physics during this history brought in by quantum physics versus classical physics. Manin acknowledges that such developments and their use of mathematics are “more sensitive to respective ‘revolutions’ and ‘paradigm shifts,’ whereas those parts that are closer to ‘pure mathematics’ show rather a kind of continuous development, as [he] argue[s] in [his] paper” (Manin 2019, p. 130). It is, as I said, possible that there is more such continuity in mathematics overall. I would, nevertheless, argue that either history, that of pure mathematics and that of physics, including in its use of mathematics, shows the interplay between continuities and breaks (again, more complex than that between normal and revolutionary practices), different at different junctures. Some of them in mathematics exhibit changes, such as those defined by new concepts, that are as revolutionary as in physics. This is, again, not to say that such changes do not have history and, hence, continuities with preceding concepts or ways of thinking, including those defined by the interactions between major and minority mathematics found throughout this history. In physics, too, one can see quantum mechanics as, mathematically, a reconstruction (it was called quantum-theoretical reinterpretation at the time) of Hamiltonian and Lagrangian mechanics, the dominant formulation of classical mechanics then. Mathematically, quantum mechanics or, in high-energy regimes, quantum field theory is a major and a majority quantum theory now. Attempts at alternatives, however, never stopped, primarily in view of the epistemological discontent with the theory, most famously expressed by Albert Einstein, but they have not thus far achieved the transformative minority role that quantum mechanics had. See also Note 1 above. 4 The history of renormalization in quantum electrodynamics (QED) offers an instructive example of this situation. It was believed by many, including Paul Dirac, the founder of the theory, that its problems, having to do with the appearance of infinities (divergent integrals) in QED, could only be resolved by means of an alternative theory, and hence by a revolutionary break, akin to that from classical to quantum theory. As it happened, the renormalization program was able by 1950 to handle these difficulties within QED itself, as it existed. In fact, QED is now the best confirmed physics theory ever, although the discontent with its nature, as requiring renormalization, has never entirely disappeared (Dirac never came to terms with it) and is still around. See (Schweber 1984) for a helpful historical account of QED and (Plotnitsky 2021, pp. 273–305) on the epistemological aspects of quantum field theory. See also (Plotnitsky 2022, pp. 271–274) on recent approaches to the mathematics of renormalization via motivic Galois theory, bringing both Galois and Grothendieck’s mathematics into it.
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
9
vis-à-vis previous history of algebraic geometry, as in part based on set theory, as in case of André Weil, who, as will be seen, disparaged category theory (Weil 1962, p. 358). Grothendieck’s use of category theory or (and in) sheaf cohomology, by then part of a major mathematics in algebraic topology, was a minority mathematics there. (Admittedly, Jean-Pierre Serre already started to bridge both fields in terms of sheaf theory.) The same case can be made for Galois’ use of group theory or Noether’s rethinking of Dedekind’s ideal theory, an established part of the major algebra then, or, more radically, Noether’s algebraically defined minority ventures into topology. Her insights, with the help of her students and followers, such as Heinz Hopf and Pavel Alexandrov, had a crucial impact on algebraic topology. Abel and Galois’ mathematics might be seen as more radically exterior minority mathematics (still, again, inevitably related to the major mathematics of their time), while that of Noether and Grothencieck as being closer to the major mathematics on their respective situations, except for Noether’s minority intervention (from the exterior territory of abstract algebra) into algebraic topology. Still, topology needed topologists, some of which were Noether’s students or followers, to build on her ideas to make them part of major topology. A minority mathematics and the major mathematics (or whatever field where both types of thinking are at work) need and depend on each other. Only through a minority mathematics, however, can mathematics transform itself. To return to Deleuze and Guattari’s expression of this relationship, minority mathematics is the revolutionary condition for every mathematics in the heart of great (or established) mathematics, in the heart of a major mathematics.5 In the remainder of this introduction, I outline a general philosophical view of mathematics, especially creative mathematics, as conceptual mathematics, following (Plotnitsky 2022). Focused (in addition to the role of concepts) on epistemological questions, that study did not consider, except by implication, the role of a minority mathematics, which is, as I argue here, essential to all creative mathematics, including the invention of concepts. This chapter primarily concerns modern mathematics, as abstract mathematics. This mathematics began to emerge around 1800, consolidated, as the dominant form of major mathematics, around 1900, and has continued as such to our own time. The abstraction in question is double: Modern mathematics is abstracted from (A) the physical world, mathematically representing which defined modern physics; and (B) from our daily phenomenal intuition. (A) and (B) are connected because our intuition also contains the phenomenal representation of the external world. It was in fact this representation that was mathematically idealized by classical physics, from which modern mathematics aimed to abstract
5
Manin might have been more in agreement with the present view of the role a minority mathematics in transforming a major mathematics, which retains some interactive continuity between them or thus a general continuity of mathematical practice. While, however, admitting this element of continuity within the interaction between a minority and a major mathematics, I do maintain that the practice of mathematics contains both continuity and discontinuity, in a complex and varied interplay, including between minority and major mathematics.
10
A. Plotnitsky
itself as well. A phenomenal reality could of course be mathematically represented apart from physical reality, beginning with the reality of mathematics itself. The term “concept” is often used uncritically, without being defined or by merely assuming some common understanding of it, in mathematical (or scientific) or even philosophical literature. There is no single definition or concept of a concept. The concept of a concept assumed here is parallel to and in part follows that of Deleuze and Guattari’s definition of a philosophical concept as grounding a creative practice in mathematics (Deleuze and Guattari 1996). This practice is that of the invention of new philosophical concepts, in juxtaposition to mathematical and scientific concepts. The latter, while casually mentioned sometimes, are in fact not defined by Deleuze and Guattari. They even deny that mathematics or science have concepts in their sense and see them defined by logical propositions and formal mathematical or scientific elements, such as functions, a surprisingly conventional view for these unconventional philosophers, including in other aspects of their view of mathematics and science. “The concept,” they say, “belongs to philosophy and only to philosophy” (Deleuze and Guattari 1996, pp. 11–12, 33–34). By contrast, I shall define mathematical or scientific concepts on the model of their concept of a philosophical concept, following the argument given in (Plotnitsky 2022, pp. 52–76), which also offers a critique of Deleuze and Guattari’s view of the role of concepts in philosophy versus mathematics and science. This chapter, accordingly, gives preference to the role of conceptual thinking rather than in logical propositions or calculations, which, however, could also be shaped by a minority mathematics, as were some of Abel’s methods of calculations. Both logic and calculations are essential to mathematics, including conceptual thinking there, because the structure of mathematical concepts is logical, and because calculations require concepts and can lead to new concepts. Nevertheless, while respecting these features, this chapter views creative mathematical thinking as most essentially defined by the invention of new concepts, which, I further argue, is only possible by means of a minority mathematics.6 In the present general understanding, following, as a general understanding, that of Deleuze and Guattari, a concept is not merely a generalization from particulars (which commonly defines concepts in linguistics, analytic philosophy, or cognitive psychology) or a general or abstract idea, although a concept may contain such generalizations or ideas, specifically abstract mathematical ideas, where they may also be concepts in their own right. A concept is a multicomponent structure, defined
6
There are still other aspects of mathematical thinking, for example, a narrative one, the role of which, as constitutive rather than merely auxiliary (which has always been recognized), has received a considerable amount of attention during recent decades. For a representative collection of essays on the subject, see Doxiadis and Mazur (2012). For a full disclosure, the present author was among the contributors, although his article questioned, on epistemological grounds, the degree of the efficacy of narratives in mathematical invention or argumentation, advocated by other contributors (Plotnitsky 2012). In any event, my emphasis in this chapter is on concepts as the primary creative vehicles of mathematics, and the role, and one indeed might say, the narrative of minority mathematics, a narrative not discussed by the contributors in (Doxiadis and Mazur 2012).
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
11
by the organization, composition, of its components, and some of these components may be concepts in turn. Of course, every concept, no matter how innovative, has a history defined by other concepts (or other forms of thought) from which this concept emerges and without which it would not be possible. Just as any theory (which, while it may have other components, is always a conglomerate of concepts), a new concept is a product of an interplay of continuities with and breaks from preceding concepts. Some concepts considered here, such as those of a group by Galois or a topos by Grothendieck, were more radically innovative. Still, however, they had connections with earlier concepts, such as (putting aside the earlier history of which) those of Adrien-Marie Legendre in the case of a group and those of category theory in the case of a topos. Riemann’s thinking and practice, discussed by this author on several previous occasions (e.g., Plotnitsky 2022, pp. 99–135), provide arguably the most prominent example of the conceptual approach to mathematics, especially in geometry, as based on the concept of a manifold [Mannigfaltigkeit]. This approach led him beyond all previous approaches to geometry, including those, axiomatic in nature, that led to non-Euclidean geometry. Riemann shunned axiomatic thinking in favor of conceptual thinking, in accord with the present general definition of a mathematical concept. Riemann himself did not offer such a general definition, but his concepts, such as and in particular that of manifold, fully conform to the definition adopted here and may, as indicated above (Note 1), be seen as a model for Deleuze and Guattari’s concept of a (philosophical) concept. Conceptual thinking plays, however, an essential role in the thinking of all the main figures considered here, although the relative balance between conceptual thinking and other forms of mathematical thinking, such as an axiomatic one, is different in each case. For example, axiomatic thinking plays a greater role in Grothendieck’s work, although the minority part of his thinking was essentially conceptual, also shaping his view of axiomatization. It is, again, not my intention to diminish the role of the logical or calculational aspects of mathematics or science, or to take anything away from the contribution of philosophical investigations, such as those in the analytic philosophy of mathematics, that focus primarily on logical and propositional structures of mathematics or science.7 My aim is to give a proper emphasis to the role of concepts, especially in creative thinking in mathematics or science, but not only in creative thinking, because working with already established concepts is indispensable in all mathematical or scientific practice. The unconditional opposition between the logicalpropositional and conceptual structures, or between calculations and concepts, is not so easy to maintain, even in considering figures, such as Galois, Riemann, or Dedekind, who gave concepts a dominant role in their work. While their thinking was primarily governed by concepts vis-à-vis calculations, they were perfectly capable of and did perform difficult calculations. Besides, calculations involve
7
There are new powerful approaches to logic in mathematics, such as homotopy type theory (Voevodsky et al. 2013; Corfield 2020). Concepts, however, still play an essential and unavoidable role in them.
12
A. Plotnitsky
concepts and lead to new concepts, as they did in the case of Abel, or Leopold Kronecker, who insisted on their primary role in mathematics. I would still argue, however, that the invention of new concepts is the primary feature of creative mathematical practice, accompanied and defined (the degree may vary) by the role of a minority mathematics, again, in the present definition, working within a major mathematics. The definition of a concept given above was only a general, abstract, definition. What meaningfully establishes a concept is the specific character of its compositional structure, defined by both the nature of each component and the character of their relations within this composition, by how these components relate to each other in the structure of the concept. This composition is unique in a new concept, and it makes this concept unique. The creative essence of concepts is in their compositional individuality, which is only possible by virtue of their general compositional nature but is defined by the specificity of this composition. All musical works are compositions, but each gives us a different music. A group is composed of its elements, but in accordance to very specific rules, which give it its structure and role in mathematics in group theory and beyond, and then it multiplies further to still more specific concepts of a group, a finite group, a Lie group, an algebraic group, and so forth, and then into further concepts within each of these. A single-component concept is a product of a provisional suspension or cutoff of its multicomponent organization. In fact, there are always cutoffs in delineating a concept, which result from assuming some of the components of this concept to be primitive entities whose structure is not specified. These components could, however, be specified by alternative delineations, leading to a new concept, containing a new set of primitive (unspecified) components. Consider the mathematical concept of space. Emerging historically at the intersection of mathematics, physics, philosophy, and general phenomenal intuition, and defined by such constitutive concept-components as point, line, plane, or distance, this concept or each of these components has a long history of definitions, modification, and transformations. As a general phenomenal concept, it is arguably as old as human consciousness. Kant did not see space (or time) as a concept, but rather as an a priori form of phenomenal intuition. He had a point, even if one rejects (as many have done) Kant’s a priori view of space and time: In his view, these phenomenal forms precede any concept of space and time. As a geometrically representable physical concept, space is at least as old as ancient Greek geometry, well preceding Euclid, who did not define, although, arguably, implied, the concept of space in the Elements. The history of the concept of space, as a mathematizable physical and eventually (in the nineteenth century) purely mathematical concept, extends from the Pythagoreans to Euclid to Descartes (a coordinate space) to Riemann (a space defined as a manifold) to Felix Hausdorff (topological space) to Grothendieck (topos), with further versions within each. In the first four cases, the concept of space qua space was, while mathematizable, still physical. After each transformation, which was a product of a minor mathematics, the resulting new concept of space became part of a major mathematics, or physics. A concept, then, is an assemblage of its components connected by its composition, which define its specific individual and, at the time of its invention, unique
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
13
nature. Some of its components may be unique but need not be. The components of a concept need not be new, but its composition must be new for a concept to be new. In this respect, a concept is akin to a work of art, always defined by its composition, in accord with the ancient Greek meaning of poiein (a making of an entity), as making something new, in which the term poetry originates. A theory, or a disciplinary field, mathematical, scientific, or philosophical, such as geometry, classical physics, or Kant’s philosophy (philosophy may be more individualized), becomes an organized assemblage of concepts. It is a product of a broader poiein, which in mathematics or science may be commonly collective, although a theory may be initiated, but never contained, by a single new concept, such as Galois theory, initiated by the concept of group. In the present view, however, a creative conceptual mimesis (which may borrow, “imitate,” elements of previous concepts but combines them in a new way) is a partial technical mimesis of previous concepts, rather than a Platonist mimesis of something that belongs to a preexisting independent reality. Of course, theoretical concepts (or theories) are, in their compositional nature and functioning, different from works of art. For one thing, a work of art need not contain concepts, although so-called conceptual art always does. In functioning, a concept can be used directly or with minor modifications in, again, Kuhn’s terms, the normal practice in mathematics and science. Philosophy, at least a certain form of philosophy, for example, as defined by Deleuze and Guattari (1996), may be associated strictly with the invention of concepts as a revolutionary and hence a minority practice, which thus becomes its normal or major practice. In this respect, this practice is closer to that of art, which, while it may use a tradition, is compelled to be innovative and in this case a minority art, but only in this respect, because the nature of the conceptual and artistic composition remains different. What is shared in the creative nature of art, philosophy, and mathematics and science is poetically (again, in the ancient Greek sense of poiein) the role of composition, and politically the minority nature of creative thought, the political essence of which is freedom. All these fields and all creative human endeavors first assume, as Deleuze and Guattari argue, the necessity of a war they, by their different means, wage against the dogmatism of opinion or any dogmatism, the greatest enemy of creative and, hence, free thought, which is always a minority thought (Deleuze and Guattari 1996, p. 202). Although Deleuze and Guattari did not expressly make this point, their argument concerning a minor (minority) literature, certainly in Kafka’s case, clearly implied it. A minority literature is, by definition, a creative practice of literature, and as such is always at war with, and is a war against opinion, and it poses a great danger to opinion and majority thinking, or even major thinking, when the latter becomes governed by opinion. The dogmatism of opinion is, in the first place, dogmatism, to which Hegel juxtaposes true thinking, as defined by a free creative mediation. “Dogmatism,” he says in the Preface to The Phenomenology of Spirit, “the way of thinking, in both knowing or in the study of philosophy, is nothing but the opinion that truth consists in a proposition which is a fixed result or else in a proposition which is immediately known [der unmittelbar gewußt]” (Hegel 2019, p. 31). In other words, dogmatism forgets or disregards the mediating and especially creative (all creation is mediation, but a special one) role of
14
A. Plotnitsky
thinking and makes one surrender one’s capacity for true thinking. The dogmatism of opinion was an enemy of thought for all figures considered in this chapter. They had to confront and fight opinion, not the least by means of new concepts, as Riemann argued in closing his Habilitation lecture. New concepts, such as those his lecture introduced, are, he said, needed “to insure that [our] work is not hindered by unduly restricted concepts and that progress in comprehending the connection of things is not obstructed by traditional prejudices” (Riemann 1854, p. 33). Such restricted, and restricting, concepts are often part of a majority mathematics and sometimes even of a major mathematics. On the other hand, new concepts or all creative thought always require one or another degree of a minority thinking and hence a freedom in deciding to pursue a minority thinking rather than follow a majority thinking, even when it is a major thinking. A decision is, I would argue, a better category than choice, at least a free choice, because there may always be factors, psychological, social, or cultural, that shape our decisions, restricting and sometimes precluding their freedom. It does not follow, however, that there is no freedom in our decisions, although this type of argument is made sometimes (usually in the form of denying the existence of free will). The category of decision allows for the possibility of freedom, which, while always relative, is crucial. Also, in saying that the essence of minority thinking is freedom, the possibility of freedom, I do refer to its political essence. I am not saying that mathematics (or science) always involves politics, either within mathematics itself or in relation between mathematics and culture. This may be true, but it means little, without giving such claims due specificity. My point here is specific and different in nature: The essence of minority thinking in mathematics or elsewhere is the possibility of freedom, because a minority thinking and hence creative mathematics would not be possible otherwise. I shall return to this point in closing this chapter to argue that mathematics allows for an extraordinary and even unique degree of freedom, and that a minority mathematics, defining all creative mathematics, is governed by this freedom rather than by a majority mathematics, even when it is a major mathematics. Deleuze and Guattari, in this connection, also speak of a majority mathematics or science, governed by a major mathematics, as “royal” or “state” mathematics (or science), and a minority mathematics or science, as nomadic, specifically juxtaposing Gaspar Monge’s (state) geometry to Girard Desargues’ (nomadic) projective geometry (Deleuze and Guattari 1987, pp. 363–365). Creative mathematics is grounded in and is an affirmation of a minority thinking and of freedom in mathematics. There is no creative mathematics otherwise. Although creative mathematical decision-making may, as Georges Polya argued, be based on plausible reasoning (Polya 1990), we still have the possibility of freedom to decide on the line of thought to pursue. There is no possibility of new mathematics without a minority mathematics and without the possibility of freedom to pursue it. Now, according to Deleuze and Guattari, a philosophical concept, as defined by them, is also a problem, a positing of a problem. This view has a mathematical genealogy, although the concept of a problem preceded mathematics and was adopted by it and given a mathematical specificity by ancient Greek mathematics. A problem in this sense is not something that, like a theorem, is derived from
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
15
assumed axioms by means of logical rules but is something that is posed, created, along with and as a concept. According to Deleuze and Guattari’s definition of these connections, however, as applied to philosophical concepts, a concept as a problem persists and thus remains a problem, rather than disappears, or is solved, in its solutions (Deleuze and Guattari 1996, p. 5). This is a complex conception, which is not easy to interpret, especially given the paradoxical nature of this formulation, and I shall not attempt to do so. Nor shall I adopt this concept as such, in part, because this conception is applied by them to philosophical concepts as problems, rather than to mathematical problems, which, in their view, are always, in principle, soluble, rather than continue to remain problems in their solutions as philosophical concepts do. Instead, I shall use a different mathematical concept proposed in (Plotnitsky 2022), that of “a concept-form,” without claiming that Deleuze and Guattari would have subscribed to this concept, either in relation to their concept of a philosophical concept or otherwise. A (new) concept-form, such as that of a group by Galois or a topos by Grothendieck, would be introduced, just as any concept, in order to solve an existing problem, which defied a solution by means of already existing concepts or had a solution that was unsatisfactory, conceptually, technically, or even aesthetically. Finding the algebraic solution to a polynomial equation (or proving the impossibility of doing so) in Galois’ case was such a problem, leading him to the concept of group. There are thus two approaches (with gray areas between them and sometimes combining both) to solving a problem, especially a difficult one: The first is essentially technical, which aims to solve it by means of logical and technical manipulations of previously existing concepts; and the second, adopted by Galois, is essentially conceptual, which uses a new concept or set of concepts, although technical and logical manipulations remain necessary, as they are in all mathematics. Such a concept may be relatively simple, such as that of a group, although this simplicity is retrospective. But, even if it is, such a concept usually arises in response to a complex problem. This was certainly so in the case of the concept of group, which was invented by Galois to find an algebraic solution to a polynomial equation (and, as it happened, proving the impossibility of doing so for the equations of degree five and higher). This was a very complex problem at the time. By means of this concept, Galois redefined this problem by establishing that whether such a solution exists or not is related to the structure of a group of permutations associated with the roots of the polynomial, known as the Galois group of this polynomial. Abel’s approach had elements of this thinking by posing the question whether a solution exists or not, given the structure of the equation, and finding that in general it does not. This finding is now known as the Abel-Ruffini theorem or Abel’s impossibility theorem. Paolo Ruffini gave an incomplete proof in 1799, and Abel was the first to prove it in 1824. Galois accomplished much more, by making a group a concept-form. As a concept-form, however, a new concept does more than merely solve a given (initial) problem. It defines a new problem-posing field in which new types of actual problems emerge and are being solved, while the concept-problem that initiated and governs this field is not something that is “solved” in the sense of no longer being
16
A. Plotnitsky
necessary for this field to continue to develop, which it does by defining new concepts and concept-forms, some of which are new forms of the original concept-form. In this way, one can maintain a certain affinity with Deleuze and Guattari’s understanding of a concept as a problem persisting in its solutions, rather than disappearing in them. I am, again, not claiming that the concept of a concept-form offers an interpretation of Deleuze and Guattari’s concept of a concept-problem. In fact, it cannot be because the latter concept is philosophical, while the concept of a concept-form is mathematical. All that matters here is that the concept of a concept-form works, as I think it does, in mathematics and science. Galois’ concept of group not only gave the Abel-Ruffini theorem a more general grounding but also gave algebra a new field, Galois theory, in which new problems were subsequently posed, as they still are, which give Galois theory its continuing life. The same is true about Grothendieck’s topos theory. This type of invention of a new concept in response to a concrete problem not only in solving this problem but also making this concept a concept-form can happen as an outcome of a deliberate strategy or it might happen contingently, in a complex interplay between chance and necessity. The concept of a concept-form is both a tribute to Plato’s thinking and a critique and ultimately abandonment of all Platonism, including Plato’s own, which is different from many other Platonisms, including mathematical Platonism, a twentieth-century conception, quite different from the way of Plato’s thought, including about mathematics. It is a tribute insofar as it affirms a creation of new conceptual forms which then shape the subsequent development of thought in a given field and beyond it. Thus, the concept of group has shaped and continues to shape both mathematics and other fields, such as physics or other natural sciences, and even social sciences. On the other hand, against Platonism (now including Plato’s own) a concept-form is not something that preexists the history of thinking related to it and is then discovered by human thought, which may only conceive of this something itself partially or approximated. It is still human thought that assumes the existence of this something as a real preexisting human thought. This assumption is Platonism in its ultimate form. By contrast, as explained, each concept-form emerges in and shapes this history in the complex interplay of chance and necessity in the work of individual figures and communities. It comes from, emerges in, a contingent history, even though it can shape the subsequent history in, one might say, quasi-Platonist fashion. I qualify by “quasi” because new concepts or concept-forms shaped by this concept-form are not mimetic shadow-images of preexisting (eternal) forms, as in Plato (Plotnitsky 2022). By the same token, a concept-form need not be seen as containing or fully determining a problem-posing field it creates, even if only as a potentiality. This field emerges in a contingent development, as a complex interplay of causal chains and chance occurrences. In general, in adopting a problem-posing versus axiomatictheorematic view of mathematics, one is more likely to reject a Euclidean, essentially Platonist, view in which axioms are assumed as pregiven truths rather than created as concepts first, as axioms are from the present perspective, a distinction that was brought into the foreground by the discovery of non-Euclidean geometry. The present view is close to Imre Lakatos’ conception of a quasi-empirical theory versus
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
17
a Euclidean theory, modeled on Euclidean geometry. (“Quasi” reflects the fact that mathematics is “empirical” in dealing with mental rather than material phenomena.) Lakatos sees creative mathematics as quasi-empirical, consistently with or even implying the key role of a minority mathematics in it. Reciprocally, the present view is quasi-empirical, even though some and even most of the figures considered here, such as Abel, Noether, or Grothendieck, might have adopted a more Euclidean philosophy of mathematics. Lakatos also argues that no Euclidean ideology can avoid quasi-empirical practice, especially as creative practice, any more than a creative major mathematics can avoid becoming a minority mathematics. According to Lakatos: The development of a quasi-empirical theory is very different. It starts with problems followed by daring solutions, then by severe tests, refutations. The vehicle of progress is bold speculations, criticism, controversy between rival theories, problem shifts. Attention is always focused on the obscure border. The slogans are growth and permanent revolution, not foundations and accumulations of eternal truths. The main pattern of Euclidean criticism is suspicion: Do the proofs really prove? Are the methods used too strong and therefore fallible? The main pattern of quasi-empirical criticism is proliferation of theories and refutations. (Lakatos 1980, pp. 29–30)
This proliferation invites and even gives a central role to a minority mathematics. As, correlatively, both quasi-empirical and grounded in the role of a minority mathematics, the present view of the history of mathematics is that of multiple trajectories, heterogeneous yet interactive, that is, trajectories with multiple connections between them, without assuming or even precluding unifying them in a single “grand” narrative, as Manin appears to see as at least possible to do (Manin 2010, p. 242).8 Every new mathematics, which, as new, is always a creation of a minority mathematics, takes a chance on the future of mathematics, just as every new minority science, literature, such as that of Kafka, or philosophy takes a chance on becoming a mathematics or science, art, or philosophy of the future. “A philosophy of the future” is Friedrich Nietzsche’s phrase describing his own philosophy, manifestly a minority philosophy, and all true philosophy in his subtitle to Beyond good and evil: A prelude to a philosophy of the future (Nietzsche 1966). To become a philosophy of the future, a philosophy, Nietzsche in effect argues, must start as a minority philosophy, which is quite true about that of most major figures in the history of philosophy – Plato, Aristotle, Descartes, Leibniz, Kant, and Hegel. One can take the same view of creative mathematics, such as that of Abel, Galois, Noether, and Grothendieck. Robert Langlands (of the Langlands program fame) compared Grothendieck to “Nietzsche’s Philosoph der Zukunft” (Langlands and Shelstad 2007, p. 486). This is no less true about Abel, Galois, or Noether, and other key figures mentioned here, or Langlands himself, as the Langlands program has proven.
8
The concept of a grand narrative is borrowed by Manin from Jean-François Lyotard, who saw it as challenged by the present-day, “postmodern,” world, in part in view of twentieth-century mathematics and science, which has shaped this world (Lyotard 1984).
18
2
A. Plotnitsky
Abel and Galois: Minority Mathematics and the Rise of Modern Algebra
Over half a century ago, Jules Vuillemin offered an influential analysis of Abel and Galois, and of the rise of (modern) algebra in La philosophie de l’algèbre (Vuillemin 1962), on which, along with other sources, such as Charles Lautman, Deleuze built his commentary on Abel and Galois in his 1968 Difference and repetition (Deleuze 1995). Vuillemin’s approach had a strong structuralist orientation shaped by the Bourbaki program (the book is dedicated to, among others, Pierre Samuel, a member of Bourbaki), an orientation different from the one of this chapter, on four counts. First, a minority mathematics was never considered, even by implication, by Vuillemin, while it is, as I argue, crucial for understanding both Abel’s and Galois’ roles in algebra and modern mathematics in general. Deleuze does not consider a minority mathematics in this earlier book either. The concept of minor literature was developed two decades later in the book on Kafka, cowritten with Guattari, building on the ideas of their earlier joint works. Second, this chapter adopts a conceptual approach versus a structuralist one. Third, Vuillemin’s book and the corresponding part of Deleuze’s book (a broader philosophical project overall) aim to transpose the methods of modern algebra, in particular those of Abel and Galois, into philosophy, which is not my aim here. Fourth, “ideas,” in Difference and repetitions and then concepts in Deleuze and Guattari’s What is Philosophy? (Deleuze and Guattari 1996), are defined strictly as philosophical ideas and concepts, even when they are operative in mathematics, specifically as problems that persist rather than disappear in “solutions” to them. By contrast, as explained above, mathematics is, for Deleuze and Guattari, always about solutions of mathematical problems (Deleuze 1995, pp. 163–164, 179–182). One might readily agree that mathematics is commonly in pursuit of solutions to its problems and that there may be a philosophical dimension to a new problem-posing field defined by a new mathematical concept-form, such as that of a group by Galois. In the present view, however, a concept-form is a mathematical concept rather than a philosophical concept. Concept-forms still enable one to solve, or to advance the possibility of solving, concrete problems, such as, in Galois’ case, the problem of finding the algebraic solution to a polynomial equation, which gave rise to the concept of group. This is made possible by embedding this original problem in a new mathematical concept-form, which defines a new problem-posing mathematical field, or even a new field of mathematics, such as Galois theory. An exact definition possible in mathematics is difficult to achieve in a philosophical concept, the rigor of which needs to be, accordingly, defined otherwise, which is a separate issue that will be put aside here. These differences notwithstanding, it may be helpful to summarize Vuillemin and Deleuze’s arguments concerning Abel and Galois’ work. According to Deleuze: Abel elaborated a whole method according to which solvability must follow from the form of the problem. Instead of seeking to find out by trial and error whether a given equation is solvable in general, we must determine the conditions of the problem which progressively specify the fields of solvability in such a way that ‘the statement contains the seeds of the
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
19
solution’. This is a radical reversal in the problem-solution relation, a more considerable revolution than the Copernican. It has been said [by Vuillemin] that Abel thereby inaugurated a new Critique of Pure Reason, in particular going beyond Kantian ‘extrinsicism’. This same judgement is confirmed in relation to the work of Galois: starting from a basic ‘field’ (R), successive adjunctions to this field (R0 , R00 , R000 . . .) allow a progressively more precise distinction of the roots of an equation, by the progressive limitation of possible substitutions. There is thus a succession of ‘partial resolvents’ or an embedding of ‘groups’ which make the solution follow from the very conditions of the problem: the fact that an equation cannot be solved algebraically, for example, is no longer discovered as a result of empirical research or by trial and error, but as a result of the characteristics of the groups and partial resolvents which constitute the synthesis of the problem and its conditions (an equation is solvable only by algebraic means—in other words, by radicals, when the partial resolvents are binomial equations and the indices of the groups are prime numbers). The theory of problems is completely transformed and at last grounded, since we are no longer in the classic masterpupil situation where the pupil understands and follows a problem only to the extent that the master already knows the solution and provides the necessary adjunctions. For, as Georges Verriest remarks, the group of an equation does not characterize at a given moment what we know about its roots, but the objectivity of what we do not know about them. Conversely, this non-knowledge is no longer a negative or an insufficiency but a rule or something to be learnt which corresponds to a fundamental dimension of the object. The whole pedagogical relation is transformed—a new Meno [Plato’s dialogue dealing with teaching mathematics]—but many other things along with it, including knowledge and sufficient reason. Galois’s progressive discernibility unites in the same continuous movement the processes of reciprocal determination and complete determination (pairs of roots and the distinction between roots within a pair). It constitutes the total figure of sufficient reason. (Deleuze 1995, pp. 179–180)
Manifestly the product of a minority mathematics, this is a radical transformation of mathematical, and, on this pattern, for Deleuze, of philosophical practice, including as the pedagogy of ideas or, in his later works, concepts. This view also implies that the pedagogy of mathematics should be transformed, certainly as it functions in mathematical practice (which, just as that of philosophy, always involves a teaching of concepts) and possibly in the actual teaching of mathematics. The latter, primarily defined by teaching even more a majority mathematics than a major mathematics, would benefit by giving more attention to minority thinking in mathematics. In the absence of a contemporaneous alternative education (possibly informal), such a teaching is not likely to lead to the emergence of mathematical thought such as that of Abel, Galois, and Noether. When B. L. van der Waerden’s Modern Algebra (van der Waerden 1930) was published in 1930, it was also an introduction to a new way of thinking in algebra, developed by Noether and Emil Artin, and at the time (and for a while) it helped new thinking in algebra and beyond, including as pursued by Bourbaki. But subsequent textbooks on algebra rarely perform this function, and Modern Algebra no longer in general does so either. I shall return to this subject below. Galois’ thinking had an even more revolutionary significance for both Vuillemin and Deleuze, given the role of the concept of group, which was (in present terms) a new concept-form and, as such, an even more consequential invention. All conceptforms require a minority practice to emerge, but Galois’ concept of group, or his thinking in general, more dramatically exhibited a minority mathematics. As noted
20
A. Plotnitsky
from the outset, the concept of group has radically deterritorialized and reterritorialized mathematics, well beyond algebra. This deterritorialization is, again, not addressed by Deleuze in Difference and repetition but is suggested by his later appeal to Galois’ “abstract machine,” alongside Riemann’s “abstract machine,” the term designating a vehicle (also in its literal sense of enabling a movement) of deterritorialization (Deleuze and Guattari 1987, p. 142). The term also designates a nomadic, and hence, minority, versus state or royal (major and often majority), practice of mathematics. I shall consider first Abel’s work, arguably the first manifested case of a minority mathematics in modern mathematics, prompting me to designate all minority mathematics as Abelian mathematics. According to Vuillemin, Abel transforms algebra by the invention of “a general method”: In the first place, one analyzes, in its most general form, a mathematical relation or a defined set of such relations, which make it possible to determine a property, of which we do not yet know whether or not we can attribute it to a class of beings: by example the character of being algebraically solvable, of being convergent, of being expressible by a defined number of functions of a certain class. In the second place, one considers the class of beings to which it is a question of attributing this property . . .; we analyze these beings from a general point of view, we define the relationships to which their nature allows them to be subjected. Finally, this double examination reveals the cases of incompatibility (demonstrations of impossibility) and, possibly, indicates the way to find the new required relationships in cases of possibility. (Vuillemin 1962, p. 214; translation mine)
Thus, rather than looking for the solution to equations of the fifth degree, one asks whether such a solution is possible, in the first place, for any polynomial equation. This shift is the key (minority) move by Abel. Abel’s new “idea of a general method consists in giving a problem a form such that it is always possible to solve it” (Vuillemin 1962, p. 209; translation mine). Abel, thus, reversed the method of his predecessors and his contemporary major mathematicians, who would proceed from the particular to the general to proceeding from the general to the particular. As a result, Abel becomes one of the creators of the modern algebraic method. Vuillemin sees this method as, in effect, structuralist in Bourbaki’s sense. Against Kant, who limited his construction of the proof of impossibility by positing the possibility of an experience as an exclusive criterion of knowledge, “general demonstrations, in the sense of Abel, change the modality of the proof. Particular demonstrations are actual [réelles]: they assume as their grounding principle the possibility of the experience given in the affect of sensation. The general proofs deal with the possible and start from the concept alone, disregarding the restrictive conditions of sensitivity” (Vuillemin 1962, p. 216; translation mine). This initiating concept is, again, defined not by a generalization from particulars, but as a new structured entity, in accord with the present definition of concepts. In this way, Abel introduced a new critique of pure reason (in Kant’s sense of a critique a rigorous examination of the legitimacy and limits of philosophical claims, such as those concerning causality, and hence concepts), at least in mathematics, but, as Vuillemin and Deleuze argue, extendable beyond mathematics. Mathematics needed Galois in algebra (a group) and then
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
21
Riemann in geometry (a manifold) to be expressly rethought in this way, or perhaps more expressly. For, one might see Abel’s thinking as reaching further in this direction and in general as a more radically transformative than Vuillemin and Deleuze suggest. His minority mathematics was even more momentous in its transformative and, hence, deterritorializing potential. Mikhail Gromov considers Abel to be “a major figure, if not the major figure, in changing the course of mathematics from what could be visualized and immediately experienced to the next level, a level of deeper and more fundamental structure” and hence toward modern mathematics (Raussen and Skau 2010, p. 401).9 Abel’s thinking also becomes even closer to that of Galois in general, beyond the problem of solving polynomial equations. Gromov continues: [Abel] changed the perspective on how we ask questions. I do not know enough about the history of mathematics but it is obvious that the work of Abel and his way of thinking about spaces and functions has changed mathematics. I do not know enough history to say exactly when this happened, but the concept of underlying symmetries of structures comes very much from his work. We still follow that development. It is not exhausted yet. This continued with Galois theory and in the development of Lie group theory, due to Lie, and, in modern times, it was done at a higher level, in particular by Grothendieck. This will continue, and we have to go through all that to see where it brings us before we go on to the next stage. It is the basis of all we do now in mathematics. (Raussen and Skau 2010, p. 401)
As such, Abel’s minority mathematics becomes a crucial case of a deterritorialization of modern mathematics. Gromov’s statement is a reflection on this aspect of Abel’s work as well, even if without considering Abel’s works strictly from this perspective. My designation of all minority mathematics as Abelian mathematics fits well with Gromov’s assessment and with his comments, on the same occasion, on mathematical education. In responding to the question concerning mathematical education or education in general now, Gromov gives a somewhat unexpected answer, by using Abel as an example: Raussen and Skau: Education is apparently a key factor. You have earlier expressed your distress about realizing that the minds of gifted youths are not developed effectively enough. Any ideas about how education should change to get better adapted to very different minds? Gromov: Again I think you have to study it. There are no absolutes. Look at the number of people like Abel who were born two hundred years ago. Now there are no more Abels. On the other hand, the number of educated people has grown tremendously. It means that they have not been educated properly because where are those people like Abel? It means that they have been destroyed. The education destroys these potential geniuses – we do not have them! This means that education does not serve this particular function. The crucial point is that you have to treat everybody in a different way. That is not happening today. We don’t have more great people now than we had one hundred, two hundred, or five hundred years ago, starting from the Renaissance, in spite of a much larger population. This is probably due to education. . . .
9
This assessment was given in the wake of Gromov’s reception of the Abel prize, which might have inflected Gromov’s remarks, without, in my view, undermining Gromov’s essentially correct assessment itself.
22
A. Plotnitsky Raussen and Skau: You point out that we don’t have anybody of Abel’s stature today, or at least very few of them. Is that because we, in our educational system, are not clever enough to take care of those who are exceptionally gifted because they may have strange ideas, remote from mainstream? Gromov: The question of education is not obvious. There are some experiments on animals that indicate that the way you teach an animal is not the way you think it happens. The learning mechanism of the brain is very different from how we think it works: like in physics, there are hidden mechanisms. We superimpose our view from everyday experience, which may be completely distorted. Because of that, we can distort the potentially exceptional abilities of some children. There are two opposite goals education is supposed to achieve: firstly, to teach people to conform to the society they live in; on the other hand, to give them freedom to develop in the best possible way. These are opposite purposes, and they are always in collision with each other. This creates the result that some people get suppressed in the process of adapting them to society. You cannot avoid this kind of collision of goals, but we have to find a balance between the two, and that is not easy, on all levels of education. (Raussen and Skau, p. 402)
Gromov is right to qualify the question if education is not obvious and to note this conflict at the core of the Enlightenment ideal of education. In effect (whether or not Gromov had in mind or even was aware of this connection to the Enlightenment), Gromov describes this ideal by referring to giving people “freedom to [individually] develop in the best possible way.” (Conforming to the society one lives in is common to most conceptions of education.) This concept and, with it, the conflict in question emerge with the Enlightenment, especially following Kant, who introduced the term “Enlightenment [Aufklärung],” in part by grounding this ideal in, to use the title of his second critique, a “critique of practical reason” (Kant 2015). Two influential examples may be noted here. The first, before (and influencing) Kant, Jean-Jacques Rousseau’s Émile, ou de l’éducation [1762] emphasized the importance of this freedom for an individual to develop in the best possible way and the difficulty of achieving the aim of doing so in a society (Rousseau 1997). (Because of one section of the book, it was banned in Paris and Geneva, and publicly burned when published in 1762.) The second, following (and yet displacing) Kant, Friedrich Schiller’s Über die ästhetische Erziehung des Menschen in einer Reihe von Briefen [Letters on the Aesthetic Education of Man] (1795) (Schiller 2004), aimed at reconciling both aims within a framework, which, however, poses major questions of its own.10 The question of education is indeed far from obvious. Abel and others to whom Gromov is referring under an apt collective name “Abels,” certainly Galois, or any number of Renaissance figures who could be mentioned, such as Galileo, were not uneducated, quite the contrary. Nor were Abel and Galois merely self-educated in mathematics either, even if to some degree they, especially Galois, were more so than others. But if one considers the likes of Gauss and Riemann, one encounters a reasonably conventional mathematical, as well as broader, education. (Kafka was remarkably educated and not only when it comes to knowledge of literature, and For a critique of Schiller’s “aesthetic ideology,” including his misreading of Kant, see (De Man 1996) and for the connections between the general problematic of “aesthetic ideology” and the idea of algebra, in connection with Paul de Man’s analysis (Plotnitsky 2001).
10
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
23
Gromov, perhaps not coincidentally, brings up Russian literature in his interview, and some of its greatest figures, comparable to Kafka, including as, in some cases, representing a minority literature.) Gromov must have been aware of these qualifications. His point was, I would surmise, different, and it may be expressed in terms of a minority thinking in mathematics or elsewhere. As indicated above, even more than individuality, our education resists and discourages orientations toward a minority thinking, not the least because while transforming major thinking it challenges even more majority thinking. As I argue, however, without a minority thinking in mathematics and a chance, and a freedom, to put it into practice the likes of Abel and Galois are not possible or in any event not likely to emerge. The problem is equally acute elsewhere, including in literature and art, which has managed to still produce their “Abels,” such as Kafka, throughout much of the twentieth century, although there are now no more Kafkas or Joyces around either. Still, minority literature or art might have a greater chance, a chance diminished by the contemporary nature of education in mathematics and science, although there are (hence, I qualify by “might”) societal and specifically educational forces operative against pursuing minority ways in literature and art.11 Mathematics or other fields could also be transformed more within major practices – more, but never entirely so. Instead, such cases show (and hide) more incremental multistep punctuated processes, with a lesser impact of minority thinking entering from a manifested exterior, such as that exhibited in Abel and Galois’ thinking. Galois is an equally exemplary case in both his uniqueness as a minority mathematician and in the role of “the concept of underlying symmetries of structures” in mathematics invoked by Gromov, who mentions Galois theory as an example following Abel’s “revolution.” It would, however, be difficult to argue that Galois theory merely follows Abel in this regard, and Gromov would probably admit its greater overall significance. It was the first expressly formulated actual theory of the type Gromov invokes. For one thing, it is as a product of a group theory, 11 More generally, as I argue here, if there is no opening toward minority thinking, there is no creative thinking in any domain, but only a proliferation of opinion and forms of programing according to preestablished rules, which are essentially equivalent to or are generative of opinions. Although I mean the term “programming” in a broader sense here, I do imply that the recent development of and claims for the so-called artificial intelligence (a misleading term), such as, now ubiquitous, ChatGPT and its avatars, have nothing to do with thought, but are merely forms of programing. In this respect, the term artificial intelligence (AI) is misleading, although it is difficult to replace it at this point, for example, by a much more accurate “digital robotics.” As such, such programs merely propagate knowledge defined by or defining opinions, with no space for a minority thinking and hence creativity. Even if such programs use things derived from minority thinking gathered from Internet archives, they will convert them to preestablished information or rigid articulations of ideas. While recognizing many extraordinary accomplishments of digital technology, one should not confuse a clever programing with thinking. A remarkable and in many respects distressing aspect of ChatGPT and related software is that they do what they do without thinking. Their use in mathematics, much discussed recently as well, is a separate issue, which I put aside. I merely note that, in my view, it does not seriously challenge my main point concerning the difference between programing and creative thinking, which is always a minority thinking. ChatGPT can only help to propagate opinions.
24
A. Plotnitsky
which, as I have stressed throughout, enacts a broad deterritorialization of modern mathematics, as well as physics and other sciences, natural and social. It is hard to think of a broader one. Some elements of group-theoretical thinking are found in Abel, or Legendre, earlier, and Abel’s thinking in general has deeper affinities with Galois in rethinking the fundamentals of mathematics as such. But the concept of group carries Galois’ signature. It was also accompanied by another important concept, that of an action of a group on another structured entity, such as an algebraic field or vector space. Besides, Galois theory is much more than only part of group theory. For Vuillemin, deeply steeped in Bourbaki’s ideology, the emphasis is on the idea of structure, and he sees the concept of group as the first formal algebraic structure in the history of mathematics. Galois’ thinking, then, is both akin to that of Abel as concerns Abel’s philosophy of solving problems and reaches beyond Abel, even if it is, as Gromov argues, not as far beyond as is sometimes claimed. It does so, again, by bringing in the role of a structure and the concept of group as an entity (we would now say a set) with a structure, a relationally organized multiplicity. Galois, for Vuillemin, becomes the first modern algebraist, who associates with solving a problem the analysis of the structure from which this problem arises and which makes it possible or not to solve it, such as the problem of finding algebraic solutions by radicals of polynomial equations. Vuillemin’s structuralist perspective is not out of place. One cannot avoid structures or merely dismiss structuralism, an important movement, conceptually and historically (Corry 2006). For one thing, concepts are also structures. On the other hand, one might, as I do, see structures as emerging from and as concepts, which thus precede structures in creative thinking in mathematics. Indeed, as in the case of Abel’s thinking, one of Vuillemin’s main points in considering Galois is more about concepts than structures, keeping in mind that a concept is a structure, too, and conversely any given structure is a concept. Sometimes, the structure of a mathematical concept, such as that of a group, may coincide with the mathematical structure defined by this concept. Still, a structure qua structure must be defined and thus comes from a concept, possibly a conceptform. Vuillemin distinguishes two types of abstraction: The first is conventional, defined by an increasing generalization from the particular to the general, and the second, found in group theory, is conceptual in the present sense. It defines the differences between specific individual cases because it constructs these cases from the initial concept. Vuillemin sees the second abstraction, which he terms “formalization,” as a double abstraction: The first is that of the formal presentation of the group, as a concept, by algebraic symbols and their relations; the second is that defined by providing a method that makes it possible “to construct individuals, no longer in intuition according to imperfect [arbitrary] schemes, but [determinately] in the concepts themselves” (Vuillemin 1962, pp. 288–289; translation mine). Thus, one deals with an algebraically formalized conceptual practice and redefines algebra accordingly, as a deterritorialization of algebra by means of a minority algebra. In Galois’ case and, more extensively, in Abel’s case, one also deals with a parallel shift to the conceptual algebraization of analysis, as well as geometry, where
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
25
this shift was initiated with the analytic geometry of Fermat and Descartes. In noting that, in the wake of Galois, “modern mathematics is therefore regarded as based upon the theory of groups or set theory rather than upon differential calculus,” Deleuze adds: “Nevertheless, it is no accident that Abel’s method consisted in giving a problem a form such that it is always possible to solve it, concerned above all the integration of differential formulae” (Deleuze 1995, pp. 180–181). Galois’ concept of group as a multiplicity with a structure was an entirely new type of concept, created by a minority mathematics from within and transforming, reconstructing, the major mathematics of his time and of the future. As a conceptform, it defined a new problem-posing field through the new mathematical formulations of existing problems (such as that of the solution of polynomial equations by radicals) and posing new problems, and the simultaneous establishing of their conditions of solubility. This new thinking changes the future of mathematics as a field of solving problems, because these problems themselves and the conditions of their solubility change, by asking under what conditions a given problem can be solved. While Abel and already Ruffini did something similar, there was no specific concept of the type a group is in its power and generality. Galois, again, also creates the concept of an action of a group on another structural entity, such as a field in algebra or a manifold in geometry. One has now a new abstract concept, an invention of thought, that of a group. This new concept governs a previously established object, an equation, or eventually governs a field (another new concept-form), or a class of such objects, similarly to the way a Platonic form precomprehends a given concrete reality. As indicated earlier, however, there are also major differences (hence my emphasis) between these two conceptions. The objects considered, such as equations or fields, are not “shadows” of a group as this new abstract form, which does not preexist them, as Plato’s form or idea would. Each such object or instantiation of a concept, such as an equation or a field, has its own definition. Just as that of a group, a more general concept of an equation (which has a long and complex history) or a field, like algebra, only preceded a particular equation, or area of algebra, locally, and not as a Platonic form, preexisting them in some “eternal” domain. (Where, one wonders, would a mathematical Platonist place such a domain?) In the present view, any new concept is a local invention of human thought and, to one degree or another, is always a product of a minority thinking. As such, it relates and handles the reality already in place and the problem this reality posed and, by doing so also, establishes a new problematic field, such as Galois theory in the case of a group. In fact, in the process it creates a new mathematical reality (Plotnitsky 2022, pp. 83–95). It always does so, however, locally and contingently, in the interaction between a minority and a major mathematics, rather being governed by one or another preexisting Platonist reality or a single transforming Ur-reality governing all possible local transformations. Such Ur-reality would imply a possibility, at least in principle, of a single major mathematics to which a minority mathematics could contribute but that it could not change by an independent act of creation. A minority mathematics could, in this case, only participate in transformations of a single major mathematics, Ur-mathematics, defined by this Ur-reality. In the present view, the possibility of
26
A. Plotnitsky
either this type of Ur-reality or this type of Ur-mathematics is precluded in principle. So is, it follows, a grand narrative of mathematics, replaced by multiple narratives of its continuous and discontinuous developments, with multiple mathematical practices (minority, major, and majority ones, among them) in multiple interactions.
3
The Three Mathematics of Emmy Noether
Emmy Noether, a major figure in twentieth-century mathematics, is also an important case of a minority mathematics. Her case has additional complexities, first, because the relationships between a major and a minority mathematics in her work in different areas of mathematics were different, and second, because of the “geography” of mathematics at the time, with Göttingen, where she worked, as one of its centers or even the center of major mathematics in most key areas of modern mathematics then. Although her achievements and the importance of her work have always been acknowledged by mathematicians and historians of mathematics, her role as a shaping figure in twentieth-century mathematics has remained an underappreciated subject for a long time, although the situation has changed in recent years. Described by some of the greatest twentieth-century mathematicians and physicists, Hilbert, Einstein, and Weyl, among them, as the most important woman mathematician in the history of mathematics, she was once referred to, in The New York Times, as “the most significant mathematician you’ve never heard of” (Angier 2012). A decade old now and no longer as applicable, this assessment still reflects both the status of mathematics in our culture and the status of women in mathematics. While the second subject has been extensively considered in literature, including in connection with Noether’s work, the first has not received much attention, although it has been given more attention in recent years as well. This chapter does not intend to offer an analysis of either subject as such. It might, however, suggest a new perspective on both by considering Noether’s work as a minority mathematics. In geometry, the primary field of her earlier work, she followed the tradition, defined by the role of symmetry and group theory, established by Riemann, Felix Klein, Sophus Lie, and others, and by that time also shaping physics, specifically general relativity, in which Noether worked as well. In algebra, the primary field in her later work, she followed the tradition initiated by Galois and developed especially by Dedekind. She also made important and, in some respects, revolutionary contributions, to algebraic topology. All three fields were by then parts of major mathematics, even though continuous innovations within each involved (in the present view, unavoidably) minority mathematical thinking and practices. Noether’s thinking and practice were minority ones both in their purely mathematical aspects and given her social and specifically institutional situation, because she was a female mathematician in a (vastly) male dominated field, as mathematics was then, including in all areas of her work. Here, I shall be primarily concerned with her mathematics, without offering an analysis of the second aspect of her minority practice, addressed in several recent studies, including her biographies. I shall argue that,
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
27
while all three facets of her work still represented, in view of their innovative nature, minority mathematics, each case was different. Her work on her celebrated theorems was more firmly within what was major mathematics in geometry at the time, her work in algebra had much more significant minority dimensions, and her contribution to algebraic topology was the most pronounced manifestation of her minority thinking in mathematics. All three of Noether’s mathematics, however, also confirm that a minority mathematics depends on and is not possible without working with a major mathematics. One might read in this way Noether’s refrain “Es steht alles schon bei Dedekind [this is all already in Dedekind],” even if she herself meant it more directly, a claim that is difficult literally to apply to Noether’s ideas in algebra, many of which were too original to be thus attributed to Dedekind (McLarty 2006, p. 212). Instead, these connections to Dedekind, whose work was by then a wellestablished part of major algebra, reflect the dependence of Noether’s minority mathematics on its interaction with her contemporary major mathematics in making possible her contributions, profoundly innovative as they were. This, I argue here, is always the case. In sum, this section presents a new angle, that of a minority or Abelian mathematics, on Noether’s work. During the last half a century or so, Noether, has become especially known, including in her recent public recognition, for her theorems, published in 1918. (Most of the studies devoted to her work focus on these theorems.) By contrast, she has always been world-famous for her contributions to modern algebra and (to a somewhat less degree) algebraic topology among mathematicians and historians of mathematics. The significance of her contribution in topology was due to Noether’s realization that algebraic operations, related to homotopies and homologies, in topology should be considered in terms of abstract algebraic concepts, such as groups. This view, quickly adopted by her contemporary topologists and, at least in the case of homotopies, by earlier figures, such as Henri Poincaré, had proven exceptionally fruitful for the development of algebraic topology and has come to define both cohomology and homotopy theories there. (Both, especially cohomology theory, were also developed in algebra and algebraic geometry.) As discussed below, however, Noether’s thinking in this field and in algebra itself was more innovative and more radical, giving her use of what Wolfgang Krull, a major algebraist in his own right, referred to as “Noether’s principle” (Krull 1935, p. 5; McLarty 2006, p. 218). The principle is grounded in the view that abstract mathematical concepts, such as groups, rings, or modules, should be primarily defined, in terms of their structure, a view that came to define modern algebra and much of modern mathematics, in part following Bourbaki’s structuralism. This understanding as such was not especially new, although Noether was a key figure in establishing it. Noether’s principle, however, was new. It says that the study of these objects should focus on the relationships, morphisms, between algebraically structured objects (sets), such as groups, rings, or modules, rather than merely on these objects themselves. Innovative as it was in algebra, Noether’s work was even more so in topology at the time, although it has become routine since, including as translated into categorical terms. Noether’s principle and, in the first place, Noether’s ideas concerning the
28
A. Plotnitsky
role of groups in considering homotopies and homologies of topological spaces were developed in algebraic topology by her disciples and followers, such as Heinz Hopf and Pavel S. Alexandrov, as well as independently by others, such as Leopold Vietoris. Noether’s principle is a precursor of later categorial and functorial principles, as considered earlier in this study, especially in Grothendieck’s work. It would, in my view, be difficult to claim that Noether herself did go that far. There is, in her work, no suggestion of the relations, functors, between categories or protocategories that she considered. In retrospect, algebraic topology is inherently categorical. What made it an exact mathematical discipline was the fact that one can associate algebraic structures (initially numbers and algebraic operations, eventually groups and other algebraic structures) to the architecture of spatial objects that are invariant under continuous transformations, independently of their geometrical properties, associated with measurements. It thus relates, functorially, objects of topological and algebraic categories. This nature of algebraic topology had, however, to be made categorical in its rigorous sense, which took a while and happened about a decade after Noether’s death (in 1935). Her thinking in algebraic topology was, nevertheless, a contribution to this history. Alexandrov, one the topologists influenced by Noether, commented on this new, minority, thinking (addressing a Russian audience), in commemorating Noether, first by comparing her work on algebra and differential geometry with that of Sophia Kovalevskaya: It is often forgotten that in this period Emmy Noether obtained excellent results concerning the concrete algebraic problems of Hilbert. These results and her work on differential invariants would have been enough by themselves to earn her the reputation of a firstclass mathematician and are hardly less of a contribution to mathematics than the famous research of S. V. Kovalevskaya. (Alexandrov 1936, cited in Dick 1981, p. 156)
He added, giving Noether a greater credit for her contribution to algebra and then algebraic topology: But when we think of Emmy Noether as a mathematician, we have in mind not these early works, important though they were in their concrete results, but rather the main period of her research, beginning in about 1920, when she became the creator of a new direction in algebra and then leading, the most consistent and prominent representative of a certain general mathematical doctrine–all that which is characterized by the words ‘begriffliche Mathematik’ [abstract mathematics]. (Alexandrov 1936, cited in Dick 1981, p. 186)
It is worth noting that “begriffliche” means conceptual, which does not exclude abstract, but is not quite the same (regardless of what Alexandrov thought). I shall, however, consider first Noether’s celebrated theorems by way of contrast as an example of a primarily major mathematics, while, again, keeping in mind that, as any innovation, they contained ingredients of a minor mathematics, even in purely mathematical terms, aside from the gender aspects of Noether’s minor mathematics, which would require a separate analysis. Noether’s theorems, especially Noether’s first theorem, were mathematical generalizations of previously known mathematical expressions of physical conservation laws (those of momentum, energy, and angular
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
29
momentum) in classical mechanics and relativity. General relativity introduced mathematical complexities into the formulation of the energy-conservation law, which complexities were handled by Noether in her article introducing her theorems. Her theorems were, however, formulated abstractly (a minority move), and they can be considered apart from any reference to physics, in the style of symplectic geometry. (The latter originates in the Hamiltonian formulation of classical mechanics where the phase spaces of certain systems assumed the structure of a symplectic manifold, defined by a closed, nondegenerate 2-form.) When applied in physics, Noether’s theorems state that every differentiable symmetry of the action of a physical system has a corresponding conservation law. The action of a physical system is an integral over time of a Lagrangian, which allows one to determine the system behavior by using the principle of least action (the quantity defined as the integral of the momentum over a given distance traveled by a body). The theorems are an application of the principle, arguably, stated first by Pierre Louis Maupertuis in 1774 and mathematically developed by Leonhard Euler around the same time, and then by Joseph Louis Lagrange and Sir William Rowan Hamilton. Leibniz had related ideas earlier. Philosophically, the idea that nature follows the most efficient path possible can be traced to much earlier, even to ancient Greek thinking. According to Yvette Kosmann-Schwarzbach’s account, closely following Noether’s article: Noether’s first theorem, a generalization of several conservation theorems that were already known in mechanics . . . She considers a multiple integral,
ð @y @ 2 y , I ¼ . . . f x, u, , . . . dx, @x @x2 ð
of a higher-order Lagrangian f that is a function of n independent variables, x1, . . ., xλ,. . ., xn, and of μ dependent variables, u1, . . ., ui, . . ., uμ, as well as of their derivatives up to a fixed but arbitrary order, κ. She then considers a variation of u, δu ¼ (δui), and derives identity (3),
μ X i¼1
ψ i δui ¼ δf þ Div A,
where the ψi are the Lagrangian expressions, which is to say the components of the variational derivative (Euler–Lagrange derivative) of f, and where the components Aλ of A are linear in the variation δu and in its derivatives. The opposite of the quantity A is now called the Legendre transform of the Lagrangian f. Here Div is the ordinary divergence, n P @Aλ Div A ¼ @xλ , of A ¼ (A1, . . ., An) considered as a vector in n-dimensional space, and δf is λ¼1
the variation of f corresponding to the variation δu of u, while the variation of x is assumed to vanish. Identity (3) is obtained by an integration by parts. In the case where n ¼ 1, the case of a simple integral, Noether gives an expression for A for an arbitrary μ, first for κ ¼ 1, which yields what she calls Heun’s “central Lagrangian equation,” then for an arbitrary κ, and then she states her theorem:
30
A. Plotnitsky I. If the integral I is invariant under a [group] Gρ , then there are ρ linearly independent combinations among the Lagrangian expressions which become divergences—and conversely, this implies the invariance of I under a [group] Gρ. The theorem remains valid in the limiting case of an infinite number of parameters. Noether explains that “in the one-dimensional case,” that is, when n ¼ 1, one obtains first integrals, while, “in higher dimensions,” i.e., when n > 1, “one obtains the divergence equations which, recently, have often been referred to as conservation laws.” By the “limiting case” included in the statement of Theorem I is meant the case in which the elements of the group depend on an infinite but denumerable set of parameters, as opposed to the case dealt with in her Theorem II (Kosmann-Schwarzbach 2011, p. 57).
Noether’s second theorem, in which this limitation no longer applies, is: II. If the integral I is invariant under a [group] G1ρ depending upon arbitrary functions and their derivatives up to order σ, then there are ρ identities among the Lagrangian expressions and their derivatives up to order σ. Here as well the converse is valid. (Kosmann-Schwarzbach 2011, p. 4)
The physical meaning of the theorems easily follows given that the mathematics used (ideally) represents the key physical features of the systems considered, in an essentially geometrical way, notwithstanding the algebra of the theorems, that of symmetry groups. For systems with suitable Lagrangians (which allow for this representation), as most classical and relativistic systems are, symmetry under continuous translations in time implies the conservation of energy, symmetry under continuous translations in space implies the conservation of linear momentum, and symmetry under continuous rotations implies the conservation of angular momentum. These formulations apply straightforwardly in classical physics, with the mathematics of the proof automatic by Noether’s first theorem. As noted, general relativity involves certain mathematical complexities, in part leading Noether to her second theorem. Noether did important work on general relativity earlier, and her work on her theorems was in part prompted by Hilbert’s questions concerning energy conservation there, on which he consulted Noether and that Noether resolved by a subtle argument in her article (Noether 1918; Kosmann-Schwarzbach 2011, pp. 63–64). Noether closes her article, by citing Klein’s famous pronouncement that relativity is an invariance under group transformation, a view that of course applies in mathematics itself (Noether 1918; Kosmann-Schwarzbach 2011, p. 22, note 27, pp. 63–64). These complexities do not affect the geometrically representational (realist) character of her theorems. Just as is classical physics, relativity, special and general, remains a realist theory, which means that Noether’s theorems or symmetry principles apply in essentially the same way as in classical physics. In principle, her theorems are, again, independent of physics, with which, however, they have been primarily associated, including by Noether herself. The situation becomes different when Noether’s theorems appear in quantum mechanics or quantum field theory, where one can still use them. More accurately one can use certain versions of these theorems, because they have to be reformulated even for the same dynamical variables – momentum, angular momentum, and energy – as those used in classical
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
31
physics in view of the nature of the formalism of quantum theory versus that of classical physics or relativity. In quantum theory, these variables are no longer functions of real quantities but are operators in Hilbert spaces over ℂ, in the case of these variables, infinite-dimensional Hilbert spaces. The theorems can also be generalized to variables, especially discrete variables, such as spin, and symmetries that are strictly quantum and have no classical counterparts. The subject will be put aside, because these versions of Noether’s theorems have only remote connections to Noether’s work; they are no longer Noether’s Noether’s theorems. I have considered this subject in detail in (Plotnitsky 2022, pp. 205–210). My point here is that her theorems as such, Noether’s Noether’s theorems, were by and large the products of what was then a major mathematics, important and innovative as these theorems were and, as such, unavoidably dependent on a minority mathematical thinking in their discovery. Thus, as noted, formulating them abstractly apart from any reference to physics was a minority move, characteristically Noetherian in nature, arguably, a link to her work in abstract algebra and then algebraic topology, where, however, the minority nature of her thinking becomes much more pronounced. Noether’s work in abstract algebra is one of the greatest representations of the idea, and the ideal, of modern algebra as abstract algebra. Algebra was part of and shaped modern mathematics in the present definition, as abstract mathematics in general (vs. representing natural objects, specifically in physics), from around 1800 on. On the other hand, the term “modern algebra” has immediate connections to and even originates in Noether’s work, because it appears to have been used for the first time in the famous book, under this title, “Modern Algebra [Moderne Algebra],” by her close associate and collaborator, van der Waerden (1930). The book was based on lectures given by Noether and Emil Artin. It was one of the first books to use an abstract (axiomatic) approach to algebraic structures, such as groups, rings, and fields, and as such became, transitioning from a minority to a major approach, one of the most influential texts on algebra in the twentieth century. It reflected Noether’s mathematical practice, again, initially as a minority practice, quickly transformed by Noether and her followers, into a major practice, with Galois and especially Dedekind as the main precursors. The grounding principle of modern algebra is what Vuillemin, as discussed above, defined, in considering Galois’ concept of group (but clearly following post-Noetherian algebra, as shaped by Bourbaki structuralism), as a double abstraction, or “formalization.” The first abstraction is that of the formal presentation of the group, as a concept, in terms of algebraic symbols and their relations; the second is that defined by a method that makes it possible “to construct individuals, no longer in intuition according to imperfect [more arbitrary] schemes, but [determinately] in the concepts themselves” (Vuillemin 1962, pp. 288–289; translation mine). In other words, this is an abstraction as a creation of new general (abstract), but in their construction concrete, intrinsically concrete, mathematical concepts. Such concepts can also borrow elements from elsewhere and mathematize them, thus abstracting them from their source areas. Symmetry is an example of such a concept: It was borrowed from general phenomenal spatial considerations, but made into invariance under transformations, which may form a group, an algebraic concept that can be
32
A. Plotnitsky
used, as it was by Galois, outside geometrical considerations, and, as a mathematical concept, defined by that of invariance, it came to geometry from algebra. In Noether’s work, then, and, in part under the influence of her work, in general, modern algebra is practiced as abstract algebra – algebra defined by means of its abstract concepts and structures. Commonly, as in van der Waerden’s book, and in most commentaries on Noether, the emphasis is on structures and axioms. I would argue, however, that concepts are equally important to Noether, thus making Noether an heir of Riemann as well as Dedekind, who was of course himself an heir of Riemann. For one thing, as noted, a structure is a concept or a set of concepts first, even though concepts are in turn defined by their structure as an organization, composition, of their elements. As such a concept and a structure often coincide in the case of purely mathematical concepts, but the concept of concept and the concept of structure are still different. According to van der Waerden’s (reported) recollection: “The maxim by which Emmy Noether was guided throughout her work might be formulated as follows: ‘Any relationships between numbers, functions, and operations become transparent, generally applicable, and fully productive only after they have been isolated [abstracted] from their particular objects and been formulated as universally valid concepts’” (Dick 1981, p. 101). Structures are, however, essential for defining morphisms between (in present-day terms) categorical entities, and morphisms are central for Noether’s work in algebra and algebraic topology. Thus, integers (which do not form a field) become studied as a commutative ring – a concept with a formally defined abstract structure. One of the outcomes of this approach is Noether’s proof of the unique factorization theorem, the Lasker-Noether theorem. The theorem extends the fundamental theorem of arithmetic, which states that every positive integer can be factored uniquely into primes, which is not so for the ideals of commutative rings in general, but is true for Noetherian rings, a very beautiful mathematical discovery. A Noetherian ring satisfies the ascending chain condition on left and right ideals. The ring is left Noetherian or right-Noetherian if the chain condition is satisfied only for the left or the right ideals, which means that every increasing sequence I1 ⊆ I2 ⊆ I3 ⊆ ⋯ of left or right ideals has the largest elements, that is, there exists n such that In ¼ In + 1 ¼ . . . . Equivalently, a ring is left or right Noetherian if every left ideal (or right ideal) is finitely generated, or Noetherian if both conditions are satisfied. The Lasker-Noether theorem states that every Noetherian ring is a Lasker ring, a ring in which every ideal can be decomposed as an intersection, called primary decompositions, of a finite number of primary ideals (which are not the same as prime ideals, which share many key properties of prime numbers).12 The theorem was proven by Emanuel Lasker (famous as a world chess champion) in 1905 for polynomial rings and convergent power series rings, and in full generality by 12
An ideal Q of a commutative ring A is primary if whenever xy is an element of Q then x or yn is also an element of Q, for some n > 0. In the ring of the integers ℤ, ( pn) is a primary ideal if p is a prime number. An ideal P of a commutative ring R is prime if: (a) If a and b are two elements of R such that their product ab is an element of P, then a is in P or b is in P; and (b) P is not the whole ring R.
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
33
Noether in 1921. Lasker’s proof, fully in accord with an established major algebra then, was no small achievement, but it was technical (calculational) and cumbersome, and of course limited. Noether’s proof is conceptual and elegant. It was also an example of how conceptual abstraction worked in commutative algebra as, at the time, a minority mathematics. Noether’s later remarkable and highly influential article Abstrakter Aufbau der Idealtheorie in algebraischen Zahl- und Funktionenkörpern [Abstract Structure of the Theory of Ideals in Algebraic Number and Function Fields] (Noether 1927) characterized the rings in which the ideals have unique factorization into prime ideals as the Dedekind domains: integral domains that are Noetherian, 0- or 1-dimensional, and integrally closed in their quotient fields. This article also contained what is now known as the isomorphism theorems, which concern certain fundamental natural isomorphisms, and other key results on Noetherian and Artinian modules. Noether was “the first to develop a general representation theory of groups and algebras, valid for arbitrary ground fields,” presented in the important article “Hyperkomplexe Grössen und Darstellungstheorie” (Noether 1929). According to van der Waerden: [The article] has had a profound influence on the development of modern algebra. . . . In the introduction Emmy Noether states that in recent publications the structure theory of algebras and the representation theory of finite groups have been separated completely. She, on the other hand, aims at a purely arithmetical foundation, in which the structure theory and the representation theory of groups and algebras appear as a unified whole, namely as a theory of modules and ideals in rings satisfying finiteness conditions. (Van der Waerden 1985, p. 244)
In her later work, Noether also made important contributions to the study of noncommutative algebras, including division algebras. The discussion just offered only considered a very limited set of examples of Noether’s work in abstract algebra, sufficient to argue that the high degree of abstraction in her approach could be seen as, at the time, a minority mathematics. I should add, however, that even these examples show (while the methods of modern algebra sometimes hide) the history that shaped these concepts, as a historical part of their intrinsic mathematical concreteness, and as such, also the history of the interaction between established major and minority mathematical practices. Only a few prominent names were mentioned here as part of this history, such as Galois, Dedekind, Hilbert, and Artin. Dozens of other figures, however, even if one thinks only of major ones, are part of this history, even if some of them, such as Kronecker, fought against abstractness, in part, I would argue, by failing to appreciate the intrinsic concreteness of this abstractness. I now move to Noether’s use of the concepts of abstract algebra, especially that of a group, in algebraic topology, a more dramatically minority intervention, if localized to the relationships between a “minor” role of abstract algebra and the major mathematics in topology. The history of Noether’s engagement with topology is well known, although there is more to Noether’s thinking than most accounts convey, and none of them considered this engagement from the viewpoint of a minority mathematics. The situation is complicated by the fact that Noether did not publish on the subject. She
34
A. Plotnitsky
only mentioned it in passing, as “an application of a group theory,” in a single article, devoted to another subject: “in den Anwendungen des Gruppensatzes – z.B. Bettische und Torsionszahlen in der Topologie [in an application of a group theory—for example, Betti and torsion numbers in topology]” (Noether 1925). This mention is, however, revealing in conveying her contribution, that of bringing group theory into topology and thus helping to make it more fundamentally algebraic, although others, such as Vietoris, did this independently around the same time. Yet again, however, Noether’s contribution was more innovative and far-reaching than only that of rethinking homology theory by means of group theory. Equally significant was the way in which she did so, nearly, even if not quite, functorially. Her ideas had a great immediate influence, quickly reshaping the major practice of topology, through the work of such figures as Alexandrov and Hopf. According to a widely circulated statement by Alexandrov: When [Noether] first became acquainted with a systematic construction of combinatorial topology, she immediately observed that it would be worthwhile to study directly the groups of algebraic complexes and cycles of a given polyhedron and the subgroup of the cycle group consisting of cycles homologous to zero; instead of the usual definition of Betti numbers, she suggested immediately defining the Betti group as the complementary (quotient) groups of the group of all cycles by the subgroup of cycles homologous to zero. This observation now seems self-evident. But in those years (1925–1928) this was a completely new point of view. (Alexandrov 1936, cited in Dick 1981, p. 174)
It was also a minority algebraic thinking in topology, or rather a new minority algebraic thinking, because there is no algebraic topology without algebra. How can topology be algebraic otherwise? On the other hand, what algebra brings into topology may be and in this case was a matter of a minority thinking, and the very emergence of algebraic topology was of course a minority thinking. Noether reportedly also noted that the idea of a Betti group makes the Euler-Poincaré formula easier to understand (Dick 1981, pp. 174–175). There is a clear consensus, which included Alexandrov and Hopf themselves, that their work greatly benefited from Noether’s insights, unpublished but extensively discussed in her circle at Göttingen at the time. On the other hand, there is less consensus and some debate concerning whether homology groups were used before, specifically by Poincaré, who clearly used homotopies as groups and spoke of them as such. A helpful historical and conceptual discussion and references are found in several articles by Colin McLarty (2006, 2011, 2017). In assessing Noether’s contribution, McLarty goes beyond previous arguments concerning Noether’s work. He credits Noether with a more radical and significant innovation than only using groups. He sees her main achievement by connecting this work to Noether’s developing of, in her words, “set-theoretical foundations of algebra” (McLarty 2006, p. 212). This would, in present terms, be seen as a more radical example of a minority mathematics by virtue of introducing homomorphisms (rather than merely groups) to bear on topological maps. The “set-theoretical topology” of his title “Emmy Noether’s set-theoretical topology” of (McLarty 2006) is a pun: Part of his argument concerns how algebraic topology (rather than set-theoretical topology, also known
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
35
as general topology) was affected by Noether’s concept of set. Luitzen E. J. Brouwer, who worked on these topological maps, becomes part of this history. McLarty’s opening is of some interest in the present context. He uses, as an epigraph, a famous statement by Weyl: “In these days the angel of topology and the devil of abstract algebra fight for the soul of each individual mathematical domain” (Weyl 1939, p. 500). The devil, as a fallen angel in the first place, is a quintessentially minority figure. McLarty then comments as follows: If Hermann Weyl ever put faces on these spirits they were his good friends, the angel Luitzen Brouwer, and the devil Emmy Noether. Weyl well describes the scope of their ambition. But topology and algebra were not fighting each other. They would come to share the soul of most of mathematics. And the angel and devil had worked together. In 1926 and 1927 Emmy Noether induced young topologists gathered around Brouwer to use her algebra to organize the kind of work on topological maps that Brouwer taught. This created the still current basis for ‘algebraic topology.’ It was a huge advance for the structuralist conception of mathematics. And it turned Noether from a great algebraist to a decisive figure for twentiethcentury mathematics. (McLarty 2006, p. 211)
This history is an example of the interplay of geometrical and algebraic thinking (in the broad sense, so as to include topology), which sometimes haunted and sometimes inspired mathematics from the Pythagoreans on, as discussed in detail in Plotnitsky (2022). It is true that the angel (of topology) and the devil (of algebra) sometimes worked together. But they also fought against each other, and changed their role, with algebra becoming a major (but not necessarily majority) angel and topology a minority (but not necessarily minor) devil, keeping here the asymmetry of terms that is part of my argument.13 Formally, my emphasis is on concepts and the conceptual conception of mathematics, rather than on structures and the structuralist conception of mathematics (à la
13
This suspicion concerning algebra is often found in geometrically oriented mathematicians, such as Michael Atiyah, who goes as far as stating, quite beyond Weyl’s more genial ironic assessment, that “algebra is the offer made by the devil to mathematicians,” which even makes one “stop thinking” (Atiyah 2002, p. 7). This is not something that Grothendieck, let alone Noether (even at her early geometrical phase), would have been likely to agree with. It is not easy to so unconditionally distinguish algebra and geometry either, as the present chapter shows and as Atiyah indeed acknowledged, still, however, holding on to his negative sentiment toward algebra. It may well be, however, that, even for Atiyah and certainly for Weyl, the devil of algebra was only the devil of algebra as the “machine” (Atiyah’s word) of calculation, algebra deprived of geometry, or topology. In the present view, algebra may also embrace and create geometry and topology, just as geometry and topology embrace and sometimes create algebra. It would in fact be difficult to find greater manifestations of this interplay than Weyl or Atiyah’s own work, including that in Atiyah’s case on the mathematics of the Yang-Mills theory, which grounds the standard model. The Yang-Mills theory then played a key role in the topology of four-dimensional manifolds, courtesy of the work of Atiyah’s student Simon Donaldson, on which Atiyah reflects at some length in his article (Atiyah 2002, pp. 12–13) and which was considered in this context by this author in (Plotnitsky 2022, pp. 275–278). See also Athanase Papadopoulos’ article on René Thom, who was equally suspicious about algebra (Papadopoulos 2022, p. 15). Papadopoulos also commented on Atiyah’s view under discussion (Papadopoulos 2022, p. 15).
36
A. Plotnitsky
Bourbaki), which governs McLarty’s analysis, in part via (Corry 2006), referred to throughout his article. As noted above, the present emphasis does not invalidate this perspective on “modern algebra,” for which Noether was one of the founding figures, or the role of structures elsewhere. They are important. Morphisms, central to Noether’s (re)algebraization of modern algebra, require structures, as do functors, which are maps between categories as (higher-level) structures. Nevertheless, the present view shifts a perspective, including in Noether’s case. Morphisms and functors are concepts first. The same considerations would apply to Grothendieck, who appears to be seen by McLarty along structuralist lines in this article and on several other occasions (McLarty 2006, p. 218, n. 14; 2018). McLarty’s analysis of Noether could be easily accommodated by the present view. He summarized his argument as follows: The time is important because Noether was just then developing what she called her set-theoretic foundations for algebra. This was not what we now call set theory. It was not the idea of using sets in basic definitions and reasoning. She took that more or less for granted, as did other Göttingers by the 1920s. Rather, her project was to get abstract algebra away from thinking about operations on elements, such as addition or multiplication of elements in groups or rings. Her algebra would describe structures in terms of selected subsets (such as normal subgroups of groups) and homomorphisms. Alexandro[v] applied her tools to Brouwer’s use of continuous maps—though Vietoris was the first to make it work. Noether brought something much deeper and more comprehensive to topology than just the use of homology groups. The next section (and a more technical appendix) will show that groups were familiar in homology before her. She brought an entire programme of looking at groups, and other structures in algebra, and other structures outside of algebra like topological spaces, in terms of the homomorphisms between them. She called this ‘set-theoretic foundations.’ (McLarty 2006, p. 212)
Noether’s “chief tools for making this work,” McLarty adds, are “her homomorphism and isomorphism theorems” (McLarty 2006, p. 212). In other words, abstract algebra is now the algebra of morphisms, which is “Noether’s principle,” as Krull called it. As McLarty says, citing Krull: Noether used the homomorphism theorems to prove isomorphism theorems, which show that certain relations among subsets imply that certain morphisms are isomorphisms. These and other ideas served Noether’s well-known goal, in Krull’s terms: Noether’s principle: base all of algebra so far as possible on consideration of isomorphisms. (Krull 1935, p. 4) Van der Waerden captured the method exactly: To understand ring ideals is to understand their analogy with normal subgroups, for which ‘we proceed from the concept of homomorphism!’ (van der Waerden 1930, p. 55). (McLarty 2006, p. 218)
It is worth noting that van der Waerden, who, as noted, invoked concepts in reflecting on Noether’s work, speaks here of the concept of a homomorphism, which, or that of a morphism, is a concept-form in the present definition, and is seen as such by Noether, or along with the concept-form of a functor, functorially by
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
37
Grothendieck. This point does not affect McLarty’s argument as such, because Noether’s use of homomorphism implied that abstract algebraic concepts such as a group, field, ring, or module, in fact, function as concepts-forms, without, in the present view, implying the Platonist view of these concepts (even if Noether had such a view, which is unclear). McLarty concludes: Noether emphasized homomorphisms, and her influence on homology forced topologists and algebraists to bring their methods together. So algebra shifted to an ever-greater focus on homomorphisms. All this algebra looked too abstract to a tough-minded geometer like Solomon Lefschetz. In the 1920s, he made tremendous but obscure progress applying homology in algebraic geometry. It was a major source of his reputation for never stating a false theorem or giving a correct proof. Realizing that his arguments needed serious improvement, he went to work on the homology of topological spaces. His first book on it says: The connection with the theory of abstract groups is clear. . .. Indeed everything that follows in this section can be, and frequently is, translated in terms of the theory of groups. It is of course a mere question of a different terminology. (Lefschetz 1930, p. 29) But Poincaré had said, in promoting analysis situs, that we must not “fail to recognize the importance of well-constructed language” (Poincaré 1908, p. 180). Whether or not Lefschetz noticed this passage, he shared the thought. Soon his topology was heavily algebraic (Lefschetz 1942). He kicked off the spate of articles on what became category theory when he asked for an appendix by Eilenberg and Mac Lane (1942). This became the standard foundation for algebraic topology and for the huge proliferation of cohomology theories in the 1950s and since. The unprecedentedly vast machinery gave unprecedented power in solving concrete problems from topology to abstract algebra to number theory. Lefschetz wrote: As first pointed out by Emmy Noether, the proper and only adequate formulation of the relation between chains, cycles, . . . requires group theory. (Lefschetz 1949, p. 11, Lefschetz’s ellipsis) In fact, it required Noether’s formulation of group theory [in terms of morphisms], and that soon required functors. (McLarty 2006, p. 230).
It is difficult to say how close Noether came to functoriality, and McLarty does not appear to claim that she ever reached this concept. The concept, concept-form, of a functor is the next step, which is, however, far from obvious and was only, and could have only been, created as part of category theory. It then became one of Grothendieck’s key concepts, which, while not invented by him, was made by him his own, as I shall discuss in the next section. One might say that in Noether’s remarkable approach to algebra, or algebraic topology, there may already have been, if implicitly, categories, those of groups, rings, or modules, even if there were no functors between categories. In Grothendieck, by contrast, functoriality becomes dominant, as one of the most important concept-forms of his mathematics, as both (initially) a minority mathematics and (eventually) a major mathematics, a mathematics that became central to the advancement of mathematical thought in algebraic geometry and beyond.
38
4
A. Plotnitsky
Topos Theory and Grothendieck’s Parliamentary Mathematical Ontology
Grothendieck’s work on algebraic varieties over a finite field, ultimately allowing one to study them by the tools of algebraic topology, is among the greatest achievements of the twentieth-century major mathematics and a remarkable case of a minority mathematics, thus also a manifested example of the essential connections between major and minority mathematics, emphasized throughout this chapter. The project of extending these tools to finite fields was initiated by André Weil, a key figure in bringing together algebraic geometry and number theory, in which he was an heir of Fermat (and probably saw himself as one), as well as of Kronecker (in this case, Weil did see himself as one). Arguably, Weil was less of a practitioner of a minority mathematics, except to the degree the latter is always involved in a creation of new mathematics, and Weil has created plenty of it. The project has a complex history, including the development, beyond algebraic geometry itself, of algebraic and differential topology, group theory, elliptic functions, and other fields, in which Weil did important work, eventually leading him to his work on algebraic geometry, culminating in his 1946 Foundations of Algebraic Geometry, with the second edition published in 1962 (Weil 1962). The book is distinguished by many important technical contributions, which would be difficult and, for the present purposes, not necessary to discuss here. Eventually (in 1949), Weil suggested that a cohomology theory for algebraic varieties over finite fields, now known as Weil cohomologies, could be developed by analogy with the corresponding theories for complex algebraic varieties or topological manifolds (considered in algebraic topology) in general. Weil’s motivation was a set of conjectures (going back to Gauss, they were stated for algebraic curves by Emil Artin in 1924, and hence are part of the history of the major mathematics in this field), known since as the Weil conjectures. They concern the so-called local ζ-functions, which are the generating functions derived from counting numbers of points on algebraic varieties over finite fields. These conjectures, Weil thought, could be attacked by means of a proper cohomology theory, although he did not propose such a theory itself. The idea, stated somewhat vaguely in the first place, was initially met with much skepticism. Obstacles were formidable, and the way of overcoming them advanced by Grothendieck was remarkable, almost miraculous. This way is, however, less of a miracle if viewed from within Grothendieck’s mathematical thinking and practice, as minority ones, in several respects unique to him. In this regard, his thinking was similar to that of Abel and Galois, both among earlier predecessors of Grothendieck’s work. As far as Weil’s work is concerned, my focus, defined by the differences between Weil and Grothendieck’s philosophy of algebraic geometry, is Weil’s concept of a “universal domain,” assumed by him to be necessary, versus Grothendieck’s multiple ontologies of topos theory. As against previous approaches to algebraic geometry, that of Weil required, in addition to the base field of definition of the objects (algebraic varieties) considered, an assumption of an (algebraically closed) “universal domain” encompassing all fields that may appear in any constructions made over the base field. The Platonist implications of this claim are tempered by his view that
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
39
such a domain need not be unique. Weil argued that the theorems of algebraic geometry will remain the same regardless of the choice of this domain if it has an infinite transcendence degree and is closed. Grothendieck abolished the idea of a universal domain on his way to solving a problem of Weil’s cohomologies posed by Weil, by creating a remarkable array of concept-forms, eventually leading to several concept-forms, such as a topos, that allow one to define Weil’s cohomologies. These concept-forms gave Grothendieck’s work a unique place in algebraic geometry and beyond. Grothendieck’s definition of “Weil’s cohomologies,” or more accurately Grothendieck’s cohomologies that accomplished the task in question, was a result of his, in its key aspects, minority mathematical practice, even when adopting major concepts of major mathematics of his time. This practice and the ways of thinking grounding it were initiated in his earlier work in functional analysis, his original field of research, although developed, as more directly connected to algebraic geometry, in his work on cohomological algebra and Teichmüller theory.14 A remarkable manifestation of this thinking in algebraic geometry was his invention of the concept of scheme, a radically innovative concept, a conceptform, even though it was almost a natural extension of by then the standard view of algebraic varieties. Nobody, however, thought about it at the time, which fact made it a minority concept. This type of event (which is not uncommon in creative mathematics and science) also requires a rethinking of the idea of “natural,” the naturalness of the natural. Grothendieck’s key idea was replacing maximal ideals with prime ideals of a ring, a natural, Noetherian, move in commutative algebra, by then an established major field. It is, however, a different matter, when one connects it to algebraic varieties, replacing the latter by schemes, thus introducing a new form of multiplicity insofar as the same algebraic variety can arise from different schemes. It also allows one, which may be seen as another post-Noetherian move, to define algebraic varieties over any commutative rings, such as integers. In its ultimate formality (favored by Grothendieck), a scheme is a topological space with commutative rings associated to all of its open sets, by gluing together spectra (defined as spaces of prime ideals) of commutative rings along their open subsets. Thus, a scheme is locally a spectrum of a commutative ring. This also leads to a relational view, according to which algebraic geometry should be developed for a morphism X ! Y of schemes (designated as a scheme X over Y), rather than for an individual scheme. In some cases, the family of all varieties of a given type can itself be viewed as a variety or scheme, a moduli space. The approach led to an enormous amount of new mathematics, much of which quickly became major mathematics, although not majority mathematics. As is, again, always the case, these concepts, even apart from their interconnections, have their history (as do some of these interconnections, while others are new), for example, in Noether’s work or in more immediately
14
The genealogy of his work on Teichmüller space originates in Riemann’s moduli problem, recast by Grothendieck in his framework. The theory provided an important example of Grothendieck’s use of functoriality, a major concept by then, but given a new life by his minority use of it. For the discussion of this work, see A’Campo et al. (2016).
40
A. Plotnitsky
preceding works in algebraic geometry by Jean-Pierre Serre. A subtler earlier link is the Stone representation theorem, which states that every Boolean algebra is isomorphic to a certain field of sets. The theorem emerged from the spectral theory of operators in a Hilbert space, a subject close to Grothendieck’s early work in functional analysis. The concept of topos was introduced nearly simultaneously with that of scheme, by relating the history of algebraic geometry just sketched to the history of algebraic topology and adopting the method of the latter in algebraic geometry over the finite field. This (unlike their automatic application to algebraic manifolds over ℂ) proved to be impossible to do merely by shifting from varieties to schemes. To be able to do so, one needed a proper topology, which was nontrivial because the objects in question are topologically discrete. A “native” topology that could be defined by them, known as Zariski’s topology, did not work, because it had too few open sets, even if one uses schemes instead of varieties. The decisive insights came from Grothendieck, helped by sheaf-cohomology theory and category theory, “cohomological algebra,” by then the standard tool of algebraic topology, and as such part of major mathematics, already transformed by Grothendieck previously, on which I shall comment below. Using these tools, a hallmark of Grothendieck’s thinking throughout his career, and his previous concepts, such as that of “scheme,” led him to topos theory and étale cohomology as a viable candidate for Weil’s cohomology, which it had quickly proven to be. As Grothendieck said: “The two crucial driving ideas [idées forces] in launching and developing the new geometry were that of scheme, and that of topos. Appearing almost at the same time and in close symbiosis with each other, they were as if a single sinew in the spectacular flight of the new geometry, and this from the very year they appeared” (Grothendieck 2022, p. 50, cited in McLarty 2018, p. 109; McLarty’s translation). Soon thereafter, he introduced a more general concept of motivic cohomology, a kind of ultimate cohomological concept-form. By using étale cohomology, Grothendieck (with Michael Artin and Jean-Louis Verdier) and Pierre Deligne (Grothendieck’s student) were able to prove Weil’s conjectures, a bit later generalized by Deligne. Grothendieck’s key insight was to generalize, in terms of category theory, the concept of “open set,” beyond a subset of an algebraic variety. This was possible because the concept of sheaf and of cohomology of sheaves could be defined for any category, rather than only that of open sets of a given space. Étale cohomology is this type of replacement, specifically by using the category of étale mappings of an algebraic variety or a scheme, which become “open subsets” of the finite unbranched covering spaces of the variety, a vast generalization of Riemann’s concept of a covering space. Grothendieck was in part building on ideas of Serre. The origin of this generalization was in the fact that the fundamental group of a topological space, for example, and (in the present context) in particular, a Riemann surface, could be defined in two ways. First, it can be defined, more geometrically but still by means of algebra, as a group of equivalence classes of sets of loops at a given point, with the equivalence relation given by homotopy. Alternatively, a fundamental group can be defined even more algebraically, as a group of transformations of covering spaces over the topological space considered. In this second definition, it is analogous to the
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
41
Galois group of the algebraic closure of a field, as Serre was first to consider for finite fields, which prepared grounds for Grothendieck’s work on étale cohomology. The idea has, thus, both Galois’ and Riemann’s concepts as important parts of its genealogy. The connection had been established in the case of Riemann surfaces long before Serre’s work, and found, for example, in Weyl’s 1913 book on a Riemann surface, undoubtedly known to Serre: The group of cover transformations, regarded as an abstract group, expresses purely and completely everything in the relation between the normal covering surface F and the base surface F which has the character of analysis situs [topology]. This group is also called the Galois group of F . It is in fact the analog of the Galois group of a normal algebraic field (of finite degree) over a base field. (Weyl 2013, p. 58)
Étale mappings give a sufficient number of open sets to define adequate cohomolgy groups for some coefficients for algebraic varieties over finite fields. In the case of complex varieties, one recovers the standard cohomology groups with coefficients in any constructible sheaf. The category of étale mappings is a topos, a concept that is, for now, the most abstract algebraic form of spatiality in mathematics. It was seen as such by Grothendieck himself, who compared his contribution to the idea of space in mathematics to that of Einstein in general relativity in physics and that of Erwin Schrödinger in quantum mechanics (Grothendieck 2022, p. 68).15 Étale cohomologies could be defined for most practical uses in simpler settings, without using the concept of topos. The concept of topos has, however, never lost its importance. It also came to play a major role in mathematical logic. The subject cannot be addressed here, apart from noting that mathematical logic is the culminating development of the trajectory, initiated by Georg Cantor’s work, with radical epistemological implications, concerning the nature of mathematical reality, to which the concept of topos, as a logical concept, may be connected (e.g., Plotnitsky 2022, pp. 82–95). By contrast, Grothendieck’s use of his topoi in algebraic geometry is
15
A reference to quantum mechanics may appear strange, but it is astute. In quantum mechanics, one deals with Hilbert spaces over ℂ, in the case considered by Schrödinger in his discovery of the theory, infinite-dimensional ones, using which enables one to predict the probabilities of quantum events. (Adding the so-called Born’s rule, which is, essentially, a form of complex conjugation, to this formalism over ℂ allows one to move to probabilities, which are real numbers.) The astuteness of Grothendieck’s reference comes from the fact that quantum mechanics establishes an entirely new relationship between a purely mathematical space, here Hilbert spaces over ℂ, and the actual (three-dimensional) physical space, mathematically represented as a real manifold, as considered in detail in (Plotnitsky 2021). Technically, it was Werner Heisenberg (not mentioned by Grothendieck) who defined this situation as such in his discovery of quantum mechanics a few months before Schrödinger (who discovered it independently in a different but mathematically equivalent form). Ironically, Schrödinger became discontented with and repudiated quantum mechanics on the account of its epistemological features defined by the situation just described, in particular the role of complex numbers and the probabilistic nature of the theory. He came to see quantum mechanics, as did Einstein, as merely a convenient method of calculations rather than as a proper fundamental theory of the ultimate constitution of nature (Plotnitsky 2021, pp. 145–166). I am not sure whether Grothendieck was aware of this attitude on the part of Schrödinger and Einstein.
42
A. Plotnitsky
essentially ontological rather than logical. His overall position concerning the nature of mathematical reality remains somewhat unclear, for example, as concerns whether it conforms or not to mathematical Platonism. On the other hand, as discussed below, the mathematical architecture of Grothendieck’s algebraic geometry, as different from that of Weil’s algebraic geometry, as defined by a universal domain, may, even if against Grothendieck’s own grain, lead to questioning all known forms of Platonism. To better see why this is the case, I shall offer brief informal remarks on category and topos theory, in connection (direct in the case of topos theory) with Grothendieck’s thinking. First, consider the following way of endowing a space with a structure, generalizing the definition of topological space in terms of open subsets. One begins with an arbitrarily chosen space, X, which may initially be left unspecified in terms of its properties and structure. What would be specified instead are the relationships between spaces, which are applicable to X, such as mapping or covering one or a portion of one by another. This structure is the arrow structure Y ! X of category theory, where X is the space under consideration and the arrow designates the relationship(s) in question. One can also generalize the notion of neighborhood or of an open subspace of (the topology of) a topological space in this way, by defining it as a relation between a given point and space (a generalized neighborhood or open subspace) associated with it. This enables one to specify a given space not in terms of its intrinsic structure (e.g., a set of points with relations among them) but “sociologically,” throughout its relationships with other spaces of the same category, say, that of algebraic varieties or schemes over a finite field of characteristic p (Manin 2002, p. 7). Some among such spaces may play a special role in defining the initial space, X, and algebraic structures, such as homotopy and cohomology, as Riemann realized in the case of covering spaces over Riemann surfaces. (Proper homotopy and cohomology theories were developed later.) Riemann’s idea of a covering space was, as noted, one of the inspirations for Grothendieck’s concept of étale topos. To make this description more grounded, I shall briefly, and again, informally, sketch the key ideas of category theory. It was introduced as part of cohomology theory in algebraic topology in 1940 and extensively used by Grothendieck in his approach to cohomological algebra and algebraic geometry. Category theory considers multiplicities (which need not be sets) of mathematical objects conforming to a given concept, such as the category of differentiable manifolds or that of algebraic varieties, and the arrows or morphisms, the mappings between these objects that preserve this structure. Studying morphisms allows one to learn about the individual objects involved, often to learn more than we would by considering them only or primarily individually. As discussed earlier, a shift toward studying morphisms, even if not categories, was one of the key new features of Noether’s work in both algebra and topology. In a certain sense, in his Habilitation lecture (Riemann 1854), Riemann already thinks in terms of morphisms and even categories (although he did not of course have either concept) because he does not start with a Euclidean space. Instead, the latter is just one specifiable object of a large protocategorical
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
43
multiplicity, that of differentiable or, more narrowly, Riemannian manifolds, defined by a particularly simple way we can measure the distance between points. Categories themselves may be viewed as objects defined by the relationships between them, and in this case one speaks of “functors” between categories rather than “morphisms” between objects in a given category. Functoriality is a crucial aspect of category theory and, as emphasized here, of Grothendieck’s work. Grothendieck did not invent the concept of functor, but he made it his own and used it with great effectiveness in his work, including in his invention of new concepts, often, as in the case of his cohomological and homotopical concepts, defined by functoriality, as are, however, cohomology theories in general. Topology relates topological or geometrical objects, such as manifolds, to algebraic ones, such as homotopy and cohomology groups. Thus, in contrast to geometry (which relates its spaces to algebraic aspects of measurement), topology, by its very nature, in effect deals with functors between categories of topological objects, such as manifolds, and categories of algebraic objects, such as groups, rather than only with morphisms, which of course remain essential. There are no functors without morphisms, while morphisms do not require functors. Functors, however, give morphisms their categorical meaning and, as a result, a deeper and more fundamental mathematical meaning and significance. With Grothendieck, mathematical practice becomes that of functoriality: It no longer primarily deals with objects but categories of objects. Although, arguably, less of a minority thinking insofar as it had a tradition by then, it was still highly transformative vis-à-vis previous uses of functoriality, and much of it was new and hence a minority mathematics. Grothendieck’s use of it, as a concept-form, led him to numerous conceptual innovations, beginning with those in Sur quelques points d’algèbre homologique, a long article, now known as the “Tohôku paper” (Grothendieck 1957). The article, as part of its minority and, thus, Abelian mathematics, also introduced abelian categories, an important new concept, and used this concept to show that sheaf cohomology can be defined as certain derivative functors in this setting. The name “abelian category” merits a brief comment in the context of this chapter. The concept was independently introduced earlier (in 1955), by David Buchsbaum, under the name of “exact categories.” Grothendieck’s well-known penchant for selecting his mathematical names with both flair and precision (schemes, étale mapping, and motives, among them) also served him here, even apart from the present context, in which the name carries additional resonances linked to the idea of a minority mathematics. It is unlikely, although not inconceivable, that Grothendieck also had in mind Abel’s thinking in defining modern mathematics, as an example of Abelian mathematics in the present sense of a minority mathematics. The origin of the name had most likely to do with the fact that a prototypical example was the category of abelian (commutative) groups. Grothendieck gave an axiomatic definition, which is secondary for the moment. An abelian category is a category in which objects and morphisms can be added and in which kernels and cokernels exist (and possess certain useful properties). Their significance in Grothendieck’s works and elsewhere in algebraic geometry, cohomology theories, or category theory, as such, has to do with their strong stability
44
A. Plotnitsky
aspects, such as the fact that the category of chain complexes or the category of functors from a (small) category to an abelian category are abelian. Among the examples of abelian categories is that of finitely generated left modules over a leftnoetherian ring, which implies that the category of finitely generated modules over a noetherian commutative ring is abelian. This is one of several junctures in which Noether’s work in commutative algebra enters the history leading to Grothendieck’s work, as part of the sequence considered here from Abel to Galois to Noether to Grothendieck. There are, again, many other proper names in between and beyond (or before) this history, some of which, but only a small sample, were mentioned here in this chapter. As already in his earlier work in functional analysis (where he introduced several new concepts, such as topological vector spaces), but more significantly in the present context, Sur quelques points confirmed the essential role of inventing new concepts, in particular new concept-forms, in creative mathematics. So did Grothendieck’s subsequent work, defined by a massive proliferation of new concepts and concept-forms. Even those (a large number as well) of such concepts and concept-forms that are contained in thousands of pages of his unpublished works, kind of reinstating his work as minor mathematics, had major impacts during the last half of a century and continue to do so. Those of scheme and topos, however, remain arguably the greatest examples, again, seen as such by Grothendieck himself. A topos in Grothendieck’s sense is a category of spaces and arrows over a given space, used for the purpose of allowing one to define richer algebraic structures associated with this space, as explained above. There are additional conditions such categories must satisfy, but this is not essential at the moment. As a simple example, for any topological space S, the category of sheaves on S is a topos. The concept of topos extends far beyond spatial mathematical objects (thus, the category of finite sets is a topos); it replaces the latter with a more algebraic structure of categorical and topos-theoretical relationships between objects. On the other hand, it derives from the properties of and (arrow-like) categorical relationships between properly topological objects. The conditions, just mentioned, that categories forming topoi must satisfy have to do with these connections. One might think of this ontology as an assembly, a “parliament,” of ontologies (Grothendieck’s concept of topos is, again, ontological, rather than logical, as is the concept of topos in mathematical logic), in the absence of any ultimate ontology, or even, as against physics, any ultimate mathematical or otherwise mental reality, as explained in Plotnitsky (2022, pp. 82–95). As against Weil’s universal domain of algebraic geometry, topos theory enables a new mathematical ontology by allowing for different (mathematical) universes associated with a given space, possibly a single-point one. Grothendieck’s topoi are possible worlds, or compossible worlds in Leibniz’s sense, without assuming, as Leibniz did (in dealing with the actual world), the existence of only one of them, the best possible one. Hence, I refer to this ontology as a parliamentary assembly of possible ontologies, in the absence of any encompassing ontology, implied by the idea of a universal domain, assumed in Weil’s “imperial” vision of algebraic geometry. According to David Reed’s summary:
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
45
Grothendieck’s constructions [beginning with schemes replacing the varieties of Weil] are far from the ‘palaces’ which Weil suggests belong to algebraic geometers by birthright. Rather than refurbishing and renewing the old constructions he has instead created an entirely new type of architecture and in the process is forced to make extensive use of the ‘makeshift algebraic constructions’ Weil carefully thought to avoid placing reliance upon. Both Weil and Grothendieck seek generality in order to be able to analyze rigorously geometric situations to which the older algebraic geometers paid little attention. This generality permits consideration of ‘degenerate’ cases, objects with singularities, ‘nonreduced’ objects and the like. Whereas Weil sought this generality through a large ‘domain’ in which the result of his constructions could be contained, Grothendieck seeks generality by expanding the category of objects under considerations until a certain ‘self-sufficiency’ is acquired, i.e. until a natural reflexivity can be found so that the objects in the category can be related to each other and operations on them can be undertaken without going outside the original category. Furthermore, the objects do not require a predetermined ‘fixed point’ outside of the category for their specification. . . . This relativity also extends to the set theoretical foundations of the theory. Grothendieck introduces a set theoretic ‘universe’ which provides the sets used in his constructions and then studies those properties which remain invariant under a change of universe! (Reed 1995, pp. 131, 177n.36)
In question, then, are two very different (mathematical) ontological philosophies – that of unification or encompassment and that of multiplicity, which is relational but has no single encompassing domain. In other words, this multiplicity is a multiplicity without unity, “the multiple without One,” as Alain Badiou would have it (Badiou 2006). This multiplicity was made possible in part by functoriality as a key part of Grothendieck’s mathematical practice, initially a minority mathematical practice. It is true that Grothendieck often speaks of “unity” (also associated with “simplicity”) as more profound than “generality” (e.g., Grothendieck 2022, p. 47; McLarty 2018, p. 124).16 This unity, however, appears to me more as a unity of his vision, although I cannot exclude that Grothendieck might have envisioned an ontology (different from that of Weil) unifying his multiple universes. I am, however, more concerned here with a philosophical view that Grothendieck’s mathematics may support, even if against his own grain. From this perspective, Grothendieck’s generality or, better (Reed is right to switch to this term), relativity of construction, a crucial aspect of his functorial thinking, is a mathematical instantiation or model of the capacity of mathematics as a human endeavor. This model is that of the creation of mathematical realities (as mental ontologies, of which Grothendieck’s topoi are examples), on the basis of, while creatively transforming, previously established realities, in the absence of an ultimate underlying or encompassing Platonist reality. It is a creation of mathematical worlds (plural!) from the materials of human thought, by endowing them with logically rigorous conceptual structures, the novelty of which is especially striking and prolific in Grothendieck.
16 McLarty’s article as well as his (McLarty 2016) offers an informative account of these concepts in Grothendieck, although neither article addresses the question of the multiple, or the implications of Grothendieck’s conceptions of mathematical multiplicities for the question of reality and realism in mathematics, considered, also in connection with Grothendieck’s thinking, in Plotnitsky (2022).
46
A. Plotnitsky
Topoi are conceptual multiplicities by virtue of bringing together the concept of algebraic geometry and algebraic topology, a strategy that, while already in place in general, was, ironically, motivated by Weil’s cohomological ideas in the case of étale cohomology. One needed to get rid of the concept of a universal domain, but one also needed, in order to do so in the first place but also to be able to work in multiple domains, new concept-forms. The creation of such conceptforms was, again, prolific in Grothendieck’s mathematical practice, the grounding principle of which was functoriality, just as the grounding principle of Noether’s mathematical practice, Noether’s principle, was defined by the mathematics of morphisms between sets with structures (rather than only the structures of sets). It might be added that functoriality was not found in Weil either. Foundations of algebraic geometry (still following Noether’s approach focusing on morphism vs. functors) mentions categories and functors once, somewhat condescendingly and that in 1962 (obviously aware of Grothendieck’s breakthroughs by using categories and functors): “Any reader interested in expressing these facts [concerning morphisms of algebraic varieties over topological fields] in the language of categories and functors will presumably be able to do so” (Weil 1962, p. 358). There is, however, no Weil cohomologies without functoriality, implicit in any cohomology theory, as Weil must have been aware. As noted, Weil cohomologies can be defined in simpler settings. However, they were invented through functoriality, a key tool of Grothendieck’s mathematical practice helping many of his conceptual inventions. The concept of functor itself was not one of them, but, as I said, Grothendieck made it his own by his innovative use of it throughout his work. Grothendieck’s philosophy of mathematical practice, and perhaps his philosophy of mathematics, especially as manifested in the relativity of topos theory, may be seen as mirroring the ultimate task of mathematics, that of creating new mathematical worlds. The multiple ontologies of Grothendieck’s topos theory may be seen as all having their possible voices in the parliament of topos theory, and a similar view can be taken concerning mathematics in general, making it a parliament of various mathematics. These voices, again, can only be given to them by us, by our mathematical thought, individual or collective (Plotnitsky 2022, pp. 82–95).17
17
As I argue on the same occasion, an analogous multiplicity is also a consequence of Gödel’s undecidability, as exemplified by Cantor’s continuum hypothesis. This hypothesis was crucial not only for the question of continuity but also for the question of Cantor’s hierarchical order of infinities (the infinity of which was one of his discoveries, in, it might be added, his own minority mathematics at the time) and thus for the whole edifice of Cantor’s set theory. The hypothesis was proven undecidable by Cohen in 1961, with Gödel contributing a crucial part earlier. It follows, however, that one can extend classical arithmetic in two ways by considering Cantor’s hypothesis as either true or false, that is, by assuming either that there is no such intermediate infinity or that there is. This allows one, by decisions of thought, to extend arithmetic into mutually incompatible systems, ultimately infinitely many such systems, because each of them will contain at least one undecidable proposition. Gödel’s incompleteness theorems may not be a necessary condition of the multiplicity of possible mathematics, as Grothendieck’s ontological relativity suggests. The existence of Cantorian and non-Cantorian set theories, or the existence of Euclidean and the various
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
47
Grothendieck’s relativistic ontological architecture “formally” captures or provides a model for the architecture of the mathematical practice as allowing the space for alternatives to a dominant major mathematics. It is not only a metaphorical model because this type of mathematical practice can be enacted within an architecture of the type defined by Grothendieck. In this chapter, this practice is expressly considered not merely as some alternative mathematics but as a minority mathematics defined as operative inside and transforming what is the established major mathematics at the time. I further argue that a major mathematics cannot be transformed otherwise, however hidden or implicit a minority practice may be in these transformations. This transformational dynamic is not contained by a model strictly based on Grothendieck’s relativistic architecture but instead underlies this architecture as a more general conception. His own mathematics, including that defining the ontological multiplicity in question, was a minority mathematics aimed to affect and transform an important and very rich part of the major mathematics of his time. As he realized, he was not always successful in this aim at the time when some of his new concepts were introduced (especially after he left Institut des Haute Études Scientifiques, where his role was akin to that of Hilbert in Göttingen in the early twentieth century) although some of his, initially marginal, ideas became more widespread subsequently. The view of a minority mathematics as something inside a major mathematics, rather than outside it, also helps to explain why Grothendieck continued to reflect on the nature of the major mathematics of his time, especially in the fields related to his work but, at least by implication, in general. Our major mathematics, he thought, did not pay enough attention to foundational issues, did not ask the right foundational questions, as Abel would have it.18
non-Euclidean geometries established earlier, does not depend on these theorems. Gödel’s theorems, however, imply the inevitability of such a multiplicity in principle, on mathematical grounds, whether one pursues such alternatives or not, in the way say mathematics has pursued both Euclidean and non-Euclidean geometry. In this case, however, one could still unify them with the same broader framework, say, as defined, following Riemann, by the concept of manifold. In the case of undecidable propositions, the only thing that these alternative forms of mathematics share is the fact they are forms of mathematics, which are, however, incompatible with each other. The role of a decision of thought, in general important for my argument here, especially in the case of Gödel’s theorems, is a key aspect of this situation. Allowing one, by such a decision and a freedom, even if relative, of making it, to extend arithmetic or any system containing it into two mutually incompatible systems, and (by iteration) to infinitely many such systems, has been seen as a troubling and even intolerable situation for some. One might, however, also find an appeal in it, as the present author or Badiou does (Badiou 2006, p. 99). 18 His long later manuscripts, such as, famously, Pursuing Stacks, a long letter to Daniel Quillen (Grothendieck 1983) may be seen along these lines. (Quillen’s work, inspired by that of Grothendieck, is another example of a creation of new category- and functoriality-based conceptforms). On the other hand, Grothendieck’s return to Montpellier university, his alma matter, after he left Paris, might be seen as (or at least conjectured to be) aimed to reinstate his position as a minority mathematician and to affirm a minority mathematics, as a revolutionary condition of all creative mathematics and hence mathematics itself, as I argue here. An enormous amount of new mathematics, including new concept-forms, was created by him in his post-Paris years.
48
5
A. Plotnitsky
Conclusion
“Mathematics is a thought,” Badiou says (Badiou 2006, p. 43). I give this statement a meaning beyond Badiou’s own, correctly juxtaposing mathematics to merely a rigorous and refined form of logic, which is necessary but not sufficient for mathematics. While I fully agree with this view, I use this statement more in accord with Deleuze and Guattari’s view of thought in What is Philosophy? as creative thinking (rather than merely referring to our mental states), specifically in art, philosophy, and mathematics and science (Deleuze and Guattari 1996, pp. 202–281). As I argue here, however, in accord with their view of creative thought as a minority thought, as exemplified by Kafka’s literary thought, all creative thought depends on the possibility, and freedom, of pursuing a minority thinking, sometimes against odds. It was not easy for Kafka or Abel and Galois to have this freedom, but they had the capacity and boldness to be free and create new literature and new mathematics. Any instance of creative thinking is, in the present view, an instance of a minority thinking, and only a minority thinking is creative. If there is no minority thinking (however relative in its scope), there is no creative thought and, hence, no creative literature or mathematics, which still needs to be created, made by us. I have argued here that the creative, or poetical, essence of creative thought and hence a minority thought in mathematics is composition, in the sense of the ancient Greek poeien as making, just as it is for poetry or all creative endeavors, respecting their specificity as such; I have also argued here that the political essence of creative or minority thought in mathematics or elsewhere is freedom. In this sense, a minority mathematics and hence all mathematics are political: It requires the politics of freedom, manifested in the possibility of a minority mathematics. Hence, the essence of mathematics is ultimately the power of compositions, conceptual or other and the freedom to create them, not rules, logical or other, necessary as such rules are for mathematics as a human endeavor and a disciplinary field. In spite, however, and sometimes because of its logical and other constraints, mathematics gives human thought an extraordinary freedom. Mathematics is a combination of an extraordinary precision or exactitude and an extraordinary freedom. I would be tempted to say the ultimate precision or exactitude, if not the ultimate freedom, except that as creative thinking, science, philosophy, and literature and art have their own ways of precision and exactitude, and all creative thought is a manifestation of freedom, again, always the freedom of minority thought. This freedom is, as I argued, not merely a free choice in our decision in pursuing our projects, including as a minority mathematics, because, as in all human decisions, such decisions may involve factors that shape and complicate our choices, including as concerns their freedom. A choice, at least a free choice, is difficult, if ever entirely possible as free, even in a democracy, mathematical or other. This fact, as I explained, makes decision a better category. As I also argued, however, this does not mean there is no freedom in our decisions. At stake, politically, is creating the possibility for freedom in our endeavors, for pursuing a minority thinking, the necessary condition of creative thinking. No creative thought or justice anywhere is possible otherwise. It is this possibility of freedom that is the political essence of a
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
49
minority mathematics and hence of mathematics itself, which cannot advance by creating new mathematics and thus remain alive without a minority thinking and a freedom to pursue it.
6
Cross-References
▶ Grothendieck: A Short Guide to His Mathematical and Philosophical Work (1949–1991) ▶ Mathematical Practice as Philosophy, with Galois, Riemann, and Grothendieck ▶ Mathematics and the Method of Abstraction ▶ One Mathematic(s) or Many? Foundations of Mathematics in 20th Century Mathematical Practice ▶ Ontology in the History and Philosophy of Mathematical Practice: An Introduction ▶ The Notion of Space in Grothendieck ▶ What Mathematicians Do: Mathematics as Process and Creative Rationality
References A’Campo N, Ji L, Papadopoulos A (2016) On Grothendieck’s tame topology. In: Papadopoulos A (ed) Handbook of Teichmüller theory, vol VI. European Mathematical Society, Zürich, pp 35–70 Alexandrov P (1936) In memory of Emmy Noether. In: Dick A (ed) Emmy Noether 1982–1925. Birkhäuser, Boston, 1981, pp 152–179 Angier N (2012, 26 March) The mighty mathematician you’ve never heard of. New York Times Atiyah M (2002) Special article: Mathematics in the 20th century. Bull Lond Math Soc 34:1–15 Badiou A (2006) Briefings on existence (trans: Madarasz N). SUNY Press, Albany Corfield D (2020) Modal homotopy type theory: the prospect for a new logic of philosophy. Oxford University Press, Oxford Corry L (2006) Modern algebra and the rise of mathematical structures. Birkhäuser, Boston De Man P (1996) Kant and Schiller. In: De Man P Aesthetic ideology. University of Minnesota Press, Minneapolis, pp 129–162 Deleuze G (1995) Difference and repetition (trans: Patton P). Columbia University Press, New York Deleuze G, Guattari F (1987) A thousand plateaus (trans: Massumi B). University of Minnesota Press, Minneapolis Deleuze G, Guattari F (1989) Kafka: toward a minor literature (trans: Polan D). University of Minnesota Press, Minneapolis Deleuze G, Guattari F (1996) What is philosophy? (trans: Tomlinson H, Burchell G). Columbia University Press, New York Dick A (1981) Emmy Noether 1982–1925. Birkhäuser, Boston Doxiadis A, Mazur B (eds) (2012) Circles disturbed: the interplay of mathematics and narrative. Princeton University Press, Princeton Eilenberg S, Mac Lane S (1942) Appendix A: on homology groups of infinite complexes and compacta. In: Lefschetz S Algebraic topology. American Mathematical Society, Providence, pp 344–349 Grothendieck A (1957) Sur quelques points d’algèbre homologique. Tôhoku Math J 9(2):119–221. https://doi.org/10.2748/tmj/1178244839
50
A. Plotnitsky
Grothendieck A (1983) Pursuing stacks (a letter to D. Quellen) thescrivener.github.io. Retrieved 07/01/2023 Grothendieck A (2022) Récoltes et semailles, I, II. Réflexions et témoignage sur un passé de mathématicien. Galimard, Paris Hegel GWF (2019) Hegel’s phenomenology of spirit (trans: Pinkard T). Cambridge University Press, Cambridge Kant I (2015) Critique of practical reason (trans: Gregor M). Cambridge University Press, Cambridge, MA Kosmann-Schwarzbach Y (2011) The Noether theorems: invariance and conservation laws in the twentieth century (trans: Schwarzbach B). Springer, Berlin Krull W (1935) Idealtheorie. Julius Springer, Berlin Kuhn T (1987) Black body theory and the quantum discontinuity, 1894–2012. University of Chicago Press, Chicago Lakatos I (1980) Mathematics, science and epistemology. Philosophical papers, vol 2 (ed.: Worral J, Currie G). Cambridge University Press, Cambridge Langlands R, Shelstad D (2007) Descent for transfer factors. In: Cartier P, Katz NM, Manin YI, Illusie L, Laumon G, Ribet KA (eds) The Grothendieck festschrift. Birkhäuser, Boston, pp 485–563. https://doi.org/10.1007/978-0-8176-4575-5_12 Lefschetz S (1930) Algebraic topology. American Mathematical Society, Providence Lefschetz S (1942) Algebraic topology, 2nd edn. American Mathematical Society, Providence Lefschetz S (1949) Introduction to topology. Princeton University Press, Princeton Lyotard J-F (1984) The postmodern condition: a report on knowledge (trans: Bennington G, Massumi B). University of Minnesota Press, Minneapolis Manin Yu (2002) Georg Cantor and his heritage. arXiv.math.AG/0209244 v1 Manin Yu (2010) What then? Plato’s ghost: the modernist transformation of mathematics. Notices Am Math Soc 57(2):239–243 Manin Yu (2019) Time and periodicity from Ptolemy to Schrödinger: paradigm shift vs. continuity in history of mathematics. In: Dani SG, Papadopoulos A (eds) Geometry in history. Springer/ Nature, Berlin, pp 129–138 McLarty C (2006) Emmy Noether’s set-theoretical topology: from Dedekind to the rise of functors. In: Ferreirós J, Gray J (eds) The architecture of modern mathematics: essays in history and philosophy. Oxford University Press, Oxford, pp 211–236 McLarty C (2011) Emmy Noether’s first great mathematics and the culmination of first-phase logicism, formalism, and intuitionism. Arch Hist Exact Sci 65:99–117. https://doi.org/10.1007/ s00407-010-0073 McLarty C (2016) How Grothendieck simplified algebraic geometry. Notices of American Mathematical Society 63(3):256–65 McLarty C (2017) The two mathematical careers of Emmy Noether. In: Beery J, Greenwald S, Jensen-Vallin J, Mast M (eds) Women in mathematics. Association for Women in Mathematics series, vol 10. Springer, Cham. https://doi.org/10.1007/978-3-319-66694-5_13 McLarty C (2018) The large structures of Grothendieck founded on finite-order arithmetic. Rev Symb Log 13(2):296–325 Nietzsche F (1966) Beyond good and evil: a prelude to a philosophy of the future (trans: Kaufmann W). Vintage, New York Noether E (1918) Invariante Variationsprobleme, Göttinger Nachrichten (1918), pp 235–257 (presented by F. Klein at the meeting of 26 July 1918); Abhandlungen, pp 248–270. Abstract by the author in Jahrbuch über die Fortschritte der Mathematik, 46 (1916–1918), vol 1, IV.15 (Variationsrechnung), p 770 Noether E (1925) Ableitung der Elementarteilertheorie aus der Gruppentheorie, 27. Januar 1925, Jahresbericht derDeutschen Mathematiker-Vereinigung 34 (Abt. 2) Noether E (1927) Abstrakter Aufbau der Idealtheorie in algebraischen Zahl- und Funktionenkörpern [Abstract structure of the theory of ideals in algebraic number fields]. Math Ann 96(1):26–61. https://doi.org/10.1007/BF01209152, S2CID 121288299
“That Which a Minority Construct”: Abelian Mathematics in Abel, Galois,. . .
51
Noether E (1929) Hyperkomplexe Größen und Darstellungstheorie [Hypercomplex quantities and the theory of representations]. Math Ann 30:641–692. https://doi.org/10.1007/BF01187794, S2CID 120464373 Papadopoulos A (2022) René Thom: from mathematics to philosophy. In: Sriraman B (ed) Handbook of the history and philosophy of mathematical practice. Springer/Nature, Switzerland AG, Cham Plotnitsky A (2001) Algebra and allegory: nonclassical epistemology, quantum theory, and the work of Paul de Man. In: Cohen T, Cohen B, Miller JH, Warminski A (eds) Material events: Paul de Man and the afterlife of theory. University of Minnesota Press, Minneapolis, pp 49–92 Plotnitsky A (2012) The disaster of the diagonal. In: Doxiadis A, Mazur B (eds) Circles disturbed: the interplay of mathematics and narrative. Princeton University Press, Princeton Plotnitsky A (2021) Reality without realism: matter, thought, and technology in quantum physics. Springer/Nature, Heidelberg Plotnitsky A (2022) Logos and Alogon: thinkable and the unthinkable in mathematics, from the Pythagoreans to the Moderns. Spinger/Nature, Heidelberg Poincaré H (1908) Science et méthode. Flammarion, Paris Polya G (1990) Mathematics and plausible reasoning, volume 1: induction and analogy in mathematics. Princeton University Press, Princeton Raussen M, Skau C (2010) Interview with Mikhail Gromov. Notices Am Math Soc 57:391–403 Reed D (1995) Figures of thought: mathematics and mathematical texts. Routledge, London Riemann B (1854) On the hypotheses that lie at the foundations of geometry. In: Pesic P (ed) Beyond geometry: classic papers from Riemann to Einstein. Dover, Mineola, 2007, pp 23–40 Rousseau J-J (1997) Emile, or on education (trans: Bloom A). Basic Books, New York Schiller F (2004) On the aesthetic education of man (trans: Snell R). Dover, Mineola Schweber SS (1984) QED and the men who made it: Dyson, Feynman, Schwinger, and Tomonaga. Princeton University Press, Princeton Van der Waerden BL (1930) Moderne algebra. Springer, Berlin Van der Waerden BL (1985) A history of algebra: from al-Khwārizmī to Emmy Noether. Springer, Berlin Voevodsky V et al (2013) Homotopy type theory: univalent foundations of mathematics. Univalent Foundations Program, Princeton Vuillemin J (1962) La Philosophie de l’algèbre. Tome I: Recherches sur quelques concepts et méthodes de l’Algèbre moderne. Presses universitaires de France, Paris Weil A (1962) Foundations of algebraic geometry, 2nd edn. American Mathematical Society, Providence Weyl H (1939) Invariants. Duke Math J 5:489–502 Weyl H (2013) The concept of a Riemann surface (trans: MacLane GL). Dover, Mineola
A Tale of Three Cities: Thebes, Babylon, and Alexandria Maurice Burke
Contents 1 2 3 4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thebes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Babylon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexandria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The DMSS System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Rational Expression Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Interweaving the Three Tales with Current Strands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Measurement Systems and Decimals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Common Fractions and Algebraic Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 The Number Line, Measure Numbers, and Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Collecting the Threads and Tying the Present to Its Recent History . . . . . . . . . . . . . . . . . . . . . . . . 6.1 A Current State of Affairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Observations from Recent History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Other Voices of Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Suggestions for Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The Lesson of Thebes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 7 10 10 12 14 14 15 16 17 17 21 23 25 25 27 27
Abstract
The interface of the “scientific tradition” of mathematics with the traditions of American elementary education has grown in the past 200 years with state curriculum standards currently a highpoint of interactions. Episodes from the history of measure number provide a perspective suggesting current standards regarding the teaching of common fraction in American schools might be a bit wrongheaded and in need of reconsideration. Deeply rooted traditions are stubborn things, but the historical perspective provides clues for new directions. M. Burke (*) Montana State University, Bozeman, Montana, USA e-mail: [email protected] © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_100-1
1
2
M. Burke
Keywords
Common fractions · Decimals · Standards · Common core · Measurement · History of number
1
Introduction
The philosopher, Philip Kitcher (1984), presents a theory of mathematical knowledge residing in and warranted by evolving mathematical practices that are primordially rooted in human interactions with the world. Kitcher’s historical examples are drawn almost entirely from what Høyrup (1989) refers to as the “scientific tradition” of mathematical knowledge. This tradition is theoretical in contrast to the practical art of subscientific mathematical knowledge, employed in the trades, commerce, and elementary schooling well into the early modern world. While Høyrup explains the extinction of the subscientific mathematical tradition in Early Modern Europe of the late sixteenth century, DNA from that tradition is richly present in elementary mathematics education today in the USA, with its rather unique semiautonomy and guild-like practices of local control that stem from its germination in American soil in the early seventeenth century. In fact, this Americanized variant might respond to Høyrup, like Mark Twain, “The report of my death was an exaggeration” (New York Journal, June 2, 1897). Just as Kitcher’s theory must not ignore the “irrational” transitions or retrograde motions Høyrup (1989, 76) describes in the subscientific tradition, epicycles should also be noticed within the American tradition. This chapter shines a light on one of these epicycles relating to practices now driven by debatable curriculum standards. The interface of the mathematical science tradition with the mathematical practices of elementary education in the USA has radically changed in the last 200 years. One of the key developments is the relatively recent emergence of standards for school mathematics. The “standards movement” in the USA surged between 1980 and 2010, with state and national insistence on assessment-driven accountability in the outcomes of elementary mathematics schooling. Its most recent high point has been the Common Core State Standards for Mathematics (CCSSM). To borrow Høyrup’s terminology once again, such standards and their attendant frameworks and assessments are “index fossils” marking the boundary of a critical transition in the evolution of mathematical practice at the elementary level, a transition surely to be measured by many decades yet to come and not just a few years. The history of mathematics makes thoroughly clear how much we do not know and is riddled with undocumented gaps. Even so, it does offer contrasts with the present, and this chapter examines historical contrasts to the fraction schemas being taught and notoriously not well learned in USA schools in recent decades, if they ever were. Its brief survey of historical practices raises questions about the current attempts to demand of elementary school children, through what are effectively national standards, fluency and proficiency in common fraction representations, interpretations, notations, operations, and standardized algorithms. The CCSSM
A Tale of Three Cities: Thebes, Babylon, and Alexandria
3
stipulations for common fractions have been embedded, virtually word-for-word, in the standards of 44 states and imitated in the rest for over a decade (NGA Center and CCSSO 2010). From the perspective of history, this might be a bit wrongheaded and worthy of reconsideration. The chapter closes with suggestions for future standards and practices regarding common fractions. When using examples from ancient history, it is sometimes challenging to find consistent meaning in the words used by translators of ancient texts. This is not a criticism but an insinuation of the nature of a discipline. To paraphrase Wittgenstein, it is possible that the nuanced denotations of ancient symbols and written terms, passing through generations and cultures from the Bronze Age to us, like many strands twisted and braided into a rope, convey no thread of meaning running the full length (Wittgenstein 1958, 32). Therefore, like a statistician sampling a diverse population, this chapter attempts to connect the present to the past by interweaving multiple strands of meaning from the mathematical stories of three ancient cities with a sufficiently diverse collection of threads running through current elementary mathematics practice and its recent history. With these disclaimers and foreshadowing, please hang on to the gnarled rope that is woven, and welcome to Thebes.
2
Thebes
Imagine Thebes, the exotically picturesque capital of ancient Egypt in 1550 BC. There worked Ahmosé, the scribe who created the Rhind Mathematical Papyrus, a reproduction of a mathematical text from the time of Egypt’s Middle Kingdom. This ancient scribe wrote in hieratic script, a cursive related to hieroglyphics almost as a symbol-by symbol shorthand. Ahmosé’s hieroglyphic for 249 27 would faintly resemble the script in Fig. 1. Wide spaces are used to separate the hieroglyphic into five symbols, each symbol a unique arrangement like the dots on a dice. They are, from right to left: 200, 40, 9, 14, 1 . A special mark, ⌓, was used above 4 and 28 to distinguish them as suband 28 1 multiples of the unit, i.e., 14 and 28 . Ahmosé’s hieratic script for this number would contain five corresponding symbols in the same order. There are several important things to note about Ahmosé’s fractions: 1. The five symbols forming the word 249 27 belonged to a larger palette containing 38 grouping symbols representing integer numerical sizes 1, 2, 3, . . ., 9, 10, 20, 30, . . ., 90, 100, 200, 300, . . ., 900, 1000, 2000, . . ., 9000, 10,000, and
Fig. 1 Hieroglyphic for 249 27 read from right to left
4
M. Burke
100,000. The other grouping symbols on Ahmosé’s numeral palette were 23, and 1 those some would call unit fractions, like 14 or 28 , generated from their corresponding reciprocals with the use of the special sign. 2. The Egyptian 17 is not the same as the unit fraction 17, which is one of seven “equal quantities” or “aliquot parts” that compose a whole. Ahmosé seems to have thought of 17 as a grouping size in the sense that a whole, as a measured quantity, has only one size that is its seventh partsize, while each of the seven equal aliquot quantities composing the whole shares that size. Thus, the partsize insinuates a ratio of an aliquot quantity to the whole quantity. Grouping sizes are not counted in Ahmosé’s notation for sizing a quantity in terms of a single unit, though counting of unit quantities was certainly involved in the process of measuring and assigning grouping sizes. In contrast, the decimal 35.6, a multiplicative notation connoting 3 10 + 5 1 + 6 0.1, is a counting of grouping sizes: 3 tens 5 ones 6 tenths. Indeed, what appears to be an exception, 23 , was very useful in Ahmosé’s tools and methods, particularly in his doubling table and his methods for multiplication and division. Instead of being used as a count, it seems to be a special symbol for 12 16, perhaps preserving an earlier usage such as the size of the-unit-less-its-third. Unlike Ahmosé’s fractions, a common fraction, as a measure number, is a count of a unit fraction. As a quantity, the common fraction 17 can be considered a subunit, counted like a unit (hence the name unit fraction), and expressed as such in language and notation as, for example, two-sevenths or 27 . Thus, the word partsize is coined for this chapter to convey the Wittgenstein-like warning that Ancient Egyptian interpretations of fraction words and symbols are not common usages today. 3. A measurement in Ancient Egypt was done in the way typical of most ancient cultures: as a simple measurement using a count of a single unit or as a compound measurement using a decreasing sequence of counted amounts in a decreasing sequence of units and subunits, e.g., 4 cubits 6 hands 1 finger. For computation purposes, given the algorithms of Egyptian practice, the latter measurement would be sized as a single measure number in a single unit: 4 12 14 17 cubit. In general, for Ahmosé to state the size of a quantity in terms of a single unit quantity meant to express the size as a strictly decreasing sequence of the grouping sizes from his numeral palette (Allen 2014, 123). Therefore, Ahmosé would not simply write 27 or 2ð 17 Þ or 17 17, not even in his scratch work, to size the quantity we would represent as 27. He would no more do this than one would use the numeral 32,020 to represent 340 just because 20 + 20 ¼ 40. In a very consistent practice, Ahmosé 1 sized two of the seventh parts of a unit as 14 28 . It was his way of sizing the ratio of a quantity to a single unit quantity. The contours of Ahmosé’s quantitative reasoning seem to move fluently from sizes expressed by his numerals to quantities having those sizes. In this regard, he was fully capable of reasoning with quantities in ways analogous to reasoning with
A Tale of Three Cities: Thebes, Babylon, and Alexandria
5
common fractions. For instance, when no measurement unit was given in a problem about an unknown quantity, he would identify the quantity with its size number. He could easily conceive of “5 hands, each the 7th partsize of a cubit,” a quantity he 1 1 would size as 23 21 or as 12 17 14 and would consider equivalent to the seventh aliquot part of a length that measured 5 cubits. (With common fractions, this corresponds to the insight that 57 ¼ 5 7, a theorem relating two separately defined numbers.) When performing arithmetic operations, Ahmosé could consider these five quantities as a “heap” of sevenths, an unsized quantity yet to be rendered as a single measure number of cubits. In fact, during a computation involving multiple heaps, Ahmosé sometimes jotted down lists of partsizes in no particular order and repeating a partsize several times. Ahmosé would often reify partsizes into easy-to-work-with quantities by stipulating they referred to a quantity measured by a select number of units. For example, 1 1 he could render 4 23 15 þ 1 12 15 15 as 5 þ 23 þ 15 þ 12 þ 15 þ 15 and stipulate these sizes referred to wholes that each measured 30 units of the same kind. Thus, ignoring for the moment the 5 wholes, the fraction part of the sum amounted to putting together 20 + 6 + 15 + 6 + 2 into an unsized heap of 49 quantities each of the 30th partsize of one of the wholes. When sizing these 49 quantities by a single measure number, one of his many approaches would likely be to sort them into a heap of 30 quantities, which would be one whole, and then make heaps of size 15, 3, and 1, a decreasing sequence of quantities that were aliquot parts of 30. Accordingly, the size of all 1 1 1 1 1 49 quantities put together is 1 12 10 30 , thereby giving 6 2 10 30 as his final sum. Ahmosé knew the sorting was not unique and tended not to use 1 as a “heap” size. So, instead of 15, 3, and 1, he could have created heaps of size 15, 2½, and 1½, 1 1 thereby rendering the combined fractional quantities as 1 12 12 20 of the whole, which also might be preferred because of the larger partsizes. Ahmosé’s conceptions of fractions and sizing probably evolved from peculiarities of an oral tradition of measurement, perhaps the all-important system of grain measures. Grain measures were based on hekat units (≈ 5 quarts), and the naming sequence for most other grain units was developed by repeated halving and doubling of the hekat (Chace 1986, 18). Such a binary system would lend itself to reducing a measurement in multiple units (e.g.,1 gallon 2 quarts 3 cups), to a decreasing 1 sequence of partsizes in one unit (1 12 18 16 gallon). With relatively efficient computational algorithms, such a binary scale might have been the impetus for the evolution of the Egyptian system of measure numbers. Unless disrupted by outside cultural influences, any oral tradition would develop computational algorithms symbiotically with its measurement systems. An arithmetic of halving and doubling seems “peculiarly” compatible (Chace 1986, 31) with the binary-scale system of grain-measure units. By the time of Ahmosé, Egyptians multiplied numbers using a doubling strategy, sometimes interleafed with multiplications by ten. If we think of multiplication originating as a scaling operation in the context of measurements, converting a grain measure from a larger unit to a smaller unit would involve a doubling strategy with the occasional multiplication by 10. 1 (Ten entered with the ro, which was 320 hekat.) Chace (1986, 31) suggests this
6
M. Burke
connection to grain measure as a possible source of the Egyptian multiplication technique. To multiply 21 24, Ahmosé built the table shown in Fig. 2, with marks by the addends of 21: Since 21 ¼ 1 + 4 + 16, Ahmosé knew 21 24 ¼ (1 + 4 + 16) 24 ¼ 24 + 96 + 384 ¼ 504. With fractions, such a multiplication technique demanded a way to double the partsizes on Ahmosé’s numeral palette. Thus, the Rhind Papyrus opens by constructing a large table giving the double of 1/n for all odd integers n from 3 to 101. Ahmosé knew the table’s values were the same as those when 2 is divided by n, 1 i.e., the number that is multiplied by n to give 2. For example, the double of 97 is 1 1 1 given as 56 679 776. Several methods were probably employed in computing the table. The simplest pattern seems to be the use of 23, i.e., 12 16, as the template for entries of 2 1 1 the form 3n , which are all rendered in the form 2n 6n. The even values of n, such as 26, were not included in the table. Ahmosé knew that 2 13 ¼ 26, and hence the 1 1 double of 26 is 13 : Indeed, he found quite useful and frequently invoked this inverting rule: m p ¼ n implies m 1n ¼ 1p. Not surprisingly, Ahmosé’s doubling table was a crucial tool in his computational algorithms. These preliminaries help us understand Problem 33 in the Rhind Papyrus: A quantity [sometimes translated as “heap”], its 23 , its 21 , and its 17 , to go in together, become 37. What is the quantity? Today, X þ 23 X þ 12 X þ 17 X ¼ 13 2 37 and, consequently, 2 13 42 X ¼ 37 come to mind. Hence, X ¼ 37 2 42 ¼ 16 97 . For Ahmosé, however, the problem required finding a number that multiplied 1 23 12 17 to get 37. He seemed satisfied to use 1 23 12 17 instead of reducing it to 2 16 17 since it satisfied his sizing criteria and it conveniently expressed the problem. 1 Doubling 1 23 12 17 four times, he gets 16 1 23 12 17 ¼ 36 23 14 28 , which he found to 1 be 21 less than 37, possibly by using an abacus, and stipulating the fractions were 1 1 is the double of 42 , and he notes 1 23 12 17 42 ¼ 97: Therefore, partsizes of 84. But 21 2 1 1 1 1 1 adapting his inverting rule, 1 3 2 7 times the double of 97 of 42 is the double , or 211. So, 2 1 1 1 2 1 1 combining his products, 1 3 2 7 16 þ double of 97 ¼ 36 3 4 28 þ 21 ¼ 1 37: Finally, getting the double of 97 from his table, Ahmosé’s answer is 1 1 1 2 16 56 679 776, agreeing with our 16 97. Fig. 2 Doubling table for multiplication 21 24
A Tale of Three Cities: Thebes, Babylon, and Alexandria
7
Such was Ahmosé’s fraction arithmetic in Thebes, 1550 BC. He fluently calculated with “fractured numbers,” but his arithmetic was morphologically different from common fraction arithmetic as a cognitive and mathematical practice.
3
Babylon
When Hammurabi (1792–1750 BCE) elevated Babylon into one of the preeminent city-states of the world, his empire had already been the crossroads of cultures for millennia. In this milieu, the Sumerian base-60 numeration system and the Sumerian-Akkadian sexagesimal system of calculation evolved, probably to cope with the diversity of number bases and metrologies used for centuries by the peoples mingling there. Hammurabi’s scribe had 5 different numeral palettes for expressing measurements, down from 12 palettes used a thousand years before. His five numeral systems pertained to length, area and volume and bricks, liquid capacity, weight, and the counting numbers for sets of discrete objects (Robson 2008, 15). This meant, in context, a numeral referred not only to an amount of a quantity, but it also signified a type of quantity. In the discrete counting palette of Old Babylon, the numerals in Fig. 3 were used: Therefore, 7390 would be written as shown in Fig. 4: This represents 3600 + 3600 + 60 + 60 + 60 + 10. Like the Egyptian system, the discrete numeral system was not a place-value system. While having some base-10 grouping traits, it was the world’s only known base-60 system. The Babylonian scribe also possessed a sexagesimal place-value system (SPVS). Here, the number 7390 was represented as a sexagesimal, shown in Fig. 5, using only ones and tens: In today’s standard notation, this represents 2 602 + 3 601 + 10 600. However, the scribe had no numeral for zero, nor a sexagesimal point for distinguishing fraction from integer. These missing elements seem like deficits until we realize,
Fig. 3 Palette of numerals representing discrete counts Fig. 4 Babylonian numeral for 7390
8
M. Burke
Fig. 5 Sexagesimal for 7390
“The SPVS . . . was only a calculational device: it was never used to record measurements or counts” (Robson 2008, 16). So, while the sexagesimal represented the number 7390, it could also represent 123 16 (i. e. , 2 601 + 3 600 + 10 601) and countless other quantities. SPVS evolved from calculations within the measurements systems of diverse Sumerian metrologies. By using smaller and smaller named and symbolized units in each metrology, they effectively dealt with fractional parts of a unit quantity without pursuing the path of Egyptian scribes. While Babylonians had a phrase for the equipartitioning of a unit or the nth part of a unit and indeed made tables showing the nth parts of numbers like 60 for various n’s, their scribes did not have a notation or arithmetic resembling our common fractions. Their palette of special symbols for fractional parts seemed to be limited to those most useful for their ordinary needs: 1 1 1 1 1 1 2 5 2 , 3 , 4 , 5 , 6 , 10 , 3 , and 6. These fractional parts were only used to estimate remainders or when rounding off a measurement. They were easily converted to sexagesimals when calculations were needed. For example, the surveyor’s rod (≈ 6 meters) equaled 12 cubits, and each cubit equaled 30 fingers. Thus, the compound measurement 3 rods 5 cubits 23 fingers might be rounded to 3 12 rods or rounded to 3 rods 5 12 14 cubits. By Hammurabi’s time, the Babylonians had adjusted their major metrologies to form “what might fairly be called the world’s first metric system” (Powell 1995, 1956). It was aligned to their base-60 numeration and used the rod-length as a common determinative element similar to the meter in our metric system. For example, the principal area unit, the sar, equaled one square rod. Fifty sar was 1 ubu, but, for calculation purposes using SPVS, sixty sar could be called a big sar, 1 1 1 60 of a sar was called an area shekel, and 602 (or 3600) of a sar was called a little shekel. So, there are notable parallels to the metric system. Moreover, the use of shekel as the ubiquitous 60th of a measure unit, when no other name was standardized, allowed Babylonians to compute fractional quantities in terms of a “generalized-compoundmeasurement” arithmetic, the SPVS. Of common fraction arithmetic, this is not. Babylonian calculations were performed on the side and not shown on the clay tablets (Robson, p. 78). However, error analysis of the results of sexagesimal calculations strongly suggests the scribes used a base-60 abacus, perhaps just lines on a table or grooves in the sand, along with tokens standing for tens and ones (Høyrup 2002, 193).
A Tale of Three Cities: Thebes, Babylon, and Alexandria
9
Imagine a scribe adding an area 1 ubu 13 sar 21 13 shekels to an area 47 sar 12 56 shekels. For simplicity, imagine the scribe placing Tens (T) tokens and Ones (|) tokens on a contrived abacus to represent the two areas. See Fig. 6. With the first area, the 13 shekel equals 20 little shekels and the ubu combines with 10 of the sar to make 1 big sar, leaving three Ones in the sar column. Similarly, 49 sar 12 56 shekels is represented in the second row: (56 shekels ¼ 50 little shekels). At this stage, the specific metrology and its related numeral system, i.e., area measure, become irrelevant because the conversion ratio of 1 to 60, from column to column, is the same for all metrologies entered on this abacus. Thus, the scribe thinks of the columns simply as generalized compound measurement units during the calculation. The scribe combines the tokens in the two rows and simplifies to get the configuration shown in Fig. 7: Having completed the calculation, the scribe jots on the clay tablet the sexagesimal measure number shown in Fig. 8: The jottings exactly record the number and type of counting tokens in the abacus result. But the scribe is not finished. The answer should be reported as an absolute area measurement using the numeral palette for area. Therefore, since 50 sar combine to make an ubu, and 2 ubu equal 1 iku, the scribe records on the clay tablet 1iku 2 sar 34 16 shekel, using a roundoff fraction. Thus, Babylon’s sexagesimals indicate no particular absolute numbers, counts, or measurements except when being interpreted by a scribe within a particular context, usually after translation into the notation and metrology of the given problem
Fig. 6 Ledger suggesting the layout of tens and ones tokens on an abacus
Fig. 7 Ledger showing the result of the addition on the abacus
Fig. 8 Sexagesimal representing results on abacus
10
M. Burke
(Robson 2008, 78). They were the inputs and outputs of formal calculations within a generalized compound-measurement arithmetic, peculiarly adapted to the Babylonian “metric” system. In addition, the SPVS operations and algorithms on fractional parts of units like the sar were neither instances of nor justified by a broader arithmetic of fraction numbers. It was an arithmetic of 60s-progression numerals aided by an abacus. (This contrasts significantly with the decimal system today when it is treated as a “corollary” of the system of common fraction arithmetic due to the stipulation that a decimal is a common fraction.) The Babylonians viewed sexagesimal division as multiplication by reciprocals and seemed to believe, on geometric grounds, that all sexagesimals had reciprocals, either exact or approximate, making no distinction between rational and irrational numbers. The product of the sexagesimals in a reciprocal pair was usually thought of as 60, which is easily viewed as an appropriately placed 1 in an abacus computation. For example, the reciprocal of 17; 46 40 is 3; 22 30, and the reciprocal of a whole 1 number like 20 is 3, not a fraction 20 . From known reciprocals, the Babylonians used an area-based, geometric method to find new reciprocals (Robson 2008, 108). They were aware of rules like the reciprocal of a product is the product of the reciprocals since they frequently doubled one sexagesimal in a reciprocal pair while halving the other to produce a new reciprocal pair. The Babylonian SPVS, including its insights into the properties of sexagesimal reciprocals, had a significant indirect influence on the next city of our tale.
4
Alexandria
One thousand years after Ahmosé and Hammurabi, a Greek-centered Hellenistic culture emerged along the Mediterranean seaboard. Over several centuries, academies of learning and scholarship developed in prominent cities, with the greatest of these located in Alexandria, Egypt, at the Mouseion and Library created by Ptolemy 1 Soter (367–287 BCE). Two “fractional arithmetics” became prominent in this cosmopolitan city.
4.1
The DMSS System
The first arithmetic, the Degree-Minute-Second System (DMSS) or astronomers’ numbers, found its home in the definitive work of the Alexandrian scientist Claudius Ptolemy (100–170 CE). Inspired by the Babylonian SPVS, Ptolemy and his immediate predecessors expressed measures of arcs and lengths in a sixties-based number system of degrees, minutes, seconds, thirds, etc. While we think of degree as a unit of angle measure indicating a known amount of rotation, Ptolemy’s word for it was “section,” and he applied it to circular arcs (i.e., 360 sections in the circumference of a circle) as well as diameters (120 sections in a diameter), both of which were considered “lines” by Greek mathematicians. In a given circle, a degree of arc length
A Tale of Three Cities: Thebes, Babylon, and Alexandria
11
is incommensurable with a degree of diameter length, something Ptolemy might have believed but could not have proven. His words for minute and second are literally translated “first sixty” and “second sixty.” Thus, DMSS is built upon the notion of a generalized unit of line measure. The section is a generalized unit in that it does not represent any absolute, known amount of length, like cubits. Also, DMSS was not a pure base-60 system since it expressed degrees in the base-10 integer scale. Unlike sexagesimals, it is built around measure numbers that refer to absolute numbers in a generalized unit, while retaining the sense of being compound measurements in known standard units (degrees, minutes, seconds, etc.) when taken on a particular circle or segment. One might call it a compound-number arithmetic in generalized, denominatized units. (A pint is a known amount in its own right but is “denominatized” when expressed as an eighth of a gallon.) Even though Ahmosé’s system of fractions was still dominant in Alexandria and had been adapted by the Greeks, Ptolemy and other Alexandrian astronomers preferred to use DMSS for their scientific calculations. He explains, “In general, we shall use the sexagesimal system for the numerical calculations owing to the inconvenience of having fractional parts, especially in multiplications and divisions, and we shall aim at a continually closer approximation, in such a manner that the difference from the correct figure shall be inappreciable and imperceptible.” (Ivor Thomas 1993, 415). Ptolemy would use Egyptian fractions when rough estimates were sufficient and great precision not needed (Toomer 1984, 7). As evidenced by Al Uqlidisi’s tenth century Arithmetic (Saidan 1978), DMSS quickly adapted to Hindu-Arabic place-valued numerals for degrees, minutes, and seconds. Notably influenced by the Greek number theory found in Euclid, Al Uqlidisi viewed the numerical unit to be indivisible, which meant fractions, theoretically speaking, were not numbers, a position Ptolemy and all Greek mathematicians up to Diophantus seem to have taken (Gow 1968, 112). Al Uqlidisi wrote: “All the preceding work [referring to his own text] on whole numbers and fractions and all that we have mentioned on parts of numbers refer in fact to this concept, namely, the degree. Number is indivisible, but the degree is divided into parts and parts of parts indefinitely without end” (Saidan 1978, 84). Thus, the compound-measurement standpoint of the Babylonian system was carried forward into the compound-number arithmetic of the DMSS system, establishing it as a prototype of the positive real numbers. It finally bequeathed its position to the decimal system in the sixteenth and seventeenth centuries, the era when Egyptian fractions effectively went extinct and the use of common fractions in science started to recede. It is worthwhile tracing into modern times the Wittgensteinian strand of decimals that entwines with the DMSS threads. Al Uqlidisi’s Arithmetic is the earliest known Arab work that discusses decimal fractions and uses a notation similar to ours. Decimal arithmetic had evolved in Chinese and Indian cultures centuries before Al Uqlidisi and long before transmission to Europe. However, the efficiency and power of this arithmetic did not seem to be broadly appreciated in Europe until Simon Stevin published his influential pamphlet De Thiende (The Tenth) in 1585.
12
M. Burke
Significantly, Stevin, like Al Uqlidisi, modeled his development of decimals after the DMSS system. Stevin called each tenth part of unity a Prime, denoted with a ①, each tenth part of a Prime a Second (denoted with a ②), each tenth part of a Second a Third (denoted with a ③), and so on. He would write 27.843 as 27⓪8①4②3③. Thus, in the words of Stevin, decimals consisted “in characters of ciphers, whereby a certain number is described and by which also all accounts which happen in human affairs are dispatched by whole numbers, without fractions or broken numbers” (Stevin 1608, First Definition). Like Ptolemy’s comparison of DMSS to Egyptian fraction arithmetic, Stevin lauded the simplicity of decimal computation over common fraction representations and computations. However, he recognized an advantage of common fractions. When reducing a compound measurement to a measure number with a single unit, 7 such as 1 foot 7 inches to 1 12 feet instead of 7.5833 feet, common fractions had an advantage. But Stevin viewed this advantage as temporary and predicted sovereigns would eventually adopt measurement systems wherein the units formed progressions of powers of ten, paralleling decimal numeration. The French adopted the metric system a hundred years later. It bears mentioning that Stevin, contra Euclid, believed number, and hence decimals flowed in a linear continuum. But Europe would have to wait for Descartes and Wallis in the seventeenth century before a real number line model was clearly articulated, particularly one that included negative and positive numbers. With his conveyance of the DMSS into the realm of decimal fractions, the symbols in a decimal suggest a conventional compound measurement in generalized, tens-based units analogous to degrees, minutes, and seconds. This conception of decimals is echoed in the research of Wearne and Hiebert (1988) where base-10 blocks, money, and metric measures provide the referents for decimal numerals. Interestingly, their research has shown such compound-measurement thinking benefits the semantic processing of decimal notations by fourth graders, leading to better abstraction of decimals as measure numbers.
4.2
Rational Expression Arithmetic
The second “arithmetic” that emerged in Alexandria, and practiced by Diophantus (200–284 CE), made use of “algebraic” reasoning in an arithmetic of numerical expressions containing one unknown. A close look at Diophantus’ works reveals that many of the problems he investigated were likely drawn from subscientific traditions (Høyrup 1989) and shared by a diverse community of mathematicians for whom his work was definitive. Consider Problem 36 of Book IV (Heath 1910, 194; Christianitis 2004, 334–335): To find three numbers such that the product of any two bears to the sum of those two a given ratio. Diophantus knew that his trove of findthe-numbers puzzles typically had multiple solutions, but he would only offer one. He represented each problem in terms of one unknown number, which we might call
A Tale of Three Cities: Thebes, Babylon, and Alexandria
13
“x” or “the unknown,” and proceeded to reason algebraically, stipulating restrictions until he reduced the problem to a single equation. In the above problem, Diophantus starts by stipulating that the product of the first and second numbers is three times their sum, the product of the second and third numbers is four times their sum, and the product of the first and third numbers is five times their sum. He designates the second number as his unknown, and therefore, the first number equals three of the unknown in part of the unknown less three units. In modern notation, where the first, second, and third numbers are denoted by f, x, and t, 3x 4x he is saying f ¼ x3 , and similarly, t ¼ x4 . Therefore, since the product of the first 3x 4x 3x 4x . and third is five times their sum, x3 x4 ¼ 5 x3 þ x4 For most of its history in Alexandria, “2 in part of 7” or “of 2 the 7th part” connoted not only Ahmosé’s “2 quantities each of partsize 17,” meaning an unsized “heaping,” but also an unclosed operation of 2 divided by 7. (Either way, in 1 Alexandria, the closure was 14 28 .) Because Diophantus was immersed in an arithmetic involving unknown numbers, closure was not an option. Therefore, he had rules for unclosed operations involving expressions in a single unknown number, analogous to rational expression arithmetic taught in high school algebra courses. 3x Accordingly, he applies a rule for multiplying unclosed divisions to get x3 4x 12x2 3x 4x 7x2 24x ¼ . He then applies another rule to get þ ¼ . Thus, x4 x3 x4 x2 þ127x x2 þ127x x must satisfy 12x2 ¼ 5(7x2 24x), which he solves to get 120 in part of 23. He is content with this answer, instead of closing it in the Alexandrian manner to get 1 1 1 1 5 16 46 69 92 276, a result so unsightly that it evokes an image of Diophantus concurring with Ptolemy’s ghost on the inconvenience of Egyptian fraction divisions. Was Diophantus participating in an established tradition of common fraction arithmetic? This is a disputed question (Christianitis 2004, 331–335). He clearly thinks the unit is divisible and that Egyptian fractions are numbers. He also thinks that unclosed divisions and unclosed heapings of addends determine unique numbers. However, there are several aspects of his work that increase prospects of the narrative stopping here, and not implying the embrace of a system of common fractions by Diophantus or by the mathematical literati of Alexandria. First, in his solution to the above problem, Diophantus felt compelled to explain to his mathematical audience his rule for adding two unclosed divisions. Thus, after 2 3xðx4Þþ4xðx3Þ 3x 4x 24x 3x 4x indicating x3 þ x4 ¼ x7x 2 þ127x, he explains: x3 þ x4 ¼ ðx3Þðx4Þ . This detour shows that Diophantus possessed a rule for adding rational expressions, and, ipso facto, a rule for adding common fractions. It also suggests that this rule was not very familiar to his mathematical audience, or he would not have mentioned it. Second, Diophantus often expressed m in part of n with the shorthand notation mn, a notation scheme for unclosed divisions that had been in use in Alexandria for several centuries by people who apparently were thinking in terms of Egyptian fractions (Fowler 2004, 370–371). Diophantus usage seems no different in this regard (Fowler 2004, 371fn). In his rare use of the mixed numeral form, he used 1 Egyptian fractions. For example, he would express 6041 in part of 16 as 377 12 16 .
14
M. Burke
To summarize, it seems that Diophantus’ arithmetic of rational expressions in one unknown was not a generalization of an undocumented practice of common-fraction arithmetic in the Hellenic world. Indeed, the historian David Fowler (2004, 368), after an extensive review of primary sources, concluded: “Nowhere do I find any convincing evidence for the proposal that ‘the Greeks’ used anything like our notations for common fractions and our ways of performing fractional arithmetic.” The earliest documentable prototype of common-fraction arithmetic is found in the Chinese bamboo text Suàn shù shū (Cullen 2007) circa 200 BCE. Diophantus evidently grasped the concepts underlying the operational rules of division that would eventually govern common fraction algorithms. However, these algebraic rules used in solving equations may well have been discovered through reasoning about unclosed divisions, invoking properties of reciprocals known and transmitted through subscientific traditions since Old Babylon. So, while Diophantus could operate with mn expressions as if they were common fractions, it seems to be a pattern in the history of mathematics that concepts are often exploited long before their reference potentials are delimited with names and notations in mathematical definitions, and before they are constrained by systemic properties in rules or axioms (Kitcher 1984, 187–203). Hence, the most we can probably say about this second Alexandrian arithmetic, rational expression arithmetic in a single unknown it is, and common fraction arithmetic it is not.
5
Interweaving the Three Tales with Current Strands
5.1
Measurement Systems and Decimals
The Old Babylonian scribes were very proficient with the arithmetic of measure numbers, including highly precise calculations involving irrational numbers. They did this without the use of anything like common fractions or even Egyptian fraction arithmetic, a system they likely had encountered by Ahmosé’s time. The Babylonian experience suggests that decimal fluency does not depend on common fraction fluency, whether speaking mathematically, psychologically, or pedagogically. It also suggests how the semantical underpinnings of decimals diverge from those of common fractions. They share cognitive roots in equi-partitioning and unitizing, leading to measure numbers in a single unit. But from there the semantics of decimals, like the sexagesimals of Old Babylon, is situated within that of ideal measurement systems and abstracts the sense of a conventional compound measurement as an enumeration according to a decreasing sequence of groupings in a single conversion ratio. Not so for common fractions. When one thinks of a system of measurement like the apothecary pounds, ounces, drams, scruples, and grains, one appreciates that in this system the smaller units are divisors of all the larger units, but one frowns at the plethora of conversion ratios: 1: 12, 1:8, 1:3, and 1:20. More ideal are systems where the conversion ratios in the progression of units are a constant, like 1: 60 in the case of the DMSS system or 1:10 in the metric system. In the conceptualization of its groupings (tenths, hundredths,
A Tale of Three Cities: Thebes, Babylon, and Alexandria
15
tens, hundreds, etc.), decimal semantics is thus rooted in an ideal measurement system whose powers-of-10 place values resemble units used in actual metrologies, where conversions of measurements to equivalent measurements are easily imagined and comparisons of size easily established. Decimal operations and algorithms can thus be made very transparent as natural extensions of whole number operations and algorithms, but not without subtleties related to the uses of 0 and the decimal point. Common fractions have a more abstract relation to actual measurement systems and processes. Smaller unit fractions are typically not divisors of larger unit fractions and typically have no correspondence with units used in actual measurement systems. Indeed, unit fractions seem like a counting system of general units where successive units differ by one in the denominator, obscuring not only the notion of equivalence, but also the associated conversion and comparison processes. This is reinforced when, in the effort to build conceptions of common fractions as a system of measure numbers, emphasis is placed on students creating equi-partitions of geometric figures and discrete sets to represent artificial measurement systems inhabitable by arbitrary unit fractions. In these dissections, it is left to the student to “unitize” a component that they can count to create a measure number. To represent operations with their diagrams, students must then use highly variable, compound dissections with visual units merging to form other visual units. The sense of measurement by a decreasing sequence of units might easily be lost in all of this, particularly its approximate nature wherein the count of the final unit in the sequence is an estimate.
5.2
Common Fractions and Algebraic Reasoning
The best algebra of the Bronze Age, a notable algebra at that, was being done in Babylon with sexagesimals and geometrically based identities of area and length, without common fractions. More pertinently, Diophantus’ story demonstrates that algebraic reasoning about rational expressions does not depend on prior common fraction algorithmic competency, or even common fraction representations of fractional measures. It also suggests that, once the connection is made between common fractions and the division operation in the elementary grades, proficiency in common fraction operations and algorithms can be developed much later as part of operating on real number fractions and rational expressions in a real variable, semantically anchored in the properties of the division operation. Indeed, after listing the usual “Why?” questions people have about the unique features of common fraction algorithms and their justifications, Lortie-Forgues et al. (2015), 208) explain: “All of these questions have answers, of course, but the answers are not immediately apparent, and they often require understanding algebra, which is generally taught after fractions, so that students lack relevant knowledge at the time when they learn fractions and might never learn how algebra can be used to justify fraction algorithms after they gain the relevant knowledge of algebra.” It is worth noting that the Dolciani high school algebra textbook series, perhaps the most popular textbook series in the USA between 1963 and 1986, derived the
16
M. Burke
operational rules of rational expressions in real variables using carefully scaffolded lessons, while citing only token examples of common fractions to illustrate (and certainly not to validate) the rules. The derivations were simple and based on assumptions about multiplicative inverses, i.e., reciprocals, and not common fractions, thereby supporting the notion that Diophantus might have derived his rules for unclosed divisions in a similar fashion (Dolciani et al. 1982).
5.3
The Number Line, Measure Numbers, and Precision
In one sense, as we survey the history of number, measure numbers were the result of successive abstractions from evolving measurement systems, often involving multiple numeral palettes depending on metrology. With traceable threads of meaning, this evolution connects our specie’s earliest measurement systems to the very notion of “measure space” encountered in courses on real analysis. The notion of a measure space assumes two things, an underlying set of measurable quantities or magnitudes and a measure function assigning a size (a representation or description in terms of known quantities) to each measurable quantity. For most historical measurement systems that evolved arithmetized fractional numbers, when a compound measurement was reduced to a single measure number in a single unit, the measure number sized, in a manner compatible with the particular historical system, the ratio of the quantity to the known quantity expressed by the unit. Therefore, the measure function connects to the idea of a ratio scale. Some measurement devices like rulers, spring scales, and graduated cylinders provide a visual sense of ratio scales that pair measurable quantities to measure numbers. All ratio scales have an origin, a unit, a built-in linear ordering of (quantity, size) pairs, and a gradation system that assumes the ratio of two quantities being measured is proportional to the ratio of their gradation-based numerical measures on the scale. The historical development of the real number line can thus be characterized as an abstraction of an ideal ratio scale of measurement, the ideal ruler so to speak. It was not, at least in Western traditions, a generalization of the system of rational numbers. Indeed, the ancient Greeks discovered that the ratio of the diagonal of a square to the side of the square is inexpressible as a ratio of two counting numbers. This implied that counts of a fixed unit, and its nth parts, were inadequate for the task of expressing the exact sizes of the magnitudes on a continuum with that unit. Simply put, ratios of integers, the precursors of rational numbers, fail to fully capture the essence of measure. Instead, the Eudoxian theory of proportions, commencing in Book V of Euclid, heralds the beginning of Greek measure theory and uses ratios of lines to represent the ratio of magnitudes. A century after Simon Stevin, measure numbers on a ratio scale, like the DMSS scale, were being abstracted as real numbers in a linear continuum. Like the particlewave duality of quantum objects, this abstraction supports the notion that a real number is a point on a graduated line and the notion that a real number is the address or “coordinate” of such a point. Perhaps the duality of perspectives can be bridged by
A Tale of Three Cities: Thebes, Babylon, and Alexandria
17
simply saying a real number is a point on a number line, and an infinite decimal is an idealized decimal measure number associated with a point on a decimal-scaled number line. Both perspectives are important. It is easy to get lost in arguments about what a real number is and forget the fundamental raison d’etre of real numbers, i.e., measurement. Long before the Greeks, ancient cultures were aware that physical measurements of continuous quantities were intrinsically approximate. Therefore, an important consideration in differentiating decimals and common fractions when taken as measure numbers is their pliancy in representing all real numbers, in computing-derived (indirect) measure numbers, and in determining the precision of derived measure numbers obtained from calculations. Long ago, Ptolemy appreciated the infinite pliancy of the DMSS system for the indirect calculation of the chord sizes in his trigonometric table. Similarly, in modern school algebra, where the focus is on reasoning with continuous quantities, functions, and graphs, the Dolciani Modern Algebra Series chose to introduce real numbers using an infinite decimals approach, even stating a completeness axiom stipulating all real numbers have a decimal representation and all decimals represent real numbers. This was not a rational number approach, along the lines Richard Dedekind defined in the nineteenth century (Dedekind 1901) and was pedagogically speaking the better choice. While the notion of infinite decimals at first seems daunting, children encounter infinite decimals by necessity in the earliest grades when they attempt to complete simple long division problems of whole numbers or convert even the simplest common fractions to decimals. Furthermore, science instruction at the precollege level for the most part uses decimals for measure numbers, usually with metric units, and relatively simple “significant figure” rules for roundoff precision, testifying to the efficiency of decimals in deriving measure numbers and in determining the precision of such derived numbers. In summary, history reveals that the real numbers grew from the needs of measurement, leading to the theories of geometric measure and to the algebra of operations needed for the indirect calculation of measures used in sciences like Ptolemy’s astronomy, where exactness demanded better number tools. Although all physical measurements of continuous quantities are approximate and their associated measure numbers usually can be regarded as rational, the real number system was not abstracted from a defined system of rational numbers but was rooted in Euclidean measure theory and even more ancient traditions.
6
Collecting the Threads and Tying the Present to Its Recent History
6.1
A Current State of Affairs
In 1989, the National Council of Teachers of Mathematics (NCTM) lent its weight to the fledgling standards movement in the USA. Not afraid to challenge the status quo, the Council published its rather bold Curriculum and Evaluation Standards for
18
M. Burke
School Mathematics. Unlike CCSSM, NCTM did not provide specific grade-level performance expectations. It set general goals for broad grade-level bands. The Standards did pronounce some controversial value statements in the form of topics deserving increased or decreased attention. One topic designated for decreased attention in grades K-4 was “Paper-and-Pencil fraction computation.” In grades 5–8, regarding common fraction operations, NCTM recommended mastery of a small number of basic facts, like 14 þ 14 ¼ 12 , and that proficiency in operations be limited to fractions “with simple denominators that can be visualized concretely or pictorially and are apt to occur in real-world settings” (NCTM 1989, 96). Many state supervisors of school mathematics, who were members of NCTM, used the NCTM Standards as a template for developing the standards in their own states during the 1990s. In many respects, the CCSSM movement was a reaction to NCTM’s 1989 standards, as well as state officials grappling with No Child Left Behind calls for grade-specific expectations and assessments. Like the New Math movement, much of the leadership behind CCSSM came from well-intentioned university mathematicians (Høyrup’s scientific tradition) concerned about the failure of the elementary curriculum (subscientific tradition) to prepare students for algebra. Indeed, these deficiencies were at the heart of the National Mathematics Advisory Panel (NMAP) Report, which stipulated that all children should successfully complete the equivalent of Algebra II in high school. CCSSM embodies the changes called for in the Panel’s Benchmarks (NMAP 2008, 20) by making fluency with common fraction operations and algorithms the most urgent (NMAP 2008, 18) of the three pillars identified by the Panel in the critical foundation of algebra. While recommendations can be inspiring, actual assessments of student achievement are far more sobering. The 2019 National Assessment for Educational Progress (NAEP) revealed that mathematics scores for fourth and eighth graders have essentially flatlined since 2009, the year before CCSSM emerged as the dominant standards in the USA. In NAEP 1978, only 24% of eighth graders tested chose 7 2 as the whole number closest to 12 13 þ 8 when given the options 1, 2, 19, 21, and “I don’t know.” Lortie-Forgues et al. (2015), 202) posed this same item to 48 eighth graders enrolled in an algebra course in an affluent, suburban middle school, which had based its curriculum on CCSSM since 2010: 7 In 2014, 27% of the 8th graders identified ‘2’ as the best estimate of 12 13 þ 8. Thus, after more than three decades, numerous rounds of education reforms, hundreds if not thousands of research studies on mathematics teaching and learning, and billions of dollars spent to effect educational change, little improvement was evident in students’ understanding of fraction arithmetic.
The difficulties children in the USA have in mastering common fractions has been well known for several hundred years. Young adults are faring no better. Kloosterman (2010, 50), reporting on trends in NAEP data for 17-year-olds between 1978 and 2004, notes the percent correct declined from 66.7% to 45.3% on the item stated as “3 2 13 ¼?” The decline typified all the common fraction items he was
A Tale of Three Cities: Thebes, Babylon, and Alexandria
19
given access to. Kloosterman concluded: “This makes sense given that exposure to common fractions outside of school is decreasing, whereas exposure to decimals fractions outside of school is increasing.” Kloosterman is not the first to notice the decline, if not outright extinction, of common fractions in the world around the child. (A similar phenomenon is being noted by NAEP with regard to certain reading skills in the age of video and cell phones.) With metric tools and digital everything, the objections of the trade guilds to this extinction claim are not compelling. STEM professionals also object, but their counterexamples generally exist in the technical milieu of real number fractions, which include both rational and irrational fractions. Such STEM examples are critically important, but translating them into a deferred gratification argument justifying the CCSSM approach has serious problems, one being the “Free” in “Land of the Free” and another being the L.P. Benezet refutation of deferred gratification arguments. Consider first the “Free.” NAEP data is unsettling to some because students in East Asian countries fare much better on common fraction literacy. This was certainly on the minds of the National Mathematics Advisory Panel when it made its recommendations in 2008. However, if history reveals anything, it is that culture, geography, and social traditions matter in the mathematical values and practices of a people. Local control is part and parcel of the history and social fabric of education in the USA. One might, by analogy, mention the same phenomenon of local control when assessing the outcomes of this nation’s response to a pandemic in comparison to East Asian countries. Indeed, once the CCSSM standards became “national” in 2010, the political blowback was substantial. Ultimately, local control translates into an enlarged role for parents and students in curriculum decision-making, in contrast to highly centralized educational systems. Indeed, according to some experts, parents and students might play the most important role (Klein and M. 1991). NAEP suggests how American students and parents are casting their ballots vis-à-vis the learned curriculum. To most of them, except for the most basic examples like a half and a fourth, common fractions are hidden numbers, absolutely of no use, and never encountered outside the classroom. Though well written and the source of many improvements (e.g., nicely coherent learning progressions and minimal redundancy from grade to grade), CCSSM has not helped the situation by making common fraction computational fluency perhaps the central imperative of the K-6 elementary curriculum. In the CCSSM K-6 standards, the words whole number(s) and integer(s) appear less than 100 times; the words decimal(s) and decimal fraction(s) appear about 33 times, while (common) fraction(s) appears over 180 times! Decimals are defined by common fractions. Consequently, initial decimal experiences of students, instead of being grounded in measurement, presumably involve much work relating decimals back to their fraction definitions, operations, and notations, thereby creating a dependency of decimal understanding on a student’s common fraction understanding, which all too often is fragile. In a logical/structural sense, decimals can be defined by common fractions, and decimals algorithms can be viewed as simple corollaries of common fraction
20
M. Burke
algorithms within CCSSM’s well-thought-out common fraction development. But if we believe Morris Kline’s (Kline 1974, 100) critique of the New Math of the 1960s, an earlier attempt by the scientific tradition to rigorize elementary arithmetic, such structure “cannot be significant (to the child) at this stage.” If anything, the history of mathematics surveyed above suggests the claims of the National Mathematics Panel about the critical pillars of school algebra need qualification. Furthermore, research by Hiebert, Wearne, and others clearly articulates the deep complexity of developing decimal competency, regardless of any work on common fractions. This leads to “The Story of an Experiment” (Benezet 1935), an article which Harvard mathematics professor Dr. Andrew Gleason found so interesting that he shared it with many, including this author. Gleason is mentioned because he was a mathematics editorial adviser of the very popular Dolciani Modern Algebra Series previously mentioned and because he directed the Cambridge Conference of research mathematicians who met in 1963 to produce a set of recommendations, which will be discussed later, titled Goals for School Mathematics. In 1924, Benezet was the superintendent in Manchester, New Hampshire, facing a crisis wherein 20% of the first graders were not promoted to second grade because they had failed arithmetic. This was particularly pronounced in minority schools with immigrant families who often spoke limited English. After many informal assessments in the district’s schools, in which he asked children quantitative reasoning questions often dealing with fractions and rates, the embarrassing results of traditional instruction convinced half of the schools in the district, particularly the minority schools, to eliminate formal arithmetic until sixth or seventh grade (no use of primers, algorithms, etc.). They were, instead, to focus on relevant quantitative reasoning tasks, mediated by math talks, in all subjects taught. Benezet (Benezet 1935, 2410) wrote: “. . . we waste much time in the elementary schools, wrestling with stuff that ought to be omitted or postponed until the children are in need of studying it. . .The whole subject of arithmetic could be postponed until the seventh year of school and it could (then) be mastered in two years by any normal child.” After pre- and post-testing the children from the four experimental schools that resumed the use of primers in the sixth grade, the results were compelling. Upon entering sixth grade, the experimental children, who had yet to learn standard operation algorithms for whole numbers, fractions, and decimals, were far behind the traditional classes. By the end of the year, on the arithmetic skills expected of sixth graders, which included common fraction proficiency, the experimental students performed just as well as the traditionally trained children and were more fluent and confident in their thinking about numbers and operations. Perhaps Benezet was a bit underhanded using fraction and rate problems in his demonstrations that something was amiss in his Manchester School District’s elementary mathematics curriculum. But he at least deserves credit for making one question the traditions of that curriculum regarding common fractions and the demand for early fluency in fraction algorithms, particularly the recurring negative effects of this curriculum on certain populations within school districts. For that reason, several cursory remarks must be made about the history of the American precollege curriculum.
A Tale of Three Cities: Thebes, Babylon, and Alexandria
6.2
21
Observations from Recent History
At the end of the nineteenth century, according to Snyder (1993), the typical school district in the USA had 144 days in its school year. The typical student attended class 99 of those days and nearly completed eighth grade, with great disparities due to race and gender. (The typical black male completed fifth grade while the typical white male completed eighth grade.) Between 1800 and 1900, these numbers were much smaller, including length of school year and attendance levels. In 1900, few Americans completed a course in algebra or geometry and only about 7% ever graduated from high school. Algebra was considered utterly abstract and intended as a mental discipline for the college-bound, which was very few students. To say the least, during the nineteenth century and early twentieth century, arithmetic was the terminal mathematics course for the vast majority of students, and it was designed to be just that: “to prepare the young for the common avocations in life” (Greenleaf 1845, Preface). It is worthwhile exploring samples of arithmetic texts written in English from the seventeenth to the twentieth century to gain a better appreciation of this goal. (See Resourceaholic in References.) As their terminal mathematics course, children encountered substantial work with compound numbers, fractions, and decimals. The diversity of measurement units in common use required fractions when expressing a compound number in terms of its largest unit. Importantly, common fraction fluency was not intended nor generally thought of as an essential foundation for algebra. In the twentieth century, high school populations gradually expanded, especially during the Great Depression. By the mid-twentieth century, the surge of enrollments in colleges, driven by the G.I Bill, and the calls for universal high school graduation led to the reevaluation of the high school curriculum as a preparation for college. Mathematicians of prestigious universities became involved in the goals and redesign of secondary school mathematics as a preparation for college. Such efforts as the University of Illinois Committee on School Mathematics, founded in 1951, and the 1958 AMS School Mathematics Study Group began to produce sample high school text materials, eventually morphing into a K-12 reformation known as the New Math movement, which barely lasted into the 1970s at the K-8 level before being driven “Back to Basics.” A report produced at the time, the Goals for School Mathematics (1963), was distributed among mathematicians and mathematics educators at universities, across the country, but received little acclaim in part because its stated purpose was to be futuristic, which it definitely was. However, the report is pertinent. Dr. William Martin of MIT and Dr. Andrew Gleason of Harvard cochaired the four-week conference. Most participants were well-known mathematics researchers from prestigious American institutions. The purpose of the conference was to produce a coherent and radical set of long-term goals for K-12 mathematics. Section 5 of the report is devoted to the K-6 mathematics curriculum. The first sentence of the section captures the spirit of their recommendations. “As we have indicated above, the objective for mathematics instruction in the elementary grades
22
M. Burke
[K-6] is familiarity with the real number system and the main ideas of geometry.” The report goes on to emphasize the following: • Early use of the real number line in the context of much measurement activity. • Reasoning with inequalities and “considerable” experience with approximations, including the effects of “round off” and significant figures. • Fractions with small denominators to name additional points on the number line. • “explicit study of the decimal system of notation including comparisons with other bases” • Study of decimals for rational and irrational numbers. • Early focus on mathematical properties of real number system with algorithms coming “considerably later in the curriculum”. • Study of geometric transformations, symmetries, and similarity of figures. What is very clear in the report is that the real numbers, the properties of their operations, the number line, measurement, decimals, and the geometry of transformations and similarity are dominant in the vision of the Cambridge Conference for the elementary curriculum. Common fractions can help to populate the number line but seem otherwise not to be emphasized. Fluency with common fraction operations and algorithms is clearly not a “critical” priority, and possibly not even a goal of these K-6 recommendations of professional research mathematicians. The back-to-basics movement in the 1970s pushed common fraction fluency back into the forefront of the elementary curriculum, until the ascendancy of the NCTM 1989 Standards and various systemic reform efforts of the 1990s seemed to stall the trend. But the backlash to the NCTM Standards and the related curricula spawned by systemic reform efforts was severe. As already mentioned, the CCSSM doubled down and firmly embedded fluency with common fractions and their standard algorithms as perhaps the central imperative of the K-6 curriculum. Along with the National Mathematics Panel pronouncement about the critical foundation of algebra, CCSSM lent weight to the sense that the tradition of early common fraction fluency was a mathematical imperative, a status it had never previously had in American education. While lifting the burden of very large denominators, the CCSSM common fraction standards lock in, among several newer practices, the traditional computational practices of the decades before the “New Math.” So, while many in the country claimed the CCSSM standards were enforcing a national curriculum and stepping on states’ rights and local control, the CCSSM standards, except for nominal changes in branding, remain the standards in nearly all states. One reason, not surprisingly, may be that the CCSSM demands for fraction and decimal computational fluency using standard algorithms have strong appeal to many adults based on long-standing American traditions. Nonetheless, states are hard-pressed to enforce standards when faced with the low performance of their students on recognized exams like NAEP and the lack of improvement, if not the outright decline in the performance of a large portion of their students on topics emphasized in their standards, like common fractions. Local control is exercised at many levels, and state intrusions into the affairs of school
A Tale of Three Cities: Thebes, Babylon, and Alexandria
23
districts have their limits. Even with state backing, CCSSM advocates are now, after a decade of implementation, facing growing concerns by parents and local educators. These CCSSM advocates are resorting to calls for patience (Goldstein 2019). Though painted with sweeping strokes, this story of epicyclic motion within the thinking about common fraction fluency in American elementary education is salient.
6.3
Other Voices of Note
In his essays on Measure of Magnitudes published in a series between 1931 and 1935, Henri Lebesgue (1966) introduced a construction of the real numbers as infinite-decimal measures of segments on a number line. He sought to illuminate a pathway for the precollege curriculum to develop the real numbers while passing over the study of common fractions. “But would we still speak of fractions at the primary level or in the first two grades of the secondary level? No, since this is not essential for the theory and serves no practical purpose. I believe that people will agree with me when I say that performing operations on twentyseconds and thirty-sevenths is a martyrdom that we inflict on twelve-year-olds out of pure sadism without any justification based on usefulness as an extenuating circumstance. Yes, I know that by searching hard, one can find certain “applications” of fractions, that certain mechanics make calculations with their fractions in cutting threads on screws. . . . .” “. . .The reform would be effective if we agreed to have the children no longer study two kinds of numbering, namely, numbering in terms of nths for commensurable numbers and the decimal numbering system, that is if we permitted them to answer 0.428 where the answer is 3/7.” “True, a divided by b or a/b is still read “a bths” when a and b are integers, but this way of speaking no longer compels us to develop the entire theory of fractions any more than the word quarte-vingt-douze [French word for 92, literally “four-twenty (plus) twelve”] compels us to study the numbering system with base twenty.” (Lebesgue 1966, 31–32)
Lebesgue’s views impacted the Cambridge Conference thinking. They also influenced Russian psychologist V. V. Davydov and his colleagues as they developed an elementary curriculum based on measurement and the real numbers. The Davydov curriculum provides an existence proof that the elementary curriculum can embody the core recommendations regarding real numbers and fractions found in the Cambridge Conference Report and build real number arithmetic from the fundamental notions of measurement. One of the overarching principles Davydov (1975, 113) drew from Lebesgue is stated as follows: But what this means in terms of curriculum design is no less than an end to the arithmetic of fractions as it is interpreted in the school. The shift from whole numbers to real numbers is a shift from arithmetic to “algebra,” to laying the foundation for analysis.
24
M. Burke
The first half of Davydov’s first-grade curriculum starts with comparing, composing, and decomposing continuous quantities of the same kind, without numerical associations, while identifying measurable traits like volume and area and mass, and representing their actions on quantities symbolically and with lines. Children thereby develop algebraic relations of equality and inequality between composed and decomposed prenumeric quantities represented in symbols and on a linear scale, akin to the properties of general magnitudes investigated by Euclid (e.g., if a, b, and c are magnitudes of the same kind, a > b implies a + c > b + c). Only near the end of first grade are numbers introduced as multitudes of units by using tokens or tally marks to communicate the amount of a unit in a measurement, based on a 1–1 comparison. Counting numbers are then introduced as a sequence of names for those multitudes of units. As with the prenumeric quantities, the focus of work with numeric quantities is to develop algebraic relations of equality and inequality between composed and decomposed quantities, but this time akin to the properties of multitudes of general magnitudes investigated by Euclid. (E.g., if m is a number, i.e., “multitude of unity,” then m(a + b) ¼ ma + mb.) On this foundation in the first grade, Davydov proceeds to expand his arithmetic of measure numbers, as he said, in a shift toward algebra and analysis. Davydov’s curriculum is sometimes classified as “early algebra” because of its early use of algebraic symbolism to express general patterns encountered by students in attempting to solve problems of measurement and quantity. The teacher’s role in this is not a case of handing students fully baked a priori abstractions, but more akin to providing words and symbols for patterns children have exploited in their problem-solving activities as well as guiding them in stipulating the general scientific principles governing those patterns. In this regard, Davydov’s Vygotsky-based social constructivism seems to be following the historical roots of mathematical learning as described in Kitcher’s constructivist epistemology. Informative examinations of the philosophical/pedagogical details, the mathematical foundations, and the successful classroom implementations of Davydov’s curriculum can be found, respectively, in Schmittau and Morris (2004), Bass (2019), and Dougherty (2008). Dougherty’s words deserve mention here. Summarizing the Measure Up (MU) Project, a longitudinal experiment immersing elementary children, regardless of prior achievement or “ability level,” in a modified Davydov curriculum, she concludes: An approach to elementary mathematics that focuses on non-specified, generalized quantities is often thought to be too abstract and thus, not accessible by young children. MU preliminary results, however, support Davydov’s claims that it offers young children a meaningful foundation on which to build sophisticated and complex mathematics. Understanding the structure and properties of mathematics creates a way for children to construct solid underpinnings that lead to substantive mathematics. . .It builds confidence so that even within non-routine or unfamiliar situations, children can reason through the relationships in the problem. (p. 411)
Davydov’s approach is but one possible development of elementary mathematics on the basis of measurement and the real numbers without passing first through the
A Tale of Three Cities: Thebes, Babylon, and Alexandria
25
gauntlet of common fraction operational fluency. Significant approaches to the elementary mathematics curriculum centrally founded on measurement, and not on common fraction fluency, have already been developed and studied in this country, such as Developing Mathematical Processes (Romberg 1977). It seems reasonable that similar approaches to real number numeracy will prove more than viable if given a chance for development.
7
Conclusion
7.1
Suggestions for Consideration
The historical strands surveyed above underline the significant cultural variations in the fundamental concept of measure number. Indeed, successive abstractions employed in the human effort of expanding number to include the “continuous number domain,” i.e., the real numbers, led Richard Dedekind to philosophize: Numbers are free creations of the human mind; they serve as a means of apprehending more easily and more sharply the difference of things. (Dedekind 1901, Preface to First Edition). This general portrayal of number is quite appealing as a summary of number’s social history. Since the time of Plato and Euclid, the final stages of a mathematical creation in the scientific tradition have been those of stipulation and systematization, with definitions and axioms. In the context of education, Poincaré (1952, 116) cautions as follows: What is a good definition? For the philosopher or the scientist, it is a definition which applies to all the objects to be defined, and applies only to them; it is that which satisfies the rules of logic. But in education, it is not that; it is one that can be understood by the pupils. So, if we are to believe Poincare and Dedekind, a major goal of the elementary curriculum, per the Cambridge Conference Report, is to guide students, brief stipulation by brief stipulation rooted in experimental facts, to a good definition of the continuous number domain. The task of pedagogy is to assure this is done with understanding and in a way that allows students to recognize that this free creation truly helps them in “apprehending more easily and more sharply the difference of things” in their world. To the child first developing understanding of magnitude, number, and measurement relationships, the things are in their physical environment, not simply the “manipulatives” used in school, and in the world of quantities and comparisons invoked by their parents and by their social groups. In Kitcher’s philosophy, the mathematical reality of the “ideal human subject” is clear: It resides in the “ings” of mathematical practice, ontologically and epistemologically prior to the “ions”; e.g., abstract mathematical notions like collection are derived from idealizations of mathematical activities like collecting (Kitcher 1984, 110). Constructivist philosophies in education are consistent with this view: The protomathematical learning of children is located in reflective operating on their world. To engage reflectively, their actions must be purposeful. By way of an
26
M. Burke
analogy whereby numbers are paints on a palette, painting a picture of an actual object or scene chosen by the child with a palette of colors actually visible in the setting is usually more purposeful than a painting-by-the-numbers of an artificial, random scene handed to them and using colors that are never seen. Curriculum standards must pay attention to this. For years, educators involved with the learning issues surrounding minority students have called for curriculum practices that reflect relevant community, cultural, and social purposes in the learning activities of these students (e.g., Educating American Indian/Alaska Native Elementary and Secondary Students 1995). But such practices are needed by all students if they are to apprehend more easily and more sharply the difference of things actually in their world. Hence, given the near extinction of common fractions in the environment of the child, there needs to be a compromise between the thinking of the 1989 NCTM Standards and the 2010 CCSSM Standards. Here are some suggestions that hopefully can provoke discussions and lead to a better interface, with fewer epicycles, between the scientific mathematical tradition and the mathematical practices of elementary education in the USA. 1. (Echoes of Ptolemy) The real numbers are an appropriate milieu for all of school mathematics, starting in kindergarten. The decimal system, with its cognitive roots in the denominate thinking of ideal measurement systems and in the geometry of a number line, provides the appropriate numerical idiom. Properties of decimal operations can be developed by means of measurement activities and geometric definitions vis-à-vis the number line. Decimal algorithms can be developed from the very concrete compound measurement operations from which they arose historically without appeal to common fraction algorithms. 2. (Echoes of Babylon) Elementary school, through Grade 8, is the appropriate place to develop fluency in decimal operations and algorithms. Today’s children live in a world of decimal numbers. Simple common fraction literacy, without the goal of fluency in standard fraction algorithms, is still a relevant backdrop at this level. However, with the place-value system, calculators, the metric system, and other decimal referents, such as percentages, commonplace in their environment, the potential is considerable for providing children with semantical support of decimals as measure numbers on a real number line. 3. (Echoes of Diophantus) Algebraic thinking can be developed throughout elementary school, particularly with the operational properties of real numbers; but the algebra of grades 8 through 10 is a proper setting for developing fluency in common fraction algorithms as a corollary of the broader development of real numbers and simple rational expressions. All young adults need this foundation for a world in which so many career options depend on mathematical literacy. Moreover, based on the recommended certifications practices at this level, teachers would be prepared to help students navigate the algebraic reasoning required. 4. (Echoes of Benezet) State standards must not preempt the advantage responsible local control can bring to education in America. Year by year, benchmarked
A Tale of Three Cities: Thebes, Babylon, and Alexandria
27
proficiencies can entrench a well-intentioned but not ideal curriculum for generations, and stifle the resourcefulness exhibited by Superintendent L.P. Benezet in thinking outside the box to address serious learning challenges in local environments. Regarding fractions, educational research (Moss and Case 1999) has found compelling evidence that the most effective progression for developing understanding of fractional measures is percentages – decimals – and then common fractions. Standards that are effectively national, like CCSSM, should not stifle such longitudinal approaches to curriculum experimentation.
7.2
The Lesson of Thebes
One of the important lessons of the tale of Thebes is the rather notable fact that the Egyptian fraction system survived untarnished for many thousands of years in even non-Egyptian cultures, well into the age of common fractions. It is a warning that the common-fractions-based, elementary curriculum, with its deep roots in American tradition, can outlive its usefulness for centuries, long after CCSSM standards are forgotten and regardless of the toll taken on large segments of the USA population. Looked at historically, the standards movement in the USA seems superficial when faced with the challenge of perturbing the deeper equilibrium of traditional practices. Even so, it is never too late to rethink what is being done. As Benezet discovered in his school district, unnecessary hurdles, like the demands of early computational fluency with common fractions, can only exacerbate inequities in the elementary education system and ultimately lead to very poor educational outcomes overall. This appears to be the case today. An agenda that prioritizes decimal number fluency and early algebraic reasoning, while delaying common fraction fluency, is a compromise that seems possible in spite of the stubbornness of traditional practices.
References Allen JP (2014) Middle Egyptian: an introduction to the language and culture of hieroglyphs, 3rd edn. Cambridge University Press, Cambridge UK Bass H (2019) Is the real number line something to be built, or occupied? In: Weigand HG (ed) The legacy of Felix Klein, ICME-13 monographs, p 67–77. https://link.springer.com/chapter/ 10.1007%2F978-3-319-99386-7_5. Accessed 10 June 2021 Benezet LP (1935–1936) The story of an experiment (in three parts). J National Edu Assoc 24, no. 8: 241–243; 24, no. 9: 301–03; 25, no. 1: 241–43 Chace AB (1986) The Rhind mathematical papyrus. NCTM, Reston. The Rhind Papyrus Volume 1 (wikimedia.org). Accessed 10 June, 2021 Christianitis J (ed) (2004) Classics in the history of Greek mathematics. Kluwer, Dordrecht Cullen C (2007) The Suàn shù shu, “writings on reckoning”: rewriting the history of early Chinese mathematics in the light of an excavated manuscript. Hist Math 34:10–44 Davydov V (1975) The origin of concepts and its importance in structuring the school subject. In: Steffie L (ed) Soviet studies in the psychology of learning and teaching mathematics, vol 7. University of Chicago, Chicago, pp p110–p122
28
M. Burke
Dedekind R (1901) The nature and meaning of number. In: Beman WW (trans) Essays on the theory of numbers. Open Court, Chicago, p14–58 Dolciani M, Sorgenfrey R, Brown R, Kane R (1982) Algebra and trigonometry structure and method, book 2, 2nd edn. Houghton Mifflin, Boston Dougherty B (2008) Measure up: a quantitative view of early algebra. In: Kaput J, Carraher D, Blanton M (eds) Algebra in the early grades. Lawrence Erlbaum, New York, pp 389–412 Educating American Indian/Alaskan native Elementary and Secondary Students (1995) American Indian Science and Engineering Society. Available via ERIC. https://files.eric.ed.gov/fulltext/ ED385404.pdf. Accessed 10 June 2021 Fowler D (2004) Logistic and fractions in early Greek mathematics: a new interpretation. In: Christianitis J (ed) Classics in the history of Greek mathematics. Kluwer, Dordrecht, pp 367–380 Goals for School Mathematics: The Report of the Conference on School Mathematics (1963) Educational Services Inc. Cambridge. https://www.maa.org/sites/default/files/pdf/CUPM/first_ 40years/1963CambConf.pdf. Accessed 10 June, 2021 Goldstein D (2019) After 10 years of hopes and setbacks, what happened to the common core? In: New York Times Dec. 6 edition Gow J (1968) A short history of Greek mathematics. Chelsea Publishing, New York Greenleaf B (1845) Introduction to the National Arithmetic. Robert S. Davis Publisher, Boston. https://www.resourceaholic.com/p/digitised-antique-maths-textbooks.html. Accessed 10 June, 2021 Heath TL (1910) Diophantus of Alexandria: a study in the history of Greek algebra. University Press, Cambridge UK. https://archive.org/details/diophantusofalex00heatiala/page/130/mode/ 2up. Accessed 10 June, 2021 Høyrup J (1989) Sub-scientific mathematics: observations on a pre-modern phenomenon. Hist Sci 28:63–86 Høyrup J (2002) A note on old Babylonian computational techniques. Hist Math 29:193–198 Kitcher P (1984) The nature of mathematical knowledge. Oxford University Press, Oxford Klein F, M. (1991) The politics of curriculum decision-making. In: Klein FM (ed) A conceptual framework for curriculum decision-making. SUNY Press, Albany, pp 24–41 Kline M (1974) Why Johnny Can’t add: the failure of the new math. Vintage Books, New York Kloosterman P (2010) Mathematics skills of 17-year-olds in the United States: 1978 to 2004. J Res Math Educ 41:20–51 Lebesgue H (1966) Measure and the integral, May K (ed and trans). Holden-Day, San Francisco Lortie-Forgues H, Tian JJ, Siegler RS (2015) Why is learning fraction and decimal arithmetic so difficult? Dev Rev 38:201–221 Moss J, Case R (1999) Developing children’s understanding of the rational numbers: a new model and an experimental curriculum. J Res Math Educ 30:122–114 National Council of Teachers of Mathematics (NCTM) (1989) Curriculum and evaluation standards for school mathematics. NCTM, Reston National Governors Association Center for Best Practices and Council of Chief State School Officers (2010) Common Core state standards for mathematics. Common Core state standards (college- and career-readiness standards and K–12 standards in English language arts and math). NGA Center and CCSSO, Washington DC. http://www.corestandards.org. Accessed 10 June, 2021 National Mathematics Advisory Panel (2008) The final report of the National Mathematics Advisory Panel. Department of Education, Washington DC Poincaré H (1952) Science and method. Dover Publications, New York Powell MA (1995) Metrology and mathematics in ancient Mesopotamia. In: Sasson JM (ed) Civilizations of the ancient near east, Vol. III. Charles Scribner’s Sons, New York, pp 1941–1957 Resourceaholic website.: https://www.resourceaholic.com/p/digitised-antique-maths-textbooks. html. Accessed June 10
A Tale of Three Cities: Thebes, Babylon, and Alexandria
29
Robson E (2008) Mathematics in ancient Iraq: a social history. Princeton University Press, Princeton Romberg TA (1977) Developing mathematical processes: the elementary mathematics program for individually guided education. In: Klausmeier HJ, Rossmiller RA, Saily M (eds) Individually guided elementary education: concepts and practices. Academic Press, New York Saidan AS (1978) The arithmetic of Al-Uqlîdisî. D. Reidel Publishing, Dordrecht Schmittau J, Morris A (2004) The development of algebra in the elementary mathematics curriculum of V. V. Davydov. Math Educ 8(1):60–87 Snyder TD (1993) 120 years of American education: a statistical portrait. National Center for Education Statistics, Washington DC Stevin S (1608) The art of tenths, or decimal Arithmetike. Robert Norton, London. https://adcs. home.xs4all.nl/stevin/telconst/10ths.html. Accessed 10 June 2021 Thomas I (1993) Greek mathematical works, volume II. Harvard University Press, Cambridge Toomer, GJ (1984) Ptolemy’s almagest. Gerald Duckworth & Co., London. https://archive.org/ details/PtolemysAlmagestPtolemyClaudiusToomerG.5114_201810 Accessed 10 June 2021 Wearne D, Hiebert J (1988) Constructing and using meaning for mathematical symbols: the Case of decimal fractions. In: Hiebert J, Behr M (eds) Number concepts and operations in the middle grade. NCTM, Reston, pp p220–p235 Wittgenstein L (1958) Philosophical investigations. Basil Blackwell Ltd, Oxford
Abel’s Approach to Elliptic Integrals The Addition Theorems John K. Dagsvik
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Elliptic, Hyperelliptic, and Abelian Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Early Approaches to Addition Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Addition Theorems for Inverse Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Development of the Theory of Elliptic Integrals in the Eighteenth and early Nineteenth Century . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Results Obtained by Fagnano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Results Obtained by Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Lagrange’s Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Results Achieved by Legendre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Abel’s Addition Theorem: An Elementary Exposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The Addition Theorem for Elliptic and Hyperelliptic Integrals . . . . . . . . . . . . . . . . . . . . . . . 4.2 A More General Version of Abel’s Addition Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Abel’s Addition Theorem in Its Most General Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Elliptic Integrals and Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A Short Review of Abel’s Work on Elliptic Integrals and the Addition Theorem . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 4 7 8 9 10 11 14 17 19 19 26 29 30 32 37 42 42
Abstract
The theory of elliptic integrals and functions was a major research topic during the nineteenth century. Great mathematicians such as Euler, Lagrange, and Legendre made important contributions in this field. The Norwegian mathematician Niels Henrik Abel revolutionized this theory as he introduced novel ideas and approaches. Specifically, he proved what is called Abel’s addition theorem which is a sweeping extension of previous addition theorems for elliptic integrals obtained in the eighteenth century. J. K. Dagsvik (*) Research Department, Statistics Norway, Oslo, Norway e-mail: [email protected] © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_99-1
1
2
J. K. Dagsvik
This chapter provides an elementary review of addition theorems for elliptic integrals before 1830 with special focus on Abel’s addition theorem. An important aim of the chapter is to bring out the intuition behind the various methodological approaches. The chapter also contains a short review of Abel’s life with special reference to his contribution to the theory of elliptic integrals and beyond. Keywords
Elliptic integrals · Hyperelliptic integrals · Abelian integrals · Elliptic functions · Addition theorems · Theory of transformations · Multiplication of elliptic integrals
1
Introduction
The life of the Norwegian mathematician Niels Henrik Abel (1802–1829) was short and dramatic. Nevertheless, he was able to make pathbreaking contributions in several fields within mathematics. A major part of Abel’s research was on the theory of elliptic functions and integrals. He revolutionized this theory as he introduced new ideas and approaches that later served as the starting point for seminal works by mathematicians such as Karl Theodor Wilhelm Weierstrass (1815–1897) and Georg Bernhard Riemann (1826–1866), among many others. One of Abel’s most important achievement is his addition theorem. The mathematician Carl Gustav Jacob Jacobi (1804–1851) described Abel’s theorem as “die grosste mathematische Entdeckung unserer Zeit, obgleich erst eine kunftige grosses Arbeit ihre ganze Bedeutung aufweise könne” (“the greatest mathematical discovery of our time, even though only a great work in the future will reveal its full significance”). The most general and complete exposition of the addition theorem is given in Abel’s famous Paris memoir. Due to a number of unfortunate circumstances, it was first published 12 years after his death (Abel 1841). An integral is called elliptic if the integrand is a rational function of u and y where y2 is a polynomial in u of degree three or four, p(u), (say). Specifically, when p(u) ¼ (1 u2)(1 k2u2) where k2 < 1 is a constant, one obtains what is called an elliptic integral of the first kind. The purpose of this chapter is to give a brief elementary review of the development of the theory of elliptic integrals before 1830 with focus on Abel’s addition theorem. Special emphasis is placed on bringing out the key idea and intuition of Abel’s approach by means of elementary calculus. Accordingly, this exposition should be accessible to a general audience interested in the development of mathematics but who only have just an elementary knowledge about calculus. A number of survey articles and books on Abel’s work cover the addition theorem. Some of them are simple without the ambition to cover essential details of the mathematical developments and proofs whereas others require considerable background knowledge of mathematics. The works of Houzel (1986, 2004) and
Abel’s Approach to Elliptic Integrals
3
Sørensen (2004) provide deeper analyses and deal with Abel’s entire work within several areas. The paper written by Huntington Barnum (1910) discusses aspects of the theory of elliptic integrals and Abel’s addition theorem, but it is not that much easier to read than Abel’s original papers. The same might be said about the work by Houzel (2004). The paper written by Mittag-Leffler (1923) provides a nice introduction to the theory of elliptic integrals and early works on addition theorems. He also compares aspects of Abel’s work with the approach Weierstrass developed to study elliptic functions and integrals. However, he does not go through the details of the proof of Abel’s addition theorem. Hopefully, this chapter will fill a gap. This chapter starts by showing how elliptic integrals emerged historically from problems in geometry and mechanics. During the seventeenth century, it became clear that this type of integrals could not be expressed on closed form by elementary functions. Subsequently, it was discovered that this type of integrals possesses specific invariance properties and the theory of such integrals became a central research area by the end of the eighteenth century. Before Abel, it was Giulio Carlo Fagnano (1682–1766), Leonhard Euler (1707–1783), Joseph Louis Lagrange (1736–1813), and Adrien-Marie Legendre (1752–1833) who were the main contributors to this theory. However, when Abel entered the scene, he introduced a totally new and powerful approach that enabled him to generalize the theory far beyond the previous contributions. Consider, for example, a set of m elliptic integrals of the first kind with the same parameter k and a constant lower integration limit and upper limits xj, j ¼ 1, 2, . . ., m. As will be discussed in detail in Sect. 4, Abel’ s approach starts by making a change of variable where the variable of integration u (say) is transformed to z by u ¼ rj(z)where rj(z)is chosen as the root of a suitably selected polynomial ψ(u)(say), where one or several of the coefficients of ψ(u)is allowed to depend on z. The upper integration limit is changed from xj to z1(say) where z1does not depend on j. After this change of variable,q Abel shows that the integrand of the j-th elliptic ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 0 integral can be expressed as r j ðzÞ= p r j ðzÞ ¼ q r j ðzÞ =ψ 0 r j ðzÞ , where q(u) is a polynomial with degree less than the degree of ψ(u) minus one. According to the theorem of partial fraction decomposition, the sum over j of the fractions {q(rj(z))/ ψ 0(rj(z))}is equal to zero and therefore the sum of the corresponding elliptic integrals becomes equal to a constant. In other words, an addition theorem for this special case has been established. The chapter is organized as follows: Section 2 contains a description of what is meant by elliptic integrals and we discuss examples of how elliptic integrals arise in selected applications in mechanics and geometry. Section 3 contains a review of the theory of elliptic integrals before Abel’s contribution. In Sect. 4, several versions of Abel’s addition theorem are presented, including examples. Section 5 contains a short review of Abel’s life with special reference to his contribution to the theory of elliptic integrals and extensions.
4
2
J. K. Dagsvik
Elliptic, Hyperelliptic, and Abelian Integrals
The class of elliptic integrals can be described as follows: let p(x) be a polynomial of degree three or four and R(x, y) be a rational function of x and y, i.e., R(x, y) is a quotient with numerator and denominator that are polynomials in x and y. An integral of the form ð pffiffiffiffiffiffiffiffiffi R x, pðxÞ dx it is called an elliptic integral. The historical reason for this is that when calculating the arch length of an ellipse, one gets an integral of this type. After Gottfried Wilhelm Leibniz (1646–1716) and Isaac Newton (1642–1727) had developed the field of infinitesimal calculus in the 1660s, and Johannes Kepler (1571–1630) had established the laws of celestial mechanics, intensive research into pure as well as applied mathematics, such as mechanics and spherical geometry (navigation), followed. Because Kepler established that planets move around the sun according to elliptic orbits, the calculation of the length of these orbits corresponds to the calculation of the arch length of an ellipse. According to Hoffman (1949), Leibniz contacted Newton and the mathematician James Gregory (1638–1675) in 1675, asking whether it would be possible for him to compute the arch length of an ellipse. Leibniz was told that this problem could only be solved approximately, not in an exact way. At the time, Leibniz thought that he himself would be able to solve this problem, but he eventually realized that he could not. Much later, it was Joseph Liouville (1809–1882) who proved that elliptic integrals cannot be expressed by elementary functions. When the degree of the polynomial p(x) is greater than 4, the above integral is called hyperelliptic. Abel considered hyperelliptic integrals as well as much more general cases called Abelian integrals. An Abelian integral has the form ð Rðx, yðxÞÞdx where y(x) is an algebraic function. Remember that an algebraic function y(x) is the root of a polynomial in y, P(x, y), (say) where the coefficients are polynomials in x with integer coefficients. In other words, the algebraic function y(x) is determined by P(x, y(x)) ¼ 0. Example 2.1. Arch Length of an Ellipse Consider the problem of computing the arch length of an ellipse. Recall that the arch length, s(u,v), from x ¼ u to x ¼ v of a differentiable function y ¼ f(x) can be computed by means of the formula ðv qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sðu, vÞ ¼ 1 þ f 0 ðxÞ2 dx: u
ð1Þ
Abel’s Approach to Elliptic Integrals
5
Recall also that the ellipse with parameters a and b can be expressed by 2 2 x y þ ¼ 1: a b
ð2Þ
From (2), it follows by applying (1) that ðv a4 þ b2 a2 x 2 1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sðu, vÞ ¼ ffi dx: a ð a2 x 2 Þ a4 þ b2 a2 x 2 u
ð3Þ
As mentioned above, Liouville’s work has made clear that the integral in (3) cannot be “solved” in the sense that it can be expressed by elementary functions. It may be instructive to see how the integral in (3) can be transformed to an elliptic integral of the second kind (see Sect. 3.5). Assume first that b > a. Let us pffiffiffiffiffiffiffiffiffiffiffiffi make a change of variable, x ! z, x ¼ a 1 z2 . Then dx ¼ az(1 z2)1/2dz. qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi The corresponding transformed integral limits become z1 ¼ 1 ðu=aÞ2 and qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi z2 ¼ 1 ðv=aÞ2 . As a result, it follows from (3) that 1 k2 z2 dz sðu, vÞ ¼ ab qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 2 Þ 1 k 2 z2 1 z ð z1 ðz2
where k2 ¼ (b2 a2)/b2. The last integral is an elliptic integral of the second kind withm ¼ k2. In the case where b < a, the proof is similar. Example 2.2. Oscillation of the Pendulum There is another well-known problem in mechanics that also leads to an elliptic integral, namely: the analysis of the oscillation of the simple pendulum. Let g denote the acceleration due to gravity, l the length of the pendulum cord, and θ(t) the angle between the y-axis and the pendulum cord at time t. The challenge is to establish how θ(t) depends on t. By applying the standard laws of mechanics, one can show, see for example, (Stephenson 1960), that the following differential equation holds: θ00 ðtÞ ¼
g sin θðtÞ: l
ð4Þ
By multiplying (4) by 2θ0(t), it follows that 2θ00 ðtÞθ0 ðtÞ ¼
2g sin ðθðtÞÞθ0 ðtÞ: l
ð5Þ
The left-hand side of (5) is the derivative of θ0(t)2and the right-hand side is the derivative of
6
J. K. Dagsvik
2 cos ðθðtÞÞg=l: Hence θ0 ðtÞ2 ¼ κ cos θðtÞ þ c
ð6Þ
where c is a constant and κ ¼ 2g/l. Let α denote the greatest angle between the y-axis and the pendulum cord: that is, the angle for which θ0(t) ¼ 0. From this, we obtain c ¼ κ cos α. By solving the differential equation in (6) and by making the change of variable u ¼ cos θ(t), it follows that the inverse solution can be written as ð1 tðθÞ ¼ cos θ
du pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1 u2 Þðu cos αÞκ
ð7Þ
where t(θ) is the time at which the pendulum cord makes an angle θ with the vertical axis. Note that this integral is an elliptic integral. The integral in (7) can be transformed into an elliptic integral of the first kind (see Sect. 3.5) by making the change of variable u ! z, u ¼ 1 þ (cosα 1)z2, which gives ð1 cos θ
du pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 2 ð1 u Þðu cos αÞκ
rffiffiffi ð1 2 dz pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 x ð1 z Þð1 0:5ð1 cos αÞz2 Þ hðθÞ
where rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin ð0:5θÞ 1 cos θ hðθÞ ¼ ¼ : 1 cos α sin ð0:5αÞ This last integral is an elliptic integral of the first kind with k2 ¼ 0.5(1 cos α). Example 2.3. The Arch Length of the Lemniscate A third example where an elliptic integral occurs is when computing the arch length of the lemniscate. The graph of the lemniscate has the same form as the infinity sign, instead of w, and can be expressed in Cartesian coordinates by the following formula
x2 þ y2
2
¼ x2 y2
or in polar coordinates as r ðθÞ2 ¼ cos ð2θÞ: Here, only the case where θ [π/4, π/4] is considered where θ is the angle between the x-axis and the radius r(θ) from the origin to the graph.
Abel’s Approach to Elliptic Integrals
7
Since x ¼ r(θ) cos θ and y ¼ r(θ) sin θ, this implies that dx ¼ (r0(θ) cos θ r(θ) sin θ) dθ and dy ¼ (r0(θ) sin θ þ r(θ) cos θ)dθ, so that sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dy dθ dθ : 1þ dx ¼ dx2 þ dy2 ¼ r 0 ðθÞ2 þ r ðθÞ2 dθ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ dx r ðθ Þ cos ð2θÞ Thus, the arch length s(θ0) which corresponds to the angle from 0 to θ0 is given by θð0
sðθ0 Þ ¼ 0
dθ : r ðθ Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffi By the change of variable θ ! r, r ¼ cos 2θ, one gets sin ð2θÞ dr ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dθ ¼ cos ð2θÞ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi 1 cos 2 ð2θÞ 1 r4 pffiffiffiffiffiffiffiffiffiffiffiffiffi dθ dθ ¼ r cos 2θ
which yields θð0
sðθ 0 Þ ¼ 0
rð0 ð1 dθ dr dr ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi 4 r ðθ Þ 1r 1 r4 1
ð8Þ
r0
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi where r 0 ¼ cos 2θ0 . By making the change of variable r ! w, r ¼ 1 w2 , it follows that 1 sðθ0 Þ ¼ pffiffiffi 2
sinð2θ0
0
dw pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : 2 ð1 w Þð1 0:5w2 Þ
This last integral is an elliptic integral of the first kind with k2 ¼ 0.5.
3
Early Approaches to Addition Theorems
By addition theorems, we mean theorems on how the sum or product of two or several functions (integrals) of given arguments can be expressed as a function of the same type with argument that is an explicit function of the given arguments. Familiar examples are the logarithm, exponential, and trigonometric functions.
8
J. K. Dagsvik
3.1
Addition Theorems for Inverse Trigonometric Functions
The inverse trigonometric functions are special cases of elliptic integrals, cf. Sections 1 and 2. One example of an addition theorem is given by pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi arcsin x þ arcsin z ¼ arcsin x 1 z2 þ z 1 x2 ,
ð9Þ
for xz 0 or x2 þ z2 1, pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi arcsin x þ arcsin z ¼ π arcsin x 1 z2 þ z 1 x2 ,
ð10Þ
for x > 0, z > 0, x2 þ z2 > 1, and pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi arcsin x þ arcsin z ¼ π arcsin x 1 z2 þ z 1 x2 ,
ð11Þ
for x < 0, z < 0, x2 þ z2 > 1, where arcsin x ¼
ðx
du pffiffiffiffiffiffiffiffiffiffiffiffiffi : 1 u2 0
Thus, arcsinx is an elliptic integral of the first kind with k ¼ 0. The addition theorem in (9), (10), and (11) can be proved by applying the well-known formula sin ðu þ vÞ ¼ sin u cos v þ cos u sin v where u ¼ arcsin x and v ¼ arcsin z. A similar addition theorem holds for the integral ðx arctgx ¼ 0
du : 1 þ u2
By the change of variable, u ! z, u ¼ z(1 z2)1/2, the integral above becomes ðx 0
xð1þx2 Þ
du ¼ 1 þ u2
ð 0
1=2
dz pffiffiffiffiffiffiffiffiffiffiffiffi 1 z2
which demonstrates that arctgx can be expressed by an elliptic integral of the first kind with k ¼ 0. It can easily be demonstrated that the following addition theorem, namely arctgx þ arctgz ¼ arctg
xþz 1 xz
holds for xz < 1, and an analogous formula holds for xz > 1.
Abel’s Approach to Elliptic Integrals
3.2
9
Development of the Theory of Elliptic Integrals in the Eighteenth and early Nineteenth Century
Since it seemed impossible to reduce elliptic integrals to elementary functions, mathematicians started to investigate what might be called the intrinsic properties of these transcendental functions. Note that in the eighteenth and early nineteenth century, elliptic integrals were often called elliptic functions, in contrast to the terminology established after Abel and Jacobi where the inverse of elliptic integrals are called elliptic functions. In the eighteenth century, the properties of the trigonometric and inverse trigonometric functions were well known by Euler and others, including the addition relations reviewed in Sect. 3.1. It is not known if Euler was led to his approach in the study elliptic integrals by considering the properties of the inverse trigonometric functions reviewed in Sect. 3.1, but it may very well have been so. To elaborate on this point, consider the relation in (9) and let pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi y ¼ x 1 z2 þ z 1 x2
ð12Þ
where z now is a given constant. By differentiation, it follows from (9) that dy dx pffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi : 2 1y 1 x2
ð13Þ
The solution of (13) is given by y ¼ sin (arcsinx þ C) which is equivalent to y ¼ sin ð arcsin x þ arcsin zÞ where z ¼ sin C. But it is known from (9) that y is also given by (12), which implies that x and y are related by the second-degree polynomial equation x2 þ y2 2αxy β ¼ 0
ð14Þ
where α and β are constants, β ¼ z2 and α2 þ β ¼ 1. In other words, the solution (y) of (13) can be expressed as an algebraic function of x. As mentioned above, it was Fagnano and Euler who were the first to study elliptic integrals systematically, see Ayoub (1984). When Euler started his investigations of elliptic integrals, inspired by the results obtained by Fagnano, his point of departure was the differential equation dy dx pffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi : 4 1y 1 x4
ð15Þ
Euler’s conjecture was that the solution of (15) would satisfy the polynomial equation x2 þ y2 þ βx2 y2 2αxy β ¼ 0
ð16Þ
10
J. K. Dagsvik
where α, and β, are constants. In fact, he proved that this is so with β > 0 and α2 þ β ¼ 1. Since the differential equations in (13) and (15) are quite similar, Euler’s conjecture might very well have been motivated by the fact that the solution of (13) is determined by (14). Apart from the term γx2y2, the latter equation is identical to (16). Subsequently, Euler studied more general differential equations of the same type where the terms under the radicals of (15) were general polynomials of degree four, to be discussed below. Among the approaches to addition theorems of elliptic functions in the nineteenth century, it is the approach of Lagrange (1766–9) that appears to be the most interesting one. His method is constructive whereas the approaches of Fagnano (1718), Euler (1761), and Legendre (1825–1828) were based on conjectures of how the addition theorem might look like. In this context, it is worth mentioning that John Landen (1719–1790) obtained what is known as Landen’s theorem (Landen 1775). Landen’s theorem expresses the length of the arc of a hyperbola in terms of the lengths of the arcs of two ellipses. For details, see Cayley (1961) who goes through Landen’s result. It seems, however, that Landen did not establish addition theorems explicitly and did not fully realize the value of his discovery (Mittag-Leffler 1923). Both Euler and Lagrange tried in vain to extend their approaches to more general situations where the terms under the radicals of (15) were polynomials of a higher degree than the fourth. Much later, Richelot (1842) demonstrated how Lagrange’s approach can be used to prove more general addition theorems. After Euler and Lagrange, Legendre continued research on the theory of elliptic integrals. Although he produced a different proof of Euler’s addition theorem, he did not come up with fundamental new ideas beyond those of Euler and Lagrange. One of his major results was that the demonstration that all elliptic integrals can be reduced to the three fixed canonical forms given in Sect. 3.5 below. 2
3.3
Results Obtained by Fagnano
Fagnano wrote several papers that dealt with integrals that had geometric interpretations but could not be expressed by elementary functions. In 1718, he published the following remarkable relation for the arch length of the lemniscate (Fagnano 1718; compare Example 2.3), pffiffiffiffiffiffiffiffiffiffiffiffiffi 2x 1 x4 2sðxÞ ¼ s 1 þ x4
ð17Þ
for x [1, 1],where s(u) is as defined in (8). The problem of establishing (17) is equivalent to showing that pffiffiffiffiffiffiffiffiffiffiffiffiffi 2x 1 x4 y¼ 1 þ x4
ð18Þ
Abel’s Approach to Elliptic Integrals
11
is a solution of the differential equation dy 2dx pffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi : 1 y4 1 x4 Apparently, it was the following problem formulated by John Bernoulli (1667–1748) that motivated Fagnano in his investigations, namely: what is the curve with the property that the time taken for a particle to traverse the curve is proportional to the distance from a given point? The solution of this problem yields the integral given in (8). Fagnano’s result in (17) can in fact be viewed as a special case within the theory of transformations of elliptic integrals. This topic was to become central in the competition between Abel and Jacobi. Although Fagnano demonstrated that (18) was a solution to (14), he was not able to find the general solution to this differential equation. For further discussion on Fagnano’s work on elliptic integrals, see Ayoub (1984).
3.4
Results Obtained by Euler
Euler first saw Fagnano’s work on 23 December 1751, which explains why Jacobi referred to this date as the birth of elliptic functions. According to Ayoub (1984), it was then that Euler was asked to examine the collected papers of Fagnano, who was being put forward for membership of the Berlin Academy. Euler had already encountered elliptic integrals through his work on elasticity and had published several papers on the topic before 1751. He had also written papers on the rectification of the ellipse, which leads to an elliptic integral, see Example 2.3. Ayoub (1984) notes that Euler had apparently not written anything about addition theorems for elliptic integrals before reading the work of Fagnano. Subsequently, Euler (1761) obtained the general solution to (15) and he established the following addition theorem pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi! x 1 y4 þ y 1 x4 sðx Þ þ sðy Þ ¼ s 1 þ x2 y2
ð20Þ
which generalizes (17), since (20) implies (17) when x ¼ y. In the same paper, Euler demonstrated that the approach applied to prove (20) can also be applied to prove addition theorems for more general integrals F(x) of the form ðx
du FðxÞ ¼ pffiffiffiffiffiffiffiffiffi pðuÞ 0
where p(x) is a polynomial of degree three or four.
12
J. K. Dagsvik
Next, consider Euler’s proof of his addition theorem in the case with p(x) ¼ 1 þ ax2 þ bx4 where a and b are real constants such that p(x) > 0, x [0, c], for some constant c. It will be clear that the corresponding addition theorem can be obtained by solving the differential equation dy edx pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ ay2 þ by4 1 þ ax2 þ bx4
ð21Þ
where y is an unknown function f(x) (say), and ε is equal to 1 or 1. Here, it is sufficient to consider the case with ε ¼ 1. However, it will be convenient to consider the case with ε ¼ 1 below when discussing Legendre’s approach. With ε ¼ 1, the equation in (21) becomes equivalent to F( y) ¼ F(x) þ C where C is a constant and y is a function of x and C. Thus, the solution can be expressed as y ¼ F1(F(x) þ C). Without loss of generality, we can write C ¼ F(z), where z is a constant. Hence, (21) becomes equivalent to FðyÞ ¼ FðxÞ þ FðzÞ
ð22Þ
where now y is a function of x and z. Euler (1761) proved that the solution of (21) also satisfies the equation a x2 þ y2 ¼ 2βxy þ γx2 y2 þ δ
ð23Þ
where α, β, γ, and δ are constants. We shall next prove this. By creating the total differential of (23), one obtains αðxdx þ ydyÞ ¼ βðxdy þ ydxÞ þ γ xy2 dx þ x2 ydy which is equivalent to
αx βy γxy2 dx þ αy βx γx2 y dy ¼ 0:
ð24Þ
When solving (23) for x and y, respectively, two possible solutions are
x¼
βy
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi αδ þ β2 α2 γδ y2 þ αγy4 α γy2
ð25Þ
and
y¼
βx þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi αδ þ β2 α2 γδ x2 þ αγx4 α γx2
ð26Þ
Abel’s Approach to Elliptic Integrals
13
There are in fact four possible solutions of (23) of x (y) in terms of y (x). However, it is essential for the subsequent results to pick from among these solutions the largest one for x and the smallest one for y. From the equations above, by multiplying by the respective denominators in (25) and (26), it follows that qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi αδ þ β2 α2 γδ y2 þ αγy4
ð27Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi αδ þ β2 α2 γδ x2 þ αγx4 :
ð28Þ
αx βy γxy2 ¼ and αy βx γx2 y ¼
By inserting (27) and (28) into (24), one obtains dy dx qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 4 αδ þ β α γδ y þ αγy αδ þ β α2 γδ x2 þ αγx4
ð29Þ
which proves that the relation in (24) in fact represents a solution of (21) with αδ ¼ 1, pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 a ¼ β α γδ, b ¼ αγ and α ¼ 1/z. Hence, δ ¼ z, γ ¼ bz, and β ¼ a þ z2 þ bz2 where z is an arbitrary variable that for the moment is kept fixed. By using the above notation, it follows that (26) can be expressed as pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ az2 þ bz4 þ z 1 þ ax2 þ bx4 y¼ : 1 bz2 x2 x
ð30Þ
Above, z is treated as a constant, but since the relations above hold for any value of z, one can treat z as a free variable. From (30) and (22), it is therefore the case that qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x 1 þ az2 þ bz4 þ z 1 þ ax2 þ bx4 A ¼ FðxÞ þ FðzÞ: F@ 1 bz2 x2
ð31Þ
Note that the equation in (20) is a special case of (31). By letting a ¼ (1 þ k2) and b ¼ k2, the addition equation in (31) can be reduced to
xΔðzÞ þ zΔðxÞ H ¼ H ðx Þ þ H ðzÞ 1 k 2 z2 x 2
ð32Þ
where H(x) denotes the (definite) elliptic integral of the first kind and Δ(x)2 ¼ (1 x2)(1 k2x2). Thus, an addition theorem for elliptic integrals has been established. Note that when z ¼ x in (32), an extension of Fagnano’s invariance relation in (17) is obtained.
14
J. K. Dagsvik
Moreover, Euler (1761) demonstrated that the same technique as above works for solving differential equations of the type mdx ndy pffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffi pðxÞ qð y Þ
ð33Þ
where m and n are integers, and p(x) and q(x) are general fourth-order polynomials in x that are not necessarily equal. The solutions of (33) constitutes part of the theory of transformations, see Cayley (1961). For example, if ffiin (33), m ¼ n ¼ 1, and p(x) ¼ pffiffiffiffiffiffiffiffi 2 (1 α2x2)(1 β2xp ),ffiffiffiffiffiffiffiffi then the differential dx= p ð xÞ is carried into a differential of ffi the same form dy= qðyÞ by means of the transformation y¼
x 1 þ αβx2
pffiffiffiffiffiffi where qðyÞ ¼ 1 α21 y2 1 β21 y2 , α1 ¼ α þ β and β1 ¼ 2 αβ. Similarly, Landen (1775) and Mittag-Leffler (1923) discovered apsimilar pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiresult, namely that when 2 2 m ¼ n ¼ 1, α1 ¼ α þ α β and β1 ¼ α α2 β2 , then sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 α 2 x2 y¼x : 1 β 2 x2 Recall that part of Euler’s method is the trick of letting the coefficients of the original differential equation in (21) be suitable functions of the free variable z. At the outset, this variable is understood to be fixed, but it becomes clear later that it may vary freely since the corresponding differential equation is supposed to hold for any value of z. Although this observation is trivial, it is nevertheless very important because it makes it possible to express y in the symmetric relation given in (30), which is a key element in the statement of the corresponding addition theorem. Below it will be shown that in Abel’s approach also a similar trick of allowing one or several coefficients to depend on a free variable is essential for establishing Abel’s addition theorem.
3.5
Lagrange’s Approach
Lagrange introduced a method for solving a specific type of differential equations that can be applied constructively to obtain the addition theorem for elliptic integrals (Cayley 1961). Lagrange’s approach for solving (21) with ε ¼ 1 starts with the introduction of a new variable t, defined by ðx
du t ¼ tðxÞ ¼ pffiffiffiffiffiffiffiffiffi : pð uÞ 0
ð34Þ
Abel’s Approach to Elliptic Integrals
15
Equations (21) and (34) imply that dx dy dt ¼ pffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffi pð x Þ pð y Þ
ð35Þ
dx pffiffiffiffiffiffiffiffiffi dy pffiffiffiffiffiffiffiffiffi ¼ pðxÞ and ¼ pðyÞ: dt dt
ð36Þ
and
By differentiating the first equation in (36) with respect to t, it follows that d2 x dðdx=dt Þ dx d ¼ ¼ dx dt dt 2 ¼
pffiffiffiffiffiffiffiffiffi pðxÞ dx dx dt
2ax þ 4bx3 pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi pðxÞ ¼ ax þ 2bx3 : 2 pð x Þ
Similarly, it follows from (36) that d2 y ¼ ay þ 2by3 : dt 2 At this point, it becomes convenient to introduce the notation r ¼ x þ y and q ¼ x y. By means of this notation, one gets d2 q d2 x d2 y b ¼ 2 2 ¼ aq þ q3 þ 3r 2 q 2 2 dt dt dt
ð37Þ
and dr dq ¼ dt dt
2 2 dx dy dx dy dx dy þ ¼ ¼ a x2 y2 þ b x4 y4 dt dt dt dt dt dt ð38Þ b ¼ arq þ rq r 2 þ q2 : 2
From (37) and (38), it follows that r
d 2 q dr dq brq3 3r 3 q brq3 br3 q ¼ arq þ arq ¼ br3 q: þ 2 2 2 2 dt 2 dt dt
ð39Þ
By dividing both sides of (39) by r3and multiplying both sides by 2 dq/dt lead to
16
J. K. Dagsvik
2 2 d2 q dq 2 dq dr dq ¼ 2br : dt dt r 2 dt 2 dt r 3 dt
ð40Þ
Equation (40) can readily be integrated, which gives 2 1 dq ¼ bq2 þ K dt r2
ð41Þ
where K is an arbitrary constant. When inserting for r, q, and dq/dt, as functions of x and y, into (41) and applying the equation in (36), the resulting equation becomes pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi!2 pð x Þ pð y Þ ¼ bðx yÞ2 þ K: xþy
ð42Þ
Equation (42) expresses the relationship between x and y in implicit form. It is, however, possible to simplify this equation considerably. After multiplication on both sides of (42) by (y þ x)2and rearranging, it is obtained that 2
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pðxÞpðyÞ ¼ ðy þ xÞ2 K þ ðx yÞ2 b 2 a x2 þ y2 b x4 þ y4 : ð43Þ
By raising both sides of (43) to the power of 2 yields 2 0 ¼ ðy þ x Þ2 K þ ðx y Þ2 b 2 a x 2 þ y 2 b x 4 þ y 4 4 1 þ ax2 þ bx4 1 þ ay2 þ by4 2 ¼ a2 4b y2 x2 þ K 2 ðy þ xÞ4 2aK ðy þ xÞ2 x2 þ y2 4K ðy þ xÞ2 4bK ðy þ xÞ2 x2 y2 : ð44Þ Note that the right-hand side of (44) has (y þ x)2 as a common factor. By dividing through and rearranging, (44) can be reduced to α x2 þ y2 ¼ 2βxy þ γx2 y2 þ δ
ð45Þ
where δ ¼ 4K, β ¼ a2 4b K2, γ ¼ 4bK, and α ¼ K2 4b þ a2 2aK. Equation (45) represents the solution y as a function of x, of the differential equation in (21). Note that the equation (45) is equivalent to the equation (23). This means that there are no other solutions to Euler’s differential equation (21) than the one Euler postulated and subsequently verified. The rest of the analysis proceeds in the same way as Euler’s deductions.
Abel’s Approach to Elliptic Integrals
3.6
17
Results Achieved by Legendre
As mentioned above, it was Legendre who most systematically and extensively conducted research on elliptic integrals before Abel and Jacobi. In his monumental three-volume work, Traité des fonctions elliptiques et des intégrales éulériens, which was published in the period 1825–1828, Legendre presented what was the state of the theory of elliptical integrals (which he called elliptic functions) at that time, and he also discussed a number of applications in geometry and mechanics. Furthermore, he has given a comprehensive contribution to numerical analysis that leads to practical approximation formulas that could be used to calculate numerical values of various functions, such as the Gamma function and elliptical integrals. In volume 2 of Legendre’s work, there are approximately 141 pages with tables calculated with a precision of between 10 and 15 decimals. It thus is an impressive amount of work that has been put into making these tables. Some of the results presented by Legendre in 1825–1828 were previously published by Legendre (1793) and Legendre (1811–1817). In 1825, Legendre proved that elliptic integrals can always be expressed by linear combinations of elementary functions and the following three types of elliptic integrals, namely: ð
dx qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , 2 ð 1 x Þ 1 k 2 x2
ð
ð 1 k2 x2 dx dx qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , and 2 2 2 2 ð1 x Þ 1 k x ð1 nx Þ ð1 x2 Þ 1 k2 x2
where k and n are constants and jk j 1. In Legendre’s classification, such integrals are called elliptic integrals of the first, second, and third kind, respectively. In Chapter 6 of Legendre (1825), a proof is given of the addition theorem established by Euler and Lagrange for elliptic integrals. He shows that Euler’s differential equation in (21) with a ¼ 1, b ¼ (1 þ k2), c ¼ k2 and ε ¼ 1 can, by making the substitution x ¼ sin φ and y ¼ sin ψ, the differential equation becomes dψ dφ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0: 2 2 1 k sin ψ 1 k2 sin 2 φ
ð46Þ
Here, it is understood that ψ is a function of φ. The problem now is to solve (46), that is, to establish the relationship between φ and ψ. By integrating (46), it follows that Gðψ Þ þ GðφÞ ¼ C where ðϕ
dv GðφÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 k2 sin 2 v 0
ð47Þ
18
J. K. Dagsvik
and C is a constant. As with Euler, Legendre’s proof is based on conjecture about what a solution of (46) might look like. Specifically, Legendre’s conjecture was that the relationship between y and q can be expressed by qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cos φ cos ψ cos μ ¼ ð sin φ sin ψ Þ 1 k2 sin 2 μ
ð48Þ
where μ is a constant. The equation in (48) is equivalent to each of the equations qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 k2 sin 2 ψ
ð49Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cos φ cos μ cos ψ ¼ ð sin μ sin ψ Þ 1 k2 sin 2 φ:
ð50Þ
cos ψ cos μ cos φ ¼ ð sin μ sin φÞ and
The simplest way of verifying that (48), (49), and (50) are equivalent is to raise both sides of the equations to the power of 2 and then make suitable rearrangements. By dividing (47) by sinφsinψ and subsequently differentiating with respect to ψ, treating φ as a function of ψ, one gets dψ=dφ 1 ð cos ψ cos μ cos φÞ þ ð cos φ cos μ cos ψ Þ ¼ 0: sin φ sin ψ Moreover, by using (48) and (49), it follows that dφ ð sin μ sin φÞ sin φ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dψ 2 2 1 k sin ψ þ ð sin μ sin ψ Þ 1 k2 sin 2 φ ¼ 0 sin ψ
which reduces to the differential equation in (46). By letting φ ¼ 0, it follows from (48) that ψ(0) ¼ μ, implying that C ¼ G(μ). If one instead considers μ as a function of φ and ψ, determined by (49) and (50), it follows that one can establish addition theorems by solving the equations in (49) and (50) for μ. Specifically, by dividing the equation in (48) by the equation in (49) sin μ cancels on the right-hand side and one can solve for cos μ, which yields cos μ ¼
ð sin ψ cos ψ ÞΛðφÞ ð sin φ cos φÞΛðψ Þ ð sin ψ cos φÞΛðφÞ ð sin φ cos ψ ÞΛðψ Þ
ð51Þ
where Λ ð φÞ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 k2 sin 2 φ2 :
By multiplying the denominator and numerator of (51) by (sinψ cos φ)Λ(φ) (sinφ cos ψ)Λ(ψ) and rearranging, we get that
Abel’s Approach to Elliptic Integrals
cos μ ¼
19
cos φ cos ψ ð sin φ sin ψ ÞΛðφÞΛðψ Þ 1 k2 sin 2 φ sin 2 ψ
ð52Þ
Similarly, it is possible to show that sin μ ¼
ð sin φ cos ψ ÞΛðψ Þ þ ð sin ψ cos φÞΛðφÞ 1 k2 sin 2 φ sin 2 ψ
ð53Þ
and tgμ ¼
ðtgφÞΛðψ Þ þ ðtgψ ÞΛðφÞ : 1 ðtgφtgψ ÞΛðφÞΛðψ Þ
ð54Þ
Next, make the change of variable back to the original variables: that is, φ ! x and ψ ! y, where φ ¼ arcsin x and ψ ¼ arcsin y. Accordingly, G(φ) ¼ H(x), G(ψ) ¼ H( y), and G(μ) ¼ H(sinμ), where H(x) is the elliptic integral of the first kind where the integration goes from zero to x. Thus, it is possible to write (47) as H(x) + H( y) ¼ H(sinμ). Furthermore, (53) can be expressed as sin μ ¼
xΔðyÞ þ yΔðxÞ 1 k2 x2 y2
which is equivalent to the result in (32). Legendre has also given an alternative proof in which he shows that the relations in (47) to (50) follow from spherical geometry, and more specifically, the relations between the sides of a triangle on the surface of a ball with respective lengths φ, ψ, and μ. In addition to being of independent interest, this proof also provides a good illustration of how problems within spherical geometry lead to elliptic integrals.
4
Abel’s Addition Theorem: An Elementary Exposition
When Abel began working on the theory of elliptic integrals, his approach was radically different from previous contributions in this field. His results are not limited to elliptic integrals but also extend to hyperelliptic integrals and the much more general case of Abelian integrals. The next section discusses the addition theorem in the elliptic and hyperelliptic case.
4.1
The Addition Theorem for Elliptic and Hyperelliptic Integrals
Consider an integral of the form given by
20
J. K. Dagsvik
ðx
du FðxÞ ¼ pffiffiffiffiffiffiffiffiffi pðuÞ c
where p(u) is a polynomial of degree higher than 2 and c is a constant. When the degree of p(u) is equal to 3 or 4, F reduces to an elliptic integral. As discussed above, Euler, Lagrange, and Legendre proved the addition theorem Fð x 1 Þ þ Fð x 2 Þ ¼ Fð x 3 Þ where x3 is a particular algebraic function of (x1, x2) where (x1, x2) are two arbitrary distinct (real) variables subject to the condition that p(xj) > 0, j ¼ 1, 2. In the general case, p(x) may be negative, have complex roots, and have complex coefficients. It is, however, beyond the scope of this chapter to analyze the complex case. In this section, integrals of the form ðx Fðx, aÞ ¼ c
μðuÞdu pffiffiffiffiffiffiffiffiffi ð u aÞ pð uÞ
will also be treated, where a is a constant smaller than c and μ(u) is a polynomial such that the degree of μ(u)2 is less than the degree of p(u). As will be clear shortly, the method of partial fraction decomposition of rational functions plays a key role in Abel’s approach. It was Gottfried Leibniz (1646–1716) and Johann Bernoulli (1667–1748) who, independently, in 1702 discovered the method known as partial fraction decomposition of rational functions, which is stated in Lemma 1 below. Abel was of course familiar with this result. Lemma 1 (Partial Fraction Decomposition) Let A(x) and B(x) be polynomials where the degree of B(x) is less than the degree of A(x). Furthermore, assume that all the roots x1, x2, . . ., xK of B(x) are distinct and different from the roots of A(x). Then A xj Að x Þ 0 ¼ : Bð x Þ j¼1 x x j B x j
K X
ðiÞ
If B(0) 6¼ 0 and A(x) ¼ xg(x), then K X g xj ¼ 0: 0 j¼1 B x j
ðiiÞ
Abel’s Approach to Elliptic Integrals
21
A proof of Lemma 1(i) is given in several mathematical textbooks, see, for example, Miller et al. (1990). The proof of Lemma 1(ii) follows by letting x ¼ 0 in (i). The next theorem is a special case of Abel’s general addition theorem. Theorem 1 Let f (x), φ(x), and p(x) be polynomials where the degree of p(x) is greater than 2, letxj, j ¼ 1, 2, . . ., K be the roots of the polynomial f(x)2 φ(x)2p(x) and let εj ¼ 1 be determined by qffiffiffiffiffiffiffiffiffiffiffi f x j ¼ e jφ x j p xj : If all the roots {xj} are real, distinct, and different from zero, and p(x) > 0 for x [c, maxj, xj], then XK
e F j¼1 j
xj ¼ C
ð55Þ
where C is a constant. but not all. As demonstrated below, some of the roots {xj} can be chosenqfreely, ffiffiffiffiffiffiffiffiffiffiffi Note that the roots {xj} must either satisfy f x j ¼ φ x j p x j or f x j ¼ q ffiffiffiffiffiffiffiffiffiffiffi φ x j p x j : The role of εj is thus to identify which of these two relations applies. Note also that the degree of the polynomial p(x) can be greater than 4 which means that Theorem 1 holds in the hyperelliptic case. This shows that Abel’s result is far more general than the previous contributions by Euler, Lagrange, and Legendre. A very brief summary of Abel’s approach goes as follows. A key step is the introduction of an auxiliary variable z and the change of variable, u ¼ rj(z), where rj(z) is a suitable function of z. This change of variable implies thatqitffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi is possible to ffi 0 express the integrand of the elliptic integral F(x), namely r j ðzÞ= p r j ðzÞ , as a rational function in rj(z) where the denominator is the derivative of a specific polynomial of which rj(z) is a root. Since the root rj(z) depends on z one or several coefficients of this polynomials must depend on z. Then, by Lemma 1(ii) (providing the conditions of Lemma 1(ii) are met), the sum of the integrands becomes zero, which implies that the sum of the corresponding integrals in (55) becomes a constant. The auxiliary variable z is only needed in the proof and does not have to be included in the statement of Theorem 1. In the following, the exposition of the proof is broken up in a number of additional Lemmas.
22
J. K. Dagsvik
Lemma 2 Let u ¼ rj(z), j ¼ 1, 2, . . ., K be differentiable functions such that xj ¼ rj(v) where {xj} are given. Then K X
K X e jF x j ¼ ej
j¼1
j¼1
xðj
c
du pffiffiffiffiffiffiffiffiffi ¼ pð uÞ
ðv
ξðzÞdz þ Ce
ð56Þ
ν0
where ξðzÞ
K X e j r 0j ðzÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi j¼1 p r j ðzÞ
ð57Þ
and v0 , Ce are constants. The proof of Lemma 2 follows readily by the change of variable, u ¼ rj(z), where the function rj(z) is unspecified (but differentiable) apart from the condition that xj ¼ rj(v). The constant Ce is given by Ce ¼
v K ð0 X
r 0j ðzÞdz qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi j¼1 ð z Þ p r j v0,j
where c ¼ rj(v0j). Although the integral on the right side of (56) does not immediately appear to represent a simplification of the original integral, it has the advantage that the original sum of integrals in (55) can be expressed as a single integral. The next lemma shows that after a suitable choice of the function {rj(z)}, one can get rid of ffi the square root in the integrand of the elliptic integral and express qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p r j ðzÞ as a rational function of rj(z). Lemma 3 Let f(x, z) and φ(x, z) be polynomials in x of degrees ml and m2, respectively, where at least one of the coefficients of f(x, z) and/or ϕ(x, z) depend on a free variable z. If rj(z), j ¼ 1, 2, . . ., K are the roots (distinct and real) of the polynomial ψ(x, z) ¼ f(x, z)2 φ(x, z)2p(x) where K ¼ max (2m1, 2m2 þ q), then φ r j ðzÞ ej qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ¼ f r ðzÞ : j p r j ðzÞ
ð58Þ
Abel’s Approach to Elliptic Integrals
23
If q is the degree of p(x), then the first m1 þ m2 þ 1 roots of ψ(x) can be chosen freely whereas the remaining K m1 m2 1 ¼ max (m1 m2, m2 m1 þ q) 1 roots become functions of the first m1 þ m2 þ 1 roots. Proof of Lemma 3: For simplicity, the variable z is suppressed in the proof. Note that (58) is equivalent to f(rj)2 p(rj)φ(rj)2 ¼ 0 which is equivalent to stating that (58) implies that {rj} are the roots of ψ(x). A key question is which constraints (if any) the roots of the polynomial ψ(x)must satisfy. Let f ¼ a0 þ a1 x þ . . . þ xm1 and φ ¼ b0 þ b1 x þ . . . bm2 xm2 where the coefficient am1 is normalized such that am1 ¼ 1 . Furthermore, (56) implies that qffiffiffiffiffiffiffiffiffiffiffi p rj a0 þ a1 r j þ . . . þ r mj 1 ¼ e j b0 þ b1 r j þ . . . þ bm2 r mj 2
ð59Þ
for j ¼ 1, 2, . . ., K. In (59), there are m1 þ m2 þ 1 unknown coefficients, namely the coefficients of the polynomials f and φ. Hence, the proof of Lemma 3 is complete. In the following, the notation g0k ðx, zÞ means the partial derivative of the function g with respect to argument k, k ¼ 1, 2. The next lemma is of crucial importance because it shows that after the change of variable and a suitable choice of the functions {rj(z)}, it is possible to express the integrand of F(x) as a rational function of {rj(z)} such that Lemma 1 can be applied. Lemma 4 Letf (x, z) and φ (x, z) be polynomials in x where one or several coefficients depend on a free variable z and let ψ(x, z) ¼ f(x, z)2 φ(x, z)2p(x). Then e j r 0j ðzÞ 2θ r j ðzÞ, z ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ ψ 0 r ðzÞ, z j 1 p r j ðzÞ
ð60Þ
θðx, zÞ ¼ f ðx, zÞφ02 ðx, zÞ þ φðx, zÞ f 02 ðx, zÞ:
ð61Þ
where
Proof of Lemma 4: By differentiating the equation ψ(rj(z), z) ¼ 0 with respect to z, it follows that ψ 01 r j ðzÞ, z r 0j ðzÞ þ ψ 02 r j ðzÞ, z ¼ 0: From (60) and (61), and the fact that Lemma 3 yields 2 p r j ðzÞ ¼ f ðr, ðzÞ, zÞ2 =φ r j ðzÞ, z ,
ð62Þ
24
J. K. Dagsvik
it follows that 2θ r j ðzÞ, z 2θ r j ðzÞ, z ¼ 0 ¼ 0 ψ 1 r j ðzÞ, z r 0j ðzÞ ψ 2 r j ðzÞ, z
2θ r j ðzÞ, z ¼ 2f r j ðzÞ, z f 02 r j ðzÞ, z 2p r j ðzÞ φ r j ðzÞ, z φ02 r j ðzÞ, z
¼
φðr j ðzÞ, zÞ f 02 ðr j ðzÞ, zÞf ðr j ðzÞ, zÞφ02 ðr j ðzÞ, zÞ 2
2
f ðr j ðzÞ, zÞ f 02 ðr j ðzÞ, zÞf ðr j ðzÞ, zÞ φðr j ðzÞ, zÞ φðr j ðzÞ, zÞφ02 ðr j ðzÞ, zÞ
φ r j ðzÞ, z ej ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ ffi : f r j ðzÞ, z p r ðzÞ j
Hence, the proof of Lemma 4 is complete. The results obtained above shows that it is possible to express ξ(z), given in (57), as K K X X e j r 0j ðzÞ 2θ r j ðzÞ, z : ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi q ξ ðzÞ ¼ 0 j¼1 j¼1 ψ 1 r j ðzÞ, z p r j ðzÞ
ð63Þ
The right side of (63) is particularly interesting because it has the structure that enables the use of Lemma 1(ii). However, to be able to do so, it must be the case that ψ(0, z) 6¼ 0 which means that zero cannot be a root of ψ(x, z). Furthermore, the degree of θ(x, z) (as a polynomial in x) must be less than or equal to the degree of ψ(x, z) (as a polynomial in x) minus 2. The next lemma establishes the properties needed for the application of Lemma 1(ii). Lemma 5 The degree of θ is lower than or equal to the degree of ψ minus 2. Proof of Lemma 5 Since both f 02 ðx, zÞ and ϕ02 ðx, zÞ do not have constant terms, it follows from (61) that θ(0, z) ¼ 0. Recall that q is the degree of p(x), m1 the degree of f(x, z), and m2 the degree of φ(x, z) (as polynomials in x). Then the degree of ψ becomes equal to max(2m1, 2m1 þ q). For the condition of Lemma 1(ii) to hold the inequality, max(2m1, 2m2 þ q) m1 þ m2 þ 2 must hold which is equivalent tomax (m1 m2, q (m1 m2)) 2. Since q 3, this inequality is evidently fulfilled for all m1 m2 0 and for m1 m2 2. Furthermore, it also holds when m1 m2 ¼ 1. This means that the inequality max(2m1, 2m2 þ q) m1 þ m2 þ 2 holds for all integer values of m1 and m2. Thus, we have proved that (55) also holds in the hyperelliptic case, that is, when q > 4. This completes the proof of Lemma 5.
Abel’s Approach to Elliptic Integrals
25
From the Lemmas 1, 2, 3, 4, and 5, it follows that ξ(z) ¼ 0. It thus follows from (58) and (60) that k X
K X e j F x j ¼ Ce þ e j F c j C:
j¼1
j¼1
Hence, the proof of Theorem 1 is complete. Let xK be a root that corresponds to εK¼ 1. From Theorem 1, the next Corollary follows. Corollary 1 Assume that the degree of p(x) is equal to 3 or 4. Then FðxK Þ ¼
XK1
e F j¼1 j
xj þ C
1 is determined such where C is a constant, εj ¼ qffiffiffiffiffiffiffiffiffiffiffi that f x j ¼ e j φ x j p x j , j ¼ 1, 2, . . . , K 1 and xK is determined by f ðxK Þ þ pffiffiffiffiffiffiffiffiffiffiffi φðxK Þ pðxK Þ ¼ 0: Next, consider the addition theorem for the integral F(x, a). To this end, the following lemma is needed. Lemma 6 Let h1(z) and h2(z) be two differentiable functions and κ a positive constant. Then ð
h2 ðzÞh01 ðzÞ h1 ðzÞh02 ðzÞ dz h2 ðzÞ2 h1 ðzÞ2 κ
pffiffiffi h2 ðzÞ þ h1 ðzÞ κ 1 pffiffiffi ¼ C þ pffiffiffi log h2 ðzÞ h1 ðzÞ κ 2 κ
where C is a constant. The proof of Lemma 6 follows immediately by differentiation of the right side of the equation stated in Lemma 6. By means of Lemma 1(i) and Lemma 2, Abel (1828d) used the same approach as in the proof of Theorem 1 to prove the following result. Theorem 2 Let f(x), φ(x), and p(x) be polynomials, let xj, j ¼ 1, 2, . . ., K, be the roots of the polynomial f(x)2 φ(x)2p(x), with p(xj) > 0, and let εj ¼ 1 be determined by qffiffiffiffiffiffiffiffiffiffiffi f x j ¼ e jφ x j p x j . If all the roots {xj} are distinct, real, and different from zero, then
26
J. K. Dagsvik
! f ðaÞ þ φðaÞpffiffiffiffiffiffiffiffiffi pðaÞ μ ð aÞ pffiffiffiffiffiffiffiffiffi e j F x j , a ¼ C pffiffiffiffiffiffiffiffiffi log f ðaÞ φðaÞ pðaÞ pð aÞ j¼1
m X
ð64Þ
where C is a constant. Proof of Theorem 2 By proceeding in the same way as in the proof of Theorem 1, it turns out that similarly to (60), it is possible to express the integrand of the left side of (64) as K X e j μ r j ðzÞ r 0j ðzÞ 2μ r j ðzÞ θ r j ðzÞ, z 0 : qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ¼ j¼1 a r j ðzÞ j¼1 a r j ðzÞ ψ 1 r j ðzÞ, z p r j ðzÞ
K X
ð65Þ
Due to Lemma 1(i), (60), and (61), the right side of (65) equals 2μðaÞθða, zÞ ¼ 2μðaÞ ψ ða, zÞ Using Lemma 6, with κ ¼ fore obtain that ð
f ðx, zÞφ02 ðx, zÞ φ
x, zÞ f 02 ðx, zÞÞdz
f ða, zÞ2 pðaÞφða, zÞ2 :
pffiffiffiffiffiffiffiffiffi pðaÞ, h2 ðzÞ ¼ f ða, zÞ, and h1(z) ¼ φ(a, z), we there-
! f ða, zÞ þ φða, zÞpffiffiffiffiffiffiffiffiffi pðaÞ 2μðaÞθða, zÞdz μðaÞ pffiffiffiffiffiffiffiffiffi ¼ C pffiffiffiffiffiffiffiffiffi log f ða, zÞ φða, zÞ pðaÞ ψ ða, zÞ pð aÞ
from which (64) follows. Note that in Theorem 2, the dependence of the functions f and φ, on the free variable z is suppressed. This completes the proof.
4.2
A More General Version of Abel’s Addition Theorem
A great advantage of Abel’s approach is that it applies to much more general cases than the ones discussed. Consider now the addition theorem for Abelian integrals of the form ðx GðxÞ ¼ c
μðuÞdu y ð uÞ
where μ(x) is a polynomial of degree t and y(x) is a root of the equation Q( y) p(x) ¼ 0, where Q( y) is a polynomial of degree m that has constant coefficients and does not contain a constant term. Moreover, p(x) is a polynomial in x with degree q which is equal to or greater than 3. The question to be investigated here is whether one can, similarly to the results obtained in the previous section, write
Abel’s Approach to Elliptic Integrals
27
XK j¼1
G xj ¼ C
ð66Þ
where C is a constant and where some of the distinct real numbers x1, x2, . . ., xK can be chosen freely whereas the remaining numbers are algebraic functions of the former ones. When Q( y) ¼ y2, the problem reduces to the case considered in Sect. 4.1. In the same way as the approach used in Sect. 4.1, one attempts to express y(rj(z)) on the following form y r j ðzÞ ¼ f r j ðzÞ, z =φ r j ðzÞ, z
ð67Þ
where rj(z), j ¼ 1, 2, . . ., K, are the roots of a suitable polynomial, ϕ(x, z) and f(x, z) are polynomials in x of degrees m2 and m1. As above, it is understood that at least one of the coefficients of these polynomial depends on a free variable z. This means that the polynomials φ and f must satisfy the equation p r j ðzÞ ¼ Q
! f r j ðzÞ, z : φ r j ðzÞ, z
ð68Þ
Let ψ(x, z) be defined as f ðx, zÞ ψ ðx, zÞ ¼ φðx, zÞm Q pð x Þ : φðx, zÞ
ð69Þ
Since Q has degree m, ψ(x, z) becomes a polynomial in x. Let rj(z), j ¼ 1, 2, . . ., K, be the roots of ψ(x, z) where, as above, K is the degree of ψ(x, z) which is given by K max (mm1, mm2 þ q). By differentiation with respect to z and subsequently inserting the root rj(z) into (69), (68) and (69) imply that ψ 02
r j ðzÞ, z ¼ φ r j ðzÞ, z
m2
0
Q
! f r j ðzÞ, z θ r j ðzÞ, z φ r j ðzÞ, z
ð70Þ
where, as above, θðx, zÞ ¼ φðx, zÞ f 02 ðx, zÞ φ02 ðx, zÞf ðx, zÞ: It thus follows from (70) that 0 r 0j ðzÞ r ð z ÞΛ r ð z Þ, z Λ r j ðzÞ, z j j ¼ r 0j ðzÞφ r j ðzÞ, z =f r j ðzÞ, z ¼ ð71Þ ¼ 0 ψ 1 r j ðzÞ, z y r j ðzÞ ψ 02 r j ðzÞ, z where
28
J. K. Dagsvik
φ r j ðzÞ, z ψ 02 r j ðzÞ, z Λ r j ðzÞ, z ¼ f r j ðzÞ, z ! m1 0 f r j ðzÞ, z θ r j ðzÞ, z =f r j ðzÞ, z : ¼ φ r j ðzÞ, z Q φ r j ðzÞ, z Since the polynomial Q does not contain a linear term, Λ(x, z) becomes a polynomial in x. Furthermore, it is easily verified that the degree of μ(x)Λ(x, z), which may be denoted by T, is given by T ¼ t þ m2 þ (m 1)m1. As discussed in Sect. 4.1, one must have that T K 2 in order for Lemma 1 to apply. This requirement implies that t, m1, and m2 must satisfy the inequality 2 þ t þ m2 þ ðm 1Þm1 max ðmm1 , mm2 þ qÞ
ð72Þ
for given values q 3, t 0 and m 4. Equation (72) will not always hold, in contrast to the corresponding case analyzed in Sect. 4.1. When (72) holds, then Lemma 1(ii) implies that K X Λ r j ðzÞ, z ¼0 0 j¼1 ψ 1 r j ðzÞ, z which, according to (71), implies that K μ r ðzÞ r 0 ðzÞ X j j ¼ 0: y r ð z Þ j j¼1
ð73Þ
In the same way as in the derivation of Theorem 1, we obtain the equation in (66) where x1, x2, . . ., xK satisfy the equation ! μ xj f xj ¼ p xj Q φ xj
ð74Þ
for j ¼ 1, 2,. . .,K. Thus, it has been demonstrated that the more general version of Abel’s addition theorem, expressed in (66), can be obtained by the same approach as the one applied for the case of elliptic and hyperelliptic integrals. It is, however, beyond the scope of this chapter to investigate how many of the roots x1, x2, . . ., xK can be chosen freely and how the remaining roots depend on the polynomials Q and p. The version of the addition theorem discussed in this section is, however, only a special case of Abel’s most general version of his theorem.
Abel’s Approach to Elliptic Integrals
4.3
29
Abel’s Addition Theorem in Its Most General Form
A summary of Abel’s addition theorem in its most general form will be presented in this section, but without going through the proof. Let y ¼ y(x) be the algebraic function determined by Qðx, yÞ ¼ 0
ð75Þ
where Q(x, y) is a polynomial in y of degree n with coefficients that are polynomials in x. Furthermore, let V ðx, yÞ ¼ r 0 ðxÞ þ r 1 ðxÞy þ . . . þ r n1 ðxÞyn1 be another polynomial in y of degree n 1 where the coefficients {rj(x)} are polynomials in x and let {βk} denote the coefficients of {rj(x)}. Assume that V ðx, yÞ ¼ 0:
ð76Þ
When both (75) and (76) hold, y can be eliminated and it follows that x must satisfy an equation ρ(x) ¼ 0 (say) where ρ(x) is a polynomial in x of degree K with constant coefficients. Theorem 3 Let xj, j ¼ 1, 2. . . ., K, be the roots of ρ(x) and define Ðx MðxÞ ¼ Rðu, yðuÞÞdu 0
where R(x, y) is a rational function in (x, y). Then μ X j¼1
η X M x j ¼ v0 þ c j log w j j¼1
where {wj} are rational functions of {βs} and {cj} are constants. When Q(x, y) ¼ Q( y) p(x) where Q( y) is a polynomial in y with constant coefficients, p(x) is a polynomial in x with constant coefficients and R(x, y) ¼ μ(x)/y, then Theorem 3 reduces to the result discussed in Sect. 4.2. The proof of Theorem 3 is given in the article “Mémoire sur une propriété générale d’une classe très étendue des fonctions transcendantes” (Abel 1841). An extremely compact version of the proof is given in the article “Démonstration d’une propriété général d’une certaine classe de fonctions transcendantes” (Abel 1829b).
30
J. K. Dagsvik
4.4
Elliptic Integrals and Multiplication
From trigonometry, it is well known that sin(my) can be expressed as a function of sin( y) by a simple algebraic formula where m is a natural number. Abel proved that similar formulas hold for elliptic integrals. As reviewed above, Fagnano obtained multiplication formulas, such as the one given in (17). He also considered cases of complex multiplication, see Ayoub (1984). Let H(x) denote the elliptic integral of the first kind, namely ðx H ðx Þ ¼ 0
du ΔðuÞ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi where ΔðuÞ ¼ ð1 u2 Þ 1 k2 u2 , and k is a real constant, jk j < 1. In the addition theorems above the roots {xf} were assumed to be distinct. However, by letting x1 ! x2, (say) it becomes clear that the corresponding limiting version of Theorem 1 will hold when two or several roots are equal, because the functions involved are continuous. Corollary 2 Let m > 1 be an integer. Then mH ðxÞ ¼ H ðyÞ where the relation between x and y is determined by m 2 f ðuÞ2 φðuÞ2 ΔðuÞ2 ¼ u2 x2 u y2 :
ð77Þ
One can determine the coefficients in the polynomial on the left-hand side of (77) and subsequently determine y as function of x. However, Abel used an alternative approach that is considerably simpler (Abel 1929a, pp. 256–258). Corollary 3 Let gm be defined by H ðgm ðxÞÞ ¼ mH ðxÞ: Then gm is given recursively by gmþ1 ðxÞ ¼ gm1 ðxÞ þ
2gm ðxÞΔðxÞ ; 1 k2 x2 gm ðxÞ2
ð78Þ
Abel’s Approach to Elliptic Integrals
31
g ðxÞ2 gm1 ðxÞ2 g2m1 ðxÞ ¼ m x 1 k2 gm ðxÞ2 gm1 ðxÞ2
ð79Þ
and g2m ðxÞ ¼
2gm ðxÞΔðgm ðxÞÞ 1 k 2 gm ð x Þ 4
ð80Þ
where g0(x) ¼ 0, g1(x) ¼ x. The function gm is rational when m is odd and has the form ξ(x)Δ(x) when m is even, where ξ(x) is rational. Proof of Corollary 3 From Corollary 2, it follows, for natural numbers m and r that H gmþr ¼ ðm þ r ÞH ðxÞ ¼ mH ðxÞ þ rH ðxÞ ¼ Hðgm Þ þ H ðgr Þ which implies that H gmþr ¼ H ðgm Þ þ H ðgr Þ:
ð81Þ
From (81) and the addition theorem, it follows that gmþr ¼
gm Δ ð gr Þ þ gr Δ ð gm Þ 1 k2 g2r g2m
ð82Þ
Note furthermore that H(x) ¼ H(x). From (81), it also follows that Hðgm Þ ¼ H ðgmr Þ þ H ðgr Þ which is equivalent to H ðgmr Þ ¼ Hðgm Þ H ðgr Þ ¼ H ðgm Þ þ H ðgr Þ:
ð83Þ
Hence, (82) and (83), with gr replaced by gr, yields gmr ¼
gm Δ ð gr Þ gr Δ ð gm Þ 1 k2 g2r g2m
ð84Þ
for m > r. By letting r ¼ 1 in (82) and (86), one obtains gmþ1 ¼ and
gm ΔðxÞ þ xΔðgm Þ 1 k2 x2 g2m
ð85Þ
32
J. K. Dagsvik
gm1 ¼
gm ΔðxÞ xΔðgm Þ : 1 k2 x2 g2m
ð86Þ
When adding together the expressions in (85) and (86), the following recursion formula for {gm} is obtained, namely gmþ1 ¼ gm1 þ
2gm ΔðxÞ : 1 k2 x2 g2m
ð87Þ
By calculating g2, g3, . . ., it becomes clear that gm is rational when m is odd and has the form ξ(x)Δ(x) when m is even, where ξ(x) is rational. Furthermore, by multiplying the expressions in (82) and (86), one obtains gmþr gmr ¼
g2m g2r : 1 k2 g2m g2r
ð88Þ
By letting r ¼ m 1 in (86), one gets gm r ¼ g1 ¼ x, so that (88) gives g2 g2 g2m1 ¼ m 2 2m12 : x 1 k gm gm1 Furthermore, by letting r ¼ m in (82) follows, due to g1 ¼ x, that g2m ¼
2gm Δðgm Þ : 1 k2 g4m
Hence, the proof is complete.
4.5
Examples
To fix ideas, we shall discuss a few examples on elliptic and hyperelliptic integrals in this section. Example 4.5.1 Let f(x) ¼ bx þ x3, φ(x) ¼ d, and p(x) ¼ 1 þ γ1x2 þ γ2x4 þ γ3x6, where a, γ1, γ2, γ3, b, and d are constants. The results achieved in Sect. 4.1 and 4.2 will now be applied to work out the details of the addition theorem for the hyperelliptic integral ðx
du FðxÞ ¼ pffiffiffiffiffiffiffiffiffi : pð uÞ 0
Notice that f(x) φ(x) p(x) is of degree 6 implying that this polynomial has 6 roots. Note also that this polynomial contains solely odd powers so that it is a cubic 2
2
Abel’s Approach to Elliptic Integrals
33
polynomial in x2. According to Theorem 1, the roots of ψ(x) are determined by the equations qffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffi f x j e jφ x j p x j ¼ bx j þ x3j e j d p x j ¼ 0,
ð89Þ
j ¼ 1, 2, . . ., 6, and these equations link the given polynomial p(x), the roots xj, j ¼ 1, 2, . . ., 6, and the coefficients b and d together. With no loss of generality, let ε1 ¼ ε2 ¼ ε6 ¼ 1 and ε3 ¼ ε4 ¼ ε5 ¼ 1. Moreover, since the roots are determined by three equations in x2, we can put x4 ¼ x1, x5 ¼ x2, and x6 ¼ x3. Let x1 and x2 be given values. From (89) for j ¼ 1, 2, it follows that b and d are determined by pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi pðx2 Þ x32 pðx1 Þ pffiffiffiffiffiffiffiffiffiffiffi b ¼ bðx1 , x2 Þ ¼ pffiffiffiffiffiffiffiffiffiffiffi x 2 pð x 1 Þ x 1 pð x 2 Þ
ð90Þ
x2 x31 x1 x32 pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi : ffi x 2 pð x 1 Þ x 1 pð x 2 Þ
ð91Þ
x31
and d ¼ d ðx1 , x2 Þ ¼
From (89) and (90) and (91), it follows that the root x3 is determined by the equation pffiffiffiffiffiffiffiffiffiffiffi bðx1 , x2 Þx3 þ x33 ¼ dðx1 , x2 Þ pðx3 Þ where it is understood that the expressions for b and d given in (90) and (91) are inserted. Consider next how to obtain a simplified explicit formula for x3 as a function of x1 and x2. Specifically, an alternative way to determine x3 is to use that the constant term of the polynomial f(x)2 φ(x)2p(x) equals the product of the roots of the polynomial. Since the products of the roots equals x1x2x3x4x5x6 ¼ (x1x2x3)2 and the constant term of the polynomial is equal to d2, it follows that pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi x 1 pð x 2 Þ þ x 2 pð x 1 Þ x21 x22 d ðx1 , x2 Þ : ð92Þ pffiffiffiffiffiffiffiffiffiffiffi ¼ ¼ pffiffiffiffiffiffiffiffiffiffiffi x3 ¼ x1 x2 x1 pðx2 Þ x2 pðx1 Þ 1 γ 2 x21 x22 γ 3 x21 x22 x21 þ x22 The third equality in (92) follows by multiplying the numerator and the denompffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi inator of the expression in the middle above by x1 pðx2 Þ þ x2 pðx1 Þ. By making the change of variable u ! w yields that F(x) ¼ F(x). Hence, the addition theorem implies that C¼
6 X j¼1
e j F x j ¼ Fðx1 Þ þ Fðx2 Þ Fðx3 Þ Fðx4 Þ Fðx5 Þ þ Fðx6 Þ:
ð93Þ
34
J. K. Dagsvik
The constant C becomes equal to zero due to the property that F(0) ¼ 0. Hence, from (93) it follows that 0 ¼ Fðx1 Þ þ Fðx2 Þ Fðx3 Þ Fðx4 Þ Fðx5 Þ þ Fðx6 Þ ¼ Fðx1 Þ þ Fðx2 Þ Fðx3 Þ Fðx1 Þ Fðx2 Þ þ Fðx3 Þ ¼ 2Fðx1 Þ þ 2Fðx2 Þ 2Fðx3 Þ which yields Fð x 1 Þ þ Fð x 2 Þ ¼ Fð x 3 Þ
ð94Þ
where x3 is given by (92). By letting x1 ¼ x2 ¼ x and y ¼ x3, it follows from (92) by using l’Hopital’s rule and (94) that if y is given by pffiffiffiffiffiffiffiffiffi 2x pðxÞ : y¼ 1 γ 2 x4 2γ 3 x6 and FðyÞ ¼ 2FðxÞ: Example 4.5.2 When γ1 ¼ 1 k2, γ2 ¼ k2, and γ3 ¼ 0, it follows that p(x) ¼ (1 x2)(1 k2x2) where k is a constant, jk j < 1, and F(x) becomes an elliptic integral of the first kind. In this case, the equation in (92) reduces to x3 ¼
x1
pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi pðx2 Þ þ x2 pðx1 Þ 1 k2 x21 x22
which, combined with (94), yields the addition theorem for elliptic integrals of the first kind. Example 4.5.3 This example is a slightly modified version of the example discussed in a letter Abel wrote on 6 August 1826 to August Leopold Crelle (1780–1855), editor and founder of the Journal für die reine und angewandte Mathematik, see Abel (2012). In this letter, Abel mentions an application of his addition theorem to the hyperelliptic integral ðx
du FðxÞ ¼ pffiffiffiffiffiffiffiffiffi pðuÞ b
Abel’s Approach to Elliptic Integrals
35
where p(x) ¼ a þ a1x þ a2x2 þ a3x3 þ a4x4 þ a5x5 þ x6 and a, a1, . . ., a5 are given numbers such that p(x) is positive on a suitable set. Thus, the integral above is a hyperelliptic integral. However, Abel provides no details in his letter of how he achieved his results; this task is left to the reader. Here, a proof is provided of the assertion put forward in Abel’s letter by applying the addition theorem. Let 2 ψ ðxÞ ¼ c þ c1 x þ c2 x2 þ x3 pðxÞ ð95Þ ¼ c2 a þ ð2cc1 a1 Þx þ c21 þ 2cc2 a2 x2 þ ð2c þ 2c1 c2 a3 Þx3 þ 2c1 þ c22 a4 x4 þ ð2c2 a5 Þx5 where c, c1, and c2 are free coefficients (variables). From this it follows that ψ(x) has only five roots (K ¼ 5) because the term x6 cancels. Let x1, x2, x3, y1, and y2 denote the roots of ψ(x). As in the examples given above, the free coefficients c, c1, and c2 can be expressed as functions of x1, x2, and x3 by solving the equations ψ(xj) ¼ 0, for j ¼ 1, 2, 3, which implies that the free coefficients must satisfy the linear equations c þ c1 x j þ c2 x2j ¼
qffiffiffiffiffiffiffiffiffiffiffi p x j x3j
for j ¼ 1, 2, 3. The remaining roots y1 and y2 are not free variables but determined by ψ(yj) ¼ 0 for j ¼ 1, 2, after the coefficients c, c1, and c2 have been determined as functions of x1, x2, and x3. From (95) it follows by using the relationship between roots and coefficients in a polynomial that 2c þc2 a
2
4 c a 2 x1 x2 x3 and y1 y2 ¼ ða5 2c which imply that y1 þ y2 ¼ a15 2c 2 2 Þx1 x2 x3 y1 and y2 are roots of the equation
2c1 þ c22 a4 c2 a x1 x2 x3 y þ ¼ 0: y a5 2c2 ða5 2c2 Þx1 x2 x3 2
ð96Þ
Hence y1 and y2 can be determined by solving the quadratic equation in (96). As a result, Theorem 1 gives Fð x 1 Þ þ Fð x 2 Þ þ Fð x 3 Þ ¼ C F ð y 1 Þ Fð y 2 Þ where C is a constant. Thus, the claim expressed in Abel’s letter is verified. Recall that in the hyperelliptic case where p(x) has degree q, q 5, it is, in contrast to the elliptic case, not generally possible to specify K 1 independent roots with the K-th root being an algebraic function of the independent roots, as we have seen in the current example. However, in special cases of hyperelliptic integrals, such as in Example 4.5.1, it is still possible to obtain a similar addition theorem as in the case with elliptic integrals.
36
J. K. Dagsvik
Example 4.5.4 Let Q( y) ¼ y3, f(x) ¼ b þ x, φ(x) ¼ d, μ(x) ¼ 1, and p(x) ¼ 1 þ αx2 þ βx4. Consider the Abelian integral ðx
du ffiffiffiffiffiffiffiffiffi : Lð x Þ ¼ p 3 pðuÞ c
Note that L(x) is not an elliptic integral because the integrand contains the cubic root of a polynomial p(x). In this case ψ ðxÞ ¼ f ðxÞ3 φðxÞ3 pðxÞ
ð97Þ
which is a polynomial of degree 4. In this case, K ¼ 4 and T ¼ 2 so that (72) is fulfilled. From (74), it follows that the roots must satisfy the equation b þ xj ¼ d
qffiffiffiffiffiffiffiffiffiffiffi 3 p xj
ð98Þ
for j ¼ 1, 2, 3, 4. From (98), it follows that the coefficients b and d are determined by x1
b ¼ bðx1 , x2 Þ ¼
p ffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi 3 pð x 2 Þ x 2 3 pð x 1 Þ x x2 p ffiffiffiffiffiffiffiffiffiffiffi and d ¼ dðx1 , x2 Þ ¼ p ffiffiffiffiffiffiffiffiffiffi1ffi p ffiffiffiffiffiffiffiffiffiffiffi : ffiffiffiffiffiffiffiffiffiffiffi p 3 3 3 pð x 1 Þ 3 pð x 2 Þ pð x 1 Þ pð x 2 Þ ð99Þ
Moreover, the constant term in the polynomial ψ(x)/βd3 turns out to be equal to (d(x1, x2)3 b(x1, x2)3)/βd(x1, x2)3 and the coefficient associated with the x3 is equal to 1/βd(x1, x2)3. By utilizing the relationship between the roots and the coefficients of a polynomial, it follows that x3 þ x4 ¼ βdðx 1, x 1
2Þ
3
x1 x2
3
3
Þ bðx1 , x2 Þ and x3 x4 ¼ dðxx1 ,xx2βd which implies that ðx , x Þ3 1 2
1
2
x3 and x4 are functions of (x1, x2), and they are determined by the quadratic equation bðx1 , x2 Þ3 dðx1 , x2 Þ3 x2j þ x1 þ x2 1=βd ðx1 , x2 Þ3 x j ¼0 x1 x2 βd ðx1 , x2 Þ3
ð100Þ
for j ¼ 3, 4, where b and d are given in (99). Consequently, the following addition theorem
Abel’s Approach to Elliptic Integrals
37 4 X L xj ¼ C j¼1
has been established where C is a constant and the roots x1 and x2 can be selected freely whereas the remaining two roots x3 and x4 are determined by (100). Consider next the special case where x1 ¼ x2 ¼ x. To examine this case, it is convenient to let x1 ! x2 in the formulas in (99) and then apply l’Hopital’s rule. Hence 2
4
2=3
3pðxÞ βx bðx, xÞ ¼ 3þαx 2αxþ4βx3 and d ðx, xÞ ¼ 2αxþ4βx3 .
Furthermore, (100) implies that x3 and x4 are given by the roots of the quadratic polynomial bðx, xÞ3 d ðx, xÞ3 x2j þ 2x 1=βd ðx, xÞ3 x j ¼0 x2 βd ðx, xÞ3 for j ¼ 3, 4. Example 4.5.5 From Corollary 3 with m ¼ 2 and m ¼ 3, respectively, it follows that H ðy2 Þ ¼ 2H ðxÞ and Hðy3 Þ ¼ 3HðxÞ where 3x 4 þ 4k2 x2 þ 6k2 x5 k4 x9 2xΔðxÞ y2 ¼ and y3 ¼ : 1 6k2 x4 þ 4k2 1 þ k2 x6 3k4 x8 1 k 2 x4
5
A Short Review of Abel’s Work on Elliptic Integrals and the Addition Theorem
Several biographers have written about Niels Henrik Abel: see, for example, Ore (1957), Stubhaug (2000), and Sørensen (2004). Sørensen (2004) also provides an overview of the state of mathematics at the turn of the eighteenth century. Here, a brief review of selected parts of Abel’s life and work is provided which have relevance to his contribution to the theory of elliptic integrals and elliptic functions. Abel’s life was short and dramatic. He was born in 1802 near the town of Stavanger in Norway. In 1804, his father was appointed pastor at Gjerstad, near the town of Risør, and the family moved there. In 1815, he entered the Cathedral School in Christiania at the age of 13. (Oslo was called Christiania at that time). A new mathematics teacher, Bernt Michael Holmboe (1795–1850), was appointed in 1818 and he set the students’ mathematical tasks to do at home. He quickly became aware of Abel’s talent and encouraged him to study mathematics at an advanced
38
J. K. Dagsvik
level. Abel entered the university in Christiania in 1821. At that time, he was already the foremost mathematician in Norway. He had studied all the latest mathematical literature in the university library and had also started work on his own mathematical problems, such as the solution to the general quintic equation in radicals. In a letter dated 21 May 1821 from Carl Ferdinand Degen (1766–1825), professor of mathematics at the University of Copenhagen to Christopher Hansteen (1784–1873), professor of astronomy and geophysics at the University of Christiania, the content of which was made known to Abel, Degen wrote that such a talented young manas Abel should devote himself to the theory of “elliptic transcendents” (elliptic integrals), adding that he would “discover Magellanian thoroughfares to large portions of a vast analytical ocean.” This almost prophetic statement might well have been influential for the direction of Abel’s work. From 1823 to 1826, Abel wrote three papers where one was published in 1826 in Magazin for Naturvidenskaberne (Journal of the Natural Sciences), Norway’s first scientific journal, which had been co-founded by Hansteen at the University of Christiania. This paper was later translated into French and published in Abel (2012a). The two other papers where not published before Abel’s death (Abel 2012b, c). In the summer of 1823, he was awarded a scholarship that allowed him to make his first trip abroad, to Copenhagen. From the letters he wrote to his former teacher Holmboe, it is clear that he had started to work on the theory of elliptic integrals and had already achieved new results. It would appear that the article “Petite contribution à la théorie de quelques fonctions transcendantes”(Abel 2012a) is the first article he published that, among other topics, concerns elliptic and hyperelliptic integrals. In a letter to Degen dated 2 March 1824, Abel writes about the progress he has made since the last time they met in Copenhagen, in the summer of 1823. From this letter and a paper that Abel wrote a short time afterwards, it seems that he had already by that time formulated the most general version of his addition theorem (Skau 2011). In his article “Petite contribution à la théorie de quelques fonctions transcendantes” (2012a), Abel starts by discussing the integral ð
φðxÞ exp ð f ðxÞÞdx xa
where a is a constant, f(x) is a rational function, and φ(x) has the form φðxÞ ¼ κðx þ α1 Þβ1 ðx þ α2 Þβ2 ðx þ αr Þβr , where α1, α2, . . ., β1, β2, . . ., κ, are given coefficients and neither of the functions f and φ depends on a. By ingenious manipulations, including suitable operations (differentiation and integration), Abel obtains a set of surprising equations that can be applied to produce a number of amazing results. For example, when p(x) ¼ γ0 þ γ1x þ þ γsxs, where γ1, γ2, . . . are given coefficients and s is an integer, he obtains as a special case
Abel’s Approach to Elliptic Integrals
ð
ð m ð n XX dx 1 a da x dx pffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffi ðm nÞγ nþmþ2 pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi ðx aÞ pðxÞ 2 pðaÞ n m pð aÞ pð x Þ
39
ð101Þ
for s greater than 2 and x being a root of p. Consider, for example, the case where p(x) ¼ (1 x2)(1 k2x2) ¼ Δ(x)2. Then γ0 ¼ 1, γ1 ¼ γ3 ¼ 0, γ2 ¼ 1 k2 and γ4 ¼ k2, which by (101) yields ð
ð ð 2 ð 2 ð dx k2 da x dx k2 a da dx ¼ : ðx aÞΔðxÞ 2ΔðaÞ ΔðaÞ ΔðxÞ 2ΔðaÞ ΔðaÞ ΔðxÞ
ð102Þ
Since ð
ð ð dx dx dx , ¼ 2aðx aÞΔðxÞ 2aðx þ aÞΔðxÞ ðx2 a2 ÞΔðxÞ
it follows immediately from (102) (by successively changing the sign of a) that ð
ð 2 ð ð ð 2 dx k2 da x dx k2 a da dx : ¼ ðx2 a2 ÞΔðxÞ 2aΔðaÞ ΔðaÞ ΔðxÞ 2aΔðaÞ ΔðaÞ ΔðxÞ
ð103Þ
The equation in (103) shows that Legendre’s elliptic integral of the third kind can be expressed in a simple way by elliptic integrals of the first and second kind when x is a root of p. In 1825, Abel was awarded a second scholarship to travel abroad. The plan was to visit Carl Friedrich Gauss (1777–1855) in Gottingen and then continue on to Paris, which at that time was the leading center for mathematics. He initially intended to travel with four of his friends. However, when he got as far as Copenhagen, he changed his mind, following his friends to Berlin instead, having decided to visit Gauss and go on to Paris afterwards. For some unknown reason, though, he never met with Gauss. On the way, he visited the astronomer Heinrich Christian Schumacher (1780–1850) in Altona, now a district of Hamburg. Schumacher was at that time the editor of Astronomische Nachrichten, where Abel and Jacobi published important papers. He then spent 4 months in Berlin, where he became well acquainted with August Leopold Crelle (1780–1885). Crelle was about to establish a new mathematical journal, Journal_für die reine und angewandte Mathematik, which would become very important for the dissemination of Abel’s work. Crelle also became a close personal friend. In the journal’s first year, Abel contributed seven articles. However, he saved what he regarded as his most important work, the Paris memoir, for the French Academy of Sciences. Abel revolutionized the theory of elliptic integrals by proposing to study elliptic functions instead of elliptic integrals. The elliptic functions may be defined as the inverse of corresponding elliptic integrals. More precisely, let
40
J. K. Dagsvik
ðx
du Hðx, kÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 2 Þ 1 k 2 u2 1 u ð 0
ð104Þ
denote the elliptic integral of the first kind with k as parameter. Then the corresponding elliptic function, say g(y, k), is defined as the inverse of the integral H(x, k): that is H(g(y, k), k) ¼ y for y [0, 1]. The elliptic functions can be viewed as a generalization of the trigonometric functions. To realize this, note that when k ¼ 0, then H(x, 0) reduces to arcsinx. The so-called Jacobian elliptic functions were introduced by Jacobi and the notation sny, cny, dny, and tny proposed by Christoph Gudermann (1798–1852) is nowadays common for elliptic functions. Specifically, sny ¼ g(y, k), cn2y ¼ 1 sn2y, dn2y ¼ 1 k2sn2y, and tny ¼ sny/cny. When k ¼ 0, sny, cny, and tny reduce to the trigonometric functions siny, cosy, and tgy, respectively. An addition theorem for elliptic functions that is dual to Euler and Legendre’s addition theorem for elliptic integrals can be expressed as snðy þ vÞ ¼
snycnvdnv þ snvcnydny : 1 k2 sn2 ysn2 v
Note that sn(y þ v) reduces to the usual addition formula for sin(y þ v) when k ¼ 0. Another area in which fierce competition between Abel and Jacobi took place was the theory of transformations. This theory includes the study of solutions (transformations) ω(x), say, satisfying HðωðxÞ, kÞ ¼ bH ðx, k0 Þ þ C where H is given in (104), k and k0 are the respective parameters, which may be equal or different, and b and C are constants. The transformation omega(x) will of course depend on b, k, and k0, but this is suppressed in the notation here. Note that Corollaries 2 and 3 can be viewed as part of the theory of transformations in the case where k ¼ k0. See Stubhaug (2000) and Sørensen (2004) for more details about the rivalry between Abel and Jacobi. In Paris, Abel continued his work on his memoir, which he finished in October 1826 and submitted to the academy on 30 October. It was to be reviewed by Augustin-Louis Cauchy (1789–1857) and Legendre. Unfortunately, the Paris memoir, regarded as Abel’s masterpiece, was put aside among Cauchy’s papers and forgotten. In early 1827, Abel returned to Berlin, where he stayed several months before returning to Norway. In Berlin, he completed an extensive paper which was published in two parts (Abel 1827, 1828a). In it he does not, however, apply his addition theorem but rather the one established by Euler and Legendre. In this paper, Abel develops the theory of elliptic functions and obtains many new results, and he also discusses the theory of transformations and complex multiplication. He also considered ways to solve for sny when sn(my) is given, where m is an integer. As an example of multiplication of elliptic functions, one can show that
Abel’s Approach to Elliptic Integrals
41
3sny 4 þ 4k2 sn2 y þ 6k2 sn5 y k4 sn9 y snð3yÞ ¼ : 1 6k2 sn4 y þ 4k2 1 þ k2 sn6 y 3k4 sn8 y The continuation of the article “Recherches sur les fonctions elliptiques” (Abel 1827) was finished and published in the following year (Abel 1828a). By May 1827, Abel was back in Norway. Jacobi wrote a paper on the theory of transformations of elliptic integrals in which he also studies elliptic functions (Jacobi 1827). Although Abel (1827) had discussed the general theory of transformation and the properties of elliptic functions, and his paper had been published several months before Jacobi sent his paper to the journal, Jacobi (1827) did not publicly refer to the work of Abel. Therefore, Abel wrote a second paper and then a third one with special emphasis on the theory of transformations (Abel 1828b, 1828c). In 1828, he published a paper (Abel 1828d) in which he demonstrates his addition theorem for hyperelliptic integrals by applying the ideas from his Paris memoir (Abel 1841). Subsequently, he wrote another and very long paper (Abel 1829a) in which he presented his addition theorem, the elliptic functions, and the theory of transformations in a way that differed from his earlier work. Due to a combination of unfortunate circumstances (Stubhaug 2000), the Paris memoir was not published before 1841, 12 years after Abel’s early death from tuberculosis. In the last paper Abel wrote before he died (Abel 1829b), on 6 April 1829, he gave a very compact proof of the addition theorem in its most general form (presented in Sect. 4.3), upon which his Paris memoir is based. It consists of only two pages, the first containing the statement of the theorem and the second containing the proof. The French mathematician Emile Picard (1856–1941) wrote the following about Abel’s theorem at the end of the nineteenth century: “Le théorème parait tout à fait élémentaire, et il n’y a peut-etre pas, dans l ’histoire de la Science, de proposition aussi importante obtenue á l ’aide de considérations aussi simples” (“The theorem appears utterly elementary, and perhaps there has never been in the history of science a proposition so important which can be reached by such simple considerations.”) The Norwegian mathematician Atle Selberg (1917–2007) wrote in 2002: “Det har alltid stått for meg som den rene magi. Hverken Gauss eller Riemann, eller noen annen, har noe som riktig kan måle seg meddette.” (“For me this has always appeared as pure magic. Neither Gauss nor Riemann nor anyone else have anything that really measures up to this.”) After Abel’s death, Jacobi’s book on elliptic integrals and functions (Jacobi 1829) was published. It became a key reference for mathematicians working on the theory of elliptic integrals and functions in the following years. However, it was later criticized for making insufficient reference to Abel’s work. After Abel had published some of his groundbreaking articles, it became known that Gauss had also been working on the theory of elliptic integrals and functions and that as early as in 1798 he had achieved some of the same results as Abel (Sørensen 2004). However, Gauss never published his results and thus the mathematical community had no knowledge of his work before the publication of Abel’s articles.
42
J. K. Dagsvik
Whereas Abel’s theory solves the problem of defining and analyzing the properties of elliptic functions (and more generally abelian functions) with elementary means, it does not immediately extend to the complex field, such as in the case where the polynomial p(x) in Theorems 1 and 2 is negative, has complex roots or has complex coefficients. Still there is no essential difference between the functions which correspond to real and to complex values of the coefficients and constants. It became the fate of Karl Theodor Wilhelm Weierstrass (1815–1897) to remedy this deficiency and provide a rigorous foundation of Abel’s theory, outlined in Weierstrass (1856). Unfortunately, the theory then loses its elementary character.
6
Conclusion
To learn about ideas and approaches in mathematics is a highly demanding task for the nonspecialist. Often the background knowledge necessary of an understanding beyond a fragmentary and superficial level is formidable. However, as regard the development of the theory of elliptic integrals and functions some 200 years ago, it is possible to understand key results, including proofs, solely with background in elementary calculus, as demonstrated in this chapter. As noted above, Abel’s approach was fundamentally different from earlier contributions, and, as has been mentioned by many mathematicians in the past, it is astonishing that such important results can be achieved by such simple means. Abel’s approach is so powerful that it extends to the case hyperelliptic integrals of any degree and more generally to Abelian integrals and functions. Abel’s work in this field inspired a great number of mathematicians in the years that followed and eventually led to Riemann and Weierstrass developing the theory of algebraic functions and algebraic curves further.
References Abel NH (1827) Recherches sur les fonctions elliptiques. J Reine Angew Math 2:101–181 Abel NH (1828a) Recherches sur les fonctions elliptiques (second part). J Reine Angew Math 3: 160–190 Abel NH (1828b) Solution d’un problème général concernant la transformations des fonctions elliptiques. Astron Nachr 6:365–380 Abel NH (1828c) Addition au mémoire sur les fonctions elliptiques. Astron Nactr 7:33–34 Abel NH (1828d) Remarques sur quelques propriété générales d’une certaine sorte de fonctions transcendantes. J Reine Angew Math 3:313–323 Abel NH (1829a) Précis d’une théorie des fonctions elliptiques. J Reine Angew Math 4:236–277, 309–348 Abel NH (1829b) Démonstration d’une propriété générale d’une certaine classe de fonctions transcendantes. J Reine Angew Math 4:200–201 Abel NH (1841) Mémoire sur une propriété générale d’une classe très étendue de fonctions transcendantes. Mémoires présentés par divers savants, vol VII, Paris Abel N H (2012a) Petite contribution à la théorie de quelques fonctions transcendantes. In: Sylow L, Lie S (eds) Oeuvres complètes de Niels Henrik Abel, vol I. Cambridge University Press,
Abel’s Approach to Elliptic Integrals
43
London (First published in Norwegian (1824–7): Et lidet bidrag til læren om adskillige transcendente functioner. Det Kgl. Norske Videnskabers Selskap i Throndhjem, 1824–1827, 177–207 Abel NH (2012b) Sur une propriété remarquable d’une classe très étendue de fonctions transcendantes. In: Sylow L, Lie S (eds) Oeuvres complètes de Niels Henrik Abel, vol II. Cambridge University Press, London Abel NH (2012c) Extension de la théorie précédente. In: Sylow L, Lie S (eds) Oeuvres complètes de Niels Henrik Abel, vol II. Cambridge University Press, London Ayoub R (1984) The lemniscate and Fagnano’s contributions to elliptic integrals. Arch Hist Exact Sci 29:131–149 Cayley A (1961) An elementary treatise on elliptic functions. Constable and Company Ltd., London. (Originally published by Bells G & Sons 1876) Euler L (1761) On the integration of the differential equation) mdx/(1 x4)1/2 ¼ ndy / (1 y4)1/2. Article 251. In: Enestrom index (Opera Omnia Z20, 58–79). (Originally published as: De integrationea equation is differential is In: Novi commentarii academiae scientiarum Petropolitanae 6: 37–57) Fagnano GC (1718) Metodo per misurare la lemniscata (Method for measuring the lemniscate). Opere matematiche 2:293–313 Huntington Barnum H (1910) Abel’s theorem and the addition formulae for elliptic integrals. Ann Math Second Ser 11:103–114 Hoffmann JE (1949) Die entwicklungsgeschichte der Leibnizschen mathematikwährend des Aufenthaltes in Paris (1672–1676). Leibniz Verlag, Munich Houzel C (1986) Fonctions elliptiques et intégrales abeliennes. In: Dieudonne J (ed) Abrégé d’histoire des mathématiques, Ch. 7. Hermann, Paris Houzel C (2004) The work of Niels Henrik Abel. In: Laudal AO, Piene R (eds) The legacy of Niels Henrik Abel – The Abel bicentennial, Oslo, 2002. Springer, Berlin, pp 21–178 Jacobi CGJ (1827) Demonstratio theorematisad theoriam functionumellipticarum spectantis. Astron Nachr 6:133–141 Jacobi CGJ (1829) Fundamenta nova theoriae functionumeli picarum. Borntreger Regiomonti. Also published in C.G.J. Jacobi’s Gesammelte Werke (1881), vol I Remer G. Berlin Lagrange JL (1766–9) Sur quelques équations différentielles dont les indéterminées sont séparés mais dont chaque member en particulier nést point intégrable. In: Serret JA (ed) Oeuves de Lagrange Vol. II, 1867–92. Gauthiers-Villars:Paris. First published in Miscellanea Taurinensia, IV. Mémoires de l ’Académie Royale de Sciences des Turins, Turin Landen J (1775) An investigation of a general theorem for finding the length of an arc of any conic hyperbola, by means of two elliptic arcs, with some other new and useful theorems deduced therefrom. Philos Trans 65: 283–289, The Royal Society of London. Also published in Mathematical Memoirs (1780), Nourse J: Bookseller to His Majesty, London Legendre AM (1793) Mémoire sur les transcendantes elliptiques. Du Pont et Firmin Didot, Paris Legendre AM (1811–17) Exercices de calcul intégral sur divers orders de transcendantes, et sur les quadratures (1811, 1817, 1816). Courcier, Paris Legendre AM (1825–8) Traité des fonctions elliptiques et des intégrales eulèriens (1825, 1826, 1828). Huzard-Courcier, Paris. Miller CD, Lial ML, Schneider DL (1990) Fundamentals of college algebra, 3rd edn. New York, Addison-Wesley Educational Publishers Mittag-Leffler G (1923) An introduction to the theory of elliptic functions. Ann Math 24:271–351 Ore Ø (1957) Niels Henrik Abel. Mathematician extraordinary. Minneapolis, University of Minnesota Press Richelot FJ (1842) Über die integration eines merkwürdigen systems differentialgleichungen. J Reine Angew Math 23:354–369 Skau C (2011) Om Niels Henrik Abel (1802–1829) sin avhandling, ‘Et lidet Bidrag til Læren om adskillige transcendente Functioner.’ Det Kgl. Norske Videnskabers Selskaps skrifter, 90-100.
44
J. K. Dagsvik
In Norwegian (On the article by Niels Henrik Abel (1802–1829), ‘Petite contribution a la théorie de quelques fonctions transcendantes’) Stephenson RJ (1960) Mechanics and properties of matter. Wiley, London Stubhaug A (2000) Niels Henrik Abel and his times. Called too soon by flames afar. Berlin, Springer Sørensen HK (2004) The mathematics of Niels Henrik Abel. Ph.D. thesis, Institute for Philosophy of Science, University of Arhus Weierstrass KTW (1856) Theorie der Abel’schen functionen. J Reine Angew Math 52:285–380
Agency in Mathematical Practice Yacin Hamami
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Why Take Agents Seriously . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Agency and Mathematical Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Agency and Mathematical Artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Agency and Mathematical Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Agency and Mathematical Texts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 5 7 10 12 15 17
Abstract
A characteristic feature of the philosophy of mathematical practice is to attend to what people do when they do mathematics. But what does it mean to do mathematics? This question raises several issues regarding the nature of action, activity, and agency in mathematical practice. The present chapter reviews contributions in the field that have attempted to theorize about these notions. It begins with some motivations for taking agents seriously in the philosophical study of mathematical practice. The core of the chapter discusses, in turn, what it means to carry out mathematical activities, do things with mathematical artifacts, engage with mathematical proofs, and perform mathematical actions prescribed by mathematical texts. Taken together, the various lines of work reported here provide an initial, but already sophisticated, picture of what it means to do mathematics. The chapter ends with some suggestions for future research on agency in mathematical practice.
Y. Hamami (*) Postdoctoral Researcher FNRS, Philosophy Department, University of Liège, Liège, Belgium e-mail: [email protected] © Springer Nature Switzerland AG 2023 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_48-1
1
2
Y. Hamami
Keywords
Philosophy of mathematical practice · Agency in mathematical practice · Actions in mathematical practice · Mathematical agents
1
Introduction
Mathematics is something that people do, from the research mathematician trying to prove Goldbach’s conjecture to the student in elementary school working on her mathematics homework. But what does it mean to do mathematics? One may expect philosophical accounts of the nature of mathematics to provide some clues on this question. And yet, as Kenneth Manders remarked: “[c]urrent philosophical conceptions of mathematics are predominantly agentless” (Manders 2008, p. 118). Paradigmatic examples are the main developments in the foundations of mathematics where the objects of study are the notions of mathematical theories, proofs, and computations. The branches of mathematical logic investigating these notions – model theory, proof theory, and recursion theory – do not usually require appealing to agents doing things.1 The same observation can be made for most philosophical theorizing on the nature of mathematical objects and mathematical truths.2 The situation seems different for epistemological considerations aiming to explain how we can possess mathematical knowledge, since knowledge needs to be attributed to an epistemic subject.3 But here again, it is easy to set aside considerations on actions, activities, and agency and to only focus on the conditions that need to be met to attribute knowledge to an epistemic subject, without describing what a subject needs to do to acquire such knowledge in the first place. One way the philosophy of mathematical practice4 departs from traditional approaches in the philosophy of mathematics is precisely by taking into consideration the active dimension of mathematics, i.e., by attending to what people do when they do mathematics. This has become a leitmotif in describing the field, for instance: The focus has turned thus to a consideration of what mathematicians are actually doing when they produce mathematics. [. . .] These historians and philosophers agree that there is more to understanding mathematics than a study of its logical structure and put much emphasis on mathematical activity as a human activity. (Mancosu et al. 2005, p. 1)
1
Notable exceptions are logicians that take into account the active dimension of deductive inference, e.g., Frege (1879), Gentzen (1934), Prawitz (1965), and Martin-Löf (1984). 2 Here again, there are exceptions, for instance, works from the precursors of the philosophy of mathematical practice – e.g., Wilder (1950), Lakatos (1976), and Kitcher (1984). Also, philosophical views that take mathematical objects and facts as being the result of social constructions – e.g., Hersh (1997), Feferman (2009), and Cole (2013) – may implicitly or explicitly appeal to mathematical agents. On social construction and constitution in mathematics, see the chapter by Cantù in this handbook (Cantù 2023). 3 On the conceptualization of epistemic subjects in the context of mathematics, see the chapter by De Toffoli in this handbook (De Toffoli 2023). 4 For recent overviews of the field, see Carter (2019) and Hamami and Morris (2020).
Agency in Mathematical Practice
3
The philosophy of mathematics has experienced a renewal in recent years due to a more open and interdisciplinary way of asking and answering questions. Traditional philosophical concerns about the nature of mathematical objects and the epistemology of mathematics are combined and fructified with the study of a wide variety of issues about the way mathematics is done, evaluated, and applied [. . .]. (Ferreirós 2016, p. xi)
Once it has been acknowledged that the active dimension of mathematics is a proper object of philosophical inquiry, the next step is to figure out how to study it. This raises methodological issues on how to investigate the whole range of actions and activities present in mathematical practices as well as the different forms of agency involved.5 It also raises theoretical questions regarding the nature of actions, activities, and agency in the specific context of mathematical practice. The aim of this chapter is to review various ways in which the notions of action, activity, and agent have played a role in philosophical studies of mathematical practice. The focus will be on contributions that have attempted to theorize about these notions. We begin, in Sect. 2, with some motivations for taking agents seriously when investigating mathematical practice. Section 3 is concerned with the notion of mathematical activity, both as an object of study in itself and as a way of building an epistemology of mathematics where mathematical knowledge is rooted into elementary mathematical activities. In Sects. 4 and 5, we will see how the notions of action and agent have played an important role in the philosophical study of mathematical artifacts and mathematical proofs. Section 6 will be concerned with mathematical agency and mathematical texts, and more specifically with how mathematical actions can be prescribed by mathematical texts. Taken together, these contributions already provide important clues on what it means to do mathematics. Section 7 concludes the chapter with some suggestions on how to pursue further the study of agency in mathematical practice.
2
Why Take Agents Seriously
An obvious reason to take agents seriously is that virtually all the mathematical practices one may be interested to investigate would not exist without agents engaging in it at a given point in human history. As Ferreirós emphasized: “[m]athematical practice always involves communities of agents in interaction” and “there is no practice without practitioners” (Ferreirós 2016, p. 45).6 One may 5
The nature of actions, activities, and agency are investigated in the branch of philosophy known as the philosophy of action (for overviews, see Schlosser 2019; Piñeros Glasscock and Tenenbaum 2023). The theme of this chapter naturally invites connections between the philosophy of mathematical practice and the philosophy of action. So far, interactions between the two fields have been rare, but see Chemla and Virbel (2015) and Hamami and Morris (2021, 2022a) for some attempts to connect the two. 6 On Ferreirós’ view of what a mathematical practice is, see also Ferreirós’ chapter in this handbook (Ferreirós 2022).
4
Y. Hamami
then expect that the notion of agent will, one way or another, figure in any philosophical analysis of mathematical practice. Still, one may object that the essence of a mathematical practice lies in the norms or rules that are constitutive of it, and not in its particular, contingent implementation(s). From this perspective, one could investigate a given mathematical practice independently of its implementation(s), in the same way as one could study a judiciary system or an algorithm which may never have been implemented concretely. In opposition to this view, Manders (2008) has argued that, in the context of his analysis of diagrambased geometric reasoning in ancient Greece, taking into account different roles for the agents engaged in this practice was important in explaining its success. Manders distinguishes two roles that he considers of theoretical importance (Manders 2008, p. 122): that of asserting the different geometrical claims made in the text and taking responsibility for them, and that of providing critical scrutiny or probing. One way of probing a geometrical claim is to propose a case, that is, a diagram which satisfies the stipulations of the claim but which is topologically different from the diagram(s) considered by the protagonist. According to Manders, proposing cases is theoretically significant because it is an essential strategy to cope with the open-ended nature of casebranching in diagram-based geometric practice, i.e., with the fact that such a practice does not possess a procedure to make sure that all topologically different diagrams that are relevant in a geometric proof have been considered. As evidence for the importance of the role of probing, Manders argues that several propositions in Euclid’s Elements can be conceived as responses to probing claims that could potentially be raised with respect to other geometrical proofs. Another reason to take agents seriously is highlighted by cases where the mathematics themselves are primarily about actions. Typical examples are mathematical practices geared towards geometric constructions or concrete arithmetical/ algorithmic procedures. For instance, ancient Greek geometry has been described by Reviel Netz as “the study of spatial action, not of visual representation” (Netz 1999, p. 60). Some mathematical traditions, such as those from Mesopotamia or ancient China, are primarily concerned with procedures for computations which are, by definition, set of actions to be performed (see, e.g., Chemla 2012). When analyzing mathematical practices of this kind, one may naturally be led to reflect on the nature of the actions involved and on the structure of the agents performing them.7 But one of the main reasons to take agents seriously is that the notions of action, activity, and agent have revealed themselves theoretically useful and fruitful when addressing a number of issues that appear prominently in the research agenda of the philosophy of mathematical practice. As we shall see in the following sections, this bundle of notions occupies an important place in philosophical work on mathematical activities, mathematical artifacts, mathematical proofs, and mathematical texts.
We can also mention here practices that could be qualified as mathematical such as string figuremaking (see, e.g., Vandendriessche 2015) or paper folding (see, e.g., Friedman 2018), though of course whether such practices can be qualified as mathematical is a debated question.
7
Agency in Mathematical Practice
3
5
Agency and Mathematical Activities
Focusing on what it means to do mathematics amounts to seeing mathematics as an activity. In this section, we will be concerned with two lines of research that naturally emerge from this perspective. The first one consists in developing an epistemology of mathematical activities. The second one explores the possibility that mathematical knowledge is rooted or grounded into elementary human activities. In a contribution entitled “Mathematical activity,” Giaquinto (2005) started from the observation that, although philosophers of mathematical practice have set for themselves the task of investigating the activities that mathematicians engage in, they have so far focused on a narrow set of activities. Giaquinto proposed then a “preliminary map” of mathematical activities worthy of philosophical investigations, recognizing that this list is not meant to be exhaustive. His list consists of the following mathematical activities: discovery, explanation, formulation, application, justification, and representation. Once such a list is established, numerous questions arise: What is the nature of these activities? Are they really different kinds of activities? What are the relations between them? Can they be decomposed into smaller activities? Giaquinto tackled some of these questions. In particular, he argues that discovering, explaining, and justifying are distinct kinds of activities which do not reduce to proving. He also proposes that, for each of the activities in the list, one can distinguish between making it, presenting it, and taking it in. For instance, in the case of discovering, one can distinguish between making the discovery, presenting the discovery to other people, and being the one who is trying to “get” the discovery, that is, who is receiving or taking in the discovery. Giaquinto conceives of his contribution as a “springboard” to pursue further philosophical investigations into the nature of mathematical activities. Some authors have argued that the origins, grounds, or roots of mathematical knowledge are to be found in certain elementary human activities. The mathematician Saunders Mac Lane put the general idea as follows8: [M]athematics started from various human activities which suggest objects and operations (addition, multiplication, comparison of size) and thus lead to concepts (prime number, transformation) which are then embedded in formal axiomatic systems (Peano arithmetic, Euclidean geometry, the real number system, field theory, etc.). These systems turn out to codify deeper and nonobvious properties of the various originating human activities. (Mac Lane 1981, p. 463)
Mac Lane offered the following list of correspondence between elementary human activities and domains of mathematics: Counting : to arithmetic and number theory Measuring : to real numbers, calculus, and analysis Shaping : to geometry and topology Forming (as in architecture) : to symmetry and group theory Estimating : to probability, measure theory, and statistics 8
See also Mac Lane (1986, chapter XII).
6
Moving Calculating Proving Puzzling Grouping
Y. Hamami
: to mechanics, calculus, and dynamics : to algebra and numerical analysis : to logic : to combinatorics and number theory : to set theory and combinatorics
There are different ways in which this general idea can be exploited to articulate a view of mathematics and mathematical knowledge as rooted or grounded into elementary human activities. I will now describe two of the most developed proposals in this direction, namely those of Kitcher (1984) and Ferreirós (2016). In chapter 6 of The Nature of Mathematical Knowledge, Kitcher (1984) endeavors to develop an account of mathematical truth according to which mathematical statements are true in virtue of certain operations or manipulations humans can perform in the world. In the case of arithmetic, two such operations are collecting and correlating: collecting is the activity of segregating or putting objects together; correlating is the activity of matching, relating, or connecting objects. According to Kitcher, the first type of operation leads to notions like set, number, and arithmetical operations, while the second type leads to notions like that of a function. Thinking in terms of mathematical activities such as collecting and correlating is key to Kitcher’s mathematical ontology: “One central idea of my proposal is to replace the notions of abstract mathematical objects, notions like that of a collection, with the notion of a kind of mathematical activity, collecting” (Kitcher 1984, p. 110). Kitcher emphasizes that the operations or manipulations he has mind are not those of any given human agent but that of an ideal agent. This ideal agent should be thought as an idealized version of human agents. Kitcher conceives then of arithmetic as an “idealizing theory”: “the relation between arithmetic and the actual operations of human agents parallels that between the laws of ideal gases and the actual gases which exist in our world” (Kitcher 1984, p. 109). This approach allows Kitcher to argue that arithmetic constitutes a description of the structure of reality, that is, of some operations that can be done in the physical world. In a slogan, Kitcher says that arithmetic is true: “in virtue not of what we can do to the world but rather of what the world will let us do to it” (Kitcher 1984, p. 108). A main goal of The Nature of Mathematical Knowledge is then to show how advanced mathematics can emerge from such proto-mathematical knowledge through various stages of rational transitions. Like Kitcher, Ferreirós holds that an account of mathematical knowledge should attribute a key role to certain elementary activities or practices: “I shall argue that our knowledge of mathematics cannot be understood without emphasizing the practical roots of math, including its roots in scientific practices and technical practices” (Ferreirós 2016, p. 5). But contra Kitcher, Ferreirós’ approach is not reductionist – central to Ferreirós’ account is the “interplay” or “interactions” between practices, not the transitions between different mathematical practices. Another important difference with Kitcher is that Ferreirós emphasizes the cognitive dimension of elementary mathematical practices and the fact that the cognitive abilities involved must be reenacted for an individual to gain mathematical knowledge, while Kitcher situates elementary mathematical practices at the historical roots of his genealogical account.
Agency in Mathematical Practice
7
Ferreirós also offers a precise definition of what he calls a “technical practice,” namely: “a recognizable type of activity that is done—and can be taught and learned—by human agents, involving direct manipulation of objects in the world, through the use of human-made instruments” (Ferreirós 2016, pp. 40–41). The three technical practices discussed by Ferreirós (2016) are that of counting, measuring, and drawing geometrical forms. An important thesis in Ferreirós’ view is that knowledge of mathematics requires the mastering of these elementary techniques. The notion of mathematical activity, intimately connected with that of a mathematical practice, is bound to play an important role in the philosophy and history of mathematical practice. As a matter of fact, many studies in the field precisely consist in the analysis of mathematical activities. The focus in this section has been on philosophical developments that have attempted to theorize about the notion of mathematical activity itself and/or to use it in the pursuit of larger goals. Here, much is to be gained by pursuing further the inquiry initiated by Giaquinto (2005). In particular, it would be theoretically and practically useful to identify and characterize the main components of what constitute mathematical activities as well as the mathematical agents able to perform them.9 This could not only yield methodological tools to investigate specific mathematical activities, it could also provide a perspective to compare and relate different mathematical activities – a theme central to approaches like the one of Ferreirós which emphasize the interplay between mathematical practices, and thus between mathematical activities. Possessing a rich and precise account of the nature of mathematical activities will, in turn, be directly useful for building philosophical conceptions of mathematics rooted in mathematical activities.
4
Agency and Mathematical Artifacts
Doing mathematics involves doing things with mathematical artifacts such as diagrams, symbols, and graphs. A significant part of the literature in the philosophy and history of mathematical practice is dedicated to understanding and reconstructing what agents did or do with mathematical artifacts in various mathematical practices, from the use of geometric diagrams in Ancient Greece mathematics (e.g., Netz 1999; Manders 2008) to that of commutative diagrams in contemporary homological algebra (e.g., De Toffoli 2017). The aim of this section is not to review this literature. Rather, the focus will be on contributions that have attempted to conceptualize and theorize about what it means to do things with mathematical artifacts. This line has been pursued mainly in the cases of mathematical diagrams and mathematical symbols,10 and I will review here some representative studies in this trend. Before 9
As an example, Carter (2008) argues that investigating the activities that mathematicians perform with mathematical structures provides a new perspective on mathematical structuralism. 10 These two kinds of mathematical artifacts are some of the most common ones in mathematical practices past and present, which is probably why they have received most of the attention in the literature. But there are also other kinds of mathematical artifacts that invite epistemic actions, for instance abacuses, geometric instruments such as ruler and compass, and computer simulations.
8
Y. Hamami
that, I will present the notion of epistemic action from Kirsh and Maglio (1994) which has played a central role in these discussions. The distinction between pragmatic and epistemic actions was introduced by Kirsh and Maglio (1994) in their psychological study of the video game Tetris. Pragmatic actions are physical actions performed in the world whose aim is to bring a physical system closer to a specific goal. In Tetris, the pragmatic actions are those aiming to orient the falling shapes in order to create full rows, the goal of the game being to fill up rows. Epistemic actions are also physical actions performed in the world, but their goal is different: their aim is to gain information by performing computations externally. In Tetris, the epistemic actions consist in rotating the shape in order to decide where and in which orientation to position a falling shape. The main advantages of relying on epistemic actions in Tetris is that rotating the shape in the game is less cognitively costly and more reliable than trying to mentally rotate the shape in the mind. Kirsh and Maglio argue that epistemic actions play an important role in many human activities, and that acting in the environment to gain information, knowledge, and understanding is an essential part of human cognition. One area of contemporary mathematics where diagrams are omnipresent is knot theory.11 De Toffoli and Giardino (2014) have conducted a detailed analysis of the use of knot diagrams in the practice of knot theory. They have argued that knot diagrams are entities that support epistemic actions such as computations and inferences. These epistemic actions require a certain cognitive faculty on the part of the agent that they called “manipulative imagination” which, they suggest, is akin to the sort of concrete manipulations one can perform on deformable objects. The moves that can be performed on knot diagrams are codified in the mathematical theory. Accordingly, agents need to be appropriately trained not only to recognize the possible or legitimate moves one can perform on a knot diagram but also to be able to perform these moves. De Toffoli and Giardino have further argued that this manipulative imagination also plays a role in the practice of low-dimensional topology (see De Toffoli and Giardino 2015). Here again, the agent must be able to recognize the permissible actions that can be performed on a given representation and so need to be appropriately trained to do so. To show this, they have conducted a detailed analysis of Rolfsen’s proof of the equivalence of two presentations of the Poincaré homology sphere. In this particular case, some of the permissible actions consist of continuous transformations. According to De Toffoli and Giardino, this faculty of manipulative imagination builds on preexisting spatial and motor cognitive capacities but needs to be appropriately trained to play its epistemic function in the practice of knot theory and low-dimensional topology. Mathematical symbols are artifacts present in virtually every branch of mathematics. De Cruz and De Smedt (2013) have made the case that a primary function of mathematical symbols is to support epistemic actions. These epistemic actions consist in typical symbol manipulations such as those related to negative and
11 Knot theory is a branch of topology that studies knots – a mathematical knot is defined as an embedding of a circle in ℝ3.
Agency in Mathematical Practice
9
imaginary numbers. These symbol manipulations are governed by specific rules associated to the relevant symbolic systems, rules that will need to be learned by the agents. Once acquired, they will provide the agents with an efficient way to perform various forms of mathematical thinking. De Cruz and De Smedt go even further and argue that mathematical symbols support forms of mathematical cognition that would not be possible without resorting to external representations, a view that they articulate by building on the so-called extended mind thesis (Clark and Chalmers 1998; Clark 2008). Their main source of evidence for this claim comes from the history of mathematics. One of their key examples is the case of negative numbers which have received strong oppositions in the past by distinguished mathematicians such as Vieta, Pascal, and De Morgan. De Cruz and De Smedt argue that the invention of the negative numbers, which appears as counterintuitive, was made possible because of the symbolization associated to the minus sign and the operation of subtraction, a symbolic system with which one can calculate. This perspective naturally raises the question of what form of cognition underlies thinking and acting with mathematical symbols. Landy et al. (2014) have proposed a cognitive theory of symbolic reasoning – called the perceptual manipulations theory – that attributes central roles to perception and action. According to this theory, external symbols and notations are treated like physical objects by the cognitive system, that is, objects that can be perceived and manipulated. Symbolic reasoning would then rely on a wide range of sensorimotor abilities such as affordance learning, pattern-matching, object tracking, symmetry detection, etc. This theory could explain the importance of careful and effective design of mathematical notations. More specifically, a well-designed notational system is, from this perspective, one that can take advantage of the sensorimotor system capacities in order, for instance, to encourage valid manipulations and refrain invalid one, or to facilitate the application of structurally similar rules from one domain to another. Giardino (2018) has advanced an encompassing framework aiming to account for different types of mathematical artifacts, including both mathematical diagrams and mathematical symbols. Her proposal is to conceive mathematical artifacts as representational cognitive tools whose main characteristic is to play the double function of representation and instrument. As instrument, the main function of representational cognitive tools is to carry out inferences, i.e., epistemic actions. Giardino offers an account of how epistemic actions operate on representational cognitive tools by building on the notions of material anchor from Hutchins (2005) and affordances from Gibson (1979). Thinking of representational cognitive tools as material anchors amounts to considering them as material entities which have been designed so that constraints associated to their epistemic functions, as well as to what they represent conceptually, are built in the artifacts themselves. The idea of affordances is that the epistemic actions than can be performed on a given representational cognitive tool are those that are afforded by the tool. These affordances depend on the context or practice in which the cognitive tool is used, which means that agents need to be trained to recognize these affordances, a necessary prerequisite to be able to perform legitimate epistemic actions with the tool.
10
Y. Hamami
What these different contributions show is that accounting for what we do with mathematical artifacts in mathematical practice naturally leads to considerations of mental and epistemic actions, and thus of mental and epistemic agency, and requires paying attention to the cognitive underpinnings of the mental activities under consideration. There is thus room on these issues for fruitful interactions between the philosophy of mathematical practice and related developments in the philosophy of mind, philosophy of action, and cognitive science.
5
Agency and Mathematical Proofs
Interacting with mathematical proofs12 is an essential part of what mathematical agents do. This includes evaluating, verifying, communicating, explaining, understanding, and reframing mathematical proofs, among others. But mathematical proofs themselves can also be conceived as being primarily about action. Hamami and Morris (2021) proposed the correspondence displayed in Fig. 1 between the static notions of deductive step and mathematical proof and the dynamic notions of deductive inference and proof activity which belong to the realm of action. That deductive inferences are first and foremost actions of an epistemic nature has been emphasized by several logicians and philosophers (see, e.g., Sundholm 2012; Prawitz 2012; Boghossian 2014; Wright 2014). Hamami and Morris introduced the term proof activity to refer to the sequence of deductive inferences corresponding to a mathematical proof. In this section, we will review contributions that propose an action-based perspective on mathematical proofs, focusing in turn on the notions of deductive inference and proof activity. Larvor (2012) has argued that, in order to better understand the nature of informal proofs, it is not sufficient to approach them as bodies of propositions connected in specific ways. Rather, one should see a proof as a sequence of dynamic transitions, that is, one should focus on inferential actions. This change of perspective is particularly fruitful because there is no reason to think of inferential actions has Mathematical Proof P
Proof Activity AP
sequence of elementary component of
sequence of elementary component of
Deductive Step S
Deductive Inference IS
Fig. 1 Mathematical proof, proof activity, deductive step, and deductive inference By “mathematical proofs” I mean here the kind of proofs one commonly finds in mathematical practice, also commonly referred to as “informal proofs.”
12
Agency in Mathematical Practice
11
being restricted to linguistic representations – inferential actions can operate on different kinds of representations such as diagrams, symbolic expressions, physical models, computer models, etc.13 Larvor builds on this observation to argue that the validity or invalidity of informal proofs can depend on their content. The main reason being advanced is that inferential actions are not always possible in all domains; only inferential actions that consist of logical inferences are. This means that the validity of these inferential actions must depend somehow on features of the domain under consideration. Larvor argues that inferential actions that are contextdependent can perfectly be rigorous insofar as they come with means of control which govern the inferential actions that are possible and legitimate in a given domain. In this respect, Manders’ analysis of Euclid’s diagram-based geometric practice (Manders 2008) is an archetypal example of a study aiming to make explicit the means of control governing inferential actions operating on text and diagram in the context of Euclid’s geometric proofs. Larvor points out that this perspective shapes and organizes a research program on informal proofs in the philosophy of mathematical practice: the task is to identify the various types of inferential actions, to isolate the features of the domain on which they operate that are responsible for their validity, and to identify the means of control that govern them.14 If it is particularly fruitful to approach individual deductive steps as inferential actions, then it must be equally fruitful to approach entire mathematical proofs as sequences of inferential actions, i.e., as proof activities (see Fig. 1). Hamami and Morris (2021) have undertaken a detailed analysis of proof activities and the form of agency underlying them. Their investigations start from the observation that proof activities have two noticeable characteristics, namely they are goal-directed and temporally extended activities. They are goal-directed in the sense that any proof activity is always directed towards a specific goal, namely to prove or establish the mathematical proposition at hand. They are temporally extended in the sense that any action in a proof activity depends on what happened before and constraints what will happen next, that is, any action is fully integrated into the temporal structure of the activity taken as a whole. In this respect, proof activities are similar to many of our ordinary activities such as travelling or cooking which are obviously goal-directed and which unfold over time. Now, the philosopher of action Michael Bratman (1987) has observed that, for human agents, the realization of goal-directed and temporally extended activities most often requires a form of planning agency. This means that the agents engaged in goal-directed and temporally extended activities are guided by a plan that they construct, revise, and execute over time. In the case of proof activities, this means that the sequence of actions does not come from nowhere but is the result of the execution of a plan that has been constructed rationally. This has some interesting consequences for the epistemology of mathematical proofs in practice. First, it means that mathematical proofs possess what Poincaré (1908) called an architecture or a
13
There is here a direct connection with the topic of agency and mathematical artifacts discussed in the previous section. 14 This line is further pursued in Larvor (2019).
12
Y. Hamami
unity that can be grasped as a whole (see also Detlefsen 1992). This is due to the fact that a mathematical proof is the result of the execution of a plan produced by a mathematical agent, in the same way as a building is the result of executing the plan of an architect. Second, because proof plans are rational constructs, this also means that mathematical proofs produced by rational planning agents possess what Mac Lane (1935) called a rational structure – a mathematical proof in practice is not a “mere” sequence of deductive steps. Hamami and Morris (2022a) have argued that this notion of rational structure can be fleshed out by identifying the norms of rationality governing the construction of proof plans. Third, if we accept proof plans as meaningful epistemological entities, then we may expect that some of the activities carried out with proofs will have to do with proof plans. Indeed, several authors have pointed out an intimate connection between proof plans and proof understanding (e.g., Poincaré 1908; Robinson 2000; Folina 2018). Following along this line, Hamami and Morris (2022b) have advanced an account of proof understanding according to which understanding a mathematical proof amounts to being able to rationally reconstruct the proof’s underlying plan. Finally, Hamami and Morris (2021) have also argued that proof plans play an important role in the activities of presenting and communicating mathematical proofs. The action-based perspective on mathematical proofs fits very naturally with the general idea of approaching mathematics as a human activity. As we saw in this section, the static and agentless notions of deductive steps and mathematical proofs can profitably be analyzed through their active counterpart, namely the notions of deductive inferences or inferential actions and proof activities. This perspective is particularly suited to study the spectrum of activities carried out in interaction with mathematical proofs. As an illustration, we saw here that Larvor’s view on inferential actions naturally leads to an account of what it means to evaluate the validity of mathematical proofs, while Hamami and Morris’ analysis of proof activities can be used to shape an account of what it means to understand, present, and communicate mathematical proofs. But these are only some of the many activities related to mathematical proofs. An obvious research program to foster this action-based perspective will be to pursue and extend the analysis of these different activities. This, in turn, can shed light on the nature of agency involved when interacting with mathematical proofs.
6
Agency and Mathematical Texts
Mathematical texts are a privileged vehicle for the transmission of mathematical knowledge.15 If we approach the nature of mathematics through the lens of mathematical practice, a key issue is then to understand how mathematical agents can 15
This is not to say that mathematical knowledge cannot be transmitted by other means. Agent-toagent communications through oral transmissions, eventually accompanied by gestures and external representations, have been historically dominant in some mathematical cultures and are certainly very important in present-day mathematical practice. Thanks to José Ferreirós for suggesting this clarification.
Agency in Mathematical Practice
13
realize mathematical actions and activities on the basis of mathematical texts. In this section, we will review studies in the field that address exactly this question. As we shall see, such investigations can yield important insights on key questions in the philosophy of mathematical practice. Computing and proving are two archetypical mathematical activities. The former is central to certain mathematical traditions such as those of Mesopotamia, ancient China, and the Indian subcontinent, while the latter has been privileged in other traditions such as that of ancient Greece (Chemla 2012). Some commentators have downplayed the activity of computing as compared to that of proving for several reasons (e.g., Hacking 2000), a main one being that computations can be carried out blindly, in a step-by-step fashion, without any form of mathematical understanding. Chemla (2015) argues against this view by undertaking a detailed investigation of how texts of mathematical procedures in ancient China prescribe mathematical actions. The focus is on two key mathematical texts, namely the Writings on mathematical procedures and The Nine Chapters on Mathematical Procedures. Her analysis yields two important results. The first one is that procedures in these texts cannot be executed in a step-by-step fashion because the specification of certain steps require to attend to other instructions appearing later on in the sequence. This means that a certain form of knowledge is required to “circulate” – Chemla’s term – within these texts of procedures. This circulation is necessary to infer from the text the actual sequence of actions to be carried out in specific cases or situations. Chemla argues that this feature witnesses to the generality of the procedures – i.e., the fact that the same procedure can be applied to a wide range of different cases and situations. It is thus essential that the reader be able to infer from the text which actions to take in her specific situation, a competence that goes beyond a blind stepby-step following of the procedure. The second result is that, to infer the sequence of actions to be carried out, the reader needs a certain understanding of the reason(s) why certain steps are carried out. This follows from a meticulous analysis of what it takes to interpret steps in procedures that involve the term “likewise,” that is, steps that require an understanding of what exactly is to be replicated. Taken together, this analysis of the way actions are prescribed in texts of mathematical procedures challenges, according to Chemla, the view that computations are mathematical activities that are carried out blindly without any understanding. Tanswell (forthcoming) has offered an analysis of imperatives in mathematical proofs – introduced by terms such as “let,” “assume,” “solve,” “observe,” etc.16 – which, he argues, yields important insights into the nature of informal proofs. Imperatives issue instructions or commands to carry out mathematical actions. Tanswell identifies three kinds of imperatives in mathematical proofs. The first kind are imperatives which directly refer to the activity to be carried out, for instance, “solve” this equation, “differentiate” or “integrate” this function, “multiply” these two matrices, etc. The second kind are imperatives corresponding to standard
16 See the chapter by Inglis and Tanswell in this Handbook (Inglis and Tanswell 2022) for a corpus analysis of imperatives and instructions as they occur in mathematical texts.
14
Y. Hamami
instructions, but for which some information is left implicit, such as an “assume” clause without stating exactly what the goal is (e.g., to establish a conditional, to reach a contradiction, to do a reasoning by cases, etc.). The third kind are imperatives whose formulation does not correspond exactly to what is to be done and, for this reason, require background and expertise to be interpreted appropriately. As an example, Tanswell mentions a situation where one is told to “using the Axiom of Choice, for each n choose an enumeration,” while “the whole point of the Axiom of Choice is that we cannot actually go about choosing infinitely many times” (Tanswell forthcoming, p. 7). Acknowledging the importance of imperatives in mathematical proofs motivates, according to Tanswell, what he calls the recipe model of informal proofs.17 The model is introduced through a number of analogies between proofs and ordinary cooking recipes. It is noted that proofs, like cooking recipes, (1) employ the imperative mood, (2) are secondary to the associated activities, and (3) involve a clear distinction between the process the author (s) went through to produce them and the way it is being used by the readers or consumers. Tanswell argues that this recipe model of informal proofs is particularly adapted to account for the role of diagrams in proofs for the main reason that diagrams are used to provide instructions in many contexts, and so there is no particular reason to see diagrams as more problematic than texts when it comes to yield instructions. The studies by Chemla (2015) and Tanswell (forthcoming) show that much is to be learned by conducting detailed analyses of how mathematical texts can lead to mathematical actions and activities. Here, we have seen that this can provide evidence for revising received views on the relation between computation and proof as well as on the nature of informal proof.18 But, more generally, such investigations are a privileged way to better understand mathematical activities themselves. In fact, for many (most?) mathematical traditions, mathematical texts are all we have. A key task for the historians of mathematics is thus to reconstruct past mathematical activities on the basis of the mathematical texts that came down to us. As Chemla’s study illustrates here, this should not only consist of identifying the actions constituting these activities, it should also amount to characterizing the knowledge, competence, and understanding that mathematical agents need to possess to carry out the relevant mathematical activities. From this perspective, the issue of mathematical agency necessarily arises whenever one undertakes to analyze or reconstruct mathematical activities, and more generally mathematical practices, on the basis of mathematical texts.19 17 See Weber and Tanswell (2022) for further developments of this model and its applications to issues in mathematics education. 18 In this respect, the studies by Chemla and Tanswell fit perfectly within the action-based perspective on mathematical proofs discussed in the previous section. 19 Here, mathematical texts should be understood in the broad sense encompassing all the different kinds of representations used in mathematical communication, namely texts but also diagrams, symbols, graphs, etc. There is then a direct connection between the theme of this section and the ones developed in Sect. 4 on agency and mathematical artifacts.
Agency in Mathematical Practice
7
15
Conclusion
What does it mean to do mathematics? Doing mathematics obviously involves doing a wide range of different things, and so the only way to progress on this question is to decompose the problem into chunks amenable to philosophical investigations. As we saw in this chapter, philosophers and historians of mathematical practice have made significant progress in this direction by investigating what it means to carry out mathematical activities, do things with mathematical artifacts, engage with mathematical proofs, and perform mathematical actions prescribed by mathematical texts. But this may only be the tip of the iceberg, and it is likely that the question of agency in mathematical practice covers a vast research territory that remains to be uncovered and explored. In this conclusion, I will suggest potential avenues for future work in this direction. Perhaps one of the most pressing issues is to reflect on what mathematical agents are, a problem that should be addressed for every specific mathematical activity and practice. One of the few authors who have tackled this question directly is Ferreirós (2016), the notion of mathematical agent being central to the philosophical view he advances.20 For Ferreirós, mathematical agents are first and foremost human agents with ordinary cognitive abilities and limitations, and immersed in a physical, social, and cultural world. Ferreirós argues that all those aspects are relevant to his account of mathematical knowledge and mathematical practices. Taken together, the contributions reviewed in this chapter already highlight many salient aspects of mathematical agents. In addition to the features identified by Ferreirós, mathematical agents are also planning agents (Hamami and Morris 2021, 2022a), powered with a faculty of manipulative imagination (De Toffoli and Giardino 2014, 2015, 2016), capable of performing inferential actions (Larvor 2012, 2019), grasping and understanding the reasons behind mathematical actions (Chemla 2015), and following recipes (Tanswell forthcoming; Weber and Tanswell 2022). A challenge for the philosophy of mathematical practice is to develop an account of mathematical agents capable of articulating these different dimensions in a systematic way. Research on agency in mathematical practice has so far focused mainly on individual human agency, but several other forms of agency are present in mathematical practice – the most obvious ones are social agency, computer agency, and extended agency. Social agency is present whenever mathematical agents are doing things together, typically when several mathematicians are collaborating to prove a theorem. This raises the question of what it means for a group of mathematical agents to do mathematics together. As philosophers of action have shown,21 social agency most often require subtle mechanisms of coordination and interaction. Investigating these
20 In Ferreirós’ words: “The thesis that multiple practices coexist and are interrelated, in a way that is crucial for the conformation of mathematical knowledge, has one interesting consequence. It means that my analysis of mathematical knowledge has to be crucially centered on the agents” (Ferreirós 2016, p. 59). 21 See Roth (2017) for a review of the literature on social agency in the philosophy of action.
16
Y. Hamami
mechanisms in the context of mathematical practice will certainly lead important insights into the social dimension of mathematical practice. Additionally, it may be interesting to investigate the agency displayed by the mathematical community or subgroups thereof, for instance, when attributing credits and rewards (see, e.g., Jaffe and Quinn 1993; Rittberg et al. 2020). Similarly, the increasing role of computers in mathematical practice raises the questions of what it means for computers to do mathematics, if at all, and what it means to do mathematics with computers.22 These issues pop up concretely whenever one investigates cases of proving or discovering where computers are involved in an essential way – e.g., in the computer-assisted proof of the four color theorem. Finally, the many studies on mathematical artifacts in mathematical practice raises the question of what exactly the agent who does mathematics consists in. When a mathematical activity is carried out by a human agent relying on some mathematical artifacts, what is doing mathematics is a sort of integrated system composed of the coupling of the agent and the artifacts, in which case the mathematical agent may better be conceived as an extended agent. Such an approach could be developed, for instance, in the line of the cyborg conception of agency proposed by Clark (2003), following the famous extended mind thesis much discussed in the philosophy of mind (Clark and Chalmers 1998). Another challenge for philosophers and historians of mathematical practice is thus to identify and characterize the different forms of agency at play in mathematical practice. The question of agency in mathematical practice is both a theoretical and a methodological issue for the philosophy of mathematical practice. It is a theoretical issue in the sense that any account of what it means to do mathematics may be expected to say something about the agents doing mathematics. But it is also a methodological issue in the sense that our underlying conception of mathematical agency guides, in part, our investigations into specific mathematical activities and practices. More specifically, it provides us with a “template” to address questions such as: What does this mathematical activity or practice consists in? What does it take for an agent to be able to properly engage in it? These two aspects are intimately intertwined: progress on the theoretical front may yield new tools and perspectives to investigate specific mathematical activities and practices;23 in turn, these investigations may yield new empirical data to constrain and revise our theoretical conceptions. Given that the nature of agency is one of the outstanding issues spanning the Humanities, progress in the case of mathematical agency can certainly be made by recruiting concepts and resources from other fields, both within philosophy – especially from the philosophy of action, philosophy of mind, epistemology, and the philosophy of science – but also from other areas of the social sciences such as sociology and anthropology. Understanding the nature of agency in mathematical practice is thus likely to require the full interdisciplinary perspective that was claimed as a characteristic feature of the philosophy of mathematical practice.
22
See Avigad (2008) for some investigations in this direction. A perfect illustration of this is provided by the studies in Chemla and Virbel (2015) which build, in part, on the theoretical framework developed by Virbel and colleagues (Grandaty et al. 2000; Virbel 2000). 23
Agency in Mathematical Practice
17
References Avigad J (2008) Computers in mathematical inquiry. In: Mancosu P (ed) The philosophy of mathematical practice. Oxford University Press, Oxford, pp 302–316 Boghossian P (2014) What is inference? Philos Stud 169(1):1–18 Bratman ME (1987) Intention, plans, and practical reason. Harvard University Press, Cambridge, MA Cantù P (2023) The social constitution of mathematical knowledge: objectivity, semantics and axiomatics. In: Sriraman B (ed) Handbook of the history and philosophy of mathematical practice. Springer, Cham Carter J (2008) Structuralism as a philosophy of mathematical practice. Synthese 163(2):119–131 Carter J (2019) Philosophy of mathematical practice: motivations, themes and prospects. Philos Math 27(1):1–32 Chemla K (ed) (2012) The history of mathematical proof in ancient traditions. Cambridge University Press, Cambridge Chemla K (ed) (2015) Proof, generality and the prescription of mathematical action: a nanohistorical approach to communication. Centaurus 57(4):278–300 Chemla K, Virbel J (eds) (2015) Texts, textual acts and the history of science. Springer, Cham Clark A (2003) Natural-born cyborgs: minds, technologies, and the future of human intelligence. Oxford University Press, Oxford/New York Clark A (2008) Supersizing the mind: embodiment, action, and cognitive extension. Oxford University Press, Oxford/New York Clark A, Chalmers D (1998) The extended mind. Analysis 58(1):7–19 Cole J (2013) Towards an institutional account of the objectivity, necessity, and atemporality of mathematics. Philos Math 21(1):9–36 De Cruz H, De Smedt J (2013) Mathematical symbols as epistemic actions. Synthese 190(1):3–19 De Toffoli S (2017) ‘Chasing’ the diagram: the use of visualizations in algebraic reasoning. Rev Symb Log 10(1):158–186 De Toffoli S (2023) The epistemological subject(s) of mathematics. In: Sriraman B (ed) Handbook of the history and philosophy of mathematical practice. Springer, Cham De Toffoli S, Giardino V (2014) Forms and roles of diagrams in knot theory. Erkenntnis 79(3): 829–842 De Toffoli S, Giardino V (2015) An inquiry into the practice of proving in low-dimensional topology. In: Lolli G, Panza M, Venturi G (eds) From logic to practice: Italian studies in the philosophy of mathematics, Boston studies in the philosophy and history of science, vol 308. Springer, Cham, pp 315–336 De Toffoli S, Giardino V (2016) Envisioning transformations—the practice of topology. In: Larvor B (ed) Mathematical cultures: the London meetings 2012–2014. Birkhäuser, Cham, pp 25–50 Detlefsen M (1992) Poincaré against the logicians. Synthese 90(3):349–378 Feferman S (2009) Conceptions of the continuum. Intellectica 51(1):169–189 Ferreirós J (2016) Mathematical knowledge and the interplay of practices. Princeton University Press, Princeton Ferreirós J (2022) What are mathematical practices? The web-of-practices approach. In: Sriraman B (ed) Handbook of the history and philosophy of mathematical practice. Springer, Cham Folina J (2018) Towards a better understanding of mathematical understanding. In: Piazza M, Pulcini G (eds) Truth, existence and explanation. Springer, Cham, pp 121–146 Frege G (1879) Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Louis Nebert, Halle a. S Friedman M (2018) A history of folding in mathematics: mathematizing the margins, Science networks. Historical studies, vol 59. Springer, Cham Gentzen G (1934) Untersuchungen über das logische Schließen I & II. Math Z 39(176–210):405–431 Giaquinto M (2005) Mathematical activity. In: Mancosu P, Jørgensen KF, Pedersen SA (eds) Visualization, explanation and reasoning styles in mathematics. Springer, Dordrecht, pp 75–87
18
Y. Hamami
Giardino V (2018) Tools for thought: the case of mathematics. Endeavour 42(2):172–179 Gibson JJ (1979) The ecological approach to visual perception. Houghton Mifflin, Boston Grandaty M, Debanc C, Virbel J (2000) Evaluer les effets de la mise en page sur la compréhension et la mémorisation de textes procéduraux (règles de jeux) par des adultes et des enfants de 9 à 12 ans.” PArole (special issue “Langage et Cognition”). 13:3–38 Hacking I (2000) What mathematics has done to some and only some philosophers. In: Smiley T (ed) Mathematics and necessity: essays in the history of philosophy. Oxford University Press, Oxford Hamami Y, Morris RL (2020) Philosophy of mathematical practice: a primer for mathematics educators. ZDM 52(6):1113–1126 Hamami Y, Morris RL (2021) Plans and planning in mathematical proofs. Rev Symb Log 14(4): 1030–1065 Hamami Y, Morris RL (2022a) Rationality in mathematical proofs. Australas J Philos Hamami Y, Morris RL (2022b) Understanding in mathematics: the case of mathematical proofs. Unpublished manuscript Hersh R (1997) What is mathematics, really? Oxford University Press, New York/Oxford Hutchins E (2005) Material anchors for conceptual blends. J Pragmat 37(10):1555–1577 Inglis M, Tanswell FS (2022) The language of proofs: a philosophical corpus linguistics study of instructions and imperatives in mathematical texts. In: Sriraman B (ed) Handbook of the history and philosophy of mathematical practice. Springer, Cham Jaffe A, Quinn F (1993) “Theoretical mathematics”: toward a cultural synthesis of mathematics and theoretical physics. Bull Am Math Soc 29(1):1–13 Kirsh D, Maglio P (1994) On distinguishing epistemic from pragmatic action. Cogn Sci 18(4): 513–549 Kitcher P (1984) The nature of mathematical knowledge. Oxford University Press, New York Lakatos I (1976) Proofs and refutations: the logic of mathematical discovery. In: Worrall J, Zahar E (eds) . Cambridge University Press, Cambridge Landy D, Allen C, Zednik C (2014) A perceptual account of symbolic reasoning. Front Psychol 5(275):1–10 Larvor B (2012) How to think about informal proofs. Synthese 187(2):715–730 Larvor B (2019) From Euclidean geometry to knots and nets. Synthese 196(7):2715–2736 Mac Lane S (1935) A logical analysis of mathematical structure. Monist 45(1):118–130. Oxford University Press Mac Lane S (1981) Mathematical models: a sketch for the philosophy of mathematics. Am Math Mon 88(7):462–472 Mac Lane S (1986) Mathematics: form and function. Springer, New York Mancosu P, Jørgensen KF, Pedersen SA (eds) (2005) Visualization, explanation and reasoning styles in mathematics. Springer, Dordrecht Manders K (2008) The Euclidean diagram (1995). In: Mancosu P (ed) The philosophy of mathematical practice. Oxford University Press, Oxford, pp 80–133 Martin-Löf P (1984) Intuitionistic type theory. Bibliopolis, Naples Netz R (1999) The shaping of deduction in Greek mathematics: a study in cognitive history. Cambridge University Press, Cambridge Piñeros Glasscock JS, Tenenbaum S (2023) Action. In: Zalta EN, Nodelman U (eds) The Stanford encyclopedia of philosophy, Spring 2023. Metaphysics Research Lab, Stanford University Poincaré H (1908) Science et Méthode. Ernest Flammarion, Paris Prawitz D (1965) Natural deduction. A proof-theoretical study. Almqvist / Wiksell, Stockholm Prawitz D (2012) The epistemic significance of valid inference. Synthese 187(3):887–898 Rittberg CJ, Tanswell FS, Van Bendegem JP (2020) Epistemic injustice in mathematics. Synthese 197(9):3875–3904 Robinson JA (2000) Proof ¼ guarantee + explanation. In: Hölldobler S (ed) Intellectics and computational logic, vol 19:277–294. Applied Logic Series. Kluwer Academic Publishers, Dordrecht
Agency in Mathematical Practice
19
Roth AS (2017) Shared agency. In: Zalta EN (ed) The Stanford encyclopedia of philosophy, Summer 2017 Schlosser M (2019) Agency. In: Zalta EN (ed) The Stanford encyclopedia of philosophy, Winter 2019. Metaphysics Research Lab, Stanford University Sundholm G (2012) “Inference versus consequence” revisited: inference, consequence, conditional, implication. Synthese 187(3):943–956 Tanswell FS (forthcoming) Go forth and multiply: on actions, instructions and imperatives in mathematical proofs. In: Bueno O, Brown J (eds) Essays on the philosophy of Jody Azzouni. Springer, Cham Vandendriessche E (2015) String figures as mathematics? In: Studies in history and philosophy of science, vol 36. Springer, Cham Virbel J (2000) Un type de composition d’actes illocutoires directifs et engageants dans les textes de type “consigne”. PArole (special Issue “Langage et Cognition”) 11–12:200–221 Weber, Keith, and Fenner Stanley Tanswell. 2022. “Instructions and recipes in mathematical proofs.” Educ Stud Math 111 (1): 73–87 Wilder RL (1950) The cultural basis of mathematics. Proc Intl Congr Math 1:258–271 Wright C (2014) Comment on Paul Boghossian, “what is inference”. Philos Stud 169(1):27–37
Algebraic Versus Geometric Thought and Expression in the Early Calculus Viktor Bla˚sjo¨
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Absence of Trigonometric Functions in Early Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Introduction of Trigonometric Functions by Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Logarithmic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Dimensional Homogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Did Barrow Prove the Fundamental Theorem of Calculus? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 2 7 8 9 11 16 16 16
Abstract
The language of the early calculus was much more geometrical than the analytic and algebraic style that was pioneered by Euler and still dominates today. For instance, functions such as sin(x) and log(x) were largely absent from the early calculus, with geometric paraphrases used in their place. From a modern standpoint, one may be inclined to assume that the eventual triumph of the more analytic perspective was a straightforward case of progress, and that the geometric aspects of the early calculus were a historical artifact ultimately hampering this development. Interestingly, however, in private notes, the pioneers of the calculus showed a readiness to disregard traditionalism and operate freely in a more protomodern style than they allowed themselves in their publications. This suggests that the adherence to the geometrical mode in published works was a deliberate choice selected with full awareness of the analytic alternative. Indeed, the geometrical paradigm was no mere blind conservatism or lip service to classical foundations; rather, it arguably had genuine merits, for example, as an intuitionboosting heuristic strategy. V. Blåsjö Universiteit Utrecht, Utrecht, The Netherlands e-mail: [email protected] © Springer Nature Switzerland AG 2020 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_14-1
1
V. Bla˚sjo¨
2
This aspect of the early calculus can serve as a case study that illuminates the relation between official expression and informal thought in mathematics more generally. For one thing, it complicates the common historiographic assumption that fidelity to historical thought is best achieved by following the original text’s mode of expression as closely as possible. Keywords
Leibniz · Newton · Barrow · Euler · History of infinitesimal calculus · Fundamental theorem of calculus · Dimensional homogeneity · Anachronism · Historiography
1
Introduction
When studying a historical mathematical text, should we attempt to interpret it in terms of later mathematical concepts? Rampant anachronism can be a mark of insensitivity and a closed mind; it can drown out subtleties and rob us of one of the greatest rewards of studying history in the first place – that of expanding our own mindset and perspective. But banning all anachronistic analysis as illegitimate risks making historical scholarship conceptually impoverished and pathologically preoccupied with incidental details. The question is ultimately one that not only the historian but also the philosopher of mathematical practice must grapple with: Can a mathematician’s written word be taken as constitutive of their underlying thought? The early development of the infinitesimal calculus in the late seventeenth century provides a rich store of case studies for these questions. In addition to interesting differences between the original and modern forms of the calculus, we are fortunate that extensive personal manuscripts by the pioneers of the calculus have been preserved (and in many cases recently published for the first time), helping us further triangulate the relations between thought and expression, tradition and innovation.
2
Absence of Trigonometric Functions in Early Calculus
The early practitioners of the calculus virtually never used any trigonometric expressions such as sin(x) in their calculus. Neither Newton nor Leibniz ever wrote sin(x) in any calculus formula in any of their published works. They and many others did write expressions like this in purely geometric contexts, such as referring to sine tables. But they never treated sin(x) or any other trigonometric expression as a function in a calculus context. Unlike us, they did not see the differentiation and anti-differentiation of such expressions as among the most basic and commonly used calculation rules. They never described curves or solutions to differential equations in terms of such expressions.
Algebraic Versus Geometric Thought and Expression in the Early Calculus
3
Fig. 1 Uniqueness of circular orbit through a given point in a F ¼ kr force field
How can this be, when any modern calculus textbook is packed with those kinds of uses of trigonometric functions? How can one have a functioning calculus without this tool? In fact, the early practitioners of the calculus were not in any way handicapped by this choice. Let us look at a few examples of how they managed without trigonometric functions. Let us consider some examples of problems where the treatment in a modern calculus textbook seems most essentially based on trigonometric functions, and see how Newton and Leibniz managed to treat these problems just as well, if not better, in other terms. Simple harmonic motion – such as the bobbing motion of a weight on a spring – is a prototype example of a physical phenomenon described by a sine curve. The harmonic motion of a spring follows from Hooke’s law that force is proportional to extension, F / x. Combined with Newton’s law F ¼ ma ¼ m€ x, this means that the differential equation for the motion of the weight on the spring is x ¼ k€ x. Today, we would express the solutions to this equation in terms of sine and cosine functions. Yet in the seventeenth century people managed without it. Newton dealt with this problem. Let’s see how he did it. Newton considered a more general situation: motion in a force field where the force is proportional to distance from the origin, and directed toward the origin. This is analogous to a gravitational force field, but with a different force law. For any given starting position, a body can be launched into a circular orbit if it is given just the right sideways velocity. If the speed is increased or diminished by any amount, this would cause a deviation from the circular path (Fig. 1). Thus a specific orbital speed is naturally and intrinsically associated with this force field and this starting position.
V. Bla˚sjo¨
4
Table 1 Behavior of a body being dropped from (x, y) ¼ (0, R) in a F ¼ kr force field
In terms of formulas In terms of the associated circular-orbit motion
Position at given time pffiffiffi yðtÞ ¼ R cos kt y-position of the orbital motion at that time
Time to reach given position arccos ð y Þ tðyÞ ¼ pkffiffi R Time in which orbital motion reaches that y
Speed at given position y_ðyÞ ¼ pffiffiffi y kR sin arccos R y-direction speed of orbital motion at that y
= =
By taking this speed as a reference, one can describe harmonic motion in this field in a natural and precise way without the need for formulas. This is precisely what Newton does in the Principia, Book I, Prop. 38, which is about the simple harmonic motion of a dropped object in a F / r force field. He expresses everything in terms of the associated circular-orbit motion. Table 1 puts this approach side by side with the modern approach using formulas. Newton’s way has a number of advantages. For instance, the equivalence of the three different results shown is immediately evident from the geometric descriptions but takes some algebra to derive from the formulas. Furthermore, Newton’s mode of expression uses language intrinsic to the system itself: the naturally associated orbital motion. The formula approach, on the other hand, becomes ugly and opaque precisely because it uses extrinsic frames of reference. The constants k and R depend on our choice of units of force and position, and the trigonometric functions assume a unit radius and hence have to be scaled in various ways. Those are conventions that are external to the specific scenario at hand, yet they dominate the formulas. This is why the formulas are opaque and hide the simple dynamics of the system: the formulas are primarily focused on accommodating the system to a fixed external reference frame rather than on describing the system in the clearest and most intuitive terms. Another example where modern calculus textbooks make trigonometric functions seem indispensable is the cycloid. “The only convenient way of representing a cycloid is by means of parametric equations,” one standard textbook proclaims (Simmons 1996, 592). In the discussion that follows, the expressions sin and cos occur 40 times in the space of two pages, making the cycloid a good candidate for the most trig-intensive example in the standard calculus repertoire. Leibniz did not think this was “the only convenient way of representing a cycloid.” In fact, the very opposite is the case: Leibniz explicitly and repeatedly used the cycloid as his prime example of how “perfectly” the calculus could represent curves, but his representation is very different from the “only convenient” one of modern textbooks. In his very first paper on the integral calculus, he stresses that his “equation expresses the relation between the ordinate y and the abscissa x
Algebraic Versus Geometric Thought and Expression in the Early Calculus Fig. 2 The cycloid
5
y s (x,y)
ΔY
=s
(X,Y) x
ΔX =s
perfectly” and “from it all the properties of the cycloid can be derived” (Blåsjö 2017, 65). Leibniz’s “perfect” way of expressing the cycloid involves no trigonometry and no parametric equations. It is based instead on a simple geometrical property of the cycloid shown in Fig. 2: The three dashed lengths are all equal, since the arc s measures both the amount of rotation needed to bring the tracing point from the top position down to (x, y), and the amount of arc that has been in contact with the ground during this rotation, that is, the distance traveled, or the horizontal displacement X x. Therefore an equation for the cycloid is X ¼ x + s. This equation can be expanded, if desired, by expressing x using the Pythagorean Theorem and s as an arc-length integral. Assuming that the generating circle has unit radius, this gives sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 pffiffiffiffiffiffiffiffiffiffiffiffiffi ð 1 dx dy 1þ dy ¼ 1 y2 þ pffiffiffiffiffiffiffiffiffiffiffiffiffi dy 1 y2 y y
pffiffiffiffiffiffiffiffiffiffiffiffiffi ð 1 XðyÞ ¼ x þ s ¼ 1 y2 þ
This is essentially Leibniz’s “perfect” equation for the cycloid, except he centers pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi his coordinate system on our y ¼ 1, so he gets the equivalent 2y y2 þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Ð dy= 2y y2 instead. Leibniz does not spell out how “all the properties of the cycloid can be derived” from this equation, but this can indeed be done. Let us consider how such an approach can lead to the same results that are derived using trigonometric expressions in modern textbooks such as Simmons (1996). What is the area A of the cycloid, in terms of the area a of the generating circle? Integrating the equation X ¼ x + s, we get A ¼ 2 ¼
ð1 1
Xdy ¼
ð1 1
xdy þ
ð1
a sdy ¼ þ ½ysy¼1 y¼1 2 1
ð1 1
y
ds 3a dy ¼ þ dy 2
ðπ yds 0
3a 3a þ ½xs¼π s¼0 ¼ 2 2
so the area of the cycloid is three times the area of the generating circle, A ¼ 3a. We see that Leibniz’s equation for the cycloid indeed lends itself very well to calculation. The steps of this calculation do not correspond to the standard modern solution and
V. Bla˚sjo¨
6
arguably compare favorably to it. A key ingredient of the modern proof is the use of a trigonometric addition formula to express cos2(θ) in terms of cos(2θ). The Leibnizian proof does not need this technical machinery. The Ð modern approach ultimately reduces the integral to the standard antiderivative cos (θ)dθ ¼ sin (θ). The Ð Leibnizian approach does the same thing in the step yds ¼ x, which is nothing but a Ð different way of saying cos (θ)dθ ¼ sin (θ). Thus, the Leibnizian calculus contains the exact equivalent of our standard trigonometric derivatives and antiderivatives, but expresses them in other terms. What is the tangent of the cycloid at any given point? In fact, the tangent passes through the top point of the rolling circle. This too is derived with quite elaborate trigonometric machinery in Simmons (1996) and other standard textbooks. From Leibniz’s equation, we can get this result without any trigonometry. For we find that ! ð1 d d d pffiffiffiffiffiffiffiffiffiffiffiffi2 dy X ¼ ðx þ sÞ ¼ 1 y þ pffiffiffiffiffiffiffiffiffiffiffiffi dy dy dy 1 y2 y
pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi ð1 þ yÞ ð1 þ yÞ 1 y2 1 y2 y 1 x ¼ ¼ pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffiffiffiffiffi ¼ ¼ 2 2 2 2 1 y 1 y 1 y 1y 1y 1y
which indeed corresponds to ΔX ΔY in the triangle in Fig. 2, as claimed. So, in this case too, the Leibnizian calculus handles the problem quite elegantly without any need for functions. (In modern terms the differentiation of the integral pffiffiffiffiffiffiffiffiffiffiffiffiffi Ð 1 trigonometric 2 is nothing but the derivative of the arccosine, but re-expressing it dy= 1 y y in such terms adds nothing of value or substance.) These examples show that although the calculus of Leibniz and Newton is different from ours, it should not be assumed inferior. Modern calculus books leave us with the impression that “the only convenient way of representing a cycloid is by means of parametric equations” involving trigonometric functions, and so on for various other problems such as harmonic oscillation. Seeing, then, that Newton and Leibniz used other methods, one may be inclined to assume that their methods are inherently clumsier, and that these historical mathematicians would have recognized as much on the spot if we explained it to them; the only reason they didn’t take this step must have been that they were conceptually limited by old ways of thinking, one naturally assumes. But the above examples suggest that the sense of superiority of the anachronistic modern perspective is hubristic. There are in fact ways in which the older approach can not only readily match what the modern approach can do, but even has a number of outright advantages over it. This suggests that Newton and Leibniz may very well have adhered to their style of calculus as a matter of conscious choice, with full awareness of the possibility of a more formula- or function-based approach. In fact, there is even one intriguing bit of direct textual support for this interpretation. Although trigonometric formulas are completely absent from all calculus publications for several decades from the inception of the calculus onwards, there is one obscure early manuscript in which Newton does precisely what the hypothesis of
Algebraic Versus Geometric Thought and Expression in the Early Calculus
7
a conceptual limitation postulated that he could not conceive: namely, he does use trigonometric formulas in calculus calculations, very much in the style of Euler or a modern textbook. In this manuscript from around 1680, Newton repeatedly writes “fl Cos A” – the fluxion of the cosine of the variable angle A – in the course of one particular problem (MS Add.3963.7, 55r; MP.IV.459). Despite seemingly never working with such expressions before or since, Newton handles them effortlessly and incorporates them in his calculations without ado. It seems that Newton opted for this mode of expression in this particular problem because of its complexity; the more geometric phrasing he normally used (including in the other problems in this very manuscript) would be very verbose in this case, which would probably impede the clarity and overview of the calculations. Thus, Newton seems to have treated “fl Cos A” as a trivial shorthand. This shows that there was certainly no conceptual obstacle that held Newton back from using such expressions more extensively. Clearly, he realized that calculus could be built on such expressions, but consciously opted against it.
3
Introduction of Trigonometric Functions by Euler
Trigonometric functions were eventually introduced into the standard calculus repertoire by Euler. As late as 1754, this was still referred to as the “new calculus of sines” (Katz 1987, 316; Euler Opera 1.14.543). The context that led Euler (1739) to introduce this “new calculus” was the differential equation s a t ¼ 0, 2a€ s þ þ sin b g a a periodically forced harmonic oscillator, whose solution is a2 b sin at t t sðtÞ ¼ D cos pffiffiffiffiffiffiffiffi þ C sin pffiffiffiffiffiffiffiffi , gða 2bÞ 2ab 2ab unless 2b ¼ a, in which case sðtÞ ¼ D cos
t t at t þ C sin þ cos : a a 4g a
Thus, unlike the simple harmonic oscillator that had long been handled by geometric paraphrase, the solutions of this differential equation involve combinations of trigonometric functions with different periods and trigonometric components multiplied by a nonconstant function (Fig. 3). The complexity of these solutions makes it very difficult to replace them with a geometrical description or to do without formulas of this type. As Euler put it, “there appear . . . motions so diverse and astonishing that one is unable altogether to foresee until the calculation is finished” (Katz 1987, 318). Just as Newton had been led to use cosine formulas more
V. Bla˚sjo¨
8 a=3
b=1
g=1
s(0)=0
s'(0)=1
a=1
b=3
g=1
s(0)=0
s'(0)=1
a=1
b=1
g=3
s(0)=0
s'(0)=1
a=2
b=1
g=1
s(0)=0
s'(0)=1
Fig. 3 Examples of solutions to the periodically forced harmonic oscillator considered by Euler (1739)
than half a century earlier in a particularly convoluted problem, so also to Euler the need to introduce such functions arises due to the complexity of the problem. In this way, it is possible to regard Euler’s step not as a change in outlook, but merely as a consequence of the same priorities that were present even in Newton’s time: the same outlook, consistently applied, leads to different styles in different contexts.
4
Logarithmic Functions
Just as trigonometric functions were absent from the early calculus, so was the logarithm function log(x). Despite explicitly dealing with the solution to the differa ential equation dy dx ¼ x and referring to logarithmic curves by name in this connection, early calculus works by, for example, Leibniz (1684) and Johann Bernoulli (1692) refrain from using any kind of expression such as log(x) in a calculus setting. This could be interpreted as a shortcoming that hampered these works (Jahnke 2003, 111; Kowalewski 1914, 169). Interestingly, however, manuscript evidence shows that the reluctance to treat log(x) as one of the fundamental functions of the calculus cannot be attributed to an inability to conceive of the possibility of doing so. For Leibniz demonstrably did conceive of it years before, when he wrote “Log y” in a calculus formula in a 1675 manuscript (AA.VII.5.325). Clearly, then, his decision not to write any formulas involving log(x) even when dealing with problems involving logarithms must be considered a conscious choice rather than unreflective adherence to older modes of expression. Despite these notable early absences, the logarithm function entered official calculus discourse much sooner than trigonometric functions. The first published use of a logarithm expression in a calculus formula appears to be Leibniz (1694, 369), who writes “log. x.” Despite never having used this notation before in print, Leibniz simply starts using it in media res without any indication that he has expanded the core repertoire of calculus functions, and with no qualms about taking Ð its properties (such as “log. x ¼ dx : x”) for granted as if they were common knowledge. This makes sense if he regarded “log. x” as a trivial shorthand for what was already well known in other (primarily verbal) terms. Explicit uses of the logarithm function in calculus formulas did not catch on very quickly. Leibniz himself only used it passingly on one other occasion in print (Leibniz 1695, 314). Manfredi (1707), in what is effectively the first published integral calculus textbook, used the logarithm function quite extensively. Manfredi’s
Algebraic Versus Geometric Thought and Expression in the Early Calculus
9
notation is “lx” for log(x). Johann Bernoulli also used this notation on a number of occasions, yet as late as 1716 he still felt the need to explain that “by lx I understand the logarithm of that x” (Bernoulli 1716, 228). Thus, even two decades after its first appearance in print, Johann Bernoulli evidently considered the logarithm as a calculus function still not fully established as an elementary part of the standard vocabulary and notation of the calculus. To conclude, trigonometric and logarithmic functions are seen as an indispensable part of calculus today, but historically the calculus was up and running for a decade before the logarithm function was used in a calculus context, and half a century before trigonometric functions followed suit. This was not for lack of occasion to use such functions: on the contrary, plenty of situations in which we would use these functions were encountered, and often explicitly recognized as concerning trigonometric or logarithmic relations. But this was expressed in verbal and geometric prose, rather than by explicit formulas. From a modern point of view, this may strike us as potentially a sign of a cognitive limitation: perhaps these early practitioners of the calculus were inhibited by an older geometrical and prosaic paradigm of mathematical thought that prevented them from embracing the power of a more modern formal and purely analytic approach. If we knew only the published works of these mathematicians, then the historical record could be construed as fitting this hypothesis. After all, the calculus community relatively soon decided to favor the analytic style rather than the original approach. Nevertheless, a strong case can be made that this anachronistically tempting hypothesis misses the mark. Manuscript evidence shows that the analytic approach – in the form of explicit use of trigonometric and logarithmic expressions in calculus formulas – was in fact considered by the creators of the calculus at a very early date, even though they generally opted against this approach in their published works. Indeed, one could argue that the approach they did take was the one best suited for their purposes. The strength of the modern analytic approach over the older style lies especially in handling complicated relationships with many components and parameters. So instead of seeing the transition from geometric to analytic expression as one from ignorance to enlightenment, one can interpret it as the natural outcome of sound preferences that remained consistent throughout: the analytic approach was introduced precisely when the occasion called for it – that is, when the efficiency and generality of formula-crunching outweighed the intuitive appeal of geometric expression.
5
Dimensional Homogeneity
Requiring equations to be dimensionally homogenous was another geometrically motivated choice persistent in the early calculus but at odds with later analytic practice (Bos 1974, 7). Thus a “length” such as y cannot equal an “area” such as x2, for instance, so one would rather write ay ¼ x2 to get a dimensionally balanced equation. This goes hand in hand with an emphasis on geometric interpretation
10
V. Bla˚sjo¨
rather than analytic formulas. Thus, for example, Johann Bernoulli, in his calculus lectures of 1692, writes the differential equation for the exponential curve as ydx ¼ ady, and when he separates the variables to dx ¼ ady : y he then proceeds, for the sake of geometrical interpretation, to explicitly multiply by a to get adx ¼ aady : y, which he then interprets visually as an equality of areas (Johann Bernoulli Opera III.421; Blåsjö 2017, 84). Bernoulli describes the resulting curve verbally and geometrically as a “Logarithmica,” but has no symbolic notation for this at this time. By contrast, Euler, in his calculus textbook, happily works with dimensionally unbalanced equations such as dy ¼ dx/x (Euler 1755, § 180) and treats them purely analytically. But again it would be a mistake to think of the geometric conservatism of the early calculus as a limitation of thought. Just as Leibniz privately used analytic notation for logarithm functions, so also in early manuscripts he was much more lax about working with equations violating dimensional homogeneity than in his later published works (Bos 1974, 12; Leibniz AA VII.5.xxxvi). A 1694 letter by Johann Bernoulli (Leibniz AA III.6.167) is explicit about what was probably widespread practice: working with a differential equation, he finds that a certain expression “¼dy” but then immediately adds: or for the sake of dimensional homogeneity “¼bdy” where b is a new constant introduced solely to balance the equation dimensionally. That is to say, Bernoulli evidently had no problems working in an analytic mode that ignored formal geometric convention when expedient, only to then translate the end result into official form as a veritable afterthought. One finds the same type of reasoning in Leibniz’s letters as well (Leibniz AA III.5.188). This suggests that the convention of dimensional homogeneity was consciously adopted for the positive insights it could bring, without prejudice to other modes of reasoning. Indeed, dimensional analysis is a useful heuristic tool to this day (Pólya 1957, 202–205). Leibniz explicitly points out its usefulness as a check on calculations (AAVII.5.292). It was also very useful for purposes of geometric interpretation (the two examples from Bernoulli just mentioned are instances of this), which was often sensibly pursued in the early calculus as we have seen. A related respect in which informal private practice was more flexible than formal expression was with regard to curve plotting. Impressively, accurate figures are quite common in the early calculus, but accompanying text is often couched in formal language quite aloof from the concrete, hands-on perspective that must have gone into producing the figures. Again, private manuscripts reveal a greater flexibility of thought than official discourse. Leibniz’s description of the catenary y ¼ (ex + ex)/2, for example, is in its published form rather stiltedly formulated in terms of the classical language of ratios. Privately, however, he operated with a more freewheeling numerical approach, including using a decimal representation of e that he never revealed in print (Raugh and Probst 2019). A similar example concerns how Jakob Bernoulli corrected his mistaken belief that the radius of curvature of any curve must be infinite at an inflection point. On intuitive grounds, Leibniz and Bernoulli had believed this to be a general principle. Indeed, it holds for simple inflection points, but there are exceptions. For instance, x3 ¼ y5 has an inflection point at the origin, yet the radius of curvature there is zero.
Algebraic Versus Geometric Thought and Expression in the Early Calculus
11
This is because there are really multiple singularities in one at that point. Bernoulli deals with this by resolving the singularity. Instead of x3 ¼ y5, consider x3 ¼ y5 b2y3 for small b. This pulls the singularities apart. Now, the inflection point at the origin is simple again and the general rule holds: infinite radius of curvature. Bernoulli published this finding, but it is interesting to compare his manuscript notes with the published account (Bernoulli 1999, 151–154, 255–258). First of all, it is striking that Bernoulli writes his equations in dimensionally homogenous form (aax3 ¼ y5 and aax3 ¼ y5 bby3) in his published article, but uses unbalanced equations (x3 ¼ y5 and x3 ¼ y5 bby3) in his private notes. Furthermore, in his notes Bernoulli used detailed numerical calculations to plot the curve and explore its behavior. A table of x and y values, computed to two decimal places, occurs right next to the figure. This is suppressed or deemed unworthy of inclusion in the published version, where the figure appears without such supporting calculations and the formal presentation gives the impression that the conclusions were reached by abstract reasoning in eminently classical form. The look behind the scenes, at the numerical curve plotting and visual checks in Bernoulli’s private notebook, reminds us once again that official published expression can give a misleading impression of the underlying thought process.
6
Did Barrow Prove the Fundamental Theorem of Calculus?
Isaac Barrow (1670) proved certain geometric theorems that could be interpreted as equivalents of the fundamental theorem of calculus. Barrow was writing before the introduction of the concepts and symbolism of derivatives and integrals. He speaks in purely geometrical terms, such as areas and tangents of curves. Should he nevertheless be regarded as having had an insight effectively equivalent to the fundamental theorem of calculus? Some have answered yes (Child 1916, vii, 31; Heath 1917, 133; Nauenberg 2014, 343), and seem to have judged that the direct translatability of Barrow’s theorems into this modern form is in and of itself a compelling reason to accept this conclusion. At the other extreme, some emphasize precisely the geometric form and absence of symbolism as a conclusive reason to deny Barrow this insight (Bos 1980, 64–65; Wagner 2001; Sonar 2018, 55). It is advisable to steer clear of both of these extremes when trying to understand the meaning and significance of historical texts. Though these two interpretations are opposites, they share an excessive emphasis on the statement of theorem and proof in isolation. A sounder approach is to investigate how these theorems functioned in Barrow’s thought as a whole, and how he saw them fitting into a bigger picture. Historians who have taken such a perspective have tended to came down against the claim that Barrow’s theorems are equivalent to the fundamental theorem of calculus (Whiteside 1961, 367–368; Mahoney 1990, 236; Katz 2009, 539). When taking this balanced approach, the historian can use modernized notation with benefit, as a tool for clarification and analysis, without falling into either of the opposite traps of the “mathematician’s” anachronistic naiveté and the “historian’s” knee-jerk rejection of all anachronistic devices. Let us take Barrow’s relation to the
V. Bla˚sjo¨
12 Fig. 4 The subtangent σ and subnormal η of a given curve
σ
η
fundamental theorem of calculus as a case study illustrating this methodology. Thus, we shall first translate Barrow’s results into modernized form, while remaining agnostic as to whether this is a faithful representation of his thought or not. This will enable us to see what the global organization of Barrow’s text says about how he viewed the theorems in question. The fundamental theorem of calculus has two parts: d dt ðb a
ðt a
yðxÞdx ¼ yðtÞ
y0 ðxÞdx ¼ yðbÞ yðaÞ
ðFTC1Þ ðFTC2Þ
By choosing the coordinate system so that y(0) ¼ 0 (whichÐ corresponds to x Barrow’s geometrical treatment), and using the notation Y ðxÞ ¼ 0 yðtÞdt , we can write this as
ðx 0
d Y¼y dx y0 ðtÞdt ¼ yðxÞ
ðFTC1Þ ðFTC2Þ
What we would interpret as results about derivatives, Barrow expresses in terms of subtangents and subnormals (i.e., the distance from the point where the tangent or normal cuts the axis to the point on the axis perpendicularly below the point on the curve in question; Fig. 4). In modernized language, the subtangent is σ( y) ¼ y/y0 and the subnormal is η( y) ¼ yy0. The purported equivalent of FTC1 is Barrow’s X.11. In our modernized form, it says σ(Y) ¼ Y/y. Using the expression for the subtangent in terms of derivatives, we see that this is indeed equivalent to Y0 ¼ y, or FTC1. The question is: is this theorem, in Barrow’s mind, tied to the specific geometrical configuration, or is it a more structural and general result that it would be natural to call upon in any situation when we seek the rate of change of any anti-derivative or accumulation function? Barrow’s own text – and in particular his treatment of a polar version of X.11 – gives us strong reason to favor the former rather than the latter interpretation, as follows.
Algebraic Versus Geometric Thought and Expression in the Early Calculus
13
Fig. 5 The polar subtangent ς and polar subnormal ν
Throughout his treatise, Barrow interleaves results of the above form (in effect, assuming perpendicular rectilinear axes) with analogous results for curves defined in Ðθ polar terms. In modernized form, RðθÞ ¼ 0 ðr ðϑÞÞ2 =2dϑ is the area-counting-function of the polar curve r(θ) (that is to say, R(θ) equals the area swept out by r(θ) from 0 to θ). For polar curves, the subnormal and subtangent are defined not as segments of the x-axis, but as segments of the line perpendicular to the radial line through the origin point (Fig. 5). The polar subtangent is ς(r) ¼ r2/r0. The polar subnormal is ν(r) ¼ r0. Barrow’s Proposition X.13 is a polar analog of X.11. It says: For the tangent at the point where r(θ) and R(θ) intersect, ς(R) ¼ 2 (equivalently: R0 ¼ R2/2). (Formally, the theorem only enables us to find the tangent at one particular point, but this is not an essential restriction. To find the tangent at another point on the curve R(θ), say where R ¼ kr, we can let r ¼ r=k, and apply the theorem to the point of intersection of r ðθÞ and R(θ), giving ς(R) ¼ 2/k2.) Barrow proves X.13 from first principles, using the geometry of the polar differential triangle of R(θ). He presents X.13 as a “theorem of the same kind” as X.11, but he makes no use of the latter in the proof of X.13. In modern terms, we could instead reason as follows. We know that ς(R) ¼ R2/R0 for any polar curve from the geometry of the polar subtangent. In this case, R(θ) is Ðθ the area-counting function of r(θ), which means that RðθÞ ¼ 0 ðr ðϑÞÞ2 =2dϑ. We can easily find its derivative using FTC1. It is R0(θ) ¼ r(θ)2/2. So the polar subtangent is ς(R) ¼ 2R2/r(θ)2. In particular, when R ¼ r, this simplifies to ς(R) ¼ 2, which is Barrow’s theorem X.13. This is a much shorter and easier proof than the elaborate deduction from first principles used by Barrow. The fact that Barrow did not use X.11 to prove X.13 in this way is thus quite compelling evidence that he lacked precisely that way of thinking about X.11 that warrants the FTC1 the epithet “fundamental.” Let us now turn to FTC2. In Lecture XI, Barrow presents “some Theorems . . . relating to the Mensuration of Magnitudes by Tangents or Perpendiculars to Curves.” That is to say, this lecture explains how to find areas in terms of tangents – or integrals in terms of derivatives. Indeed, we find a geometric equivalent of FTC2 in Barrow’s XI.19. As in the case of FTC1, Barrow’s statement and proof of
14 Table 2 Results proved by Barrow (1670) in modernized form XI.1 XI.2 XI.3 XI.4 XI.5 XI.7 XI.10 XI.12 XI.13 XI.16 XI.17 XI.19 XI.20 XI.21
V. Bla˚sjo¨
Ð ηdx ¼ y2/2 Ð ηydx ¼ y3/3 Ð 2 ηy dx ¼ y4/4 Ð 3 ηy dx ¼ y5/5 Ð Yydx ¼ Y2/2 pffiffiffi3 Ð pffiffiffi Y ydx ¼ 2 Y =3 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 Ð η þ xdx ¼ 12 x2 þ y2 Ð σdy ¼ Y Ð Ð σydy ¼ y2dx Ð Ð 2 σy dy ¼ y3dx Ð Ð 2 σ dy ¼ σydx Ð Ð 3 σ dy ¼ σ 2ydx Ð 0 y dx ¼ y Ð Ð 02 (y ) dx ¼ y0dy Ð Ð 03 (y ) dx ¼ (y0)2dy
Theorem XI.19 could quite reasonably be construed as corresponding directly to FTC2. Unless one takes a fundamentalist zero-tolerance hardline against any interpretation that transcends surface expression, there is nothing in Barrow’s formulation of XI.19 that explicitly indicates any essential conceptual difference between it and FTC2. In other words, Barrow’s XI.19 could be thought of as a full-fledged FTC2. But the more interesting question is: did Barrow think of it this way? To answer this question, we must not focus on XI.19 itself, but rather look at the context in which this theorem occurs. Much as in the case of FTC1, a contextual perspective strongly suggests that Barrow did not think of XI.19 as fundamental or a conceptual cornerstone of the entire chapter. Rather, he seems to have viewed it as one particular geometrical result among many of a similar kind. Excluding polar curves, the area results that Barrow presents in Lecture XI are shown in Table 2. In modern terms, it is trivial to verify these results by substituting the respective derivative expressions for η and σ and using FTC2 as well as basic “Leibnizian algebra” with differentials such as dy dx dx ¼ dy. Today, we think of FTC2 as much more generic and prototypical than the other theorems in Barrow’s Lecture XI. But in the geometrical language used by Barrow, it does not have an evidently exceptional status. XI.19 is not any more generic than XI.1 or XI.10 or even XI.7, which has a very direct geometrical pffiffiffiffiffiffiffiffiffiffiffiffiffiffi meaning since η + x is the distance from origin to end of subnormal and x2 þ y2 is the distance from origin to the relevant point (x, y) on the graph of y(x). Barrow proves these results by direct area considerations and not by verifying by differentiation that the right-hand sides are antiderivatives of the integrands, as we would in light of FTC2. He never uses XI.19 for such a purpose. There is no indication that he conceived of the possibility of taking XI.19 to be the centerpiece
Algebraic Versus Geometric Thought and Expression in the Early Calculus
15
of the theory in this sense (or any other sense for that matter). Admittedly, XI.19 and its variations occur at the end of this sequence of results, which might suggest that it is the key result that crowns the entire development. Conceivably, Barrow might have perceived the fundamental nature of this result, yet opted to include and prove the previous theorems independently, perhaps for pedagogical, illustrative, or reference purposes. This possibility can be ruled out, however, by some revealing remarks later in the work. In Appendix III to Lecture XII, Barrow offers a certain Theorem IV which he praises as “the most Fertile of all the Propositions foregoing. The greater Part of them being either contained in it, or deduced easily from it.” The theorem in question may be considered a general result on change of variables. In slightly modernized terms, it states that if y(x), g(x), h( y) are functions such that ηyððxyÞÞ ¼ hgððyxÞÞ for all x (which in Ð Ð hdy. Barrow notes the modern terms is equivalent to g ¼ y0h) then gdx ¼ following special cases. If h ¼ 1, we get XI.19. If h ¼ y, we get XI.1. If y ¼ g, we get XI.10. From this Barrow concludes: “I cannot but accuse my Foresight, for not having first laid down this Theorem . . . and then having deduced the rest from it, . . . which I observe may be done.” This very strongly suggests that Barrow did not see the possibility of using his FTC2-equivalent XI.19 to unify the entire sequence of results in Lecture XI. He expressly admits that he would have loved to give such a unified treatment if he had seen a way of doing so. Only later did he realize that this was in fact possible, and even then he still did not see XI.19 as fundamental but rather saw the unification as coming from another theorem altogether that does not correspond to the modern view of the centrality of the FTC. Thus, in the case of Barrow’s purported equivalents of both FTC1 and FTC2, Barrow misses major opportunities to use them to do the kind of work that FTC1 and FTC2 would do for us. This suggests that Barrow’s theorems should not be considered equivalent to FTC1 and FTC2. The mere fact that Barrow used geometrical language does not in and of itself entail this conclusion. On the contrary, the early pioneers of the calculus often used geometrical language similar to that of Barrow, yet showed no signs of being constrained in their thinking when the occasion called for formal, nongeometrical applications of calculus principles abstractly in ways that are not visualizable in terms of tangents and areas (e.g., Engelsman 1984, Chap. 2). Hence a similar surface form of expression can be in one case indicative of a genuine conceptual limitation and in another case not. Rather than focusing on the surface form in which an idea is expressed, attention to its functional role in the author’s broader argument is a better indicator of which aspects of the form of expression are incidental or essential to the underlying thought. Reconstructions in modernized mathematical terms can be a useful tool for such purposes.
V. Bla˚sjo¨
16
7
Conclusion
The early calculus used geometrical language and shunned the analytic style that came to predominate later. Many historians have tended to view this as a major conceptual divide (e.g., Speiser 2008, 108, 130). More mathematically inclined scholars, however, have been inclined to push back against this perspective and perceive more continuity in the underlying ideas than the surface form of expression would suggest (e.g., Fraser 2020, 198). The examples from Newton, Leibniz, and Bernoulli analyzed above weigh in favor of the mathematical interpretation. With regard to Barrow’s purported geometrical version of the fundamental theorem of calculus, on the other hand, we have a reverse state of affairs. Mathematically inclined readers have been those most ready to accept the equivalence, while the historians’ perception of conceptual discontinuity is borne out by closer analysis. In the former case, the mathematician’s intuition was best supported by the historian’s methods: The mathematician’s sense that the early practitioners of the calculus did not lack the ability to reason in a manner functionally equivalent to the modern analytic approach is borne out by textual evidence that they indeed explicitly used more analytic approaches in private manuscripts that in many cases have only recently been published through meticulous efforts of specialized historians. Conversely, in the Barrow case, the historian’s intuition is arguably best supported by the mathematician’s methods: A thoroughly modernistic reconstruction of Barrow’s reasoning, far from being tantamount to a simplistic anachronistic fallacy, makes a compelling case against accepting the equivalence of Barrow’s theorems and their modern counterparts. Efforts to understand past mathematical thought are well served by drawing on the strengths of each these diverse perspectives and approaches.
8
Cross-References
▶ Christiaan Huygens: A XVIIth Century Mathematician Working in the Tradition of Archimedes and Apollonius ▶ Descartes’ Transformation of Greek Notions of Proportionality ▶ Heuristics and Mathematical Practice ▶ Historiography of Mathematics
References Barrow I (1670) Lectiones geometricae: in quibus (praesertim) generalia curvarum linearum symptomata declarantur, London. Reprinted in William Whewell (ed) The mathematical works of Isaac Barrow. Cambridge University Press, 1860, 155–316. English translations in Barrow (1735) and Child (1916) Barrow I (1735) Geometrical lectures: explaining the generation, nature and properties of curve lines. Stephen Austen, London
Algebraic Versus Geometric Thought and Expression in the Early Calculus
17
Bernoulli J (1999) Die Werke von Jakob Bernoulli, Bd. 5. Differentialgeometrie, edited by David Speiser, André Weil, & Martin Mattmüller. Birkhäuser, Basel Bernoulli J (1692) Lectiones Mathematicae de Methodo Integralium, aliisque. In: Bernoulli J (ed) Opera III, pp 385–558. Partial German translation in Kowalewski (1914) Bernoulli J (1716) Problema: data serie linearum per rectae in eadem Linea constantis variationem prodeunte invenire aliam seriem linearum, quarum quaevis priores omnes ad angulos rectos secabit. Acta Eruditorum:226–230 Blåsjö V (2017) Transcendental curves in the Leibnizian calculus. Elsevier, Oxford, United Kingdom Bos HJM (1974) Differentials, higher-order differentials and the derivative in the Leibnizian calculus. Arch Hist Exact Sci 14(1):1–90 Bos HJM (1980) Newton, Leibniz and the Leibnizian tradition, chapter 2. In: Grattan-Guinness I (ed) From the Calculus to set theory, 1630–1910. Princeton University Press, Princeton Child JM (1916) The geometrical lectures of Isaac Barrow. Open Court Publishing, Chicago and London Engelsman SB (1984) Families of curves and the origins of partial differentiation, Elsevier NorthHolland mathematics studies, vol 93 Euler L (1739) De novo genere oscillationum. Commentarii academiae scientiarum Petropolitanae 11:128–149. (presented 1739). Opera Omnia, Series 2, vol 10, 78–97. E126, 1750 Euler L (1755) Institutiones calculi differentialis, Saint Petersburg Fraser C (2020) Review of Shank (2018). J Mod Hist 92(1):197–198 Heath TL (1917) Review of Child (1916). Math Gaz 9(130):131–134 Jahnke HN (2003) A history of analysis. American Mathematical Society. Providence, Rhode Island, USA Katz VJ (1987) The Calculus of the trigonometric functions. Hist Math 14(4):311–324 Katz VJ (2009) A history of mathematics: an introduction, 3rd edn. Addison-Wesley Kowalewski G (1914) Die erste Integralrechnung: Eine Auswahl aus Johann Bernoullis Mathematischen Vorlesungen über die Methode der Integrale und anderes, Ostwald’s Klassiker der exakten Wissenschaften 194. Engelmann, Leipzig & Berlin Leibniz GW (1684) Nova methodus pro maximis et minimis, itemque tangentibus, quae nec fractas, nec irrationales quantitates moratur, & singulare pro illis calculi genus. Acta Eruditorum:467– 473 Leibniz GW (1694) Constructio propria problematis de Curva Isochrona Paracentrica. Acta Eruditorum:364–375 Leibniz GW (1695) Responsio ad nonnullas difficultates, a Dn. Bernardo Niewentiit circa methodum differentialem seu infinitesimalem motas. Acta Eruditorum:310–316 Leibniz GW (AA) (1923–present ongoing) Sämtliche Schriften und Briefe, Gottfried-WilhelmLeibniz-Gesellschaft, Academie Verlag, leibnizedition.de Mahoney MS (1990) Barrow’s mathematics: between ancients and moderns. In: Feingold M (ed) Before Newton: the life and times of Isaac Barrow. Cambridge University Press, Cambridge Manfredi G (1707) De constructione aequationum differentialium primi gradus, Bologna Nauenberg M (2014) Barrow, Leibniz and the geometrical proof of the fundamental theorem of the Calculus. Ann Sci 71(3):335–354 Newton I (MP) (1967–1981) The mathematical papers of Isaac Newton, 8 vols, edited by D. T. Whiteside. Cambridge University Press, Cambridge, pp 1967–1981 Pólya G (1957) How to solve it, 2nd edn. Princeton University Press, Princeton Raugh M, Probst S (2019) The Leibniz catenary and approximation of e – an analysis of his unpublished calculations. Hist Math 49:1–19 Simmons GF (1996) Calculus with analytic geometry, 2nd edn. McGraw-Hill, New York, USA Sonar T (2018) The history of the priority dispute between Newton and Leibniz. Birkhäuser, Basel, Switzerland Speiser D (2008) Discovering the principles of mechanics 1600–1800, edited by Kim Williams & Sandro Caparrini. Birkhäuser
18
V. Bla˚sjo¨
Wagner J (2001) Barrow’s fundamental theorem. Coll Math J 32(1):58–59 Whiteside DT (1961) Patterns of mathematical thought in the later seventeenth century. Arch Hist Exact Sci 1(3):179–388
An Ethnoarithmetic Excursion into the Javanese Calendar Natanael Karjanto and Franc¸ois Beauducel
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Ancient and Modern Calendars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Pre-Gregorian Calendars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Gregorian Calendar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Javanese Calendar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Where Is Java? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Who Are Javanese People? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 A Background of the Javanese Calendar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Some Characteristics of the Javanese Calendar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Computer Implementation of the Javanese Calendar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Discussion and Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Javanese View of Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Pranatamangsa Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 5 5 6 8 8 9 9 9 21 23 23 24 25 26
Abstract
A perpetual calendar, a calendar designed to find out the day of the week for a given date, employs a rich arithmetical calculation using congruence. Zeller’s congruence is a well-known algorithm to calculate the day of the week for any N. Karjanto (*) Department of Mathematics, University College, Sungkyunkwan University, Suwon, Republic of Korea e-mail: [email protected] F. Beauducel Université de Paris, Institut de physique du globe de Paris, CNRS, Paris, France Institut de recherche pour le développement, Research and Development Technology Center for Geological Disaster, Balai Penyelidikan dan Pengembangan Teknologi Kebencanaan Geologi (BPPTKG), Yogyakarta, Indonesia e-mail: [email protected] © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_82-1
1
2
N. Karjanto and F. Beauducel
Julian or Gregorian calendar date. Another rather infamous perpetual calendar has been used for nearly four centuries among Javanese people in Indonesia. This Javanese calendar combines the Saka Hindu, lunar Islamic, and western Gregorian calendars. In addition to the regular 7-day, lunar month, and lunar year cycles, it also contains 5-day pasaran, 35-day wetonan, 210-day pawukon, octo-year windu, and 120-year kurup cycles. The Javanese calendar is used for cultural and spiritual purposes, including a decision to tie the knot among couples. In this chapter, we will explore the relationship between mathematics and the culture of Javanese people and how they use their calendar and the arithmetic aspect of it in their daily lives. We also propose an unprecedented congruence formula to compute the pasaran day. We hope that this excursion provides an insightful idea that can be adopted for teaching and learning of congruence in number theory. Keywords
Javanese calendar · Ethnomathematics · Pasaran · Wetonan · Pawukon · Windu · Kurup · Congruence · Indonesia
1
Introduction
Arithmetic and number theory find applications in various cultures throughout the world. In addition to solving everyday problems using elementary arithmetic operations, our ancient ancestors also developed and invented perpetual calendars without any aid of modern electronic calculators and the computer. A perpetual calendar is a system dealing with periods of time that occur repeatedly. It is organized into days, weeks, months, and years. A date of a calendar structure refers to a particular day within that system. A calendar is used for various civil, administrative, commercial, social, and religious purposes. The English word “calendar” is derived from the Latin word which refers to the first day of the month in the Roman calendar. It is related to the verb calare (“to announce solemnly, to call out”), which refers to the “calling” of the new moon when it was visible for the first time (Brown 1993). Another source mentions that the modern English “calendar” comes from the Middle English calender, which was adopted from the Old French calendier. It originated from the Latin word calendarium, which meant a “debt book, account book, register.” In Ancient Rome, interests were tracked in such books, accounts were settled, and debts were collected on the first day (calends) of each month (Stakhov 2009). Generally, the calculation of and periods in the calendrical system are synchronized with the cycle of the Sun or Moon. Hence, the names solar, lunar, and lunisolar calendars, where the latter combined both the Moon phase and tropical (solar) year. Our current Gregorian and the previously Julian calendars are solar-based calendars, as well as the ancient Egyptian calendar. An example of the lunar calendar is the Islamic calendar. Prominent examples of the lunisolar calendar are the Hebrew,
An Ethnoarithmetic Excursion into the Javanese Calendar
3
Chinese, Hindu, and Buddhist calendars (Richards 1998; Reingold and Dershowitz 2018). Studying calendar systems from any culture cannot be separated from attempting to gain insight and understanding not only in the arithmetic behind them but also in the interconnectivity with mathematics, the history of mathematics, astronomy, mathematics education, and sociocultural aspects. Hence, the study of perpetual calendars is encompassed by the fields of ethnomathematics and ethnoastronomy, the research areas where diverse cultural groups embrace, practice, and develop mathematics and astronomy in their daily lives. Figure 1 summarizes a theoretical framework used for the discussion on the Javanese calendar considered in this chapter. Although the title of this chapter contains the term “ethnoarithmetic,” indeed that covering a topic on the Javanese calendar embodies other aspects beyond the cultural and mathematical, in this case, arithmetical aspect per se. As we observe at the third level of the Venn diagram presented in Fig. 1, an excursion into the Javanese calendar intersects and includes four distinct but related disciplines: mathematics, education, culture, and astronomy. Fundamental work in the area of ethnomathematics has been documented by Ascher (2002). The author elaborated on several mathematical ideas and their cultural embedding of people in traditional or small-scale cultures, with some emphasis on time structure and the logic of divination. In particular, the book also covered an explanation of the Balinese calendar. For discussion of the Balinese
Education
ME
Mathematics
Culture
CE
EME
EA
EAE
Astronomy
JC EM
AE MA
MAE
EMA
Fig. 1 A theoretical framework for an ethnoarithmetic study of the Javanese calendar. At the first level of the intersection, ME denotes mathematics education, EM denotes ethnomathematics, MA denotes mathematics astronomy, AE denotes astronomy education, EA denotes ethnoastronomy, and CE denotes cultural education. At the second level of the intersection, EME denotes ethnomathematics education, MAE denotes mathematics astronomy education, EMA denotes ethnomathematics astronomy, and EAE denotes ethnoastronomy education. The third level of the intersection is the heart of discussion; we have JC as the Javanese calendar
4
N. Karjanto and F. Beauducel
calendrical system and its multiple cycles, consult Vickers (1990); Darling (2004); Proudfoot (2007); Ginaya (2018); and Gislén and Eade (2019c). To the best of our knowledge, the only monograph that discusses an intersection of three components of the theoretical framework is a volume edited by Rosa et al. (2017). The book covers diverse approaches and perspectives from ethnomathematics to mathematics education, particularly in several non-Western cultures. Since these topics stimulate debates not only on the nature of mathematical knowledge and the knowledge of a specific cultural group but also on the pedagogy of mathematics classroom, the field of ethnomathematics certainly offers a possibility for improving mathematics education across cultures. A collection of essays dealing with the mathematical knowledge and beliefs of cultures outside the Western world is compiled by Selin and D'Ambrosio (2000). The essays address the connections between mathematics and culture, relate mathematical practices in various cultures, and discuss how mathematical knowledge is transferred from East to West. A coverage of calendars in various cultures is also briefly touched. Since many calendrical systems are invented based on the movement of celestial bodies, particularly the Sun and Moon, we cannot dismantle the role of astronomy in the study of calendars, including the Javanese one. Various civilizations incorporate the cyclical movements of both the Sun and Moon into their calendrical systems (Ruggles 2015). Some examples of this calendrical-astronomical relationship can be observed among others in the Jewish (Cohn 2007), Indian (Dershowitz and Reingold 2009), Chinese (Martzloff 2000, 2016), mainland Southeast Asian (Eade 1995), and Islamic calendars (Proudfoot 2006). An explanation of the mathematical and astronomical details of how many calendars function has been covered extensively by Reingold and Dershowitz (2018). A related series with non-Western ethnomathematics is a book on non-Western ethnoastronomy that has been edited by Selin and Sun (2000). In particular, the book also dedicated one chapter on an astronomical feature of the Javanese calendar, where a season keeper (pranotomongso or pranata mangsa) guides agricultural activities among rural peasants in Java (Daldjoeni 1984; Ammarell 1988; Hidayat 2000). The readers might be interested to compare this with the cultural production of Indonesian skylore across three ethnic groups: Banjar Muslim, Meratus Dayak, and Javanese peoples (Ammarell and Tsing 2015). Integrating the Javanese calendar into elementary school education as an ethnomathematics study has been attempted by Utami et al. (2020). Another attempt is to embed some topics related to calendrical systems into the first-year seminar on the mathematics of the pre-Columbian Americas (Catepillán 2016). A study from Taipei, Taiwan, on the movements of the Moon at the primary level, introduced pupils to both the Gregorian and Chinese calendars in the context of physical classroom environments (Hubber and Ramseger 2017). Although we propose an intersection of only four disciplines in our theoretical framework, the list is by no means exhaustive. Another possibility is to include psychology or the interaction between mathematics and psychology. For instance, a theoretical foundation on the calendrical system and the psychology of time has been
An Ethnoarithmetic Excursion into the Javanese Calendar
5
modeled by Rudolph (2006). In particular, the author proposed that Balinese (also equally applied to Javanese) time might be neither “circular” nor “linear,” but profinite. Since the Javanese calendrical systems involve 5-day pasaran, 7-day saptawara/dinapitu, and 30 wuku cycles, it could be modeled by the group of b This group bundles together all different sets of the ring of pprofinite integers ℤ. adic integers ℤp and various finite sets ℤ/( p) of integers modulo p, where p denotes prime integers (Milne 2020). A discussion on calendar timekeeping scheme in Southeast Asia is covered by Gislén (2018). An overview of the calendars in Southeast Asia is given by Gislén and Eade (2019a). In their subsequent papers, they and Lân also discussed calendars from Burma, Thailand, Laos, and Cambodia (Gislé and Eade 2019b), Vietnam (Lân 2019), and Malaysia and Indonesia (Gislén and Eade 2019c), eclipse calculation (Gislén and Eade 2019d), and chronicle inscriptions (Gislén and Eade 2019e). See also Eade (1995), Ôhashi (2009), and (Golzio 2012) for basic facts and further explanations of the calendrical systems in India and (Mainland) Southeast Asia. This chapter is organized as follows. After this introduction, the following section briefly covers the calendars from ancient and contemporary times, which includes the pre-Gregorian and Gregorian calendars. After that, a discussion on the Javanese calendar in the context of ethnoarithmetic will be dedicated exclusively in one section. The final section presents discussion and epilogue.
2
Ancient and Modern Calendars
This section briefly covers several calendars in the pre-Gregorian era and Zeller’s congruence algorithm for the Gregorian calendar.
2.1
Pre-Gregorian Calendars
Before the Gregorian calendar that we are using today was adopted, numerous calendars have been used in various parts of the world. The Egyptian calendar was among the first solar calendars with its history dated back to the fourth millennium BCE. A Mesoamerican civilization of the Maya peoples also developed a calendar system where the year was divided into 18 months of 20 days (Ascher 2002; Stakhov 2009). The Mayan calendar system contains three separate calendars, and although they are not related mathematically, all of them are linked in a single calendar system. The first one is called the “long count,” used by the Mayans to measure date chronology for history recording. The second one is the Haab calendar, a nonchronological civil calendar. The third one is the Mayan religious calendar, also nonchronological, called the Tzolkin (Ascher 2002; Cohn 2007). Related to the Maya civilization is the Inca Empire, where the latter also developed its calendar system, which was based on several different astronomical cycles,
6
N. Karjanto and F. Beauducel
including the solar year, synodic and sidereal lunar cycles, and local zenith period (Urton and Llanos 2010). A variety of mathematical developments among the native Americans from the prehistoric to present has been compiled by Closs (1996). In particular, one chapter of the book discusses the calendrical system of the Nuu-chahnulth (formerly referred to as the Nootka), one of the Indigenous peoples of the Pacific Northwest Coast in Canada (Folan 1986). Some calendars are based on lunisolar, and one of them is the Jewish calendar. Also called the Hebrew calendar, it is still used until today, mainly for Jewish religious observance. Although the Jewish calendar was developed in its current format during the Talmud period in the fifth century CE by Rabbi Hillel II, the up-todate counting has accumulated more than 5000 years (Ascher 2002; Cohn 2007). For our information, AM 5781 began at sunset on Friday, 18 September 2020, and will end at sunset on Monday, 6 September 2021 CE. Here, AM means Anno Mundi, the Latin phrase for “in the year of the world.” Although the traditional Chinese calendar is also a lunisolar type, the most fundamental component is the sexagenary cycle, i.e., a cycle of 60 terms marked by coordination between 12 celestial stems and 10 terrestrial branches, which is also known as the “Stems-and-Branches” or gānzhī (干支) (Martzloff 2000). A discussion on the mathematical aspect of the Chinese calendar has been covered extensively by Aslaksen (2001, 2002, 2006). A historical aspect of the Chinese calendar is discussed by Sun (2015). Astronomical aspects and the mathematical structures underlying the calculation techniques of the Chinese calendar are highlighted by Martzloff (2016). The Hijri, or Islamic calendar, is a lunar calendar consisting of 12 lunar (synodic) months in a year of 354 or 355 days. It is still widely used in predominantly Muslim countries alongside the Gregorian calendar, primarily for religious purposes. The current counting started from 622 CE, commemorating the emigration of Prophet Muhammad and his companions from Mecca to Medina (hijra) (Hassan 2017). In the Gregorian calendar reckoning, the current Islamic year is 1422 AH, which approximately runs from 20 August 2020 until 9 August 2021. AH means Anno Hegirae, the Latin phrase for “in the year of the Hijra.” In particular, the history of Muslim calendars in Southeast Asia in the historical and cultural context has been successfully traced by Proudfoot (2006).
2.2
Gregorian Calendar
The language of congruences is a basic building block in arithmetic and number theory. It allows us to operate with divisibility relationships in a similar way as we deal with equalities. Congruences have many applications, and one of them is to determine the day of the week for any date for a given perpetual calendar. In particular, the procedure for finding the day of the week for a given date in the Gregorian calendar has been discussed by Carroll (1887); Conway (1973); Burton (2011); and Rosen (2011). See also Gardner (1996) and Cohen (2000).
An Ethnoarithmetic Excursion into the Javanese Calendar
7
The Gregorian calendar was promulgated by Pope Gregory XIII and is a replacement of the Julian calendar, proposed by Julius Caesar in 46 BC. Starting from 1582, the Catholic states were among the first countries to adopt the Gregorian calendar by skipping 10 days in October: Thursday, 4 October 1582, was followed by Friday, 15 October 1582. Greece and Turkey are among the last countries that adopted the Gregorian calendar when they changed on 1 March 1923 and 1 January 1926, respectively. Let W denote the day of the week from Saturday ¼ 0 to Friday ¼ 6, k denote the day of the month, m denote month, and N denote year. For the month, the convention is March ¼ 3, April ¼ 4,. . ., but January ¼ 13, and February ¼ 14. For the year, N is the current year unless the month is January or February, for which N is the previous year. The relationship between year and century is given by N ¼ 100 C + Y, where C denotes zero-based century and Y denotes a two-digit year. Note that the purpose of adopting this expression is not to confuse C with the standard century number, which is C + 1 for the first 99 years. The formula for finding the day of the week W of day k of month m of year N is given by Zeller’s congruence algorithm (Zeller 1882, 1883, 1885, 1887): W
j k 13ðm þ 1Þ Y C kþ þYþ þ 2C ðmod 7Þ: 5 4 4
ð1Þ
Here, bxc denotes the floor function or greatest integer function and mod is the modulo operation or remainder after division. As an example, we are interested in finding the days of the week when a Javanese heroine and educator Raden Adjeng Kartini was born and passed away. She was born in Jepara, a town on the north coast of Java, located around 80 km to the northeast of Semarang, the present capital of Central Java province in Indonesia. Kartini was a pioneer for girls’ education and women’s emancipation rights in Indonesia, at the time when Indonesia was still a part of the Dutch East Indies colonial empire. She could be correlated with her European counterparts, including an English advocate of women’s right Mary Wollstonecraft (27 April 1759 – 10 September 1797) or a Finnish social activist Minna Canth (19 March 1844 – 12 May 1897) (cf. Kartini 1911, 1920; Wollstonecraft 1792; Canth 1885). Kartini was born on 21 April 1879, so we have k ¼ 21, m ¼ 4, N ¼ 1879, C ¼ 18, and Y ¼ 79. She passed away on 17 September 1904, at the age of 25, and we identify that k ¼ 17, m ¼ 9, N ¼ 1904, C ¼ 19, and Y ¼ 4. Using Zeller’s congruence formula (1), the days when Kartini was born Wb and passed away Wd can be calculated as follows, respectively: h i h i 13ð5Þ 79 18 þ 79 þ þ 36 ðmod 7Þ W b 21 þ 5 4 4 ð21 þ 13 þ 79 þ 19 þ 4 36Þ ðmod 7Þ 100 ðmod 7Þ 2 ðmod 7Þ
8
N. Karjanto and F. Beauducel
j k j k 13ð10Þ 4 19 þ4þ þ 38 ðmod 7Þ W d 17 þ 5 4 4 ð17 þ 26 þ 4 þ 1 þ 4 38Þ ðmod 7Þ 14 ðmod 7Þ 0 ðmod 7Þ: Hence, Kartini was born on Monday, 21 April 1879 and passed away on Saturday, 17 September 1904. She was buried at Bulu Village, Rembang, Central Java, around 100 km east of Jepara.
3
Javanese Calendar
This section features the main characteristics of the Javanese calendar. We start with the geographical location of the island of Java, the Javanese people, and a brief historical background of the calendar. After providing a detailed discussion, we close the section by computer implementation of the Javanese calendar.
3.1
Where Is Java?
5°S
Java is an island in Indonesia, not a programming language, although the latter was renamed after some Javanese coffee by its founders. It is located in the southern hemisphere, around 800 km from the Equator. It extends from latitude 6 to 8 south, and longitude 105 to 114 east. With an area of 150,000 square kilometers, it is about 1000 km long from west to east and around 200 km wide from north to south.
6°S
Sumatra
Javanese Sundanese Betawi Madurese Balinese
Java Sea
Jakarta
Semarang
7°S
Bandung
8°S
Surakarta
Madura Surabaya
Bali
Yogyakarta
9°S
Indian Ocean 104°E
108°E
112°E
Fig. 2 The situation of the Java island, main cities, and present spoken languages: Javanese (Central and East Java, and a small enclave in North-West Java), Sundanese (West Java), Betawi (in and around Jakarta metropolitan area), Madurese (Madura Island and a part of North-Eastern Java), and Balinese (Bali Island and a small part of Eastern Java). Basemap uses ETOPO5 and SRTM3 topographic data and shaded relief mapping code (Beauducel 2020a). ETOPO5 is a 5 arc-minute resolution relief model for the Earth’s surface that integrates land topography and ocean bathymetry dataset. SRTM3, the Shuttle Radar Topography Mission, is a 3 arc-second resolution digital topographic database of land elevation limited to latitudes from 60 south to 60 north
An Ethnoarithmetic Excursion into the Javanese Calendar
9
The island lies between Sumatra to the west and Bali to the east. It is bordered by the Java Sea on the north and the Indian Ocean on the south (see Fig. 2). It is the world’s 13th largest island (Cribb 2000).
3.2
Who Are Javanese People?
The Javanese people are a native ethnic group to the island of Java. They form the largest ethnic group in Indonesia, with more than 95 million people living in Indonesia and approximately 5 million people living abroad. Although they predominantly reside in the central and eastern parts of the island, they are also scattered in various parts of the country (Taylor 2003; Ananta et al. 2015). See Fig. 2. Javanese people have their own language which is distinct from Indonesian. The Javanese language is a member of the Austronesian family of languages written in Javanese hanacaraka or dentawyanjana. Thanks to its long history and legacy of Hinduism and Buddhism in Java, the language adopted a large number of Sanskrit words (Marr and Milner 1986; Errington 1998).
3.3
A Background of the Javanese Calendar
Javanese people use the Javanese calendar simultaneously with two other perpetual calendars, the Gregorian and Islamic calendars. The former is the official calendar of the Republic of Indonesia, and the latter is used mainly for religious purposes. Prior to the adoption of the Javanese calendar in 1633 CE, Javanese people used a calendrical system based on the lunisolar Hindu Saka calendar (Ricklefs 1993). The Javanese calendar was inaugurated by Sultan Agung Adi Prabu Hanyakrakusuma (1593–1645 CE), or simply Sultan Agung, the third Sultan of Mataram who ruled Central Java from 1613 CE to 1645 CE. Although the counting of the year follows the Saka calendar, the Javanese calendar employs a similar lunar year as the Islamic Hijri calendar instead of the solar year system like the former (Gislén and Eade 2019a). The Javanese calendar is sometimes referred to as AJ (Anno Javanico), the Latin phrase for Javanese Year. Since 2008, the difference between the Gregorian and Javanese calendars is about 67 years, where the current year 2020 CE corresponds to 1953 AJ (Oey 2001; Raffles 1817).
3.4
Some Characteristics of the Javanese Calendar
Different from many other calendars that employ a 7-day week cycle, the Javanese calendar adopts a 5-day week cycle, known as pancawara. Amalgamating with the 7-day week cycle of the Gregorian and Islamic calendars, namely the saptawara cycle, one obtains the 35-day cycle, known as wetonan (Darling 2004). This foundation cycle interferes with additional cycles:
10
N. Karjanto and F. Beauducel
Table 1 Main cycles of the Javanese calendar Cycle name Pancawara
Length 5
Unit Day
Wuku
7
Day
Wetonan Wulan
35 29 or 30
Day Day
Pawukon Taun
30 12
Wuku Wulan
Windu
8 4 2 15
Taun Windu Windu Windu
Lambang Kurup
Comment Javanese week 5 pasaran names (see Table 2) Gregorian/Islamic week 7 dinapitu names (see Table 3) 30 wuku names 35-day names as Dinapitu and Pasaran Day number in a wulan is dina 12 wulan names (see Table 5) 30 weeks 210 days 354 or 355 days Taun number starts on 1555 AJ 8 taun names (see Table 6) 96 wulan 81 Wetonan 2,835 days 32 taun, 4 windu names 16 taun, 2 lambang names 120 Taun 1 day 42,524 days 4 kurup names until today (see Table 8)
• A 210-day cycle of 30 weeks, named as the pawukon. • A more complex combination of the lunar month wulan, which has 29 or 30 days, the lunar year taun, which is 12 lunar months, windu, which is 8 lunar years, and finally kurup of 15 windu or 120 lunar years minus 1 day, which matches exactly the Islamic calendar cycle (Proudfoot 2007). There are also additional cycles, but they are no longer used in the Javanese tradition. To be exhaustive, we list them here (Richmond 1956; Zerubavel 1989): • A 6-day cycle called the Paringkelan: “Tungle,” “Aryang,” “Wurukung,” “Paningron,” “Uwas,” and “Mawulu”. • An 8-day cycle called the Padewan: “Sri,” “Indra,” “Guru,” “Yama,” “Rudra,” “Brama,” “Kala,” and “Uma”. • A 9-day cycle called the Padangon: “Dangu,” “Jagur,” “Gigis,” “Kerangan,” “Nohan,” “Wogan,” “Tulus,” “Wurung,” and “Dadi”. The list of main cycles and characteristics is summarized in Table 1 and is detailed in the following subsubsections. Figure 3 shows an original table of the main cycles’ names written in the Javanese script hanacaraka.
3.4.1 Pancawara and Pasaran The Javanese 5-day cycle is named pancawara and made of 5 days known as pasaran: “Lêgi,” “Pahing” (or “Paing”), “Pon,” “Wage,” and “Kliwon.” The word comes from “pasar” which means market. Historically, the market was held and
An Ethnoarithmetic Excursion into the Javanese Calendar
11
Fig. 3 Table list of 8 taun, 12 wulan, 7 dinapitu, and 5 pasaran names in Javanese Sanskrit, after Warsapradongga (1892)
operated on a 5-day cycle based on a pasaran day, e.g., “Pasar Kliwon” or “Pasar Lêgi”. Until today, most of the markets in Java still have a pasaran name like the Kliwon Market in Kudus, Central Java, although they usually operate every day (Oey 2001). Each pasaran is associated with some symbols, in particular, the five classical elements of Aristotle, colors, cardinal directions (note that the Javanese culture has
12
N. Karjanto and F. Beauducel
Table 2 Names of the five pasaran days in the Javanese week and commonly associated symbols Ngoko Pon Wage Kliwon Lêgi Pahing
Krama Petak Cemeng Asih Manis Pahit
Meaning – Dark Affection Sweet Bitter
Element Water Earth Spirit Air Fire
Color Yellow Black Mixed color White Red
Direction West North Focus/center East South
Posture Sleep Sit down Stand up Turn back To face
five of them including a center), and human posture (Pigeaud 1977) (cf. Brinton 1893). See Table 2. The computation of the pasaran day from the Gregorian or Islamic calendar date can be performed using a similar strategy as Zeller’s congruence algorithm. Indeed, the formula is based on the observation that the day of the week progresses predictably based upon each subpart of the date, i.e., day, month, and year. In this case, we will consider the 5-day Javanese week pasaran. Each term within the formula is used to calculate the offset needed to obtain the correct day. Let P denote the pasaran day, in the order of Pon ¼0, Wage ¼1, Kliwon ¼2, Lêgi ¼ 3, and Pahing ¼ 4. We now propose the following pasaran day congruence formula: P
j k 3ð m þ 1Þ Y C kþ þ þ 4C 4 ðmod 5Þ: 5 4 4
ð2Þ
where k, m, C, and Y denote the same variables as the ones defined for Zeller’s congruence formula (see expression (1) in the previous section). Each term in (2) can be analyzed as follows: • The variable k represents the progression of the day of the week based on the day of the monthj since keach successive day results in an additional offset of one. • The term þ
3ðmþ1Þ 5
adjusts for the variation in the days of the month. Indeed,
starting from March to February, the days in a month are {31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 31, 28/29}. The last element, February’s 28/29 days, is not a problem since the formula had rolled it to the end. The number of days for the 11 first elements of this sequence modulo 5 (still starting with March) would be {1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1} which basically alternates the subsequence {1, 0, 1, 0, 1} every 5 months and gives the number of days that should be added to the next month. The fraction 35 0:6 applied to m + 1 and the floor function have that effect and will add the proper amount of days. • Since there are 365 days in a nonleap year, and 365 (mod 5) 0 (mod 5), there is no need to add an offset for the normal year. • For the leap years, 366 (mod 5) 1 (mod 5), so 1 day must be added to the offset value by the term þ Y4 .
An Ethnoarithmetic Excursion into the Javanese Calendar
13
• There are 36,524 days in a normal century and 36,525 days in each century divisible by 400; the term 4C adds 36,524 (mod 5) 4 (mod 5), 4 days for any century, and 4 C4 removes these 4 days for a century divisible by 400. • The overall function, (mod 5), normalizes the result to reside in the range from 0 to 4, which yields the index of the correct day for the date being analyzed. For example, to find the pasaran of the very first day in the Javanese calendar (1 Sura 1555 AJ Alip) which corresponds to 8 July 1633 CE, we have k ¼ 8, m ¼ 7, C ¼ 16, and Y ¼ 33. Using (2), we obtain P 8 + 4 + 8 + 64 16 68 (mod 5) 3 (mod 5). Hence, 8 July 1633 CE was a “Lêgi.” From the previously considered example, the pasaran days when Kartini was born Pb and passed away Pd can be calculated as follows: Pb ð21 þ 3 þ 19 þ 72 16Þ ðmod 5Þ 99 ðmod 5Þ 4 ðmod 5Þ Pd ð17 þ 6 þ 1 þ 76 16Þ ðmod 5Þ 84 ðmod 5Þ 4 ðmod 5Þ: Hence, both 21 April 1879 and 17 September 1904 fall on a “Pahing.”
3.4.2 Dinapitu, Wuku, and Pawukon Dinapitu literally means “day 7” in Javanese and corresponds to the day names in the Gregorian/Islamic calendar week with exact equivalence, i.e., from Monday to Sunday: “Sênèn,” “Selasa” (or “Slasa”), “Rêbo,” “Kêmis,” “Jemuwah” (or “Jumungah”), “Sêtu,” and “Ngahad” (or “Ahad”). In recent literature, we often find the name of the days in the Indonesian language (from Monday to Sunday: “Senin,” “Selasa,” “Rabu,” “Kamis,” “Jumat,” “Sabtu,” and “Minggu”). They are also associated with particular symbolic meanings. See Table 3. The names of the days of the week are similar in both languages as they are absorbed from Arabic except for Sunday in Indonesian which was assimilated from the Portuguese “Domingo.” The 7-day cycle is called saptawara or padinan, and a week is named a wuku. The period of 30 wuku makes the pawukon cycle, i.e., 210 days (Proudfoot 2007). There are 30 different names of wuku (Soebardi 1965; Headley 2004) (See Table 4). Figure 4 shows an example of the Javanese calendar with padinan, pasaran, wuku, and paringkelan information, among others. Table 3 List of names of the 7 days in a wuku Gregorian week (cf. Rizzo 2020)
Dinapitu Ngahad Sênèn Selasa Rêbo Kêmis Jemuwah Sêtu
Padinan Dite Soma Anggara Buda Respati Sukra Tumpak
Week day Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Symbol Silent Forward Backward Turn left Turn right Up Down
14
N. Karjanto and F. Beauducel
Table 4 List of names of the 30 different wuku in a pawukon 1 2 3 4 5 6
Sinta Landep Wukir Kurantil Tolu Gumbreg
7 8 9 10 11 12
Warigalit Warigagung Julungwangi Sungsang Galungan Kuningan
13 14 15 16 17 18
Langkir Mandasiya Julungpujut Pahang Kuruwelut Marakeh
19 20 21 22 23 24
Tambir Medangkungan Maktal Wuye Manahil Prangbakat
25 26 27 28 29 30
Bala Wugu Wayang Kulawu Dukut Watugunung
3.4.3 Wetonan The wetonan cycle combines the 5-day pancawara cycle with the 7-day wuku week cycle. Each wetonan cycle lasts for 7 5 ¼ 35 days, with 35 distinct combinations of the couple “dinapitu pasaran” which is called the weton. Figure 5 displays the 35-day wetonan cycle of the dual dinapitu pasaran. The 7-day wuku cycle is arranged clockwise from Monday to Sunday and the 5-day pasaran day progresses from the center of the disk outwardly from Lêgi to Kliwon. It shows a spiral pattern that repeats seven times and sweeping five sectors each, indicated by solid black, solid blue, dashed-black, dashed-blue, dashed-dotted black, dashed-dotted blue, and dotted black spirals, respectively. Although the weton can be calculated independently using either Eq. (1) or (2), it is also possible to propose a single congruence formula:
j k 153ðm þ 1Þ Y C þ 15Y þ þ 19C þ þ5 5 4 4 W w ðmod 7Þ
w ¼kþ
ð3Þ
P w ðmod 5Þ where k, m, C, and Yare the same variables as defined for previous congruence formulas, w is a 35-day offset congruence, and W or P corresponds to the index of dinapitu or pasaran day, respectively. Each term of the congruence can be analyzed as follows: • The variable k represents the progression of the day of the week based on the day of the month, since day results in an additional offset of one. j each successive k • The operation þ
• • • •
153ðmþ1Þ 5
adds the proper amount of days for each month,
considering January and February as the 13th and 14th months of the previous year, respectively. The operation +15Y adds an offset of 365 (mod 35) 15 (mod 35) days for the common, nonleapyears. The operation þ Y4 adds 1 more day for the leap years since 366 (mod 35) 16 (mod 35). The operation +19C adds 36,524 (mod 35) 19 (mod 35), 19 days for any regular century (a century with nonleap year). The operation þ C4 adds 1 more day for a century leap year that is divisible by 400 since 36,525 (mod 35) 20 (mod 35).
An Ethnoarithmetic Excursion into the Javanese Calendar
15
Fig. 4 An example of the Javanese calendar for December 2020 CE issued by the Kraton palace in Yogyakarta. It contains information on Jimakir taun, Bakdamulud and Jumadilawal wulans, pasaran day, padinan weekday, wuku, and paringkelan, among others. Each column in the wuku row lists wuye, manahil, prangbakat, bala, and wugu, respectively. Each column in the paringkelan row lists paningron, uwas, mawulu, tungle, and aryang, respectively. The information on pranatamangsa reads as “kanem (the sixth), 43 dinten (days), 9 November–21 December 2020.” The text that follows candranipun (interpretation) says “rasa mulya kasucian; wit woh–wohan sami mawoh, kathah bun upas mejahi taneman; peksi kuntul sami neba, nyebar winih yen lintang luku sampun katingal hing wetan.” It can be translated as follows “a noble sense of holiness; fruit trees– the same fruit grows, a lot of poisonous dew drops kills plants; heron birds are identical, spread the seed if the plowing star has been observed in the east.”
16
N. Karjanto and F. Beauducel Sunday 35
Saturday
14
20 34
28 Monday 13 7
15
27
29 6
Friday
19
5
33
22 1
Lêgi
26
12
8
21
Pahing
16 11
Kliwon
2
31
23
32
9
17
30
18 Pon
4 25 Thursday
Wagè
3
Tuesday 24 10
Wednesday Fig. 5 A 35-day cycle of wetonan in the Javanese calendar. The regular Gregorian 7-day cycle is arranged clockwise from Monday to Sunday. The 5-day pasaran cycle is arranged from inside to outside. The white disk, red, blue, green, and yellow rings correspond to “Lêgi,” “Pahing,” “Pon,” “Wagè,” and “Kliwon,” respectively
• The operation +5 adjusts the final offset to fit the indexes defined for dinapitu W and pasaran P after (mod 7) and (mod 5), respectively. Let us take the same example of 8 July 1633 CE with k ¼ 8, m ¼ 7, C ¼ 16, and Y ¼ 33. Using (3), we have w 8 + 244 + 8 + 304 + 4 + 5 573, W 573 (mod 7) 6 (mod 7), and P 573 (mod 5) 3 (mod 5). Hence, the 8 July 1633 CE was a “Jemuwah Lêgi.” Notice that we are able to replace both the congruence formulas (1) and (2) using our original single congruence relationship (3). Obtaining both dinapitu and pasaran days using a tabular method has been attempted by Arciniega (2020), and a Java application based on the Indian calendar for calculating the Javanese calendar has been developed by Gislén and Eade (2019e). The wetonan cycle is especially important for divinatory systems, celebrations, and rites of passage as birth or death. Commemorations and events are held on days considered to be auspicious. In particular, the weton of birth is considered playing an
An Ethnoarithmetic Excursion into the Javanese Calendar
17
important role in any individual personality, in a similar way as a zodiac sign does in Western astrology. The two weton days of future spouses are supposed to determine their background nature compatibility and are used to compute the best date of marriage using a strict arithmetic formula (Utami et al. 2019). It also figures in the timing of many ceremonies of slametan ritual meal and many other traditional divinatory systems (Utami et al. 2020). The eve of “Jemuwah Kliwon” is considered particularly popular and auspicious for magical and spiritual matters (Darling 2004; Arciniega 2020). The anniversary of Javanese birthday occurs every 35 days, so about a thousand times in a century (exactly 1043 times). For a newborn baby, the first occurrence of weton, i.e., aged 35 days, is named “selapanan” where the parents will cut hairs and nails of their child for the first time. For adults, the weton of birth is considered as an eminent day, not festive but expressing humility and blissfulness. For example, a person may fast (sometimes also the day before and the day after), stop his commercial activity, avoid taking any major decision, or simply be more generous to surrounding people through philanthropic actions such as share a blessed meal or some “jajan pasar” (early morning fresh sweet snacks from the market). The weton for the birth and death of Sultan Agung is “Jemuwah Lêgi,” which is also the first day of the Javanese calendar he created. This weton is therefore one of the recurrent noble days (see sub-subsection Dina Mulya) and considered as an important night for pilgrimage. Indeed, the weton of every King’s birth is a special day; the present Sultan Hamengkubuwana X was born on 2 April 1946 CE, a “Selasa Wage.” This weton has been chosen for his ascension to the throne, on 7 March 1989 CE. Every “Selasa Wage,” the animated touristic center of Yogyakarta, Jalan Malioboro, is closed to motor vehicles, and the covered sidewalks “kaki lima,” where commercial activity usually abound, are entirely cleaned. As another especially prominent example, the present palace of Kraton Yogyakarta has been inhabited by Sultan Hamengkubuwana I and his regal suite on the “13 Sura 1682 AJ” (7 October 1756 CE), a “Kemis Pahing.” In 2015, the Governor decided that every “Kemis Pahing,” schoolchildren, public servants, and in particular those working in territorial services of Yogyakarta, must wear the traditional costume all day long, as a reminder of their regional culture. Merchants from traditional markets are encouraged to do the same.
3.4.4 Wulan The lunar month is named a wulan and lasts for 29 or 30 days. There are 12 different names: “Sura,” “Sapar,” “Mulud,” “Bakdamulud,” “Jumadilawal,” “Jumadilakir,” “Rejeb,” “Ruwah,” “Puasa,” “Sawal,” “Dulkangidah” (or “Sela”), and “Besar.” The length of each wulan is attributed as follows (see Table 5 and the next sub-subsections): • “Sura,” “Rejeb,” “Puasa,” and “Dulkangidah” are always 30 days. • “Bakdamulud,” “Jumadilakir,” “Ruwah,” and “Sawal” are always 29 days. • “Sapar,” “Mulud,” “Jumadilawal,” and “Besar” have lengths depending on the taun and kurup.
18
N. Karjanto and F. Beauducel
3.4.5 Taun A taun is a cycle on 12 wulan and corresponds to the Javanese lunar year. There are eight different taun names: “Alip,” “Ehé,” “Jimawal,” “Jé,” “Dal,” “Bé,” “Wawu,” and “Jimakir,” formed by different wulan length sequences (see Table 5). For seven of the taun, the sequence alternates monotonically between 30- and 29-day lengths. For the final wulan “Besar,” it can be either 29 or 30 days, depending not only on the taun but also on the kurup, the 120 lunar year cycle, which has different day length sequences for the fifth taun “Dal.” As a result, the total day length of a taun varies from 354 (short or normal year, named “Taun Wastu”) to 355 (long or leap year, named “Taun Wuntu”), as described in Table 6: • “Alip,” “Jimawal,” and “Bé” are always normal years, with a wulan “Besar” of 29 days. • “Jimawal” and “Wawu” are always leap years, with a wulan “Besar” of 30 days. • “Jé” and “Dal” are normal or leap depending on the kurup. • “Jimakir” is a leap year for the 14 first windu but becomes a normal year for the final windu of a kurup cycle. Each taun is assigned by a monotonic increasing number, based on the Indian calendar Saka. The reason was Sultan Agung decided to continue the counting from the Shalivahana era, which was 1555 at the time when inaugurating the Javanese calendar (cf. Nuraeni and Azizah 2017). Thus, the Javanese calendar began on “1 Sura Alip 1555 AJ,” which corresponds to 8 July 1633 CE.
Table 5 List of names of the 12 wulan Javanese lunar months and the associated number of days, depending on the taun and kurup (see Table 6)
No Wulan Name 1 2 3 4 5 6 7 8 9 10 11 12
Wulan length (days) Taun 1–4, 6–8 Taun 5 All Kurup Kurup 1 Kurup 2 Kurup 3
Kurup 4
Sura Sapar Mulud Bakdamulud Jumadilawal Jumadilakir Rejeb Ruwah Pasa Sawal Dulkangidah Besar
30 29 30 29 30 29 30 29 30 29 30 29/30
30 29 30 29 30 29 30 29 30 29 30 30
30 30 29 29 30 29 30 29 30 29 30 30
30 30 29 29 29 29 30 29 30 29 30 30
30 29 30 29 30 29 30 29 30 29 30 29
Total (days)
254/355
355
355
354
354
An Ethnoarithmetic Excursion into the Javanese Calendar
19
Table 6 List of names of the 8 taun Javanese lunar years forming a windu, and the associated number of days, an alternate between 354 (short or normal year) and 355 (long or leap year, grayshaded cells) days, depending on the kurup
No Name
Krama
Meaning
1 2 3 4 5 6 7 8
Purwana Karyana Anama Lalana Ngawanga Pawaka Wasana Swasana
intention action work destiny life back and forth orientation empty
Alip Ehé Jimawal Jé Dal Bé Wawu Jimakir
Kurup 1 354 355 354 354 355 354 354 355
Taun Length (days) Kurup 2 Kurup 3 354 355 354 354 355 354 354 355
354 355 354 355 354 354 354 355
Kurup 4 354 355 354 355 354 354 354 355
3.4.6 Windu and Lambang Eight taun make a windu (Proudfoot 2006). Despite the variability of each taun length, the total length of a normal windu is constant since it always contains both five short and three long taun, which is a total of 2,835 days (about 7 years 9 months in the Gregorian/Islamic calendar). This corresponds to exactly 81 wetonan. This means that each “New Windu” day, dated as “1 Sura Alip,” falls on the same weton. There is an exception to that rule: The final windu of a kurup cycle (see the next sub-subsection) is always shortened by 1 day, with a 29 days wulan “Besar” during the final taun “Jimakir.” This induces a shift in the wetonan cycle at each kurup. Furthermore, there are four different names of windu: “Adi,” “Kuntara,” “Sengara,” and “Sancaya” that compose a 32-taun cycle. Another cycle of 16 taun is combined using two different names of windu, called lambang: “Kulawu” and “Langkir.” These two cycles are summarized in Table 7. 3.4.7 Kurup The longest cycle in the Javanese calendar is called a kurup, formed by 15 windu, which is equivalent to 120 taun or 1440 wulan (Gislén and Eade 2019c). But the very last wulan of the cycle, i.e., the twelfth wulan “Besar” of the eighth taun “Jimakir” of the fifteenth windu, has only 29 days, so that the total length of a kurup is 2,835 15 1 ¼ 42,524 days (about 116 years and 6 months in the Gregorian/ Islamic calendar) (Rosalina 2013). The period has the same number of days as in 120 lunar years of the Tabular Islamic calendar, a rule-based variation of the Islamic Hijri calendar. Although the number of years and months are identical, the months are determined by arithmetical rules instead of observation or astronomical calculations. Moreover, each kurup determines the following, as given in Tables 5 and 6: • Which of the taun “Jé” or “Dal” has a long wulan “Besar”. • The sequence of wulan lengths in the taun “Dal”.
20 Table 7 List of names of the four windu and two lambang
N. Karjanto and F. Beauducel Windu name Adi Kuntara Sêngara Sancaya
Lambang name Langkir Kulawu Langkir Kulawu
Hence, the full date sequences in the calendar vary between kurup. As the weton of the first day of a kurup repeats at each first day of the windu, a kurup is named using the corresponding weton falling on “1 Sura Alip.” Table 8 lists the first five kurup. Meanwhile, the Sultanate of Mataram was divided under the Treaty of Giyanti between the Dutch and Prince Mangkubumi in 1755 CE. The agreement divided ostensible territorial control over Central Java between Yogyakarta and Surakarta Sultanates. The former was ruled by Prince Mangkubumi, also known as Raden Mas Sujana or Hamengkubuwono I (1717–1792 CE), and the latter was administrated by Sinuhun Paliyan Negari, who was known as Pakubuwana III (1732–1788 CE) (Ricklefs 1974; Soekmono 1981; Frederick and Worden 1993; Brown 2004). During the second kurup (1749–1821 CE), some experts realized that the Javanese calendar was still 1 day behind compared to the Islamic Hijri calendar. Hence, the King of Surakarta, Susuhunan Pakubuwana V (1784–1823 CE), decided to end the Kurup “Amiswon” in the year 1748 AJ, even though it had only been running for nine windu and two taun. So, the taun “Ehé” 1748 AJ, which was supposed to be a leap year, was made only 354 days, and the third kurup “Aboge” started on the taun “Jimawal” 1749 AJ. But some noticed that it would be more appropriate if the incrementation of kurup should have been carried out 2 lunar years before, namely on the taun “Alip” 1747 AJ. As a consequence of this delay, the third kurup “Aboge” is only 118 taun long. However, the Sultanate of Yogyakarta did not make a similar decision and pursued the second kurup normally, so that the calendar in the two concurrent regions was different during 46 years (see Table 9). On taun “Jimakir” 1794 AJ, the Sultan of Yogyakarta, Hamengkubuwana VI (1821–1877 CE), finally agreed and decided that the third kurup “Aboge” will also end with taun “Jimakir” 1866 AJ, reconciling the two calendars. The fourth and present kurup “Asapon” is planned to last a normal length of 120 lunar years and will end on “29 Besar Jimakir 1986 AJ,” which is 25 August 2052 CE. The following day will start the fifth kurup “Anenhing” (“Alip Senen Pahing”) on “1 Sura 1987 AJ Alip,” but the sequence of leap years has not yet been decided, so it is formally impossible to calculate the exact dina, wulan, and taun for such a distant future date. Nevertheless, there is no obstacle with other strictly monotonic cycles such as weton, wuku, and windu.
3.4.8 Dina Mulya Dina mulya are the noble days in the Javanese calendar. Except for the “Siji Sura” which is the new lunar year and falls on the first day of the first wulan every taun, others are associated with a specific weton and specific taun or wuku (see Table 10).
An Ethnoarithmetic Excursion into the Javanese Calendar
21
Table 8 List of names of the five first kurup Javanese 120 lunar year cycles, their short names (a contraction of the weton at each new windu, i.e., on “1 Sura Alip”), the first and last taun, the total amount of taun, and the starting dates in the Gregorian calendar No 1 2 3 4 5
Kurup name Jamingiyah Kamsiyah Arbangiyah Salasiyah Isneniyah
Short name A’ahgi Amiswon Aboge Asapon Anenhing
First Taun (AJ) Alip 1555 Alip 1675 Jimawal 1749 Alip 1867 Alip 1987
Last Taun (AJ) Jimakir 1674 Ehé 1748 Jimakir 1866 Jimakir 1986 Jimakir 2106
Taun 120 74 118 120 120
Start date (CE) 8 July 1633 11 December 1749 28 September 1821 24 March 1936 26 August 2052
Table 9 List of dates of the second and third kurup in the Sultanate of Yogyakarta No Kurup name Short name First Taun (AJ) Last Taun (AJ) 2 Kamsiyah Amiswon Alip 1675 Jimakir 1794 3 Arbangiyah Aboge Alip 1795 Jimakir 1866
Taun 120 72
Start date (CE) 11 December 1749 16 May 1866
Table 10 List of the noble days dina mulya Dina Mulya Siji Sura
Weton –
Wuku –
Dina 1
Wulan Sura
Taun –
Aboge
Rêbo Wage
–
–
–
Alip
Daltugi
Sêtu Lêgi
–
–
–
Dal
Kuningan Hanggara Asih Dina Mulya
Sêtu Kliwon Selasa Kliwon
Kuningan Dukut
– –
– –
– –
Occurrences 1 every 354/355 days (new lunar year) 10 during the Taun Every 7/8 years 10 during the Taun Every 7/8 years Every 210 days Every 210 days
Jemuwah Kliwon Jemuwah Lêgi
Watugunung
–
–
–
Every 210 days
–
–
–
–
Every 35 days
Dina Purnama
3.5
Computer Implementation of the Javanese Calendar
The computation of the full Javanese calendar has been implemented using GNU Octave, a Matlab ®-compatible scientific language with a single function “weton.m” (Beauducel 2020c). When using a computer, however, the determination of the weekday or pasaran day from a date in the Gregorian calendar does not require the congruence formula (3). In fact, most computer languages are able to calculate
22
N. Karjanto and F. Beauducel
the exact number of days that last from any reference date, correctly taking into account leap years. In the case of the Javanese calendar, the linear timeline will be the number of days counted from 8 July 1633 CE, which falls on “Jemuwah Lêgi 1 Sura Alip 1555 AJ Jamingiyah,” the first day of the first kurup. Using that facility, calculation of the 7-day week cycle can be made by a simple modulo 7, the pasaran cycle by a modulo 5, the wetonan cycle by a modulo 35, and the pawukon cycle by a modulo 210 functions. Moreover, since any modern digital calendar is able to set a periodic event, a specific weton date repeated every 35 days will smoothly give the corresponding wetonan cycle over the whole calendar. On the other hand, the computation of dina, wulan, taun, windu, and kurup is more complicated since the exact sequences vary throughout history following human decisions, making these cycles neither exactly cyclic nor monotonic. Hence, the proposed computing strategy is to construct, for each kurup, a windu table as a 8 12 matrix of rows taun versus columns wulan, containing the day length of that specific wulan. This matrix is repeated 15 times to form a complete kurup, or less for the second and third kurup. Then, all the matrices are concatenated. The cumulative sum of this table elements, in the row order, gives the total number of days that last from the origin at the beginning of each lunar month, and can be compared to the linear timeline described above. Thus, a simple “table lookup” function will give the corresponding kurup, windu, taun, and wulan indexes, and the dina will be given by the remainder. We have discussed earlier that the Javanese months wulan adopt the lunar month, and this is closely related but does not necessarily correspond exactly with both the Islamic Hijri and Tabular Islamic calendars. For the former, the length of the months, whether 29 or 30 days, depends on the moon’s visibility or weather conditions. The latter, however, adopted a convention of 30-day months for odd-numbered months (including the 12th month in a leap year) and 29-day months for even-numbered ones. On the other hand, the number of days in a Javanese wulan depends on a complex combination of taun and kurup, as we observed in Table 5. As a consequence, discrepancies are inevitable between the Javanese and Islamic calendars. The two calendars correspond exactly only during the first kurup, i.e., the first 120 lunar year cycles. Table 11 displays the corresponding Gregorian dates of the Javanese and (Tabular) Islamic New Year, 1 Sura and 1 Muharram, respectively, for the past half century. While more than 85% of the dates match, we observe that, in general, a 1- or 2-day gap may occur between the two calendars. The Gregorian dates in Table 11 can be obtained using the function “weton.m” by typing in >> weton('all',' 1 Sura') >> weton(datenum(1633,7,8):now,' 1 Sura')
The two syntaxes are strictly equivalent. The output spits out 400 items from 1555 AJ to 1954 AJ. Note that the space before “1” is important, else, 11 and 21 Sura will be spitted out as well. Additionally, it is also essential to introduce the time window from 8 July 1633 CE onward as the dates without this will be considered invalid in the Javanese calendar. Here are other examples (recall Kartini’s birthdate):
An Ethnoarithmetic Excursion into the Javanese Calendar
23
Table 11 Examples of the Javanese and (Tabular) Islamic New Year for the past 52 years and their corresponding Gregorian dates. The bracketed (6) December 2010 was the observed date Javanese year 1900 AJ 1906 AJ 1914 AJ 1922 AJ 1927 AJ 1930 AJ 1935 AJ 1944 AJ 1951 AJ 1954 AJ
Gregorian date (CE) 30 March 1968 24 January 1974 29 October 1981 3 August 1989 11 June 1994 8 May 1997 16 March 2002 8 December 2010 23 September 2017 20 August 2020
Islamic year 1388 AH 1394 AH 1402 AH 1410 AH 1415 AH 1418 AH 1423 AH 1432 AH 1439 AH 1442 AH
Gregorian date (CE) 31 March 1968 25 January 1974 30 October 1981 4 August 1989 10 June 1994 9 May 1997 15 March 2002 (6) 7/8 December 2010 21/22 September 2017 19/20 August 2020
>> weton(1:now,'28 Bakdamulud 1808') Error using weton (line 247) Some dates are invalid (before 08-Jul-1633) >> weton('all','28 Bakdamulud 1808') Senen Pahing Langkir 28 Bakdamulud 1808 AJ Be Adi Langkir Arbangiyah (Aboge), 21 April 1879 CE >> weton(1879,4,21) Senen Pahing Langkir 28 Bakdamulud 1808 AJ Be Adi Langkir Arbangiyah (Aboge), 21 April 1879 CE >>
Observe that the last two distinct inputs produce identical outputs. In the Islamic Hijri calendar, 21 April 1879 CE corresponds to 28 Rabi‘ath-Thani 1296 AH. More details of the Islamic calendar and its relationship with the Gregorian calendar are beyond the scope of this chapter. Interested readers may consult Proudfoot (2006); Cohn (2007); Reingold and Dershowitz (2018), and the references therein.
4
Discussion and Epilogue
4.1
Javanese View of Time
Different from other calendars where the calculation is often based mainly on the movement of celestial bodies, such as the Sun, the Moon, and the stars, the elements of the Javanese calendar are composed of several cycles, including the solar-based Gregorian and lunar-based Islamic calendars. The Javanese people have a different perspective on time, for which some of these cycles are not all related to the movement of the celestial bodies or seasonal changes, such as the 5-day cycle pasaran, 35-day
24
N. Karjanto and F. Beauducel
cycle wetonan, and 210-day cycle pawukon. This could be compared with the 260-day cycle Tzolkin in the Maya calendar (Jenkins 1994; Rice 2007; Cohn 2007; Ascher 2002). On the other hand, dina, wulan, taun, windu, and kurup correspond to the familiar cycles we encounter in other ancient and modern calendars, i.e., daily (lunar), monthly, yearly, 8-yearly, and 120-yearly cycles, respectively. Thanks to its geographical location, where the island of Java is only less than 1000 km from the Equator, the change of seasons should be interpreted differently from the one in the subtropical regions. The climate tends to be relatively even yearround. There is only a slight variation between daylight hours, sunrise, and sunset throughout the year. For example, the longest and shortest daylights during the southern hemisphere summer (December) and winter (June) solstices in Yogyakarta (7.78 S, 110.37 E) are 12 hours and 38 minutes and 11 hours and 43 minutes, respectively (Beauducel 2020b). Although one could hardly notice this 1-hour difference over the 6-month period, Javanese residents do notice the difference between the time of sunlight and sunset. The morning civil twilight, when the geometric center of the Sun is 6 below the horizon, begins at 05:30 when the Sun reaches the Tropic of Cancer in June, but this morning twilight can be observed before 05:00 when the Sun touches the Tropic of Capricorn in December. There are neither significant events during these both solstices nor during both equinoxes among the Javanese community. For the people who live in subtropical regions, these astronomical events can be considerably important and mark some significant events, both chronologically as well as culturally. For example, the traditional East Asian lunisolar calendars divide a year into 24 solar terms jiéqì (節 氣). Traditionally, they celebrate the Dōngzhì (冬至) festival during the winter (December) solstice on one of the days between 21 and 23 December, where people make and eat the soup ball tāngyuán (湯圓), made from glutinous rice flour and served in a hot broth or syrup. Nonetheless, we would not interpret that the changes of season in Java are completely absent. The following subsection briefly discusses another “calendrical” system uniquely designed for the Javanese climate.
4.2
Pranatamangsa Revisited
Pranatamangsa means “the regulation or arrangement of the season.” It is a “calendrical” system used not only for agricultural activities among rural peasants in Java with two millennia of history but also for fish catching and capturing among Javanese fishermen. Thanks to the classical and regular seasonal rhythms, the farmers and fishermen were able to conduct and organize their activities productively using this “season keeper” (Daldjoeni 1984; Hidayat 2000). The pranatamangsa calendar is arranged to fit with the solar calendar of 365- or 366-day annual period, but the length of each period (mangsa) is unequal, which ranges from 23 to 43 days. The number of days exhibits a symmetric pattern between the first and second semesters. Coincidentally, the 1st, 2nd, and 6th mangsa, and hence the 12th, 11th, and 7th mangsa, have prime numbers 41, 23, and
An Ethnoarithmetic Excursion into the Javanese Calendar
25
Table 12 Pranatamangsa at a glance
No.
Starting day
Name of mangsa
Length (days)
Season
1 2 3 4 5 6 7 8 9 10 11 12
23 June 3 August 26 August 19 September 14 October 11 November 23 December 4/5 February 2 March 27 March 20 April 13 May
Kaso Karo Katelu Kapat Kalima Kanem Kapitu Kawolu Kasanga Kasadasa Désta Saddha
41 23 24 25 27 43 43 26/27 25 24 23 41
Kemarau or Ketiga (Dry season) Labuh (Transition; beginning of rainy season) Penghujan or Rendheng (Wet season) Maréng (Transition; ending of rainy season)
43, respectively as their lengths of days. The cycle starts at the beginning of the dry season on 23 June, around the winter solstice in the southern hemisphere (Van Den Bosch 1980; Daldjoeni 1984). Table 12 lists the names of mangsa, each of its period in pranatamangsa, and the four main seasons. When the length of the 8-month Kawolu is 26 days, the year is a common year or wuntu. When a year becomes a leap year or wastu, Kawolu will have 27 days. According to the most popular climate classification system, the Köppen-Geiger scheme, the island of Java features both tropical rainforest and monsoon climates, i.e., Af and Am, respectively (Beck et al. 2018, 2020). Thus, dry and wet seasons occur intermittently throughout the year. However, as we observed in Table 12, there are two seasons of Labuh and Maréng that serve as a transition from one distinct season to another (cf. Kristoko et al. 2012; Zaki et al. 2020). There are several of future implications of the pranatamangsa system. First, due to climate change and global warming, seasonal rhythms tend to be irregular than what they used to be in the past. Second, with the advent of technology, the historically twice harvest-time annually may be extended to more than two times. Third, with the regional and global industrialization, many farmers abandoned or sold out their lands, worked in factories, and relocated to urban areas. The topic of pranatamangsa, these aforementioned implications, and other possible ramifications are separate topics in themselves. An in-depth discussion is beyond the scope of this chapter and should be covered elsewhere. We wrap up this section and chapter with the following conclusion.
4.3
Conclusion
In this chapter, we have discussed the cultural, historical, and arithmetic aspects of the Javanese calendar. Along with the internationally acknowledged Gregorian and
26
N. Karjanto and F. Beauducel
the majority-embraced religiously Islamic Hijri calendars, the Javanese calendar has its unique place in the hearts of many Javanese people in Indonesia as well as Javanese diaspora overseas. Although many Javanese people have adopted modern lifestyle, the Javanese calendar is still utilized in various daily affairs, including to choose the best possible time for arranging a wedding day. While determining the day of the week for any given date can be computed using Zeller’s congruence algorithm of modulo 7, the pasaran day of the Javanese calendar can be calculated using a new congruence formula of modulo 5. Additionally, we have also proposed a unique and combined congruence formula for calculating both the day of the week and pasaran day for any given date in the Gregorian calendar. Furthermore, using a computer program GNU Octave “weton.m” (Beauducel 2020c), all the cycles of the Javanese calendar, i.e., wetonan, wuku, wulan, windu, lambang, and kurup can also be determined straightforwardly. Dedication NK would like to dedicate this chapter to his late father Zakaria Karjanto (Khouw Kim Soey, 許金瑞) who introduced and taught him the alphabet, numbers, and the calendar in his early childhood. Karjanto senior was born in Tasikmalaya, West Java, Japanese-occupied Dutch East Indies on 1 January 1944 (Saturday Pahing) and died in Bandung, West Java, Indonesia on 18 April 2021 (Sunday Wage). Acknowledgment The authors would like to thank Matthew Arciniega (Vortx, Inc., Ashland, Oregon, United States of America) for sharing the contents of his old website and Roberto Rizzo (University of Milano–Bicocca, Italy) for pointing to the article written by Proudfoot (2007). FB warmly thanks Alix Aimée Triyanti for her cultural influence and inspiration, and all his Javanese friends for their enthusiastic support.
References Ammarell G (1988) Sky calendars of the Indo-Malay archipelago: Regional diversity/local knowledge. Indonesia 45:85–104 Ammarell G, Tsing AL (2015) Cultural production of skylore in Indonesia. In: Ruggles CLN (ed) Handbook of Archaeoastronomy and Ethnoastronomy. Springer, New York, pp 2207–2214 Ananta A, Arifin EN, Hasbullah MS, Handayani NB, Pramono A (2015) Demography of Indonesia’s Ethnicity. Institute of Southeast Asian Studies, Pasir Panjang, Singapore Arciniega M (2020) Personal communication. Matthew Arciniega has written various articles on the Javanese calendar on his defunct website www.xentana.com from the 1990s until around 2004. Although the website is no longer accessible, the readers may still find his articles in the Internet Archive Wayback Machine web.archive.org Ascher M (2002) Mathematics Elsewhere: An Exploration of Ideas Across Cultures. Princeton University Press, Princeton Aslaksen H (2001) Fake leap months in the Chinese calendar: From the Jesuits to 2033. In: Chan AKL, Clancey GK, Loy HC (eds) Historical Perspectives on East Asian Science, Technology, and Medicine. World Scientific, Singapore, pp 387–393 Aslaksen H (2002) When is Chinese New Year? Griffith Observer 66(2):1–11, National University of Singapore preprint, last updated 13 March 2009 Aslaksen H (2006) The mathematics of the Chinese calendar. National University of Singapore pp 1–52, (Preprint), last updated 17 July 2010
An Ethnoarithmetic Excursion into the Javanese Calendar
27
Beauducel F (2020a) Dem: Shaded relief image plot (digital elevation model). https://github.com/ IPGP/mapping-matlab. Retrieved 31 Dec 2020 Beauducel F (2020b) Sunrise: sunrise and sunset times. MATLAB Central File Exchange. https:// www.mathworks.com/matlabcentral/fileexchange/64692-sunrise-sunrise-and-sunset-times. Retrieved 31 Dec 2020 Beauducel F (2020c) Weton: Javanese calendar. https://github.com/beaudu/weton. Retrieved 31 Dec 2020 Beck HE, Zimmermann NE, McVicar TR, Vergopolan N, Berg A, Wood EF (2018) Present and future Köppen-Geiger climate classification maps at 1-km resolution. Sci Data 5:180214 Beck HE, Zimmermann NE, McVicar TR, Vergopolan N, Berg A, Wood EF (2020) Publisher correction: Present and future Köppen-Geiger climate classification maps at 1-km resolution. Sci Data 7(1):1–2 Brinton DG (1893) The native calendar of Central America and Mexico. A study in linguistics and symbolism. Proc Am Philos Soc 31(142):258–314 Brown C (2004) A Short History of Indonesia: The Unlikely Nation? Allen & Unwin, Crows Nest Brown LE (1993) The New Shorter Oxford English Dictionary. Oxford University Press, Oxford Burton DM (2011) Elementary Number Theory, 7th edn. McGraw-Hill Education, New York Canth M (1885) Työmiehen vaimo (The worker’s wife). CreateSpace Independent Publishing Platform, (in Finnish) Carroll L (1887) To find the day of the week for any given date. Nature 517 Catepillán X (2016) An ethnomathematics course and a first-year seminar on the mathematics of the pre-Columbian Americas. Mathematics Education, Springer, pp 273–290 Closs MP (1996) Native American Mathematics. University of Texas Press, Austin Cohen EL (2000) What day of the week is it? Cubo Matemática Educacional. Math J 2(1):01–11 Cohn M (2007) The mathematics of the calendar. Lulu Press, Morrisville. www.lulu.com Conway JH (1973) Tomorrow is the day after doomsday. Eureka 36:28–31 Cribb R (2000) Historical Atlas of Indonesia. Routledge Curzon Press/University of Hawaii Press, Surrey/Honolulu Daldjoeni N (1984) Pranatamangsa, the Javanese agricultural calendar–Its bioclimatological and sociocultural function in developing rural life. Environmentalist 4(7):15–18 Darling D (2004) Marking time. Latitudes 36:upaginated Dershowitz N, Reingold EM (2009) Indian calendrical calculations. In: Yadav BS, Mohan M (eds) Ancient Indian Leaps into Mathematics. Springer, New York, pp 1–31 Eade CJ (1995) The Calendrical Systems of Mainland South-East Asia. Brill, Leiden Errington JJ (1998) Shifting Languages: Interaction and Identity in Javanese Indonesia. Cambridge University Press, Cambridge Folan WJ (1986) Calendrical and numerical systems of the Nootka. In: Closs MP (ed) Native American Mathematics. University of Texas Press, Austin, pp 93–108 Frederick WH, Worden RL (1993) Indonesia: A Country Study. United States Government Publishing Office for the Library of Congress, Washington, DC Gardner M (1996) The Universe in a Handkerchief: Lewis Carroll’s Mathematical Recreations, Games, Puzzles, and Word Plays. Copernicus, Springer, New York Ginaya G (2018) The Balinese calendar system: From its epistemological perspective to axiological practices. Int J Linguistics Lit Cult 4(3):24–37 Gislén L (2018) On lunisolar calendars and intercalation schemes in Southeast Asia. J Astron Hist Herit 21(1):2–6 Gislén L, Eade JC (2019a) The calendars of Southeast Asia. 1: Introduction. J Astron Hist Herit 22(3):407–416 Gislén L, Eade JC (2019b) The calendars of Southeast Asia. 2: Burma, Thailand, Laos and Cambodia. J Astron Hist Herit 22(3):417–430 Gislén L, Eade JC (2019c) The calendars of Southeast Asia. 4: Malaysia and Indonesia. J Astron Hist Herit 22(3):447–457
28
N. Karjanto and F. Beauducel
Gislén L, Eade JC (2019d) The calendars of Southeast Asia. 5: Eclipse calculations, and the longitudes of the Sun, Moon and planets in Burmese and Thai astronomy. J Astron Hist Herit 22(3):458–478 Gislén L, Eade JC (2019e) The calendars of Southeast Asia. 6: Calendrical records. J Astron Hist Herit 22(3):479–491 Golzio KH (2012) The calendar systems of ancient India and their spread to Southeast Asia. In: Boschung D, Wessels-Mevissen C (eds) Figurations of Time in Asia. Wilhelm Fink Verlag, München, pp 204–225 Hassan A (2017) Muslim calendar. In: Çakmak C (ed) Islam–A Worldwide Encyclopedia. ABC-CLIO, Santa Barbara/Denver, pp 1129–1130 Headley SC (2004) The Javanese wuku weeks: Icons of good and bad time. In: Le Roux P, Sellato B, Ivanoff J (eds) Poids et measures en Asie du Sud-Est (Weights and Measures in Southeast Asia–Metrological Systems and Societies), vol 1, Institut de Recherche sur le Sud-Est Asiatique (Southeast Asian Research Institute), Centre national de la recherche scientifique (CNRS) et Université de Provence (The French National Center for Scientific Research and University of Provence Aix-Marseille I), Marseille, France, pp 211–236 Hidayat B (2000) Indo-Malay astronomy. In: Astronomy Across Cultures: The History of Non-Western Astronomy. Springer, Dordrecht, pp 371–384 Hubber P, Ramseger J (2017) Physical learning environments for science education: An ethnographic field study of primary classrooms in Australia, Germany and Taiwan. In: Hackling M, Ramseger J, Chen HL (eds) Quality Teaching in Primary Science Education. Springer, Cham, pp 51–77 Jenkins JM (1994) Tzolkin: Visionary Perspectives and Calendar Studies. Borderland Sciences Research Foundation, Eureka Kartini RA (1911) Door duisternis tot licht: Gedachten over en voor het Javaansche volk (Through darkness to light: Thoughts about and for the Javanese people). G.C.T. van Dorp, Semarang and Soerabaja, Dutch East Indies (Indonesia) and ’s-Gravenhage (The Hague), the Netherlands, (in Dutch, with a foreword by J. H. Abendanon) Kartini RA (1920) Letters of a Javanese princess. Wentworth Press and University Press of America, Sydney, Australia and Landham, Maryland, United States of America, (translated from Dutch by A. L. Symmers, with a preface by E. Roosevelt, edited and with an Introduction by H. Geertz) Kristoko H, Eko S, Sri Y, Bistok S (2012) Updated pranata mangsa: Recombination of local knowledge and agro meteorology using fuzzy logic for determining planting pattern. Int J Comput Sci Issues (IJCSI) 9(6):367–372 Lân LT (2019) The calendars of Southeast Asia. 3: Vietnam. J Astron Hist Herit 22(3):431–446 Marr DG, Milner AC (1986) Southeast Asia in the 9th to 14th Centuries. Institute of Southeast Asian Studies, Singapore Martzloff JC (2000) Chinese mathematical astronomy. In: Selin H, D’Ambrosio U (eds) Mathematics Across Cultures: The History of Non-Western Mathematics. Springer, Dordrecht, pp 373–407 Martzloff JC (2016) Astronomy and Calendars – The Other Chinese Mathematics 104 BC–AD 1644. Springer, Berlin/Heildelberg Milne J (2020) Class field theory (v4.03). www.jmilne.org/math/ Nuraeni Z, Azizah N (2017) Application of number theory in the calculation of Java calendar. In: Proceedings of Ahmad Dahlan International Conference on Mathematics and Mathematics Education, Yogyakarta, pp 177–181 Oey E (2001) Java, 3rd edn. Periplus Editions Publishing Group/Tuttle Publishing, Jakarta/North Clarendon Ôhashi Y (2009) Mainland Southeast Asia as a crossroads of Chinese astronomy and Indian astronomy. In: Yadav B, Mohan M (eds) Ancient Indian Leaps into Mathematics. Birkhäuser, Boston, pp 193–200
An Ethnoarithmetic Excursion into the Javanese Calendar
29
Pigeaud TGT (1977) Javanese divination and classification. In: de Josselin de Jong PE (ed) Structural Anthropology in the Netherlands: A Reader, Koninklijk Instituut voor Taal-, Land- en Volkenkunde (Royal Institute for Linguistics, Geography, and Ethnology)–Translation Series, Foris, Dordrecht, pp 64–82 Proudfoot I (2006) Old Muslim Calendars of Southeast Asia. Brill, Leiden Proudfoot I (2007) In search of lost time: Javanese and Balinese understandings of the Indic calendar. Bijdragen tot de Taal-, Land-en Volkenkunde (Journal of the Humanities and Social Sciences of Southeast Asia) 163(1):86–122 Raffles TS (1817) The History of Java. Black, Parbury, and Allen, London, United Kingdom, (two volumes) Reingold EM, Dershowitz N (2018) Calendrical Calculations, The ultimate (Fourth) Edition. Cambridge University Press, Cambridge Rice PM (2007) Maya Calendar Origins: Monuments, Mythistory, and the Materialization of Time. University of Texas Press, Austin Richards EG (1998) Mapping Time: The Calendar and Its History. Oxford University Press, Oxford Richmond B (1956) Time measurement and calendar construction. Brill Archive Ricklefs MC (1974) Jogjakarta under Sultan Mangkubumi, 1749-1792: A history of the division of Java, London Oriental Series, vol 30. Oxford University Press, Oxford Ricklefs MC (1993) A History of Modern Indonesia since c. 1300, 2nd edn. Macmillan Press, Houndmills/Basingstoke/Hampshire/London Rizzo R (2020) What’s in a name? How Indonesian Buddhism gained its second ‘D’. Globe–Lines of Thought Across Southeast Asia (Published 14 October 2020. The Wednesday Buda can sometimes also spelled as Budha. Some government officials and civil servants in Indonesia seem to be ignorant by using this term when registering the religious data for the Buddhist community in the country. The correct Indonesian word for Buddhist is Buddha, with double ‘d’. The error has been rectified in a letter issued on 31 December 2018 by the Indonesian Ministry of Religion No. B.3727/DJ.VII/Dt.VII.I.2/BA.00/12/2018) Rosa M, Shirley L, Gavarrete ME, Alangui WV (2017) Ethnomathematics and Its Diverse Approaches for Mathematics Education. Springer, Cham Rosalina I (2013) Aplikasi kalender Islam Jawa dalam penentuan bulan Qomariyah: Penyesuaian kalender Saka dengan kalender Hijriyah (Application of Javanese Islamic calendar in determining Qamariyyah month: Adjustment of Saka and Hijri calendars). Undergraduate thesis, Universitas Islam Negeri Maulana Malik Ibrahim, Malang, East Java, Indonesia. http:// etheses.uin-malang.ac.id/87/ (in Indonesian) Rosen KH (2011) Elementary Number Theory and Its Applications, 6th edn. Pearson, London Rudolph L (2006) The fullness of time. Culture & Psychology 12(2):169–204 Ruggles CLN (2015) Calendars and astronomy. In: Ruggles CLN (ed) Handbook of Archaeoastronomy and Ethnoastronomy. Springer, New York, pp 15–30 Selin H, D’Ambrosio U (2000) Mathematics Across Cultures: The History Of Non-Western Mathematics, Science Across Culture: The History of Non-Western Science Series, vol 2. Springer, Dordrecht Selin H, Sun X (2000) Astronomy Across Cultures: The History of Non-Western Astronomy, Science Across Culture: The History of Non-Western Science Series, vol 1. Springer, Dordrecht Soebardi (1965) Calendrical traditions in Indonesia. Madjalah Ilmu-Ilmu Sastra Indonesia (Indonesian Literature Magazine) 3:49–61 Soekmono R (1981) Pengantar Sejarah Kebudayaan Indonesia (An Introduction to Cultural History of Indonesia). Kanisius, Sleman, Special Administrative Region of Yogyakarta, Indonesia, (in Indonesian) Stakhov A (2009) The Mathematics of Harmony: From Euclid to Contemporary Mathematics and Computer Science, Series on Knots and Everything, vol 22. World Scientific, Singapore, (assisted by Olsen, S) Sun X (2015) Chinese calendar and mathematical astronomy. In: Ruggles CLN (ed) Handbook of Archaeoastronomy and Ethnoastronomy. Springer, New York, pp 2059–2068
30
N. Karjanto and F. Beauducel
Taylor JG (2003) Indonesia: Peoples and Histories. Yale University Press, New Haven Urton G, Llanos PN (2010) The Social Life of Numbers: A Quechua Ontology of Numbers and Philosophy of Arithmetic. University of Texas Press, Austin Utami NW, Sayuti SA, Jailani J (2019) Math and mate in Javanese primbon: Ethnomathematics study. J Math Educ 10(3):341–356 Utami NW, Sayuti SA, Jailani J (2020) An ethnomathematics study of the days on the Javanese calendar for learning mathematics in elementary school. Elem Educ Online 19(3):1295–1305 Van Den Bosch F (1980) Der Javanische Mangsakalender (The Javanese mangsa calendar). Bijdragen tot de Taal-, Land-en Volkenkunde (Journal of the Humanities and Social Sciences of Southeast Asia) 136(2/3de Afl):248–282, (in German) Vickers A (1990) Balinese texts and historiography. Hist Theory 29(2):158–178 Warsapradongga MD (1892) Koleksi Warsadiningrat (MDW1892a). Yayasan Sastra Lestari, Surakarta Wollstonecraft M (1792) A Vindication of the Rights of Woman: With Strictures on Political and Moral Subjects. Cambridge University Press, Cambridge Zaki MK, Noda K, Ito K, Komariah K, Sumani S, Senge M (2020) Adaptation to extreme hydrological events by Javanese society through local knowledge. Sustainability 12(24):10,373 Zeller JCJ (1882) Die Grundaufgaben der Kalenderrechnung auf neue und vereinfachte Weise gelöst (The basic tasks of the calendar calculation solved in a new and simplified manner). Württembergische Vierteljahrshefte für Landesgeschichte (Württemberg Quarterly Books for Regional History), W Kohlhammer, Stuttgart, Germany 5:313–314, (in German) Zeller JCJ (1883) Problema duplex Calendarii fundamentale (The fundamental of the calendar double problem). Bull Soc Math France (Bull French Math Soc) 11:59–61, (in Latin) Zeller JCJ (1885) Kalender-Formeln (Calendar formulas). Math-naturwissenschaftliche Mitteilungen des mathematisch-naturwissenschaftlichen Vereins Württemberg (Math Sci Rep Math Sci Assoc Württemberg) 1(1):54–58, (in German) Zeller JCJ (1887) Kalender-Formeln (Calendar formulas). Acta Math 9(1):131–136, (in German) Zerubavel E (1989) The Seven Day Circle: The History and Meaning of the Week. University of Chicago Press, Chicago
An Uncertain Travel Elijah Liflyand
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 What Is This About? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Mathematics Versus Music . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Mathematics Versus Fiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Individual Work and Joint Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 Applications: pro et contra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Abstract
In these notes, my personal reflections on practice and working philosophy of mathematics are discussed. Some of the reviewed issues definitely have a philosophical sense, or at least flavor. The generality of some of them is suited to similar observations of recognized specialists open-handedly quoted. Keywords
Social side of mathematical activity · Music · Language · Fiction · Individual work · Joint work · Insignificant variables · Function of bounded variation · Hilbert transform · Stirling numbers · Applications · Seminar
E. Liflyand (*) Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel © Springer Nature Switzerland AG 2023 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_141-1
1
2
1
E. Liflyand
Introduction
The article by Etnyre (2019) is concluded with the statement Good introductions can elevate the community’s appreciation to your work and quite possibly you, so make sure to write great ones.
Not throwing even a shadow on this instructive article, one may (half)jokingly ask whether that specific text needs a great introduction. Pronouncing this, we come to one of the borders between mathematical and philosophical paradoxes. In the sense that such a border may appear anywhere, making one hesitate what side of the border is more attractive that very moment. I am not going to deeply discuss purely philosophical paradigms and their reflections in mathematical ways of thinking in full, so to say, philosophical depth. Instead, I shall try to recall the cases and aspects of my practice as an acting researcher in pure mathematics which, in my opinion, were of general nature common to all humanity, and which I dare, justified or not completely, call philosophical. Recall Littlewood’s (1957) I asked Prof. Wittgenstein was this not a profound philosophical joke, and he said it was.
On the other hand, it is probably a moment for the first reference to Persson (2021), where not only the one who is interviewed sounds, but the interviewer as well; in particular, the latter said: Popper failed really to consider mathematics seriously, probably because like most modern philosophers, and here I very much include Wittgenstein, he did not know much about mathematics and had certainly done no work in mathematics, which is a prerequisite for understanding mathematics.
Arguing neither against nor for this statement itself, I cherish hopes for having the mentioned prerequisite, which may allow me to catch certain appropriateness in a less familiar area. However, I would like to begin with taking a roundabout approach to the subject. There was a famous Russian poet (of Georgian-Armenian origin) Bulat Okudzhava. Some of his rhymes converted to very nice and popular songs. However, he also was a successful and very interesting Prosaist. A part of his novels went back to certain events of Russian nineteenth-century history. Once being interviewed in a written form about how a writer like him works on historical issues, he preferred to present a metaphorical song instead of just writing down more or less standard words. Here is this song in my equi-rhythmical translation (possible to be sang – contrary to a non-rhythmical word for word translation as in Okudzhava [1982]), entitled I am writing an historical novel In a bottle, darkly brown, once full of imported beer, of a rose the scarlet crown
An Uncertain Travel
3
bloomed with soft and proud cheer. I was writing of the past, slowly letting it unravel; from the first page to the last one was my uncertain travel. Refrain: One’s bequeathing as it’s easing. It is easing as one’s breathing. As one’s breathing one’s bequeathing, never bothering to please. This for nature is a withness. What for — is not our business, why — is not for us to quiz. Distances were colored blue, pure fiction was in plenty, and from my own fate I drew, any thread if it was handy. Helped my heroes’ ways to weave, on the history inquired, and an officer retired was sometimes my make-believe. Refrain: Fiction is not a deceit; nor a finish is the matter. Let me keep on writing it to the very final letter. And before the rose-leaves die, which inside the bottle grow, let me cry the words that lie stashed away since long ago. Refrain:
I am aware of the translation by Evgenia Sarkisyants, some of her phrasing inevitably coincide with mine, but mainly I tried to have the refrain first of all and some other pieces as I see it, either for better (closer to the source) meaning, for better rhythm, or just to my taste. And, of course, keeping in mind the present topic and how this illustration fits it. Back to our subject, mathematical work can be described similarly, in a sense. In fact, once I tried to parody Okudzhava’s rhymes as I am writing a mathematical paper, but let us skip too much poetry and return to mathematics and philosophy.
2
What Is This About?
However, it is not that easy to switch to purely mathematical impressions because of the social side of mathematical activity. The point is that it is impossible to conceal oneself in the ebony tower, and very often a mathematician faces the need to explain to plain people (of course, here the people who lack certain mathematical education are meant as plain) what such a person is doing (and even gets money for). Unfortunately, an immense majority identify mathematics with numbers and arithmetic calculations (something like accountant) or figures and measurements (say,
4
E. Liflyand
land surveyor). No wonder that many people identify mathematicians and computer programmers despite that this is by no means the case and interesting exceptions are just interesting exceptions. On the other hand, getting acquainted, many people immediately start telling how good or bad they are (were) in mathematics.
2.1
Mathematics Versus Music
Some time ago, my former tennis partner, a musician, firmly told me that music is something creative, contrary to mathematics, which is a dry routine work. Laying aside the fact that he had no idea what mathematics is (back to the previous paragraph), that meaningless discussion had brought my attention, and not for the first time, to possible relations between mathematics and music. I am sure that far more advanced minds than mine have been thought on these interrelations. However, my feeling is that only mathematicians have been troubled by such reflections. Well, many mathematicians are fond of music; some of them play music on a professional level. For example, Per Enflo and Louis Rowen used to give concerts, playing piano and violoncello, respectively. Closer to our topic, it seems that research in pure mathematics can justifiably be compared with composing music. Of course, the means are different as well as the ways of perception; however, some similarities are striking. Does a theorem (a theorem is an intentional simplification; the question may concern a new approach, a new notion, or a new theory) exist “somewhere” unless formulated, proved, and presented (published) by a mathematician? By the way, these steps may sometimes be fulfilled (or a more musical word – performed) by different people or groups. In fact, involved mathematicians evoke from “somewhere” a “melody,” fixing it by means of “notes” – notations and words. The observation that the number of notes and other musical signs is essentially smaller than the number of notions and notations should not be a serious objection. After all, there is a diversity of musical instruments, performers, etc. Or, which is simpler and maybe more correct, music can be compared with a specific area of mathematics rather than with mathematics as a whole. Combinatorics, for instance. Sound extracted should not be considered as a determining factor. Indeed, there is a notion of inner hearing of music. Or, there are “professional” music lovers who come to a concert and listen to a music with the score in hands. Anticipating possible objections, let us mention that both manifestations can be emotional. Looking at the matter the other way round, recall that mathematics is frequently identified with language. To my taste, this is almost a perfect definition (at least, to some extent). Maybe it is more precise to say that being a language for various phenomena, mathematics is imprescriptible from the language itself in which the results, proofs, and conclusions are given. But music is not less frequently called a language of soul, feelings, or the like. A different feature of music is a need of its performance, with masterly performance as a characteristic of this phenomenon. However, mathematicians are very often on the stage, even not mentioning their teaching activity. As V.K. Beloshapka wrote in his reminiscences about Vitushkin (posted on https://arzamas.academy/mag/1051-math?fbclid¼IwAR04xUpED2FSK fnEOkCkyz7ZZPFV9PWhFgtpK3P8SkzwvvhR13PRuzo8rrQ):
An Uncertain Travel
5
Life of a mathematician takes its shape as life of seminars. It is something like concerts, where mathematicians gather and tells something to each other.
And there are coryphaei, masters, and apprentices in this activity. We shall return to this issue in the sequel.
2.2
Mathematics Versus Fiction
One more activity which, at a glance, seems having nothing in common with mathematics is fiction. In fact, this is sometimes exactly the same process, or at least the published result of the corresponding process is, concerning maybe crime fiction or detective novels more precisely. Let us analyze this assertion. There is something intriguing and hidden, it is searched all the time, with certain clues found, real or false, and the proof of the “main result” as an apotheosis. One may object that the audience is very different, especially in amount. That’s obviously correct, and the reason is similarly obvious: special terms and language in general as well as much or less symbolics. However, special language and symbolics, important and sometimes decisive, can be found in modern fiction. Here are a couple of examples immediately coming to my mind. In the famous “Jurassic Park” by Michael Crichton, the action could not move further without reading and understanding a computer print. In fact, not only the hero must understand the printing but the reader too. In the not less famous “The Flanders Panel” by Arturo Prez-Reverte, the reader must understand chess, including chess notation. Clearly, most of the readers do not understand these issues, just skip them and, to be honest, get the main pleasure in killing the time. The main objection might be that such readers just enjoy pleasant and talented reading. However, along with talented authors, there is an immense of a middle or low level. They more or less successfully publish their writings and find their gratifying readers. But the same is among mathematicians: there are very talented authors (in the sense of writing; not all the authors of great results are such) and many more not great in writing. Therefore, the main difference is the specially prepared audience, but the same is with the mentioned listeners of music with the score or spectators of the so-called “elite” movies. To be precise, what I have compared with fiction concerned a plot, and research papers were mostly understood. But mathematicians do write books, and quite many authors write pretty many books. The process is discussed, and various recommendations on this subject are given in Silverman (2019). A wish to make one’s book a delicious and popular reading, with many demanded copies, is apparently the same for any author, regardless of the topic. To mention a recent successful example, Iosevich (2007) is a very impressive one. However, a personal experience and conclusions should be mentioned in order to be complete and honest. Already publishing three books and having one more in process can be considered as a material for possible generalizations. First of all, a strong wish to present a readable text claims for certain demands, self-demands. There must be a plot, a scenario, and a thread going through the whole text. There must be something beyond a collection of facts and results. My
6
E. Liflyand
first book Iosevich and Liflyand (2014) was an attempt to overview the results where the Fourier transform can be characterized by its asymptotic behavior. The second one, Liflyand (2019), appeared when the separated facts on the behavior of the Fourier transform of a function of bounded variation seemed to be united in a comprehensive theory. In other words, when the puzzle was almost completed. The third book, Liflyand (2021), is more a textbook than a research monograph. The idea was to extend the first chapter of Liflyand (2019), given as a necessary brief toolkit to the further material, to a sort of a textbook on harmonic analysis on the real line in full value. In each of these cases, the goal was to present the author’s view of a certain topic in a readable way and more or less full generality. In short, each time this is a comprehension of the topic as a whole, with slight abuse of terminology a philosophical comprehension. On the other hand, moving within the classification of mathematicians as theory builders and problem solvers, the one who writes a monograph (not a textbook) switches from the latter to the former, in a sense.
3
Individual Work and Joint Work
The problem of the general and the individual is considered as one of the fundamental problems of philosophy and psychology. I believe that in mathematics, it expresses itself in research. Except a very clear individual manner of thinking on a mathematical problem, there are different models of collaboration. Luckily, the publications authored by 20 or 40 names are really marginal in mathematics and mainly are not of research nature if appear. Two and three authors are the most typical collaborative groups in mathematics; by the way, the Research in Pairs program in Oberwolfach assume from two to four collaborators. I do have an experience with three other coauthors, but it is unique and probably not enough for conclusions. So, let us discuss the events where two or three people struggle with a problem or write a survey text. The most natural and the most fruitful collaboration seems to be the one of two colleagues. Very often such collaborators are the friends possessing experience in discussing the problems of mutual interest or even in joint work on other problems. On the other hand, very often an accidental collaboration becomes a basis for a longstanding friendship. In each of the cases, this new or renewed collaboration starts with mutual interest in a certain problem and, what is maybe more beneficial, with understanding that they have an idea or ideas and tools for a successful attack on the problem. In fact, in most cases there is no much difference if three people participate. In many situation, two of them could be characterized as a pair described above, while the reasons appeared for the third one to join them are the same reasons as those above for two. To be honest, sometimes such an activity of two or three (or more) is accidental, valid for one occasion only, and does not result in further collaboration. It sometimes turns out that the participants or some of them are incompatible, sometimes they just fail to find more problems and tools of really mutual interest. In all of these situations, successful or not, the issues of psychology or social philosophy turn out to be more important than those professional.
An Uncertain Travel
7
Last but not least, what collaboration may protect better than individual work is the novelty. The possibility of working on something already discovered might be very bothering. Similarly, there exists an option on elaborating something formally new but trivial in essence. This is well illustrated in the underestimated book Lem (2013) (it was translated into English only in 2013, decades after its appearance): Not everything constitutes scientific truth: an ocean of insignificant variables is larger than an ocean of stupidity – and this is already saying something.
Let me describe a case in my practice, where some concrete mathematics is in order. There is a well-known theorem of Hardy and Littlewood. It asserts that if both a periodic function and its conjugate are of bounded variation, then the Fourier series of each converges absolutely. One day a desire appeared to extend this statement to the non-periodic case. However, the start of this project had been postponed for a distant period. This time was spent in asking far more educated colleagues and looking through the literature for a possible existence of such an extension. There was a feeling of how it can be that nobody had tried it. Finally, the work was done with one of my colleagues and published in Liflyand and Stadtmüller (2013). And only then, in a while I was terrified finding a non-periodic extension in a recognized paper, Hille and Tamarkin (1935), which I looked through many times. Luckily, that story had a happy end: in that paper a quite restrictive additional condition was assumed not needed in our work and in general. The point was that we benefited from using the modified Hilbert transform not known to Hille and Tamarkin (1935). Hence, the only negative feature of Liflyand and Stadtmüller (2013) turned out to be not referring to Hille and Tamarkin (1935); the result however remained safely new. In a different occasion, coming to a conference, I planned to devote free time to a particular problem in harmonic analysis. However, already during the flight, I started thinking on a problem of combinatorial nature and eventually spent all the week in cracking some relevant obstacles. Flying back, I already had a well-composed theory, with results and proof. Luckily, I believed in the novelty of my speculations for not a single moment. Indeed, in a while one of my more experienced and educated colleagues enlightened me in the regard that what I “discovered” were the Stirling numbers (of the second kind). One can read about them in every book on combinatorics. However, I was delighted in the coincidence of my calculations with the corresponding piece of the general theory. Unfortunately, similar troubles used to happen from time to time, and nobody knows when such an event may occur in the future. The fears of false novelty or trivial novelty are already mentioned. There is one more fear, the one of possible mistakes. The whole third chapter of Littlewood (1957) is devoted to a collection of various misunderstandings and misprints, but mostly of logical or linguistic nature. However, in §13 of this chapter a misprint of wrong sign resulted in the use and application of a wrong statement. This is closer to what is meant as a fear: the use of a false argument, omission of an argument, and the like. Back to the problem of the general and the individual, probably the most common form of constant generality is seminars. It has already been mentioned in
8
E. Liflyand
Beloshapka’s saying above. In fact, the complete citation is continued with a funny statement. Mathematics is probably transmitted through airborne droplets. Some possess immunity from it since their school-days, while few others are ailing – a sort of a limited epidemics.
There are two roles on seminars that everybody plays, frequently or rarely: the speaker and the listener. I have also experienced a third role of the long-standing organizer of a seminar. It seems that having a continuously working and constantly attended seminar is completely beneficial. Sharing one’s own results with colleagues and learning from them new facts and approaches – what can be more fruitful for a mathematician in addition to the main issue of searching new results and solving challenging problems. As of possible mistakes, a good and erudite audience sometimes saves the speaker from certain (not all, alas) mistakes in time, prior to the publication. Of course, there are enough people who avoid constantly playing the mentioned roles. Well, social philosophy will apparently cluster them as a group enjoying equal rights with those mentioned before.
4
Applications: pro et contra
Einstein’s observation (Bloch 1995, p. 243) Inasmuch as the mathematical theorems are related to reality, they are not sure; inasmuch as they are sure, they are not related to reality.
adds fire to this everlasting contradiction. One moment in my personal experience seems to be very instructive. It concerns the work which finally resulted in Liflyand and Trebels (1998). Here more concrete mathematics comes into play. Considered are radial functions in the n-dimensional Euclidean spaces, that is, the functions depending on the Euclidean norm of the variables. As is well known, their Fourier transform is also radial and can be expressed in a more sophisticated form than the standard Fourier transform via the Bessel functions. The problem is under what conditions on functions (or for what classes of radial functions) the multidimensional Fourier transform can be expressed asymptotically in such a way that the leading term is a purely one-dimensional Fourier transform of a function related to the given one. (By the way, a priori such results have numerous obvious applications in the way that many important one-dimensional results can “for free” be extended to the radial case.) One day my collaborator claimed for examples concerning a certain assertion. I would be happy to construct some but failed. My statement instead was that the Fourier transform is a very important and recognized object that any new result on it sounds by itself and does not need immediate illustration. After a few days of this conflict, none of us was able to deliver an example, and my collaborator had to agree with me (well, the circumstances forced him to do so). This particular situation has, in my opinion, a philosophical side. Not only in the sense of
An Uncertain Travel
9
interrelations of theory and practice. There is one more feature of new results which may justify them regardless of their immediate applicability. What is meant is the aesthetical aspect, of course, a specific mathematical aesthetics is understood. It cannot substitute for the proof but can be a landmark on the road to a possible proof or just to an interest or lack of that in the issue in question. Along with the place of the considered problem, the aesthetical side may give an a priori belief that certain applications will appear in due time. This is well expressed in (Stewart 1995): Mathematics is a harmonious whole: except that the harmony is incomplete, because there are always gaps in our knowledge and vaguely understood hints of new interrelations. In this sense, an application of any mathematics from this central body is an application of the whole. If you insist that mathematics justify its existence by providing applications, then an application of one part will justify the whole. We do not cut off a violinist’s feet just because he doesn’t use them in playing the violin: in the same way we ought not to dismiss group theory just because it won’t pay the rent.
(A remark aside: Does pay as far as I know!) There is an additional point in this controversial issue: whether a person has a right to do whatever one is interested in, not thinking much on possible applications, or in a wider sense, on possible relations with practice and other areas? Of course, I mean a professional side of this arbitrariness, not a discussion in Dostoyevsky’s style. I believe that these problems are related to such social issues as the existence of middle class contrary to the elite group, its weight, and role.
5
Concluding Remarks
Of course, many of the above analogs and comparisons, like those with music and fiction, are relative and serve more for dialogs with strangers. All these similarities and differences are disputable, with numerous pro et contra in each of them. Last but not least, one may ask where philosophy is here. The answer is: everywhere. Even pondering over these problems makes a serious difference and becomes a decisive step toward philosophical reflection. This as well as possible getting down to philosophy in mathematics can be well illustrated by Hersh’s explanation in Persson (2021): I had gotten hooked on philosophy of math when I volunteered to teach a course that was listed in my department’s catalogue as Foundations of Mathematics. No one had ever offered it, before or since. I expected to just do my usual thing when teaching a subject I know nothing about – pick the best textbook I can find, and stay a chapter ahead of the students. Not this time! All the textbooks I found simply presented three viewpoints – logicist, intuitionist, and formalist, and left it plain that all three were inadequate, unsatisfactory, failures. End of course!! As a teacher I found that situation deeply unacceptable. After all, I ought to at least know what was my own personal philosophy of math. But I found that I simply didn’t know. So I had to find out where I stood, what was my understanding of the nature of the subject to which I devoted my life. On my part, The Mathematical Experience
10
E. Liflyand was a stage in my struggle to figure out my own answers. Then also, my career as a mathematician had given me a special kind of experience, which had not been much exploited in a literary way. I was very lucky to find Phil Davis as a collaborator. We never dreamed that the book would make such a splash. It was far short of our original intentions, but we were desperate and submitted what we had. It seemed only a rag-bag at the time, but nevertheless, it worked, after all!
Hoping that the presented notes will also work to a certain extent and mentioning that the topic is inexhaustible, both theoretically and on the level of individual experience, what can be recommended for further reading and thinking (in addition to one’s main research work) is to look through the publications that attracted my attention in general and while working on these notes, Bochner (1966), Gromov (2018), Manin (2007), Shapiro (2005), and Stewart (1995), to mention some. And, of course, numerous Hersh’s books and articles. I would also like to mention Wigderson (2019): it is not less philosophical than “technological.” For sure, these by no means will be boring.
References Bloch A (1995) Murphy’s law complete. Mandarin Paperbacks, Berkshire Bochner S (1966) The role of mathematics in the rise of science. Princeton University Press, Princeton Etnyre J (2019) The art of writing introductions. Notices AMS 66:361–362 Gromov M (2018) Great circle of mysteries. Mathematics, the world, the mind. Birkhäuser/ Springer, Cham Hille E, Tamarkin JD (1935) On the absolute integrability of Fourier transforms. Fundam Math 25: 329–352 Iosevich A (2007) A view from the top. Analysis, combinatorics and number theory, Student mathematical library, 39. American Mathematical Society, Providence Iosevich A, Liflyand E (2014) Decay of the Fourier transform: analytic and geometric aspects. Birkh Lem S (2013) Summa technologiae. University of Minnesota Press, Minneapolis Liflyand E (2019) Functions of bounded variation and their Fourier transforms. Birkhäuser Liflyand E (2021) Harmonic analysis on the real line – a path in the theory. Birkhäuser/ Springer, Cham Liflyand E, Stadtmüller U (2013) On a Hardy-Littlewood theorem. Bull Inst Math Acad Sinica (New Series) 8:481–489 Liflyand E, Trebels W (1998) On asymptotics for a class of radial Fourier transforms. Z Anal Anwendungen (J Anal Appl) 17:103–114 Littlewood JE (1957) A mathematician’s miscellany. Mandarin Paperbacks, London Manin Y (2007) Mathematics as metaphor. Selected essays of Yuri I. Manin. With a foreword by Freeman J. Dyson. American Mathematical Society, Providence Okudzhava B (1982) 65 songs, 2nd edn. Ardis, Ann Arbor Persson U (2021) A conversation with Reuben Hersh. EMS Mag 121:20–35 Shapiro HT (2005) A larger sense of purpose. Princeton University Press, Princeton Silverman JH (2019) To write or not to write ... a book, and when. Notices AMS 66:357–358 Stewart I (1995) Concepts of modern mathematics. Dover Publications Inc, New York Wigderson A (2019) Mathematics and computation. A theory revolutionizing technology and science. Princeton University Press, Princeton
Argumentation in Mathematical Practice Andrew Aberdein and Zoe Ashton
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 What Is an Argument? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Proof as Argumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Toulmin Layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Argumentation Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Mathematical Reasoning as Argumentative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Mathematics and Rhetoric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Problem Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Reasoning About Refutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Presenting Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Mathematical Communities as Argumentative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 8 8 10 14 14 15 16 17 17 20
Abstract
Formal logic has often been seen as uniquely placed to analyze mathematical argumentation. While formal logic is certainly necessary for a complete understanding of mathematical practice, it is not sufficient. Important aspects of mathematical reasoning closely resemble patterns of reasoning in nonmathematical domains. Hence the tools developed to understand informal reasoning, collectively known as argumentation theory, are also applicable to much mathematical argumentation. This chapter investigates some of the details of that application. Consideration is given to the many contrasting meanings of the word “argument”; to some of the specific argumentation-theoretic tools that
A. Aberdein (*) School of Arts and Communication, Florida Institute of Technology, Melbourne, FL, USA e-mail: aberdein@fit.edu Z. Ashton Department of Philosophy, Ohio State University, Columbus, OH, USA © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_12-1
1
2
A. Aberdein and Z. Ashton
have been applied to mathematics, notably Toulmin layouts and argumentation schemes; to some of the different ways that argumentation is implicated in mathematical practices; and to the social aspects of mathematical argumentation. Keywords
Argumentation schemes · Argumentation theory · Mathematical argumentation · Mathematical reasoning · Mathematical rhetoric · Problem choice · Toulmin layouts
1
Introduction
Since logic developed the tools to adequately represent formal derivations, many philosophers of mathematics have been tempted to conclude that formal derivation suffices to account for all interesting features of mathematical practice. However, there have always been other philosophers who perceived the shortcomings of such a reduction. Here, for example, is Henri Poincaré: If you are present at a game of chess, it will not suffice, for the understanding of the game, to know the rules for moving the pieces. That will only enable you to recognize that each move has been made conformably to these rules, and this knowledge will truly have very little value. Yet this is what the reader of a book on mathematics would do if he were a logician only. To understand the game is wholly another matter; it is to know why the player moves this piece rather than that other which he could have moved without breaking the rules of the game. It is to perceive the inward reason which makes of this series of successive moves a sort of organized whole. This faculty is still more necessary for the player himself, that is, for the inventor (Poincaré 1913, 218).
One response to this limitation of formal logic is to recognize an analogy with a similar limitation in another domain: formal logic is also an imperfect tool for understanding everyday reasoning. Solutions have been proposed for that problem: systems of informal logic, argumentation theory, or dialectic have been devised since antiquity to address ordinary reasoning (van Eemeren et al. 2014). Hence some philosophers of mathematical practice have reasoned that these theories might also lend themselves to the understanding of mathematical reasoning (Aberdein 2009). This chapter surveys the uses to which argumentation theory has been put in order to understand mathematical practice. Section 2 addresses the many ambiguities implicit in the word “argument” – and how they are specifically related to mathematics. Section 3 discusses two prominent proposals for the application of particular tools from argumentation theory to mathematics: Toulmin layouts and argumentation schemes. Section 4 considers the argumentative aspects of mathematical practices beyond proof and Sect. 5 focuses on the contribution of communities of mathematical practice to such argumentation.
Argumentation in Mathematical Practice
2
3
What Is an Argument?
Several overlapping distinctions may be drawn in our understanding of arguments. • Argument-that/Argument-about: A sequence of statements whereby premises offer support for a conclusion is an argument. But, an exchange of conflicting views held by different people is also an argument. We may refer to the former as an argument-that and to the latter as an argument-about. Arguments-that are also known as arguments1 and arguments-about as arguments2 (O’Keefe 1977). As Michael Gilbert helpfully glosses the distinction, “one person makes an argument1 and [at least] two people have an argument2” (Gilbert 2014, 21). Although the most familiar mathematical arguments tend to be arguments-that, argumentsabout arise in mathematics too, such as priority disputes, contested axioms or principles, or debates over the legitimacy of a technique or the admissibility of a proof. A salient recent example is the contested status of Shinichi Mochizuki’s claimed proof of the abc conjecture (Aberdein 2023). • Process/Product: Arguments-that are sometimes represented as products of the argument-about process, but this is arguably a mischaracterization: “If, as part of organizing the domain of argumentation theory, we merely want to distinguish acts of arguing from arguments-as-objects, we should not use the misleading process/product labels to do so. At the very least such labels imply a relationship that does not exist and so distort our perceptions of the domain of study” (Goddu 2011, 87). Nonetheless, we may usefully distinguish between an argument understood as an act of arguing and the informational trace that act leaves behind (a transcript, a recording, etc.), also often called an argument (Sundholm 2012, 948). This distinction straightforwardly applies to mathematics. (For further reflection on proofs as acts of proving or proof-events, see Goguen 2001; Stefaneas and Vandoulakis 2014, 2015.) • Monologue/Dialogue/Polylogue: The number of participants in an argument may vary considerably. Most attention has traditionally been paid to monologues and dialogues: arguments-that are characteristically presented as monologues, arguments-about as dialogues. If dialogues are understood loosely, as also covering argumentation with more than two participants, that distinction would exhaust the options. However, some authors have made a case for differentiating the two-participant dialogue from the many-participant polylogue (Lewiński 2014). The Polymath Project, a series of crowd-sourced proofs of open conjectures, is a rich source for research on mathematical polylogues (Allo et al. 2021). • Small scale/Large scale: Arguments can vary significantly in scale. The scale of an argument may be measured in several conceptually distinct ways. The duration of an argument may range from arguments that take seconds to arguments which last for hundreds of years. The size of an argument may range from a few inferences expressed in short sentences about a simple issue to inferentially complex structures involving a great many very long sentences. Likewise,
4
A. Aberdein and Z. Ashton
mathematical proofs can vary in length from a few lines to tens or even hundreds of pages. In exceptional cases, proofs can be so long as to defy the capacity of any single mathematician to survey the whole (Coleman 2009). This can arise in proofs achieved by traditional means, such as that of the classification of finite simple groups, the components of which comprised thousands of pages in several hundred articles by dozens of mathematicians (Steingart 2012). It is even more acute in the case of computer-assisted proofs, such as that of the four-color theorem or the Kepler conjecture, or indeed the subsequent computer verification of these proofs (Gonthier 2008; Hales et al. 2017). • Static/Dynamic: Static arguments have achieved a final and definite form; dynamic arguments are fluid and ongoing. In general, the evolution of knowledge may be understood as the product of dynamic argumentation. Dynamic arguments are common in everyday life – but they are also central to the development of scientific thought. A well-known and influential analysis of a dynamic argument in mathematics is Imre Lakatos’s Proofs and Refutations. Lakatos shows how the protracted search for a proof of the Descartes–Euler conjecture, which relates the quantities of vertices, edges, and faces of convex polyhedra, involved significant redefinition of most of the concepts used in that conjecture (Lakatos 1976). (For further discussion, see Sect. 4.3 below or Reyes 2021.) • Centralized/Distributed: Arguments may be either centralized or distributed with respect to several factors including time, people, space, and media. A highly centralized argument may be restricted to what one person communicates in one place at one time in one mode of expression. But arguments may involve a varying cast of arguers and be drawn out over long periods, in multiple locations and media. Large-scale arguments are characteristically distributed: As a mathematical example, the classification of finite simple groups involved many years of work by a large, geographically widespread collective of mathematicians (Steingart 2012). • Sequential/Parallel: Logic, whether formal or informal, tends to reconstruct arguments as a linear sequence of premises from which intermediate statements are derived, culminating in a final conclusion. (Some logical systems prefer to invert this sequence.) However, argumentation in natural contexts often occurs in a more parallel fashion: Several strands of argument may be developed simultaneously, the conclusion may be derived from initial premises and then reinforced by subsequent subarguments, and so forth. Parallel arguments are also more likely to arise in projects with many participants, whether by accident or design: “To permit a large collaboration, . . . long proofs have been broken up into series of shorter lemmas” (Hales et al. 2017, 11). The educationists Christine Knipping and David Reid have proposed more fine-grained subdivisions of parallel arguments in mathematics (Knipping and Reid 2019; for further discussion, see Sect. 3.1 below.) The seven overlapping distinctions addressed so far have received unequal interest in the philosophy of mathematics. The formal logical approach mentioned in the introduction best coheres with a small-scale, static, centralized, sequential product like a published proof. Much recent work in the philosophy of mathematical
Argumentation in Mathematical Practice
5
practice focuses on mathematical arguments along the other dimensions. Some further distinctions arise from consideration of the goals of the arguers: • Persuasive/Directive/Polemic: That arguments may be distinguished by their objective is an ancient idea: Aristotle distinguished forensic arguments (concerned with past acts), display arguments (concerned with present circumstances), and deliberative arguments (concerned with future acts) (Aristotle 1991, 1358b). Erik C. W. Krabbe and Jan Albert van Laar propose an updated distinction between three different functions of reasoning: persuasive (to convince the other party), directive (to get the other party to act), and polemic (to intimidate the other party) (Krabbe and van Laar 2007, 29 f.). They contrast these “inherently argumentative” functions with three further functions of reasoning: explanatory (to enhance understanding), explorative (to investigate connections between statements), and probative (to establish new knowledge). • Adversarial/Nonadversarial: On a strict interpretation, all arguments begin in conflict: a difference in belief, or concerning how to act, or of some other kind. On a broader interpretation, arguments need not be strictly adversarial, hence they may proceed from other situations, including shared uncertainty or one party knowing what another does not. These nine different dimensions of comparison interact in important ways. Firstly, they are not pairwise independent. For example, as the scale of an argument increases, we may tentatively expect both the number of participants and the number of different objectives to increase, but the likelihood of the argument being either static, centralized, or sequential to tend to zero. Secondly, the last two dimensions combine to produce what Douglas Walton calls dialogue types (see Fig. 1, adapted from Walton and Krabbe 1995, 81). Further complicating this picture, Walton and Krabbe observe that dialogues can shift from one type to another (for example, an inquiry might turn into a persuasion dialogue if one inquirer becomes an advocate for a particular result) or be embedded in a dialogue of a different type (a deliberation over which course of action to pursue might contain an inquiry into the merits of one action, say). Walton maintains that arguments may arise in any dialogue type (Walton 1998); Krabbe is more conservative and regards argumentation as restricted to adversarial contexts (the lefthand fork of Fig. 1) (Krabbe and van Laar 2007, 33). How do these distinctions apply to argumentation in mathematics? The most widely discussed case is that of mathematical proof, which many authors have maintained is intrinsically argumentative. But even here, we may distinguish multiple distinct activities which give rise to arguments of different kinds. For example, Krabbe observes the following distinct stages: 1. Thinking up a proof to convince oneself of the truth of some theorem 2. Thinking up a proof in dialogue with other people (inquiry dialogue; probative functions of reasoning) 3. Presenting a proof to one’s fellow discussants in an inquiry dialogue (persuasion dialogue embedded in inquiry dialogue; persuasive and probative functions of reasoning)
6
A. Aberdein and Z. Ashton Is there a conflict?
YES
NO
Is resolution the goal?
Is there a common problem to be solved?
YES P ERSUASION
Fig. 1
NO Is settlement the goal?
YES Is this a theoretical problem?
YES
NO
YES
NO
N EGOTIATION
E RISTICS
I NQUIRY
D ELIBERATION
NO I NFORMATION S EEKING
Determining the type of dialogue (after Walton and Krabbe 1995, 81)
4. Presenting a proof to other mathematicians, e.g., by publishing it in a journal (persuasion dialogue; persuasive and probative functions of reasoning) 5. Presenting a proof when teaching (information-seeking and persuasion dialogue; explanatory, persuasive, and probative functions of reasoning) (Krabbe 2008, 457) This sequence is familiar from many mathematicians’ descriptions of the proving process, although in actual examples some steps may be repeated as proof attempts come unstuck. In deference to his intrinsically adversarial conception of argument, Krabbe only considers three dialogue types as hospitable to proofs. In other work, one of us has suggested that proofs (or other mathematical arguments) may be found in other dialogue types too (see Table 1, from Aberdein 2021, 165; see also Aberdein 2007b, 148). “Oracular” information-seeking owes its inspiration to an influential aside of Alan Turing concerning a machine “supplied with some unspecified means of solving number-theoretic problems; a kind of oracle as it were” (Turing 1939, 172). An oracle is a “black box” – it supplies answers but not explanations. For some skeptics of computer-assisted proofs, this is a compelling analogy for the role that the computer plays in such proofs (e.g., Tymoczko 1979). Deliberation differs from inquiry in seeking only a provisional conclusion. Mathematicians aspire to more permanent stability for their results. Nonetheless, there are circumstances where they are obliged to settle for less than they would wish, despite the rigor of their arguments. These include the “architectural conjectures” upon which many mathematical research programs depend (Mazur 1997, 198). Negotiation characteristically adds resource sensitivity to the provisional outcome typical of deliberation.
Argumentation in Mathematical Practice Table 1
7
Some mathematical dialogue types
Dialogue type Inquiry Persuasion Pedagogical informationseeking Oracular informationseeking Deliberation
Initial situation Openmindedness Difference of opinion Respondent lacks information Proponent lacks information Openmindedness
Negotiation
Difference of opinion
Eristic
Personal conflict
Main goal Prove or disprove conjecture Resolve difference of opinion with rigor Transfer of knowledge Transfer of knowledge
Goal of proponent Contribute to main goal Persuade respondent Disseminate knowledge of results and methods Obtain information
Reach a provisional conclusion
Contribute to main goal
Exchange resources for a provisional conclusion Reveal deeper conflict
Contribute to main goal Win in the eyes of onlookers
Goal of respondent Obtain knowledge Persuade proponent Obtain knowledge Inscrutable
Obtain warranted belief Maximize value of exchange Win in the eyes of onlookers
While idealized accounts of mathematical practice disregard such factors, they are unavoidable in some contexts, especially in applied mathematics. (And it has been controversially suggested that “semi-rigorous” proofs might come with price tickets, costing the computational resources necessary for certainty [Zeilberger 1993].) Even eristic dialogues can be the context for mathematical reasoning, as demonstrated in the mathematically inventive quarrels of early modern mathematicians such as Girolamo Cardano and Niccolò Tartaglia (Toscano 2020). A final distinction among the different senses of argument cuts across most of those discussed above and arises not from argumentation theory but directly from mathematical practice. Here it is presented by Joel David Hamkins: • Hard/Soft: “A hard argument is one that is technically difficult; perhaps it involves a laborious construction or a difficult calculation; perhaps it involves bringing disparate fine details together in just the right combination in order to succeed; or perhaps it involves proving various specific facts about a comparatively abstract construction, perhaps relating disparate levels of abstraction. A soft argument, in contrast, is one that appeals only to very general abstract features of the situation, and one needs hardly to construct or compute anything at all” (Hamkins 2020, 166). Hamkins’s use of “hard” and “soft” echoes G. H. Hardy’s division of analysis into “the ‘hard, sharp, narrow’ kind as opposed to the ‘soft, large, vague’ kind” (Hardy 1929, 64). It also owes something to Alexander Grothendieck’s celebrated analogy between two strategies for solving mathematical problems and two ways of opening
8
A. Aberdein and Z. Ashton
a nut: cracking it with a hammer or softening it in water until it opens with light pressure (McLarty 2007, 301). Grothendieck favored the latter strategy, of immersing problems in a much wider theory from which a solution could (eventually) be readily inferred. This in turn is suggestive of Freeman Dyson’s division of mathematicians into birds and frogs: “Birds fly high in the air and survey broad vistas of mathematics out to the far horizon. They delight in concepts that unify our thinking and bring together diverse problems from different parts of the landscape. Frogs live in the mud below and see only the flowers that grow nearby. They delight in the details of particular objects, and they solve problems one at a time” (Dyson 2009, 212). Hard arguments play to the tightly focused strengths of the frogs; soft arguments require the birds’ sweeping perspective.
3
Proof as Argumentation
The view mentioned in Sect. 1, that formal derivation suffices to account for all interesting features of mathematical practice, squarely focuses on the role of mathematical proof. Pure mathematicians trade in proofs. But, even in the domain of mathematical proof, formal logic does not capture all there is of interest. Toulmin layouts and argumentation schemes, as we will see below, are useful methods of examining mathematical proofs. But, notably, such techniques tease apart the descriptive and normative aspects of (apparent) proof. Whereas a formal derivation is either correct or not, strictly speaking, a derivation at all, the methods of argumentation theory provide the means to describe proofs independently of whether they succeed as proofs.
3.1
Toulmin Layouts
One of the most influential treatments of informal argumentation is that of Stephen Toulmin (1958). His “layout” can represent deductive inference, but encompasses many other species of argument besides. In its simplest form, shown in Fig. 2a, the Fig. 2
B
Toulmin layouts
W W
D
D C
(a) Basic Layout
Q
R (b) Full Layout
C
Argumentation in Mathematical Practice
9
layout represents the derivation of a claim (C), from data (D), in accordance with a warrant (W). This DWC pattern resembles a deductive inference rule, such as modus ponens, but it can be used to represent looser inferential steps. The differences between the types of inference which the layout may represent are made explicit by the additional elements of the full layout shown in Fig. 2b. The warrant is justified by its dependence on backing (B), possible exceptions, or rebuttals (R), are indicated, and the resultant force of the argument is stated in the qualifier (Q). Hence the full layout may be understood as “Given that D, we can Q claim that C, since W (on account of B), unless R.” In a frequently cited example (derived from Toulmin 1958, 104), “Given that HARRY WAS BORN IN BERMUDA, we can PRESUMABLY claim that HE IS BRITISH, since ANYONE BORN IN BERMUDA WILL GENERALLY BE BRITISH (on account of VARIOUS STATUTES . . .), unless HIS PARENTS WERE ALIENS, SAY.” Toulmin wrote The Uses of Argument in England in the 1950s as a critique of what he perceived as a formalizing trend in contemporary philosophy; the Toulmin layout was subsequently adopted by communication theorists in America in the 1960s and 1970s; from there it seems to have passed to mathematics educationists in Germany in the 1990s. (Although Toulmin himself briefly considers a mathematical example, Theaetetus’s proof that there are exactly five platonic solids [Toulmin et al. 1979, 89]. For discussion, see [Aberdein 2005, 290 ff.].) In particular, Götz Krummheuer is usually credited with the first sustained application of the Toulmin layout to mathematical argumentation (Krummheuer 1995; for a recent survey, see Krummheuer 2015). Toulmin draws a distinction between analytic and substantial arguments depending on whether the claim is already at least implicit in the backing. This may suggest that only analytic arguments occur in mathematics. But that overlooks the emphasis Toulmin places on how out of the ordinary analytic arguments are: “it begins to be a little doubtful whether any genuine, practical argument could ever be properly analytic” (Toulmin 1958, 127). Granted, immediately after this passage, Toulmin places arguments in (pure) mathematics among the analytic arguments. However, the philosophers and educationists who have applied Toulmin’s work to mathematics endorsed his account of argument, not his philosophy of mathematics. They have typically treated (at least some) mathematical arguments as substantial rather than analytic. For example, “It is the substantial argumentation that is seen here as more adequate for the reconstruction” of mathematics classroom situations (Krummheuer 1995, 236). In other words, the part of Toulmin’s work that we should apply to many mathematical arguments is what he has to say about nonmathematical arguments. Much subsequent work applying the Toulmin layout to mathematical reasoning has concerned ways in which it may be extended to cover cases that he does not directly address – including many of the less favored sides of the distinctions drawn in Sect. 2 above. Toulmin’s own later work discusses how more than one layout may be chained together to represent a multistep argument (Toulmin et al. 1979, 79). Other authors have shown how more complicated structures, including linked and convergent arguments, may be represented by combinations of layouts (Aberdein
10
A. Aberdein and Z. Ashton
2006, 214). Christine Knipping and David Reid have paid particular attention to larger-scale structures of parallel argument, distinguishing source-like argumentation structure, where “arguments and ideas arise from a variety of origins, like water welling up from many springs,” from “reservoir structure,” wherein arguments “flow towards intermediate target-conclusions that structure the whole argumentation into parts that are distinct and self-contained,” and spiral argumentation structure in which “the final conclusion is proven in many ways” (Knipping and Reid 2019, 18 ff.). Matthew Inglis and colleagues have argued persuasively for the relevance of the full Toulmin model to mathematical arguments, rather than the simplified DWC or DWBC versions that have found widest application among mathematics educationists (Inglis et al. 2007). In particular, they observe that nondeductive warrants can play an essential role in mathematical argumentation, just so long as this is signaled by the use of appropriate qualifiers. Work has also been done to explore the connections between the Toulmin layout and other models of argument applicable to mathematics (Pease and Aberdein 2011), or to link it to broader conceptual analyses of mathematical cognition, such as the “ck¢-enriched” Toulmin model (Pedemonte and Balacheff 2016).
3.2
Argumentation Schemes
Argumentation schemes are stereotypical patterns of reasoning. Although their origins lie in the topoi of classical rhetoric, they have lately found extensive application in the analysis and evaluation of argumentation. This revival is substantially due to the work of the argumentation theorist Douglas Walton. Most attention has been paid to defeasible schemes typical of informal reasoning, although deductive inference rules can also be considered special cases of argumentation schemes. The defeasible nature of the reasoning is not made explicit among the premises, but captured by an additional device, critical questions, which point to possible exceptions. Many of the defeasible schemes may ultimately be understood as more or less specialized instances of the very general scheme of Defeasible Modus Ponens (Walton et al. 2008, 366). In Scheme 1, we have presented it in a way designed to bring out its similarities to the Toulmin layout: Argumentation Scheme 1 Defeasible Modus Ponens Data P. Warrant As a rule, if P, then Q. Therefore, Qualifier presumably, Conclusion . . . Q. Critical Questions 1. Backing: What reason is there to accept that, as a rule, if P, then Q? 2. Rebuttal: Is the present case an exception to the rule that if P, then Q?
Argumentation in Mathematical Practice
11
The strength of the argumentation scheme approach lies in its heterogeneity: An influential (but not exhaustive) survey identifies over one hundred different schemes (Walton et al. 2008, 308 ff.). Hence schemes are typically presented with much greater specificity than Scheme 1. For example, Scheme 2 is a scheme for Argument from Analogy: Argumentation Scheme 2 Argument from Analogy Similarity Premise Generally, case C1 is similar to case C2. Base Premise A is true (false) in case C1. Conclusion A is true (false) in case C2. Critical Questions 1. Are there differences between C1 and C2 that would tend to undermine the force of the similarity cited? 2. Is A true (false) in C1? 3. Is there some other case C3 that is also similar to C1, but in which A is false (true)? (Walton et al. 2008, 315). Analogies in mathematics can be formal and thereby capable of rigorous proof (for a specific example and further discussion of this scheme, see Aberdein 2013b, 373). They can also be informal heuristics, for example, the “strong analogy between the pluralist nature of set theory and what has emerged as an established plurality in the foundations of geometry” (Hamkins 2020, 296; for further discussion of this analogy, see Berry 2020). Analogical reasoning has been a topic of wide interest and mathematical analogies in particular have been the subject of focused discussion (Schlimm 2008; Bartha 2013; Priestley 2013). In more recent work, Walton proposed a partial taxonomy of schemes (Walton and Macagno 2015, 22), although he acknowledged that some schemes remained outside this classification. Many of these schemes have been applied to mathematical arguments: Table 2 is based on (Walton and Macagno 2015, 22, Table 1), but adds references to prior work in which such applications have been developed. Just as Walton’s classification of schemes is incomplete, so is their application to mathematics. Some of these schemes may be of limited usefulness in the analysis of specifically mathematical argumentation, but others have direct application. By varying which schemes are treated as admissible, it is possible to capture different conceptions of mathematical rigor. To this end, one of us has proposed a threefold distinction among the ways schemes may be employed in mathematical reasoning (Aberdein 2013b, 366 f.): • A-schemes correspond directly to derivation rules. (Equivalently, we could think in terms of a single A-scheme, the “pointing scheme” which picks out a derivation whose premises and conclusion are formal counterparts of its data and claim.) • B-schemes are exclusively mathematical arguments: high-level algorithms or macros. Their instantiations correspond to substructures of derivations rather
12
A. Aberdein and Z. Ashton
Table 2 Walton and Macagno’s partial classification of schemes (adapted from Walton and Macagno 2015, 22), with prior applications to mathematical argumentation indicated Discovery arguments 1. Arguments establishing rules Argument from a random sample to a population Argument from best explanation 2. Arguments finding entities Argument from signf,h,i,k Argument from ignorancec
Applying rules to cases Practical reasoning 1. Arguments based 1. Instrumental on cases argument from practical reasoning Argument from Argument from an action to motive established rulef Argument from verbal classificationb,i,j Argument from cause to effect 2. Defeasible rulebased arguments Argument from examplec,e,i,j Argument from analogyd,j Argument from precedentk 3. Chained arguments connecting rules and cases Argument from gradualismc Precedent slippery slope argument Sorites slippery slope argument
a
Aberdein (2007a) Aberdein (2010) c Aberdein (2013a) d Aberdein (2013b) e Aberdein (2019) f Aberdein (2021) g Cantù (2013) h Dove (2009) i Metaxas (2015) j Metaxas et al. (2016) k Pease and Aberdein (2011) b
2. Argument from values Argument from fairness 3. Value-based argument from practical reasoningg a. Argument from positive or negative consequencesc,i,j Argument from waste Argument from threat Argument from sunk costs
Source-dependent arguments 1. Arguments from position to know a. Argument from expert opiniona,c,i b. Argument from position to knowc Argument from witness testimony 2. Ad hominem arguments a. Direct ad hominem b. Circumstantial ad hominem Argument from inconsistent commitment Arguments attacking personal credibility i. Arguments from allegation of bias ii. Poisoning the well by alleging group bias 3. Arguments from popular acceptance Argument from popular opinionb Argument from popular practiceb
Argumentation in Mathematical Practice
13
than individual derivations (and they may appeal to additional formally verified propositions). • C-schemes are even looser in their relationship to derivations, since the link between their data and claim need not be deductive. Specific instantiations may still correspond to derivations, but there will be no guarantee that this is so and no procedure that will always yield the required structure even when it exists. Thus, where the qualifier of A- and B-schemes will always indicate deductive certainty, the qualifiers of C-schemes may exhibit more diversity. Indeed, different instantiations of the same scheme may have different qualifiers. B-schemes are essentially what Saunders Mac Lane calls “processes of proof” or general rules, algorithmic procedures that are ultimately reducible to elementary logical inferences although not necessarily so analyzed by the mathematicians who routinely employ them: “whenever a group of elementary processes of proof occurs repeatedly in the course of many proofs, it is desirable to formulate this group of steps once for all as a new process” (Mac Lane 1935, 123). Much more recently, Yacin Hamami has used B-schemes, which he terms “hl-rules” or higher-level rules of inference, to defend the “standard view” of mathematical rigor, that rigorous proofs are those for which there is a routine translation into a formal derivation (Hamami 2022). Hamami’s account of rigor has three components: a descriptive thesis, a normative thesis, and a philosophical thesis asserting the conformity of the other two theses. Relative to some mathematical practice M , these theses may be stated as follows. The descriptive thesis states that a mathematical proof P is rigorousD if and only if for every mathematical inference I in P, there exist D ∈ D* and V1, . . ., Vn ∈ V * such that (1) D(I) ¼ hI1, . . ., Ini and (2) Vi(Ii) ¼ valid for all i ∈〚1, n〛, where the set D * consists of decomposition (or proof search) processes, whereby a mathematical inference may be rewritten as a sequence of immediate mathematical inferences, and the set V * consists of hl-rules, whereby immediate mathematical inferences are judged valid if they correspond to instances of the hl-rules. The normative thesis states that a mathematical proof P is rigorousN if and only if P can be routinely translated into a formal proof. Hamami defines “routine translation” as the composition of three successive translations between proofs understood at four levels of granularity: vernacular level proofs, comprised of inferences presented at the level of formality normal to M ; higher-level proofs, comprised of inferences instantiating hl-rules in M ; intermediate-level proofs, comprised of inferences instantiating primitive rules of inference in M ; and lower-level proofs, comprised of inferences instantiating rules of inference in a purely formal system. The conformity thesis states that if P is rigorousD then P is rigorousN. Hamami’s account of rigor corresponds to one of four alternatives that one of us has discussed elsewhere (Aberdein 2013b, 369). It may be contrasted with the more conservative policy of only admitting A-schemes and thereby treating only formalized mathematics as truly rigorous and more liberal options in which C-schemes are also admitted, thereby tolerating a greater diversity of innovation in mathematical proof.
14
4
A. Aberdein and Z. Ashton
Mathematical Reasoning as Argumentative
In Sect. 3, we saw that to understand proof through the lens of argumentation theory is to see it in the context of a greater diversity of types of mathematical argument. While proofs are a vital component of modern mathematical practice, it is not the only aspect of mathematical reasoning to which argumentation theory may be applied. We now turn to these other aspects. Our discussion begins with general claims about mathematical reasoning and rhetoric. We then turn to a number of issues which surround proof including problem choice, reasoning about refutations, and presentation of mathematical information.
4.1
Mathematics and Rhetoric
A first connection between mathematical reasoning and argumentation theory involves rhetoric. Rhetoric, the study of the art of persuasive argument, has long been set in opposition to mathematics. It was thought that mathematics, as an objective, rational, and atemporal field, has little to do with the study of persuasion. But a number of authors have challenged this idea. An early paper by Philip J. Davis and Reuben Hersh identified two areas where mathematics and rhetoric intersect (Davis and Hersh 1986). The first involves importing or applying mathematics to theories in the social sciences, such as psychodynamics and economics. Such appeals to mathematiziation are rhetorical and argumentative moves, but they are not argumentation within mathematical reasoning. However, as Davis and Hersh point out, there is rhetoric in mathematics too, since “all proofs are incomplete, from the viewpoint of formal logic” (Davis and Hersh 1986, 66). Each proof requires rhetorical elements to convince the intended audience that the result is true. The mathematician relies on the audience’s background knowledge or intuition to patch up gaps in the proof and understand the intentions of the prover. Like Davis and Hersh, Edward Schiappa discusses multiple ways in which mathematical reasoning can be rhetorical. The first intersection, again, is the rhetorical use of mathematics. Mathematical methods can be used to persuade in a variety of arguments. Schiappa cites examples ranging from mundane advertisement – “four out of five dentists agree” – to the discovery of Neptune to argue that mathematical reasoning plays a role in lending credibility to arguments outside its purview (Schiappa 2021). The second intersection is the role of rhetoric within mathematics: the argumentative and stylistic modes of persuasion in mathematical arguments. Each aspect of mathematical practice historically required a social and persuasive component. Acceptance of axioms and stipulated definitions depends on the audience one aims to persuade. Even the available concepts which mathematicians reason about can be the result of rhetoric. For example, G. Mitchell Reyes argues that, since there was no empirical or geometric verification for infinitesimals, their substance was found in the rhetorical argumentation which surrounded them (Reyes 2004). Schiappa also argues that the language of mathematics is rhetorical since it is human-made. Symbols and concepts like the infinitesimal or the number zero were
Argumentation in Mathematical Practice
15
additions where “social acceptance was not assured, meaning was contested, and alternatives competed” (Schiappa 2021, 49). Relatedly, in this collection, Reyes examines the relationship between rhetoric and mathematics in Lakatos’s Proofs and Refutations (Reyes 2021).
4.2
Problem Choice
In addition to the rhetorical components pervasive in nonproof, there is room for argumentation in the process around proofs. Perhaps the most fundamental part of solving a problem, and of proving a theorem, is selecting a problem. Problem choice in mathematics has frequently been associated with the beauty and intrinsic worth of problems. Under this view, there is a special sensibility mathematicians employ when choosing a subject. According to Jacques Hadamard, mathematicians “feel that such a direction of investigation is worth following; [they] feel that the question in itself deserves interest . . . everybody is free to call or not to call that a feeling of beauty” (Hadamard 1945, 127). This approach to problem choice has been supplemented in recent years. Morten Misfeldt and Mikkel Willum Johansen interviewed research mathematicians about the factors which influence problem choice (Misfeldt and Johansen 2015). In line with Hadamard’s claims, certain external factors, like funding, were not very influential on problem choice. Misfeldt and Johansen found that problem choice was largely motivated by personal interest, perceived ability to solve the problem, and the values of the community. It was not enough for mathematicians to be interested in the problem, they had to be assured that the mathematical community would be interested. Mathematicians expressed “the need to have an audience—the right audience—for their work” (Misfeldt and Johansen 2015, 368). This connection between an audience and an arguer is an area which argumentation is primed to explore. Elsewhere, one of us has looked at problem choice through an argumentative lens by applying Chaïm Perelman and Lucie Olbrechts-Tyteca’s notion of the contact of minds to problem choice in mathematics (Ashton 2018). The contact of minds is a set of conditions which must be met before argumentation can occur. Contact of minds requires four things: the arguer must attach importance to the audience, the speaker must not be beyond question, the audience must be willing to consider being convinced, and they must share a common language. Contact of minds is a prerequisite of any argumentation and mathematical arguments are no different. Ashton argues that it is an audience-based factor that influences problem choice alongside traditional considerations of beauty, intrinsic worth, and practical benefits. But the contact of minds was not originally meant to be applied to mathematics: Perelman and Olbrechts-Tyteca specifically oppose their study of argumentation to the mathematical sciences. Argumentation, they claimed, was distinct from mathematics since argumentation was social and its conclusions were probable (Perelman and Olbrechts-Tyteca 1969). But barring mathematical practice from the domain of argumentation is inapproporiately limiting (Dufour 2013; Ashton 2018, 2021). Misfeldt and Johansen’s interviews indicate that mathematicians consider the
16
A. Aberdein and Z. Ashton
interest of other mathematicians while choosing problems. Given that problem choice is broader than merely the structure of proof, a choice about what to research is a choice of what to argue about. In this way, the process of choosing a problem to research is distinctly social and related to the mathematical community (Ashton 2018; discussed further in Sect. 5).
4.3
Reasoning About Refutations
The influential informal logician Ralph Johnson (2000) follows the rhetoricians Perelman and Olbrechts-Tyteca (1969) in denying that proofs can be arguments because there are features of proof, such as necessity, that are incompatible with the social dimension of arguments. For Perelman and Olbrechts-Tyteca, mathematics deals in demonstrations which are mechanically checkable and result in certainty, regardless of the audience. Mathematical reasoning, for them, does not involve uncertainty or controversy. Likewise, Johnson viewed proofs as lacking a dialectical tier. For Johnson, arguments have two components. The first component of argument is the illative core which is a discursive structure where reasons support the conclusion. But this logical aspect alone is not a full argument. A dialectical tier must supplement the illative core. Dialectical tiers are where arguers discharge their dialectical obligations by responding to objections, criticisms, or implications of their view. According to Johnson, proofs do not have dialectical tiers since they are conclusive and anyone trained in the field must recognize that they are conclusive. But refutations are a natural part of mathematical reasoning. Reasoning about purported refutations may best be understood in terms of argumentation. In an idealized view of proof, one in which proof cannot be an argument, the reader of a proof is an expert who “needs nobody to grasp a proof, otherwise she is not an expert” (Dufour 2013, 69). Such a view, according to Dufour, ignores the important role of checking a proof for correctness. There may still be legitimate room for refutations at this stage. As Fallis (2003) notes, proofs have many intentional gaps. The gaps are purportedly something a reader could fill in, with enough time and background knowledge. Verifying that a proof is correct requires checking these gaps. If the gaps cannot be traversed by either the reader or author, the proof itself may be refuted. Sometimes large gaps may result in the apparent refutation of otherwise good proofs. According to Dufour, Galois experienced such problems of communication (Dufour 2013, 71). Galois’s exaggerated brevity led his audience to believe that the mathematics itself was insufficiently developed. Proofs may be refuted because of unintelligible or untraversable gaps. These gaps, and their relationships to audience understanding, are evidence that reasoning about the correctness of proofs, and their refutations, involves argumentation. Contra Johnson, Ian Dove has argued that proofs do have dialectical tiers (Dove 2007). In particular, Dove argues that the method of monster barring seen in Lakatos’s Proofs and Refutations is part of a dialectical tier (Lakatos 1976). Lakatos reconstructs a series of purported proofs of Euler’s formula. Cauchy’s proof of Euler’s formula for polyhedra is considered and then counterexamples are raised
Argumentation in Mathematical Practice
17
which are not convex and not simply connected. Dove argues that there is a dialectical tier for Cauchy’s proof of Euler’s formula since (a) objections are raised to the proof and (b) the objections receive responses in the proof. The method of response is to bar exceptions to the formula so that Euler’s formula holds for simply connected, convex polyhedra. In other words, monster barring to improve a conjecture is one way a proof can have a dialectical tier. Both Dove and Dufour found that reasoning about refutations is an argumentative practice. Purported proofs are not always uncontroversial and incorporating refutations involves filling out a dialectical tier or considering a relevant audience.
4.4
Presenting Reasoning
After a problem has been chosen and its solution has been verified, mathematical reasoning must be disseminated to a larger public. There is much work to be done to discover how mathematicians incorporate audience consideration while presenting their solutions. But Line Edslev Andersen’s interviews with working mathematicians indicate that audience consideration does feature into how papers are revised. Andersen’s interviews provide insight into how mathematicians write for mathematicians (Andersen et al. 2021), how peer reviewers receive and evaluate papers (Andersen 2017), and how mathematicians choose which gaps to include in their papers (Andersen 2020). Additionally, mathematical results must be translated to students in a pedagogical setting. Results also often need to be communicated to scientists or mathematicians in other fields. In each of these cases, it is common to reword or recast a proof to aid in communication. All “strategic rewordings belong to the field of mathematical argumentation” (Dufour 2013, 73). Some of these rewordings may even be usefully cast within Johnson’s concept of a dialectical tier. This is a rich area for further research, but, as Dufour points out, one must first admit mathematical proofs into the realm of argumentation. Section 5 examines how communities play a role in mathematical practice broadly, not just in presenting reasoning.
5
Mathematical Communities as Argumentative
Throughout this chapter, we have looked at the different aspects of mathematical practice that involve argumentation. Mathematicians present and produce argumentations. Their choice of problems and reasoning while solving problems also involved argumentation. But arguments are presented to, or developed with, an audience in mind. We now turn to mathematical audiences and communities. We argue that mathematical audiences influence which investigations are undertaken and how they are undertaken. Given that the interest of this section is mathematical audiences, the receivers of mathematical arguments, one might begin by asking what kind of interaction mathematicians have with their audiences. Mathematics is often portrayed as an
18
A. Aberdein and Z. Ashton
isolated activity. Mathematicians lock themselves in an attic and spend days proving complicated conjectures alone. Indeed, stories of famous mathematicians seem to support such an idea. Andrew Wiles did work on his proof of Fermat’s last theorem in an attic. And mathematical advances can be so particular, as in the case of Mochizuki’s purported proof of the abc conjecture, that only a handful of people can verify or understand them. But this isolated view is not an accurate portrayal of mathematical practice. Without an appropriate community, or engagement with that community, mathematical investigations can falter. Besides the obvious importance of mentors or coauthors, there needs to be a certain level of engagement with a larger community of mathematicians. Take, for example, William Thurston’s discussion in On Proof and Progress in Mathematics (Thurston 1994). According to Thurston, mathematicians tend to follow fads. Fad-following is in line with his claim that there is a vital social component to mathematical progress. He drew this conclusion from his own experiences. When Thurston first entered the field of foliations, he rapidly solved a number of open problems. But the field seemed to empty out within a few years of his entrance and Thurston lost interest soon after. In response to this experience, he tried to develop and present infrastructure instead of proving theorems. He writes that: I have put a lot of effort into non-credit-producing activities that I value just as I value proving theorems: mathematical politics, revision of my notes into a book with a high standard of communication, exploration of computing in mathematics, mathematical education, development of new forms for communication of mathematics . . . I think that what I have done has not maximized my “credits” . . . I do think that my actions have done well in stimulating mathematics. (Thurston 1994, 177)
We can see that, for Thurston, the health of the mathematical community is invaluable to retaining mathematicians who are interested in that area. This is part of their tendency to “follow fads” to newly exciting communities. Audience and community involvement seem important, but what exactly does argumentation theory have to offer in this area? As one of us has argued, Thurston’s story exemplifies a broader issue, namely that problem choice in mathematics rests on the assurance that there exists a “contact of minds” between the audience and the mathematician (Ashton 2018). As we can see, the interest and activity of a mathematical community is vital to successful mathematical practice. In addition to the role that communities play in problem choice, they play an important role in the ongoing dialectic involved in problem-solving. The method of proofs and refutations described by Lakatos relies heavily on community involvement in problem-solving. A first “proof” is suggested to show that, for polyhedra, V E þ F ¼ 2, where V is the number of vertices, E the number of edges, and F the number of faces of the polyhedron. The proof faces a number of global and local counterexamples, that is, counterexamples to the conjecture and to specific proof steps, respectively. Accommodating these counterexamples requires a reconceptualization of the definition of polyhedron. In Section 4, we saw that reasoning about those refutations involved filling out a dialectical tier. It is also
Argumentation in Mathematical Practice
19
important that community involvement plays an essential role in the reasoning. The history of the proof is recounted using the story of a class attempting to solve the problem. But it is clear that the refinements of definition, the counterexamples, and the resulting proofs are all products of dialectical engagement between the initial prover and an audience. This community involvement plays out on a historical scale, as in (Lakatos 1976), but also in other collaborative problem-solving. For example, communities play a key role in crowdsourced mathematics such as Mathoverflow and (Mini-)Polymath. These activities have been studied by Ursula Martin and Alison Pease who connect the activities undertaken to the method of proofs and refutations (Martin and Pease 2013a, b). Building on these datasets, in later work with Joseph Corneli and other colleagues, they model mathematical arguments by analyzing how the discourse unfolds (Corneli et al. 2019). They use what they call Inference Anchoring Theory + Content to further understand how these dialogues introduce and track salient features of mathematical progress. This is done by examining the dialogue within a community of mathematicians. So far we have considered the role that an external audience plays in problem choice. We have also looked at how we could identify communities in terms of the argument schemes that they allow or even disavow. But the final role of mathematical audiences is best described as an internal one. The core idea is that mathematical audiences play an important role in the development of proofs and other mathematical arguments. Valentin Bazhanov, for example, has argued that proofs are an appeal to a scientific community (Bazhanov 2012). A new result is offered to the community as a purported proof. To become a proof, the community must be persuaded that the argument is reliable and reproducible. This decision is borne out in dialogue. Catarina Dutilh Novaes argues for a dialogical conception of proof (Dutilh Novaes 2021). According to the dialogical conception, the concept of proof is an embodiment of a semiadversarial dialogue between two people: Prover and Skeptic. Skeptic grants certain premises to Prover. Prover then puts forward statements claimed to follow from the premises. At each move, or inference, Skeptic has three potential moves. He brings up objections, counterexamples, and requests for clarification. If all of the steps are indefeasible, that is, without counterexamples, the proof is a winning strategy for Prover. Of course proofs are not usually dialogues between two interlocutors following Prover–Skeptic rules. Dutilh Novaes argues that the Skeptic is actually internalized into the method itself. In this sense, Skeptic, who is a particular kind of audience, plays a vital role in what constitutes proof. In addition to the internalized skeptic, one of us has argued that the audience under consideration in a proof is actually a universal audience (Ashton 2021). This is a concept drawn from Perelman and Olbrechts-Tyteca: An argument to the universal audience is one that is meant to convince all people. The universal audience itself is an imagined audience – an arguer can never stand before all people and ask whether or not they assent to his argument. Rather, the universal audience is an abstraction from experiences with real audiences that the arguer has encountered. The account applies also to mathematics: mathematicians encounter real audiences in their education and practice. In addition, they learn what certain groups of real audiences
20
A. Aberdein and Z. Ashton
react to – knot theorists accept inferential moves involving diagrams that certain algebraists might not. By considering all these different, real audiences, mathematicians construct an internalized audience which reflects the standards of reasonableness found within each audience. In this way, mathematicians construct their own universal audience. According to both Dutilh Novaes and Ashton, there is an internalized audience which, through experience with real audiences, is vital to proof development. In one case, a proof is an argument to an internalized skeptic. In the other, it is an argument to an internalized standard of reasonableness. Whether the mathematician aims to convince the adversarial skeptic or “all reasonable people,” there is clearly a role for the audience in standards of proof. In other words, proofs should be examined in terms of the audiences who could have influenced them. Communities, conceived of as audiences to mathematical arguments, are a vital component of the practice on a number of levels. The assurance of an interested audience and the contact of minds can influence research programs and problem choice. The ongoing dialogue between provers and the mathematical community helps to verify results, introduce new methods, and clear hidden assumptions. In addition to these explicit, external roles, the internalized audience plays an important role in proof development and the associated normativity. Given all this, it’s clear that no account of argumentation in the philosophy of mathematics could be complete without a thorough discussion of the role of these audiences.
References Aberdein A (2005) The uses of argument in mathematics. Argumentation 19(3):287–301 Aberdein A (2006) Managing informal mathematical knowledge: techniques from informal logic. In: Borwein JM, Farmer WM (eds) MKM 2006, vol 4108. Springer, Berlin, LNAI, pp 208–221 Aberdein A (2007a) Fallacies in mathematics. Proceedings of the British Society for Research into Learning Mathematics 27(3):1–6 Aberdein A (2007b) The informal logic of mathematical proof. In: Van Kerkhove B, Van Bendegem JP (eds) Perspectives on mathematical practices: bringing together philosophy of mathematics, sociology of mathematics, and mathematics education. Springer, Dordrecht, pp 135–151 Aberdein A (2009) Mathematics and argumentation. Found Sci 14(1–2):1–8 Aberdein A (2010) Observations on sick mathematics. In: Van Kerkhove B, Van Bendegem JP, De Vuyst J (eds) Philosophical perspectives on mathematical practice. College Publications, London, pp 269–300 Aberdein A (2013a) Mathematical wit and mathematical cognition. Top Cogn Sci 5(2):231–250 Aberdein A (2013b) The parallel structure of mathematical reasoning. In: Aberdein A, Dove IJ (eds) The argument of mathematics. Springer, Dordrecht, pp 361–380 Aberdein A (2019) Evidence, proofs, and derivations. ZDM 51(5):825–834 Aberdein A (2021) Dialogue types, argumentation schemes, and mathematical practice: Douglas Walton and mathematics. J Appl Logics 8(1):159–182 Aberdein A (2023) Deep disagreement in mathematics. Global Philosophy 33(1):17 Allo P, Van Bendegem JP, Van Kerkhove B (2021) Polymath as an epistemic community: analyzing the digital traces of the polymath project. In: Sriraman B (ed) Handbook of the history and philosophy of mathematical practice. Springer, Cham Andersen LE (2017) On the nature and role of peer review in mathematics. Account Res 24(3): 177–192
Argumentation in Mathematical Practice
21
Andersen LE (2020) Acceptable gaps in mathematical proofs. Synthese 197(1):233–247 Andersen LE, Johansen MW, Sørensen HK (2021) Mathematicians writing for mathematicians. Synthese 198(Suppl 26):6233–6250 Aristotle (1991) The art of rhetoric. Penguin, London, translated by H. Lawson-Tancred Ashton Z (2018) Mathematical problem choice and the contact of minds. In: Zack M, Schlimm D (eds) Research in history and philosophy of mathematics: the CSHPM 2017 annual meeting in Toronto, Ontario. Birkhäuser, Cham, pp 191–203 Ashton Z (2021) Audience role in mathematical proof development. Synthese 198(Suppl 26):6251–6275 Bartha P (2013) Analogical arguments in mathematics. In: Aberdein A, Dove IJ (eds) The argument of mathematics. Springer, Dordrecht, pp 199–237 Bazhanov VA (2012) Mathematical proof as a form of appeal to a scientific community. Russ Stud Philos 50(4):56–72 Berry S (2020) Taking the analogy between set theory and geometry seriously, online at https:// seberry.org/Hamkins.pdf Cantù P (2013) An argumentative approach to ideal elements in mathematics. In: Aberdein A, Dove IJ (eds) The argument of mathematics. Springer, Dordrecht, pp 79–99 Coleman E (2009) The surveyability of long proofs. Found Sci 14(1–2):27–43 Corneli J, Martin U, Murray-Rust D, Nesin GR, Pease A (2019) Argumentation theory for mathematical argument. Argumentation 33(2):173–214 Davis PJ, Hersh R (1986) Mathematics and rhetoric. In: Davis PJ, Hersh R (eds) Descartes’ dream: the world according to mathematics. Penguin, London, pp 57–73 Dove IJ (2007) On mathematical proofs and arguments: Johnson and Lakatos. In: Van Eemeren FH, Garssen B (eds) Proceedings of the Sixth Conference of the International Society for the Study of Argumentation, vol 1. Sic Sat, Amsterdam, pp 346–351 Dove IJ (2009) Towards a theory of mathematical argument. Found Sci 14(1–2):137–152 Dufour M (2013) Arguing around mathematical proofs. In: Aberdein A, Dove IJ (eds) The argument of mathematics. Springer, Dordrecht, pp 61–76 Dutilh Novaes C (2021) The dialogical roots of deduction: historical, cognitive, and philosophical perspectives on reasoning. Cambridge University Press, Cambridge Dyson F (2009) Birds and frogs. Not Am Math Soc 56(2):212–223 van Eemeren FH, Garssen B, Krabbe ECW, Henkemans AFS, Verheij B, Wagemans JHM (2014) Handbook of argumentation theory. Springer, Dordrecht Fallis D (2003) Intentional gaps in mathematical proofs. Synthese 134(1–2):45–69 Gilbert MA (2014) Arguing with people. Broadview Press, Peterborough, ON Goddu GC (2011) Is ‘argument’ subject to the product/process ambiguity? Informal Logic 31(2):75–88 Goguen J (2001) What is a proof? Online at http://cseweb.ucsd.edu/~goguen/papers/proof.html Gonthier G (2008) Formal proof—the four color theorem. Not Am Math Soc 55(11):1382–1393 Hadamard J (1945) An essay on the psychology of invention in the mathematical field. Princeton University Press, Princeton, NJ Hales TC, Adams M, Bauer G, Dang DT, Harrison J, Hoang TL, Kaliszyk C, Magron V, McLaughlin S, Nguyen TT, Nguyen TQ, Nipkow T, Obua S, Pleso J, Rute JM, Solovyev A, Ta AHT, Tran TN, Trieu DT, Urban J, Vu KK, Zumkeller R (2017) A formal proof of the Kepler conjecture. Forum Math Pi 5(e2):1–29 Hamami Y (2022) Mathematical rigor and proof. Rev Symb Log 15(2):409–449 Hamkins JD (2020) Lectures on the philosophy of mathematics. The MIT Press, Cambridge, MA Hardy GH (1929) Prolegomena to a chapter on inequalities. J Lond Math Soc 4(1):61–78 Inglis M, Mejía-Ramos JP, Simpson A (2007) Modelling mathematical argumentation: the importance of qualification. Educ Stud Math 66(1):3–21 Johnson RH (2000) Manifest rationality: a pragmatic theory of argument. Lawrence Erlbaum Associates, Mahwah Knipping C, Reid DA (2019) Argumentation analysis for early career researchers. In: Kaiser G, Presmeg N (eds) Compendium for early career researchers in mathematics education. Springer, Cham, pp 3–31
22
A. Aberdein and Z. Ashton
Krabbe ECW (2008) Strategic maneuvering in mathematical proofs. Argumentation 22(3):453–468 Krabbe ECW, van Laar JA (2007) About old and new dialectic: dialogues, fallacies, and strategies. Informal Logic 27(1):27–58 Krummheuer G (1995) The ethnography of argumentation. In: Cobb P, Bauersfeld H (eds) The emergence of mathematical meaning: interaction in classroom cultures. Lawrence Erlbaum Associates, Hillsdale, pp 229–269 Krummheuer G (2015) Methods for reconstructing processes of argumentation and participation in primary mathematics classroom interaction. In: Bikner-Ahsbahs A, Knipping C, Presmeg N (eds) Approaches to qualitative research in mathematics education: examples of methodology and methods. Springer, Dordrecht, pp 51–74 Lakatos I (1976) Proofs and refutations: the logic of mathematical discovery. Cambridge University Press, Cambridge Lewiński M (2014) Argumentative polylogues: beyond dialectical understanding of fallacies. Stud Log Gramm Rhetoric 36(1):193–218 Mac Lane S (1935) A logical analysis of mathematical structure. Monist 45(1):118–130 Martin U, Pease A (2013a) Mathematical practice, crowdsourcing, and social machines. In: Carette J, Aspinall D, Lange C, Sojka P, Windsteiger W (eds) Intelligent computer mathematics, LNAI, vol 7961. Springer, Berlin, pp 98–119 Martin U, Pease A (2013b) What does mathoverflow tell us about the production of mathematics?, presented at SoHuman, 2nd International Workshop on Social Media for Crowdsourcing and Human Computation. ACM Web Science, Paris. Online at http://arxiv.org/abs/1305.0904 Mazur B (1997) Conjecture. Synthese 111(2):197–210 McLarty C (2007) The rising sea: Grothendieck on simplicity and generality. In: Gray JJ, Parshall KH (eds) Episodes in the history of modern algebra (1800–1950). American Mathematical Society, Providence, RI, pp 301–325 Metaxas N (2015) Mathematical argumentation of students participating in a mathematics–information technology project. Int Res Educ 3(1):82–92 Metaxas N, Potari D, Zachariades T (2016) Analysis of a teacher’s pedagogical arguments using Toulmin’s model and argumentation schemes. Educ Stud Math 93(3):383–397 Misfeldt M, Johansen MW (2015) Research mathematicians’ practices in selecting mathematical problems. Educ Stud Math 89(3):357–373 O’Keefe D (1977) Two concepts of argument. J Am Forensic Assoc 13(3):121–128 Pease A, Aberdein A (2011) Five theories of reasoning: interconnections and applications to mathematics. Log Log Philos 20(1–2):7–57 Pedemonte B, Balacheff N (2016) Establishing links between conceptions, argumentation and proof through the ck¢-enriched Toulmin model. J Math Behav 41:104–122 Perelman C, Olbrechts-Tyteca L (1969) The new rhetoric: a treatise on argumentation. University of Notre Dame Press, Notre Dame Poincaré H (1913) The foundations of science. The Science Press, New York, translated by G. B. Halstead Priestley WM (2013) Wandering about: analogy, ambiguity and humanistic mathematics. J Humanist Math 3(1):115–135 Reyes GM (2004) The rhetoric in mathematics: Newton, Leibniz, the calculus, and the rhetorical force of the infinitesimal. Q J Speech 90(2):163–188 Reyes GM (2021) Rhetorical approaches to the study of mathematical practice. In: Sriraman B (ed) Handbook of the history and philosophy of mathematical practice. Springer Schiappa E (2021) In what ways shall we describe mathematics as rhetorical? In: Wynn J, Reyes GM (eds) Arguing with numbers: the intersections of rhetoric and mathematics. Penn State Press, Philadelphia, PA, pp 33–52 Schlimm D (2008) Two ways of analogy: extending the study of analogies to mathematical domains. Philos Sci 75(2):178–200 Stefaneas P, Vandoulakis IM (2014) Proofs as spatio-temporal processes. Philos Sci 18(3):111–125 Stefaneas P, Vandoulakis IM (2015) On mathematical proving. J Artif Gen Intell 6(1):130–149
Argumentation in Mathematical Practice
23
Steingart A (2012) A group theory of group theory: collaborative mathematics and the ‘uninvention’ of a 1000-page proof. Soc Stud Sci 42(2):185–213 Sundholm G (2012) “Inference versus consequence” revisited: inference, consequence, conditional, implication. Synthese 187(3):943–956 Thurston WP (1994) On proof and progress in mathematics. Bull Am Math Soc 30(2):161–177 Toscano F (2020) The secret formula: how a mathematical duel inflamed renaissance Italy and uncovered the cubic equation. Princeton University Press, Princeton, NJ Toulmin S (1958) The uses of argument. Cambridge University Press, Cambridge Toulmin S, Rieke R, Janik A (1979) An introduction to reasoning. Macmillan, London Turing AM (1939) Systems of logic based on ordinals. Proc Lond Math Soc 45(1):161–228 Tymoczko T (1979) The four-color problem and its philosophical significance. J Philos 76(2):57–83 Walton DN (1998) The new dialectic: conversational contexts of argument. University of Toronto Press, Toronto Walton DN, Krabbe ECW (1995) Commitment in dialogue: basic concepts of interpersonal reasoning. State University of New York Press, Albany, NY Walton DN, Macagno F (2015) A classification system for argumentation schemes. Argum Comput 6(3):219–245 Walton DN, Reed C, Macagno F (2008) Argumentation Schemes. Cambridge University Press, Cambridge Zeilberger D (1993) Theorems for a price: tomorrow’s semi-rigorous mathematical culture. Not Am Math Soc 40(8):978–981
Bayesian Perspectives on Mathematical Practice James Franklin
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The Relation of Conjectures to Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Applied Mathematics and Statistics: Understanding the Behavior of Complex Models . . . 4 The Objective Bayesian Perspective on Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Evidence for and Against the Riemann Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Probabilistic Relations Between Necessary Truths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Problem of Induction in Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 5 6 8 11 12 14 14 14
Abstract
Mathematicians often speak of conjectures as being confirmed by evidence that falls short of proof. For their own conjectures, evidence justifies further work in looking for a proof. Those conjectures of mathematics that have long resisted proof, such as the Riemann hypothesis, have had to be considered in terms of the evidence for and against them. In recent decades, massive increases in computer power have permitted the gathering of huge amounts of numerical evidence, both for conjectures in pure mathematics and for the behavior of complex applied mathematical models and statistical algorithms. Mathematics has therefore become (among other things) an experimental science (though that has not diminished the importance of proof in the traditional style). We examine how the evaluation of evidence for conjectures works in mathematical practice. We explain the (objective) Bayesian view of probability, which gives a theoretical framework for unifying evidence evaluation in science and law as well as in
J. Franklin (*) School of Mathematics and Statistics, University of New South Wales, Sydney, NSW, Australia e-mail: [email protected] © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_84-2
1
2
J. Franklin
mathematics. Numerical evidence in mathematics is related to the problem of induction; the occurrence of straightforward inductive reasoning in the purely logical material of pure mathematics casts light on the nature of induction as well as of mathematical reasoning. Keywords
Mathematical conjectures · Bayesian · Logical probability · Problem of induction · Riemann hypothesis · Numerical evidence
1
Introduction
Mathematical practice cannot consist solely of someone’s writing down mathematical proofs and someone else’s publishing and reading them. That may be the final output (in formal pure mathematics), but there are many preliminary activities that must take place on the way. Much of it involves evaluating inconclusive evidence as to whether a hoped-for theorem is true and whether one’s attempts at proving it are likely to succeed. It is similar in the art of navigation (prior to google maps), and for the same reasons: navigation cannot consist just of paths from A to B. Back of house (in the metaphor of Reuben Hersh 1991), there must be work to plan how to find them, to choose among alternatives, and to estimate whether a developing path is on the right track. In mathematical research too, one must navigate a path to the final result, a polished proof, and there are probabilistic skills needed to evaluate progress. Especially in the early stages of work on a problem, mathematicians need not merely hunches and beliefs, but ones that are as a matter of fact well supported by the evidence. The working mathematician must begin by asking questions like: Which of the many conjectures that could be generated are worth investigating? Which ones are relevant to the problem at hand? Which can be confirmed or refuted in some easy cases, so that there will be some indication of their truth in a reasonable time? Which might be capable of proof by a method in the mathematician’s repertoire? Which might follow from someone else’s theorem? Which are unlikely to yield an answer until after the next review of tenure? The mathematician must answer these questions to allocate time and effort. But not all answers to these questions are equally good. To stay employed as a mathematician, a proportion of these questions must be answered well (Franklin 1987). The area where a mathematician must make the finest discriminations of this kind – and where he might, in theory, be guilty of professional negligence if he makes poor decisions – is as a supervisor advising a prospective Ph.D. student. It is usual for a student beginning a Ph.D. to choose some general field of mathematics and then to approach an expert in the field as a supervisor. The supervisor (after estimating that the student is capable of a Ph.D.) then selects a problem in that field for the student to investigate. In mathematics, more than in any other discipline, the initial choice of problem is the crucial event in the Ph.D.-gathering process. The problem
Bayesian Perspectives on Mathematical Practice
3
must be unsolved at present, not being worked on by someone who is likely to solve it soon, but most importantly, tractable, that is, probably solvable, or at least partially solvable, by three years’ work at the Ph.D. level. It is recognized that of the enormous number of unsolved problems that have been or could be thought of, the tractable ones form a small proportion, and that it is difficult to discern which they are. The skill in rational evaluation of conjectures required of a supervisor is high. The next stage in successful mathematical practice is applying for a grant. While it is not unknown to ask for money for already completed work, a grant application normally talks up the importance of an unsolved problem or general area, its “impact,” and the applicant’s prospects of making progress on it. Expert referees for grants will usually have rather fixed ideas on what problems in their fields are important, and as fellow pure mathematicians they will ignore the applicant’s claims that the research will eventually lead to a cure for cancer and world peace. Thus much of the applicant’s effort must go into arguing persuasively that the problem is at least largely solvable with the resources the grant will provide and that the applicant and team are the perfect people to do it. For established researchers, the correct strategy is to exhibit their track records and speak confidently of their expected timeline of results (several examples in Zeilberger 2012). It is more difficult in the case of early-career researchers, both for the applicant and the evaluators, as there is less track record to go on so the evaluators need to consider whether the applicant’s approach is realistic in the light of both the problem and the applicant’s background. They may be tempted to fall back on proxy indicators of “promise” such as “having been supervised by a Fields medallist.” None of that argument involves presenting proofs. It is probabilistic in nature but bears on the provability of theorems – and their provability with given resources, not their provability in principle. With the grant won, or not, the next task is to work on the actual prospective theorems, developing insights and talking to fellow experts on the problem, and constantly evaluating the evidence that progress towards a solution is being made. We look at examples of evidence for several conjectures below. It is not adequate to describe the relation of evidence to hypothesis as “subjective,” “heuristic,” or “pragmatic”: although those descriptions may be true as far as they go. For the practice of evaluating conjectures to be successful there must be an element of what it is rational to believe on the evidence. The Bayesian perspective on evidence, to be explained below, gives a philosophical account of that.
2
The Relation of Conjectures to Proof
Although it is not the only possibility, in the practice of pure mathematical research consideration of the probability of conjectures is usually allied to the traditional view that the main aim of pure mathematics is to construct semiformal but rigorous proofs of theorems.
4
J. Franklin
As one researcher describes the process: The computer is often used as an exploratory tool in problem solving. This exploratory analysis is very important in gaining insight into problems and theorems and often leads to new conjectures. Heuristic reasoning may then lead from the conjecture to a formal theorem whose validity leads itself to further numerical investigation. When these numerical investigations present overwhelming numerical evidence that the proposed theorem is true, the researcher sets upon a search for a mathematical proof. (Schuster 1985); a more extensive account in (Wilf 2008), many examples in (Borwein and Bailey 2004).
That is how mathematicians speak in cases such as the Riemann hypothesis, considered below, where evidence has been collected and published over a long timespan because a famous conjecture has resisted proof. Although purely computational evidence is not generally publishable by itself, it is possible to prove some restricted cases of a conjecture and exhibit computational evidence that it holds more generally (e.g., Boij 1999). However, there are some other possibilities for how probabilistic reasoning can work in mathematical practice, other than simply providing evidence towards a traditional formal or semiformal proof. One is that there may be probabilistic proofs, that is, proofs or “proofs” that in their nature provide only a high probability that the result is true. A well-known case is probabilistic testing of primality, widely used in cryptography, where an algorithm delivers only a high probability that a number is a prime (Rabin 1980). In the light of such examples, there has been some discussion of whether mathematicians should abandon their model of formal deductive proof and rest content with some probabilistic substitute (e.g., (Fallis 1997), discussion in (Easwaran 2009); related arguments in (Paseau 2015; Sørensen 2016)). A popular article on the topic titled The death of proof by the journalist John Horgan (1993) attracted some heated responses (e.g., (Krantz 2011), but some support in (Zeilberger 1993)) and by and large mathematical practice has remained committed to the deductive model of proof as the ideal. That ideal is likely to gain support from current projects on fully automated theorem provers (a progress report in (Ornes 2020)). The results may be hard for humans to survey, but they are fully deductive proofs which do not contain human errors and sloppiness. We should also not forget the role of arguments from authority in mathematics. (Inglis and Mejía-Ramos 2009). While mathematics in principle offers the possibility of understanding any result by following its proof, it is impossible in practice to do that for all the results one must know. Mathematicians, like other people, rationally attribute high probability to the theorems signed off by respected experts. Refereeing depends on it. So does learning mathematics. (Consequences for pedagogy considered in (Aberdein 2019), section 5.)
Bayesian Perspectives on Mathematical Practice
3
5
Applied Mathematics and Statistics: Understanding the Behavior of Complex Models
Even more than in pure mathematics, numerical evidence has come to the fore in the large areas of applied mathematics and computational statistics that have been opened up by the vast increases in computational power in recent decades. In such fields, one is typically dealing with a very complex mathematical object such as a simulation of climate, whose mathematical properties are well beyond the reach of analytic methods or proof. Understanding how it works – that is, its internal behavior rather than its match with the modeled outside reality – can only be approached by experimenting with its output and applying normal (probabilistic) scientific methods for drawing conclusions from experiments. A typical area for such work is computational fluid dynamics. One begins with a complex physical phenomenon which one hopes to understand, such as turbulence generated by nonbreaking surface waves in the ocean (Tsai et al. 2015). One replaces the physical phenomenon with some standard numerical simulation of it, which one can run on a powerful computer with various choices of parameters. The simulation is a strictly mathematical object whose relation to the physical phenomenon being modeled needs understanding; for one thing, typically the model is discrete and the physical phenomenon continuous (at least at the scale being considered). But experimental evidence is available that the model and phenomenon behave quite similarly, or there would be no point in the simulation. One can therefore hope to understand the modeled reality by experimenting with and analyzing the model. The behavior of the simulation (now considered in abstraction from its role in imitating the phenomenon) is a purely mathematical question. In principle it might be addressed by solving equations and writing proofs, but almost always the simulation is too complex for that to be possible. One can only perform (numerical) experiments, that is, run the simulation with different choices of parameters (such as total energy in the system and various initial conditions), and see what happens. If the phenomenon to be explained sometimes occurs and sometimes does not, progress has been made in understanding its causes. But the different simulation runs are only samples of the simulation’s mathematical behavior, so there is an essentially probabilistic inference from the runs to the overall behavior, just as there is for any conclusion from finite experimental evidence to a general law. Something similar occurs in “big data” applications where one sets up an algorithm intended to solve some statistical problem such as fraud detection or handwritten character recognition. Before testing the system “in the wild” on real data, one often seeks first to understand its performance in principle – does it have the inherent capacity to find certain structures in data? (The question is particularly urgent if there is a shortage of real data; example in (Hoffmann et al. 2019).) That is a strictly mathematical question and so in theory subject to proof, but with a complex system of the sort typically used, proof is not attainable. So one tests it on synthetic data, that is, computer-generated data into which one has inserted structures of the kind one hopes the system will be able to detect. If the system detects them,
6
J. Franklin
something is known about its performance. A run with purely synthetic data is an experiment that gives rise to purely mathematical knowledge. Or one may want to compare different available mathematical methods for a type of task, such as spatial prediction where again there is a shortage of real data. Simulated data give a sufficient quantity of data to evaluate the capacities of the different methods (e.g., Harris et al. 2010). Again, the numerical experiments result in a purely mathematical conclusion.
4
The Objective Bayesian Perspective on Evidence
There is no agreed account of what Bayesianism means, but here we present a version that is suitable as a theoretical framework for the evaluation of evidence for conjectures in pure mathematics (Taken from (Franklin 2011, 2016) but based on (Keynes 1921; Jaynes 2003)). It is the version called (fully) objective Bayesianism or logical probability. It holds that the relation of evidence to conclusion is a matter of strict logic, like the relation of axioms to theorems in mathematics but less conclusive. Given a fixed body of evidence – say in a trial in court, or in a dispute about a scientific theory, or in evaluating numerical evidence for a mathematical conjecture – and given a conclusion, there is a fixed degree to which the evidence supports the conclusion. If we could agree just what the standard of “proof beyond reasonable doubt” is, then, in a given trial, it is an objective matter of logical fact whether the evidence presented does or does not meet that standard, and so a jury is either right or wrong in its verdict on the evidence. It is not essential to the Bayesian perspective that the relation of evidence to conclusion should be given a precise number. It is unlikely that in typical legal, scientific or mathematical cases a numerical probability could be calculated, even if it existed in principle. It is sufficient for objective Bayesianism that it is sometimes intuitively evident that some hypotheses, on some bodies of evidence, are highly likely, almost certain, or virtually impossible. The most central theses of Bayesianism do not concern numbers but are certain qualitative principles of evidence. The first is the simplest principle of logical probability, called by Pólya ((1954), 4; further in (Mazur 2014)) “the fundamental inductive pattern” or “verification of a consequence.” It is as follows: q is a (nontrivial) consequence of hypothesis p q is found to be true So, p is more likely to be true than before (In short, “Theories are confirmed by their consequences or predictions.”) That is, taken for granted as rational in scientific experiment or historical inquiry (except by strict Popperian falsificationists). The point of drawing predictions from theories and checking if they are true is normally taken to be that confirmation of predictions supports a theory. That is as applicable to mathematical conjectures as to scientific
Bayesian Perspectives on Mathematical Practice
7
theories: the simplest possible consequences of a general mathematical statement such as “All odd numbers have odd squares” are its instances such as “The square of 131 is odd,” which can be checked by calculation. Finding the consequences to be true gives some degree of support for a generalization and can justify further work. It is true that in some clearly defined circumstances, it is natural to associate a number with the degree to which evidence supports a conclusion. Those circumstances are cases of the “proportional syllogism” or “statistical syllogism,” where the sole relevant evidence is a proportion in a set. If the sole evidence bearing on whether a certain patient will be cured of disease A by drug B is that 89% of patients with disease A are cured by drug B, then it is natural to assign a number to the probability, on that evidence, that the patient will be cured; namely 0.89. (Even with the proportional syllogism, however, arguments with imprecise numbers are often the most applicable, such as “The vast majority of flights arrive safely, therefore I can relax on takeoff.”) When numbers are applicable, it is usual to model the relation of evidence e to conclusion h by a number P(h | e), between 0 and 1 inclusive, “the (logical) probability of h given e,” which satisfies the usual axioms of conditional probability: Pðnot h j eÞ ¼ 1 Pðh j eÞ Pðh1 and h2 j eÞ ¼ Pðh1 j eÞ Pðh2 j h1 and eÞ The theorem of probability that is most useful for updating evidence is the celebrated Bayes’ theorem which expresses how the probability of evidence given a hypothesis relates to the probability of the hypothesis given the evidence: Pðh j e&bÞ ¼ Pðe j h&bÞ Pðh j bÞ=Pðe j bÞ (where h is read as hypothesis, e as (new) evidence, and b as the background knowledge that is taken for granted in the context of the problem). Thus, new evidence supports a hypothesis which makes the evidence likely (but only if the evidence is surprising). While Bayes’ theorem gives its name to Bayesianism, we emphasize that for the purpose of theory evaluation in general, the numerical theorem is much less important than qualitative consequences of it such as Pólya’s “fundamental inductive pattern.” Another intuitively appealing such consequence, also easily derivable, is that the more surprising a consequence of h (that is, the lower its “prior” probability P(e | b) on the background knowledge b), the more it increases the probability of h. (Thus Einstein’s surprising prediction of the gravitational bending of light gave strong support to the theory of general relativity when it was observed to be true.) If one takes those principles to be logical, they are as applicable to evidence for conjectures in pure mathematics as they are for theories in science or allegations in courts.
8
J. Franklin
The appearance in Bayes’ theorem of the “prior” probabilities P(h | b) and P(e | b) calls attention to the need to have some knowledge of them. For example, if one takes a “dogmatic” prior according to which a hypothesis is impossible, P(h | b) ¼ 0, then no amount of evidence will dig one out of that hole: according to Bayes’ theorem, the “posterior” probability P(h | e & b) will still be zero after any evidence e. Some logical intuition will be needed to evaluate whether a mathematical proposition is inherently unlikely or not, but such intuition seems to be available in simple cases; for example, it seems intuitively unlikely that the square of every odd number should be odd unless there was some good mathematical reason for that. Other versions of Bayesianism are possible, such as subjective Bayesianism which requires only the axioms and does not constrain prior probabilities (Howson and Urbach 2006) and a version of objective Bayesianism that falls short of claiming that the probabilities involved in evidence evaluation are matters of strict logic (Williamson 2010). These are less attractive in the very abstract reasoning of pure mathematics where it seems that any considerations that genuinely count can only be logical; but in any case those versions agree on the qualitative principles of probability such as Pólya’s inductive pattern, which are the main concern in explaining rational methods of evaluating the strength of conjectures.
5
Evidence for and Against the Riemann Hypothesis
Much of what has been said so far – both about the process of evaluating evidence for conjectures and the general principles of Bayesian reasoning – can be illustrated by the evidence that has accumulated for and against the Riemann hypothesis, currently the most famous unsolved problem of mathematics. Because the hypothesis has resisted proof for so long, the mathematical community has been much more explicit than usual about the evidential situation. Expert opinion is that the Riemann hypothesis is almost certainly true, and that a proof is not exactly imminent but may not be far off. Naturally, some of the considerations are only comprehensible to true experts, but some of them are comparatively shallow and can be widely appreciated. Very relevant in this case are the shallowest arguments, those from authority. How people think about arguments from authority in mathematics is evident in reactions to people who claim to have proved the Riemann hypothesis. If someone unknown presents his “result” on BBC news and at a conference run by someone who has accepted fake papers, he will not be taken seriously (Steckles and Lawson-Perfect 2018). If the aged Fields Medallist Sir Michael Atiyah presents a sketch of a proof, it is still met with skepticism, because of the difficulty of the problem, the minimal amount of information provided, and the author’s track record of making unestablished proof claims in his old age (Amit 2018). On the other hand, when Alain Connes (2016) learnedly explains the comparative strengths of different strategies that may lead to a solution, mathematicians in general will nod in agreement, whether they understand it or not. He is the expert.
Bayesian Perspectives on Mathematical Practice
9
Table 1 Number of roots of the Riemann zeta found to have real part ½ Worker Gram (1903) Backlund (1914) Hutchinson (1925) Titchmarch (1935/6) Lehmer (1956) Meller (1958) Lehman (1966) Rosser, Yohe, and Schoenfeld (1968) Brent (1979) Te Riele, van de Lune et al. (1986) Gourdon (2004) Platt and Trudgian (2020)
Number of roots found to have real part ½ 15 79 138 1041 25,000 35,337 250,000 3,500,000 81,000,001 1,500,000,001 1013 3 1012 with more rigor
The Riemann hypothesis states that “With certain trivial exceptions, all the (infinitely many) roots of the Riemann zeta function have real part equal to ½.” For the present purpose an understanding of the Riemann zeta function as a function in the complex plane is not necessary, it is only important that the hypothesis is a simple universal proposition like “all ravens are black.” It is also true that the infinitely many nontrivial roots of the Riemann zeta function have a natural order, so that one can speak of “the first million roots.” (One of many popular but informed accounts is (Derbyshire 2003).) As a straightforward generalization, the Hypothesis is susceptible to purely inductive evidence, that is, just calculating many zeros starting from the first one and checking that they have real part ½. (We consider inductive evidence in mathematics in section 7 below.) That process was begun by Riemann after he made the conjecture in 1859 and has of course been much assisted by increasing computer power. The results have been, as they developed (Table 1): Inductions in mathematics can certainly dwarf those in science. While this inductive evidence is generally taken to be quite strong (to some indeterminate degree), there is also a reason to deny it is close to certainty. That comes from the close connection of the Hypothesis with the prime number theorem. This theorem states that the distribution of primes thins out logarithmically, or more exactly, the number of primes less than x is (for large x) approximately equal to the Ðx dt integral log t. 2
If tables are drawn up for the number of primes less than x and the values of this integral, for x as far as calculations can reach, then it is always found that the number of primes less than x is actually less than the integral. On this evidence, it was thought for many years that this was true for all x. Nevertheless, Littlewood proved that this is false. While he did not produce an actual number for which it is false, it appears that the first such number is extremely large – well beyond the range of
10
J. Franklin
computer calculations. That gives some reason – it is very hard to say how strong – for believing that there might be a very large counterexample to the Riemann Hypothesis even though there are no small ones. The reasons why most mathematicians’ confidence in the Riemann hypothesis does approach certainty are not taken from further numerical evidence but from its connections with other theses that are or “ought to be” right. The first of these was the prime number theorem just mentioned. It is a consequence of the Riemann Hypothesis and was proved independently in 1896. The support given by this proof is thus an instance of Pólya’s schema, mentioned above, “theories are confirmed by their consequences or predictions.” However, the prime number theorem is not a very strong consequence of the Riemann hypothesis: the Riemann Hypothesis “says a lot more,” so the support given by the prime number theorem is only moderate. Two other considerations (among those accessible to ordinary mathematicians) are much stronger. One is the remarkable “Denjoy’s probabilistic interpretation of the Riemann Hypothesis,” an easily understood thesis in number theory which is equivalent to the Riemann Hypothesis and is independently intuitively plausible. We will not review it here (for a very brief account see (Franklin 2014), Chap. 15). The other is again a result of its connection with the prime number theorem. That theorem gives a crude estimate of the density of primes (namely, the density of primes around a large number N is 1/logN ). The Riemann Hypothesis implies a much more finegrained knowledge of the distribution of primes (namely, that the difference between the number of primes up to x and the integral above is O(√xlogx), the best possible). If the Riemann Hypothesis is true, that fine-grained distribution is the simplest and most natural one, but if the Hypothesis is false, some more complicated and unnatural phenomenon must be present. The type of probabilistic reasoning involved is what Mazur (2014) calls “reasoning from randomness”: that after known constraints have been taken into account, the rest should be random. (To take a simpler example, given that all primes except 2 and 5 have last digits 1, 3, 7, or 9, one can initially expect these digits to occur equally often; that is, uniformly distributed as they would be if chosen at random.) The Riemann Hypothesis is also supported by an argument from analogy (identified by Mazur as another probabilistic argument form common in mathematics). In some famous and difficult work, André Weil proved that the analogue of Riemann’s Hypothesis is true for certain other zeta functions, and his related conjectures for an even more general class of zeta functions were proved to widespread applause in the 1970s. One expert says “It seems that they provide some of the best reasons for believing that the Riemann hypothesis is true – for believing, in other words, that there is a profound and as yet uncomprehended number-theoretic phenomenon, one facet of which is that the roots ρ all lie on Re s ¼ ½” (Edwards 1974, 298). However, not all experts are convinced of the closeness of the analogy and outsiders are in no position to judge. The Riemann Hypothesis, then, is a remarkable testbed for displaying the diversity of probabilistic evidence that can bear on a mathematical conjecture.
Bayesian Perspectives on Mathematical Practice
11
Similar stories could be told with the other famous unsolved problems of mathematics. For example, Goldbach’s conjecture is generally strongly believed to be true (on the basis of both simple numerical and other evidence) but also to be very unlikely to be proved soon (Review in (Baker 2009), section 3.2; evidence for some other currently unsolved conjectures listed in (Aberdein 2019), section 1). It is also instructive to follow the historical development of evidence for now-proved conjectures, like Fermat’s last theorem and the classification of finite simple groups. In those cases, the growing confidence of experts both that the conjecture was true and that proof was imminent turned out to be justified (Franklin 2014, Chap. 15). It has indeed been argued that there is a certain observable regularity in the time-to-proof of mathematical conjectures in general (Hisano and Sornette 2013). It would be hard to deny that probabilistic evidence in mathematics is worthwhile, in the face of evidence that it has proved in the past to be a reliable guide to the (provable) truth.
6
Probabilistic Relations Between Necessary Truths
There is one difficult question that needs consideration in applying Bayesian theory or nondeductive logic in mathematics in particular. P(h | e) is intended to be a measure of the support that evidence e gives to hypothesis h. If e entails h, then P(h | e) is 1, since h is certain given e. But in mathematics, the typical case is that e does entail h, though that is perhaps as yet unknown. If, however, P(h | e) is really 1, how is it possible in the meantime to discuss the (nondeductive or probabilistic) support that e may give to h, that is, to treat P(h | e) as less than 1? In other words, if h and e are necessarily true or false, how can P(h | e) be other than 0 or 1? The answer is that, in both deductive and nondeductive logic, there can be many logical relations between two propositions. Some may be known and some not. To take an artificially simple example in deductive logic, consider the following argument: If all men are mortal, then this man is mortal All men are mortal ————————————— Therefore, this man is mortal. The premises entail the conclusion, certainly, but there is more to it than that. They entail the conclusion in two ways: firstly, by modus ponens, and secondly by instantiation from the second premise alone. That is, there are two logical paths from the premises to the conclusion. More realistic cases are common in mathematics, when different mathematicians produce different proofs of the same theorem, that is, different logical paths from the axioms to the theorem. Now just as there can be two deductive paths between premises and conclusion, so there can be a deductive and nondeductive path, with only the latter known. Before the Greeks’ development of deductive geometry, it was possible to argue the following:
12
J. Franklin
All equilateral (plane) triangles so far measured have been found to be equiangular This triangle is equilateral —————————————— Therefore, this triangle is equiangular There is a nondeductive logical relation between the premises and the conclusion: the premises inductively support the conclusion. But when deductive geometry appeared, it was found that there was also a deductive relation, since the second premise alone entails the conclusion. This discovery in no way vitiates the correctness of the previous nondeductive reasoning or casts doubt on the existence of the nondeductive relation. That relation cannot be affected by discoveries about any other relation. So the answer to the question “How can there be probabilistic relations between necessary truths?” is simply that those relations are additional to any deductive relations. They may be known independently of them (and are often easier to know). Once that is established, it is possible to understand Pólya’s remark that nondeductive logic is better appreciated in mathematics than in the natural sciences (1954, vol II, 24). In mathematics there can be no confusion over natural laws, the uniformity of nature, propensities, the theory-ladenness of observation, pragmatics, scientific revolutions, the social relations of science, or any other red herrings. There are only the hypothesis, the evidence, and the logical relations between them.
7
The Problem of Induction in Mathematics
As we have seen, induction, or inference from the observed to the unobserved, is found in pure mathematics as well as in the natural and social sciences. That has significance for understanding the nature of induction as well as for mathematics. If induction works in just the same way in pure mathematics as in science, its rationality and justification would seem to be independent of any contingent facts about the world, such as the obtaining of natural laws or the uniformity of nature. Let us take an easy example, even simpler than the ones above concerning the roots of the Riemann zeta function. The first million digits of π are random Therefore, the second million digits of π are random The first few digits of π are 3.141592653589793238462643383279502884197169399375105820974944 ... It can be seen that they are random in the sense of lacking pattern. (That notion is formalized in several ways, as in “passes standard statistical tests for randomness” or “incompressible,” not “probabilistically generated,” “stochastic” (Eagle 2018); of course the digits of π are generated by a deterministic formula. For present purposes,
Bayesian Perspectives on Mathematical Practice
13
“looking patternless” is sufficient.) It is easily checked that the first million digits of π do both look random and pass statistical tests for randomness (such as having about the same number of each digit). (Calculations now stretch to 4 trillion digits in hexadecimal (Bailey et al. 2012). Intuitively, that is good evidence that the patternlessness will continue into the second million digits, just in the same way that observing a million black ravens is good inductive evidence that further observed ravens will be black. It is widely believed that the digits continue to be random indefinitely, although it is not proved even that there are about equal numbers of each digit (Marsaglia 2005). (Some claims that the digits of π fail a more subtle statistical test for randomness in (Ganz 2014) with debate in (Bailey et al. 2017) and (Ganz 2017).) Gronau and Wagenmakers (2018) argue that this is a perfect case for appreciating the Bayesian perspective in pure mathematics. They apply a formal Bayesian analysis to the hypothesis that π is normal (in base 10), that is, that the ten digits appear equally often in its decimal expansion. They conclude that “After all [the first]100 million digits [of π] have been taken into account, the observed data are 1.86 1030 times more likely to occur under [the hypothesis that π is normal] than under [the alternative hypothesis that is it not]. The extent of this support is overwhelming.” As with any Bayesian analysis with exact numbers, the final conclusion depends on a choice of prior probabilities, and the authors discuss possible choices. It is true, as argued by Baker (2007) that there is a special problem with inductive arguments in number theory in that all the observed cases are of small numbers. Any number that can be calculated with is very small, compared to numbers in general. That bias in the evidence could raise a question as to whether any induction of the form “All observed numbers have property X, therefore all numbers have property X” could have high probability. That does not imply, however, that inductive arguments in mathematics are generally poor. Firstly, a bias in the evidence towards small numbers does not affect inductive arguments with more modest conclusions, such as “All observed numbers have property X, so the next number calculated will have property X.” (For example, the argument above about the randomness of the digits of π only extrapolated a finite distance, thus keeping to small numbers.) Secondly, many other inductive arguments have a bias in the evidence, without thereby becoming worthless (though they may become less secure). For example, extrapolative inductive inference like “All observed European swans are white, therefore all swans in the world are white” is a worthwhile inductive argument, even though the extrapolation beyond the observed range weakens it. It is nevertheless true that purely inductive evidence is less credible in mathematics than it might be elsewhere, because of the possibility that some mathematical reason against a generalization might apply only for huge numbers. As we saw, the purely inductive evidence for the Riemann hypothesis was a possible instance of that, and it is possible to have “overwhelming” numerical evidence for a conjecture, where the reasons for it being false also reveal why there should be such good evidence for it (Schuster 1985). That is not the same, however, as saying that the evidence was not good in the first place or that a blanket skepticism is justified.
14
J. Franklin
The occurrence of inductive arguments in pure mathematics suggests something about their nature as well as the nature of mathematics. If inductive arguments work in the necessary matter of pure mathematics, apparently on the same basis as they work in science, it seems they cannot depend on any contingent principles, such as the uniformity of nature or the holding of natural laws. The digits of π are surely the same in all possible worlds, and the rationality of arguments about them independent of any differences between possible worlds. That is more compatible with logical justifications of induction (e.g., Stove 1986) than ones that rely on either contingent facts like natural laws or sociological facts about what reasoners do. Again, it is easier to appreciate the force of probabilistic reasoning in mathematical cases, free of distractions.
8
Conclusion
The Bayesian perspective brings some order into part of the vast backstory of mathematical practice that underpins published research output. It allows us to understand the nature of the variety of methods of evaluating evidence – for pure mathematical conjectures, for the behavior of mathematical models, and for the performance of statistical algorithms. Experimental evidence, checking of consequences, weighing of intuitions, arguments from analogy, and expectations as to uniform distributions work the same way in mathematics as they do in natural and social sciences. The Bayesian perspective explains the unity of those methods and why they have rational force. The fact that they do work in the necessary matter of mathematics suggests that they are matters of pure logic.
9
Cross-References
▶ Experiments in Mathematics: Fact, Fiction or the Future? ▶ Heuristics and Mathematical Practice ▶ How Experiments in Mathematics Can Contribute to the Learning of Mathematics ▶ Numerical Calculations as Experiments in 19th Century Mathematics ▶ On Computer Simulations as Experiments in the Natural Sciences ▶ Proofs, Arbitrary Exemplifications, and Inductive Generalizations in Euler’s Mathematical Practice
References Aberdein A (2019) Evidence, proofs, and derivations. ZDM 51:825–834 Amit G (2018) Riemann hypothesis likely remains unsolved despite claimed proof. New Scientist, 24 Sept. https://www.newscientist.com/article/2180504-riemann-hypothesis-likely-remainsunsolved-despite-claimed-proof/ Bailey DH, Borwein JM, Calude CS, Dinneen MJ, Dumitrescu M, Yee A (2012) An empirical approach to the normality of π. Exp Math 21(4):375–384
Bayesian Perspectives on Mathematical Practice
15
Bailey DH, Borwein JM, Brent RP, Reisi M (2017) Reproducibility in computational science: a case study: randomness of the digits of pi. Exp Math 26(3):298–305 Baker A (2007) Is there a problem of induction for mathematics? In: Leng M, Paseau A, Potter M (eds) Mathematical knowledge. Oxford University Press, Oxford, pp 59–73 Baker A (2009) Non-deductive methods in mathematics. Stanford Encyclopedia of Philosophy. http://plato.standord.edu/entries/mathematics-nondeductive Boij M (1999) Betti numbers of compressed level algebras. J Pure Appl Algebra 134(2):111–131 Borwein J, Bailey D (2004) Mathematics by experiment: plausible reasoning in the 21st century. AK Peters, Natick Connes A (2016) An essay on the Riemann hypothesis. In: Nash JF, Rassias MT (eds) Open problems in mathematics. Springer, Cham, pp 225–257 Derbyshire J (2003) Prime obsession: Bernhard Riemann and the greatest unsolved problem in mathematics. Joseph Henry Press, Washington, DC Eagle A (2018) Chance versus randomness. Stanford Encyclopedia of Philosophy. https://plato. stanford.edu/entries/chance-randomness/ Easwaran K (2009) Probabilistic proofs and transferability. Philosophia Mathematica 17(3): 341–362 Edwards HM (1974) Riemann’s zeta function. Academic, New York Fallis D (1997) The epistemic status of probabilistic proof. J Philos 94(4):165–186 Franklin J (1987) Non-deductive logic in mathematics. Br J Philos Sci 38(1):1–18 Franklin J (2011) The objective Bayesian conceptualisation of proof and reference class problems. Sydney Law Rev 33(3):545–561 Franklin J (2014) An Aristotelian realist philosophy of mathematics: mathematics as the science of quantity and structure. Palgrave Macmillan, Basingstoke Franklin J (2016) Logical probability and the strength of mathematical conjectures. Math Intell 38 (3):14–19 Ganz RE (2014) The decimal expansion of π is not statistically random. Exp Math 23(2):99–104 Ganz RE (2017) Reply to “reproducibility in computational science: a case study: randomness of the digits of Pi” [Bailey et al. 17]. Exp Math 26(3):306–307 Gronau QF, Wagenmakers E-J (2018) Bayesian evidence accumulation in experimental mathematics: a case study of four irrational numbers. Exp Math 27(3):277–286 Harris P, Fotheringham AS, Crespo R, Charlton M (2010) The use of geographically weighted regression for spatial prediction: an evaluation of models using simulated data sets. Math Geosci 42(6):657–680 Hersh R (1991) Mathematics has a front and a back. Synthese 88(2):127–133 Hisano R, Sornette D (2013) Challenges to the assessment of time-to-proof of mathematical conjectures. Math Intell 35(4):10–17 Hoffmann J, Bar-Sinai Y, Lee LM, Andrejevic J, Mishra S, Rubinstein SM, Rycroft CH (2019) Machine learning in a data-limited regime: augmenting experiments with synthetic data uncovers order in crumpled sheets. Sci Adv 5(4):eaau6792 Horgan J (1993) The death of proof. Sci Am 269(4):92–103 Howson C, Urbach P (2006) Scientific reasoning: the Bayesian approach, 3rd edn. Open Court, Chicago Inglis M, Mejía-Ramos JP (2009) The effect of authority on the persuasiveness of mathematical arguments. Cogn Instr 27(1):25–50 Jaynes ET (2003) Probability theory: the logic of science. Cambridge University Press, Cambridge Keynes JM (1921) A treatise on probability. Macmillan, London Krantz SG (2011) John Horgan and “the death of proof?”. In: The proof is in the pudding. Springer, New York, pp 219–222 Marsaglia G (2005) On the randomness of pi and other decimal expansions. http://www.yaroslavvb. com/papers/marsaglia-on.pdf Mazur B (2014) Is it plausible? Math Intell 36(1):24–33
16
J. Franklin
Ornes S (2020) How close are computers to automating mathematical reasoning? Quanta Magazine, Aug 27. https://www.quantamagazine.org/how-close-are-computers-to-automating-mathemati cal-reasoning-20200827/ Paseau A (2015) Knowledge of mathematics without proof. Br J Philos Sci 66:775–799 Platt D, Trudgian T (2020) The Riemann hypothesis is true up to 3·1012, arXiv:2004.09765 Pólya G (1954) Mathematics and plausible reasoning (vol. I, Induction and analogy in mathematics, and vol. II, Patterns of plausible inference). Princeton University Press, Princeton Rabin MO (1980) Probabilistic algorithm for testing primality. J Number Theory 12(1):128–138 Schuster EF (1985) On overwhelming numerical evidence in the settling of Kinney’s waiting-time conjecture. SIAM J Sci Stat Comput 6(4):977–982 Sørensen HK (2016) ‘The end of proof’? The integration of different mathematical cultures as experimental mathematics comes of age. In: Larvor B (ed) Mathematical cultures. Birkhauser, Cham, pp 139–160 Steckles K, Lawson-Perfect C (2018) Atiyah Riemann Hypothesis proof: final thoughts, aperiodical.com, Sept 28. https://aperiodical.com/2018/09/atiyah-riemann-hypothesis-prooffinal-thoughts/ Stove D (1986) The rationality of induction. Clarendon, Oxford Tsai W, Chen S, Lu G (2015) Numerical evidence of turbulence generated by nonbreaking surface waves. J Phys Oceanogr 45(1):174–180 Wilf HS (2008) Mathematics: an experimental science. In: Gowers T (ed) Princeton companion to mathematics. Princeton University Press, Princeton, pp 991–999 Williamson J (2010) In defence of objective Bayesianism. Oxford University Press, Oxford Zeilberger D (1993) Theorems for a price: tomorrow’s semi-rigorous mathematical culture. Notices Am Math Soc 40(8):978–981 Zeilberger D (2012) Appendix to Doron Zeilberger’s Opinion 117: Links to posted Grant Proposals. https://sites.math.rutgers.edu/~zeilberg/Opinion117Appendix.html
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties Regarding the Number Continuum Elías Fuentes Guille´n
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Bolzano’s Uncertainties Regarding the Notions of Quantity and Number . . . . . . . . . . . . . . . . . . 3 The Conceptual Framework of Bolzano’s Measurable Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Bolzano’s meßbare Zahlen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 4 13 20 31 35
Abstract
During the first half of the 1830s, and as part of his project for a Größenlehre, Bernard Bolzano worked on a manuscript entitled Reine Zahlenlehre in which he introduced the notion of what he called “meßbare Zahlen.” The various additions and corrections to its three extant versions are evidence of an unfinished work, the definitive edition of which was not published until 1976. The present chapter casts light upon the links between, on the one hand, his theory of “measurable numbers” and its conceptual framework, and, on the other hand, his insights and uncertainties with regard to the notions of number and quantity prior to the writing of that work. While Bolzano’s proposal has usually been considered as an attempt at a theory of what nowadays is called the real-number continuum, this chapter shows that a more faithful reading must consider it as a pioneering and
E. Fuentes Guillén (*) Institute of Philosophy of the Czech Academy of Sciences, Prague, Czech Republic Department of Mathematics, Faculty of Sciences, UNAM, Mexico City, Mexico © Springer Nature Switzerland AG 2022 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_96-2
1
E. Fuentes Guille´n
2
transitional theory of the number continuum which provided relevant insights into this latter but which remained, nonetheless, still bound to a not-yet-modern conception of mathematics and numbers. Keywords
Bernard Bolzano · Measurable numbers · Number continuum · Real numbers · Nineteenth-century mathematics
1
Introduction
Throughout the eighteenth century mathematics was usually defined as the “science of measuring everything that can be measured” or the “science of the quantities, that is, all those things that can be augmented or reduced.”1 However, by the end of the nineteenth century such an understanding of mathematics, which entailed the division of quantities into discrete (i.e., numbers) and continuous, was no longer advocated by most mathematicians. So, while certain appellations used by some of the authors who developed the first theories of real numbers, such as Zahlengrössen (numerical quantities; cf. Cantor 1872; Weierstraß 1868/1986), still evoked that earlier understanding, they nevertheless involved a different, relatively abstract conception of quantities and, ultimately, mathematics (cf. Ferreirós 2007, p. 125). The disagreement between Richard Dedekind and Rudolf Lipschitz regarding a note included by the former in the introduction to his Sur la théorie des nombres entiers algébriques (1876–77) reveals the prevalence of the idea that Euclid’s theory of proportions, and in particular definition 5 from Book V of his Elements, was enough “to guarantee the continuity of the domain of ‘incommensurable magnitudes’” (Benis-Sinaceur 2015, p. 25; cf. Lipschitz 1876/1986, p. 72). Whereas Lipschitz advocated that idea, Dedekind explicitly rejected it and pointed out that, from his point of view, “arithmetic is to be kept free of any [mixture] of foreign elements,”2 something on which he insisted in his own 1888 work (cf. Dedekind 1888, p. X). This chapter draws attention to some of the complexities involved in the transition from a traditional notion of numbers to the definitions of real numbers published from c. 1870 onward, including the ones by Dedekind and Cantor. This is done by focusing specifically on the work of Bernard Bolzano, whose theory of “measurable numbers” (meßbare Zahlen) is considered to be an early “attempt at a theory of real numbers” (Russ 2004, p. 348; cf. Šebestík 1992, p. 354; Laugwitz 1982, p. 667; Rusnock 2000, p. 177). The work by Bolzano which is considered by some scholars
(Wolff 1716, p. 863): “Ist eine Wissenschafft alles auszumessen, was sich ausmessen läst. Insgemein beschreibet man sie per scientiam quantitatum, durch eine Wissenschafft der Größen, das heisset, aller derjenigen Dinge, die sich vergrössern oder verkleinern lassen.” 2 (Dedekind 1876, p. 284): “l’Arithmétique doit être maintenue exempte de tout mélange d’éléments étrangers.” 1
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
3
to contain such an attempt is entitled Pure Theory of Numbers (Reine Zahlenlehre; henceforth RZ). Although this work was written in the first half of the 1830s and its manuscript was rediscovered at the beginning of the twentieth century, its definitive edition was published only in 1976 as part of the Bernard Bolzano Gesamtausgabe (Bernard Bolzano: The Complete Edition; henceforth BBGA),3 publication of which, by the publisher Frommann-Holzboog, has been ongoing from 1969 up to the present day.4 As it will be argued, Bolzano’s RZ, which was developed within the framework of his project for a theory of quantities (Größenlehre), and in particular his theory of measurable numbers, was still tied to a not-yet-modern conception of numbers, quantities and, ultimately, mathematics. Thus, Sect. 2 discusses his insights and uncertainties with regard to the notions of quantity and number mainly as expressed in the mathematical works that preceded RZ, while also paying attention to certain documents (both posthumously published and as yet unpublished) preceding, contemporary with, and succeeding RZ, such as Bolzano’s mathematical notebooks (Miscellanea Mathematica), which were published for the first time in the BBGA. Then, Sect. 3 goes on to focus on the introductory volume to Bolzano’s theory of quantities, which precedes RZ, as well as on the first six sections of the latter work in order to better understand his project of a pure theory of numbers, while also paying attention to his mathematical notebooks written during the 1820s. Finally, Sect. 4 addresses his insights and uncertainties with regard to the number continuum by bringing to the fore the particularities of his “measurable numbers” and the other core notions of the seventh and last section of RZ. The need to study Bolzano’s manuscripts has recently been highlighted by both Steve Russ and Kateřina Trlifajová, with specific reference to RZ (cf. Russ and Trlifajová 2016, p. 44), and Jan Šebestík, with reference to Bolzano’s project of a Größenlehre. As noted by Šebestík in the “Afterword” to Bernard Bolzano: His Life and Work, which he co-wrote with Paul Rusnock, in some cases such a study could “modify our understanding of Bolzano’s thought” (Rusnock and Šebestík 2019, p. 597). This chapter will show that this is the case with Bolzano’s notion of numbers
3
For a detailed account of the publication of RZ and its reception throughout the second half of the twentieth century, cf. (Spalt 1991, pp. 16–24; Russ 2004, pp. XX & 348). According to two letters written by Bolzano in the early 1840s and quoted by Eduard Winter, Bolzano resumed work on his project on mathematics between mid-1840 and mid-1841 (cf. Winter 1933, pp. 215–216). 4 Founded by Jan Berg, Friedrich Kambartel, Jaromír Louzil, Bob van Rootselaar and Eduard Winter, and currently edited by Edgar Morscher, the BBGA comprises some introductory volumes (Einleitungsbände) and, up to the present day, about 100 published volumes divided into four series: (I) Schriften, the “works published during Bolzano’s lifetime”; (II) Nachlaß, his “posthumous works” (subpart A) and his “scientific diaries” (subpart B); (III) Briefwechsel, his correspondence with various correspondents; and (IV) Dokumente, which includes various documents ranging from portraits and biographies to documents on Bolzano’s trial (https://www.frommann-holzboog.de/ editionen/20?lang¼de). All references to volumes of the BBGA, therefore, contain: (a) the initials BBGA; (b) the letter “E” (for the introductory volumes) or the number of the series (1–4) and, in the case of the Reihe II, the subpart letter (i.e., 2A or 2B); and (c) the volume number and, in some cases, the supplement number.
4
E. Fuentes Guille´n
and that the present research not only contributes to its elucidation but also accounts for the complexities of the emergence of our number continuum insofar as, to paraphrase Russ, Bolzano’s theory epitomizes those complexities vividly (cf. Russ 2004, p. 348). In this sense, this chapter follows Epple’s recommendation to thoroughly investigate the “eventful atmosphere” that led to “the end of the paradigm of the science of quantity”5 by addressing the subtleties of the work of an author writing somewhat earlier than those whom Epple discusses in Das Ende der Grössenlehre. Eine Einführung in die Geschichte der Grundlagen der Analysis, 1860–1930.
2
Bolzano’s Uncertainties Regarding the Notions of Quantity and Number
Bolzano entered the Faculty of Philosophy at the Carl-Ferdinandischen Universität (Prague) as a student in 1796 and, as it is known from his lecture notes for the first academic year, was taught there the notions that were customary throughout the eighteenth century: mathematics was the “science of quantities,” that is, “that which is capable of increase or decrease,” while numbers were defined as “a multitude of things of one kind.”6 Such traditional notions can be found in almost all Germanlanguage mathematics textbooks of the time, including Abraham Gotthelf Kästner’s Anfangsgründe, which had been the official textbook in Prague since 1784 (cf. Handbuch 1786, p. 407).7 As Bolzano explained in his 1804 Considerations on Some Objects of Elementary Geometry (henceforth BG), such a conception of “quantity” entailed that a thing could only be quantified if “regarded as consisting of a number (plurality) of things that are equal to the unit (or the measure).”8 Thus, just as natural numbers would be a plurality of things equal to 1, a straight line would be a plurality of things equal to a certain straight line segment taken as unit of measurement. However, as he noted in his written examination for the chair of Elementary Mathematics at Prague University, which took place shortly after the publication of that work, certain things that at (Epple 1996, pp. 3 & 1): “Ende des Paradigmas der Größenlehre”; “bewegten Atmosphäre.” (LA PNP, C II 14/2, [2r] & [4v]): “Die Größe ist das, was einer Vermehrung, oder einer Verminderung fähig ist”; “Zahl ist eine Menge von Dingen einerley Art.” LA PNP is the abbreviation for the Literární archiv Památníku národního písemnictví (Literary Archive of the Museum of National [Czech] Literature), in Prague, where Bolzano’s notes are preserved. 7 The above-cited definitions can be found almost verbatim in (Kästner 1786, pp. 1 & 21). I refer to the fourth edition of Kästner’s work since it is the one which is known, with certainty, to have been available at the time at Prague’s Public and University Library, which eventually became the National Library of the Czech Republic (cf. Catalogus Mathematicorum IX A 19 1781ff., p. 130 [r]). 8 (Bolzano 1804, pp. 3–4; cf. Russ 2004, p. 37): “Größe heißt ein Ding, insofern es angesehen wird als bestehend aus einer Anzahl (Vielheit) von Dingen, die der Einheit (oder dem Maße) gleich sind.” The title of Bolzano’s work is Betrachtungen über einige Gegenstände der Elementargeometrie. 5 6
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
5
the time were regarded as “quantities” in fact posed a problem both conceptually and operationally, as in the case of the so-called infinitely small quantities. For Bolzano, the notion of these quantities was “incomprehensible” (unverständlich) inasmuch as they were either to be considered as “nothing” (Nichts), in which case they would not be a quantity but rather the absence of a quantity, or to be considered as something, in which case the usual procedure, according to which a piece of a curved line equals a piece of a straight line if their difference is an infinitesimal, would be strictly speaking unacceptable (ČG Publicum 1796–1805 98/755, p. [13v]). As he went on to explain further, such a procedure as this latter “does not fundamentally remove the difficulty; rather it only removes it for the eye, though not for the understanding.”9 Such reluctance to accept infinitesimals can also be found in Bolzano’s second published mathematical work, namely his 1810 Contributions to a Better-Grounded Presentation of Mathematics (henceforth BD).10 There, addressing the so-called infinitely small and great quantities, he wrote that it was yet to be decided whether the “infinite” (whose notion comprised both), p “or ffiffiffiffiffiffithe ffi differential,” was “nothing else but a symbolic expression,” as in the case of “ 1 and suchlike [expressions].”11 At the time, “symbolic expression” was an appellation commonly used to refer to so-called imaginary quantities, as is attested to by the corresponding entry in the supplement to the famous mathematical dictionary initially developed by Georg Simon Klügel (cf. Grunert 1836, p. 387), as well as by the term’s usage in AugustinLouis Cauchy’s Cours d’Analyse (cf. Cauchy 1821, p. 173ff.). Nevertheless, in the case of these two references, as in the case of so many others, there seems to have been no deep reflection, as there was in Bolzano, on why mathematical infinites and imaginaries could not legitimately be regarded as quantities, let alone as numbers, but at most as “symbolic expressions.”12
(ČG Publicum 1796–1805 98/755, p. [13v]): “hebt die Schwierigkeit nicht im Grunde; sondern entrückt sie nur dem Auge, nicht aber dem Verstande.” For a detailed account of Bolzano’s examination, as well as its transcription and English translation, cf. (Fuentes Guillén and Crippa 2021). 10 The title of Bolzano’s work is Beyträge zu einer begründeteren Darstellung der Mathematik. This was the first installment (Lieferung) of that work, the second—and unfinished—installment of which was reconstructed by Jan Berg, based on two versions, and not published until 1977 in BBGA 2A5. 11 (Bolzano 1810, p. 30): “das Unendliche, oder das Differenzial, nichts anders als ein symbolischer pffiffiffiffiffiffiffi Ausdruck sey, gerade wie 1, dgl.” 12 A noteworthy precedent here may possibly be found in Lambert, who in a letter to Kant, dated pffiffiffiffiffiffiffi October 13, 1770, wrote: “The sign 1 represents an unthinkable non-thing. And yet it can be used very well in finding theorems” (Zweig 1999, p. 118). It is worth mentioning that it had been only a short time before, in 1767/1768, that Lambert had composed and published his famous Mémoire sur quelques propriétés remarquables des quantités transcendantes, circulaires et logarithmiques, in which he seems to avoid the use of imaginaries (cf. Lambert 1761/1768, p. 319; Barnett 2004, pp. 24–25). I am grateful to Eduardo Dorrego López for bringing this to my attention. 9
6
E. Fuentes Guille´n
Interestingly, among the extant notes for a section entitled “On the Properties of Numbers,” which was to be part of the second installment of BD and would have been written around 1810 (cf. Berg 1977, pp. 7–8), Bolzano stated that “if a number is conceived as augmented by 1, a new number arises,” so that, as he added in a marginal note, a new number would arise whenever two numbers were “conceived together.”13 As noted by Johan Blok, “such a statement would seem akin to the modern notion of a successor function” (Blok 2016, p. 256), inasmuch as the addition of the unit to each number would originate the subsequent one. Nonetheless, considering the aforementioned traditional notions of both number and quantity, the extension of the former beyond the sequence of the naturals still required, from Bolzano’s point of view, some justification—something that, in the case of the irrationals, posed problems. In a couple of notes included in his mathematical notebooks and dating from late 1814, Bolzano addressed the question of how to determine irrational quantities, irrationality being for him, as in the corresponding cases of the infinite and of fractions, a “predicat[e] [that] refer[s] to quantities” and not to numbers.14 Thus, on October 26, he pointed out the possibility of considering an irrational as “a quantity for which the rule to determine it by numbers is indeed given but requires an operation which can never be brought to an end,” which will allow one to approximate to said quantity “as closely as one desires.”15 Moreover, he said, an irrational would be a quantity that lies between twopgiven quantities which involve ffiffiffi two such rules, so that, for example, in the case of 2, one could establish the two following “rules (manners)” for determining it: (a) to “[l]ook at any power of 10 as a denominator, and seek out the greatest possible number which, considered as numerator, gives a fraction the square of which would be smaller than 2”; (b) to “look at any power of 10 as a denominator, and seek out the smallest possible number which, considered as numerator, gives a fraction the square of which would be greater than 2.”16 pffiffiffi This would mean that 2 could be determined as the “limit” (Grenze) to which, following these rules, the quantities 1.4 (or 14 10, i.e., the greatest fraction the square of (BBGA 2A5, p. 61 & fn. r): “Wenn zu einer Zahl noch 1 hinzugedacht wird, entsteht eine neue Zahl”; “Wenn 2 Zahlen zusammengedacht werden, entsteht eine neue Zahl.” The title of the note is Von den Eigenschaften der Zahlen. 14 (BBGA 2B6/2, p. 195): “alle diese Prädicate beziehen sich meiner Ansicht nach auf Größen.” 15 (BBGA 2B6/2, p. 195): “ist eine Größe, für welche die Regel sie durch Zahlen zu bestimmen, zwar gegeben ist, aber eine nie zu beendigende Operation erfordert”; “als man nur immer will.” 16 (BBGA 2B6/2, p. 196): “Regeln (Arten)”; “Man sehe irgend eine Potenz der 10 als Nenner an, und suche dazu die möglichst größte Zahl, welche als Zähler betrachtet einen Bruch gibt, dessen Quadrat kleiner als 2 wäre”; “man sehe irgend eine Potenz der 10 als Nenner an, und suche dazu die möglichst kleinste Zahl, welche als Zähler betrachtet einen Bruch gibt, dessen Quadrat größer als 2 wäre.” 13
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
7
which—according to the first rule—would be smaller than 2) and 1.5 (or 15 10, i.e., the smallest fraction the square of which—according to the second rule—would be greater than 2), 1.41 and 1.42, and so on, could approximate as closely as one desired (cf. BBGA 2B6/2, pp. 195–197).17 But then, in a note dated three days later (i.e., October 29, 1814) and entitled “Irrational, transcendental quantities,” Bolzano added that, according to his previous note, “the irrational quantities are to numbers (arithmetic) what the transcendent ones are to functions, that is to say, to finite functions, i.e. to algebraic functions (algebra),” inasmuch as the former “cannot be expressed by a finite multitude of numbers,” while the latter “cannot be expressed by a finite multitude of powers and products.”18 This was a quite interesting remark and meant that, strictly speaking, irrationals were not to be regarded as numbers but rather as quantities that could only be determined numerically according to the aforementioned rules. Indeed, he even suggested that his previous explanation could be shortened “by using only one instead of two bounding [or limit] quantities,” so that an irrational could be regarded as a “constant quantity to which the variable [quantity] X, which arises from a rule that contains a number of repetitions of one and the same operation which can be increased to any extent one wishes, can approximate as closely as one desires.”19 As we will discuss in Sect. 4, Bolzano’s later procedure for determining what he would call “measurable numbers” resembles the one first outlined in his notes of 1814. However, in the first place (as noted by Bob van Rootselaar in the introduction to the volume of BBGA containing the first of these notes), Bolzano’s early procedure “provides a means of deciding whether a given quantity is measurable [...] [,] not a means of constructing a measurable quantity” since, “[f]or Bolzano, the quantities are already there [and] one needs only to determine them by numbers.”20 In the second place, moreover, such a procedure still required some further
17
This procedure could be rooted in the successful use of continuous fractions, by means of which approximations of irrationals, both by excess and by default, could be provided. For a detailed account of continuous fractions at the time, as well as a historical sketch of their use, cf. (Klügel 1808, pp. 43–87; cf. also Dorrego López and Fuentes Guillén, forthcoming). 18 (BBGA 2B7/1, p. 35): “Irrationale, transcendente Größen”; “sind die irrationalen Größen das für die Zahlen (Arithmetik), was die transcendenten für die Functionen, die endlichen, d.h. algebraischen Functionen, (Algebra) sind”; “durch keine endliche Menge von Zahlen ausgedrückt werden kann; so kann [...] durch keine endliche Menge von Potenzen und Producten ausgedrückt werden.” 19 (BBGA 2B7/1, p. 35): “In Ansehung beyder aber werfe ich die Frage auf: ob man ihre Erklärung nicht dadurch bedeutend verkürzen könnte, daß man statt ihren Werth durch zwey Grenzgrößen [...], zu bestimmen vielmehr nur eine einzige gebrauchte”; “beständige Größe, der die veränderliche X, welche durch eine Regel entsteht, die eine beliebig »zu vermehrende Anzahl von Wiederholungen einer und derselben Operation enthält, so nahe kommen »kann, als man nur immer will.” It should be noted that, immediately afterward, Bolzano wrote that “however, sometimes one has to use another quantity Y as a help.” 20 (van Rootselaar 1995, pp. 27–28): “sie ergibt ein Mittel zu entscheiden, ob eine vorgegebene Größe meßbar sei. Es ist kein Mittel, eine meßbare Größe zu konstruieren. Für Bolzano sind die Größen schon da, man muß sie nur durch Zahlen bestimmen.”
E. Fuentes Guille´n
8
clarifications in order to operate with numerically determined irrationals, in particular with regard to the difference between these latter and their limits, that is to say, concerning the decreasing difference between an irrational quantity and its corresponding rational limits. As Bolzano stated in another note included in his mathematical notebooks and written at the turn of 1815, he had, up to this point, still not been able to “get things clear in his mind” either as regards the concept of the pffiffiffiffiffiffi ffi infinite or as regards the concepts of 0 and 1 : all of them, he wrote, “denote a kind of nothing,” even though “they are essentially different.”21 This last note was to be followed by the discussion of an example showing the nature of the difference between these three concepts; in the end, however, Bolzano did not write it. Of course, this does not mean that Bolzano did not resort to, for example, the irrationals and zero. For example, in his 1816 work on the binomial theorem (henceforth BL),22 he argued that the validity of the binomial series expansion for (1 + x)n could also be proved for an irrational value n ¼ i, on the basis that such a value could be approximated to by a fraction mn , with both n and m integers, so that n “ð1 þ xÞm comes as close as one might desire to the value (1 + x)i.”23 Thus, considering “x < 1” (or, as noted by Paul Rusnock, “|x| < 1” (Rusnock 2000, 68)), Bolzano obtained the series n
ð1 þ xÞm ¼ 1 þ
n n xþ m m
n m
1 2 n x þ ... þ m 2
n m
1 mn r þ 1 r ... x þ Ω2 , 2 r
where Ω2 can become smaller than any given quantity (or, from a modern perspective, Ω2 ! 0), from which, considering that the difference between i and mn can therefore be made as small as desired, he substituted the latter by the former and obtained the series24 ð1 þ xÞi ¼ 1 þ ix þ i
i1 2 i1 irþ1 r x þ ... þ i ... x þ Ω2 : 2 2 r
Leaving aside the fact that, as Rusnock has pointed out, there is a subtle mistake in Bolzano’s proof—since Bolzano “uses the expansion for n ¼ 1 as the basic convergence lemma [,] [b]ut this series diverges for x ¼ 1” (Rusnock 2000, p. 69)—there are two things that are particularly interesting about it nonetheless. Firstly, although the meaning of Ω2 might seem clear, it remains to be explained just 21 (BBGA 2B7/1, p. 68): “kann ich noch immer nicht ins Reine kommen”; “bezeichnen eine Art von Nichts”; “sie doch wesentlich verschieden sind.” 22 The title of Bolzano’s work is Der binomische Lehrsatz, und als Folgerung aus ihm der polynomische, und die Reihen, die zur Berechnung der Logarithmen und Exponentialgrößen dienen, genauer als bisher erwiesen. n
(Bolzano 1816, p. 76): “ð1 þ xÞm dem Werthe (1 + x)i so nahe, als man will.” 24 It should be noted, firstly, that in the original the superscript in Ω2 is placed at the top and that, as Steve Russ wrote, “[a]s often the case with Bolzano, these superscripts only distinguish, they are not powers” (Russ 2004, p. 349). Secondly, in the printed version the substitution of 1 þ mn x by 1 + ix appears as “1” (cf. Bolzano 1816, p. 77). 23
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
9
what this sign signified for Bolzano. Secondly, at the very end of that work, he stated that the concept of “the irrationality of a quantity,” along with that of “imaginary expressions” and some others, had yet to be clearly developed and that, in fact, this lack of clarity about these concepts themselves prevented him from addressing some cases in which they were used, for example, those involving the so-called imaginary quantities or “when a negative quantity is to be raised to a power with an irrational exponent.”25 On the one hand, throughout his works of 1816–17 Bolzano referred to the irrationals as quantities and not as numbers.26 Such an unequivocal use, then, in these works—which include his Purely Analytic Proof (henceforth RaB)27—of the appellation “quantity” to refer to the irrationals would not be merely a conceptual or even a nominal issue but would have practical, operational implications. On the other hand, the notion of quantities ω and Ω, which he introduced in BL, was a key element not only in that work, but also in his works of 1817 and in particular in RaB, a work in which he took very significant and ground-breaking steps toward demonstrating highly relevant theorems, such as the one known as the intermediate value theorem (i.e., if a continuous function of x is positive for a value of x and negative for another value, it must be 0 for an intermediate value). As Bolzano explained in the preface to BL, he was opposed to the notion of infinitesimals or quantities that “are supposed to be smaller than any [...] alleged, i.e. conceivable quantity,” instead of which he proposed the use of the notion of variable quantities “which can become smaller than any given [one].”28 This reluctance with regard to infinitesimals was nothing strange among mathematicians at the time (cf. Lagrange 1797; du Bourguet 1810). However, as discussed before in connection with his 1804 written exam for a chair at Prague University, Bolzano’s criticism had to do not only with what he considered the lack of rigor in the procedures associated with infinitesimals and the “attempted calculation of the infinite,” but also with the concept itself, which, he said, was “self-contradictory.”29 Thus, as the first version of his alternative notion to infinitely small quantities explicitly states (a notion that would correspond to the quantities that in §14 of BL he designated by the symbol ω, Ω being the “algebraic sum or difference” of a
(Bolzano 1816, pp. 143–144; cf. Russ 2004, p. 247): “wenn eine negative Größe auf eine Potenz mit irrationalem Exponenten erhoben werden sollte”; “imaginäre[n] Ausdrücke”; “Irrationalität einer Größe.” 26 Indeed, in his early mathematical works (1804–17) Bolzano only once used the expression “irrational[e] Zahl,” namely in the formulation of one of the last problems in his 1816 work, in the solution of which, however, he again referred to irrational quantities (Bolzano 1816, p. 137). 27 The full title of Bolzano’s work is Rein analytischer Beweis des Lehrsatzes, daß zwischen je zwey Werthen, die ein entgegengesetztes Resultat gewähren, wenigstens eine reelle Wurzel der Gleichung liege. 28 (Bolzano 1816, p. V): “schon kleiner seyn soll[en], als eine jede [...] angebliche d. h. gedenkbare Größe”; “die kleiner als jede gegebene werden können.” 29 (Bolzano 1816, pp. IV & XI): “versuchte Berechnung des Unendlichen”; “selbst widersprechenden.” 25
10
E. Fuentes Guille´n
“(finite) multitude” of ω’s),30 the idea underlying it was that “one can assume an even smaller ([or] greater) [quantity] for each already assumed one.”31 In other words, Bolzano stressed the fact that one could, at most, use quantities that “can be” smaller than a “given” one, but not quantities that are de facto smaller than any “conceivable” one, while resisting the idea of identifying such quantities (or any quantity) with 0 (cf. Fuentes Guillén and Martínez Adame 2020). It should be emphasized that at the time, and in spite of the innovative features of the mathematics that he was developing, Bolzano still advocated a traditional notion of number. As he wrote in “On the Concept of Quantity and the Different Kinds Thereof,”32 dated 1816, “only that which mathematicians usually call a whole number deserves to be called a true number,” while the rest were “not actually numbers, but quantities, or not even quantities.”33 Moreover, in that paper Bolzano pointed out that in previous works, BL among them, he had relied on the “common use of language” to refer to numbers and quantities, despite having already “recognized its incorrectness,” since he “really feared” that by deviating from it his works would be “completely incomprehensible.”34 As he went on to explain, for him: Any thing X is to be called a quantity (Quantum) insofar as it possesses a property, in consideration of which another thing M, called measure, behaves with respect to X in the same way as two divisible things of the same kind behave with respect to each other, one of which is the unit [and] the other an arithmetically determinable whole N, i.e. a thing for which a rule can be given, according to which, for any arbitrary multitude of equal and aliquot parts into which the unit is decomposed, one can determine how many of those parts are contained in N.35
From Bolzano’s point of view, such a relation could also be expressed by pointing out that “X is measured [...] by N,” the latter being called “the number that expresses the quantity of X,” albeit “very improperly” since “[o]nly in the rarest cases is N an actual [or real] number, although it is always something determinable by numbers (several, (Bolzano 1816, p. 15): “algebraische Summe oder Differenz”; “(endliche) Menge.” (BBGA 2B7/1, p. 79): “man zu jedem schon angenommenen ein noch kleineres (größeres) annehmen kann.” 32 The title is “Uiber den Begriff der Größe und die verschiedenen Arten derselben.” 33 (BBGA 2A5, p. 191): “nur dasjenige, was die Mathematiker gewöhnlich eine ganze Zahl nennen, den Nahmen einer wahren Zahl verdiene”; “sind eigentlich nicht Zahlen, sondern Größen, oder nicht einmahl diese.” 34 (BBGA 2A5, p. 191): “gewöhnlichen Sprachgebrauch”; “die Unrichtigkeit desselben [...] erkannt”; “auch wirklich befürchten mußte, hiedurch ganz unverständlich zu werden.” 35 (BBGA 2A5, p. 199): “Größe (Quantum) heißt jede Ding X, so fern es irgend eine Eigenschaft besitzt, in Betracht deren ein anderes M, das Maß genannt, sich zu X so verhält, wie sich zwey theilbare Dinge derselben Art verhalten, deren eines die Einheit, das andere ein arithmetisch bestimmbares Ganze N ist, d.h. ein Ding, von dem eine Regel angeblich ist, nach welcher man bey jeder beliebigen Menge gleicher und aliquoter Theile, in die man die Einheit zerlegt, bestimmen kann, wie viele dergleichen in N enthalten sind.” This passage makes it clear that, at least in the case of Bolzano, the appropriate translation for “Größe” is “quantity” and not “magnitude” (from the Latin magnitūdō). 30 31
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
11
often infinitely many).”36 As will be discussed in the following sections, this is basically the idea of measurement that underlies Bolzano’s later project of a theory of quantities and, within it, a theory of numbers. This means that for him, both then and later, “the concept of quantity contain[ed] that of number as a characteristic.”37 Interestingly, in his 1816 paper Bolzano maintained that, according to such a definition of quantity and as he had already stated in BL, the concept of infinitely small quantities was self-contradictory, while the infinitely great and the imaginaries were “improperly so-called” (“uneigentlich so genannt”) due to certain similarities with quantities (cf. BBGA 2A5, pp. 201–203). However, two years later, in a note included in his mathematical notebooks and dated September 9, 1818, he showed himself inclined to accept the notion of infinitely small quantities (cf. BBGA 2B9/2, p. 126). Many years later, in Paradoxes of the Infinite (henceforth PU),38 the final version of which dates from the late 1840s and was published posthumously, Bolzano went back to the notion of the measurement of a quantity by another one with equal status standing in a certain ratio to it (cf. Bolzano 1851/1920, pp. 22 & 59) and insisted on the fact that, in a sense, “all numbers [are] also quantities [and] there are even more 1 1 2 1 quantities than numbers,” “because pffiffialso ffi p ffiffithe ffi fractions 2 , 3 , 3 , 4 , . . . ::, as well as the 3 so-called irrational expressions 2, 2, . . . ::, π, e, . . . :: designate quantities.”39 Furthermore, in that work he addressed the discussion thatphe ffiffiffiffiffiffihad ffi left hanging in his 1815 note on the similarities and differences between 0, 1 (both “objectless” quantity ideas or expressions)40 and the infinitely small quantities, which, he insisted, should not be regarded as equivalent (Bolzano 1851/1920, p. 56). In particular, Bolzano devoted several paragraphs in PU to determining, in a precise manner, the concept of zero41 and, in connection with this, to discussing the idea of infinitely small quantities. Thus, on the one hand, he explained that while
(BBGA 2A5, p. 199): “X werde (bey dem Maße M ) durch N gemessen”; “die Zahl, welche die Größe von X ausdrückt”; “sehr uneigentlich”; “Nur in den seltensten Fällen ist N eine wirkliche Zahl; wohl aber ist es jederzeit etwas durch Zahlen (mehrere, oft selbst unendlich viele) Bestimmbares.” 37 (BBGA 2A5, p. 208): “der Begriff der Größe jenen der Zahl als ein Merkmahl.” 38 The full title of Bolzano’s work is Paradoxien des Unendlichen herausgegeben aus dem schriftlichen Nachlasse des Verfassers von Dr. Fr. Příhonský. 39 (Bolzano 1851/1920, p. 21): “alle Zahlen zugleich auch Größen, sondern es gibt noch weit mehr Größen als Zahlen, weil auch die Brüche 12 , 13 , 23 , 14 , . . . ::, ingleichen die sogenannten irrationalen pffiffiffi pffiffiffi Ausdrücke 2, 3 2, . . . ::π, e, . . . :: Größen bezeichnen.” 40 Concerning the translation of “gegenstandlos” as “objectless” (gegenstand-los), it is worth noting, as Russ does, that Bolzano uses it “in contrast to gegenständlich meaning, of an idea, that it does have objects associated with it” (Russ 2004, p. XXIX). 41 Here Bolzano acknowledges Martin Ohm for “having been the first who drew the attention of the mathematical public to the difficulties in the concept of zero” (Bolzano 1851/1920, p. 55). Interestingly, in the work to which Bolzano refers, Ohm stated that “so würde die 0, als benannte Zahl gedacht, Nichts vorstellen,” having established before that a “benannte Zahl” would be “diejenige Zahl, bey welcher wir uns eine völlig bestimmte und genannte Einheit denken,” so that “[e]ine Größe A wird gemessen, wenn man sie als eine benannte Zahl aE ausdrückt, oder doch die Elemente findet, aus denen diese benannte Zahl aE leicht gebildet werden kann” (Ohm 1828, pp. 48–49). 36
12
E. Fuentes Guille´n
0 could only be considered a quantity in a translatitious or figurative sense (uneigentlichem Sinne), it could be determined as the “symbol” (Zeichen) for which the equations A A ¼ 0 and A 0 ¼ A hold in general (i.e., regardless the quantity expression A corresponds to an actual quantity or is “objectless”), so that A 0 ¼ 0 (the concept of multiplication) and B AB ¼ A (the concept of division) could be formed, as long as B 6¼ 0 (Bolzano 1851/1920, pp. 50–51 & 56–57).42 In other words, Bolzano drew attention to the fact that, although strictly speaking 0 only accounts for the absence of a quantity, one might be entitled to consider it as a quantity since, leaving aside division by 0, the four basic operations can be carried out using it. On the other hand, Bolzano insisted that the notion of infinitely small quantities led to contradictions (Bolzano 1851/1920, pp. 59–60). However, he pointed out that problems could be overcome if, in the expression Δy Δx, both Δy and Δx were regarded as ¼0, while the corresponding notation for the derivative dy dx was not considered “as a quotient of dy by dx,” that is, not as a ratio between zeros, “but only as a symbol of the derivative of y by x.”43 From Bolzano’s point of view, this justified the practice of calculating using such signs inasmuch as these latter were considered “not as signs of actual quantities, but rather as equal to zero,” and he added that, “on entirely similar principles,” the same could be said about “calculation[s] using so-called imaginary quantities,”44 the values of which he had already identified—along with the value of the “infinitely great”—as non-measurable (cf. Bolzano 1930, p. 16) in a work composed prior to PU, namely his Theory of Functions (Functionenlehre; henceforth F). The following sections will discuss Bolzano’s project of a Pure Theory of Numbers (RZ) in which he presented the notion of “measurable numbers” and which, coupled with F, is to be understood within the framework of his more ambitious project of a general theory of quantities. The development of such a project was carried out precisely between the publication of his early mathematical works (i.e., the period 1804–17) and the publication of PU (i.e., 1851), in which latter work he noted, as regards his considerations on the legitimacy of making calculations with certain symbols, that, ultimately, such considerations had led to “important truths of the general theory of quantities.”45
Bolzano noted that 0 should only be used as divisor in identity equations, such as A0 ¼ A0 (Bolzano 1851/1920, pp. 57–58). 43 (Bolzano 1851/1920, p. 67; Russ 2004, p. 639): “als einen Quotienten von dx in dy, sondern nur eben für ein Symbol der abgeleiteten von y nach x.” 44 (Bolzano 1851/1920, pp. 69 & 71): “nicht als Zeichen wirklicher Größen, sondern sie vielmehr als der Null gleichgeltend betrachten”; “ganz ähnlichen Grundsätzen”; “sogenannten imaginären Größen.” 45 (Bolzano 1851/1920, pp. 64 & 69): “meistens ganz richtige Ergebnisse”; “wichtige Wahrheiten der allgemeinen Größenlehre.” 42
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
3
13
The Conceptual Framework of Bolzano’s Measurable Numbers
According to Jan Berg, who is the editor of both the Introduction to the Theory of Quantities and First Concepts of the General Theory of Quantities (henceforth EG)46 and RZ in BBGA, it is known from Bolzano’s correspondence that he worked on his project of a theory of quantities from around 1830 onward and that by 1835 he had almost finished writing EG and RZ (Berg 1975, pp. 9–10; cf. Winter 1933, pp. 213–216; Russ 2004, p. 347). Thus, in order to better understand Bolzano’s proposal of “measurable numbers,” which he introduced in the seventh and last section of RZ, it seems necessary first to shed light on the conceptual groundwork of such a proposal—which Bolzano developed in EG and the previous sections of RZ—as well as to address some of the reflections on quantities and numbers included in Bolzano’s mathematical notebooks from the early 1820s through to the early 1830s. In an extant note from his mathematical notebooks entitled “Concept and Division of Mathematics,” dated May 1830, Bolzano explicitly refers to the fact that, whereas in BD he had favored an alternative definition of mathematics,47 by the time of this note, 20 years later, mathematics had become for him “once again a theory of quantities alone.”48 As noted by Russ in his English translation of Bolzano’s mathematical works, the very title “for his mature comprehensive work on mathematics” (i.e., Größenlehre) makes it clear that Bolzano had embraced, by this time, once again the traditional definition of mathematics as the “science of quantities” (Russ 2004, p. 347). Accordingly, Bolzano began EG by pointing out that “the word quantity” could be understood in both a narrow sense (enger Sinn), to refer only to so-called continuous quantities, and a broad sense (weiter Sinn), which would include also so-called discrete quantities, and that it was in this latter, broader sense that mathematics could be defined as “a science of quantities,” a science which would therefore comprise the theory of numbers, “one of the most important mathematical disciplines.”49 For Bolzano, however, this did not mean that all mathematical disciplines dealt only with quantities, as would be shown by the case of the theory of combinations, which dealt with the combinations of which any given plurality of
46
The title is Einleitung in die Größenlehre und erste Begriffe der allgemeinen Größenlehre. In 1810 Bolzano defined mathematics as “eine Wissenschaft [...], die von den allgemeinen Gesetzen (Formen) handelt, nach welchen sich die Dinge in ihrem Daseyn richten müssen” (Bolzano 1810, p. 11). For Paola Cantù, referring to Bolzano’s Größenlehre, “[i]t is exactly for the sake of generality that Bolzano abandons the previous definition of mathematics and modifies the meaning of ‘Größe’ from quantity to quantity in general, as it was common in the tradition of the mathesis universalis” (Cantù 2014, p. 313). 48 (BBGA 2B12/1, p. 132): “Mathematik ist mir jetzt wieder nur Größenlehre, und zerfällt zunächst in die reine und angewandte.” The title of Bolzano’s note is “Begriff und Eintheilung der Mathematik.” For an earlier note of Bolzano on this subject, cf. (BBGA 2B2/1, p. 23). 49 (BBGA 2A7, p. 25): “eine Wissenschaft der Größen”; “eine der wichtigsten mathematischen Disciplinen.” 47
14
E. Fuentes Guille´n
objects were susceptible (BBGA 2A7, pp. 27–28); this was a consideration, it must be said, that he had already introduced in BD (cf. Bolzano 1810, p. 4). In addition to this, and with respect to the particular disciplines of the general theory of quantities, Bolzano noted that the theory of the infinitely small or great quantities, which for him represented a “special type” (besondere Gattung) of quantities, should be regarded as a peculiar but useful part of pure mathematics, of which the theory of numbers (another special type of quantities) would also be a part (cf. BBGA 2A8, pp. 33–34). Moreover, he explained that the latter could be used to accurately determine “all other quantities,” provided only that—and this was certainly a proviso he intended to abide by in his own work—numbers were not considered just in the strict sense of the term, that is, provided that “special forms” (Arten) such as rationals and irrationals were also taken into account and thus “the properties of numbers [were] mixed with the properties common to all quantities, as well as with those that [were] only common to certain other forms.”50 But, had Bolzano by this point already sorted out his uncertainties regarding the notions of number and quantity? In a broad sense, he wrote in EG, quantity would be every object “belonging to a kind of things such that, for each two, only one of the following two relations to each other can ever hold: they must either be equal to one another or one of them must contain a part equal to the other.”51 Thus, on the one hand, an irrational i could be regarded either as a quantity equal to a part of a rational quantity from which a certain quantity would have to be subtracted in order to obtain i, or as a quantity to one of whose parts a rational quantity could be equated and to which a certain quantity would have to be added in order to obtain i; whereas, on the other hand, the corresponding irrational number would be the numerical determination of i, which should comply with certain properties common to all quantities and to numbers. The mathematical notebooks of Bolzano written throughout the previous decade seem very precisely to account for the process which eventually led him to such a stance in his Größenlehre. For example, in a note from 1821 he wrote that, while infinite quantities could be determined by saying that “a quantity is 1 great if every number is a part of it and 1psmall ffiffiffiffiffiffiffi pifffiffiffiffiffiffiitffi is [...] < than every fraction,” imaginary quantities (Größen) such as “ 1, 2, etc:” should be regarded as “attributes that are determined by the numbers 1, 2. . ., but [...] cannot be assigned to actual objects” (in other words, while such expressions somehow involve numbers, there is no real object to which they correspond and so they are “objectless”), both the latter and the
(BBGA 2A7, p. 34): “aller übrigen Größen”; “wo (wie wir in diesem Buche selbst zu thun gesonnen sind) die Eigenschaften der Zahlen untermischt mit den Eigenschaften, die allen Größen gemeinschaftlich, ingleichen auch mit solchen, die nur gewissen anderen Arten (z.B. den irrationalen) Größen zukommen, abgehandelt werden.” 51 (BBGA 2A7, pp. 25–26): “als gehörig zu einer solchen Art von Dingen betrachten, deren je zwei immer nur eines von folgenden zwey Verhältnissen gegen einander behaupten können: sie müssen entweder einander gleich seyn, oder das Eine derselben muß einen dem andern gleichen Theil enthalten.” Bolzano was to go back to the principle of trichotomy in his theory of measurable numbers (BBGA 2A8, pp. 136–137 & 139; cf. Rychlík 1962, p. 5 & 56–59; Berg 1990, p. 151). 50
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
15
former being “unmeasurable quantities.”52 Then, in a note written in 1823, he defined quantity as “a collection of parts, among which two parts that are not equal to each other have instead the relation in which one of them contains a part that is equal to the other,”53 but he noted that such a definition pffiffiffiffiffiffiffiwould rule out certain ideas (Vorstellungen), so that, in order to include “0, 1, 1, . . .,” a more general “concept of quantity or rather of a quantity expression (a quantity idea)” would be required.54 Because of that, in a later note written between 1823 and 1824, Bolzano asked himself if it would not “be easier to say: A quantity (in abstracto) is such an attribute of a thing which is determinable by the thought of a sum, i.e. by the thought of a collection of parts in which no order among the same is perceived,”55 from which he concluded that the so-called imaginary quantities, zero and the infinite (both great and small) would not be quantities but “mere quantity ideas” (“bloße Größenvorstellungen”) (BBGA 2B11/2, p. 91). Finally, in a note on “[t]he theorem that factors with modified order give one and the same product” (written between 1828 and 1829), Bolzano argued that such a theorem was valid not only for whole numbers, but also for fractions and “irrational quantities” such as “i ¼ mn þ ω, where m, n are whole numbers and ω can become as small as one desires,” while adding that imaginary quantities “are not actual quantities at all, but only signs that have been obtained according to the rules that apply to all real quantities.”56 And then, in a note on “imaginary quantities” (written between 1829 and 1830), he insisted that these latter were not quantities but “quantity concepts,” inasmuch as they represented a quantity “of which there is none.”57
(BBGA 2B11/2, pp. 22–23): “eine Größe ist 1groß, wenn jede Zahl ein Theil von ihr; und 1klein, wenn sie ein Theil von jedem Bruch ist oder wie man sagt < als jeder Bruch”; pffiffiffiffiffiffiffi pffiffiffiffiffiffiffi “ 1, 2, etc sind Beschaffenheiten, die durch Zahlen 1, 2. . . bestimmt werden; obgleich Beschaffenheiten, die keinen wirklichen Gegenständen zukommen können”; “unermeßlichen Größen.” This note is entitled “Begriff der Größe,” dated September 1. 53 (BBGA 2B11/2, p. 86): “Die Größe ist ein Inbegriff von Theilen, zwischen deren je zweyen, die einander nicht gleich sind, das Verhältniß Statt hat, daß der Eine derselben einen dem anderen gleichen Theil enthält.” This note is entitled “Begriff einer Größe” and is dated March 10, 1823. 54 (BBGA 2B11/2, p. 86): “einer Größe oder vielmehr eines Größenausdrucks (einer Größenvorstellung).” I opted to translate “Vorstellung” by the term “idea,” and not “representation,” in order to emphasize its mental character and, at the same time, to distinguish it from “Darstellung,” which can be associated with a “representation” by signs (cf. BBGA 2A8, p. 22). For a detailed account of “Vorstellung” in the work of both Kant and Bolzano, cf. (Rusnock 2000, p. 102ff.). 55 (BBGA 2B11/2, p. 91): “Sollte man denn nicht einfacher sagen können: Eine Größe (in abstracto) sey eine solche Beschaffenheit einer Sache, welche bestimmbar ist durch den Gedanken einer Summe, d.h. durch den Gedanken eines Inbegriffs von Theilen, bei welchen auf keine Ordnung unter denselben gesehen wird?” This note is entitled “Begriff der Größe” and it is not dated. 56 (BBGA 2B12/1, pp. 84–85): “Der Lehrsatz, daß Faktoren mit veränderter Ordnung einerlei Produkt geben”; “Irrationalgrößen”; “i ¼ mn þ ω wo m, n ganze Zahlen, und ω so klein werden kann, als man nur immer will”; “sind nähmlich gar keine wirklichen Größen, sondern nur Zeichen, die nach denjenigen Regeln, welche für alle reellen Größen gelten erhalten worden sind.” 57 (BBGA 2B12/1, p. 103): “Imaginäre Größen”; “Größenbegriffe”; “deren es keine gibt.” 52
16
E. Fuentes Guille´n
All this helps to understand the position taken by Bolzano on the notion of number in RZ and, in particular, in the first section of this latter work, entitled “On the Concept of Numbers, Their most General Properties and the Manner of Their Designation.”58 There, he began by stating that numbers are sums formed from a unit (i.e., 2, 3, 4, 5, 6, . . .), which meant that, strictly speaking, the “number sequence” (Zahlenreihe) was the “natural sequence of numbers” or the “sequence of natural numbers,”59 which would be infinite, although in a broad sense one could also consider “objectless ideas of numbers” and, ultimately, numbers could be identified with “whole or actual” (wirkliche) ones (BBGA 2A8, p. 17). On the one hand, this explains Bolzano’s subsequent discussion of several definitions of number, ranging from the definition common at the time (i.e., that of Kästner, although he only mentions Stahl) and the traditional definition in which “things of the same kind” was replaced by “units” (Euclid, Kraushaar), through to those definitions in which the meaning of “number” was extended also to fractions (Schröder, Stein) and even to the irrationals, the infinitely great and the infinitely small, but excluded zero and imaginaries (Stevin, Newton) (BBGA 2A8, pp. 18–20). On the other hand, this led Bolzano to declare that by the “appellation” (Benennung) of a “number idea” he wished his readers to understand “any idea of the form ‘[a] number, which has the attributes b0, b00, b0 0 0’, regardless of whether these attributes can in fact be found combined [vereinigt] in a number or not; i.e. regardless of whether this idea has a real [or actual] object or not.”60 Moreover, he made it clear that he would also call these ideas “number concepts for a change” (zur Abwechslung), the representation (Darstellung) of which was carried out by means of “signs” which he called “number expressions” (BBGA 2A8, p. 22). He then went on to write: Finally, to abbreviate, in cases where there is no misunderstanding to be worried about, we will also allow ourselves to simply call the number ideas and number expressions numbers themselves, and to do so indeed even at times when there are no numbers corresponding to these ideas or signs in reality.61
The title is “Von dem Begriffe, den allgemeinsten Beschaffenheiten und der Bezeichnungsart der Zahlen.” 59 (BBGA 2A8, p. 15): “die natürliche Reihe der Zahlen, oder mit Einigen auch die Reihe der natürlichen Zahlen.” 60 (BBGA 2A8, pp. 18–21): “eine jede Vorstellung von der Form: »Eine Zahl, welche die Beschaffenheiten b0, b00, b0 0 0. . .hat,« verstehen wollen, gleichviel ob diese Beschaffenheiten bei einer Zahl in der That vereinigt angetroffen werden können oder nicht; d.h. gleichviel ob diese Vorstellung einen wirklichen Gegenstand hat oder nicht.” 61 (BBGA 2A8, p. 22): “Zur Abkürzung endlich werden wir uns in Fällen, wo eben kein Mißverstand zu besorgen stehet, erlauben, die Zahlenvorstellungen und Zahlenausdrücke auch schlechtweg Zahlen selbst zu nennen, u. dieß zwar zuweilen selbst dann, wenn es dergleichen dieser Vorstellung oder Zeichen entsprechende Zahlen in Wahrheit gar nicht gibt.” 58
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
17
After such explanations as these, Bolzano goes on in the second section of RZ to address the addition and subtraction of—natural—numbers (which, for him, did not comprise 0 or the unit), and shows that, given two numbers S and M: (a) with S > M, the difference S M will be another number; (b) with S ¼ M, their difference (i.e., 0) will be an “objectless idea,” as he had already pointed out in EG (cf. BBGA 2A7, p. 164); (c) with S < M, and given the “arbitrary proposition” (willkürliche Satz) or “convention” (cf. Russ 2004, p. 106) according to which “in all numerical expressions we want to regard the unit to which they refer as a unit capable of opposition,” the difference S M will nevertheless represent an “actual number” (wirkliche Zahl), although from an “opposite i.e. negative” (entgegengesetzt d.h. negativ) unit (BBGA 2A8, pp. 52–53). In the third section of RZ he explains the product (and proves some properties, including distributivity and commutativity) and “quotients” of numbers, introducing the “designation” or “symbol” (Zeichnung) mp (BBGA 2A8, p. 67). Moreover, in the fourth section (on Rationale Zahlen), at the beginning of which Berg (editor of RZ in BBGA) notes that there is a marginal note in the manuscript in which Bolzano states that in these sections things could be “shortened by considering quantities instead of just numbers,”62 Bolzano turns to look at number ideas (Zahlenvorstellungen) in which only a finite “multitude” (Menge)63 of the four basic operations are to be performed and explains “simple fractions” (einfache Brüche), addressing their calculation rules (avoiding division by 0) and stating that an “elementary expression” will be any “number expression” (Zahlenausdruck) in which only a finite or infinite multitude of these operations occur (cf. BBGA 2A8, pp. 73–84). He does not discuss the latter case in this section (namely, expressions which involve an infinite multitude of the four basic operations), although he gives the following examples: “
1 1 þ 1 1 þ 1 ::in inf:
or 1 1 1 1 1 þ þ þ þ þ ::in inf: ” ðBBGA 2A8, p:73Þ: 2 4 8 16
(BBGA 2A8, p. 73 fn. u): “In allen diesen Abschnitten läßt sich Manches verkürzen, indem statt bloßer Zahlen, Größen überhaupt betrachtet werden.” 63 Here I agree with both Peter Simons and Russ about the translation of “Menge” as “multitude,” and not as “set,” which emphasizes its primitive meaning as a “bunch” of something and avoids its association with the set theory developed decades later (cf. Simons 1997, pp. 95–96; Russ 2004, pp. XXVIII–XXIX; see also Ferreirós 2007, pp. XX–XXI & 72–73). 62
18
E. Fuentes Guille´n
The fifth section of RZ focuses on the “Relations of Height Between Rational Numbers,”64 meaning that form of comparison between different rational numbers whereby one will be “greater or higher” and the other “smaller or lower,” so that for two unequal rational numbers one is always higher than the other and “there always lies [another] one between them,” indeed there always lies an infinite “multitude” of such numbers.65 For Bolzano this section was necessary in order to “justify” the extension of the meaning of that “relation of being greater” (“Verhältniß des Größerseyns”), constitutive of the “concept of number” “taken as a basis in this theory of numbers,” so that it could be applied to rationals, which he regarded as “mere number ideas” with “no [associated] object, because in fact there is no number that is ¼ 35 or ¼ 25 or ¼ 15.”66 Otherwise, he points out, one would be led to consider relations of height pffiffiffiffiffiffiffi involving “objectless” (gegenstandlos) number ideas, as in the case of “1 þ 1 > 1, which no mathematician will admit” insofar as the combination of numbers and imaginaries would not admit such a relation, or expressions 1 such as “1 þ 1þ1þ1þin inf: ,” which, not being a “finite elementary number expression,” would entail an extension of the concept of number.67 Such an extension is performed in the sixth section of RZ, entitled “Rational Numbers Which Can Increase or Decrease to Infinity,”68 where Bolzano explains the “concepts” of these numbers which, he writes, should not “[be confused] with the concepts of an infinitely great and an infinitely small number”: whereas the latter
The title is “Verhältnisse der Höhe zwischen den rationalen Zahlen.” In his translation of the seventh section of RZ, Russ translates “Höhe” as “order” (Russ 2004, p. 395). Undoubtedly, the notion of “height” involves a certain relation of order among the multitude of rational numbers. However, in order to provide a faithful account of Bolzano’s proposal and to avoid any possible— and misleading—set-theoretical interpretation, I opted not to translate “Höhe” as “order.” Similarly, and with regard to a paragraph of Bolzano’s Theory of Science (Wissenschaftslehre; henceforth WL) on the relation of subordination (Unterordnung), Jan Šebestík has translated “Höhe” as “hauteur” (Šebestík 1992, p. 177; cf. Bolzano 1837, p. 451). Later, Dedekind and Cantor were to use what they called the “height” (Höhe) of algebraic numbers in order to enumerate these latter (cf. Cantor 1874, pp. 259–260; Ferreirós 2007, pp. 178–180). 65 (BBGA 2A8, pp. 86, 90 & 92): “größer oder höher”; “kleiner oder niedriger”; “so gibt es jederzeit eine, die zwischen ihnen liegt”; “Die Menge der Rationalzahlen, die zwischen je zwey von einander verschiedenen Rationalzahlen A und Z eingeschaltet werden können, geht ins Unendliche.” Nowadays this can be expressed by saying that the set of rational numbers are densely ordered in the reals. Such an approach, however, is foreign to Bolzano’s proposal. 66 (BBGA 2A8, p. 87): “ich in dieser Zahlenlehre einmahl zu Grunde gelegt habe”; “bloße Zahlenvorstellungen”; “die keinen Gegenstand haben, weil es in der That keine Zahl gibt, die ¼ 35 oder ¼ 25 oder ¼ 15 wäre.” Indeed, Bolzano once again acknowledges Ohm “for having first recognized this need for a particular conceptual determination [Begriffsbestimmung] of the signs > and < when applied to general numbers [allgemeine Zahlen]” (BBGA 2A8, p. 87). It should be noted that, in the paragraph that Bolzano refers to, Ohm discusses the relations of “being greater than” and “being smaller than” between “real numbers” (reele Zahlen), which for him, as he explained before, consisted of the integers, fractions and zero (Ohm 1828, pp. 123–124). 67 (BBGA 2A8, pp. 87 & 88): “was doch kein Mathematiker zugeben wird”; “endliche[r] elementare[r] Zahlenausdru[ck].” 68 The title is “Rationale Zahlen, die ins Unendliche zu- oder ab-nehmen können.” 64
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
19
involve “thinking” of a “multitude that consists of infinitely many units,” the former involve thinking of variable numbers that “can take on infinitely many values, including those that are greater [or smaller] than any other given number.”69 Consequently, from Bolzano’s point of view, whereas the “multitude of clear ideas that can be cultivated over time in the mind of a thinking being” would be a number idea that can increase to infinity while “being and always remaining finite,”70 “the multitude of points lying on a given line,” thought “as a number,” would “form the (admittedly self-contradictory) concept of an infinitely great number.”71 Or, as he later explained in WL, no “attempted enumeration” (versuchten Abzählung) of the members of an infinite multitude of points on a bounded line could ever be achieved (Bolzano 1837, p. 413). The idea of rational numbers that can increase or decrease to infinity is a very interesting one not only because the concept of such numbers, along with Bolzano’s criticism of self-contradictory concepts of number, reveals a link with Bolzano’s earlier works (cf. Sect. 2), but also because of what he goes on to explain about the former in this section of RZ. For him, the idea associated with a “sign” (Zeichen) designating a number that increases or decreases to infinity was a “multi-form number idea” (mehrförmige Zahlenvorstellung), given that it involves “infinitely many values,” none of which would be the greatest or smallest (BBGA 2A8, pp. 95– 96). This meant that, as he had already explained in the first section of that work, such a number idea and its concept would comprise infinite numbers and thus a particular sign would be required to denote said number idea, while, for example, a seven-form number concept would be that of the number between 3 and 11, or 21 and 29, comprising in each case seven different numbers (4, 5, 6, 7, 8, 9, 10 and 22, 23, 24, 25, 26, 27, 28, respectively; cf. BBGA 2A8, p. 24). As a consequence, in the case of the aforementioned number idea involving infinitely many small values, Bolzano introduced an arbitrary proposition or convention (willkürliche Satz) according to which the signs “ω, Ω, ω1, ω2, etc. [denote]
69 (BBGA 2A8, p. 95): “verwechsel[t] [werden] mit den Begriffen einer unendlich großen und unendlich kleinen Zahl”; “eine Menge denke, die aus unendlich vielen Einheiten bestehet”; “die der Werthe unendlich viele und darunter auch solche annehmen kann, die größer als eine jede andere gegebene Zahl sind” (emphasis added in “bestehet” and “kann”). 70 It is interesting to note the subtle contrast between Bolzano’s stance on this issue and the proof of Dedekind’s theorem 66 in his Was sind und was sollen die Zahlen?, where the latter stated that “the totality S of all things that can be the object of my thought is infinite” (Dedekind 1888, p. 17): whereas for Dedekind this proves that there exist “infinite systems” (unendliche Systeme), as in the case of “my world of thoughts,” for Bolzano the number idea of one’s own clear ideas can increase to infinity but remains finite. Nevertheless, as Dedekind himself pointed out in a footnote with regard to (Bolzano 1851/1920, pp. 13–14), Bolzano does consider “infinite multitudes” (cf. Bolzano 1817a, pp. 4–6). 71 (BBGA 2A8, p. 95): “[d]ie Menge der klaren Vorstellungen, welche sich in dem Gemüthe eines Denkenden Wesens mit der Zeit ausbilden können, ist und bleibt immer endlich”; “die Menge der Punkte, die in einer gegebenen Linie liegen; und denken wir uns sonach diese Menge als eine Zahl; so bilden wir uns den (freylich sich selbst widersprechenden) Begriff einer unendlich großen Zahl.”
E. Fuentes Guille´n
20
numbers that can decrease indefinitely.”72 The theorem that follows this proposition, which deals with “the algebraic sum of a finite and non-variable multitude of rational numbers ω1, ω2, . . .ωn,” makes it clear that, at least at that point, those signs stand for rational numbers that decrease and thus vary to infinity, which in turn allows Bolzano to hold that each non-variable rational number is to be enclosed by a pair of variable numbers that “can come as close to each other as one desires.”73 In other words, as the proof (Beweis) of the latter theorem shows, the non-variable rational number A would be the limit to which the pair of rational numbers X ¼A
1 1 and Y ¼ A þ 2N 2N
would approximate, with N denoting a number that can become as great as one desires (BBGA 2A8, p. 99), a procedure which resembles that which he had already proposed in his mathematical notebooks as a procedure for determining an irrational quantity. As it will be discussed in the following section, it is with reference to the proposition about enclosed non-variable numbers that, in the last section of RZ, Bolzano goes back to the possibility pointed out at the beginning of this book’s fourth section regarding numerical expressions in which an infinite multitude of the four basic operations occur, which he calls “infinite quantity concepts.”
4
Bolzano’s meßbare Zahlen
As pointed out by Berg, whereas the final title that Bolzano assigned to the seventh section of RZ was “Infinite Quantity Concepts (Quantity Expressions),”74 he had initially used the expression “Infinite Number Concepts” (Unendliche Zahlenbegriffe; BBGA 2A8, p. 100 fn. m). In this section, Bolzano sets out to consider expressions that represent numerical concepts involving an infinite multitude of basic operations, that is, infinite quantity expressions, such as the sum of the infinite “multitude of all actual [or real] numbers,” the expression of which would be “1 + 2 + 3. . .in inf.,” or even such as the sum of the fractions (Brüche) 1 “12 14 þ 18 16 þ ::in inf:,” which arises by the division of the unit by 2 and the multiplication of the denominators of the subsequent fractions by 2.75 (BBGA 2A8, p. 97): “Willk. S. Zahlen, die ins Unendliche abnehmen können, wollen wir zur Abkürzung zuweilen durch Zeichen wie ω, Ω, ω1, ω2, u.s.w. andeuten.” 73 (BBGA 2A8, pp. 97 & 98): “Die algebraische Summe aus einer endlichen und unveränderlichen Menge von Rationalzahlen ω1, ω2, . . .ωn”; “Zu jeder unveränderlichen Rationalzahl A gibt es ein Paar veränderliche Zahlen, die sie als ihre Grenzen zwischen sich einschließen (d.h. deren die Eine höher, die andern niedriger ist) während sie selbst einander so nahe rücken können, als man nur immer will.” 74 The title is “Unendliche Größenbegriffe. (Größenausdrücke).” 75 (BBGA 2A8, p. 100): “die Menge aller wirklichen Zahlen.” A similar statement can be found in (Bolzano 1851/1920, p. 20). 72
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
21
However, Bolzano notes in §3 that an infinite number concept does not require that the idea of every one of the operations that it involves be comprised in it as “singular constituent parts,” in which case it would be “unthinkable for a finite mind such as ours.”76 This remark is connected with his previous and subsequent discussion of concepts and determinable objects. For example, in BD (1810) he established that composite concepts were decomposable (zerlegbaren) and definable, unlike simple concepts (e.g., a point) or “indeterminate or infinite” concepts (unbestimmte oder unendliche; concepts of the form “[e]verything which is not A [...] is also not M”; Bolzano 1810, pp. 43, 56–57 & 84–85), so that, in the case of a concept composed of an “infinite number of parts” (unendlich vielen Theilen), this would not be definable (BBGA 2A8, p. 101). Later, in The Three Problems (henceforth DP; 1817),77 he introduced the notion of an infinite multitude of quantities and explained that such a concept was not contradictory and indeed that such a multitude could be “determinable” (bestimmbar) given a rule (or law) or a finite multitude of rules that determines all its parts, as in the case of the functions “ fx, f x, f x, . . . , ” which “vary according to the law of continuity,” or a straight line, “determined by the mere specification of its two endpoints and by a single law.”78 As he explained further, in such cases “only the multitude of its parts is indeterminable, but not the multitude itself, if there is a rule that determines every single one of those parts.”79 And, finally, when addressing, in PU, the “paradox” whereby, according to the concept established for them, the—natural—numbers would constitute an infinite multitude of all the finite multitudes that said numbers are, he pointed out that this multitude was determined by “the rule of formation” according to which “each of its terms has a succeeding one.”80 More importantly, it is in §5 that Bolzano posits for the first time the property that, as he explains in §6, will allow him to consider what he calls “measurable expressions,” namely: Theo[rem]. Among the infinite number concepts there are also some that are of such a nature that, for any arbitrary actual number q that we want to regard as the denominator of a
(BBGA 2A8, p. 101): “eigenthümlichen Bestandtheile”; “und eben deßhalb für einen endlichen Verstand, wie es der unsrige ist, undenkbar.” 77 The full title of Bolzano’s work is Die drey Probleme der Rectification, der Complanation und der Cubirung, ohne Betrachtung des unendlich Kleinen, ohne die Annahmen des Archimedes, und ohne irgend eine nicht streng erweisliche Voraussetzung gelöst; zugleich als Probe einer gänzlichen Umstaltung der Raumwissenschaft, allen Mathematikern zur Prüfung vorgelegt. 78 (Bolzano 1817a, p. 4): “nach dem Gesetze der Stetigkeit ändern.” For Bolzano’s use of superscripts, cf. footnote 24. 79 (Bolzano 1817a, p. 6): “nur die Menge seine Theile unbestimmbar, nicht aber er selbst, wenn anders irgend ein Gesetz vorhanden ist, das jeden einzelnen aus diesen Theilen bestimmt”; “durch die bloße Angabe ihrer zwey Endpunkte und durch ein einziges Gesetz [...] bestimmt.” 80 (Bolzano 1851/1920, p. 21; cf. Russ 2004, p. 611): “Denn nach dem, in der Erklärung jener Reihe (§8) angegebenen Bildungsgesetze derselben hat jedes ihrer Glieder wieder ein folgendes.” 76
22
E. Fuentes Guille´n fraction, a numerator p can be found which is again a positive or negative actual number, or sometimes even a zero, [and] with the outcome that we obtain the two equations S ¼ pq þ P 1 and S ¼ pþ1 q P , in which the sign S denotes the infinite number expression, but the P and 1 P denote a pair of strictly positive number expressions or, [in the case of] the former, sometimes even a mere zero.81
As Bolzano explains further in the following paragraph, this “means of determining infinite number concepts” was in fact very useful, since it implied “infinitely many operations” that would otherwise be impossible for us.82 Moreover, he says, such a means could be used to “determine or measure by approximation,” that is, “up to a 1q,” “every not only infinite but also finite number expression S,” which he calls “measurable or estimable” (meßbar oder ermeßlich).83 Therefore, whereas in the case of rational numbers the “measuring” (messenden) fraction pq would be “complete or perfect” (voll oder vollkommen), so that the “complement” (Ergänzung) P would consequently be ¼0, in other cases (leaving aside the “unmeasurable or inestimable” expressions, that is, those for which the aforementioned equations do not occur) one would require not only the measuring fraction plus a complement to measure S but also the “next greater fraction” (nächst[e] größer[e] Bruch) pþ1 q (BBGA 2A8, p. 105). In other words, those equations would constitute the rules that would allow one to determine numerically a certain quantity, either completely or approximately. In to better understand Bolzano’s proposal, let us consider the concrete case pffiffiorder ffi of 2 , an example of a non-rational measurable expression (or “number,” as he states in §7) and an example, moreover, which, as discussed in Sect. 2, he had addressed years earlier in his mathematical notebooks. He is no longer here, as he was back in 1814, providing some rules for finding out if a quantity is measurable; instead, he is requiring that the aforementioned equations be satisfied in order to determine or measure the number expression S, without, however, identifying the latter with anything specific, as for example Cantor (with a fundamental sequence of
81 (BBGA 2A8, p. 102; cf. Russ 2004, pp. 358–359): “Lehrs. Unter den unendlichen Zahlenbegriffen gibt es auch einige, die von einer solchen Beschaffenheit sind, daß sich zu jeder beliebigen wirklichen Zahl q, die wir als Nenner eines Bruches betrachten wollen, ein Zähler p, der abermahls eine positive oder negative wirkliche Zahl, oder zuweilen auch eine Null ist, mit dem Erfolge 1 auffinden läßt, daß wir die beyden Gleichungen erhalten S ¼ pq þ P und S ¼ pþ1 q þ P , in welchen 1 das Zeichen S den unendlichen Zahlenausdruck, die P und P aber ein Paar durchaus positive Zahlenausdrücke oder das erstere zuweilen auch eine bloße Null bedeutet.” 82 (BBGA 2A8, pp. 103–104): “dieß Mittel zur Bestimmung unendlicher Zahlenbegriffe”; “unendlich vielen Verrichtungen.” 83 (BBGA 2A8, p. 104): “näherungsweise bestimmen oder messen”; “bis auf ein 1q”; “jeden, nicht nur unendlichen, sondern auch endlichen Zahlenausdrucke S.” Here I follow Russ’ translation of “ermeßlich” as “estimable”—in measure—and therefore of “unermeßlich” as “inestimable” (Russ 2004, p. 361). According to the corresponding entry in the last volume of the Deutsches Wörterbuch by Jacob and Wilhelm Grimm, “ermeszlich” meant “mensurabilis” (Grimm and Grimm 1862, p. 915).
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
23
84 rational numbers) and Dedekind (with cuts)85 were to do a few decades later. Thus, pffiffiffi 1 by means of, on the one given S ¼ 2 , one would be able to measure it up to 100 141 hand, the measuring fraction 100 plus the complement P and, on the other hand, 142 100 1 (the next greater fraction) from which P1 should be subtracted, with P þ P1 < 100 . Clearly, one would be able to carry out such a measurement as far as desired, so that, following the “visualization” proposed by Russ and Trlifajová, one would always be able to approximate more closely to the rational division marks enclosing its value on a line of length >S (cf. Russ and Trlifajová 2016, p. 43). Throughout the following paragraphs Bolzano proves several theorems, including the following: that “[a]ny rational number is a [complete] measurable number” and vice versa (§7); that “[i]f A is measurable, then –A is also measurable” (§10); and that “there are [...] not two different [numerators] p1 and p2” of a pair of measuring fractions with the same q as denominator and which correspond to the same number expression S so that there are two different pairs of what could be called measuring equations (although this is not Bolzano’s terminology) corresponding to S (§11).86 Then, however, in the theorem of §21 and the definition of §22, and with reference to an addendum (Zusatz) to the previous theorem in which he had pointed out that the numbers
A¼1
1 and B ¼ 1 1 þ 1 þ 1 þ . . . in inf:
have different measuring fractions (§20; cf. BBGA 2A8, pp. 111–112), Bolzano notes that “[a]mong the infinite number concepts there are also some” that “represent” numbers that can be called “infinitely small,” such as S¼
1 , 1 þ 1 þ 1 þ . . . in inf:
and which, “according to the concepts [introduced] so far” (Bolzano once more refers to (BBGA 2A7, p. 164)), should not be considered as ¼0.87 In other words, such number concepts were those which, “in the attempted process of measuring”
A fundamental sequence a1,, a2, . . .an, ..... of rational numbers has the property that the difference |an + m an| becomes as small as desired as n increases, b (a numerical quantity or Zahlengrösse) being the limit of the sequence and B (the totality of such numerical quantities) being the domain of real numbers (Cantor 1872, pp. 123–124). 85 A Dedekind’s cut decomposes all rational numbers into two classes, such that each rational number in the class to the left is smaller than each one in the class to the right, the cut being either a rational or an irrational number (Dedekind 1872, pp. 19–21). 86 (BBGA 2A8, pp. 105–107; cf. Russ 2004, pp. 361 & 363): “Jede Rationalzahl ist eine meßbare Zahl, und zwar läßt sich ein volles Maß für sie angeben”; “Ist A meßbar, so ist auch –A meßbar”; “gibt es [...] nicht zwey von einander verschiedene Zahlen p1 und p2, welche als Zähler des messenden Bruches.” 87 (BBGA 2A8, pp. 112–113): “Unter den unendlichen Zahlenbegriffen gibt es auch einige”; “vorstell[en]”; “unendlich kleine”; “nach den bisherigen Begriffen.” 84
24
E. Fuentes Guille´n
and with the numerator of the measuring fraction ¼0 (so that the measuring fraction is 0q þ P1 ¼ P1 or, following Bolzano’s initial notation, ¼P), “the two equations S ¼ P1 ¼ 1q P2 [or P1 ¼ 1q þ P2, for negative S] hold, where q is taken to be as great as one desires.”88 Bolzano goes on to say, first of all, that he will call a “finite number” (endliche Zahl, which should not be confused with the appellation “finite expression”) any non-infinitely-small measurable number (end of §22), and, secondly, that infinitely small numbers are not “actual numbers” (wirkliche Zahlen; §23). Then, in the theorem of §26 and the definition of §27, he notes that, conversely, “[a]mong the infinite number concepts there are also some” for which, “in the attempted process of measuring,” “there is a p for every q, such that one of the two equations S ¼ p pþ1 89 1 2 infinite number q þ P ¼ q P is satisfied, but not both of them at once,” concepts the representation of which he calls “infinitely great” positive or negative numbers, depending on whether they satisfy the first or second equation, respectively (BBGA 2A8, p. 114). Examples of these numbers would be those designated by the expression S ¼ 1 + 2 + 3 + 4 + . . .in inf. and, for negative S, 1 2 3 4 . . .in inf. (cf. BBGA 2A8, p. 114). However, unlike the infinitely small numbers, the infinitely great numbers would be “unmeasurable” (unermeßlich) inasmuch as they would not comply with the requirement of the two measuring equations set out in §§5–6 (cf. BBGA 2A8, p. 115). Interestingly, Bolzano points out that there would, nevertheless, still also be unmeasurable numbers that are not infinitely great, such as the number represented by the expression 1 1 + 1 1 + . . .in inf. and 0 (§§29–30; cf. BBGA 2A8, p. 115). Thus, and at least up to this point in RZ, his theory of numbers (in the broad sense of this term) would not entirely correspond to our modern continuum of real numbers. This inasmuch as it comprises: (a) (b) (c) (d) (e)
the unit, that is, 1; the numbers in the strictest sense of the term, that is, 2, 3, 4, 5, 6, . . .; the negative numbers; the rational numbers; the non-rational measurable numbers, which so far seem to hint at our irrational numbers but also include the infinitely small numbers;
(BBGA 2A8, p. 113): “bey dem versuchten Geschäfte des Messens”; “die beyden Gleichungen S ¼ P1 ¼ 1q P2 bestehen, q werde so groß als man nur immer will genommen.” Bolzano introduces the denomination of “infinitely small negative number” (unendlich kleine negative Zahl) and its corresponding measuring equations in this very same paragraph. 89 (BBGA 2A8, pp. 112–114): “Unter den unendlichen Zahlenbegriffen gibt es auch einige”; “bey dem versuchten Geschäfte des Messens [...] für welche die beyden Gleichungen S ¼ P1 ¼ 1q P2 bestehen, q werde so groß als man nur immer will genommen”; “für jedes q wohl ein p gibt, das 2 Einer der beyden Gleichungen S ¼ pq þ P1 ¼ pþ1 q P , keines aber, das beyden zugleich entspräche.” 88
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
25
(f) the unmeasurable numbers, such as the infinitely great numbers, 0 and what nowadays would be look on as “divergent series.” Furthermore—although, as it must once again be emphasized, this holds true of Bolzano’s proposal in RZ only up to this point, namely §30 of the seventh section— his “measurable numbers” would not strictly correspond either to our modern continuum of real numbers, since they would include the infinitely small numbers but not 1 and 0, or to our nonstandard reals, since Bolzano’s measurable numbers would indeed not include 1, 0 or the infinitely great numbers. In the following paragraphs Bolzano focuses on operations involving infinitely great and small numbers, without saying anything further about 1 1 + 1 1 + . . .in inf. and suchlike expressions. To mention just a few results, in §32 he proves that the sum of an infinitely great and a rational number is a number of the first kind; in §§34–35 he proves that the sum of two, or of a “finite multitude” (endlich[e] Menge) of either infinitely small numbers or infinitely great numbers of the same sign, is infinitely small or infinitely great, respectively; in §36 he states that the difference between a pair of infinitely great numbers can be zero (if they are equal), infinitely small (if an infinitely small number is added to one or the other of them), finite (if a finite number is added to one or the other of them), or even infinitely great (if an infinitely great number is added to one or the other of them); in §§37–38 he proves that the product of an infinitely small or great number by a finite one (thus excluding 0) or by another infinitely small or great number is infinitely small or great, respectively; in §39 he proves that the product of a finite multitude of what he calls finite numbers cannot be either infinitely small or infinitely great, while in §40 he proves that the quotient of a finite number divided by another finite number cannot be either infinitely small or infinitely great; and in §44 he states that the “signs” (Zeichen) P1 and P2 that appear in the measuring equations are also measurable numbers (cf. BBGA 2A8, pp. 115–122). But then in §53, after arguing that the sum, the difference (although he neglects this case) or the product of two—or of a finite multitude of—measurable numbers must necessarily be another measurable number (§§45–46 & 49–50), Bolzano states that, “according to [the] concepts [introduced] so far,” there are number expressions that cannot be compared with one another, that is, between which there obtains no relation of equality or of height.90 As he goes on to say, such would be the case, for example, of A and A þ
1 1 þ 1 þ . . . in inf:
since, according to a definition set out in the fifth section of RZ, the difference A B, with A > B and with both of these being rational numbers, “can be reduced to a
(BBGA 2A8, p. 128): “nach unsern bisherigen Begriffen.”
90
E. Fuentes Guille´n
26
fraction of the form þ mn, where m and n denote actual numbers,”91 which is not the case here since Aþ
1 1 A¼ , 1 þ 1 þ . . . in inf: 1 þ 1 þ . . . in inf:
with n (i.e., 1 + 1 + . . .in inf.) not being an actual number. As a consequence, in §54 he poses the “problem” (Aufg[abe]) of investigating whether the previously discussed concepts could be extended in order to allow us to establish such a relation, and he presents the following solution: given any two finite or infinite number expressions A and B, if both are measurable and if “for any arbitrary q, the same positive or negative p is always found for both expressions A and B, which p pþ1 2 3 4 92 satisfies the equations A ¼ pq þ P1 ¼ pþ1 with q P and B ¼ q þ P ¼ q P ,” 1 3 2 4 P and P zero or positive expressions and P and P positive expressions, then A ¼ B; but if the difference between those two number expressions is positive and not infinitely small, then the one being subtracted will be smaller than the other (cf. BBGA 2A8, pp. 133–134). The problem with such a solution, as Bolzano notes in an additional sheet which was not included in the 1962 edition of RZ (cf. Rychlík 1962) and which indicates that §54—and by extension the definition in §55, as well as perhaps some other paragraphs—should be rewritten (BBGA 2A8, p. 129 fn. n), arises when one considers that “numbers that differ only by an infinitely small [number?] can behave differently in the process of measuring.”93 This would be the case, for example, of the sum of :33333 . . . and :66666 . . . , which, according to Bolzano’s initial formulation of the relations of equality and height, would be ¼ :99999 . . . and so < 1, which in turn would result in 1 2 þ < 1: 3 3
(BBGA 2A8, p. 128; cf. Russ 2004, p. 390): “auf einen Bruch von der Form þ mn bringen läßt, wo m und n wirkliche Zahlen bedeuten.” 92 (BBGA 2A8, p. 131): “zu jedem beliebigen q immer dasselbe positive oder negative p für beyde 2 und B ¼ Ausdrücke A und B vorfindet, welches die Gleichungen A ¼ pq þ P1 ¼ pþ1 q P 91
p qþ 93
4 P3 ¼ pþ1 q P erfüllet.”
(BBGA 2A8, p. 130): “Zahlen, die sich nur um ein unendlich Kleines unterscheiden, können sich bei dem Geschäfte des Messens verschiedentlich verhalten.”
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
27
Or, following Bolzano, this would also be the case for 1 and 1
1 , 1 þ 1 þ . . . in inf:
which, “for every measuring fraction q, give the numerators p ¼ q and p ¼ q 1 [respectively]”94 and yet should be regarded as equal. Because of this, he adds, if the difference between A and B, “considered absolutely,” “behaves like zero,” “one finds that the numerator of the measuring fraction [is] ¼0, and so [...] A ¼ B.”95 First of all, it should be stressed that Bolzano’s corrections and amendments, such as the one just presented, are evidence of a work the final version of which he did not manage to finish. Secondly, the amendment introduced in §54 not only allows Bolzano to revisit the case posited in §20, namely A 6¼ B, with A ¼ 1
1 and B ¼ 1, 1 þ 1 þ 1 þ . . . in inf:
and state that A þ J ¼ A, with J an infinitely small number or ¼0 (BBGA 2A8, p. 136). Such an amendment, as he explains further in the note of §58, involves reconsidering the notion of infinitely small numbers insofar as, “in the process of measuring,” they “behave” (he uses the noun Verhalten) as “equal to zero,” even though they are not, and thus, he says, in order to distinguish them from zero in the strict sense of the term, they should be called “relative or respective zero.”96 As pointed out by Detlef Laugwitz in his second paper on RZ, this last formulation of the equality relation between measurables is crucial in order to overcome some significant difficulties in Bolzano’s theory (cf. Laugwitz 1982, p. 669), although, as noted by Berg and—following him—Rusnock, it may at the same time pose some difficulties, in particular due to Bolzano’s reference to the difference A B “considered absolutely” (cf. BBGA 2A8, pp. 134–135 fn. 64; Rusnock 2000, p. 185). Moreover, as pointed out by Rusnock, “[t]he condition on P1 reveals an important weakness in Bolzano’s approach” (Rusnock 2000, p. 182), not only from the modern perspective of how our real numbers must behave, but also from the perspective of what Bolzano’s own theory of numbers dictates. As noted by Russ and Trlifajová, whereas according to §§45–46 the sum of a finite multitude of measurable numbers will also be a measurable number (so, from a modern point of view, there is closure under addition), problems emerge when one considers an (BBGA 2A8, p. 130): “bei jedem messenden Bruche q, den Zähler p ¼ q, und p ¼ q 1 geben.” (BBGA 2A8, p. 130): “absolut betrachtet”; “wie Null sich verhält”; “man zu Jedem auch noch so großen Nenner q den Zähler des messenden Bruches ¼0 findet, und somit [...] daß A ¼ B sey.” 96 (BBGA 2A8, p. 136): “bey dem Geschäfte des Messens”; “Null gleichgelten”; “relative oder beziehungsweise Null.” 94 95
E. Fuentes Guille´n
28
expression that Bolzano quotes in §2 as an example of an infinite quantity expres1 sion, namely “12 14 þ 18 16 þ ::in inf:” (BBGA 2A8, p. 100; Russ and Trlifajová 2016, p. 49). Following Russ and Trlifajová, we might interpret such an expression 5 “as the sequence of partial sums [so that] we obtain the sequence 12 , 14 , 38 , 16 , . . . that 1 converges to 3 ” (Russ and Trlifajová 2016, p. 49), but for which the two corresponding measuring equations cannot be given. Nonetheless, if this type of expression is unmeasurable, the sum of “A ¼ 2 2 þ 12 12 þ 18 18 þ . . . in inf: ” 3 and “B ¼ 1 þ 12 þ 14 þ 18 þ . . . in inf:,” “A þ B ¼ 1 32 þ 34 38 þ 16 . . . in inf:,” would likewise be unmeasurable and, as a consequence, measurable numbers would not be closed under addition (Russ and Trlifajová 2016, p. 49).97 Whereas from a modern perspective it may be desirable to repair Bolzano’s proposal in such a way that his “measurable numbers” turn out to be equivalent to our real numbers, we consider that it would be a more faithful reading of RZ to understand it as a transitional work that was not leading inexorably to the modern post-Weierstraßian or nonstandard frameworks.98 It is undeniable that Bolzano himself was struggling to overcome some problems in his theory of numbers, as is shown by a handwritten note that he added at the very end of RZ: For the theory of measurable numbers Should it not be possible to simplify the theory of measurable numbers if one establishes the definition of them so that A is called measurable if one has 2 equations of the form 99 A ¼ pq þ P ¼ pþn q P, where for the same n, q can be increased to infinity [or indefinitely]?
In his first work on RZ, which was based on the 1962 edition, Laugwitz suggests an amendment in line with those which Bolzano introduced but which were only included in the second edition of that work (as in the case of the note just quoted), whereby “[a]n expression A is called measurable if, for every natural (or ‘actual’) 100 number q, there is an integer p ¼ p(q), so that pq < A < pþ2 Berg interprets the q .” last of Bolzano’s amendments as meaning that a measurable number expression corresponds “to a sequence S (¼hSni) of rational numbers in such a way that, for every natural number q, there is an integer p(q, S) so that, for all n, holds Sn ¼ n
Þ 3 Previously, Rusnock posed the example of “1 32 þ 34 38 þ 16 . . . þ 3∙ð1 which gives rise to 2n 1 1 1 1 ” the sequence of values 1, 2 , 4 , 8 , 16 , . . . , a sequence that converges to zero and that would not be measurable even though “it can be obtained as the sum or difference of two expressions each of which is measurable” (Rusnock 2000, p. 182). 98 For similar stance with regard to both the modern post-Weierstraßian and nonstandard interpretations of Augustin-Louis Cauchy’s analytical proposal, cf. (Lützen 2003, p. 164). 99 (BBGA 2A8, p. 168): “Zur Lehre von den meßbaren Zahlen Sollte die Lehre von den meßbaren Zahlen nicht vielleicht vereinfacht werden können, wenn man die Erklärung derselben so errichtet, daß A meßbar heißt, wenn man 2 Gleichungen von der Form, A ¼ pq þ P ¼ pþn q P hat, wo bey einerlei n, q ins Unendliche zunehmen kann?” 97
(Laugwitz 1965, p. 407): “Ein Ausdruck A heißt meßbar, wenn es zu jeder natürlichen (oder „wirklichen“) Zahl q eine ganze Zahl p ¼ p(q) gibt, so daß pq < A pþ2 q .”
100
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
29
pðq, SÞ q
þ Pn ðq, SÞ ¼ pðq, SqÞþm Pn ðq, SÞ, where m is a natural number and hPn(q, S)i is a purely positive sequence.”101 Similarly, and following (Rychlík 1962) and (van Rootselaar 1964), who also suggested such an interpretation of Bolzano’s measurable numbers, Rusnock proposes that A be considered as measurable “iff for all q there is exactly one integer p and number concepts P1 ≳ 0 and P2 > 0 such that A ¼ pq þ P1 ¼ pþ1 q P2 ” (Rusnock 2000, p. 186). And, finally, Russ and Trlifajová suggest that Bolzano’s definition of measurable numbers can be repaired either by considering the numerator p in both the measuring fraction and the next greater fraction of the measuring equations as a rational number, or by interpreting that Bolzano “generally considered number expressions with oscillating values and approaching a rational number as being equal to that rational number” (Russ and Trlifajová 2016, p. 50).102 Undoubtedly, each of those proposals has its advantages; but they all involve certain conjectures regarding issues that are not entirely clear or explicit in Bolzano’s theory. In particular, whereas the consideration of measurable numbers as sequences of rational numbers would be “in harmony with Bolzano’s remarks on infinite series from the paper on the binomial theorem” (Rusnock 2000, p. 181), where—from a modern point of view—Bolzano treated infinite series in a finite manner by focusing on their partial sums or finite segments of them (cf. Rusnock 2000, p. 63; Russ 2004, p. 144), it should be stressed that the latter does not explicitly endorse such an interpretation in RZ. So, while this chapter is not intended to demean the challenging and praiseworthy work of reinterpreting Bolzano’s theory, it rather concurs with the suggestion of both Spalt and Šebestík that this theory should be understood, first and foremost, “on its own terms,”103 that is, “in its specificity, including its stylistic peculiarities.”104 Thus, on the one hand, in the case of convergent (in modern terminology) infinite number expressions with alternating values, Bolzano—as Rusnock has already suggested—seems to have “never fully mastered the use of inequalities in conjunction with absolute values, and never seems to have understood the behaviour of
(BBGA 2A8, p. 168 fn. 77): “entspricht nun eine Folge S (¼hSni) von rationalen Zahlen derart, daß es für jede natürliche Zahl q eine ganze Zahl p(q, S) gibt, so daß für alle n gilt: Sn ¼
101
pðq, SÞ q
þ Pn ðq, SÞ ¼ pðq, SqÞþm Pn ðq, SÞ , wobei m eine natürliche Zahl und hPn(q, S)i eine rein positive Folge ist.” 102 Russ and Trlifajová base the second interpretation on an example provided by Bolzano in §45, “which could be interpreted as non-monotonic sequences,” namely 1 2 p1 þp2 þ2 p1 þp2 þ1 1 2 13 14 ” 1 1 þ Ω þ P ¼ þ Ω P , from which “A þ B ¼ ” (there “A þ B ¼ p þp q q q q q would be a mistake here, either in the manuscript or in the 1976 edition, since the numerator is written “p1 + p1 + 1”; BBGA 2A8, p. 123; Russ and Trlifajová 2016, pp. 50–51; Russ 2004, p. 384). 103 (Spalt 1991, p. 68): “Sie [Bolzano’s Zahlenlehre] muß nur in ihrem eigenen Recht verstanden werden.” 104 (Šebestík 1992, p. 358): “l’historien doit de prime abord restituer une théorie du passé dans sa spécificité, y compris dans ses particularités stylistiques; c’est ensuite seulement qu’il peut tenter un rapprochement avec la pratique mathématique de notre temps.”
E. Fuentes Guille´n
30
alternating series” (Rusnock 2000, p. 183). On the other hand, as regards the infinitely great and other unmeasurable numbers (i.e., those that cannot be placed in a relation of equality or height), such as the unmeasurable number represented by the divergent (in modern terminology) number expression 1 1 + 1 1 + . . .in inf. (§§29–30), Bolzano says very little more about these throughout the whole of the rest of RZ. First, in §62 he notes that “we are not entitled, even according to the broader concepts now established,” to maintain that all infinite number expressions can be placed in a relation of equality or height, as shown by “the difference B A ¼ 1 2 + 3 4 + 5 . . in inf.,” with A being “an arbitrary expression.”105 Then, in §69 he points out that it can be said that every positive or negative infinitely great number will be greater or smaller than any “actual number” (wirkliche Zahl), respectively (BBGA 2A8, p. 138), so that (a) it can be considered that every positive or negative measurable number lies between 0 and a positive or negative infinitely great number, respectively (theorem of §78; BBGA 2A8, p. 140); (b) given two numbers L and R, and a “property B [that] is to belong to all measurable numbers from L to R,” if both are infinitely great and one is positive and the other negative, then B will belong to all measurable numbers; but if one of them is ¼0 and the other a positive or negative infinitely great number, B will belong to all positive or negative measurable numbers, respectively (theorem of §89).106 Throughout the remaining paragraphs of the seventh section of RZ, proceeding in a way similar to the way in which he proceeds in the previous sections with regard to natural numbers, integers and rationals (cf. Sect. 3; Šebestík 1992, pp. 355–356), Bolzano discusses the extension of several properties to measurable numbers (cf. Russ and Trlifajová 2016, p. 45). For example, he states that “
½if A > B and B > C, then also A > C” ð§63Þ
and that107
105 (BBGA 2A8, p. 137; cf. Russ 2004, p. 396): “wir selbst nach den jetzt aufgestellten weiteren Begriffen nicht berechtiget sind”; “der Unterschied B A ¼ 1 2 + 3 4 + 5 . . in inf.”; “beliebiger Ausdruck.” 106 (BBGA 2A8, p. 142; Russ 2004, p. 403): “die Beschaffenheit B allen meßbaren Zahlen von L bis zu R ein- oder ausschließlich zukommen soll.” Bolzano’s formulation of this theorem contains a fairly subtle mistake since, strictly speaking, he says at the beginning of the paragraph that if L and R are infinitely great numbers, with one of them positive and the other negative, and the property B belongs “to all measurable numbers from L to R, inclusively or exclusively,” then it belongs to all measurable numbers. However, it is clear that L and R cannot be counted among the measurable numbers and thus L and R cannot be included among “all measurable numbers.” 107 (BBGA 2A8, pp. 137 & 139): “Wenn A > B und B > C, so ist auch A > C”; “entweder A ¼ B oder A > B oder A < B.”
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . . “
31
either A ¼ B or A > B or A < B” ðtheorem of §73Þ,
a proposition which, as in other cases, is false with his first formulation of the equality relation (cf. Russ 2004, p. 399 fn. q; BBGA 2A8, pp. 134–135 fn. 64). In addition to this, he takes up once again some important theorems which he had already addressed in RaB, such as the theorem on what nowadays is called a least upper bound property (§109) and that about what—depending on whether or not Bolzano’s infinitesimals are considered to be equivalent to 0—can be considered equivalent to so-called Cauchy’s convergence criterion (§107). Nevertheless, inasmuch as the elucidation of the extent to which Bolzano succeeds to ground these properties in a rigorous manner (from his own point of view) in RZ is something that requires and deserves a thorough study,108 whereas the objective of this section has been to address the insights and uncertainties with regard to the number continuum underlying his notion of “measurable numbers,” readers are referred to (Spalt 1991), (Šebestík 1992) and (Rusnock 2000) for a careful assessment of Bolzano’s discussion of such theorems.
5
Final Remarks
In his introduction to the 1976 edition of RZ, Berg pointed out a dilemma faced by the scholar of Bolzano’s theory of numbers: either one updates Bolzano’s proposal to bring it into line with present-day knowledge, thereby running the risk of misinterpreting some of its basic characteristics, or, if one attempts to be faithful to Bolzano’s proposal, some of such characteristics remain unclear (cf. Berg 1976, p. 8). This chapter has attempted to cast light upon Bolzano’s insights and uncertainties regarding the number continuum by paying attention not only conceptual framework of RZ, but also to the introductory volume to his general theory of quantities (i.e., EG), his mathematical notebooks and some other of his works, both those published during his lifetime and those published only posthumously. Moreover, this chapter has tried to provide a more faithful assessment of RZ by showing how Bolzano’s stance on numbers evolved from his student days through to his maturity, when he wrote this latter work. It should be pointed out, however, that there are some volumes of Bolzano’s Größenlehre and Miscellanea Mathematica (comprising, indeed, those notebooks which cover the period from 1830 onward) which have not yet been published and which might provide further evidence regarding certain aspects of RZ. Coupled with this, it is worth noting that, due to its limited scope, this chapter has left out of account certain interesting and not uncontroversial issues that are either addressed in RZ or linked to it. For example, this chapter did not delve into Bolzano’s In her presentation at the symposium “Bolzano’s Mathematics and the General Methodology of the Sciences,” which was organized by Steve Russ and held as part of the 16th CLMPST, in Prague, Anna Bellomo examined §107 and provided appealing evidence in favor of a non-set-theoretical interpretation of RZ by focusing on Bolzano’s reasoning “in terms of parts and wholes.”
108
32
E. Fuentes Guille´n
characterization of numbers as abstract (the property or nature to which the numbers are due) and concrete (the “actual multitude of things”), or “nominated” (benannte or genannte; e.g., “the concept three points”) and “unnominated” (unbenannte or ungenannte; in themselves, e.g., “the concept three,” so there is no unnominated number idea that has only one object, because “one” is not a number), these latter being those which are “primarily” (vornehmlich) studied in the “pure theory of numbers” (cf. BBGA 2A8, pp. 19–26). For a careful study of these topics, readers are referred to (Benis-Sinaceur 1975) and (Šebestík 1992, pp. 345–350). Likewise, this chapter has not engaged in any deeper discussion of Bolzano’s distinction between multitudes and collections (not sets), for which readers are referred to (Simons 1997), nor of Bolzano’s vindication of the so-called partwhole principle (cf. Trlifajová 2018). Regarding this last issue: (a) in PU, Bolzano defended the idea that an infinite multitude is greater than those infinite multitudes contained as parts of it (cf. Bolzano 1851/1920, pp. 26–27 & 30–31); (b) in WL, he stated that in the “infinite series of concepts [of whole numbers] [...] n, n2, n4, n8, . . .,” “every succeeding [concept] is always subordinated to [i.e., smaller than] the preceding one”;109 and (c) he is said to have retracted his previous stance on the partwhole principle in a letter dated a few months before his death (cf. Mancosu 2009, p. 625). Paolo Mancosu has challenged the interpretation according to which “Bolzano saved his mathematical soul in extremis and joined the rank of the blessed Cantorians by repudiating his previous sins,” and he has pointed out that “recent mathematical work gives us theories that can (cum grano salis) be seen as formally capturing (parts of) the intuitive concept of infinity found in [...] all those who believed that the size of the natural numbers is larger than the size of the even numbers” (Mancosu 2009, pp. 626–627). Instead, and based not only on the passages alluded to but also on a paper by Bolzano dated 1840–41, Rusnock and Šebestík have recently argued that “[m]aintaining the whole–part principle would thus appear to force Bolzano” to pay a price that he presumably would not have been willing to pay (Rusnock and Šebestík 2019, pp. 537–538). There is a well-known passage—“well-known” at least among Bolzano’s scholars—toward the end of the preface to RaB where, with reference to his theorem on what today is designated as a “least upper bound,” Bolzano claims that, according to a “correct concept of quantity,” “the thought [or idea] of an i which is the greatest of those [quantities] of which it may be said that all those standing below it possess the property M, is the thought [or idea] of a real, i.e. of an actual quantity.”110 From the point of view expressed in this chapter, the fact that in this passage Bolzano refers to i as a “quantity” and not as a “number,” whereas in §109 of RZ he refers to it as a “measurable number,” not only constitutes evidence of both a terminological and a
(Bolzano 1837, p. 474): “unendlichen Reihe von Begriffen”; “ist also jeder folgende immer den vorhergehenden untergeordnet.” 110 (Bolzano 1817a, p. 23): “für Jeden, der einen richtigen Begriff von Größe hat, [...] daß der Gedanke eines i, welches das größte derjenigen ist, von denen gesagt werden mag, daß alle unter ihm stehende die Eigenschaft M besitzen, der Gedanke einer reellen d.h. wirklichen Größe sei.” 109
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
33
conceptual change, but ultimately points to the substratum on which his later notion of “measurable numbers” was to rest, namely that of quantities. Therefore it may be concluded that, on the one hand, from his early works onward Bolzano showed “a profound insight into the nature of [quantities]” (Russ 2004, p. 150)—indeed, one would even want to be entitled to say “into the nature of numbers”—which eventually led him to the development of his notion of “measurable numbers.” On the other hand, however, even in the case of the latter notion, which reflects a transition to a non-traditional notion of numbers and represents “a surprising achievement for its time” (Russ 2004, p. 348), one still cannot help but note significant traits of a notyet-modern understanding of the number continuum, of quantities and of mathematics itself. In this sense, Šebestík’s remark that the relevance of Bolzano’s theory of numbers can be better appreciated if it is compared both with later and with earlier proposals in this subject seems to be extremely appropriate (cf. Šebestík 1992, p. 385). Following Šebestík, it is worth considering, for example, “Kästner’s decomposition of irrational numbers,”111 first published in 1758. For the latter, irrational—algebraic—numbers, “which originate from roots, can be approximated to just as closely as one desires,” inasmuch as they are composed of both a rational part and of another part that can be considered “an innumerable multitude of infinitely small parts.”112 Such an approach, which was not novel,113 certainly had its shortcomings, starting with the fact that Kästner only referred to the algebraic irrationals and took for granted the operations on them (Šebestík 1992, pp. 353–355). Nonetheless, as pointed out by Šebestík, Kästner’s presentation can be regarded as a step forward, inasmuch as he would seem to have intended to develop “[a]ll the concepts of arithmetic” on the basis of the notion of whole numbers.114 The ending of “the paradigm of the science of quantity” (to echo Epple’s expression mentioned in the introduction), which was still strongly advocated in the early nineteenth century and which, as has been shown to some extent in this chapter, entailed profound changes, was actually a protracted process which went on well beyond Bolzano’s own lifetime, extending for the whole of that century. As a consequence, remnants of that previous understanding of mathematics as Größenlehre were to continue to prevail, in one way or another, in several key proposals that are usually linked to the development of the modern notion of real
111 (Šebestík 1992, p. 353): “Lorsque nous parlerons de la représentation bolzanienne des «grandeurs mesurables», on devra se rappeler la décomposition des nombres irrationnels selon Kästner.” 112 (Kästner 1786, pp. 126–127; cf. Šebestík 1992, pp. 352–353): “Irrationalzahlen, die von Wurzeln herrühren, kann man sich, so weit man will, nähern”; “eine unzählige Menge unendlich kleiner Theile.” 113 In the mid-1810s, Bolzano explicitly rejected such an approach, which was still common at the turn of the nineteenth century (cf. van Rootselaar 1995, p. 28; Šebestík 1992, p. 355), because otherwise “one would arrive at the consideration of the infinite” (“käme man da in die Betrachtung des Unendlichen”; BBGA 2B6/2, p. 195). 114 (Kästner 1786, p. [4]): “Alle Begriffe der Arithmetik gründen sich meines Erachtens auf die von ganzen Zahlen.” Cf. (Šebestík 1992, pp. 52 & 54).
34
E. Fuentes Guille´n
numbers. A couple of examples shall be mentioned. Firstly and as noted by Ferreirós, whereas Weierstraß continued to differentiate between, on the one hand, numbers in the strict sense of the term and, on the other, quantities—leading to “[h]is theory of rational and irrational numbers [which dates from the 1860s] [being] formulated as a theory of ‘numerical [quantities]’”—, there nevertheless lies “under [such a] traditional denomination [. . .] an abstract conception of [quantities]” (Ferreirós 2007, pp. 64–65 & 125). Secondly, as pointed out by Epple, whereas in his Theorie der complexen Zahlensysteme Hermann Hankel intended to develop a formal concept of real numbers following “the British tradition of Peacock and de Morgan,” he nevertheless claimed that “[t]he irrational require[d] for its systematic treatment the concept of quantity”115 and that irrationals, therefore, should be developed “within the theory of quantities.”116 Nevertheless, as the works of Dedekind, Cantor, and Heine show, by the 1870s the notion of number had changed significantly. Hence Dedekind’s insistence in 1872 that the “gradual extension of the concept of number” should be carried out without using “foreign ideas,” such as that of “measurable quantities.”117 As he was later to write, in the preface to the first edition of Was sind und was sollen die Zahlen?, even though he agreed that the “ancient conviction” that an “irrational number [could be] understood as a ratio of measurable quantities” was ultimately “the source” of any attempt at a theory of irrational numbers, for Dedekind there was an essential difference between his own theory based on cuts and certain earlier proposals.118 In particular, Dedekind disagreed with Jules Tannery’s claim that the latter’s theory (which the former recognized as equivalent to his) represented “the development of [the] idea originating from J. Bertrand [...] of defining an irrational number by indicating all rational numbers which are smaller and all those which are greater than the number to be defined.”119 As he went on to explain, Joseph Bertrand’s proposal, according to which any “incommensurable number” “is necessarily placed between two [commensurable numbers], nx and xþ1 n , and [the integer]
115 (Epple 1996, pp. 6–7; cf. Hankel 1867, p. 47): “die britische Tradition Peacocks und de Morgans”; “Das Irrationale verlangt zu seiner systematischen Fassung den Grössenbegriff.” 116 The third and fourth sections of Hankel’s work are entitled “Die reellen Zahlen in ihrem formalen Begriffe” and “Die reellen Zahlen in der Grössenlehre”, respectively (Hankel 1867, pp. 35 & 48), while the full title of that work is Theorie der complexen Zahlensysteme insbesondere der gemeinen imaginären Zahlen und der Hamilton’schen Quaternionen nebst ihrer geometrischen Darstellung. 117 (Dedekind 1888, p. X): “die shrittweise Erweiterung der Zahlbegriffes”; “fremdartiger Vorstellungen”; “der meßbaren Größen.” 118 (Dedekind 1888, p. XIV): “diese uralte Ueberzeugung”; “die irrationale Zahl als Verhältniß meßbarer Größen auffaßt”; “die Quelle.” 119 (Dedekind 1888, pp. XIII–XIV): “die Entwickelung eines von Herrn J. Bertrand herrührenden Gedankens nennt, welcher in dessen Traité d’arithmétique enthalten sei und darin bestehe, eine irrationale Zahl zu definiren durch Angabe aller rationalen Zahlen, die kleiner, und aller derjenigen, die größer sind als die zu definirende Zahl.”
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
35
n can be taken great enough, so that their difference 1n is as small as desired,”120 was still a proposal which “ha[d] recourse to the existence of measurable quantities” and “present[ed] some essential gaps.”121 Bolzano’s proposal of “measurable numbers,” which was unknown to all these mathematicians, certainly constitutes a step forward, made already within the first half of the nineteenth century, in the elucidation of the number continuum. However, several crucial steps remained still to be taken, beginning with the abandonment of the substratum of quantities and by extension of the core notion of variable quantities (cf. Fuentes Guillén and Martínez Adame 2020). The position defended in this chapter, therefore, is that a proper assessment of Bolzano’s theory of numbers should always take into account its original setting and, as fruitful as such an “updating” approach may sometimes appear to be, should always avoid dislocating this theory by reinterpreting it within a modern, post-Weierstraßian or nonstandard framework. Acknowledgments The author would like to thank Steve Russ, José Ferreirós and Carlos Álvarez for their very helpful remarks and suggestions, as well as Michèle Friend and Bharath Sriraman for all their support for the publication of this chapter. He would also like to thank Davide Crippa for his help in obtaining some of the documents cited in this chapter, as well as Alejandrina Viesca Ramírez for her careful revision of it and Alexander Reynolds for his excellent proofreading. Funding This chapter was made possible thanks to the support by the Postdoctoral Scholarship Program of the Dirección General de Asuntos del Personal Académico (DGAPA) and the Faculty of Sciences at Universidad Nacional Autónoma de México, as well as thanks to the support for the grant project “Mathematics in the Czech Lands: From the Jesuit Teaching to Bernard Bolzano” (GA ČR 19-03125Y) and the support by the Centre for Science, Technology and Society Studies of the Filosofický ústav Akademie věd České republiky.
References Barnett JH (2004) Enter, Stage Center: The Early Drama of The Hyperbolic Functions. Math Mag 77(1):15–30. https://www.tandfonline.com/doi/abs/10.1080/0025570X.2004.11953223 Benis-Sinaceur H (1975) Bolzano est-il le précurseur de Frege? Archiv für Geschichte der Philosophie 57(3):286–303 Benis-Sinaceur H (2015) Is Dedekind a Logicist? Why Does Such a Question Arise? In: BenisSinaceur H, Panza M, Sandu G (eds) Functions and generality of logic. Reflections on Dedekind’s and Frege’s Logicisms. Springer, Cham, pp 1–57 Berg J (1975) Einleitung des Herausgebers. In: Berg J (ed) Bernard Bolzano Gesamtausgabe, Band 2A7. Frommann-Holzboog, Stuttgart-Bad Cannstatt, pp 9–16 Berg J (1976) Einleitung des Herausgebers. In: Berg J (ed) Bernard Bolzano Gesamtausgabe, Band 2A8. Frommann-Holzboog, Stuttgart-Bad Cannstatt, pp 7–9
(Bertrand 1849, pp. 204–205): “nombre incommensurable”; “est nécessairement compris entre 1 deux d’entre eux, nx et xþ1 n , et l’on peut prendre n assez grand, pour que leur différence n soit aussi petite qu’on le voudra.” 121 (Dedekind 1888, pp. XIV–XV): “sie sogleich ihre Zuflucht zu der Existenz einer meßbaren Größe nimmt”; “einige so wesentliche Lücken darzubieten.” 120
36
E. Fuentes Guille´n
Berg J (1977) Einleitung des Herausgebers. In: Berg J (ed) Bernard Bolzano Gesamtausgabe, Band 2A5. Frommann-Holzboog, Stuttgart-Bad Cannstatt, pp 7–11 Berg J (1990) Zur logischen und mathematischen Ontologie. Geneseologie und Resultatismus in der Analysis der Grundlagen der Bolzanoschen Zahlenlehre. In: Spalt DD (ed) Rechnen mit dem Unendlichen. Beiträge zur Entwicklung eines kontroversen Gegenstandes. Springer, Basel, pp 123–155 Bertrand J (1849) Traité d’arithmétique. L. Hachette et cie, Paris Blok J (2016) Bolzano’s Early Quest for A Priori Synthetic Principles. Mereological Aspects of the Analytic-Synthetic Distinction in Kant and the Early Bolzano. PhD thesis. Rijksuniversiteit Groningen Bolzano B (1804; BG) Betrachtungen über einige Gegenstände der Elementargeometrie. In Commission bey Karl Barth, Prag Bolzano B (1810; BD) Beyträge zu einer begründeteren Darstellung der Mathematik. Erste Lieferung. Caspar Widtmann, Prag Bolzano B (1816; BL) Der binomische Lehrsatz, und als Folgerung aus ihm der polynomische, und die Reihen, die zur Berechnung der Logarithmen und Exponentialgrößen dienen, genauer als bisher erwiesen. C.W. Enders, Prag Bolzano B (1817a; DP) Die drey Probleme der Rectification, der Complanation und der Cubirung, ohne Betrachtung des unendlich Kleinen, ohne die Annahmen des Archimedes, und ohne irgend eine nicht streng erweisliche Voraussetzung gelöst; zugleich als Probe einer gänzlichen Umstaltung der Raumwissenschaft, allen Mathematikern zur Prüfung vorgelegt. Paul Gotthelf Kummer, Leipzig Bolzano B (1817b; RaB) Rein analytischer Beweis des Lehrsatzes, daß zwischen je zwey Werthen, die ein entgegensetztes Resultat gewähren, wenigstens eine reelle Wurzel der Gleichung liege. Gottliebe Haase, Prag Bolzano B (1837; WL) Wissenschaftslehre. Versuch einer ausführlichen und größtentheils neuen Darstellung der Logik mit steter Rücksicht auf deren bisherige Bearbeiter. Erster Band. Seidelschen Buchhandlung, Sulzbach Bolzano B (1851/1920; PU) Paradoxien des Unendlichen herausgegeben aus dem schriftlichen Nachlasse des Verfassers von Dr. Fr. Příhonský, A. Höfler (ed). Felix Meiner, Leipzig Bolzano B (1930; F) Functionenlehre. In: Rychlík K (ed) Spisy Bernarda Bolzana/Bernard Bolzano’s Schriften, Svazek 1. Královská česká společnost nauk, Praha Bolzano B (1975; EG BBGA 2A7) Einleitung in die Größenlehre und erste Begriffe der allgemeinen Größenlehre. In: Berg J (ed) Bernard Bolzano Gesamtausgabe, Band 2A7. Frommann-Holzboog, Stuttgart-Bad Cannstatt Bolzano B (1976; RZ BBGA 2A8) Größenlehre II: Reine Zahlenlehre. In Berg J (ed) Bernard Bolzano Gesamtausgabe, Band 2A8. Frommann-Holzboog, Stuttgart-Bad Cannstatt Bolzano B (1977; BBGA 2A5) Mathematische und philosophische Schriften 1810–1816. In: Berg J (ed) Bernard Bolzano Gesamtausgabe, Band 2A5. Frommann-Holzboog, Stuttgart-Bad Cannstatt Bolzano B (1995; BBGA 2B6/2) Miscellanea Mathematica 10. In: van Rootselaar B, van der Lugt A (eds) Bernard Bolzano Gesamtausgabe, Band 2B6/2. Frommann-Holzboog, Stuttgart-Bad Cannstatt Bolzano B (1996; BBGA 2B7/1) Miscellanea Mathematica 11. In: van Rootselaar B, van der Lugt A (eds) Bernard Bolzano Gesamtausgabe, Band 2B7/1. Frommann-Holzboog, Stuttgart-Bad Cannstatt Bolzano B (2001; BBGA 2B9/2) Miscellanea Mathematica 16. In: van Rootselaar B, van der Lugt A (eds) Bernard Bolzano Gesamtausgabe, Band 2B9/2. Frommann-Holzboog, Stuttgart-Bad Cannstatt Bolzano B (2005; BBGA 2B11/1) Miscellanea Mathematica 19. In: van Rootselaar B, van der Lugt A (eds) Bernard Bolzano Gesamtausgabe, Band 2B11/1. Frommann-Holzboog, Stuttgart-Bad Cannstatt
Bolzano’s Theory of meßbare Zahlen: Insights and Uncertainties. . .
37
Bolzano B (2006; BBGA 2B11/2) Miscellanea Mathematica 20. In: van Rootselaar B, van der Lugt A (eds) Bernard Bolzano Gesamtausgabe, Band 2B11/2. Frommann-Holzboog, Stuttgart-Bad Cannstatt Bolzano B (2007; BBGA 2B12/1) Miscellanea Mathematica 21. In: van Rootselaar B, Berg J (eds) Bernard Bolzano Gesamtausgabe, Band 2B12/1. Frommann-Holzboog, Stuttgart-Bad Cannstatt Cantor G (1872) Ueber die Ausdehnung eines Satzes aus der Theorie der trigonometrischen Reihen. Math Ann 5:123–132 Cantor G (1874) Ueber eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen. J Reine Angew Math 77:258–262 Cantù P (2014) Bolzano versus Kant: Mathematics as a Scientia Universalis. In: Reboul A (ed) Mind, Values, and Metaphysics. Philosophical essays in Honor of Kevin Mulligan, vol 1. Springer, Cham, pp 295–316 Catalogus Mathematicorum IX A 19 (1781ff) Catalogus Mathematicorum Anno Era Christiano 1781 elucubratus. Katalogy Národní knihovna České republiky Cauchy A-L (1821) Cours d’Analyse de l’École Royale Polytechnique. 1er Partie, Analyse Algébrique. Debure frères, Paris Dedekind R (1872) Stetigkeit und irrationale Zahlen. Friedrich Vieweg und Sohn, Braunschweig Dedekind R (1876) Sur la théorie des nombres entiers algébriques. Bull Sci Math Astron 11:278– 288 Dedekind R (1888) Was sind und was sollen die Zahlen? Friedrich Vieweg und Sohn, Braunschweig Dorrego López E, Fuentes Guillén E (forthcoming) Irrationality and transcendence in the 18th and 19th centuries. A contextualised study of J. H. Lambert’s work. Prologue by José Ferreirós. Springer Du Bourguet JBE (1810) Traités élémentaires de calcul différentiel et de calcul intégral, indépendans de toutes notions de quantités infinitésimales et de limites, Tome premier. Courcier, Paris Epple M (1996) Das Ende der Grössenlehre. Eine Einführung in die Geschichte der Grundlagen der Analysis, 1860-1930. Preprint-Reihe des Fachbereichs Mathematik 8. Johannes GutenbergUniversität Mainz, Mainz Ferreirós J (2007) Labyrinth of Thought. A History of Set Theory and Its Role in Modern Mathematics, 2nd edn. Birkhäuser, Germany Fuentes Guillén E, Crippa D (2021) The 1804 examination for the chair of Elementary Mathematics at the University of Prague. Hist Math 57:24-54.e18. Fuentes Guillén E, Martínez Adame C (2020) The notion of variable quantities ω in Bolzano’s early work. Hist Math 50:25–49 Grimm J, Grimm W (1862) Deutsches Wörterbuch, Dritter Band, E – Forsche. S. Hirzel, Leipzig Grunert JA (ed) (1836) Supplemente zu Georg Simon Klügel’s Wörterbuche der reinen Mathematik, Zweite Abtheilung, E bis Z. E. B. Schwickert, Leipzig Handbuch (1786) Handbuch aller unter der Regierung des Kaisers Joseph II für die K.K. Erbländer ergangener [sic] Verordnungen und Gesetze in einer Systematischen Verbindung enthält die Verordnungen und Gesetze von Jahre 1784, Sechster Band. Georg Moesle, Wien Hankel H (1867) Theorie der complexen Zahlensysteme insbesondere der gemeinen imaginären Zahlen und der Hamilton’schen Quaternionen nebst ihrer geometrischen Darstellung. Leopold Voss, Leipzig Kästner AG (1786) Anfangsgründe der Arithmetik, Geometrie, ebenen und sphärischen Trigonometrie und Perspectiv, 1sten Theils erste Abtheil, Vierte vermehrte Auflage. Wittwe Vandenhoeck, Göttingen Klügel GS (1808) Mathematisches Wörterbuch oder Erklärung der Begriff, Lehrsätze, Aufgaben und Methoden der Mathematik, Erste Abtheilung, Dritter Theil von K bis P. Schwickertschen Verlage, Leipzig Lagrange J-L (1797) Théorie des fonctions analytiques, contenant les principes du calcul différentiel, dégagés de toute considération d’infiniment petits ou d’évanouissans, de limites
38
E. Fuentes Guille´n
ou de fluxions, et réduits a l’analyse algébrique des quantités finies. Imprimerie de la République, Paris Lambert JH (1761/1768) Mémoire sur quelques propriétés remarquables des quantités transcendantes, circulaires et logarithmiques. Mémoires de l’Académie Royale des Sciences et Belles-Lettres, Tom. XVII. Haude et Spener, Berlin, pp 265–322 Laugwitz D (1965) Bemerkungen zu Bolzanos Größenlehre. Arch Hist Exact Sci 2(5):398–409 Laugwitz D (1982) Bolzano’s infinitesimal numbers. Czechoslov Math J 32(4):667–670 Lipschitz R (1876/1986) Briefwechsel mit Cantor, Dedekind, Helmholtz, Kronecker, Weierstrass und anderen, Winfried Scharlau (ed). Springer, Braunschweig/Wiesbaden Lützen J (2003) The foundation of analysis in the 19th century. In: Jahnke HN (ed) A History of Analysis. American Mathematical Society/London Mathematical Society, pp 155–195 Mancosu P (2009) Measuring the size of infinite collections of natural numbers: was Cantor’s theory of infinite number inevitable? Rev Symb Logic 2(4):612–646 Ohm M (1828) Versuch eines vollkommen consequenten Systems der Mathematik, Erster Theil. T. H. Riemann, Berlin Rusnock P (2000) Bolzano’s Philosophy and the Emergence of Modern Mathematics. Rodopi, Amsterdam Rusnock P, Šebestík J (2019) Bernard Bolzano. His Life and Work. Oxford University Press, Oxford Russ S (2004) The Mathematical Works of Bernard Bolzano. Oxford University Press, Oxford Russ S, Trlifajová K (2016) Bolzano’s measurable numbers: are they real? In: Zack M, Landry E (eds) Research in History and Philosophy of Mathematics, The CSHPM 2015 annual meeting in Washington D.C. Cham, Birkhäuser, pp 39–56 Rychlík K (1962) Theorie der reellen Zahlen im Bolzanos handschriftlichen Nachlasse. Verlag der Tschechoslowakischen Akademie der Wissenschaften, Prag Šebestík J (1992) Logique et mathématique chez Bernard Bolzano. Vrin, Paris Simons P (1997) Bolzano on collections. In: Künne W, Siebel M, Textor M (eds) Bolzano and Analytic Philosophy. Rodopi, Amsterdam, pp 87–108 Spalt DD (1991) Bolzanos Lehre von den meßbaren Zahlen 1830–1989. Arch Hist Exact Sci 42: 15–70 Trlifajová K (2018) Bolzano’s Infinite Quantities. Found Sci 23:681–704 van Rootselaar B (1964) Bolzano’s Theory of Real Numbers. Arch Hist Exact Sci 2(2):168–180 van Rootselaar B (1995) Einleitung. In: van Rootselaar B, van der Lugt A (eds) Bernard Bolzano Gesamtausgabe, Band 2B6/2. Frommann-Holzboog, Stuttgart-Bad Cannstatt, pp 11–32 Weierstraß K (1868/1986) Einführung in die Theorie der analytischen Functionen nach einer Vorlesungsmitschrift von Wilhem Killing aus dem Jahr 1868, Remmert R (ed). Universität Münster, Münster Winter E (1933) Bernard Bolzano und sein Kreis. Jakob Hegner, Leipzig Wolff C (1716) Mathematisches Lexicon. Johann Friedrich Gleditschens seel. Sohn, Leipzig Zweig A (1999) Immanuel Kant. Correspondence. Cambridge University Press, Cambridge
Brouwer’s Intuitionism Mathematics in the Being Mode of Existence Victor Pambuccian
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 A Very Short Introduction to L. E. J. Brouwer’s Life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 East and West, to Have or to Be . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Haas on the Main Characteristic of the Mind East and West . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Fromm on to Have or to Be . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Eastern Origin of Subjective Idealism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Brouwer’s View on Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Wisdom and Mysticism: Beyond Mathematics and Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 8 17 17 19 21 29 31 45 48 50
Abstract
It is argued that Brouwer’s philosophy of mathematics makes perfect sense if viewed from an Eastern philosophical perspective, as a mathematics in what Erich Fromm called “the being mode of existence.” The difficulty Western philosophers have accepting its validity under Brouwer’s own justifications is that mathematics is one of the highest prized treasures of Western philosophy (those footnotes to Plato’s dialogues). Keywords
L.E.J. Brouwer · Intuitionism · Erich Fromm · William S Haas · Being mode of existence · Eastern mind
V. Pambuccian (*) School of Mathematical and Natural Sciences, Arizona State University – West Campus, Phoenix, AZ, USA e-mail: [email protected] © Springer Nature Switzerland AG 2023 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_103-3
1
2
1
V. Pambuccian
Introduction
L. E. J. Brouwer and his intuitionism is perhaps unique in the annals of the history of mathematics and its philosophy by the quality of the hostility encountered from mathematicians and philosophers alike, from his own time all the way to the present day. Brouwer appears to be unique by the unusual fact of the absence of restraint in the verbal abuse, ridicule, and belittling of his philosophy even from mathematicians who have been attracted to intuitionistic mathematics sans underlying philosophy. Craig Smoryński, a mathematical logician, relates in Smoryński (1994) that “some years ago my topology teacher told the class that Brouwer had been a competent topologist who went off the deep end and invented intuitionism” and that “Brouwer is still widely believed to have been a Nazi, and indeed the pre-publication version of one major obituary of him reported that Brouwer had joined the SA and marched in its parades, his hair flowing majestically in the wind.” The latter part of this slanderous piece of disinformation came from a draft of Kreisel and Newman (1969). There is no drop of truth in it. Brouwer never had any kind of sympathies for any nationalist movement, to say nothing of a biologically driven one, as anyone reading (Van Dalen 2013) will learn. It just fits well into the simplistic picture of the world painted by ideologists to claim that countermodernists (see below) are right-wingers. After sending that draft to Arend Heyting, and receiving the following reply on March 15, 1969 (copy in Heyting Papers, Haarlem; unpublished, courtesy of Mark van Atten), Kreisel removed, thanking Heyting, that passage: It is not true that Brouwer supported National-Socialism. I have very good reasons to say this emphatically. Brouwer was a friend of my wife’s father, who was a fervent anti-nationalsocialist. (He was the conductor of the Residentie-orkest, the philharmonic orchestra of The Hague, and he got into trouble by refusing to let the orchestra play the Horst Wessellied in honour of Prince Bernhard, who then, bethroted to Princess Juliana, was still a German subject.) If Brouwer had in the least supported National-Socialism, my wife’s father would have broken all links with him, and my wife would have known that.
By 1994, Smoryński had considerably softened his view on Brouwer and pointed out that the above were misrepresentations and that the author of the book he was reviewing had “done a hatchet job on Brouwer” that “van Stigt’s psychological analysis of Brouwer is excessive and excessively negative.” In 1977, his take on Brouwer had been more negative, for he had written that (Smoryński 1977, p. 822): L.E.J. Brouwer was making the rounds in a bizarre attempt to turn mathematics into a religion. When, in 1920, Hermann Weyl fell prey to Brouwer’s lunacy, David Hilbert decided to intervene.
Charles McCarty, an expert on intuitionistic logic, found Brouwer’s philosophy to be an “idiosyncratic idealism” (McCarty 2008, p. 39) and argued as follows that one
Brouwer’s Intuitionism
3
ought to found intuitionism on philosophical assumptions that bear no resemblance to Brouwer’s own (McCarty 1987, p. 542). Now, we seem to be left with only one alternative: that constructions are themselves phenomenal objects akin to pains and afterimages. Their intrinsic properties are only those which they seem to have and these are solipsistic: unshared, unsharable and, on some outlooks, incommunicable. As is well known, Brouwer encouraged just this sort of mathematical solipsism. He declared intuitionistic mathematics to be the record of the fruits of the creative activity of a single individual. Constructions are phenomenal objects which, in the mind of the mathematician, make up a kind of Lego set, pieces of which can be assembled and displayed on the inner visual field. Ultimately, the facts of mathematics are just those which he can put together from his fundamental elements. It is clear that we ought not to follow Brouwer.
We will see later that “single individual,” “solipsistic,” and “mathematical solipsism” are misunderstandings of a superficial reading of Brouwer’s writings. And Giovanni Sambin, a constructivist mathematician, tells us (Sambin 2008, p. 301): We are here to celebrate a master, namely L.E.J. Brouwer, and one hundred years from the birth of his intuitionism. I believe that the best way to keep Brouwer alive is to try and go beyond Brouwer himself. [. . .] To go beyond Brouwer means to learn his lesson, but also try to fix his mistakes and to soften (in ourselves) some hardness in his personality. And in this way to keep him alive. To confine Brouwer into the prison of what he has materially written, would mean to kill him (again).
The philosopher Carl Posy, who devotes a book to mathematical intuitionism (Posy 2020), asks in Posy (1998, p. 308), ironically: First, suppose, as I have argued, that Brouwer’s position is primarily mathematical. Why does he encumber it with this bugaboo about language in general and about Hilbert’s concentration on formal language in particular?
And Georg Kreisel, a mathematical logician who contributed to intuitionistic logic, has mostly derogatory things to say about Brouwer, for example: Gödel was utterly bored by Brouwer unlike several logicians and mathematicians who, being dry themselves, were buoyed by Brouwer’s probably genuine exuberance. (Kreisel 1987, p. 146) Apparently, he [Brouwer] did not like to be interrupted anyway; fittingly, for a good solipsist. (Kreisel 1987, p. 147)
With little knowledge of the actual biographical facts that led to the pause in Brouwer’s publications, which were not available at that time, Kreisel proposes an entirely implausible explanation, if one takes Brouwer’s own philosophical writings seriously (which Kreisel obviously does not, and makes his disdain explicit in Kreisel (1977)):
4
V. Pambuccian Be that as it may, it is a fact that Brouwer himself devoted the ten years after publication of the papers by Heyting and Gödel to non-scientific activities, although, objectively, these papers concerned essential points of his life’s work – and although he was only fifty and in good health. It does not seem unreasonable to suppose that he had received an intellectual shock. (Kreisel and Newman 1969, p. 45) Gödel’s immensely natural proofs did not need anything like the heavy distinctions which Brouwer thought essential to his development of constructive mathematics (nor, incidentally, the ingenious constructions that Hilbert had introduced into proof theory). What they did involve was a clear perception of aims and principles, a philosophical, non-doctrinaire analysis, alien to the logical activists of the time. Seeing the incomparable superiority of this kind of analysis Brouwer had to face the question, consciously or unconsciously, to what extent he had even begun to master his own logical ideas. (Kreisel and Newman 1969, p. 44)
The absurdity of Brouwer mastering “his own logical ideas” and the supposed Gödel-envy in Brouwer will become apparent when we get to know Brouwer’s views on logic. That Heyting’s formalization of intuitionistic logic caused an “intellectual shock” is ludicrous but very much in line with the tone one felt free to use whenever the subject of discussion was Brouwer. A historian of mathematics found the need to chip in as well while reviewing Van Dalen (1999b): “Part of the mystique around Brouwer is caused by the notorious obscurity (one could say incoherence) of this philosophy. (Grattan-Guinness 1999, p. 529) “Arend Heyting [. . .] became a major protagonist of intuitionism without the supernatural.” (Grattan-Guinness 1999, p. 531) In addition to these squarely hostile remarks about Brouwer’s philosophy of mathematics, there is a subtler way of rejecting his philosophy while trying to find firm foundations for a highly valued intuitionistic mathematics by ignoring Brouwer altogether, the way Michael Dummett, a philosopher, did in Dummett (1975) and Dummett (2000). This is all the more remarkable, given the achievement of the man whose philosophical thought has inspired so little respect in the mathematical and philosophical world. As Gerrit Mannoury summarized one of Brouwer’s achievements, in the retelling of Dirk van Dalen (Van Dalen 2013, p. 747): In 1947 it was forty years ago that Brouwer got his doctorate. Friends, students and colleagues who had not deserted the old master, organised a jubilee symposium on 19 February. The ceremony took place in the lecture hall of the Geological Institute of Brouwer’s brother Aldert. Mannoury addressed his famous student and colleague; he sketched how Bolyai, Riemann and Peirce had broken the hegemony of the Euclidean and Archimedean axioms, how Einstein had ‘widened the visual field of physical science’, but how Aristotelian logic had survived all disturbances until Brouwer ‘emancipated human thought from the authority of the logical principles and so ran down that stronghold itself’.
Sure, the discovery of non-Euclidean geometry had not been greeted with enthusiasm either. But 100 years after its discovery, all doubts regarding its validity and significance had vanished. Why, with many more professional mathematicians and philosophers interested in mathematics active in the twentieth and twenty-first centuries, did the passing of time not similarly validate Brouwer’s revolutionary
Brouwer’s Intuitionism
5
achievement? It had never attracted more than a handful of devotees and survived into the twenty-first century only in the form of a formal logical theory, far removed from Brouwer’s own conception of what intuitionist mathematics ought to be. In the words of Meister Eckhart, who was dear to Brouwer, his intuitionistic mathematics was a work connected with him, which vanished with him. For as soon as the work is accomplished, it has instantly vanished, as has the time in which it happened and there is nothing left to do with the work. Should it somehow continue to act, then it must do so with other works and also in another time. (Trans. V. P.) [Denn sobald das Werk war, ist es sogleich zunichte geworden, ebenso die Zeit, in der es geschah, und ist nun mit dem Werke nichts mehr zu tun. Soll es irgendwie weiter wirken, so muss es mit anderen Werken geschehen und auch in einer anderen Zeit.]. (Eckhart 1934, p. 204)
The task that we set for ourselves is to find out the deep reasons for the rejection of Brouwer’s intuitionistic mathematics and of his philosophy of mathematics. We refer to the deep reasons to distinguish them from reasons connected with his biography, with Hilbert’s animosity and inability to withstand professional criticism, which we take to be superficial in the sense that, even in their absence, rejection would have occurred. We could not find any such explanation in the literature on Brouwer’s philosophy, which contains works in which Brouwer’s intuitionism is taken seriously and is treated with utmost respect. The literature that takes Brouwer seriously tries to establish links between his philosophy and that of other figures in the history of philosophy. The first such attempt is due to Oskar Becker who found in Becker (1927) that Brouwer’s intellectual predecessors were Aristotle, Descartes, and Kant and who understands Brouwer’s intuitionism through the prism of a thoroughly verheideggert phenomenology. Similarities between Brouwer’s philosophy and the Lebensphilosophie of the 1920s, as well as with Bergson (which is also emphasized in Largeault (1993)), were put forward in Hesseling (2003). Parallels with Husserl (sans Heidegger) are the subject of Van Atten (2007), van Atten (2015, pp. 237–281) (where “a close intellectual kinship between [Husserl and Brouwer]” is detected), and Tieszen (2008). We find these analyses very interesting, yet their aim is to bring Brouwer closer to some sort of Western philosophical mainstream and thus do not aim to explain the historical phenomenon of exclusion and derision of Brouwer’s philosophy. Aristotle, Descartes, Kant, and Husserl are pillars of Western philosophy, and hardly anyone dares to write about them in the style sampled above for Brouwer’s philosophy. Bergson and the Lebensphilosophie had their decades of glory. Moreover, Husserl was a close friend of Hilbert, and their philosophies of mathematics, despite differences, “share a ‘mathematics first,’ non-revisionist approach toward mathematics” (Hartimo 2017, p. 245). Was the Grundlagenstreit between Hilbert and Brouwer, as Heyting would have it (as reported in Van Dalen (2013, p. 829)), only “a historical accident, a blatant failure of common sense and of communication”? Would a mediation by Husserl – who wrote in a letter of May 9, 1928, to Heidegger that “the most interesting in Amsterdam were the long conversations with Brouwer, who made a very important impression on me, one of
6
V. Pambuccian
a thoroughly original, radically sincere, genuine, very modern man” [“das Interessanteste waren in Amsterdam die langen Gespräche mit Brouwer, der einen ganz bedeutenden Eindruck auf mich machte, den eines völlig originellen, radikal aufrichtigen, echten, ganz modernen Menschen” (Husserl 1994, p. 156)] – have avoided its outcome? Such a mediation could have certainly helped with raising the level of civility that Hilbert was lacking, but we will argue that the deeper reasons for rejecting Brouwer’s philosophy were not of a nature that could be mediated away. Schopenhauer as a possible influence is dealt with in Eggenberger (1976), Largeault (1993), Detlefsen (1998), Koetsier (2005), and, in-depth, Van Belle (2021). Largeault also finds that Another feature that relates Schopenhauer to Brouwer: the discredit that struck their work. Should we believe that each type of culture has its own incompatibilities? [Trans. V. P.] [Autre trait qui apparente Schopenhauer et Brouwer: le discrédit qui a frappé leur œuvre. Faut-il croire que chaque type de culture comporte ses incompatibilités propres?]. (Largeault 1993, p. 166)
Could a similarity with Schopenhauer – which is entirely hypothetical, as Brouwer mentions very few Western philosophers, and, from those of times past, only Kant and Spinoza – explain the outsider status of his philosophy and the animosity it elicits? Schopenhauer’s influence in his lifetime and long past his death has been considerable. His philosophy influenced in the nineteenth century some late Romantics (such as Eminescu), Tolstoy, and philosophers (such as Nietzsche) and in the twentieth century Wittgenstein and Cioran, psychologists (such as Freud and Jung), writers and poets (such as Thomas Mann, Hermann Hesse, Samuel Beckett, Jorge Louis Borges, and Rainer Maria Rilke), and even physicists (such as Einstein or Wolfgang Pauli). Nothing remotely similar can be said about Brouwer’s philosophy. One should also consider that Brouwer himself refers to a proximity to “perennial philosophy” (Brouwer 1948, p. 487), not to some dernier cri in philosophy. However, we hold onto the observation that “each type of culture has its own incompatibilities” as a particularly insightful remark. One may think, with Herbert Mehrtens, that Brouwer’s philosophy of mathematics and his intuitionism were doomed by being counter-modernist. What this means is spelled out succinctly in Mehrtens (1996, pp. 521, 522) (the long version can be found in Mehrtens (1990)): The two common traits of the various modernisms that I identify as central are, first, the autonomy of cultural production and, second, the departure from the vision of an immediate representation of the world of experience. The artist, the composer or the scientist define what is good and true in their respective fields, and they define it autonomously, independently of political, religious, or philosophical authority, and largely not only independently from but against the beliefs dictated by common sense. [. . .] The counter-modernist attitude arises with modernism. It is part of modernity, of the modern world. That is why I chose the term counter-modernism instead of anti-modernism. It is the counterpart to modernism, insisting on the question whether there is not some natural
Brouwer’s Intuitionism
7
substance to the truth and meaning of mathematics. The counter-modernist attitude claims that there is such substance, mostly called ‘intuition’.
However, counter-modernism, on its own, does not doom a philosopher’s posterity, as Gottlob Frege’s posthumous destiny clearly shows. Frege was not only counter-modern, he failed to even understand what Hilbert was doing, as his correspondence with Hilbert on the latter’s Grundlagen der Geometrie of 1899 and his outlandish polemic in the pages of the Jahresbericht der Deutschen Mathematiker-Vereinigung clearly shows. Frege was not only counter-modern by insisting that axioms ought to be true (in an absolute way), and not arbitrarily chosen, and by his rejection of non-Euclidean geometry at a time when the issue had been long resolved among mathematicians (see Tóth 2009), but he was also a nationalist (Mehrtens suggests (Mehrtens 1996, p. 520) that there is a correlation between anti-modernism and nationalism, although he acknowledges the existence of exceptions, such as Hermann Weyl; Brouwer is falsely depicted as an antiinternationalist, which anyone reading (Van Dalen 2013) will conclude is flatly wrong) and held anti-Semitic views. Although ignored during his lifetime, causing him to adopt a strident polemic and satiric writing style in several publications (see Wille 2020a, b), Frege has enjoyed a posterity as the founding figure of the dominant style of philosophy in the Anglo-Saxon world, analytic philosophy. His writings have generated a veritable exegetical Frege industry, in which even attempts to find how he could have been right in the dispute with Hilbert abound (to the disbelief of mathematicians, such as Hans Freudenthal, who wrote already in 1960 that “Frege, rebuking Hilbert like a schoolboy, also joins the Bœotians. (I have never understood why he is so highly esteemed today.)” (Freudenthal 1962, p. 618)). A major aspect of modernism is a marked relativism regarding truth claims. Truth is generated either individually or by groups as deemed convenient and almost anything goes. There are no universal truths that can be discerned from the human condition, whose very characteristics are considered debatable, to say nothing of an ability to “discern.” Contrary claims are considered oppressive. In the field of logic, for instance, this is expressed by Carnap’s principle of tolerance (Toleranzprinzip): In logic, there are no morals. Everyone is at liberty to build his own logic, i.e. his own form of language, as he wishes. All that is required of him is that, if he wishes to discuss it, he must state his methods clearly, and give syntactical rules instead of philosophical arguments. (Carnap 1959, § 17) [In der Logik gibt es keine Moral. Jeder mag seine Logik, d. h. seine Sprachform, aufbauen wie er will. Nur muß er, wenn er mit uns diskutieren will, deutlich angeben, wie er es machen will, syntaktische Bestimmungen geben anstatt philosophlscher Erörterungen. (Carnap 1934, § 17)]
In summarizing the results of his study of Brouwer’s philosophy, Michael Detlefsen observes how far from the modernist credo Brouwer’s understanding of mathematics is (Detlefsen 1998, p. 331):
8
V. Pambuccian He [Brouwer] was a kind of mathematical ‘existentialist’ in that he sought a conception of mathematical knowledge that makes it basic to our human existence. For this to be so, he believed, mathematical knowledge must be intimately related to our most basic knowledge of ourselves – that is, to that ‘existential’ awareness that we have of ourselves as willing, acting beings. Such views are, to be sure, extraordinary when viewed in the light cast by the bland and vapid views that have dominated the philosophy of mathematics of the past few decades. They are not, however, for all that, either silly or out of place.
We find that Largeault’s remark on incompatibilities related to “culture” and Detlefsen’s remark on the closeness of mathematical knowledge to our human existence, to our being, come very close to the two reasons we will put forward for the rejection of Brouwer’s philosophy. The first reason, somewhat similar to that mentioned by Largeault, is connected with the geography of the mind, along the lines of Holenstein (2004), with a mistake made by Central Casting, which placed L. E. J. Brouwer in Europe rather than in “the East” and in mathematics rather than in the humanities. The basics of the characteristic features of thought East and West are taken from Haas (1956) and will be summarized below. We are fully aware that speaking of East and West is not well taken these days, but given that we know only of ideological concerns related to that geography, we will ignore the prohibition to refer to those cardinal points. The second reason is connected with what Erich Fromm called “the being mode of existence” in Fromm (1976), for we claim that both the philosophy and the mathematics Brouwer envisioned are reflections of a mind in the being mode of existence, and his reason for rejecting classical mathematics and formalism is that he sees them tainted with the having mode of existence. Given the dominance of Greek philosophy in Western philosophy and mathematics (for, as Whitehead noted “The safest general characterization of the European philosophical tradition is that it consists of a series of footnotes to Plato” (Whitehead 1979, p. 39)) and of the having mode of existence in society, a philosophy and a proposal for a reform in mathematics that go against these two pillars were doomed to fail. Making sense of the connection of the two reasons mentioned above with Brouwer’s philosophy and pointing out why that Trojan horse cannot be accepted in the proudest citadel of Western thought, which is mathematics, are, in essence, the aims of this essay.
2
A Very Short Introduction to L. E. J. Brouwer’s Life
Luitzen Egbertus Jan Brouwer was born on February 27, 1881, at Overschie, the Netherlands, the first child of a schoolmaster and his wife. He enrolled at the University of Amsterdam as a 16-year-old, passing his final examination in 1904, the year he also got married to Lize de Holl (after being engaged to her since 1902), 11 years his senior, who had a daughter from a previous marriage. His close friends during that time were poets, not mathematicians: Carel Adema van Scheltema, Jan Lockhorst, and Albert Plasschaert. He went three times, on foot, to Italy and back.
Brouwer’s Intuitionism
9
The experience as a mathematics student left him wondering whether mathematics would be the field to pursue, whether it would have a life significance. As happens so often, I began my academic studies as it were, with a leap in the dark. After two or three years, however full of admiration for my teachers, I still could see the figure of the mathematician only as a servant of natural science or as a collector of truths: – truths fascinating by their immovability, but horrifying by their lifelessness, like stones from barren mountains of disconsolate infinity. And as far as I could see there was room in the mathematical field for talent and devotion, but not for vocation and inspiration. (Brouwer 1946a, p. 474)
The lectures of Gerrit Mannoury, a self-taught mathematician, solidified his interest for mathematics, as he no longer felt in front of a lifeless discipline but rather one intimately connected to life: [. . .] The undertone of Mannoury’s argument had not whispered: ‘Behold, some new acquisitions for our museum of immovable truths’, but something like this: Look what I have built for you out of the structured elements of our thinking. – These are the harmonies I desired to realize.[. . .] Behold the harmonies, neither desired nor surmised, which after the completion [of the construction] surprised and delighted me’. (Brouwer 1946b, p. 475)
In a letter written on September 7, 1906, to Adama van Scheltema, Brouwer gives voice to his feelings of estrangement: Life is a magic garden. With wondrous, softly shining flowers, but among the flowers the gnomes are walking, and I am so afraid of them. They stand on their head and the worst is that they call out to me that I must also stand on my head; once in a while I try to do so and die with shame; but then sometimes the gnomes cry that I do it very well, and that I am indeed a real gnome too. But on no account will I fall for that. (Van Dalen 2013, p. 196)
He got his doctorate in 1907 with a thesis on both foundational topics and Hilbert’s fifth problem. He was appointed a privaatdocent at the University of Amsterdam in 1909. In 1912 he was appointed extraordinary professor at that same university and, a year later, ordinary professor on the chair that had been vacated for him by his doctoral advisor Korteweg. He turned down full professorships from Göttingen and Berlin (“I feel far too much a Dutchman and in particular a Friesian Dutchman. And I would rather live here between Dutch friends to enjoy, and Dutch enemies to see through, than far away among strangers!” (Van Dalen 2013, p. 191), from an interview from Wiessing (1960)). Between 1909 and 1913, he published a series of papers that revolutionized topology. One of the results is Brouwer’s fixed-point theorem: A continuous map f: D ! D, where D is the ndimensional closed disk, has a fixed point, i.e., there exists an x0 D, such that f (x0) ¼ x0. After 1913, the only major contribution to topology is in 1939, an intuitionistic proof of the triangulation of any differentiable manifold. Starting with 1913, he returned to the subject of his doctoral dissertation: the rebuilding of mathematics on a sound foundation, intuitionistic mathematics. Appointed to the editorial board of the leading mathematical journal of the time, the Mathematische
10
V. Pambuccian
Annalen in 1914, at the invitation of Felix Klein, he was removed from that board in 1928 by David Hilbert on fabricated grounds. The editorial board was completely revamped, with Albert Einstein and Constantin Carathéodory, who had resisted Brouwer’s removal, removed as well. Einstein had tried to convince Hilbert that one does not choose one’s personality or one’s health, regardless of the afflicted organs or systems, and that it is in bad taste to make any fuss about such matters. On October 18, 1928, he wrote to Hilbert Mr. Brouwer is an unwilling defender of Lombroso’s theory of the close connection between genius and madness. I cannot understand how you can take this man, who in his country not only enjoys the reputation of a great mathematician but also that of a hopeless troublemaker, so bitterly seriously (Trans. V P.) [. . .] I consider him [Brouwer], with all due respect for his mind, a psychopath and it is my opinion that it is neither objectively justified nor appropriate to undertake anything against him. I would say: ‘Sire, give him the liberty of a jester!’ If you cannot bring yourself to this, because his behaviour gets too much on your nerves, for God’s sake do what you have to do. I, myself, cannot sign, for the above reasons such a letter. (Van Dalen 2013, p. 556) [Herr Brouwer ist ein unfreiwilliger Verfechter von Lombrosos Theorie der nahen Verbindung von Genie und Wahnsinn. Ich kann es nicht begreifen, dass Sie diesem Mann, der in seinem Lande neben dem Rufe eines grossen Mathematikers auch den eines hoffnungslosen Querulanten geniesst, so bitter ernst nehmen koennen. [. . .] Ich betrachte ihn bei aller Hochachtung vor seinem Geiste als einen Psychopathen und halte es weder fuer objektiv gerechtfertigt noch fuer zweckmaessig etwas gegen ihn zu unternehmen. Ihnen moechte ich sagen:,Sire, geben Sie ihm Narrenfreiheit’! Wenn Sie dies aber nicht vermoegen, weil sein Verhalten Ihnen zu sehr auf die Nerven gegangen ist, so tun Sie in Gottes Namen, was sie muessen. Ich selbst kann aus besagten Gruenden einen solchen Brief nicht unterzeichnen.]. (Rowe and Felsch 2019, p. 278f)]
A few days later, on October 23, 1928, Einstein complemented the above with a somewhat different evaluation of Brouwer in a letter to Carathéodory: Furthermore, I admire him not only as a particularly clairvoyant spirit, but also as a forthright and characterful man. (Trans. V. P.) [Im uebrigen aber verehre ich ihn nicht nur als ueberaus hellsehenden Geist, sondern auch als geraden und charaktervollen Menschen.]. (Rowe and Felsch 2019, p. 120)
After his removal from the editorial board of the Mathematische Annalen, Brouwer worked tirelessly to launch a competing journal with an international editorial board. He became, in 1934, a founding editor of Compositio Mathematica, a truly international journal (unlike the Annalen, which, even before Brouwer’s removal, had only two non-German members on its editorial board) – with all German editors resigning in protest in 1936 due to Brouwer’s refusal to purge its Jewish editors – which ceased publication in 1940 due to the German invasion of the Netherlands. When it resumed publication, in 1951, he was removed from the editorial board of his own journal by his colleagues. The effect that episode had on him is expressed in a letter of May 8, 1953, to Mannoury as follows:
Brouwer’s Intuitionism
11
I too have been in bed for months during this winter at a relapse of a heart attack, which I suffered in 1950 being suddenly informed of the theft of my journal and the unbelievable means of deceit and fraud used. It is, I believe, through this event which I on the one hand out of self-preservation must try to forget and to which I can, on the other hand, not resign myself, that I have definitely become misanthropic within our national borders. Remarkable, how little philosophy sometimes protects against psycho-somatic reflexes and reactions. (Van Dalen 2013, p. 789) (Dutch original in Van Dalen (2011b, p. 2789))
In 1918 a group consisting of Brouwer; Frederik Van Eeden, a writer and psychiatrist; Jacob de Haan, a Dutch-Jewish writer; and Gerrit Mannoury, a mathematician, invited many renowned figures, such as Rabindranath Tagore, Martin Buber, Giuseppe Peano, and Henri Borel, a sinologist and journalist, to join it, with the aim of “the creation of new words, which form a code of elementary means of communication for the systematic activity of a new and holier society” (Van Dalen 2013, p. 264). The choice of a sinologist was connected with Brouwer’s belief that, unlike Western languages, Eastern languages allow for the expression of a different philosophy (Van Stigt 1990, p. 200). The International Institute for Philosophy saw its founding that year and was dissolved in 1922 to be replaced by the Signific Circle. Buber had declined the invitation, questioning the possibility of a collective creation of words, given that “The creation of a word is for me one of the most mysterious processes of spiritual life” (Van Dalen 2013, p. 262) [Wortschöpfung, Erschaffung des Wortes ist für mich einer der geheimnisvollsten Vorgänge des geistigen Lebens (Van Dalen 2011b, p. 793)]. Brouwer’s reply to Buber is of great interest for our thesis of the Eastern nature of Brouwer’s mindset, as it is one of several places in which he deplores shortcomings in the Western cultural setup. The word of the Occident does indeed in various cases have in addition to its material, a spiritual value, but the latter is always subjugated to the first, and where the first one has acquired a more certain and permanent orienting influence on the activity of society, in the sense that it induces isolated individuals, in their pursuit of physical security and material comfort, to hinder each other as little as possible and wherever possible to support each other; the latter lacks any influence on the legal state of affairs (except possibly when it is used itself for the devious realisation of injustice); therefore its influences are weak, fleeting and local. Words which have exclusively a spiritual value and which are suitable for the in- and exhaling of the Weltgeist and for the orientation on the observance of Tao are non-existent in Occidental languages; should they exist, then their influence would be paralysed by the mutual physical hatred, rooted in mutual distrust of the purity of their birth, of people who live too closely together, and which hampers the pursuit of material comfort by the single individual only mildly, but which considerably hampers the in- and exhalation of the Weltgeist. (Van Dalen 2011a, p. 262) [Das abendländische Wort besitzt zwar in mehreren Fällen neben seinem materiellen einen seelischen Wert, aber letzterer ist immer dem ersteren untergeordnet, und während erstere eine sichere und dauerhaft orientierende Wirkung auf die Aktivität der Gemeinschaft erworben hat, in dem Sinne dass es die einzelnen Individuen dazubringt, im Erstreben körperlicher Sicherheit und materiellen Komforts einander möglichst wenig zu hindern und womöglich zu unterstïutzen, entbehrt letzterer jeden Einflusz auf die Rechtsverhältnisse
12
V. Pambuccian (es sei denn insofern er daselbst zur Erschleichung von Unrecht missbraucht wird); demzufolge sind seine Wirkungen schwach, vorübergehend und lokalisiert. Wörter, welche ausschliesslich seelische Werte besitzen und dazu geeignet sind, die Gemeinschaft auf das Ein- und Ausatmen des Weltgeistes und das Innehalten von Tao hin zu orientieren, gibt es in den abendländischen Sprachen nicht; wenn solche existierten, würde übrigens ihre Wirkung lahmgelegt werden durch den wechselseitigen leiblichen Hass der zu dicht bei einander lebenden Menschen, welcher im gegenseitigen Misstrauen gegen die Reinheit ihrer Geburt wurzelt und das Erstreben materiellen Komforts durch die einzelnen Individuen nur unerheblich, das Ein- und Ausatmen des Weltgeistes aber in hohem Masse hindert. (Van Dalen 2011b, p. 792)]
He expressed himself similarly in 1948: Eastern devotion has perhaps better expressed this wisdom than any western man could have done. For instance in the following passages of the Bhagavad-Gita which even in translation have conserved their electrifying power. (Brouwer 1948, p. 486)
Another one who declined the invitation to join the project of word creation was Erich Gutkind, the author of a prophetic, mystical book Siderische Geburt – Seraphische Wanderungen vom Tode der Welt zu Taufe der Tat (Sideric birth – Seraphic wanderings from the death of the world to the baptism of the deed), and a friend of Walter Benjamin und Gershom Scholem. Gutkind “doubted if European languages could simply incorporate or assimilate words of a high spiritual value” (Van Dalen 2013, p. 263) for “words like the Chinese Tao or the Hebrew Torah, so strikingly analogue in meaning and sound, can never be grown on a European foundation and presuppose a radical reform of society, require a higher society” (Van Stigt 1990, p. 200), to which Brouwer replied that “The word cannot wait for the higher form of society, because the higher form of society waits for the word” (Van Dalen 2013, p. 263). Brouwer became a lifelong friend of Erich and Lucia Gutkind. After Brouwer’s first visit of the Gutkinds in Berlin, they wrote to Van Eeden “Brouwer is with us and he will bring you our greetings. We get along wonderfully and we are enormously pleased with him” (Van Dalen 2013, p. 323). By 1946 Brouwer had come to the conclusion that “Buber was right in his denial of the creative power of collective work in this domain” (Van Stigt 1990, p. 196) of creating a language that would, on the one hand, be “purified of rhetoric, subjective, demagogic and in general of deceptively emotional admixtures” and, on the other hand, contain “words for basic immaterial notions which can serve as elements of a more indicative language, suitable to express general mental values and human emotions of fraternity and solidarity” (Brouwer 1946a, p. 452). Besides the injustice visited upon him by his removal without any drop of a valid reason from the editorial board of the Mathematische Annalen, Brouwer suffered another blow in 1929, which he related in a letter to Hans Hahn of August 9, 1929: Due to a disturbance in the execution of my travel plan, your letter from Bellagio reached me with a long delay. The disturbance was caused by a great calamity: four days ago my briefcase, which also contained my scientific diary, was stolen from me on the front platform
Brouwer’s Intuitionism
13
of a Brussels tram, by a pickpocket, and both the police and the detectives consider the case as hopeless. Since in this diary my collective scientific thoughts and ideas of the last three years, which have largely disappeared from my memory, and of which only a few have already found a registration elsewhere, had been recorded, this event means for my scientific personality a serious personal mutilation, in a way that is like the decapitation (elimination of the central process) for a pine tree. To my amazement, I remain so far, fairly calm under this blow of fate; I believe, however, from certain phenomena, that I have nonetheless suffered a nervous collapse, the consequences of which will perhaps only later become visible, together with a disorganisation of my scientific thoughts. (Van Dalen 2013, p. 600f) [Infolge einer Störung in der Abwickelung meines Reiseplanes hat mich Ihr Brief aus Bellagio mit grosser Verzögerung erreicht. Die Störung wurde von einer grossen Kalamität verursacht: vor 4 Tagen wurde mir meine Brieftasche, welche auch mein wissenschaftliches Tagebuch enthielt, auf dem Vorderbalcon der Brüsseler Strassenbahn von Taschendieben gestohlen, und sowohl Polizei wie Detektive betrachten die Angelegenheit als hoffnungslos. Weil in diesem Tagebuch meine sämtlichen wissenschaftlichen Gedanken und Einfälle der letzten drei Jahren archiviert waren, welche grossenteils meinem Gedächtnis entschwunden sind und von denen nur wenige schon anderweitige Festlegung gefunden haben, so bedeutet das Geschehnis für meine wissenschaftliche Persönlichkeit eine schwere Verstümmelung, gewissermassen dasselbe was für eine Tanne die Enthauptung (Entfernung des Zentralsprosses) bedeutet. Zu meiner Verwunderung bleibe ich unter diesem Schicksalsschlag bisher ziemlich ruhig; ich glaube aber nach gewissen Erscheinungen, dass ich trotzdem eine Nervenerschütterung erlitten haben muss, deren Folgen vielleicht erst später, zusammen mit der Desorganisierung meiner wissenschaftlichen Gedankenwelt, ans Licht treten werden. (Van Dalen 2011b, p. 1747)]
Later on, twice, in 1941 and 1944 (Van Dalen 2013, pp. 674, 693), his house was on fire, so that Brouwer would write in a letter to De Loor on June 20, 1945 (Van Dalen 2013, p. 702), that “as a consequence my scientific archive and with it the manuscripts of my unpublished research have been lost.” (Van Dalen 2013, p. 702) If one is looking for the long mathematical silence that ensued, then it is not Gödel’s theorems that brought it about, but these events. Among the seven doctoral students Brouwer had, two developed intuitionistic mathematics further: Maurits Belinfante and Arend Heyting. The former, a Portuguese Jew, Brouwer’s first intuitionistic follower, who developed intuitionistic analysis, was deported first to Theresienstadt and then in 1944 to Auschwitz, where he was killed. Heyting wrote, at Brouwer’s suggestion, his dissertation on an intuitionistic axiomatic development of projective geometry. Later Heyting developed a formal logical calculus that should capture the essence of intuitionistically valid inferences. Brouwer was “highly pleased” with Heyting’s work. However, as related by Heyting, “He [Brouwer] always maintained that formalization is unproductive, a sterile exercise. He never changed his mind about that” (Van Stigt 1990, p. 290). Developments of intuitionistic mathematics after World War II proceeded among formal logic lines. In the 1930s, he served on the municipal council of Blaricum and was involved in several lawsuits and other time-consuming non-mathematical and unpleasant activities that he felt a compulsion to be involved in out of a very strong sense of justice. After the war, he was suspended for some months as a suspected collaborationist, reprimanded, and then re-instated by the University. At the end of 1945, Brouwer
14
V. Pambuccian
contacted his old friend Erich Gutkind, now living in the United States. Gutkind wrote a letter to Einstein, letting him know that Brouwer had expressed a strong wish to emigrate to the United States. He mentioned “that Brouwer had hidden some Jews in his institute and took care of them” as well as that “I recall that Brouwer travelled a couple of times to Germany to get German mathematicians out of German concentration camps” (Van Dalen 2013, p. 716). Gutkind asked Einstein to help Brouwer find a position in the United States. Einstein refused to help Brouwer, given that he was under investigation for collaboration with the occupier. “There is no doubt that Brouwer suffered under the post-war accusations, – he, one of the leading Dutch scientists, who had always come to the defence of the underdog, was classified as a spineless collaborationist!” (Van Dalen 2013, p. 716). He published a few more papers, one (Brouwer 1952a) showing that his own fixed-point theorem is intuitionistically false but can be rephrased intuitionistically by stating that, given a natural number n, one can find a point x0 whose distance to f (x0) is less than 2n. He traveled extensively to give lectures on intuitionism. In 1959 his wife Lize died. In 1966, he was run over, crossing the street in front of his house in Blaricum, to deliver some presents for Saint Nicholas’ eve to friends across the road, by three cars. Van Dalen writes that “Somehow he managed to inspire either love and admiration, or hatred and repugnance” (Van Dalen 2013, p. 714). In the first category was the homeopathic physician Ralph Twentyman, who wrote that “the high spiritedness Brouwer brought into any room is one of my most treasured memories. Everything moved up a floor or two when he appeared” (Van Dalen 2013, p. 600f). We should also add that Brouwer was very far from having adopted the European professorial persona. He could disappear in the middle of a discussion in the garden of his house, with women present, enter his house, and re-appear naked (as he enjoyed nude open-air baths), continuing the previous conversation, with no one commenting (Van Dalen 2013, p. 645). Or, during a visit to Canada, at 72, he would disappear at a picnic to climb a tree (Van Dalen 2013, p. 8f). “When he stayed with [Georg Henrik] von Wright, he asked his host one night, ‘would you mind if I slept out on the ground tonight?’ Von Wright may have had his doubts about his aged guest, but he saw no reason to refuse the request. And so Brouwer once more was one with nature under the sky, as he loved to be” (Van Dalen 2013, p. 788f). During an interview with Wiessing, translated from Wiessing (1960, p. 143f) in Van Dalen (2013, p. 190f), Brouwer explained succinctly his intellectual journey: Basically my mathematical thinking is non-sensory internal architecture. You may compare these forms of thought to music or poetry. My first inklings of the possibility of such a mathematics emerged, I think, from discussions with my teachers at the time of my HBS [high school] and gymnasium study. But only in my dissertation of 1907 have I started to give these thoughts a definite formulation. Since then this mathematics, nowadays called ‘intuitionistic’, has developed with interruptions. The recognition in professional circles of this work of mine came only in a rather slow tempo, with many ups and downs. It has by no means found general acceptance! Many view it, even now, as charlatanism. There are also people who say that it may be correct, but that it is totally uninteresting and not even new.
Brouwer’s Intuitionism
15
If I had not, now and then, written about ‘ordinary mathematics’, I don’t think a place at a Dutch University could have been found for me. [. . .] [. . .] I don’t like mathematics and it basically bores me. Wiessing: What would you, if it comes to that, rather have done than practice mathematics? Brouwer: That is hard to say. Let me say: to have no subject and to let my thoughts roam freely. Every attachment to a subject brings, as you will agree, that your realm of thought suffers a certain mutilation. And it is obvious that then one can only have pleasure in such a profession, if one is, as I sometimes observe with some people, supported and driven by ambition or conviction. But that has never been the case with me. Anyhow, life demands that you choose a profession. Well, then I think that science is for a man like me, who is by nature solitary, not such a bad sanctuary. One is less dependent on the public, and one can more easily preserve one’s solitude, than if one takes up literature or the visual arts, not to mention music. For no matter how much pleasure and satisfaction art by itself may give to a person, society, I think, demands more violating concessions from artists than science does.
It is worth mentioning that Brouwer, who almost single-handedly developed geometric topology and could have continued on that path to certain glory, decided to return to the concerns of his youth (he had hoped, during his topological years, that his topological work could be made intuitionistically sound (see Dubucs 1988; Johnson 1981), repudiated his topological work, and never lectured on it, although the preoccupation with intuitionism made him appear, in the eyes of many mathematicians, a “charlatan,” as he put it. To give a specific example, which also shows that an admiration for the Bhagavad Gita (see Weil 1991) in two European mathematicians can coexist with very different philosophies of mathematics, André Weil attended Brouwer’s Berlin lectures on intuitionism and wrote to Fréchet on January 31, 1927, that He has declared in his first lecture that the principle of the excluded third is a superstition which is about to disappear. It is a pity that such a remarkable man devotes himself exclusively to such bizarre things. (Van Dalen 2013, p. 595)
Brouwer is one of the very rare cases who not only felt a slight hesitation at the alternative presented in the thought experiment devised by Simone Weil but actually chose to be an object of derision rather than betray the truth. Here is the thought experiment of a manifest counter-modern, one who believed in truth: If one were to propose to all those whose profession is to think, priests, pastors, philosophers, writers, scholars, professors of all kinds, the choice, from the present moment on, between two destinies: either to sink immediately and definitively into idiocy, literally, with all the humiliations that such a collapse entails, and keeping only enough lucidity to feel all its bitterness; or a sudden and prodigious development of intellectual faculties, which ensures them an immediate global celebrity and glory after death for millennia, but with the inconvenience that their thought would always stay a little outside of the truth; can one believe that a lot of them would experience for such a choice even a slight hesitation? [Trans. V. P.] [Si l’on proposait à tous ceux qui ont pour profession de penser, prêtres, pasteurs, philosophes, écrivains, savants, professeurs de toute espèce, le choix, à partir de l’instant
16
V. Pambuccian présent, entre deux destinées: ou sombrer immédiatement et définitivement dans l’idiotie, au sens littéral, avec toutes les humiliations qu’un tel effondrement entraine, et gardant seulement assez de lucidité pour en éprouver toute l’amertume; ou un développement soudain et prodigieux des facultés intellectuelles, qui leur assure une célébrité mondiale immédiate et la gloire après la mort pendant des millénaires, mais avec cet inconvénient que leur pensée séjournerait toujours un peu en dehors de la vérité; peut-on croire que beaucoup d’entre eux éprouveraient pour un tel choix même une légère hésitation?]. (Weil 1949, p. 328)
Needless to say, this very short introduction does not relieve the reader, in particular, the reader who thinks negatively about Brouwer, to consult Van Dalen (2013) or Van Dalen (1999a) and Van Dalen (2005), Van Dalen (2011a), and Van Dalen (2011b) to get whatever picture some footprints on a sandy beach can reveal about a life. Unsurprisingly, both the reviews for Mathematical Reviews and for Zentralblatt für Mathematik (Dauben 2000; 2007; Soifer 2007) find that Van Dalen was too kind to Brouwer, for they would have seen some facts in a significantly more detrimental light. The much more substantial reviews (Smoryński 2015; Johnson 2014; Kaneko 2002b) are free from that kind of critique. While praising Van Dalen’s monumental achievement, Gray (2015, p. 134) finds that “there is a slight feeling of sympathy for Brouwer that the reviewer does not share, being perhaps less enamoured of intuitionism” and that there was a solipsistic streak in him that strengthened his inability to get people on his side, that elevated a reasonable but by no means secure philosophical opinion to one of inflexible rectitude, that kept him illiberal and unable to see that his allies were increasingly rightwing in toxic times, and that made him capable of cruelly selfish behaviour. (Gray 2015, p. 134)
Gray is here very selective in the decision regarding who Brouwer’s “allies” were, thinking of the National-Socialists Ludwig Bieberbach and Theodor Vahlen, but forgetting his friends with far-left leanings, the communists Henri Wiessing (Van Dalen 2013, p. 281) and Gerrit Mannoury (Van Dalen 2013, p. 513), and devotes significant space in his review to the post-war accusations against Brouwer, which were, to a careful reader of Van Dalen (2013), without any basis in fact, or rather a dispute about the most appropriate resistance strategy. It was by no means clear what that strategy should be, as a reader of Hermans (1949) would come to realize. Equating accusations with guilt would also be ill-advised in an atmosphere in which, even as late as 1963 – not 1945 – Miep Gies would be considered by the police “one of the suspected people” for the betrayal on August 4, 1944, of the Frank family (Gies and Gold 1987, p. 250) on the basis of her having been born in Vienna. The “illiberal” part is even more mysterious, given that Brouwer was in 1935 and 1939 elected to the council of the town Blaricum, with a significant artist population. What was he advocating for? The preservation of the local heath (a nature reserve), [. . .] better access for walkers while keeping out cars [. . .] a raise for the local police [. . .], added improvements to a plan for the construction of a cycle path, [. . .] the improvement of the soccer field. [. . .] When [. . .] the committee of action ‘For God’ demanded the barring of the periodical of the freethinkers
Brouwer’s Intuitionism
17
from the public library, Brouwer remarked that according to the state’s instructions all groups must be represented. (Van Dalen 2013, p. 641)
And do these thoughts sound like those of an “illiberal” person? Power over fellow-creatures will be avoided. Firstly because one would get mixed up with limitations of other people’s liberty of action. And secondly because those fellow-creatures are part of the reflex image held out to mind from its deepest home, therefore have to be respected, and must not be judged, let alone condemned, despised or rejected, even if they are enemies to be fought against. (Brouwer 1948, p. 486)
If one were to think that van Dalen’s biography has cleared the air of the cobwebs of lies that have been spun around Brouwer, one must have forgotten that, as Thomas Francklin wrote: Falsehood will fly, as it were, on the wings of the wind, and carry its tales to every corner of the earth; whilst truth lags behind; her steps, though sure, are slow and solemn, and she has neither vigour nor activity enough to pursue and overtake her enemy. (Francklin 1787, Sermon XI (On Vigilance), p. 233)
Defamations, heard through the grapevine, are passed on, regardless of the absence of any mention of anything remotely resembling the accusation in the extensive correspondence related to the Mathematische Annalen affair and picked up in AD 2021 in Budianksy (2021, p. 82) (where Brouwer becomes a “committed Aryan nationalist”) and Misak (2021) (where Brouwer is mentioned as “a budding Nazi collaborator”) as facts, with The dispute took on a particular nuance in the sense that the Dutchman Brouwer set himself up as a champion of Aryan Germanness. Consequently Hilbert removed him from the editorial board of the Mathematische Annalen after he objected to what he felt were too many Eastern European Jewish (“Ostjuden”) authors. (Fraenkel 2016, p. 137)
as source.
3
East and West, to Have or to Be
3.1
Haas on the Main Characteristic of the Mind East and West
To substantiate why an approach from an “Eastern” position to mathematics is unacceptable by the philosophical establishment, we need to define what we mean by the distinction between East and West in philosophy. We think that the following quotations from Haas (1956) will give an idea of what is meant by these different fundamental approaches. However, the decisive act which constitutes Eastern mind and civilization – the positing or the fixing of the subject – implies the idea that an existence entirely separated from and
18
V. Pambuccian independent of the subject cannot be conceived. So what becomes in the Western mind and civilization the autonomous object, is paralleled in the East by the other which is coherent with and held in existence by the subject. (Haas 1956, p. 114) But there lives on in the world concept and philosophy of the East a definite aversion to those pairs of opposites which dominate the history of Western thought – the living and the inanimate, mind and matter, soul and body, the conscious and the unconscious. And the absence of these antitheses is the expression of the closeness in which the other despite its otherness is held to the subject.[. . .] Being a form of the mind the subjective attitude is not restricted to any field of existence. It is active everywhere [. . .] Its most penetrating consequence is its denial to the external world of an existence in its own right. Such an objective existence, independent of the perception of the subject, seems to the Asiatic mind absurd and paradoxical – a monstrous idea. (Haas 1956, p. 116) On the other hand, radical idealism with its theory of the illusory character of the external world is not innate in, let alone characteristic of, Asia. It belongs as do so many other extremist theories to the West. (Haas 1956, p. 117) The East takes opposites as they are immediately experienced from life and nature. Thus while the principle of the contraries is recognized, they are left in their original context and to their original operation. For this reason they have been interpreted by contemplative thought as polarities whose contrariety, which is more apparent than real, conceals a deeper meaning of balance and harmony. (Haas 1956, p. 145) The structures of the Eastern mind leads it to regard reality as somehow co-ordinated and akin to the subject. The innate tendency is to bridge and diminish the distance between reality and the subject. (Haas 1956, p. 130) Ignoring the objectifying process of thought, the East dismisses the outer world as the all important object of speculation. Instead it stresses the necessary relationship of the Real with the subject. This of course, radically changes the position and meaning of the term subject.[. . .] the subject, for the Easterner, is relative not to the object, that is not to an objective world as the essence of reality, but to an entity close to him [. . .]. (Haas 1956, p. 131) The history of the Western mind in general and of its philosophical thought in particular is characterized by two great conceptions. Neither assumed a prominent role in the East. The one is mathematics – the other the idea of evolution. Important as the invention and knowledge of mathematics are in China and particularly in India, the East cannot claim anything comparable to the systematic development of mathematical theory and applied mathematics in the West. (Haas 1956, p. 150) In the history of Western philosophy the importance of mathematical thought on the one hand, and the idea of evolution on the other, is vividly illustrated by the fact that with few exceptions the most important philosophers can be grouped as those who were inspired by mathematical thinking and those who follow the line of the idea of evolution. The first group is represented by men like Pythagoras, Plato, Descartes, Leibniz and Kant while the other can claim the names of Heraclitus, Aristotle and Hegel. If philosophical mysticism is credited with having vaulted this chasm, it did so at the price of ignoring the structure of the Western mind as well as its premises and aspirations. In this fashion it succeeded in avoiding the issue. It is no mere coincidence that the two outstanding representatives of this are to be found at the periphery of Western civilization – Plotinus, who was born in Egypt and inspired by the currents of Eastern thought, and Spinoza, of Jewish descent. (Haas 1956, p. 151f) Eastern cognition is interested in consciousness itself. Western cognition is interested in the objects of consciousness. (Haas 1956, p. 167) In the mind of the East the subject holds the predominant place occupied by the object in the West (Haas 1956, p. 180). Reduced to its essence Eastern knowledge is a form of being, a state of consciousness, that is lucid and self-sufficient (Haas 1956, p. 182).
Brouwer’s Intuitionism
19
Western knowledge is a form of having (Haas 1956, p. 182). At the height of his fulfillment, the man of the East stands utterly alone (Haas 1956, p. 232).
James P. Allen, an Egyptologist, expresses a similar view on the nature of Eastern cognition in (Allen 1988, p. ix): We have divorced philosophy, as a discipline, from religion. In the former we appreciate reality objectively, as something capable of study; in the latter we understand it subjectively, as something that can only be experienced. This dichotomy did not govern ancient Near Eastern thought. To it all appreciation of reality was subjective – “I-Thou” rather than “I-It.”
One should not conclude from the above that there is no room in the Eastern mind for forms of realism. The Nyāya (one of the six orthodox schools of Hinduism) believed “in the external existence and mind-independence, not just of the ordinary objects of daily experience, but also of the universals or properties which reside in those objects. They combined this ‘metaphysical realism’ with a correspondingly realist or externalist approach towards cognitive and semantic content” (Ganeri 1996, p. 111f). This is precisely the kind of view repudiated by Brouwer and the Buddha. While several schools of Indian philosophy embrace ananta (infinity) – Patañjali recommends meditating on ananta, for the Jain “the theory of multiplexity of reality (anekānta-vāda) rests on [. . .] the conviction that things relate to each other by an infinite number of relations” (Balcerowicz 2001, p. 379) – both Brouwer and José Ortega y Gasset allow no place for actual infinity in mathematics, the latter, who also rejected the use of the principle of the excluded middle in mathematics, insisting that mathematicians should “focus only on objects that are ‘immediately present’ for the human intuition” (Rabi 2016, p. 65) and that “the infinite cannot be immediately present for the human intuition” (Rabi 2016, p. 68).
3.2
Fromm on to Have or to Be
Erich Fromm, a radical-humanistic psychoanalyst, has described in Fromm (1976) two basic character orientations, which he calls “the having mode of existence” and “the being mode of existence.” The former is the dominant mode of existence in contemporary society. Our claim is that the spirit of Brouwer’s philosophy – “the spirit out of which the work takes shape” [“der Geist, aus dem das Werk geschieht” (Eckhart 1934, p. 204)] – is that of the latter. Fromm cites from the works or sayings of Jesus, the Buddha, Meister Eckhart, Spinoza, and Albert Schweitzer, to which he refers as “Masters of Living,” to circumscribe the being mode of existence. Fromm was decidedly counter-modern, for he considered, much like Spinoza – for whom “mental health is, in the last analysis, a manifestation of right living; mental illness, a symptom of the failure to live according to the requirements of human nature” (Fromm 1976, p. 78) – that there exists a “human nature” and wrote about the “pathology of normalcy”; in other words, unlike standard psychiatry, he
20
V. Pambuccian
found that being adapted to a society that he found insane and acting in its spirit is not a sign of sanity. Although he grew up in Frankfurt in the twentieth century, in reality he grew up, as he put it in interviews, in a “medieval atmosphere, in which everything is dedicated to traditional learning” (Friedman 2013, p. 4). Moreover: Between 1916 and 1921, Fromm actively studied topics linked to the Hebrew Bible in a small group, led by Rabbi Nehemia Nobel in Frankfurt, that included Martin Buber, Leo Baeck, and Gershom Scholem. (Friedman 2013, p. xxxiii)
Here are a few quotations from Fromm (1976) that refer to themes that occur in Brouwer’s philosophy as well. Optimum knowledge in the being mode is to know more deeply. In the having mode it is to have more knowledge. (Fromm 1976, p. 34) Having refers to things and things are fixed and describable. Being refers to experience, and human experience is in principle not describable. What is fully describable is our persona – the mask we each wear, the ego we present – for this persona is in itself a thing. In contrast, the living human being is not a dead image and cannot be described like a thing. In fact, the living human being cannot be described at all. Indeed, much can be said about me, about my character, about my total orientation to life. This insightful knowledge can go very far in understanding and describing my own or another’s psychical structure. But the total me, my whole individuality, my suchness that is as unique as my fingerprints are, can never be fully understood, not even by empathy, for no two human beings are entirely alike. Only in the process of mutual alive relatedness can the other and I overcome the barrier of separateness, inasmuch as we both participate in the dance of life. Yet our full identification of each other can never be achieved. (Fromm 1976, p. 71) Language is an important factor in fortifying the having orientation. The name of a person – and we all have names (and maybe numbers if the present-day trend toward depersonalization continues) – creates the illusion that he or she is a final, immortal being. The person and the name become equivalent; the name demonstrates that the person is a lasting, indestructible substance – and not a process. Common nouns have the same function: i.e., love, pride, hate, joy give the appearance of fixed substances, but such nouns have no reality and only obscure the insight that we are dealing with processes going on in a human being. But even nouns that are names of things, such as “table” or “lamp,” are misleading. The words indicate that we are speaking of fixed substances, although things are nothing but a process of energy that causes certain sensations in our bodily system. But these sensations are not perceptions of specific things like table or lamp; these perceptions are the result of a cultural process of learning, a process that makes certain sensations assume the form of specific percepts. We naively believe that things like tables and lamps exist as such, and we fail to see that society teaches us to transform sensations into perceptions that permit us to manipulate the world around us in order to enable us to survive in a given culture. Once we have given such percepts a name, the name seems to guarantee the final and unchangeable reality of the percept. (Fromm 1976, p. 67) If my self is constituted by what I have, then I am immortal if the things I have are indestructible. (Fromm 1976, p. 67)
It should be noted that the having mode of existence is by no means restricted to the possession of objects. As part of the Four Noble Truths, tṛ́ṣṇa (Sanskrit for desire or thirst), a cause of dukkha (unsatisfactoriness, suffering), is not primarily directed to material things.
Brouwer’s Intuitionism
21
It is important to keep in mind that for Buddhism, the problem is not desire for material things; more fundamentally, desire is a thirst for ideas, theories (dṛṣṭi), since it is these that give us a sense of identity and define the values we impute to material things, such that we then find them desirable. [. . .] Those ideas, theories (dṛṣṭi) are part of the play of a linguisticcognitive web of closure (prapañca, āsava, anuśaya, saṃvṛti, etc.). (Lusthaus 2002, p. 61)
Similarly, for upādāna (appropriation, grasping, clinging, attachment), the result of tṛ́ṣṇa, we have: We appropriate in order to expand, strengthen and affirm our selves. Self-definition, selfidentification arises through appropriation. What is other either becomes mine, or else it serves as the boundary marker that circumscribes my limits, and thus what I am. The ‘more’ I possess, the ‘greater’ I am. (Lusthaus 2002, p. 65)
4
Language
An important element of Brouwer’s philosophy is his position with respect to language, a position that has changed little throughout his life. It is also a major source of misunderstanding by the “bland and vapid views that have dominated the philosophy of mathematics,” for it has been read in an absolute, un-nuanced sense. It is that misunderstanding that brands Brouwer with the “solipsist” label. We will see that, even though some might actively dislike it (Posy’s “bugaboo”), it is by no means an idiosyncratic idea that sprung out of Brouwer’s head and is not encountered in other reasonable minds. That his views on language are of high relevance to his philosophy of mathematics and to his rejection of formalism and of logicism can be best read from his own explanation of the “first act of intuitionism”: The first act of intuitionism completely separates mathematics from mathematical language, in particular from the phenomena of language which are described by theoretical logic, and recognizes that intuitionist mathematics is an essentially languageless activity of the mind having its origin in the perception of a move of time, i.e. of the falling apart of a life moment into two distinct things, one of which gives way to the other, but is retained by memory. If the two-ity thus born is divested of all quality, there remains the empty form of the common substratum of all two-ities. It is this common substratum, this empty form, which is the basic intuition of mathematics. (Brouwer 1952b, p. 509f)
His views can be summarized as follows: (i) language originates in willtransmission, in commands given to others, and thus of questionable morality as its original aim is to “break the will of others,” making others obey; (ii) it is not, on its own, a means of communication, but in rare instances, it could become an aide in a communication that has been established by non-verbal means; (iii) language is very inadequate at expressing the inner life of a human being; (iv) language does not have the miraculous power of conjuring a life-like world, and it does not have a descriptive function in which it has shaken off the original sin of its birth as willtransmission; (v) even in the very restricted field of mathematics, language cannot
22
V. Pambuccian
express faithfully the creations of a human subject, as these are living processes. Of these, (i), (iv), and (v) are insights that cannot be found expressed in such stark terms by other thinkers who find language inadequate. There is one mathematician who, after Brouwer, made a (v)-like statement. René Thom, another mathematicianphilosopher, coming from an entirely different philosophical perspective, namely, that of mathematical Platonism, wrote in Thom (1970) (see also Papadopoulos 2018) that the world of ideas infinitely exceeds our “technical possibilities.” It is in the intuition that the ultima ratio of our faith in the truth of a theorem resides. And, according to a now-forgotten etymology, a theorem is above all the object of a vision. (Thom 1970, p. 697) [Car le monde des Idées excède infiniment nos possibilités opératoires, et c’est dans l’intuition que réside l’ultima ratio (la raison dernière) de notre foi en la vérité d’un théorème – un théorème étant, selon une étymologie aujourd’hui bien oubliée, l’objet d’une vision.]
Already in Brouwer’s earliest writing of 1905, the pamphlet Life, Art, and Mysticism, we read: The immediate companion of the intellect is language. From life in the intellect follows the impossibility of any form of direct communication with others – instinctively by gesture or looks, or even more spiritually through all separation of distance. People therefore start training themselves and their offspring in some crude sign language, painfully and with little success, for never has anyone been able to communicate with others, soul to soul. Language can only be the accompaniment of an already existing mutual understanding. [. . .] Only in the very narrowly restricted domains of the imagination such as in the exclusively intellectual sciences – which are completely separated from the world of perception and therefore touch the least upon the essentially human – only there can mutual understanding be maintained for some time. There is little scope for misunderstanding notions such as “equal” and “triangle,” but even then two different people will never feel them in exactly the same way. Even in the case of the most restricted sciences, logic and mathematics – a sharp distinction between these two is hardly possible – no two different people will have the same conception of the fundamental concepts on which these two sciences are constructed; and yet their wills are parallel, and in both there is a small, unimportant part of the brain which forces their attention in the same way. (Brouwer 1996, p. 401) But ridiculous is the use of language when one tries to express subtle nuances of will which are not part of the living reality of those concerned, when for example so-called philosophers or metaphysicians discuss among themselves morality, God, consciousness, immortality, or the free will. These people do not even love each other, let alone share the same subtle movements of the soul; sometimes they even do not know each other personally. They either talk at cross-purposes or each builds his own little logical system which lacks any connection with reality. For logic is life in the human brain; it may accompany life outside the brain but it can never guide it by virtue of its own power. (Brouwer 1996, p. 401f) Words are no more than commando-signals [. . .] means of training and controlling All verbal utterances are more-or-less developed verbal imperatives, i.e. speaking can always be reduced to commands or threats, and understanding to obeying. (Van Stigt 1990, p. 197, unpublished papers) Language can accompany man’s will to dominate the will of others or his will to keep the movements of wills together; for example, the war cry of Red Indians accompanies the will to break the will of others.
Brouwer’s Intuitionism
23
Language by itself has no meaning; any philosophy which searched for a firm foundation based on that presumption has come to grief; lulled into sleep by the assurance of such firm foundation, one was rudely awakened by the appearance later of deficiencies and contradictions. A language which does not derive its certainty from the human will but which claims to live on in the ‘pure concept’ is an absurdity. (Brouwer 1996, p. 403) The practical reliability of the logical principles relies on the fact that a significant part of the world of intuition shows much more fidelity and satisfaction with respect to its finite organization than humanity itself. The fact that one has been, from time immemorial, blind to this sober interpretation, has been caused by the fact that the exclusive character of words as a means of will-transmission had not been recognized, and these were seen as a result of a rash superstition as suggestive means of fetish-like ‘concepts’. These ‘concepts’ as well as the links existing between them, are supposed to have an existence independently of the causal stance of man, and the logical principles should represent the concepts and the a priori laws governing their connections. [Trans. V P.] [Die praktische Zuverlässigkeit der logischen Prinzipien beruht darauf, dass ein großer Teil der Anschauungswelt in bezug auf ihre endliche Organisation viel mehr Treue und Zufriedenheit zeigt als die Menschheit selbst. Dass man von alters her vor dieser nüchternen Interpretation blind war, wurde dadurch verursacht, dass man den ausschließlichen Charakter der Worte als Willensübertragungsmittel nicht erkannte und dieselben infolge eines unbesonnenen Aberglaubens als Andeutungsmittel fetischartiger ‘Begriffe’ betrachtete. Diese ‘Begriffe’ sowie die zwischen ihnen bestehenden Verknüpfungen sollten unabhängig von der kausalen Einstellung des Menschen eine Existenz besitzen, und die logischen Prinzipien sollten die Begriffe und ihre Verknüpfungen beherrschende aprioristische Gesetze darstellen. (Brouwer 1929, p. 423)] In default of a plurality of mind, there is no exchange of thought either. Thoughts are inseparably bound up with the subject. So-called communicating-of-thoughts to somebody, means influencing his actions. Agreeing with somebody, means being contended with his cooperative acts or having entered into an alliance. [. . .] By so-called exchange of thought with another being the subject only touches the outer wall of an automaton. This can hardly be called mutual understanding. Only through the sensation of the other’s soul sometimes a deeper approach is experienced. And when wisdom revealed by the beauty of this sensation, finds expression in the antiphony of words exchanged, then there may be mutual understanding. (Brouwer 1948, p. 485)
These quotes encourage analytic philosophers to heap the usual verbal abuse, such as “solipsist,” on Brouwer. This is a thorough misunderstanding of what Brouwer is writing against. What he writes against is a language that is viewed as possessing self-existence, as having content independently of the human mind, and as capable of transmitting that content. The rather mainstream Western view that language does have these miraculous powers is expressed by Franz Brentano, Gottlob Frege, Nicolai Hartmann, and Karl Popper. In Hartmann (1964), Hartmann assigns to the spiritual (das Geistige) its own “real stratum” in the composition of the world, one which cannot be reduced to subjective consciousness. Subjective is only the carrying out of the act of thinking, the content of the thought itself is the spiritual content that is detachable from the subject and transferable, a fact without which communication between humans would not be possible. And Popper, whose engagement with Brouwer’s thought is surveyed in (Naraniecki 2015), finds that
24
V. Pambuccian we may distinguish the following three worlds or universes: first, the world of physical objects or physical states; secondly the world of states of consciousness or of mental states, or perhaps of behavioural dispositions to act; and thirdly the world of objective contents of thought, especially of scientific and poetic thoughts and of works of art. (Popper 1967, p. 106)
Popper finds that his “third world” has “more in common still with Bolzano’s theory of a universe of propositions in themselves and of truths in themselves” and “resembles most closely the universe of Frege’s objective contents of thought” (Popper 1967, p. 106). He argues for “the independent existence of the third world” (Popper 1967, p. 107). “Theories or propositions or statements are the most important third-world linguistic entities” (Popper 1968, p. 157). “The activity of understanding consists, essentially, in operating with third-world objects” (Popper 1968, p. 164). The Buddha finds that such a separation of the contents of the second and third worlds is impossible: Feeling, perception, and consciousness, friend – these states are conjoined, not disjoined, and it is impossible to separate each of these states from the other to describe the difference between them. For what one feels, that one perceives, and what one perceives, that one cognizes. Wisdom and consciousness, friend – these states are conjoined, not disjoined, and it is impossible to separate each of these states from the other to describe the difference between them. For what one wisely understands, that one cognizes, and what one cognizes, that one wisely understands. [. . .] wisdom is to be developed, consciousness is to be fully understood. (Nanamoli and Bodhi 1995, Mahāvedalla Sutta, pp. 388–389 (PTS M i 292–293)]
While the Buddha mentions a “theory” that does not depend on human discovery, namely, the doctrine of dependent arising (or dependent origination) (see Kalupahana 1975), its content, when clothed in language, is subject to the vicissitudes that visit any language-composed doctrine practiced by humans. It is due to precisely the historical destiny of all such doctrines that he prophesized the demise of the dharma in India (see Lamotte (1976, p. 210f) for more details). And what, bhikkhus, is dependent origination? “With birth as condition, aging-and-death [comes to be]”: whether there is an arising of Tathāgatas or no arising of Tathāgatas, that element still persists, the stableness of the Dhamma, the fixed course of the Dhamma, specific conditionality. A Tathāgata awakens to this and breaks through to it. (Boddhi 2000, II.25, Nidānasaṃyutta, p. 551)
As pointed out by David Kalupahana, language, as understood by the Buddha, does not work on its own, in the absence of what Brouwer calls “an already existing mutual understanding.” Language cannot be a means of communication unless the language users agree upon the various elements of a language as being expressive of their experiences. Thus, agreement or concurrence (sammuti), whether it be of a small group or of the world (loka) at large, is an
Brouwer’s Intuitionism
25
important characteristic of language. The use of the term sammuti, which implies agreement in the sense of “thinking together,” makes language not merely a behavioural phenomenon but a psychological one as well. (Kalupahana 1999, p. 52)
The “propositions in themselves,” the “truths in themselves,” the “objective contents of thought,” and, in general, “third-world objects” claim a svabhāva existence (self-nature in Kalupahana’s translation, own being in Robinson’s translation), which is not only found to be contradictory in Chap. XV (Examination of SelfNature) of Nāgārjuna’s Mūlamadhyamakakārikā (for the extent to which mādhyamika can be seen as a critique of language (see Napper (1989, pp. 75–100)): Svabhāva is by definition the subject of contradictory ascriptions. If it exists, it must belong to an existent entity, which means that it must be conditioned, dependent on other entities, and possessed of causes. But a svabhāva is by definition unconditioned, not dependent on other entities, and not caused. Thus the existence of a svabhāva is impossible. (Robinson 1957, p. 299)
but is also considered by the Buddha and his followers to be the origin of all mental confusion: The root of all the troubles of cyclic existence and solitary peace is the ignorance conceiving true existence, as well as its predispositions. (Gyatso 1990, p. 78)
If I think in terms of “contents of thought,” this objectifying process allows ownership of that “content.” If thoughts are fleeting by their very nature, are inseparably bound up with the subject, and disappear with the subject, then they are of no value in the having mode of existence. Similar thoughts on the inadequacy of language were expressed not just by Erich Fromm but also by a wide variety of twentieth-century thinkers. Here is what the founder of “general semantics” had to say on the matter: Words are not the things they represent. (7) The above [. . .] considerations lead to unexpectedly far-reaching consequences. From (7) – it follows that the objective levels which include the events, ordinary objects, objective actions, processes, immediate feelings, ‘instincts’, ‘ideas’, [. . .] represent un-speakable levels, are not words. (9) From (9) – that the use of the ‘is’ of identity, as applied to objective, un-speakable levels, appears invariably structurally false to facts and must be entirely abandoned. Whatever we might say a happening ‘is’, it is not; (10) From (10) – structure appears as the only possible link between the objective, unspeakable, and the verbal levels. (11) From (11) – the only possible ‘content of knowledge’ becomes exclusively structural. (12) From (12) – the only aim of ‘knowledge’ and science appears as the empirical search for, and verbal formulation of, structure. The only method for acquiring ‘knowledge’ is found in an empirical investigation of the potentially unknown structure of the world, ourselves included, only afterwards adjusting the structure of languages so that they would be similar, and so of maximum usefulness;
26
V. Pambuccian instead of the delusional reversed order of ascribing to the world the structure of an inherited primitive language. (Korzybski 1931, p. 751f)
Although Krishnamurti does not explicitly refer to the will-transmission character of language, it is implicit when he admonishes his audiences repeatedly that the speaker, although on a podium, tries to figure out the answer to a question with his audience, that all following of someone else’s advice on matters of wisdom, life, love, ethics, or meditation is counter-productive, for only oneself can discover for oneself the truth of those, that all else is “conditioning.” “There is no right guru. There is only wrong guru, because nobody can teach you anything except for yourself. [. . .] nobody can teach you what you are” (Krishnamurti and Bohm 1983). He also sees the will-transmission component when language is not even used between two people but rather in a monologue, as in the traditional approaches to “meditation”: Who is meditating? In most systems of meditation, Japanese, Indian, Tibetan, and so on, there is the controller and the controlled. The controller tries to control thought, quieten thought, shape it according to a purposeful direction. [. . .] The speaker has gone into this for the last sixty years or more. He has discussed this question with Zen pundits, with Hindus, Tibetans and all the rest of the gang, and he refutes that kind of meditation because their idea of meditation is to achieve an end. The end being complete control of the brain so that there is no movement of thought. Because when the brain is still, deliberately disciplined, deliberately sought after, it is not silent. It is like achieving something, which is the action of desire. (Krishnamurti 1984)
Krishnamurti is also very clear about the inadequacy and the limitations of language: Understanding is not verbal, nor is there such a thing as intellectual understanding. Intellectual understanding is only on the verbal level, and so no understanding at all. Understanding does not come as a result of thought, for thought after all is verbal. (Krishnamurti 1956, p. 194) Sir, love is not a word; the word is not the thing, is it? God is not the word “god,” love is not the word “love.” But you are satisfied with the word, because the word gives you a sensation. When somebody says “God,” you are psychologically or nervously affected, and that response you call the understanding of God. So, the word affects you nervously and sensuously, and that produces certain action. But the word is not the thing, the word “god” is not God; you have merely been fed on words, on nervous, sensuous responses. Please see the significance of this. How can you act if you have been fed on empty words? For words are empty, are they not? They can only produce a nervous response, but that is not action. Action can take place only when there is no imitative response, which means the mind must enquire into the whole process of verbal life. (Krishnamurti 1991, Bangalore fourth public talk July 25, 1948) Can we say the word is not the thing? Whatever the description, it is not the real, not the truth, however much you embellish or diminish it. We recognize that the word is not that, then what is there beyond all this? Can my mind be so desireless that it won’t create an illusion, something beyond? (Krishnamurti and Bohm 1999, p. 121) Naming only strengthens and gives continuity to the experiencer, to the desire for permanency, to the characteristic of particularizing memory. There must be silent awareness
Brouwer’s Intuitionism
27
of naming, and so the understanding of it. We name not only to communicate, but also to give continuity and substance to an experience, to revive it and to repeat its sensations. This naming process must cease, not only on the superficial levels of the mind, but throughout its entire structure. This is an arduous task, not to be easily understood or lightly experienced; for our whole consciousness is a process of naming or terming experience, and then storing or recording it. It is this process that gives nourishment and strength to the illusory entity, the experiencer as distinct and separate from the experience. Without thoughts there is no thinker. Thoughts create the thinker, who isolates himself to give himself permanency; for thoughts are always impermanent. (Krishnamurti 1956, p. 69)
Here is what Albert Schweitzer wrote on the possibility of understanding one another: None of us can claim that we really know another person, even if we have lived together with that person every day for years. Of what makes up our inner experience, we cannot communicate even to our closest others more than fragments. We can neither give the entirety of that experience nor could the others grasp it. We walk in a half darkness, in which neither can recognize the other’s features accurately. Only from time to time, through an experience we had with our companion, or through a word that falls between us, does the other stand next to us for a moment, as if illuminated by a lightning bolt. That’s when we see him as he is. Afterwards we’ll go again, perhaps for a long time, side by side in the dark and try in vain to imagine the other’s features. (Trans. V. P.) [Keiner von uns darf behaupten, daß er einen andern wirklich kenne und wenn er seit Jahren täglich mit ihm zusammen lebt. Von dem, was unser inneres Erleben ausmacht, können wir auch unseren Vertrautesten nur Bruchstücke mitteilen. Das Ganze vermögen wir weder von uns zu geben, noch wären sie imstande, es zu fassen. Wir wandeln in einem Halbdunkel, in dem keiner die Züge des anderen genau erkennen kann. Nur von Zeit zu Zeit, durch ein Erlebnis, das wir mit dem Weggenossen haben, oder durch ein Wort, das zwischen uns fällt, steht er für einen Augenblick neben uns, wie von einem Blitze beleuchtet. Da sehen wir ihn, wie er ist. Nachher gehen wir wieder, vielleicht für lange, im Dunkel nebeneinander her und suchen vergeblich, uns die Züge des andern vorzustellen. (Schweitzer 1974, p. 385f)]
And here is what two neurobiologists wrote about communication by means of language (one of them, Francisco Varela, was a co-founder of the Mind and Life Institute to promote dialog between science and Buddhism, so guilty by association of accepting the validity of the Eastern approach): So long as language is considered to be denotative it will be necessary to look at it as a means for the transmission of information, as if something were transmitted from organism to organism, in a manner such that the domain of uncertainties of the ‘receiver’ should be reduced according to the specifications of the ‘sender’. However, when it is recognized that language is connotative and not denotative, and that its function is to orient the orientee within his cognitive domain without regard for the cognitive domain of the orienter, it becomes apparent that there is no transmission of information through language. It behooves the orientee, as a result of an independent internal operation upon his own state, to choose where to orient his cognitive domain; the choice is caused by the ‘message’, but the orientation thus produced is independent of what the ‘message’ represents for the orienter. In a strict sense then, there is no transfer of thought from the speaker to his interlocutor; the listener creates information by reducing his uncertainty through his interactions in his cognitive domain. Consensus arises only through cooperative interactions in which the
28
V. Pambuccian resulting behavior of each organism becomes subservient to the maintenance of both. (Maturana and Varela 1980, p. 32)
And here is a quote from Henri Poincaré about language, quoted at the end of Brouwer (1913): People do not understand each other because they do not speak the same language and because there are languages that cannot be learned. (Trans. V. P.) [Les hommes ne s’entendent pas, parce qu’ils ne parlent pas la même langue et qu’il y a des langues qui ne s’apprennent pas. (Poincaré 1913, p. 161)]
All of Brouwer’s utterances on language are taken literally and absolutely by Brouwer’s detractors, who deliberately refuse to take notice of the fact that the same person who wrote dismissively about the capabilities of language to provide exact communication by means of “objective contents of thought” and wrote about its inherent limitations, was involved in the significs project and was a master of a highly evocative language, as related in a recollection of Mannoury’s daughter. Mrs. Vuysje, the daughter of Mannoury, described Brouwer’s conversation as an almost artistic process: Brouwer could in the middle of a sentence pause to savour a particular expression, and to replace it after consideration by a more suitable one. She remembered him, standing in the garden stretching out his hand, tasting a particular word and contemplating fitting equivalents. The result was not a faltering flow of words; Brouwer had mastered the technique of refinement to such a degree that he managed to produce gentle, flowing sentences of incredible length without discernible interruptions. (Van Dalen 2013, p. 263)
Not even in this area is Brouwer given his due. Reviewing Van Stigt (1990), Smoryński mentions that he could not make any sense of Van Stigt’s remark that “Brouwer lacked the ability to express his feelings artistically,” for it “certainly runs counter to what every Dutchman I’ve spoken to has said of Brouwer’s correspondence with Adama van Scheltema: Brouwer’s prose is uniformly favourably compared with that of the poet” (Smoryński 1994, p. 802). There are, however, mathematicians and philosophers who did not fall into the trap of concluding “solipsism” or “bugaboo” from Brouwer’s writings on language. In addition to the works already mentioned by Largeault and Detlefsen, Van Dalen (1999a), Placek (1999), and Kaneko (2002a) treat this aspect of Brouwer’s philosophy in earnest, and none come to the negative conclusions mentioned in the Introduction. Placek (1999) disproves that Brouwer was a solipsist and shows that intersubjectivity is possible in Brouwer’s philosophy. There is one more restricted use Brouwer finds for language; it is that of a memory aide. Even if language in its origin and in the first place is a function of the activity of social man it plays a significant role also in the reflective thinking and mnemotechnical processes of the solitude of singular man. (Brouwer 1933; Van Stigt 1990, p. 422f)
Brouwer’s Intuitionism
29
We can say that, while Brouwer realizes that, unfortunately, language is not a means for mutual understanding, is endowed with an original sin that renders its use morally questionable, we have no choice but to try the impossible. As Verena HuberDyson put it: Formalization and Poetry are the two most direct means we have of externalizing our thoughts. (Huber-Dyson 1996, p. 67)
Brouwer chose poetry, for at least it has no pretension to convey anything exactly, and opts for hinting at a state of mind, a feeling, or an atmosphere. This he found, in his own experience as a student at the University of Amsterdam, listening to Gerrit Mannoury, more helpful than the precision of strings of symbols. Brouwer finds that one is, by the very nature of language, in the Sisyphean, absurd hero position of making, while being thrown upon the use of language, a desperate use of a tool ill-suited for the task, much like a character in Franz Kafka’s world, as seen by Albert Camus: Kafka’s world is in truth an indescribable universe in which man allows himself the tormenting luxury of fishing in a bathtub, knowing that nothing will come of it. (Camus 1955) [Le monde de Kafka est à la vérité un univers indicible où l’homme se donne le luxe torturant de pêcher dans une baignoire, sachant qu’il n’en sortira rien.]
5
The Eastern Origin of Subjective Idealism
Why is subjective idealism so scandalous in Western philosophy concerned with mathematics? After all, Schopenhauer made statements that put him squarely in the subjective idealist, decidedly Eastern, camp, such as On the other hand, the Kantian teaching, even without the antinomies, leads to the insight that things and their whole mode and manner of existence are inseparably associated with our consciousness of them. Therefore he who has clearly grasped this soon reaches the conviction that the assumption that things exist as such, even outside and independently of our consciousness, is really absurd. [Trans. E. F. J. Payne] [Hingegen leitet die Kantische Lehre, auch ohne die Antinomien, zu der Einsicht, daß die Dinge und die ganze Art und Weise ihres Daseyns mit unserm Bewußtseyn von ihnen unzertrennlich verknüpft sind; daher wer Dies deutlich begriffen hat, bald zu der Ueberzeugung gelangt, daß die Annahme, die Dinge existierten als solche auch außerhalb unseres Bewußtseyns und unabhängig davon, wirklich absurd ist. (Schopenhauer 1859, Erstes Buch, Kap. 1, p. 708)]
and this did not prevent him from being influential for a while, nor did Emerson’s embrace of subjective idealism (Lyttle 1997) harm his position in American letters and philosophy. That someone presenting general advice on life, ethics, or, in general, what usually goes for wisdom, embraces subjective idealism presents no threat to the
30
V. Pambuccian
main pride of Western philosophy: mathematics and evolution. What is scandalous and asks for the entire arsenal of derision, belittling, and name-calling in Brouwer’s case is the proposal as the only sound basis for mathematics, a jewel of the Western philosophical approach, of an unmistakably Eastern foundation. As was observed by Aram Frenkian in 1946, there is not a hint of subjective idealism in any Greek philosophical work that precedes the contact with the East brought about by Alexander. He detected in Frenkian (1946) the first text suggestive of subjective idealism in the Memphite Theology, as inscribed on the Shabaka Stone (see Allen (1988), Hoffmeier (1983), and Finnestad (1976) for a detailed study of its contents), which dates from the 25th dynasty of Egypt. There we are told that “Great and important is Ptah,// who gave life to all the gods and their kas as well// through this heart and this tongue” (Allen 1988, p. 43). Similar utterances can be found in earlier previous Egyptian texts, such as the Coffin Texts (Allen 1988, p. 45) (late 18th or early 19th dynasty). The conclusions Allen draws in (Allen 1988, pp. 56–61) regarding the subjective idealist nature of Egyptian thought and cosmology, a “construct of opposites in balance” (Allen 1988, p. 57), without citing Frenkian (1946) or Haas (1956), can be taken to be an added, independent confirmation of Frenkian’s thesis. The appearance of something akin to subjective idealism, in various mind-only schools of Mahāyāna Buddhism, has been the subject of several studies, such as Guenther (1989), Suzuki (1998), and Lusthaus (2002). Perhaps the most telling line is one by Nāgārjuna, who states: The [material world], in the final analysis, cannot exist. (Inagaki 1998, Chap. 33, p. 124)
The first appearance of a “vacillation” between “the ontic-ontological aspect” and “its mentalist-idealist aspect” (McEvilley 1980, p. 181) in the world of Greek letters can be detected in the philosophy of Plotinus, the first Greek philosopher who is seen similar to, on the one hand, an Indian current of thought of the subjective idealism variety, Vijñānavāda Buddhism, and, on the other hand, to an Indian version of objective idealism, the Upanishadic-Vedāntic philosophy, in the opus magnum comparing Greek and Indian thought (McEvilley 2002, Chaps. 22 and 23) (cf. also Sabo 2017). The dearth of reports of Greek philosophers traveling East before Alexander’s conquests, mentioned in Männlein-Robert (2009), indicates that Greek philosophers of pre-Hellenistic times were unaware of the very existence of subjective idealist thought. Subjective idealism is very rare in the history of Western philosophy. Although Berkeley was labeled by Kant and Hegel as a subjective idealist, that designation is, given the role played by God in his philosophy, certainly not accurate, as pointed out in Callahan (2015). The varieties of idealism favored in Western philosophy tend to leave the physical world in its place. Husserl emphasized in his Die Krisis der europäischen Wissenschaften und die transzendentale Phänomenologie the European roots of his transcendental phenomenology not just because the year in which the lectures were held was 1936 and he had been branded non-Aryan but, as pointed out in
Brouwer’s Intuitionism
31
(Majer 2010), because the roots of his philosophy are unmistakably Western. In fact, there is a version of Husserl minus idealism, developed by one of his students, Ingarden (1975), which is viable. In that sense, the philosophy favored for comparisons with Brouwer escapes the anathema of an Eastern origin.
6
Brouwer’s View on Mathematics
As we have seen, the first act of intuitionism distinguishes mathematics from the language of mathematics; establishes the source of mathematics, which is an intuition, the intuition of time; and establishes the nature of the mathematical activity, which is a “languageless activity of the mind.” This is Brouwer’s version of Fromm’s “Being refers to experience, and human experience is in principle not describable.” The second act of intuitionism defines the intuitionistic continuum in terms of choice sequences, which are freely chosen sequences of rational numbers. We will first look at the wide-ranging consequences of the first act. The first part, the statement that mathematics is not to be confused with the language of mathematics, might seem the most shocking. Brouwer provides the following comparison to explain why the two are distinct: “the formal language which accompanies mathematics as the weather-map accompanies the atmospheric processes” (Brouwer 1937, p. 451). We have seen that René Thom expressed similar views on the inability of our possibilités opératoires of expressing the world of our ideas in Thom (1970, p. 697). Needless to say, the support comes from a counter-modern, who looked to Aristotle for guidance, followed none of the intellectual fashions of his time, and asked his daughter, then just a child, excited about the events of May 1968: “Mais où prends-tu qu’après une révolution les choses iront mieux?” (Papadopoulos 2018) [“But where did you get it that after a revolution things will get better?” (Trans. V. P.)] Brouwer is very explicit on the difference between mathematics and the language of mathematics, both in the sense that there is language-specific rubbish that he does not count as mathematics and that there exists ineffable mathematics, making its communication impossible. [. . .] I do not recognize as true, hence as mathematics, everything that can be written down in symbols according to certain rules, and conversely I can conceive mathematical truth which can never be fixed down in any system of formulas. (Brouwer 1937, p. 452) Thus for a human mind equipped with an unlimited memory, pure mathematics, practised in solitude and without using linguistic signs, would be exact, but the exactness would be lost in mathematical communication between human beings with an unlimited memory, because they would still be thrown upon language as their means of understanding. (Brouwer 1933, p. 443)
Although we have seen that other thinkers deny language the ability to express certain processes, it is mostly our inner lives that are deemed ineffable by Fromm or Schweitzer. On the other hand, even something that has been clothed in language cannot be decoded by a naive reading, as pointed out by Dilthey, who asks for
32
V. Pambuccian
special faculties, such as Verstehen (understanding) for the very comprehension of literary textual material, for which the reader needs the abilities of Hineinversetzen, Nachbilden, Nacherleben (Dilthey 1965, pp. 213–216) [putting oneself in somebody else’s position, replicating, reliving]. In other words, texts do not communicate by means of a simple transmission of sequences of words. Yet one would perhaps think that mathematics is far removed from these texts belonging to the humanities, that comprehension is straightforward, requires no Verstehen in Dilthey’s sense, because mathematics is not about human life. If, however, mathematics is a “languageless activity of the mind,” then it is subject to the same communication difficulties a literary text encounters. Things are further complicated by the fact that Mutual understanding between two people is a matter of degree, but in mathematics their mutual understanding is either total or it does not exist, like being-asleep, one either is or is not. (Van Stigt 1990, p. 208, unpublished manuscript]
On an empirical level, anyone who has tried to teach mathematics has been confronted with this issue, as have university administrators who have thought that recording the teaching of a course by a luminary would enable tens of thousands to learn the content of that course. The same holds for the purported proof of the abcconjecture or for the highly complex mathematics of our time, in which certain subjects are understood by no more than a dozen people, and it is practically impossible to learn those subjects from texts. One needs to be present in the centers where there are experts in them. This is a version of Polanyi’s “we can know more than we can tell” (Polanyi 2009, p. 4). When Brouwer refers to “everything that can be written down in symbols according to certain rules,” he means what he called in Brouwer (1913) formalism. The name refers to Hilbert’s understanding of the foundations of mathematics, although Hilbert never used the term. Hilbert understood, for the purposes of proof theory, a mathematical theory as consisting of a set of axioms expressed in a formal language, together with a set of rules of inference, which allow the deduction of new sentences from the axioms. All sentences are just strings of meaningless symbols, satisfying certain syntactic rules. The name “formalism” turned out to be a very popular one and has been in use by everyone except Hilbert and members of his inner circle up to the present day. Hilbert would say that he does not mean that mathematics is concerned with deriving meaningless strings of symbols from other meaningless strings of symbols, that this is just a tool for reaching certain metamathematical goals. However, formalism is, nowadays, the official foundation of mathematics. The other competing foundation, logicism, which claimed that mathematics is a part of logic, has faded away, as it was not able to account for modern algebra or even for geometry, as noticed by Bertrand Russell himself, a major proponent of logicism (Gandon 2012; Musgrave 1977). Now, even this highly controlled and unambiguous language is, in Brouwer’s view, fraught with the shortcomings of any language.
Brouwer’s Intuitionism
33
There is neither exactness nor certainty for will-transmission, especially for one mediated through language. And this state of affairs remains undiminished if will-transmission refers to the construction of purely mathematical systems. So even for pure mathematics there is no certain language, i.e., no language that excludes misunderstandings in conversation and protects from mistakes in the act of memory support (i.e. from the confusion of different mathematical entities). This circumstance cannot be remedied, in the manner done by the formalistic school, that one subjects the mathematical language (i.e. the system of signs that is used to evoke purely mathematical constructions in other people) itself to a mathematical examination, confers it through reworking the accuracy and stability of a material instrument or a phenomenon of exact science and in so doing communicates in a second-order language or metalanguage about it. (Trans V. P.) [Nun gibt es aber für Willensübertragung, insbesondere für durch die Sprache vermittelte Willensübertragung, weder Exaktheit, noch Sicherheit. Und diese Sachlage bleibt ungeschmälert bestehen, wenn die Willensübertragung sich auf die Konstruktion reinmathematischer Systeme bezieht. Es gibt also auch für die reine Mathematik keine sichere Sprache, d. h. keine Sprache, welche in der Unterhaltung Mißverstandnisse ausschließt und bei der Gedächtnisunterstützung vor Fehlern (d. h. vor Verwechslungen verschiedener mathematischer Entitäten) schützt. Diesem Umstande ist nicht dadurch abzuhelfen, daß man, wie es die formalistische Schule macht, die mathematische Sprache (d. h. das zur Hervorrufung reinmathematischer Konstruktionen bei anderen Menschen dienende Zeichensystem) selber einer mathematischen Betrachtung unterzieht, ihr durch Umarbeitung die Genauigkeit und Stabilität eines materiellen Instrumentes oder eines Phänomens der exakten Wissenschaft verleiht und sich dabei in einer Sprache zweiter Ordnung oder Übersprache über sie verständigt. (Brouwer 1929, p. 421f)
Now, what could be the ineffable mathematical truths Brouwer referred to? I thought it was some truth belonging to geometric topology, given that I always felt that the idea that geometric topology could be formalized is a pious wish. The fact that Thom, another geometric topologist, stated the same belief in ineffable mathematical truths seems to provide additional support. Van Dalen (Van Dalen 1999a) thinks that choice sequences, which are unfinished objects, were what Brouwer had in mind. Also part of the first act is the realization that mathematics originates in intuition, in particular in a single intuition, that of time, “a single aprioristic basic intuition, which may be called invariance in change as well as unity in multitude” (Brouwer 1907, p. 97). That arithmetic originates in [the intuition of] time was stated by Meister Eckhart as well. The one is eternity, which always holds itself alone and is without change. However, the two, that is time, which changes and multiplies itself. (Trans. V. P.) [Das Eine ist die Ewigkeit, die sich allezeit allein hält und wandellos ist. Die Zwei aber, das ist die Zeit, die sich wandelt und vermannigfaltigt. (Eckhart 1934, p. 240)]
It may appear at first sight, perhaps even after reading Brouwer (1909), that Brouwer dropped the intuition of space, present in Kant’s philosophy, given the discovery of non-Euclidean geometries and the untenability of an intuition of Euclidean space (there would still be a possibility to assume the existence of an intuition of absolute geometry, the common part of Euclidean and hyperbolic space). In his thesis, even though there is a unique “basic intuition of mathematics,” there is
34
V. Pambuccian
an acknowledgment of a distinction between the discrete and the continuous intuition of time which does not occur in his later writings: Since continuity and discreteness occur as inseparable complements, both having equal rights and being equally clear, it is impossible to avoid one of them as a primitive entity, trying to construe it from the other one, the latter being put forward as self-sufficient; in fact it is impossible to consider it as self-sufficient. Having recognised that the intuition of continuity, of ‘fluidity’, is as primitive as that of several things conceived as forming together a unit, he latter being at the basis of every mathematical construction, we are able to state properties of the continuum as a ‘matrix of points to be thought of as a whole’. (Brouwer 1907, p. 17)
To justify the untenability of an intuition of space, Brouwer brings arguments taken from “mechanics,” such as a reference to the “relativity postulate” due to Lorentz (Brouwer 1909, p. 115) (he was probably unaware of Einstein’s work at that time) or the impossibility of separating time and space, against both Frege’s and Russell’s attempted defense of Euclidean geometry in the form that “our faculty of cognition can assimilate the world of experience only in the form of Euclidean geometry” (Brouwer 1909, p. 109). This is surprising, given that time and space are supposed to be those of intuition, not those of physics. He mentions this explicitly for time: “Of course we mean here intuitive time which must be distinguished from scientific time” (Brouwer 1907, p. 61). It is apparent that he favored a view of geometry that is only about the continuum, and the continuum can be developed from the intuition of time. As he put it in Brouwer (1913, p. 128): “For since Descartes we have learned to reduce all these geometries to arithmetic by means of the calculus of coordinates.” Brouwer’s insistence on a single basic intuition might also have been motivated by an aversion, felt by many of those whom Fromm calls Masters of Living, for dualism. We know he was not just in awe of the writings of Meister Eckhart and Jakob Böhme, of the Bhagavad-Gita (all of which he mentioned already in 1905 in Brouwer (1996)), but also of Spinoza, for he wrote, as a case in point for his “logic is not a reliable instrument to discover truths and cannot deduce truths which would not be accessible in another way as well” (Brouwer 1948, p. 488) that “we feel the wisdom that blooms in Spinoza’s work as completely independent of his logical system” (Brouwer 1908, p. 107). We think it is the same “definite aversion to those pairs of opposites” found by Haas in “the world concept and philosophy of the East,” that Nina Berberova confessed to, that made the mystically inclined Brouwer avoid treating time and space as generating distinct intuitions. All dualism is painful for me, all splitting or bisecting contrary to my nature. When Lenin speaks about matter in opposition to energy, when Berdyaev speaks about material principle (reaction) and spiritual principle (revolution), when idealist philosophers speak about spirit and flesh, I am jarred, as by a false note. My commandment was the truth that matter is energy. My whole life has been the reconciliation within myself of the old dichotomy. (Berberova 1992, pp. 23–24)
Brouwer’s Intuitionism
35
This monistic view is not generally accepted. It is clear that the intuition of time allows the creation of natural numbers and thus a foundation for number theory. While, in Atiyah’s view, modern algebra is also dependent on time, he finds geometry to be distinctly not related to time. Algebra, on the other hand (and you may not have thought about it like this), is concerned essentially with time. Whatever kind of algebra you are doing, a sequence of operations is performed one after the other and ‘one after the other’ means you have got to have time. In a static universe you cannot imagine algebra, but geometry is essentially static. I can just sit here and see, and nothing may change, but I can still see. Algebra, however, is concerned with time, because you have operations which are performed sequentially and, when I say ‘algebra’, I do not just mean modem algebra. Any algorithm, any process for calculation, is a sequence of steps performed one after the other; the modern computer makes that quite clear. The modern computer takes its information in a stream of zeros and ones, and it gives the answer. Algebra is concerned with manipulation in time and geometry is concerned with space. These are two orthogonal aspects of the world, and they represent two different points of view in mathematics. (Atiyah 2002, p. 6f)
However, the state of mind Atiyah describes for geometry, “I can just sit here and see, and nothing may change, but I can still see,” sounds awfully close to that undifferentiated state that one is graced with in meditation, not the kind called up by the I who has decided to “meditate,” but the one that establishes its dominion by itself, reminds one of what Krishnamurti describes as “the state of experiencing.” In the state of experiencing, there is neither the experiencer nor the experienced. The tree, the dog and the evening star are not to be experienced by the experiencer; they are the very movement of experiencing. There is no gap between the observer and the observed; there is no time, no spatial interval for thought to identify itself. Thought is utterly absent, but there is being. This state of being cannot be thought of or meditated upon, it is not a thing to be achieved. The experiencer must cease to experience, and only then is there being. In the tranquillity of its movement is the timeless. (Krishnamurti 1956, p. 32)
That is not the state in which geometry happens. In geometry itself, one has to separate two things “one of which gives way to the other, but is retained by memory,” as Brouwer put it, to discover any geometric truth. The mind must still be traveling from one entity to the other, its stillness may give rise to awe, but will never generate an insight, languageless or not, about the structure of the observed. We have here identified the intuition that allows to “mentally differentiate between objects” with the intuition of time. Placek is not sure whether that identification is allowed, as he finds no confirmation for it in Brouwer’s writings: although one may introduce a number of other intuitions as well, say those allowing the subject to separate a mathematical entity, or mentally differentiate between objects, Brouwer does not identify the intuition of two-ity with any of them. Even more, he does not even mention any such faculty in his writings. Whether these faculties are needed in his doctrine is [a] separate question. (Placek 1999, p. 42)
36
V. Pambuccian
Only if Brouwer understood geometry strictly as the science of the continuum could he have considered that the intuition of time can serve to construct space by means of the construction of the coordinate space. However, he was too wellinformed about the axiomatic foundations of geometry and certainly knew that ordered geometry, as axiomatized by the axioms of incidence and order, in the two-dimensional case, without the addition of any configuration theorem, does not allow for an algebraization. He had read the entire relevant literature, had a regular course on the subject, and was the advisor of a thesis in classical axiomatic geometry. There is a striking similarity with the confluence of time and space in Buddha’s language, which was pointed out by Monier Williams. Another interesting feature of the Buddha’s language of temporality is his use of the term for ‘space’ to refer to time as well. In the traditional literature, the term adhvan is used exclusively in the sense of road, way, orbit, journey or course. In fact, Monier Williams has noted that Buddhism and Jainism used it in the sense of time. The “Discourse on Convocation” (Sangīti-suttanta) enumerates the three addhā [paths] as past, present and future. Elsewhere it is said: A person, wandering for a long stretch of time with craving as companion, does not overcome the continued cycle of wandering (saṁsāra) in this or other modes of life. (Kalupahana 1999, p. 59)
Now, leaving aside the fact of a single origin, what about the fact that mathematics originates in an intuition at all? This is Brouwer’s middle way between nominalism (for which mathematical objects, relations, and structures do not exist – as physical objects or as entities having causal powers) and Platonism (for which there are abstract mathematical objects whose existence is independent of human beings, of the language or thought processes of those beings – Plato was, incidentally, not a Platonist in this sense, as pointed out in Pritchard (2010, p. 111)), shying away from both extremes. One can also see it as a middle way between a naturalism that declares “all mathematics” to be “ultimately a product of the human nervous system, the best product produced at each stage of our development” (Korzybski 1931, p. 276), a position defended by the neurobiologist Jean-Pierre Changeux, and a Platonism, backed in conversations with Changeux (Changeux and Connes 1995) by Alain Connes. It is Brouwer’s version of denying mathematics a svabhāva, a selfexistence, it is his version of “dependent origination” (paṭiccasamuppȧda), his insistence on the interrelatedness of all there is. So how is this intrusion of paṭiccasamuppȧda on the altar of Western philosophy, with its relentless search for the object, received? We present two quotes, from critics of intuitionism (which Hartmann calls “Intuitivismus”) by Nicolai Hartmann and by Felix Kaufmann, that read as if the authors had read Haas’s book, have internalized how Western philosophy has to proceed, and came up with the predictable answers. ‘Intuition’ is from the start a mode of cognition, i.e. a ‘grasping’, a transcendent act; in that it differs essentially from the ‘setting’, and it is just the theory that misjudges that. The truth of intuition is that it is not a giving act, but a receiving (receptive) act, and that the giving authority behind it ought to be searched for at the object. It is the one who determines the intuition to the extent that it ‘presents’ (appears) itself to it, and that happens as an object that
Brouwer’s Intuitionism
37
is indifferent to the very act of intuition. It is thus already assumed as an existent-in-itself. In case such an object were not to present itself, in case no existent were available, that would have its particular suchness already as its own, then the act itself would not be a seeing apprehension either. (Trans. V. P.) [,Intuition’ ist nämlich von vornherein ein Modus des Erkennens, also ein,Erfassen’, ein transzendenter Akt; sie ist darin von Grund aus anders gestellt als die,Setzung’, und es ist nur die Theorie, die das verkennt. Die Wahrheit der Intuition ist, dass sie nicht gebender, sondern aufnehmender (rezeptiver) Akt ist, und dass die gebende Instanz hinter ihr beim Gegenstande zu suchen ist. Dieser bestimmt die Anschauung, insoweit er sich ihr,darbietet’ (erscheint), und zwar als ein gegen den Anschauungsakt selbst indifferenter. Er ist also schon als ansichseiender vorausgesetzt. Bietet sich ein solcher Gegenstand nicht dar, liegt also kein Seiendes vor, das sein bestimmtes Sosein schon an sich hätte, so ist auch der Akt kein schauendes Erfassen. (Hartmann 1965, p. 236)] For Brouwer’s view that mathematical facts change with mathematical knowledge implies that there is something cognizable that is created only by being cognized, which runs counter to the nature of cognition. For [. . .] all cognition presupposes an object that must be thought of as existing independently of its being cognized. (Kaufmann 1978, p. 56f) [Denn Brouwers Auffassung, daß sich die mathematischen Tatsachen mit der mathematischen Erkenntnis wandeln, schließt ein, daß es ein zu Erkennendes gebe, welches erst durch die darauf bezogene Erkenntnis geschaffen wurde, was dem Wesen der Erkenntnis zuwiderläuft. Denn jede Erkenntnis setzt [. . .] einen unabhängig von seinem Erkanntwerden als bestehend zu denkenden Gegenstand voraus.]. (Kaufmann 1930, p. 65)
Here is what the theoretical physicist David Bohm had to say about this objectifying tendency of Western philosophy: The weakness of thought is that thought inevitably separates itself from what it thinks about. It creates an imaginary other, which it calls the object – which is still really thought. Let’s say I’m thinking about the image of a tree. Now that which I am thinking about seems to be separated from me. It seems the image is over there somewhere and I am here. Therefore it seems that I have created two images, one is the tree and the other is me. (Krishnamurti and Bohm 1999, p. 49)
Going even deeper into the process of object-creation, as well as into the phenomenon of tṛ́ṣṇa, we find in Brouwer (1948, p. 480) that As mind it [consciousness] takes the function of a subject experiencing the present as well as the past sensation as object. And by reiteration of this two-ity-phenomenon, the object can extend to a world of sensations of motley plurality. In measure of the irreversibility with which the subject has receded from an element of the object, this element loses its egoicity, i.e. gets estranged from the subject, and in measure of this estrangement, mind becomes disposed to desire and apprehension, and consequently to positive or negative conative activity with respect to the element in question.
There is a striking emphasis on the yang, on the object, on the external, the visible, which goes hand in hand with a de-emphasis of the yin, the internal, the invisible in these critiques of Brouwer’s intuitionism. The same emphasis on external features is prominent in formalism, in which the origin of the strings of meaningless symbols, that are its subject matter, is irrelevant. Formalism is not specific to mathematics as a modernist doctrine. Its counterpart in philosophy is analytic
38
V. Pambuccian
philosophy, which focuses on the language of philosophy, given that, what is philosophy other than a philosophical text, and what other purpose could philosophy serve than producing exact sequences of words? Its counterpart in psychology is behaviorism, which ignores the inner life of a human being and focuses on the observable part, namely, behavior. To put in in Hannah Arendt’s terms, “behavior has replaced action as the foremost mode of human relationship” (Arendt 1998, p. 41). In legal theory, the modernist approach is that of the “Pure Theory of Law” (Reine Rechtslehre) of Hans Kelsen, against the counter-modernist legal theory of Natural Law (Naturrecht), with Leo Strauss as its prominent twentieth-century proponent. This rejection of mathematics’ dependency on intuition, of its birth from intuition, the preference for thinking of mathematics as of a thing – for in its formalist understanding it is a thing, namely, the system consisting of strings of meaningless symbols, together with the formal rules that allow for the generation of other strings of meaningless symbols – is another case of what Günther Anders called “the shame of not being a thing,” in which he sees “a second level of alienation,” that on which “man has recognized the superiority of things, brings himself in conformity with them, approves of his own having-become-a-thing, respectively condemns his not-having-become-a-thing as a shortcoming” (Anders 1980, p. 30) (Trans. V. P.). The shame of being born – an unavoidable fact in any philosophy of complete interdependency – has been already noticed by Schelling (as mentioned in Anders 1980, p. 325), who writes that “Der Eigendünkel des Menschen sträubt sich gegen diesen Ursprung aus dem Grunde [. . .]” (Schelling 1809, p. 360) [“The self-conceitedness of man rejects this deep origin [. . .]” (Trans. V. P.)]. How does formalism avoid having to admit that time is playing a decisive role? Having created a “material instrument” (Brouwer 1929, p. 422), namely, the strings of meaningless symbols, that one can imagine as being written on paper or stone, it has transformed mathematics into the physical object that is represented by the strings of symbols. The deceptive part in this procedure is that one cannot thus consider that this spatial artifact is free of time, for the strings of symbols are concatenations, so time in the sense of this after that and a direction of reading are required. Without a direction of reading and the notion of a successor, the notion of string itself falls apart. On a very concrete level, philosophically inclined mathematicians fear that, if one were to accept Brouwer’s origin story, mathematics would – from being capable of bringing forth the envy of a Christian theologian (who is afraid that a Platonic understanding “precludes God’s being the sole ultimate reality,” the existence of “uncreatable” abstract objects being “incompatible” with God’s aseity (Craig 2017, p. 13)) – become part of psychology. This fear is unjustified (and has been dealt with in both Placek (1999) and Van Atten (2004)) and has been addressed in a very condensed form by Largeault: intuitionism has something aristocratic about it which is connected with the exceptional nature of intuition. (Despite this fact, intuition, pure and disinterested knowledge that it is, is
Brouwer’s Intuitionism
39
universal rather than individual, which refutes the legend of solipsism which has been attributed to the Dutch geometer.) (Trans. V. P.) [l’intuitionisme a quelque chose d’aristocratique qui tient au caractère exceptionnel de l’intuition. (Malgré cela, l’intuition, connaissance pure et désintéressée, est universelle plutôt qu’individuelle, ce qui réfute la légende du solipsisme attribué au géomètre hollandais.). (Largeault 1993, p. 224)]
This universal nature of mind, as well as the rejection of what Brouwer refers to as “the plurality of mind,” is emphasized by Krishnamurti in many of his talks. One of these is a talk with David Bohm: DB: You are using the word mind, it means it is not my mind. JK: Oh, mind, mind, it is not mine. DB: Universal or general. JK: Yes. It is not my brain either. DB: No, but there is a particular brain, this brain or that brain. Would you say there is a particular mind? JK: No. DB: Now, you see that is an important difference. You are saying mind is really universal. JK: Mind is universal – if you can use that word, ugly word. DB: Unlimited and undivided. JK: It is unpolluted, not polluted by thought. DB: But I think for most people there will be a difficulty in saying how do we know anything about this mind. I only know of my mind is the first feeling – right? JK: You cannot call it your mind. You only have your brain which is conditioned. You can’t say, ‘It is my mind’. DB: Yes, well whatever is going on inside I feel is mine and it is very different from what is going on inside somebody else. JK: No, I question whether it is different. DB: At least it seems different. JK: Yes. I question whether it is different, what is going on inside me as a human being and you as another human being, we both go through all kinds of problems, suffering, fear, anxiety, loneliness, suffer, and so on and so on. We have our dogmas, beliefs, superstitions, and everybody has this. DB: Well we’ll say it is all very similar but still it seems each one of us is isolated from the other. JK: By thought. My thought has created that I am different from you, because my body is different from you, my face is different from you, so we carry that same – we extend that same thing into the psychological area. (Krishnamurti and Bohm 1983) That universal nature of introspectively discovered truth is also stated in the following “reflection” of Goethe, which appeared posthumously in 1833 (Goethe 1953; Maximen und Reflexionen 1060, p. 514):
40
V. Pambuccian In case I know my relation to myself and to the outside world, I call it truth. And in that way everyone can have their own truth, and it is nevertheless always the same. (Trans. V. P.) [Kenne ich mein Verhältnis zu mir selbst und zur Außenwelt, so heiß ich’s Wahrheit. Und so kann jeder seine eigene Wahrheit haben, und es ist doch immer dieselbige.]
Brouwer himself wrote to Van Dantzig on August 24, 1949, that he sees a psychological interpretation of intuitionism as a misunderstanding: I am glad to see that these developments make the essentially negative properties meaningful also to those who do not recognize the intuitionistic creating subject, because with respect to mathematics they hold either a psychologistic point of view, or in any case stick to the ‘plurality of mind’. As I told you in conversation, my example in question is for fundamental intuitionism so much more unassailable than for those of a different persuasion, because the intuitionistic creating subject can certainly, and from the outset put restrictions (or prohibitions of restrictions) on a specific growing mathematical entity, but not on his own possibilities of creation. My belief that psychological pictures of intuitionistic mathematics, however interesting they may be, never can be adequate, has, if possible, been even strengthened by your comment. (Van Dalen 2011a, p. 439)
Now why and how does this philosophical stance of Brouwer alter concrete mathematics? Why is his own fixed-point theorem becoming false under this understanding? Brouwer had proved that the assumption of a continuous self-map of the closed disc without a fixed point leads to a contradiction. So, if we denote by A the fixed-point theorem, then Brouwer proved the truth of ::A, or, put differently, “the absurdity of the absurdity of A.” In classical logic, the double negation of a sentence is equivalent to it. Intuitionistically, that is not valid. Why? In classical mathematics, one thinks of the closed disk as a set of points, so if a continuous selfmap f without a fixed point is an absurdity, then there must exist a point in that closed disk that is not moved by f. We might never be able to find that point for a specific f, but its existence is assured. This thinking – even if only encoded in the axiom system chosen, and thus purely formal and not making any overt claim of existence – is a tributary to Platonism, to the belief that the points of the closed disk exist independently of our ability to construct them. However, already in his doctoral dissertation, Brouwer has a section heading that reads “Mathematics can deal with no other matter than that which it has itself constructed” (Brouwer 1907, p. 51) and we find out that “to exist in mathematics means: to be constructed by intuition; and the question whether a corresponding language is consistent, is not only unimportant in itself, it is also not a test for mathematical existence” (Brouwer 1907, p. 96). In short, there are no “non-experienced truths” (Brouwer 1948, p. 488) in mathematics. This is best illustrated by Brouwer’s account of the intuitionistic continuum, whose very nature is that of an unfinished entity, one that is becoming. Brouwer found the need to introduce a new language to deal with intuitionistic entities. For example, he does not use the word “set.” This has similarities with the Buddha’s tendencies in matters of language.
Brouwer’s Intuitionism
41
The Buddha needed a language that avoided [. . .] substantialist and nihilist implications. This was the language of becoming (bhava) that more accurately reflected his experience of change and continuity. It is the language of ‘dependent arising’ (paṭiccasamuppȧda) that steered clear of the extremes of permanent existence and nihilistic non-existence. Thus the Buddha can be credited with revolution in linguistic philosophy when he consistently utilized the ‘language of becoming’ to take the venom off the ‘language of existence.’ (Kalupahana 1999, pp. ii–iii)
Again, Largeault, with his keen sensitivity for the cultural aspect of philosophy, had noticed this fact, for he writes that Does this mean that our mathematics is based on a substance-attribute ontology, reflected in our language (subject-copula-attribute sentences), inherited from the ancient Greeks, while intuitionist mathematics presupposes an ontology of processes, which is repelled by our thinking habits? (Trans. V. P.) [Cela signifie-t-il que nos mathématiques reposent sur une ontologie substance-attribut, reflétée dans notre langue (phrases suject-copule-attribut), héritée des Grecs anciens, tandis que les mathématiques intuitionistes supposent une ontologie de processus, qui répugne à nos habitudes de pensée? (Largeault 1993, p. 175, footnote 1)]
One could wonder why it was and still is so hard for mathematicians and philosophers of mathematics to allow the observer inside the halls of mathematics. Did not physicists do just that? True, they did, but they were forced to do so by the very nature of experiments, not out of an abundance of interest in the Eastern approach. And even physics tends to exclude those who overstep the boundaries of what is, due to experimental outcomes, strictly necessary to accept, such as David Bohm, with his Spinoza-like embrace of the interconnectedness of mind and matter (see also Espinosa Rubio 2000), and his statement, reminiscent of Teilhard de Chardin’s “within of things” that Mind and matter are inseparable, in the sense that everything is permeated with meaning. The whole idea of the somasignificant or signasomatic is that at no stage are mind and matter ever separated. There are different levels of mind. Even the electron is informed with a certain level of mind. (Bohm and Weber 1987, p. 443)
But, one might ask, what is the value of the fact that Brouwer’s philosophy displays characteristics similar to that of an Indian who lived more than 2500 years ago, when science or mathematics were as good as non-existent? And how is the thesis of Brouwer’s Eastern disposition helped by having the support of Krishnamurti, to whom Walpola Rahula said in Rahula et al. (1996, p. 19) “there is hardly any difference between your teaching and the Buddha’s”? Is that same Eastern mind still present in the twentieth century, in Indians who had extensive contact with the Western mind, and who did not discover “the liberation of insight” (Krishnamurti and Bohm 1999, p. 101)? Pupul Jayakar, in conversation with Krishnamurti, referred to an Indian tendency of “delving into the self, delving into the within, insights into things” (Jayakar and Krishnamurti 1996, p. 154). And then there is Rabindranath Tagore – who, just like Krishnamurti, whose talks Brouwer
42
V. Pambuccian
frequented (see Van Dalen (2013, p. 364) and (van Atten 2015, p. 179), where we find out that “When visiting Krishnamurti, Brouwer said to a friend: ‘Oh my, this is the baby room of philosophy’”), played a role in Brouwer’s life – in conversation with Einstein “in the afternoon of July 14, 1930, at the latter’s residence in Kaputh”: E.: There are two different conceptions about the nature of the universe: (1) The world as a unity dependent on humanity. (2) The world as a reality independent of the human factor. T.: When our universe is in harmony with Man, the eternal, we know it as truth, we feel it as beauty. E.: This is a purely human conception of the universe. T.: There can be no other conception. This world is a human world – the scientific view of it is also that of the scientific man. There is some standard of reason and enjoyment which gives it truth, the standard of the Eternal Man whose experiences are through our experiences. E.: This is a realization of the human entity. T.: Yes, one eternal entity. We have to realize it through our emotions and activities. We realize the Supreme Man who has no individual limitations through our limitations. Science is concerned with that which is not confined to individuals; it is the impersonal human world of truths. Religion realizes these truths and links them up with our deeper needs; our individual consciousness of truth gains universal significance. Religion applies values to truth, and we know truth as good through our own harmony with it. E: Truth, then, or Beauty, is not independent of Man? T: No. E: If there would be no human beings any more, the Apollo of Belvedere would no longer be beautiful? T: No. E: I agree with regard to this conception of Beauty, but not with regard to Truth. T: Why not? Truth is realized through man. E: I cannot prove that my conception is right, but that is my religion. T: Beauty is in the ideal of perfect harmony which is in the Universal Being; truth the perfect comprehension of the Universal Mind. We individuals approach it through our own mistakes and blunders, through our accumulated experience, through our illumined consciousness – how, otherwise, can we know Truth? E: I cannot prove scientifically that truth must be conceived as a truth that is valid independent of humanity; but I believe it firmly. I believe, for instance that the Pythagorean theorem in geometry states something that is approximately true, independent of the existence of man. Anyway, if there is a reality independent of man there is also a truth relative to this reality; and in the same way the negation of the first engenders a negation of the existence of the latter. T: Truth, which is one with the Universal Being, must essentially be human, otherwise whatever we individuals realize as true can never be called truth – at least the truth which is described as scientific and can only be reached through the process of logic, in other words, by an organ of thoughts which is human.
Brouwer’s Intuitionism
43
According to Indian Philosophy there is Brahman the absolute Truth, which cannot be conceived by the isolation of the individual mind or described by words, but can only be realized by completely merging the individual in its infinity. But such a truth cannot belong to Science. The nature of truth which we are discussing is an appearance – that is to say what appears to be true to the human mind and therefore is human, and may be called māyā or illusion. E: So according to your conception, which may be the Indian conception, it is not the illusion of the individual, but of humanity as a whole. T: In science we go through the discipline of eliminating the personal limitations of our individual minds and thus reach that comprehension of truth which is in the mind of the Universal Man. E: The problem begins whether Truth is independent of our consciousness. T: What we call truth lies in the rational harmony between the subjective and objective aspects of reality, both of which belong to the super-personal man. E: Even in our everyday life we feel compelled to ascribe a reality independent of man to the objects we use. We do this to connect the experiences of our senses in a reasonable way. For instance, if nobody is in this house, yet that table remains where it is. T: Yes, it remains outside the individual mind, but not outside the universal mind. The table which I perceive is perceptible by the same kind of consciousness which I possess. E: Our natural point of view in regard to the existence of truth apart from humanity cannot be explained or proved, but it is a belief which nobody can lack – no primitive beings even. We attribute to Truth a super-human objectivity; it is indispensable for us, the reality which is independent of our existence and our experience and our mind though we cannot say what it means. T: Science has proved that the table as a solid object is an appearance, and therefore that which the human mind perceives as a table would not exist if that mind were naught. At the same time it must be admitted that the fact, that the ultimate physical reality of the table is nothing but a multitude of separate revolving centres of electric forces, also belongs to the human mind. In the apprehension of truth there is an eternal conflict between the universal human mind and the same mind confined in the individual. The perpetual process of reconciliation is being carried on in our science and philosophy, and in our ethics. In any case, if there be any truth absolutely unrelated to humanity then for us it is absolutely non-existing. It is not difficult to imagine a mind to which the sequence of things happens not in space, but only in time like the sequence of notes in music. For such a mind its conception of reality is akin to the musical reality in which Pythagorean geometry can have no meaning. There is the reality of paper, infinitely different from the reality of literature. For the kind of mind possessed by the moth, which eats that paper, literature is absolutely non-existent, yet for Man’s mind literature has a greater value of truth than the paper itself. In a similar manner, if there be some truth which has no sensuous or rational relation to the human mind it will ever remain as nothing so long as we remain human beings. E: Then I am more religious than you are!
44
V. Pambuccian
T: My religion is in the reconciliation of the Super-personal Man, the Universal human spirit, in my own individual being. This has been the subject of my Hibbert Lectures, which I have called ‘The Religion of Man’. (Gosling 2007, p. 161f) For the mathematician who is not philosophically inclined, and who could not care less about East or West, the rejection of intuitionistic mathematics is connected with the fact that it is considered to be a mutilation of classical mathematics, given that many of the results of classical mathematics become false in intuitionistic mathematics. The fact that, on the other hand, there are results in intuitionistic mathematics that are flatly wrong in classical mathematics, such as the theorem proved by Brouwer in 1924 that “every full function is uniformly continuous” does not endear it to mathematicians either, given that such results remove from the body of mathematics all results about discontinuous functions. Brouwer’s answer to this criticism was: In general intuitionism brings about a complete recasting of mathematics, with the result, to our regret, that in many places its supple and elegant character is lost, and it has to assume much harsher, more tortuous and more complicated forms. Alas, the spheres of truth are less transparent than those of illusion. (Brouwer 1933, p. 444)
This displeasure with “fewer” results is indicative of the approach to knowledge in the having mode of existence. More knowledge is preferred to knowing more deeply, which is what Brouwer proposes as “optimum knowledge in the being mode.” As Meister Eckhart put it: That one is much more blessed, who can dispense with all things and does not need them, then he, who has taken possession of all things, because he needs them. (Trans. V. P.) [Der ist viel seliger, der alle Dinge entbehren kann und ihrer nicht bedarf, als der, welcher alle Dinge in Besitz genommen hat, weil er sie braucht. (Eckhart 1934, p. 109)]
Brouwer never claimed that classical mathematics or formalistic mathematics is false. He just claimed that just the fact that something is free of error does not make it worthwhile pursuing, that in a finite life, one should pursue “the structured elements of our thinking” (Brouwer 1946b, p. 475), the structures that arise from some lived intuition, not just any possible game of symbols. It is, after all, a moral argument for intuitionistic mathematics. He was opposed, also on moral grounds, to the cooperation of mathematicians with society: Of course art and philosophy continually illustrating such wisdom cannot participate in cooperation, in particular should not communicate with the state. Supported by the state, they will lose their independence and degenerate. (Brouwer 1948, p. 487) The two games [logic and mathematics], by dint of their origin, influence each other. By their very nature, they should not interfere with social life. Since the latter has nevertheless claimed them, they undergo the influence of the pragmatic sciences by cooperating, against their nature, to the transformations of social life that are called progress. Luckily, their most
Brouwer’s Intuitionism
45
beautiful developments will probably never have any connection with technological, economical, or political matters. [Les deux jeux [logique et mathématique], en vertu de leur origine, s’influencent mutuellement. De par leur nature, ils ne devraient pas s’immiscer dans la vie sociale. Celle-ci les ayant néanmoins réclamés, ils subissent l’influence des sciences pragmatiques tout en coopérant, contre leur nature, aux transformations de la vie sociale qu’on appelle le progrès. Heureusement, leurs plus beaux développements n’auront probablement jamais aucun rapport avec les questions techniques, économiques ou politiques. (Brouwer 1950)
The same disdain for applications and the corrupting influence of money was present in René Thom’s thinking (Papadopoulos 2018) and is certainly more widespread than meets the eye, given that many mathematicians will likely keep such thoughts private or never get the opportunity to make them public.
7
Wisdom and Mysticism: Beyond Mathematics and Time
While mathematics originates in the intuition of time, Meister Eckhart tells us that No thing is as despised by God as time. Not only time, the clinging to time as well. And not only the clinging, already the touch of time. And not only the touch, already the simple smell or taste of time. (Trans. V. P.) [Gott ist kein Ding so sehr zu wider als die Zeit. Nicht allein die Zeit, auch das Haften an Zeit. Und nicht allein das Haften, schon das Berühren von Zeit. Und nicht allein das Berühren: schon der bloße Geruch und Geschmack von Zeit. (Pfeiffer 1857, 308, 5)] If you are neither this nor that, then you are all things. – What has number or is number, from that he is far. (Trans. V. P.) [Bistu weder dies noch das, so bistu alle Dinge. – Was Zahl hat oder Zahl ist, von dem ist er fern.] (Pfeiffer 1857, 162, 7)
And Krishnamurti tells us that “a human being can see for himself that compassion is out of time, truth is out of time, and the depth from which that compassion comes is out of time. And therefore it is not cultivable” (Krishnamurti and Bohm 1999, p. 97). Brouwer writes already in his dissertation (although his doctoral advisor crosses it out, as not relevant for mathematics): Mathematics is a free creation; it exists in developing a primordial intuition which can be called ‘permanence in change’ or ‘the discrete in the continuous’. Applying it to the exterior world is creating the objective world; this is characteristic of the human strategy in the general struggle for life. As such it is clearly inferior and has nothing to do with religion or with wisdom. (Van Stigt 1990, p. 414f, The Rejected Parts of Brouwer’s Dissertation) Perhaps the greatest merit of mysticism is its use of language independent of mathematical systems of human collusion, independent also of the direct animal emotions of fear and desire. If it expresses itself in such a way that these two kinds of representations cannot be detected, then the contemplative thoughts – whose mathematical restriction appears as the only live element in the mathematical system – may perhaps again come through without obscurity, since there is no mathematical system that can replace them. When the mystical
46
V. Pambuccian author joins representations of this kind into more central affections, of which they were one-sided restrictions, he can, with the most ordinary words, gradually break down the barriers round the contemplative sphere and guide us back to the ‘all-embracing’ which every poet seeks to approach. His language will therefore appear as meaningless to those who expect to find in words only the communication of mathematical systems or a stimulus to mathematical activity. (Van Stigt 1990, p. 409, The Rejected Parts of Brouwer’s Dissertation) In religious truth, i.e. in wisdom, which abolishes the discernment between the subject and something different, and where the perception of time is no longer admitted, there is no mathematical understanding, let alone reliability of logic. (L. E. J. Brouwer, 1908, p. 108) Searching for wisdom, we may find it in knowing that causal thinking and acting is non-beautiful and hard to justify, and that in the long run it brings disappointment. And in knowing that the exterior world with its innumerable individuals and with its hypertrophied cooperation is wedded to mind, its disharmonies reflecting mind’s free-willguilt. As a consequence of this knowing the exterior world and one’s own position in it are accepted as they are, so that towards the exterior world generally only acts as reversible as possible aiming at maintenance, but no acts let alone causal acts aiming at change, are undertaken of one’s free will. Repair of disadjustments, averting of danger and relief of need, all this negative intervening in human society is justified in itself and sometimes prescribed. But positive activity to change the structure of human society governed by so many unknown forces, will always be checked by the self-admonition: ‘not to improve her works has Providence placed thee in this world’, and only vocation and inspiartion tested in detachment-concentration will be stronger than this admonition. (Brouwer 1948, p. 485f)
This understanding leads to a “sceptical prognosis” for humanity, one that was the central concern of Fromm or Krishnamurti as well. In times past, this would have been dismissed as “pessimism,” perhaps no longer as out of hand. Brouwer had expressed concerns regarding the equilibrium-upsetting interventions of humans in nature, already in 1905, in the opening lines of Brouwer (1996). It is important to notice that Brouwer offers not only a somber prognosis but also the means to escape it by abandoning the delusion of being able to steer and control life in desired directions by means of “causal acts,” by “cunning.” The categorical imperative prescribing the aforesaid attitude towards life has its counterpart in a sceptical prognosis that mankind, possessed by the delusion of causality, will slide away in a deteriorative process of overpopulation, industrialization, serfdom, and devastation of nature, and that when hereby first its spiritual and then its physiological conditions of life will have been destroyed, it will come to its end like a colony of bacteria in the earth crust having fulfilled its task. All this though timeless art and perennial philosophy continually suggest that the unknown forces governing the destiny of individual and community, are not subject to causality; that in particular the ways of fate cannot be paved with causality, and that security is as unattainable as it is unworthy; that intensification of organization increases vulnerability, that new vulnerability asks for protection through new organization, and that thus for organization which is believed in, there is no end of growth; finally that if the delusion of causality could be thrown off, nature, gradually resuming her rights, would be (except for her bondage to destiny) generous and forgiving to a mankind decausalized and subsiding to more modest and more harmonious proportions. (Brouwer 1948, p. 487)
Brouwer’s Intuitionism
47
And we can find already in 1905, in its last sentence, his categorical rejection of the having mode of existence: Only he who recognizes that he has nothing, that he cannot possess anything, that absolute certainty is unattainable, who completely resigns himself and sacrifices all, who gives everything, who does not know anything, does not want anything and does not want to know anything, who abandons and neglects everything, he will receive all; to him the world of freedom opens, the world of painless contemplation and of – nothing. (Brouwer 1996, p. 429)
While it might seem strange that Brouwer searches the ultimate answers whether for the source of mathematical intuition or for the source of “wisdom” in himself, in introspection, he is, again, in good company in doing so: Albert Schweitzer and Salman Baruch Rabinkow – a Chassidic Jew of the Chabad Chassidism branch, which originated in Lithuania, and Talmud teacher – with whom Fromm studied for 6 years the Talmud, and about whom he writes (Fromm 1987, p. 103) that he ‘influenced my life more than any other man.” The great secret is to go through life as an unconsumed person. Such an outcome is possible to the one who, instead of reckoning with people and facts, is thrown back on oneself and seeks the ultimate reason for things in oneself. (Trans. V. P.) [Das große Geheimnis ist, als unverbrauchter Mensch durchs Leben zu gehen. Solches vermag, wer nicht mit den Menschen und Tatsachen rechnet, sondern in allen Erlebnissen auf sich selbst zurückgeworfen wird und den letzten Grund der Dinge in sich sucht. (Schweitzer 1974, p. 312)] Judaism ascribes absolute value in its teaching only to this autonomous human being, who connects nature and spirit in a continuous synthesis. To the extent that the condition of autonomy is satisfied, Jewish doctrine does not recognize any difference in rank between humans. Everyone is entitled and obliged to say “The world was created for my own sake” (Sanhedrin, Mishnah, chap. 4), because every human being is an end in itself and is, as it were, burdened with the responsibility for the whole of Creation. (Trans. V. P.) [Allein diesem autonomen, Natur und Geist zur durchgängigen Synthese verbindenden Menschen spricht das Judentum in seiner Lehre absoluten Wert zu. Soweit aber die Voraussetzungen der Autonomie gegeben sind, kennt die jüdische Lehre keinerlei Rangunterschiede zwischen Mensch und Mensch. Jedermann ist berechtigt und verpflichtet zu sagen „Meinetwegen ist die Welt erschaffen” (Sanhedrin, Mischna, Kap. 4), denn jedes menschliche Wesen ist Selbstzweck und ist gleichsam mit der Verantwortung für die gesamte Schöpfung belastet. (Rabinkow 1929, p. 808f)]
Brouwer’s emphasis on the inadequacy of language for communication and the overcoming of even the languageless activity that is mathematics with mysticism is not shared by all mystics. While mystics of the negative theological persuasion, such as Meister Eckhart, agree with Brouwer’s take on the ineffability of the mystical experience, not all mystics regard language as inadequate. As Rudolf Otto pointed out, mysticism “preserves its special character, which explains itself clearly from the ground over which it domes up” [“wahrt ihren Sondercharakter, der sich deutlich aus dem Grunde erklärt, über dem sie sich wölbt”] (Otto 1926, p. 231). Given the high regard for language in Judaism (“Thy word (or: essence) is true from the beginning” (Psalm 119: 160)), Jewish mysticism highly values and is heavily reliant on
48
V. Pambuccian
language, as pointed out in Scholem (1972), Ben-Sasson (2018), and Kilcher (1998, pp. 31–94). The point of departure of all mystical linguistic theories, among which we should also number those of the Kabbalists, is constituted by the conviction that the language – the medium – in which the spiritual life of man is accomplished, or consummated, includes an inner property, an aspect which does not altogether merge or disappear in the relationships of communication between men. (Scholem 1972, p. 60)
None of this diminished in any way the deep friendship, based to a great extent on a common interest for the mystic experience, between Brouwer and Erich Gutkind. Questions regarding “the common core thesis on mysticism” (which “holds that all forms are in the end different ways to express the same thing” (Barendregt 2008, p. 131)) were addressed in Van Atten and Tragesser (2015) and in Barendregt (2008).
8
Conclusions
What we have attempted to show in this study is the fact that Brouwer’s philosophy of mathematics is coherent and does not stand alone in the arena of philosophies. It is perceived as a foreign body by philosophers trained in the Western tradition, who usually do not even confer the title “philosophy” to its Eastern branch, let alone allow it to have anything to say about Western philosophy’s oldest and most valued possession: mathematics. We have seen that Brouwer’s philosophy converses well with that of the Buddha, Erich Fromm, Krishnamurti, Meister Eckhart, Albert Schweitzer, David Bohm, Humberto Maturana, Francisco Varela, or Spinoza. Critics will say, fine, all you have shown is that he hung out with the wrong crowd, with the wrong side of C. P. Snow’s “two cultures.” What is a modern mind to do with the Buddha, Krishnamurti, Bohm, and Spinoza, who maintain that, although the world is highly causal and thus that there is a great amount of determinism, insight liberates (see (Bittner 1994) for the train of thought that led Spinoza to this conclusion)? All of them defend countermodernist positions (certainly an anachronistic statement in the case of the Buddha, Spinoza, or Meister Eckhart); no contemporary ethics study in the English language, written by an analytic philosopher, even mentions Schweitzer’s name. Both the being mode of existence and the Eastern approach are considered to be outside of the area of acceptable academic discourse. Incidentally, whoever thinks that countermoderns must lie politically on the right spectrum should consider that Bohm was a communist until the Hungarian Revolution of 1956. This is also not the first instance in which Brouwer and Bohm have been brought together in an essay: a proposal for a model of indeterminism in physics (Gisin 2021) has called for the intuitionistic continuum as “an alternative mathematical language that is both powerful enough to
Brouwer’s Intuitionism
49
allow scientists to compute predictions and compatible with indeterminism and the passage of time” and mentions Bohmian mechanics. What we have not attempted to do is to take sides. It is clear to us that Platonism, intuitionism, and formalism are all coherently expressed philosophies, and we are not at all interested in poking holes in one of them, with the aim of demolishing it or finding out inner contradictions. The conversation between Jean-Pierre Changeux and Alain Connes, in which none is able to convince the other of the wrong nature of one’s own position, shows that such conversations make sense as exercises in informing the other, not in convincing the other of some mistake in one’s own thinking (Changeux and Connes 1995). Although we clearly stated this in a previous, much shorter essay on this topic (Pambuccian 1992), every attempt to defend Brouwer’s position as coherent will likely bring on the same reaction that Brouwer’s philosophy elicits, as can be read from the following review: This essay exhibits the efforts of a mathematician to articulate a philosophy justifying doing mathematics under the constraints of the outlook of the intuitionist L. E. J. Brouwer. But the philosophy does not concern having the proper concept of mathematical reality and, then, having the proper techniques for having truths about it. The word “having” is repeated in the preceding sentence to emphasize that a more fundamental philosophy, based on insights from (Fromm 1976), is pursuit of detachment from the way of existing which is dependent upon having things: the having mode of existence. This includes detachment from having the proper metaphysics and epistemology. The fruit of such detachment is an independent way of being: the being mode of existence. Presumably, those who are seeking this detached way of existing will do mathematics in Brouwer’s way because they are detached from having access to a mathematical reality. (Kielkopf 1993)
The writer never attempted to justify his own activity, as he has written all his papers in a strict formalistic tradition and has used a (formal) intuitionistic result only once. The result used was arrived at by Michael Beeson in Beeson (2018). The writer’s own use of it in solving an open problem going back to 1852 was straightforward (Pambuccian 2018). This essay is also not written to justify the writer’s own or anyone else’s activity. That would be pathetic, for, as Schweitzer put it: We are in truth when we experience conflicts increasingly deeper. The good conscience is an invention of the devil. (Trans. V. P.) [In der Wahrheit sind wir, wenn wir die Konflikte immer tiefer erleben. Das gute Gewissen ist eine Erfindung des Teufels.] (Schweitzer 1923, p. 249)
What we hope to have shown is that, if a formalist, whose papers are – unlike those of the majority of mathematicians, who are formalists only on Sundays – filled with actual strings of meaningless symbols, can understand what Brouwer was intending and why, then anyone can. There is no expectation that anyone will change their mind on the validity of Brouwer’s philosophy after having read this essay.
50
V. Pambuccian
References Allen JP (1988) Genesis in Egypt. The philosophy of ancient Egyptian creation accounts. Yale Egyptological Seminar, New Haven Anders G (1980) Die Antiquiertheit des Menschen, vol I. C. H. Beck, München Arendt H (1998) The human condition, 2nd edn. The University of Chicago Press, Chicago Atiyah MF (2002) Mathematics in the 20th century. Bull Lond Math Soc 34:1–15 Balcerowicz P (2001) The logical structure of the naya method of the Jainas. J Indian Philos 29: 379–403 Barendregt H (2008) Buddhist models of the mind and the common core thesis on mysticism. In: van Atten M, Boldini P, Bourdeau M, Heinzmann G (eds) One hundred years of intuitionism (1907–2007). The Cerisy conference. Birkhäuser, Basel, pp. 131–145 Becker O (1927) Mathematische Existenz. Max Niemeyer Verlag, Halle a.S Beeson M (2018) Brouwer and Euclid. Indag Math (NS) 29:483–533 Ben-Sasson HH (2018) “The name of God and the linguistic theory of the Kabbalah” revisited. J Relig 98:1–28 Berberova N (1992) The italics are mine. Alfred A Knopf, New York Bittner R (1994) Spinozas Gedanke, daß Einsicht befreit. Deutsche Zeitschrift für Philosophie Berlin 42:963–871 Boddhi B (2000) The connected discourses of the Buddha. A translation of the Saṃyutta Nikāya. Wisdom, Boston Bohm D, Weber R (1987) Meaning as being in the implicate order philosophy of David Bohm: a conversation. In: Hiley BJ, Peat FD (eds) Quantum implications. Essays in honour of David Bohm. Routledge, London/New York Brouwer LEJ (1907) Over de grondslagen der wiskunde. Academisch proefschrift. Maas & van Suchtelen, Amsterdam; [On the foundations of mathematics] [Brouwer 1975, pp. 11–101] Brouwer LEJ (1908) De onbetrouwbaarheid der logische principes. Tijdschrift voor Wijsbegeerte 2: 152–158. [The unreliability of the logical principles] [Brouwer 1975, pp. 107–111] Brouwer LEJ (1909) Het wezen der meetkunde. Openbare Les. Clausen, Amsterdam; [The Nature of Geometry. Public Fecture.] [Brouwer 1975, pp. 112–120] Brouwer LEJ (1913) Intuitionism and formalism. Bull Am Math Soc 20:81–86. [Brouwer 1975, pp. 123–138] Brouwer LEJ (1929) Mathematik, Wissenschaft und Sprache. Monatshefte für Mathematik 36: 153–164. [Brouwer 1975, pp. 417–428] Brouwer LEJ (1933) Willen, Weten, Spreken. Euclides 9:177–193. [Brouwer 1975, pp. 443–446] [English translation in [Van Stigt 1990, Appendix 5]] Brouwer LEJ (1937) Signifische dialogen [Signific dialoges]. Synthese 2:168–174. 316–324; [Brouwer 1975, pp. 447–456] Brouwer LEJ (1946a) Synopsis of the signific movement in The Netherlands. Synthese 5:201–208. [Brouwer 1975, pp. 465–471] Brouwer LEJ (1946b) Address delivered on September 16th, 1946, on the conferment upon Professor G. Mannoury of the honorary degree of Doctor of Science. [Brouwer 1975, pp. 472–476] Brouwer LEJ (1948) Consciousness, philosophy and mathematics. In: Proceedings of the 10th international congress of philosophy, Amsterdam, pp. 1235–1249; [Brouwer 1975, pp. 480–494] Brouwer LEJ (1950) Discours final. In: Les méthodes formelles en axiomatique. Centre National de la Recherche Scientifique, Paris, p. 75. [Brouwer 1975, p. 503] Brouwer LEJ (1952a) An intuitionist correction of the fixed-point theorem of the sphere. Proc R Soc Lond A 213:1–2. [Brouwer 1975, pp. 506–507] Brouwer LEJ (1952b) Historical background, principles and methods of intuitionism. S Afr J Sci 49:139–146. [Brouwer 1975, pp. 508–515] Brouwer LEJ (1975) In: Heyting A (ed) Collected works, vol 1. North-Holland, Amsterdam
Brouwer’s Intuitionism
51
Brouwer LEJ (1996) Life, art, and mysticism. Van Stigt WP (trans). Notre Dame J Form Log 37: 389–429 Budianksy S (2021) Journey to the edge of reason. The life of Kurt Gödel. Oxford University Press, Oxford Callahan G (2015) Was Berkeley a subjective idealist? Collingwood and British Idealism Studies 21:157–184 Camus A (1955) The myth of Sisyphus, and other essays (trans: O’Brien J). Alfred A Knopf, New York Carnap R (1934) Logische Syntax der Sprache. Springer Verlag, Wien Carnap R (1959) The logical syntax of language (trans: Smeaton A). Adams and Co, Littlefield Changeux JP, Connes AF (1995) Conversations on mind, matter, and mathematics. DeBevoise MB (ed & trans). Princeton University Press, Princeton Craig WL (2017) God and abstract objects. The coherence of theism: aseity. Springer, Cham Dauben JW (2000) Review of [Van Dalen 1999b]. MR1669279 (2000g,01037) Dauben JW (2007) Review of [Van Dalen 2005]. MR2171883 (2007i,01004) Detlefsen M (1998) Constructive existence claims. In: Schim M (ed) The philosophy of mathematics today. Oxford University Press, New York, pp. 307–335 Dilthey W (1965) Der Aufbau der geschichtlichen Welt in den Geisteswissenschaften. Gesammelte Schriften, VII. Band. B. G. Teubner/Vandenhoeck & Ruprecht, Stuttgart/Göttingen Dubucs J-P (1988) L. E. J. Brouwer: topologie et constructivisme. Revue d’Histoire des Sciences 41:133–155 Dummett M (1975) The philosophical basis of intuitionistic logic. In: Rose HE, Shepherdson JC (eds) Logic colloquium ‘73 (Bristol, 1973). North-Holland, Amsterdam, pp. 5–40 Dummett M (2000) Elements of intuitionism, 2nd edn. Oxford University Press, Oxford Eckhart M (1934) Deutsche Predigten und Traktate. Insel, Leipzig Eggenberger PB (1976) The philosophical background of L.E.J. Brouwer’s intuitionistic mathematics. Ph.D. Thesis. University of California, Berkeley Espinosa Rubio L (2000) Spinoza y David Bohm: un diálogo sobre filosofía natural. Estudios Filosóficos 49:309–328 Finnestad RB (1976) Ptah, creator of the gods: reconsideration of the Ptah sections of the Denkmai. Numen 23:81–113 Fraenkel AA (2016) Recollections of a Jewish mathematician in Germany. Edited by Jiska CohenMansfield. Translated by Allison Brown. Birkhäuser, Cham Francklin T (1787) Sermons on various subjects, and preached on several occasions by Thomas Francklin, vol 3. T. Cadell, in the Strand, London Frenkian AM (1946) L’Orient et les origines de l’idéalisme subjectif dans la pensée européenne. T. I: La doctrine théologique de Memphis (L’inscription du roi Shabaka). Librairie Orientaliste Paul Geuthner, Paris Freudenthal H (1962) The main trends in the foundations of geometry in the 19th century. In: Nagel E, Suppes P, Tarski A (eds) Logic, methodology and philosophy of science. Proceedings of the 1960 international congress. Stanford University Press, pp. 613–621 Friedman LJ (2013) The lives of Erich Fromm. Love’s prophet. Columbia University Press, New York Fromm E (1976) To have or to be? Harper & Row, New York Fromm E (1987) Reminiscenses of Shlomo Barukh Rabinkow. In: Jung L (ed) Sages and saints. Ktav Publishing House, Hoboken, pp. 99–105 Gandon S (2012) Russell’s unknown logicism. Palgrave Macmillan, New York Ganeri J (1996) Numbers as properties of objects: Frege and the Nyāya. Studies in Humanities and Social Sciences 3:111–121 Gies M, Gold AL (1987) Anne Frank remembered. The story of the woman who helped to hide the Frank family. Simon and Schuster, New York Gisin N (2021) Indeterminism in physics and intuitionistic mathematics. Synthese 199: 13345–13371. https://doi.org/10.1007/s11229-021-03378-z
52
V. Pambuccian
Goethe JW (1953) Goethes Werke. Band XII. Schriften zur Kunst, Schriften zur Literatur, Maximen und Reflexionen. Christian Wegner Verlag, Hamburg Gosling DL (2007) Science and the Indian tradition. When Einstein met Tagore. Routledge, London Grattan-Guinness I (1999) Review of [Van Dalen 1999b]. Bull Am Math Soc 36:529–532 Gray J (2015) Brouwer’s certainties: mysticism, mathematics, and the ego. Essay review of [Van Dalen 2013]. Metascience 24:127–134 Guenther HV (1989) Mentalism and beyond in Buddhist philosophy. In: Guenther HV (ed) Tibetan Buddhism in Western perspective. Dharma Publishing, Berkeley, pp. 162–177 Gyatso T (the Fourteenth Dalai Lama) (1990) Opening the eye of new awareness. Wisdom, Boston Haas WS (1956) Destiny of the mind. East and West. Macmillan, New York Hartimo M (2017) Husserl and Hilbert. In: Centrone S (ed) Essays on Husserl’s logic and philosophy of mathematics. Springer, Dordrecht, pp. 245–263 Hartmann N (1964) Der Aufbau der realen Welt. Walter de Gruyter, Berlin Hartmann N (1965) Zur Grundlegung der Ontotogie. Walter de Gruyter, Berlin Herman WF (1949) De tranen der acacia’s. Uitgeverij G. A. van Oorschot, Amsterdam Hesseling DE (2003) Gnomes in the fog. The reception of Brouwer’s intuitionism in the 1920s. Birkhäuser, Basel Hoffmeier JK (1983) Some thoughts on Genesis 1 & 2 and Egyptian cosmology. J Anc Near East Soc 15:39–49 Holenstein E (2004) Philosophie-Atlas. Orte und Wege des Denkens. Ammann Verlag, Zürich Huber-Dyson V (1996) Thoughts on the occasion of Georg Kreisel’s 70th birthday. In: Odifreddi P (ed) Kreiseliana. About and around Georg Kreisel. A. K. Peters, Wellesley, pp. 51–73 Husserl E (1994) Brief von Husserl an Heidegger, 9. V. 1928 (1). In: Schuhmann K (ed) Husserliana. Dokumente. Band III. Briefwechsel. Teil 4: Die Freiburger Schüler. Springer-Science, Dordrecht Inagaki H (1998) Nāgārjuna’s discourse on the ten stages (Daśabhumika-vibhasa). Ryukoku Gakkai, Kyoto Ingarden R (1975) On the motives which led Husserl to transcendental idealism. Martinus Nijhoff, Den Haag Jayakar P, Krishnamurti J (1996) Is there an Eastern mind and a Western mind? In: Questioning Krishnamurti. J. Krishnamurti in dialogue. Thorsons, San Francisco, pp. 153–169 Johnson DM (1981) The problem of the invariance of dimension in the growth of modern topology II. Arch Hist Exact Sci 25:85–267 Johnson DM (2014) Review of [Van Dalen 2013]. Not Am Math Soc 61:607–610 Kalupahana DJ (1975) Causality. The central philosophy of Buddhism. The University Press of Hawaii, Honolulu Kalupahana DJ (1999) The Buddha’s philosophy of language. Sarvodaya Viishna Lekha, Moratuwa Kaneko H (2002a) Brouwer’s conception of language, mind and mathematics. Ann Jpn Assoc Philos Sci 11:35–49 Kaneko H (2002b) Review of [Van Dalen 1999b]. Ann Jpn Assoc Philos Sci 11:51–56 Kaufmann F (1930) Das Unendliche in der Mathematik und seine Ausschaltung. Eine Untersuchung über die Grundlagen der Mathematik. F. Deuticke, Wien Kaufmann F (1978) The infinite in matematics. McGuinness B (ed), Foulkes P (trans). D. Reidel, Dordrecht Kielkopf CF (1993) Review of [Pambuccian 1992]. MR1174963 (93f:00005) Kilcher A (1998) Die Sprachtheorie der Kabbala als ästhetisches Paradigma. J. B. Metzler, Stuttgart Koetsier T (2005) Arthur Schopenhauer and L. E. J. Brouwer: a comparison. In: Koetsier T, Bergmans L (eds) Mathematics and the divine. A historical study. Elsevier, Amsterdam, pp. 569–593 Korzybski A (1931) A non-Aristotelian system and its necessity for rigour in mathematics and physics. In: Korzybski A (ed) Science and sanity, 5th edn. Institute of General Semantics, Brooklyn, pp. 747–761
Brouwer’s Intuitionism
53
Kreisel G (1977) Book review of L. E. J. Brouwer. Collected works, volume I, philosophy and foundations of mathematics. Bull Am Math Soc 83:86–93 Kreisel G (1987) Gödel’s excursion into intuitionistic logic. In: Weingartner P, Schmetterer L (eds) Gödel remembered. Bibliopolis, Napoli, pp. 67–186 Kreisel G, Newman MHA (1969) Luitzen Egbertus Jan Brouwer 1881–1966. Biogr Mem Fellows R Soc 15:39–68 Krishnamurti J (1956) Commentaries on living, first series. The Theosophical Publishing House, Wheaton Krishnamurti J (1984) Conversation 1588, Los Alamos, New Mexico, USA, 21 March 1984. http:// legacy.jkrishnamurti.org/krishnamurti-teachings/view-text.php?tid¼1588&chid¼1285 Krishnamurti J (1991) Collected works 5. Choiceless awareness. Kendall Hunt, Dubuque Krishnamurti J, Bohm D (1983) Is there evolution of consciousness? Dialogue 2 Brockwood Park, England – 20 June 1983. https://jkrishnamurti.org/content/there-evolution-consciousness Krishnamurti J, Bohm D (1999) The limits of thought. Routledge, London Lamotte É (1976) Historie du bouddhisme Indien. Des origines à l’ère Śaka. Institut Orientaliste, Louvain-la-Neuve Largeault J (1993) Intuition et intuitionisme. Librairie Philosophique J. Vrin, Paris Lusthaus D (2002) Buddhist phenomenology. RoutledgeCurzon, London Lyttle D (1997) “The world is a divine dream”: Emerson’s subjective idealism. The Concord Saunterer, New Series 5:92–110 Majer U (2010) The origin and significance of Husserl’s notion of the Lebenswelt. In: Hyder D, Rheinberger H-J (eds) Science and the life-world. Essays on Husserl’s ‘crisis of European sciences’. Stanford University Press, Stanford, CA Männlein-Robert I (2009) Griechische Philosophie in Indien? Reisewege zur Weisheit. Gymnaisum 116:331–357 Maturana HR, Varela FJ (1980) Autopoiesis and cognition. The realization of the living. D. Reidel, Dordrecht McCarty C (1987) Variations on a thesis: intuitionism and computability. Notre Dame J Form Log 28:536–580 McCarty C (2008) The new intuitionism. In: van Atten M, Boldini P, Bourdeau M, Heinzmann G (eds) One hundred years of intuitionism (1907–2007), the Cerisy conference. Birkhäuser, Basel, pp. 37–49 McEvilley T (1980) Plotinus and Vijñānavāda Buddhism. Philos East West 30:181–193 McEvilley T (2002) The shape of ancient thought. Allworth Press, New York Mehrtens H (1990) Moderne – Sprache – Mathematik. Eine Geschichte des Streits um die Grundlagen der Disziplin und des Subjekts formaler Systeme. Suhrkamp Verlag, Frankfurt am Main Mehrtens H (1996) Modernism vs. counter-modernism, nationalism vs. internationalism: style and politics in mathematics 1900–1950. In: Goldstein C, Gray J, Ritter J (eds) L’Europe mathématique/mathematical Europe. Histoires, mythes, identités/history, myth, identity. Éditions de la Maison des Sciences de l’Homme, Paris Misak C (2021) What are the limits of logic? How a groundbreaking logician lost control. Review of (Budianksy 2021). Times Literary Supplement, November 5, 2021 Musgrave A (1977) Logicism revisited. Br J Philos Sci 28:99–127 Nanamoli B, Bodhi B (transl) (1995) The middle length discourses of the Buddha. A new translation of the Maj́j́híma Níkāya. Wisdom Publications, Boston Napper E (1989) Dependent arising and emptiness. A Tibetan-Buddhist interpretation of Mādhyamika philosophy emphasizing the compatibility of emptiness and conventional phenomena. Wisdom Publications, Boston Naraniecki A (2015) L. E. J. Brouwer and Karl Popper: two perspectives on mathematics. Cosm Hist 11:239–255 Otto R (1926) West-östliche Mystik. Leopold Klotz Verlag, Gotha
54
V. Pambuccian
Pambuccian V (1992) Mathematik, Intuition und die Existenzweise des Seins. Wissenschaft vom Menschen 3:87–119 Pambuccian V (2018) Negation-free and contradiction-free proof of the Steiner-Lehmus theorem. Notre Dame J Form Log 59:75–90 Papadopoulos A (ed) (2018) René Thom. Portrait mathématique et philosophique. CNRS Éditions, Paris Pfeiffer F (1857) Meister Eckehart. J.G. Goschen, Leipzig Placek T (1999) Mathematical intuitionism and intersubjectivity. A critical exposition of arguments for intuitionism. Kluwer Academic Publishers, Dordrecht Poincaré H (1913) Dernières pensées. Flammarion, Paris Polanyi M (2009) Tacit knowledge. The University of Chicago Press, Chicago Popper KR (1967) Epistemology without a knowing subject. In: Popper KR (ed) Objective knowledge. Oxford University Press, Oxford, pp. 106–152 Popper KR (1968) On the theory of the objective mind. In: Popper KR (ed) Objective knowledge. Oxford University Press, Oxford, pp. 153–190 Posy C (1998) Brouwer versus Hilbert: 1907–1928. Sci Context 11:291–325 Posy CJ (2020) Mathematical intuitionism. Cambridge University Press, Cambridge Pritchard P (2010) Plato’s philosophy of mathematics, 2nd edn. Academia Verlag, Sankt Augustin Rabi L (2016) Ortega y Gasset on Georg Cantor’s theory of transfinite numbers. Kairos J Philos Sci 15:46–70 Rabinkow SB (1929) Individuum und Gemeinschaft im Judentum. In: Brugsch T, Lewy FH (eds) Die Biologie des Menschen. Band 4: Soziologie der Person. Urban & Scwarzenberg, Berlin/ Wien, pp. 799–824 Rahula W et al (1996) Are you not saying what the Buddha said? In: Questioning Krishnamurti. J. Krishnamurti in dialogue. Thorsons, San Francisco, pp. 18–38 Robinson RH (1957) Some logical aspects of Nāgārjuna’s system. Philos East West 6:291–308 Rowe DE, Felsch V (2019) Otto Blumenthal: Ausgewählte Briefe und Schriften II. 1919 – 1944. Springer Spektrum, Berlin Sabo T (2017) Plotinus and Buddhism. Philos East West 67:494–505 Sambin G (2008) Two applications of dynamic constructivism: Brouwer’s continuity principle and choice sequences in formal topology. In: van Atten M, Boldini P, Bourdeau M, Heinzmann G (eds) One hundred years of intuitionism (1907–2007). The Cerisy conference. Birkhäuser, Basel, pp. 301–315 Schelling FWJ (1809) Philosophische Untersuchungen über das Wesen der menschlichen Freiheit und die damit zusammenhängende Gegenstände. In: Schelling FWJ, Sämtliche Werke, I. Abteilung, Bd VII Scholem G (1972) The name of God and the linguistic theory of the Kabbalah I and II. Diogenes 79:59–80. 80: 164–194 Schopenhauer A (1859) Die Welt als Wille und Vorstellung. Grossherzog Wilhelm Ernst Ausgabe. II. Teil. Insel Verlag, Leipzig Schweitzer A (1923) Kultur und Ethik. C. H. Beck’sche Verlagsbuchhandlung, München Schweitzer A (1974) Aus meiner Kindheit und Jugendzeit. In: Schweitzer A (ed) Gesammelte Werke, vol I. C.H. Beck, München, pp. 253–313 Smoryński C (1977) The incompleteness theorem. In: Barwise J (ed) Handbook of mathematical logic. North-Holland, Amsterdam, pp. 821–865 Smoryński C (1994) Review of [Van Stigt 1990]. Am Math Mon 101:799–802 Smoryński C (2015) Review of [Van Dalen 2013]. Math Intell 37:103–107 Soifer (2007) Review of [Van Dalen 1999b] and [Van Dalen 2005]. Zbl 1122.01019 Suzuki DT (1998) Studies in the Laṅkāvatāra sutra. Munshiram Manoharlal Publishers, New Delhi Thom R (1970) Les mathématiques “modernes”: une erreur pédagogique et philosophique? L’Âge de la Science 3:225–236. [English translation: American Scientist 59: 695–699]
Brouwer’s Intuitionism
55
Tieszen R (2008) The intersection of intuitionism (Brouwer) and phenomenology (Husserl). In: van Atten M, Boldini P, Bourdeau M, Heinzmann G (eds) One hundred years of intuitionism (1907–2007). The Cerisy conference. Birkhäuser, Basel, pp. 78–95 Tóth I (2009) Liberté et vérité. Pensée mathématique et spéculation philosophique. Éditions de l’Éclat, Paris van Atten M (2004) On Brouwer. Wadsworth, Belmont van Atten M (2007) Brouwer meets Husserl. On the phenomenology of choice sequences. Springer, Dordrecht van Atten M (2015) Essays on Gödel’s reception of Leibniz, Husserl, and Brouwer. Springer, Cham van Atten M, Tragesser R (2015) Mysticism and mathematics: Brouwer, Gödel, and the common core thesis. In: Van Atten M (ed) Essays on Gödel’s reception of Leibniz, Husserl, and Brouwer. Springer, Cham, pp. 173–187 van Belle M (2021) Schopenbrouwer. De rehabilitatie van een miskend genie. Marcel van Belle, Tilburg van Dalen D (1999a) The role of language and logic in Brouwer’s work. In: Orowska E (ed) Logic at work. Essays dedicated to the memory of Helena Rasiowa. Physica-Verlag, Heidelberg van Dalen D (1999b) Mystic, geometer, and intuitionist. The life of L. E. J. Brouwer, 1881–1966. Volume 1: The dawning revolution. Oxford University Press, Oxford van Dalen D (2005) Mystic, geometer, and intuitionist. The life of L. E. J. Brouwer, 1881–1966. Vol. 2. Hope and disillusion. Oxford University Press, Oxford van Dalen D (2011a) The selected correspondence of L.E.J. Brouwer. Springer, London van Dalen D (2011b) Companion to the selected correspondence of L.E.J. Brouwer. Springer, London. extras.springer.com van Dalen D (2013) L. E. J. Brouwer – topologist, intuitionist, philosopher. How mathematics is rooted in life. Springer, London van Stigt WP (1990) Brouwer’s intuitionism. North-Holland, Amsterdam Weil S (1949) L’enracinement. Gallimard, Paris Weil A (1991) Souvenirs d’apprentissage. Birkhäuser, Basel Whitehead AN (1979) Process and reality. The Free Press, New York Wiessing H (1960) Bewegend portret. Moussault, Amsterdam Wille M (2020a) Nach 1879. Über die Anfänge der Polemik im Werk Gottlob Freges. In: Wille M (ed) Fregesche Variationen. Essays zu Ehren von Christian Thiel. Mentis Verlag, Munster, pp. 189–220 Wille M (2020b) >alles in den Wind geschriebene maxði, jÞ f i ðgðtÞÞ > f j ðgðtÞÞ : Remark 1 We will not provide a proof of Skolem’s lemma here. However, we will later modify a proof of the existence of a cohesive set, a result by Dekker and Myhill (see Theorem VI in 12.3 of (Rogers 1967)), in order to show its similarity to Lemma 1. Next, fix a function g the existence of which is given by Lemma 1, and let f i ¼g f j iff
n o t : f i ðgðtÞÞ ¼ fj ðgðtÞÞ ¼ N almost everywhere:
It is easy to see that ¼g is an equivalence relation. For any f F1, let [f] denote the equivalence class of f: ½ f ¼ fh F1 : h ¼g f g: Let the domain N* be the set of all equivalence classes: N ¼ f½ f : f F1g: Let ℕ* be the structure in the language {+, , 1, ¼} with domain N* where ¼ℕ*, 1ℕ*, +ℕ*, and ℕ* are defined as follows. The equality ¼ℕ* is the equality between equivalence classes (the elements of N*). The constant 1ℕ* is the equivalence class of the function with constant value 1, denoted by [1], and ½ f þℕ ½h ¼ ½ f þ h, ½ f ℕ ½h ¼ ½ f h: Skolem then verifies that the sentences in Th(ℕ) hold in ℕ*. His verification can be viewed in comparison with a proof of Łoś’s Theorem for ultrapowers. The fact that ℕ* is a nonstandard model of arithmetic which has ℕ as its submodel is verified as follows. For n N, let [n] ℕ* be the equivalence class of the function with constant value n. The set {[n]: n N} of equivalence classes of the constant
Countable Nonstandard Models: Following Skolem’s Approach
15
functions provide a copy of ℕ in ℕ*. The sentence 8x∃y(x ¼ y) defines the identity function id ðxÞ ¼ x: It is obvious that [id] ℕ* and [id] 6¼ [n] for any n N, and hence ℕ* cannot be isomorphic to ℕ: ℕ ≇ ℕ: The class of arithmetically definable functions is countable because the language is countable. The model ℕ* is countable because its universe consists of equivalence classes of an equivalence relation ¼g defined on the countable family F1. A theory is countably categorical if it has exactly one countable model, up to isomorphism. Skolem’s result shows that neither PA nor Th(ℕ) are countably categorical. Hence, the hope that Th(ℕ) (or even PA) has only one countable model up to isomorphism, the familiar structure ℕ, is shattered. However, ℕ is the only computable such model. In the introduction to his 1955 paper, Skolem refers to the Löwenheim-Skolem theorem and Skolem’s paradox. For him “the notion set and particularly the notion subset in the case of infinite sets can only be asserted to exist in a relative sense.” Since an axiomatization of the natural number series exists within set theory, the axiomatization of the number series also cannot be expected to be absolute. The model ℕ* constructed by Skolem is remarkable also because it can be described using computability-theoretic notions, which were developed much later than Skolem’s original 1934 paper.
3.2
Computability and Models of Fragments of Arithmetic
In this section we will exploit notions of modern computability theory to build nonstandard models of theories. We will start by comparing Skolem’s nonstandard model ℕ* with the following models: (i) A structure denoted by R = A , which was proposed by Feferman, Scott, and Tennenbaum (1959) as a nonstandard model of arithmetic, (ii) A structure denoted by ΠCℕ, called a cohesive power of ℕ, which was introduced by Dimitrov (2009a).
4
Cohesive Sets
The two approaches (i)–(ii) use various notions of algorithmically indecomposable sets. Skolem’s Arithmetical Lemma 1 also implicitly uses this notion. We will now give relevant definitions and results from computability theory.
16
R. D. Dimitrov and V. Harizanov
Definition 2 (1) An infinite set C of natural numbers is called cohesive if for every c.e. set W, either W \ C or (N – W) \ C is finite. That is, a set is cohesive if it is indecomposable into two infinite parts by a c.e. set. (2) Some cohesive sets have c.e. complements called maximal sets. That is, a set M is maximal if M is c.e. and its complement (N – M) is cohesive. The notion of cohesiveness was introduced by Rose and Ullian (1963) in 1963, and is related to the notion of indecomposability introduced by Dekker and Myhill (1960) in 1960. Friedberg (1958), answering a question of Myhill from 1956, constructed a maximal set in 1958 without using an explicit reference to the notion of a cohesive set. Similarly, r-cohesive and r-maximal sets are defined, where r stands for recursive, which is a synonym for computable. Definition 3 (1) An infinite set C N is r-cohesive if for every computable set W, either W \ C or (N – W) \ C is finite. (2) A set M is r-maximal if M is c.e. and (N – M) is r-cohesive. Clearly, maximal sets are r-maximal since computable sets are c.e., but the converse does not hold. Theorem 5 There is a function g: N ! N such that its range rng(g) is a cohesive set. Proof Let E0, E1, E2, . . . be any list of all c.e. sets. We simultaneously define by induction an infinite sequence of sets C0 C1 C2 and a function g. Let C0 ¼ N. Let g(0) ¼ 0. Assume that we have defined Cn and g(n). Let Cnþ1 ¼
C n \ En C n \ ð N En Þ
if Cn \ En is infinite, otherwise:
Let g(n +1) be the least c Cn+1 such that c > g(n). Remark 2 The proof of Theorem 5 has been presented as a construction of a function g: N ! N such that rng(g) is a cohesive set. The original theorem of Dekker
Countable Nonstandard Models: Following Skolem’s Approach
17
and Myhill establishes that every infinite set has a cohesive subset. While there are countably many maximal sets, there are continuum many cohesive sets. We can prove Skolem’s Lemma 1 using the procedure outlined in the proof of Theorem 5. Using Skolem’s list of arithmetical functions we define the following sets for i, j 1 and i 6¼ j: n o Aij1 ¼ n : f i ðnÞ < f j ðnÞ , n o Aij2 ¼ n : f i ðnÞ ¼ f j ðnÞ , Aij3 ¼ fn : f i ðnÞ > f i ðnÞg: While the sets Aijk may not be c.e. (although they are arithmetical), we can construct a function g such that rng(g) is indecomposable for the sequence Aijk where k ¼ 1, 2, 3 and i, j 1 and i 6¼ j. We can call rng(g) an arithmetically indecomposable set.
4.1
Feferman-Scott-Tennenbaum Model
Following Skolem’s construction of a countable nonstandard model of arithmetic, Tennenbaum asked whether starting with computable functions and an r-cohesive set C, we could produce a nonstandard model of arithmetic. Below we give a description of such structure and the negative answer to Tennenbaum’s question provided in (Feferman et al. 1959) by Feferman, Scott, and Tennenbaum. The structure they built is a model of only a fragment of arithmetic but is very interesting due to its constructive properties. Let R be the set of all computable (recursive) functions, and let C be a fixed r-cohesive set. Feferman, Scott, and Tennenbaum constructed a structure, denoted by R = C , consisting of equivalence classes of computable functions modulo an equivalence relation ~C on R . More specifically, for computable unary functions f and g let f C g iff C fn : f ðnÞ ¼ gðnÞg: We write X * Y if all but finitely many elements of X are also elements of Y. The domain of the structure R = C consists of the equivalence classes of computable functions under ~C. The arithmetic operations of addition and multiplication are defined naturally (poinwise). Feferman, Scott, and Tennenbaum proved that R = C is a model of only a fragment of arithmetic. They constructed a particular Π3 sentence σ such that ℕ σ but R = C :σ:
18
4.2
R. D. Dimitrov and V. Harizanov
Cohesive Powers of Computable Structures
Recent developments in the area of computability-theoretic ultrapowers of structures arose from Dimitrov’s study (2004, 2008, 2009b) of c.e. vector spaces over computable fields. The study of the computable infinite-dimensional vector space V1 has been ongoing since Metakides and Nerode (1977) introduced it in 1977. The vectors in the space V1 are the finitely nonzero infinite sequences of elements from a computable field F. The most natural infinite computable field is the field Q of rational numbers. In (Dimitrov 2008), we discovered an interesting class of nonstandard fields, later called cohesive powers of Q , and denoted by ΠC Q , where C is a cohesive set. The field ΠC Q has certain similarities with the classical ultrapower construction. In (Dimitrov 2009a), we introduced the notion of cohesive powers of computable structures in general. While cohesive powers arose from investigating lattices of c.e. vector spaces (Dimitrov and Harizanov 2016, 2017, n.d.), it became clear that ΠCℕ was an interesting structure in itself. We later discovered the exact connection between ΠCℕ and the structure R = C of Feferman, Scott, and Tennenbaum. A partial computable function φ can be thought of as having a partial algorithm that takes elements n N as inputs and either halts and outputs the value φ(n), or computes forever in which case φ(n) is undefined. We denote the halting computation by φ(n) # and computing forever by φ(n) ". The domain of φ, the set of all inputs on which the partial algorithm halts, is a c.e. set. Two partial functions φ and ψ are equal, denoted by φ ’ ψ, if on every input either both are undefined, or both are defined and have the same value, denoted by φ(n) #¼ ψ(n) #. Determining whether a partial computable function is total (hence computable) is an undecidable problem. Definition 4 (Dimitrov 2009a) Let L be a finite language and A be a computable L-structure with domain A N. Let C N be cohesive. The cohesive power of A over C, denoted by ΠC A, is the L-structure B defined as follows: • Let D ¼ {φ: (φ: N ! A) is partial computable with C * dom(φ)}. • For φ, ψ D, let φ ¼C ψ iff C *{i: φ(i) #¼ ψ(i) #}. The relation ¼C is an equivalence relation on D. Let [φ] be the equivalence class of φ D with respect to ¼C. • The domain of B is the set B ¼ {[φ]: φ D}. • Let R be an n-ary predicate symbol of L: For [φ1], . . ., [φn] B, define RB ð½φ1 Þ, . . . , ½φn by RB ð½φ1 , . . . , ½φn Þ , C
i:
^ φm ðiÞ # ^ RA ðφ1 ðiÞ, . . . , φn ðiÞÞ :
1mn
• Let F be an n-ary function symbol of L. For [φ1], . . . , [φn] B, let ψ be the partial computable function defined by
Countable Nonstandard Models: Following Skolem’s Approach
19
ψ ðiÞ ’ FA ðφ1 ðiÞ, . . . , φn ðiÞÞ: Define FB ð½φ1 , . . . , ½φn Þ ¼ ½ψ : • Let c be a constant symbol of L . Then cB is the equivalence class of the total computable function with constant value cA . The cohesive power of a finite structure is isomorphic to the structure. The cohesive power of an infinite computable (hence countable) structure produces an infinite countable structure. In (Dimitrov et al. 2014), the authors demonstrated that R = C and ΠCℕ are isomorphic when the complement of the set C is a maximal set. In that case for every partial computable function φ D there is a total computable function f such that φ ¼C f, hence [φ] ¼ [f]. The following theorem is a part of a more general result in (Dimitrov 2009a). Theorem 6 (1) If σ is a Π2 sentence or a Σ 2 sentence, then ΠC A σ if and only if A σ. (2) If σ is a Π3 sentence and ΠC A σ, then A σ. The converse of (2) does not hold. In (Dimitrov et al. n.d.), the authors produced natural counterexamples for linear orders. Using the previous theorem it cannot be deduced that Π Cℕ is a model of the full theory Th(ℕ), but just of a fragment of Th(ℕ). As mentioned before, the cohesive power of a computable field is a field. This is because the field axioms are of complexity at most Π2. This is the case with many other familiar structures. In (Dimitrov et al. n.d.), we have also investigated the relationship between additional algorithmic properties of a computable structure and the isomorphism type of its cohesive power.
4.3
Concluding Remarks
In this paper we focused on the existence of various nonstandard models. The algorithmic complexity of these models varies. Uncountable models are considered noncomputable because of their size. Even all countable nonstandard models of arithmetic are noncomputable. The elements of Skolem’s countable model of arithmetic are equivalence classes of functions of arbitrary arithmetic complexity. It is not a computable model but we can measure the complexity of the elements of the model. The elements of the Feferman-Scott-Tennenbaum structure, which is only a model of a fragment of arithmetic, are equivalence classes of computable functions. Unfortunately, there is no algorithmic enumeration of all computable functions. The elements of cohesive powers of a computable structure are equivalence classes of partial computable functions. In general, the preservation of satisfaction of sentences
20
R. D. Dimitrov and V. Harizanov
in cohesive powers, as well as in the Feferman-Scott-Tennenbaum structure, is limited to lower levels of arithmetical hierarchy because it depends on the existence of uniform procedures for finding witnesses of existential quantifiers. Compared to Skolem’s model these structures favor a more constructivist approach at the expense of satisfiability of sentences at higher levels of the arithmetical hierarchy. Acknowledgments The first author would like to thank F. Stephan and D. Skordev for useful discussion regarding Skolem’s 1934 paper. The second author was partially supported by the Simons Foundation collaboration grant 429466 and GW Dean’s Research Chair award. The authors are listed lexicographically, indicating equal contribution to the chapter.
References Ash CJ, Knight JF (2000) Computable structures and the hyperarithmetical hierarchy. North Holland, Amsterdam Chang CC, Keisler HJ (1990) Model theory. North-Holland, Amsterdam Bell JL, Slomson AB (1974) Models and ultraproducts: an introduction, 3rd revised printing, NorthHolland, Amsterdam Dekker JCE, Myhill J (1960) Recursive equivalence types. University of California Publications in Mathematics, N.S., vol 3, pp 67–213 Dimitrov R, Harizanov V, Miller R, Mourad KJ (2014) Isomorphisms of non-standard fields and Ash’s conjecture, 10th Conference on Computability in Europe. In: Beckmann A, CsuhajVarjú E, Meer K (eds) Language, life, limits, Lecture Notes in Computer Science 8493. Springer, pp 143–152 Dimitrov R, Harizanov V, Morozov A, Shafer P, Soskova AA, Vatev SV (n.d.) On cohesive powers of linear orders, submitted. https://arxiv.org/abs/2009.00340 Dimitrov R, Harizanov V (2016) Orbits of maximal vector spaces. Algebra and Logic 54:440–477 Dimitrov R, Harizanov V (2017) The lattice of computably enumerable vector spaces. In: Day A, Fellows M, Greenberg N, Khoussainov B, Melnikov A, Rosamond F (eds) Computability and complexity. Springer, pp 366–393 Dimitrov RD, Harizanov V (n.d.) Effective ultraproducts and applications, accepted for publication in Aspects of computation, Greenberg N, Ng KM, Wu G, Yang Y (eds), IMS, National University of Singapore, World Scientific Dimitrov R (2004) Quasimaximality and principal filters isomorphism between ε and L ðV 1 Þ. Arch Math Log 43:415–424 Dimitrov R (2008) A class of Σ03 modular lattices embeddable as principal filters in L ðV 1 Þ. Arch Math Log 47:111–132 Dimitrov R (2009a) Cohesive powers of computable structures, Annuaire de l’Université de Sofia “St. Kliment Ohridski.” Fac Math Inf, Tome 99:193–201 Dimitrov R (2009b) Extensions of certain partial automorphisms of L ðV 1 Þ , Annuaire de l’Université de Sofia “St. Kliment Ohridski.” Fac Math Inf, Tome 99:183–191 Feferman S, Scott DS, Tennenbaum S (1959) Models of arithmetic through function rings, Notices of the American Mathematical Society 6:173–174. Abstract #556–31 Fokina E, Harizanov V, Melnikov A (2014) Computable model theory. In: Downey R (ed) Turing’s Legacy: Developments from Turing Ideas in Logic. Cambridge University Press/ASL, pp 124– 194 Frayne T, Morel AC, Scott DS (1962) Reduced direct products. Fundam Math 51:195–228 Friedberg RM (1958) Three theorems on recursive enumeration. I. Decomposition. II. Maximal set. III. Enumeration without duplication. Journal of Symbolic Logic 23:309–316
Countable Nonstandard Models: Following Skolem’s Approach
21
Hirshfeld J (1975) Models of arithmetic and recursive functions. Israel Journal of Mathematics 20: 111–126 Hirshfeld J, Wheeler WH (1975) Forcing, arithmetic, division rings. Lecture Notes in Mathematics 454. Springer, Berlin Kaye R (1991) Models of Peano arithmetic. Oxford University Press, New York Kossak R, Schmerl JH (2006) The structure of models of Peano arithmetic. Oxford University Press, New York Lerman M (1970) Recursive functions modulo co-r-maximal sets. Transactions of the American Mathematical Society 148:429–444 Łoś J (1955) Quelques remarques, théorèmes et problèmes sur les classes définissables d’algèbres. Mathematical Interpretation of Formal Systems, North-Holland, Amsterdam, pp 98–113 McLaughlin TG (1989) Some extension and rearrangement theorems for Nerode semirings. Zeitschrift für Mathematische Logik und Grundlagen der Mathematik 35:197–209 McLaughlin TG (1990) Sub-arithmetical ultrapowers: a survey. Annals of Pure and Applied Logic 49:143–191 McLaughlin TG (2007) Δ1 ultrapowers are totally rigid. Archive for Mathematical Logic 46:379–384 Metakides G, Nerode A (1977) Recursively enumerable vector spaces. Annals of Mathematical Logic 11:147–171 Mostowski A (1952) On models of axiomatic systems. Fundam Math 39:133–158 Nelson GC (1992) Constructive ultraproducts and isomorphisms of recursively saturated ultrapowers. Notre Dame Journal of Formal Logic 33:433–441 Peano G (1889) Arithmetices Principia, Nova Methodo Exposita Rogers H (1967) Theory of recursive functions and effective computability. McGraw-Hill Book Company, New York Rose GF, Ullian JS (1963) Approximations of functions on the integers. Pacific Journal of Mathematics 13:693–701 Ryll-Nardzewski C (1952) The role of the axiom of induction in elementary arithmetic. Fundam Math 39:239–263 Schmerl JH (2011) Tennenbaum’s theorem and recursive reducts. In: Kennedy J, Kossak R (eds) Set Theory, Arithmetic, and Foundations of Mathematics – Theorems, Philosophies. Lecture Notes in Logic 36, Association for Symbolic Logic, La Jolla, pp 112–149 Skolem Th (1934) Über die Nicht-charakterisierbarkeit der Zahlenreihe mittels endlich oder abzählbar unendlich vieler Aussagen mit ausschliesslich Zahlenvariablen. Fundam Math 23: 150–161 Skolem Th (1955) Peano’s axioms and models of arithmetic. Mathematical Interpretation of Formal Systems, North-Holland, Amsterdam, pp 1–14 Smoryński C (1991) Logical number theory I: an introduction. Springer-Verlag, Berlin Soare RI (1987) Recursively enumerable sets and degrees. A study of computable functions and computably generated sets. Springer-Verlag, Berlin Tennenbaum S (1959) Non-Archimedean models for arithmetic. Notices of the American Mathematical Society 6:270. Abstract #556-42
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers Alan Baker
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Reductio Arguments Are Not Countermathematicals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Classifying Countermathematicals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Metamathematical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Axiomatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Stipulational . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Counterarithmetical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Spoof Perfect Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Vacuist Responses: Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Mathematical Practice Is Irrelevant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 “Spoof Perfect Number Theory” Is Not Part of Serious Mathematics . . . . . . . . . . . . . . . . 5.3 Mathematically Acceptable Spoof Perfect Numbers Are Interesting . . . . . . . . . . . . . . . . . 5.4 Vacuously True Counterarithmeticals Do Not Imply That All Numbers Are Spoof Perfect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Spoof Perfect Numbers Need Not Be Defined Using Counterarithmeticals . . . . . . . . . . 6 Paraphrasing Away Counterarithmeticals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Sigma Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Nonvacuism and the Evaluation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 4 7 7 7 8 9 11 13 13 14 16 17 18 19 19 21 24 26 26
Abstract
Philosophical theories of counterfactuals have had relatively little to say about counterfactual reasoning in mathematics. Partly this is because most mathematical counterfactuals seem also to be counterpossibles, in that their antecedents deny some necessary truth. In this chapter, I delineate several different categories of mathematical counterfactual (or “countermathematical”) and then examine in A. Baker (*) Department of Philosophy, Swarthmore College, Swarthmore, PA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_24-1
1
2
A. Baker
detail a case study from mathematical practice that features counterfactual reasoning about “spoof perfect” numbers. I argue that reasoning about spoof perfect numbers presents both a challenge to philosophical analyses of counterfactuals and a resource for thinking more productively about the role of counterfactual reasoning in mathematics. Keywords
Counterfactuals · Counterpossibles · Mathematical practice · Perfect numbers
1
Introduction
It has been suggested that a good “philosophical heuristic,” when trying to attack a proffered analysis, is to look at extreme cases (Hàjek 2016). When the analysis in question concerns counterfactual conditionals, one class of extreme cases are counterfactuals with impossible antecedents – so-called “counterpossibles” (It has been pointed out to me (by Ralph Wedgewood) that the term “counterpossible” here runs counter to the analogy with “counterfactual.” If the antecedent of a counterfactual denies an actual fact, should not the antecedent of a counterpossible deny a possible fact? In other words, a more appropriate term for these impossible counterfactuals is “counternecessary.” Unfortunately, current usage is too entrenched at this stage to be open to being changed). As is well-known, the standard possible-worlds analysis of counterfactuals delivers the verdict that all counterpossibles are true. Vacuously true, one might say, since it matters not what the content of the consequent is. Nor is this the only route to vacuity. Syntactic approaches, based on the derivability of the consequent from the antecedent, will also class all counterpossibles as uninterestingly true, if the background logic is classical. So what? A full analysis of counterfactuals may need to say something about counterpossibles. But does anything substantive hang on what is said? In particular, can it be shown to be a liability for an analysis to classify all counterpossibles as vacuously true? In his 1974 book, Counterfactuals, David Lewis devotes several pages to the issue of counterpossibles. Here is an excerpt: [O]ne sometimes asserts counterfactuals by way of reductio in philosophy, mathematics, and even logic. . . . Their antecedents deny what are thought to be philosophical, mathematical, or even logical truths, and must therefore be thought not only false but impossible. These asserted counterphilosophicals, countermathematicals, and counterlogicals look like examples of vacuously true counterfactuals. (Lewis 1973, p. 24)
In the above passage, Lewis simultaneously coins (at least I am not aware of any earlier usage) the term “countermathematical” for the class of counterpossibles that have mathematical subject matter and makes explicit the implication of his analysis that all counterpossibles are vacuously true. On Lewis’s definition, a countermathematical is a conditional whose antecedent denies a mathematical truth. In order to come up with a definition which hinges less on the realism/
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
3
antirealism debate in the philosophy of mathematics, let us drop reference to truth and adopt the following definition: (1) A countermathematical is a subjunctive conditional whose antecedent denies an accepted mathematical claim. (I plan to ignore, for present purposes, the question of what degree of agreement across the mathematical community is required in order for a mathematical claim to count as “accepted”.) As is well known, Lewis is a champion of a possible-worlds analysis of counterfactuals. According to his analysis, a counterfactual A □ ! B is true if and only if the closest A-worlds to the actual world are all B-worlds. But if A is impossible, then there are no A-worlds, so the standard analysis makes all counterfactuals with impossible antecedents vacuously true. Note that there are also other routes to this conclusion that have nothing directly to do with possible worlds. For example, consider any analysis of counterfactuals rooted in classical logic which makes the consequent of a counterfactual true if it can be deduced from the antecedent together with appropriately amended auxiliary hypotheses. If the antecedent is contradictory, then every consequent is deducible, regardless of what other hypotheses are added. In what follows, I shall use the term vacuist for any analysis according to which all counterpossibles are vacuously true, and nonvacuist for any analysis which makes room for some counterpossibles to be true and others false. (Although I shall not pursue the issue in this paper, I think that most of what I say below against the vacuist position would apply equally well against the converse view that all counterpossibles are vacuously false (or that all counterpossibles lack a truthvalue). The point of the examples from mathematical practice is that any uniform classification of counterpossibles is insufficient.) Pressure on the vacuist account comes from three main sources. However, or so I shall argue, a compelling case against vacuism has yet to be made. First, some philosophers have argued that classifying all counterpossibles as trivially false does violence to our intuitions concerning specific examples (and especially pairs of examples with appropriately contrasting consequents). The problem with this line of objection is that our intuitions about extreme cases tend to be weak to the point of uselessness, and easily manipulated by peripheral contextual factors. A second source of pressure on vacuism is the increasing use of nonvacuist accounts of counterpossibles to do substantive philosophical work, especially in the analysis of mathematical explanation. Recent work by Sam Baron, Mark Colyvan, David Ripley, Alexander Reutlinger, and others has sought to extend a difference-making account of scientific explanation to cover mathematical explanation. This involves making sense of mathematical facts “making a difference,” and a natural way to do this is to consider what would be the case if a certain mathematical fact were different. For this analysis to work, the resulting counterpossibles must be nontrivial, hence it requires some kind of nonvacuist account. This approach has been explored both for mathematical explanation in science (Baron et al. 2017; Baron 2020; Reutlinger et al. 2020) and for intramathematical explanation of one mathematical fact by another (Baron et al. 2020). Although important progress has
4
A. Baker
been made in fleshing out this analysis, there remains much work to be done in clarifying just how the counterpossibles featuring in mathematical explanations ought to be evaluated. Some general guidelines have been proposed, but without further grounding in mathematical practice there remains the issue of overreliance on vague and only partially articulated intuitions. This brings us to the third – and in my opinion most crucial – source of pressure on the vacuist view, which is that counterpossibles play a substantive role in actual mathematical practice. If this claim is correct, then it not only casts doubt on the cogency of vacuism by exposing its tensions with practice, but it also bolsters the first two sources of pressure on vacuism. On the one hand, patterns of acceptance and rejection of counterpossibles in mathematical practice can provide a firmer scaffolding for our intuitions about individual cases. On the other hand, this same kind of evidence from practice can flesh out a positive analysis of mathematical explanation within mathematics, and potentially also in science. The problem, however, is that the evidence provided to date of the role played by counterpossibles in mathematical practice is equivocal at best. Although some examples have been given, in many cases – as we shall see – they turn out either not to involve genuine counterfactuals, or not to be significantly linked in to broader aspects of mathematical practice. My focus in this chapter is on counterpossibles with mathematical content, and in particular arithmetically oriented examples. I shall argue that cases of this sort can be found in actual mathematical practice and that their role in this practice constitutes a prima facie challenge to mainstream contemporary accounts of counterfactuals. My principal targets are accounts that are vacuist, in the sense defined above, because such accounts make it particularly difficult to accommodate any interesting role for counterpossibles in practice. But I think that the examples discussed also pose a challenge for existing nonvacuist accounts. Accounts of this latter sort can allow for true counterpossibles and false counterpossibles, but this is not much use in itself if it is not accompanied by workable guidelines for evaluating the truth or falsity of particular examples. Broadly speaking, then, the challenge for the vacuist concerns analysis, and the challenge for the nonvacuist concerns evaluation. (“We can ask two types of questions about counterfactual conditionals: What is the meaning of the statement, and how do we determine whether it is true or false?” (Lewis-Beck et al. 2004).) I am not the first to look to mathematics for examples of counterpossibles, nor the first to stress “practice-based” examples as a way to put pressure on certain analyses of counterfactuals. However, as I shall argue in the next section, the purported examples of countermathematicals that have been discussed in the philosophical literature to date fail to stand up to scrutiny.
2
Reductio Arguments Are Not Countermathematicals
I have argued that one way to advance the debate over the proper analysis of counterfactuals is to find examples of counterpossibles from nonphilosophical practice. But does not the beginning of the Lewis quote from the previous section
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
5
indicate one area, straightforwardly distinct from philosophical practice, in which such examples can be found in abundance, namely in proofs by reductio ad absurdum? In the first part of this section, I shall argue, against what is the prevailing view in the literature on counterfactuals, that reductio arguments are not countermathematicals and thus, more generally, that they are not counterpossibles either. Then, at the end of this section, I shall show that even if I am wrong and reductio arguments are counterpossibles, they still cannot be used to put any real pressure on the vacuist position. The core of my argument against reductio arguments being counterpossibles is that they are not even counterfactuals. Counterfactuals are subjunctive conditionals, but reductio conditionals do not function as subjunctives. For example, consider the following condensed argument: (2) Assume (for purposes of reductio) that there is a largest prime p. Then (p! + 1) is prime. This is true because we can derive the primality of (p! + 1) from known facts about divisibility – in particular, that (p! + 1) leaves remainder 1 when divided by any number less than or equal to p. Hence (p! + 1) has no prime factors less than or equal to p. This is in sharp contrast to the corresponding claim expressed as a subjunctive conditional. If there were to be a largest prime, then (intuitively) all sorts of other established mathematical facts might be different too. But, if so, then we cannot just “automatically” appeal to any such other established mathematical fact in trying to derive a contradiction from the assumption that there is such a p. One might argue, for example, that if there were finitely many primes then there would also be only finitely many numbers. In particular, the following might be an acceptable subjunctive conditional, (3) If there were a largest prime p, then there would be a largest number q such that q < ( p! + 1). But if this is right, then it is false that if there were a largest prime p then (p! + 1) would be prime. It is false because in these circumstances there would be no number (p! + 1). (It is striking that Lewis uses the subjunctive formulation when discussing this exact example in his 1973, p. 25.) It is enough for present purposes to show that reductio arguments are not “disguised” subjunctive conditionals. For, if they are not subjunctive conditionals, they are not counterfactuals, and if they are not counterfactuals then they are not counterpossibles. Are reductio arguments “disguised” conditionals of some other sort? I used to think that reductio conditionals are more like indicative conditionals. But this now does not seem right to me either. If there is a largest prime, then I have been radically misled by the various proofs I have seen which purport to establish the contrary result. So who knows what I should accept as a (mathematical) consequence of this assumption! (This connects with the behavior of indicative conditionals in other contexts. I might accept the subjunctive conditional, “If I hadn’t vacationed in
6
A. Baker
France last summer then I would have vacationed in Spain,” but this does not commit me to accepting the corresponding indicative conditional, “If I didn’t vacation in France last summer then I vacationed in Spain.” If I did not vacation in France, then something very strange is going on because I have lots of vivid memories of doing just that.) (More generally, the role of this conditional in the context of a reductio proof is very different from the normal role of a counterfactual. Roughly speaking, we are interested in the “reductio conditional” only as a means toward demonstrating the falsity of the antecedent. By contrast, our interest typically in a conditional, whether indicative or subjunctive, is in exploring the ramifications of the truth of its antecedent. So, even when there are conditionals in the context of reductio proofs, they do not have the normal role of counterfactuals.) My thesis that reductio arguments do not involve counterpossibles is, I acknowledge, controversial. Fortunately for the broader debate over counterpossibles, there are independent grounds for thinking that the case of reductio arguments is not especially useful in settling the issue of how properly to analyze counterfactuals. Consider how an argument against the vacuist analysis of counterfactuals might proceed. The basic form is as follows: (1) The standard possible-worlds analysis makes all counterfactuals with impossible antecedents vacuously true, since if A is impossible then there are no A-worlds. (2) Reductio arguments in mathematics start from impossible premises. [e.g., the assumption that there is a largest prime number, p]. (3) But not all conclusions advanced in the context of a reductio argument are mathematically acceptable. (4) Hence, there are mathematically unacceptable claims of the form “If P then Q” where P is impossible. (5) Hence, not all counterfactuals with impossible antecedents are vacuously true, so the standard analysis is untenable. The typical response to the above argument, for those who favor a possibleworlds-based analysis of counterfactuals, is to make a distinction between what can truly be asserted and what can appropriately be asserted. In a 1997 paper, Daniel Nolan argues that for a proof to be acceptable it is not enough that it be formally valid; it must also be obviously formally valid. Thus a defender of the standard analysis of counterfactuals can argue that the problem with “If √2 ¼ a/b then 42 is prime,” is not that it is false but that the steps from premise to conclusion have not been made clear. Nolan concludes that “reductio proofs do not show the need for nontrivial reasoning about impossibilities” (Nolan 1997, p. 538). (The above line of defense draws a distinction between mathematical acceptability and truth, but not in a way that presupposes any particular background philosophy of mathematics. It should be noted that antirealists about mathematics have independent reasons for rejecting the original argument because on their view mathematical acceptability does not even aim for literal truth.) In this way, the vacuist can hold onto the claim that all countermathematicals are vacuously true, but add that a countermathematical is (mathematically) acceptable if and only if a derivation is provided of its
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
7
consequent from its antecedent using other demonstrable mathematical facts. This is in contrast to the nonvacuist, who holds that some countermathematicals are not just mathematically unacceptable but also false. Can this debate be moved forward, and – if so – how? I think that it can, and one of my main aims in this chapter is to indicate why and how. The key to making progress is to realize that there are other examples of countermathematicals which play a role in mathematical practice but are not part of reductio arguments.
3
Classifying Countermathematicals
Recall our earlier definition, that a countermathematical (henceforth, CM) is a subjunctive conditional whose antecedent denies an accepted mathematical claim. Before proceeding to look at some case studies, it will be useful to have in place a preliminary classification of different types of CM. I do not claim that the taxonomy that follows is either optimal or exhaustive, but hopefully it will allow for a more focused discussion of different examples.
3.1
Metamathematical
Some mathematical claims are not claims of a particular mathematical theory but claims about the properties of a theory. The denial of some established metamathematical claim is a metamathematical CM (or, perhaps, a countermetamathematical). For example, (4) If ZF set theory were a conservative extension of Peano Arithmetic, then Goodstein’s Theorem would be provable in PA. It is unclear to what extent metamathematical CMs play a role in mathematical practice. Sometimes, metamathematical CMs are used in reductio-style reasoning, for example, showing that if a certain formal theory were complete and consistent then it would be strong enough to embed arithmetic, and hence Gödel’s Incompleteness Theorem would be violated. I would be inclined to resist classifying such cases as CMs for the reasons canvassed in Sect. 2. Another reason for distinguishing this category from “genuine” CMs comes from stressing the division between mathematics and metamathematics. Perhaps reasoning of this sort is not “internal” enough to mathematical practice, and too often bound up with broader issues in logic or philosophy.
3.2
Axiomatic
An axiomatic CM is a conditional whose antecedent denies an established axiom of an established mathematical theory. For example,
8
A. Baker
(5) If the Axiom of Choice were false, then there would be sets S and T of incomparable size (i.e., neither can be mapped in a one-to-one fashion onto a subset of the other). Evaluating the truth or falsity of axiomatic CMs seems relatively straightforward, at least in principle. It is a matter of determining whether the consequent is a theorem of the amended theory produced by replacing the axiom in question with its negation. Jonathan Bennett has argued – convincingly, I think – that evaluation is possible in this sort of case, despite the lack of analytical support, precisely because the axioms of a given theory provide a framework for carving up logical space in a specified manner; Consider . . . how counterlogicals are handled. . . . [A] speaker can say things of the form ‘If conjunction were not commutative, then C would be the case,’ and be right for some values of C and wrong for others, just so long as he is talking about the power structure of some system of logic – some set of rules and independent axioms – and saying that C is a theorem in the system that results from the original one by deleting its commutativity axiom. Such conditionals are saved from triviality by being made relative to some formulation of logical truth. (Bennett 2003, pp. 228–229)
Independence does seem to be a crucial feature of what makes axiom-based CMs relatively unproblematic to evaluate. It means they can be dropped (or even negated) without having to alter any other axioms of the theory (Although I agree with Bennett’s general point here, the specific example in the given quote does not work quite as he claims. If all that is done is that the denied axiom is “deleted,” then the new system will be weaker than the old system. To get any new results, the negation of the deleted axiom must be added in its place. Otherwise, the most we could legitimately say would be along the lines of, “liven if conjunction were not commutative, C would still be the case.”). Another way of putting this point is that there is another (logically) consistent theory in the vicinity which can be used as a framework for evaluation. Even if it is conceded that evaluation of axiomatic CMs is straightforward, there remains the issue of how such CMs are to be analyzed. It is true that this may raise problems for platonists, for example, if the Axiom of Choice is necessarily true, then what can be meant by drawing conclusions from the supposition that it is false? (Nolan 1997) Nonetheless, I doubt that axiomatic CMs will provide good case studies for the debate over counterpossibles because this problem of analysis does not extend across other positions in the philosophy of mathematics. Indeed, even some other versions of mathematical realism, for example, Quinean indispensabilism, will have no problem with conditional claims that deny the truth of established axioms.
3.3
Stipulational
A CM is stipulational if its antecedent asserts that a “borderline” case (or cases) of some mathematical concept be reclassified in a different way. One of the clearest
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
9
examples of this sort concerns the nonprimality of 1. Here is a sample quote taken from a mathematics textbook (there are also plenty of more technical examples, “Take the Euler Fi-function for example. it can be shown the Fi[p] ¼ p 1 when p is a prime. If 1 was prime that wouldn’t be true for Fi[1] ¼ 1”): (6) If 1 were prime, then the Fundamental Theorem of Arithmetic, which says that every whole number greater than one can be written uniquely as the product of prime numbers, would be false. The most natural way of reading this example, I suggest, is as contradicting not actual mathematical facts but actual mathematical terminology. Since the mathematical facts can therefore be left untouched, evaluation is again relatively straightforward. Each old theorem mentioning primality (or a property defined using primality) must be retranslated using the new terminology and then assessed to see if it remains true. Some do (for example, the theorem that there are an infinite number of primes) and others do not (for example, the theorem about the uniqueness of prime factorization). Crucial to this reading is that the reclassified cases be borderline, otherwise the new concept becomes too far removed from the original one. Compare (6) above with. (7) If 4 were prime, then some composite numbers would have no prime factors. As things stand, 4 is a paradigmatic example of a nonprime number. If the meaning of “prime” were altered to the extent that 4 is counted as prime, then it becomes unclear what would happen to related terms such as “composite” and “factor,” and this interferes with the evaluation of (7). In light of the above remarks, there is a sense in which stipulational CMs are not really counterpossibles at all. A platonist, for example, can entertain the antecedent of (6) without supposing that anything is different in the realm of mathematical abstracta. This suggests that even the vacuist can allow for such conditionals to be nontrivial by focusing on change of terminology rather than change in intrinsic facts about mathematics. (Perhaps there is a way to make the distinction between stipulational CMs and other sorts of CM in terms of Kripke’s rigid designator/ nonrigid designator distinction.) If so, then stipulational CMs cannot provide a definitive challenge to competing analyses of counterpossibles.
3.4
Counterarithmetical
Our discussion of the above two categories suggests that the most effective examples of CMs are likely to be neither axiomatic nor stipulational. A further desideratum is that the CM feature in an area of mathematics in which there is a single, wellestablished theory, and about which there is little debate among mathematicians concerning the correctness of its basic results. Penelope Maddy has usefully
10
A. Baker
distinguished areas of mathematics in which there is a presumption of a single correct theory in terms of a background maxim she calls UNIFY (Maddy 1997). In an area not governed by this maxim, such as group theory, it seems that subjunctive conditionals do not generally correspond to counterfactuals. For example, consider the conditional: (8) If the operator of the group Z6 were not commutative, then the group would form the symmetry group of an equilateral triangle. This is best read, I suggest, is simply shifting the focus from an abelian group of order 6 to the (only) nonabelian group of order 6, namely S3, and making a true assertion about a feature of this latter group. For our purposes, a good place to look is arithmetic since it seems to have the above feature; moreover, there is little doubt over its core axioms and theorems. (According to Maddy, set theory is one paradigm area of mathematics where UNIFY is operative. However, although there may be a presumption of a single correct theory, there is ongoing debate about which set-theoretical axioms ought to be included (especially in the area of large cardinal axioms).) I coin the term “counterarithmetical” to cover the special case of a CM which pertains to arithmetic. More precisely: (9) A counterarithmetical is a subjunctive conditional whose antecedent denies an accepted arithmetical claim. (Rather than trying to build into the above definition some distinction between CAs and the previous two categories, I shall just accept that some CAs are axiomatic (e.g., “If there were a number without a unique successor then there would be a largest prime number.”), and some CAs are stipulational (e.g., (8) above).) Linking examples of counterarithmeticals to mathematical practice is crucial here, and it is in this emphasis that I part company with what prior discussion there has been of the topic. Various examples of counterpossibles with arithmetical subject matter have been given in the philosophical literature, but these “philosophical” examples tend to bear little relation to mathematical practice and hence must be evaluated almost entirely on the basis of intuitions that are typically fairly weak, even conflicting. For example, Almerindo Ojeda asserts in passing, while arguing against the standard possible worlds analysis of counterpossibles, that (10) If 3 were even, 4 would be odd is intuitively true, but that (11) If 3 were even, 4 would be prime
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
11
is intuitively false (Ojeda 2005, p. 31). Beyond the bare appeal to intuition, Ojeda’s argument is brief. Roughly, his idea is that the successor of an even number need not be prime, so if 3 were even this would not “force” 4 to be prime. The problem with using intuitions as the sole data for philosophical analysis is that it provides no guidance when faced with conflicting intuitions. For example, I happen not to share Ojeda’s intuition that the second of the above counterarithmeticals is false. In support, I offer the following argument: I accept that if 3 were even then 4 would be odd. But if 4 were odd, then it would not be divisible by 2 (from the definition of oddness). Also, 4 would not be divisible by 3, because 3 would be even and only even numbers are divisible by other even numbers. So 4 would not be divisible by any smaller number except for 1. Hence, 4 would be prime. I am not claiming that my argument is decisive. In fact my whole point is that one cannot expect either decisive arguments or robust intuitions concerning such “freefloating” CMs. For this reason, it is unlikely that concocting philosophically motivated examples of CMs will provide a firm enough basis for undermining the standard view. This applies not just to CMs but to any counterfactual conditional with an “extreme” antecedent. (Another class of extreme counterfactuals are those whose antecedents deny a law of nature. For an interesting analysis of such “counterlegals,” (see Kvart 1986, p. 208). Also, for an attempt to link counterlegals to certain global kinds of mathematical counterfactual (such as “If there were no mathematical objects, then. . ..”) via Quinean indispensability considerations, see Baker (2003).) One problem is that our intuitions concerning counterfactuals tend to weaken as the counterfactual conditions increase in breadth (the scope and magnitude of discrepancies from the actual world) and depth (the modal strength of the actual claims being denied). A second problem is that extreme antecedents do not tend to feature in everyday counterfactual claims. Outside of the philosophy classroom, people rarely go around making assertions about what would be the case if there were round squares, or if some objects were both blue and red all over. Hence, there is little in the way of extraphilosophical “commonsense” usage to measure our analyses against. It is here, I think, that consideration of counterarithmeticals in particular has the potential to cast light on the broader debate over counterpossibles. For – as we shall see in the next section – there are examples of mathematicians asserting counterarithmeticals in the course of normal mathematical practice, in what appears to be a philosophically unselfconscious manner. Looking at the pattern of acceptance and rejection of counterarithmeticals by practicing mathematicians may provide a potentially important body of data from a source external to, and no less reliable than, philosophy itself.
4
Spoof Perfect Numbers
A natural number is perfect if and only if it is equal to the sum of its proper divisors, including 1. Perfect numbers have been an object of interest for mathematicians since ancient times, and the Greeks in particular attached mystical significance to
12
A. Baker
them. Partly, this is because of the rarity of perfect numbers. Just 5 perfect numbers were known to the ancient Greeks, and even with the aid of today’s powerful computers fewer than 50 more have been added to that list. There are also several quite basic questions about perfect numbers which remain unsettled. These include whether there are infinitely many perfect numbers, and whether there are any odd perfect numbers. It is this latter question which provides the starting point for our first example. In 1638, Descartes corresponded with Mersenne concerning the issue of the existence of odd perfect numbers. Descartes argues that there are likely to be odd perfect numbers, writing at one point that I think I am able to prove that . . . there are no odd perfect numbers, unless they are composed of a single prime number multiplied by a square whose root is composed of several other prime numbers. But I can see nothing which would prevent one from finding numbers of this sort. For example, if 22,021 were prime, in multiplying it by 918,009, which is a square whose root is composed of the prime numbers 3, 7, 11, 13, one would have 198,585,576,189, which would be a perfect number. (Quoted in Crubellier and Sip 1997)
As Descartes was well aware, 22,021 is a composite number (its prime factorization is 192.61). Focus for the moment on the core claim which Descartes makes here, namely (12) If 22,021 were prime, then 198,585,576,189 would be perfect. This is a subjunctive conditional whose antecedent denies an accepted arithmetical claim, in other words, it is a counterarithmetical. In recent years, number theorists have become interested in numbers of this sort, which they have termed spoof perfect numbers. Specifically, a spoof perfect number is defined to be a number that is perfect if you assume, contrary to mathematical fact, that one or more of its composite factors are prime. To make the mathematical side of the discussion more tractable, I shall henceforth focus on a much smaller example of a spoof perfect number: (13) If 4 were prime, then 60 would be perfect. At first blush, one might wonder what difference it makes whether a certain factor is prime or not. After all, the perfectness of a number depends on the sum of all its factors, regardless of primality. The idea of the spoof antecedent condition is that the spoof prime number, in the above case 4, is now assumed not to break down into any smaller factors. Under this hypothesis, there are fewer factors of 60, because only factors expressible as multiples of 3, 4, and 5 are permitted. Thus, for example, 10 would no longer count as a factor of 60. The “eligible” factors of 60, under the hypothesis that 4 is prime, are 1, 3, 4, 5, 12, 15, and 20. These sum to 60, hence 60 is spoof perfect. My core argument against the vacuist account of counterpossibles can now be formulated:
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
13
1. Spoof perfect numbers are a mathematically interesting subset of the natural numbers. 2. Each spoof perfect number is defined using a counterarithmetical. 3. If all counterpossibles are true, then counterarithmeticals do not define a mathematically interesting subset of the natural numbers. 4. Hence, not all counterpossibles are true. The remainder of the chapter will be devoted to potential ways of responding to this argument on behalf of someone who holds a vacuist view of counterpossibles.
5
Vacuist Responses: Analysis
5.1
Mathematical Practice Is Irrelevant
Perhaps, the most extreme response to the above challenge is to dismiss it entirely on the grounds that appeal to mathematical practice ought to carry no weight in the debate over the proper analysis of counterpossibles. I shall be brief here, and not just because this response effectively puts an end to the present discussion. It is a widely held belief among both contemporary metaphysicians and contemporary philosophers of mathematics that mathematical practice is entirely relevant to the evaluation of competing analyses of concepts which appear within the borders of mathematics, which is not to say that practice overrides all other factors. But if a philosophical analysis implies that significant portions of the mathematical community are systematically wrong about some area of mathematical discourse, then this is – and should be – taken to be a serious drawback for the analysis in question. The earlier discussion of reductio reasoning in mathematics as a potential thorn in the side of the vacuist account of counterpossibles was predicated on the assumption that if it were to be shown that mathematicians use counterpossibles in a way that conflicts with vacuist account, then it would be a problem for the latter analysis. And David Lewis, an archetypal vacuist about counterpossibles, has made numerous remarks about the importance of respecting mathematical practice as a condition of adequacy on philosophical theorizing. One of his better-known proclamations is the following, Mathematics is an established, going concern. Philosophy is as shaky as can be. To reject mathematics for philosophical reasons would be absurd. . . . I’m moved to laughter at the thought of how presumptuous it would be to reject mathematics for philosophical reasons. (Lewis (1991, pp. 58–59). The intervening passage between the two quoted segments reads, “If we philosophers are sorely puzzled by the classes that constitute mathematical reality, that’s our problem. We shouldn’t expect mathematics to go away to make our life easier. Even if we reject mathematics gently – explaining how it can be a most useful fiction, ‘good without being true’ – we still reject it, and that’s still absurd. Even if we hold onto some mutilated fragments of mathematics that can be reconstructed without classes, if we reject the bulk of mathematics, that’s still absurd.”)
14
5.2
A. Baker
“Spoof Perfect Number Theory” Is Not Part of Serious Mathematics
A second line of counterargument is to not only acknowledge that systematic patterns of reasoning exhibited in mathematical practice do carry weight, at least in principle, but also to argue that the investigation of spoof perfect numbers is not a legitimate part of “mainstream” mathematical practice. If so, then the core argument may perhaps still be dismissed. It is true that discussions of spoof perfect numbers are uncommon, even within number theory. On the other hand, results and conjectures concerning spoof perfect numbers can be found in “respectable” mathematical publications – for example, in R.K. Guy’s influential survey of open questions in number theory (Guy 2004), and refereed journals such as Integers and Mathematical Computation (Nielsen (2003, 2015) and Dittmer (2014). For additional recent work focused specifically on spoof perfect numbers, see also Arnaldo and Dris (2017), and BYU Computational Number Theory Group (2020)). What about links between spoof perfect numbers and other issues of demonstrable significance in number theory (or indeed elsewhere in mathematics)? Links of this sort are one aspect of what G.H. Hardy has termed the “seriousness” of a piece of mathematics. (A chess problem is genuine mathematics, but it is in some way “trivial” mathematics. However ingenious and intricate, however original and surprising the moves, there is something essential lacking. Chess problems are unimportant. The best mathematics is serious as well as beautiful – “important” if you like, but the word is very ambiguous, and “serious” expresses what I mean much better (Hardy 1940, p. 16).) Unsurprisingly, the area to which research into spoof perfect numbers is most commonly connected is research into perfect numbers simpliciter, and in particular odd perfect numbers. As already mentioned, whether there exist any odd perfect numbers is an important and open question in number theory. The consensus seems to be that it is unlikely that there are any odd perfect numbers, partly because various increasingly strong constraints have been proved concerning the size and properties of any such number. For example, it is known that if any odd perfect number must be larger than 101500, it must have at least 101 (not necessarily distinct) prime divisors, and it must have the form p4a + 1 m2, where p ¼ 1 (mod 4) is prime. (For these and further conditions on odd perfect numbers, see Holdener and Rachfal (2019, p. 541). The nineteenth-century mathematician Sylvester is quoted as saying that “the existence of [an odd perfect number] – its escape, so to say, from the complex web of conditions which hem it in on all sides – would be little short of a miracle.”) In Sect. 6 below, I shall discuss an example where results concerning spoof odd perfect numbers are used to draw conclusions about “genuine” odd perfect numbers. A third indicator of the mathematical interest of spoof perfect numbers is the extent to which the basic notion of spoofness has been generalized or modified to yield other potentially fruitful mathematical concepts. One generalization of Descartes’ notion is to that of a multispoof perfect number. As the name suggests, these are numbers which are perfect if two or more of their composite factors are assumed to be prime. For example,
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
15
(14) If 4 and 6 were prime, then 840 would be perfect. (This example is due to Wang (2002). The full “prime” factorization is 840 ¼ 4.5.6.7. It is straightforward (though tedious) to verify that the various combinations of these factors sum to 840, as required.) Moving even further in this direction, a number is completely spoof perfect if it can be expressed as a product of composite numbers, and it would be perfect if all of these composite factors were prime. For example, 390405312000¼ 4.8.9.10.15.22.46.94.95 is an even spoof perfect number if you suppose incorrectly that all these factors are primes. (ibid)
Reformulating this in conditional form yields the following counterarithmetical: (15) If 4 and 8 and 9 and 10 and 15 and 22 and 46 and 94 and 95 were prime, then 390,405,312,000 would be perfect. Another extension that is worth mentioning, partly because it is more radical in nature, is to negative spoof perfect numbers. This involves allowing negative numbers as potential prime factors, for example, when we write 6 ¼ (2.).(3), thus opening up a whole new class of cases. An example of a negative spoof perfect number is given by Voight. N ¼ 22,017,975,903 ¼ 3472112192(127) “is perfect if we are willing to ignore the fact that 127 is negative” (Voight 2003). Finally, there are various theorems and conjectures concerning spoof perfect numbers, results that we will discuss in more detail in the next subsection. These provide more evidence, at least prima facie, that the topic – though small in scope – is of genuine mathematical interest. Lurking in the background here are broader questions concerning the relation between mathematical practice and philosophy. What force does appeal to evidence from mathematical practice have in attacking or defending philosophical claims about mathematics? And can distinctions be made between different sorts of mathematical practice? Ought we to privilege “serious” academic mathematics over recreational mathematics? Published mathematics over unpublished? Statements made in the course of a proof over “stage-setting” introductory remarks? At this juncture, I will limit myself to a fairly modest claim: that there is enough going on in the mathematical investigation of spoof perfect numbers for it to deserve to be taken seriously as data for the analysis and evaluation of counterpossibles.
5.3
Mathematically Acceptable Spoof Perfect Numbers Are Interesting
What about responding to the core argument by borrowing one of the vacuist strategies from the discussion of reductio reasoning, and making a pragmatic distinction between truth and mathematical acceptability? The idea is to take aim
16
A. Baker
at Premise 1 and argue that, strictly speaking, the set of spoof numbers is not interesting, because (so the vacuist maintains) all counterarithmeticals are true. However, what mathematicians are picking out in their examples of spoof perfect numbers are part of a subset of mathematically acceptable spoof perfect numbers. And these may well be quite interesting. What makes a particular spoof perfect number mathematically acceptable? Just as in the reductio case, acceptability hinges on the availability of a step-by-step derivation of the consequent (that the number is perfect) from the antecedent (that a particular composite factor is prime). There are at least a couple of reasons for thinking that this pragmatic strategy will not be as effective here as it was for reductio conditionals. First, it turns out that there is always a simple derivation of the consequent from the antecedent available, because – unlike in typical reductio proofs – we already have the negation of the antecedent in our stock of basic theorems and results. (Of course, if the claim being proved by reductio has already been proved by some other method, then we may indulge in the pretense that is not available as a theorem that can be introduced during the course of the new proof. Nonetheless, the canonical case of reductio reasoning proceeds from genuine ignorance over the truth-value of the initial assumption.) For example, we can quickly prove that 59 is spoof perfect as follows. (1) (2) (3) (4)
4 is prime So 4 is prime or 59 is perfect 4 is not prime Hence, 59 is perfect
Assumption 1, Addition Theorem 2, 3, Disjunctive Syllogism
Clearly, this same pattern of reasoning can be applied to prove the literal truth of any claim about spoof perfectness. So the vacuist is in a bind because it is not just the truth of all counterarithmeticals that is uninteresting but also the trivial easiness of proving individual cases. Second, and more important, present in the mathematical literature are theorems and conjectures concerning the existence of spoof perfect numbers of various sorts. Many of these results concern completely spoof perfect numbers (recall from the previous section that a number is completely spoof perfect if it would be perfect under the assumption that all of its composite factors are prime). For example, Theorem 1: For a given number of composite factors, the number of even completely spoof perfect numbers is finite. Theorem 2: Every completely spoof perfect number has at least six composite factors. Theorem 3: 907,200 is the smallest even completely spoof perfect number. The details and proofs of these theorems can be set aside for the moment. What is crucial is that they each make claims that are simply false if – as the vacuist
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
17
maintains – all numbers are spoof perfect (and multispoof perfect, and completely spoof perfect). Not only does the vacuist position render the above “theorems” literally false, but it also settles in a trivial manner certain conjectures which are taken by the number theoretic community to be open (and interesting). For example, Conjecture: There is no odd completely spoof perfect numbers. To summarize, the inclusion in number theoretic practice of theorems and conjectures of the above sort brings the pragmatic strategy of the vacuist into direct conflict with mathematical practice. If all counterarithmeticals are indeed literally true, then the bulk of these theorems are literally false. And no appeal to a mathematical acceptability/unacceptability distinction will get around this fact. One glimmer of hope does remain for the proponent of a pragmatic vacuist strategy. She could attempt to formalize the notion of “interestingly true counterarithmetical” and then interpret the above theorems as making claims that are implicitly restricted to the class of “interesting” spoof perfect numbers (I owe this suggestion to Bill Childs). However, the burden of proof is then on the vacuist to come up with an adequate characterization of “interesting.” I shall postpone discussion of this option until we have some more technical apparatus in place.
5.4
Vacuously True Counterarithmeticals Do Not Imply That All Numbers Are Spoof Perfect
This response on behalf of the vacuist takes aim at Premise 3 of the core argument. The idea is to question the inference from vacuously true counterarithmeticals to all numbers being (trivially) spoof perfect. As we shall see, there is something to this idea, depending on how exactly one decides to define spoof perfectness. But even on a definition which allows for some numbers not to be spoof perfect, even for the vacuist, the resulting set of numbers still seems to have little independent mathematical interest. Hence, the overall core argument is unaffected. The point of this response is to focus on the precise definitional link between spoof perfect numbers and counterarithmeticals. Recall the sorts of cases that have been presented, for example, (16) If 4 were prime, then 60 would be perfect. The weakest way of defining spoof perfectness is to say that a number is spoof perfect if and only if it features in the consequent of a counterarithmetical of the above sort. More precisely, Definition 1: A number, n, is spoof perfect if and only if there is some m such that (If m were prime, then n would be perfect) is true.
18
A. Baker
On this definition, a vacuist account of counterpossibles implies that all numbers are spoof perfect. However, we could also put further constraints on the conditional claim. One obvious constraint is to add that the conditional in question be a counterfactual, in other words that m not be prime. In itself, this does not reduce the extent of spoof perfect numbers. But it does if we combine it with a further plausible constraint that m be a proper divisor of n. Adding these twin constraints yields the following alternative definition: Definition 2: A number, n, is spoof perfect if and only if there is some composite m such that m is a proper divisor of n and (If m were prime, then n would be perfect) is true. It is worth noting that for the nonvacuist, the only difference between these two definitions is whether actually perfect numbers are counted as spoof perfect since, in Definition 1, presumably only ms that are proper factors of n will feature in true counterfactuals about the perfectness of n. But for the vacuist, the difference between the two definitions is more significant. Whereas Definition 1 implies that all numbers are spoof perfect, Definition 2 restricts the spoof perfect numbers to those which have at least one composite proper divisor (since this is just the condition placed on m). So, to summarize, there is nothing to stop the vacuist from picking Definition 2 and thus making room for not all numbers being spoof perfect. However, it is easy to see that the resulting set consists of all and only those numbers that have at least one composite proper divisor, in other words numbers with at least three prime factors. Even on this refinement of the vacuist position, therefore, spoof perfect numbers must be equated with a much more basically defined set, leaving no reason for mathematicians to characterize them in such a convoluted, counterarithmetical way. I conclude that the refined vacuist position still does not fit with mathematical practice and hence that this response on behalf of the vacuist is insufficient.
5.5
Spoof Perfect Numbers Need Not Be Defined Using Counterarithmeticals
Thus far, we have looked at challenges to Premise 1 and to Premise 3 of the core argument, as well as attempts to dismiss the force of the argument outright. I now want to turn to possible attacks on what is in many ways the crucial claim of the argument, encapsulated in Premise 2: 2. Each spoof perfect number is defined using a counterarithmetical. In the original formulation of the core argument, this premise was left intentionally vague. In particular, the “is” in Premise 2 could be read (in increasing order of strength) as
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
19
. . . may be defined using a counterarithmetical. . . . is always defined using a counterarithmetical. . . . must be defined using a counterarithmetical.
The idea behind challenging this claim is to argue that the counterfactual locutions that tend to be used when introducing the notion of spoof perfect number – “if 22,021 were prime,” “if 4 were prime,” etc. – are rhetorical “fluff.” Not only can they be dispensed with when giving a more precise and explicit definition of spoof perfectness, but they ought to be also dispensed with. If it turns out that precise definitions can be given which make no use of counterpossibles, then the vacuist would seem to be off the hook. Vacuous counterpossibles can then happily coexist with a contentful role for spoof perfect numbers in mathematics.
6
Paraphrasing Away Counterarithmeticals
6.1
Sigma Functions
What has been presented at the end of Sect. 5 is a program for a defense against the core argument, not a defense in itself. Its success will depend crucially on whether counterpossible-free definitions of spoof perfect number, multispoof perfect number, and so on can be formulated. The approach I shall consider here is based on the apparatus of sigma-functions. I have chosen this approach for two reasons. First, the use of sigma-functions is already well-entrenched in the theory of perfect numbers. Second (and relatedly), it is to sigma-functions that mathematicians most often turn when explicating the more intricate aspects of spoof perfect numbers (see Nielsen (2003, p. 9) and Guy (2004, p. 72)). Definition: The sigma-sum of a natural number, n, written σ(n), is the sum of all of the proper and improper divisors of n. It will be useful for subsequent discussion to note the following theorem concerning sigma-sums: Theorem: If the prime decomposition of a number, n, is p1a. p2b. . . ., then σ(n) ¼ (1 + p1 + p12 + . . . + p1a)(1 + p2 + p22 + . . . + p2b)(. . .). The proof is too involved to give here, but the result is intuitive if one thinks of the divisors of a number being made up of all possible combinations of the elements of its prime decomposition. It can be seen that multiplying out the σ product yields each divisor as one element of the final sum. Lemma: n is a perfect number if and only if σ(n) ¼ 2n
20
A. Baker
This follows immediately from the above Theorem, since by definition a number, n, is perfect if it is the sum of all its divisors other than itself. Hence, the sum of all its divisors and itself is equal to 2n. The basic strategy for defining spoof perfect numbers is to construct a new “sigma*-sum” corresponding to each counterarithmetical antecedent, such as “If 4 were prime.” The sigma* function corresponding to this assumption treats 4 as if it were a prime number for the purposes of calculating sums of divisors. This involves first expressing the number whose spoof perfectness we are investigating as the product of powers of 4 together with a product of prime factors. (In order to guarantee uniqueness, the highest power of 4 possible must be included in the factorization. For example, 16 ¼ 22.4 would not be a legitimate “4-factorization.”) For example, 20 ¼ 4.5; 24 ¼ 4.2.3; 75 ¼ 3.52; 80 ¼ 42.5, and so on. If the components of such a factorization are c1a. c2b. . . ., then σ*(n) ¼ (1 + c1 + c12 + . . . + c1a)(1 + c2 + c22 + . . . + c2b)(. . .). As can be seen, this directly mirrors the Theorem for sigma-sums that applies to (purely) prime factorizations. If σ*(n) ¼ 2n, then n is spoof perfect. For illustrative purposes, let us run through the above procedure with 4 as our “counterarithmetical prime” and 60 as our number being investigated. 60 ¼ 4.3.5, so σ*(60) ¼ (1 + 4)(1 + 3)(1 + 5) ¼ 5.4.6 ¼ 120. So σ*(n) ¼ 2n, and 60 is spoof perfect. With this apparatus in place, we can agree to accept as correct those counterarithmeticals which correctly describe a σ*-function of this sort. So “If 4 were prime then 60 would be perfect” is correct, but “If 4 prime then 59 would be perfect” is not correct (The latter counterarithmetical is not correct because the 4-factorization of 59 is 59, so σ*(59) ¼ 1 + 59 ¼ 60 ((2.59)). The counterarithmeticals are not to be read literally (since, given the vacuist account, they are all trivially true) but rather as gesturing at the corresponding claims about σ*-functions. All that remains is to define the absolute notion of a number being spoof perfect, when this is not tied to any particular counterarithmetical claim. If we introduce subscript notation to indicate the composite number whose primality is being assumed, so σ*4(n) is the σ*-function for n under the supposition that 4 is prime, then we can define spoof perfectness as follows: Definition: A number, n, is spoof perfect if and only if ∃m such that m is a composite proper divisor of n and σ*m(n) ¼ 2n. What about the other varieties of spoofness that were mentioned earlier? I will not pause at this point to formulate a sigma-function definition of multispoof perfect number, though I think it would be quite straightforward to do so. As for completely spoof perfect number, this requires extending the σ* notation to allow for multiple subscripts. Recall that a number is completely spoof perfect if and only if it can be expressed as the product of composite factors and would be perfect under the assumption that all of these composite factors are prime. So, for composite numbers a, b, c, . . ., let the (a, b, c, . . .)-factorization of n be a factorization into the product of powers of a, b, c, . . . (This involves abandoning uniqueness of the result of the σ*
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
21
operation because in many cases there will be more than one way to express a number as the product of given composite factors. For example, consider factoring 1024 into a product of the composite numbers 4 and 16. Not only 1024 ¼ 4.162, but also 1024 ¼ 43.16. It can be verified that these two factorizations yield distinct σ*-sums). Then define σ*a,b,c,. . .(n) to be the σ*-function for n under the supposition that a, b, c, . . . are prime. Definition: A number, n, is completely spoof perfect if and only if ∃a, b, c, . . . such that a, b, c, . . . are composite proper divisors of n and σ*a,b,c,. . .(n) ¼ 2n. Finally, to illustrate the expressive power of the sigma-notation approach, we shall show how one of the theorems concerning spoof perfect numbers that was mentioned in subsection (III) above can be “translated” into the language of σ-functions. Theorem 2: Every completely spoof perfect number has at least six composite factors. Theorem 2 (σ* version): 8n 8a, b, c, . . . (If a, b, c, . . . are composite proper divisors of n, and if σ*a,b,c,. . .(n) ¼ 2n, then the number of composite factors of n 6).
6.2
Discussion
It is clear, I think, that the sigma-function approach sketched above provides the vacuist with a powerful tool for resisting the force of the core argument. It uses apparatus that is already quite familiar from mainstream number theory, and it allows definitions and theorems concerning spoof perfect numbers to be formulated without involving any countermathematical elements. No proof has been given that this method of “sigma-paraphrase” can always be applied to claims concerning spoof perfect numbers, but there is no reason to think that it cannot, and plenty of reason to view this is a quite general method for eliminating reference to counterarithmeticals from mathematical presentations of spoof perfect numbers. The vacuist sees the core argument against their position as resting on a claim of indispensability: Spoof perfect numbers must be defined using counterarithmeticals, hence if counterarithmeticals are vacuous then spoof perfect numbers are uninteresting. Let us grant for the moment that the vacuist has succeeded in undermining this indispensability claim. Does this eliminate this challenge to the vacuist position? Talk of indispensability and of paraphrasing things away brings to mind the Quine-Putnam argument for the existence of mathematical objects, and nominalist challenges to the argument such as Hartry Field’s Science Without Numbers (Field 1980). The presumption of this debate is that the goal for the nominalist is to come up with an adequate, mathematical-object-free formulation of our best scientific
22
A. Baker
theories. However, there is another strand of antinominalist argument, advocated by neo-Fregeans such as Crispin Wright, which questions whether even a successful recipe for paraphrasing away mathematical objects would bolster the nominalist position. Consider one area where it does seem to be straightforwardly possible to eliminate explicit quantification over mathematical objects, namely in using (finite) numbers to count objects. As is well known, the reference to number in an expression such as. (17) The number of cows in the field is 3 and can be eliminated, using just first-order logic with identity, by re-expressing (17) as (18) ∃x ∃ y ∃ z[Cx & Cy & Cz & x 6¼ y & x 6¼ z & y 6¼ z & (Cw (w ¼ x _ w ¼ y _ w ¼ z))]
8 w
The nominalist takes this to show that utterance of (17) does not ontologically commit us to numbers. The neo-Fregean reply is that if (18) really has the same content as (17), does this not – conversely – show that (18) has ontological commitment to numbers? Returning to the case of spoof perfect numbers, the analogous question concerns the precise relationship between a counterarithmetical assertion about spoof perfect numbers, such as (19) If 4 were prime, then 60 would be perfect, and its “sigma-paraphrase” (20) σ*4(60) ¼ 120 (¼ 2.60). The issue of what to say about the relation between (19) and (20) presents something of a dilemma for the vacuist. If, for example, she postulates some strong sort of equivalence between the two, then this would seem to conflict with her broader commitment to the vacuous truth of all counterpossibles. For this would in turn imply that, strictly speaking, all claims of the form of (19) are true, whereas clearly many assertions concerning the value of σ* functions are false. In other words, the vacuist cannot rationally maintain both that spoof counterarithmeticals are equivalent to claims about sigma-functions and that all spoof counterarithmeticals are true. A second, more appealing option, is to drop any claim of equivalence between (19) and (20) and say instead that (20) gives the literal meaning of what is expressed, in rhetorical and mathematically imprecise language, by (19). In other words, the model for the relation is more like that which putatively holds between an expression such as “Bob had butterflies in his stomach” and its more literal explication, “Bob felt nervous.” Does this literal / metaphorical distinction give the right kind of relation between (20) and (19) for the vacuist’s purpose?
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
23
The answer to this question depends partly on how the vacuist intends her claims to link to mathematical practice. On the one hand, she might be claiming that what mathematicians really mean when they assert a spoof counterarithmetical such as (19) is a claim about sigma-functions along the lines of (20). Following John Burgess’s terminology (Burgess 1983) (in the context of the platonism-nominalism debate), call this hermeneutic vacuism. On the other hand, the vacuist might be claiming that what mathematicians ought to do when making assertions about counterpossibles is to replace claims such as (19) with their more mathematically precise analogs such as (20). Call this revolutionary vacuism. Each of these alternatives faces some difficulties. The hermeneutic vacuist must face the fact that many appearances of spoof perfect numbers in the mathematical literature are not accompanied by any mention of sigma functions. Even more seriously, the original example from Descartes predates the development of the sigma-function formalism (by Euler) by over one hundred years! What has the hermeneutic vacuist to say about such cases, where mathematicians make spoof counterarithmetical claims without having any awareness of the sigma-function alternative? This would seem to indicate that mathematicians’ understanding of spoofness is not dependent on translation into sigma notation. (Or should we say that Descartes was writing purely metaphorically in his letter to Mersenne . . .?) The revolutionary vacuist faces a different sort of difficulty. For, unlike the hermeneutic vacuist, he is advocating that mathematicians change certain aspects of their practice. In particular, they should stop using counterarithmetical locutions in introducing and specifying spoof perfect numbers. But another way of viewing this call for revision is as a conflict between revolutionary vacuism and mathematical practice. And – although this is not a fatal flaw – this is enough to cast serious doubts on the cogency of this brand of vacuism. There is much more to be said here on both sides, but hopefully enough has been done to show the connections between this issue and some broader debates in metaphysics concerning indispensability and ontological commitment. I shall close this section with a separate objection to the sigma-paraphrase strategy, namely that it fails to account for a certain kind of use to which mathematicians put spoof perfect numbers. In a 2003 paper, Pace Nielsen proves an upper bound on the size of an odd perfect number, expressed in terms of its number of distinct prime factors. At the end of the paper, Nielsen discusses Descartes’ spoof odd perfect number (and shows its spoof perfectness using sigma-function notation). He then makes the following remarks: To demonstrate . . . the weakness of the bound above, we have the following: . . . If we restrict ourselves to numbers that can be expressed as a power of 22021 and other factors relatively prime to 19 and 61, and if we replace σ by σ*, then the proof of the above theorem [on upper bounds] still goes through. So it should be true that for N ¼ 32.72.112.132.22021 we have N < 21024. In fact 21024/N ≈ 4.407 10317. This demonstrates one of the difficulties in proving the non-existence of odd perfect numbers. (Nielsen 2003, p. 9)
What Nielsen does here is first show that the upper bound he has proved gives a very weak result for Descartes’ spoof odd perfect number, since the bound is larger
24
A. Baker
than the number itself by a factor of 10317. He then uses this fact to motivate the claim that the bound is likely also to be very weak for nonspoof odd perfect numbers. To be sure, this latter move is not a formal proof of the weakness of the bound. But the use of investigations into the properties of spoof odd perfect numbers to cast light on the properties of genuine odd perfect numbers (if any there be) is not uncommon in this area of mathematics. The question for the vacuist, then, is whether this sort of “heuristic” use of spoof perfect numbers can be justified once everything has been translated into sigma-functions. There are at least a couple of reasons for thinking this might be a problem for the vacuist. First, the sigma-function approach is rather formalistic in nature. It is true that the relevant σ*m-function can be precisely defined, for each composite number m, but why should the values of these σ*m-functions have any bearing on the regular σ-function expressing the sum of divisors? Second, the sigma-function approach abandons any role for the property of being a genuine perfect number in defining the notion of spoof perfect number. This is in contrast with the counterarithmetical approach, which defines spoof perfectness by pointing to a counterfactual situation in which the given number is literally perfect. This encourages the view that spoof perfect numbers are closely connected to genuine perfect numbers – they almost were perfect! – and hence provides motivation for heuristic arguments such as Nielsen’s which apply conclusions from one to the other.
7
Nonvacuism and the Evaluation Problem
Thus far, I have deliberately focused attention on the vacuist account of counterpossibles because the challenge posed by spoof counterarithmeticals in mathematical practice is most immediate for views which classify all counterpossibles as vacuously true. However, I think that the counterarithmetical examples presented earlier also offer resources for nonvacuist analyses of counterpossibles. In this section, I shall briefly outline why. To make the discussion more concrete, I shall use a specific nonvacuist analysis to illustrate the general problem. Consider an analysis along the lines proposed by Daniel Nolan and others, which adds impossible worlds to the Lewisian framework (Nolan 1997). The idea is to keep the form of Lewis’s definition of counterfactuals, but to save counterpossibles from vacuity by allowing their antecedents to pick out the closest impossible worlds. A counterpossible is false if its consequent is false at one of these closest impossible worlds. In allowing for false counterpossibles, this analysis makes room for nontrivial counterarithmeticals and thus for a mathematically interesting class of spoof perfect numbers. So far so good, but the nonvacuist is not yet off the hook. Ideally, we would like not just a framework for analyzing counterpossibles but also guidance about how to evaluate the truth or falsity of specific examples. It is important not to set the bar unreasonably high here when we ask the nonvacuist for guidelines for evaluating counterarithmeticals. If the framework is one of impossible worlds, then counterpossibles will be evaluated at those
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
25
impossible worlds that are most similar to the actual world. Thus the request is mainly for guidance about how to assess comparative similarity. And the issue of how to cash out similarity is a thorny problem for possible-worlds approaches in general. Nonetheless, there are reasons for thinking that counterarithmeticals present two distinctive extra difficulties, stemming from their contradictory antecedents and from their mathematical subject matter. For the moment, I will focus just on the second of these features. One approach to giving guidance is to provide rules of thumb for assessing relative similarity between two worlds. David Lewis provides some such rules in a 1979 paper (Lewis 1979, p. 472): 1. It is of first importance to avoid big, widespread, diverse violations of law. 2. It is of second importance to maximize the spatio-temporal region throughout which perfect match of particular fact prevails. 3. It is of third importance to avoid even small, localized, simple violations of law. 4. It is of little or no importance to secure approximate similarity of particular fact, even in matters that concern us greatly. There are several reasons why applying these rules to counterfactuals with mathematical subject matter is problematic. For example, is there an intuitively clear divide in mathematics between “laws” and “particular facts”? Distinguishing laws in terms of their modal status clearly will not work, since “All even numbers are divisible by 2” and “7 is a prime number” are both necessary, if either is – nor will making a distinction in terms of generality, or being universally quantified in form. (Compare, “All numbers between 6 and 8 are prime,” which is universally quantified but seems nonlawlike, and “0 is a natural number” which seems both particular and lawlike.) A second problem is how to make sense of – or provide analogs for – notions of spatiotemporal matching for mathematical facts. (See Baker (2003, pp. 256–260) for more discussion of how to apply Lewis-style rules of thumb to counterfactuals with mathematical subject matter.) Perhaps, it will be possible to come up with rules of thumb for assessing similarity between worlds where different mathematical facts obtain. But at present, this is an open question, and – as such – a challenge to the nonvacuist position. Note also that to be useful in the context of evaluating spoof counterarithmeticals, such guidelines will need to answer quite specific questions, concerning, for example, whether a world in which 6 is prime and 6 is a factor of 12 is more similar to the actual world than a world in which 6 is prime and 6 is not a factor of 12. (The difficulty is that if 6 is a factor of 12, then (presumably) 12 ¼ 2.6. But also 12 ¼ 2.2.3, so this implies that uniqueness of prime factorization does not hold.) It may be that some alternative, nonsimilarity-based analysis will be more effective in analyzing counterarithmeticals. Baron et al. (2020), for example, consider an approach inspired by counterfactual analyses of causation in which a distinction is made between “upstream” and “downstream” facts. They briefly consider a number of ways of trying to flesh out these notions in the mathematical case, where no causal notions are in play, but end up acknowledging that they are
26
A. Baker
leaving “the distinction between upstream and downstream facts at an intuitive level” (Baron et al. 2020, p. 8).
8
Conclusions
I hope to have shown in this chapter how the role of counterarithmetical reasoning in mathematical investigations of spoof perfect numbers presents both a prima facie challenge and a potentially valuable resource to philosophical theories of counterfactuals. To be adequate, any such theory must have something to say about counterpossibles – counterfactuals with impossible antecedents. If the analysis implies that all counterpossibles are vacuously true, then the challenge is to explain the systematic patterns of acceptance and rejection of different arithmetical counterpossibles by practicing mathematicians. If the analysis allows for nonvacuously true counterpossibles, then the task is to provide guidelines for evaluating arithmetical counterpossibles that are precise enough to underpin their role in actual mathematical practice.
References Arnaldo J, Dris B (2017) The abundancy index of divisors of spoof odd perfect numbers. Indian J Number Theory 2017:13–26 Baker A (2003) Does the existence of mathematical objects make a difference? Australasian J Philos 81(2):246–264 Baron S (2020) Counterfactual scheming. Mind 129:535–562 Baron S, Colyvan M, Ripley D (2017) How mathematics can make a difference. Philosophers’ Imprint 17:1–19 Baron S, Colyvan M, Ripley D (2020) A counterfactual approach to explanation in mathematics. Philos Math 28(1):1–34 Bennett J (2003) A philosophical guide to conditionals. Oxford University Press, New York Burgess J (1983) Why i am not a nominalist. Notre Dame J Formal Logic 24:93–105 BYU Computational Number Theory Group (2020) Odd, spoof perfect factorizations. arXiv:2006.10697v1 Crubellier M, Sip J (1997) Looking for perfect numbers. History of Mathematics: History of Problems, Paris, pp 389–410 Dittmer S (2014) Spoof odd perfect numbers. Math Comput 83:2575–2582 Field H (1980) Science without numbers. Princeton University Press, Princeton Guy, R. (2004) Unsolved problems in number theory. Springer, New York Hàjek A (2016) Philosophical heuristics and philosophical methodology. In: Cappelen et al (eds) The Oxford handbook of philosophical methodology. Oxford University Press, Oxford Hardy G (1940) A mathematician’s apology. Cambridge University Press, Cambridge, MA Holdener J, Rachfal E (2019) Perfect and deficient perfect numbers. Am Math Mon 126(6):541–546 Kvart I (1986) A theory of counterfactuals. Hackett, Indianapolis Lewis D (1973) Counterfactuals. Blackwell, Oxford Lewis D (1979) Counterfactual dependence and time’s arrow. Noûs 13:455–476 Lewis D (1991) Parts of classes. Blackwell, Oxford Lewis-Beck M et al (eds) (2004) The sage encyclopedia of social science research methods. Sage, Thousand Oaks
Counterpossibles in Mathematical Practice: The Case of Spoof Perfect Numbers
27
Maddy P (1997) Naturalism in mathematics. Oxford University Press, Oxford Nielsen P (2003) An upper bound for odd perfect numbers. Dermatol Int 3:A14 Nielsen P (2015) Odd perfect numbers, diophantine equations, and upper bounds. Math Comput 84: 2549–2567 Nolan D (1997) Impossible worlds: A modest approach. Notre Dame J. Formal Logic 38:535–572 Ojeda A (2005) Are intensions necessary? Sense as the construction of reference, Semantics Archive Reutlinger A, Colyvan M, Krzyzanowska K (2020) The prospects for a monist theory of non-causal explanation in science and mathematics. Erkenntnis. https://doi.org/10.1007/s10670-02000273-w Voight J (2003) On the nonexistence of odd perfect numbers. In: Katok et al (eds) MASS Selecta: Teaching and learning advanced undergraduate mathematics. American Mathematical Society 293–300 Wang P (2002). http://www.primepuzzles.net
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius Ptolemy and His Commentators (Second–Fourth Century CE) Alberto Bardi
Contents 1 2 3 4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Claudius Ptolemy and His Time: A Brief Overview of His Life and Works . . . . . . . . . . . . . . . Ptolemy’s Legacy in Alexandria: Pappus and Theon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ptolemy’s Philosophy of Mathematical Practice: Astronomy, Astrology, and Astronomical Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Ptolemy’s Mathematical Practice in the Alexandrian Context: Texts, Languages, and Genres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Socio-historical and Material Aspects of Ptolemy’s Mathematical Practice . . . . . . . . . . . . . . . . 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 4 6 7 8 12 15 16 16
Abstract
Claudius Ptolemy’s mathematical astronomy originated in Alexandria in Egypt under Roman rule in the second century CE and held for more than a millennium, even beyond the Copernican theories (sixteenth century). To trace the flourishing of such mathematical creativity requires an understanding of Ptolemy’s philosophy of mathematical practice, the ancient commentators of Ptolemaic works, and the historical context of Alexandria in Egypt, a multicultural city which became a cradle of cultures of mathematical practices and blossomed into the Ptolemaic system and its several related outputs between the second and fourth centuries CE. Ptolemy’s mathematical practice will be explored under three lenses: (a) an examination of the mathematical prose of Ptolemy’s Almagest alongside the corresponding passages in its Alexandrian commentaries; (b) the social, material, and semantic layers of Ptolemy’s mathematics; and (c) the conception of true philosophy in Ptolemy’s theory of knowledge. Examining the history and A. Bardi (*) Department of the History of Science, Tsinghua University, Beijing, China e-mail: [email protected] © Springer Nature Switzerland AG 2023 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_111-1
1
2
A. Bardi
philosophy of Ptolemy’s mathematical practice in Alexandria (1) shows that Ptolemy’s ideals were usually betrayed by the practitioners of his mathematics and (2) proves that its main aspect is constituted by mathematical knowledge arranged in tabular format, that is the so-called astronomical tables, to which Ptolemy’s mathematics owes its successful legacy. Keywords
Alexandria in Egypt · Almagest · Astronomical tables · Commentaries · Handy Tables · Pappus · Ptolemy · Tables · Theon of Alexandria
1
Introduction
The mathematical astronomy of Claudius Ptolemy originated in Alexandria in Egypt under Roman rule in the second century CE and held for more than a millennium, even beyond the Copernican theories of the sixteenth century (Boll 1894; Toomer 1975; Taub 1993; Feke 2018; Juste et al. 2020). To trace the origins of such mathematical creativity calls for an examination of Ptolemy’s philosophy of mathematical practice, the ancient commentators of Ptolemaic works, and the historical context of Alexandria in Egypt, a multicultural city which became a cradle of cultures of mathematical practices and flourished into the Ptolemaic system and its several related outputs (e.g., astronomical tables) between the second and the fourth centuries CE. Alexandria was founded in 331 by Alexander the Great during his campaign in Africa and the Middle East. Due to its favorable strategic position, being located at the crossroads of Asia, Africa, and Europe, Alexander established the capital of his empire in Alexandria. The city would soon benefit from cross-cultural encounters, making it a prominent cosmopolitan center. Historical documents account for Alexandria’s population as composed of native Egyptians, Greeks, Jews, Persians, Ethiopians, Syrians, Romans, and Arabs (Fraser 1972; Netz 2020, 257–305). The credit for making Alexandria an intellectual hub of the Hellenistic world goes to Ptolemy the First (305–283 BCE), the general who took over control of Egypt upon Alexander’s death. Inspired by the model of famous Greek intellectual circles such as those formed around Pythagoras, Plato, and Aristotle, Ptolemy decided that Alexandria should become the center of Greek culture in this new world. Ptolemy’s patronage for the arts and sciences took shape in the Museion (museum), a sacred place devoted to the Muses, daughters of Zeus and Mnemosyne, the personifications of memory, and their protector Apollo. Adjacent to the Museum, Ptolemy constructed a library, which was intended not only to house and preserve important works but also to be open to the general public. This famous library contained a huge number of volumes for the time and was meant to contain all human knowledge in one place (Blum 1991, 95–123). Ptolemy invited scholars from all over the world to Alexandria and supported them with scholarship grants. Consequently, all manner of scholars gathered at this Museion, which was akin to
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
3
a society of fellows or an institute of advanced studies: mathematicians, poets, philosophers, philologists, astronomers, geographers, physicians, historians, and artists. How such exceptional circumstances came about remains a matter of historiographical discussion: two factors seem to have deeply influenced the character of the culture which grew out of the mixture of peoples and scholars and the broadened physical horizons. First, the Alexandrians’ commercial interests brought geographical and navigational problems to the fore and directed attention to materials, methods of production, and the improvement of skills. Second, because commerce was carried out by free people who were not segregated socially from the scholars, the latter became aware of and involved in the problems facing the people at large. As a result, scholars were induced to unite their flourishing theoretical studies with concrete, practical, scientific, and engineering investigations. Technical fields were pursued and extended, training schools were established, and sciences were fostered and cultivated. Later, the wars between Alexander’s successors saw the success of the Roman power, with Egypt becoming a Roman province in 30 BCE. Alexandria under Roman rule was the environment in which Claudius Ptolemy worked and formed the background in which his mathematical creativity could thrive and spread. The science of that time, usually described as “Hellenistic,” benefitted from encounters with several traditions that merged and mingled thanks to the cultural institutions in Alexandria, namely, the Museion and the Library (Neugebauer 1957, 145–190). The literary culture of Alexandria became so rich and elaborate that texts would reflect connections between arts and sciences; mathematics played alongside poetry in new ways (Netz 2009). The accumulated mass of data in literature and science became so vast that early systems of bibliography and cataloguing had to be invented, and the art of philology could thrive (Blum 1991). In astronomy, scholars could draw on knowledge stemming from the Egyptian, Greek, and Babylonian traditions. Besides providing a huge volume of records of observational data for astronomers, the Greek and the Babylonian traditions had conceived theoretical frameworks which were influential for Ptolemy and his generation: the Greeks conceived the cosmos geometrically, while the Babylonians dealt with celestial phenomena arithmetically (Neugebauer 1957, 97–144). Ptolemy’s mathematical astronomy would produce new knowledge by combining the two traditions (Neugebauer 1957, 191–207; Goldstein 2007). Ptolemy’s mathematical practice will be explored under three lenses: (a) an examination of the mathematical prose of Ptolemy’s Almagest alongside the corresponding passages in its Alexandrian commentaries; (b) the social, material, and semantic layers of Ptolemy’s mathematics; and (c) the conception of true philosophy in Ptolemy’s theory of knowledge.
4
2
A. Bardi
Claudius Ptolemy and His Time: A Brief Overview of His Life and Works
Claudius Ptolemy lived in Alexandria in the second century (ca. 100–ca. 170), a period which historiography usually describes with the term pax romana (Roman peace). The time was characterized by bilingualism (Greek and Latin) and saw four renowned Roman emperors in power: Trajan (ruled 98–117), Adrian (ruled 117–38), Antoninus Pius (ruled 138–61), and Marcus Aurelius (ruled 161–80) (Goldsworthy 2016). People with access to education were usually studying toward a full command of Greek and Latin because the spheres of influence between the two idioms had been shaped through cultural-historical processes, so that eventually Greek became the language of science and philosophy, Latin the language of law, administration, and trade. Any Roman of the second century who was intellectually ambitious had to learn Greek; this was achieved chiefly with the help of Greek tutors or through years of study in Athens, Alexandria, or any other city in the eastern provinces (Brown 1971; Cribiore 2001; Watts 2012). Ptolemy wrote his works in Greek, and they are the only source we have about his life, apart from later indirect sources. For instance, the tenth-century Byzantine lexicon Suda mentions Ptolemy as an Alexandrian philosopher. This may sound surprising to a modern readership, but astronomy and mathematics have been considered part of philosophy for centuries, and this was a common framework in Hellenism and the Middle Ages (Taub 1993; Feke 2018; Jones 2020). Without pretending to map all the semantic possibilities of philosophy in Antiquity and the Middle Ages, it is worth noting that Ptolemy himself provides a philosophical statement in Almagest Book 1 in which he claims that mathematics is the highest form of philosophy, thus it provides the highest form of knowledge (see below). Ptolemy’s polymathy is indeed well reflected in his scholarly outputs. To mention some of his works, on the sciences of the heavens he authored the Almagest or Mathematical Syntaxis (Toomer 1984), the Handy Tables (Tihon and Mercier 2011), the Tetrabiblos (Robbins 1940), and the Planetary Hypotheses (Goldstein 1967; Hamm 2011). On other sciences, the Geography (Berggren and Jones 2000), Optics (Mark Smith 1996), the Harmonics (Swerdlow 2004), and also a philosophical work, the Criterion, alongside further, lesser explored works (Toomer 1975; Jones 2020). The Almagest is Ptolemy’s most successful work and is essential to understanding his mathematical practice (Toomer 1984; Goldstein 2007; Pedersen 2011). In the text, it is evident how Ptolemy was benefitting from Babylonian, Egyptian, and Greek astronomical traditions to generate a new mathematical system to be applied to astronomy: in sum, Babylonian sexagesimal system and astronomical parameters alongside observational records, Egyptian arrangement of calendar and division of day, and Greek geometrical devices to account for celestial motions. The opus consists of 13 books. The first two are introductory, explaining astronomical assumptions and mathematical methods. Ptolemy proves the sphericity of the Earth and postulates the sphericity of the heavens and their revolution around the Earth immobile in the center. He discusses and redetermines the obliquity of the
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
5
ecliptic by using trigonometry and relying on previous astronomers (or philosophers), Hipparchus and Menelaus of Alexandria. Every distance on the sphere is an angular one; the measurement of angles is replaced by the consideration of the chords subtending the corresponding arcs. The circle is divided into 360 parts and the diameter into 120 parts. Ptolemy used sexagesimal numbers: each of the 60 parts of the radius was divided into 60 small parts, and these again were divided into 60 smaller ones. A table of chords (sides of regular polygons) was computed for every half degree, each chord being expressed in parts of the radius, minutes, and seconds. The table of chords is followed by a geometrical argument leading to the calculation of the relations of arcs of the equator, ecliptic, horizon, and meridian. The same kind of discussion is continued in Book 2 with reference to the length of the longest day at a given latitude. Book 3 deals with the length of the year and the theory of the Sun, with the exposition of the models of eccentrics and epicycles, both probably already developed by the Greek astronomer Apollonius of Perga around the second century BCE. Book 3: length of the month and theory of the Moon. Book 4: construction of the astrolabe; theory of the Moon continued; diameters of the Sun, Moon, Earth’s shadow, distance of the Sun, dimensions of the Sun, Moon, and Earth. Book 6: solar and lunar eclipses. Books 7–8: catalogue of the Stars (Graßhoff 1990); precession of the equinoxes; the Milky Way; and the construction of a celestial globe. Books 9–13: planetary motions. Book 9: order of the planets according to their distances from the Earth and periods of revolution; Mercury. Book 10: Venus; Book 11: Jupiter and Saturn; Book 12: stationary points and retrogressions; Book 13: motions of planets in latitude, inclinations, and magnitudes of their orbits. The goal of the Almagest was to educate its readers to reach a contemplative status of the laws governing the heavens and thus become god-like, in the Platonic sense. Not accidentally, the collection of Greek poems from Late Antiquity known under the title Palatine Anthology provides an epigram ascribed to Ptolemy, in which the possibility of reaching divine realms in Platonic terms through the study of astronomy and detachment of the self from the earth is put into verse (Page 1981, 112). Among the outcomes of the Almagest is to provide, to use a current word, the “know-how” to construct astronomical tables. A full command of the “know-how” of the astronomical tables enables the drawing of schematic maps of the heavens for the present, the past, and the future. Ptolemy himself produced such a set of tables, the Handy Tables, which is a structured set of astronomical tables that he composed after completing the Almagest, largely adapting them from the tables embedded in that treatise (Tihon and Mercier 2011). An astronomical table is an arrangement of numbers (plus other graphic signs), which stand for the values assumed by a quantity as a function of values of an argument. The tables can be considered a quantitative representation of the geometric model, in a tabular format. Practically, on the one hand, a table is approachable to people uninterested in the mathematical theory; on the other hand, it is also deemed the aim of tabular astronomy, for Ptolemy and his peers regard table-making as a good astronomical practice: these two aspects will prove crucial for the outcomes of Ptolemy’s mathematical practice. While a table never appears on its own in the Almagest, but rather as a constitutive element of a
6
A. Bardi
mathematical argument that shows how the table is derived from the geometric model (Sidoli 2014), by contrast, the Handy Tables are tables on their own. The goal of Ptolemy’s tables, in both Almagest and Handy Tables, is to show that apparently irregular motions are actually based on the periodic motions of circles. Ptolemy’s tables superseded other formats of astronomical knowledge, such as the tables of Babylonian tradition, which had been produced according to arithmetic models (Jones 1997a, b, 1998, 2000). The tables acquired immense importance in antiquity and in the medieval traditions of the Western and Eastern Mediterranean and the Middle East: kings, caliphs, and emperors alike granted funding to have the best and most accurate astronomical tables (Kennedy 1956; King et al. 2001; Chabás and Goldstein 2012).
3
Ptolemy’s Legacy in Alexandria: Pappus and Theon
The fourth-century Alexandrian mathematicians Pappus (Cuomo 2000) and Theon (Toomer 1976) are essential point of reference when exploring Ptolemy’s mathematical practice. Between 300 and 350 CE, Pappus authored several works, the most famous being the Synagoge or Collection of mathematical problems (Jones 1986); his Commentary on the Almagest survives in fragments (Rome 1931–1945). Theon was active in the second half of the fourth century. Remembered as the last attested member of the Museion, he was either a contemporary of Pappus or belonging to the generation immediately following him. He co-authored or accomplished the Commentary on the Almagest initiated by Pappus and he also published two commentaries on the Handy Tables, which are known as The Great Commentary on the Handy Tables (Mogenet and Tihon 1985; Tihon 1991; Tihon 1999) and The Little Commentary on the Handy Tables (Tihon 1978). The former is extensive, spanning five books: it explains not only how to use the tables but also the reasons behind the operations to be carried out and the theory behind the tables’ construction and, accordingly, provides geometrical proofs. By contrast, The Little Commentary is just a collection of instructions: each chapter provides procedures on how to use the tables. Due to its practice-oriented nature, it would soon become the model for all subsequent generations of authors writing handbooks on how to use astronomical tables, from Late Antiquity till Early Modernity, in Christian, Jewish, and Islamicate contexts. Pappus’s and Theon’s commentaries were chiefly meant to explain the difficult content of the Almagest to students and provide them with the basic knowledge to access its theorems (Jones 1999). It turned out that their practice of commenting on Ptolemy generated a new kind of mathematical practice, which, though conservative in nature, granted the tools to further develop Ptolemy’s mathematics to subsequent generations – similar to what happened to other commentaries on Greek mathematics, the “deuteronomic texts,” as Netz (1998) dubs them.
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
4
7
Ptolemy’s Philosophy of Mathematical Practice: Astronomy, Astrology, and Astronomical Tables
As anticipated, according to Ptolemy’s theory of knowledge, the highest form of philosophy is mathematics. More precisely, he attaches to this conception not just a mere theoretical dimension but a practical one too, intending the two sides to be mutually beneficial: all this is evident in the first passage of the Almagest: The true philosophers, Syrus, were, I think, quite right to distinguish the theoretical part of philosophy from the practical. For even if practical philosophy, before it is practical, turns out to be theoretical, nevertheless one can see that there is a great difference between the two: in the first place, it is possible for many people to possess some of the moral virtues even without being taught, whereas it is impossible to achieve theoretical understanding of the universe without instruction; furthermore, one derives most benefit in the first case [practical philosophy] from continuous practice in actual affairs, but in the other [theoretical philosophy] from making progress in the theory. Hence we thought it fitting to guide our actions (under the impulse of our actual ideas [of what is to be done]) in such a way as never to forget, even in ordinary affairs, to strive for a noble and disciplined disposition, but to devote most of our time to intellectual matters, in order to teach theories, which are so many and beautiful, and especially those to which the epithet ‘mathematical’ is particularly applied. (Toomer 1984, 35)
Furthermore, the notion of mathematics as the highest form of philosophy in Ptolemy is related to the knowledge and practice of astrology. Astrology, in Ptolemy’s theory of knowledge, is the “physical” branch of the sciences of the stars, and the most influential systematization of knowledge labelable as “astrology” was redacted by Ptolemy himself in his Tetrabiblos (Robbins 1940). To understand astrology through the eyes of Ptolemy and the Alexandrian scholars, let us read his introduction to the Tetrabiblos: OF the means of prediction through astronomy, o Syrus, two are the most important and valid. One, which is first both in order and in effectiveness, is that whereby we apprehend the aspects of the movements of sun, moon, and stars in relation to each other and to the earth, as they occur from time to time; the second is that in which by means of the physical character of these aspects themselves we investigate the changes which they bring about in that which they surround. The first of these, which has its own science, desirable in itself even though it does not attain the result given by its combination with the second, has been expounded to you as best we could in its own treatise by the method of geometrical proofs. We shall now give an account of the second and less self-sufficient method in a properly philosophical way, so that one whose aim is the truth might never compare its perceptions with the sureness of the first, unvarying science, for he ascribes to it the weakness and unpredictability of material qualities found in individual things, nor yet refrain from such investigation as is within the bounds of possibility, when it is so evident that most events of a general nature draw their causes from the enveloping heavens. But since everything that is hard to attain is easily assailed by the generality of men, and in the case of the two before-mentioned disciplines the allegations against the first could be made only by the blind, while there are specious grounds for those leveled at the second—for its difficulty in parts has made them think it completely incomprehensible, or the difficulty of escaping what is known has disparaged even its object as useless—we shall try to examine briefly the measure of both the possibility and the usefulness of such prognostication before offering detailed instruction on the subject. (Translation Robbins 1940, 2–4; slightly reworked, my emphasis)
8
A. Bardi
In sum, the mathematical branch of the science of the stars with a higher degree of certainty is the one whose certainty is granted by geometrical proofs, while there is a less self-sufficient method, which lies at the foundations of the physical branch of astronomy, that is, astrology, which hinges upon the certainty of the former branch. Astrology is knowable on the grounds of the regular occurrence of the same effects of the Sun and the Moon and of other regular occurrences observed by farmers and sailors. To these observational records, in order to be scientific, one needs to add an accurate knowledge of the movements of the heavenly bodies. In other words, one needs to study the mathematics contained in the Almagest, which, in fact, as said, was meant to teach readers the mathematics needed to reach a contemplative status of the laws governing the heavens and thus become god-like, in the Platonic sense. Those who can reach the contemplative status are, by implication, able to construct astronomical tables as taught by the Almagest. As noted above, having a full command of the “know-how” of the astronomical tables allows one to have schematic maps (horoscopes) of the heavens for the present, the past, and the future (North 1986). They can be interpreted properly after studying the Tetrabiblos. On all these grounds, the true philosopher, according to Ptolemy, must learn both Almagest and Tetrabiblos, and that is the way to reach the full contemplation, the highest form of knowledge granted by the study of mathematics.
5
Ptolemy’s Mathematical Practice in the Alexandrian Context: Texts, Languages, and Genres
An analysis of Ptolemy’s mathematical practice can be undertaken via several approaches. For instance, (a) by recomputing the mathematical procedures and tables provided in Ptolemy’s works (e.g., Van Brummelen 1993 recomputed Ptolemy’s tables) or (b) by examining the Almagest and the corresponding passages in Alexandrian commentaries, paying attention to the language authors employed and the literary contexts in which they were working (they did not have to write “papers,” their canonical genres were different). The latter has never been done in these terms so far. A case study in this concern is offered here. Within such an approach, Ptolemaic texts must be understood in the cultural context of the production of knowledge in which Ptolemy, Pappus, and Theon were living, that is, the Museion and the Library of Alexandria, which embody and symbolize the institutionalization of the culture of mathematical practice that Ptolemy and his successors had developed. The products that have survived from that age are the texts, which act as both goals and tools for further development of the knowledge which in Alexandria was acquired, catalogued, and systematized (Renn 2020, 302–303, 308–309). In the case of astronomy, the knowledge stemming from Egyptian, Babylonian, and Greek astronomy could be systematized and used to generate the Ptolemaic system. On this account, the focus will be on the mathematical prose of Ptolemy, Pappus, and Theon. The working questions are as follows: (1) What is the nexus between the Almagest and the commentaries of Pappus and Theon of Alexandria, why are the
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
9
commentaries needed? (2) Are the commentaries didactic books, or how should they be conceived? (3) What is the role of the tables in Ptolemaic treatises? (4) What changes between Ptolemy and his commentators can be explained by historical circumstances, and how? To study the Greek mathematical prose is to study a language that is basically composed of words, numbers, and diagrams. Due to historical grounds, it lacks the level of radical formalization of contemporary mathematics, but its powerful linguistic resources were the basis for the development of Western mathematics, and its hypothetico-deductive structure shaped a solid cognitive substratum that pervaded Western mathematics till Modernity, as set out in The Shaping of Deduction in Greek Mathematics: A Study in Cognitive History (Netz 1999). A study of Ptolemy’s mathematical discourse in terms of the function of individual units provided a classification of types of mathematical prose: theorem, problem, analysis, computation, table, and description (Sidoli 2004). A seminal study on the Greek language of Euclid has detected three different stylistic codes: the demonstrative code, the procedural code, and the algorithmic code (Acerbi 2021). The three studies mentioned in this paragraph have offered insightful interpretative tools to the present examination of Ptolemy’s mathematical practice. The present case study focuses on one of the most renowned and influential topics in the history of astronomy, namely, the assumption that celestial bodies are subjected to regular and circular motion, according to Aristotle’s physical assumptions for the superlunar world. This assumption finds one of its most renowned textual witnesses in Almagest Book 3, in which the theory of the Sun is set out. Hinging upon the Aristotelian assumption, Ptolemy provides an epicyclic and eccentric hypothesis in order to account geometrically for the apparent variations in the motions of the Sun. This subject, furthermore, was chosen because it is one of the most prominent in the history of astronomy. On the one hand, the assumption of regular circular motion was one of the issues for which several medieval astronomers criticized Ptolemy. On the other hand, the very same assumption was the basis for Copernicus’s hypothesis of heliocentrism. The equivalence of the eccentric and the epicyclic models is expounded, inter alia, in Almagest Book 3: Our next task is to demonstrate the apparent anomaly of the sun. But first we must make the general point that the rearward displacements of the planets with respect to the heavens are, in every case, just like the motion of the universe in advance, by nature uniform and circular. That is to say, if we imagine the bodies or their circles being carried around by straight lines, in absolutely every case the straight line in question describes equal angles at the centre of its revolution in equal times. The apparent irregularity [anomaly] in their motions is the result of the position and order of those circles in the sphere of each by means of which they carry out their movements, and in reality there is in essence nothing alien to their eternal nature in the ‘disorder’ which the phenomena are supposed to exhibit. The reason for the appearance of irregularity can be explained by two hypotheses, which are the most basic and simple. When their motion is viewed with respect to a circle imagined to be in the plane of the ecliptic, the centre of which coincides with the centre of the universe (thus its centre can be considered to coincide with our point of view), then we can suppose, either that the uniform motion of each [body] takes place on a circle which is not concentric with the universe, or that they have
10
A. Bardi such a concentric circle, but their uniform motion takes place, not actually on that circle, but on another circle, which is carried by the first circle, and [hence] is known as the ‘epicycle’. It will be shown that either of these hypotheses will enable [the planets] to appear, to our eyes, to traverse unequal arcs of the ecliptic (which is concentric to the universe) in equal times. (Translation Toomer 1984, 141)
This is followed by an exposition of the eccentric and the epicyclic hypotheses and their equivalence with reference to the motions of the Sun. Theorems, proofs, and diagrams are the basic elements of the prose. Theorems and proofs are written in a Greek that follows the canonic language of Greek geometry as found in Euclid’s Elements (the demonstrative code, following Acerbi 2021). The following diagram graphically resumes the equivalence of the two hypotheses. Imagine that we are on Earth, center of the universe, on point D, center of the circle ADG. We will end up finding the Sun at point Z on a certain time from a certain latitude either if we assume that it moves along the circle EΘH, with center Θ, eccentric from D, or if we take the Sun along the small circle (epicycle) whose center is B, which moves along the circle ADG.
(from Toomer 1984, 149)
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
11
Both hypotheses are structured according to a common scheme shared by practitioners of Greek geometry, initially developed by Euclidean scholars and later fixed into writing by Proclus (fifth century) in his Commentary on the First Book of Euclid’s Elements (Morrow 1970). It reads as follows: Every problem and every theorem that is furnished with all its parts should contain the following elements: an enunciation, an exposition, a determination, a construction, a proof, and a conclusion. Of these the enunciation states what is given and what is being sought from it, for a perfect enunciation consists of both these parts. The exposition takes separately what is given and prepares it in advance for use in the investigation. The determination takes separately the thing that is sought and makes clear precisely what it is. The construction adds what is lacking in the given for finding what is sought. The proof draws the proposed inference by reasoning scientifically from the propositions that have been admitted. The conclusion reverts to the enunciation, confirming what has been proved. So many are the parts of a problem or a theorem. The most essential ones, and those which are always present, are enunciation, proof, and conclusion; for it is alike necessary to know in advance what is being sought, to prove it by middle terms, and to collect what has been proved. It is impossible that any of these three should be lacking; the other parts are often brought in but are often left out when they serve no need. (Translation Morrow 1970, 159, slightly revised; my emphasis)
Throughout the Almagest, Ptolemy takes Euclidean geometry for granted, and this seems to generate misunderstandings in the canonical style of Greek mathematics, especially if the reader is not trained in Euclidean geometry. Pappus and Theon intervene and provide the hidden knowledge in their commentaries (Jones 1999). The corresponding passages of Almagest Book 3, in Pappus’s and Theon’s Commentary on the Almagest (Rome 1931–1945: III, 845 ff.) feature the demonstrative code and the procedural one. But, most importantly, their commentary is a paraphrase and summary of the subjects. The common elements of sentences are constituted by expressions like: “here he [meaning Ptolemy] says” + “direct quotations from the text of the Almagest.” Moreover, Pappus and Theon provide passages from Euclid or other canonical works in circumstances which Ptolemy did not explain. It is evident that the focus is on the tables, and certain keywords – such as kanonographia (writing tables) and kanonopoiia (constructing tables) – are emphasized (Rome 1931–1945: III, 891). The computations of the tables of the Sun feature the procedural code (Rome 1931–1945: III, 907 ff.), but also computations in the second person of the imperative mood to describe an operation, in a paratactic syntax (the algorithmic code, following Acerbi 2021), and these computations are aimed at summing up the operations expounded in the preceding part by applying them to a computation sample, according to the values on which tables are structured, that is, the cycle of years, months, days, hours, and latitude of Alexandria (e.g., Rome 1931–1945: III, 911). Notably, the next chapter of the commentary (Rome 1931–1945: III, 912 ff.) is devoted to the correspondences between the computations in the Almagest and the Handy Tables. This shows that such computations were not just copied from the Almagest but reworked by Ptolemy himself.
12
A. Bardi
Theon’s Great Commentary on the Handy Tables treats the theory of the Sun (Mogenet and Tihon 1985, 103) by exposing how to derive the regular motions of the Sun to build tables and teaches how to read the tables. Proofs are combined with procedures. By contrast, the Little Commentary on the Handy Tables just provides procedures on how to use the Handy Tables, that is, how to work out the operations and which table to choose for a given problem, which column, which line. In the latter work, there is absolutely no requirement of understanding the theory underlying the tables. To sum up: 1. There is a shift from complexity of syntax and morphology to a simplification (e.g., the same computations in different codes). 2. The notion of epicycle and eccentrics are progressively lost. 3. The diagram is unnecessary, thus absent, in the tables and in the commentary thereon. 4. According to Sidoli’s units, while Ptolemy’s Almagest and Pappus’s and Theon’s commentary thereon contain units 1 to 6 (see above), the focus on the tables leads to a deployment of just 5 (the tables) and 6 (description) in the Handy Tables. At the same time, there is a loss of theorems and problems but an increase in (3) analysis (of tables) and (4) computation (within the system of the tables) on the commentaries on tables. 5. The texts reflect the orientation to the practice that astronomy underwent after Ptolemy’s Almagest: a. greater attention placed on the tables and how to use them (kanonopoiia and kanonographia); b. prevalence of the procedural code; 6. Process of systematization: in commenting on the Almagest, Pappus and Theon not only follow Ptolemy’s division into chapters but also enumerate theorems to allow readers to orientate into the theory; 7. Commentators were likely operating when they perceived the prose as incomplete. The commentaries to the Almagest are therefore not to be understood as mere didactic tools or simplified versions of the work they are commenting on, but as proper, self-standing works (Chemla and Most 2022), which allowed the preservation and continuation of mathematical practice (“deuteronomic texts,” as Netz (1998) puts it).
6
Socio-historical and Material Aspects of Ptolemy’s Mathematical Practice
The case study offered above shows a shift of focus from the construction of tables toward the use of tables. At the same time, a huge change occurs at the epistemological level, after the second century, as an outcome of what practitioners of Ptolemaic mathematics were experiencing. The proofs that one was supposed to master to become a true philosopher were perceived as being too difficult. Therefore,
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
13
practitioners chose to focus on the use of tables and to omit theorems and proofs. To be sure, the ideals of Ptolemy’s philosophy of mathematical practice were betrayed, already in the Alexandrian context. Pappus and Theon provide evidence of the difficulty of the subject and the problems encountered by students in Alexandria. The social history of astronomy in the fourth century in Alexandria is recounted in Theon’s Commentary on the Almagest, Book 1: What has been left without a commentary seems difficult. As for those people about whom Ptolemy, at the beginning of the work [i.e. the Almagest], says are willing to prove the whole topic geometrically, they end up managing most of the topics in handy tables through simple methods. On the contrary, we have not only put our best efforts into going through each topic through geometrical proofs as far as possible, but have also left nothing at all of difficult teaching, and we are not among those people who hold this opinion on the subject. (My translation; Rome 1931–1945: III, 318, lines 8–14, my emphasis)
Furthermore, Theon’s Little Commentary on the Handy Tables: My Epiphanius, we have accurately provided a more rational method for the computation of the stars in the Handy Tables in another treatise in five books. However, because the majority of those who followed our classes in order to learn the subject not only are unable to follow sufficiently the multiplications and divisions of the numbers, but also are completely ignorant of geometrical proofs, we have done our best to systematically comment on them [the tables] by providing plain methods so that the exposition of the subject would be clearer to them [i.e. those who do not understand geometry]. (My translation; Tihon 1978, 199, lines 1–10; my emphasis)
Not surprisingly, students, forthcoming practitioners of Ptolemaic astronomy, recurred easily to tables and learned how to use them, without having any interest in the geometrical proofs underlying their construction. The Hellenistic world after the second century saw the emergence of the practice of tables without being taught in geometrical proofs become pervasive in scholarly realms. Such a practice was commonly perceived as “astronomy” and “science” in the population who had access to education. This conception of mathematical practice in astronomical sciences obviously became the most successful and spread in the subsequent centuries in several cultural contexts (Bardi 2021). This social aspect of mathematical practice would attract criticism from all generations of scholars after Ptolemy. For instance, Ptolemaic works reached the Arabic world through translations from Syriac and Middle Persian. The elements one finds in Ptolemy and his commentators in terms of theory, practice, and differences in the nature of the two branches of the science of the stars were transmitted almost unvaried. For example, in ninth-century Baghdad, Abū Maʿšar composed The Great Introduction to Astrology, where he claims that true mathematicians/philosophers must have a full command of the Almagest and not just be able to use the astrological part and that, moreover, they must be able to construct tables and not just learn how to use them (Burnett 2002). Another later and emblematic example, in a European context this time, is Regiomontanus (Johannes Müller) in the fifteenth century, in his lecture at the University of Padua, where he lamented that
14
A. Bardi
the scholars of his time claimed to be astrologers just because they knew how to use the tables without having mastered the mathematics behind them (Byrne 2006; Rutkin 2019, 369–373; Omodeo 2021). The shift from proofs to procedures as well as the increased attention being paid to the practical dimension of tables are also reflected in a semantic shift in the Greek word to designate the table, canon, a shift attested to in commentaries on the tables. Passages which define the concept of astronomical table show that, while for Pappus and Theon the word canon means both the object and the knowledge contained in the table, in the first commentary modeled on Theon’s, the word canon means just the object, not the mathematical knowledge provided by and systematized into the table. More precisely, in the fourth century, Theon of Alexandria wrote: we call “table” the whole knowledge of this kind of work, which is arranged into the table, even if it happens to occupy more than one page; [we call] “line” what is labeled with this name in the current fashion; [we call] “column” what is placed in vertical, which happens to have only degrees or only sixties – we call them again both “column.” (My translation; Tihon 1978, 201, lines 4–9)
In the seventh century, the Byzantine scholar Stephen of Alexandria, rephrasing Theon’s definition, built a new Commentary on the Handy Tables: we call “table” this kind of work, even if it happens to occupy more than one page; [we call] “line” what is labeled with this name in the current fashion, that is read from the left to the right; [we read] “column” what is read from above to the bottom. (My translation; Lempire 2016, p. 84, lines 15–18)
The semantic shift occurs in parallel with a material-historical factor, which is understandable when the historical role assumed by tables is seen in the light of the transmission process of Ptolemaic works between the second and fourth centuries. Briefly, the history of Ptolemy’s mathematical practice took shape alongside one of the most prominent changes in the transmission of knowledge, namely, the shift from scroll to codex (Cavallo 1975; Roberts and Skeat 1983). This transition was not direct, similar to what one experiences nowadays in the transition from the physical book to the digital format. Initially, the Almagest was thought to be displayed in scrolls, but Theon redacted his commentaries with the codex in mind. In fact, in his commentary on the tables, Theon claims that tables might be contained by several pages, and scrolls do not have pages. This transition from scroll to codex is also reflected in the change in the format of the tables in the Handy Tables compared to that of the Almagest, which goes from 45 to 30 lines in each table, as well as in the change in the numbers of year cycles, from 18 to 25 (Acerbi 2020, 597). Ptolemy’s commentators adapted Ptolemaic mathematical content to the new material support, thus irreversibly influencing the minds of new practitioners of Ptolemaic astronomy.
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
7
15
Conclusion
The historical and philosophical features of Ptolemy’s mathematical practice are the outcome of exceptionally favorable historical, cultural, and material circumstances, which emerged in Alexandria in Egypt under Roman rule from the second to the fourth centuries CE, and thanks to patrons of arts and sciences and institutions which preserved culture, such as the Museion and the Library. In this context, astronomical knowledge stemming from Babylonian, Egyptian, and Greek traditions could be preserved and employed to develop a new mathematical astronomy, whose major products are to be found in Ptolemy’s Almagest, the Handy Tables, and Pappus’s and Theon’s commentaries thereon. The astronomical tables became the main object of the culture of mathematical practice which took the Almagest as its main reference. In fact, all the examined texts refer to the tables in one way or another, but the way in which they referred to tables changed significantly. On this account, examinations of mathematical practice should consider the Almagest and the Commentaries as a corpus of texts as well as a collective effort of Alexandrian mathematicians to produce and use the best set of tables in their institutional and scholarly contexts. To be sure, Ptolemy’s tabular format has been successful over the tradition of tables in Alexandria. Non-Ptolemaic tables are generally not based on the principle of regular circular motion and are aimed at tracking apparent motions of celestial bodies, while the novelty of Ptolemy consists in providing tables which correspond to the mathematical principles of the geometrical models of the motions (Jones 1997a, b, 1998, 2000; Goldstein 2007). However, the success of the Handy Tables and, consequently, that of Theon’s Little Commentary on the Handy Tables lies not just in the tabular format and its user-friendliness; it also derives from the stylistic resources within the language of Greek mathematics. The transmission of Theon’s Little Commentary is huge in comparison with other commentaries, because his text employed a prose that was most accessible to its readers and could completely omit the theory. In a way, it is a fully practical text. From the second to the fourth century, the relevance of the tables for Ptolemy’s mathematical practice is present in the Almagest and increased in the commentaries thereon. This historical phenomenon is explainable by two interrelated factors: first, the difficulty of Ptolemy’s mathematical theory (and the laziness of the majority of students and practitioners); second, the material requirement to have tables and to merely learn how to use them (practitioners of astrology did not need any knowledge of mathematical theory and astrological activity was an easy source of income). To further explore material aspects, the enhanced focus on tables in the history of Ptolemaic mathematical practice is parallel to the passage from scroll to codex, which has shaped the format and organization of the mathematical content of the tables. Therefore, the history of Ptolemy’s mathematical practice is inextricably linked to the material features of scrolls and manuscripts, which are indeed “stratified social objects” (Ronconi 2018). The changes that occurred in the practice of tables form part of a social history which is contradictory in comparison to Ptolemy’s ideals of philosophy of
16
A. Bardi
mathematical practice. For Ptolemy, the true mathematical practice consists of a combination of a full command of mathematical astronomy and astrology, and it is aimed at reaching a contemplative status of divine realms (in a Platonic sense). In both branches of the Ptolemaic science of the heavens, the role of tables is indeed crucial, for the true philosopher must be able to master the theorems and proofs to build tables in addition to having the knowledge to interpret the horoscopes that one can obtain from them. However, as noted, the practical dimension of tables became prevalent, and most practitioners aimed at learning how to use them. In sum, the astronomical tables, as an object of study of mathematical practice, are the place in which ideal, social, and material layers of Ptolemaic mathematics meet. It is no accident, then, that history would go on to see the table become the most influential object for the reception of Ptolemaic astronomy in later centuries, in the Islamicate, Byzantine, and European contexts of the Middle Ages, as well as in Early Modernity. The commentaries on the Almagest and the Handy Tables are part of the culture of “deuteronomic texts” (Netz 1998), which preserved mathematics and its canons in Late Antiquity but also changed mathematical practice. In the case of Ptolemy and his commentators, the change is evident in the role that tables assumed from the Almagest through to Theon’s commentary on the Handy Tables: initially thought of as objects inserted into a discursive mathematical prose, and as goals of the mathematical prose of the Almagest, they became the very first object of the mathematical practice, thus deemed worthy of commentaries thereon and sufficient knowledge to be mastered in order to become a professional astronomer.
8
Cross-References
▶ Experimental Philosophy Approaches to the Study of Mathematical Practice and Language ▶ Mathematics, Natural Languages, Artificial Languages and Formalisms, Viewed from the Ancient World ▶ “Shaping” Revisited ▶ The Metaphysics of Platonism ▶ The Social Epistemology of Mathematical Proof ▶ The Sociology of Mathematical Practice ▶ What Happens When We Read a Mathematical Text from a Historical Point of View?
References Acerbi F (2020) Interazioni fra testo, tavole e diagrammi nei manoscritti matematici e astronomici greci. La conoscenza scientifica nell’Alto Medioevo, Centro Italiano di Studi sull’Alto Medioevo, Settimane di Studio del CISAM 67:585–621 Acerbi F (2021) The logical syntax of Greek mathematics. Springer, Cham
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
17
Bardi A (2021) Hybrid knowledge and the historiography of science: rethinking the history of astronomy between second-century CE Alexandria, Ninth-Century Baghdad, and FourteenthCentury Constantinople. Transversal: Int J Historiogr Sci 2021(11):1–13 Berggren JL, Jones A (2000) Ptolemy’s geography. An annotated translation of the theoretical chapters. Princeton University Press, Princeton Blum R (1991) Kallimachos. The Alexandrian Library and the origins of bibliography. Translated from the German by Hans H. Wellisch. University of Wisconsin Press, Madison Boll F (1894) Studien über Claudius Ptolemäus. Ein Beitrag zur Geschichte der griechischen Philosophie und Astrologie. Jahrbücher für classische Philologie, Suppl 21:49–224 Brown P (1971) The world of late antiquity. Thames and Hudson, London Burnett C (2002) The certitude of astrology: The scientific methodology of al-Qabīṣī and abū Maʿshar. Early Sci Med 7(3):198–213 Byrne JS (2006) A humanist history of mathematics? Regiomontanus’s Padua Oration in context. J Hist Ideas 67(1):41–61 Cavallo G (ed) (1975) Libri, editori e pubblico nel mondo antico. Laterza, Roma-Bari Chabás J, Goldstein BR (2012) A survey of European astronomical tables in the late middle ages. Brill, Leiden/Boston Chemla K, Most GW (eds) (2022) Mathematical commentaries in the ancient world. A global perspective. Cambridge University Press, Cambridge Cribiore R (2001) Gymnastics of the mind: Greek education in Hellenistic and Roman Egypt. Princeton University Press, Princeton Cuomo S (2000) Pappus of Alexandrian and the mathematics of late antiquity. Cambridge University Press, Cambridge Feke J (2018) Ptolemy’s philosophy: mathematics as a way of life. Princeton University Press, Princeton Fraser PM (1972) Ptolemaic Alexandria. Clarendon Press, Oxford Goldstein BR (1967) The Arabic version of Ptolemy’s Planetary hypotheses. American Philosophical Society, Philadelphia Goldstein BR (2007) What’s new in Ptolemy’s Almagest? Nuncius 22(2):261–285 Goldsworthy A (2016) Pax Romana. War, peace, and conquest in the Roman World. Yale University Press, New Haven Graßhoff G (1990) The history of Ptolemy’s star catalogue. Springer, New York Hamm EA (2011) Ptolemy’s planetary theory. An English Translation of Book One, Part A of the Planetary hypotheses with introduction and commentary. PhD dissertation, University of Toronto Jones A (ed) (1986) Pappus of Alexandria. Book 7 of the collection. Sources in the history of mathematics and physical sciences, vol 8. Springer, New York Jones A (1997a) Studies in the Astronomy of the Roman period. Centaurus 39:1–36 Jones A (1997b) Studies in the Astronomy of the Roman period. Centaurus 39:211–229 Jones A (1998) Studies in the Astronomy of the Roman period. Centaurus 40:1–41 Jones A (1999) Uses and users of astronomical commentaries in antiquity. In: Most GW (ed) Commentaries – Kommentare. Vandenhoeck & Ruprecht, Göttingen, pp 147–172 Jones A (2000) Studies in the Astronomy of the Roman period. Centaurus 42:77–88 Jones A (2020) The ancient Ptolemy. In: Juste et al (eds) Ptolemy’s science of the stars in the middle ages. Brepols, Turnhout, pp 13–34 Juste D, van Dalen B, Hasse DN, Burnett C (eds) (2020) Ptolemy’s science of the stars in the middle ages. Brepols, Turnhout Kennedy ES (1956) A survey of Islamic Astronomical tables. Trans Am Philos Soc, New Series 46(2):123–177 King DA, Samsó J, Goldstein B (2001) Astronomical handbooks and tables from the Islamic World (750–1900): an interim report. Suhayl 2:9–105 Lempire J (2016) Le commentaire astronomique aux tables faciles de Ptolémée attribué à Stéphanos d’Alexandrie
18
A. Bardi
Mark Smith A (1996) Ptolemy’s theory of visual perception: an English translation of the “Optics” with introduction and commentary. Trans Am Philos Soc, New Series 86(2) p vii-xi+1-300 Mogenet J, Tihon A (1985) Le “Grand Commentaire” aux Tables Faciles de Théon d’Alexandrie aux Tables Faciles de Ptolémée. Livre I. Biblioteca Apostolica Vaticana, Città del Vaticano Morrow GR (1970) Proclus. A commentary to the first book of Euclid’s elements (trans: Morrow GR). Princeton University Press, Princeton Netz R (1998) Deuteronomic texts: late antiquity and the history of mathematics. Revue d’histoire des mathématiques 4(2):261–288 Netz R (1999) The shaping of deduction in Greek mathematics. A study in cognitive history. Cambridge University Press, Cambridge Netz R (2009) Ludic proof. Greek mathematics and the Alexandrian aesthetic. Cambridge University Press, Cambridge Netz R (2020) Scale, space and canon in ancient literary culture. Cambridge University Press, Cambridge Neugebauer O (1957) The exact sciences in antiquity, 2nd edn. Dover, New York North J (1986) Horoscopes and history. The Warburg Institute, University of London, London Omodeo PD (2021) Johannes Regiomontanus and Erasmus Reinhold. Shifting perspectives on the history of Astronomy. In: Brentjes S, Fidora A (eds) Premodern translation: comparative approaches to cross-cultural transformations. Brepols, Turnhout, pp 165–186 Page DL (1981) Further Greek epigrams. Cambridge University Press, Cambridge Pedersen O (2011) A survey of the Almagest. With annotation and new commentary by Alexander Jones. Springer, New York/Dordrecht/Heidelberg/London Renn J (2020) The evolution of knowledge. Princeton University Press, Princeton Robbins F (1940) Ptolemy. Tetrabiblos. Harvard University Press, Cambridge, MA Roberts CH, Skeat TC (1983) The birth of the Codex. The British Academy, London Rome A (1931–1943) Commentaires de Pappus et de Théon d’Alexandrie sur l’Almageste, 3 vols. Biblioteca Apostolica Vaticana, Città del Vaticano Rome Adolphe (1931–1945) Commentaires de Pappus et de Théon d’Alexandrie sur l’Almageste, I-III vol. Città del Vaticano: Biblioteca Apostolica Vaticana Ronconi F (2018) Manuscripts as stratified social objects. Scandinavian J Byzantine Modern Greek Stud 4:19–40 Rutkin D (2019) Sapientia Astrologica: Astrology, magic and natural knowledge, ca 1250–1800. I. Medieval structures (1250–1500): conceptual, institutional, socio-political, theologico-religious and cultural. Springer, Cham Sidoli N (2004) Ptolemy’s mathematical approach: applied mathematics in the second century. PhD dissertation, University of Toronto Sidoli N (2014) Mathematical tables in Ptolemy’s Almagest. Hist Math 41:13–37 Swerdlow NM (2004) Ptolemy’s Harmonics and the “Tones of the Universe” in the Canobic inscription. In: Burnett C, Hogendijk JP, Plofker K, Yano M (eds) Studies in the history of the exact sciences in honour of David Pingree. Brill, Leiden, pp 137–180 Taub L (1993) Ptolemy’s universe. The natural philosophical and ethical foundations of Ptolemy’s Astronomy. Open Court Publishing Company, Chicago/La-Salle Tihon A (1978) Le “Petit Commentaire” de Théon d’Alexandrie aux Tables Faciles de Ptolémée. Biblioteca Apostolica Vaticana, Città del Vaticano Tihon A (1991) Le “Grand Commentaire” de Théon d’Alexandrie aux Tables Faciles de Ptolémée. Livres II et III. Biblioteca Apostolica Vaticana, Città del Vaticano Tihon A (1999) Le “Grand Commentaire” de Théon d’Alexandrie aux Tables Faciles de Ptolémée. Livres IV. Biblioteca Apostolica Vaticana, Città del Vaticano Tihon A, Mercier R (2011) Πτoλεμαίoυ Πρóχειρoι Κανóνες. Les Tables Faciles de Ptolémée, 2 vol. Peeters, Louvain Toomer GJ (1975) Ptolemy. In: Gillispie CC (ed) Dictionary of scientific biography, vol XI. Scribner, New York, pp 186–206
Cultures of Mathematical Practice in Alexandria in Egypt: Claudius. . .
19
Toomer GJ (1976) Theon of Alexandria. In: Gillispie CC (ed) Dictionary of scientific biography, vol XIII. Scribner, New York, pp 321–325 Toomer GJ (1984) Ptolemy’s Almagest. Duckworth, London Van Brummelen G (1993) Mathematical tables in Ptolemy’s Almagest. PhD dissertation, Simon Fraser University Watts E (2012) Education: speaking, thinking, and socializing. In: Johnson SF (ed) The Oxford handbook of late antiquity. Oxford University Press, Oxford, pp 467–486
Definitions (and Concepts) in Mathematical Practice V. J. W. Coumans
Contents 1 2 3 4 5
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Relation Between Definitions and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theme I: The Nature of Mathematical Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theme II: The Relation Between a Term and the Concept It Represents . . . . . . . . . . . . . . . . . . . Theme III: Concepts and Definitions as Communal Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Descriptive Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Normative Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Why Do Mathematicians Use Certain Definitions and Not Others? . . . . . . . . . . . . . . . . . . 6 Theme IV: Values Regarding Mathematical Definitions and Concepts . . . . . . . . . . . . . . . . . . . . . 6.1 Naturalness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Fruitfulness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Explanatoriness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 4 7 10 11 14 15 17 17 18 19 20 21
Abstract
Definitions are traditionally seen as abbreviations, as tools for notational convenience that do not increase inferential power. From a Philosophy of Mathematical Practice point of view, however, there is much more to definitions. For example, definitions can play a role in problem solving, definitions can contribute to understanding, sometimes equivalent definitions are appreciated differently, and so on. This chapter reviews the literature on definitions and (to a certain extent) concepts in mathematical practice. It is structured according to four themes through which definitions (and concepts) in mathematical practice have been studied. These themes concern (1) the nature of definitions, (2) whether and how concepts evolve, (3) definitions and concepts from a communal perspective, and (4) different values relating to definitions and concepts. V. J. W. Coumans (*) Institute for Science in Society, Radboud University, Nijmegen, The Netherlands © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_94-1
1
2
V. J. W. Coumans
Keywords
Definition · Concept · Mathematical practice · Philosophy
1
Introduction
Definitions are a central part of mathematics and mathematical practice. Traditionally, they have been studied in logic for their inferential role (see Belnap 1993). However, the role of definitions in mathematical practice goes beyond this. As Tappenden (2008a) remarks, “mathematicians often set finding the ‘right’/‘proper’/ ‘correct’/‘natural’ definition as a research objective and success – finding the ‘proper’ definition – can be counted as a significant advance in knowledge” (Tappenden 2008a, p. 256). Understanding proofs in practice is a prominent topic in the philosophy of mathematical practice. To give an imcomplete overview, there is literature on the various functions of proofs (De Villiers 1990; Baldwin 2018; Dawson 2006), on the difference between the ideal conceptions of proofs and proofs in practice (Andersen 2018; Larvor 2012; Hamami 2019), on virtues of proofs (Detlefsen 2008; Inglis and Aberdein 2015; Kleiner 1991), and on proofs and the peer review process (Andersen 2018; Frans and Kosolosky 2014; Geist et al. 2010). By contrast, definitions in practice have not been studied systematically as of yet. This does not mean that definitions have not been investigated at all. In fact, there is a reasonable literature on definitions in mathematical practice. Usually, however, the main point of focus in that literature is not to get a better understanding of definitions in practice, but to discuss a different topic in which definitions are a particular case. As a result, definitions in mathematical practice have been discussed, but not systematically, and it is therefore difficult to navigate through the existing literature that relates to definitions in mathematical practice. This chapter is an effort to make the literature on definitions more accessible by reviewing and organizing the literature on definitions in mathematical practice (from now on referred to as simply “definitions”). Such an overview can help structure future research, in the hope of obtaining a better understanding of definitions in practice. The chapter will also discuss the notion of “concept.” This is because definitions and concepts are intricately related. One of the main functions of definitions is to specify or introduce a concept. In that sense, definitions are sometimes evaluated on the basis of the concept they specify. Vice versa, mathematicians try to understand certain concepts, and definitions can serve as means to do that. In that sense, some definitions are considered better than others for understanding a concept. In short, to meaningfully discuss definitions, one also needs to take concepts into account to a certain extent. This chapter is outlined as follows. It starts with a brief discussion of the relation between definitions and concepts. This discussion serves two purposes. First, it helps to determine to what extent literature on concepts should be included in this review
Definitions (and Concepts) in Mathematical Practice
3
(as the literature on concepts is vast and goes well beyond the scope of discussing definitions in practice). Second, the term “definition” is used by various authors, possibly meaning different things. Similarly, the usage of the term “concept” varies slightly across authors. By fixing a meaning for these terms, one can compare the usages. After these initial remarks, the literature on definitions and concepts in mathematical practice will be discussed. The overview is structured according to four themes that can be identified in the literature. These themes concern (1) the nature of definitions, (2) whether and how concepts evolve, (3) definitions and concepts from a communal perspective, and (4) different values relating to definitions and concepts. N.B. Definitions are also studied in the context of mathematics education. However, a discussion of this is beyond the scope of this chapter. The interested reader is referred to (Ouvrier-Buffet 2013), which includes a summary of the mathematics education literature on definitions. This chapter is concluded with some remarks on the relation between the different themes.
2
The Relation Between Definitions and Concepts
Before looking into the literature regarding definitions, it is useful to first get an idea of the distinction between definitions and concepts. Unfortunately, the discussion regarding definitions and concepts in general is too diverse to find an uncontroversial description of both (Belnap 1993; Gupta 2019; Robinson 1954; Cappelen and Plunkett 2020; Margolis and Laurence 2019). For the purpose of this chapter, it suffices to take the notion of “definition” to refer to linguistic entities that specify the meaning or usage of a term. When it comes to the ontological status of concepts, there are also various competing theories (Margolis and Laurence 2019). The relation between definitions and concepts as used in this chapter is that definitions are linguistic entities that give rise to concepts, mental or otherwise, and in doing so they can concur with or differ from pre-existing/intended concepts. As an illustration, consider the example of prime numbers, which is also considered in (Tappenden 2008a) for different purposes. For the natural numbers, one can define prime numbers through the following definitions: • A natural number P is prime if and only if the only divisors of P are 1 and P. • A natural number P is prime if and only if for every pair of natural numbers a, b we have that if P divides ab, then P divides a or P divides b. These are different definitions which give rise to the same concept of “prime number.” Another example can be found in Lakatos’ Proofs and Refutations (2015). In this well-known work, which will be discussed later in more detail, the notion of a polyhedron is investigated: What is a polyhedron? Several definitions are discussed, and they are often evaluated on the criterion of whether they concur with some intended concept of what a polyhedron is.
4
V. J. W. Coumans
In the following, the terms “definition” and “concepts” are used in the way described above. However, some of the authors, whose work is discussed in this chapter, also use these terms, possibly meaning different things. What is meant with the term is usually clear from the context. However, when I want to stress the meaning from the above formulation, I will write definition and concept. To illustrate the intricacy of the relation between the notions of definition and concept, and to show the use of distinguishing between definition and concept, one can look at the history of the topological notion of compactness (Raman-Sundström 2015). Nowadays, compactness usually refers to open-cover compactness (X is compact if each open cover U has a finite subcover). For metric spaces, the notion of open-cover compactness is equivalent to so-called sequential compactness (a space X is sequentially compact if every sequence xn has a subsequence that converges in X). For topological spaces in general, this is not the case. This means that in some contexts, different definitions can lead to the same concept, whereas their generalizations can give rise to different concepts. Furthermore, Raman-Sundström (2015) describes definitions of compactness that are analogous to the definition of sequential compactness. In other words, these versions are equivalent to open-cover compactness but, in some way, do justice to the formulation of sequential compactness in terms of convergence and sequences. One of these generalizations is in terms of nets. Whereas sequences are usually indexed by the natural numbers (0, 1, 2, 3, etc.), nets allow elements to be indexed by a far broader class of sets. By using a notion of convergence that relies on the notion of net, one can transform the formulation of sequential compactness in such a way that the result is equivalent to open-cover compactness. In conclusion, it can occur that different concepts have analogous definitions (in this case, the definition of sequential compactness is analogous to the definition of compactness in terms of nets). For more detail on the history of compactness, the reader is referred to (Raman-Sundström 2015).
3
Theme I: The Nature of Mathematical Definitions
As emphasized in the above relation between definitions and concepts, a definition is a linguistic expression, but not every linguistic expression is a definition. Therefore, the question is: What do we expect definitions to do? What differentiates definitions from linguistic expressions in general? In this section, three points of view on this issue are discussed: the traditional, the heuristic, and the work of OuvrierBuffet (2013). The traditional view and the heuristic view are discussed by Cellucci (2018). He describes the traditional view as a combination of five statements (that he traces back to the works of Frege): (1) A definition merely stipulates the meaning of a term [. . .] (2) A definition is an abbreviation [. . .] (3) A definition is always correct [. . .] (4) A definition can always be
Definitions (and Concepts) in Mathematical Practice
5
eliminated [. . .] (5) A definition says nothing about the existence of the thing defined [. . .] (Cellucci 2018, p. 608).
Claim (3) is intended to mean that definitions are not right or wrong, as they are not assertions to begin with. Claims (2) and (4) are also discussed in the context of definitions in general (as opposed to mathematical definitions). They are referred to as the criterion of eliminability and conservativeness, respectively (Belnap 1993). (Note that, although item (4) uses the word “eliminated,” it corresponds to the criterion of conservativeness and not to the criterion of eliminability.) The idea behind eliminability is that a definition can be formulated in terms that are already known. The criterion of conservativeness says that no new conclusions can be derived by introducing a certain definition. More precisely, let L be a language and T a theory. Then a definition consists of expanding the language, resulting in L0 , and adding sentences to the theory, resulting in T0. The criterion of elimination is then given by: with respect to theory T0, every L0 -sentence should be equivalent to an L-sentence. Conservativeness can be formulated as: For every L-sentence S, if S follows from T0, then it should also follow from T. Cellucci discusses the traditional view of mathematical definitions in order to refute it and to defend the heuristic conception of definitions. The latter fits within the heuristic conception of mathematics, which, although not a mainstream view of mathematics, has gained some traction (also see Goethe and Friend 2010). In this view, mathematical problem solving is not a matter of deductively proving statements from axioms, but a process in which problems are solved by the analytic method. This method consists of presenting plausible hypotheses that solve the problem. These hypotheses, in turn, are problems that need to be solved as well. This process of providing hypotheses for open problems can continue indefinitely. In that regard, Cellucci argues that definitions are hypotheses too. He formulates this as: “[a] definition is the hypothesis that there exists something which satisfies the condition stated in the definiens” (Cellucci 2018, p. 618). Cellucci claims that the heuristic view of definitions is preferable to the traditional one because it can describe mathematics better. What he means with mathematics is “the ‘real’ mathematics of the ‘real’ mathematicians” (Cellucci 2018, p. 617). He contrasts this with formal conceptions of mathematics in which there is no room for the logic of discovery. According to Cellucci, the heuristic conception of definitions can explain various phenomena that are not explained by the traditional view of definitions, namely that: “(a) Two extensionally equivalent definitions of the same concept may have different heuristic values; (b) Definitions in mathematics are not starting points but arrival points in the solution to problems; (c) There is no circularity of definitions and theorems; (d) Many definitions are proof-generated; (e) Definitions in mathematics can be justified” (Cellucci 2018, p. 620). Although Cellucci portrays both points of view as competitors, one might argue that they are complementary. The traditional view focuses on the inferential role of definitions and the context of justification, whereas the heuristic view zooms in on the preformal mathematics and the context of discovery. In that sense, both accommodate an investigation of mathematical practice, albeit of different aspects thereof.
6
V. J. W. Coumans
Which view is more appropriate differs for each situation and depends on contextual factors. The third view on the nature of definitions is offered by Ouvrier-Buffet (2013). She aims to understand the act of defining in order to develop didactical situations for mathematics students. To this end, she develops three conceptions of definitions: the Aristotelian, the Popperian, and the Lakatosian. In the following, only sketches of these conceptions are presented. The interested reader is referred to (OuvrierBuffet 2013) for a detailed discussion and references to her earlier work. In the Aristotelian conception, definitions serve as tools for classification. One starts such a classification by looking at the set of objects and for specific differences within this set. Such an analysis should lead to an “aesthetic definition.” Some principles for defining are logical, like that there should be no vicious circles, that previous terms should be defined, etc. Other principles in this conception include that redundancies and metaphors should be avoided and that the concepts, which are defined, should actually exist. Whereas the Aristotelian conception considers definitions as tools for classification, the Popperian conception considers them useful for choosing between competing theories. In that sense, one can try to reduce the number of postulates of a theory and generate counterexamples. The principles that guide the defining process include, for instance, the resistance to refute theories and criteria for when one theory is better than another. The Lakatosian conception combines aspects from the previous ones. According to this conception, definitions are employed to solve intramathematical problems (i.e., finding the domain of validity for a given theorem) and for classification. Some of the actions that guide the defining process are “generating examples and counterexamples” and consequently dealing with these counterexamples by the monsterbarring and exception-barring (see Sect. 4.) These three conceptions are part of a model of the process of defining. This model also consists of different moments in the definition process (“in-action,” “zero,” “formalized,” and “axiomatic”) and an account of the type of problems that definitions can help solve. This model is used to create strategies for developing educational contexts for defining. Finally, Ouvrier-Buffet (2015), on the basis of interviews with mathematicians, differentiates different types of definitions. For instance, she distinguishes between definitions that are temporarily used to shorten a talk from those that “remain and will belong in the public domain” (Ouvrier-Buffet 2015, p. 2216). Another distinction is made between working definitions that enable the mathematician to start investigating the object and the formal definition that one might arrive at later. In conclusion, we saw various descriptions of what definitions are and what they (ought to) do. What all these views have in common, however, is that they see definitions as a device that links a term to a concept. This connection between a term and its concept is the subject of the following section.
Definitions (and Concepts) in Mathematical Practice
4
7
Theme II: The Relation Between a Term and the Concept It Represents
Definitions are a way to assign to a term a concept/meaning. (Technically, concepts and meanings are different entities, but for the purpose of this chapter, I group them as one.) In this section, the relation between terms and their associated concepts is discussed. First, two points of view explicated by Schlimm (2012), regarding the status of this relation, are discussed: Is it fixed or is it subject to change? After establishing that, from a particular point of view, this relation can change over time, we turn to the work of Lakatos (2015) and Kitcher (1983), who discuss two ways in which this relation changes. Finally, the work of Tanswell (2018) is discussed. He not only demonstrates that this relation is not as definite as is sometimes assumed, but also suggests a way to steer conceptual change. Although the title of Schlimm’s Mathematical Concepts and Investigative Practice (2012) speaks of concepts, the article does not concern concepts (cf. Sect. 2), but the relationship between terms and what they refer to. Schlimm describes this in terms of whether “an object a falls under a concept P” (Schlimm 2012, p. 128). He then discusses two points of view regarding this relation: the Fregean view and the Lakatosian view. (Schlimm stresses that these perspectives are not per se those held by Frege and Lakatos, respectively, but that these are nonetheless inspired by their work.) The Fregean view sees concepts as definite and fixed. That a concept is definite means that for every object, it is ontologically determined whether it falls under that concept or not. This does not suggest, however, that mathematicians themselves can immediately determine whether an object falls under that concept or not. That mathematical concepts are fixed means that if a term refers to a specific concept at one time, it does so always. Mathematical concepts should be fixed, because otherwise logical deduction is not possible (Schlimm 2012). Arguments like “All A’s are B’s, this X is an A, therefore X is a B” might suffer if concept A changes between the first and second premise. According to the Lakatosion view, by contrast, concepts are fluid and inexact: They may change, and sometimes it is not clear whether a certain object falls under the concept or not. One of the main motivations for this view is its ability to capture heuristic aspects of mathematics as well and not just the deductive parts. Instead of arguing for either the Fregean or the Lakatosian view, Schlimm suggests that both of these descriptions are very useful in making sense of mathematical developments. He argues that “[i]n the history of mathematics many concepts were introduced informally and went through considerable changes, redefinitions, splittings, etc. However, other concepts were introduced by formal, determinate definitions from the start” (Schlimm 2012, p. 134). Accordingly, the Fregean and Lakatosian notion can complement each other. One example that Schlimm discusses concerns the term “number.” With the introduction of new number systems, “number” refers to different concepts over time. From a Lakatosian view, this is unproblematic. However, the Fregean view
8
V. J. W. Coumans
would have us write different terms for these different systems. For instance, the natural numbers can refer to the integers greater than or equal to 0, or to the integers greater than 0. These should be indicated by Numbers0 and Numbers>0, respectively. Schlimm concludes that both views should be used to highlight different parts of mathematical practice, and therefore, he opts for a pluralistic approach. Although Schlimm does not do this, one might relate these two views to the points of view regarding definitions listed in Sect. 3. Although the Fregean/ Lakatosian views concern the relation between a term and a concept, and the traditional/heuristic views concern the nature of definitions, one can argue that the Fregean view corresponds to the traditional view, and the Lakatosian view to the heuristic view. Both the Fregean and the traditional view highlight the inferential role of definitions and concepts, whereas the Lakatosian and the heuristic view explicitly accommodate for the interference of mathematicians in the course of defining/fixing a concept for a term. Furthermore, analogous to the case of traditional vs. heuristic, one can make similar remarks about the accuracy of the Fregean and the Lakatosian view: It is a matter of what these views try to describe. If we are looking at an idealized version of mathematics, the Fregean version might be more accurate, whereas the Lakatosian view might be more accurate for preformal mathematics or the human side of mathematics. Which view is more appropriate is dependent on the particulars of a situation. If we do indeed want to approach conceptual change from the preformal perspective, we cannot ignore Lakatos’ seminal Proofs and Refutations (2015). As is wellknown, Proofs and Refutations constitutes a rational reconstruction of the research on Euler’s polyhedron formula: V – E + F = 2 (i.e., for polyhedra, the number of vertices (V) the number of edges (E) + the number of faces (F) = 2). The format of the book is a hypothetical discussion between a teacher and its pupils and is connected with historical events via its footnotes. The book contains a wealth of insights regarding mathematical practice. Due to space restrictions, this chapter cannot contain a review of the entire book. Nonetheless, what follows is Proofs and Refutations’ main idea regarding conceptual change. For in-depth discussions of Lakatos’ Philosophy of Mathematics, see (Koetsier 1991) and (Larvor 1998). Proofs and Refutation’s discussion starts right after the conjecture VE + F ¼ 2 has been formulated, with the teacher presenting a “proof” of the theorem. However, his sense of a proof is not the standard conception of something that establishes the truth of the theorem beyond a doubt, but a “thought experiment – or ‘quasiexperiment’- which suggests a decomposition of the original conjecture into subconjectures or lemmas” (Lakatos et al. 2015, p. 10). The students express doubt regarding this “proof” and the theorem and suggest several counterexamples, either to the theorem itself (global counterexamples) or to one of the lemmas in the proof (local counterexamples). It turns out that there is no consensus with respect to what a polyhedron actually is. The students then try to pin down what polyhedra are by presenting various definitions. It turns out that there is a dialectic between the proof and the concept of a polyhedron. What is understood by polyhedron leads to reformulations of the theorem (and its “proof”), and the steps in the proof suggest
Definitions (and Concepts) in Mathematical Practice
9
properties that polyhedra should have. In short, improving a “proof” is a process that alters the theorem, the “proof,” and the concepts involved. A concept that is the result of such a process is referred to as a proof-generated concept. To get an idea of how this change occurs, one can look at the phenomena of “monster-barring” and “exception-barring.” During the hypothetical discussion, some students come up with counterexamples to the polyhedra formula, i.e., a polyhedron for which either the proof fails or the theorem does not hold. One possible reaction to this counterexample is to label it a “monster,” i.e., not real polyhedron. By doing this, the concept of polyhedron is revised and the validity of the proof and the theorem preserved. This is referred to as “monster-barring.” A more nuanced approach is exception-barring, where the counterexample is not yet dismissed as a monster, but is investigated to determine the domain of validity for the theorem. Another account of conceptual change is presented by Kitcher (1983). This description is given in a general discussion of how mathematical practices change (See Sect. 5). Kitcher argues that concepts (in particular, expressions and sentences) get meaning through so-called initiating events. In other words, an initiating event is the way in which a token of a type term gets appointed a referent. Due to this type/ token distinction, the same concept (type term) can have multiple initiating events and the set of these events is called the reference potential. Ideally, this reference potential is homogeneous, i.e., all of its initiating events lead to the same entity. Sometimes, however, the reference potential is heterogeneous and this can lead to refined expressions by adapting the referential potential so to make it homogeneous again. One of Kitcher’s examples concerns Cantor’s introduction of the transfinite ordinal ω. Cantor’s contemporaries protested against Cantor’s move from the natural numbers, a sequence that continues indefinitely, to a completion of that sequence in the form of ω. Kitcher suggests that there is a heterogeneous referential potential at work. Cantor describes ω as a number. But what do we consider numbers to be? One might for instance refer to the complex numbers and say that numbers constitute expressions of the form a + bi. In that sense, ω has trouble fitting the bill. However, when one considers numbers as entities on which one can perform certain arithmetical operations, then ordinal arithmetic shows that transfinite ordinals are numbers too. Therefore, this episode stretched the concept of number. If one assumes that mathematical concepts are subject to change, then one can investigate how to deal with these changes. This is done by Tanswell (2018). After establishing that mathematical concepts can be inexact, he suggests that conceptual engineering can play a valuable role in making concepts more exact. Conceptual engineering is a branch of philosophy that looks at how concepts acquire meaning, how they are used, and how one can steer the usage of certain terms (see Cappelen and Plunkett 2020). A specific mathematical issue for which conceptual engineering can be useful concerns the concept of sets. Some argue that there exists a single set-theoretic universe, whereas others suggest that there exist many. Tanswell argues that “the best hope for a resolution is through the active deployment of the tools of conceptual engineering” (Tanswell 2018, p. 904).
10
V. J. W. Coumans
In conclusion, according to some, the relation between terms and their related concepts is fixed and according to others this relation is subject to change. The next section zooms out and looks at concepts and definitions from the perspective of a mathematical community. A single mathematician might investigate a particular concept or use a definition, but what makes that other mathematicians are also interested in that concept or use that definition as well?
5
Theme III: Concepts and Definitions as Communal Notions
Mathematics is often framed as an objective and individual activity. The philosophy of mathematical practice has shown two things: There are many (inter) subjective factors in mathematics, and it is for a large part also a communal activity. In this section, we look at definitions and concepts from a communal perspective. As in many other activities, there is some sense of fashion in mathematics (Wilder 1953; Thurston 1994). Therefore, one question that presents itself is as follows: What makes concepts or definitions fashionable? Corfield (2003) formulates this accurately through a hypothetical case: [i]f I define a snook to be a set with three binary, one tertiary and a couple of quaternary operations, satisfying this, that and the other equation, I may be able to demonstrate with unobjectionable logic that all finite snooks possess a certain property, and then proceed to develop snook theory right up to noetherian centralizing snook extensions. But, unless I am extraordinarily fortunate and find powerful links to other areas of mathematics, mathematicians will not think my work worth a jot. By contrast, my articles may well be in demand if I contribute to the understanding of Hopf algebras, perhaps via noetherian centralizing Hopf algebra extensions. (Corfield 2003, p. 11)
As the line “mathematicians will not think my work worth a jot” suggests, Corfield talks as if there is some general consensus to what is considered valuable or not. Whether such a consensus should exist is a topic of debate. Nonetheless, the discussion in the literature is framed as though there is such a consensus. Assuming this is indeed the case, the two questions that are discussed in this section are as follows: 1. Why do mathematicians study the concepts that they study and not others? 2. Given a concept with a variety of equivalent definitions, why would/do mathematicians prefer one definition over another? This section starts by investigating the first question. The literature regarding this question can be split into descriptive and normative accounts. Descriptive accounts describe the processes by which concepts are introduced, by which they are (possibly) accepted, and how they “survive.” Normative accounts have the implicit assumption that mathematical concepts are accepted when they are considered to be valuable. These normative accounts therefore consist of general criteria of
Definitions (and Concepts) in Mathematical Practice
11
valuable mathematics. Finally, this section concludes with a review of the literature on the second question.
5.1
Descriptive Accounts
In discussing how mathematics develops, metaphors are often used. Hacking (2014), for instance, discusses two metaphors that explain how mathematics grows and which concepts are studied: the butterfly model and the Latin model. According to the butterfly model, mathematics develops teleologically, like a caterpillar becoming a cocoon and then turning into a butterfly. In that sense, mathematical developments are predetermined, although failures can hinder its development. By contrast, the cornerstone of the Latin model is that mathematics is not predetermined and that it evolves through contingencies. The comparison is made with how Latin evolved into various other languages. A more concrete approach can be found in Wilder (1953), which specifically addresses the introduction and lifespan of concepts. Wilder argues that when it comes to the introduction of concepts, the communal or social aspects of mathematics are just as important as the individual aspects, if not more. He makes the case that mathematical concepts are not created out of nothing, but that they are the result of synthesizing established mathematics. “A concept doesn’t just pop up full grown “like Venus from the waves,” although it may seem to, to the individual mathematician who does the conceiving. Usually its elements are lying in [. . .] the mathematical culture stream” (Wilder 1953, p. 426). Wilder’s goal is to “inquire, above the individual level, the manner in which mathematical concepts originate, and to study those factors that encourage their formation and influence their growth” (Wilder 1953, p. 425). To this end, Wilder discusses several case studies of the introduction and evolution of concepts. From these, he identifies factors that can lead to new concepts. First of all, mathematical concepts arise from interaction with the “non-mathematical environment.” He points to primitive mathematics as solutions to practical problems, in farming or taxes, but notes that contemporary mathematics is also influenced by practical, e.g., economical, problems. Second, Wilder argues that while mathematics is partly individual (the final step in the development of a concept is by the individual), mathematics can only flourish by cooperation. In this sentence, cooperation should be read both in the literal and in a broader sense. The literal one referring to when small groups of mathematicians collaborate, the broader one indicating the efforts of previous mathematicians in generating the needed mathematical tools (e.g., “a Newton can carry on only from the level which the mathematics of his time has reached” (Wilder 1953, p. 438)). Therefore, the greatest factor for mathematical growth is what Wilder calls conceptual contacts. Wilder differentiates conceptual contacts on the level of the individual from the level of the group. The first one represents meetings of mathematicians, like colloquia or international collaboration. The second refers to “diffusion of
12
V. J. W. Coumans
concepts,” like the fusion of two mathematical subfields (he discusses, for instance, the combination of algebra and geometry into algebraic geometry). The crucial difference is that on the group level, these developments depend on a far larger number of mathematicians. Wilder then suggests two possible ways in which conceptual contacts can lead to new concepts. First, concepts can arise from the observation of similar structures and patterns in different branches of mathematics. He describes this as a mathematical equivalent of the observation of patterns in nature in physics. The second example of concept formation is that tools from one branch of mathematics are imported into another branch. One can also look at the axiomatic method as a tool for concept formation. Given that a certain structure is characterized by a set of axioms, one can change one of the axioms and get another structure. Wilder mentions, for instance, how non-Euclidian geometry is obtained by substituting a nonparallel axiom for the parallel axiom. The final example of concept formation that Wilder discusses is the “well known and much abused generalization” (Wilder 1953, p. 442). He remarks that this does not only work at the individual level, but also at the group level, namely via concept splitting. He argues that mathematicians sometimes gradually investigate certain objects from particular points of view, each leading to different generalized versions of that concept. In addition to these various ways in which concepts are introduced, a problemsolving oriented model for concept introduction is discussed in (Carter 2013) and (Kjeldsen and Carter 2012). Using the example of Riemann surfaces, Carter (2013) sets out to study how mathematicians handle mathematical objects, and how they prove, but ends up suggesting a model for how mathematical objects are introduced. Note, however, that where Carter writes ‘objects’, we can replace this with concepts (cf. Sect. 2). This model is crystallized and tested on the introduction of convex bodies in (Kjeldsen and Carter 2012). Despite several differences in comparison to Riemann surfaces, the introduction of convex bodies is seen as evidence for the validity of the model. The model consists of three stages. Stage 1: The new object is defined “with reference to already accepted objects” (Kjeldsen and Carter 2012, p. 360). In Carter’s case study on Riemann surfaces, these surfaces are used to study Abelian functions. Stage 2: Correspondence theorems between the already accepted objects and the new object are obtained, and the already accepted objects are studied with the use of the new object and these correspondence theorems. Stage 3: The new object is defined without reference to the already accepted objects and studied for its own sake. The essence of this model is a motif that can be found in several other sources as well. The idea is that in order to solve a particular problem, mathematicians might go in a certain direction and that that direction turns into an established part of mathematics itself. For instance, Corfield (2003) differentiates between concepts that are fundamental, convenient, or pointless. The first type corresponds to concepts
Definitions (and Concepts) in Mathematical Practice
13
that are “intrinsically worth studying” (Corfield 2003, p. 228) and the second to concepts that are “valuable as a tool” (Corfield 2003, p. 228). According to Corfield, the status of mathematical entities can change from useful tools to intrinsically worth studying. In effect, this is the same message as the model by Kjeldsen and Carter. A general account of mathematical practices, which can also accommodate the aforementioned model, is presented in Kitcher’s The Nature of Mathematical Knowledge (1983). Even though the main goal of this book is to show that mathematical knowledge is not a priori, this book is also well known for the introduction of the notions of mathematical practice and interpractice transitions. A mathematical practice is a five-tuple (L, M, Q, R, S), “where L is the language of the practice, M the set of metamathematical views, Q the set of accepted questions, R the set of accepted reasonings, and S the set of accepted statements” (Kitcher 1983, pp. 163– 164). An interpractice transition is a transition from a practice (L, M, Q, R, S) to a practice (L0 , M0 , Q0 , R0 , S0 ). For our purposes, it is interesting that, with this theory, Kitcher describes three ways in which expressions get introduced. In the first, definitions are abbreviations of other expressions and therefore serve a pragmatic purpose. This concurs with the traditional account of mathematical definitions as described in (Cellucci 2018). The second type is where a particular notion does not have a clearly defined referent and the community of mathematicians looks for a suitable interpretation of that term. This concurs with the account described by Lakatos (2015). As an example, Kitcher discusses how the term “function” evolved. The third type is Kitcher’s description of conceptual change in which a heterogeneous reference potential is transformed into a homogeneous one (as described in Sect. 4). In general, Kitcher presents five patterns of how practices change, namely question-answering, question-generating, generalization, rigorization, and systematization. The motif that some mathematical concepts that were introduced as a means to study another concept become the subject of investigation themselves arises via the patterns of question-answering and question-generating. Concretely, mathematical concepts are sometimes introduced by question-answering. However, these mathematical concepts can lead to questions about those concepts, i.e., a case of question-generating. These patterns of change are also reviewed by Schlimm (2012). He uses them for the construction of five patterns of conceptual change: “Clarifications of informal concepts, systematizations of concepts and results, investigations of sharp concepts (defined by axiom systems), and generalizations and abstractions” (Schlimm 2012, p. 138). He suggests that these can lead to new or modified mathematical concepts. However, Schlimm only mentions these patterns and does not investigate them any further. Next to looking at how certain concepts can become topics of study, one can also inquire into what Wilder (1953) calls the “life span of a concept.” He argues that usually, concepts become fashionable and are studied intensely until they are no longer fashionable. When that happens, the concept’s life span has ended. However, concepts can become very old or even immortal when they are connected to other mathematical concepts. Unfortunately, Wilder cannot give examples of concepts that
14
V. J. W. Coumans
have died out completely. He remarks that mathematics has a tendency to reorganize itself, making various distinct parts of mathematics coherent. In that way, concepts never really die out but survive by being absorbed by a general theory.
5.2
Normative Accounts
Several authors tacitly assume that valuable mathematics becomes part of the shared mathematical theories and mathematics with little to no value does not. Muntersbjorn (2003) discusses this relation explicitly. She argues that the history of mathematics is sometimes wrongfully portrayed as teleological: Earlier mathematics is developed to facilitate later mathematics. Nowadays, mathematicians might value historical mathematical theories because of the possibilities they provide for our contemporary mathematics. However, that need not be why the theory was appreciated originally. She writes: Mathematics is a goal-oriented activity. In general, mathematicians strive to solve problems as quickly, accurately, and generally as possible. But the immediate goals of early individuals cannot be identified solely with the particular accomplishments of later individuals. [. . .] An important measure of mathematical success is the extent to which a particular result enables subsequent results. Good mathematics is fruitful. Yet, mathematical results must have immediate value for their community of origin before they can become candidates for further elaboration, as results without immediate value do not get passed down to the next generation. (Muntersbjorn 2003, p. 163)
Therefore, Muntersbjorn claims that there is a connection between mathematical concepts being accepted and cultivated, on the one hand, and being deemed valuable, on the other. Regarding these values, Muntersbjorn suggests “that the value of a mathematical innovation depends on its provenance and potential. Good mathematics originates from the identification of tacit insights responsible for prior success and generates further success” (Muntersbjorn 2003, p. 175). In addition to provenance and potential, she also discusses the importance of psychological factors such as brevity and clarity. The quality or value of concepts is also discussed by Carballo (2020). His main question is as follows: In virtue of, what is a concept, better or worse, epistemically? Carballo argues that several claims have been made that some concepts are better because they are more “natural,” “sparse,” “joint-carving,” etc. He refers to these terms as elite properties. Carballo first puts forward the hypothesis that concepts are epistemically better if and because they are more elite than others. Then Carballo tries to argue against that hypothesis. He presents several arguments. For instance, good concepts figure in good explanations. However, good explanations do not necessarily need elite concepts. Therefore, eliteness is not necessary for good concepts. Whether Carballo makes a convincing case is beyond the scope of this chapter, but the fact is that when it comes to evaluating concepts (or definitions), mathematicians often refer to values like naturalness or fruitfulness. We will come back to this in Sect. 6.
Definitions (and Concepts) in Mathematical Practice
15
One can also look at the success of mathematical developments in general. Corfield (2001, 2003), for instance, argues that mathematicians often have to present the value of their work. He presents five criteria for valuable mathematical developments: 1. when a development allows new calculations to be performed in an existing problem domain, possibly leading to the solution of old conjectures; 2. when a development forges a connection between already existing domains, allowing the transfer of results and techniques between them; 3. when a development provides a new way of organising results within existing domains, leading perhaps to a clarification or even a redrafting of domain boundaries; 4. when a development opens up the prospect of new, conceptually motivated domains; 5. when a development reasonably directly leads to successful applications outside of mathematics. (Corfield 2001, pp. 508–509)
Developments that score high on all criteria are considered to be uncontroversially valuable. When developments do not score high on all criteria, the value depends on a tacit weighing of the criteria. As these criteria are formulated quite generally, they do not only apply to concepts, but also to definitions. Some definitions can be valuable because they provide “a new way of organising results within existing domains” (see point 3) or because they forge “a connection between already existing domains” (see point 2). However, there is also literature that addresses the value of definitions directly.
5.3
Why Do Mathematicians Use Certain Definitions and Not Others?
An approach that addresses the value of definitions can be found in (Werndl 2009). Her main question is: “in what ways are definitions in mathematics justified, and are these kinds of justification reasonable?” (Werndl 2009, p. 313). Werndl argues that it is important to justify definitions, since unjustified ones are less meaningful to “us.” Note that Werndl’s use of the word “us” suggests some form of consensus on what are meaningful definitions. Therefore, she concludes, justified definitions are the ones we should focus on. The first type of justification she identifies is the Lakatosian proof-generated definition. Note, however, that in Proofs and Refutations, proof-generated definitions are seldom discussed; Lakatos mainly discusses proof-generated concepts. Nonetheless, Werndl suggests that “[Lakatos’] main idea is the notion of a proof-generated definition. Here his main example is definitions of polyhedron which are justified because they are needed to make the proof of Eulerian conjecture work” (Werndl 2009, p. 315). Extrapolating the meaning of proof-generated concepts to proof-generated definitions, one might argue that the latter are definitions that are the result of improving a proof. Werndl phrases this as “a definition which is needed in order to prove a specific conjecture regarded as valuable” (Werndl 2009, p. 315). Note that in suggesting that definitions can be justified by being proof-generated, the
16
V. J. W. Coumans
term “definition” should be interpreted as concept. A part of proof-generated definitions is that they capture the right domain of validity for the theorem. This then involves not the definitions themselves, but the concepts they correspond to. Next to the justification of definitions being proof-generated, Werndl identifies three other types of justifications of definitions: • Natural-world justified definitions: definitions that “capture a preformal idea regarded as valuable for describing or understanding the natural world” (Werndl 2009, p. 320) • Condition-justified definitions: “a definition . . . justified by the fact that it is equivalent in an allegedly natural way to a previously specified condition which is regarded as mathematically valuable” (Werndl 2009, p. 326) • Redundancy-justified definitions: a definition that “eliminates at least one redundant condition in an already accepted definition” (Werndl 2009, p. 330) In these cases, the term “definition” does refer to definitions. Natural-world justifications are about what captures a preformal idea, condition-justified definitions show that a certain expression captures a relevant condition as well, and redundancy-justified definitions are more economical than others. Furthermore, one can examine how certain definitions developed over time. As mentioned, Raman-Sundström (2015) does this by looking at “the origin and development of open-cover and sequential compactness” and “how and why opencover compactness came to be favoured” (Raman-Sundström 2015, p. 619). She shows how in general topology, open-cover and sequential compactness are not equivalent and that the open-cover version was preferred because it is “more general and applicable” (Raman-Sundström 2015, p. 629). In the final comments, she remarks that “the story of how open-cover compactness came to be seen as the right one is a story of developing mathematics without always knowing where it was going, how important terms should be defined and how widely they might be applied” (Raman-Sundström 2015, p. 633). Even though the mathematicians involved might not always have known where the path would lead them, Raman-Sundström also shows that many choices were based on values. This becomes clear from the following excerpts: • “[Frechet] preferred definitions that had an intuitive feel rather than analytic power” (Raman-Sundström 2015, p. 625) • “Alexandroff claimed the accumulation point characterization was most important initially, due to the dominance of the Bolzano–Weierstrass property, but after some years it became clear that the open-cover property was more fruitful” (Raman-Sundström 2015, p. 625) • “some mathematicians find nets more intuitively appealing and useful, while others prefer filters” (Raman-Sundström 2015, p. 631). The next section discusses several values regarding definitions and concepts.
Definitions (and Concepts) in Mathematical Practice
6
17
Theme IV: Values Regarding Mathematical Definitions and Concepts
As the above section shows, values play an important role when it comes to concepts and definitions. Not only is mathematics often evaluated on the basis of values (see Sect. 5.2), mathematical developments are also often steered by these values (see Sect. 5.3). Usually, these values are referred to in an imprecise and informal way. As a response, there are several efforts to make these notions more precise. In this section, philosophical accounts of three values are discussed: naturalness, fruitfulness, and explanatory value.
6.1
Naturalness
Mathematicians often describe constructions and concepts as “natural.” Although there are methodological difficulties involved in studying an informal concept like “natural” (Mauro and Venturi 2015), several authors have reflected on this term. For instance, Corfield (2003) briefly differentiates between three types of naturalness. The first interpretation sees natural as the opposite of artificial. Natural concepts are those that we can encounter “in nature.” This, obviously, does not refer to the same nature in which one encounters plants and birds, but to “reasonably well frequented regions of mathematics” (Corfield 2003, p. 224). In the second interpretation, naturalness is linked to inevitability. Natural concepts are those concepts that mathematicians will discover at some point. Formulated more strongly, natural concepts “will necessarily appear and that, by virtue of their nature, they rather than the user will determine their use” (Corfield 2003, p. 226). If mathematicians from various backgrounds discover the same concept, then this indicates that the concept might be a natural one. Lastly, Corfield discusses a narrow interpretation of naturalness, based on origins of category theory. In this interpretation, naturalness constitutes the absence of arbitrary choices. For instance, the proof that every vector space is isomorphic with its double dual is considered natural as it requires no choice of basis. Where Corfield discusses naturalness of concepts, constructions, proofs, and a variety of mathematical phenomena, Tappenden (2008a) zooms in on concepts and definitions. He describes naturalness in terms of “carving at the conceptual joints.” Then he looks at two examples, the Legendre symbol and prime numbers. Regarding the first, he asks (1) whether this concept is natural and (2) how we can establish its naturalness. At first sight, the Legendre symbol seems to be an artificial concept due to the case distinction in the definition (The Legendre symbol is defined by: for an odd a a prime p, an integer a, p ¼ 1, if a is a quadratic residue modulo p; p ¼ 1, if a is a nonquadratic residue modulo p; and ap ¼ 0 if a 0 ðmod pÞ.). However, it also appears to unify the theory of quadratic reciprocity. Tappenden remarks this and concludes that there are two possible perspectives on the symbol: Either it is a useful notion which helps to streamline the theory, in which case it is mathematically natural,
18
V. J. W. Coumans
or it is only useful because it is a gerrymander in a particular context and not generally useful, and therefore, not natural. Tappenden argues for the first approach by showing that the Legendre symbol is useful in various different mathematical contexts and that it proves to be a key notion in various generalizations. He therefore concludes, regarding questions (1) and (2) that “[i]t is a mathematical question whether the Legendre symbol carves mathematical reality at the joints and the verdict is unequivocally yes” (Tappenden 2008a, p. 264). Tappenden also considers the carving at the conceptual joints in the case of prime numbers. He provides two definitions: 1. A natural number P is prime if and only if the only divisors of P are 1 and P. 2. A natural number P is prime if and only if for every pair of natural numbers a, b we have that if P divides ab, then P divides a or P divides b. Although these definitions are equivalent in the context of natural numbers, the definition that is taught to high school students is the first. In contexts like algebraic number theory, however, these definitions are no longer equivalent. In order to generalize the notion of primeness, mathematicians have to choose between these definitions. To make this decision, Tappenden argues, mathematicians look at the original theorems involving prime numbers to see which property was actually doing the heavy lifting so that these theorems can be generalized. It turns out that in the general context, the high school definition is referred to as irreducibility whereas the other is referred to as primeness. The second definition therefore carves at the conceptual joints of primeness.
6.2
Fruitfulness
Fruitfulness is seen by Tappenden (2012) as a bridge between the foundational works of Frege and the study of mathematical practice. The main connection is Frege’s view that “logical reasoning needs ‘fruitful concepts’ to extend knowledge, and that identifying and studying the fine structure of such concepts can produce further discoveries” (2012, p. 205). Several authors have tried to make sense of Frege’s usage of the term fruitful for definitions (Shieh 2008; Tappenden 1995, 2012; Horty 2007). The consensus seems to be that some definitions are merely collections of unordered characteristics, whereas others display a logical structure That, when analyzed, leads to new knowledge. In that sense, fruitful definitions are those that enable the extension of knowledge. Horty (2007), on the other hand, takes this notion and contrasts it with other parts of Frege’s theory of concepts. He concludes that there is a friction between the conception of definitions as abbreviations (cf. Sect. 3) and definitions as being fruitful. If definitions should not increase the inferential power of a logical system, how can they lead to new knowledge? Horty then assesses Frege’s work to explain this seeming contradiction, the details of which are beyond the scope of this text.
Definitions (and Concepts) in Mathematical Practice
19
After discussing Dedekind’s remarks on fruitfulness, which are similar to Frege’s, Tappenden (2012) asks: “how can Frege/Dedekind’s recognition of fruitfulness as a practical basis for choice of concepts and definitions in mathematics helpfully guide us?” (2012, p. 211). He hopes to show that by looking at fruitfulness, we can better understand the reliance of mathematicians on “beauty.” More generally, Tappenden argues that many judgments like “elegance” or “simplicity” have a guiding role in science, even though it is not clear why these notions would lead to truth. Therefore, he concludes, we should look at those notions in detail. In Tappenden (2012), he makes a start by exploring a connection between beauty and fruitfulness. He argues that seeing the way one result might lead to other, can lead to an aesthetic reaction. Fruitfulness is also discussed by Yap (2011). She first tries to grasp the notion by looking at remarks by Gauss and provides a mathematical interpretation of those remarks. In this interpretation, mathematical calculi are fruitful if they can be integrated in existing methods, if they help to split mathematical problems into subproblems, and if they help to capture the core of the issues we frequently encounter. Note, however, that a “calculus” is not the same as a “concept,” in general. However, one of the cases that Yap discusses is the Legendre symbol which is also discussed in Tappenden (2008a, b). Therefore, Yap’s article on the fruitfulness of calculi is also discussed here. She then compares the Legendre symbol with Gauss’ congruence notation, i.e., a b (mod n). She argues that Gauss’ congruence notation and not the Legendre symbol was useful for proving the quadratic reciprocity theorem. Yap presents three arguments for the usefulness of Gauss’ congruence notation: It rather straightforwardly leads to partitioning the integers in equivalence classes, equivalence behaves much like equality, and this calculus enabled the induction strategy that is essential in Gauss’ first proof of the theorem. Yap argues that the Legendre symbol is not as fruitful as the congruence notation, because it is too specific. She concludes that the congruence notation has the right balance between generality and specificity. Nonetheless, the Legendre symbol has proved to be useful, but that usefulness is “tied to the truth of the reciprocity theorem itself, whereas the usefulness of congruences, being more general, is wider” (Yap 2011, p. 414).
6.3
Explanatoriness
It has been established that many mathematicians refer to proofs as being explanatory. Consequently, there have been many efforts to describe what mathematical explanation actually is (see D’Alessandro 2019). Another question regarding mathematical explanation is about which mathematical techniques are (or can be) explanatory or not. Lange (2009), for instance, argues that mathematical induction, in general, is not explanatory. Lehet (2019) disagrees and looks at induction in mathematical definitions to argue that it can be explanatory in some cases. To do this, Lehet first expands the theory of mathematical explanation to cover explanatory definitions as well. “A mathematical definition is explanatory if it makes
20
V. J. W. Coumans
the mathematical concept being defined more accessible—i.e., it explains some feature or property of the relevant concept” (Lehet 2019, p. 6). She gives two examples of explanatory definitions. The first concerns the derivative. She focuses on which definition might be best suitable for an introductory calculus course. She argues that a definition of the derivative, as a limit of secants and tangents, is more explanatory than one in which the algebraic manipulations are highlighted (e.g., d n n1 ). For those only interested in calculating the derivatives, the algebraic dx x ¼ nx example is preferable. However, for a more profound understanding, the limit example is preferable, and as the limit definition makes the “deeper meaning accessible” (Lehet 2019, p. 8), it is explanatory. The second example concerns CW-complexes. The details are beyond the scope of this text. The crux is that CW-complexes are larger spaces, constructed out of smaller spaces. By doing this, mathematicians can reduce complex properties of these larger spaces into a number of simpler properties of smaller spaces. In effect, these larger spaces become accessible by reducing them to a number of accessible smaller spaces. Although Lehet (2019) was the first to explicitly discuss explanatory definitions, the aforementioned article by Raman-Sundström (2015) in a sense already shows how explanatoriness is an important factor for definitions. Given the vast literature on mathematical explanation in general, there are many opportunities for further investigating explanatory definitions.
7
Conclusion
Definitions (and concepts) in mathematical practice is not a standard topic of investigation in the philosophy of mathematical practice. Usually, they are discussed in light of others matters; for example, Kitcher (1983) discusses conceptual change in his description of mathematical practices. In the above, the literature on definitions (and concepts) is structured using five themes. Whereas definitions and concepts are intricately related, the first theme focuses on definitions: What is the nature/role of definitions? As one important function of definitions is specifying the meaning of a term, the relationship between terms and concepts constitutes theme two. From a broader perspective, one can look at how definitions and concepts are accepted and studied by mathematical communities, which is theme three. Within this theme, one can further differentiate between descriptive and normative accounts of how concepts enter the shared mathematical scope and literature on why certain definitions and not others are used. Theme four is about various values that are attributed to concepts and definitions. This is summarized in Fig. 1. These themes are not disjoint. For instance, the work by Tappenden (2008a) shows why the mathematical community might choose to work with a particular definition (cf. theme III) as well as the role of the naturalness considerations in this decision (cf. theme IV). Moreover, some types of conceptual change (cf. theme II)
Definitions (and Concepts) in Mathematical Practice
21
Terms Theme II: The relation between terms and concepts Theme I: What is the role/nature of definitions?
Definitions
Concepts
Theme III: Concepts and definitions from a communal perspective
Concepts Descriptive accounts
Concepts Normative accounts
Theme IV: Values of definitions and concepts
Definitions
Fig. 1 The four themes and their relations
can also be considered cases of certain concepts becoming fashionable and others falling out of fashion (cf. theme III). The themes represent different paths that lead to the discussion of definitions and concepts in mathematical practice. However, these paths are connected. For instance, beneficial properties of definitions (cf. theme IV) are dependent on the expectations and aims of definitions (cf. theme I). Different values (cf. theme IV) can guide the process of choosing which concept to relate to a term (cf. theme II). Definitions can become fashionable because they capture a relevant concept in a natural way (cf. themes II, III, and IV). Studying the relations between the different themes is therefore a promising avenue for future research.
References Andersen LE (2018) Acceptable gaps in mathematical proofs. Synthese:1–15. https://doi.org/10. 1007/s11229-018-1778-8 Baldwin J (2018) The explanatory power of a new proof: Henkin’s completeness proof. In: Piazza M, Pulcini G (eds) Truth, existence and explanation: FilMat 2016 studies in the philosophy of mathematics. Springer International Publishing, Cham, pp 147–162. https://doi. org/10.1007/978-3-319-93342-9_9 Belnap N (1993) On rigorous definitions. Philos Stud 72(2):115–146. https://doi.org/10.1007/ BF00989671
22
V. J. W. Coumans
Cappelen H, Plunkett D (2020) Introduction: a guided tour of conceptual engineering and conceptual ethics. In: Conceptual engineering and conceptual ethics. Oxford University Press, Oxford, pp 1–26. https://doi.org/10.1093/oso/9780198801856.003.0001 Carballo AP (2020) Conceptual evaluation: epistemic. In: Conceptual engineering and conceptual ethics. Oxford University Press, Oxford, pp 304–332. https://doi.org/10.1093/oso/ 9780198801856.003.0015 Carter J (2013) Handling mathematical objects: representations and context. Synthese 190(17): 3983–3999 Cellucci C (2018) Definition in mathematics. Eur J Philos Sci 8(3):605–629 Corfield D (2001) The importance of mathematical conceptualisation. Stud Hist Philos Sci Part A 32(3):507–533. https://doi.org/10.1016/S0039-3681(01)00007-3 Corfield D (2003) Towards a philosophy of real mathematics. Cambridge University Press, Cambridge D’Alessandro W (2019) Explanation in mathematics: proofs and practice. Philos Compass 14(11): e12629. https://doi.org/10.1111/phc3.12629 Dawson JW Jr (2006) Why do mathematicians re-prove theorems?†. Philos Math 14(3):269–286. https://doi.org/10.1093/philmat/nkl009 De Villiers M (1990) The role and function of proof in mathematics. Pythagoras 24:17–24 Detlefsen M (2008) Purity as an ideal of proof. In: Mancosu P (ed) The philosophy of mathematical practice. Oxford University Press, Oxford, pp 179–197 Frans J, Kosolosky L (2014) Mathematical proofs in practice: revisiting the reliability of published mathematical proofs. Theoria Int J Theory Hist Found Sci 29(3(81)):345–360 Geist C, Löwe B, Van Kerkhove B (2010) Peer review and knowledge by testimony in mathematics. In: PhiMSAMP: philosophy of mathematics: sociological aspects and mathematical practice. College Publications, London, pp 155–178 Goethe NB, Friend M (2010) Confronting ideals of proof with the ways of proving of the research mathematician. Stud Logica 96(2):273–288. https://doi.org/10.1007/s11225-010-9284-0 Gupta A (2019) Definitions. In: Zalta EN (ed) The Stanford encyclopedia of philosophy. Winter 2019 edn. Metaphysics Research Lab, Stanford University Hacking I (2014) Why is there philosophy of mathematics at all? Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9781107279346 Hamami Y (2019) Mathematical rigor and proof. Rev Symb Log:1–41. https://doi.org/10.1017/ S1755020319000443 Horty JF (2007) Frege on definitions: a case study of semantic content. Oxford University Press, Oxford Inglis M, Aberdein A (2015) Beauty is not simplicity: an analysis of mathematicians’ proof appraisals. Philos Math 23(1):87–109. https://doi.org/10.1093/philmat/nku014 Kitcher P (1983) The nature of mathematical knowledge. Oxford University Press, New York Kjeldsen TH, Carter J (2012) The growth of mathematical knowledge—introduction of convex bodies. Stud Hist Philos Sci Part A 43(2):359–365. https://doi.org/10.1016/j.shpsa.2011.12.031 Kleiner I (1991) Rigor and proof in mathematics: a historical perspective. Math Mag 64(5):291– 314. https://doi.org/10.2307/2690647 Koetsier T (1991) Lakatos’ philosophy of mathematics: a historical approach. Studies in the history and philosophy of mathematics, 0928–2017, vol 3. North-Holland, Amsterdam Lakatos I, Worrall J, Zahar E (2015) Proofs and refutations: the logic of mathematical discovery. Cambridge University Press, Cambridge Lange M (2009) Why proofs by mathematical induction are generally not explanatory. Analysis 69(2):203–211 Larvor B (1998) Lakatos: an introduction. Routledge, London & New York Larvor B (2012) How to think about informal proofs. Synthese 187(2):715–730. https://doi.org/10. 1007/s11229-011-0007-5 Lehet E (2019) Induction and explanatory definitions in mathematics. Synthese. https://doi.org/10. 1007/s11229-019-02095-y
Definitions (and Concepts) in Mathematical Practice
23
Margolis E, Laurence S (2019) Concepts. In: Zalta EN (ed) The Stanford encyclopedia of philosophy. Summer 2019 edn. Metaphysics Research Lab, Stanford University Mauro LS, Venturi G (2015) Naturalness in mathematics. In: Lolli G, Panza M, Venturi G (eds) From logic to practice: Italian studies in the philosophy of mathematics. Springer International Publishing, Cham, pp 277–313. https://doi.org/10.1007/978-3-319-10434-8_14 Muntersbjorn MM (2003) Representational innovation and mathematical ontology. Synthese 134(1/2):159–180 Ouvrier-Buffet C (2013) Modeling of the defining activity in mathematics and of its dialectic with the proving process epistemological study and didactical challenges. Université Paris-Diderot – Paris VII, Paris Ouvrier-Buffet C (2015) A model of mathematicians’ approach to the defining processes. In: CERME 9 – ninth congress of the European Society for Research in Mathematics Education, Prague, Czech Republic, 2015-02-04. pp 2214–2220 Raman-Sundström M (2015) A pedagogical history of compactness. Am Math Mon 122(7):619– 635. https://doi.org/10.4169/amer.math.monthly.122.7.619 Robinson R (1954) Definition. Clarendon Press, Oxford Schlimm D (2012) Mathematical concepts and investigative practice. In: Scientific concepts and investigative practice. De Gruyter, Berlin/Boston, pp 127–148. https://doi.org/10.1515/ 9783110253610 Shieh S (2008) Frege on definitions. Philos Compass 3(5):992–1012. https://doi.org/10.1111/j. 1747-9991.2008.00167.x Tanswell FS (2018) Conceptual engineering for mathematical concepts. Inquiry 61(8):881–913. https://doi.org/10.1080/0020174X.2017.1385526 Tappenden J (1995) Extending knowledge and ‘fruitful concepts’: Fregean themes in the foundations of mathematics. Noûs 29(4):427–467. https://doi.org/10.2307/2216281 Tappenden J (2008a) Mathematical concepts and definitions. In: Mancosu P (ed) The philosophy of mathematical practice. Oxford University Press, Oxford, pp 256–275. https://doi.org/10.1093/ acprof:oso/9780199296453.003.0010 Tappenden J (2008b) Mathematical concepts: fruitfulness and naturalness. In: Mancosu P (ed) The philosophy of mathematical practice. Oxford University Press, Oxford, pp 276–301 Tappenden J (2012) Fruitfulness as a theme in the philosophy of mathematics. J Philos 109(1/2):204–219 Thurston WP (1994) On proof and progress in mathematics. Bull Am Math Soc 30(2):161–177. https://doi.org/10.1090/S0273-0979-1994-00502-6 Werndl C (2009) Justifying definitions in mathematics – going beyond Lakatos. Philos Math 17(3): 313–340 Wilder RL (1953) The origin and growth of mathematical concepts. Bull Am Math Soc 59(5):423– 448 Yap A (2011) Gauss’ quadratic reciprocity theorem and mathematical fruitfulness. Stud Hist Philos Sci Part A 42(3):410–415. https://doi.org/10.1016/j.shpsa.2010.09.002
Descartes’ Transformation of Greek Notion of Proportionality Piotr Błaszczyk
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Euclidean Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Use of Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Descartes’ Algebra of Line Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 From Greek Proportion to Descartes’ Algebra of Line Segments . . . . . . . . . . . . . . . . . . . 4 Arithmetic of Line Segments 4.1 Defining Descartes’ Operations . . . . . . . . . . . . . . . . . . . . . . . . 5 The Field of Line Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 True and False Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Zero and Sign Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Constructing False Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Ordered Field of Line Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Pythagorean Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Formula for a Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Pappus Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Formula for Locus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Constructing Locus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Magnitudes, Numbers, and Quantities Versus Elements of an Ordered Field . . . . . . . 8 Ancient Greek Numerical Practice Based on Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Euclidean Arithmetic and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Heron’s Formula for the Area of a Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Euclidean Versus Descartean Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 2 3 4 5 8 9 10 10 11 12 12 13 13 15 17 17 20 22 22 24 24 26 30 32 32
The author is supported by the National Science Centre, Poland grant 2018/31/B/HS1/03896. P. Błaszczyk (*) Institute of Mathematics, Pedagogical University of Cracow, Cracow, Poland e-mail: [email protected] © Springer Nature Switzerland AG 2021 B. Sriraman (ed.), Handbook of the History and Philosophy of Mathematical Practice, https://doi.org/10.1007/978-3-030-19071-2_16-1
1
2
P. Błaszczyk
Abstract
In ancient Greek mathematics, the general term μEγεϑoς stands for geometric magnitudes of all kinds. Euclid’s book V develops a theory of magnitudes, their ratios, and proportions. Viewed from the modern perspective, the theory of proportions is a technique of processing ratios. In early modern mathematics, it was replaced by implicit rules of an ordered field, then, in the nineteenth century, by the arithmetic of real numbers. We formalize Euclid’s theory from book V and show how it was reshaped in Descartes’ La Géométrie. Keywords
Magnitude · Ratio · Proportion · Algebra of line segments · Arithmetic of line segments · Ordered field · Real closed field · Descartes’ plane
1
Introduction
Descartes’ 1637 Discours de la Méthode pour bien conduire sa raison, et chercher la vérié dans les sciences, plus la Dioptrique, les Météores et la Géométrie qui sont des essais de cette Méthode (Descartes 1637b) consists of the philosophical essay Discourse on Method and three scientific treaties, as we would could call them today. Each of them contains spectacular discovery. In Meteores, Descartes provides widely accepted explanation of the rainbow phenomenon. In Dioptrics, he derives the law of refraction, currently known as Snell’s law, as well as fundamental laws concerning elliptic and hyperbolic lenses. In Geometry, he presents a solution to the Pappus problem for any number of lines. The problem itself and its general solution suggested by Descartes are already forgotten. Yet, it was a novel method of solving problems in geometry that has had a far-reaching impact on mathematics and made La Géométrie a masterpiece of mathematical writing. Descartes’ method is commonly referred to as an analytic geometry. In fact it took centuries to turn his original procedures into the modern analytic form. One aspect of that method includes operations on line segments – operations which replaced the ancient technique of proportions.
2
Euclidean Proportions
In ancient Greek mathematics, the general term μEγεϑoς (magnitude) covers line segments, triangles, convex polygons, circles, angles, arcs of circles, and solids. Euclid’s book V develops rigorous treatment of magnitudes, their ratios, and proportions. It is one of the most thorough theories making up the Elements. At the same time, it calls for a formal examination. Technical terms like greater, lesser, multitude, and proportion are consistently applied throughout the course.
Descartes’ Transformation of Greek Notion of Proportionality
2.1
3
Foundations
We formalize magnitudes of the same kind (line segments being of one kind, triangles being of another, etc.) as an additive semigroup with a total order, (M, +, yÞ, ð8x, yÞð∃zÞðx < y ) x þ z ¼ yÞ, ð8x, y, zÞðx < y ) x þ z < y þ yÞ, ð8xÞð8n ℕÞð∃yÞðx ¼ nyÞ:
The term nx as applied in these formulas is defined by nx ¼ |fflfflfflfflfflffl x þ ffl.{zfflfflfflfflfflffl . . þfflx}. n times
The relation greater than is characterized by transitivity and the law of trichotomy; both Euclid’s and Aristotle’s texts provide evidence for it. Equality of line segments, angles, and arcs means congruence. Equality of figures should be interpreted in Euclid’s theory of area, as explained in (Błaszczyk 2018); equality of solids involve an analogous theory developed in book XII of the Elements. The symbol + interprets arguments like the following: “And since AG [a] is equal to E [a], and CH [c] to F [c], thus AG, F [a + c] is equal to CH, E [c + a]” (V.25), that is1 a þ c ¼ c þ a: Thus, in the text, instead of AG + F, or AG and F, we find the term AG, F. Moreover, we interpret the equality AG,F ¼ CH,E as the commutative property. In a similar way, we could show that the addition of magnitudes is an associative operation. By alike textual analysis we could certify axioms E2 and E3. Note that in book V, magnitudes are represented by line segments, hence the term AG, E finds an obvious interpretation as a concatenation of two lines. Yet the results of book V are applied to magnitudes of any kind. In general, viewed from the methodological perspective, the addition of magnitudes should be treated as a primitive concept. Nonetheless, within Euclid’s theory of equal figures, the sum of squares could be defined as a square determined on the ground of proposition I.47, and the sum of similar figures as a figure similar to the ones being added, determined on the ground of proposition VI.31.2 E1 interprets definition V.4, the so-called Archimedean axiom: “Magnitudes [x, y] are said to have a ratio with respect to one another which, being multiplied [nx] (πoλλαπλασιαζóμενα), are capable of exceeding one another [nx > y].” The term nx stands here for the phrase multiple of the magnitude. Axiom E1 is applied in book V only once: in the proof of proposition V.8. Then, in a modified form, it is applied in proposition X.1 on incommensurable lines. 1 English translations of the Elements after (Euclid 2007). The accuracy of this edition is easy to verify since it also includes the Greek text of the classic Heiberg’s edition (Heiberg 1883). 2 See (Błaszczyk 2018; Błaszczyk and Petiurenko 2020) respectively.
4
P. Błaszczyk
Geometry developed in books I-IVof the Elements does not rely on E1 and can be modeled on a non-Archimedean plane. It can be shown that without E1 proposition V.8, crucial for the theory, does not hold. Yet, the theory of similar figures as developed in book VI builds on results of book V, hence Euclidean similar figures should be modeled on an Archimedean plane. E4 is implicitly applied in proposition V.5. This axiom is not essential, as it can be derived from the remaining four, i.e., E1-E3, and E5 as presented below. Next to sums and relation greater than, book V introduces yet another relation on magnitudes, namely proportion. In Greek manuscripts, it is rendered by the schematic phrase τÒ A πρÒς τÒ B, OὔτOς τÒ Γ πρÒς τÒ Δ (“as A is to B, so is C to D”). For a symbolic representation of proportion, we adopt the term x: y:: z: υ. In fact, this symbol was introduced already in the seventeenth century.3 Descartes did not apply any symbolic shortcuts for proportions and mimics Euclid’s phrase. Definition V.5 clearly brings in the proportion in a verbal form. We interpret it by the following formula: x : y :: z : υ,df ð8m, n ℕÞ½ðnx > 1my ) nz>2 mυÞ^ ^ðnx ¼ my ) nz ¼ mυÞ ^ ðnx