187 117 2MB
English Pages 430 Year 2014
Formalism and Beyond
Logos
Studien zur Logik, Sprachphilosophie und Metaphysik Herausgegeben von/Edited by Volker Halbach, Alexander Hieke, Hannes Leitgeb, Holger Sturm
Volume / Band 23
Formalism and Beyond
On the Nature of Mathematical Discourse Edited by Godehard Link
ISBN 978-1-61451-829-7 e-ISBN (PDF) 978-1-61451-847-1 e-ISBN (EPUB) 978-1-61451-996-6 ISSN 2198-2201 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2014 Walter de Gruyter Inc., Boston/Berlin Printing: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com
Contents
Preface
ix
Duality, Epistemic Efficiency & Consistency Michael Detlefsen 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 Abstract Duality or Dualization? . . . . . . . . . . . . . 3 The Contentual Addition Model of Dualization . . . . . . 4 Proofs & Proof Developments . . . . . . . . . . . . . . . 5 The Contentual Addition Model & The Traditional Contentualist View of Proof . . . . . . . . . . . . . . . . . . . 6 Contentual Addition in an Abstract Setting . . . . . . . 7 Non-Trivial Axiom Systems . . . . . . . . . . . . . . . . 8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 5 7 8 10 12 17 19
Frege on Quantities and Real Numbers in Consideration of the Theories of Cantor, Russell and Others Matthias Schirn 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 The concept of quantity in Frege’s writings between 1874 and 1884 . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Cantor’s theory of irrational numbers and Frege’s critique 4 Russell on quantities and real numbers in Principles of Mathematics and Principia Mathematica . . . . . . . . . 5 Quantities and real numbers in Grundgesetze . . . . . . . 6 Frege’s plan carried out: von Kutschera’s account . . . .
25 26 35 49 56 59 89
Frege on Formality and the 1906 Independence-Test Patricia A. Blanchette 1 Introduction . . . . . . . . . . . . 2 The Proposal . . . . . . . . . . . . 3 The Import of the 1910 Notes . . 4 The Anti-Metatheory Explanation 5 The Similarity with Hilbert . . . . 6 Conclusion . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
97 97 98 104 107 110 115
vi
Contents
Formal Discourse in Russell: From Metaphysics to Philosophical Logic Godehard 1 2 3 4 5 6 7 8
Link Introduction . . . . . . . . . . . . . . . . . . Setting the Stage: Russell’s Early Ontology . On the Nature of Functions . . . . . . . . . The Substitutional Theory . . . . . . . . . . Principia Mathematica . . . . . . . . . . . . Ramification: Gödel’s Gestalt Switch . . . . Lessons for Ontology . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
119 119 124 139 152 155 163 166 175
On Live and Dead Signs in Mathematics Felix Mühlhölzer 1 A Mess Concerning the Reference, Interpretation and Application . . . . . . . . . . . . . . . . . . . . . . . . . . 2 How can Intended Models be Singled Out? . . . . . . . . 3 Strings of Strokes in Hilbertian Finitism . . . . . . . . .
183 184 195 201
Generalization and the Impossible Paul Ziche 1 “Contradictions are emotions”: The example of the complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . 2 Russell on symbolism: Making the simple complicated . 3 Ways into logic: Generalization and abstraction . . . . . 4 Pure logic and meta-scientific induction . . . . . . . . . . 5 Scientistic liberalism and interesting generalizations . . .
209 209 214 215 219 221
Assumptions of Infinity Karl-Georg Niebergall 1 Introduction . . . . . . . . . . . . . . . . . . . . 2 “T makes an assumption of infinity”, “T assumes the finite” . . . . . . . . . . . . . . . . . . . . . 3 Expressing infinity: a preliminary suggestion . . 4 Axioms for and definitions of “finite” . . . . . . 5 Elaboration of (DIiii) . . . . . . . . . . . . . . . 6 The potentially infinite . . . . . . . . . . . . . . 7 Conclusion . . . . . . . . . . . . . . . . . . . . . 8 Appendix . . . . . . . . . . . . . . . . . . . . . .
. . . . . merely . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
229 229 234 237 242 250 256 261 266
The Interpretation of Classes in Axiomatic Set Theory Daniel Roth, Gregor Schneider 275 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 275 2 Set Theories . . . . . . . . . . . . . . . . . . . . . . . . . 275
vii
Contents
3 4
Interpreting Classes . . . . . . . . . . . . . . . . . . . . . 296 Concluding Remarks . . . . . . . . . . . . . . . . . . . . 308
Purity in Arithmetic: some Formal and Informal Issues Andrew Arana 1 Introduction . . . . . . . . . . . . 2 Topical purity . . . . . . . . . . . 3 The infinitude of primes . . . . . 4 Incompleteness and the possibility 5 Closing thoughts . . . . . . . . . .
. . . . . . . . . . . . . . . . . . of purity . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
315 315 316 318 331 333
Domain Extensions and Higher-Order Syntactical Interpretations Marek Polański 1 Introductory remarks . . . . . . . . . . . . . . . . . . . . 2 Domain extensions: some paradigmatic examples . . . . 3 L-operations and L-constructions . . . . . . . . . . . . . 4 Higher-order syntactical interpretations and their constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Concluding remarks . . . . . . . . . . . . . . . . . . . . .
337 337 338 340 344 349
Finite Methods in Mathematical Practice Laura Crosilla, Peter Schuster 1 Introduction . . . . . . . . . . . . . . . . . . . 2 Hilbert’s programme now and then . . . . . . 3 Finite methods for constructive algebra . . . . 4 Geometric formulas and dynamical proofs . . . 5 Realising Hilbert’s programme in commutative 6 Appendix . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . algebra . . . . .
. . . . . .
351 351 352 365 369 372 398
List of Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 411 Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Preface The present volume grew out of a cooperation between German, Italian, and US-American researchers from 2005 to 2011, which was funded in equal parts by the Alexander-von-Humboldt Foundation and a matching fund from the University of Notre Dame within the format of one of the Foundation’s TransCoop projects. The general theme of the cooperation was called: “Imaginary and Ideal Elements and Limit Concepts in Mathematics: Their Theory, History, and Philosophical Understanding”. The title was motivated by an ongoing philosophical interest in the role of formalist aspects in mathematical theorizing and practice. Accordingly, the research activities ranged from studies approaching historical topics with modern logico-philosophical tools to systematic conceptual and logical analyses. The contributions presented here cover this ground and to some degree extend it in various directions. Papers focussing on central historical figures in the field, like Frege, Russell, Hilbert, and Wittgenstein, are accompanied by those dealing with issues like infinity, finiteness, and proof procedures and ones putting formalist mathematics into historical perspective. More general information is given in the abstracts preceding each paper. A note on the name index: Names of authors might not explicitly appear on a page mentioned in the index if their work is merely referred to via numbers in square brackets. In such a case the names can be retrieved from the bibliography of the essay concerned. In the name of all contributors, I wish to thank the Humboldt Foundation and the University of Notre Dame for funding the project, and the Munich Center for Mathematical Philosophy (MCMP) for generous additional financial support. I also thank several anonymous reviewers for their expert opinion; Jesse Tomalty for looking over the English of non-native speakers; Johannes Stern und Roland Poellinger for administrational work at various phases of the project. In particular, Mic Detlefsen is to be thanked for originally initiating the transatlantic cooperation, and Gregor Schneider for serving as the general LATEX editor of the volume. Finally, I thank the editors of the series and the publisher for agreeing to produce the book with them. Munich, June 2014, G. L.
Duality, Epistemic Efficiency & Consistency Michael Detlefsen
Duality has often been described as a means of extending our knowledge with a minimal additional outlay of investigative resources. I attempt to construct a serious argument for this view. Certain major elements of this argument are then considered at length. They’re found to be out of keeping with certain widely held views concerning the nature of axiomatic theories (both in projective geometry and elsewhere). They’re also found to require a special form of consistency requirement.
1 Introduction Duality or reciprocity principles in projective geometry have been described as being among the most significant developments in modern geometry.1 The geometer H. S. M. Coxeter stated the basic rationale behind such claims as follows: One of the most attractive features of projective geometry is the symmetry2 and economy with which it is endowed by the principle of duality: fifty detailed proofs may suffice to establish as many as a hundred theorems. [9, p. 25]3,4
Duality has thus been seen as a means of effecting a dramatic economy in our geometrical thinking. It has in fact been linked to economy or efficiency 1
Cf. [36, p. 24].
2
‘Symmetry’ is the term Klein, Veblen and Young (cf. [32, p. 7]) and others used to describe an important structural feature of the theorem-set of projective geometry they took to be induced by duality.
3
For a sampling of similar statements see [25, p. 25]; [23, p. 217]; [20, pp. 3-4]; [19, p. 15]
4
For convenience, I’ll refer to the basic epistemological idea expressed in this claim – that dualization effectively provides for the doubling of the knowledge represented by a given body of primary proofs – as the doubling idea.
2
Michael Detlefsen
from the very start. A supposed connection is clear, for example, in the writings of Gergonne, who emphasized that efficiency and organization are just as important to the development of scientific thinking as the discovery and justification of “new truths”. Indeed, Gergonne suggested, for extensively developed sciences, improved internal organization may be an even more compelling consideration than the development of further new knowledge. In his view, in fact, this was the case for the mathematics of his day. [A]t the point which mathematics has reached today, . . . encumbered as we are with theorems of which even the most intrepid memory cannot flatter itself it retains the statements, it would perhaps be less useful to science to seek new truths than to reduce the truths already discovered to a small number of guiding principles. In any case a science perhaps recommends itself less by the multitude of propositions which make it up than by the manner in which these propositions are related and connected to one another. [14, pp. 150-151]5
Roughly a century after Gergonne’s observation that accumulated knowledge sometimes stands in need of the type of simplifying “reduction” suggested by duality, Veblen and Young suggested making dual structuring, or something akin to it, an ideal (or at least a virtue) of axiomatic theorizing in geometry more generally. They termed this ideal symmetry, and they described it as a type of internal structuring of a theory in which proofs and/or theorems are “paired” in such a way that, from a given element of a given pair, the other element can be obtained by application of a routine mechanical transformation.6 They listed this ideal alongside such better-known general standards and ideals of scientific theorizing as consistency, independence and categoricity.7 There is, further, the desideratum of utmost symmetry . . . in the whole body of theorems.8 . . . 5
An obvious question, of course, is why the improvements in internal organization should not themselves be seen as constituting discoveries of new truths. Gergonne need not have denied that they are. His point could be reformulated by making a distinction between new truths of the type typically identified as such by mathematicians and truths not so recognized. His suggestion would then have been that development of the former is in certain circumstances as important as development of the latter.
6
Veblen and Young spoke of mathematical ‘sciences’ where I speak of mathematical ‘theories’. “We understand the term a mathematical science to mean any set of propositions arranged according to a sequence of logical deductions.” [32, p. 2]
7
Cf. [32, §§1, 2].
8
Veblen and Young also introduced a further condition they termed ‘generality.’ They described it as follows: “[T]he applicability of a theorem shall be as wide as possible. This has relation to the arrangement of the assumptions, and can be attained by using in the proof of each theorem a minimum of assumptions.”
Duality, Epistemic Efficiency & Consistency
3
Symmetry can frequently be obtained by a judicious choice of terminology. This is well illustrated by the concept of “points at infinity” which is fundamental in any treatment of projective geometry. . . . Let us now consider the . . . two propositions: 1. Any two distinct points of a plane are on one and only one line. 1′ . Any two distinct lines of a plane are on one and only one point. Either of these propositions is obtained from the other by simply interchanging the words point and line. . . . In view of the symmetry of these two propositions it would clearly add much to the symmetry and generality of all propositions derivable from these two, if we could regard them both as true without exception. This can be accomplished by attributing to two parallel lines a point of intersection.” [32, pp. 7-8]9
Veblen’s and Young’s suggestion thus seems to have been that such dualities as the familiar point-line duality of plane projective geometry provide for a resource-conserving conversion of certain proofs into certain other proofs, and that this convertibility, in turn, provides for increased efficiency in the development of our knowledge.10 The reasoning behind remarks such as those just surveyed has not been set out carefully and explicitly. A first task, then, is to try to get clearer on what such reasoning might look like. The following argument, I believe, is an at least credible first candidate. Its clarification and evaluation will be the chief concerns of this paper. Dualization Argument Comparative Epistemic Gain: To make use of dualization provides an opportunity to significantly increase the extent of our knowledge
This description suggests that generality too might have been intended to ‘compress’ or tighten a theory’s internal organization by maximizing the specificity of connections between theorems and axioms. 9
It should perhaps be noted that what we’re here calling duality, or related (though not always identical) notions going by the same name, have been suggested as ideals for theories other than projective geometry. A recent example is an appeal for duality in algebraic geometry (cf. [2]). There are more remote examples as well, one of which concerns duality in set theory and its relationship to duality in projective geometry (cf. [29]).
10 Other dualities (e.g. point-plane) were taken to offer similar benefits. These dualities took various forms. We have already noted dualities for particular types of geometrical figures (e.g. the above-mentioned point-line duality for planar figures, or the point-plane duality for spatial figures). Some also offered more abstract statements of dualities such as the following: “[F]rom any statement or theorem concerning the relative positions of the elements composing a geometrical configuration, another statement or theorem can be obtained by a simple interchange of the elements of the configuration with their reciprocals.” [10, p. 10].
4
Michael Detlefsen
over what it would be were we not to make use of it.11,12 Modest Developmental Costs: To make use of dualization requires only a modest outlay of developmental resources or “labor” beyond those needed to obtain the primary proofs to which it (i.e. dualization) is to be applied.13 Nature of Epistemic Efficiency: If the gains-to-costs ratio of one method of epistemic development are greater than those of another, we say that the epistemic efficiency of the one is higher than that of the other. ∴ Improved Epistemic Efficiency: Dualization presents a compelling opportunity to improve epistemic efficiency. ∴ Dualization: We ought to make use of dualization when we have the opportunity to do so. This or something like it seems to have been the basic reasoning behind the claims concerning the significance of duality just surveyed. There are a number of points at which it invites closer examination. My focus will be the first premise, Comparative Epistemic Gain, whose justification, I believe, is problematic. Before turning to this premise and the question of its justification, however, I want briefly to consider what may be an even more basic matter – namely, whether it is duality principles per se (i. e., duality principles conceived essentially as abstract existence claims), or dualization processes, that are supposed to sustain whatever epistemic economies there are that may credibly be attributed to duality.14
11 Or, if not the quantity or extent of what is strictly speaking knowledge, the quantity of some other type of epistemic good. 12 The following remark by Smart is an example of the use of explicitly epistemic language to describe the virtues of duality. “From any known theorem we are . . . able to write down a reciprocal theorem whose truth we can at once assert; and we have thus a useful and valuable method of extending our knowledge of geometrical properties.” [28, p. 260]. 13 The following remark by Mathews is an example of how the term ‘labor’ has been used to describe the benefits of dualization. “[P]ractically the principle of duality halves our labour, because all we have to do is to translate, so to speak, the enunciation of a proved proposition into that of its correlative, and then infer the latter at once”. [20, p. 4]. 14 There is another basic matter I want to mention, even though I will not discuss it further here. This concerns an assumption that is implicit in the Dualization Argument, namely, that there are reasons to want to economize on the expenditure of cognitive resources. This leads fairly directly to the question of (i) whether and, if so, in what ways cognitive resources are “limited” or “depletable”, and to questions concerning (ii) how limits on and/or depletion of resources for individuals may compare to limits on and/or depletion of resources for communities. These are important and difficult questions, but questions I will not go into here.
Duality, Epistemic Efficiency & Consistency
5
2 Abstract Duality or Dualization? There is, I think, a difference between taking the economy that duality is supposed to represent to be a product of what might be described as “abstract” knowledge of the existence of dual proofs and taking it to be a product of such knowledge of dual proofs as might be obtained through application of known processes of dualization to known primary proofs. For cases where the knowledge represented by having and comprehending a proof of a theorem differs materially from that represented by merely knowing that there is a proof of that theorem, the above distinction can be expected to matter. How extensive and important such differences may be is difficult to say in any general way, and I will make no attempt to cut through this difficulty here. This notwithstanding, the relevance of such a distinction seems an important point to bear in mind. Generally speaking, its importance will be proportional to the extent to which knowing a proof of a proposition provides for better knowledge of it (and perhaps also more extensive knowledge of related propositions) than merely knowing that there is a proof of it. This being so, I will consider dualization rather than mere abstract duality as the focal form duality should be considered to take for purposes of reckoning the extent to which it may increase epistemic efficiency. Those who have written on the significance of duality have not always been careful to mark such a distinction. [G]iven a theorem and its proof, we can immediately assert the dual theorem; for a proof of the latter could be written down mechanically by dualizing every step in the proof of the original theorem. [9, p. 231]
Such statements are in certain respects puzzling. Coxeter describes the process as beginning with a given theorem and a given proof of it. This presumed theorem and its presumed proof are what I will generally refer to as the primary theorem and proof of a dualization process. Coxeter says that these “givens” immediately justify assertion of the dual theorem. What seems curious to me is his description of what is given. He does not say that what is given is that there is a proof of the primary theorem. Rather, he describes the given as the primary proof itself. This naturally suggests an intention to build not on the abstract fact of the existence of a primary proof, but on the substance of that proof itself. In other words, it suggests the following plan of epistemic expansion: Dualization: Given a primary proof of a primary theorem, we mechanically transform it into a dual proof of the dual theorem. Knowledge of this newly obtained proof then justifies, among other things, assertion of the dual theorem.
6
Michael Detlefsen
This is not how Coxeter describes things though. He does not appeal to the transformation of the primary proof into a dual proof. Rather he appeals to knowledge that the proof can be dualized, and concludes the assertability of the dual theorem from this. The procedure that Coxeter describes is therefore not one which appears to transform a given primary proof into a corresponding dual proof. Rather, it is something seemingly intended to transform knowledge of a primary proof directly into knowledge of a dual theorem. What matters, he suggests, is not that knowledge of a primary proof is transformed into knowledge of a dual proof, but that knowledge of a primary proof assures us that a dual proof “could be written down mechanically” (loc. cit., emphasis added). Coxeter’s conception of the epistemic expansion underwritten by duality thus seems a curious mixture of the scheme described in Dualization with a more abstract understanding of duality – an understanding which sees duality as proceeding from known existence of a primary proof (as distinct from knowledge of the proof itself) to known existence of a dual of it to, finally, assertion of the dual theorem. In other words, it appears to combine Dualization with the following Abstract Duality: Given a sentence that is known to have a primary proof, we may immediately assert both the provability of its dual and the dual itself. Dualization and Abstract Duality seem to me to offer substantially different models for the type of epistemic expansion that may be supported by a duality phenomenon. These differences seem generally to be as significant as the combined differences between (i) knowing that a primary proof exists vs. knowing such a proof itself and (ii) knowing that a transformation of a primary to a dual proof exists vs. knowing and applying such a transformation to obtain a dual proof. This notwithstanding, there may be settings where the differences are not so great, or where such differences as there are suggest a preference for Abstract Duality. Consider, for example, the possibility of a theorem of which the following conditions hold: (a) the complexity of even the simplest proof of a theorem (whether primary or secondary) is such as to make it largely unknowable to humans, but (b) the prospects for human knowledge of the existence of such proofs are significantly better. It may that there are propositions with respect to which the best we can hope for is knowledge of the existence of a proof. For such propositions,
Duality, Epistemic Efficiency & Consistency
7
applicable forms of duality might be limited to Abstract Duality.15 Generally speaking though (e.g., when simplest proofs are not so complex as to practically afford only abstract knowledge of their existence), the epistemic expansion underwritten by Dualization can be expected to be greater than that underwritten by Abstract Duality. The reason is that dualization can generally be expected to yield knowledge not only of dual theorems but of dual proofs, and such knowledge seems generally to go beyond mere knowledge of a dual theorem (and/or abstract knowledge of the existence of a proof of it).16 At the same time, though, the sum of the resources consumed by dualization may be greater than that required for knowledge of Abstract Duality.17 The mere fact, if it is a fact, that the overall epistemic product of Dualization should be greater than that of Abstract Duality does not in itself imply that Dualization generally sustains greater gains in efficiency than does Abstract Duality. To determine that it does, if (and where) it does, would require more intricate analysis than I will provide here. What seems clear, though, is that Dualization and Abstact Duality generally represent different plans for epistemic expansion. I note this because, in what follows, I will consider only dualizational plans and such efficiencies as they may or may not support.
3 The Contentual Addition Model of Dualization As I am conceiving of it, dualization is a process of proof development. It starts with a primary proof, treated as given, and transforms it into another proof. This transformation, it is commonly believed, engenders a body of knowledge which represents the epistemic product of the process of dualization. The (or at least a) common view is that it this product effectively “doubles” the knowledge represented by the primary proof from which the dualization process proceeds. It does so, moreover, with little 15 Perhaps Appel’s and Haken’s computer-assisted proof of the four color theorem is a case in point. This at any rate if we suppose that the proof defies human comprehension in a relevant sense(s) in which the program verifying that there is a proof does not. In general, one of the questions raised by the Appel-Haken proof is how the knowledge of a theorem that is normally represented by knowing a proof of a theorem compares to the knowledge represented by knowing that there is a proof of it. This is an interesting and important question, though there is room enough to explore it further here. 16 To mention an obvious additional component, it would generally include knowledge of the premises of the dual proof. 17 To dualize an entire proof might generally require more resources than merely to dualize the theorem proved. In addition, the demands of grasping a dual proof, and of recognizing it as a dual proof once it has been generated, might be considerably greater than those of knowing that a given primary proof can be dualized.
8
Michael Detlefsen
additional expenditure of investigative resources beyond those expended in the development of the primary proof mentioned. A little more specifically, what is taken to be doubled is the extent or amount of knowledge that is represented by a supposed primary proof. To put it another way, the contents of the knowledge that is a direct product of dualizational development of a proof is presumed to be roughly equal in extent to the contents of the knowledge that is a similarly direct product of the primary proof from which dualization departs. These contents are also presumed to be “new”. A proof obtained by dualization is thus presumed to constitute an addition to the contents of the dualizer’s knowledge that is roughly equal in extent to the epistemic product of the primary proof with which it begins. For present purposes, then, I am conceiving of the epistemic product of a proof or proof development as the total body of knowledge which it represents. I am thinking of the extent of such a product, moreover, as a measure of some type of combination of the propositional contents of the several constituent pieces of knowledge whose combination constitutes the product mentioned. What I am calling the Contentual Addition Model of dualization is a view which sees it (i.e. dualization) as a means of increasing the extent of the knowledge that is represented by a primary proof development. Indeed, if Coxeter and others of like persuasion are correct, dualization provides for what is essentially a doubling of the extent of the knowledge provided by a primary proof development. It is supposed to provide this, moreover, with little additional expenditure of cognitive resources beyond that which is required for the primary proof development to whose product it is to be applied. Both primary and dualizational developments of proofs of dual theorems, then, require expenditure of cognitive resources.18 Those who believe in the epistemic benefits of duality generally hold a view to the effect that the ratio of the extent of the epistemic product to the cost of a dualizational development of a proof of a dual theorem generally exceeds that of a primary development of a proof for it.
4 Proofs & Proof Developments We must be careful, though, to distinguish the knowledge represented by a proof (be it primary or a proof by dualization) per se, and the generally more extensive knowledge represented by a proof development. That this 18 As with developed content, so too with developmental expenditures, I will assume them to be quantitatively measurable or estimatable. It may also be necessary to suppose that their supply is in certain respects limited.
Duality, Epistemic Efficiency & Consistency
9
is so is due to the fact that, in the end, the choices presented by duality phenomena are choices between proof developments, and not merely choices between proofs per se. Dualizational proof development has a mechanical character which may generally make it surer and more straightforward than primary development of a proof of a dual theorem. It may therefore seem to offer advantages of surety and simplicity relative to primary developments of proofs for dual theorems. Be this as it may, dualizational and primary developments also have associated epistemic products, and were the epistemic products of primary developments systematically (or at least predictably) greater than those of their dualizational counterparts, the surety and straightforwardness of dualizational development might not be sufficient to warrant an overall preference for dualizational over primary developments. Even supposing that the costs of dualizational development are as a rule substantially lower than those of primary developments, primary development of proofs for dual theorems might predictably generate a greater epistemic product than does dualizational development. Were the expected increases in epistemic product great enough, primary development of proofs of dual theorems might prove to be rationally preferable to dualizational development even if its costs were generally higher, perhaps even much higher. When considering the opportunities for epistemic economy that may be offered by duality, then, we ought to compare the relative epistemic products of primary and dualizational proof development for dual theorems and not merely their primary and secondary proofs. If duality is to represent an opportunity for epistemic economy, it must presumably be the case that the product-to-cost ratio of dualizational proof development for dual theorems is generally greater than that of their primary counterparts. This, at any rate, is the key idea of what I will call the Contentual Addition Model of dualization. According to it, dualization adds to the knowledge represented by a primary proof development by adding to the extent or quantity of the contents known. On some versions of this view, in fact, it provides for what is essentially a doubling of the (extent of) such contents. If the Contentual Addition Model is to be convincing, the idea of contentual addition on which it is based must be plausible. As we will now see, though, there are a number of questions which can be raised concerning such a view of contentual addition. These include: (i) Are the axioms of projective geometry themselves propositions or, as per the common conception of projective geometry as a so-called abstract science, are they rather something more schematic (e.g. propositional functions of some type)?
10
Michael Detlefsen
(ii) (a) Are proofs in projective geometry finite sequences of judgments each element of which (including the conclusions proved) adds to the extent to the total epistemic product of the proof? Or (b) do proofs in projective geometry rather establish what is generally a logical relationship between the contents of a set of possibly non-judgmentive premises and the content of a non-judgmentive conclusion? In addition to these, there are questions concerning the requirements the Contentual Addition Model places on our knowledge of models of the theories (e.g. projective geometry) to which it (the Contentual Addition Model) is applied. Here one thing seems sure – namely, that mere consistency is not in itself enough to sustain the Contentual Addition Model. It seems instead to require knowledge of a model – indeed, particular knowledge of a model, as distinct from mere or abstract knowledge of its existence. This type of knowledge seems to involve more than that which would at least standardly be provided by what Hilbert referred to as “direct” (i.e. a proof-theoretic) proofs of consistency. This complicates the question of what form a consistency requirement ought best to take for theories where dualization is wanted as a means of efficiently extending our knowledge. More on this later.
5 The Contentual Addition Model & The Traditional Contentualist View of Proof Let us conceive of the epistemic product of a proof development as the body of epistemic goods in whose production it figures significantly.19 Likewise, let’s conceive of the cost of a proof development as the total body of cognitive resources expended in carrying it out. These are rough characterizations, of course, but they suggest an at least correspondingly rough way to think of the epistemic efficiency of a proof development – namely, as the ratio of (the extent of) its epistemic product to (the extent of) its cost, however these might more exactly be conceived.20 19 More (though not perfectly) exactly, we may think of it as the total body of epistemic products that are (a) reasonably seen as being regular concommitants of the development in question and (b) reasonably taken to be generally valuable to pursuers of the proof development in question as pursuers of their epistemic-developmental type. 20 We could, and ultimately should, of course, consider not only matters of extent or quantity but also of quality. In a more fully developed account, we would want a ratio of the value of the product (i.e. a composite of its quantity and (relative) quality) of a proof development and the value of the investigative resources consumed by it (i.e. some composite of the scarcity of these resources and of the relative value of expending them in this way).
Duality, Epistemic Efficiency & Consistency
11
We must therefore attempt to determine what are the typical epistemic products of primary and dualizational proof developments, and what are the expenditures typically needed for their respective executions. This may be subtler than it may at first appear to be. To begin with, we must decide certain larger questions concerning the overall character of the type of proof being developed. On one view, proofs are sequences of judgments the propositional contents of which are judged to stand in certain logical relationships to one another. The judgments mentioned are affirmations of propositional contents. Some of these are the contents of premises, one (typically) the content of the conclusion and some the contents of the judgments of logical relationships mentioned. For convenience, I’ll call this the Traditional Contentualist View (TCV) of proof. On the TCV, the epistemic product of a primary proof development can be roughly divided into four types of components – the premisory, logical, conclusory and auxiliary components, respectively. These are what the names suggest. The premisory component is thus knowledge of the developed proof’s premises. The logical subproduct is the knowledge that the development gives of the logical relationships between the several premises and sets of premises of the proof and between (sets of) its premises and its conclusion. The conclusory part of the product is the knowledge the proof development gives of the conclusion of the proof. The auxiliary subproduct of a proof development, finally, is comprised of such other items of knowledge as are non-incidentally engendered by it, even though they should not be part of that knowledge which in some sense constitutes the proof itself.21 As an example of auxiliary knowledge, consider the type of knowledge that may result from making a false start on developing a primary proof, learning from its failure, and eventually successfully developing a primary proof. Such knowledge seems to be a common (if not universal) element of primary proof development – common enough, at least, to deserve consideration as a possibility generally to be considered in realistic estimations of the comparative epistemic merits of primary and dualizational proof developments. A particular point to consider in this connection is that dualizational and primary proof development may not be equal in their auxiliary elements. In particular, dualization might not regularly offer the same rich potential for 21 To avoid misunderstanding, let me note that the auxiliary items I have in mind are not what earlier writers have sometimes referred to as intervenient knowledge – that is, knowledge of such a type as equips a knower to better develop her knowledge in the future, even though it may not itself constitute a contentual addition to her present knowledge. Mathematics has traditionally been prized for such benefits. Cf. [1, Bk. 2, VIII, 2]; the anonymously written dedication to [27, xiv-xv]; and [4, p. 172]. What I am here terming auxiliary items are, by contrast, contentual additions to an agent’s present knowledge, and not (or at least not merely) enhanced capacity for future contentual extension of her knowledge.
12
Michael Detlefsen
development of auxiliary knowledge of the type just mentioned as primary proof development does. Auxiliar knowledge is of course just that, auxiliary. It is not knowledge that is properly a part of that knowledge which constitutes the proof developed in a proof development. This notwithstanding, it may nonetheless contribute to the realization of a more comprehensive set of ends which extend, but also continue (in an appropriate way), the ends of the given proof development(s). Auxiliary gains of the general type mentioned above (i.e. those involved in making and correcting false starts) do not of course come at no cost. The underlying trial-and-error procedures can be expected to consume cognitive resources, and it would not be surprising if the amounts consumed sometimes exceed those consumed by their more mechanical dualizational counterparts.22 This notwithstanding, the main point is this: in estimating the relative benefits of primary and dualizational developments, the possibility of auxiliary gains and costs ought generally to be considered. Dualization may generally offer advantages over primary development in terms of lowered cognitive costs. If, however, it were also to offer less by way of auxiliary gains, it might be that primary development of a proof for a dual theorem would be overall preferable to dualizational development. This being so, the idea that dualization should generally offer what is essentially a doubling of the epistemic benefits of a parallel primary development seems dubious, or at least in need of special argument. Whatever the correct calculation of the alleged efficiency of a dualization might be, it will require knowing of the primary development(s) to which a dualizational alternative is to be compared (a) whether it offers auxiliary epistemic benefits, (b) what the extent of such benefits is, and (c) what their cumulative cognitive cost is. This suggests a significantly more complicated calculation than that suggested by the simple doubling idea.
6 Contentual Addition in an Abstract Setting The TCV is likely the most accommodating view as regards the Contentual Addition Model of duality. As we have just seen, though, even it poses certain difficulties for the Contentual Addition Model. In addition, the TCV does not embody a conception of theory that has generally been taken to be the (or an) appropriate conception for that theory (or cluster of theories) with respect to which dualization has featured most prominently – namely, that pertaining to projective geometry. The 22 It does not seem inevitable that they should do so, though. Even mechanical procedures can be long and intricate.
Duality, Epistemic Efficiency & Consistency
13
prevailing view since the late nineteenth and early twentieth centuries is that projective geometry should be conceived and formulated as a so-called “abstract science” (abstracte Wissenschaft).23 Veblen and Young gave the following summary statement of this view. The starting point of any strictly logical treatment of geometry (and indeed of any branch of mathematics) must . . . be a set of undefined elements and relations, and a set of unproved propositions involving them; and from these all other propositions (theorems) are to be derived by the methods of formal logic. . . . [T]he undefined elements are to be regarded as mere symbols devoid of content, except as implied by the fundamental propositions. Since it is manifestly absurd to speak of a proposition involving these symbols as self-evident, the unproved propositions referred to above must be regarded as mere assumptions. [32, p. 1-2, emphases in text]
They went on to remark, however, that though mathematical sciences are not generally intended as descriptions of particular domains identified in advance, their significance as bodies of knowledge nonetheless depends on their having application to (parts of) our larger experience. We understand the term a mathematical science to mean any set of propositions arranged according to a sequence of logical deduction. From the point of view developed above such a science is purely abstract. If any concrete system of things may be regarded as satisfying the fundamental assumptions, this system is a concrete application or representation of the abstract science. The practical importance or triviality of such a science depends . . . on the importance or triviality of its possible applications. [Loc. cit., emphases in text]
This became the commonly accepted understanding of projective geometry in the early years of the twentieth century. A clear and clearly stated example was Whitehead who described the axioms of projective geometry as “statements about relations between points” [34, p. 1]. He quickly added, though, that “they are not statements about particular relations between particular points” [loc. cit.]. Rather, the points and relations mentioned “are not otherwise specified than by the supposition that the axioms are true propositions when they are considered as referring to them” [loc. cit.]. In the abstract science of projective geometry, then, [T]he points mentioned in the axioms are not a special determinate class of entities; but they are in fact any entities whatever, which happen to be inter-related in such a manner, that the axioms are true when they are considered as referring to those entities and their inter-relations. [34, p. 2] 23 This conception was given impetus by Pasch’s 1882 lectures on what he and others referred to as the “new” geometry ([22]), and also by various writings of Peano and his students. See also [35, p. 46] for a use of the phrase “abstracte Wissenschaft” that Blumenthal reported as having impressed Hilbert. Wiener argued that geometry ought to be constructed as an abstract science whose propositions are in some sense “independent” of the usual axioms of geometry.
14
Michael Detlefsen
Accordingly, the axioms of an abstract science are not truly propositions at all, but propositional functions. As such they are not strictly capable of being either true or false and are therefore not properly regarded as truths. This was the typical view of projective geometry after the turn of the twentieth century, though not every statement of it was so clear as Whitehead’s, or so explicit in its use of a distinction like that between propositions and propositional functions. The important point for our purposes is that on the abstract conception of projective geometry,24 the premises and conclusions of proofs are not propositions but propositional functions. As such they do not have contents to which other contents might be added, or which might be added to other contents, in the way presumed by the Contentual Addition Model. In addition to being conceived as abstract sciences, it has been common in the late nineteenth and twentieth centuries to conceive of mathematical theories hypothetically. In an axiomatic theory thus conceived, proofs are not intended to establish the truth of their conclusions but only to reveal a relationship of logical implication between the axioms of the theory and them. [A] mathematical demonstration, strictly speaking, is not concerned with the truth of the proposition at all; it is concerned merely with the logical relation that exists between the given proposition and certain other propositions called the axioms – in other words, all that a mathematical demonstration tells us is that if the axioms are true, then the theorem in question will also be true – provided, of course, that our deductive reasoning is sound. [17, §8, p. 159]
Such an understanding of proof does not seem to fit well with the Contentual Addition Model. The reason, briefly, is that, on this understanding, primary proofs are proofs of tautologies and dualized proofs of tautologies do not generally constitute proofs of new tautologies in any relevant way.25 The sentences proved and their proofs may be syntactically distinct, but this does not imply a genuine newness of the tautology proved or the proof by which it is proved. To see why, consider a dualizable theory T in a language LT .26 Let π be a primary proof (i.e. a proof developed by primary means) in T of a sentence θ of LT . For convenience, let’s say that an axiom of T is “in” π if it is one of the elements of the finite sequence of sentences of LT 24 This is a conception that has been taken to apply not only to projective geometry but to mathematics more generally. Cf. [3, p. 359]. 25 Still less do they constitute new proofs of new tautologies. 26 This is intended to include, of course, theories that are dually axiomatized (i.e. whose axioms are given in dual pairs). For simplicity, in fact, we may assume that T is of this type.
Duality, Epistemic Efficiency & Consistency
15
that constitutes π. For simplicity, let us further stipulate that knowing π may be regarded as giving knowledge that the axioms of T that are in π logically imply the conclusion of π. Suppose, then, that we dualize π to obtain a proof d(π) of the sentence d(θ). The pivotal question is this: What knowledge may d(π) and its dualizational development reasonably be expected to add to the knowledge assumed to be given by π and its primary development?27 Whatever the details, the core of the answer seems to be this: “d(π) and its dualizational development can not be expected to add anything new of comparable extent to the knowledge provided by π and its development.” The reason is that substituting terms into a tautology to make it a new sentence (i.e. a new instance of an already recognized tautologous form) is not in itself enough to make knowledge either of it or of its tautologousness new knowledge. Thus, for example, the knowledge represented by developing a primary proof of, say, 1. Either [any two distinct points of a plane are on one and only one line] or it is not the case that [[any two distinct points of a plane are on one and only one line]].
is not doubled by making the substitutions necessary to dualize that proof into a proof of 1D. Either {any two distinct lines of a plane are on one and only one point} or it is not the case that {{any two distinct lines of a plane are on one and only one line}}.
Considered as a tautology, in fact, 1D is not new when compared to 1. Neither is the dualized proof of 1 a new proof of a tautology as compared with the assumed primary proof of 1. The reason, generally speaking, is that dualization of tautologies preserves their recognizable logical forms. That is, it preserves the forms that make the dualized sentences the tautologies they are. This being so, and knowledge of tautologousness being mainly a matter of (a) knowing of a sentence that it has a certain logical form F and (b) knowing that all sentences having that form are tautologies, the dual of a tautology will not be a genuinely “new” tautology. It will be only a syntactically distinct instance of an already recognized tautologous form. What would make it a new tautology would be its having a new tautologous form. Dualization, though, does not generally produce sentences having new logical forms as compared to their primary counterparts. Rather, it produces only sentences which, despite their syntactical distinctness from their primary counterparts, are nonetheless of recognizedly the same logical form. 27 Here, as expected, d(θ) is the dual of θ, and d(π) the proof which results from dualizing each element of π.
16
Michael Detlefsen
Much the same is true of proofs of tautologies and their dualizations. If the displayed logical forms of a finite family of sentences α1 , . . . , αn are perceived to stand in certain formal-logical relationships to one another, the displayed logical forms of d(α1 ), . . . , d(αn ), generally speaking, can be expected to perceivedly stand in the same relationships.28 A more detailed description of the general situation may make things clearer. Consider, then, the general scenario in which a primary development of a proof π of θ is based on the following knowledge: (A) (i) knowledge of a certain syntactically displayable logical form Fc , and (ii) knowledge that Fc is displayed by θ, (B) (i) knowledge of certain syntactically displayable logical forms Fa1 , . . . , Fan and (ii) knowledge that these are severally the forms displayed by the axioms in π and, finally, (C) knowledge that a set of sentences of LT which severally have the forms Fa1 , . . . , Fan logically imply any sentence that has the form Fc . Consider further a dualizing prover who knows of her dualizing process that for all sentences σ and all finite sets of sentences Σ of LT , Σ logically implies σ only if d(Σ) logically implies d(σ).29 For such a prover, dualization would seem not to double the knowledge produced by a primary development of π.30 In particular, it could not reasonably be expected to yield new knowledge to match the knowledge described in (C). The knowledge corresponding to the (C) component under dualization would not be new knowledge. Rather, as mentioned earlier, it would be essentially the same knowledge as that represented by the (C) component of the primary proof development. Neither, for that matter, would we expect the (A)(i) and (B)(i) components to be new. What might more reasonably be seen as new are the (A)(ii) and (B)(ii) components – the knowledge that Fc is a form of d(θ) and that Fa1 , . . . , Fan are forms of the axioms d(a1 ), . . . , d(an ) that are in d(π). Such knowledge being new would not, however, be enough to warrant a claim that dualization of π effectively doubles the knowledge yielded by a primary development of π. Rather, for such a claim to be plausible, the (C) 28 I say ‘can be expected to’ rather than ‘will’ because it is possible that, as a matter of human perception, dualization may alter our perception of even logical form. Application of the Contentual Addition Model of duality might thus require premises concerning our perception of logical forms and what stands to influence it. 29 Here, of course, d(σ) is the dual of σ, Σ is a class of sentences of LT , and the elements of d(Σ) are the duals of the elements of Σ. 30 It seems quite plausible that a dualizing prover should have the knowledge mentioned. The usual dualization processes for projective geometry clearly preserve standard logical forms and whether or not Σ implies σ is at least normally determined by the standard logical forms of σ and the sentences in Σ. My use of “normally” here is intended to reflect the fact that, like other norms, norms concerning what is central to rational judgments of implication are at least in principle subject to change.
Duality, Epistemic Efficiency & Consistency
17
component of a dual proof would also have to constitute new knowledge of comparable extent to the (C) component of its primary counterpart, and this seems generally not to hold. The Contentual Addition Model of dualization thus seems incapable of sustaining the Dualization Argument. Specifically, it seems incapable of sustaining the Comparative Epistemic Gain premise when projective geometry is either viewed as an abstract science or a hypothetical science.31
7 Non-Trivial Axiom Systems The above argument having been given, it is important to note that both Veblen and Young [32] (cf. p. 2) and Whitehead [34] (cf. p. 2) expressed general concerns regarding what they termed the possible “triviality” of abstract axiomatic systems. They described this triviality as consisting in the absence of “concrete system[s] of things which may be regarded as satisfying” [32] the axioms of an abstract axiomatic system. The standard usage of ‘concrete’ in mathematics in the late nineteenth and early twentieth centuries was intended to express a contrast with the meaning of ‘abstract’ in ‘abstract science’. Accordingly, a concrete model of an abstract science was taken to be a system of propositions – as distinct from propositional functions or proposition-schemata – obtainable from an abstract science by substituting particular propositions for its propositiono-schematic axioms. Some, however, have suggested that non-triviality requires more than mere existence of a model. They have suggested such additional requirements as the “practical importance” (cf. [17, §16]; [32, p. 2]) of a nontrivializing interpretation. Huntington, for example, described possession of an “interesting concrete interpretation” [16, p. 4, fn ∗ ] as a condition of an abstract science’s significance or study-worthiness. It has thus been fairly common to see the existence of a model as necessary though not in itself sufficient for non-triviality. The relevant meanings of such terms as “concrete”, “interesting” and “practical importance” are not altogether clear of course.32 Let us suppose, however, at least for the sake of argument, that a “concrete” and/or 31 Historically, the hypothetical and abstract conceptions seem often to have been combined. 32 Nor did mathematical writers do much to clarify their meanings. The philosopher Josiah Royce offered a more substantive description making use of a contrast he found compelling between mathematical science and games such as chess. He wrote (cf. [26, pp. 451-452]): “The exactly stated ideal hypotheses whose consequences the mathematician develops must possess, as is sometimes said, sufficient intrinsic importance to be worthy of scientific treatment. They must not be hypotheses. The mathematician is not, like the solver of chess problems, merely displaying his skill in dealing with the arbitrary fictions of an ideal game. His truth is, indeed, ideal;
18
Michael Detlefsen
“interesting” and/or “practically important” model can always be found for a set of projective axioms.33 Under such an assumption, is the Comparative Epistemic Gain premise plausible? Not clearly so, it seems. Even given an interpretation of an abstract theory under which its axioms may be evident, there is no guarantee that dualization will essentially double the extent of one’s knowledge, or otherwise substantially increase it. For one thing, the details of the particular dualization may matter. Suppose, for example, that the system in question is one in which each axiom has a dual that is also an axiom of the system. Some, but not all, dualizations are of this axiomatically dual type. An interpretation of an axiomatically dual system that makes its axioms evident, interesting or practically important will generally confer as much of these qualities on dual proofs as it confers on their primary counterparts. For dualizable systems that are not axiomatically dual, there seems to be no similar guarantee. Being evident, interesting or practically important are not properties that dualization may generally be expected to preserve. Finding an evident, interesting or practically important interpretation of a non-axiomatically dual dualizable system cannot therefor generally be expected to sustain epistemic doubling (or other substantial epistemic increase) under dualization. In addition to this, there is the question of what proportion of the knowledge represented by a primary proof is constituted by knowledge of logical implication. If the proportion is relatively high, the factor by which dualization can be expected to increase or extend the knowledge represented by a primary proof will be relatively low. The reason, as stated in section 6, is that the knowledge of logical implication represented by a dual proof can not generally be counted on to be genuinely new knowledge. If the above reasoning is correct, requiring of a dualizable system that it have not merely a model, but an evident, interesting or practically important model does not in itself do much to improve the plausibility of the Comparative Epistemic Gain premise. his world is, indeed, treated by his science as if this world were the creation of his postulates a ‘freie Schöpfung.’ But he does not thus create for mere sport. On the contrary, he reports a significant order of truth. As a fact, the ideal systems of the pure mathematician are customarily defined with an obvious, even though often highly abstract and remote, relation to the structure of our ordinary empirical world. Thus the various algebras which have been actually developed have, in the main, definite relations to the structure of the space world of our physical experience. The different systems of ideal geometry, even in all their ideality, still cluster, so to speak, about the suggestions which our daily experience of space and of matter give us.” 33 Alternatively, let us suppose the Dualization Argument to be restricted to systems for which this is true.
Duality, Epistemic Efficiency & Consistency
19
8 Conclusion If the preceding discussion is correct, the question of whether and to what extent dualization may be capable of extending our knowledge in an especially efficient way is complex and difficult. Perhaps the central concern is how rightly to conceive quantity or extent of knowledge. If extent of knowledge is understood contentually (i.e. as consisting in the extent of its contents), we need better ways of estimating, comparing and measuring it. How significant dualization may be as a means of increasing the contentual extent of our knowledge also depends on questions concerning the nature of the theories being dualized. In particular, it depends on whether theories are taken to be particular 34 (i.e., their axioms are taken to be particular propositions) or abstract (i.e. their axioms are taken to be propositional functions). It depends as well on questions concerning the epistemic aim of proof within a dualized theory. In particular, it depends on whether the aim of proof is taken to be (i) the justification of the theorems proved, or (ii) the discovery of relationships of logical implication which obtain between the axioms of a theory and other propositions expressible in its language. Among questions we have not thus far properly attended to are the demands the Contentual Addition Model places on dualized theories and how these demands compare to the traditional requirements placed on axiomatic theories. This goes particularly for the traditional consistency requirement. For Contentual Addition to be possible, more seems to be required than mere consistency. The dualized theory must at the very least have a model, and it seems, in fact, that it must have a model that is known in an appropriate way to be a model. This last remark calls for elaboration. It was not uncommon for earlier writers to identify consistency with model-existence. Such identification was perhaps particularly common when it came to matters of practical proof. The (or at least a) common belief was that, practically speaking, the only way to prove the consistency of a theory was to provide a model for it.35 Definitionally, the two were often enough distinguished, even if the significance of their differences may not generally have been clear. It was also common practice to characterize the consistency of a set of axioms in terms of what was logically deducible from it. Hilbert thus described the consistency of the five groups of axioms in his Geometrie as consisting 34 Or ‘concrete’ to employ a more commonly used term. 35 Cf. [12, §95]; [13, §143]; [6, p. 530]; [7, p. 629]; [8, p. 77]; [32, p. 3]; [37, pp. 43-44]; [21, pp. 10-11, 14].
20
Michael Detlefsen
in the fact that they “do not contradict one another, i.e. it is not possible, through logical inference, to derive (abzuleiten) from them a fact (Thatsache) which contradicts any of the axioms” [15, p. 19].36,37 Despite common acceptance of a difference in meaning between consistency and model-existence, then, there was no similarly common belief in their practical difference. There was virtually universal agreement that model-existence implies consistency and therefore that proofs of modelexistence are in effect proofs of consistency. Some, in fact, described this as a basic dictate of logic. Whitehead, for example, saw it as following from the Law of Contradiction. A set of axioms must be consistent, that is to say, it must not be possible to deduce the contradictory of any axiom from the other axioms. According to the logical ‘Law of Contradiction’, a set of entities cannot satisfy inconsistent axioms. Thus the existence theorem for a set of axioms proves their consistency. Seemingly this is the only possible method of proof of consistency. [34, p. 3]
That model-existence implies consistency was thus a widely held belief.38 So, too, as noted a couple of paragraphs above, was the view expressed in the last sentence of Whitehead’s remark. There seems to have been no comparably widespread belief that consistency implies model-existence. These were, of course, the decades just before Gödel’s proof of his completeness theorem – the major result of recent times (and probably ever) addressing the connection between deducibility concepts of consistency and model-existence. Still less common, perhaps, was the idea that there might be a practical way to prove consistency that didn’t involve construction of a model. Here, of course, Hilbert was an important exception.39 36 For similar characterizations of consistency in terms of what is deducible from a set of propositions see [34, p. 3]; [18, p. 13]; [21, pp. 10-11]. There were others who, at least sometimes, defined consistency not as non-contradictoriness of deducible consequences, but as possession of a model (cf. [32, p. 3]; [17, p. 165]). 37 In one respect this characterization of consistency is curious. It identifies contradiction with the axioms as that which is to be avoided. In truth, though, we should want to avoid not only contradiction between a theorem and an axiom, but contradiction between any two theorems. This same curiously narrow characterization of consistency was adopted by other writers too. 38 It was not universal, however. The American philosopher Paul Weiss asked: “Is it possible that the only way we can determine whether a set is consistent is by seeing all the postulates actually exemplified in some one object?” [33, p. 468]. “If so,”, he answered, “we must arbitrarily assume that the object is self-consistent, so that the proof of consistency must ultimately rest on a dogma.” [ibid.]. 39 Whitehead alluded to another class of exceptions as well. “Some mathematicians solve the difficult problem of existence theorems by assuming the converse relation between existence theorems and consistency, namely that, if a set of axioms are consistent, there exists a set of entities satisfying them. Then consistency can only be
Duality, Epistemic Efficiency & Consistency
21
Even had Gödel’s completeness theorem been known at the time, it would not have suggested an alternative to model-construction as a practical means of proving consistency. For first-order axioms, it showed that consistency implies model-existence. It did not, however, provide or suggest an alternative to model-construction as a means of proving consistency. Neither, for that matter, does it provide or suggest actual models for given consistent sets of first-order axioms. Consistency and model-existence are thus theoretically and practically distinct and they were generally taken to be so by nineteenth and twentieth century foundational writers who prized duality. It is to me, then, a little surprising that these same writers did not more carefully distinguish consistency and model-existence as constraints on the adequacy of geometrical axiom-systems. The Contentual Addition Model of duality requires not only that a model of the dualizable axioms exist, but that it be known. Those, therefore, who take dualization to be among the cardinal virtues of projective geometry, and who see Contentual Addition (or something like it) as the basis for this virtue, cannot settle for consistency as the basic constraint on axiom systems. They require not merely consistency or even existence of a model. Rather, at the very least, they require the existence of a known model. A direct proof of consistency of the type envisioned by Hilbert for the axioms of arithmetic would not give them this. Consequently, it would not give them what they need – namely, a means of sustaining Contentual Addition by dualization. Only construction of a model would suffice, and for theoretical reasons as well as for such practical reasons as there may be. Acknowledgments It is a pleasure to acknowledge the generous financial support of the TransCoop Programme of the Alexander von Humboldt Stiftung and the Agence nationale de la recherche (ANR) of France under their chaires d’excellence programme. It is also a pleasure to thank the members of the Imaginary and Ideal Elements and Limit Concepts in Mathematics TransCoop project, the Ideals of Proof (IP) research group, the philosophy department and logic group at the University of Notre Dame, the HPS and SPHERE groups at the Université de Paris 7–Diderot, the Philosophy Department and the Archives Henri Poincaré at the Université de Lorraine, the philosophy department and logic and HPS groups at the École Normale Supérieure and the groups attached to the past and current chairs in the guaranteed by a direct appeal to intuition, and by the fact that no contradiction has hitherto been deduced from the axioms. Such a procedure in the deduction of existence theorems seems to be founded on a rash reliance on a particular philosophical doctrine respecting the creative activity of the mind.” [34, pp. 3-4].
22
Michael Detlefsen
philosophy of language and epistemology at the Collège de France. These groups supported my students and me in a variety of ways and provided a welcoming and stimulating environment. Among individuals, special thanks are due to Paddy Blanchette, Henk Bos, James Cargile, Martin Carrier, Marcus Giaquinto, Jeremy Gray, Tim McCarthy, Colin McLarty, Michael Potter, Greg Restall, Peter Schröder-Heister, Göran Sundholm and Jean-Jacques Szczeciniarz for useful discussion of various points.
References [1] F. Bacon. The two bookes of Francis Bacon. Of the proficience and advancement of learning, divine and humane. Printed for Henrie Tomes, London, 1605. [2] J. Becker and D. Gottlieb. A History of Duality in Algebraic Topology. In I. M. James (ed.), History of Topology, pp. 725–745. North-Holland, Amsterdam, 1999. [3] P. Bernays. Hilbert, David. In Borchert (ed.), Encyclopedia of Philosophy, volume IV. MacMillan Reference USA, Detroit, 2006. [4] B. Bolzano. Preface to Considerations on some Objects of Elementary Geometry. Reprinted in [11], volume 1. Page references are to this reprinting. [5] L. E. J. Brouwer. Über die Bedeutung des Satzes vom ausgeschlossenen Dritten in der Mathematik, insbesondere in der Funktionentheorie. Journal für die reine und angewandte Mathematik, 154:1–7, 1925. English translation in [30], 334-345. Page references are to this translation. [6] H. C. Brown. Review of [24]. Journal of Philosophy, Psychology and Scientific Method, 3:530–531, 1906. [7] H. C. Brown. Infinity and the Generalization of the Concept of Number. Journal of Philosophy, Psychology and Scientific Methods, 5:628–634, 1908. [8] J. Coolidge. The Elements of Non-Euclidean Geometry. Clarendon Press, Oxford, 1909. [9] H. S. M. Coxeter. Introduction to Geometry. Wiley, New York, 2nd edition, 1969. [10] L. Dowling. Projective Geometry. McGraw-Hill Book Co., New York, 1917. [11] W. Ewald. From Kant to Hilbert: A Source Book in the Foundations of Mathematics, two volumes. Oxford University Press, Oxford, 1996. [12] G. Frege. Die Grundlagen der Arithmetik. Eine logisch mathematische Untersuchung über den Begriff der Zahl. W. Koebner, Breslau, 1884. [13] G. Frege. Grundgesetze der Arithmetik, Begriffsschriftlich abgeleitet II. H. Pohle, Jena, 1903. [14] J. D. Gergonne. Géométrie de Situation. Annales de Mathematique, 18: 149–154, 1827-28. [15] D. Hilbert. Grundlagen der Geometrie. Teubner, Leipzig, 1899.
Duality, Epistemic Efficiency & Consistency
23
[16] E. Huntington. The fundamental laws of addition and multiplication in elementary algebra. Annals of Mathematics, 8:1–44, 1906. [17] E. Huntington. The fundamental propositions of algebra. In J. W. A. Young (ed.), Monographs on topics of modern mathematics relevant to the elementary field, pp. 151–210. Longmaks, Green, and Co., London, 1911. [18] E. Huntington. The Continuum and other types of serial order. Harvard University Press, Cambridge, MA, 2nd. edition, 1917. [19] G. Ling, G. Wentworth, and D. Smith. Elements of Projective Geometry. Ginn & Co., New York, 1922. [20] G. Mathews. Projective Geometry. Longmans, Green & Co., New York, 1914. [21] C. O’Hara and D. Ward. Introduction to Projective Geometry. Oxford University Press, Oxford, 1937. [22] M. Pasch. Vorlesungen über Neuere Geometrie. Teubner, Leipzig, 1882. [23] A. Pickford. Elementary projective geometry. Cambridge University Press, Cambridge, 1909. [24] M. Pieri. Sur la compatibilité des axioms de l’arithmétique. Revue de Métaphysique et de Morale, 14:196–207, 1906. [25] T. Reye. Lectures on the Geometry of Position, Part I. The Macmillan Co., New York, 1898. [26] J. Royce. The Sciences of the Ideal. Science, 20(510):449–462, 1904. [27] N. Saunderson. The elements of algebra. Cambridge University Press, Cambridge, 1740. [28] E. H. Smart. A First Course in Projective Geometry. Macmillan & Co., London, 1913. [29] E. Specker. Dualität. Dialectica, 12:451–465, 1958. [30] J. van Heijenoort. From Frege to Gödel: A source book in mathematical logic 1879–1931. Harvard University Press, Cambridge, 1967. [31] O. Veblen and J. W. Young. A set of Assumptions for Projective Geometry. Americal Journal of Mathematics, 30:347–380, 1908. [32] O. Veblen and J. W. Young. Projective Geometry, volume I. Ginn & Co., Boston, 1910. [33] P. Weiss. The Nature of Systems. II. The Monist, 39(3):440–472, 1929. [34] A. N. Whitehead. The Axioms of Projective Geometry, Cambridge Tracts in Mathematics and Mathematical Physics. Cambridge University Press, London, 1906. [35] H. Wiener. Über Grundlagen und Aufbau der Geometrie. Jahresbericht der Deutschen Mathematiker-Vereinigung, 1:45–48, 1892. [36] J. W. Young. Projective Geometry, Mathematical Association of America. Open Court Publishing Co., Chicago, 1930. [37] J. W. Young, W. W. Denton, and U. G. Mitchell. Lectures on Fundamental Concepts of Algebra and Geometry. Macmillan Co., New York, 1911.
Frege on Quantities and Real Numbers in Consideration of the Theories of Cantor, Russell and Others1 Matthias Schirn
The core of this essay is a detailed account of Frege’s theory of real numbers in the second volume of his opus magnum Grundgesetze der Arithmetik [21]. I begin with introductory comments on Frege’s standpoint vis-à-vis the conception of analysis by some of his contemporaries and remarks about his platonism. In section 2, I first take a look at Frege’s theory of quantity in his Habilitationsschrift Rechnungsmethoden, die sich auf eine Erweiterung des Größenbegriffes gründen [16]. I deal then with some critical observations that Frege makes in Die Grundlagen der Arithmetik [18] with respect to Hankel and Newton’s treatment of the concept of quantity and make a few remarks on Frege’s review of H. Cohen’s book Das Prinzip der Infinitesimal-Methode und seine Geschichte [7], in which Cohen comments on the Kantian distinction between extensive and intensive magnitudes. In section 3, I describe the essential features of Cantor’s theory of irrational numbers and examine the main points of the critique deployed by Frege. In section 4, I take a look at Russell’s theory of real numbers in [39] as well as in [56]. Section 5 (5.1–5.4) is devoted to a detailed reconstruction of Frege’s conception of the notion of quantity and his theory of real numbers with an eye to both the sketchy informal and the meticulous formal account as far as it goes in [21]. Special emphasis is placed on the considerations that led him to set up the definitions of the concepts positival class and positive class and the problem of proving the mutual independence of the clauses that make up the definition of the former concept which is only preparatory to the latter. Due to Russell’s paradox, Frege’s logical foundation of analysis remained a fragment. Section 5.2 is an interlude in which I deal briefly with the concept of quantity in the work of Euclid, Aristotle and Euler. In the final section 6, I give a brief account of von Kutschera’s proposal of how Frege might have carried on with the logical construction of analysis in a projected third volume of Grundgesetze, had he not been shocked by Russell’s paradox.
1
I dedicate this essay to Christian Thiel on the occasion of his 75th birthday.
26
Matthias Schirn
1 Introduction: the targets of Frege’s critique in Grundgesetze (vol. II) and a question concerning his Platonism The method of introducing the real numbers proposed by Frege in the second volume of Grundgesetze der Arithmetik [21] lies between the traditional geometrical approach and the theories developed by Cantor, Weierstraß, and Dedekind. The latter purport to be purely arithmetical – hence the label “the arithmetization of analysis”. From the geometrical approach Frege adopts the characterization of the real numbers as ratios of quantities or, as he also says, as measurement numbers (Maßzahlen). And taking up a key idea of his fellow mathematicians, he detaches the real numbers from all special kinds or types of quantity. The rationale for doing this, so we are told, is that the application of the real numbers is not restricted to any special types of quantity, but rather relates to the domain of the measurable, which embraces all types of quantity whatsoever. On the face of it, this sounds to be largely in the spirit of Frege’s logicism which he had laid out informally, and by paying much attention to its philosophical underpinnings, in Die Grundlagen der Arithmetik [18]. In this splendid work as well as in the short essay ‘Formale Theorien der Arithmetik’ [19], Frege argued with great cogency that his logicist project rests crucially on the insight that, if arithmetic is to be regarded as a branch of logic, both the application of the numbers and the laws governing them must exhibit the most salient feature of logic, which is utmost generality. At that time (and certainly for several years to come), Frege was deeply convinced that arithmetic meets the logicist requirement of unrestricted generality and, moreover, enjoys the likewise distinguished status of possessing unmatched objectivity: “There is nothing more objective than the laws of arithmetic” [18, § 105]. Frege’s way of discussing the foundations of analysis in [21] bears a striking methodological similarity to his treatment of number theory in [18]. Neither in [18] nor in [21] does he begin by propounding his own theory, but rather by launching a vigorous attack on rival theories. In [21], the main targets are Heine and Thomae’s radical version of formalism (Frege calls it game formalism), Cantor’s theory of real numbers as well as Weierstraß’s view of the natural numbers. Any reader of this volume who is expecting a thorough examination of the theory of irrational numbers “of such a distinguished mathematician as Weierstraß” [21, § 148] is bound to be disappointed. Frege takes the easy route. He basically confines himself to making critical remarks, spiced with plenty of irony, about Weierstraß’s treatment of the natural numbers2 and eventually tries 2
All translations from the work of Frege, Hankel and Euler into English are my own. In a very few cases of Frege’s work, I have only modified and corrected the existing
Frege on Quantities and Real Numbers
27
to convince us that, due to its shaky foundations, Weierstraß’s theory of irrational numbers need not be examined in greater detail. Likewise, Frege pays comparatively little attention to Dedekind’s theory of real numbers, although he praises his sharp distinction between sign and reference [Bedeutung] and the view, disavowed by the formalists, that numbers are what numerical signs refer to and not those signs themselves.3 However, endorsing arithmetical platonism himself, as I think that he does, Frege naturally finds fault with Dedekind’s creation of new mathematical objects by abstraction.4 In the current literature, it is not undisputed that Frege was a fully fledged platonist. Yet putting his platonism in the right perspective is, to my mind, of considerable importance for appropriately assessing his overall philosophy of arithmetic, including his foundational approach to analysis. Thus, some clarifying words about Frege’s platonism may be in order here. I hold that his logicism goes hand in hand with his endorsement of an arithmetical version of ontological platonism. Frege is convinced that all numbers are logical objects which exist independently of human minds. In particular, his ontological platonism in the period 1893–1902 is meant to apply to logical objects of a fundamental and irreducible kind, namely to courses-of-values of functions. According to his logicist manifesto, all numbers are to be identified with logical objects of this prototype. The common view that Frege was a realist with respect to logical objects has been challenged unsuccessfully, I think, by several Frege scholars such as Sluga, Currie, Resnik, and others. The arguments which they advance I take to be far-fetched, or awkward, or both. Perhaps the clearest expression of Frege’s arithmetical platonism can be encountered in the context of his repudiation of Hankel’s formal arithmetic in [18]. There he says that even the mathematician cannot create something arbitrarily, any more than the geographer; “he too can only discover what is there and name translations. As far as I can tell, most of the passages that I translated from [21] have so far not been published in English translation. The translation of [20, 21] by P. Ebert and M. Rossberg is forthcoming from Oxford University Press. 3
Peter Simons, in his stimulating essay ‘Frege’s Theory of Real Numbers’ [52, p. 359], contends that Frege “brings perceptive criticisms of then current theories of reals, among others those of Cantor, Dedekind and Weierstrass, which are not without contemporary relevance” . However, as far as these three mathematicians are concerned, this claim holds at most for Cantor’s theory, but even in that case it must be relativized; see my assessment of Frege’s critique of Cantor’s theory of the reals in section 3.
4
See [9, § 4] and [10, § 6]. Abstraction à la Dedekind (which he characteristically weds to structure) differs significantly from Fregean abstraction. The latter consists in the transformation of a given equivalence relation into an identity between abstract objects. Note that Frege does not speak of abstract objects on his own account, but rather of non-real objects (nicht-wirklichen Gegenständen) when he deals with those objects which we would call abstract today, for example, the axis of the Earth, the equator, the centre of mass of the solar system (cf. [18, p. 35]).
28
Matthias Schirn
it” [18, pp. 107 f.]. In the Preface to [20] (p. XIII), Frege argues exactly in the same vein. Elsewhere [50] I have pointed out that in a passage in [21], where Frege considers the issue of how we have cognitive access to the objects of arithmetic, he does not give a clear-cut answer to the question whether the step of logical abstraction from right to left in Axiom V could – reasonably and acceptably – be called a creation. At the same time, I deliberately refrained from speculating about the reason(s) that might have motivated Frege (a) to ask this question at all and (b) to desist from giving a straightforward answer to it. Nonetheless, three points seem clear to me. First, there is ample evidence that in [21] Frege’s platonism did not undergo any significant change, in spite of (a). Second, by his own lights, Frege should never have conceded that discussing the question of whether his introduction of courses-of-values by way of logical abstraction via Axiom V can be called a creation, may easily degenerate to a quarrel over words. To my mind, he should have avoided raising this issue at all in § 146 of [21] – in fact, there was no recognizable need to do so – instead of backing himself into a corner by leaving it undecided. Plainly, once the issue was brought up, Frege should have given a definite answer, not an evasive one. In the light of the available evidence about the status and the role that he assigns to Axiom V in his logical system, a coherent answer would have been one along these lines: Axiom V is designed to function as the appropriate means of coming into epistemic contact with courses-of-values, of grasping them; it is not intended to call them into being. Like any explicit definition that meets Pascal’s classical requirements of eliminability and non-creativity (cf. [38, pp. 356 f.]) it would be powerless to achieve this anyway. It is true that, unlike proper definitions which are immediately turned into epistemically trivial assertoric sentences once the definiendum has been defined, Frege considers axioms to contain real knowledge.5 Nonetheless, nowhere does he unambiguously claim that they have any creative potential. When he raises the epistemological key question “How do we grasp logical objects?”, he presupposes that they exist prior to their apprehension. The answer to the question, though not exactly in Frege’s words, is of course as follows: they are grasped by means of logical abstraction. Third, in the light of his undisguised fondness of ontological platonism, Frege would have been well-advised to refrain from conceding that one might perhaps call the procedure of logical abstraction a creation, if creation is meant in a rigid sense, implying a barrier to its executability. For even if a creation of logical objects (if it is possible at all) were to proceed in a regulated, non-arbitrary fashion and thus within sharp boundaries, 5
He does not justify this view. It is, for example, difficult to see how Frege could convince us that the axiom ⊢ a → a (cf. [20, § 18]) possesses genuine epistemic value; see [48] for a discussion of this and related issues.
Frege on Quantities and Real Numbers
29
it would nevertheless be a creation and as such clash with the platonist aspirations that Frege manifests in several places of his writings, not least in Grundgesetze. In other words: prohibiting or condemning any arbitrary and boundless creation of mathematical objects and, in the same breath, licensing in certain cases a creation of such objects, if the mode of carrying it out and its admissibility are established once and for all (cf. [21, p. 149]), marks a position that Frege could not consistently maintain, quite apart from the fact that he fails to spell out what “admissibility” is to mean here precisely and how it could be established. In short, he could not have accepted any creation of mathematical or logical objects, no matter how it were performed. (Concerning Frege’s attack on the practice of bringing numbers into existence by means of definition, which was apparently widespread among his fellow mathematicians, see also [51, pp. 156 ff.].) In my view, another issue raised in § 147 would require clarification. When in § 147 Frege asks whether “our procedure can be called a creation” and responds by saying that this question may easily degenerate to a quarrel over words, it is not absolutely clear what he means by “our procedure”. Considering the entire context of his remarks, I presume that he appeals to the step of logical abstraction inherent in Axiom V, that is, to the transformation of the generality of an equality into a course-of-values equality. Yet I do not wish to vouch for this option. (Frege explains that in carrying out this transformation we acknowledge something in common to the two functions – namely their course-of-values; this is how he characterizes the move of abstraction in Axiom V in [21, § 146]; cf. also [24, p. 198]). It is true that in the third passage and especially in the first half of the fourth and concluding passage of § 147 Frege focuses entirely on the alleged virtues of Axiom V: (1) that it is the appropriate cognitive means of grasping logical objects, if there are such objects at all; (2) that it is scientifically indispensable or, more specifically, that without it a scientific justification of arithmetic would be impossible; (3) that it serves the same ends that other mathematicians seek to attain by creating new numbers. (Concerning (3), Frege had already made it clear that the transformation in Axiom V differs fundamentally from the unregulated and arbitrary creation of numbers by other mathematicians.) He goes on to say: “We thus hope to be able to develop the whole wealth of objects and functions dealt with in mathematics out of the functions whose names are listed in I, § 31, as from a seed. Can our procedure be called a creation?” I find this transition irritating, especially since Frege spares himself the trouble of explaining it to his readers. In particular, I fail to see why and how Frege’s hope of being able to develop all the objects and functions dealt with in mathematics out of the primitive functions of the system of Grundgesetze should derive from the virtues that he claims for Axiom V. Admittedly, one might perhaps say that courses-of-values are developed out of the primitive – course-of-values function εϕ(ε) via Axiom V. And in a sense, Axiom V is designed to “yield” (not to create) all objects dealt with in arithmetic. Recall that according to Frege’s logicist credo all numbers are to be defined as or identified with courses-of-values. Nonetheless, at least the way the functions that occur in arithmetic sprout from the seed of the primitive, logically simple functions is a matter quite distinct from logical abstraction inherent in Axiom V.
30
Matthias Schirn
In any event, at this point of Frege’s exposition it seems that we cannot definitely rule out that with “our procedure” he intends to refer quite generally to the development of the objects and functions dealt with in arithmetic out of the primitive functions and not exclusively to the transformation as represented by Axiom V. However, the content of the last sentence of § 147, vague as it is, appears to speak again in favour of my presumption that with the use of the phrase “our procedure” a few sentences earlier Frege intends to refer only to Axiom V. In this sentence (“And with this, all the difficulties and doubts [concerns] that otherwise call into question the logical possibility of creation disappear, and we may hope that with our courses-of-values we achieve everything what has been missed by following those other paths”), he mentions explicitly courses-of-values and thus appeals implicitly also to Axiom V. Be this as it may, the question as to how the development of the objects and functions treated of in mathematics out of the primitive functions is to proceed is passed over in silence by Frege. This is unfortunate because he missed the chance of dispelling any remaining doubt about what he meant by “our procedure” in the relevant context. As to the development of objects and functions out of the primitive functions, I conjecture that what Frege had in mind was the construction of logically complex function-names and object-names by iterated application of the formation rules of his system, which are “gap formation” and “insertion”. In this way, he does indeed obtain special functions – for example, the relation of an object falling within the extension of a concept, the single-valuedness of a relation, the following [succession] of an object after an object in the series of a relation – and likewise special objects – for example, equivalence classes of equinumerosity, extensions of relations (= Relations), Relations of Relations – that are required for laying the foundations of arithmetic, and he is able to define them via constructive definitions. See the table of definitions in [20, pp. 240 f.]; see also [21], for example, §§ 167, 173, 175, 193.
Admittedly, those who still wish to raise doubts about Frege’s platonism in the period 1893–1903 might think to have an easy task by drawing attention to Frege’s apparently insouciant stipulations when he comes to laying out his formal system. What I have in mind, is first and foremost his stipulation at the end of § 10 in [20]: the True and the False are identified with their own unit classes in order to remove, in a first essential step, the referential indeterminacy of course-of-values terms, arising from Frege’s metalinguistic stipulation concerning the informal analogue of the name – of the course-of-values function “ εϕ(ε)” in § 3, later to be enshrined in the 6 formal version of Axiom V. 6
The stipulation in § 3 reads as follows: “I use the words ‘the function Φ(ξ) has the same course-of-values as the function Ψ(ξ)’ generally as coreferential [gleichbedeutend] with the words ‘the functions Φ(ξ) and Ψ(ξ) always have the same value for the same argument’.” Axiom V appears for the first time at the end of § 20, clad in formal garb. – Above I wrote deliberately “in a first essential step”. In § 10, Frege proposes to achieve a more exact specification of courses-of-values, that is, to remove the referential indeterminacy of course-of-values terms, by determining for
Frege on Quantities and Real Numbers
31
The sceptic might object that the identification just mentioned flies in the face of ontological platonism. According to Frege’s alleged platonist stance – so he or she might argue – it should be an objective fact whether, say, the True, is a courses-of-values or not, and if it is one, which one it is. From the point of view of the platonist, this has to be fixed once and for all in the mind-independent universe of logical objects and, hence, can never be a matter of arbitrary stipulation. On the one hand, I do not think that we are entitled to claim, by appealing to Frege’s practice of making certain stipulations in the course of constructing his mature logical theory, that he was not a platonist, at least not during the period of Grundgesetze. On the other hand, I do not wish to deny that there is indeed a tension between Frege’s platonism and certain stipulations that he makes in [20]. To be sure, we have no evidence that he was fully aware of this conflict; nor do we know whether he thought that he could lightly pass over it, insisting that he was at liberty to make certain stipulations – consistent with the set of assumptions underlying the theory – in order to secure a unique reference (Bedeutung) for every well-formed expression of his formal language. Before I turn to Frege’s view vis-à-vis Cantor’s approach to analysis, let me briefly illustrate the tension that I mentioned by considering just one every primitive first-level function, when introducing it, which values it receives for courses-of-values as arguments, just as for all other arguments. At the stage of § 10, the proposed procedure boils down to determining the values of ξ = ζ for coursesof-values and the two truth-values as arguments. In §§ 11-12, Frege introduces the last two primitive first-level function-names of his system, the description operator and the conditional sign, by determining the values of the corresponding functions for courses-of-values as arguments, and for all other arguments. Note that neither function is completely reducible to a primitive first-level function that has already been elucidated. Although Frege passes the issue over in conspicuous silence, it could seem that with these two additional stipulations the piecemeal process of fixing com– pletely the reference of the name of the second-level course-of-values function εϕ(ε) has come to an end for him. Note that the determination of the values of ξ = ζ for courses-of-values and the two truth-values as arguments plays a key role in Frege’s method of fixing completely the references of courses-of-values terms. This applies even independently of the fact that for negation, the determination of the functionvalues for the two truth-values and all other arguments (of type 1) proves to be unnecessary and the horizontal function (which is a concept under which only the True falls) is reducible to ξ = ζ; plainly, this concept is co-extensive with ξ = (ξ = ξ). – By the way, if for Frege a sound elucidation of the primitive function-name “ εϕ(ε)” would have proved to be feasible, that is, one which did not rest on a presupposed acquaintance with courses-of-values, then he could have defined the predicate “a is a course-of-values” (“CV (a)” ), modelled on his definition of “n is a cardinal number” in § 72 of [18]. – CV (a) := ∃ϕ (εϕ(ε) = a). (As far as I can see, this was first noted in [41] and [44].) Equipped with this definition, which, let us suppose, satisfies Frege’s principle of completeness, he would have been in a position to decide, in principle, for every given object a whether or not it is a course-of-values. If a is a course-of-values and is given to us as such, Axiom V would tell us whether a is identical with a course-of-values b referred to by a canonical course-of-values name. Unfortunately, the prospects for devising an – impeccable elucidation of “ εϕ(ε)” were not encouraging for Frege.
32
Matthias Schirn
special aspect. On the face of it, it seems consistent for Frege (1) to dismiss as indefensible the general proposal, made in the second footnote to [20, § 10], of identifying with their unit classes all and only those objects which are not given to us as courses-of-values (that is, which are not referred to by canonical course-of-values terms) and yet (2) to allow certain particular identifications which the general proposal, if accepted, would also license. On closer reflection, however, this is less clear. The identification of the True and the False with their unit classes is, from Frege’s point of view, indeed consistent with Axiom V, as is established by his “permutation argument” in § 10. Yet following his line of thought in the second footnote, it seems that, before we make this stipulation, we are bound to rule out that the True and the False are courses-of-values or classes containing more than one object. For according to the argument presented there, the fact that an object is not given to us as a course-of-values does not imply that it is not one. In particular, we have no guarantee that is not a courseof-values distinct from its unit class. But why should this argument not apply to Frege’s favourite logical object, referred to by “∀x (x = x)” , for example? And if it does, how can Frege then legitimately identify the True with its unit class? So much for Frege’s platonism which in my view overarches his entire philosophy of arithmetic. It is for this reason that in the present introduction I tried to shed some new light on it.7 Let us return to Frege’s critique of rival theories of real numbers. His discussion of Cantor’s theory of irrational numbers appears to be a trifle less polemical than both his crusade against the formalists and his attempt to make fun of and pull to pieces Weierstraß’s theory of the natural numbers. The discussion consists in large part in demonstrating that Cantor offends against two principles of correct explicit definitions that Frege lays down in [20] and considers at length in [21, §§ 56-67]: the principle of completeness and the principle of simplicity (of the definiendum).8 As to the first principle, he confines himself to considering the case of first-level concepts and first-level relations. The principle then states that a definition of a concept must uniquely determine, with respect to any object, whether or not it falls under the concept. Similarly, a definition of a dyadic relation must unambiguously determine, with respect to any one object and any other object, whether or not the one stands in that relation to the other. The principle of simplicity states that the sign or name defined may not be composed of any 7
My motivation to take a closer look at the concluding passage of [21, § 147] from the point of view of Frege’s platonism derives from the talk ‘ “The discussion of this question can easily degenerate into a quarrel about words”: Platonism in Frege’s Grundgesetze?’ that Marcus Rossberg and Philip Ebert delivered in May 2011 in a conference on Frege’s philosophy of mathematics in Bucharest (organized by Sorin Costreie), and especially from our subsequent discussion.
8
In [20, § 33], Frege states seven principles that he considers to be relevant for definitions.
Frege on Quantities and Real Numbers
33
familiar signs that are yet to be defined.9 To all appearances, it was not Frege’s primary concern to comment on the very substance of Cantor’s theory of real numbers. His resumé that this theory in no way reaches its aim seems, apart from a few sound but comparatively minor objections, strongly exaggerated.10 In my eyes, the momentousness of the theories of real numbers developed by Cantor, Weierstraß and Dedekind is beyond doubt. As a matter of fact, these theories had a decisive impact on later approaches to the foundations of analysis. It is mainly for this reason that they deserve to be called “classical”. Frege distinguishes between three notions of the essence of cardinal numbers in the writings of Weierstraß, all of which he dismisses as untenable. (1) A number is an aggregate of concrete things. “If you roused a man, who had never contemplated the matter, from his sleep with the question, ‘What is a number?’, he would likely put forth, in his initial state of perplexity, expressions similar to those of Weierstraß: ‘set’, ‘mass’, ‘series of things’, ‘object consisting of homogeneous parts’, etc. [. . . ] Both of the possible major errors have thereby been committed. The first consists in confusing the number with its bearer or substrate [. . . ] The second lies in the fact that neither the concept nor the extension of the concept are taken to be the bearer of the number, but rather that which should be denoted by the words ‘aggregate’, ‘series of things’, ‘object consisting of homogeneous parts’” [21, p. 150]. (2) The number is a property (value, validity) of such an aggregate. Frege’s comment is this: “The value or validity of an aggregate or a number is distinguished from the aggregate itself and, hence, it seems obvious that with this the actual [eigentliche] number is meant. This is also a way in which it is smuggled in; nowhere is it said what the value or validity might be” [21, p. 151]. (3) The number is an aggregate of abstract things or of a single, repeatedly occurring abstract thing. “As a row of books consists of books, so the number 3 consists then of abstract units, or better yet, of the – of course repeatedly occurring – One. We do not learn what this might be, though. It is probably so abstract that in order to think it, one must not think anything at all” [21, p. 152]. It is in view of these deficiencies that Frege feels free to leave out an examination of Weierstraß’s theory of irrational numbers. The basis of this theory is simply not firm, Frege surmises. In my view, Frege is not taking the matter seriously enough here. As I have implied above, Weierstraß’s construction of analysis – the terminological shortcomings aside – makes sense. Thus, it would have deserved more careful consideration by Frege beyond criticizing Weierstraß’s use of the word “aggregate” and providing evidence of definition-theoretic errors. By the way, one may speculate what Frege himself would have answered if he had been roused in the night with the question: “What is a number?” 9
Frege maintains that the simplicity of the definiendum does not rule out that it may be regarded as consisting of parts. Its simplicity does exclude, however, that the reference of the definiendum follows from the references of the parts and, furthermore, that these parts occur also in other combinations and are treated as independent signs with a reference of their own (cf. [21, § 66]).
10 Of course, it must be assumed in the first place that one is prepared to accept Frege’s theory of definition.
34
Matthias Schirn
Having said that, let me also mention, in fairness to Frege, that in his critical remarks on Weierstraß’s conception of natural numbers (see also [24, pp. 232 ff.]), just as when he inveighs, rather dismissively, against Cantor’s description of how to arrive at the cardinal number of a given set by carrying out a double act of abstraction (cf. [24, pp. 76-80]) or takes sides against Biermann and Schubert’s views of the numbers (cf. [24, pp. 81-95]; [23, pp. 240-261]), he plays masterly on the keyboard of irony and sarcasm. And for the most part, I am inclined to acknowledge his arguments as sound; a few of them even strike me as devastating. Some might complain that Frege’s critique of Weierstraß, Cantor, Husserl, Heine, Thomae, Biermann, Schubert and other contemporaries of his lacks charity and occasionally overshoots the mark (see, for example, [53] and [54]). Be this as it may, compared with the academically longwinded and stilted writing of many of his fellow mathematicians (I recommend [27] as a delightful sample) I find Frege’s way of criticizing rival theories of number both refreshing and insightful.
Needless to say, Frege’s own theory of real numbers in his mature period, although it differs markedly from the theories of his fellow mathematicians, did not emerge from out of the blue. Typically enough, he refers to it as “Größenlehre” (“theory of quantity”). In a sense, the concept of quantity was a constant companion of his when he developed his foundational project in several stages. As a matter of fact, this concept plays already a key role for Frege at the beginning of his career, namely in his “Habilitationsschrift” Rechnungsmethoden, die sich auf eine Erweiterung des Größenbegriffs gründen of 1874, is at least touched upon in his philosophical masterpiece of 1884 and again plays a dominant role in [21]. So far these topics did not receive the attention they deserve. This applies also to Frege’s critique of Cantor’s theory of irrational numbers.11 In sections 2 and 3, I shall try to fill this gap to some extent. Admittedly, as I indicated above, due its focus on alleged definition-theoretic errors and its neglect of certain issues germane to the quintessence of Cantorian analysis, Frege’s assessment of Cantor’s approach suffers from one-sidedness. Moreover, to my mind it is not free from bias. All the same, I think that it deserves to be discussed by paying a little more attention to some of the details of Cantor’s doctrine than Frege does. I shall now proceed as follows. In section 2, I shall be concerned with Frege’s understanding of the concept of quantity in his work between 1873 and 1884. In a first step, I deal with his theory of quantity in his Habilitationsschrift. In a second step (likewise in section 2), I comment on Hankel’s conception of quantity and a remark of Frege’s on Hankel’s theory of real 11 To the best of my knowledge, so far only Dummett [12, pp. 63 ff.] has dealt with Frege’s critique of Cantor’s theory of irrational numbers. Yet his account differs very much from my own. As to the concept of quantity in [16], the only treatment that I have seen in the literature is the one given by [8], pp. 353 ff. However, I have never come across any discussion of Frege’s remarks on Hankel’s theory of real numbers in [18, § 12].
Frege on Quantities and Real Numbers
35
numbers ([18, § 12]). In addition, I consider a passage in [18, § 19], where Frege makes some remarks about Newton’s view of number in terms of “the abstract relation between any given quantity and another of the same kind that is taken as a unity”. In section 3, I describe the essential features of Cantor’s theory of irrational numbers and examine the main points of the critique deployed by Frege. In section 4, I take a look at Russell’s theory of real numbers in [39] as well as in [56]. Section 5 is devoted to a detailed reconstruction of Frege’s conception of the notion of quantity and his theory of real numbers with an eye to both the sketchy informal and the meticulous formal account as far as it goes in [21]. Special emphasis is placed on the considerations that led him to set up the definitions of the concepts positival class and positive class and the problem of proving the mutual independence of the clauses that make up the definition of the former concept which is only preparatory to the latter. Due to Russell’s paradox, Frege’s logical foundation of analysis remained a fragment. In the final section 6, I give a brief account of von Kutschera’s proposal of how Frege might have carried on with the logical construction of analysis in a projected third volume of Grundgesetze, had he not been shocked by Russell’s paradox.
2 The concept of quantity in Frege’s writings between 1874 and 1884 2.1 Methods of calculation and the concept of quantity: Frege’s Habilitationsschrift (1874) At the outset of his Habilitationsschrift Rechnungsmethoden, die sich auf eine Erweiterung des Größenbegriffs gründen (1874), Frege aims at illustrating the remarkable difference between geometry and arithmetic in the way in which their fundamental principles are grounded. As the title suggests, it is by investigating the concept of quantity that he pursues this aim. Frege points out that this concept had gradually been detached from intuition and finally gained the status of a self-subsistent concept. Its range of application is indeed so comprehensive that he is certainly right in denying that it stems from intuition. Frege argues as follows. Since we have no intuition of the object of arithmetic, its principles cannot rest on intuition either. One might add by way of analogy: Since we have no sense perception of the object of arithmetic, its principles cannot rest on sense perception either. Frege does not tell us directly from which source of knowledge the principles of arithmetic are supposed to originate, but I trust that he would have said something like this: these principles derive from conceptual or pure thinking. In an instructive letter to Anton Marty written in 1882, Frege
36
Matthias Schirn
mentions for the first time the notion of a source of knowledge which he presumably borrowed from Kant. However, as early as in [16, p. 50] Frege speaks of intuition as the source of the axioms of geometry. I am almost certain that the word “source” is meant there to refer to what in his letter to Marty of 1882 he characterizes as the source of knowledge of spatial intuition. Generally speaking, I suppose that Frege uses the term “source of knowledge” to refer to a cognitive faculty of the human mind, and in doing so he is following deliberately in Kant’s footsteps. Yet unlike Kant, he explicitly characterizes (though only in his late fragments) a source of knowledge as that which justifies the acknowledgement of truth, the judgement [24, p. 286]. In the letter to Marty, Frege emphasizes that a source of knowledge more restricted in scope than conceptual thinking (begriffliches Denken), like spatial intuition or sense perception, would not suffice to guarantee the general validity of the arithmetical sentences. Thus, in this letter Frege already classifies three sources of knowledge, a classification that reappears, save for one modification, in an undated letter to E. V. Huntington (presumably written in 1902) and in his late fragments ‘Erkenntnisquellen der Mathematik und der mathematischen Naturwissenschaften’ and ‘Neuer Versuch der Grundlegung der Arithmetik’.12 In these fragments, he acknowledges the logical source of knowledge, the geometrical source of knowledge (that is, spatial intutition) and sense perception as constituting the third source of knowledge. Due to the lack of available evidence, I hesitate to suggest that what in the letter to Marty and again in [18] (cf. § 14) Frege calls conceptual thinking coincides with the logical source of knowledge. However, on plausible grounds I assume that in Frege’s eyes our ability and actual performance of conceptual thinking, in particular our practice of drawing deductive inferences, is very much akin to what in his late fragments he terms the logical source of knowledge. I presume, however, that in his view the logical source of knowledge is not only the faculty of drawing deductive inferences13 and, hence, of providing deduc12 In his letter to Huntington, Frege writes [25, p. 89]: “I have set myself the goal of grounding arithmetic on logic alone. For this it is essential to exclude with certainty everything that stems from other sources of knowledge (intuition, sense experience).” 13 According to Frege, deductive inference is to judge by being aware of other truths as grounds of justification. In his fragment ‘Logik’ (I), he underscores that deductive inference cannot be the only mode of justifying truths. “There must be judgements whose justification rests on something else, if they stand in need of justification at all” [24, p. 3]. The task of investigating non-deductive or non-logical forms of justification is assigned to epistemology. Logic and epistemology are thus put on a par only insofar as both disciplines are concerned with justifying grounds of truths. Admittedly, in ‘Logik’ (I), Frege does not mention other forms of justification besides deductive inference or deductive proof. In particular, he does not say there that epistemology can provide a non-deductive justification of a primitive law of logic. It therefore remains unclear on what the justification of truths (if there are any), which are capable and (or) in need of justification, but resist justification through
Frege on Quantities and Real Numbers
37
tive justifications for truths. My hunch is that he also takes it to be that cognitive faculty which enables us to grasp, in a direct, non-inferential way primitive laws of logic.14 This is not to say that he regards the logical source of knowledge at the same time as furnishing justifying grounds for acknowledging primitive laws of logic to be true. In the Preface to [20], Frege raises the question why and with what right we acknowledge a logical law to be true. His answer is that logic can respond only by reducing it to other logical laws. When this is not possible – as can be seen whenever the act of acknowledging a primitive law of logic as true is at issue – Frege claims that logic can give no answer. In summary then, I presume that during his entire career, from his first writings until his last fragments, Frege adhered unwaveringly and invariably to the Kantian idea that the human mind is endowed with certain specific faculties of attaining knowledge, with sources of knowledge. In [18, § 12], Frege uses also the terms “Erkenntnisgrund ” (“ground of knowledge”) and “Erkenntnisprinzip” (“principle of knowledge”) when he refers to (pure) intuition. There are several places in the Kritik der reinen Vernunft (Critique of Pure Reason, [35]) where Kant employs the term “Erkenntnisquelle”. Following Kemp Smith’s translation of [35], I render it as “source of knowledge”; “source of cognition” is another possible translation, and it is chosen by Guyer and Wood in their translation of [35]. Thus, for example, Kant writes in [35, B4] (I quote again from the translation by Guyer and Wood): “. . . strict universality belongs to a judgement essentially; this points to a special source of cognition [knowledge], namely a faculty of a priori cognition [knowledge]. Necessity and strict universality are therefore secure indications of an a priori cognition [knowledge].” Such a faculty of a priori knowledge is space and time: “Time and space are accordingly two sources of cognition [knowledge], from which different synthetic cognitions can be drawn a priori, of which especially pure mathematics in regard to the cognitions of space and its relations provides a spendid example” (A38-9/B55-6). Much later, in the Transcendental Dialectic, Kant writes (A294/B350-1): “But the formal aspect of all truth consists in agreement with the laws of the understanding. In the senses there is no judgement at all, neither a true nor a false one. Now because we have no other sources of cognition [Erkenntnisquellen] besides these two, it follows that error is effected only through the unnoticed influence of sensibility on understanding, through which it happens that the subjective grounds of the judgement join with the objective ones, and make the latter dedeductive proof, is supposed to rest. It seems, however, that if there were no such truths, epistemology, as characterized by Frege, would lack a proper domain of investigation. For he can hardly see its task in supplying justifying grounds for truths which do not stand in need of justification. Notice that in ‘Logik’ (I) Frege does not explicitly claim or demand the existence of truths that need neither deductive nor non-deductive justification. 14 I am inclined to ask: if it is not the logical source of knowledge that enables us to grasp primitive truths of logic, which other source of knowledge should then enable us to do this? Moreover, it is perfectly possible that in Frege’s view the logical source of knowledge comprises also the faculty of conceptual analysis.
38
Matthias Schirn
viate from their destination. . . ” Earlier in the Critique (A94/B127), Kant had mentioned “three original sources (capacities or faculties of the soul), which contain the conditions of the possibility of all experience, and cannot themselves be derived from any other faculty of the mind, namely sense, imagination, and apperception.” Later, at the very outset of the second book of the Analytic of Principles (A130-1/B169), he underscores that “general logic is constructed on a plan that corresponds quite precisely with the division of the higher faculties of cognition [Erkenntnisvermögen]. These are: understanding [Verstand ], the power of judgement [Urteilskraft], and reason [Vernunft]. In its Analytic that doctrine accordingly deals with concepts, judgements, and inferences, corresponding exactly to the functions and the order of those powers of mind [Gemütskräfte] which are comprehended under the broad designation of understanding in general.” I presume that Kant uses the terms “source of knowledge [cognition]” and “faculty of knowledge [cognition]” largely in the same sense. According to him, there are basically two sources or faculties of knowledge, a lower one, namely sense, and a higher one, which is understanding, taken in a comprehensive sense. Understanding, conceived of in this wider sense, comprises both the power of judgement and reason. On the face of it, Frege’s logical source of knowledge bears a notable similarity to Kant’s source or faculty of knowledge of the understanding. According to Frege, the logical source of knowledge is involved when inferences are drawn, and thus is almost always involved. Similarly, in Kant’s view, both judgements and inferences fall, by their very nature, in the domain and activity of the understanding, construed in the broader sense. As far as the role of concepts in Kant’s Analytic is concerned – recall that he establishes a correspondence between understanding (construed in the narrower sense), the power of judgement, and reason and the specific function belonging to each of these higher faculties of cognition – Frege’s use of the term “conceptual thinking” may come to mind. This term is perhaps a kind of forerunner of the term “logical source of knowledge”, bearing in mind that Frege uses the latter term only in an undated letter to Huntington and his last fragments. Let me emphasize that despite the similarity I just mentioned Frege’s conception of logic does not coincide with Kant’s. Kant distinguishes between general or formal logic and transcendental logic. We know from a remark of Frege’s in ‘Über die Grundlagen der Geometrie’ II, (1906) that, despite first appearances, logic is for him not purely formal (cf. [23, p. 322]). I must leave a thorough comparison of the conceptions of logic of Kant and Frege for another occasion.
It is true that logic is not even mentioned in [16]. Yet stressing the comprehensive range of application of the concept of quantity, as Frege does, seems to foreshadow his later argument from the universal applicability of arithmetic to its purely logical nature ([18] and [19]). To be sure, it is not more than this. In [16], Frege does not anticipate, let alone explicitly state, the central thesis of his philosophical masterpiece still to come in 1884. The thesis is as follows: The fundamental laws of arithmetic are (in all likelihood) analytic, that is, they can be derived exclusively from
Frege on Quantities and Real Numbers
39
primitive laws of logic and definitions.15 I hasten to add that the key idea underlying Frege’s logicist project does already appear in [17], although there it is not yet framed in terms of the notion of analyticity.16 After having classified two kinds of truths which require a proof for their justification – the proof of a truth of the first kind can proceed purely logically, while the proof of a truth of the second kind must be supported by empirical facts – the question to be settled for the laws of arithmetic is to which of these two kinds of truths they belong. As Frege points out, the answer requires us to test “how far one could get in arithmetic by means of inferences alone, relying only on the laws of thought, which are beyond all particularities. The procedure for this test was that I sought first to reduce the concept of ordering in a series to the concept of logical consequence, in order to advance from here to the concept of number” [17, p. X].17 15 In Frege’s view, primitive truths of logic are maximally general truths which, thanks to their evidence, neither need proof nor admit or are capable of proof in a theory T in which they are laid down as axioms. While the property of unprovability depends on a particular system, Frege seems to regard the property of not needing proof as something that belongs intrinsically to certain distinguished, (self-)evident general truths, quite independently of the system in which they are singled out as axioms. Note that in [18, § 3] Frege defines the concept of analyticity only for truths which are capable of being proved; no provision is made for the first premises of the deductive proof (of an arithmetical truth), namely the basic laws of logic figuring as axioms in a theory T and the definitions framed in T . I assume, however, that Frege, had his attention been drawn to the omission, would have characterized both the primitive laws or axioms of logic and the definitions as analytic. Frege insists that in his definition of analyticity in terms of deducibility from fundamental logical laws and definitions it is presupposed that we take into consideration also those propositions on which the permissibility of a definition rests. (Both Austin and Jacquette’s translations of the relevant passage in [18, § 3] are inaccurate; see [22, p. 4] and [26, p. 19].) 16 In his Begriffsschrift of 1879, Frege does not yet employ the term “analytic” in the sense in which he defines it in [18, § 3]. In [17, § 24], he explains his conception of definition by taking as an example the definition of a hereditary property in a series (or sequence). It is only in this context that he uses the term “analytic”, and he does so along Kantian lines, where the analyticity of a judgement implies its epistemic triviality. Once the content of the definiens has been bestowed upon the definiendum, the definition is immediately turned into an analytic judgement, because it displays only what was put into the new symbols in the first place. By contrast, Frege’s definition of the notion of analyticity in terms of the notion of deductive proof in [18], § 3 allows that an analytic truth extends our knowledge. It therefore differs essentially from Kant’s explanation, despite the fact that in [18, § 3 (footnote)] Frege tries to play down the difference by saying that he does not intend to confer a new sense on the term “analytic”, but only to state accurately what Kant has meant by it. Yet in [18, § 88] Frege finds fault with what he sees as the narrowness of Kant’s definition of analyticity. It is already in his letter to Marty that Frege criticizes Kant for having placed too little value on analytic judgements because the examples on which he draws are too simple. I doubt that the basic laws of arithmetic, if they “can be proved from definitions by means of logical laws alone. . . may have to be regarded as analytic judgements in the Kantian sense” [25, p. 163]. 17 In part III of [17] entitled ‘Einiges aus der allgemeinen Reihenlehre’ (‘Some Topics
40
Matthias Schirn
After having argued against the intuitive character of the subject matter of arithmetic in [16], Frege goes on to write (p. 51): If, as we have shown, we do not find the concept of quantity in intuition, but create it ourselves, then we are justified in trying to formulate its definition so as to permit as manifold an application as possible, in order to extend the domain that is subject to arithmetic as far as possible. Now to what do those principles, from which the whole of arithmetic grows as from a seed, refer? To addition; for the other kinds of calculation arise from this one. This is why there is such an intimate connnection between the concepts of addition and quantity that the latter cannot be grasped at all without the former. Quite generally speaking, the process of addition is the following: we replace a group of things by a single one of the same kind. This gives us a determination of the concept of quantitative identity. If we can decide in every case when objects agree in a property, then we have obviously the correct concept of the property. Thus in specifying under what conditions there is a quantitative identity, we determine thereby the concept of quantity. A quantity of a certain kind – for example, a length – is accordingly a property in which a group of things can agree with a single thing of the same kind, independently of their internal structure.
Frege adds that the proposed determination of the concept of quantity can be regarded as sound only if the property we are thinking of allows such a scope that it is possible for things not to agree in it. He calls the multiplicity enclosed within this scope the quantitative domain. This exposition of the concept of quantity is far from being a paragon of clarity and definiteness. (1) To begin with, it springs to mind that Frege speaks of a creation of the concept of quantity by ourselves. To the best of my knowledge, this is the only place in his entire work where he acknowledges a creation at all concerning concepts, numbers or logical objects in general, within the bounds of his own philosophy of arithmetic. I do not know how much importance we should attach to this remark which is at variance with everything that Frege says in his later work about the formation of concepts in general and of mathematical and logical concepts in particular. Perhaps Frege only wanted to convey that the concept of quantity does not originate in intuition, but is rather something that we find only in rational, conceptual thinking. Be this as it may, it is true that only a few years later in ‘Boole’s rechnende Logik und die Begriffsschrift’ of 1880-81 and likewise from a General Theory of Sequences’), Frege derives a number of theorems about sequences to provide a general idea of how to handle his concept-script and underscores the extensive applicability of the theorems obtained. He makes it clear that the range of validity or application of a truth is as wide as the scope of the source of knowledge from which it derives. For the sake of convenience, I use here the term “source of knowledge” ; recall that in 1879 Frege does not yet use this term. When he embarks on commenting on theorems about sequences, he mentions pure thinking and intuition which only a few years later he terms sources of knowledge.
Frege on Quantities and Real Numbers
41
in [18] the usefulness of definitions in mathematics and logic is not generally seen as restricted to their function as abbreviations and simplifications as in Frege’s work, say, after 1891, when he comes to develop and present a systematic theory of (explicit) definition, based on a few clear-cut principles. According to [18], the distinguishing mark of “really good” definitions lies in the fact that they embody a process of fruitful concept formation. This process proceeds by analyzing a judgeable content into a constant and a variable part or in other words: by applying the method which in [20] Frege describes by stating three rules of constructing function-names in his formal language and which I term rules of gap formation. Yet no matter how fruitful concept formation via gap formation is taken by him to be (at least during the period 1880-1884), he nowhere characterizes it as a creation.18 (2) So much at least is clear: Addition is regarded as the key operation on which every other arithmetical operation rests. The concepts of quantity and of addition are inextricably intertwined. It is through addition that the relation of quantitative identity is fixed. In Frege’s theory of real numbers in [21], addition again plays a distinguished role. Here the demarcation of the quantitative domain results from the requirement that the commutative and associative laws for addition hold. (3) The reader who is expecting that a definition of the concept of quantity is finally forthcoming is bound to be disappointed. As to Frege’s claim “If we can decide in every case when objects agree in a property, then we have obviously the correct concept of the property”, Currie [8, p. 354] has pointed out that it is ambiguous. He argues that “agreement in a property” may mean: (a) “a and b have F ”, or (b) “a and b have F to the same degree”, or (c) “a and b are the same F ”. He considers (b) to be the most likely option: “Because from ‘a and b have F to the same degree’ we can infer (A): ‘the magnitude of a’s F ness = the magnitude of b’s F ness’. . . ” (p. 354). Perhaps this is right; it is hard to tell. Note that in Frege’s later work to have a grasp of a first-level concept or property would amount to saying: we can decide for every given object whether it falls under the concept or not/whether it has the property or not. (After 1891, Frege construed the concepts under which a given object falls as its properties.) Now, on the face of it, the way Frege characterizes the intended definitional introduction of the concept of quantity is reminiscent 18 According to [18], the characteristic marks of fruitful definitions are as follows: (1) they represent a kind of concept formation in which, to use Frege’s geometrical image, entirely new boundary lines are drawn; (2) they enable us to carry out gapless proofs, something that would have been impossible without them; (3) we may draw inferences from them which extend our knowledge. This, however, is not to say that a fruitful definition as such adds to our knowledge. Frege nowhere claimed that it does. As I indicated above, in his mature period after 1891 Frege abandoned his thesis about the systematic fruitfulness of good definitions in mathematics and logic. For details regarding this change see [43].
42
Matthias Schirn
of the method of introducing (tentatively) a concept (more precisely: a function-name or a singular term forming operator) by means of a contextual definition, in terms of an abstraction principle. Unfortunately, Frege fails to specify a criterion of identity for quantities. If Currie’s proposal is correct, then we are left with “The quantity of a’s F ness = the quantity of b’s F ness if and only if. . . ”, and have to find out what could or should be put into the empty place, marked by the three dots. If it were the sign of a suitable equivalence relation, then the contextual definition so construed would define the operator “the quantity of a’s ϕness”.19 To be sure, Frege spares himself the trouble of showing how the content of arithmetic is contained in the properties of quantity (he thinks) he has set out, and how special kinds of quantity, such as cardinal number and angle, can also be defined from his standpoint. He confines himself to drawing the conclusion that quantity can be ascribed to operations. In general, he says, it is possible to search for the operation which, when applied n times, can replace a given operation, and for the operation which reverses the given one. “We can easily see that these operations and the ones that can arise from them in the ways indicated form a quantitive domain” [16, p. 52]. Frege goes on to point out that there are several examples of the repetition of the same operation to be found in arithmetic. Thus, addition is said to lead to multiplication, and multiplication to involution. So much for Frege’s treatment of the concept of quantity in [16]. Let us now turn to Hankel and Newton and Frege’s comments on their doctrines.
2.2 Frege on Hankel and Newton in Frege 1884 Guided by his earlier definitions of the terms “analytic truth”, “synthetic truth”, “a posteriori truth” and “a priori truth”, Frege raises the question, in the heading of § 12 of [18], whether the laws of arithmetic are synthetic a priori or analytic. He comments mainly on Hankel’s theory of real numbers and Kant’s notion of intuition. Before I turn to Frege’s comment on Hankel’s theory, let me take a look at Hankel’s introduction of the concept of magnitude in [27]. At the outset of the section entitled “The real numbers in the theory of magnitude”, Hankel claims that the relation-concept magnitude (Grösse) 19 We do not know when Frege began composing and writing [18]. Although it is a fairly small book, it contains almost his entire philosophy of mathematics in a rather condensed form. Furthermore, Frege went through a fair amount of literature before or when writing several chapters of the book. Thus, I presume that soon after the completion of his Begriffsschrift of 1879 he began working on his philosophical masterpiece. The fact that apart from his Begriffsschrift he published relatively little during the period, say, 1875–1883, lends perhaps further support to my presumption. In short, assuming that in [16] Frege thought that the concept of quantity would be best defined in terms of an abstraction principle is by no means far-fetched, let alone out of place.
Frege on Quantities and Real Numbers
43
is immediately given in pure intuition. He concludes from this that he need not provide a metaphysical definition of this concept, that is, a definition that reveals completely its essence, and that an exposition of it will suffice for his purposes. After having made a somewhat cloudy remark on the nature of mathematical definitions, Hankel goes on to say that regarding the concept of magnitude we need not define the concept of quantity (Quantität ), but must rather define the concept of a quantum (Quantum). He adds that these two concepts are united in the word “magnitude” and finally suggests that it is not the concept of magnitude that requires a definition, but rather “what ‘large’ is” (“was ‘gross’ sei” ). I find this hard to follow. Hankel possibly wishes to convey that the meaning of the word “magnitude” contains two distinct constituents or elements, namely the concept of quantity and the concept of a quantum. Alternatively, it could seem that he wants to say that the concept of magnitude has two intimately related conceptual components, namely the concepts just mentioned. I presume that what has to be defined, according to Hankel, is the predicate “is large” (“ist gross”). In what follows, Hankel refers to Euclid. He says that an analysis of the use that Euclid makes of the concept of being large or of the concept of largeness (Begriff des Grossen) yields the following definition [27, pp. 48 f.]: Grösse heisst ein Object, wenn es grösser, kleiner als ein anderes, oder ihm gleich ist, und in letzterem Falle ihm überall substituiert werden kann; wenn es ausserdem durch wiederholte Position vervielfacht (und geteilt) werden kann. Gleichartig heissen Grössen, wenn die eine vervielfältigt, die andere übertreffen kann. We call magnitude an object if it is greater, smaller as another or equal to another object, and in the latter case can always be substituted through it; if furthermore it can be multiplied (and divided) by iterated position. Magnitudes are of the same kind, if the one multiplied can exceed the other.
The last sentence expresses the same idea as definition 4 of book V of Euclid’s Elements (I first quote from the original Greek text and then from the translation provided by Heath): Λόγον χειν πρÕς ¥λληλα µεγέθη λέγεται, § δύναται πολλαπλασιαζόµενα ¢λλήλων Øπερέχειν. Magnitudes are said to have a ratio to one another which are capable, when multiplied, of exceeding one another.
Roberto Torretti has pointed out to me that Hankel’s way of phrasing Euclid’s definition is imprecise. According to Torretti, the correct translation of definition 4 into German should have been this: Gleichartig heissen Grössen, wenn eine jede vervielfältigt die andere übertrifft.
I think that Torretti is right in stressing that this amendment is not sheer pedantry. He argues that if it were sufficient for satisfying the definition
44
Matthias Schirn
that any of the two magnitudes, when multiplied, were to exceed the other, we might say that there is a ratio between a right angle α and any curved angle β contained in α since α multiplied with 1 is clearly greater than β. On the other hand, though, it is obvious that β multiplied with n will never exceed α, no matter how large the factor n may be. As to Hankel’s explanation “Grösse heisst ein Object, wenn es grösser, kleiner als ein anderes, oder ihm gleich ist, und in letzterem Falle ihm überall substituiert werden kann; wenn es ausserdem durch wiederholte Position vervielfacht (und geteilt) werden kann”, it does not correspond to any of the passages of the Elements where Euclid uses the word “µέγεθος” (“magnitude”). Now it is correct that in Book V of the Elements Euclid does not define the concept of magnitude, but rather analyzes its properties and structure by setting up a group of definitions and by subsequently proving a number of propositions involving the concepts of magnitude, of ratio, of multiple, of proportion, proportional, etc. However, instead of presenting a definition of the concept of a quantum or of the predicate “is large”, Hankel expressly offers a definition (not an exposition!) of the concept of magnitude when he makes his stipulation by appealing to Euclid. Recall that Hankel considered a definition of this concept to be unnecessary in the first place. However, I refrain here from trying to disentangle what strikes me as a confusion of terms and turn now to Frege’s comments on Hankel in [18, § 12]. In this section, Frege mentions that Hankel [27] bases the theory of real numbers on three principles, to which he ascribes the character of notiones communes.20 He then quotes from [27, p. 54]: They become perfectly evident through explication, are valid for all domains of magnitudes, according to the pure intuition of magnitude; and they can, without forfeiting their character, be transformed into definitions, by saying: By the addition of magnitudes one understands an operation that satisfies these principles.
Frege objects that in the last claim there is an unclarity. He is willing to grant that the proposed definition can perhaps be framed. Yet he also points out [18, § 12] that it cannot serve as a substitute for those principles; for in the application it would always be at issue: are the cardinal numbers magnitudes, and is, what one ordinarily calls addition of cardinal numbers, addition in the sense of this definition? And to answer it, one would already need to know those sentences about the cardinal numbers.
The three principles that, according to Hankel, can be transformed into definitions are the following (cf. [27, pp. 54 ff.]): (1)
a + (b + c) = (a + b) + c.
20 Hankel appeals here to Kant’s conception of notiones communes.
Frege on Quantities and Real Numbers
45
(2)
a + b = b + a.
(3)
If a = Ae, b = Be and a′ = Ae′ , b′ = Be′ , then (a + b) of e is the same multiple as (a′ + b′ ) of e′ .
Indeed, Hankel’s way of presenting the matter falls short of clarity, but not necessarily for the first reason that Frege mentions. To begin with, strictly speaking, it is not correct to claim that the principles (1), (2), and (3) are transformed into definitions. What Hankel suggests, is rather a single definition of the operation of addition of magnitudes, consisting of several clauses. Understood in this way, the definition does not give rise to objections on formal grounds. Thus, at least from a formal point of view, Frege’s cautious phrase “can perhaps be made” seems to be misplaced. Frege is of course right in speaking of a definition in the singular and thus in tacitly correcting Hankel’s phrasing. Hankel’s definition should read as follows: The addition of magnitudes is an operation that satisfies the following principles: (1) (2)
a + (b + c) = (a + b) + c. a + b = b + a.
(3)
If a = Ae, b = Be and a′ = Ae′ , b′ = Be′ , then (a + b) of e is the same multiple as (a′ + b′ ) of e′ .
Now a word about Frege’s first objection. I take it that Hankel regarded the cardinal numbers as magnitudes/quantities just as Frege did in [16] (cf. p. 51) and in [18]. The definition of the addition of magnitudes given above would then fully apply to the cardinals constituting just one type of magnitude. Moreover, I fail to see why Hankel should care much about the question of whether addition of cardinal numbers in the “ordinary” sense is addition in the sense of his definition. (Note that Frege fails to spell out what “ordinary” is to mean here precisely.) Hankel’s definition just lays down how the operation of addition of magnitudes of any type should be understood in his theory of magnitude. And to be sure, at least the properties of associativity and commutativity fully apply to what we “ordinarily” regard as the operation of addition of cardinal numbers. Frege writes [18, p. 18]: If we consider everything that is called a magnitude: cardinal numbers, lengths, surface areas, volumes, angles, curvatures, masses, velocities, forces, light intensities, galvanic currents, and so forth, we can well understand how they can all be brought under one concept of magnitude; but the expression “intuition of magnitude”, and even more so “pure intuition of magnitude”, cannot be acknowledged as correct.
In the light of Frege’s later work on the foundations of analysis in [21], it might come as a surprise that in this quotation he also mentions cardinal numbers as forming a kind of magnitude. From his later point of view, this is illicit. In [21], in the course of comparing the reals with the cardinals,
46
Matthias Schirn
he makes it clear that a cardinal number serves to answer a question of the form “How many objects of a certain kind are there?”, while a real number answers the question “How great is a magnitude (or quantity) compared with a unit magnitude (or unit quantity)?”. In § 19 of [18], when Frege comes to discuss Newton’s conception of number, he claims – erroneously – that the number that gives the answer to the question “how much?” can also determine how many units are contained in a length. Finally, returning to Frege’s critique of Hankel’s theory of magnitude, I think that Frege is right in denying that the expression “intuition of magnitude”, and even more so the term “pure intuition of magnitude”, can be acknowledged as correct. Unfortunately, it remains obscure why Hankel appeals to a pure intuition at all in this context. Perhaps he was strongly influenced by Kant’s notion of a pure intuition in the Critique of Pure Reason. However this may be, from a mathematical point of view, there was no need for Hankel to invoke a pure intution when introducing and characterizing the concept of magnitude. Thus, I think that Frege’s objection has after all little weight, since Hankel could easily have refrained from using the phrase “according to the pure intuition of magnitude”, indeed without any loss for his mathematical theory of magnitude in general and his proposed definition of the operation of addition of magnitudes in particular. Let us now turn to Frege’s comment on Newton’s conception of number in [18, § 19]. When Frege comes to discuss Newton’s conception of number in [18, § 19] he bases his comments on Baumann’s account of Newton’s ideas (cf. [4]). While I was composing this essay, I did not succeed in getting hold of Newton’s original work; nor did I manage to cast a glance at Baumann’s book. It is for this trivial reason that at present I cannot judge whether Baumann represents Newton’s conception of number faithfully. In § 19, Frege argues against the attempt to conceive numbers geometrically, as ratios between lengths or surfaces. He cites Newton as proposing to define number as the abstract ratio between quantities, namely between any given quantity and another quantity of the same kind, taken as unity. This tallies with Frege’s own characterization of real numbers two decades later in [21]. Frege observes that Newton’s definition applies to numbers in the wider sense, including fractions and irrational numbers, adding the proviso that in this case the concepts of magnitude and of ratio of magnitudes are presupposed. Frege concludes from this: “Accordingly, it appears that the explanation of number in the narrower sense, of cardinal number, will not be superfluous.” 21 Now, the supposed fact that Newton’s general 21 As so often, Austin deviates a trifle too far from Frege’s original text when he translates [22, p. 25]: “This should presumably mean. . . ” This is simply inaccurate. By contrast, Jacquette [26, p. 34] gets it right here: “Accordingly, it appears. . . ” However, both Austin and Jacquette render the term “Grössenverhältnis” somewhat awkwardly as “relation in respect of magnitude” (Austin) and as “magnitude
Frege on Quantities and Real Numbers
47
definition of number presupposes the concepts of magnitude and of ratio of magnitudes would not cause any problem if prior to that definition Newton provided a proper explication or definition of the concepts of magnitude and of ratio of magnitudes (which at present I do not know). At any rate, Frege tries to justify the apparent need for explaining the concept of cardinal number within a Newtonian setting by referring to Euclid: “for Euclid needs the concept of equimultiple [“des Gleichvielfachen” ] in order to define the identity of two ratios of lengths; and the equimultiple amounts again to a numerical identity.” 22 Frege does not exclude that the identity of ratios of lengths can be defined independently of the concept of number. He goes on to say that if it could be defined in this way, then we might remain in uncertainty in which relation the geometrically defined number would stand to the number of ordinary life. He adds that a further problem might arise, namely the question of whether arithmetic itself can get along well with a geometrical concept of number, especially if one thinks of the number of roots of an equation or the numbers prime to a number and smaller than it. However, as I already pointed out, the sharp distinction between the application of the reals and the application of the cardinals along the lines of [21] is missing in [18]. This is obvious from Frege’s remark that the number that gives the answer to “How many?” can also determine how many units are contained in a length. In conclusion, he raises another objection to what, in his view, might have been Newton’s understanding of the notion of magnitude: Calculation with negative, fractional, irrational numbers can be reduced to calculation with the natural numbers. Yet what NEWTON perhaps wished to understand by magnitudes, as whose ratio the number is defined, was not only geometrical magnitudes but also sets. In that case, however, his definition is useless for our purposes, since of the expressions “number through which a set is determined” and “ratio of a set to the unit of the set” the latter provides no better information than the first.
So much for Frege’s critical assessment of Hankel’s and Newton’s conceptions of quantity.
2.3 Extensive and intensive magnitudes: Frege on Cohen In a review of [7], Frege criticizes Cohen’s treatment of the distinction between extensive and intensive magnitudes.23 This distinction has Kanrelations” (Jacquette). 22 Jacquette’s translation of “Gleichvielfaches” as “equinumerosity” is incorrect, since Frege distinguishes between Gleichvielfaches and Gleichzahligkeit (equinumerosity). 23 See in this connection [7], section 19 “Differential und intensive Größe”, section 33 “Intensive Realität”, section 58 “Das Intensive und das Inextensive” and section 79 “Die intensive Größe und das Infinitesimale bei Kant”; cf. also [6, pp. 211 ff.]. Just one year before he published his Begriffsschrift Frege delivered in Jena a short
48
Matthias Schirn
tian roots. In [35] (“Systematic representation of all synthetic principles”), Kant distinguishes between extensive and intensive magnitudes. The principle of the axioms of intuition is: All intuitions are extensive magnitudes, whereas the principle of the anticipations of perception is: In all appearances the real which is an object of the sensation, has intensive magnitude, that is, a degree. Kant calls an “extensive magnitude that in which the representation of the parts makes possible the representation of the whole (and therefore necessarily precedes the latter)” (A162/B203). He calls that magnitude “which can only be apprehended as a unity, and in which the multiplicity can only be represented through approximation to negation = 0, intensive magnitude” (A168/B210). In his review of [7], Frege writes [23, p. 101]: Now the distinction between intensive and extensive magnitudes has no sense in pure arithmetic. Nor does it seem to matter anywhere else in the whole of mathematics. The number 3, for example, can serve as a measurement number for a distance with respect to a unit of length; but it can also serve as the measurement number for an intensive magnitude, for example, for a light-intensity measured in terms of a unit of lightintensity. The calculation proceeds in both cases according to exactly the same laws. The number 3 is therefore neither an extensive nor an intensive magnitude but it rather stands above this contrast. The same holds also for the infinitesimal. Cohen would perhaps respond to this: Light-intensity is not an intensive but an extensive magnitude; yet it seems that such a response would fly in the face of linguistic usage.
There is not much to add to this assessment from my point of view. I agree with Frege that the distinction at issue does not have any proper place in arithmetic. Following Frege again, I further hold that this distinction is unsuited for playing any significant and fruitful role in other branches of mathematics. Finally, I think that both the concept of extensive magnitude and that of intensive magnitude are far from being sharply defined either in [35] or in [7]. To form a sustainable judgement concerning the legitimacy of Frege’s objections, I have taken the trouble of reading half way through [7], but found Cohen’s cloudy style of writing and arguing hard to digest. It seems to me that in raising his objections to certain ideas presented by Cohen Frege exercised verbal restraint. Despite the massive shortcomings of Cohen’s account, Frege’s critique is not accompanied by irony, let alone by sarcasm. On several other occasions, when he shoots the arrows of his mordant criticism on mathematicians and philosophers alike, just the opposite is the case. lecture on a way of conceiving the shape of a triangle as a complex quantity. He argues that, despite first appearances, the shape of a triangle can be conceived of not only as a quality, but also as a quantity. He underscores that the second option ought not to be confused with the fact that the shape can be characterized by quantitative determinations. “What we are concerned with here is to obtain one and only one measurement number [Meßzahl ] for each triangular shape, so that one can speak of the addition of two triangular shapes to yield a new triangular shape” [23, p. 90].
Frege on Quantities and Real Numbers
49
3 Cantor’s theory of irrational numbers and Frege’s critique In what follows, I shall characterize Cantor’s theory of irrational numbers and take a closer look at the objections that Frege raises to this theory. In his essay ‘Über die Ausdehnung eines Satzes aus der Theorie der trigonometrischen Reihen’ of 1872, Cantor develops for the first time – albeit in a rather condensed form – his theory of irrational numbers [5, pp. 92-101]. He construes them as limit values of convergent sequences of rational numbers. In later work (see, for example, [5, p. 186]), he calls these sequences fundamental sequences (Fundamentalreihen). In his essay of 1872, Cantor takes the rational numbers as given and defines an infinite sequence of rationals a 1 , a 2 , . . . an , . . . (that is, a Cauchy-sequence {an } of rational numbers) by appealing to the condition that the difference an+m − an becomes infinitely small with increasing n, whatever the positive integer m may be. In other words: he defines an infinite sequence of rationals by appealing to the condition that in case of an arbitrarily chosen positive rational ǫ there is a positive integer n1 such that |an+m − an | < ǫ, if n ≥ n1 and m is an arbitrary positive integer. Cantor expresses this condition of {an } succinctly as follows: The sequence {an } has a certain limit b [5, p. 93]. In his seminal work ‘Grundlagen einer allgemeinen Mannigfaltigkeitslehre’ of 1883, Cantor discusses three main forms of introducing the real numbers in a strict arithmetical fashion: the definitions suggested by Weierstraß, Dedekind and himself. All three definitions are said to share the common characteristic that to the definition of an irrational real number there always belongs a well-defined (countably) infinite set of rational numbers. Cantor points out that the difference between the three forms of definition is due to the momentum of generation (Erzeugungsmoment) through which the set of rational numbers is linked to the number it defines, and to the conditions which the set must satisfy in order to qualify as a foundation for the definition of the number in question [5, p. 184]. As to his own definition of the real numbers, Cantor likewise proceeds from a countably infinite set of rational numbers (aν ). Every such set (aν ), which can also be characterized by the requirement Lim (aν+µ − aν ) = 0 ν=∞
(for arbitrary µ),
he calls a fundamental sequence and assigns to it a number b, to be defined through it, “for which one can expediently use the sign (aν ) itself, as suggested by Heine” [5, p. 186].24
24 Note that here Cantor himself does not use quotation marks.
50
Matthias Schirn
Quite in the spirit of Frege, one might object to this explanation that it fails to spell out what entitles us to use the sign ‘(aν )’ in place of the number b we assign to a set (aν ) that meets the requirement mentioned above. Cantor’s appeal to Heine as an authority in this matter must have raised a red flag for Frege. Strictly speaking, a sign cannot take over the status and the function of a number; or, in other words: the former cannot replace the latter. This applies even if by a sign one does not understand the actualized sign type, that is, the concrete, physical occurrence or inscription of the sign, but rather the sign type qua abstract object. Yet Cantor’s subsequent explanations suggest that he originally intends to correlate numbers b, b′ and not their signs with his fundamental sequences. He writes [5, p. 186]: Such a fundamental sequence presents three cases, as can be rigorously deduced from its concept: either its members aν for sufficiently large values of ν are smaller in absolute value than any arbitrarily given number; or, from a certain ν on they are greater than a determinable positive rational number ρ; or, from a certain ν on they are less than a determinable negative rational quantity −ρ. In the first case, I say that b is equal to zero, in the second, that b is greater than zero or positive, in the third that b is less than zero or negative.
Definitions of the relations of identity, greater than and less than for two numbers b und b′ are provided only after the sum and the difference b ± b′ as well as the product b ·b′ have been defined. Cantor stipulates that b = b′ or b > b′ or b < b′ , depending on whether b − b′ is equal to zero or greater than zero or less than zero [5, p. 186]. It seems obvious that he does not intend to set up definitions of the relations of being identical with, being greater than or being less than for two numerical signs “b” und “b′ ”; for in this case his definitions would be nonsensical. Not surprisingly, it is by invoking his principles of definition that Frege raises objections to all three groups of Cantor’s definitions. In [21, § 69], he purports to have unmasked the definitions of the first group as flawed, on the grounds that the definienda are not simple. In fact, the definienda contain the words “greater” and “less”, with which acquaintance prior to the act of framing the definitions must be assumed. Hence, according to Frege, this offends against his two principles of definition. He writes ([21, § 69]; see also [21, §§ 77 f.]): But acquaintance with the words “zero” and “equal” must also be assumed; and then the expressions “equal to zero”, “greater than zero”, and “less than zero” are completely known and must not be explained again. If they were not [completely known], then the previous definitions would have been incomplete – a violation of our first principle of definition.
For someone who is prepared to endorse Frege’s theory of definition, in particular his prohibition on piecemeal definitions, these criticisms would certainly go through.
Frege on Quantities and Real Numbers
51
Frege goes on to take Cantor to task for his definitions of the elementary operations. Among other things (cf. [21, §§ 79 f.]), he objects that the expressions “sum”, “difference”, and “product” are explained through themselves. Since they had thus been explained only incompletely until now, his principle of completeness was breached. Cantor is rebuked for having passed something off as a definition that he would have needed to prove as a theorem. Frege also deals with the third group of definitions in great detail (cf. [21, §§ 81-83]). His critique essentially boils down to the point that in the definitions he is considering the expressions “equal”, “greater”, and “less” are shifted back and forth between being known and being unknown (p. 94). However, in this way, he thinks, his principle of completeness is infringed. Furthermore, he makes it clear that a definition can never be used to define two things. In the present case, this would mean: the greater-than relation and the irrational numbers. The reason is that every attempt to do this ignores the principle of simplicity of the definiendum. Regarding Cantor’s definitions of sum, difference, product, being equal to, being less than, and being greater than, Frege jettisons them on the grounds that they have a kind of Protean ring to them (note that he uses a different, though somewhat related metaphor, [21, § 82]): What first presents itself as an explanation of the signs “+”, “>”, etc., claims, in the next instance, to determine more exactly that which, according to Cantor, should be assigned to the fundamental sequences. However, this deception is only possible due to the fact that those signs are now, once again, considered to be known. Thus, those definitions shimmer in two colours, by sometimes defining the sum, the product, being greater than, etc., and by sometimes being intended to determine the new numbers. But this is incompatible.
In any event, Frege’s complaint that Cantor does not always clearly distinguish between the sign and its denotation or reference appears to be justified. To see this, let us consider a passage from Cantor’s ‘Bemerkung mit Bezug auf den Aufsatz: Zur Weierstraß-Cantorschen Theorie der Irrationalzahlen’ of Illigens (1889) [5, p. 114]25 on which Frege likewise comments. . . . but I have never asserted, nor has anyone else ever asserted that the signs b, b′ , b′′ , . . . are concrete quantities in the literal sense. As abstract thought objects they are only quantities in the non-literal or figurative sense. What must be considered decisive here is that, as anyone familiar with my theory already knows, with the help of these abstract quantities b, b′ , b′′ , . . . concrete quantities in the literal sense, for example, geometric distances, etc., can be quantitatively determined in a precise manner.
I find this a little hard to follow. To begin with, Cantor refers here to b, b′ , b′′ etc., as signs, and, in the same breath, as abstract thought objects 25 It is a short reply to the criticism that E. Illigens [34, pp. 155-160] levelled against Cantor’s theory of irrational numbers.
52
Matthias Schirn
or as abstract quantities. At the beginning of his ‘Bemerkung’, he even calls b, b′ , b′′ irrational number concepts.26 Cantor distinguishes between quantities in the literal sense and abstract quantities in the figurative sense but, as the use of the word “figurative” may already indicate, he seems to acknowledge only the former as quantities in the proper sense of this word. To my mind, the distinction, vague as it is, makes little sense. It seems to me that the talk of certain signs as abstract quantities is only a façon de parler on which nothing should be grounded. Moreover, I fail to see how with the aid of certain signs conceived of as abstract quantities one could bring it about to determine concrete quantities quantitatively in a precise manner. Frege rightly objects to Cantor’s explanation that it would indeed take a strong faith to construe signs qua physical objects as abstract thought objects. At the same time, he grants that Cantor probably considered the signs “b”, “b′ ”, “b′′ ”, etc. to refer to abstract thought objects. It goes without saying that signs written on a piece of paper with pencil or on a blackboard with chalk cannot be regarded as abstract objects. By contrast, a sign construed as a sign type can and even must be considered an abstractum. Thus, if in the preceding quotation Cantor had the sign qua type rather than the sign qua token in mind, then Frege’s complaint would, at least prima facie, lose some of its force. Nevertheless, my hunch is that Cantor meant in fact the sign qua token. I even doubt that he was aware of the type-token distinction (was Frege aware of it?), but these are issues that we cannot and need not decide here. In any case, no matter how a sign is construed – as type or as token – it remains that, taken by itself, it cannot do the job of determining quantities such as geometric distances, time spans, electric charges, light-intensities, and so on in a precise manner. My (partly) conciliatory proposal is then this: when Cantor refers to b, b, b′ , b′′ , . . . (without using quotation marks) as abstract quantities, he means, in contrast to the wording he chooses, that the numbers b, b′ , b′′ , . . . 26 It is not surprising that Cantor and other contemporaries of Frege’s do not distinguish between concept and object as clearly and systematically as Frege does in his work after 1891, if they draw any such distinction at all. Cantor characterizes individual cardinal numbers and order types as general concepts. In the third of his three explanations of the terms “power” and “cardinal number” (which he apparently takes to be synonymous) in ‘Mitteilungen zur Lehre vom Transfiniten’ [5, pp. 411 f.], he speaks also of a set whose elements are particular, well-distinguished abstract concepts. It should be possible, he says, to conceive as objects not only concrete things qua elements of a set, but also abstract concepts qua elements of a set [5, p. 420]. Unfortunately, Cantor does not give clear examples of concepts that he considers to be abstract. However, I assume that he would classify, for example, the concepts cardinal number and order type as abstract ones. One is inclined to point out here that all concepts are, by their very nature, abstract entities and not only those concepts under which abstract objects fall or, let us say, those higher order concepts under which concepts fall.
Frege on Quantities and Real Numbers
53
or the objects designated by “b”, “b′ ”, “b′′ ”, . . . are abstract thought objects with the help of which concrete quantities can be quantitatively determined.27 If this is correct, then by straightening out his terminology Cantor could have escaped blatant incoherence. Frege writes [21, p. 86]: Now if by the expression “abstract thought object” Cantor understands that what we call logical object, then there seems to be perfect agreement between us. Yet it is too bad that these abstract objects do not occur at all in Cantor’s explanation! We have fundamental sequences and signs b, b′ , etc. We cannot, with the best will in the world, consider these to be abstract thought objects, nor can the fundamental sequences be meant.
According to Cantor, we have the fundamental sequences and the numbers b, b′ , etc. assigned to them, and the latter should be defined through the former. However, as we have already agreed upon, for Cantor the numbers b, b′ , etc. are in fact abstract thought objects. I hasten to add that his notion of an abstract thought object does not match Frege’s concept of a logical object (under which Frege subsumes numbers, courses-of-values, and the two truth-values).28 In particular, the question arises as to why the fundamental sequences themselves could not have been meant by the 27 In the case of individual numbers, Cantor also speaks of number concepts. Notice in this connection that he does not clearly distinguish between concept and object, let alone along Fregean lines. 28 I say this although Cantor’s concept of an abstract thought object is not as clear as it should be. In fact, it is less clear than Frege’s concept of a logical object. It is true that Frege says relatively little about this concept. With the exception of the two truth-values, he introduces logical objects by way of what I call logical abstraction (see [48, pp. 172-178]; concerning Frege’s view of logical objects see also [49]). In [18], cardinal numbers are tentatively introduced via Hume’s Principle – which at a later stage of his logical-mathematical investigation into the concept of cardinal number (cf. § 73) is very sketchily and incompletely “derived” from the explicit definition of the cardinality operator – and in § 3 of [20] courses-of-values by means of an informal semantic stipulation later to be embodied in the formal version of Basic Law V. Like Hume’s Principle, Basic Law V is a logical abstraction principle of second-order; both, the second-level equivalence relation of equinumerosity between two first-level concepts and the second-level relation of coextensiveness between two (monadic) first-level functions can be defined in second-order logic. Yet unlike Hume’s Principle, Axiom V is acknowledged by Frege to be a primitive law of logic. As far as Cantor’s notion of an abstract thought object is concerned, I presume that in general he construed an abstract object in accordance with the customary view as one that is non-spatial and non-temporal and, hence, not capable of involvement in causal or physical interaction. Cardinal or ordinal numbers, for example, are abstract thought objects for Cantor. If we cast a glance at his description of how one may arrive at, let us say, the cardinal number M of a given set M , we may gain approximately an idea of what he means by an abstract thought object. Cantor construes M as a definite set, comprised of nothing but units, which exist in our mind as an intellectual copy or a projection of M . M is obtained by carrying out the process of abstraction from both the nature of the elements of M and their order. To be sure, Frege would have refrained from calling numbers, or more generally coursesof-values, abstract thought objects. Note also his disdain for Cantorian abstraction as a variety of psychological abstraction.
54
Matthias Schirn
abstract thought objects. As to Cantor’s fundamental sequences, Frege rightly conjectures in another place that they are taken to consist of abstract thought objects (cf. [21, § 86]). However, I fail to see why the fundamental sequences themselves could not be classified correctly as abstract thought objects. Plainly, a sequence or set of abstract objects is likewise to be considered an abstractum. In [21, § 77], Frege calls into question the idea that the numbers assigned to fundamental sequences are signs. The following view seems more plausible to him [21, p. 89]: Related to each fundamental sequence there is a certain number that need not be a rational. These numbers are thus, to some extent, new and not yet considered, and should be determined by the fundamental sequences to which they are related. The sign ‘b’, then, does not designate the fundamental sequence, but rather the number related to it. Hence, this the number is, itself, not a sign, but rather that which Cantor calls an abstract thought object.
Frege recognizes the weak point of the view that he attributes to Cantor. It is the fact that the assignment of a number to a fundamental sequence and the definition of a new number “are contracted into one act”. Surely, one can assign a previously defined number to a given fundamental sequence, but one cannot assign a determinate number to it that has yet to be defined through it. It is time to summarize Frege’s critique of Cantor’s theory of irrational numbers (cf. [21, § 84]). Frege distinguishes between two views. According to the first, the numbers assigned to fundamental sequences are signs; according to the second, they are abstract thought objects. In the first case, the correlation of numbers qua signs with certain fundamental sequences is said to be inessential. In Frege’s view, Cantor disposes only of his fundamental sequences, while the ratios of quantities are lacking. “First we must know the ratios of quantities, the real numbers; then we may discover how we can determine the ratios by means of the fundamental sequences” [21, § 76]. Frege adds that Cantor’s theory is by no means purely arithmetical since its application to geometry is crucial for it. That is to say: Cantor considered it essential that with the help of abstract quantities b, b′ , b′′ , . . . concrete quantities, such as geometrical distances, be precisely determinable. Cantor’s introduction of abstract numerical quantities is purely arithmetical, but is said to miss the decisive point. The description of how one could determine distances through numerical quantities includes the decisive point, but is alleged to be not purely arithmetical. According to Frege, in the second case we do not succeed in grasping (fassen) the numbers assigned to the fundamental sequences qua abstract thought objects. One cannot assign the numbers to the sequences until one is in possession of these numbers. Frege’s critique of Cantor’s definitional practice need not be repeated here.
Frege on Quantities and Real Numbers
55
Thanks to his introduction of the fundamental sequences and the three forms of definition of the real numbers mentioned a while ago, Cantor arrives at the following theorem (I): If b is the number determined by a fundamental sequence (aν ), then b − aν with increasing ν becomes less in absolute value than any conceivable rational number, or, what amounts to the same thing: Lim aν = b. ν=∞
Cantor underscores that in his definition of the real numbers the number b is not defined as the limit of the members a of a fundamental sequence (aν ). If this were the case, he would be committing the logical error of presupposing the existence of the limit Lim aν . According to Cantor, the ν=∞ situation is precisely this: his previous definitions have assigned to the concept b such properties and such relations in which it stands to the rational numbers that from this we can infer with logical evidence that the limit Lim aν exists and is equal to b. ν=∞
Cantor holds that the irrational number, in virtue of the property ascribed to it by the definitions, has just as definite a reality in our minds (he calls it intrasubjective or immanent reality) as the rational numbers. We need not acquire an irrational number through a limiting process, but, on the contrary, “by possession of it we become convinced of the practicability and evidence of limiting processes in general” [5, p. 187]. This reflection leads Cantor to the following extension of theorem (I): (II) If (bν ) is any set of rational or irrational numbers with the property that Lim (bν+µ − bν ) = 0 (for any µ), then there is ν=∞
a number b determined by a fundamental sequence (aν ) such that Lim bν = b. ν=∞
From (I) and (II) it emerges that the same numbers b that are defined on the basis of fundamental sequences (aν ) – Cantor calls them fundamental sequences of the first-order – and which are defined as limits of the aν , can also be represented in various ways as limits of fundamental sequences (bν ), where each of the bν is defined by a fundamental sequence of the first (ν) order (aµ ) (with fixed ν) [5, p. 188]. Accordingly, Cantor calls such a set (bν ), if it has the property that Lim (bν+µ − bν ) = 0 (for any µ), a fundaν=∞ mental sequence of the second order. Furthermore, one may construct not only fundamental sequences of the third, fourth, . . . , nth order, but also fundamental sequences of the αth order, where α is any number of what Cantor refers to as the second number class.29 All fundamental sequences 29 Cantor’s first number class is the set of all finite ordinal numbers {ν}, which has the type ω (the smallest transfinite ordinal number). The second number class Z(ℵ0 ) is the set {α} of all order types α of well-ordered sets of cardinality ℵ0 and thus the set of all transfinite denumerable ordinal numbers. (ω is thus the smallest number of the second number class.) The type of Z(ℵ0 ) is the smallest non-denumerable
56
Matthias Schirn
of higher order do the same job for the determination of a real number b as do the fundamental sequences of the first order. Hence, by appealing to fundamental sequences of the higher order we do not introduce numbers which could not already have been determined through the fundamental sequences of the first order.30
4 Russell on quantities and real numbers in Principles of Mathematics and Principia Mathematica In [39, p. 285], Russell attempts to undermine Cantor’s claim that his theory of irrational numbers renders his theorem (I) (to which I referred above) strictly demonstrable. According to Russell, Cantor’s supposed proof is fallacious precisely because it fails to show that a rational can be subtracted from a real number. Russell makes it clear that connected with every rational number a there is a real number defined by the fundamental sequence whose terms are all identical with a; if b is the real number defined by a fundamental sequence (aν ), and if bν is the real number defined by a fundamental sequence whose terms are all equal to aν , then (bν ) is a fundamental sequence of real numbers whose limit is b. Russell points out that, contrary to what Cantor assumes, we cannot infer from this reasoning that Lim aν exists. Asserting the existence of Lim aν is justified only if (aν ) has a rational limit. “The limit of a series of rationals either does not exist, or is rational; in no case is it a real number. But in all cases a fundamental series of rationals defines a real number, which is never identical with any rational” [39, p. 285]. Russell [39, pp. 270 ff.] defines the irrational numbers as classes of segments of rational numbers. According to his theory, we can define, with respect to a given rational number r, four infinite classes of rationals: (1) those less than r, (2) those not greater than r, (3) those greater than r, (4) those not less than r. Classes of rationals that have the property of (1) are called segments. A segment of rationals is defined as a class of rationals which is not empty, nor yet coextensive with the rationals themselves, and which is identical with the class of rationals x such that there is a rational y of the said class such that x is less than y. It is in this connection that Russell refers to G. Peano’s work Formulaire de Mathémathiques (vol. II, 1899, part III, § 61). Moreover, in his letter to Frege of 20 February 1903, Russell observes that Frege’s criticism of the ordinal number, its power the second smallest transfinite cardinal number ℵ1 (cf. [5, pp. 325, 331]). 30 See Cantor’s explanation [5, pp. 188 ff.] of why he thinks that his definition of the real numbers is suitable. cf. also § 10 entitled ‘Die in einer transfiniten geordneten Menge enthaltenen Fundamentalreihen’ (‘The fundamental sequences contained in a transfinite ordered set’) of his ‘Beiträge zur transfiniten Mengenlehre’ [5, pp. 307 ff.].
Frege on Quantities and Real Numbers
57
arithmetical theory of irrational numbers in [21] seems sound. He adds that he himself has a purely arithmetical theory which is free of logical errors: “Let k be a class of rational numbers; I then call the class of all rational numbers smaller than at least one member of k the real number determined by k. Some hints of this theory are to be found in Peano” [25, p. 237]. In saying this, Russell refers first and foremost to Peano’s essay ‘Sui numeri irrazionali’, Rivista di Matematica 6 (1896-99), pp. 126140). Russell [39, p. 270] observes that in ‘Beiträge zur Begründung der transfiniten Mengenlehre’ (Mathematische Annalen 46 (1895), pp. 481-512) Cantor comes very close to his own theory of real numbers. As to Peano, Russell complains [39, pp. 274 f.] that Peano’s way of characterizing the relation between segments and irrational numbers lacks clarity. Another point he makes is that Peano goes astray in construing the real numbers as the limits of classes of rationals: a segment is in no sense a limit of a class of rationals [39, pp. 274 f.]. Although Peano maintains that a complete theory of irrational numbers can be constructed by appeal to segments, he does not seem to be aware of the philosophical reasons why this must be done. (See Russell’s arguments for the necessity of supplying such a construction in chapter XXXIV of his [39]). There is an interesting comment by Frege on Russell’s theory of irrational numbers in [39]. In his letter of 21 May 1903 to Russell, Frege recognizes this theory as “logically unassailable”, but makes one important proviso: it must be guaranteed that the word “class” has been given a proper meaning. From Frege’s point of view, this means that classes are to be conceived of as logical objects in accordance with his own conception of logical objects, and thus not as aggregates, systems or wholes consisting of parts (cf. [20, pp. 1-3]; [23, pp. 104 f.; 193 ff.]; [25, pp. 222 f., 225]). He attributes the latter conception to Russell and argues that it does not allow a logical foundation for arithmetic. He writes [25, p. 239]: When you define an irrational number as a class of rational numbers, it is, of course, something different from what I call an irrational number according to my definition, although there is naturally a connection. It seems to me that you need a double transition: (1) from numbers to rational numbers, and (2) from rational to real numbers in general. I want to go at once from numbers to real numbers as ratios of quantities.
In Part VI, entitled “Quantity” of volume III of Principia Mathematica (1913), Russell presents his mature theory of quantities and real numbers which he had worked out in collaboration with Whitehead. As to their treatment of quantities, they proceed from a simple guiding principle: “No quantity of any kind without a comparison of different quantities of that kind” (p. 261). In section B, Russell and Whitehead concern themselves with “kinds” of quantity: masses, spatial distances, velocities. They regard each kind of quantity as what they call a “vector-family”. A vector-family is a class of
58
Matthias Schirn
one-one relations all of which have the same converse domain and, moreover, have their domain contained in their converse domain. Russell and Whitehead argue that in a case that relates to spatial distances, the applicability of this view is obvious; concerning masses, the view is said to become applicable by considering, for example, one gramme as + one gramme, that is, as the relation of a mass m to a mass m′ when m exceeds m′ by one gramme. What is commonly called one grammme will then be the mass which has the relation + one gramme to the zero of mass. Section C is dedicated to measurement, that is, to the discovery of ratios (see the definition of ratios *303.01 on p. 260) or of the relations expressed by real numbers, between the members of a vector-family. A vector-family is measurable if it contains a member T (the unit) such that any other member S stands to T in a relation which is either a ratio or a real number. Section D is concerned with cyclic families of vectors, such as angles or elliptic straight lines. When Russell and Whitehead come to consider the real numbers as opposed to ratios, they point out that the former are said to be required first and foremost to obtain a Dedekindian series, so as to secure limits to sets of rationals having no rational limit. If rationals and irrationals are to form one series, it is necessary to give some definition of “rationals” other than “ratios”, since the series of ratios (assuming the acceptance of the axiom of infinity) is not Dedekindian, and is not part of any arithmetically definable Dedekindian series. Whitehead and Russell stress (p. 316, *310) that the properties which real numbers must possess from their point of view, will be forthcoming if they are identified with segments of H (see the explanation below), and if segments of the form H ′ X, that is, segments which have ratios as limits, will be termed ‘rational real numbers’. They likewise emphasize in this context that hardly any of the properties of real numbers can be proved without acknowledging the axiom of infinity. “Thus H ′ X is the rational real number corresponding to the ratio X, and a real number in general is of the form H ′′ λ, where λ is a class of ratios. H ′′ λ will be irrational when λ has no limit or maximum in H” [56, p. 316]. Note that for the relation “less than” among rationals of a given type (excluding 0q ), Whitehead and Russell use the letter “H”, “to suggest η . . . because, if the axiom of infinity holds, the series of rationals of a given type is an η” (p. 278); cf. the definition of “H” on p. 278, *304.02. Following Cantor, the authors use “η” for the class of rational series (cf. the definition *273.01 on p. 202) and define a rational series as one which is compact, has no beginning or end, and has ℵ0 terms in its field. Consequently, the field of rational series can be arranged in a progression, and this is the source of the special properties by means of which rational series can be distinguished from other compact series (cf. p. 199).
Frege on Quantities and Real Numbers
59
5 Quantities and real numbers in Grundgesetze 5.1 Informal considerations I shall now turn to Frege’s theory of real numbers in [21]. Going through its details will provide an appropriate idea of his sustained efforts; the theory did not just fall into his lap. In §§ 157-164, we find a series of preliminary considerations, followed by the formal construction of analysis under the heading “Grössenlehre” (“theory of quantity”). In section A, Frege proves the associative and commutative laws for the composition of extensions of relations. The latter, unlike the former, are not generally valid. Section B provides the definitional introduction of the quantitative domain, of the concept positival class, as well as the derivation of several theorems – among others, devoted to greater than and less than in a positival class. In section Γ, Frege defines the upper limit in a positival class; in section ∆, he defines the concept positive class and proves Archimedes’ Axiom. Section E contains the proof of the commutative law in a positive class. Finally, in section Z, Frege proves the commutative law in the domain of a positive class. Having completed this proof, Frege rather abruptly breaks off the logical construction of analysis, which he apparently had hoped to bring to a happy ending in a third Grundgesetze volume. The reason is Russell’s discovery of a contradiction in Frege’s logical system. In a first preliminary reflection, Frege discusses the reason why, in his view, the domain of the cardinals cannot be extended to that of the reals. The cardinal numbers are not ratios, and must therefore be distinguished from the positive integers. Surely, we use cardinal numbers to count, and the domain of what is countable is, according to Frege, the widest domain of all. In fact, he considers it to be all-embracing, because everything thinkable belongs to it. Cardinal numbers, by their very nature, provide answers to questions of the type “How many objects of a certain kind are there?” By contrast, the reals are to be construed as ratios of quantities;31 they measure how large a given quantity is compared with a unit quantity. Thus, in Frege’s view, the mode of application of the reals differs fundamentally from that of the cardinals.32 And just as in [18] he attempted to account for the application of the natural numbers in counting in their definition, so in [21] he takes pains to ensure that the application of the real numbers in measurement is appropriately built into their definition. 31 In [21, § 157 (footnote 2)] Frege approvingly mentions Newton when he emphasizes again that he has construed the real numbers as ratios of quantities and has thus pointed to the quantities as those objects between which such a ratio holds. 32 Currie [8, p. 349] claims that the standpoint of [20, 21] presupposes a sharp distinction – both methodological and ontological – between the natural and the real numbers. Only the first half of this claim is correct. According to Frege, both cardinal numbers and real numbers are to be defined as courses-of-values, the cardinals as equivalence classes of equinumerosity and the reals as Relations of Relations.
60
Matthias Schirn
It is noteworthy that the f -relation in which a cardinal number stands to its successor differs from the relation ξ + 1 = ζ. The former only holds between cardinal numbers, while the latter holds also between numbers other than positive integers. Hence, the formula “aS(bSf )” – “b directly follows a in the series of cardinal numbers” – cannot be replaced by “a+1 = b”. By inverting ξ + 1 = ζ, guided by the positive integers, we can go back via zero to the negative numbers. However, going back beyond the cardinal number 1 is impossible. This motivates Frege to distinguish between the cardinal numbers 0 and 1 and what he simply calls the numbers 0 and 1.33 Frege likewise stresses that his way of introducing the real numbers cannot be grounded on geometrical configurations.34 Taken literally, this remark goes practically without saying, for if the theory of real numbers rested intrinsically on geometrical constructions or configurations, then the logicist thesis could hardly apply to analysis, contrary to Frege’s own express opinion. To be sure, both the second and an envisioned third Grundgesetze volume were designed to establish the logicist thesis for the theory of real numbers as well, bearing in mind that Frege presumably considered cardinal arithmetic to be the paradigm case and at the same time the acid test for logicism. If the real numbers are construed as ratios of quantities, then they could not be distances, for example. For it is imperative to distinguish between a distance and the measurement number that belongs to it in proportion to a unit distance. A numerical symbol does not denote a distance; it does not refer to anything geometrical. The same ratio of quantities associated with distances is also present in other types of quantities, for example, in masses, angles, volumes, electric charges, light-intensities, time periods, velocities, moments of inertia, forces, curvatures, etc. Thus, the application of the real numbers is not restricted to any special kinds of quantity (for example, to geometrical ones), but rather relates to the domain of the measurable, which encompasses all kinds of quantity whatsoever. In the course of discussing Cantor’s theory of irrational numbers, Frege accordingly mentions two advantages that the conception of real numbers as ratios of quantities can lay claim to [21, p. 85]: If we take a closer look, we note that that a numerical sign, taken by itself [für sich allein], cannot denote a length, a force etc., but only in connection with the designation of a measure, a unit, such as a metre, a gram, etc. What, then, does the numerical sign, taken by itself, [thereby] denote? Obviously a ratio of quantities . . . If now by ‘number’ we understand the reference [Bedeutung] of a numerical sign, then a real number is the same as a ratio of quantities. Now what have we gained by defin33 Frege introduces here a special notation for the designation of cardinal numbers. I shall ignore it henceforth. 34 “For if arithmetical sentences can be proved independently of geometrical axioms, then they must be so proved” [21, § 158].
Frege on Quantities and Real Numbers
61
ing real number as ratio of quantities? [The second emphasis is mine.] At first it seems only that one expression has been replaced by another. This is, nevertheless, a step forward. Firstly, no one will confuse a ratio of quantities with a written or printed sign; and thus a source of countless misunderstandings and errors is blocked. Secondly, with the expression ‘ratio of quantities’ or ‘ratio of a quantity to a quantity’ we indicate the mode in which real numbers are linked with quantities. Of course, the main work remains to be done. Initially, we have merely words that indicate to us only approximately the direction in which the solution is to be sought. The meaning [Bedeutung] of these words has yet to be fixed more precisely. But we shall even from now on no longer say that a number or numerical sign denotes, now a length, now a mass, now a light-intensity. We shall say, rather, that a length can have to a length the same ratio as a mass has to a mass, or as a light-intensity has to a light-intensity; and this same ratio is the same number and can be denoted by the same numerical sign.
Two comments may be in order here. (a) To note that a numerical sign, taken by itself, cannot denote a length, etc. does not require that we look more carefully at its use; this fact is obvious. In any event, the initial statement in the quoted passage does not imply that a numerical sign, taken by itself, does not denote anything. Thus, in the relevant context it is consistent for Frege to claim further that a numerical sign, taken by itself, denotes a ratio of quantities (a number). Now, the second sentence in the original German reads as follows: “Was bedeutet nun dabei das Zahlzeichen allein?” (My emphasis; I render here the word “allein” in the sense of the phrase “für sich allein” .) At least at first glance it is not clear what the word “dabei” is intended to mean in this context. Is it to mean “in Verbindung mit der Bezeichnung eines Maasses. . . ” (“in connection with the designation of a measure. . . ”) – which is the very phrase that Frege employs in the preceding sentence – or is it merely a dispensable filler, that is, a word whose sense is not intended to contribute anything essential to the expression of the thought? I tend to vote for the second option. The numerical sign, taken by itself, denotes a ratio of quantities. Plainly, it makes little sense to assume that in the quoted sentence Frege uses the word “dabei” with the intention to refer to the case in which a numerical sign is considered just by itself, since this would amount to asking: What does a numerical sign, taken by itself, denote when taken by itself? (b) It is worth pointing out that in raising the question “What, then, does the numerical sign, taken by itself, denote?” Frege appears to ignore his dictum in [21, § 97]: “One may ask about meanings only where the signs are constituents of sentences which express thoughts.” (“Nach Bedeutungen kann nur gefragt werden, wo die Zeichen Bestandtheile von Sätzen sind, die Gedanken ausdrücken.”) This dictum, which Frege states in the course of criticizing Thomae’s game formalism, is immediately reminiscent of the
62
Matthias Schirn
context principle in [18], especially when this is clad in the garb of a methodological maxim: “The meaning of words must be asked for in the context of a sentence, not in isolation” (p. XXIII). It is chiefly for reasons of space that I do not pursue this conflict further here. Suffice it to observe that Frege did not dismiss the context principle from his mind; it is still in force in [20, 21], although it is no longer highlighted as a guiding principle of his foundational project (cf. [42] and [46]).35 As I mentioned at the very outset of this essay, Frege’s method of introducing the real numbers lies between the traditional geometrical approach and the new theories developed by Cantor, Weierstraß, and Dedekind. Concerning the geometrical foundation, Frege retains the characterization of the real numbers as ratios of quantities, but, following a key idea of the new theories, he detaches them from all special kinds of quantity. The application of the real numbers in measuring quantities may not, of course, simply be externally patched onto them, because we would then have to state separately for each kind of quantity how the measurement is to be carried out, and we would lack general criteria for the applicability of the real numbers as measurement numbers. Frege mentions one serious doubt that might be raised to the middle course he has adopted. If the positive square root of 2 is a ratio of quantities, then it seems indispensable for its definition that it provide quantities which in fact stand in this relation to one another. The question is how this can be accomplished if appeal to geometrical or physical quantities is inadmissible. One still needs a ratio of quantities like √ the positive square root of 2 because otherwise our use of the symbol “ 2” would not be justified. In Frege’s view, offering a solution to this difficulty requires that we elucidate the meaning of the word “quantity”. We are told that all previous attempts to define the term “quantity” have miscarried. It is likely that this claim is meant to include Frege’s 35 Whenever it is beyond doubt that Frege uses the word “Bedeutung” in the technical sense that he associates with it in his official semantics of the Sinn and the Bedeutung of linguistic expressions, I render it as “reference” (“denotation” might be another option). Note that in his writings after 1892 Frege’s use of the word “Bedeutung” on a few occasions differs from the technical sense which vastly predominates in them. In the logical investigation ‘Der Gedanke’, for example, he writes: “Die Bedeutung des Wortes ‘wahr’ scheint ganz einzigartig zu sein” [23, p. 345]. I believe that in this context “Bedeutung” is used in a neutral, nontechnical sense; it should therefore be rendered as “meaning”: “The meaning of the word ‘true’ seems to be altogether unique.” Note that nowhere in his writings does Frege say anything about the reference of “true” ; he confines himself to emphasizing the special, “non-contributory” sense of this word whenever it occurs in standard linguistic environments such as “The thought that p is true” or “It is true that p”. Bearing all this in mind, I decided to translate the second occurrence of “Bedeutung” in the quotation above as “meaning”, because I presume that Frege does not use it here strictly in the technical sense. In this context, “Bedeutung” might even be meant to comprise, possibly in a somewhat loose sense, the two semantic components sense and reference. Analogous remarks apply to Frege’s dictum in [21, § 97].
Frege on Quantities and Real Numbers
63
own earlier suggestion (in [16]) of how the notion of quantity should be explicated, namely by stating the conditions under which identity of quantity holds [16, p. 51]. In a moment, we shall see that one of the criticisms that Frege levels against previous and current attempts to define or explicate the concept of quantity, is that the use of the phrase “of the same kind” proves to be an idle wheel. Yet it is precisely this phrase that occurs in Frege’s early explanation of what type of property a quantity of a certain kind such as length, force, mass or velocity would be, if the determination of the general concept of quantity were to proceed via the formulation of identity conditions for quantities. Recall that it would be a property in which a group of things, independently of their internal structure, can agree with a single thing of the same kind. In short, if in [21] Frege still fully subscribed to this early proposal, it would be hard to understand why he did not explicitly exempt it from his wholesale rejection of all attempts to define “quantity”. Be this as it may, his principal objection is that the term “quantity” is usually explained with the help of another term that stands in equal need of explanation as the explicandum itself. As a consequence, we are left with no better an understanding of the term “quantity” than before. Frege illustrates this by means of some examples, but mentions only Otto Stolz and Hermann Hankel by name. If one looks at the attempted explanations of the term “quantity”, Frege says, one often comes across the word “of the same kind” or “homogeneous” (“gleichartig” ) or the like. It is required of quantities that those of the same kind can be compared, added and subtracted, also that a quantity can be decomposed into parts of the same kind. Frege’s objection is that here the phrase “of the same kind” does not explain anything, for things can be of the same kind in one respect, but of different kinds in another (which, to my mind, is almost trivially true).36 Yet the fact that we cannot decide unambiguously whether an object is of the same kind as another goes against his logical requirement of sharp delimitation of a concept or a relation. Others, Frege points out, define the concept of quantity by using the words “greater” and “smaller” or “increase” and “diminish”, but they are said to spare themselves the trouble of explaining in what the relation of being greater or smaller or the activity of increasing or diminishing consists. The same is said to apply to the use of the words “addition”, “sum”, “duplicate”.
36 In the section entitled “The Mathematical-Sublime” of his Kritik der Urteilskraft (Critique of the Power of Judgement) [36, p. 92], Kant says (I slightly paraphrase): That something is a quantity can be recognized by restricting one’s attention to the thing itself, without comparing it at all with other things, namely when plurality of that which is homogeneous constitutes a unit. However, to determine how large it is always requires something else for its measure, which likewise is a quantity.
64
Matthias Schirn
5.2 Interlude: The concept of quantity/magnitude in the work of Euclid, Aristotle, and Euler – some remarks In what follows, I shall take a look at the expositions of the concept of magnitude or quantity in the work of Euclid, Aristotle, and Euler and consider briefly Frege’s possible response to them. Frege was undoubtedly familiar with Euclid’s Elements and thus also with book V in which Euclid deals with the concept of magnitude (µέγεθος). It is perfectly possible and perhaps even probable that he also knew the relevant passages in which Aristotle explicates the concept of quantity (ποσÕν). Finally, I take it to be rather likely that Frege knew (at least part of) the work Vollständige Anleitung zur Algebra (1770) of the famous mathematician Leonhard Euler. At the very beginning of it, Euler gives an explication of the notion of quantity (Grösse) in three parts. When Frege jettisons across the board all previous and present explanations of the notion of quantity in mathematics, he may also have had in mind Euclid or Aristotle or Euler or even the exposition of that concept by all three of them. To all appearances, Frege greatly admired Euclid’s work on the foundations of mathematics for its methodological rigour and its groundbreaking results. This is already evident from the fact that he unconditionally endorsed Euclid’s theory of geometry and declared his conception of axioms to be sacrosanct. I mentioned earlier (in 2.2) that in Book V of the Elements Euclid does not define the concept of magnitude, but rather analyzes its properties and structure by setting up a group of definitions and by subsequently proving a number of propositions involving the concepts of magnitude, of ratio, of multiple, of proportion, of proportional, etc. At the outset of book V, Euclid states altogether eighteen definitions. Regarding the very first definition, one may have expected a definition of “magnitude”, but learns instead under which condition a magnitude is a part of a magnitude. Definition 1: Μέρος στ µέγεθος µεγέθους τÕ λασσον τοà µείζονος, Óταν καταµετρÍ τÕ µεζον. A magnitude is a part of a magnitude, the less of the greater, when it measures the greater.
Definition 3 defines the concept of a ratio: Λόγος στ δύο µεγεθîν еογενîν ¹ κατ¦ πηλικότητά ποια σχέσις. A ratio is a sort of relation in respect of size between two magnitudes of the same kind.
Definition 4 is as follows (it was already quoted earlier in 2.2): Λόγον χειν πρÕς ¥λληλα µεγέθη λέγεται, § δύναται πολλαπλασιαζόµενα ¢λλήλων Øπερέχειν.
Frege on Quantities and Real Numbers
65
Magnitudes are said to have a ratio to one another which are capable, when multiplied, of exceeding one another.
Definition 5 stipulates what it is for magnitudes to be in the same ratio, while Definition 6 introduces the term “proportional” for magnitudes that have the same ratio. Definitions 12–16 are concerned with ratios; they determine the meaning of the terms “alternate ratio”, “converse ratio”, “composition of a ratio”, “separation of a ratio”, “conversion of a ratio”.37 After having set up the group of eighteen definitions, Euclid proceeds to prove twenty five propositions starting with the proof of the proposition “If there be any number of magnitudes whatever which are, respectively, equimultiples of any magnitudes equal in multitudes, then whatever multiple one of the magnitudes is of one, that multiple also will all be of all” and ending with the proof of proposition 25 “If four magnitudes be proportional, the greatest and the least are greater than the remaining two”. So much to Euclid on the concept of magnitude. Let us now cast a glance at Aristotle’s conception of quantity. Aristotle deals with the concept of quantity (ποσÕν, which literally means: how much) briefly in his Metaphysics and more extensively in the Categories. I begin with his exposition in the Metaphysics (1020a). I first quote from the original Greek text and shall then present the English translation by W.D. Ross:38 ΠοσÕν λέγεται τÕ διαιρετÕν ις νυπάρχοντα ïν κάτερον À καστον ν τι κα τόδε τι πέφυκεν εναι. ΠλÁθος µν οâν ποσόν τι ¦ν ¢ριθµητÕν Ï, µέγεθος δ ¥ν µετρητÕν Ï. λέγεται δ πλÁθος µν τÕ διαιρετÕν δυνάµει ις µ¾ συνεχÁ, µέγεθος δ τÕ ις συνεχÁ· µεγέθους δ τÕ µν φ' ν συνεχς µÁκος τÕ δ' π δύο πλάτος τÕ δ' π τρία βάθος. We call a quantity (ποσÕν) that which is capable of being divided in two or more constituent parts of which each is by nature a one and a “this”. A quantity is a plurality (πλÁθος) if it is numerable, a magnitude (µέγεθος) if it is measurable. We call a plurality that which is divisible potentially into non-continuous parts, a magnitude that which is divisible into continuous parts.
Thus, Aristotle considers both the measurability and the divisibility into continuous parts to be defining properties of a quantity qua magnitude. 37 One special aspect concerning Euclid’s treatment of the concept of magnitude in Book V is that it explains how to work with ratios of lengths, areas, etc. without defining numbers which occur in these ratios. 38 Speaking for myself, I consider it essential to include, at least in some places, the original Greek text. As a rule, in my work I rely on original texts whenever I can. One of the reasons is that I never unconditionally trust any translation of a philosophical text. The available standard English translations of Frege’s works, for example, contain numerous mistakes (actually too many for my taste) which distort the meaning of the original. Another more practical reason is that a reader who is interested in casting at least a glance at the original text will presumably appreciate the comfort of having immediate access to it.
66
Matthias Schirn
Frege does not follow Aristotle in distinguishing between two basic kinds of quantity: plurality and magnitude. For Frege, a quantity is, by its very nature, measurable. I assume that Frege would be reluctant to accept Aristotle’s explanation of “quantity”; he would definitely not adopt it as his own, although Aristotle does not speak of the divisibility (or decomposition) of a magnitude into parts of the same kind, but rather of its divisibility into continuous parts. Unlike the Metaphysics, Aristotle’s work Categories (4b) does not yet provide an explication (definition) of the term “quantity”. Here Aristotle confines himself to distinguishing between discrete and continuous quantities and further between quantities that are composed of parts which have position in relation to one another and quantities that are not composed of parts which have position in relation to one another. He mentions number and language as instances of discrete quantities, and lines, surfaces, bodies, time and place as instances of continuous quantities. In the original text, it reads as follows: Τοà δ ποσοà τÕ µέν στι διωρισµένον, τÕ δ συνεχές· κα τÕ µν κ θέσιν χόντων πρÕς ¥λληλα τîν ν αÙτος µορίων συνέστηκε, τÕ δ οÙκ ξ χόντων θέσιν. στι δ διωρισµένον µν οον ¢ριθµÕς κα λόγος, συνεχς δ γραµµή, πιφάνεια, σîµα, τι δ παρ¦ ταàτα χρόνος κα τό πος'.
In what follows, Aristotle attempts to explain each of these categories of quantities. A quantity is further characterized quite generally as something that has no contrary and, furthermore, as something that does not seem to admit of a more and a less. Finally, the most important distinguishing mark of a quantity is, Aristotle claims, its being called both equal and unequal. Ετι τù ποσù οÙδέν στιν ναντίον . . . ΟÙ δοκε δ τÕ ποσÕν πιδέχεσθαι τÕ µ©λλον κα τÕ Âττον, οον τÕ δίπηχυ. . . Ιδιον δ µάλιστα τοà ποσοà τÕ σον τε κα ¥νισον λέγεσθαι'.
As to the last point, I am not exactly sure what Aristotle has in mind. I presume that he wants to convey that a quantity can be said to be equal or unequal to another quantity of the same kind, as the case may be, and that this is the most significant feature of any quantity and of paramount importance for apprehending the concept of quantity.39 39 In a written comment, Michael Scanlan agrees with me that “τÕ σον” is best rendered as “equal” and not as “identical”. He points out that Aristotle is talking about the amount of things, a group of men or the length of a given line. Here one group of men is equal or unequal to another in number, one line is equal or unequal to another in length. Scanlan suggests that here we have an equivalence relation, not an identity. He writes: “Any time that two objects share some, but not necessarily all, properties, then I prefer to talk of an equivalence relation. One way to do this is to treat identity as one among many other equivalence relations. This is definitionally smoother, but outside a formal system, I think it is clearer to save “equivalence” for talking about relations in which a and b share some, but not all, characteristics. Thus, we have an equivalence relation of sameness in height among
Frege on Quantities and Real Numbers
67
Let me end this interlude with a few words on Leonhard Euler’s concept of quantity. The first sentence of Euler’s work Vollständige Anleitung zur Algebra (1770) begins with an explanation of the concept of quantity: 1. Erstlich wird alles dasjenige eine Größe genannt, was einer Vermehrung oder Verminderung fähig ist oder wozu sich noch etwas hinzusetzen oder wovon sich etwas hinwegnehmen läßt . . . 2. Es giebt sehr viele verschiedene Arten von Größen, welche sich nicht wohl aufzählen lassen; und daher entstehen die verschiedenen Theile der Mathematik, deren jeder mit einer besonderen Art von Größen beschäftigt ist. Die Mathematik ist überhaupt nichts anderes als eine Wissenschaft der Größen, welche Mittel ausfindig macht, wie man letztere ausmessen kann, 3. Es läßt sich aber eine Größe nicht anders bestimmen oder ausmessen, als daß man eine andere Größe derselben Art als bekannt annimmt, und das Verhältniß angiebt, in dem diese zu jener steht.
Here is my English translation: 1. In the first place, everything is called a quantity which is capable of an increase or a decrease or to which something can be added or from which something can be taken away . . . 2. There are very many different kinds of quantities which cannot be properly enumerated; and it is for this reason that the different parts of mathematics emerge, each of which is concerned with a special kind of quantity. Mathematics is indeed nothing but a science of quantities that traces ways in which one can measure the latter. 3. However, a quantity can be determined or measured only by assuming that another quantity of the same kind is known, and by giving the ratio in which the first stands to the second.
As I have already said, it is possible that Frege had also Euler’s explication of the concept of quantity (in section 1 of the quotation above) in mind when he criticized those explanations of this concept in which the terms “to increase” and “to diminish” occur essentially. Note that what Euler says in section 2 is reminiscent of and basically in line with certain remarks of Frege’s in [16]. As to Euler’s section 3, I suppose that Frege would have basically agreed that a quantity can be measured only by giving the ratio in which it stands to another quantity, but would probably have criticized Euler’s use of the words “of the same kind”.40
a group of people, who are all distinct individuals, or in geometry we can think of two distinct triangles that are equivalent with respect to the properties involved in congruence.” Scanlan proposes that Aristotle seems to be thinking of equivalence relations in this latter sense in his use of the word “equal” in the chapter on quantity in the Categories. 40 I assume that an explanation of “quantity” such as the one listed in The Encyclopedia of Philosophy (ed. P. Edwards, vol. 5, Macmillan Publishing Co., New York, London 1967, p. 242): “If one thing can be said to be greater than, equal to, or less than another in a certain respect, then this respect may be called a quantity” would likewise have fallen prey to Frege’s critique.
68
Matthias Schirn
So much for my comments on Euclid, Aristotle, Euler and Frege’s possible reaction to their views of the notion of quantity. It is time to return to Frege’s treatment of the concept of quantity in [21].
5.3 Informal considerations continued Frege considers the source of the failures concerning the explications of the concept of quantity to lie in the manner in which the fundamental question is posed. Instead of asking: “What properties must an object have in order to be a quantity?”, we ought to ask: “How must a concept be constituted if its extension is to be a quantitative domain?” 41 If we substitute “class” for “extension of a concept”, then the question becomes: “What properties must a class possess in order to be a quantitative domain?” A thing is not a quantity taken by itself, but only in so far as it belongs, with other objects, to a class which is a quantitative domain. For the sake of convenience, Frege expressly disregards absolute quantities. He confines himself to considering only those quantitative domains, in which a contrast [Gegensatz ] occurs, to which the contrast of the positive and the negative corresponds when it comes to dealing with measurement numbers (cf. [21, § 162, p. 159]). It is in this connection that Frege refers approvingly to Gauss and quotes a longer passage from him (Gauss, works, vol. II, p. 170). In it, Gauss proceeds from the observation that positive and negative numbers can only be applied where that which has been counted [das Gezählte] has an opposite or a contrary [ein Entgegengesetztes]. He points out that, on closer examination, this presupposition applies only where it is not substances (that is, objects conceivable for themselves) [für sich denkbare Gegenstände], but rather the relations, each holding between two objects, that are taken as that which has been counted. It is thereby postulated that these objects are, in a certain way, ordered in a series S, for example, A, B, C . . . and that the relation of A to B can be regarded as being equal to the relation of B to C, etc. Gauss goes on to say that to the concept of opposition [Entgegensetzung] belongs only the permutation [Umtausch]42 of the members of the relation, such that if the relation of 41 More literally, but perhaps less elegantly: “. . . so that its extension is a quantitative domain.” Another option is: “. . . for its extension to be a quantitative domain.” Frege [21, p. 158, footnote 2] acknowledges that Stolz makes a move in this direction when he writes that a quantitative concept is a concept of such a kind that any two of the objects falling under it are explained as being equal or unequal. Frege argues that in the sentence that immediately follows this explanation (or definition), namely: “In other words: ‘Quantity means every object which should be set equal to or unequal to another object’ ” [“Mit andern Worten: ‘Grösse heisst jedes Ding, welches einem andern gleich oder ungleich gesetzt werden soll ’ ”], shows that Stolz abandons his (initially promising) attempt. 42 I presume that the term “permutation” fits the bill here; “interchange” may be another option.
Frege on Quantities and Real Numbers
69
A to B is considered to be +1, the relation of B to A must be represented as −1. Thus, insofar as such a series S is unbounded on both sides, every real number represents the relation of a member, arbitrarily chosen as the beginning (of the series), to a given member of S. Frege signalizes that he basically agrees with this thought, but adds that (for his own purpose) he leaves out the limitation to the integers and prefers to replace the phrase “that which has been counted” with the phrase “that which has been measured” (“das Gemessene”). Moreover, he points out that unlike Gauss he considers the equality of the relations to be definable without regard to certain objects that may stand to one another in the relation. If a relation is given in which A stands to B, then – so Frege argues further – it is at the same time determined whether B stands to C and C stands to D in this same relation, and this yields automatically an ordering of objects in a series. Let me make a couple of brief remarks on this. First, Gauss employs the term “Relation” , not the term “Beziehung”, but he obviously uses the first in the usual sense of the second. In his comment on Gauss, Frege uses the two terms indiscriminately in the sense of “Beziehung” . It is only a few lines later (in the concluding passage of § 162) that he introduces “Relation” as a shorthand expression for “Umfang einer Beziehung” (“extension of a relation”). Second, we know from Frege’s remarks elsewhere in his work that identity of relations cannot, at least in a strict sense of identity, be defined, contrary to what he claims in his comment on Gauss. The reason is that Frege considers identity to be a relation that holds only between objects. However, he claims (cf. [24, p. 131]) that there is a second-level relation between (first-level) concepts which is akin to the first-level relation of identity, namely the mutual subordination or coextensiveness of first-level concepts. Note that in this connection Frege does not mention the analogous case of the coextensiveness of (first-level) relations. In his letter to the American mathematician Huntington which I mentioned earlier in section 2, Frege draws attention to some features of his own theory of real numbers and makes a few critical comments on the theory of magnitudes and real numbers that his addressee had presented in a trias of short articles (see [31, 32, 33]). It is his theory of real numbers in Grundgesetze where Frege sees the points of contact with Huntington’s foundational work. I am just now busy with the printing of the second volume of my Basic Laws of Arithmetic, which partly contains some considerations similar to the ones in your papers, especially with respect to Archimedes’ Axiom and the commutative principle, even though our points of departure are different.
Frege goes on to point out that in his forthcoming work he raises the (fundamental) question “What properties must a class have in order to be a quantitative domain?” and adds that there are some points where
70
Matthias Schirn
his treatment of analysis also diverges from Huntington’s. Moreover, he claims greater simplicity for his account; he writes [25, p. 89]: I too take into account at once the contrast between positive and negative, taking from Gauss the hint that this contrast occurs only among relations. The question now arises in this form: What properties must a class of Relations have in order to be a quantitative domain? This way of putting the question is somewhat simpler than yours, because something corresponding to your ‘rule of combination’ is given from the outset, namely the composition of Relations, which was already defined in the first volume. You take two things into account: the class or ‘assemblage’ and the ‘rule of combination’ in this class, and this rule is not given through the class. This is also a point where, from a logical point of view, your account does not seem to be perfectly correct.
The quantities to be considered by Frege are “Relationen”. Henceforth, I use the term “Relation(s)” with a capital “R” as a shorthand for “extension of a relation” or “extensions of relations”.43 Accordingly, in the quotation above I have rendered Frege’s term “Relationen” as “Relations”, with the only exception of its occurrence in the first sentence where he refers to Gauss. While in [16] Frege somewhat vaguely characterized a quantitative domain as the multiplicity enclosed within the scope of a quantitative kind (for example, length), conceived of as a property in which a group of things can agree with a single thing of the same kind, he now specifies quantitative domains more succinctly and more precisely as classes of Relations, that is, as extensions of concepts subordinate to the concept Relation. Thus, both the quantities themselves and the quantitative domains are now, in pursuit of the logicist programme, taken to be logical objects à la Frege par excellence. Frege lays down that the converse of what he calls “sign” (“Vorzeichen” ) corresponds to the converse of the relation (K and UK). The addition of the measurement numbers corresponds to the composition of Relations (KLΠ). Hence, the symbol “U ” is comparable to the minus sign and “L” is comparable to the sign for addition. The formula “ALUB” corresponds to “a − b”, and the formula “ALUA” corresponds to the null sign. (See the definitions of the converse of a relation Up and the composite relation pLq in section 5.4.) In [21, § 164], Frege addresses the question of where quantities whose ratios are irrational numbers might be found. They will have to be nonempty Relations, that is, the required quantities must not be extensions of those (first-order) relations in which no objects stand to one another. For it is plain that such relations are coextensive; there is only one empty 43 Frege sees no need to introduce a special axiom governing double courses-of-values, and R. Heck [28, pp. 283 f.] explains correctly why this is so. Note in this context that the terms for double courses-of-values can be formed by means of the notation available for the “simple” courses-of-values introduced in [20, § 9]; cf. § 36.
Frege on Quantities and Real Numbers
71
Relation. Yet no real number can be defined by means of the empty Relation. If q is the empty Relation, then both the converse of q and the composition of q with its converse coincide with q. Furthermore, the composition of Relations of the quantitative domain under consideration must not yield the empty Relation. However, this would be the case if there were no object ∆ to which an object would stand in the first Relation and which would stand in the second Relation to an object. The upshot so far is obvious: “We thus need a class of objects which stand to one another in the Relations of our quantitative domain, and in fact this class must comprise infinitely many objects” [21, § 164]. Frege observes that the required class must have a cardinality greater than the class of natural numbers (finite cardinal numbers), and draws attention to the fact that the cardinal number belonging to the concept class of natural numbers is, in effect, greater than the cardinal number of the concept natural number. Somewhat surprisingly, Cantor’s proof that for any set M , the cardinality of the power set ℘(M ) is greater than the cardinality of M is passed over in silence. (In his short article ‘Über eine elementare Frage der Mannigfaltigkeitslehre’ (1890-91), Cantor had already announced his diagonal argument for proving the result just mentioned; cf. [5, pp. 278 ff.]). Having arrived at this point, Frege sketches his plan for the envisaged introduction of the real numbers. In order to render his exposition more accessible to the reader, he temporarily assumes that the irrational numbers are known. Every positive real number a can be represented in the form k=∞ X 1 r+ 2nk k=1
where r is a positive integer or 0, and n1 , n2 , . . . form an infinite, monotone increasing sequence of positive integers. To every positive rational or irrational number a there belongs an ordered pair hr, Ri, where r is a positive integer or 0, and R an infinite class of positive integers (class of the nk ). If instead of the integers we take cardinal numbers, then to every positive real number there belongs an ordered pair whose first member is a cardinal number and whose second member is a class of cardinal numbers which does not contain the cardinal number 0. Suppose now that a, b and c are positive real numbers and that a + b = c holds. Then for every b there is a relation holding between the pairs belonging to a and to c. This relation is said to be definable without presupposing any knowledge of the real numbers. Thus, we have relations, each of which is again characterized by a pair (belonging to b), to which we add the converses. As Frege further points out, the extensions of these relations (that is, these Relations) correspond single-valuedly44 (eindeutig) to the positive and negative real 44 Here I am indebted to the translators of Frege’s Grundgesetze, Philip Ebert and
72
Matthias Schirn
numbers; and to the addition of the numbers b and b′ corresponds the composition of the corresponding (or associated) Relations. He eventually observes that the class of these Relations is a domain which suffices for his plan, but hastens to add that it is not thereby said that he will hold precisely to this route. Thus, following Frege’s exposition in [21, § 164] one could define the real numbers as ordered pairs hm, M i, formed from an integer m and an infinite set M of natural numbers not including 0. If with regard to this representation of the real numbers as ordered pairs hm, M i one defines addition (‘+’), the relation less-than (‘ a. This proof has several steps that themselves require proof. Examples are the step that consists in the assertion that if b|(p1 · p2 · · · pn ) and b|S(p1 · p2 · · · pn ), then b = 1, or the step in which it is asserted that if a|b and a|S(b), then a = 1. Whether or not the Euclidean solution is topically pure thus depends on whether or not the main proof, and all of the subproofs needed to establish the steps of the main proof, draw only what belongs to the topic of IP. On the face of it, there is nothing unarithmetic about these proofs, and so a favored initial diagnosis is that the Euclidean solution is pure. However, there are good reasons to think this too quick. A first reason for concern about the purity of this solution is that it appeals to multiplication in generating Q, though multiplication was not included in the preliminary specification of IP’s topic. A second reason for concern is that the subproofs have not been given fully, and so appeal to elements foreign to IP’s topic cannot yet be ruled out. A reply to both concerns would be to note that the main proof and each of the needed subproofs can be carried out from the first-order Peano axioms (PA), as can be (tediously) checked. Provided that the axioms of PA (augmented by definitions of the appropriate ordering and primality) belong to the topic of IP, its sufficiency for expressing the main proof and each subproof answers the second concern, and its inclusion of a definition of multiplication answers the first concern. Thus, provided that the Peano axioms belong to the topic of IP, and that the proofs when carried out in PA remain faithful to the Euclidean proof and subproofs, the Euclidean proof is topically pure. This is not an especially convincing reply, however, because it begs both questions. The sufficiency of PA for the Euclidean solution was not in question; this is indicative of PA’s being widely considered an adequate axiomatization of elementary arithmetic. What is in question is the topicality of the commitments engendered in accepting PA for IP. The reply simply asserts that these commitments are topical for IP, but that is exactly what being questioned. What is needed is a more fine-grained analysis of the topic of IP, in particular investigation of what operations (divisibility? multiplication? addition?) and modes of inference (classical logic? how much induction?) belong to IP’s topic. To this the essay now shifts.
3.1.1 The topicality of arithmetic operations for IP The second concern just raised is that the fully spelled-out Euclidean solution may contain elements that do not belong to IP’s topic. When spelling out the solution in PA this is the case, since addition is used in establish-
320
Andrew Arana
ing the needed properties of multiplication, while addition is not explicitly mentioned in the problem as formulated. But this is not merely an issue with PA. Any proof that uses multiplication must either take to belong to the definition of multiplication the properties of multiplication that it needs, or prove them on some other basis. If the latter, proof via addition is the obvious choice since multiplication is ordinarily defined as iterated addition (as in PA, for instance). If the former, then some plausible nonadditive definition of multiplication is needed; and moreover some answer will be needed for the reply that multiplication is also not mentioned explicitly in IP. We will return to the issue concerning multiplication; let us for now discuss the additive case. Concerning the use of addition in the Euclidean solution, one response would be to defend the use of addition as topically pure for IP. One could do so on the grounds that addition (and multiplication) are “basic” to understanding the natural numbers, because we are talking about a discretely ordered ring in usual practice, that is, as a structure with both an addition and a multiplication operator. But this seems wrong: Presburger and Skolem arithmetic (with just addition and multiplication, respectively) are just as “basic” as Peano arithmetic. Indeed, children seem to grasp the natural numbers before they understand the concepts of addition and multiplication. The sequence starting with 1 and generated by successors is more plausibly basic (though not necessarily the most basic). Another response to this objection concerning the use of addition in the Euclidean solution notes that the only need for addition in the Euclidean solution is to establish properties of multiplication such as commutativity and associativity. We may thus isolate these properties of multiplication and find a proof directly from them, without adverting to addition. The following seventeen assumptions are an attempt to do this. They include assumptions regarding successor and the ordering in addition to multiplicative assumptions, in order to yield a set of assumptions sufficient for solving IP without using addition. Assumption 1 For all x, there exists y such that y = S(x). Assumption 2 For all x, y, x = y if and only if S(x) = S(y). Assumption 3 For all x, y, there exists z such that z = x · y. Assumption 4 For all x, x · 1 = x. Assumption 5 For all x, y, z, (x · y) · z = x · (y · z). Assumption 6 For all x, y, x · y = y · x. Assumption 7 For every sequence of primes p1 , . . . , pn , there exists z such that z = p1 · p2 · · · pn .
Purity in Arithmetic: some Formal and Informal Issues
321
Assumption 8 For all x, y, z, if x < y and y < z, then x < z. Assumption 9 For all x, x 6< x. Assumption 10 For all x, y, either x < y, x = y, or y < x. These three assumptions together imply that if x < y, then y 6< x, and hence that the trichotomy asserted in Assumption 10 is exclusive, i.e. for each x, y, exactly one of x < y, x = y, and y < x holds. Assumption 11 For all x, 1 ≤ x. Assumption 12 For all x, y, x < y if and only if S(x) < S(y). Assumption 13 For all x, x < S(x). Assumption 14 For all x, y, if x < y then S(x) ≤ y. Assumption 15 For all x, y, z, x < y if and only if xz < yz. Assumption 16 For all y 6= 1 and all x, S(yx) < y · S(x). Assumption 17 For each formula ϕ(x, y), where x is a free variable and the y are terms, if ϕ(1, y) and if for all a and all b < a, ϕ(b, y) implies that ϕ(a, y), then for all a, ϕ(a, y). These assumptions may be grouped as follows: Assumptions 1 and 2 concern successor, 3–7 concern multiplication, 8–10 concern the ordering, 11– 16 concern how successor and multiplication respect the ordering, and 17 is an induction schema.5 Next we will give a non-additive solution to IP using these assumptions. To simplify the structure of the main proof, we will separate from the main proof the following three lemmas, and prove them separately. Lemma 3.1 S(1) is prime. Lemma 3.2 Every natural number a 6= 1 has a prime divisor p ≤ a. Lemma 3.3 For all a, b, if a|b and a|S(b), then a = 1. Using these lemmas, the main result, that for all a, there exists b > a such that b is prime, can be proved as follows, with the assumptions referred to therein listed afterwards. 1. Either a = 1 or a > 1. [Assumption 11] 5
We are not claiming that these assumptions are mutually independent of each other.
322
2.
Andrew Arana
a) Say a = 1. b) By Lemma 3.1, S(1) is prime. [Assumption 1] c) S(1) > 1. [Assumption 13]
3.
a) Say a < 1. b) Let p1 , p2 , . . . , pn be all the primes less than or equal to a. c) Let Q = S(p1 · p2 · · · pn ). [Assumptions 1, 7] d) By Lemma 3.2, Q has a prime divisor b. e)
i. Suppose b = pi . ii. Then b|(p1 · p2 · · · pn ). [Assumptions 5, 6]
iii. By Lemma 3.3, b = 1, contradicting the primality of b. f) Thus for each i, b 6= pi .
g) Either a < b, or b ≤ a. [Assumption 10] h) b ≤ a contradicts that the pi were all the primes less than or equal to a. i) Thus a < b. Proof of Lemma 3.1, S(1) is prime: 1. For all n, n < S(1), n = S(1), or S(1) < n. [Assumptions 1 and 10] 2. a) Suppose n = S(1). b) S(1)|S(1). [Assumption 4] 3.
a) Suppose n < S(1). b) S(n) ≤ S(1). [Assumptions 1, 14]
c) If S(n) < S(1), then n < 1, a contradiction. [Assumptions 11 and 12; and 8–10 which imply that only one of the cases of trichotomy of < obtains] d) If S(n) = S(1), then n = 1. [Assumption 2] e) n = 1. f) 1|S(1). [Assumptions 4, 6]
4.
a) Suppose S(1) < n. b) i. Suppose n|S(1). ii. There exists x such that nx = S(1). iii. S(1) · x < nx. [Assumption 15] iv. S(1) · x < S(1) · 1. [Assumption 4]
v. x < 1, a contradiction. [Assumptions 11, 15] c) Thus, if S(1) < n, then n 6 | S(1).
5. So the only numbers dividing S(1) are 1 and S(1), and so S(1) is prime.
Purity in Arithmetic: some Formal and Informal Issues
323
Proof of Lemma 3.2, every natural number a 6= 1 has a prime divisor p ≤ a: 1. We proceed by strong induction on a.
2. Base case: S(1) is prime by Lemma 3.1. 3. Inductive case: a) Suppose that for all y < a, y 6= 1 has a prime divisor p ≤ y. b) Either a is prime or composite. c) If a is prime, we are finished. d) So suppose a is composite, i.e. that there is some b such that 1 < b < a and b|a. e) By the inductive hypothesis, b has a prime divisor p ≤ b. f) Since p|b and b|a, p|a. [Assumption 5] g) Since p ≤ b and b < a, p ≤ a. [Assumption 8] 4. So for all a 6= 1, a has a prime divisor p ≤ a. [Assumption 17]
Proof of Lemma 3.3, for all a, b, if a|b and a|S(b), then a = 1: 1. Suppose a|b and a|S(b).
2. Then there exist x and y such that ax = b and ay = S(b). 3. ax = b < S(b) = ay. [Assumption 13] 4. x < y. [Assumptions 6 and 15] 5.
a) Suppose a 6= 1. b) S(ax) < a · S(x). [Assumptions 1, 3, and 16] c) S(x) ≤ y. [Assumptions 1 and 14] d) a · S(x) ≤ ay. [Assumptions 3, 6 and 15]
e) S(ax) < ay [Assumption 8] f) S(b) < S(b), a contradiction [Assumption 9]
6. a = 1. To argue that this solution is pure, we would need to argue that each of the seventeen assumptions belong to IP’s topic, that is, that each assumption partly determines the content of IP as formulated (for an ordinary investigator). If we are willing to grant PA as topical for IP, then this is trivial, since each of these assumptions may be derived in PA. Otherwise, this is a difficult task, because it is hard to say definitively whether an assumption is determinative of the content of a problem formulation, even for an ordinary investigator. Indeed it is not even clear what the standards are for making such a determination. Toward this, we note in particular that Assumption 7 is provable by induction, that is, from Assumption 17; and that Assumption 16 asserts that multiplication grows faster than successor, which seems essential to the typical contemporary understanding of the relation of these two functions.
324
Andrew Arana
While the issue concerning the topicality of addition for IP remains of interest, let us expand the discussion by considering the topicality of multiplication for IP as well. We raised two issues for said topicality earlier: firstly, some plausible non-additive definition of multiplication would be needed if multiplication is to belong “natively” to IP’s topic; and secondly, some answer would be needed to the point that like addition, multiplication is also not mentioned explicitly in IP; rather, only division is, in the definition of prime number. Both points may be met by introducing work in mathematical logic. To that work we now turn. We first point out that since primality is defined in terms of divisibility, a definition of divisibility uncontroversially belongs to IP’s topic. This leaves open precisely which such definition is included. Divisibility is often defined in terms of multiplication – a divides b if and only if there exists x such that a · x = b – but this is not required. Informally, a divides b if a collection of b many objects can be divided into a groups with none left over. While this can be expressed in terms of multiplication, we have just shown that it need not be. The question then is what definitions and axioms ground (our understanding of) divisibility. There has been some logical work on axiomatizations of the arithmetic of divisibility, in which divisibility is taken as a primitive, notably by Cegielski (cf. [9], [10]), but this work takes the infinitude of primes as an axiom and so is not finegrained enough for the question in focus here. We turn instead to work of Julia Robinson which offers a more promising direction for our investigation. Robinson showed how to define addition (and multiplication) for the natural numbers in terms of just successor and divisibility, both of which are explicitly referenced in the problem’s formulation (cf. [32, pp. 100-2]). She firstly showed that addition is definable in terms of successor and multiplication as follows: a + b = c if and only if S(a·c)·S(b ·c) = S[(c·c)·S(a·b)]. She next showed both that two numbers being relatively prime, and that a number being the least common multiple of other numbers, can be defined in terms of successor and divisibility, without appeal to addition. She lastly showed how to define multiplication using successor, relative primality, and least common multiple. Using these explicit definitions, the Euclidean solution to IP (as carried out in PA) may be translated into a language with just 1, S, |, and 0 as the basic open sets. The following can then be shown (but we omit the details here): these sets Ba,b together form a basis for a topology on the integers; and each Ba,b is closed as well as open. By the latter it follows that the union of finitely many Ba,b is closed since in a topological space, unions of finitely many closed sets are closed. S We now consider the set A = B0,p for p ≥ 2 prime. Since every p
integer besides ±1 has a prime factor (by the Fundamental Theorem of Arithmetic), every integer besides ±1 is contained in some B0,p . Thus, statement involves only finitary mathematical objects (i.e., what logicians call an arithmetical statement) can be proved in elementary arithmetic.” (cf. [5, p. 258])
328
Andrew Arana
A = Z − {−1, 1}. If A were a union of finitely many B0,p , then it would be a closed set in our topology. Then {−1, 1}, being the complement of a closed set, would be open. But this is impossible, since the basic open sets Ba,b are all infinite, and by the definition of basis, each open set is a superset of some basic open set. Thus A is not a union of finitely many B0,p . Hence there are infinitely many primes. Our argument that Furstenberg’s solution to IP is topically impure simply observes that its commitments to definitions of topological space, topological basis, and open and closed sets in a topology do not belong to IP’s topic. Each could be retracted without a corresponding change in our understanding of IP. One might object that some set-theoretic commitments are necessary for understanding IP. For instance, one might reply that the “proper” definition of natural number is set-theoretic, as in the original second-order Dedekind-Peano axioms, or in Frege or Russell’s work. This makes it clear how difficult it is to say definitively what belongs to a problem’s topic, for a full response to this objector would be an argument against this understanding of number, a significant philosophical achievement in its own right. Rather than offer such a response, we observe that it is open whether what is defined by set-theoretic definitions of natural number is the same as what is defined by purely first-order definitions. It is consistent with what is presently accepted that we are discussing (at least) two different problems, one with set-theoretic commitments in its topic, one without.8 In that case, the topical purity of Furstenberg’s proof, with respect to its set-theoretic commitments, comes down to which of these problems is the IP being considered. Another objection of this type would be that while set-theoretic commitments may not be necessary to understanding natural number, they are necessary for understanding the arithmetic functions appealed to by IP. In reply we point out that arithmetic functions can be understood algorithmically, without appeal to set theory. We see no good reason why a set-theoretic understanding of function take precedence, particularly in the case of IP where the functions are merely used for computations. Note also that the topology used in Furstenberg’s proof, when carried out in set theory, is quite weak, i.e. it can be carried out in a fragment of set theory that uses just boolean operations on “simple” sets of natural numbers. On this point, D. Cass and G. Wildenberg have shown that Furstenberg’s proof can be reformulated in terms of periodic functions on integers, “avoiding the language of topology” (cf. [8, p. 203]). In reply, we observe that the issue again is whether any set-theoretic commitments are 8
We say “at least” because it is conceivable that IP’s topic may not contain all of the commitments needed to permit Furstenberg’s proof, while containing other settheoretic commitments.
329
Purity in Arithmetic: some Formal and Informal Issues
engendered by understanding arithmetic problems. Whether or not these set-theoretic commitments are “weak” is only relevant inasmuch as it bears on whether those commitments belong to IP’s topic, and we see no reason to think that they do. In Section 5 of [15] we surveyed a further objection to the topical impurity of Furstenberg’s solution, articulated by Colin McLarty in correspondence. McLarty’s view is that to have a full understanding of IP, one must include not merely set-theoretic commitments but indeed topological commitments of the type appealed to in Furstenberg’s proof. Hence Furstenberg’s proof should not therefore be regarded as impure simply because it appeals to topological principles. We want to revisit this objection here in order to clarify further McLarty’s point, and to pinpoint how it suggests a notion of content that differs from the one at play in topical purity, in which the understanding of ordinary practitioners is foundational. In taking this view, McLarty is aligning himself with the Bourbakiste tradition of arithmetic research, a tradition to which Furstenberg’s work also belongs. McLarty suggests that Furstenberg developed his proof from then-current work of Claude Chevalley in class field theory, work that was considered cutting-edge arithmetic despite the central role of topology in it. In [11] Chevalley took himself to have made progress in realizing a purist ideal; as he remarked to open the paper, “Class field theory is presented a little more simply today than a few years ago, in particular because of the elimination of “transcendental means” ” [La théorie du corps de classes se présente un peu plus simplement aujourd’hui qu’il y a quelques années, notamment du fait de l’élimination des “moyens transcendants”] (cf. [11, p. 394]). The “transcendental means” in question are ζ-functions; as Olga Taussky-Todd remarks in her review of this paper, a striking achievement of this paper was “the exclusion of analytical methods. . . the theory of the ζ-functions which for so long a time seemed necessary can now be omitted.” 9 Chevalley thus sought and achieved the elimination of complex analysis from what he considered arithmetic, though he had an expansive view of what counts as arithmetic; as Taussky-Todd puts it, “topological methods play an important part in this new presentation of class field theory.” From this standpoint, Furstenberg’s proof is a way to illustrate these sophisticated methods to non-experts, and by providing a simple solution to a classical arithmetical problem, a demonstration that they are arithmetic. 9
Cf. Math. Reviews MR0002357 (2, 38c). The Riemann zeta function ζ(s) =
∞ P k=1
1 ks
is a well-known ζ-function used heavily in analytic number theory (cf. [21]). The general notion of ζ-function, or L-functions as they are sometimes known, arose from work of Euler and Dirichlet, and have been used heavily in analytic and algebraic number theory (cf. [29], [7]). H. Weber had used ζ-functions in class field theory (cf. [34]); it was this use of analysis that Chevalley’s work was an attempt to purify. For a historical discussion of ζ-functions in Chevalley’s context, cf. [12]; and for a detailed technical discussion of ζ-functions in this context, cf. [26, Chapter 11].
330
Andrew Arana
This approach yielded new results on the cutting-edge of arithmetic, but Furstenberg’s solution to IP shows that the approach also yielded new solutions to elementary arithmetic problems. Furthermore, the topological means it draws upon are topological only in an axiomatic, lattice-theoretic sense, rather than in the sense typical of the Poincaré-Lefschetz topology according to which essential use is made of continua such as the real or complex lines. Chevalley judged topology in the latter sense to be nonarithmetic, but in the former sense to be arithmetic. This, then, is the view that McLarty offers in objection to our determination that Furstenberg’s solution to IP is topically impure. He claims that the topological elements in Furstenberg’s solution belong to IP’s topic, and as a result Furstenberg’s solution should not be judged topically impure on the basis of its use of these elements. McLarty’s Bourbakiste point is an important one. Work like Chevalley’s and Furstenberg’s shows that IP is not merely a problem of significance for arithmetic, but of topology as well. It shows that there are “deep” connections between arithmetic and topology, connections that were hidden to previous investigators. It brings to light the fusion of what were once thought separate domains. McLarty’s point is that solutions to problems that draw on commitments concerning domains that are “deeply” connected to the topic of a problem are of special epistemic importance. Their importance seems to be twofold: firstly, they improve our knowledge of the connections between domains by showing how one domain can be used to solve problems in another; and secondly, through this gain of knowledge of connections, they afford the investigator “considerable economy of thought” [économie de pensée considérable] by providing her with results applicable to multiple domains of investigation rather than to just a single one (cf. [6, § 5]). Such solutions help combat the splintering of mathematics into autonomous disciplines with different methods and aims (cf. [6, § 1]). We have argued (in Section 5 of [15]) against McLarty’s view, on the grounds that a topological solution of IP provides a “deep” solution but not its most “basic” solution, where “basicness” reflects the conceptual resources corresponding most closely to those which are needed to grasp the problem. Our diagnosis rests on McLarty’s claim that the Chevalleyinspired reading properly articulates IP’s topic, even though an ordinary understanding of IP would seem to contain no topological commitments. There are thus two competing notions of problem understanding that might be thought to be determinative of topics and hence of the content of problems, what could be called “basic” and “deep” understanding.10 On 10 There is a discussion of a related distinction in [4] between “informal” or “intuitive” content of a statement, by which is meant what someone with a casual understanding of geometry would (be able to) grasp, and “formal” or “axiomatic” content, by which is meant the inferential role of that statement in an axiomatic system.
Purity in Arithmetic: some Formal and Informal Issues
331
the latter, Bourbakiste notion suggested by McLarty, Furstenberg’s solution qualifies as topically pure. While this notion of deep understanding is important and deserves further investigation, the view here is that this notion should not replace the “basic” sense of understanding in topic determination. This is, in short, because McLarty and Bourbaki make clear that they see deep understanding as articulating connections between the domain of the problem being investigated and other domains, rather than articulating just the content of the problem being investigation. Commitments of the latter type are the ones relevant to topical purity, however, since a topically pure solution of a problem is best thought of as a solution of precisely that problem, not some different problem – even if there are good reasons to pursue the solution of that different problem, as the McLarty/Bourbaki view argues.
4 Incompleteness and the possibility of purity In an article [28], Georg Kreisel explained the consequences of Gödel’s work for purity as follows: Gödel’s paper [18] established that logical purity can be achieved in principle, and [1931] that arithmetic purity cannot be achieved; in fact, the result in [19] is so general that it is quite insensitive to any genuine ambiguities in the notion of purity of method. (pp. 163–4)
The idea seems to be that Gödel sentences are arithmetical sentences, and that a pure proof of an arithmetical sentence must draw only upon arithmetical means. But since Gödel sentences are unprovable by just arithmetical means, they do not admit of pure proof. Such, at any rate, seems to have been Kreisel’s view. Daniel Isaacson has articulated a view concerning the content of Gödel sentences that can be used to argue against Kreisel that Gödel sentences can be proved purely. In [27] Isaacson argued that sentences in the firstorder language LPA of arithmetic may have purely arithmetical content, or may have in addition “higher-order”, i.e. infinitary, non-arithmetical content. Since the ordinary understanding of sentences in LPA involves only arithmetical content, he says that their higher-order content, if any, is only “implicit” or “hidden”. This follows from his view that the content of arithmetical sentences is determined by what is necessary and sufficient for “perceiving” that that sentence is true, where said “perception” amounts either to “articulating” our grasp of the structure of the natural numbers, as he claims yields the axioms of PA, or to the recognition of a proof of that sentence. Since we have that Gödel sentences are PA-provably equivalent to sentences expressing by coding metamathematical properties of arithmetic (such as unprovability or consistency), it follows that these sentences are
332
Andrew Arana
provably unprovable in PA but provable by higher-order means. Such equivalences reveal “the implicit (hidden) higher-order content” of truths in the language of arithmetic, Isaacson writes. He holds that “the understanding of these sentences rests crucially on understanding this coding and our grasp of the situation being coded.” Hence, he concludes, Gödel sentences are not arithmetical sentences, but rather have higher-order content. If correct, this would seem to imply that pure proofs of Gödel sentences could draw on non-arithmetical resources and hence are available, contra Kreisel. In reply, we point out that Isaacson’s view seems muddled in the following respect: on the one hand, the non-arithmetical nature of Gödel sentences is the result of their provable unprovability in PA, and on the other hand, of their having coded metamathematical content. In identifying these two, is Isaacson’s view committed to maintaining that every arithmetical sentence independent of PA has coded metamathematical content? The first criterion seems to embody the view that the content of a sentence is determined by the inferential role it plays within an axiomatic theory (in this case a theory in which the metamathematics of PA can be carried out, for instance ZFC). This view does not permit obviously arithmetic sentences like the Goldbach conjecture to be judged as arithmetical at present, since there is at present no reason to believe its (plus-minus) truth is “directly perceivable” from our grasp of the structure of the natural numbers, nor from any other truths, arithmetic or not. This tells against the first criterion as a compelling view concerning the content of sentences in the language of arithmetic. The second criterion, that Gödel sentences are higher-order rather than arithmetical in virtue of having coded metamathematical content, is more promising. However, it suffers from the following problem. While we cannot see that Gödel sentences are Gödel sentences without grasping their coded metamathematical content, we can grasp them the way we do ordinary universally quantified sentences in the language of arithmetic without seeing that they are Gödel sentences. For instance, we could reasonably try to prove such sentences while only accepting the axioms of PA. It is true that our interest in Gödel sentences stems from their metamathematical content, generally speaking, but whether a sentence is arithmetical should be independent of our reasons for interest in it. We could encounter Gödel sentences in mainstream number-theoretic work, without knowing beforehand that these sentences are equivalent to metamathematical sentences, and could in that case grasp these sentences without grasping any higher-order content (which is not to say we could prove them without such grasp). A defender of Isaacson’s view could draw on the distinction between “basic” and “deep” content made in the previous section. While the basic
Purity in Arithmetic: some Formal and Informal Issues
333
content of Gödel sentences would seem to be arithmetical, their “deep” content would seem to be metamathematical or higher-order. On this view, the basic content of any sentence expressible in the language of arithmetic is arithmetical, while its deep content depends on other theoretical factors such as its inferential role in axiomatic arithmetic. However, for evaluating Kreisel’s claim that Gödel sentences cannot be proved purely, it is basic rather than deep content that is relevant, at least if the type of purity at issue is topical. As we have explained, grasp of their deep content is unnecessary for grasping these sentences in the ordinary way sufficient for attempting their proof, for instance. But it is precisely the latter type of grasp that determines what belongs to a problem’s topic, and hence what may be drawn upon in a topically pure proof. Consequently Isaacson’s observations indicate in another way the two types of content to which we have already drawn attention, but do not pose a convincing argument against Kreisel’s claim that Gödel sentences have no topically pure proofs.
5 Closing thoughts The case of the infinitude of primes is valuable because it highlights several key issues important for getting clearer on topical purity. How topics of problems are determined awaits further systematic study. Case studies like the one presented here are necessary and important preludes to this type of investigation. This particular case study highlights the difficulty of determining exactly what belongs to the topic of even a quite elementary problem. While addition does not explicitly appear in the problem’s formulation, it is natural to think that addition belongs to the topic of every arithmetic problem, in virtue of the natural numbers’s identity as an additive structure. The discussion of the Euclidean solution here was meant to show how to argue for its topical purity without just granting this point about the additive identity of the natural numbers. The discussion of Furstenberg’s topological solution illustrates two competing notions of problem content that might be thought to be determinative of topics, what could be called “basic” and “deep” content. We argue that what belongs to the deep content of a problem may not necessarily be drawn upon by a topically pure solution of that problem, and so in particular that Furstenberg’s solution of IP is not topically pure. We then consider Kreisel’s claim that Gödel sentences have no pure proofs and observe that Isaacson’s point that these sentences have hidden higher-order content does not contradict this claim, since this hidden content is again deep rather than basic and so does not bear on the purity or impurity of proofs drawing on these higher-order means.
334
Andrew Arana
References [1] Andrew Arana. Logical and semantic purity. Protosociology, 25:36–48, 2008. Reprinted in Philosophy of Mathematics: Set Theory, Measuring Theories, and Nominalism, Gerhard Preyer and Georg Peter (eds.), Ontos, 2008. [2] Andrew Arana. On formally measuring and eliminating extraneous notions in proofs. Philosophia Mathematica, 17:208–219, 2009. [3] Andrew Arana. Elementarity and purity. In Andrew Arana and Carlos Alvarez (eds), Analytic Philosophy and the Foundations of Mathematics. Palgrave/Macmillan, 2011. Forthcoming. [4] Andrew Arana and Paolo Mancosu. On the relationship between plane and solid geometry. Review of Symbolic Logic, 5(2):294–353, June 2012. [5] Jeremy Avigad. Number theory and elementary arithmetic. Philosophia Mathematica, 11:257–284, 2003. [6] Nicholas Bourbaki. L’architecture des mathématiques. In François Le Lionnais (ed.), Les grands courants de la pensée mathématique. Éditions des Cahiers du Sud, 1948. [7] Kevin Buzzard. L-functions. In Gowers [20]. [8] Daniel Cass and Gerald Wildenberg. A novel proof of the infinitude of primes, revisited. Mathematics Magazine, 76(3):203, 2003. [9] Patrick Cegielski. La theorie élémentaire de la divisibilité est finiment axiomatisable. C. R. Acad. Sci. Paris Sér. I Math., 299(9):367–369, 1984. [10] Patrick Cegielski, Yuri Matijasevich, and Denis Richard. Definability and decidability issues in extensions of the integers with the divisibility predicate. Journal of Symbolic Logic, 61(2):515–540, 1996. [11] Claude Chevalley. La théorie du corps de classes. Annals of Mathematics, 41:394–418, 1940. [12] J.W. Cogdell. On Artin L-functions. http://www.math.ohio-state.edu/ ∼ cogdell/artin-www.pdf, 2007. [13] Paola D’Aquino. Local behaviour of the Chebyshev theorem in models of I∆0 . Journal of Symbolic Logic, 57(1):12–27, 1992. [14] Paola D’Aquino. Weak fragments of Peano arithmetic. In The Notre Dame Lectures, volume 18 of Lecture Notes In Logic, pp. 149–185. Association for Symbolic Logic, Urbana, IL, 2005. [15] Michael Detlefsen and Andrew Arana. Purity of methods. Philosophers’ Imprint, 11(2):1–20, 2011. [16] Herbert B. Enderton. A mathematical introduction to logic. court/Academic Press, Burlington, MA, second edition, 2001.
Har-
[17] Harry Furstenberg. On the infinitude of primes. American Mathematical Monthly, 62(5):353, 1995. [18] Kurt Gödel. Die Vollständigkeit der Axiome des logischen Funktionenkalküls. Monatshefte für Mathematik und Physik, 37(1):349–360, 1930. Reprinted and translated in Collected Works Volume 1, Solomon Feferman et. al. (eds.), Oxford University Press, 1986.
Purity in Arithmetic: some Formal and Informal Issues
335
[19] Kurt Gödel. Über formal unentscheidhare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik, 38: 173–198, 1931. Reprinted and translated in Collected Works Volume 1, Solomon Feferman et. al. (eds.), Oxford University Press, 1986. [20] Timothy Gowers (ed.). The Princeton companion to mathematics. Princeton University Press, Princeton, 2008. [21] Andrew Granville. Analytic Number Theory. In Gowers [20]. [22] Petr Hájek and Pavel Pudlák. Metamathematics of first-order arithmetic. Perspectives in Mathematical Logic. Springer-Verlag, Berlin, second edition, 1998. [23] Michael Hallett and Ulrich Majer (eds). David Hilbert’s Lectures on the Foundations of Geometry, 1891–1902. Springer-Verlag, Berlin, 2004. [24] G. H. Hardy and E. M. Wright. An introduction to the theory of numbers. Oxford University Press, New York, fifth edition, 1979. [25] A. E. Ingham. The distribution of prime numbers. Cambridge University Press, Cambridge, 1932. [26] Kenneth Ireland and Michael Rosen. A classical introduction to modern number theory. volume 84 of Graduate Texts in Mathematics. SpringerVerlag, New York, second edition, 1990. [27] Daniel Isaacson. Arithmetical truth and hidden higher-order concepts. In W. D. Hart (ed.), The Philosophy of Mathematics, pp. 203–224. Oxford University Press, New York, 1996. First published in Logic Colloquium ’85, the Paris Logic Group (eds.), Amsterdam, North-Holland, 1987, pp. 147– 169. [28] Georg Kreisel. Kurt Gödel. Biographical Memoirs of Fellows of the Royal Society, 26:149–224, 1980. [29] Barry Mazur. Algebraic Numbers. In Gowers [20]. [30] Rohit Parikh. Existence and feasibility in arithmetic. Journal of Symbolic Logic, 36:494–508, 1971. [31] J. B. Paris, A. J. Wilkie, and A. R. Woods. Provability of the pigeonhole principle and the existence of infinitely many primes. Journal of Symbolic Logic, 53(4):1235–1244, 1988. [32] Julia Robinson. Definability and decision problems in arithmetic. Journal of Symbolic Logic, 14:98–114, 1949. [33] William W. Tait. Finitism. The Journal of Philosophy, 78(9):524–546, 1981. [34] Heinrich Weber. Lehrbuch der Algebra, volume III. F. Vieweg und Sohn, Braunschweig, second edition, 1908. [35] Alan Woods. Some problems in logic and number theory and their connections. PhD thesis, University of Manchester, 1981.
Domain Extensions and Higher-Order Syntactical Interpretations Marek Polański
The paper is concerned with logical analysis of a broad family of mathematical constructions which fall under the vague term “domain extension“. The aim of the author is to contribute to a clarification of this notion and to provide an explication in both syntactic and model theoretical terms. The paper is mainly motivated by examples such as Whitehead’s definition of point or Russell’s construction of instants of time from events. The paper begins with a short exposition of some general model theoretic conditions which can serve as a first approximation of an adequate explication of the notion in question. The explication proposed by the author is based on a very general notion of syntactic interpretation. This notion is introduced and characterized semantically in the second part of the paper.
1 Introductory remarks The phrase “domain extension” has many connotations. The present paper is intended to contribute to its clarification in model theoretical terms. Our aim is to provide of a model-theoretic description and a purely syntactical characterization of some broad family of mathematical constructions which fall under this rather vague term. The common use of the term “domain extension” is not entirely clear but it can be roughly characterized by a series of well-known algebraic and geometric examples. The last ones are closely related to the so-called method of extensive abstraction (as described by Whitehead in [10, 11] and developed by Russell in [7] and Tarski in [9]). In this paper we introduce a concept of syntactic interpretation between higher-order theories which turns out to be a syntactic counterpart to constructions of this kind. The paper is organized as follows. In the next section some paradigmatic examples of domain extensions are presented. In the third section we introduce some conceptual preliminaries and formulate some very general conditions on abstract operations on relational structures which can serve
338
Marek Polański
as model-theoretic counterparts of the more concrete operations briefly discussed in the second section. We introduce a very general notion of an L-construction and define a very general notion of an elementary construction. Both sections have a rather expository character. The fourth section provides a new conceptual framework which is motivated by the paradigmatic examples discussed earlier.
2 Domain extensions: some paradigmatic examples Let us begin with a list of well-known mathematical examples which motivate the conceptual framework described and developed in the next two sections. Our paradigmatic examples can be divided into roughly three categories. The first one contains extensions of algebraic structures. This category comprises some well-known cases where an algebraic structure A is extended to an algebraic structure B for the same vocabulary. A is then (up to isomorphism) a substructure of B in the usual model-theoretic sense. The universe B of B is a result of adding new objects which satisfy some conditions expressed in the related vocabulary. The new objects can be (and usually are), roots of polynomials over A. The motivation for such an extension is then purely algebraic. Typical and well-known cases of this sort are extensions of number structures: the passage from natural numbers to integers, from integers to the field of rationals, the passage from rationals to reals and from reals to complex numbers. The second category embraces cases where an new domain is the result of filling gaps in an ordered structure. Typical examples of such a domain extension are completions of Boolean algebras and completions of linear orderings. To the third category belong constructions related to the method of extensive abstraction originated by Whitehead and further developed by Russell, Tarski (compare [9]), and some contemporary authors working on the point-free foundations of geometries (compare [4], [5]). Usually, a point-free geometry is a theory of spacial regions which is formulated in a language extending the language of mereology. Points are introduced via an approximation procedure which consists in defining a set whose elements are certain sets of regions of space linearily ordered by the mereological inclusion. Constructions of this kind, together with some further definitional steps, transform models of point-free geometries into models of point-based geometrical theories. The last ones are in a sense definable over the original models of a given point-free geometry. The essence of the method of extensive abstraction can be illustrated on simple examples. Consider the class of all so-called interval orders which are structures of the form A = (A, ⊲A ) where A is a non-empty set and ⊲A is a binary
Domain Extensions and Higher-Order Syntactical Interpretations 339
relation on A such that: (1) for all a ∈ A: non(a ⊲A a), and (2) for all a, b, c, d ∈ U : if a ⊲A b and c ⊲A d then a ⊲A d or c ⊲A b. Elements of the universe of an interval order can be regarded as time intervals. For each interval order A let us define a relation of overlapping ◦A as follows: a ◦A b iff non(a ⊲A b) and non(b ⊲A a). The following two examples show how point structures (being linear orderings) can be constructed from interval orders by means of the method of extensive abstraction. The constructions exemplify the ideas of Whitehead and Russell. Example 1: Points as maximal sets of mutually overlapping intervals A subset D of A is called an antichain in A if and only if for all a, b ∈ D: a ◦A b. Under a maximal antichain in A we understand an antichain in A which cannot be properly extended to another antichain. Let matc(A) be the set of all maximal antichains in A. For each interval structure A let F (A) be the structure (matc(A), ⋖A ) where ⋖A is a binary relation on matc(A) such that D1 ⋖A D2 iff for some a ∈ D1 , b ∈ D2 : a ⊲A b. Example 2: Points as equivalence classes of abstractive sets Let us define for all a, b ∈ U : a ≺A b :⇔ a 6= b and for all c: if c ◦A a then c ◦A b.
A subset D of U is called an abstractive class in A if D is linearily ordered by the relation ≺A and such that there is no a ∈ A such that for all b ∈ D: a ≺A b. Let ac(A) be the set of all abstractive classes in A. Now let us define the binary relation ∼A on ac(A) as follows: D1 ∼A D2 iff for each a ∈ D1 there is an b ∈ D2 such that a ≺A b and for each a ∈ D2 there is an b ∈ D1 such that a ≺A b. Clearly, ∼A is an equivalence relation on ac(A). For each interval structure A let F (A) be the structure (ac(A)∼A , ⋖A ) where ac(A)∼A is the corresponding set of all equivalence classes, and for all [D1 ]∼A and [D2 ]∼A : D1 ⋖A D2 :⇔ for some a ∈ D1 , b ∈ D2 : a ⊲A b.
In the next section we shall consider a list of conditions which approximate the class of domain extensions as characterized by the above examples. We start with some very general conditions and try to narrow down the class of operations they define. In section 4 we shall provide a precise concept which can serve as a candidate for an adequate model-theoretic explicatum of the term “domain extension”.
340
Marek Polański
3 L-operations and L-constructions
Let L1 and L2 be finite and purely relational vocabularies. For each vocabulary L let StrL denote the class of all relational structures which can serve as models for L. Let F be a partial operation from StrL1 to StrL2 . We call F regular if it preserves isomorphisms in the following sense. (ISOM) For all A, B ∈ dom(F ): if A ∼ = B then F (A) ∼ = F (B).
The above condition is purely algebraic in character. It does not refer to any particular languages which can be evaluated in the structures in question. There is of course a wide variety of languages which can be associated with a given relational vocabulary. Let L be a logical system which extends the first-order logic. Let us denote by F mL (L) and SentL (L) the classes of all L-formulas and all L-sentences (respectively) in the vocabulary L. We call a partial operation F from StrL1 into StrL2 an L-operation if and only if the following condition holds. (TRANS) There is a mapping τ from SentL (L2 ) into SentL (L1 ) such that for all ϕ ∈ SentL (L2 ) and all A ∈ dom(F ): A |= τ (ϕ) iff F (A) |= ϕ
The condition (TRANS) says intuitively that there is a uniform way to corelate (or to reduce) all L-expressible properties of a constructed structure F (A) with (to) some L-expressible properties of the original structure A. It is apparent that (TRANS) implies the following condition. (ELEM) For all A, B ∈ dom(F ): if A ≡L B then F (A) ≡L F (B).
(TRANS) and (ISOM) are independent from each other. It is easy to show that (TRANS) does not imply (ISOM). To see this let L be the first-order logic and F be an operation such that (1) dom(F ) the class of all models of the first-order theory of some particular finite structure A, (2) F (A) is the linear ordering ω (3) for each B ∈ dom(F ) such that A 6= B F(B) is the linear ordering ω + (ω ∗ + ω). Since each model of the first-order theory of A is isomorphic to A and ω is not isomorphic to ω + (ω ∗ + ω) such an operation F does not satisfy (ISOM). However, (TRANS) is satisfied. Let τ be the function which assigns to each sentence which is true in ω the sentence verum, and to each sentence which is false in ω the sentence falsum. Since ω and ω + (ω ∗ + ω) are first-order equivalent, then obviously for each sentence ϕ in an appropriate vocabulary for ω and ω+(ω ∗ +ω), and each model B in dom(F ): B |= τ (ϕ) just in case F (B) |= ϕ. It is equally easy to show that (ISOM) does not imply (TRANS). Let L be again the first-order logic and let F be the operation which assigns to each linear ordering its automorphism group. Each such automorphism group F (A) of a linear ordering A is here a structure of the form (B, R) where B is the set of all automorphisms of A and R is a ternary relation on B such that for all elements i, j, k ∈ B: (i, j, k) ∈ R if and only if k is the composition of i with j. Hence, each such automorphism group is a structure for a
Domain Extensions and Higher-Order Syntactical Interpretations 341
relational vocabulary. Now the ordering ω and ω + (ω ∗ + ω) are firstorder equivalent but the corresponding automorphism groups F (ω) and F (ω + (ω ∗ + ω)) are not (the first one contains only one element and the second one is infinite) This implies that F does not even satisfy (ELEM), let alone (TRANS). All paradigmatic examples of domain extensions described above take the form of regular L-operations. Some of them do not, strictly speaking, fulfil the condition of uniqueness. For example, each model A of the Peano arithmetic has many extensions which are models of the theory of integers. Nevertheless, any two such B1 and B2 which extend a given model A of Peano arithmetic are isomorphic over A (i.e. isomorphic to each other via a mapping that is the identity on A). Similarly, any two completions of a given Boolean algebra A are isomorphic over A. However, the concept of a regular L-operation is too wide to serve as a satisfactory explicatum of “domain extension”. It can be easily demonstrated that the conditions (ISOM) and (TRANS) taken together are not strong enough to exclude many trivial correlations between two classes of structures. Consider, for example, an operation F which is such that: (i) for all A, B ∈ dom(F ) A ≡L B, and (ii) there is a single structure C such that F (A) = C, for each A ∈ dom(F ). Clearly, any such operation F satisfies both (ISOM) and (TRANS). Therefore further conditions should be taken into account. We call a partial operation F from StrL1 to StrL2 an L-construction if and only if there exist functions f, D, and θ such that θ transforms injectively the set of all L-variables into itself, the domains of f and D are identical with the domain of F and the following conditions hold (we write fA and DA for the values of A under f and D, respectively). (PROXY) For each A ∈ dom(f ) fA is a function from DA onto the universe of F (A). (DEF)
For each atomic formula ϕ ∈ F mL (L2 ) with free variables x1 , ..., xn there is a formula ψ ∈ F mL (L1 ) with free variables θ(x1 ), ..., θ(xn ) such that for all A ∈ dom(F ) and all a1 , ..., an ∈ DA : (⋆) A |= ψ[θ(x1 ) : a1 , ..., θ(xn ) : an ] iff F (A) |= φ[x1 : fA (a1 ), ..., xn : fA (an )].
The above definition resembles various conditions discussed in literature (compare from example: [3], [1]). Some comments are needed. Firstly, the value of the function f for each model A for which F is defined assigns elements of the universe of F (A) to elements of DA in such a way that each element of the universe of F (A) has at least one counterpart. Hence, for each A in dom(F ) fA is a kind of functional proxy relation. Secondly, the set DA is not necessarily a subset of the set of all individuals in the universe of A. For example, if A is an inteval order (as defined above) DA may be the set of all maximal antichains or the set of all abstractive
342
Marek Polański
classes in A. In such a case θ would assign to each first-order variable a second-order variable. The above definition has some simple but useful consequences. 1.
2.
There is an L-formula δ in L1 whose extension in A is identical with DA . To see this apply the condition (DEF) the formula ’(x = x)’. We call δ a domain formula for F . Observe that δ depends essentially on f. There is an L-formula χ in L1 which defines a relation which is a surrogate for identity. Let us call it an indiscernibility formula for F . To see this apply (DEF) to the formula ’(x = y)’. Obviously, the extension of χ in A is the kernel of the function fA , i.e. the set of all pairs (a, b) such that fA (a) = fA (b). The relation kχkA is clearly an equivalence relation on DA and it is a congruence with respect to the L1 -counterpart of each basic predicate in L2 .
3.
Each L-construction is regular. To see this consider A, B ∈ dom(F ) and an isomorphism h from A onto B Now let us define a function g from rge(fA ) into rge(fB ) as follows. For each d ∈ dom(g) choose an ¯ object a ∈ kδkA such that fA (a) = d and let g(d) := fB (h(a)) where ¯ h is a suitably defined extension of h such that for all L-formulas A ¯ ϕ in L1 : h[kϕk ] = kϕkB . The claim that such an extension of an isomorphism h from A onto B exists is a kind of (model-theoretic) isomorphism theorem. It can easily be shown that g is a bijection that preserves the extensions of formulas which are L1 -counterparts of basic predicates in L2 .
4.
F is also an L-operation. This follows from the fact that the condition (⋆) can be easily shown to hold for all formulas ϕ ∈ F mL (L2 ).
We call F an elementary construction if F is an L-construction and L is the first-order logic. Strictly speaking, according to this explication of an elementary (i.e. first-order) construction the canonical constructions of socalled n-dimensional elementary interpretations (as defined in in [8]) are not elementary constructions. The above definition of an L-construction could be modified in a such a way that ψ in (⋆) would be allowed to have nk free variables. In such a case the domain formula δ for F would contain n free variables. Although we do not follow this explication strategy it should be stressed that the notion of interpretation introduced in the present paper is motivated by Szczerba’s account. All canonical constructions in Szczerba’s sense all L-constructions in our sense. Moreover, many of the operations from our paradigmatic examples are elementary according to Szczerba’s account. In the first category there is only one exception: the operation transforming rationals to reals is a non-elementary (this can be shown with the help of a simple cardinality argument). However, each operation belonging to the second category is not elementary. But all of them are L-construction for some logics L (for instance, for the second-
Domain Extensions and Higher-Order Syntactical Interpretations 343
order logic). The examples from the third group are more complicated and a definitive answer with regard to some of them would require a deeper analysis. As it seems, they are also not elementary, both in our sense and in the sense defined by Szczerba. This shows that the class of elementary constructions does not embrace many of the paradigmatic cases of domain extensions. This fact is the main motivation for a more systematic study of non-elmentary L-constructions. This topic will occupy the rest of the present paper. Our strategy will consist in defining a special kind of higher-order constructions instead of trying to add further general conditions in style of (ISOM) etc. However, it is interesting to notice that there is another strategy which seems equally promising but which we do not follow here. Each of the paradigmatic examples can be regarded as a class of structures H which is elementary in a logic L and such that there is a vocabulary L which extends L1 ⊎ L2 ⊎ {P } where ⊎ stands for the disjoint sum and P is a new unary predicate. For example, the completion operation for linear ordering can be regarded as the class H consisting of structures of the form A = (A, P A ,