242 82 3MB
English Pages 542 [544] Year 2012
Ulrich Berger, Hannes Diener, Peter Schuster, Monika Seisenberger (Eds.) Logic, Construction, Computation
ontos mathematical logic edited by Wolfram Pohlers, Thomas Scanlon, Ernest Schimmerling Ralf Schindler, Helmut Schwichtenberg Volume 3
Ulrich Berger, Hannes Diener, Peter Schuster, Monika Seisenberger (Eds.)
Logic, Construction, Computation
Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliographie; detailed bibliographic data is available in the Internet at http://dnb.ddb.de
North and South America by Transaction Books Rutgers University Piscataway, NJ 08854-8042 [email protected] United Kingdom, Ire Iceland, Turkey, Malta, Portugal by Gazelle Books Services Limited White Cross Mills Hightown LANCASTER, LA1 4XS [email protected]
Livraison pour la France et la Belgique: Librairie Philosophique J.Vrin 6, place de la Sorbonne ; F-75005 PARIS Tel. +33 (0)1 43 54 03 47 ; Fax +33 (0)1 43 54 48 18 www.vrin.fr
©
2012 ontos verlag P.O. Box 15 41, D-63133 Heusenstamm www.ontosverlag.com ISBN 978-3-86838-158-0 2012 No part of this book may be reproduced, stored in retrieval systems or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use of the purchaser of the work Printed on acid-free paper ISO-Norm 970-6 FSC-certified (Forest Stewardship Council) This hardcover binding meets the International Library standard Printed in Germany by CPI buch bücher.de
Helmut Schwichtenberg, April 2012
Preface Over the last few decades the interest of logicians and mathematicians in constructive and computational aspects of their subjects has been steadily growing. Moreover, researchers from disparate areas have started fruitful collaborations because they realized that, despite being driven by different motivations, they can benefit enormously from the mutual exchange of techniques concerned with constructivity and computability. A key figure in this exciting development is the logician and mathematician Helmut Schwichtenberg to whom this volume is dedicated. Helmut Schwichtenberg has pioneered the interaction of logic, mathematics, and computer science and has made crucial contributions to these disciplines, in particular to proof theory, recursion theory, normalisation and term rewriting, computable analysis, constructive mathematics, program extraction, and interactive and automated theorem proving. The volume is dedicated to him on the occasion of his retirement and his 70th birthday. Helmut Schwichtenberg was born on the 5th of April, 1942 in Sagan, Silesia. From 1961 he studied Mathematics at the FU Berlin and from 1964 at M¨unster University. He completed his PhD thesis (“Eine Klassifikation der mehrfach rekursiven Funktionen”) in 1968 and his Habilitationsschrift (“Einige Anwendungen von unendlichen Termen und Wertfunktionalen”) in 1973, both at M¨unster University under the supervision of Dieter R¨odding. In 1974 he was appointed Professor at the University of Heidelberg and in 1978 he became the successor of Kurt Sch¨utte as Chair for Mathematical Logic at the Ludwig-Maximilians University in Munich. He has been a member of the Bayerische Akademie der Wissenschaften since 1986. Helmut Schwichtenberg has written two books, Basic Proof Theory (with Anne Troelstra, 2000), and Proofs and Computations (with Stan Wainer, 2012), and coedited several volumes (amongst them six Summer School Marktoberdorf proceedings). To date, he has written more than 80 publications. He has been co-editor of the journals Archive for Mathematical Logic, and Annals of Pure and Applied Logic, as well as of the series Mathematical Logic of the Ontos-Verlag, and Studies in Proof Theory of Bibliopolis. He also serves on the advisory board of HigherOrder and Symbolic Computation. Tirelessly, and very successfully, he has set up research projects, networks, and cooperations with other universities and industry. During a research visit at Carnegie Mellon University, Pittsburgh in 1987–88, he started to develop Minlog, a well respected interactive proof assistant which, among
other things, is known for its strengths in normalization, constructive analysis, and program extraction. A primary concern of Helmut Schwichtenberg has always been the promotion of young scientists. He taught at the Summer School Marktoberdorf for two decades and (co-)directed its biennial logic-oriented ‘blue series’ from 1991 until 2007. He was a member of the Munich Graduiertenkolleg ‘Sprache, Information, Logic’ and spokesman and co-founder of the Graduiertenkolleg ‘Logic in Computer Science’. For many years he organized logic meetings at the ‘Mathematisches Forschungsinstitut Oberwolfach’ and served on the Scientific Committee of the Institute. Most recent is his involvement in the European Marie Curie PhD training network in Mathematical Logic, MATHLOGAPS and MALOA. Helmut Schwichtenberg’s PhD students were: Karl-Adolf H¨owel, Peter P¨appinghaus, Ulf Schmerl, Klaus Martin H¨ornig, Martin Ruckert, Lew Gordeev, P´all Egerz, Ulrich Berger, Karl-Heinz Niggl, Ralph Matthes, Thomas Rudlof, Monika Maidl, Wolfgang Zuber, Klaus Weich, Felix Joachimski, Matthias Eberl, Monika Seisenberger, Favio Miranda Perea, Martin K¨ubler, Dan Hernest, Stefan Schimanski, Luca Chiarabini, Diana Ratiu, Trifon Trifonov. He is still active as PhD supervisor. The plan for this Festschrift was formed during the workshops Program Extraction and Constructive Proofs (PECP) and Classical Logic and Computation (CLAC) which were held in Brno, G¨odel’s birth place, on 21st–22nd August 2010 on the occasion of Helmut Schwichtenberg’s retirement. We asked collaborators and friends of his to contribute to a book which should be presented to him in the year of his 70th birthday. The result is a collection of 20 refereed articles written by 30 authors which cover a good deal of current research on constructive and computational aspects of logic, and reflect the breadth of Helmut Schwichtenberg’s research and his scientific network. We would like to thank the speakers and participants of PECP and CLAC who made these workshops scientifically high profile and memorable events. We are also grateful to the Deutsche Vereinigung f¨ur Mathematische Logik und f¨ur Grundlagenforschung der Exakten Wissenschaften (DVMLG), the Kurt G¨odel Society (KGS), and the organizers at the University of Brno for their support. We thank Rafael H¨untelmann from the Ontos Verlag and Ralf Schindler for being very helpful with publishing this volume, Reinhard Kahle and Stan Wainer for their respective advice on editorial matters, and Ursula Schwichtenberg for answering questions on the biographical data. Our special thanks go to the referees, who made a crucial contribution to the high scientific standard of this volume. The articles in this book can be grouped into eight themes: Constructive set theory (Andrea Cantini and Laura Crosilla, Justus Diller, Solomon Feferman, Gerhard J¨ager and Rico Zumbrunnen), Provably recursive functions (Wilfried Buchholz and Andreas Weiermann, Wolfram Pohlers and Jan-Carl Stegert, Elliott Spoors and
Stan Wainer), Program extraction (Federico Aschieri and Stefano Berardi, Mark Bickford and Robert Constable), Theories of truth (Sebastian Eberhard and Thomas Strahm, Solomon Feferman), Constructive mathematics (Douglas Bridges, Joan Moschovakis), Classical vs. intuitionistic logic (Gilda Ferreira and Paulo Oliva, Hajime Ishihara), Inductive definitions (Grigori Mints, Anton Setzer and Fredrik Nordvall Forsberg), Continuous functionals and domains (Dag Normann, Dieter Spreen). The authors and editors of this Festschrift, and those who responded to our call but were unable to contribute within the restricted time frame, wish Helmut Schwichtenberg all the best for his 70th birthday and hope that he will enjoy many more years of fruitful research.
Leeds, Siegen, and Swansea
Spring 2012
Ulrich Berger Hannes Diener Peter Schuster Monika Seisenberger
Contents Preface Contributors A New Use of Friedman’s Translation: Interactive Realizability
Federico Aschieri and Stefano Berardi Polymorphic Logic
Mark Bickford and Robert Constable Constructive Solutions of Ordinary Differential Equations
Douglas S. Bridges A Nonstandard Hierarchy Comparison Theorem
Wilfried Buchholz and Andreas Weiermann Transitive Closure in Operational Set Theory
Andrea Cantini and Laura Crosilla Baire Space in CZF
Giovanni Curi and Michael Rathjen Functional Interpretations of Classical and Constructive Set Theory
Justus Diller Weak Theories of Truth and Explicit Mathematics
Sebastian Eberhard and Thomas Strahm Axiomatizing Truth: Why and How?
Solomon Feferman On the Strength of some Semi-Constructive Theories
Solomon Feferman On the Relation Between Various Negative Translations
Gilda Ferreira and Paulo Oliva
7 11 51 67 79 91 123 137 157 185 201 227
A Finite Axiomatisation of Inductive-Inductive Definitions
Fredrik Nordvall Forsberg and Anton Setzer Some Conservative Extension Results
Hajime Ishihara About the Strength of Operational Regularity
Gerhard J¨ager and Rico Zumbrunnen Epsilon Substitution for ID1
Grigori Mints
¨ Another Unique Weak Konig’s Lemma WKL!!
Joan Rand Moschovakis The Continuous Functionals as Limit Spaces
Dag Normann Provably Recursive Functions of Reflection
Wolfram Pohlers and Jan–Carl Stegert A Hierarchy of Ramified Theories Below PRA
Elliott J. Spoors and Stanley S. Wainer Representing L-Domains as Information Systems
Dieter Spreen
259 289 305 325 343 353 381 475 501
Contributors Federico Aschieri Dipartimento di Informatica Universit`a degli Studi di Torino, Italy Stefano Berardi Dipartimento di Informatica Universit`a degli Studi di Torino, Italy Mark Bickford Computer Science Department Cornell University, Ithaca, NY, USA Douglas S. Bridges Department of Mathematics and Statistics University of Canterbury, Christchurch, New Zealand Wilfried Buchholz Mathematisches Institut Ludwig-Maximilians-Universit¨at M¨unchen, Germany Andrea Cantini Dipartimento di Filosofia Universit`a degli Studi di Firenze, Italy Robert Constable Computer Science Department Cornell University, Ithaca, NY, USA Laura Crosilla Department of Philosophy University of Leeds, UK Giovanni Curi Dipartimento di Matematica Universit`a degli Studi di Padova, Italy Justus Diller Institut f¨ur math. Logik und Grundlagenforschung Westf¨alische Wilhelms-Universit¨at, M¨unster, Germany
Sebastian Eberhard Institut f¨ur Informatik und angewandte Mathematik Universit¨at Bern, Switzerland Solomon Feferman Department of Mathematics Stanford University, Stanford, USA Gilda Ferreira Departamento de Matematica Faculdade de Ciencias da Universidade de Lisboa, Portugal Fredrik Nordvall Forsberg Department Computer Science Swansea University, UK Hajime Ishihara School of Information Science Japan Advanced Institute of Science and Technology, Japan Gerhard J¨ager Institut f¨ur Informatik und angewandte Mathematik Universit¨at Bern, Switzerland Grigori Mints Department of Philosophy Stanford University, Stanford, CA, USA Joan Rand Moschovakis Occidental College (Emerita) Graduate Program in Logic and Algorithms, Athens, Greece Dag Normann Department of Mathematics University of Oslo, Norway Paulo Oliva School of Electronic Engineering and Computer Science Queen Mary University of London, UK Wolfram Pohlers Institut f¨ur math. Logik und Grundlagenforschung Westf¨alische Wilhelms-Universit¨at, M¨unster, Germany
Michael Rathjen Department of Pure Mathematics University of Leeds, UK Anton Setzer Department Computer Science Swansea University, UK Elliott J. Spoors Department of Pure Mathematics University of Leeds, UK Dieter Spreen Department Mathematik Universit¨at Siegen, Germany Jan–Carl Stegert Institut f¨ur math. Logik und Grundlagenforschung Westf¨alische Wilhelms-Universit¨at, M¨unster, Germany Thomas Strahm Institut f¨ur Informatik und angewandte Mathematik Universit¨at Bern, Switzerland. Stanley S. Wainer Department of Pure Mathematics University of Leeds, UK Andreas Weiermann Vakgroep Wiskunde Universiteit Ghent, Belgium Rico Zumbrunnen Institut f¨ur Informatik und angewandte Mathematik Universit¨at Bern, Switzerland.
A New Use of Friedman’s Translation: Interactive Realizability Federico Aschieri and Stefano Berardi Friedman’s translation is a well-known transformation of formulas. The Friedman translation has two properties: i) it validates intuitionistic theorems – if a formula is intuitionistically provable, then so it is its Friedman translation; ii) it is suitable for program extraction from classical proofs – the intuitionistic provability of the Friedman translation of the negative translation of a for-all-exist-formula implies the intuitionistic provability of the formula itself. However, the Friedman translation does not validate classical principles, like the Excluded Middle. Here, we define a restricted Friedman translation which both validates the Excluded Middle and Skolem axiom schemata restricted to Σ01 -formulas and it is also suitable for program extraction from classical proofs using such principles: the intuitionistic provability of the restricted Friedman translation of a for-all-exist-formula implies the intuitionistic provability of the formula itself. Then we introduce a learning-based Realizability Semantics for Heyting Arithmetic with all finite types, extended with the two previous axiom schemata. We call this semantics “Interactive Realizability”, and we characterize it as the composition of our restricted Friedman translation with Kreisel modified realizability. As a corollary, we show that Interactive Realizability is, in a sense, “axiom-driven”, while the other Realizability Semantics for Classical Arithmetic, like the semantics of Krivine, are “goal-driven”.
1
Introduction
In the past years, many computational interpretations of Classical Arithmetic have been put forward. Under a first classification, they fall into two large categories: direct and indirect interpretations. Among the indirect interpretations one finds the negative translations followed either by Dialectica interpretations [13], [30] (see e.g. Kohlenbach [19]) or by intuitionistic realizability interpretations combined with Friedman’s translation [13] (see e.g. Berger and Schwichtenberg [10]). Among the direct interpretations, there are different versions of Classical Realizability (Krivine’s [22] and Avigad’s [6]), there is Coquand game semantics [11],
12
Federico Aschieri and Stefano Berardi
cut-elimination and normalization of classical proofs [14] (under Curry-Howard correspondence of not), and the epsilon substitution method [23] (the Kreisel nocounterexample interpretation [20] is an easy corollary of the other ones). More recently, another Classical Realizability interpretation for Heyting Arithmetic with Excluded Middle and Skolem axioms over Σ01 -formulas has been introduced by Aschieri, Berardi and de’ Liguoro [8], [1], [9]: it is based on the notion of learning and it is called the “Interactive Realizability”. The goal of this paper is to compare Interactive Realizability with the other notions of Classical Realizability, using Friedman’s translation and Kreisel modified Realizability as tools. On a first sight, these computational interpretations of Classical Arithmetic may appear completely different. However, this is not the case and it is often possible to find unifying concepts. A common way of studying and relating the various computational interpretations of Classical Arithmetic is, first, to characterize them in terms of translations of Classical into Intuitionistic Arithmetic and secondly, to compare the resulting translations. In the case of Classical Realizability interpretations one usually has Classical Realizability = Negative Translation + Friedman’s Translation + Modified Realizability For example, Avigad [7] characterized its own Classical Realizability in terms of a special negative translation followed by Friedman’s translation again followed by Kreisel’s modified realizability [21]; similarly did Towsner [32]. Oliva and Streicher [27] managed to do the same for Krivine’s classical realizability for Classical Analysis. Miquel [25] used their characterization to compare the algorithms extracted from proofs of Σ01 -formulas – either obtained by using Krivine’s realizability or Oliva and Streicher’s characterization – and to conclude they are basically the same. All these results characterize classical realizability in the following way: a classical realizer of a formula B is an intuitionistic realizer of some Friedman translation of some negative translation of B. In this paper we build over this research line and we investigate the relationship between Interactive Realizability and Friedman’s translation. We shall prove (from an idea of the first author) that Interactive Realizability for Heyting Arithmetic in all finite types HAω , with Excluded Middle EM1 and Skolem axioms SK1 over Σ01 -formulas, can be understood as a new way of using Friedman’s translation, a way that avoids the use of negative translations for program extraction purposes. Interactive Realizability restricts the family of goal-formulas in Friedman’s translation, in order to interpret each instance of Excluded Middle used in the proof by some constructive principle. While in the usual Friedman’s A-translation the
A New Use of Friedman’s Translation: Interactive Realizability
13
goal-formula A is some Σ01 -property that one wants to prove (it is “goal-driven”), in ours the goal is fixed once and for all, does not depend on the particular proof one is considering and consists in learning something about the Skolem functions interpreting the Excluded Middle. More precisely, we will show that learning-based realizability can be decomposed in a fixed Friedman A -translation followed by Kreisel’s modified realizability, as follows: Interactive Realizability = Friedman’s A -Translation + Modified Realizability The fixed formula A has a free variable s : N2 → N and says: “there exists a counterexample to the fact that s is a Skolem function solving the Halting problem”. More precisely, A asserts that s is not a Skolem function for the formula ∀xN ∀yN ∃zN T xyz, where T is Kleene’s predicate. In other words, A says that the familiar full type structure built over natural numbers is not a Tarski model of HAω + EM1 + SK1 , whenever the function s is interpreted as a Skolem function solving the Halting problem, and that there is a counterexample supporting this assertion. We observe that a counterexample to the assertion that s is a Skolem function solving the Halting problem is just a triple of numbers (n, m, l) such that T nml = True and T nms(n, m) = False. This is true because if s is not such a Skolem function, it makes false the Skolem axiom: ∀xN ∀yN ∀zN . T xyz → T xys(x, y) We shall use this characterization to stress the similarities, but also the differences, between Interactive Realizability and the other Classical Realizability semantics proposed so far. Namely, all these Realizability semantics are related to some Friedman’s translation, but in Interactive Realizability the negative translation is eliminated, and the restricted Friedman translation implicit in Interactive Realizability validates the Excluded Middle and Skolem axioms (it is “axiom-driven”). In particular, an interactive realizer of a formula B is characterized just as an intuitionistic realizer of the Friedman A -translation of B. The result seems to accord with the intuition that learning-based realizability describes “locally” the constructive ideas hidden in classical proofs, thanks to the interpretation of classical principles with Skolem functions and learning algorithms.
1.1
Plan of the Paper
In section §2 we prove that there exists a restricted Friedman’s translation validating Excluded Middle and Skolem axiom schemata (both restricted to Σ01 -formulas),
14
Federico Aschieri and Stefano Berardi
which at the same time still allows program extraction from profs of Π02 -formulas. In the rest of the paper we define the Interactive Realizability Semantics, corresponding to such restricted Friedman’s translation. In section §3 we introduce a term calculus in which Interactive, learning-based realizers are written, namely an extension of G¨odel’s system T plus a constant symbol for a Skolem function Φ. In section §4, we extend Interactive Realizability, as described in [1], to HAω + EM1 + SK1 , an arithmetical system with functional variables. In section §5, we compare Interactive Realizability with Kreisel’s no-counterexample interpretation and we give an example of interactive realizer. In section §6 we conclude our characterization of Interactive Realizability in terms of Kreisel’s modified Realizability + restricted Friedman’s translation.
2
A restricted version of Friedman’s Translation
In this section we first remark how Friedman’s translation does not validate Excluded Middle for Σ01 -formulas, and then how a restricted version of it does. In the rest of the paper we define the Interactive Realizability Semantics, which may be defined as the composition of Kreisel’s modified Realizability with our restricted Friedman’s translation.
2.1
The Friedman Translation
The Friedman translation is a strikingly simple device introduced by Friedman [13] in order to prove closure of intuitionistic systems S under Markov’s rule: S ` ¬∀xN ¬P(x) =⇒ S ` ∃xN P(x) where P is any decidable quantifier free formula, possibly with some other free variables besides x. The translation gives a reasonable semantics to formulas that are derived in a possibly inconsistent universe, in which some arbitrary universal statement is assumed to be true. In such a world, false statements can be proved by perfectly valid arguments, since the assumption of the world may be false. For example, one may prove HA + ∀xN ¬P(x) ` ⊥ as lemma in a classical proof of ∃xN P(x). Tarskian semantics is therefore no longer adequate. The idea of Friedman’s translation is to change the meaning of formulas in such a way that even false formulas are interpreted by true ones, carrying interesting constructive information. In a universe in which a false assumption is made,
A New Use of Friedman’s Translation: Interactive Realizability
15
the only way of recovering from the disaster of some derived false atomic formula Q is pointing out the concrete false consequence of the false assumption that is to blame for deriving Q. If the false assumption is ∀xN ¬P(x), the new meaning of Q is Q ∨ ∃xN P(x) In words, if Q is derived from the assumption ∀xN ¬P(x), either Q is true, or Q is false and thus some false consequence ¬P(n) of ∀xN ¬P(x) must have been used in the derivation of Q. Thus P(n) must be true and ∃xN P(x) must hold. The following theorem, from [13], is well known and holds for many systems. Here, we shall focus on the system HAω (see section §4.1 for details): Theorem 1 (Friedman’s A-Translation). Given any formulas A, B, where A has not free variables occurring bound in B, let us denote with BA the formula resulting from B by replacing every atomic formula Q of B with Q ∨ A. If Γ is a set of formulas and HAω + Γ ` B and HAω ` F A for every F ∈ Γ then
HAω ` BA
The theorem is proved by straightforward induction on proof length (see Friedman [13]), and crucially depends on the fact that the A-translation of every axiom of HAω is provable in HAω . Now suppose that HAω ` ¬∀xN ¬Q(x) where Q is any quantifier-free formula. Then, by theorem 1 HAω ` (¬∀xN ¬Q(x))∃x
N
Q(x)
But this latter formula implies ∃xN Q(x) in HA (see Friedman [13]) and thus HAω ` ∃xN Q(x)
Therefore HAω is closed under Markov’s rule. This closure property of HAω is exploited for program extraction purposes, in connection with G¨odel’s double negation translation (again, see [13]). In particular, if PAω is the classical version of HAω , one can show that PAω ` ∀xN ∃yN P(x, y) =⇒ HAω ` ∀xN ¬∀yN ¬P(x, y)
and thus by closure of HAω under Markov’s rule PAω ` ∀xN ∃yN P(x, y) =⇒ HAω ` ∀xN ∃yN P(x, y)
16
Federico Aschieri and Stefano Berardi
We notice that the preliminary use of the negative translation before Friedman’s translation is necessary to obtain the above result, since the following generalization of theorem 1 is false: PAω ` B =⇒ HAω ` BA
This is due to the fact that HAω does not prove the A-translation of the Excluded Middle for every formula A, because if A is refutable then BA ⇔ B. A-translation proves only the A-translation of the double negation translation of Excluded Middle. Therefore, A-translation alone cannot eliminate classical reasoning. It is thus intriguing to ask whether Friedman’s translation alone is enough for program extraction from proofs of Π02 -formulas in the system HAω + EM1 + SK1 . More precisely: is there a formula A such that HAω ` EMA 1
HAω ` SKA 1
and for all atomic formulas P(x, y) we have the following correctness property: HAω ` (∀xN ∃yN P(x, y))A =⇒ HAω ` ∀xN ∃yN P(x, y) ?
If there is such a formula, then by theorem 1 one has: HAω + EM1 + SK1 ` ∀xN ∃yN P(x, y) =⇒ HAω ` ∀xN ∃yN P(x, y)
allowing program extraction from classical proofs in the system HAω + EM1 + SK1 just by using Kreisel’s modified realizability for HAω . The answer, as we shall see, is positive: Interactive Realizability defines in a natural way a formula A with the desired properties. We observe that this formula does not vary with the particular Π02 -formula one wants to prove, whereas in the standard use of Friedman’s translation the goal formula always does. For the sake of simplicity, in this section we assume EM1 := ∀xN ∀yN . ∃zN T (x, y, z) ∨ ∀zN ¬Bool T (x, y, z) since all other instances of the Excluded Middle on Σ01 -formulas can be derived from the instance over Kleene’s predicate (see definition 8 of section §3 ). Similarly, we also assume SK1 := ∀xN ∀yN ∀zN . T xyz ⇒Bool T xyΦ(x, y)
2.2
A New Way of using Friedman’s Translation
To the same extent that Friedman’s translation deals with provability under possibly false assumptions, Interactive, learning-based realizability (see section §4) deals
A New Use of Friedman’s Translation: Interactive Realizability
17
with computations under possibly false computational hypotheses. The first repairs false proved statements by pointing out the actual concrete assumption that causes the inconsistency; the second repairs wrong computational results by spotting some wrong value of the Skolem function that produced some mistake. In particular, the very idea behind learning-based realizability is to make assumptions about the values of the Skolem function Φ for the predicate T and, thanks to them, carry out computations even in situations in which one cannot effectively compute the right values of Φ. By a continuity argument, given any approximation s : N2 → N of Φ, one knows that if s satisfy the following axiom SK1 [s] = ∀xN ∀yN ∀zN . T xyz ⇒Bool T xys(x, y)
for a sufficient number of particular choices for x, y, z, then a realizer of a Σ01 formula will be able to compute a right witness when using s in place of Φ. If, instead, the witness computed is incorrect, then one knows that SK1 [s] is false, and the task of the realizer is to spot a wrong value of s and to correct it with a right one. The realizer effectively finds out numerals n, m, l such that T nml ∧ ¬Bool T nms(n, m) and thus recognizes that A (s) := ∃xN ∃yN ∃zN . T xyz ∧ ¬Bool T xys(x, y) holds, which is classically equivalent to the negation of SK1 [s]. A (s) asserts that there exists a counterexample to the fact that s is a Skolem function for T . In general, the behavior of a learning-based realizer of an atomic formula Q, is to realize, in Kreisel’s sense, the formula Q ∨ A (s) (but possibly providing multiple witnesses of A (s)). We choose A (s) as the formula of the Friedman translation we were seeking, where s is a free variable denoting any map of type N2 → N. We now prove that A (s) is exactly the formula for the Friedman translation we asked for. Theorem 2 (A New Use of Friedman’s Translation). Let s : N2 → N be a variable and let A (s) := ∃xN ∃yN ∃zN . T xyz ∧ ¬Bool T xys(x, y) Then we have
18
Federico Aschieri and Stefano Berardi
1. HAω ` (SK1 [s])A (s) 2. HAω ` (EM1 )A (s) Proof.
1. Since (SK1 [s])A (s) is equal to ∀xN ∀yN ∀zN . T xyz ⇒Bool T xys(x, y) ∨ ∃xN ∃yN ∃zN .T xyz ∧ ¬T xys(x, y) it is immediate to show that HAω ` (SK1 [s])A (s)
2. By definition A (s)
EM1
= ∀xN ∀yN . ∃zN . T (x, y, z) ∨ A (s) ∨ ∀zN .¬Bool T (x, y, z) ∨ A (s)
We reason by cases according as to whether T xys(x, y) is true or not: (s) (a) T xys(x, y) is true. Then also ∃zN . T (x, y, z) is true and so EMA . 1
(b) T xys(x, y) is false. Then ¬Bool T xys(x, y) is true. Let us consider an arbitrary z: we want to show that ¬Bool T (x, y, z)∨A (s) holds. If T xyz holds, then A (s) is true and we have finished; if ¬Bool T (x, y, z) holds, we are done again. We have thus concluded that HAω ` (EM1 )A (s)
(1)
In the rest of the paper we will define Interactive Realizability, with hindsight a semantics corresponding to the Friedman’s translation we just outlined. Using Interactive Realizability, at the end of this paper we will prove that the restricted Friedman’s translation of any Π02 -formula is correct, that is, that for any atomic formula P(x, y) we have: Theorem 3 (Correctness of the Restricted Friedman Translation). HAω ` (∀xN ∃yN P(x, y))A (s) =⇒ HAω ` ∀xN ∃yN P(x, y)
A New Use of Friedman’s Translation: Interactive Realizability
19
It is very easy to see that, classically, if the formula (∀xN ∃yN P(x, y))A (s) is true for every s, then also the formula ∀xN ∃yN P(x, y) is true. Indeed, from the Axiom of Choice, one obtains the existence of a Skolem function Φ solving the the Halting problem. Thus, A (Φ) must be false, since by construction there cannot be any counterexample to the fact that Φ is a Skolem function for the Halting problem (if we trust classical logic!). Thus P(x, y) is equivalent to P(x, y) ∨ A (Φ) and thus (∀xN ∃yN P(x, y))A (Φ) → ∀xN ∃yN P(x, y) is true. However, it requires more work to prove theorem 3. A synopsis of what one has to show is the following. From a proof in HAω of the formula ∀xN ∃yN P(x, y)A (s) one can extract an interactive realizer t of the formula ∀xN ∃yN P(x, y). If one fixes x = n, from that realizer t, one directly obtains an update procedure (in the language of [3, 6]), which is a functional U : (N2 → N) → (N3 ) ∪ ∅ that given any function s : N2 → N as argument, either returns the empty set or a triple (i, j, l) consisting in a new correct value l of the aforementioned Skolem function Φ on argument (i, j). The idea is that with an update procedure, one can construct a good enough finite approximation of Φ, which turns out to be a function f such that U( f ) = ∅. But the existence of such a function f can be proven in HAω for any update procedure representable in G¨odel’s system T. But then, using f , the interactive realizer t can compute a witness for the formula ∃yN P(n, y). The first step of our program is to introduce a term calculus TClass in which the realizers of Interactive Realizability live.
3
The Term Calculus T
Class
The content of this section is based on Aschieri and Berardi [1], with a few simplifications, namely in the notion of state. We shall review the typed lambda calculi T and TClass in which learning-based realizers are written in [1]. T is a completely
20
Federico Aschieri and Stefano Berardi
standard extension of G¨odel’s system T (see Girard [16]) with some syntactic sugar. The basic objects of T are numerals, booleans, and its basic computational constructs are primitive recursion at all types, if-then-else, pairs, as in G¨odel’s T. T also includes as basic objects finite partial functions over N and simple primitive recursive operations over them. TClass is obtained from T by adding on top of it a Skolem function symbol Φ : N → N → N, denoting some map Turing-equivalent to the oracle for the Halting problem. The symbol is totally inert from the computational point of view and realizers are always computed with respect to some approximation of the Skolem map represented by Φ.
3.1
Updates
In order to define T , we start by introducing the concept of “update”, which is nothing but a finite partial function over N. Realizers of atomic formulas will return these finite partial functions, or “updates”, representing new pieces of information that they have learned about the Skolem function Φ. Skolem functions, in turn, are used as “oracles” during computations in the system TClass . Updates are new associations input-output that are intended to correct, and in this sense, to update, wrong oracle values used in a computation. Definition 4 (Updates and Consistent Union). We define: 1. A binary predicate of T is any closed term P : N2 → Bool of G¨odel’s T. 2. We assume P0 , P1 , P2 , . . . is any sufficiently expressive recursive enumeration of binary predicates of T. That is, we assume that for each numeral n, T n = Pm for some m. 3. An update set U, shortly an update, is a finite set of triples of natural numbers representing a finite partial function from N2 to N. We say that U is sound if for every (i, n, m) ∈ U, we have Pi nm = True. 4. Two triples (a, n, m) and (a0 , n0 , m0 ) of numbers are consistent if a = a0 and n = n0 implies m = m0 . 5. Two updates U1 , U2 are consistent if U1 ∪ U2 is an update.
A New Use of Friedman’s Translation: Interactive Realizability
21
6. U is the set of all updates. 7. The consistent union U1 U U2 of U1 , U2 ∈ U is U1 ∪ U2 minus all triples of U2 which are inconsistent with some triple of U1 . We think of a triple (a, n, m) belonging to a sound update as the code of a witness for ∃yN .Pa (n, y). The fact that every update is a partial function allows in each update at most one witness for each formula ∃yN .Pa (n, y). We remark that the enumeration P0 , P1 , . . . can be arbitrary, as long as it is recursive, and will not play any significant role throughout the paper: it is just a simple way to give “names” to the predicates of T and to store witnesses. Only in section 6, for simplicity reasons and theoretical purposes, we shall assume a particular enumeration. For implementation purposes, we may assume the enumeration P0 , P1 , . . . , to be just a computable list of every binary predicate of T. We also remark that we could have defined an update to be just a single triple of numbers: all the results of this paper would hold in this case. However, it will be clear in the following that one obtains more efficient programs with our definition of updates: realizers of Post rules will avoid losing precious witnesses. The consistent union U1 U U2 is an non-commutative operation: whenever a triple of U1 and a triple of U2 are inconsistent, we arbitrarily keep the triple of U1 and we reject the triple of U2 , therefore for some U1 , U2 we have U1 U U2 , U2 U U1 . U is a “learning strategy”, a way of selecting a consistent subset of U1 ∪ U2 , such that U1 UU2 = ∅ =⇒ U1 = U2 = ∅. Any operator U selecting a consistent subset of U1 ∪ U2 and satisfying U1 UU2 = ∅ =⇒ U1 = U2 = ∅ would produce an alternative Realizability Semantics.
3.2
The System T
T is formally described in figure 1. Terms of the form ifA t1 t2 t3 will be written in the more legible form if t1 then t2 else t3 . For every update U ∈ U, we assume having in T a constant U : U, where U is a new base type representing U. We write ∅ for ∅. In T , there are four operations involving updates (see figure 1): 1. The first operation is denoted by the constant is : U → N2 → Bool. is takes as arguments an update constant U and two numerals a, n; it returns True if (a, n, m) ∈ U for some m ∈ N (that is, if the pair (a, n) is in the domain of the partial map U); it returns False otherwise. 2. The second operation is denoted by the constant get : U → N2 → N. get takes as arguments an update constant U and two numerals a, n; it returns m
22
Federico Aschieri and Stefano Berardi
if (a, n, m) ∈ U for some m ∈ N (that is, if (a, n) belongs to the domain of the partial function U); it returns 0 otherwise. 3. The third operation is denoted by the constant mkupd : N3 → U. mkupd takes as arguments three numerals a, n, m and transforms them into (the constant coding in T ) the update {(a, n, m)}. 4. The forth operation is denoted by the constant d : U2 → U. d takes as arguments two update constants and returns the update constant denoting their consistent union. We observe that the constants is, get, mkupd are just syntactic sugar and may be avoided by coding finite partial functions into natural numbers; their behaviour even does not depend on the enumeration P0 , P1 , . . . of binary predicates, since the updates in the language of T are not assumed to be sound. System T may thus be coded in G¨odel’s T. System T is obtained from system T adding a new atomic type and new operations on it. The following definition formalizes what has been done in extending T to T , and it useful for defining arbitrary extensions of T with arbitrary maps over natural numbers. We shall need such extensions when we add the non-computable map Φ to T . Definition 5 (Functional set of rules). Let C be any set of constants, each one of some type A1 → . . . → An → A, for some A1 , . . . , An , A ∈ {Bool, N, U}. We say that R is a functional set of reduction rules for C if R consists, for all c ∈ C and all a1 : A1 , . . . , an : An closed normal terms of T , of exactly one rule ca1 . . . an 7→ a, for some closed normal term a : A of T . Any extension of T with constants and even non-computable functional sets of rules, is strongly normalizing and has the uniqueness-of-normal-form property. Theorem 6. Assume that R is a functional set of reduction rules for C (def. 5). Then T + C + R enjoys strong normalization and weak-Church-Rosser (uniqueness of normal forms) for all closed terms of atomic types. Proof. As in [3]
The following normal form theorem also holds. Lemma 7 (Normal Form Property for T + C + R). Assume that R is a functional set of reduction rules for C. Assume A is either an atomic type or a product type.
A New Use of Friedman’s Translation: Interactive Realizability
23
Then any closed normal term t ∈ T of type A is: a numeral n : N, or a boolean True, False : Bool, or an update constant U : U, or a constant of type A, or a pair hu, vi : B × C. Proof. As in [3].
In section §2 we made use of the fact that every instance of EM1 (Excluded Middle over all Σ10 -formulas) can be proved from the Excluded Middle over Kleene’s predicate. Definition 8 (Kleene’s Predicate T ). With T : N3 → N we denote a predicate of G¨odel’s T representing Kleene’s predicate (see e.g., Odifreddi [26]). That is, for any numerals n, m, l, T nml = True if and only if the n-th partial recursive function is defined on argument m and l codes a computation proving it.
3.3
The System TClass
We now define a classical extension of T , that we call TClass , with a constant symbol Φ : N2 → N denoting a non-computable map of the same Turing degree of an oracle for the Halting problem. We shall use the elements of TClass to represent non-computable realizers. Definition 9 (The System TClass ). Define TClass = T + Φ, where Φ : N2 → N is a new constant symbol. For every numeral a, Φa – which we shall denote with Φa – represents a Skolem function for the formula ∃yN Pa xy, taking as argument a number x and returning some y such that Pa xy if any exists, and an arbitrary value otherwise. There is no set of computable reduction rules for the constant Φ, and therefore no set of computable reduction rules for TClass . Each (in general, non-computable) term t ∈ TClass is associated to a set {t[s] |s ∈ T , s : N2 → N} ⊆ T of computable terms we call its “approximations”, one for each term s : N2 → N of T , which is thought as a computable approximation of the oracle Φ. Definition 10 (Approximation at state s). We define: 1. A state is a closed term of type N2 → N of T . 2. Assume t ∈ TClass and s is a state. The “approximation of t at state s” is the term t[s] of T obtained from t by replacing each constant Φ with s.
24
Federico Aschieri and Stefano Berardi
We interpret any t[s] ∈ T as a learning process evaluated with respect to the information taken from an approximation s of Φ. Here we consider an approximation of Φ to be an arbitrary term s : N2 → N; s may be correctly in agreement with Φ on some arguments, but wrong on other ones. Consequently, we are going to consider the set of (a, n) such that Pa nsa (n) = True as the real “domain” of s (where sa (n) denotes san). We are also going to define a term ⊕, which takes as argument a term f : N2 → N and an update U, and changes the values of f according to U. This is the fundamental operation of our computational model: realizers correct wrong oracle approximations with new correct values that they have previously learned and stored in the updates. Last, using Φ, we are going to define for every numeral a the oracle Xa , which takes as argument a numeral n and returns a guess for the truth value of ∃yN Pa ny. Definition 11 (Domain, Updates of Functions, Oracle Xa ). We define: 1. If s is a state, we denote with dom(s) the set of pairs of numerals (a, n) such that Pa nsa (n) = True. 2. We define a term ⊕ : (N2 → N) → U → (N2 → N) as follows: 2
⊕ := λ f N →N λuU λxN λyN if (is u x y) then (get u x y) else f xy We will write t1 ⊕ t2 in place of ⊕t1 t2 . 3. For every numeral a, we define a term Xa : N → Bool as follows: Xa := λxN Pa x(Φa x)
We introduce now a notion of convergence for families of terms {t[si ]}i∈N ⊆ T , defined by some t ∈ TClass and indexed over a set {si }i∈N of states. Informally, “t convergent” means that the normal form of t[s] eventually stops changing when the approximation si of Φ gets better and better. If s and r are states, we formalize what it means that r is at least as good an approximation as s by defining: s ≤ r ⇐⇒ ∀a, n. sa (n) , ra (n) =⇒ (a, n) < dom(s) ∧ (a, n) ∈ dom(r) Intuitively, if s ≤ r, then r can be obtained by changing some of the values of s that make s a wrong approximation of Φ. We say that a sequence {si }i∈N of states is a weakly increasing chain of states (is w.i. for short), if si ≤ si+1 for all i ∈ N.
A New Use of Friedman’s Translation: Interactive Realizability
25
Definition 12 (Convergence). Assume that {si }i∈N is a w.i. sequence of states, and u ∈ TClass . 1. u converges in {si }i∈N if ∃i ∈ N.∀ j ≥ i. u[s j ] = u[si ] in T . 2. u converges if u converges in every w.i. sequence of states. We remark that if u is convergent, we do not ask that u is convergent to the same value on all w.i. chain of oracle approximations. The limit value of u may depend on the information contained in the particular chain from which u gets the knowledge. The chain of approximations, in turn, is selected by the particular definition we use for the “learning strategy” U. Different “learning strategies” may produce different limit values. Theorem 13 (Convergence Theorem). Assume t ∈ TClass is a closed term of atomic type A (A ∈ {Bool, N, U}). Then t is convergent. Proof. As in [3].
Remark. The idea of the proof of theorem 13 corresponds exactly to the intuition that during any computation, the oracle Φ is consulted a finite number of times and hence asked for a finite number of values. When the approximation sn of Φ is great enough, we can substitute Φ with sn and we obtain the same oracle values and hence the same results.
4
An Interactive Learning-Based Notion of Realizability for HAω + EM1 + SK1
In this section we introduce a learning-based notion of realizability for HAω + EM1 + SK1 , Heyting Arithmetic in all finite types (see e.g. Troelstra [31]) plus Excluded Middle scheme for all Σ01 -formulas: EM1 := ∀xN . ∃yN Pa xy ∨ ∀yN ¬Pa xy
and Skolem axioms for all Σ01 -formulas: SK1 := ∀xN ∀yN . Pa xy → Pa xΦa (x)
then we prove our main Theorem, the Adequacy Theorem: “if a closed arithmetical formula is provable in HAω + EM1 + SK1 , then it is realizable”.
26
Federico Aschieri and Stefano Berardi
We first define the formal system HAω + EM1 + SK1 . We represent atomic predicates of HAω + EM1 + SK1 with (in general, non-computable) closed terms of TClass of type Bool. Terms of HAω + EM1 + SK1 may include the function symbol Φ, with Φa denoting the Skolem function for ∃yN Pa xy. We assume having in G¨odel’s T some terms ⇒Bool : Bool → Bool → Bool, ¬Bool : Bool → Bool, ∨Bool : Bool → Bool → Bool . . ., implementing boolean connectives. If t1 , . . . , tn , t ∈ T have type Bool and are made from free variables all of type Bool, using boolean connectives, we say that t is a tautological consequence of t1 , . . . , tn in T (a tautology if n = 0) if all boolean assignments making t1 , . . . , tn equal to True in T also make t equal to True in T.
4.1
Language of HAω + EM1 + SK1
We now define the language of the arithmetical theory HAω + EM1 + SK1 . Definition 14 (Language of HAω + EM1 + SK1 ). The language LClass of HAω + EM1 + SK1 is defined as follows. 1. The terms of LClass are all t ∈ TClass . 2. The atomic formulas of LClass are all Q ∈ TClass such that Q : Bool. 3. The formulas of LClass are built from atomic formulas of LClass by the connectives ∨, ∧, → ∀, ∃ as usual, with quantifiers possibly ranging over variables xτ , yτ , zτ of arbitrary finite type τ of T . We denote with ⊥ the atomic formula False. If P is an atomic formula of LClass in the free variables x1τ1 , . . . , xnτn and t1 : τ1 , . . . , tn : τn are terms of LClass , with P(t1 , . . . , tn ) we shall denote the atomic formula P[t1 /x1 , . . . , tn /xn ]. We defined ⇒Bool : Bool, Bool → Bool as a term implementing implication, therefore, to be accurate, formulas of the form Pa (t, u) ⇒Bool Pa (t, Φa t) are not an implication between two atomic formulas, but they are equal to the single atomic formula Q, where Q := ⇒Bool (Pa tu)(Pa t(Φa t)) Any atomic formula A of LClass is a boolean term of TClass , therefore for any s : N2 → N of T we may form the “approximation” A[s] : Bool, A[s] ∈ T of A. In A[s] we replace the Skolem function Φ we have in A by its approximation s. From now onwards, for every pair of terms t1 , t2 of system T , we shall write t1 = t2 if they are the same term modulo the equality rules corresponding to the reduction rules of system T (equivalently, if they have the same normal form).
A New Use of Friedman’s Translation: Interactive Realizability
4.2
27
Interactive (or Learning-Based) Realizability
For every formula A of LClass , we are now going to define what type |A| a realizer of A must have. Definition 15 (Types for realizers). For each formula A of LClass we define a type |A| of TClass by induction on A: 1. |P| = U, if P is atomic, 2. |A ∧ B| = |A| × |B|, 3. |A ∨ B| = Bool × (|A| × |B|), 4. |A → B| = |A| → |B|, 5. |∀xτ A| = τ → |A|, 6. |∃xτ A| = τ × |A|. Let now p0 := π0 : σ0 × (σ1 × σ2 ) → σ0 , p1 := π0 π1 : σ0 × (σ1 × σ2 ) → σ1 and p2 := π1 π1 : σ0 × (σ1 × σ2 ) → σ2 be the three canonical projections from σ0 × (σ1 × σ2 ). We define the realizability relation t A, where t ∈ TClass , A ∈ LClass and t : |A|. Definition 16 (Interactive Realizability). Assume s is a state, t is a closed term of TClass , C ∈ LClass is a closed formula, and t : |C|. We define first the relation t s C by induction and by cases according to the form of C: 1. t s Q for some atomic Q if and only if t[s] = U implies: • U is sound and dom(U) ∩ dom(s) = ∅ • U = ∅ implies Q[s] = True 2. t s A ∧ B if and only if π0 t s A and π1 t s B 3. t s A ∨ B if and only if either p0 t[s] = True and p1 t s A, or p0 t[s] = False and p2 t s B
28
Federico Aschieri and Stefano Berardi
4. t s A → B if and only if for all u, if u s A, then tu s B 5. t s ∀xτ A if and only if for all closed terms u : τ of T , tu s A[u/x] 6. t s ∃xτ A if and only for some closed term u : τ of T , π0 t[s] = u and π1 t s A[u/x] We define t A if and only if for all closed s : N → N of T , t s A. Remark. The ideas behind the definition of s in the case of HA + EM1 + SK1 are those we already explained in [3], [1]. The system HAω + EM1 + SK1 has, w.r.t. the system HA + EM1 + SK1 considered in our previous papers, a new feature: its quantifiers range over terms of arbitrary finite type, i.e. over the functionals definable in LClass . These functionals, in general, are not recursive. However, in each particular world/state s what is in general uncomputable becomes computable, since the Skolem function Φ is interpreted by a computable approximation s. Thus, in the state s every term of LClass becomes recursive and, in fact, a term of G¨odel’s system T. This is the reason why in the definition of the realizability relation s all quantifiers range over terms of system T . The next proposition tells that realizability at state s respects the notion of equality of TClass terms, when the latter is relativized to state s. That is, if two terms are equal at the state s, then they realize the same formulas at the state s. Proposition 17 (Saturation). If t1 [s] = t2 [s] and u1 [s] = u2 [s], then t1 s B[u1 /x] if and only if t2 s B[u2 /x]. Proof. By straightforward induction on A.
Example. The most remarkable feature of our Realizability Semantics is the existence of a realizer for EM1 . Assume that Pa is a predicate of T and define Ea := λαN hXa α, hΦa α, ∅i, λnN if Pa αn then mkupd a α n else ∅i
Indeed Ea realizes its associated instance of EM1 . Proposition 18 (Realizer Ea of EM1 ). Ea ∀xN . ∃yN Pa (x, y) ∨ ∀yN ¬Bool Pa (x, y)
Proof. Let m be any numeral. We have Ea m = hXa m, hΦa m, ∅i, λnN if Pa mn then mkupd a m n else ∅i
A New Use of Friedman’s Translation: Interactive Realizability
29
and we want to prove that Ea m s ∃yN Pa (m, y) ∨ ∀yN ¬Bool Pa (m, y)
We have p0 Ea m = Xa m = Pa (m, Φa (m)). There are two cases. 1. Xa m[s] = True. Let n = sa (m). Then Pa (m, n) = True and we have to prove p1 Ea m s ∃yN Pa (m, y)
By definition Thus and
p1 Ea m = hΦa m, ∅i
π0 (p1 EP m)[s] = π0 hsa (m), ∅i = n π1 (p1 Ea m) s Pa (m, n)
because Pa (m, n) = True. We conclude p1 Ea m s ∃yN P(m, y)
2. Xa m[s] = False. We have to prove p2 Ea m = λnN if Pa mn then mkupd a m n else ∅ s ∀yN ¬Bool Pa (m, y)
i.e. that given any numeral n if Pa mn then mkupd a m n else ∅ s ¬Bool Pa (m, n)
If Pa (m, n) = False, the thesis follows since ¬Bool Pa (m, n) = True. If Pa (m, n) = True, then if Pa mn then mkupd a m n else ∅ = {(a, m, n)} s ¬Bool Pa (m, n)
since Pa (m, sa (m)) = False and thus dom({(a, m, n)}) = {(a, m)} and (a, m) < dom(s). We now prove that if we start from any term s0 : N2 → N of T and we repeatedly apply any atomic realizer t : U of TClass , we obtain a “zero” of t, that is a term sn : N2 → N of T such that t[sn ] = ∅. We interpret this by saying that any atomic realizer t represents a terminating learning process.
30
Federico Aschieri and Stefano Berardi
Theorem 19 (Zero Theorem). Let Q be an atomic formula of LClass and suppose t Q. Let s : N2 → N be a closed term of T . Define, by induction on n, a sequence {sn }n∈N of terms such that: s0 := s def 11
sn+1 := sn ⊕ t[sn ] = λxN λyN if (is t[sn ] x y) then (get t[sn ] x y) else sn (x, y) Then, there exists a n such that t[sn ] = ∅. Proof. We first prove that s0 , s1 , s2 , . . . is a weakly increasing chain. Suppose si (a, n) , si+i (a, n): we have to prove that (a, n) ∈ dom(si+1 ) and (a, n) < dom(si ). By definition of si+1 , if it were is t[si ] a n = False, then we would have si (a, n) = si+i (a, n), contradiction. Thus, is t[si ] a n = True, and if we choose an update U such that U = t[si ], we have: is U a n = True
that is, (a, n) ∈ dom(U), and for some m, (a, n, m) ∈ U. By definition of t si Q we deduce that U is sound and dom(si ) ∩ dom(U) = ∅. From dom(si ) ∩ dom(U) = ∅ and (a, n) ∈ dom(U) we obtain (a, n) < dom(si ). From U is sound and (a, n, m) ∈ U we obtain Pa nm = True. By definition, si+1 (a, n) = get t[si ] a n = get U a n = m Therefore, si+1 (a, n) = m and by definition 11 of dom, we have that (a, n) ∈ dom(si+1 ). We conclude that s0 , s1 , s2 , . . . is weakly increasing.
Now, by theorem 13, t converges over the chain {si }i∈N : there exists k ∈ N such that for every j ≥ k, t[s j ] = t[sk ]. By choice of k sk+1 ⊕ t[sk+1 ] = (sk ⊕ t[sk ]) ⊕ t[sk+1 ] = (sk ⊕ t[sk ]) ⊕ t[sk ] = sk ⊕ t[sk ] = sk+1 and hence it must be that t[sk ] = ∅, which is the thesis.
The Zero Theorem could be expressed, in an equivalent way, as a fixed point result, but we skip this reformulation for sake of brevity. As usual for a realizability interpretation, we may extract from any realizer t ∀xσ ∃yτ Pxy, with P : σ → τ → Bool closed term of system T , some recursive map f from the set of terms of type σ to the set of terms of type τ, such that Pu f (u) = True for all u : σ.
A New Use of Friedman’s Translation: Interactive Realizability
31
Theorem 20 (Program Extraction via Learning Based Realizability). Let t be a term of TClass and suppose that t ∀xσ ∃yτ Pxy, with P : σ → τ → Bool closed term of system T . Then: 1. From t one can effectively define a recursive function f such that for every closed term u : σ of system T, f (u) : τ is a term of system T such that Pu( f (u)) = True. 2. If σ ∈ {N, N → N}, then f can be represented in system T . 3. If σ < {N, N → N}, then f can be represented in system T plus Spector’s bar recursion (see Spector [30]). Proof.
1. Let
v := λmσ π1 (tm)
v is of type σ → U. Since for every closed u : σ of T vu Puπ0 (tu) by the zero theorem 19, there exists a recursive function zero from the set of type-σ terms of system T to the set of type-N2 → N terms of T such that vu[zero(u)] = ∅ for every closed u : σ of T . Define the function f := w 7→ π0 (tw)[zero(w)] and fix a closed term u : σ of T . By unfolding the definition of realizability with respect to zero(u), we have that tu zero(u) ∃yτ Puy and hence that is to say
π1 (tu) zero(u) Pu( f (u)) vu[zero(u)] = ∅ =⇒ Pu( f (u)) = True
and therefore which is the thesis.
Pu( f (u)) = True
32
Federico Aschieri and Stefano Berardi
2. The fact that f can be represented in system T follows by the methods of Aschieri [5]. In particular, by theorem 12 of [5], the function zero is representable in system T when σ = N, because λmσ λgN→N vm[g] is a numerable collection of update procedures (see Avigad [6], Aschieri [5]). A straightforward generalization of the aforementioned theorem 12 of [5] –taking care of collection of update procedures indexed by terms of type N → N – extends the result for σ = N → N. 3. See Aschieri [3], [4] for a proof that the function zero is representable in system T plus bar recursion. Remark. The function f described in theorem 20, point 1, reduces the problem of finding a witness for the formula ∃xτ Pux to the problem of computing a zero of the atomic realizer vu := π1 (tu) This latter problem is solved by f by computing the sequence s0 := s sn+1 := sn ⊕ vu[sn ] until a n is found such that vu[sn ] = ∅. The translation of f in a term of system T , which exists by theorem 20, point 2, yields the very same algorithm. The crucial fact is that the number n can be computed directly in system T and thus the iteration that allows to compute sn can be expressed by the primitive recursor of T . For details see [3, 5].
4.3
Curry-Howard Correspondence for HAω + EM1 + SK1
In figure 2, we define a standard natural deduction system for HAω + EM1 + SK1 (see [29], for example) together with a term assignment in the spirit of Curry-Howard correspondence for classical logic. We replace purely universal axioms (i.e., Π01 -axioms) with Post rules, which are inferences of the form Γ ` A1 Γ ` A2 · · · Γ ` An Γ`A where A1 , . . . , An , A are atomic formulas of LClass such that for every substitution σ = [t1 /x1 , . . . , tk /xk s/Φ] of closed terms t1 , . . . , tk of T and closed s : N2 → N of T , A1 σ = . . . = An σ = True implies Aσ = True. Let now eq : N2 → Bool a
A New Use of Friedman’s Translation: Interactive Realizability
33
term of G¨odel’s system T representing equality between natural numbers. Among the Post rules, we have the Peano axioms Γ ` eq S(x) S(y) Γ ` eq x y
Γ ` eq 0 S(x) Γ`⊥
and axioms of equality Γ ` eq x y Γ ` eq y z Γ ` eq x z
Γ ` eq x x
Γ ` A(x) Γ ` eq x y Γ ` A(y)
and for every A1 , A2 such that A1 = A2 is an equation of G¨odel’s system T (equivalently, A1 , A2 have the same normal form in T), we have the rule Γ ` A1 Γ ` A2 We add also have a Post rule Γ ` A1 Γ ` A2 · · · Γ ` An Γ`A for every classical propositional tautology A1 → . . . → An → A, where for i = 1, . . . , n, Ai , A are atomic formulas obtained as combination of other atomic formulas by the G¨odel’s system T boolean connectives. As title of example, we have the rules Γ`⊥ Γ`P
Γ`B Γ ` A ∧Bool B Γ ` A ⇒Bool B Γ`A
Finally, we have a rule of case reasoning for booleans. For any atomic formula P and any formula A[P] we have: Γ ` A[True] Γ ` A[False] Γ ` A[P] The connectives ∨Bool and ∨ have the same meaning but they are syntactically different: for every atomic formula P, we consider P∨Bool ¬Bool P an atomic formula and P ∨ ¬Bool P a compound formula. P ∨Bool ¬Bool P is an axiom, while we may derive HAω ` P ∨ ¬Bool P by case reasoning. Assume u1 , . . . , un are realizers of the assumptions of a Post rule. Then a realizer of the conclusion of a Post rule is of the form u = u1 d · · · d un . In this case, we have n different realizers, whose learning capabilities are put together through
34
Federico Aschieri and Stefano Berardi
a sort of union. In order to prove that u realizes A, assume that u[s] = ∅, then u1 [s] = . . . = un [s] = ∅, i.e. all ui “have nothing to learn”. In that case, each ui must guarantee Ai to be true, and therefore the conclusion of the Post rule is true, because true premises A1 , . . . , An spell a true conclusion A. Thus, u realizes A. If T is any type of T , we denote with dT a dummy term of type T , defined by N d = 0, dBool = False, dU = ∅, d A→B = λzA .d B (with zA any variable of type A), d A×B = hd A , d B i. We now prove our main theorem, that every theorem of HAω + EM1 + SK1 is realizable. As usual in adequacy proofs for realizability, we prove a stronger version of the theorem, suitable to be proved by induction on proofs. Theorem 21 (Adequacy Theorem). Suppose that Γ ` w : A in the system HAω + EM1 + SK1 , with Γ = x1 : A1 , . . . , xn : An , and that the free variables of the formulas occurring in Γ and A are among α1 : τ1 , . . . , αk : τk . For all states s and for all closed terms r1 : τ1 , . . . , rk : τk of system T , if there are terms t1 , . . . , tn such that for i = 1, . . . , n, ti s Ai [r1 /α1 · · · rk /αk ] then
w[t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ] s A[r1 /α1 · · · rk /αk ]
Proof. Notation: for any term v and formula B, we denote v[t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ] with v and
B[r1 /α1 · · · rk /αk ]
with B. We have |B| = |B| for all formulas B. We denote with = the provable equality in TClass . We proceed by induction on w. Consider the last rule in the derivation of Γ ` w : A: 1. If it is the rule for variables, then for some i, w = xi|Ai | and A = Ai . So w = ti s Ai = A. 2. If it is the ∧I rule, then w = hu, ti, A = B ∧ C, Γ ` u : B and Γ ` t : C. Therefore, w = hu, ti. By induction hypothesis, π0 w = u s B and π1 w = t s C; so, by definition, w s B ∧ C = A. 3. If it is a ∧E rule, say left, then w = π0 u and Γ ` u : A ∧ B. So w = π0 u s A, because u s A ∧ B by induction hypothesis.
A New Use of Friedman’s Translation: Interactive Realizability
35
4. If it is the → E rule, then w = ut, Γ ` u : B → A and Γ ` t : B. So w = ut s A, for u s B → A and t s B by induction hypothesis. 5. If it is the → I rule, then w = λx|B| u, A = B → C and Γ, x : B ` u : C. Suppose now that t s B; we have to prove that wt s C. By induction hypothesis on u, u s C. By trivial equalities wt[s] = (λx|B| u)[t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ]t[s] = (λx|B| u)t[t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ][s] = u[t/x|B| ][t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ][s] = u[s] Then by u[s] = wt[s] and saturation (prop. 17), wt s C. 6. If it is a ∨I rule, say left (the other case is symmetric), then we have w = hTrue, u, d|C| i, A = B ∨ C and Γ ` u : B. So, w = hTrue, u, d|C| i and hence π0 w[s] = True. We indeed verify that u s B with the help of induction hypothesis. 7. If it is a ∨E rule, then w = if p0 u then (λx|B| w1 )(p1 u) else (λx|C| w2 )(p2 u) and Γ ` u : B ∨ C, Γ, x : B ` w1 : D, Γ, y : C ` w2 : D, A = D. Assume p0 u[s] = True. By inductive hypothesis u s B ∨ C. Therefore, p1 u s B. Hence w[s] = (λx|B| w1 )p1 u[t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ][s] = w1 [p1 u/x|B| ][t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ][s] = w1 [p1 u/x|B| t1 /x1|A1 | · · · tn /xn|An | r1 /α1 · · · rk /αk ][s] = w1 [p1 u/x|B| ][s] By induction hypothesis, w1 [p1 u/x|B| ] s D. Thus, by w1 [p1 u/x|B| ] = w[s] and saturation (prop. 17), also w s D. Symmetrically, if p0 u[s] = False, we obtain again w s D.
36
Federico Aschieri and Stefano Berardi
8. If it is the ∀E rule, then w = ut, A = B[t/ατ ] and Γ ` u : ∀ατ B. So, w = ut. Let v = t[s]. By inductive hypothesis u s ∀αB and so uv s B[v/ατ ]. Since ut[s] = uv[s], by saturation (prop. 17), we conclude that ut s B[t/ατ ]. 9. If it is the ∀I rule, then w = λατ u, A = ∀ατ B and Γ ` u : B (with ατ not occurring free in the formulas of Γ). So, w = λατ u, since α , α1 , . . . , αk . Let t : τ be a closed term of T ; by saturation (prop. 17), it is enough to prove that wt = u[t/ατ ] s B[t/ατ ], which amounts to show that the induction hypothesis can be applied to u. For this purpose, we observe that, since α , α1 , . . . , αk , for i = 1, . . . , n we have ti s Ai = Ai [t/ατ ] 10. If it is the ∃E rule, then w = (λατ λx|B| t)(π0 u)(π1 u) Γ, x : B ` t : A and Γ ` u : ∃ατ .B. Assume v = π0 u[s]. Then t[v/ατ , π1 u/x|B| ] s A[v/ατ ] = A by inductive hypothesis, whose application being justified by the fact, also by induction, that u s ∃αN .B and hence π1 u s B[v/ατ ]. We thus obtain by w[s] = t[π0 u/ατ π1 u/x|B| ][s] and saturation (prop. 17) that w s A 11. If it is the ∃I rule, then w = ht, ui, A = ∃ατ B, Γ ` u : B[t/ατ ]. So, w = ht, ui; and, indeed, π1 w = u s B[π0 w/ατ ] = B[t/ατ ] since by induction hypothesis u s B[t/ατ ]. By saturation we conclude the thesis. 12. If it is the induction rule, then w = λαN Ruvα, A = ∀αN B, Γ ` u : B(0) and Γ ` v : ∀αN .B(α) → B(S(α)). So, w = λαN Ruvα. We have to prove that wu s B[n/α] for all closed normal form of type N. Let n = u[s] be the normal form of u[s]: then n is a numeral by the Lemma 7. A plain induction on n shows that wn = Ruvn s B[n/α] for u s B(0) and vi s B(i) → B(S(i)) for all numerals i by induction hypothesis. If we set i = n, the thesis follows by saturation and wu[s] = wn[s].
A New Use of Friedman’s Translation: Interactive Realizability
37
13. If it is a Post rule, then w = u1 d u2 d · · · d un and Γ ` ui : Ai . So, w = u1 d u2 d · · · d un . First, suppose that, for i = 1, . . . , n, ui [s] = Ui and w[s] = U. By induction hypothesis, dom(Ui ) ∩ dom(s) = ∅, and thus also dom(U) ∩ dom(s) = ∅. Suppose now that U = ∅; then we have to prove that A[s] = True. It suffices to prove that A1 [s] = A2 [s] = · · · = An [s] = True. We have U1 = · · · = Un = ∅ and by induction hypothesis A1 [s] = · · · = An [s] = True, since ui s Ai , for i = 1, . . . , n. 14. If is the excluded middle axiom EM1 , then w = Ea realizes EM1 by Prop. 18. 15. If it is a Φ-axiom rule, then w = λxN λyN if (Pa xy ⇒Bool Pa x(Φa x)) then ∅ else (mkupd a x y) and
A = ∀xN ∀yN . Pa (x, y) ⇒Bool Pa (x, Φa x)
Let n, m be two arbitrary numerals. We have to prove that wnm s Pa (n, m) ⇒Bool Pa (n, Φa n) There are two cases: (a) Pa (n, m) ⇒Bool Pa (n, sa n) = True. In this case, wnm[s] = ∅ and we have only to check that dom(s) ∩ dom(∅) = ∅, which is trivial. (b) Pa (n, m) ⇒Bool Pa (n, sa n) = False. Then, Pa nm = True and Pa nsa (n) = False. Moreover wnm[s] = mkupd a n m = U with U = {(a, n, m)}. We have first to check that U is sound (see definition 4): this follows from Pa nm = True. Then we have to check that dom(s) ∩ dom(U) = ∅: indeed, dom(U) = {(a, n)}, and by definition 11, Pa nsa (n) = False implies (a, n) < dom(s). Last, we have to check that U , ∅, which is immediate by U = {(a, n, m)}. As corollary of the Adequacy theorem 21, we obtain the main theorem. Theorem 22. If A is a closed formula such that HAω + EM1 + SK1 ` t : A, then t A.
38
5
Federico Aschieri and Stefano Berardi
Interactive Realizability and Kreisel’s No Counterexample Interpretation of Σ02 -formulas: an example of realizer
We now want to relate Interactive Realizability with Kreisel No-Counterexample Interpretation. We construct some terms transforming any witness of Kreisel’s Interpretation [20] of any Σ02 -formula in an interactive realizer of that formula – and vice versa. Let us fix a Σ02 -formula, with P(x, y) predicate of G¨odel’s system T: ∃xN ∀yN P(x, y)
(2)
We start from the no-counterexample interpretation.
5.1
From the No-Counterexample Interpretation to Interactive Realizability
A witness of the no-counterexample interpretation of (2) is a realizer Ψ – in the sense of Kreisel’s modified realizability – of the Herbrand normal form of (2): ∀ f N→N ∃xN P(x, f (x))
(3)
That is, Ψ : (N → N) → N is a term of T such that: ∀ f N→N P(Ψ f, f (Ψ f ))
(4)
As it is well known, one has HAω + EM1 + SK1 ` ∀ f N→N P(Ψ f, f (Ψ f )) → ∃xN ∀yN P(x, y)
The proof is classical and goes as follows. Suppose (4) and assume without loss of generality that P0 (x, y) ≡ ¬Bool P(x, y) We recall that Φ0 is the Skolem map for ∃yN P0 (x, y). Let x = ΨΦ0 . Then, by (4) with f = Φ0 , we deduce P(ΨΦ0 , Φ0 (ΨΦ0 )), that is P(x, Φ0 (x)) and equivalently
¬Bool P0 (x, Φ0 (x))
A New Use of Friedman’s Translation: Interactive Realizability
39
By SK1 we obtain the Skolem axiom for Φ, therefore from ¬Bool P0 (x, Φ0 (x)) we get ∀yN ¬P0 (x, y) which is equivalent to ∀yN P(x, y). Thus we conclude (2). By the Adequacy theorem 21, we know that an interactive realizer Ω of ∀ f N→N ∃xN P(x, f (x)) → ∃xN ∀yN P(x, y) indeed does exist. Since λ f N→N hΨ f, ∅i) ∀ f N→N ∃xN P(x, f (x)) we have that Ω(λ f N→N hΨ f, ∅i) is a interactive realizer of (2). It is instructive however to build directly Ω, as example, by using the aforementioned classical proof as a source of inspiration. The witness to (2) coming from the proof is just Ψ(Φ0 ). This latter term is not computable, but we suppose to have an approximation s : N2 → N of Φ. When we compute Ψ(Φ0 )[s] = Ψ(s0 ) = n if we are lucky we obtain a numeral n which is a witness to (2). But in general this is not the case, and we must be prepared to test the result of our computation and to correct s if n is a wrong witness. Indeed, the term λyN if Pny then ∅ else mkupd 0 n y does the job, as the next proposition says. Proposition 23 (From the N.C.I. to Learning-Based Realizability). Suppose Ψ : (N → N) → N is a term of T such that: ∀ f N→N P(Ψ f, f (Ψ f )) Then hΨ(Φ0 ), λyN if PΨ(Φ0 )y then ∅ else mkupd 0 Ψ(Φ0 ) yi ∃xN ∀yN P(x, y) Proof. Let s : N2 → N be a closed term of T , s0 = Φ0 [s] be the corresponding approximation of the Skolem map Φ0 for ∃yN P0 (x, y), and n be the normal form of Ψ(s0 ). Then n is a numeral by Lemma 7. By definition of interactive realizability, we have to show that, if we compute Ψ(Φ0 )[s] = Ψ(s0 ) = n
40
Federico Aschieri and Stefano Berardi
then we have λyN if PΨ(Φ0 )y then ∅ else mkupd 0 Ψ(Φ0 ) y s ∀yN P(n, y) Thus we must show that, fixed any numeral m, if PΨ(Φ0 )m then ∅ else mkupd 0 Ψ(Φ0 ) mi s P(n, m)
Compute
if PΨ(Φ0 )m then ∅ else mkupd 0 Ψ(Φ0 ) m[s]
=if Pnm then ∅ else mkupd 0 n m =U for some U ∈ U. There are two cases for U: 1. U = ∅. Then Pnm = True and thus we obtain the thesis: if PΨ(Φ0 )m then ∅ else mkupd 0 Ψ(Φ0 ) m s P(n, m)
by definition of realizability for atomic formulas. 2. U = {(0, n, m)}. Then Pnm = False, and hence P0 nm = True by P0 (x, y) ≡ ¬Bool P(x, y). Since ∀ f N→N P(Ψ f, f (Ψ f )) by letting f = s0 and substituting Ψs0 with n, we obtain Pns0 (n) = True; therefore P0 nΦ0 [s](n) = P0 ns0 (n) = False. Therefore U is sound and dom(U) ∩ dom(s) = ∅. That is, we have the thesis: if PΨ(Φ0 )m then ∅ else mkupd 0 Ψ(Φ0 ) m s P(n, m)
5.2
From Interactive Realizability to the No-Counterexample Interpretation
It is trivial to show that HAω + EM1 + SK1 ` ∃xN ∀yN P(x, y) → ∀ f N→N ∃xN P(x, f (x))
since any x such that ∀yN P(x, y) is such that P(x, f (x)), whatever f is. Hence, given any interactive learning-based realizer t ∃xN ∀yN P(x, y)
A New Use of Friedman’s Translation: Interactive Realizability
41
and by following the idea of the trivial proof above, we obtain that λ f N→N hπ0 t, (π1 t) f (π0 t)i ∀ f N→N ∃xN P(x, f (x)) By theorem 20, we obtain a term with the properties of Ψ. According to the proof of theorem 20, that term takes an f : N → N and computes the sequence s0 := λxN 0 sn+1 := sn ⊕ (π1 t) f (π0 t)[sn ] until an m such that (π1 t) f (π0 t)[sm ] = ∅ is found; then it returns π0 t[sm ]. It is more efficient, however, to start with s0 := f since by theorem 19 the produced sequence converges as well to a zero.
6
Interactive Realizability as Friedman’s Translation + Kreisel’s Modified Realizability
In this subsection we show that the notion of Interactive Realizability for HAω + EM1 + SK1 represents a new way of using Friedman’s translation. More precisely, Interactive Realizability is exactly the same as the notion of Kreisel’s modified realizability for HAω applied to our restricted Friedman translation of formulas. More precisely, we claim that the notion t B is equivalent to the notion ∀sN→N ∈ T . t[s] mr B[s]A (s) . Before we begin our argument, we recall the definition of modified realizability mr (Kreisel [20]). We denote with L the set of terms and formulas obtained from the terms and formulas of LClass by replacing the constant Φ with some term s : N2 → N of T . Definition 24 (Modified Realizability). Assume s : N2 → N is a closed term of T , t is a closed term of T , D ∈ L is a closed formula, and t : |D|. We define by induction on D the relation t mr D: 1. t mr Q if and only if Q = True 2. t mr A ∧ B if and only if π0 t mr A and π1 t mr B 3. t mr A ∨ B if and only if either p0 t = True and p1 t mr A, or π0 t = False and p1 t mr B
42
Federico Aschieri and Stefano Berardi
4. t mr A → B if and only if for all u, if u mr A, then tu mr B 5. t mr ∀xτ A if and only if for all closed terms u : τ of T , tu mr A[u/x] 6. t mr ∃xτ A if and only for some closed term u : τ of T , π0 t = u and π1 t mr A[u/x] It is technically more convenient to define directly a realizability relation mrf s such that t mrf s B[s] is equivalent (modulo some inessential adjustments in the atomic case) to the relation t mr B[s]A (s) . When Q is atomic, by definition the relation t mr Q[s]A (s) is t mr Q[s] ∨ ∃xN ∃yN ∃zN . T xyz ∧ ¬T xys(x, y) So, either Q[s] = True or p2 t contains a triple of numerals n, m, l such that T nml ∧ ¬T nms(n, m) is true. Thus it is better to define directly t as term of type U which reduces to an update, non-empty and containing such numerals n, m, l whenever Q is false. Definition 25 (mr Combined with the A (s)-Translation). Assume s : N2 → N is a closed term of T , t is a closed term of T , D ∈ L is a closed formula of L, and t : |D|. We define by induction on D the relation t mrf s D: 1. t mrf s Q if and only if • t = U implies that for all (n, m, l) ∈ U, T nml = True and T nms(n, m) = False • t = ∅ implies Q = True 2. t mrf s A ∧ B if and only if π0 t mrf s A and π1 t mrf s B 3. t mr A ∨ B if and only if either p0 t = True and p1 t mr A, or π0 t = False and p1 t mr B 4. t mrf s A → B if and only if for all u, if u mrf s A, then tu mrf s B 5. t mrf ∀xτ A if and only if for all closed terms u : τ of T , tu mrf A[u/x]
A New Use of Friedman’s Translation: Interactive Realizability
43
6. t mrf ∃xτ A if and only for some closed term u : τ of T , π0 t = u and π1 t mrf A[u/x] We are now able to characterize learning-based realizability as a Friedman translation combined with Kreisel’s modified realizability. Theorem 26 (Characterization of Interactive Realizability). Let t ∈ TClass and s : N2 → N ∈ T . Then, for every B ∈ LClass t s B ⇐⇒ t[s] mrf s B[s] Proof. The thesis is proved by routine induction on B. 1. B = Q, with Q atomic. Then t[s] mrf s Q[s], by definition 25, holds if and only if: • t[s] = U implies that T nml = True and T nms(n, m) = False for all (n, m, l) ∈ U • t[s] = ∅ implies Q[s] = True Indeed, this is exactly the definition 16 of t s Q, provided one makes some additional hypothesis on the enumeration P0 , P1 , . . . of definition 4. For example, it is enough to assume that for each numeral n, Pn = T n. 2. B = C ∧ D. Then t s C ∧ D if and only if π0 t s C and π1 t s D if and only if (by induction hypothesis) π0 t[s] mrf s C[s] and π1 t[s] mrf s D[s] if and only if t[s] mrf s (C ∧ D)[s]. 3. B = C ∨ D. Assume p0 t[s] = True (the case p0 t[s] = False is symmetrical). Then, t s C ∨ D if and only if p1 t s C if and only if (by induction hypothesis) p1 t[s] mrf s C[s] if and only if t[s] mrf (C ∨ D)[s] by the very definition 25 of mrf s . 4. B = C → D. Assume t s C → D. We want to prove that t[s] s (C → D)[s]. Thus, we have to suppose u mrf s C[s] and conclude that t[s]u mrf s D[s]. Since u = u[s] (u is a closed term of T by definition 25), by induction hypothesis we obtain that u mrf s C and hence that tu s D. By induction hypothesis, t[s]u = tu[s] mrf s D[s], which is what we wanted to show.
44
Federico Aschieri and Stefano Berardi
Conversely, assume t mrf s (C → D)[s]. We want to prove that t s C → D. Thus, we have to suppose u mrf s C and conclude that tu mrf s D. By induction hypothesis, we obtain that u[s] mrf s C[s] and hence that tu[s] s D[s]. By induction hypothesis again, tu mrf s D, which is what we wanted to show. 5. B = ∀xτC. Assume t s ∀xτC and let u : τ an arbitrary closed term of T . Then tu s C[u/x] and by induction hypothesis t[s]u = tu[s] mrf s C[u/x][s] = C[s][u/x] We have thus proved that t[s] mrf s ∀xτC[s]. Similarly, one proves that t[s] mrf s ∀xτC[s] implies t s ∀xτC. 6. B = ∃xτC. Assume p0 t[s] = u. Then t s ∃xτC if and only if p1 t s C[u/x] if and only (by induction hypothesis) p1 t[s] mrf s C[u/x][s] = C[s][u/x]
if and only if t[s] mrf s ∃xτC[s]. Finally, we are able to prove the correctness property of restricted Friedman’s translation which we claimed in §2. Theorem 27 (Correctness of the Restricted Friedman Translation). Given any atomic predicate P(x, y), the following holds: HAω ` (∀xN ∃yN P(x, y))A (s) =⇒ HAω ` ∀xN ∃yN P(x, y)
Proof. Let us consider an arbitrary Π02 -formula ∀xN ∃yN P(x, y). By applying the Kreisel modified realizability to any proof in HAω of (∀xN ∃yN P(x, y))A (s) , one obtains a term t[s] of G¨odel’s system T such that ∀sN→N . t[s] mr (∀xN ∃yN P(x, y))A (s) and thus
∀sN→N . t[s] mrf s ∀xN ∃yN P(x, y)
by definition of mrf. By theorem 26, for some t0 we have t0 ∀xN ∃yN P(x, y)
A New Use of Friedman’s Translation: Interactive Realizability
45
Moreover, by theorem 20, from t0 one can extract a term u : N → N of G¨odel’s system T such that for every numeral n, P(n, u(n)) = True. The proof is carried out in HAω , hence one obtains that t0 ∀xN ∃yN P(x, y) =⇒ HAω ` ∀xN ∃yN P(x, y) Therefore, HAω ` (∀xN ∃yN P(x, y))A (s) =⇒ HAω ` ∀xN ∃yN P(x, y)
which is the thesis.
References [1] F. Aschieri, S. Berardi, Interactive Learning-Based Realizability for Heyting Arithmetic with EM1 , Logical Methods in Computer Science, 2010. [2] F. Aschieri, Interactive Learning Based Realizability and 1-Backtracking Games, Proceedings of Classical Logic and Computation 2010, Electronic Proceedings in Theoretical Computer Science, 2011. [3] F. Aschieri, Learning, Realizability and Games in Classical Arithmetic, PhD Thesis, 2011. http://arxiv.org/abs/1012.4992 [4] F. Aschieri, Transfinite Update Procedures for Predicative Systems of Analysis, Proceedings of Computer Science Logic, 2011. [5] F. Aschieri, A Constructive Analysis of Learning in Peano Arithmetic, Annals of Pure and Applied Logic, 2011, doi:10.1016/j.apal.2011.12.04 [6] J. Avigad, Update Procedures and 1-Consistency of Arithmetic, Mathematical Logic Quarterly, volume 48, 2002. [7] J. Avigad, A realizability Interpretation for Classical Arithmetic, in Buss, H´ajek, and Pudl´ak eds., Logic Colloquium ’98, Lecture Notes in Logic 13, AK Peters, 57-90, 2000. [8] S. Berardi, Classical Logic as Limit Completion, MSCS, Vol. 15, n.1, 2005, pp.167-200. [9] S. Berardi and U. de’ Liguoro, Interactive Realizers. A New Approach to Program Extraction from non-Constructive Proofs, TOCL, 2010.
46
Federico Aschieri and Stefano Berardi
[10] U. Berger, H. Schwichtenberg, Program Extraction from Classical Proofs, Logic and Computational Complexity Workshop 1994, Lecture Notes in Computer Science, vol. 960, Springer, 1995. [11] T. Coquand, A Semantic of Evidence for Classical Arithmetic, Journal of Symbolic Logic 60, pag 325-337,1995. [12] H. Friedman, Classically and Intuitionistically Provable Recursive Functions, Lecture Notes in Mathematics, 1978, Volume 669/1978, 21-27. ¨ [13] K. G¨odel Uber eine bisher noch nicht ben¨utzte Erweiterung des Finiten Standpunktes, Dialectica 12, 280–287 (reproduced with English translation, in [G¨odel 1990], 240–251. [14] G. Gentzen, Die Widerspruchsfreiheit der reinen Zahlentheorie. Mathematische Annalen, 112:493-565, 1935. English translation: The consistency of elementary number theory, in Szabo [465], pages 132-200. [15] G. Gentzen, Untersuchungen u¨ ber das logische Schliessen. Mathematische Zeitschrift, 39:176-210, 405-431, 1935. English translation: Investigations into logical deduction, in Szabo [465], pages 68-131 [16] J.-Y. Girard, Proofs and Types, Cambridge University Press (1989). [17] E. M. Gold, Limiting Recursion, Journal of Symbolic Logic 30, pag. 28-48 (1965). [18] S. C. Kleene, On the Interpretation of Intuitionistic Number Theory, Journal of Symbolic Logic 10(4), pag 109-124 (1945). [19] U. Kohlenbach, Applied Proof Theory, Springer-Verlag, Berlin, Heidelberg, 2008. [20] G. Kreisel, On the Interpretation of non-Finitist Proofs, Part II: Interpretation of Number Theory, Applications, Journal of Symbolic Logic, vol. 17, 1952. [21] G. Kreisel, On Weak Completeness of Intuitionistic Predicate Logic, Journal of Symbolic Logic, vol. 27, 1962. [22] J-L. Krivine, Typed lambda-calculus in classical Zermelo-Fraenkel set theory, Archive for Mathematical Logic, 40(3):189–205, 2001. [23] G. Mints, S. Tupailo, W. Bucholz, Epsilon Substitution Method for Elementary Analysis, Archive for Mathematical Logic, volume 35, 1996
A New Use of Friedman’s Translation: Interactive Realizability
47
[24] G. Mints, S. Tupailo, Epsilon Substitution Method for the Ramified Language and ∆11 -Comprehension Rule, Logic and Foundations of Mathematics, 1999. [25] A. Miquel, Relating classical realizability and negative translation for existential witness extraction. In Typed Lambda Calculi and Applications (TLCA 2009), pp. 188-202, 2009. [26] P. Odifreddi, Classical Recursion Theory, Studies in Logic and Foundations of Mathematics, Elsevier, 1989 [27] P. Oliva, T. Streicher, On Krivine Realizability Interpretation of Second-Order Classical Arithmetic, Fundamenta Informaticae, 2008. [28] H. Schwichtenberg, A. Troelstra, Basic Proof Theory, Cambridge University Press, 1996 [29] M. H. Sorensen, P. Urzyczyn, Lectures on the Curry-Howard isomorphism, Studies in Logic and the Foundations of Mathematics, vol. 149, Elsevier, 2006. [30] C. Spector, Provably Recursive Functionals of Analysis: a Consistency Proof of Analysis by an Extension of Principles in Current Intuitionistic Mathematics, Dekker (ed.), Recursive Function Theory: Proceedings of Symposia in Pure Mathematics, vol. 5. AMS, Providence, 1962 [31] A. Troelstra, D. van Dalen, Constructivism in Mathematics, vol. I, NorthHolland, 1988. [32] H. Towsner, A Realizability Interpretation of Classical Analysis, Archive for Mathematical Logic, vol. 43, 2005. [33] A. Troelstra, Metamathematical Investigations of Intuitionistic Arithmetic and Analysis, Lecture Notes in Mathematics, Springer-Verlag, Berlin-HeidelberNewYork, 1973.
48
Federico Aschieri and Stefano Berardi
Types
σ, τ ::= N | Bool | U | σ → τ | σ × τ
Constants Terms
c ::= Rτ | ifτ | 0 | S | True | False | is | get | mkupd | d | U (∀U ∈ U) t, u ::= c | xτ | tu | λxτ u | ht, ui | π0 u | π1 u
Typing Rules for Variables and Constants xτ : τ 0:N S:N→N
True, False : Bool U : U (for every U ∈ U) d:U→U→U is : U → N → N → Bool get : U → N → N → N mkupd : N → N → N → U ifτ : Bool → τ → τ → τ Rτ : τ → (N → (τ → τ)) → N → τ
Typing Rules for Composed Terms t:σ→τ u:σ tu : τ u:σ t:τ hu, ti : σ × τ
u:τ λxσ u : σ → τ u : τ0 × τ1 i ∈ {0, 1} πi u : τi
Reduction Rules All the usual reduction rules for simply typed lambda calculus (see Girard [16]) plus the rules for recursion, if-then-else and projections Rτ uv0 7→ u ifτ True u v 7→ u
Rτ uvS(t) 7→ vt(Rτ uvt)
ifτ False u v 7→ v
πi hu0 , u1 i 7→ ui , i = 0, 1
plus the following ones, assuming a, n, m be numerals: if ∃m. (a, n, m) ∈ U True is U a n 7→ False otherwise m if ∃m. (a, n, m) ∈ U get U a n 7→ 0 otherwise U 1 d U 2 7→ U1 U U2 mkupd a n m 7→ {(a, n, m)}
Figure 1: the extension T of G¨odel’s system T
A New Use of Friedman’s Translation: Interactive Realizability
49
Contexts With Γ we denote contexts of the form x1 : A1 , . . . , xn : An , with x1 , . . . , xn proof variables and A1 , . . . , An formulas of LClass . Axioms
Γ, x : A ` x|A| : A
Conjunction
Implication
Γ`u:A Γ`t:B Γ ` hu, ti : A ∧ B
Γ`u: A∧B Γ ` π0 u : A
Γ`u:A→B Γ`t:A Γ ` ut : B
Γ`u: A∧B Γ ` π1 u : B
Γ, x : A ` u : B Γ ` λx|A| u : A → B Γ`u:A Γ ` hFalse, d|A| , ui : A ∨ B
Disjunction Intro.
Γ`u:A Γ ` hTrue, u, d|B| i : A ∨ B
Disjunction Elim.
Γ ` u : A ∨ B Γ, x : A ` w1 : C Γ, x : B ` w2 : C Γ ` if p0 u then (λx|A| w1 )(p1 u) else (λx|B| w2 )(p2 u) : C
Universal Quantification
Γ ` u : ∀ατ A Γ ` ut : A[t/ατ ]
Γ`u:A Γ ` λατ u : ∀ατ A
where t is a term of LClass and αN does not occur free in any formula B occurring in Γ. Existential Quantification
Γ ` u : A[t/ατ ] Γ ` ht, ui : ∃ατ .A
Γ ` u : ∃ατ .A Γ, x : A ` t : C Γ ` (λατ λx|A| t)(π0 u)(π1 u) : C
where ατ is not free in C nor in any formula B occurring in Γ. Induction
Γ ` u : A(0) Γ ` v : ∀αN .A(α) → A(S(α)) Γ ` λαN Ruvα : ∀αN A
Γ ` u1 : A1 Γ ` u2 : A2 · · · Γ ` un : An Γ ` u1 d u2 d · · · d un : A where n > 0 and A1 , A2 , . . . , An , A are atomic formulas of LClass , and the rule is a Post rule for equality, for a Peano axiom or for a classical propositional tautology or for booleans.
Post Rules
Post Rules with no Premises Γ ` ∅ : A where A is an atomic formula of LClass and an axiom of equality or a classical propositional tautology. EM1
Γ ` Ea : ∀xN . ∃yN Pa (x, y) ∨ ∀yN ¬Bool Pa (x, y)
SK1
Γ ` SP : ∀xN ∀yN . Pa (x, y) ⇒Bool Pa (x, Φa x) with SP := λxN λyN if (Pa xy ⇒Bool Pa x(Φa x)) then ∅ else (mkupd a x y).
Figure 2: Terms Assignement Rules for HAω + EM1 + SK1
50
Polymorphic Logic Mark Bickford and Robert Constable In this article we explore uses of the intersection type as another form of universal quantification. This concept can be expressed naturally in type theories that allow polymorphic terms, such as Computational Type Theory and Intuitionistic Type Theory. We have found this quantifier to be very useful both in theory and in practice. When we use the uniform universal quantifier, we obtain more efficient realizers of constructive content. Moreover, we have been able to find the computational content in classical results by restating them using these quantifiers. Theorems stated in terms of the usual universal quantifier and implication can sometimes be restated with the corresponding polymorphic versions and given new proofs that construct more uniform, polymorphic, witnesses that are also more efficient. We illustrate these ideas in the realm of pure logic. We first show how to prove a lemma from Smullyan’s book First Order Logic and extract its well hidden computational content. Then we show how to precisely characterize the computational content of theorems in minimal first-order logic, the logic underlying Minlog which Helmut Schwichtenberg and his collaborators have used to create many beautiful examples of how to find efficient computational content from both constructive as well as classical proofs.
1
Introduction
It is well known from the propositions as types principle that methods for constructing types can also be seen as methods for constructing propositions. For example the dependent function type constructor can be seen as universal quantifier and the subtype relation as a new form of implication [9, 11, 10]. These turn out to be very natural logical operators which in many cases express a computational interpretation that is precisely what is needed to express an idea and to create an efficient computational realizer for it. In this article we explore uses of the intersection type [13, 9] as another form of universal quantification. This concept can be expressed naturally in type theories that allow polymorphic terms, such as Computational Type Theory [8, 1] and Intuitionistic Type Theory [12]. We have found this quantifier to be very useful both in
52
Mark Bickford and Robert Constable
theory and in practice. When we use the uniform universal quantifier instead of the normal constructive one, we obtain more efficient realizers of constructive content. Moreover, we have been able to find the computational content in classical results by restating them using these quantifiers. Many theorems stated in terms of the usual universal quantifier and implication can be restated with the corresponding polymorphic versions and given new proofs that construct more uniform, polymorphic, witnesses that are also efficient. We illustrate these ideas in the realm of pure logic. We first show how to prove a lemma from Smullyan’s book First Order Logic [16] and extract its well hidden computational extract. We show that the same idea can be used to indicate when numerical arguments are used simply as indexes into computation and need not be present in the computational content. We illustrate this with a theorem whose computational extract is precisely the Y combinator [3]. Then we show how to precisely characterize the computational content of theorems in minimal first-order logic, the logic underlying Minlog which Helmut Schwichtenberg and his collaborators have used to create many beautiful examples of how to find efficient computational content from classical proofs [5, 6, 4, 14, 15]. This article is good background for our recent results on intuitionistic completeness of minimal and intuitionistic first-order logic. In the article Intuitionistic Completeness of First-Order Logic [7], we show that these logics are complete with respect to uniform validity using the intended BHK realizability semantics.
2
Universal quantification
The standard universal quantifier ∀x : T. P(x) is defined to be the Π-type, Πx : T. P(x), which we prefer to call the dependent function type and write as x : T → P(x). A witness f ∈ ∀x : T. P(x) is therefore a function f ∈ x : T → P(x) that maps any x ∈ T to a witness for P(x). In some cases, there may be a single p that is a uniform witness for P(x) for T any x ∈ T . In this case, p is a member of the intersection type, x : T P(x). Such a p is not a function with input x ∈ T , but is instead a witness for P(x), polymorphic or uniform over all x ∈ T .
2.1
Polymorphic universal quantification
T We define the polymorphic universal quantifier ∀[x : T ]. P(x) to be x : T P(x). The brackets around the bound variable indicate that the witness does not “use” the parameter x. Classically, ∀x : T. P(x) and ∀[x : T ]. P(x) have the same meaning, but constructively they differ. A witness p for the proposition with the polymorphic
Polymorphic Logic
53
quantifier is likely to be more efficient since it does not need to be given an input x ∈ T. In an extensional computational type theory like Nuprl, types are members of a hierarchy of universes, Ui , i ∈ {0, 1, 2, . . . }. When the universe level i is unimportant or can be inferred from context, we write Type for Ui . Since propositions are defined to be types, we define Pi = Ui and write P when the level is unimportant or can be inferred from context. P is the type of propositions which we can think of as truth values. A false proposition is an empty type, so it is extensionally equal to False = Void. A true proposition is a non-empty type and the members of the type are the witnesses for the truth of the proposition.
2.2
Rules for ∀[x : T ]. P(x)
T The rules for proving ∀[x : T ]. P(x) are the rules for proving x : T P(x). These T make use of contexts with hidden declarations. To prove Γ ` x : T P(x) we must prove Γ, [x : T ] ` P(x). The brackets on the declaration [x : T ] added to the context Γ indicate that it is hidden. To prove this sequent, we use whatever rules are appropriate for proving P(x), and no rules use hidden declarations. The hidden declarations are automatically unhidden once the sequent is refined to one with a conclusion of the form t1 = t2 ∈ T . Because the rules for proving an equality proposition all extract a fixed witness term Ax (because we consider equality propositions to have no constructive content) the extract of any proof of Γ, [x : T ] ` P(x) will not include the hidden parameter x. In particular, the proposition t ∈ T is simply an abbreviation for t = t ∈ T , so when proving a typing judgement, the hidden declarations are unhidden and may be used. More generally, a conclusion C has trivial computational content if we can construct a closed witness w (independent of the hypotheses) such that C ⇔ w ∈ C. Tactics (usually) recognize when the conclusion has trivial computational content and replace the conclusion with w ∈ C, so the hidden declarations can be used while proving such conclusions as well. These include all of the so-called “Harrop” formulae that are built from ∀, ⇒, ¬, ∧ but do not use ∃, ∨.
3
A polymorphic induction principle
The principle of complete induction over the natural numbers, N, can be written in higher-order logic as ∀P : N → P. (∀n : N. (∀m : Nn . P(m)) ⇒ P(n)) ⇒ (∀n : N. P(n))
54
Mark Bickford and Robert Constable
Here, the type Nn is the set type {m : N | m < n} whose members are the natural numbers less than n. A witness for the induction principle is a member Ind of the corresponding dependent function type P : (N → P) → (n : N → (m : Nn → P(m)) → P(n)) → (n : N → P(n)) The witness Ind will have the form λP. λG. λn. . . . . It takes inputs P, G, and n, where G has type (n : N → (m : Nn → P(m)) → P(n)), and produces a witness, Ind(P, G, n), for P(n). If we restate the induction principle using the polymorphic universal quantifier, we get ∀[P : N → P]. (∀[n : N]. (∀[m : Nn ]. P(m)) ⇒ P(n)) ⇒ (∀[n : N]. P(n)) Proving this is equivalent to the construction of a witness W of type \ \ \ \ P(m)) → P(n)) → ( P(n)) ( ( P : (N→P) n : N m : Nn
n: N
T T W will have the form λF. . . . and take an F ∈ ( n : N ( m : Nn P(m)) → P(n)) and T produce a member, W(F), of ( n : N P(n)). The input F is a function that takes an T x ∈ ( m : Nn P(m)) and produces a witness, F(x) for P(n). The result W(F) is a uniform witness for all the P(n), n ∈ N. Such a W appears to be a fixed point operator, and we can, in fact, prove the polymorphic induction principle using any fixed point combinator fix that satisfies fix(F) ∼ F(fix(F)) The relation ∼ is the symmetric-transitive closure of 7→, where t1 7→ t2 if a single primitive computation step such as β-reduction, expanding definitions (δ-reduction), or reducing another primitive (+, ∗, . . . , on numbers, projections on pairs, etc.) transforms t1 into t2 . In computational type theory all types are closed under ∼, so we have subject reduction : x ∈ T, x ∼ y ` y ∈ T Lemma 1. ∀[P : N → P]. (∀[n : N]. (∀[m : Nn ]. P(m)) ⇒ P(n)) ⇒ (∀[n : N]. P(n)) Proof. Given [P ∈ N → P] and f : ∀[n : N]. (∀[m : Nn ]. P(m)) ⇒ P(n) we must construct a member of (∀[n : N]. P(n)) (without using P).
Polymorphic Logic
55
We show that fix( f ) (which is independent of P) is in (∀[n : N]. P(n)). Since this is a proof of a typing judgement, we may now use the declarations that were formerly hidden. T T Let Γ be the context P : N → P, f : n : N ( m : Nn P(m)) ⇒ P(n). We must show Γ, n : N ` fix( f ) ∈ P(n) and we use the complete induction principle on n. Thus, we show that fix( f ) ∈ P(n) follows from the assumptions Γ, n : N, ∀m : Nn . fix( f ) ∈ P(m) T But these assumptions imply fix( f ) ∈ ( m : Nn P(m)), and therefore, using the polymorphic type of f , f (fix( f )) ∈ P(n). Since f (fix( f )) ∼ fix( f ), we have fix( f ) ∈ P(n). We carried out this proof in Nuprl using for the fixed point combinator the Ycombinator, Y = λ f (λx( f (xx)))(λx( f (xx))) The extract of the proof, computed by the system, is simply the term Y.
4
Uniform wellfoundedness and Brouwer ordinals
We can generalize these results. For a type T , a relation R on T , and a member x ∈ T , we define T Rx to be the subtype {y : T | R(y, x)}. We say that R is uniformly wellfounded on T if the following polymorphic induction priciple holds: ∀[P : T → P]. (∀[x : T ]. (∀[y : T Rx ]. P(y)) ⇒ P(x)) ⇒ (∀[x : T ]. P(x)) Thus, Lemma 1 states that Y is the witness to the fact that < is uniformly wellfounded on N. The type N can be seen as an instance of the class of types called Brouwer ordinals, or following Martin-Lof, W types. These types are parameterized by a type A and family of types B[a] indexed by a ∈ A. In Nuprl we use the general recursive type constructor to define the W type W(A; a.B[a]) = rec(W. a : A× (B[a] → W)) Thus every member of type W = W(A; a.B[a]) is a pair ha, f i where a ∈ A and f ∈ B[a] → W, and we call this pair wsup(a, f ). The members of the set { f (b) | b ∈ B[a]} are the immediate predecessors of wsup(a, f ). Following Martin-Lof, we define two mutually recursive relations < and ≤ on W by: wsup(a, f ) ≤ w ⇔ ∀x : B[a]. f (x) < w
56
Mark Bickford and Robert Constable
w < wsup(a, f ) ⇔ ∃x : B[a]. f (x) ≤ w Using the induction principle for the recursive type, we can show that these definitions are well formed and define transtive relations on W(A; a.B[a]). Then, by essentialy the same argument as in Lemma 1, we proved Lemma 2. For any type A and type family B ∈ A → Type, the relation < is uniformly wellfounded on W(A; a.B[a]) (and any fixed point combinator, such as Y, is a witness). Using Lemma 2 we can show that if R is an ordering on a type T and there is an order-preserving map from hT, Ri to hW, 0 such that if |x − x0 | 6 r and |y − y0 | 6 r, then (x, y) ∈ A. Let f : A → R be uniformly continuous on each compact subset of A, let M > sup {| f (x, y)| : |x − x0 | 6 r, |y − y0 | 6 r} , and let h = min {r, r/M}. Then the differential equation (*) has a solution y on the interval I ≡ [x0 − h, x0 + h] ([15], Theorem 6, page 10).
68
Douglas S. Bridges
In this paper, we discuss Peano’s existence theorem in the setting of Bishop’s constructive mathematics (BISH), by which we mean mathematics with intuitionistic logic and some appropriate set- or type-theoretic foundation such as those in [4, 13, 17, 18, 19].1 We prove that (*) can be solved approximately as closely as we wish,2 but that the existence of exact solutions in the general case is equivalent to an essentially nonconstructive principle. We also show how adding different forms of the hypothesis that there exists at most one solution to (*) helps to provide an exact solution. The natural form of that extra hypothesis gives the solution if we accept a version of Brouwer’s fan theorem; a stronger, sequential form of the ‘at most one solution’ hypothesis works without any fan-theoretic assumptions.
2
Peano’s theorem and LLPO
If X is a compact metric space and f : X → R a uniformly continuous function, we write k f kX ≡ sup x∈X | f (x)| . Lemma 1. Under the hypotheses of Peano’s theorem, if y is a solution of the differential equation (*) on the interval I ≡ [x0 − h, x0 + h] , then |y(x1 ) − y(x2 )| 6 M |x1 − x2 | and
(x1 , x2 ∈ I)
kykI 6 |y0 | + Mh.
Proof. For x1 , x2 ∈ I we have Z x 2 f (t, y(t))dt 6 M |x1 − x2 | . |y(x1 ) − y(x2 )| 6 x1 Also,
Z x f (t, y(t))dt 6 |y0 | + Mh, kyk 6 |y0 | + sup x∈I x0
as we wanted.
Lemma 2. Under the hypotheses of Peano’s existence theorem, let S = {y ∈ C(I) : ky − y0 k 6 Mh ∧ ∧ ∀ x1 ,x2 ∈I (|y(x1 ) − y(x2 )| 6 M |x1 − x2 |) . 1
(5)
For more on BISH, see [6, 7, 11, 12]. For information about other varieties of constructive mathematics and their relation to BISH, see [5, 11, 22]. 2 For a different approach to the existence of approximate solutions for (*), using the Schauder fixedpoint theorem, see Section 4 of [14].
Constructive Solutions of Ordinary Differential Equations
Then S is a compact subset of C(I). Moreover, if Φ : S → R is defined by ( ) Z x f (t, y(t)) dt : x ∈ I , Φ(y) = sup y(x) − y0 − x0
69
(6)
then Φ is uniformly continuous on S, and inf y∈S Φ(y) = 0. Proof. First observe that, by (5.6) on page 102 of [7], T ≡ z ∈ C(I) : kzk 6 Mh ∧ ∀ x1 ,x2 ∈I (|z(x1 ) − z(x2 )| 6 M |x1 − x2 |) is compact. Since S = y0 + T , it follows that S is compact. For each y ∈ S and each x ∈ I we have |y(x) − y0 | 6 Mh 6 r, so f (x, y(x)) is defined. Hence Φ(y) is well defined at (6). Next, since |Φ (g1 ) − Φ (g2 )| 6 kg1 − g2 k + h sup | f (t, g1 (t)) − f (t, g2 (t))| t∈I
for all g1 , g2 in C (I), we see that Φ is uniformly continuous on S . The rest of the proof follows part of the classical proof of the Peano existence theorem in [15] (page 10) or [10] (4.7.6). Since f is uniformly continuous on the compact set n o K ≡ (x, y) ∈ R2 : |x − x0 | 6 r, |y − y0 | 6 r , we can apply the Stone-Weierstraß Theorem, to construct a sequence (pn )n>1 of polynomial functions of two variables such that sup | f (x, y) − pn (x, y)| < 2−n
(x,y)∈K
for each n. We may assume that sup |pn (x, y)| 6 M
(x,y)∈K
for each n. Since pn satisfies a Lipschitz condition, there exists a unique solution yn of the differential equation y0 = pn (x, y),
y(x0 ) = y0
on I; see [15] (page 8) or [10] (4.7.4). Note that r h ≡ min r, M
70
Douglas S. Bridges
depends on M and not on n. By Lemma 1, yn ∈ S . Moreover, for each x ∈ I and each n, Z x Z x f (t, yn (t)) dt 6 yn (x) − y0 − pn (t, yn (t)) dt yn (x) − y0 − x0 x0 Z x ( f (t, yn (t)) − pn (t, yn (t))) dt + x0 6 2−n |x − x0 | , 6 2−n h, so
Z x −n y (x) − y − f (t, y (t)) dt n 6 2 h. 0 n x0
Hence Φ(yn ) → 0 as n → ∞, and therefore inf y∈S Φ(y) = 0.
Bishop called the following essentially nonconstructive proposition the lesser limited principle of omniscience: LLPO: For each binary sequence (an )n>1 with at most one term equal to 1, either a2n = 0 for all n or a2n+1 = 0 for all n. This principle is equivalent to the statement ∀ x∈R (x > 0 ∨ x 6 0) and to the full form of Peano’s theorem. In fact, it is well known (cf. the proof of Theorem 2 in [1]) that if, for any a > 0 and some h > 0, the initial value problem y0 = y1/3 , y(0) = a has a solution on [−h, h], then we can derive LLPO. We now present an alternative Brouwerian example, whose force derives not from a fuzziness in the initial condition, but from the behaviour of the solution between x = 0 and x = 1. Theorem 3. Peano’s existence theorem is equivalent to LLPO. Proof. The proof of is based on ideas of Aberth [3]. First, for a, b in R we define the spike function spike(·, a, r) : R → R as the unique (uniformly) continuous function on R that vanishes outside the interval [a − r, a + r], takes the value 1 at a, and is linear on the intervals [a − r, a],
Constructive Solutions of Ordinary Differential Equations
71
[a, a + r]. Let (an )n>1 be a binary sequence with at most one term equal to 1. Define the mapping h : [0, 1] → [−1, 1] by ! ∞ X 1 (−1)n 2−n an spike x, , 2−n−1 (0 6 x 6 1) . h(x) ≡ 2 n=1 Then h is uniformly continuous on [0, 1]. Note that at most one term of the series defining h(x) is nonzero; if it is the nth term, then h is 0 everywhere in [0, 1] except for an interval of length 2−n centred at 1/2, on which it has a (possibly negative) triangular pulse of height 2−n . It follows that if y : [0, 1] → R satisfies y0 (x) = h(x), y(0) = 0, then y (1) =
∞ X
(−1)n 2−2n−1 an .
n=1
Next define a continuous function f : [0, 2] × [−2, 2] → [−1, 1] such that h(x) if 0 6 x 6 1 f (x, y) = (x − 1) y1/3 if 1 < x 6 2. Setting
f (−z, y) = f (z, y) (−2 6 z 6 0) ,
extend f by uniform continuity to a uniformly continuous mapping of [−2, 2] × [−2, 2] into [−1, 1]. (Note that f glues nicely at the lines x = −1, x = 0, and x = 1.) Next, observe that if y(1) , 0, then the differential equation y0 = (x − 1) y1/3 has a unique solution 1 y = sgn(y(1)) (x − 1)2 + |y(1)|2/3 3
!3/2
on the interval [1, 2]. Then 1 + |y(1)|2/3 y(2) = sgn(y(1)) 3
!3/2 .
Suppose that the initial value problem y0 = f (x, y), y(0) = 0
(7)
72
Douglas S. Bridges
has a solution y on the interval [−2, 2]. (In the notation of the Peano existence theorem, we have x0 = 0 = y0 , r = 2). Either y(2) < (1/3)3/2 or y(2) > − (1/3)3/2 . In the first case, if an = 1 for some even n, then y(1) = 2−2n−1 , so y(2) > (1/3)3/2 , a contradiction. Hence an = 0 for all even n. A similar argument shows that in the second case, an = 0 for all odd n. We conclude that Peano’s existence theorem implies LLPO. In view of Lemma 2, the converse implication is a consequence of Ishihara’s result that LLPO is equivalent, over BISH, to every uniformly continuous, realvalued mapping on a compact set attaining its infimum [16].
3
Adding uniqueness
The problem in the first part of the proof of Theorem 3 is the potential non-uniqueness of the solution of the differential equation (7). In an attempt to get over this barrier, we say that the general differential equation (*) has at most one solution on an interval [a, b] containing x0 if Z x f (t, yi (t)) dt > 0 max sup yi (x) − y0 − i=1,2 x∈[a,b]
x0
whenever y1 , y2 are uniformly continuous on [a, b] , y1 (x0 ) = y2 (x0 ), and y1 , y2 . We now show that if, with the function f as in the first part of the proof of Theorem 3, the initial value problem (7) has at most one solution on [−2, 2], then it has a solution. To that end, let a≡
∞ X
(−1)n 2−2n−1 an ,
n=1
which, if (7) has a solution is the value of that solution at x = 1. Also, define uniformly continuous functions y+ , y− : [−2, 2] × [−2, 2] → [−1, 1] such that
and
Rx |h(t)| dt 0 y+ (x) = 1 (|x| − 1)2 + |y+ (1)|2/3 3/2 3 Rx − 0 |h(t)| dt y+ (x) = − 1 (|x| − 1)2 + |y (1)|2/3 3/2 − 3
if 0 6 |x| < 1 if 1 6 |x| 6 2 if 0 6 |x| < 1 if 1 6 |x| 6 2.
Constructive Solutions of Ordinary Differential Equations
73
Then y+ (2) , y− (2), so y+ , y− . Now assume that the initial value problem (7) has at most one solution. Then either Z x sup y+ (x) − f (t, y+ (t)) dt > 0 0
|x|62
or
Z sup y− (x) −
|x|62
x 0
f (t, y− (t)) dt > 0.
(8)
Consider the first alternative. There exists x ∈ [−2, 2] such that Z x f (t, y+ (t)) dt. y+ (x) , 0
By continuity, we may assume that x lies in one of the four intervals (−2, −1) , (−1, 0), (0, 1) , (1, 2) . Take first the case where 0 6 x 6 1. We have Z x Z x Z x f (t, y+ (t)) dt = h(t)dt, |h(t)| dt = y+ (x) , 0
0
0
so there exists t ∈ (0, x) such that h(t) , |h(t)| and therefore h(t) < 0. It follows that (7) has the unique solution y− on [−2, 2] . Next, consider the case where 1 < x < 2. We have !1/2 Z x Z x Z 1 1 2 2/3 t (t − 1) + |y+ (1)| f (t, y+ (t)) dt = dt |h(t)| dt + 3 1 0 0 !1/2 Z x 1 dt. t (t − 1)2 + |y+ (1)|2/3 = |y(1)| + 3 1 With u ≡ becomes Hence Z
x 0
1 3
(t − 1)2 + |y(1)|2/3 , the indefinite version of the integral on the right Z 3 1/2 u du = u3/2 . 2
f (t, y+ (t)) dt = |y(1)| +
Our choice of x yields
1 (x − 1)2 + |y+ (1)|2/3 3
!3/2 = |y+ (1)| + y+ (x).
y+ (x) , |y+ (1)| + y+ (x),
from which it follows that y+ (1) , 0. If y+ (1) > 0, then y+ is the unique solution of our initial value problem, which is a contraction. Hence y+ (a) < 0, and the unique
74
Douglas S. Bridges
solution on I is, in fact, y− . Similar considerations to the foregoing cover the cases where −2 < x < −1 and −1 < x < 0. Likewise, if (8) obtains, then (*) has the unique solution y+ on I. This example suggests that if (*) has at most one solution on an interval I containing x0 , then it has a solution on I. However, this is not possible, in view of Aberth’s recursive example3 in which (*) has no solution [3] (pages 125–139). We can, however, prove a weaker result in this vein. For this, we require some more definitions. Let (X, ρ) be a metric space, and f : X → R a sequentially continuous function. that has an infimum. A sequence (xn )n>1 in X is said to be minimising for f if f (xn ) → inf f as n → ∞. We say that f has – sequentially at most one minimum point if any two minimising sequences (xn )n>1 , xn0 n>1 for f are eventually close in the sense that ρ xn , xn0 → 0 as n → ∞; – uniformly at most one minimum if for each ε > 0, there exists δ > 0 such that if x, x0 ∈ X, f (x) < inf f + δ, and f (x0 ) < inf f + δ, then ρ (x, x0 ) < ε; – a strong minimum at ξ (the strong minimum point) if for each ε > 0, there exists δ > 0 such that if x ∈ X and f (x) < f (ξ) + δ, then ρ (x, ξ) < ε. Clearly, if f has uniformly at most one minimum, then it has sequentially at most one minimum. Referring to Proposition 3 of [8], we see that if X is compact and f is uniformly continuous, then the converse holds, so these two ‘at most one minimum’ conditions are equivalent. In that case there exists a unique ξ ∈ X such that f (ξ) = inf f ; moreover, with ε, δ as above, if x ∈ X and f (x) < inf f + δ, then ρ (x, ξ) < ε ([20], Proposition 2.1), so ξ is a strong minimum point for f . In view of Lemma 2, these observations, applied to the compact space S and the uniformly continuous mapping Φ on S , give us the following result. Proposition 4. Under the hypotheses of Peano’s theorem, let S and Φ : S → R be as defined at (5) and (6). Suppose that Φ has sequentially at most one minimum in S . Then (i) the differential equation (*) has a unique solution y on the interval I; 3 It is worth checking to see whether Aberth’s example actually has at most one solution in our strong sense. If, as is permissible when working in the recursive model of BISH (see Chapter 3 of [11]), we assume Markov’s principle, then it does. For if y1 , y2 ∈ S and max {Φ(y1 ), Φ(y2 )} = 0, then Φ(y1 ) = Φ(y2 ) = 0, contradicting Aberth’s result; whence, by Markov’s principle, max {Φ(y1 ), Φ(y2 )} > 0.
Constructive Solutions of Ordinary Differential Equations
75
(ii) for each ε > 0, there exists δ > 0 such that if z ∈ S and Φ(z) < δ, then ky − zkI < ε. Now, it is a truth universally acknowledged that a constructive mathematician in possession of parametrised solutions must be in want of continuity in those parameters: Theorem 5. Let A ⊂ R2 be closed, (x0 , y0 ) ∈ A◦ , and r > 0 such that n o A ⊃ K ≡ (x, y) ∈ R2 : |x − x0 | 6 r ∧ |y − y0 | 6 r . Let M > 0, h = min {r, r/M}, and I ≡ [x0 − h, x0 + h]. With S as at (5), for each f ∈ S define Φ f on S by ( ) Z x f (t, y(t)) dt : x ∈ I . Φ f (y) ≡ sup y(x) − y0 − x0 Let S u be the set of those f ∈ S such that k f kK 6 M and Φ f has a strong minimum in S , and for each f ∈ S u let y f denote the strong minimum point of Φ f . Then the mapping f y f is pointwise continuous on S u . Proof.
Fix f ∈ S u and ε > 0. Choose δ > 0 such that if g ∈ S u and Φ f (g) < δ, then
y f − g
< ε. Consider any g ∈ S u with k f − gkK < δ/r. For each x ∈ I we have Z x f (t, yg (t)) dt yg (x) − y0 − x0 Z x Z x f (t, yg (t)) − g(t, yg (t)) dt 6 yg (x) − y0 − g(t, yg (t)) dt + x0 x0 6 Φg (yg ) + r k f − gkK . Since Φg (yg ) = 0, it follows that Φ f (yg ) 6 r k f − gkK < δ
and therefore that y f − yg < ε.
4
Concluding remarks
If we adopt Brouwer’s fan theorem for detachable bars, FTD (see [8] for more on this), then we can establish the existence of a solution to (*) on the interval
76
Douglas S. Bridges
[x0 − h, x0 + h] with the hypothesis ‘any two minimising sequences for Φ are eventually close’ in Proposition 4 replaced by ‘there is at most one solution of (*)’: we just apply Theorem 5 of [8] to the uniformly continuous function Φ on the compact space S . It is tempting to believe that the ‘at most one solution implies there is a solution’ version of Peano’s existence theorem is equivalent, over BISH, to FTD . However, a proof of this equivalence is elusive, perhaps because the Peano existence problem deals with a very specific compact space, namely S , whereas equivalents of FTD normally deal with statements that apply to all compact metric spaces (see [9]). Acknowledgements. Numerous conversations with Josef Berger, Iris Loeb, Peter Schuster, and Helmut Schwichtenberg have helped clarify my thoughts on the problems discussed in this paper. The work was supported by the Marsden Fund of the Royal Society of New Zealand (project UOC0502) and the recurrent, generous hospitality of the Schwichtenberg logic group at Ludwig-Maximilians-Universit¨at, M¨unchen.
References [1] O. Aberth: ‘Computable analysis and differential equations’, in Intuitionism and Proof Theory (A.Kino, J. Myhill, R.E. Vesley, eds), 47–52), NorthHolland Publ. Co., Amsterdam, 1970. [2] O. Aberth: Computable Analysis, McGraw-Hill, New York, 1980. [3] O. Aberth: ‘The failure in computable analysis of a classical existence theorem for differential equations’, Proc. Amer. Math. Soc. 30, 151–156, 1971. [4] P. Aczel and M. Rathjen: Notes on Constructive Set Theory, Report No. 40, Institut Mittag-Leffler, Royal Swedish Academy of Sciences, 2001. [5] M.J. Beeson: Foundations of Constructive Mathematics, Springer Verlag, Heidelberg, 1985. [6] E.A. Bishop: Foundations of Constructive Analysis, McGraw-Hill, New York, 1967. [7] E.A. Bishop and D.S. Bridges: Constructive Analysis, Grundlehren der Math. Wiss. 279, Springer Verlag, Heidelberg, 1985.
Constructive Solutions of Ordinary Differential Equations
77
[8] J. Berger, D.S. Bridges, and P.M. Schuster: ‘The fan theorem and unique existence of maxima’, J. Symbolic Logic 71(2), 713-720, 2006. [9] J. Berger and H. Ishihara: ‘Brouwer’s fan theorem and unique existence in constructive analysis’, Math. Logic Quart. 51(4), 360–364, 2005. [10] D.S. Bridges: Foundations of Real and Abstract Analysis, Graduate Texts in Mathematics 174, Springer Verlag, Heidelberg-Berlin-New York, 1998. [11] D.S. Bridges and F. Richman: Varieties of Constructive Mathematics, London Math. Soc. Lecture Notes 97, Cambridge Univ. Press, 1987. [12] D.S. Bridges and L.S. Vˆı¸ta˘ : Techniques of Constructive Analysis, Universitext, Springer New York, 2006. [13] H.M. Friedman: ‘Set Theoretic Foundations for Constructive Analysis’, Ann. Math. 105(1), 1–28, 1977. [14] M. Hendtlass: ‘Fixed point theorems in constructive mathematics’, preprint, University of Leeds, 2011. [15] W. Hurewicz: Lectures on Ordinary Differential Equations, M.I.T. Press, Cambridge, Mass., 1958. [16] H. Ishihara, ‘An omniscience principle, the K¨onig lemma and the HahnBanach theorem’, Z. Math. Logik Grundl. Math. 36, 237–240, 1990. [17] P. Martin-L¨of: An Intuitionistic Theory of Types: Predicative Part, in Logic Colloquium 1973 (H.E. Rose and J.C. Shepherdson, eds), 73–118, North– Holland, Amsterdam, 1975. [18] J. Myhill: ‘Constructive Set Theory’, J. Symb. Logic 40(3), 347–382, 1975. [19] G. Sambin and J. Smith: Twenty Five Years of Constructive Type Theory, Oxford Logic Guides 36, Clarendon Press, Oxford, 1998. [20] P.M. Schuster: ‘Unique solutions’, Math. Logic Quart. 52(6), 534-539, 2006. Corrigendum: Math. Logic Quart. 53(2), 214, 2007. [21] A.S. Troelstra and D. van Dalen: Constructivism in Mathematics: An Introduction (two volumes), North Holland, Amsterdam, 1988.
78
A Nonstandard Hierarchy Comparison Theorem for the Slow and Fast Growing Hierarchy Wilfried Buchholz and Andreas Weiermann∗
It is folklore that the slow and fast growing hierarchy match up for the first time at the proof-theoretic ordinal of (Π11 − CA)0 . By results of Sch¨utte and Simpson it is known that the underlying notation system looses its strength when the ordinal addition function is no longer present. In this article we will show that a hierarchy comparison can still be established. Surprisingly the match of the slow and fast growing hierarchy can be arranged by using standard fundamental sequences to happen at ω2 which is much smaller than the ordinal of (Π11 − CA)0 . We will also show that the slow growing hierarchy consists of elementary functions only when it is based on a Buchholz style system of fundamental sequences for the Sch¨utte Simpson ordinal notation system.
1
Introduction
With Helmut Schwichtenberg (who wrote his PhD thesis about this subject) we share a deep interest in subrecursive hierarchies. Schwichtenberg [5] and independently Wainer gave in the seventies a classification of the < 0 -recursive functions which nowadays still forms a classic and which is very useful not even in hierarchy theory. Over the years Schwichtenberg (and Wainer) also showed continued interest in results related to the comparison of the slow and fast growing hierarchies [6]. This article provides a somewhat surprising result on hierarchy comparisons which is driven by pure curiosity. What happens in hierarchy comparison results when the addition is deleted from the context? We show that a modification of a new proof of the hierarchy comparison theorem goes through almost word for word but the match between the hierarchies now occurs at ω2 .
∗ This author’s research was partially supported by the John Templeton Foundation and the FWO and it was partially done whilst this author was a visiting fellow at the Isaac Newton Institute for the Mathematical Sciences in the programme ‘Semantics & Syntax.
80
2
Wilfried Buchholz and Andreas Weiermann
Tree ordinals for IDω
In this section we recall some facts from the theory of tree ordinals for IDω (cf., e.g., [2, 3]). Definition 1. Inductive Definition of tree classes Oν (ω , ν ≤ ω + 1). 1. 0 := ∅ ∈ Oν . 2. α ∈ Oν =⇒ α+1 := {(0, α)} ∈ Oν . 3. µ < ν & ∀ξ ∈ Oµ (αξ ∈ Oν ) =⇒ (αξ )ξ∈Oµ ∈ Oν . We identify O0 and IN. Definition 2. Inductive definition of | α | for α ∈ O1 . 1. | 0 |:= 0. 2. | α + 1 |:=| α | +1. 3. | (αi )i∈IN ) |:= sup{| αi | +1 : i ∈ IN}. Definition 3. Inductive definition of α + β for α, β ∈ Oω+1 . 1. α + 0 := α. 2. α + (β + 1) := (α + β) + 1. 3. α + (βξ )ξ∈Oµ := (α + βξ )ξ∈Oµ . ˙ µ , ω, ˙ Definition 4. Definition of Ω ˙ Ω. ˙ µ+1 := (ξ)ξ∈Oµ+1 . 1. Ω ˙ 0 := ω 2. Ω ˙ := (ξ)ξ∈O0 . ˙ ω := (Ω ˙ i )i∈ω . 3. Ω ˙ := Ω ˙ 1. 4. Ω Definition 5. Inductive definition of Dω : Oω+1 → Oω+1 . ˙ ω. 1. Dω 0 := Ω 2. Dω (α + 1) := Dω α + Dω α. 3. Dω ((αξ )ξ∈Oµ ) := (Dω αξ )ξ∈Oµ . Definition 6. Inductive definition of Dm ω (α) for α ∈ Oω+1 . D0ω (α) := α, m Dm+1 ω (α) := Dω (Dω (α)). ˙ω +ω ˙ + ω)) ˙ i∈IN . We set ε˙ Ωω +1 := (Diω (Ω Definition 7. Inductive definition of Dm : Oω+1 → Om+1 ˙ m. 1. Dm 0 := Ω 2. Dm (α + 1) := Dm (α) + 1. 3. Dm ((αξ )ξ∈Oρ ) := (Dm (αξ ))ξ∈Oρ , if ρ ≤ m. 4. Dm ((αξ )ξ∈Oρ+1 ) := (Dm βρ+1,m )ξ∈Om if m < ρ + 1 ξ
where βρ+1,ρ+1 := αξ and βρ+1,n := βρ+1,n+1 ρ+1,n+1 for n < ρ + 1. ξ ξ Dn βξ
(m < ω).
A Nonstandard Hierarchy Comparison Theorem
81
Remark: | D0 ε˙ Ωω +1 | is the proof-theoretic ordinal of IDω [cf.[2, 3]]. Definition 8. Inductive definition of a set T of tree notations. 1. 0, 1 ∈ T , lev(0) = lev(1) = 0 and 1 ∈ P. 2. If α ∈ T and ν ≤ ω, and lev(α) ≤ ν + 1 then Dν α ∈ T , lev(Dν α) = ν and Dν α ∈ P. 3. If α0 , ..., αk ∈ P (k ≥ 1), then α0 + · · · + αk ∈ T and lev(α0 + · · · + αk ) = max{lev(αi ) : i ≤ k} In the sequel we work only with tree ordinals which are denoted by elements of T . For those tree ordinals we have in addition a term structure along which we can carry out syntactical definitions. Note that α, β ∈ T implies α + β ∈ T . For notational reasons we will write in the sequel α[[ξ]] for αξ at several places. Definition 9. Inductive definition of tp(α) for α ∈ T and α ∈ Oω+1 . 1. tp(0) := 0. 2. tp(α + 1) := 1 := {∅}. ˙ n. 3. tp((αξ )ξ∈On ) := Ω Lemma 10. Recursive description of α[[ξ]] for α ∈ T with tp(α) > 1 and ξ ∈ tp(α). 1. (Dn 0)[[ξ]] = ξ. 2. (α0 + · · · + αk )[[ξ]] = α0 + · · · + αk [[ξ]]. 3. (Dn α)[[ξ]] = Dn α[[ξ]] if lev(tp(α)) ≤ n. 4. (Dn α)[[ξ]] = Dn (α[[Dn (α[[ξ]])]]) if lev(tp(α)) > n. Note that this is conform with the standard interpretation for tree ordinals. Lemma 11. α, ξ ∈ T and ξ ∈ tp(α) implies α[[ξ]] ∈ T . Proof. If α = α0 + · · · + αk then tp(α) = tp(αk ) and the i.h. yields αk [[ξ]] ∈ T . Then α[[ξ]] = α0 + · · · + αk [[ξ]] ∈ T . If α = Dm β and tp(α) = tp(β) = Ωm+1 then the i.h. yields β[[ξ]] ∈ T hence α[[ξ]] = Dm β[[ξ]] ∈ T . If α = Dm β and tp(α) = tp(β) = Ωρ+1 with ρ + 1 > m then the i.h. yields β[[ξ]] ∈ T hence Dm β[[ξ]] ∈ T . The i.h. yields β[[Dm β[[ξ]]]] ∈ T hence α[[ξ]] = Dm β[[Dm β[[ξ]]]] ∈ T . ˙ ρ+1 and β ∈ Oρ+1 and tp(β) = Ω ˙ m then Lemma 12. If α, β ∈ T and tp(α) = Ω ˙ tp(α[[β]]) = Ωm and (α[[β]])[[ξ]] = α[[β[[ξ]]]] for ξ ∈ Om
82
Wilfried Buchholz and Andreas Weiermann
Proof. If α = α0 + · · · + αk then tp(α) = tp(αk ) and = = = = =
α[[β[[ξ]]]] α0 + · · · + αk [[β[[ξ]]]] α0 + · · · + (αk [[β]])[[ξ]] (α0 + · · · + αk [[β]])[[ξ]] ((α0 + · · · + αk )[[β]])[[ξ]] (a[[β]])[[ξ]]
˙ n and n ≤ m then tp(α) = tp(γ) and If α = Dm γ with tp(γ) = Ω
˙ m+1 If α = Dm γ with tp(γ) = Ω = = = = = = =
α[[β[[ξ]]]] = Dm (γ[[β[[ξ]]]]) = Dm ((γ[[β]])[[ξ]]) = (Dm γ[[β]])[[ξ]] = (α[[β]])[[ξ]] ˙ m and then tp(α) = Ω α[[β[[ξ]]]] Dm γ[[Dm γ[[β[[ξ]]]]]] Dm γ[[Dm (γ[[β]][[ξ]])]] Dm γ[[Dm γ[[β]][[ξ]]]] Dm (γ[[Dm γ[[β]]]][[ξ]]) (Dm γ[[Dm γ[[β]]]])[[ξ]]) ((Dm γ)[[β]])[[ξ]]) (α[[β]])[[ξ]]
˙ m+1 then Dm α = Dm α[[Dm α[[Ω ˙ m ]]]]. Lemma 13. If α ∈ T and tp(α) = Ω Proof. = = = = =
˙ m ]]]] Dm α[[Dm α[[Ω Dm α[[Dm ((α[[ξ]])ξ∈Om )]] Dm α[[(Dm α[[ξ]])ξ∈Om ]] Dm ((α[[Dm α[[ξ]])]])ξ∈Om ) (Dm α[[Dm α[[ξ]])]])ξ∈Om Dm α
A Nonstandard Hierarchy Comparison Theorem
83
Definition 14. Inductive definition of Fα for α ∈ O1 (cf.[1]). 1. F0 (n) := n. 2. Fα+1 (n) := Fα (n) + 1. 3. F(αi )i∈IN (n) := FαFαn (n) (n). This is a recursion along the rank of α, | α |. In the sequel we carry out most calculations along representations for tree ordinals. But at some places we use induction on the ranks. Remark: Results in [3] indicate that for every in IDω provably total function f : IN → IN there is an i < ω such that f (n) < FD0 (Diω (Ω˙ ω +ω+ ˙ ω)) ˙ (n) holds for all n ∈ IN [cf.[3]]. To obtain a majorization of the Hardy-hierarchy used in [3] and the F hierarchy used in this article one can roughly employ an estimate of the form Fβ (Fα (x)) ≤ Fα+β (x). In the sequel we consider (Fα ) as one suitable version of the fast growing hierarchy. Definition 15. Inductive definition of Gα for α ∈ O1 . 1. G0 (n) := 0. 2. Gα+1 (n) := Gα (n) + 1. 3. G(αi )i∈IN (n) := Gαn (n). This is again a recursion along | α |.
3
Proof of the hierarchy comparison theorem following the classical lines
We give a proof of the hierarchy comparison theorem using ideas of Wainer ([7]). The following definition is carried out by recursion on the (length of the) notation for a tree ordinal. (In the sequel we identify these notations with the denoted ordinal. This causes no intrinsic difficulty but one has to be aware of the fact that different notations can denote the same tree ordinal.) Definition 16. Inductive definition of C x (α) for α ∈ T 1. C x (α) := Gα (x) if α ∈ O1 . 2. C x (α0 + · · · + αk ) := C x (α0 ) + · · · + C x (αk ). 3. C x (Dm+1 α) := DmC x (α). ˙ ω ) := Ω ˙ x. 4. C x (Ω Lemma 17. If x ∈ IN, α ∈ T , lev(α) ≤ n for some n < ω and tp(α) = ω ˙ then C x (α) = C x (α x ).
84
Wilfried Buchholz and Andreas Weiermann
Proof. By induction on the length of the notation for α. 1. lev(α) = 0. Then C x (α) = Gα (x) = Gαx (x) = C x (α x ). 2. α = β + γ where tp(γ) = ω. ˙ Then the induction hypothesis yields C x (α) = C x (β) + C x (γ x ) = C x (β) + C x (γ) = C x (α). 3. α = Dm+1 β where tp(β) = ω. ˙ Then the induction hypothesis yields C x (α) = DmC x (β) = DmC x (β x ) = C x (α x ). ˙ m+2 . Then tp(α) = Ω ˙ m+1 and this case does not 4. α = Dm+1 β where tp(β) = Ω occur. Lemma 18. ˙ m , m ≥ 1 and ξ ∈ Om then the tree ordinal C x (αξ ) is the same If α ∈ T , tp(α) = Ω as the tree ordinal C x (α)[[C x (ξ)]]. Proof. By induction on the length of the notation for α. 1. α = Dm 0. Then α[[ξ]] = ξ and the result follows. ˙ m. 2. α = β + γ where tp(γ) = tp(α) = Ω Then C x (α[[ξ]]) = C x (β) + C x (γ[[ξ]]) = C x (β) + C x (γ)[[C x (ξ)]] = C x (α)[[ξ]]. ˙ m where m ≤ n + 1. 3. α = Dn+1 β and tp(β) = tp(α) = Ω Then C x (α[[ξ]]) = DnC x (β[[ξ]]) = DnC x (β)[[C x (ξ)]] = C x (α)[[C x (ξ)]]. ˙ n+2 . 4. α = Dn+1 β where tp(β) = Ω 0 ˙ Let β := β[[Dn+1 β[[Ωn+1 ]]]]. ˙ n+1 . Then α is the same tree ordinal as Dn+1 β0 and tp(α) = Ω 00 0 0 ˙ Moreover C x (Dn+1 β) = DnC x (β). Let β := β [[Dn β [[Ωn ]]]]. ˙ n. Then C x (α) is the same tree ordinal as Dn β00 and tp(C x (α)) = Ω For ξ ∈ On+1 we obtain following identity between tree ordinal values C x (α)[[ξ]] = C x (Dn+1 β0 [[ξ]]) = Dn β00 [[C x (ξ)]] = C x (α)[[C x (ξ)]]. Theorem 19. Let α ∈ T and lev(α) ≤ 1. Then GD0 α (x) = FC x (α) (x). Proof. By induction on the tree ordinal which is denoted by α. 1. α = 0. Then GD0 0 (x) = G x (x) = x = F0 (x) = FC x (0) (x). 2. If α = β + 1, then
A Nonstandard Hierarchy Comparison Theorem
85
GD0 α (x) = GD0 β+1 (x) = GD0 β (x) + 1 = FC x (β) (x) + 1 = FC x (β)+1 (x) = FC x (α) (x). 3. If tp(α) = ω ˙ then GD0 α (x) = GD0 αx (x) = FC x (αx ) (x) = FC x (α) (x). ˙ 1 then 4. If tp(α) = Ω
= = = =
GD0 α (x) GD0 α[[D0 αΩ˙ 0 ]] (x) FC x (α[[D0 αΩ˙ 0 ]]) (x) F(C x (α))[[C x (D0 αΩ˙ 0 )]] (x) F(C x (α))[[FC x (α ˙ ) (x)]] (x)
= =
F(C x (α))[[F(C x (α))[[x]] (x)]] (x) FC x (α) (x).
Ω0
Corollary 20. GD0 D1 ...Dm Ωm+1 (x) = FD0 ...Dm−1 Ω˙ m (x). Remark: If one changes the definition of Dm by defining a similar function D0m by defining D0m (α + 1) := D0m α + D0m α then the proof of Theorem 7 and Corollary 8 go through when one considers a variant Fα0 of the fast growing hierarchy which 0 (n) := Fα0 (n) · 2. satisfies Fα+1 Our result raises some immediate questions: ˙ ω ? (By [4] it is known that the height will be bounded What is the height of D0 Ω by ε0 .) Is GD0 Ω˙ ω in fact slow- or fast growing or is F D0 Ω˙ ω slow- or fast growing? These questions are answered in the following section and the answer is somewhat surprising.
86
4
Wilfried Buchholz and Andreas Weiermann
A direct proof of the hierarchy comparison theorem
This section is the result of a fruitful interaction of the second author with the referee during the refereeing procedure after which the referee became the first author. (We follow the tradition of using the lexicographic ranking of authors.) The first author calculated the order type of D0 Ωω which is ω2 . He further proved some technical lemmata and showed GD0 Ω2 (x) , FD0 Ω1 (x) which provided a counter example to a claim of the second author in the first version of this article. The second author then took up the first author’s suggestions to start with some direct calculations and was able to correct his original proof. Some further results which have independent interest are documented in this section. ˙ to simplify the notation. We drop in this section the superscript˙in Ω Definition 21. p(n) := 2n+1 − 1. Remark. p(n + 1) = p(n) + p(n) + 1. Lemma 22. a) lev(β) < n + 1 ⇒ Dn (α + β) = Dn α + β b) Dn (Ωn+1 · k) = Ωn · p(k) c) Fω·k (x) = x · p(k). d) GΩ·k (x) = x · k. Proof. a) By induction on β. Proof of b) by induction on k: Dn (Ωn+1 · 0) = Ωn · 1 = Ωn · p(0). Lemma 4 and assertion a) yield Dn (Ωn+1 ·(k+1)) = Dn (Ωn+1 ·k)+Dn (Ωn+1 ·k)+Ωn = Ωn · p(k) + Ωn · p(k) + Ωn = Ωn · p(k+1). c) Similarly to b). d) By induction on k. Lemma 23. D0 . . . Dn (Ωn+1 · k) = Ω0 · pn+1 (k). Proof. By induction on n. n = 0: By Lemma 9b, D0 (Ω1 · k) = Ω0 · p(k). n ≥ 1: IH D0 . . . Dn (Ωn+1 · k) = D0 ...Dn−1 (Ωn · p(k)) = Ω0 · pn (p(k)) = Ω0 · pn+1 (k). Lemma 9 and Lemma 10 yield the following version of the hierarchy comparison theorem. Corollary 24. n ≥ 1 ⇒ GD0 ...Dn (Ωn+1 ·k) (x) = x · pn+1 (k) = FD0 ...Dn−1 (Ωn ·k) (x). This calculation indicates the surprising fact that F remains rather modestly growing in the current context and this will be verified in somewhat more detail.
A Nonstandard Hierarchy Comparison Theorem
87
Definition 25. a) T (≥ n) := {Ωl · jl + Ωl−1 · jl−1 + · · · + Ωn · jn : l ≥ n} b) T (≤ n) := {Ωn · jn + Ωn−1 · jn−1 + · · · + Ω0 · j0 + r} c) T (< ω) := {Ωl · jl + Ωl−1 · jl−1 + · · · + Ω0 · j0 + r : l ≥ 0} Lemma 26. a) α ∈ T (≥ n) ⇒ (∃ j)[Dn α = Ωn · j] b) α ∈ T (< ω) ⇒ Dn α ∈ T (≤ n) Assertion a) yields that the height of D0 Ωω is ω2 . Finally we arrive at an independent proof of the main result of the last section. Theorem 27. α ∈ T (≤ 1) ⇒ GD0 α (x) = FC x (α) (x). Proof. Assume α = Ω1 · j + Ω0 · k + l. Then GD0 α (x) = GD0 (Ω1 · j)+Ω0 ·k+l (x) = GΩ0 ·(2 j+1 −1)+Ω0 ·k+l (x) = x · (2 j+1 − 1 + k) + l. Moreover FC x (α) (x) = FΩ0 · j+x·k+l (x) = FΩ0 · j (x) + x · k + l = x · (2 j+1 − 1 + k) + l. Corollary 28. GD0 (D1 (Ωn+1 ·k)) (x) = FD0 (Ωn ·k) (x). Proof. Note that the term D1 (Ωn+1 · k) is not an official member of T . But we have D1 (Ωn+1 · k) = Ω1 · j iff D0 (Ωn · k) = Ω0 · j. The last theorem yields then the assertion. We close this section with a technical calculation of the value of Dk α. This allows for rather precise estimates for calculating the values of the involved hierarchies. Definition 29. g() := 1, g(x0 , ..., xm ) = p0 (g(x0 , ..., xm−1 )) + xm , where p0 (n) := 2n − 1. Lemma 30. k ≤ m & xr > 0 & k ≤ r g(xm , ..., xr , 0, 0i ).
⇒
g(xm , ..., xr −1, g(xm , ..., xr −1, 0) + 1, 0i ) =
Proof. Proof by induction on i: 1. i = 0: = = = =
g(xm , ..., xr −1, g(xm , ...xr −1, 0) + 1) p0 (g(xm , ..., xr −1)) + g(xm , ...xr −1, 0) + 1 p0 (g(xm , ..., xr −1)) + p0 (g(xm , ...xr −1)) + 1 p0 (g(xm , ..., xr −1) + 1) p0 (g(xm , ..., xr )) = g(xm , ..., xr , 0).
88
Wilfried Buchholz and Andreas Weiermann
2. Induction step: trivial.
A straightforward modification of the proof for Lemma 4 shows the following Lemma. Lemma 31. If r > 0 then Dk (β + Ωk+r ) = Dk (β + Dk+r−1 (β + Ωk+r−1 )). This yields the following general description of the values of the collapsing function on tree ordinals denoted by elements from T . Theorem 32. Dk (Ωm+k · jm + Ωm+k−1 · jm−1 + · · · + Ωk · j0 ) = Ωk · gm+1 ( jm , . . . , j0 ). Proof. By induction on α = Ωm+k · jm + Ωm+k−1 · jm−1 + · · · + Ωk · j0 . If α = 0 then Dk α = Ωk = Ωk · gm+1 (0, . . . , 0). If α , 0 let r be minimal such that jr > 0. Let β = Ωm+k · jm + Ωm+k−1 · jm−1 + · · · + Ωr+k · ( jr − 1). If r = 0 then Dk α =
Dk (β + Ωk ) = Dk (β) + Ωk
= Ωk · (gm+1 ( jm , . . . , j0 − 1) + 1) = Ωk · gm+1 ( jm , . . . , j0 ) If r > 0 then Dk α = = = = =
Dk (β + Dk+r−1 (β + Ωk+r−1 )) Dk (β + Dk+r−1 (β) + Ωk+r−1 ) Dk (β + Ωk+r−1 · gm+2−r ( jm , . . . , jr − 1, 0) + Ωk+r−1 ) Ωk · (gm+1 ( jm , . . . , jr − 1, gm+2−r ( jm , . . . , jr − 1, 0) + 1, 0, . . . , 0) Ω · gm+1 ( jm , . . . , jr , 0, . . . , 0).
Corollary 33. a) GD0 (Ωn+2 ·k) (x) ≥ FD0 (Ωn ·k) (x). b) The function x 7→ FD0 (Ωn ·x) (x) is elementary recursive for every fixed n < ω. c) The function x 7→ GD0 Ωx (x) is not elementary recursive. Remark. The results of the last two sections of this article show that it is possible to match the slow and fast growing hierarchies at level ω2 which thence might be considered as subrecursively inaccessible. To achieve this goal we used a slow growing hierarchy which a posteriori turned out to be fast growing in the sense
A Nonstandard Hierarchy Comparison Theorem
89
that it matches up with the elementary functions at level ω2 . But our underlying choice of fundamental sequences is not artificial since we used a system of natural fundamental sequences from the existing standard literature.
5
Another hierarchy comparison result
ˆ m : Oω+1 → Om+1 of the collapsing funcLet us consider the following variant D tions Dm . ˆ 0 0 := 1, D ˆ k+1 0 := Ωk+1 . 1. D ˆ m (α+1) := D ˆ m (α) + 1. 2. D ˆ m ((αξ )ξ∈On ) := (D ˆ m (αξ ))ξ∈On , if n ≤ m. 3. D ˆ m ((αξ )ξ∈Ok+1 ) := (D ˆ m (αζi ))i∈IN with ζ0 := 0, ζi+1 := D ˆ k αζi , if m ≤ k. 4. D ˆ m is so to speak the tree The significance of this variant lies in the fact that D analogue of the ordinal function πm in [4]. This means that, if π0 πi1 . . . πil 0 is an ˆ 0D ˆ i1 . . . D ˆ il 0| = π0 πi1 . . . πil 0. ordinal term in the sense of [4], then |D In [4], among others, the following result is proved (1) If a = π0 πi0 . . . πil 0 is an ordinal term ≥ ω, then ωa = π0 πi0 +1 . . . πil +1 0. This can be sharpened to the following ˆ 0D ˆ i0 . . . D ˆ il 0 with i0 ≥ 1, then ω ˆ 0D ˆ i0 +1 . . . D ˆ il +1 0, eα = D Theorem 34. If α = D eα ∈ O1 for α ∈ O1 is defined by where ω 1 if α ∈ {0, 1} α0 (e ω · (i+1)) if α = α0 +1 , 1 . eα := ω i∈IN (e if α = (αi )i∈IN ωαi )i∈IN On the other side one easily shows Lemma 35. For all α ∈ O1 we have Gωeα = Fˆ α , where Fˆ α : IN → IN is defined by 1 if α ∈ {0, 1} ˆ F (x) · (x+1) if α = α0 +1 , 1 Fˆ α (x) := α 0 Fˆ (x) if α = (αi )i∈IN αx In the same way as in Section 4 we have derived Corollary 24 from Lemmata 22 and 23, we now obtain Corollary 36 from Theorem 34 and Lemma 35. Corollary 36. GDˆ 0 Ωn+1 = Fˆ Dˆ 0 Ωn for n ≥ 1.
90
Wilfried Buchholz and Andreas Weiermann
References [1] W. Buchholz: Three contributions to the conference on recent advances in proof theory. Preprint, Oxford 1980. [2] W. Buchholz:: Ordinal analysis of IDν . Lecture Notes in Mathematics 897. 234-260. [3] W. Buchholz: An independence result for (Π11 − CA) + BI. Annals of Pure and Applied Logic 33 (1987), 131-155. [4] K.Sch¨utte and S.G. Simpson: Ein in der reinen Zahlentheorie unbeweisbarer Satz u¨ ber endliche Folgen von nat¨urlichen Zahlen. Archiv f¨ur mathematische Logik und Grundlagenforschung 25 (1985), pp. 75-89. [5] H. Schwichtenberg. Eine Klassifikation der ε0 -rekursiven Funktionen. Z. Math. Logik Grundlagen Math. 17 1971 61–74. [6] H. Schwichtenberg, S.S. Wainer: Ordinal bounds for programs. Feasible mathematics, II (Ithaca, NY, 1992), 387–406, Progr. Comput. Sci. Appl. Logic, 13, Birkh¨auser Boston, Boston, MA, 1995. [7] S.S. Wainer: Slow growing versus fast growing. The Journal of Symbolic Logic 54 (2) (1989), 608-614. [8] A. Weiermann and G. Wilken: Goodstein sequences for prominent ordinals up the ordinal of (Π11 − CA)0 . Preprint 2012. Submitted.
Conservativity of transitive closure over weak constructive operational set theory Andrea Cantini and Laura Crosilla∗ Dedicated to Prof. Helmut Schwichtenberg
Constructive set theory a` la Myhill–Aczel has been extended in [10, 11] to incorporate a notion of (partial, non–extensional) operation. Constructive operational set theory is a constructive and predicative analogue of Beeson’s Inuitionistic set theory with rules and of Feferman’s Operational set theory [4, 15, 16, 17, 18]. This paper is concerned with an extension of constructive operational set theory [11] by a uniform operation of Transitive Closure, τ. Given a set a, τ produces its transitive closure τa. We show that the theory ESTE of [11] augmented by τ is still conservative over Peano Arithmetic.
1
Introduction
This article is a follow–up of [10, 11, 9], where we introduced a number of systems of constructive set theory with operations and studied their proof–theoretic strength. Constructive Operational Set Theory is a constructive (thus generalised predicative) theory of sets and operations which has similarities with Feferman’s classical Operational Set Theory [15, 16, 17, 18] and Beeson’s impredicative Intuitionistic Set Theory with Rules [4]. It is an operational analogue of Constructive Set Theory in the style of Myhill and Aczel [23, 1]. One motivation behind constructive operational set theory is to merge a constructive notion of set [23, 1, 3] with some aspects which are typical of Explicit Mathematics [14]. In particular, one has non–extensional operations (or rules) alongside extensional constructive sets. Operations are in general partial and a limited form of self–application is permitted. In [11] a fully explicit fragment, called ESTE, of operational set theory was singled ∗ The research is supported by MIUR, under the national project Thinking and Computing, PRIN 2008 and within the frame of the University of Florence local research unit, sub–project Abstraction and computation: logical and epistemological aspects. The second author is supported by EPSRC grant EP/G029520/1.
92
Andrea Cantini and Laura Crosilla
out. This system is finitely axiomatized and was shown to be proof–theoretically as strong as Peano Arithmetic, PA [11]. The aim of this note is to investigate an extension of elementary explicit constructive set theory ESTE [11] with an operator τ, which uniformly assigns to any given set a its transitive closure, τa. Formally, we strengthen ESTE with the axiom TC: (∃z)(τa ' z ∧ T rans(z) ∧ a ⊆ z ∧ (∀c)(T rans(c) ∧ a ⊆ c → z ⊆ c)) where T rans(z) stands for (∀x)(∀y)(x ∈ z ∧ y ∈ x → y ∈ z). We prove that the resulting extension ESTEt is still conservative over PA in the sense that it has the same computational content as PA: • for a closed application term f , if ESTEt proves that f : N → N, then f defines a recursive function that is provably total in PA. The result is achieved in two steps. First of all we consider an extension of Aczel and Rathjen’s elementary constructive set theory ECST [3] with Myhill’s exponentiation axiom and transitive closure. We recall that ECST is the fragment of constructive set theory CZF, which contains, besides extensionality, pairing, union and strong infinity, the schemata of bounded separation and full replacement. Therefore, compared with full CZF, the ∈–induction and the subset collection schemata are dropped and replacement is used instead of the strong collection schema. We also recall that Myhill’s exponentiation axiom states that for any sets a and b the collection of all functions from a to b is a set. We here also consider the following axiom TRANS: ∀a∃b(a ⊆ b ∧ T rans(b)), where T rans(b) is defined as above. We show that if ECST is enriched with the exponentiation axiom and TRANS then the resulting system ECSTt is conservative over PA. The proof of conservativity is carried out by interpreting the set theory in a theory Tc of classical Frege structures with generalized induction principle GID, similarly as in [10, 11]. This latter theory is conservative over PA [7, 8]. The first step in the proof is the extension to ECSTt of the realizability interpretation of [11], which is carried out within Tc and recalls in this context Aczel’s type–theoretic interpretation of the constructive set theory CZF [1]. We wish to underline two aspects of this proof. First of all, we use the fixed point theorem of Tc to define a transitive closure operator underlying the interpretation of the axiom TRANS. This interpretation is thus very much in the spirit of the operational set theory we here introduce in the first place.
Transitive Closure in Operational Set Theory
93
Secondly, the proof relies on the fact that the theory Tc is equipped with the principle GID (Generalized Inductive Definitions). As a consequence a useful induction principle is provable in the model construction VN (proposition 17). However, due to a separation, in the model, between natural numbers and sets, the VN -induction scheme of proposition 17 is acquired at no cost from a proof–theoretic perspective (see also section 2.2). In the second step of the proof we prove that ESTEt is conservative over ECSTt by means of proof theoretic techniques (asymmetric interpretation, partial cut elimination). Since the argument is a routine extension of the one given in [11], here it will be only sketched. We conclude this introduction by recalling that the status of transitive closure in weak classical set theory has recently gained a considerable attention. The (quite surprising) role of transitive closure for determining the relation between classical set theory without infinity and Peano Arithmetic has been investigated in [20]; see also [21, 12]. Models of Zermelo set theory (with foundation) in which Transitive Closure fails have been given for example in [5, 6, 22]. In [13] the authors study Antifoundation and Transitive Closure on the basis of Zermelo set theory (without foundation). We are not aware of specific studies on the proof theoretic strength of transitive closure on the basis of weak constructive set theories. For the reader’s convenience we first briefly recall the theories we shall be working in.
1.1 1.1.1
Elementary operational set theory Language and conventions
The language of ESTE is the following applicative extension, LO , of the usual first order language of Zermelo–Fraenkel set theory, L. The language includes the predicate symbols ∈ and =. The logical symbols are all the intuitionistic operators: ⊥, ∧, ∨, →, ∃, ∀. We have in addition: • the combinators K and S; • a ternary predicate symbol, App, for application; App(x, y, z) is read as x applied to y yields z; • el for the ground operation representing membership; • pair , un , im , sep , exp , for set operations; • ∅, ω, set constants;
94
Andrea Cantini and Laura Crosilla
For convenience we also use the bounded quantifiers ∃x ∈ y and ∀x ∈ y, as abbreviations for ∃x (x ∈ y ∧ . . .) and ∀x (x ∈ y → . . .). As customary, we define ϕ ↔ ψ by (ϕ → ψ) ∧ (ψ → ϕ) and ¬ϕ by ϕ →⊥. We also write a ⊆ b for ∀z (z ∈ a → z ∈ b). Terms and formulas. Terms and formulas are inductively defined as usual. To increase perspicuity, we consider a definitional extension of LO with application terms, defined inductively as follows.1 (i) Each variable and constant is an application term. (ii) If t, s are application terms, then ts is an application term. Application terms will be used in conjunction with the following abbreviations. (i) t ' x for t = x when t is a variable or constant. (ii) ts ' x for ∃y ∃z (t ' y ∧ s ' z ∧ App(y, z, x)). (iii) t ↓ for ∃x (t ' x). (iv) t ' s for ∀x (t ' x ↔ s ' x). (v) ϕ(t, . . . ) for ∃x (t ' x ∧ ϕ(x, . . . )). (vi) t1 t2 . . . tn for (. . . (t1 t2 ) . . . )tn . To ease readability we sometimes use the notation t(x, y) for txy. In the language LO , the notion of bounded formula needs to be appropriately modified. Definition 1 (Bounded formulas). A formula of LO is bounded, or ∆0 , if and only if all quantifiers occurring in it, if any, are bounded and in addition it does not contain application App. Classes are introduced similarly as in ordinary set theory: they are abbreviations for abstracts {x : ϕ(x)}, for any formula ϕ of the language LO . In particular, we let V := {x : x ↓}. For A and B sets or classes, we write f : A → B for ∀x ∈ A ( f x ∈ B) and f : V → B for ∀x ( f x ∈ B). By f : A2 → B and f : V2 → B we indicate ∀x ∈ A ∀y ∈ A ( f xy ∈ B) and ∀x ∀y ( f xy ∈ B), respectively. This can be clearly extended to arbitrary exponents n > 2. Finally, for set a, f : a → V means that f is everywhere defined on a. 1
The use of application terms goes back to Feferman[14].
Transitive Closure in Operational Set Theory
95
Truth values. We may represent false and truth by the empty set and the singleton empty set, respectively; that is, we let ⊥ := ∅ and > := {∅} := pair ∅∅. Let Ω be the class P> := {x : x ⊆ >}. The class Ω intuitively represents the class of truth values (or of propositions). Note that in the presence of exponentiation, if Ω is taken to be a set, then full powerset follows (see Aczel [1], Proposition 2.3). Relations and set–theoretic functions. The notions of relation between two sets, of domain and range of a relation can be defined in the obvious way in ESTE. In the following we write Dom(R) and Ran(R) to denote the domain and the range of a relation, respectively2 . We also have a standard notion of set–theoretic function which we can express by a formula, Fun(F), stating that F is a set encoding a total binary relation which satisfies the obvious uniqueness condition. We shall use upper case letters F, G, . . . for set–theoretic functions and lower case letters f, g, . . . for operations (that is if they formally occur as operators in application terms or as first coordinates in App–contexts). In [10, 11] we have investigated the relation between the notions of operation and set–theoretic functions. Finally, in defining the axiom of infinity we shall make use of the following successor operation. Definition 2. Let Suc := λx.un (pair x(pair xx)). 1.1.2
Axioms of ESTE
Definition 3. ESTE is the LO theory whose principles are all the axioms and rules of first order intuitionistic logic with equality, plus the following principles. Extensionality • ∀x (x ∈ a ↔ x ∈ b) → a = b General applicative axioms • App(x, y, z) ∧ App(x, y, w) → z = w • K xy = x ∧ S xy ↓ ∧ S xyz ' xz(yz)
2 In [11] we have shown that in ESTE there is an operator opair internally representing the ordered pair of two sets. In addition, also the range and the domain of a relation correspond to internal operations, respectively.
96
Andrea Cantini and Laura Crosilla
Membership operation • el : V2 → Ω and el xy ' > ↔ x ∈ y Set constructors • ∀x (x < ∅) • pair xy ↓ ∧∀z (z ∈ pair xy ↔ z = x ∨ z = y) • un a ↓ ∧∀z (z ∈ un a ↔ ∃y ∈ a(z ∈ y)) • ( f : a → Ω) → sep f a ↓ ∧∀x (x ∈ sep f a ↔ x ∈ a ∧ f x ' >) • ( f : a → V) → im f a ↓ ∧∀x (x ∈ im f a ↔ ∃y ∈ a(x ' f y)) Strong infinity • (ω1)
∅ ∈ ω ∧ ∀y ∈ ω (Suc y ∈ ω)
• (ω2)
∀x (∅ ∈ x ∧ ∀y(y ∈ x → Suc y ∈ x) → ω ⊆ x)
Exponentiation exp ab ↓ ∧∀x(x ∈ exp ab ↔ (Fun(x) ∧ Dom(x) = a ∧ Ran(x) ⊆ b)).
Remark. The principles ruling sep and im embody the explicit character of the separation and replacement schemata in the present operational context: sep provides – uniformly in any given f : a → Ω – the set of all elements satisfying the “propositional function” defined by f ; on the other hand, im yields – uniformly in any given operation f defined on a set a – the image of a under f . Definition 4 (The theory ESTEt ). The theory ESTEt is obtained from ESTE by adding a new constant τ to the language together with the axiom TC: (τa↓ ∧ T rans(τa) ∧ a ⊆ τa ∧ (∀c)(T rans(c) ∧ a ⊆ c → τa ⊆ c)) where T rans(z) stands for (∀x)(∀y)(x ∈ z ∧ y ∈ x → y ∈ z). 1.1.3
Elementary Constructive Set Theory
In [3] the authors introduce a subsystem of CZF called ECST (for Elementary Constructive Set Theory). Such a system is intended to be somehow minimal. On the one hand, Aczel and Rathjen show that many standard set–theoretic constructions may be carried out already in this fragment of constructive set theory. On the
Transitive Closure in Operational Set Theory
97
other hand, ECST is very weak as for example it does not prove the existence of the addition function on ω [24]. We shall here be interested in a strengthening of ECST by addition of exponentiation, as such a theory is of the same proof–theoretic strength as Peano Arithmetic. The language of ECST is the same language as that of Zermelo–Fraenkel set theory. In this context, the notion of ∆0 formula is the standard one, that is, a formula is ∆0 or bounded if no unbounded quantifier occurs in it. Definition 5. The theory ECST includes the principles of first order intuitionistic logic plus the following set–theoretic principles. 1. Extensionality; 2. Pair; 3. Union; 4. ∆0 –Separation (that is separation restricted to ∆0 formulas only); 5. Replacement: for arbitrary ϕ, ∀x ∈ a∃!yϕ(x, y) → ∃b∀y(y ∈ b ↔ ∃x ∈ aϕ(x, y)); 6. Strong Infinity:
∃a [Ind(a) ∧ ∀z (Ind(z) → a ⊆ z)],
where we use the following abbreviations: • Empty(y) for (∀z ∈ y) ⊥;3 • S uc(x, y) for ∀z [z ∈ y ↔ z ∈ x ∨ z = x]; • Ind(a) for (∃y ∈ a)Empty(y) ∧ (∀x ∈ a)(∃y ∈ a)S uc(x, y). We write ω also for the set defined by strong infinity (which is unique by extensionality). Note that ECST differs from the better known system CZF in that it only has Replacement in place of Strong Collection and it omits both Subset Collection and ∈–Induction. Let Exponentiation be the axiom: ∀a, b ∃c ∀z (z ∈ c ↔ (Fun(z) ∧ Dom(z) = a ∧ Ran(z) ⊆ b)), where as usual Fun(z) is a bounded formula expressing the fact that z is a set– theoretic function, Dom(z) and Ran(z) are the domain and range of z, respectively. 3 ⊥ stands for the absurd sentence and should be distinguished from ⊥, as used above for the empty set, or the minimum truth value.
98
Andrea Cantini and Laura Crosilla
Definition 6. The theory ECSTt is obtained from ECST by adding the axiom of exponentiation and the axiom TRANS: ∀a∃b(a ⊆ b ∧ T rans(b)).
2
Constructing the model
The proof–theoretic strength of ECSTt is determined by a realisability interpretation into a classical axiomatic theory of abstract self–referential truth, Tc . This is conservative over PA [7, 8]. First of all let us recall the theory Tc .
2.1
The theory Tc
The basic first order language LT of Tc comprises the predicate symbols =, T , N, the binary function symbol ap (application), combinators K, S , successor, predecessor, definition by cases on numbers, pairing with projections. Terms are inductively generated from variables and individual constants via application. As usual ts := ap(t, s); missing brackets are restored by associating to the left. Formulas are inductively generated from atoms of the form t = s, T (t), N(t) by means of sentential operations and quantifiers. We adopt the following conventions: (i) By [ϕ] we denote a term representing the propositional function associated ˆ ∃, ˆ ¬, ˆ with ϕ and such that FV([ϕ]) = FV(ϕ). We fix distinct closed terms ∀, ˆ ∧, . . . , naming the logical constants. In addition, =, ˆ Nˆ name the equality and the number predicates, respectively. Then [ϕ] is inductively defined by stipulating ˆ s, [T (s)] = s and closing under application of [t = s] = (=ˆ ts), [N(s)] = N ˆ ˆ the “small hat” operations, noting that [∀xϕ] = ∀(λx[ϕ]), [∃xϕ] = ∃(λx[ϕ]). (ii) Given a formula ϕ we define abstraction by letting {x : ϕ} := λx.[ϕ]. (iii) We define intensional membership, η , as follows: x η a := T (ax); x η¯ a := T (¬(ax)). ˆ (iv) The notion of class (or classification) is so specified4 : Cl(a) := ∀x (x η a ∨ x η¯ a). 4 A warning: here ‘class’ is understood in the (non–extensional) sense of the theory of Frege structures [2, 8].
Transitive Closure in Operational Set Theory
99
(v) A formula ϕ is T –positive iff ϕ is inductively generated from prime formulas of the form T (t), t = s, ¬t = s, N(t), ¬N(t) by means of ∨, ∧, ∀, ∃. (vi) A formula ϕ is T –positive operative in v (in short, T –positive or a positive operator) iff ϕ belongs to the smallest class of formulas inductively generated from prime formulas of the form T (t), s η v, t = s, ¬t = s, N(t), ¬N(t) by means of ∨, ∧, ∀y, ∃y, where y is distinct from v and v does not occur in t, s. (vii) For each formula ϕ, fixed points are defined by letting: I(ϕ) := Y(λv.{x : ϕ(x, v)}) where Y is Curry’s fixed point combinator. The system Tc comprises the following principles, besides classical predicate calculus with equality. 1. The base theory TON− (see e. g. [19]), which formalises the notion of total extensional combinatory algebra expanded with natural numbers. This includes the obvious axioms on combinators, pairing, projections. In addition, closure axioms for the predicate N defining a copy of the natural numbers, together with number theoretic conditions on the basic operations of successor S UC, predecessor PRED, 0, definition by cases on the natural numbers. 2. A fixed point axiom (Tr) for abstract truth Tr(x, T ) ↔ T (x). Here Tr(x, T ) is a formula encoding the following inference rules: a=b T [a = b]
¬(a = b) T [¬(a = b)]
N(a) T [N(a)]
¬N(a) T [¬N(a)]
for the basic atomic formulas with = and N. Further, the following additional clauses for the compound formulas: T (a) T (¬ˆ ¬a) ˆ
Ta Tb ˆ T (a∧b) ∀x T (ax) ˆ T (∀a)
T (¬a) ˆ [ or T (¬b)] ˆ ˆ T (¬(a ˆ ∧b)) ∃x T (¬ax) ˆ ˆ T (¬ˆ ∀a)
100
Andrea Cantini and Laura Crosilla
Formally Tr(x, T ) is spelled out by the formula:
∃u∃v∃w
[ ∨ ∨ ∨ ∨ ∨ ∨ ∨ ∨
(x = [u = v] ∧ u = v) ∨ (x = [¬u = v] ∧ ¬u = v) ∨ (x = [Nv] ∧ N(v)) ∨ (x = [¬N(v)] ∧ ¬Nv) ∨ (x = ¬ˆ ¬v ˆ ∧ T v) ∨ ˆ ∧ T v ∧ T w) ∨ (x = v∧w ˆ ∧ (T (¬v) (x = ¬(v ˆ ∧w) ˆ ∨ T (¬w))) ˆ ∨ ˆ ∧ ∀z(T (vz))) ∨ (x = ∀v ˆ ∧ ∃z(T (¬vz)))] (x = ¬ˆ ∀v ˆ
3. Consistency axiom: ¬(T x ∧ T ¬x). ˆ 4. Induction on natural numbers Cl − INDN for classes: Cl(a) ∧ ClosN (a) → ∀x(N(x) → x η a) with ClosN (a) := 0ηa ∧ ∀x (xηa → (S UCx)ηa). 5. The principle GID, ensuring the minimality of the fixed points: if ϕ(x, v) is a positive operator Closϕ (ψ) → ∀x (xηI(ϕ) → ψ(x)) with Closϕ (ψ) := ∀x (ϕ(x, ψ) → ψ(x)).5 T− is the theory Tc without number theoretic induction. Let CL be {x : Cl(x)} (which is provably not a class). Then we can show that CL has natural closure conditions which are essential for the interpretation of ECSTt . That is, in T− , CL is closed under elementary comprehension, generalized disjoint union, generalized disjoint product. It satisfies a form of positive comprehension: if ϕ is T –positive, then T [ϕ] ↔ ϕ and ∀x (xη{u : ϕ} ↔ ϕ[u := x]). Also a version of the second recursion theorem holds: if ϕ is positive then ∀x (xηI(ϕ) ↔ ϕ(x, I(ϕ))). For the proofs, see [8], II.9B, II.10A. 5 Here ϕ(x, ψ) is the formula obtained by replacing each occurrence of the formula t η v in ϕ(x, v) by means of ψ(t).
Transitive Closure in Operational Set Theory
101
Theorem 7. Tc is proof–theoretically equivalent to PA. Proof. See [10], Theorem 7.3, or [7].
2.2 Tc and the natural numbers We wish to remark that the principle GID ensures the minimality of the fixed points that are expressible in the language of Tc . We also underline that Tc includes induction on natural numbers for classes only. In the following, let N be the class {x : N(x)}. Clearly, in Tc there are a priori two distinct notions of natural numbers: on the one hand, there is the class N on which one can argue by induction only relative to classes; on the other hand by fixed point one can find IN such that ClosN (ψ) → ∀x (xηIN → ψ(x)), where ClosN (ψ) is the formula ψ(0) ∧ ∀x(ψ(x) → ψ(S UC(x))). Clearly by GID one can show that IN ⊆ N, but we are not allowed to argue conversely (as there are models where indeed IN is strictly contained in N and hence N contains non– standard natural numbers). The main trick in the definition of the set–theoretic universe VN below amounts to embed non–standard numbers in it so that they form a set and satisfy the strong induction axiom, but not the full mathematical induction schema. In the following we shall work informally in the theory Tc . Let (x, y) denote the basic pairing operation which is built–in the axioms of Tc ; (x, y, z) stands for (x, (y, z)), and, if u = (x, y, z), u0 = x, u1 = y and u2 = z. Recall that N is the class {x : N(x)}; for k ∈ N let Nk := {m : m η N ∧ m : 0 = 1}; ka = bk = {e : e = 0 ∧ ∃k(k η N ∧ Nk = a¯ = b¯ ∧ a˜ = b˜ = ν)} ⊕{e : e = (e0 , e1 ) ∧ ¬(Nat(a) ∧ Nat(b)) ˜ 0 x)0 k ∧∀x η a¯ (e0 x)0 η b¯ ∧ (e0 x)1 η k˜a x = b(e ¯ ˜ ∧∀y η b (e1 y)0 η a¯ ∧ (e1 y)1 η k˜a(e1 y)0 = byk}; ˜ 0 k}; ka ∈ bk = {e : e = (e0 , e1 ) ∧ e0 η b¯ ∧ e1 η ka = be kϕ ∧ ψk kϕ ∨ ψk kϕ → ψk k∃u ∈ a ϕ(u)k k∀u ∈ a ϕ(u)k
= = = = =
{e : e = (e0 , e1 ) ∧ e0 η kϕk ∧ e1 η kψk}; kϕk ⊕ kψk; {e : ∀q η kϕk(eq η kψk)}; {e : e = (e0 , e1 ) ∧ e0 η a¯ ∧ e1 η kϕ(˜ae0 )k}; {e : ∀x η a¯ (ex η kϕ(˜a x)k)}.
Formally speaking, the definition of kϕk above makes sense only after showing by a fixed point argument that there exists an operation H(a, b) satisfying the equation for ka = bk (hence the definition inductively extends H to arbitrary bounded conditions). Definition 21. Let ϕ be an arbitrary formula of ECSTt ; we inductively define a formula e ϕ of Tc with the same free variables as ϕ and a fresh variable e:
Transitive Closure in Operational Set Theory
109
1. if ϕ is a bounded formula of ECSTt , then e ϕ iff e η kϕk; 2. else: e ϕ→ψ e ϕ∧ψ e ϕ∨ψ e ∀u ∈ a ϕ(u) e ∃u ∈ a ϕ(u) e ∃u ϕ e ∀u ϕ
iff iff iff iff iff iff iff
∀ f ( f ϕ → e f ψ) ; e = (e0 , e1 ) ∧ e0 ϕ ∧ e1 ψ ; (e = (0, e1 ) ∧ e1 ϕ) ∨ (e = (1, e1 ) ∧ e1 ψ) ; ∀x η a¯ (ex ϕ(˜a x)) ; e = (e0 , e1 ) ∧ e0 η a¯ ∧ e1 ϕ(˜ae0 ) ; e = (e0 , e1 ) ∧ e0 η VN ∧ e1 ϕ(e0 ) ; ∀u η VN (eu ϕ(u)) .
Lemma 22. Let ϕ be a bounded formula of ECSTt . Then Tc proves ~x ∈ VN → Cl(kϕ(~x)k).
(19)
Proof. By induction on the complexity of the formula ϕ. The atomic case a = b is proved by VN induction. The other cases use the fact that classes are closed under elementary comprehension, generalized disjoint union, generalized disjoint product.
3.1
Realizability: preparatory computations
In this subsection we assume that a η VN with a0 = 2 (that is we look at the transitive closure for non–natural numbers) and perform some preparatory computations. Recall that if a0 = 2, then [ ˙ sup(¯a, λy.τ(˜ay)). τa = a∪˙ We obtain: τa
[ [ ˙ ˙ sup(¯a, λy.τ(˜ay))}V {a, [ [ [ ˙ ˙ ˙ M(a) sup(¯a, λy.τ(˜ay)))z0) ≡ sup(2, λz.Da( =
(21)
=
(22)
=
] z ) ] sup(e(M(a)), λz. M(a)z 0 1
Now observe that
(20)
110
Andrea Cantini and Laura Crosilla
M(a)
=
2
] M(a)
=
λz.Da(
[ ˙
sup(¯a, λy.τ(˜ay)))z0
hence ] M(a)0 = ] M(a)1 =
a [ ˙
sup(¯a, λy.τ(˜ay)
Lemma 23 (provably in Tc ). τa e τa
= a¯ ⊕ {(u0 , u1 ) | u0 η¯a ∧ u1 ητ(˜au0 )} ] z ] = λz. M(a)z 0 1
Proof. The second equation follows by definition of τa and (22) above. We then complete the computation: τa
= e(M(a)) = {(u0 , u1 ) | u0 η2 ∧ u1 ητ(˜au0 )} = =
] ∪ {(1, u1 ) | u1 η M(a)1} ] {(0, u1 ) | u1 η M(a)0} [ ˙ {(0, u1 ) | u1 η¯a} ∪ {(1, u1 ) | u1 η sup(¯a, λy.τ(˜ay))}
= a¯ ⊕ {(u0 , u1 ) | u0 η¯a ∧ u1 ητ(˜au0 )}. ˙ Remark. Let x∈τa: then
e ∃u(uητa ∧ x τau)
If uητa, then by Lemma 23, u is of the form (u0 , u1 ) and there are two cases to take into account: u0 = 0 or u0 = 1. Case 1. u0 = 0. Then u1 η¯a and x a˜ u1 and hence ˙ x∈a. ] Case 2. u0 = 1. Then u1 = (v0 , v1 ), v0 η¯a and v1 ητ(˜av0 ). Hence x τ(˜ av0 )v1 . But this witnesses ˙ x∈τc. ˙ ∃c∈a.
Transitive Closure in Operational Set Theory
3.2
111
Realizability: Soundness Theorem
Lemma 24. There are operators F, G such that, provably in Tc , for every a, b ∈ VN : (i) G(a) a ⊆ τa; (ii) F(a) T rans(τa) Proof. Essentially this holds because the proof of Lemma 19 is constructive. (i) Let us consider the first property. We have to show that, for some G(a), G(a) realizes a ⊆ τa. That is, we must check that some G(a) inhabits k∀u ∈ a.u ∈ τak. Writing e for G(a), this is: e ∀iη¯a((ei)0 ητa ∧ (ei)1 ηk˜ai = τa(ei) 0 k). In case a0 = 2, for i ∈ a¯ (by Lemma 23) we can take (ei)0 = (0, i), and e i)k. But (again by Lemma 23): (ei)1 ∈ k˜ai = τa(0, ] z )(0, i) = a˜ i. ] e i) = (λz. M(a)z τa(0, 0 1 Thus we can take e such that for every iη¯a, ei = ((0, i), fa˜ i ), where fa˜ i is a realizer of the identity: a˜ i = a˜ i. In case a0 = 1, for every iη¯a, ei = (i, fa˜ i ) verifies the inclusion. (ii) As to the second property, we show: Sublemma. Assume that 1. there exists an operation J, such that, for every i η N, J(i) η kT rans(τ(ν(i)))k 2. there exists an operation K, such that, whenever Cl(a) ∧ ∀x η a.hx η kT rans(τ( f x))k, then
K(h, a, f ) η kT rans(τ(sup(a, f ))k.
Then there exists F such that ∀a η VN .F(a) η kT rans(τ(a))k Verification of (ii). Using definition by cases on N, there exists an operation Φ, such that
112
Andrea Cantini and Laura Crosilla
• Φ(F, a) = J(i), if a0 = 1, a1 = i; • Φ(F, a) = K(F, a1 , a2 ), if a0 = 2. Choose F(a) = Φ(F, a) by fixed point theorem; F is the required operation (argue by VN –induction using the assumption on J and H). Thus it remains to prove that there are operations J, K satisfying the hypotheses of the sublemma. As to J, we make the realizers in the proof of Lemma 19 explicit and obtain: • there exists an operation Ω such that, for every k η N, for every f , if ∀i < k. f (i) η kT rans(τ(νi))k, then Ω( f, k) η kT rans(τ(νk))k. Then, by fixed point theorem and Cl − INDN , there exists J such that, for every i η N, J(i) realizes T rans(τ(ν(i))) (see (17), lemma 19). Concerning K, assume that a is a class, f : a → VN and ∀x η a.hx η kT rans(τ( f x))k.
(23)
We wish to find K(h, a, f ) η kT rans(τ(sup(a, f ))k. In the following we argue informally. Let e η kv ∈ uk; g η ku ∈ τ(sup(a, f ))k. Read the realizer g of u ∈ τ(sup(a, f )). If (g0 )0 = 0, we have a realizer of u ∈ sup(a, f ) and also a realizer of u ⊆ τu (by the previous step (i)), whence a realizer S of v ∈ τu (using e). This readily yields a realizer of v ∈ ˙ sup(a, λy.τ( f y)) and hence one of v ∈ τ(sup(a, f )). Else, there is a realizer of u ∈ τ( f x), for some x η a. Then by (23) and using e, S we get a realizer of v ∈ ˙ sup(a, λy.τ( f y)), and hence one of v ∈ τ(sup(a, f )). Theorem 25. Every theorem of ECSTt is realized in Tc , i.e. if ECSTt ` ϕ(~x), then there exists a closed term e such that, provably in Tc , for ~a ∈ VN e~a ϕ(~a). Proof. See Theorem 8.22 of [10]. The new case – the axiom TRANS – is taken care of by Lemma 24.
Transitive Closure in Operational Set Theory
113
Reducing ESTEt to ECSTt
4
We have to show that Theorem 26. ESTEt is interpretable in ECSTt . Moreover, every '–free sequent provable in ESTEt is already provable in ECSTt . Theorem 26 is verified by means of proof–theoretic methods. This is achieved through two steps. First of all, we give a Gentzen–style formulation of ESTEt , called Γt , so that the predicate ' occurs positively, both in the active formulas and the minor formulas of the relevant inferences for the set constructors and application. Consequently, a partial cut elimination theorem holds. Then we give an asymmetric interpretation of Γt in ECSTt , which yields the final result.
4.1
Step 1: sequent style formulation
We only give a sketch of the theory Γt . As usual, capital Greek letters Γ, Λ, . . . denote finite sequences of formulas of Γt . Sequents are of the form Γ ⇒ Λ. The system Γt is an extension of the intuitionistic Gentzen calculus [25]. The logical rules consist of the usual rules for intuitionistic logic, including cut and rules for =. In addition, there are the structural rules of weakening, exchange and contraction. In the following we first present the axioms and rules involving application; in particular, we include trivial independence conditions on constants for operations. Then we state the main rules for the set–theoretic constructors of Γt . Definition 27. In order to simplify the statements, we extend the language by adding new terms as follows: (*) if t, s are terms, so are Kt , St , pairt , imt , sept , elt , expt , Sts .6 Finally, note that in the following, separation, explicit replacement and transitive closure are split into distinct rules to ease the asymmetric interpretation of section 4.2. Indeed, each rule subsumes a clause for generating the application relation, e.g. the separation and TC–rules show how the separation and transitive closure operators are introduced. 6 Formally, the special terms can be eliminated by means of a set–theoretically defined ordered pairing operation h−, −i and 8 distinct sets c1 ,. . . , c8 , e.g. to be identified with distinct elements of ω. For example, Kt , can be identified with hc1 , ti.
114
Andrea Cantini and Laura Crosilla
Gentzen–style presentation of non–logical axioms and rules.
cludes (the closure under substitution of) the following sequents and rules: 1. Uniqueness:
Γ, ts ' p, ts ' q ⇒ p = q
2. let C be a constant among K, S, pair, im, sep, el, exp; then Γ ⇒ Ct ' Ct Γ ⇒ St s ' Sts 3. Combinatory completeness: Γ ⇒ Kt s ' t Γ ⇒ tr ' u
Γ ⇒ sr ' v Γ ⇒ Sts r ' w
Γ ⇒ uv ' w
4. Independence: • let C1 , C2 ∈ {K, S, pair, un , im, sep, el, exp, τ}; then Γ , C1 = C2 ⇒ • let C1 , C2 ∈ {K, S, pair, im, sep, el, exp}; then Γ , C1t = C2s ⇒ t = s ∧ C1 = C2 • let C1 , C2 ∈ {S}; then C1ts = C2pq ⇒ t = p ∧ s = q ∧ C1 = C2
5. Extensionality:
Γ, ∀x (x ∈ p ↔ x ∈ q) ⇒ p = q
6. Empty–set:
Γ ⇒ ∀x(x < ∅)
7. Representing elementhood: Γ ⇒ ∃z[z ⊆ > ∧ ela b ' z ∧ ∀u(u ∈ z ↔ u = ⊥ ∧ a ∈ b)]
Γt in-
Transitive Closure in Operational Set Theory
8. Union: Γ ⇒ ∃z[una ' z ∧ ∀u(u ∈ z ↔ ∃y ∈ a (u ∈ y))] 9. Pairing: Γ ⇒ ∃z[paira b ' z ∧ ∀u(u ∈ z ↔ u = a ∨ u = b)] 10. Strong infinity: Γ⇒∅∈ω Γ, t ∈ ω ⇒ S uct ∈ ω Γ, ∅ ∈ t ∧ ∀y(y ∈ t → Suc y ∈ t) ⇒ ω ⊆ t 11. Separation: From the premiss: Γ ⇒ (∀u ∈ a)(∃y ⊆ >)( f u ' y) infer: Γ ⇒∃z[(∀u ∈ z)( f u ' > ∧ u ∈ a)∧ ∧ (∀u ∈ a)(∀y( f u ' y → y = >) → u ∈ z)]
From the premisses • Γ ⇒ (∀u ∈ a)(∃y ⊆ >)( f u ' y) • Γ ⇒ (∀u ∈ z)( f u ' > ∧ u ∈ a) • Γ ⇒ (∀u ∈ a)(∀y( f u ' y → y = >) → u ∈ z) infer: Γ ⇒ sepa f ' z
115
116
Andrea Cantini and Laura Crosilla
12. Explicit replacement: Γ ⇒ (∀x ∈ a)∃y( f x ' y) Γ ⇒ ∃z[(∀y ∈ z)(∃x ∈ a)( f x ' y) ∧ (∀x ∈ a)(∃y ∈ z)( f x ' y)] From the premisses • Γ ⇒ (∀u ∈ a)∃y( f u ' y) • Γ ⇒ (∀y ∈ z)(∃x ∈ a)( f x ' y) • Γ ⇒ (∀x ∈ a)(∃y ∈ z)( f x ' y) infer: Γ ⇒ ima f ' z 13. Exponentiation: Γ ⇒∃z[expa b ' z ∧ ∀F(F ∈ z ↔ (Fun(F) ∧ Dom(F) = a∧ ∧ Ran(F) ⊆ b))] 14. Transitive closure: TC–introduction: Γ ⇒ T rans(b)
Γ ⇒ (∀c)(T rans(c) ∧ a ⊆ c → b ⊆ c) Γ ⇒ τa ' b
TC–existence: Γ ⇒ (∃z)(T rans(z) ∧ a ⊆ z ∧ (∀c)(T rans(c) ∧ a ⊆ c → z ⊆ c)) Proviso: a, b < FV(Γ). We stress that the active formulas of the inferences and axioms are positive in '. Theorem 28 (Quasi–normal form). A Γt –derivation D can be effectively transformed into a Γt –derivation D∗ of the same sequent, such that every cut formula occurring in D∗ is positive in '.
Transitive Closure in Operational Set Theory
117
In order to state the required form of the partial cut elimination theorem, one introduces in the usual way the collections of formulas positive (respectively, negative) in the application predicate '. Γ ⇒ ∆ is '–positive if every formula occurring in Γ, ∆ is '–positive. Lemma 29 (Quasi–normal form). Every Γt –derivation D of an arbitrary sequent Γ ⇒ ∆ can be effectively transformed into a Γt –derivation D∗ of the same sequent, such that every cut formula occurring in D∗ is positive in '. Corollary 30. Every Γt –derivation D of a '-positive sequent Γ ⇒ ∆ can be effectively transformed into a Γt –derivation D∗ of the same sequent, which only contains '–positive sequents.
4.2
Step 2: the asymmetric interpretation
The starting point is that in the intended interpretation the application predicate App is inductively generated by an operator, which can be defined by a formula A(x, y, z, P) positive in P.7 Indeed, A(x, y, z, P) can be directly read off from the the rules of Γt , which take care of introducing the combinators and the set–theoretic operators (im, sep, el, exp, pair, un and τ); and so we do not bore the reader with its explicit formalization. If we temporarily use ⊥ also as an abbreviation for K = S, we can define : App0 (x, y, z) := ⊥ Appk+1 (x, y, z) := A(x, y, z, Appk ). Here above A(x, y, z, Appk ) is obtained from A(x, y, z, P) by replacing P everywhere with Appk . Now the asymmetric interpretation amounts to replacing the application predicate by its finite stages Appn which, for each given n, can be explicitly defined and proved to exist in the pure set–theoretic language of ECSTt . Thus the finite approximations of the rules – τ rules included – can be justified in the application–free system ECSTt . The interpretation is asymmetric in the sense that it depends on a pair of number parameters m ≤ n: the positive occurrences of application are separated from the negative ones (the former being replaced by Appn and the second by Appm ).
7 This formula belongs to the language of ECST , except for the ternary predicate symbol P and for t the terms of the form Ct , Sts , where C is a constant among K, S, im, sep, el, exp, pair, see definition 27 and related footnote.
118
Andrea Cantini and Laura Crosilla
Definition 31. (i) We inductively define A[m, n], where A is a formula of Γt , uniformly in n, m, by stipulating that A 7→ A[m, n] commutes with ∧, ∨, ∀, ∃, and in addition: A[m, n] := A, provided A has the form t = s or t ∈ s; App(t, s, r)[m, n] := Appn (t, s, r); (A → B)[m, n] := (A[n, m] → B[m, n]). (ii) If Γ := {A1 , . . . , A p }, Γ[m, n] := {A1 [m, n], . . . , A p [m, n]}; (iii) (Γ ⇒ ∆)[m, n] := Γ[n, m] ⇒ ∆[m, n]. Lemma 32. (i) For each k ∈ ω, Appk is a formula of ECSTt . (ii) if A is App–positive (negative), then A[m, n] := An (A[m, n] := Am ); if A is App–free, A[m, n] := A. (iii) Persistence: let m ≤ p ≤ q ≤ n. Then, provably in ECSTt : A[p, q] → A[m, n]; A[n, m] → A[q, p]. Below it is convenient to adopt the more suggestive notation xy 'm z instead of Appm (x, y, z). Let (Γ ⇒ ∆)[m, n] be the asymmetric intepretation of the sequent Γ ⇒ ∆ in the language of Γt . Then by the above, one proves the fundamental lemma, which readily implies the theorem 26. Lemma 33. Let D be a Γt –derivation of Γ ⇒ ∆. Then there exists a natural number c ≡ cD such that, for every m > 0 and every n such that n ≥ c + m, Γ[n, m] ⇒ ∆[m, n] is derivable in ECSTt . Proof. Assume we are given a Γt –derivation D of Γ ⇒ ∆. Then, by lemma 29 of step 1, we can assume that every cut formula occurring in D∗ is positive in '. By persistence (lemma 32), it is enough to check, for some constant c depending on D, (Γ ⇒ ∆)[m, c + m]. (24)
Transitive Closure in Operational Set Theory
119
We only deal with the interpretation of the rules involving the transitive closure operator τ (for the interpretation of the other rules see [11]). Let us consider the TC–existence rule. We argue informally. By IH, we have for some constant e (Γ[m, e + m] ⇒ T rans(b) (Γ[m, e + m] ⇒ (∀d)(T rans(d) ∧ a ⊆ d → b ⊆ d) But the two conditions on the right hand side imply, for all n ≥ max(e, 1): Γ[m, n] ⇒ τa 'n b As to the rule of TC–existence, let TRANS be the axiom stating the mere existence of a transitive superset for every set ∀a∃b(a ⊆ b ∧ T rans(b)) ECST with exponentiation EXP and TRANS proves ([3],19.4): ∀a∃b(a ⊆ b ∧ T rans(b) ∧ ∀d(T rans(d) ∧ a ⊆ d → b ⊆ d)),
(25)
(25) immediately implies the asymmetric interpretation of TC–existence, choosing cD = 0 Remark. Theorem 26 still holds in presence of full subset collection and strong collection. Acknowledgements. We would like to thank the referee for a careful reading of the paper and for suggesting a number of improvements in the presentation of the material. We are grateful to Peter Schuster for valuable comments.
References [1] P. Aczel, The type theoretic interpretation of constructive set theory, Logic Colloquium ’77 (A. MacIntyre, L. Pacholski, and J. Paris, eds.), North– Holland, Amsterdam-New York, 1978, pp. 55–66. [2]
, Frege structures and the notion of proposition, truth and set, The Kleene Symposium (J. Barwise H. J. Keisler and K. Kunen, eds.), North– Holland, Amsterdam-New York, 1980, pp. 31–59.
[3] P. Aczel and M. Rathjen, Notes on constructive set theory, 2000.
120
Andrea Cantini and Laura Crosilla
[4] M. Beeson, Towards a computation system based on set theory, Theoretical Computer Science 60 (1988), 297–340. [5] M. Boffa, Axiome et sch´ema de fondement dans le syst`eme de Zermelo, Bulletin de L’acad´emie Polonaise des Sciences 17 (1969), no. 2, 113–15. [6]
, Axiom and scheme of foundation, Bulletin de la Soci´et´e Math´ematique de Belgique 22 (1970), 242–47.
[7] A. Cantini, Levels of implication and type free theories of partial classifications with approximation operator, Zeitschrift f¨ur mathematische Logik und Grundlagen der Mathematik 38 (1992), 107–141. [8]
, Logical frameworks for truth and abstraction, North–Holland, Amsterdam, 1996.
[9]
, Extending constructive operational set theory by impredicative principles, Math. Log. Q. 57 (2011), no. 3, 299–322.
[10] A. Cantini and L. Crosilla, Constructive set theory with operations, Logic Colloquium 2004 (A. Andretta, K. Kearnes, and D. Zambella, eds.), Lecture Notes in Logic, vol. 29, Cambridge University Press, Cambridge, 2008. [11]
, Elementary constructive operational set theory, Ways of Proof Theory (R. Schindler, ed.), Ontos Series in Mathematical Logic, Frankfurt, 2010, pp. 199–240.
[12] A. Enayat, J. H. Schmerl, and A. Visser, Omega–models of finite set theory, Set theory, Arithmetic, and Foundations of Mathematics: Theorems, Philosophies, Cambridge University Press, 2011. [13] O. Esser and R. Hinnion, Antifoundation and transitive closure in the system of Zermelo, Notre Dame Journal of Formal Logic 40 (1999), no. 2, 197–205. [14] S. Feferman, A language and axioms for explicit mathematics, Algebra and Logic (J. Crossley, ed.), Lecture Notes in Mathematics, vol. 450, Springer, Berlin, 1975, pp. 87–139. [15]
, Operational set theory and small large cardinals, Information and Computation 207 (2009), 971–979.
[16] G. J¨ager, On Feferman’s operational set theory OST, Annals of Pure and Applied Logic 150 (2007), 19–39.
Transitive Closure in Operational Set Theory
121
[17]
, Full operational set theory with unbounded existential quantification and powerset, Annals of Pure and Applied Logic 160 (2009), 33–52.
[18]
, Operations, sets and classes, Logic, Methodology and Philosophy of Science - Proceedings of the Thirteenth International Congress, College Pubblications, 2009.
[19] G. J¨ager and T. Strahm, Totality in applicative theories, Annals of Pure and Applied Logic 74 (1995), 105–120. [20] R. Kaye and T. L. Wong, On interpretations of arithmetic and set theory, Notre Dame J. Formal Logic 48 (2007), no. 4, 497–510. [21] A. Mancini and D. Zambella, A note on recursive models of set theories, Notre Dame Journal of Formal Logic 42 (2001), no. 2, 109–115. [22] A. R. D. Mathias, Weak systems of Gandy, Jensen and Devlin, Set Theory: Centre de Recerca Matem´atica, Barcelona 2003-4 (Joan Bagaria and Stevo Todorcevic’, eds.), Trends in Mathematics, Birkh¨auser Verlag, Basel, 2006, pp. 149–224. [23] J. Myhill, Constructive set theory, Journal of Symbolic Logic 40 (1975), 347– 382. [24] M. Rathjen, The natural numbers in constructive set theory, Mathematical Logic Quarterly 54 (2008), 83–97. [25] A. S. Troelstra and H. Schwichtenberg, Basic proof theory, 2nd. ed., Cambridge University Press, Cambridge, 2000.
122
Formal Baire Space in Constructive Set Theory Giovanni Curi and Michael Rathjen∗
Introduction Constructive topology is generally based on the notion of locale, or formal space (see [10, 9, 8], and [22, pg. 378], for an explanation of why this is the case). Algebraically, locales are particular kinds of lattices that, like other familiar algebraic structures, can be presented using the method of generators and relations, cf. e.g. [23]. Equivalently, they may be described using covering systems [9, 13]. Settheoretically, ‘generators and relations’ and covering systems can be regarded as inductive definitions. Classical or intuitionistic fully impredicative systems, such as intuitionistic Zermelo-Fraenkel set theory, IZF [2], or the intuitionistic theory of a topos [11], are sufficiently strong to ensure that such inductive definitions do give rise to a locale or formal space. This continues to hold in (generalized) predicative systems as for example the constructive set theory CZF augmented by the weak regular extension axiom wREA (where the covering systems give rise to so-called inductively generated formal spaces, [1]). However, albeit being much weaker than classical set theory ZF, the system CZF + wREA is considerably stronger than CZF. As it turns out, CZF + wREA is a subsystem of classical set theory ZF plus the axiom of choice AC, but not of ZF alone (cf. [17]). Naturally, this lends itself to the question of what can be proved in the absence of wREA. In this note we show that working in CZF alone, a covering system may fail to define a formal space already in a familiar case. It is easy to see that CZF can prove that, e.g., the covering systems used to present formal Cantor space C, and the formal real line R, do define formal spaces; this is essentially because the associated inductive definition is a finitary one for C, and can be replaced by a finitary one, plus an application of restricted Separation, for R (as apparently first noted by T. Coquand, see [7, Section 6] for more details). There has been for some ∗ This
material includes work supported by the EPSRC of the UK through Grant No. EP/G029520/1.
124
Giovanni Curi and Michael Rathjen
time the expectation that the same does not hold for formal Baire space B. The main result in this note, Theorem 3.4, will confirm this expectation. This result, in conjunction with [3, Proposition 3.10] (or [4, B.4]), also answers in the negative the question, asked in [3], whether CZF proves that the Brouwer (or constructive) ordinals form a set (although this could in fact also be inferred by previous results, see Section 3). Independently, this was also shown in [4, Corollary B.5].1 A corollary of Theorem 3.4 is moreover that the full subcategory FSpi of the category FSp of formal spaces defined by the inductively generated formal spaces [1, 5], fails to have infinitary products in CZF, according at least to the received construction. The notion of inductively generated formal space was introduced in [5] to make it possible to predicatively perform basic constructions on formal spaces as that of the product formal space; these constructions indeed do not appear to be possible for general formal spaces without recourse to some strong impredicative principle [5]. In CZF augmented by the weak regular extension axiom wREA, the category FSpi of inductively generated formal spaces can instead be proved to have infinitary products (and more generally, all limits). Exploiting the isomorphism of Q the product n∈N N with B, one shows that this need no longer be the case in CZF alone. Although CZF does not prove that B is a formal space, it does prove that B is an imaginary locale [7]. More generally, every covering system defines an imaginary locale in CZF. Imaginary locales give rise to a category ImLoc that extends the category FSp of formal spaces, and that has all limits (in particular all products) already assuming a fragment of CZF. As the categories FSp and FSpi , ImLoc is equivalent to the ordinary category of locales in a fully impredicative system as classical set theory.
1
Constructive set theory and inductive definitions
The language of Constructive Zermelo-Fraenkel Set Theory, CZF, is the same as that of Zermelo-Fraenkel Set Theory, ZF, with ∈ as the only non-logical symbol. CZF is based on intuitionistic predicate logic with equality, and has the following axioms and axiom schemes: 1. Extensionality: ∀a∀b(∀y(y ∈ a ↔ y ∈ b) → a = b). 2. Pair: ∀a∀b∃x∀y(y ∈ x ↔ y = a ∨ y = b). 1 At the end of this paper there is a post scriptum explaining the relationship between some of the findings in [3], [4], and the present paper.
Baire Space in CZF
125
3. Union: ∀a∃x∀y(y ∈ x ↔ (∃z ∈ a)(y ∈ z)). 4. Restricted Separation scheme: ∀a∃x∀y(y ∈ x ↔ y ∈ a ∧ φ(y)), for φ a restricted formula. A formula φ is restricted if the quantifiers that occur in it are of the form ∀x ∈ b, ∃x ∈ c. 5. Subset Collection scheme: ∀a∀b∃c∀u((∀x ∈ a)(∃y ∈ b)φ(x, y, u) → (∃d ∈ c)((∀x ∈ a)(∃y ∈ d)φ(x, y, u) ∧ (∀y ∈ d)(∃x ∈ a)φ(x, y, u))). 6. Strong Collection scheme: ∀a((∀x ∈ a)∃yφ(x, y) → ∃b((∀x ∈ a)(∃y ∈ b)φ(x, y) ∧ (∀y ∈ b)(∃x ∈ a)φ(x, y))). 7. Infinity: ∃a(∃x ∈ a ∧ (∀x ∈ a)(∃y ∈ a)x ∈ y). 8. Set Induction scheme: ∀a((∀x ∈ a)φ(x) → φ(a)) → ∀aφ(a). See [2] for further information on CZF and related systems. We shall denote by CZF− the system obtained from CZF by leaving out the Subset Collection scheme. Note that from Subset Collection one proves that the class of functions ba from a set a to a set b is a set, i.e., Myhill’s Exponentiation Axiom. Intuitionistic ZermeloFraenkel set theory based on collection, IZF, has the same theorems as CZF extended by the unrestricted Separation Scheme and the Powerset Axiom. Moreover, the theory obtained from CZF by adding the Law of Excluded Middle has the same theorems as ZF. As in classical set theory, we make use of class notation and terminology [2]. The set N of natural numbers is the unique set x such that ∀u[u ∈ x ↔ (u = ∅ ∧ (∃v ∈ x)(u = v ∪ {v}))]. A major role in constructive set theory is played by inductive definitions. An inductive definition is any class Φ of pairs. A class A is Φ−closed if: (a, X) ∈ Φ, and X ⊆ A implies a ∈ A. The following theorem is called the class inductive definition theorem [2].
126
Giovanni Curi and Michael Rathjen
Theorem 1.1 (CZF− ). Given any class Φ, there exists a least Φ−closed class I(Φ), the class inductively defined by Φ. Given any inductive definition Φ and any class U, there exists a smallest class containing U which is closed under Φ. This class will be denoted by I(Φ, U). Note that I(Φ, U) is the class inductively defined by Φ0 = Φ ∪ (U × {∅}), i.e., I(Φ, U) = I(Φ0 ). Given a set S , we say that Φ is an inductive definition on S if Φ ⊆ S × Pow(S ), with Pow(S ) the class of subsets of S . An inductive definition Φ is finitary if, whenever (a, X) ∈ Φ, there exists a surjective function f : n → X for some n ∈ N. Φ is infinitary if it is not finitary. As is shown in Section 3, even when Φ is a set, I(Φ) need not be a set in CZF. For this reason, CZF is often extended with the Regular Extension Axiom, REA. REA: every set is the subset of a regular set. A set c is regular if it is transitive, inhabited, and for any u ∈ c and any set R ⊆ u×c, if (∀x ∈ u)(∃y)hx, yi ∈ R, then there is a set v ∈ c such that (∀x ∈ u)(∃y ∈ v)((x, y) ∈ R)
∧
(∀y ∈ v)(∃x ∈ u)((x, y) ∈ R).
(26)
c is said to be weakly regular if in the above definition of regularity the second conjunct in (26) is omitted. The weak regular extension axiom, wREA, is the statement that every set is the subset of a weakly regular set. In CZF + wREA, the following theorem can be proved. Theorem 1.2 (CZF + wREA). If Φ is a set, then I(Φ) is a set. The foregoing result holds in more generality for inductive definitions that are bounded (see [2]). The theory CZF has has the same strength as classical Kripke-Platek set theory or the theory one non-iterated inductive definitions ID1 by [15, Theorem 4.14]. It is therefore much weaker than Π11 -comprehension. The strength of CZF + REA and CZF + wREA is the same as that of the subsystem of second order arithmetic with ∆12 -comprehension and Bar induction (see [15, Theorem 5.12] and [16, Theorem 4.7]). Thus it is much stronger than CZF, but still very weak compared to ZF. ZF+AC proves REA whereas wREA (and a fortiori REA) is not provable in ZF alone by [17, Corollary 7.1]. Sometimes one considers extensions of CZF by constructively acceptable choice principles, such as the principle of countable choice: ACω : for every class A, if R ⊆ N × A satisfies (∀n ∈ N)(∃a ∈ A)R(n, a) then there exists f : N → A such that f ⊆ R.
Baire Space in CZF
2
127
Constructive locale theory
Unless stated otherwise we will be working in CZF− . The notion of locale [11, 8, 9] provides the concept of topological space adopted in intuitionistic fully impredicative systems such as topos logic (Higher-order Heyting arithmetic). In the absence of the Powerset Axiom, however, as for instance in constructive generalized predicative systems, this notion splits into inequivalent concepts. A preordered set is a pair (S , ≤) with S and ≤ sets, and ≤ a reflexive and transitive relation. For U a subset, or a subclass, of S , ↓ U abbreviates {a ∈ S : (∃b ∈ U)a ≤ b}. We also use U ↓ V for ↓ U ∩ ↓ V. A generalized covering system on a preordered set (S , ≤) is an inductive definition Φ on S such that, for all (a, X) in Φ, 1. X ⊆ ↓ {a}, 2. if b ≤ a then there is (b, Y) ∈ Φ with Y ⊆ ↓ X. An imaginary locale is a structure of the form X ≡ (S , ≤, Φ), with Φ a generalized covering system on the preordered set (S , ≤). The set S is called the base of X. Given an imaginary locale X ≡ (S , ≤, Φ), we let Φ≤ denote the class of pairs Φ ∪ {(b, {a}) | b ≤ a}. As Φ≤ is an inductive definition, given any subclass U of S , by Theorem 1.1, there exists (in CZF− ) A(U) ≡ I(Φ≤ , U), i.e. the smallest class containing U closed under Φ≤ . Theorem 2.1 (CZF− ). For every a, b ∈ S , and for all subclasses U, V of S , the following hold: 0. ↓ {a} ⊆ A({a}), 1. U ⊆ A(U), 2. U ⊆ A(V) implies A(U) ⊆ A(V), 3. A(U) ∩ A(V) ⊆ A(U ↓ V).
128
Giovanni Curi and Michael Rathjen
See [7] for the proof of this result and further information on imaginary locales. Equipped with a suitable notion of continuous function, imaginary locales form the (superlarge) category ImLoc. Two full subcategories of this category had been considered earlier as possible counterpart of the category of locales in constructive (generalized) predicative settings. Let FSp be the full subcategory of ImLoc given by those imaginary locales X ≡ (S , ≤, Φ) which satisfy (A-smallness)
for every U ∈ Pow(S ), A(U) is a set,
and let FSpi be the full subcategory of ImLoc given by those X ≡ (S , ≤, Φ) which satisfy the smallness condition above, and are such that (Φ-smallness)
Φ is a set,
i.e., such that Φ is an ordinary covering system. FSp and FSpi are respectively (equivalent to) the category of formal spaces and the category of inductively generated formal spaces [1, 5]. Formal spaces are generally presented in terms of a covering relation on a preordered set. Given a (class-)relation / ⊆ S × Pow(S ), let the saturation of a subset U of S be defined as the class A(U) = {a ∈ S : a / U}, where we write a / U for /(a, U). Then, by definition, / is a covering relation if, with the class A(U) thus re-defined, the A-smallness condition is satisfied, and the conditions in Theorem 2.1 are satisfied for every U, V ∈ Pow(S ). One passes from one definition of formal space to the other by associating to an imaginary locale (S , ≤, Φ) satisfying A-smallness, the structure (S , ≤, /), where a/U ⇐⇒ a ∈ I(Φ≤ , U); in the other direction, given a covering relation / on (S , ≤), one obtains a generalized covering system (S , ≤, Φ) satisfying A-smallness by letting Φ ≡ {(a, U) | a / U & U ⊆ ↓ a}. Note that the same correspondence exists more generally between imaginary locales and covering relations that are not required to satisfy the A-smallness condition. Assuming the full Separation scheme Sep, the categories ImLoc and FSp coincide, since the A-smallness condition is always satisfied. On the other hand, even in CZF + Sep, ImLoc is not the same as FSpi , as there are formal spaces of various types that cannot be inductively generated in this system [6].
Baire Space in CZF
129
In CZF, every formal space X has an associated set-generated class-frame S at(X), see [1]; the carrier of S at(X) is given by the class {A(U) | U ∈ Pow(S )} of saturated subsets of X (meets and joins are given by U ∧ V ≡ U ↓ V, and W S i∈I U i ≡ A( i∈I U i )). Note that if X is instead an imaginary locale, S at(X) cannot be constructed in a generalized predicative setting, as A(U) may fail to be a set for some U. With the full Separation scheme and the Powerset axiom available, as e.g., in IZF, S at(X) is an ordinary frame (locale); in such a system, the categories ImLoc, FSp, FSpi (coincide, and) are all equivalent to the category of locales. The concept of formal space was the first to be introduced, in [21]. The reason that led to consider the stronger notion of inductively generated formal space is that it does not appear to be possible to carry out, in a generalized predicative setting, standard basic constructions for general formal spaces, such as that of the product space [5]. The category FSpi has been shown to have all products and equalizers (hence all limits) [23, 14]. However, this only holds in CZF + REA, and we shall see in the next section that the given construction of products may in fact fail to yield, in CZF, an inductively generated formal space from inductively generated formal spaces. Moreover, as already recalled, the restriction to FSpi rules out several types of formal spaces of interest [6]. By contrast, the category ImLoc is complete (has all limits) already over CZF− [7], and, as seen, is an extension of the category of formal spaces which in a fully impredicative system as IZF is still equivalent to the category of locales. We conclude this section noting that, in the absence of REA, an imaginary locale satisfying Φ-smallness need not satisfy A-smallness (with REA it does, recall Theorem 1.2). The next section presents an example of such a phenomenon. Imaginary locales of this kind determine a full subcategory of ImLoc, called the category of geometric locales; this category is itself complete in CZF− [7].
3
Formal Baire space
Recall that Baire space is the set NN , endowed with the product topology. Its pointfree version, formal Baire space, is defined as follows. Let N∗ be the set of finite sequences of natural numbers; formal Baire space B is the generalized covering system B ≡ (N∗ , ≤, ΦB ), where, for s, t ∈ N∗ , s ≤ t if and only if t is an initial segment of s, and ΦB = {(s, {s ∗ hni | n ∈ N}) | s ∈ N∗ }.
130
Giovanni Curi and Michael Rathjen
B is then an imaginary locale in CZF− . Note that ΦB is an infinitary inductive definition. The class ΦB is a set in CZF− , so that it is a set in CZF + REA. By Theorem 1.2, then, AB (U) ≡ I(ΦB≤ , U) is a set for every U ∈ Pow(N∗ ). Thus: Proposition 3.1. CZF + REA proves that B is a formal space. Recall that a point of a formal space (or, more generally, of an imaginary locale) X ≡ (S , ≤, ΦX ) is a inhabited subset α of S such that (i) for every a, b ∈ α there is c ∈ {a} ↓ {b} with c ∈ α, (ii) for every a ∈ S , U ∈ Pow(S ), if a ∈ α and a ∈ I(Φ≤X , U) then there is b ∈ U such that b ∈ α. Points of B may be identified with infinite sequences of non-negative integers. For a ∈ S , let ext(a) denote the class of points ‘in a’, i.e. the class of points to which a belongs. Then, X ≡ (S , ≤, ΦX ) is spatial whenever, for every a ∈ S and U ∈ S Pow(S ), one has a ∈ I(Φ≤X , U) if and only if ext(a) ⊆ b∈U ext(b). Note that the principle of Monotone Bar Induction BI M is exactly the statement that B is spatial [8]. Moreover, spatiality of B implies the spatiality of formal Cantor space and of the formal Real Unit Interval; spatiality of these formal spaces is in turn respectively equivalent to the Fan Theorem, and to compactness of the real unit interval (see [8], or [3]). It is well-known that the compactness of the Real Unit Interval (and hence Monotone Bar Induction) is inconsistent with Church Thesis, so that the spatiality of the above-mentioned formal spaces is independent from the systems we are considering. Contrary to what happens with the Fan Theorem, FT, adding decidable Bar Induction BID (which is a consequence of monotone Bar Induction, BI M ) to CZF has a marked effect. Theorem 3.2. ([20, Corollary 4.8],[19, Theorem 9.10(i)]) (i) CZF + BID proves the 1-consistency of CZF. (ii) CZF and CZF + FT have the same proof-theoretic strength. On the other hand, BI M has no effect on the proof-theoretic strength in the presence of REA. Theorem 3.3. [19, Theorem 9.10(ii)] CZF + REA and CZF + REA + DC + BI M have the same proof-theoretic strength. Formal Cantor Space (defined as formal Baire space, but with N∗ replaced everywhere by {0, 1}∗ , and with N replaced by {0, 1} in the covering system), and the formal Real Unit Interval involve finitary inductive definitions, and can be proved
Baire Space in CZF
131
formal spaces already over CZF− (the covering system for the formal Real Unit Interval is in fact given by an infinitary inductive definition, but this can be seen to have the same effect of a finitary one plus an application of the restricted Separation scheme, cf. [7, Section 6]). It has been an open question for some time whether CZF alone proves that B is a formal space. Theorem 3.4. CZF + ACω does not prove that, for every U ∈ Pow(N∗ ), I(ΦB≤ , U) is a set. The unprovability result obtains even if one adds the Dependent Choices Axiom, DC (cf. [2]), and the Presentation Axiom (cf. [2]), PA, to CZF. Proof. We plan to show, using the axioms of CZF, that from the assertion ∀ U ⊆ N∗ I(ΦB≤ , U) is a set
(27)
it follows that the well-founded part, WF(≺), of every decidable ordering ≺ on N is a set. Here decidability means that ∀n, m ∈ N (n ≺ m ∨ ¬ n ≺ m) and by an ordering we mean any transitive and irreflexive binary relation (which is also a set). Recall that WF(≺) is the smallest class X such that for all n ∈ N, ∀m ∈ N (m ≺ n → m ∈ X)
implies n ∈ X.
(28)
If one then takes ≺ to be the ordering which represents the so-called BachmannHoward ordinal, it follows from [18, 4.13, 4.14] that (27) implies the 1-consistency of CZF (actually the uniform reflection principle for CZF and more), and therefore, in light of [15, Theorem 4.14], also the 1-consistency of CZF + ACω + DC + PA. As a result, (27) is not provable in CZF + ACω + DC + PA. It remains to show that, assuming (27), WF(≺) is a set. Define U to be the subset of N∗ consisting of all sequences hn1 , . . . , nr i with r > 1 such that ¬ n j ≺ ni for some 1 ≤ i < j ≤ r. Observe that ∀s ∈ N∗ (s ∈ U ∨ s < U) owing to the decidability of ≺. Let s ∈ N∗ . We say that n is in s if s is of the form s = hn1 , . . . , nr i with r ≥ 1 and n = ni for some 1 ≤ i ≤ r. Another way of saying that n is in s is that s = s1 ∗ hni ∗ s2 for some s1 , s2 ∈ N∗ . Let V ⊆ N be the class defined as follows: n∈V
iff
∀s ∈ N∗ (n in s → s ∈ I(ΦB≤ , U)).
(29)
Claim 1: WF(≺) ⊆ V. To show this, assume n ∈ N and m ∈ V for all m ≺ n. Suppose n is in s. For an arbitrary k ∈ N we then have s ∗ hki ∈ I(ΦB≤ , U), for ¬k ≺ n implies s ∗ hki ∈ U and hence s ∗ hki ∈ I(ΦB≤ , U), whereas k ≺ n entails s ∗ hki ∈ I(ΦB≤ , U) by assumption. Thus s ∈ I(ΦB≤ , U) owing to its inductive definition. Whence n ∈ V.
132
Giovanni Curi and Michael Rathjen
hi denotes the empty sequence of N∗ . Let Y = {s ∈ N∗ | s ∈ U ∨ ∃m ∈ WF(≺) (m in s) ∨ ∀m m ∈ WF(≺) }. Claim 2: I(ΦB≤ , U) ⊆ Y. By definition of Y, we have U ⊆ Y and whenever s ∈ Y and t ∈ N∗ then s∗t ∈ Y. To confirm the claim it thus suffices to show that s∗hni ∈ Y for all n ∈ N implies s ∈ Y. So assume s ∗ hni ∈ Y for all n ∈ N. s ∈ U implies s ∈ Y. If s = hi, then hni ∈ Y for all n. Thus s = hi implies that for all n, n ∈ WF(≺) or ∀m m ∈ WF(≺), and therefore n ∈ WF(≺), yielding s ∈ Y. Henceforth we may assume that s < U and s , hi. In particular s = t ∗ hki for some t ∈ N∗ , k ∈ N, and the components of s are arranged in ≺-descending order. Let n ≺ k. We then have s ∗ hni = t ∗ hki ∗ hni < U and hence there is an l ∈ WF(≺) such that l is in s ∗ hni or else ∀m m ∈ WF(≺). Thus n ≺ l ∨ n = l for some l ∈ WF(≺) or ∀m m ∈ WF(≺), yielding n ∈ WF(≺). In consequence, ∀n ≺ k n ∈ WF(≺), thus k ∈ WF(≺), and hence s ∈ Y. This finishes the proof of Claim 2. As n ∈ V implies hni ∈ I(ΦB≤ , U), we deduce with the help of Claim 2 that hni ∈ Y. Hence n ∈ WF(≺) or ∀ m m ∈ WF(≺), whence n ∈ WF(≺). Thus V ⊆ WF(≺), and in view of Claim 1, we have V = WF(≺). If I(ΦB≤ , U) were a set, V would be a set, too. Hence (27) entails that WF(≺) is a set. Corollary 3.5. CZF + ACω + DC + PA does not prove that the imaginary locale B is a formal space. This in particular answers the question in footnote 2 of [3]. Note that B is in fact a geometric locale, since ΦB is a set in CZF− . Remark 3.6. Direct calculations show that the construction of products for inducQ tively generated formal spaces [23] gives B n∈N N, where N is the discrete formal space of the natural numbers, which is trivially inductively generated in Q CZF. It is easy to see that the assumption that the imaginary locale n∈N N is a formal space implies that I(ΦB≤ , U) is a set, for U as in the proof of Theorem 3.4, so Q that n∈N N also cannot be proved to be a formal space in CZF + ACω + DC + PA. This fact shows that the category of inductively generated formal spaces, which is complete in CZF + REA, is not closed for infinitary products (at least according to the received construction) in (CZF and) CZF + ACω + DC + PA. Note that Q n∈N N B is the product in the category of imaginary locales [7]. The above theorem also answers the question in footnote 3 of [3], whether CZF proves that the Brouwer (or constructive) Ordinals form a set. Call a relation R ⊆ S × Pow(S ) set-presented if a mapping D : S → Pow(Pow(S )) is given satisfying R(a, U) ⇐⇒ (∃V ∈ D(a)) V ⊆ U, for every a ∈ S , U ∈ Pow(S ).
Baire Space in CZF
133
Lemma 3.7 (CZF− ). Let X ≡ (S , ≤, Φ) be an imaginary locale. If R(a, U) ≡ a ∈ I(Φ≤ , U) is set-presented, then X is a formal space. Proof. The class I(Φ≤ , U) is a set for every U ∈ Pow(S ), since I(Φ≤ , U) = {a ∈ S | (∃V ∈ D(a)) V ⊆ U}, and the latter class is a set by Replacement and Restricted Separation. The following result, without mentioning PA, was also independently established in [4, Corollary B.5]. Corollary 3.8. CZF + ACω + DC + PA does not prove that the Brouwer Ordinals form a set. Proof. The proof of [3, Proposition 3.10] or [4, Proposition B.4] shows that, in CZF + ACω plus the assertion that the Brouwer Ordinals form a set, the relation RB (s, U) ≡ s ∈ I(ΦB≤ , U) is set-presented. But then I(ΦB≤ , U) is a set for every U ∈ Pow(N∗ ), by Lemma 3.7. So if CZF proves that the Brouwer Ordinals form a set, CZF + ACω proves that B is a formal space. Implicitly, it has been known for a long time that CZF + ACω + DC + PA does not prove that the Brouwer ordinals form a set, owing to [15, Theorem 4.14] and an ancient result due to Kreisel. In [12] Kreisel showed that the intuitionistic theory IDi (O) of the Brouwer ordinals O is of the same proof-theoretic strength as the classical theory of positive arithmetical inductive definitions ID1 . IDi (O) is an extension of Heyting arithmetic via a predicate for the Brouwer ordinals and axioms pertaining to O’s inductive nature. Thus from [15, Theorem 4.14] it follows that IDi (O) and CZF + ACω + DC + PA have the same proof-theoretic strength. But if the latter theory could prove that the Brouwer ordinals form a set, it could easily prove that IDi (O) has a set model, and in particular the consistency of IDi (O). The research reported in this paper was carried out in June 2011. The question of whether CZF proves that the Brouwer ordinals form a set, raised in [3], was also answered (in the negative) by the authors of [3], Benno van den Berg and Ieke Moerdijk. They posted a new version of their paper on arxiv.org in November 2011 which in the meantime got published as [4]. In [4, Corollary B.5] they show that the Brouwer ordinals cannot be proved to be a set on the basis of CZF+DC. Theirs and our proof both hinge on proof-theoretic results from [20]. Our non-provability result 3.4, however, is stronger than [4, Corollary A.3]. The latter shows that CZF+DC cannot prove that Baire space has a presentation whereas the former shows that this theory (or even CZF+PA) does not prove that B is a formal space. Post scriptum.
134
Giovanni Curi and Michael Rathjen
References [1] P. Aczel, “Aspects of general topology in constructive set theory”, Ann. Pure Appl. Logic, 137, 1-3 (2006), pp. 3-29. [2] P. Aczel, M. Rathjen “Notes on Constructive Set Theory”, Mittag-Leffler Technical Report No. 40, 2000/2001. [3] B. van den Berg, I. Moerdijk, “Derived rules for predicative set theory: an application of sheaves”. Version of September 18, 2010 [http://arxiv.org/abs/1009.3553v1]. [4] B. van den Berg, I. Moerdijk, “Derived rules for predicative set theory: an application of sheaves”. Annals of Pure and Applied Logic (2012), doi:10.1016/j.apal.2012.01.010 [5] T. Coquand, G. Sambin, J. Smith, S. Valentini, “Inductively generated formal topologies”, Annals of Pure and Applied Logic 124, 1-3 (2003), pp. 71–106. [6] G. Curi, “On some peculiar aspects of the constructive theory of point-free spaces”, Mathematical Logic Quarterly, 56, 4 (2010), pp. 375 – 387. [7] G. Curi, “Topological inductive definitions”. Ann. Pure Appl. Logic, doi:10.1016/j.apal.2011.12.005. [8] M. Fourman and R. Grayson, “Formal Spaces”, in: A.S. Troelstra and D. van Dalen (eds.), The L.E.J. Brouwer Centenary Symposium, pp. 107–122. North Holland (1982). [9] P. T. Johnstone, “Stone Spaces”, Cambridge University Press, 1982. [10] P. T. Johnstone, “The points of pointless topology”, Bull. Am. Math. Soc. 8 1, 41–53 (1983). [11] P. T. Johnstone, “Sketches of an elephant. A topos theory compendium. II,” Oxford Logic Guides 44; Oxford Science Publications. Oxford: Clarendon Press, 2002. [12] G. Kreisel “A survey of proof theory”, J. Symbolic Logic 33 (1968) pp. 321– 388. [13] S. MacLane, I. Moerdijk, “Sheaves in Geometry and Logic - A First Introduction to Topos Theory”, Springer, 1992.
Baire Space in CZF
135
[14] E. Palmgren, “Predicativity problems in point-free topology”, in: V. Stoltenberg-Hansen et al. (eds.), Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Helsinki, Finland, August 14–20, 2003, Lecture Notes in Logic 24, ASL, (AK Peters Ltd, 2006), pp. 221–231. [15] M. Rathjen “The strength of some Martin–L¨of type theories”, Department of mathematics, Ohio State University, Preprint series (1993) 39 pages. [16] M. Rathjen “The anti-foundation axiom in constructive set theories”, in: G. Mints, R. Muskens (eds.) Games, Logic, and Constructive Sets. (CSLI Publications, Stanford, 2003) 87–108. [17] M. Rathjen, R.S. Lubarsky “On the regular extension axiom and its variants”, Math. Log. Q. 49, 5 (2003), pp. 511–518. [18] M. Rathjen “Replacement versus collection in constructive Zermelo-Fraenkel set theory”, Ann. Pure Appl. Logic 136 (2005) pp. 156–174. [19] M. Rathjen, “Constructive set theory and Brouwerian principles”, J. Universal Computer Science 11 (2005), pp. 2008–2033. [20] M. Rathjen, “A note on bar induction in constructive set theory”, Math. Log. Q. 52, 3 (2006), pp. 253–258. [21] G. Sambin, “Intuitionistic formal spaces - a first communication”, in: Mathematical Logic and its Applications, D. Skordev, ed. Plenum, (1987), pp. 187204. [22] A. Troelstra and D. van Dalen, “Constructivism in mathematics, an introduction, Vol. I,II”, Studies in logic and the foundation of mathematics 121, 123 (Amsterdam etc., North-Holland, 1988). [23] S. Vickers, “Some constructive roads to Tychonoff”, in: L. Crosilla and P. Schuster (eds.) From Sets and Types to Topology and Analysis: Practicable Foundations for Constructive Mathematics, pp. 223 – 238, Oxford Logic Guides 48, OUP (2005).
136
Functional Interpretations of Classical and Constructive Set Theory Justus Diller Whereas the Diller-Nahm interpretation of Peano arithmetic in finite types PAω is transferred directly to a ∧-interpretation of Kripke-Platek set theory KPω as well as of its finite type extension KPωω in the classical version of the theory of constructive set functionals T ∈ , the Dialectica interpretation already of KPω requires an additional non-constructive choice function. By a suitable translation of existential quantification, the ∧-interpretation is further extended to a functional interpretation of Aczel’s constructive set theory CZF − and its finite type extension CZF ω− in T ∈ . We also characterize the strength of these functional translations and sketch a hybrid interpretation ∧q of CZF ω− in itself which shows closure of CZF ω− under, e.g., a weak form of existential definability.
1
Introduction
Generalizing G¨odel’s Dialectica interpretation D of Heyting arithmetic HA in his theory T [12], consider a mathematical theory T h and a theory FT of functionals of finite types such that the quantifier-free formulae of L(T h) are also in L(FT ). A functional translation I of T h in FT is a recursively defined map which, with each formula A of L(T h), associates an expression AI ≡ ∃v∀wAI [v, w] where AI [v, w] is a formula of L(FT ) and v, w are disjoint tuples of variables of finite type not occurring in A. Moreover, AI ≡ AI ≡ A for quantifier-free A ∈ L(T h). The functional translation I is a functional interpretation of T h in FT , we write I
T h ,→ FT , if for T h ` A and AI as above, there is a tuple of terms b of L(FT ) (with variables among the variables free in A) such that FT ` AI [b, w] If this last line holds, A is called I-interpretable in FT , and the terms b are called (a tuple of) I-interpreting terms of A.
138
Justus Diller
In this notation, G¨odel proves in [12] D
HA ,→ T As we identify Peano arithmetic PA with the negative fragment of HA, this implies D
PA ,→ T Under this definition, Shoenfield’s translation [17] is, strictly speaking, not a functional translation. We also do not follow [8] where the Shoenfield translation is considered the prototype of functional translations of classical theories. We rather concentrate on functional translations which apply to constructive theories as well as to their classical counterparts, the latter being formulated in an ∃-free fragment. For discussions of the conceptual and foundational background, see [2], [6], [8], and [20]. In [12], G¨odel in particular justifies his choice of the translation of implications A → B, (A → B)D ≡ ∃WY∀vz(AD [v, Wvz] → BD [Yv, z]) by the existence of characteristic terms for AD [v, w], BD [y, z] which are type o formulae of L(T ) and hence also decidable in T . In the absence of equality functionals - which reduce all equations to equations of type o - , this argument does not extend to Heyting arithmetic in finite types HAω , the natural span of HA and T . In HAω , as in T , equations of higher type do not possess characteristic terms, they are undecidable and yet translated identically: By Howard’s example (cf. [11]), HAω is not D-interpretable in T . A way out is shown by the ∧-translation which differs from the Dialectica translation essentially only in the case of implication: (A → B)∧ ≡ ∃XWY∀vz(∀x < Xvz A∧ [v, W xvz] → B∧ [Yv, z]) This translation, of course, requires an extension of G¨odel’s theory T by a bounded universal quantifier ∀x < t to the theory T ∧ . The ∧-translation solves a contraction problem: The D-interpretation of ∀wA[w] → A[b] ∧ A[c], even for qf A, calls for a term w0 such that ` A[w0 ] → A[b] ∧ A[c] which, for undecidable A, may not exist in T . The more liberal ∧-interpretation, however, calls for terms X and W satisfying ∀x < XA[W x] → A[b] ∧ A[c]. Such terms are X = 2 and W with W0 = a and W1 = b. More generally, in [11] is shown: ∧
HAω ,→ T ∧
Functional Interpretations of Classical and Constructive Set Theory
139
Due to the undecidability of equations of higher type, T and T ∧ have to be extended by the schema S tab(=) :≡ {¬¬a = b → a = b | a, b terms of the same type} to yield classical theories T c and T ∧c . Similarly, Peano arithmetic in all finite types PAω is the negative fragment of HAω , again extended by the schema S tab(=). D
While, as above, we do not have PAω ,→ T c , we do have ∧
PAω ,→ T ∧c Whereas in [10], we stressed the analogy between functional interpretations of PAω and KPωω , here we present the ∧-interpretation of CZF ω− as an extension of the ∧-interpretation of KPωω .
2
Constructive set functionals
An intuitionistic theory T ∈ of constructive set functionals in finite types is first developed by Burr in [4], following a suggestion by A. Weiermann to study transfinite recursors Rτ with higher types τ as a tool for a functional interpretation of KripkePlatek set theory. As in G¨odel’s T , types are o and with σ and τ also (σ → τ). c : τ indicates that c is a term of type τ. Application is type conforming: If a : σ → τ and b : σ, then (ab) : τ. Basic functionals of T ∈ will be given below together with their defining axioms. Within the language of T ∈ , we distinguish the class of ∆0 -formulae: (1) ∆0 -formulae are built up from atomic formulae s ∈ t, s = t (s, t : o) and ⊥ by ∧, ∨, →, ∀x ∈ t, ∃x ∈ t. (2) T ∈ -formulae are built up from ∆0 -formulae and equations a = b (a, b : τ) by ∧, →, ∀x ∈ t. The logic of T ∈ is intuitionistic predicate logic with identity, with the laws for quantification restricted to bounded quantifiers (cf. [4]). The logic calculus is a Hilbert-style calculus with ’short’ axioms and rules. It is presented in detail in [4] and [11] and is essentially the calculus used by G¨odel in [12]. Basic functionals are 1. combinators K, S - we suppress type indices - with axioms (K) Kab = a (S) Sabc = ac(bc)
140
Justus Diller
2. arithmetic constants 0 : o, S uc : o → o → o, ω : o with axioms (0) ∀x ∈ 0 ⊥ (Suc) S uc s t = s ∪ {t} The standard one-place successor suc is defined by suc t = S uc t t. (ω)
0 ∈ ω ∧ ∀x ∈ ω suc x ∈ ω ∧ ∀x ∈ ω (x = 0 ∨ ∃y ∈ ω x = suc y)
3. constants reflecting ∆0 -logic for union-replacement, intersection, and implication U : (o → o) → o → o, Int : o → o → o; Imp : o → o → o → o, with axioms (U)
x ∈ U f t ↔ ∃y ∈ t x ∈ f y, shorthand [ Uft = { f x | x ∈ t}
(Int) x ∈ Int ts ↔ x ∈ s ∧ ∀y ∈ t x ∈ y, shorthand \ Int ts = s ∩ t (Imp) x ∈ Imp rst ↔ x ∈ r ∧ (x ∈ s → x ∈ t), shorthand Imp rst = {x ∈ r | x ∈ s → x ∈ t}
4. ∈-recursors Rτ : ((o → τ) → o → τ) → o → τ require restriction functionals τ of type (o → τ) → o → o → τ for the formulation of their defining axioms. The restriction functionals are defined by recursion on τ. Writing ( f τ t) or just ( f t) for τ f t, their defining equations are: ( f o t)x = U f ({x} ∩ t) ( f σ→τ t)xuσ = ((λy. f yuσ ) τ t)x The axiom for the ∈-recursor Rτ now is (R)
Rτ ht = h((Rτ h) τ t)t
Further axioms and rules of T ∈ are the ∆0 -axiom of set-extensionality (ext) (∀x ∈ s x ∈ t ∧ ∀x ∈ t x ∈ s) → s = t, the rule of type-extensionality (T-EXT) A → f x = gx ` A → f = g with x : τ not in A, f, g, finally the rule of transfinite or ∈-induction (T IND) ∀x ∈ y F[x] → F[y] ` F[t] for all formulae F[y] of L(T ∈ ). This completes the description of the theory T ∈ .
Functional Interpretations of Classical and Constructive Set Theory
141
The classical version of T ∈ is simply T ∈c :≡ T ∈ + S tab(=) The restriction functionals τ , defined above, satisfy (1) T ∈ ` ∀x ∈ t ( f t)x = f x (2) T ∈ ` ∀x ∈ t f x = gx → f t = g t These properties of τ justify its name restriction and also give axiom (R) its intuitive meaning. The constants reflecting ∆0 -logic allow for a simple proof of an explicit form of ∆0 -separation: 2.1 Explicit ∆0 -separation. To any ∆0 -formula A[x] and any term t : o with x : o not in t, there exists a separation term {x ∈ t | A[x]} : o such that T ∈ ` y ∈ {x ∈ t | A[x]} ↔ y ∈ t ∧ A[y] Any ∆0 -formula A possesses a characteristic term {0 | A} = {x ∈ 1 | A} with x not in A such that T ∈ ` 0 ∈ {0 | A} ↔ {0 | A} = 1 ↔ A For a proof, see [4] or [9]. Thus, every formula A of L(T ∈ ) is equivalent to a negative formula, and T ∈c ` ¬¬A → A Ordered pairs h , i and their projections ( )0 , ( )1 may be defined as usual. The disjoint sum {hx, yi | y ∈ Y x, x ∈ X} of {Y x | x ∈ X} satisfies: 2.2 Contraction of bounds. In T ∈ is provable ∀x ∈ X ∀y ∈ Y x F[x, y] ↔ ∀z ∈ {hx, yi | y ∈ Y x, x ∈ X} F[(z)0 , (z)1 ] A principle of generalized induction holds in T ∈ which is the essential technical tool for a functional interpretation of (T IND). It is closely related to the corresponding principle in T ∧ (Satz 1 in [11] ). Its proof requires some recursion and induction schemata which we state here without proof. As these schemata - like functional translations - often handle tuples of terms and types, we insert a Convention on tuple notation 1. If σ is a type tuple σ1 , ..., σl and τ is a type tuple τ1 , ..., τk then σ → τ denotes the type tuple σ1 → ... → σl → τ1 , ..., σ1 → ... → σl → τk (associating to the
142
Justus Diller
right). Hence, for l = 0, σ → τ is τ, and for k = 0, σ → τ is empty. 2. If a is a term tuple a1 , ..., ak of type tuple σ → τ and b is a term tuple b1 , ..., bl of type tuple σ, then (ab) denotes the term tuple a1 b1 ...bl , ..., ak b1 ...bl (associating to the left) of type tuple τ. 2.3 Recursion and induction schemata in T ∈ Simultaneous ∈-recursion: Given a non-empty type tuple τ = τ0 , ..., τn and terms gi : (o → τ) → o → τi (i ≤ n), there are terms fi : o → τi (i ≤ n) which solve the n + 1 equations (S imR)
fi t = gi ( f0 t)...( fn t)t
(i ≤ n)
ω-induction: In T ∈ is admissible F[0] , ∀y ∈ ω (F[y] → F[suc y]) ` t ∈ ω → F[t] Simultaneous ω-recursion: Given τ = τ0 , ..., τn and terms ai : τi and bi : τ → o → τi (i ≤ n), there are terms fi : o → τi (i ≤ n) which solve the n + 1 pairs of equations fi 0 = ai
and
∀x ∈ ω fi (suc x) = bi ( f0 x)...( fn x)x
We are now in a position to prove 2.4 Generalized transfinite induction. Given a term X, a term tuple W with variables a, u, x : o and a tuple of variables z, all not in X, W, such that (1)
T ∈ ` ∀u ∈ a ∀x ∈ Xaz B[u, W xaz] → B[a, z]
Then
T ∈ ` B[t, z]
for any term t : o.
Proof. By simultaneous ω-recursion, we define a term tuple X1 , Z depending on z and a by Zy0 = z X1 0 = 1
∀n ∈ ω Zhx, yi(suc n) = W xa(Zyn) ∀n ∈ ω X1 (suc n) = {hx, yi | x ∈ Xa(Zyn), y ∈ X1 n}
For n ∈ ω, we have by contraction of bounds 2.2 (2)
∀u ∈ a (∀y ∈ X1 (suc n) B[u, Zy(suc n)] ↔ ∀y ∈ X1 n ∀x ∈ Xa(Zyn) B[u, W xa(Zyn)]) After substituting Zyn for z in (1) and distributing ∀y ∈ X1 n, we rewrite (1) under this equivalence as ∀n ∈ ω(∀u ∈ a ∀y ∈ X1 (suc n) B[u, Zy(suc n)] → ∀y ∈ X1 n B[a, Zyn])
Functional Interpretations of Classical and Constructive Set Theory
143
This is of a form ∀n ∈ ω(∀u ∈ a C[u, suc n] → C[a, n]) which implies T ∈ ` ∀u ∈ a ∀n ∈ ω C[u, n] → ∀n ∈ ω C[a, n] (T IND) and 0 ∈ ω now yield ` C[t, 0] ≡ ∀y ∈ X1 0 B[t, Zy0], i.e. ` B[t, z].
3
Functional interpretations of Kripke-Platek set theory
Kripke-Platek set theory (with infinity) KPω (cf. Barwise [3]) is a classical set theory which lacks strong non-constructive principles like the powerset axiom or full separation, but still has a considerable expressive power. In a sense, KPω collects constructive set theoretic principles in a classical theory. We underline this aspect by ∧-interpreting KPω as well as its finite type extension KPωω (cf. [5], [10]), the natural span of KPω and T ∈c , in the classical theory T ∈c . KPωω is defined as an extension of T ∈c . Its language is the closure of L(T ∈ ) under ∧, →, ∀x ∈ t, and ∀uτ for all types τ; its axioms and rules are those of T ∈c , extended - with the exception of the rule (T-EXT) - to the full language of KPωω , plus the laws for the universal quantifier and the axiom schema of ∆0 -collection: (∆0 -collection)
∀x ∈ t ¬∀y¬A[x, y] → ¬∀z¬∀x ∈ t ∃y ∈ z A[x, y] where A[x, y] is a ∆0 -formula and x, y : o
The rule (T-EXT) remains restricted to the language of T ∈ (weak extensionality). Being an extension of T ∈c , KPωω inherits several properties from this subtheory: (1) KPωω enjoys explicit ∆0 -separation. (2) In KPωω , every formula is equivalent to a negative formula. (3) KPωω ` ¬¬A → A for all formulae A of KPωω . (4) KPωω satisfies classical ∃-free predicate logic. Moreover, it is easily seen: (5) KPωω is an extension of KPω. We consider functional translations D and ∧, as they are obviously transferred from classical arithmetic PAω to KPωω (cf. [10] where this transfer is treated in detail and which contains some of the material of the present section). To avoid ambiguities, we give a simultaneous definition of D and ∧ on L(KPωω ). Recursive definition of the Dialectica- and the ∧-translation Let I stand for the translations D and ∧ simultaneously. To any formula A of KPωω , I assigns an expression AI ≡ ∃v∀wAI [v, w], with AI [v, w] a formula of
144
Justus Diller
L(T ∈ ) as follows: (T ∈ )I
AI ≡ A for A ∈ L(T ∈ )
Let AI be as above and BI ≡ ∃y∀zBI [y, z], then (∧)I (→)D (→)∧
(A ∧ B)I (A → B)D (A → B)∧
(→)∧0 (∀ ∈)I (∀)I
(A → B)∧ (∀x ∈ t B[x])I (∀uA[u])I
≡ ∃vy∀wz(AI ∧ BI ) ≡ ∃WY∀vz(AD [v, Wvz] → BD [Yv, z]) ≡ ∃XWY∀vz(∀x ∈ Xvz A∧ [v, W xvz] → B∧ [Yv, z]) in case the tuple w is not empty ≡ ∃Y∀vz(A∧ [v] → B∧ [Yv, z]) for empty w ≡ ∃Y∀z(∀x ∈ t BI [Y x, z]) ≡ ∃V∀uwAI [u, Vu, w]
3.1 Proposition. Already the type o theory KPω is not Dialectica interpretable in T ∈c . Proof by example. KPω (with a constant 0 for the empty set) proves ∀x(∀y ¬y ∈ x → x = 0) This has the D-translation ∃Y∀x(¬Y x ∈ x → x = 0) Any Y satisfying this formula is necessarily a classical choice function which is not a constructive set functional. However, the ∧-translation of this formula is ∃ZY∀x (∀z ∈ Zx ¬Yz ∈ x → x = 0), and ∧-interpreting functionals Z, Y are given by Zx = x and Yz = z. Besides the contraction problem, the ∧-translation also solves a ∆0 ⊂ Π1 -problem: Given an L(T ∈ )-formula A[xo ], KPωω ` ∀y(y ∈ t → A[y]) → ∀y ∈ t A[y] The D-translation of this formula calls for a term s satisfying (s ∈ t → A[s]) → ∀y ∈ t A[y] which may not exists in T ∈ , as the example shows. The ∧-translation, however, is (equivalent to) ∃Y(∀y ∈ Y(y ∈ t → A[y]) → ∀y ∈ t A[y]) and is therefore ∧-interpreted by Y = t.
Functional Interpretations of Classical and Constructive Set Theory
145
Here we made use of a simplification of the ∧-translation: 3.2 Lemma. A formula ∀y A[y] → B with A, B in L(T ∈ ), y : o may be ∧-translated as ∃Y(∀y ∈ Y A[y] → B) without changing the concept of ∧-interpretability. Proof. (∀y A[y] → B)∧ is literally ∃XY 0 (∀x ∈ X A[Y 0 x] → B). Y is obtained from X, Y 0 by putting Y = {Y 0 x | x ∈ X}, and X, Y 0 are obtained from Y by X = Y and Y 0 x = x. Moreover, in the translation of longer formulae, the tuple of variables X, Y 0 and the variable Y are handled the same way. 3.3 ∧-Interpretation Theorem for KPωω ∧
KPωω ,→ T ∈c Proof by induction on deductions. The ∧-interpretation of the logic of KPωω , including identity, may be taken over from [11], replacing < by ∈, augmented by the above solution of the ∆0 ⊂ Π1 -problem. Results from T ∧ which are used there have to be transferred to T ∈ . That, however, is easily done, in particular by exploiting explicit ∆0 -separation 2.1 and contraction of bounds 2.2 (cf. [4], [5], and [10]). Axioms and rules of T ∈c , including type extensionality (T-EXT) are all interpreted by the empty tuple. Only (T IND) and ∆0 -collection remain to be ∧-interpreted. (T IND) ∀u ∈ a F[u] → F[a] ` F[t] By I.H., there are terms and term tuples X, W, Y0 such that (1) T ∈c ` ∀x ∈ Xavz ∀u ∈ a F∧ [u, vu, W xavz] → F∧ [a, Y0 va, z] We define terms Y by simultaneous transfinite recursion 2.3 Ya = Y0 (Y a)a, substitute Y a for v in (1), and obtain T ∈c ` ∀x ∈ Xa(Y a)z ∀u ∈ a F∧ [u, Yu, W xa(Y a)z] → F∧ [a, Ya, z] Here, the terms Xa(Y a)z, W xa(Y a)z are terms X 0 az, W 0 xaz, and F∧ [a, Ya, z] is a formula B[a, z] satisfying (1) in proposition 2.4. So, by generalized transfinite induction, T ∈c ` F∧ [t, Yt, z]. The ∧-translation of the schema of (∆0 -collection) is, simplified by an application of lemma 3.2, ∃Z∀Y(∀x ∈ t ∃y ∈ Y x A[x, y] → ∃z ∈ ZY∀x ∈ t ∃y ∈ z A[x, y])
146
Justus Diller
Given Y satisfying the antecedent, there is one canonical z satisfying the conseS quent, namely z = {Y x | x ∈ t} = UYt. (∆0 -collection) is therefore ∧-interpreted by the term Z with the value ZY = {UYt}. This completes the proof of the ∧-interpretation theorem. For related proofs using, however, different translations, cf. [5], [7]. An immediate consequence is: 3.4 Conservativity and relative consistency. KPωω is a conservative extension of T ∈c . The consistency of T ∈c implies the consistency of KPωω . In order to characterize a functional translation I in a classical theory T h, it must be kept in mind that an expression AI ≡ ∃v∀wAI [v, w] is, for non-empty tuple v, not a formula in L(T h). Instead, one has to consider its negative version AI− ≡ ¬∀v¬∀wAI [v, w], and the schema to be characterized is {A ↔ AI− }. This complicates the situation in comparison to the constructive case. For I = D and the type o fragment PAω0 of Peano arithmetic in finite types PAω , this schema is characterized by (q f − AC) ∀x¬∀y¬A[x, y] → ¬∀Y¬∀x A[x, Y x] with A[x, y] in L(T 0 ) and non-empty tuples x, y of variables, which is (AC)− with quantifier free matrix A: 3.5 Proposition (Kreisel [13]). PAω0 + (q f − AC) ≡ PAω0 + {A ↔ AD− } Up to a double negation, (q f − AC) - which, with A in L(T ∈ ), may also be read as an axiom schema in KPωω - is of the form B → BD− with B ≡ (∀x∃y A[x, y])− Similarly, for A in L(T ∈ ) and tuples x, y as above, let (q f − ARC) ∀x¬∀y¬A[x, y] → ¬∀S , Y¬∀x¬∀s ∈ S x ¬A[x, Y sx] This quantifier free axiom of restricting choice is literally of the form B → B∧− with B ≡ (∀x∃y A[x, y])− . We therefore put (q f − AC D ) :≡ (q f − AC) and (q f − AC∧ ) :≡ (q f − ARC) and let (q f − AC I ) refer to either. 3.6 Lemma. For A, B, A[u] in L(KPωω ), u a - possibly empty - tuple of variables, KPωω proves: 1. AI− ↔ A for A in L(T ∈ ) 2. (A ∧ B)I− ↔ AI− ∧ BI− 3. (A → B)I− → AI− → BI−
Functional Interpretations of Classical and Constructive Set Theory
147
4. (∀u¬A[u])I− → ∀u¬(A[u])I− ; furthermore 5. KPωω + (q f − AC I ) ` ∀u¬(A[u]I− ) → (∀u¬A[u])I− Since prenex formulae B are of the form ∀u1 ¬...∀un ¬C[u1 , ..., un ] with n ≥ 0, possibly empty - tuples u1 , ..., un of variables, and C[u1 , ..., un ] ∈ L(T ∈ ), this lemma implies: 3.7 Partial characterization of I = D and I = ∧ KPωω + (q f − AC I ) ≡ KPωω + {B ↔ BI− | B prenex} Proof. Let B be prenex as above. Then KPωω + (q f − AC I ) ` B ↔ BI− follows by 1. and n applications of 4. and 5. in Lemma 3.6. Conversely, the schema (q f − AC I ), as mentioned above, is a set of formulae B → BI− with prenex B. Due to the ∧-interpretation theorem 3.3, this implies for I = ∧: 3.8 Characterization theorem for ∧ on KPωω KPωω + (q f − ARC) ≡ KPωω + {A ↔ A∧− } Proof. For A ∈ L(KPωω ), let B be a prenex normal form of A. Then KPωω ` A ↔ B . Therefore, by theorem 3.3, KPωω ` (A ↔ B)∧− , which by 2. and 3. in 3.6 implies KPωω ` A∧− ↔ B∧− , and, as shown in 3.7, KPωω + (q f − ARC) ` B ↔ B∧− Put together, these equivalences imply KPωω + (q f − ARC) ` A ↔ A∧− 3.9 Corollary. KPωω + (q f − ARC) proves any formula ∧-interpretable in T ∈c . It may be conjectured that KPωω + (q f − ARC) 0 (q f − AC) and that therefore (q f − AC) is not ∧-interpretable in T ∈c . On the other hand, it can be shown: 3.10 Extended ∧-Interpretation Theorem for KPωω ∧
KPωω + (q f − ARC) ,→ T ∈c As stated above, this proof does not transfer to a Dialectica interpretation in T ∈c , even of KPω only. Following Burr-Hartung [7], we therefore extend the theory T ∈c by a non-constructive choice function. 3.11 The axiom of uniform choice Let FC be a new constant of type o → o with the axiom
148
Justus Diller
(uniform AC)
x # 0 → FC x ∈ x
t#0, read: t is inhabited, stands for ∃y ∈ t y = y. The extension of the type o fragment of T ∈c by this axiom suffices for a Dialectica interpretation of KPω, even of the type o fragment of KPωω : 3.12 Dialectica interpretation theorem D
c KPωω0 + (uni f orm AC) ,→ T ∈0 + (uni f orm AC)
The proof by induction on deductions parallels the proof of the ∧-interpretation theorem above and is, in some aspects, technically simpler. In particular, the induction steps for A3 A → B, B → C ` A → C and for (T IND) are special cases of the corresponding steps in 3.3, with the given bounds being equal to 1. Two induction steps, however, those for contraction A7 and for introduction of the bounded universal quantifier Q3b, require new arguments. A7 A → B, A → C ` A → B ∧ C. By I.H., there are term tuples W1 , Y1 and W2 , Y2 D-interpreting A → B and A → C, i.e. ` AD [v, W1 vz1 ] → BD [Y1 v, z1 ] and ` AD [v, W2 vz2 ] → C D [Y2 v, z2 ] We look for a term tuple W with ` BD [Y1 v, z1 ] → Wvz1 z2 = W2 vz2 and ` ¬BD [Y1 v, z1 ] → Wvz1 z2 = W1 vz1 , because, for such a W, ` ¬BD [Y1 v, z1 ] → AD [v, Wvz1 z2 ] → ⊥ and ` BD [Y1 v, z1 ] → AD [v, Wvz1 z2 ] → C D [Y2 v, z2 ], hence, by classical logic, ` AD [v, Wvz1 z2 ] → BD [Y1 v, z1 ] ∧ C D [Y2 v, z2 ] Since BD is ∆0 , by 2.1, the term tuple W can be defined by Wvz1 z2 = Dτ (W1 vz1 )(W2 vz2 ){0 | BD [Y1 v, z1 ]} with Dτ a tuple of case distinction functionals of appropriate type tuple τ. Then W, Y1 , Y2 D-interpret A → B ∧ C. - This argument, cut back to Heyting arithmetic HA, is in fact G¨odel’s original argument for his Dialectica interpretation of A7. It depends on the formula B being of type o. Q3b A → y ∈ t → B[y] ` A → ∀y ∈ t B[y]. By I.H., there are term tuples W 0 , Y such that
Functional Interpretations of Classical and Constructive Set Theory
149
` AD [v, W 0 vyz] → y ∈ t → BD [y, Yvy, z]. This certainly implies ` ∀y ∈ t AD [v, W 0 vyz] → ∀y ∈ t BD [y, Yvy, z]. If the antecedent of this implication does not hold, then we pick a y ∈ t such that ¬AD [v, W 0 vyz] by applying the choice function FC , that is, we define Wvz = W 0 v(FC {y ∈ t | ¬AD [v, W 0 vyz]})z and obtain ` ¬∀y ∈ t AD [v, W 0 vyz] → AD [v, Wvz] → ⊥. These two implications imply by classical logic ` AD [v, Wvz] → ∀y ∈ t BD [y, Yvy, z] (uniform AC) is D-interpreted trivially. That completes the Dialectica interpretation of KPωω0 + (uniform AC). By an argument analogous to the proof of the ∧-characterization theorem 3.7, it can be shown: 3.13 Characterization theorem for the D-translation KPωω0 + (uni f orm AC) + (q f − AC) ≡ KPωω0 + (uni f orm AC) + {A ↔ AD− } c is derivable in KPωω0 + (uni f orm AC) + (q f − Any formula D-interpretable in T ∈0 AC).
Moreover, also the Dialectica interpretation extends to this larger theory, in analogy to 3.10: 3.14 Extended D-interpretation theorem D
c KPωω0 + (uni f orm AC) + (q f − AC) ,→ T ∈0 + (uni f orm AC)
4
Functional interpretations of Constructive Set Theory
Constructive set theory CZF goes back to Myhill [14] and Aczel [1]. Its schema of subset collection, as shown by Burr [4], needs an additional fullness functional for a functional interpretation. Here we consider its subtheory CZF − which is CZF without subset collection. Its finite type version CZF ω− is again defined as an extension of T ∈ . Its language is the closure of L(T ∈ ) under ∧, →, ∀x ∈ t, and under unbounded quantifiers ∀xτ , ∃xτ for finite types τ. The bounded existential quantifier
150
Justus Diller
∃x ∈ t and disjunction ∨ are primitive connectives only within ∆0 -formulae. The logic is intuitionistic predicate logic with identity. The only axiom schema added to T ∈ is (Strong collection)
∀x ∈ a ∃y F[x, y] → ∃b F 0 [a, b]
where
F 0 [a, b] ≡ ∀x ∈ a ∃y ∈ b F[x, y] ∧ ∀y ∈ b ∃x ∈ a F[x, y] The rule (T IND) is extended to the full language of CZF ω− , whereas the rule (T -EXT) remains restricted to the language of T ∈ . As explicit versions of the axioms of Infinity, Pair, Union and of the schema of ∆0 -Separation of CZF are derivable already in T ∈ , we see: 4.1 Lemma. CZF ω− is an extension of CZF − . A detailed ∧-interpretation of CZF ω− is presented in [9]. We refer to this source for most of the proofs. The first functional interpretation of CZF ω− is given by Burr [4], based on his ×-translation improved by Schulte [16]. We extend the ∧translation of KPωω by a clause for the existential quantifier which we take over from Schulte’s ×-translation in [16]. For A[u]∧ ≡ ∃v∀wA∧ [u, v, w], the clause runs: (∃)∧
(∃uA[u])∧ ≡ ∃XUV∀w(X#0 ∧ ∀x ∈ X A∧ [U x, V x, w])
Due to this clause, we do not have A∧∧ ≡ A∧ for all A, but one easily sees: 4.2 Lemma. 1. If A ≡ ∀wB with B ∈ L(T ∈ ), then A∧ ≡ A. 2. CZF ω− ` (A ∧ B)∧ ↔ A∧ ∧ B∧ 3. CZF ω− ` (∃uA[u])∧ ↔ ∃u(A[u])∧ 4. CZF ω− ` A∧∧ ↔ A∧ 4. follows from 1. by iterated application of 3. The ∧-translation of implication given in section 3 solves the ∆0 ⊂ Π1 -problem for CZF ω− in the same way as for KPωω . Its dual is a ∆0 ⊂ Σ1 -problem: a ∆0 -formula ∃x ∈ t A[x] may be read as a Σ1 -formula ∃x(x ∈ t ∧ A[x]) with an unbounded quantifier ∃x, having a different translation again. A Dialectica-like translation D of the implication B ≡ ∃x ∈ t A[x] → ∃x(x ∈ t ∧ A[x]) would lead to a matrix BD ≡ ∃x ∈ t A[x] → (s ∈ t ∧ A[s]) ,
Functional Interpretations of Classical and Constructive Set Theory
151
and an interpreting term s which - unlike the situation in arithmetic where s = µy < t A[y] is a primitive recursive solution - again requires the choice function FC (cf. 3.11), since T ∈ ` ∃x ∈ t A[x] → {x ∈ t | A[x]}#0 : T ∈ + (uniform AC) ` BD
for
s = FC {x ∈ t | A[x]}
An adequate translation is given in [4] and [16] by the ×-translation which in this case coincides with the ∧-translation above: In translating ∃xoC[x] with C[x]∧ ≡ C[x], one basically looks for an inhabited set X such that all elements x ∈ X satisfy C[x]: There is no need to pick an element from the set X. For the above B, this asks for a set X satisfying B∧ ≡ ∃x ∈ t A[x] → X#0 ∧ ∀x ∈ X(x ∈ t ∧ A[x]) Obviously, in this case, the ∆0 -set X = {x ∈ t | A[x]} will do. So it is just the introduction of an inhabited bound X and of the combination X#0∧∀x ∈ X that solves the ∆0 ⊂ Σ1 -problem. In general, in translating ∃uA[u], the variable u may be of arbitrary type τ and the translation of A[u] may already start with some existential quantifiers. Here, we follow [16] in defining (∃uA[u])∧ as above, with the variable X bounding all of U, V. The special case above, however, is compatible with the general clause (∃)∧ : 4.3 Lemma. ∃x A[x] with A[x] ∈ L(T ∈ ), x : o may be ∧-translated as ∃X(X#0 ∧ ∀x ∈ X A[x]) without changing the concept of ∧-interpretability. The proof parallels the proof of lemma 3.2. Looking at (→)∧ and (∃)∧ , we see that every existential variable in the prefix of A∧ - which is not itself a bound - has an accompanying bound, inhabited or not. Therefore, if an additional inhabited bound is added in front of A∧ , it may, after distribution over the subformulae of A∧ , be contracted with the bounds already present and be finally absorbed altogether: 4.4 Lemma on absorption of bounds. Let A∧ be ∃v∀wA∧ [v, w]. Given terms S , V (not containing variables from w) there are terms v0 (also not containing variables from w) such that T ∈ ` S #0 ∧ ∀s ∈ S A∧ [V s, w] → A∧ [v0 , w] For a proof, we refer to [9]. A weaker version of this lemma, adapted to the ×translations, appears in [4] and [16] as ’distribution lemma’. The absorption lemma
152
Justus Diller
is useful in the proof of 4.5 ∧-Interpretation Theorem for CZF ω− ∧
CZF ω− ,→ T ∈ Proof by induction on deductions. The ∧-interpretation of the axioms and rules of the negative fragment of logic with identity, including (T IND), is the same as for KPωω in 3.3. As the axioms and rules of T ∈ , including (T-EXT), are interpreted trivially, the only laws that still have to be ∧-interpreted are those governing disjunction, (bounded and unbounded) existential quantification, and (Strong Collection). Here, we consider the laws for disjunction. As disjunctions occur only between ∆0 -formulae, A → A ∨ B and B → A ∨ B are ∆0 and hence interpreted by the empty tuple. To illustrate the impact of the absorption lemma, we look at A0 → B, A1 → B ` A0 ∨ A1 → B Here, A0 , A1 are again ∆0 , and let B∧ ≡ ∃y∀z B∧ [y, z]. By I.H. and case distinction, we have terms Y such that for i ∈ 2 T ∈ ` Ai → B∧ [Yi, z] By ∆0 -separation, we define a bound S := {x ∈ 1 | A0 } ∪ {x ∈ {1} | A1 }. For this S ⊂ 2, T ∈ ` A0 ∨ A1 → S #0 and T ∈ ` ∀s ∈ S B∧ [Y s, z], hence T ∈ ` A0 ∨ A1 → S #0 ∧ ∀s ∈ S B∧ [Y s, z] By absorption of bounds 4.4, there are terms y0 such that T ∈ ` A0 ∨ A1 → B∧ [y0 , z] : a combination of ∆0 -separation and absorption of bounds gives us the term tuple y0 ∧-interpreting A0 ∨ A1 → B . For the interpretation of laws governing the existential quantifier and of the schema of (Strong collection), we refer to [9]. Since ∧ is the identity on L(T ∈ ), the ∧-interpretation theorem implies: 4.6 Corollary. CZF ω− is a conservative extension of T ∈ . The consistency of T ∈ implies the consistency of CZF ω− . As in HAω , we adapt Markov’s rule to the ∧-translation so that its conclusion is the ∧-translation of its premise. Admissibility of this rule is then also an immediate consequence of the ∧-interpretation theorem:
Functional Interpretations of Classical and Constructive Set Theory
153
4.7 Corollary. (Rule-M∈ ) is admissible in CZF ω− : (Rule-M∈ ) If ` ∀wA[w] → B with A, B ∈ L(T ∈ ), then ` ∃XW(∀x ∈ X A[W x] → B) In contrast to the situation in KPωω , the schema {A ↔ A∧ } is axiomatized on the basis of CZF ω− along the same lines as in HAω . Due to the ’weak’ translation (∃)∧ of the existential quantifier, however, choice appears here only as a weak axiom of choice (WAC) which still carries more constructive information than the schema (ARC) in the previous section. 4.8 The theory CZF ω+ WIP∈
(∀wA → ∃yB[y]) → ∃XY(∀wA → X#0 ∧ ∀x ∈ X B[Y x]) for A ∈ L(T ∈ )
M∈
(∀wA[w] → B) → ∃XW(∀x ∈ X A[W x] → B) for A, B ∈ L(T ∈ )
WAC
∀x∃yA[x, y] → ∃S Y∀x(S x#0 ∧ ∀s ∈ S x A[x, Y sx])
Here, w, y, and in WAC also x are to be read as tuples of variables of arbitrary type. Bounds X, S x are single variables resp. terms of type o. CZF ω+ :≡ CZF ω− + WIP∈ + M∈ + WAC In analogy to the characterization of ∧ on HAω (cf. [11]), we obtain: 4.9 Characterization Theorem CZF ω+ ≡ CZF ω− + {A ↔ A∧ } Since CZF ω− ` A∧ → B∧ implies that A → B is ∧-interpretable in T ∈ , the characterization theorem implies: 4.10 Extended ∧-Interpretation Theorem ∧
CZF ω+ ,→ T ∈ Also CZF ω+ is a conservative extension of T ∈ .The consistency of T ∈ implies the consistency of CZF ω+ . In order to show closure of CZF ω− under relevant rules beyond (Rule-M∈ ), a hybrid tranlation is studied in [16] which we present as a q-hybrid ∧q which is related to ∧ as modified q-realizability mq is related to modified realizability mr in [19] by Troelstra. The clauses distinguishing it from the ∧-translation are: (→)∧q
(A → B)∧q ≡ ∃XWY∀vz(A ∧ ∀x ∈ XvzA∧q [v, W xvz] → B∧q [Yv, z])
154
Justus Diller
(∃)∧q
(∃uA[u])∧q ≡ ∃XUV∀w(X#0 ∧ ∀x ∈ X(A[U x] ∧ A∧q [U x, V x, w]))
Also for this translation, lemma 4.4 on absorption of bounds holds, and in analogy to the ∧q-interpretation of HAω (cf. Stein [18]), Schulte [16] essentially shows: 4.11 ∧q-Interpretation Theorem ∧q
CZF ω− ,→ CZF ω− The proof parallels the proof of the ∧-interpretation theorem 4.5. Even more: The same term-tuples ∧- and ∧q-interpret the theorems of CZF ω− . For details, see [16], though there the theorem is proved for a ×q-translation. As the ∧q-translation ’remembers’ more of the translated formulae than the ∧translation, we obtain admissibility of rules beyond (Rule-M∈ ): 4.12 Proposition CZF ω− satisfies weak existential definability (WED) and is closed under rules of weak choice and of weak independence of premise: (WED) If CZF ω− ` ∃yτ A[y], then there are terms S : o, Y : o → τ s.t. CZF ω− ` S #0 ∧ ∀s ∈ S A[Y s] (Rule-WAC) If CZF ω− ` ∀xσ ∃yτ A[x, y], then there are terms S , Y s.t. CZF ω− ` S x#0 ∧ ∀s ∈ S x A[x, Y xs] (Rule-WIP∈ ) If CZF ω− ` ∀wA[w] → ∃yB[y] with A ∈ L(T ∈ ), then there are terms X, Y s.t. CZF ω− ` ∀wA[w] → X#0 ∧ ∀x ∈ XB[Y x] One may ask whether we could not do better and get rid of the inhabited bounds that occur in and weaken these three admissibility statements: Could we not prove (ED) instead of (WED)? At least, (WED) reduces the problem of (ED) in general to the question of (ED) for inhabited sets [16]: 4.13 Lemma If for every term S : o for which T ∈ ` S #0, there is a term t : o such that T ∈ ` t ∈ S , then CZF ω− satisfies (ED): (ED) If CZF ω− ` ∃yτ A[y], then there is a term b : τ s.t. CZF ω− ` A[b] The argument above which led us to translate the existential quantifier according to (∃)∧ does not decide the question, but it leaves little hope for establishing (ED) for CZF ω− :
Functional Interpretations of Classical and Constructive Set Theory
155
4.14 Conjecture (¬ED) There is a term S : o such that T ∈ ` S #0, but for no term t T∈ ` t ∈ S . In the forthcoming paper [15], M. Rathjen proves the existence property (EP) for CZF − , a version of (ED) for sentences only: 4.15 Theorem (Rathjen) CZF − has the existence property: (EP) If CZF − proves a sentence ∃x A[x], then there is a formula C[x] with exactly x free, such that CZF − ` ∃!x(C[x] ∧ A[x]) It may be conjectured that this result transfers from CZF − to CZF ω− and from set variables x to variables of arbitrary finite type, and that - in the richer language of CZF ω− - the defining formula C[x] may be chosen to be simply x = b for some functional b of T ∈ . That would imply that CZF ω− satisfies (ED) for sentences ∃yτ A[y]. Yet, conjecture 4.14 may still hold for some term S : o containing free parameters.
References [1] Aczel, P.H.G.: The type theoretic interpretation of constructive set theory, in: A. McIntyre, L. Pacholski, J. Paris (Eds.), Logic Colloquium ’77, North Holland, Amsterdam 1978, 55 - 66. [2] Avigad, J., and S. Feferman: G¨odel’s functional (’Dialectica’) interpretation, in: S. Buss (Ed.), Handbook of Proof Theory, Studies in Logic and the Foundations of Mathematics, Vol. 137, Elsevier, Amsterdam 1998, 337 - 406. [3] Barwise, J.: Admissible Sets and Structures, Springer-Verlag, Berlin Heidelberg New York 1975. [4] Burr, W.: Functional interpretation of Aczel’s constructive set theory, APAL 104 (2000) 31 - 73. [5] Burr, W.: A Diller-Nahm-style functional interpretation of KPω, Arch. Math. Logic 39 (2000) 599 - 604. [6] Burr, W.: Concepts and aims of functional interpretations: Towards a functional interpretation of constructive set theory, Synthese 133 (2002) 257 - 274. [7] Burr, W., and V. Hartung: A characterization of Σ1 -definable functions of KPω + (uniformAC), Arch. Math. Logic 37 (1998) 199 - 214.
156
Justus Diller
[8] Diller, J.: Logical problems of functional interpretations, APAL 114 (2002) 27 - 42. [9] Diller, J.: Functional interpretations of constructive set theory in all finite types. Dialectica 62 (2008) 149 - 177. [10] Diller, J.: Functional interpretations of classical systems, in: R. Schindler (Ed.), Ways of Proof Theory, Ontos Mathematical Logic, vol. 2, Ontos Verlag, Heusenstamm 2010, 241 - 255. [11] Diller, J., and W. Nahm: Eine Variante zur Dialectica-Interpretation der Heyting-Arithmetik endlicher Typen, Arch. Math. Logik Grundl. 16 (1974) 49 66. ¨ [12] G¨odel, K.: Uber eine bisher noch nicht ben¨utzte Erweiterung des finiten Standpunktes, Dialectica 12 (1958) 280 - 287. [13] Kreisel, G.: Interpretation of analysis by means of constructive functionals of finite type, in: A. Heyting (Ed.), Constructivity in Mathematics, North Holland Publ. Co., Amsterdam 1959, 101 - 128. [14] Myhill, J. 1975: Constructive set theory, J. Symb. Logic 40 (1975) 347 - 382. [15] Rathjen, M.: The weak existence property as a path to the strong existence property. To appear. [16] Schulte, D.: Hybrids of the ×-translation for CZF ω , J. Applied Logic 6 (2008) 443 - 458. [17] Shoenfield, J.R.: Mathematical Logic, Addison-Wesley Publ. Reading, MA, 1967.
Comp.,
[18] Stein, M.: Eine Hybrid-Interpretation der Heyting-Arithmetik endlicher Typen, Master’s thesis, University of M¨unster, 1974. [19] Troelstra, A.S.: Metamathematical investigation of intuitionistic arithmetic and analysis, Lecture Notes in Mathematics 344, Springer, Heidelberg/New York 1973. [20] Troelstra, A.S.: Introductory Note to 1958 and 1972, in: K. G¨odel, Collected Works, vol. II, Publications 1938 - 1974, S. Feferman (Ed.), The Clarendon Press, Oxford University Press, New York 1990.
Weak Theories of Truth and Explicit Mathematics Sebastian Eberhard and Thomas Strahm∗ Dedicated to Helmut Schwichtenberg on his retirement
We study weak theories of truth over combinatory logic and their relationship to weak systems of explicit mathematics. In particular, we consider two truth theories TPR and TPT of primitive recursive and feasible strength. The latter theory is a novel abstract truth-theoretic setting which is able to interpret expressive feasible subsystems of explicit mathematics.
1
Introduction
The theories of truth and explicit mathematics considered in this article are all based on a common applicative ground language for operations in the sense of combinatory logic; operations can freely be applied to other operations and strong principles of recursion are available due to the known expressivity of combinatory algebras. The first order applicative base describes the operational core of Feferman’s explicit mathematics, cf. [17, 18, 19]; instead of a predicate N for natural numbers we will consider a predicate W in order to single out those operations which denote binary words. Types (or classifications) in explicit mathematics are extensional collections of operations. They are generated successively and linked to the applicative ground structure by a naming relation: the names of a type constitute its intensional or computational representations. The interplay of types and names on the level of combinatory operations makes the framework of explicit mathematics very expressive. An alternative means to extend first order applicative theories by a typing discipline is to extend them by a unary truth predicate T and interpret naive set theory by stipulating x ∈ a as T(ax). The so-obtained axiomatic frameworks of partial, ∗ Research
supported by the Swiss national Foundation.
158
Sebastian Eberhard and Thomas Strahm
self-referential truth are rooted in Kripke’s seminal work and also yield an interpretation of classical Frege structures (cf. Aczel [1], Beeson [3], and Hayashi and Kobayashi [27]). For detailed background on the type of truth theories considered here, see Cantini [6, 7] and Kahle [30, 31]. Of course, the work on axiomatic truth over combinatory logic is also strongly related to corresponding work in the area of arithmetical truth theories, see e.g. Feferman [15, 21], Friedman and Sheard [25], and Halbach [26]. The focus of the present paper is to discuss various weak (positive) truth theories and systems of explicit mathematics as well as their mutual relationship. Namely we will address two families of theories, capturing the primitive recursive and polynomial time computable functions, respectively. We will see that the truth theories can interpret corresponding systems of explicit mathematics very directly, whereas reverse embeddings of truth theories into explicit mathematics are more elaborate and require additional assumptions. A further novelty of this paper is the introduction of a natural feasible truth theory TPT , whose provably total operations are the polynomial time computable ones, as is shown in Eberhard [12]. TPT can only reflect initial segments of the class W of binary words, but features unrestricted truth induction; it is obtained as a natural restriction of a truth theory TPR of the strength of primitive recursive arithmetic. Moreover, TPT can interpret very expressive feasible systems of explicit mathematics. We conclude the introduction with a detailed outline of the paper. In Section 2 we will introduce the basic applicative framework which is common to all systems studied in this paper. Section 3 presents the two central truth theories of this paper, TPR and TPT . The first one was previously introduced in Cantini [7, 10]. Both systems rely on a form of positive truth and embody truth induction. Whereas TPR can reflect the whole predicate W of binary words, TPT only reflects initial segments. In Section 4 we will present two natural systems of explicit mathematics of polynomial and primitive recursive strength, respectively: the system PETJ of Spescha and Strahm [36, 37, 38] and a natural explicit system EPCJ; both of these frameworks are direct subsystems of Feferman’s EM0 plus the join principle (cf. [17, 19]). For the embedding of truth theories into explicit mathematics, further principles will be needed, for example, the existence of universes, and Cantini’s uniformity principle. Section 5 will be devoted to mutual embeddings of weak truth theories and systems of explicit mathematics. Firstly, we will see that PETJ and EPCJ are very directly contained in TPT and TPR , respectively. The reverse embeddings are more difficult: (i) for the direct embedding of TPR into EPCJ we assume the existence of a universe and the uniformity principle; (ii) the reduction of TPT to PETJ proceeds via an intermediate leveled truth theory, which in turn can be directly modeled in an extension of PETJ by universes. In Section 6 we turn
Weak Theories of Truth and Explicit Mathematics
159
to an extended discussion of the proof theory of the systems considered in this paper; this includes the review of some known results and a discussion of work under preparation, namely Eberhard’s novel realizability interpretation of TPT , which also yields that the extensions of the systems of explicit mathematics mentioned before do not raise the proof-theoretic strength. We conclude this article with an outlook of future work, namely the application of the feasible truth theory TPT in order to obtain proof-theoretic upper bounds for the unfolding of schematic systems of feasible arithmetic.
2
The basic applicative framework
The theories of truth and explicit mathematics studied in this paper are based on a common applicative base theory. It includes the axioms for a partial or total combinatory algebra and a basic data type of binary words.
2.1
The applicative language L
Our basic language L is a first order language for the logic of partial terms which includes: • variables a, b, c, x, y, z, u, v, f, g, h, . . . • constants k, s, p, p0 , p1 , dW , , s0 , s1 , pW , c⊆ , ∗, × • relation symbols = (equality), ↓ (definedness), W (binary words) • arbitrary term application ◦ The meaning of the constants will become clear in the next paragraph. The terms (r, s, t, p, q, . . . ) and formulas (A, B, C, . . . ) of L are defined in the expected manner. We assume the following standard abbreviations and syntactical conventions: t1 t2 . . . tn s(t1 , . . . , tn ) t1 ' t2 hti ht1 , . . . , tn+1 i t∈W t : Wk → W
:= := := := := := :=
(. . . (t1 ◦ t2 ) ◦ · · · ◦ tn ) st1 . . . tn t1 ↓ ∨ t2 ↓ → t1 = t2 t pht1 , . . . , tn itn+1 W(t) (∀x1 . . . xk ∈ W)tx1 . . . xk ∈ W
160
Sebastian Eberhard and Thomas Strahm
s≤t s ≤W t
c⊆ (1×s, 1×t) = 0
:= :=
s≤t∧s∈W
In the following we often write A[~x] in order to indicate that the variables ~x = x1 , . . . , xn may occur free in A. Finally, let us write w for the canonical closed L term denoting the binary word w ∈ W.
2.2
The basic theory of operations and words B
The applicative base theory B has been introduced in Strahm [40, 41]. Its logic is the classical logic of partial terms due to Beeson [2, 3]. The non-logical axioms of B include: • partial combinatory algebra: k xy = x,
s xy↓ ∧ s xyz ' xz(yz)
• pairing p with projections p0 and p1 • defining axioms for the binary words W with , the binary successors s0 , s1 and the predecessor pW • definition by cases dW on W • initial subword relation c⊆ • word concatenation ∗, word multiplication ×1 These axioms are fully spelled out in [40, 41]. Below we will be mainly interested in extensions of B by the axioms of totality of application and extensionality of operations: Totality of application:
(∀x)(∀y)(xy↓)
(Tot)
(∀ f )(∀g)[(∀x)( f x ' gx) → f = g]
(Ext)
Extensionality of operations:
Observe that in the presence of the totality axiom, the logic of partial terms reduces to ordinary classical predicate logic. In the following we write B+ for the extension of B by (Tot) and (Ext). 1 x×y
×xy.
signifies the length of y fold concatenation of x with itself; note that we write x×y instead of
Weak Theories of Truth and Explicit Mathematics
161
Various extensions of B or B+ by suitable induction principles on W have been proposed in the past. Most relevant for the systems studied in this article are the theories PT and PR, cf. Strahm [41]. The former includes a form of bounded induction, namely ΣbW induction, whereas the latter features induction for arbitrary positive formulas. Let us turn to the crucial consequences of the axioms about a partial combinatory algebra. For proofs of these standard results, the reader is referred to Beeson [3] or Feferman [17]. Lemma 1 (Explicit definitions and fixed points). 1. For each L term t there exists an L term (λx.t) so that B ` (λx.t)↓ ∧ (λx.t)x ' t
2. There is a closed L term fix so that B ` fixg↓ ∧ fixgx ' g(fixgx)
Let us quickly remind the reader of two standard models of B, namely the recursion-theoretic model PRO and the term model M(λη). For an extensive discussion of many more models of the applicative basis, the reader is referred to Beeson [3] and Troelstra and van Dalen [43]. Example 2 (Recursion-theoretic model PRO). Take the universe of binary words W = {0, 1}∗ and interpret application ◦ as partial recursive function application in the sense of ordinary recursion theory. Example 3 (The open term model M(λη)). Take the universe of open λ terms and consider the usual reduction of the extensional untyped lambda calculus λη, augmented by suitable reduction rules for the constants other than k and s. Interpret application as juxtaposition. Two terms are equal if they have a common reduct and W denotes those terms that reduce to a “standard” word w. Note that M(λη) satisfies both (Tot) and (Ext).
2.3
Provably total functions
We intend to measure the proof-theoretic strength of all the systems treated in this article by ascertaining their provably total functions. In the following let L be a language extending our first-order language L. The notion of a provably total function is introduced for an arbitrary L theory Th.
162
Sebastian Eberhard and Thomas Strahm
Definition 4. A function F : Wn → W is called provably total in an L theory Th, if there exists a closed L term tF such that (i) Th ` tF : Wn → W and, in addition, (ii) Th ` tF w1 · · · wn = F(w1 , . . . , wn ) for all w1 , . . . , wn in W. The notion of a provably total word function is divided into two conditions (i) and (ii). The first condition (i) expresses that tF is a total operation from Wn to W, provably in the L theory Th. Condition (ii), on the other hand, claims that tF indeed ~ in Wn . represents the given function F : Wn → W, for each fixed tuple of words w To give an example, the provably total functions of the above-mentioned theories PT and PR are the polynomial time computable and primitive recursive functions, respectively.
3
Positive truth
Theories of truth contain a predicate T that mimics the properties of truth. The axiomatization of this predicate relies on a coding mechanism for formulas. In the applicative framework, we code formulas using new constants designating logical operations. In the weak theories of truth discussed in this paper, the Tarski biconditionals hold only for positive formulas. Therefore no liar paradoxes occur. In the following we will introduce two weak truth theories TPR and TPT . The theories will be presented simultaneously since their axioms differ only slightly.
3.1
The language LT of positive truth
The (first order) language LT is an extension of the language L by • a new unary predicate symbol T for truth ˙ , ∧, ˙ ∃˙ ˙ ∨, ˙ ∀, • new individual constants =, ˙ W The new constants allow only the coding of positive formulas since we do not add ˙ and ∨. ˙ a constant ¬˙ to code negation. As usual, we will use infix notation for =, ˙ ∧,
3.2
Two theories of positive truth
All truth theories considered in this article are based on the applicative theory B+ . Accordingly, their underlying logic is simply first order classical predicate logic. The truth axioms for TPR and TPT differ only in the Tarski biconditionals which are
Weak Theories of Truth and Explicit Mathematics
163
available for the word predicate W. The truth axioms for TPR spell out the expected clauses according to the compositional semantics of truth as follows. Compositionality: T(x = ˙ y)
↔
x=y
(=) ˙
˙ x) T(W
↔ ↔ ↔
W(x)
˙ PR ) (W ˙ (∧) ˙ (∨)
↔
(∀z)T( f z)
↔
(∃z)T( f z)
˙ y) T(x ∧ ˙ y) T(x ∨ T(∀˙ f ) T(∃˙ f )
T(x) ∧ T(y) T(x) ∨ T(y)
˙ (∀) ˙ (∃)
˙ PR ) the following axiom: For the feasible theory TPT , we use instead of (W ˙ xy) ↔ y ≤W x) x ∈ W → (T(W
˙ PT ) (W
˙ PR ), it allows only the reflection of initial segments of the set of In contrast to (W words. Both theories contain unrestricted truth induction. Truth induction: T(r) ∧ (∀x ∈ W)(T(rx) → T(r(s0 x)) ∧ T(r(s1 x))) → (∀x ∈ W)T(rx)
Next we would like to determine the classes of formulas for which the Tarski truth biconditionals hold. In the case of TPT this is the class of so-called simple formulas, which are patterned after similar classes of formulas in explicit mathematics, see [34, 37] and the next section of this paper. Definition 5 (Simple formulas). Let A be a positive LT formula and u be a variable not occurring in A. Then the formula Au which is obtained by replacing each subformula of the form t ∈ W of A by t ≤W u is called simple. Next we define coding operations for TPR and TPT which map the positive, respectively the simple formulas to their codes. Definition 6. For each positive formula A of LT we inductively define a term [A]
164
Sebastian Eberhard and Thomas Strahm
whose free variables are exactly the free variables of A: [t = s] [T(t)] [s ∈ W] [A ∧ B] [A ∨ B] [(∀x)A] [(∃x)A]
:= := := := := := :=
t =˙ s t ˙s W ˙ [A]∧[B] ˙ [A]∨[B] ˙ ∀(λx.[A]) ˙ ∃(λx.[A])
Definition 7. For each simple formula Au of LT we inductively define a term hAi whose free variables are exactly the free variables of A: ht = si hT(t)i hs ≤W ui hA ∧ Bi hA ∨ Bi h(∀x)Ai h(∃x)Ai
:= := := := := := :=
t =˙ s t ˙ us W ˙ hAi∧hBi ˙ hAi∨hBi ˙∀(λx.hAi) ˙ ∃(λx.hAi)
We have that λx.[A], respectively λx.hAi can be interpreted as the propositional function defined by the formula A. For both theories of truth, the Tarski biconditionals can be proved for the positive, respectively simple formulas. Lemma 8 (Biconditionals for TPR ). Let A be a positive LT formula. We have TPR ` T([A]) ↔ A
Lemma 9 (Biconditionals for TPT ). Let Au be a simple LT formula. We have TPT ` u ∈ W → (T(hAu i) ↔ Au )
An interesting consequence of the biconditionals is a second recursion or fixed point theorem for positive, respectively simple predicates. This theorem can be obtained by lifting the fixed point theorem for combinatory logic (cf. Lemma 1) to the truth-theoretic language, cf. Cantini [6, 10].
Weak Theories of Truth and Explicit Mathematics
4
165
Explicit mathematics
Types in explicit mathematics are collections of operations and must be thought of as being generated successively from preceding ones. They are represented by operations via a suitable naming relation G(x), and we let Hn (x) = F(x), which will be Ft (x) since t > G(x). By the same argument, we get that Θn (y) = Φt (y) for all y. This ends the proof of the claim. From the information given in the claim, we can compute the least modulus of the sequence using the µ-operator. We do not use that Ψ and G are continuous in this proof, but since the least modulus function of a convergent sequence from Ct(k) is continuous, we cannot expect to make use of this observation. Theorem 18. Assume that {e}(Φ1 , . . . , Φn ) ' a and that
Φi = lim Φi,t t→∞
with modulus functions Ψi for i = 1, . . . , n. Assume further that {e}(Φ1,t , . . . , Φn,t ) ' at for each t ∈ N. Then a = limt→∞ at , and we can compute the modulus uniformly in the data. Proof. We use the recursion theorem to provide the uniform algorithm, and we use induction on the ordinal rank of the computation tree for {e}(Φ1 , . . . , Φn ) in order to prove that the algorithm works. Our construction of the algorithm is by 9 cases, following S1 - S9. In the case of S8 we use Lemma 17 and the other cases are trivial in view of Observation 15. We leave the details for the reader. Remark 19. The µ-operator is S1 - S9 -computable. This is so, because the recursion theorem is available from S1 - S9. There are actually good reasons for replacing S9 with a scheme Sµ (see below) for the µ-operator. The main reason in this context is that there is hardly any need for the full enumeration scheme, the µoperator seems always to be sufficient for the positive applications. Kleene’s original motivation for introducing S1 - S9 was to investigate transfinite definability, e.g.
366
Dag Normann
generalizing the hyperarithmetical hierarchy to a hyperanalytical hierarchy. In this endeavor, S9 is essential. His motivations for introducing the countable functionals might be to understand S1 - S9 -computability better. Thus, when we now see the continuous functionals as a mathematical structure of interest in itself, it may be that replacing S9 with Sµ gives us a more natural concept of computing. Bergstra [2] showed that S9 is strictly stronger than Sµ over the continuous functionals in the context of S1 - S8. Sµ: If e = h10, di, then {e}(Φ1 , . . . , Φn ) ' µx({d}(x, Φ1 , . . . , Φn ) = 0) with the standard interpretation of µx for partial functions.
5
Fixed point definability
In the previous section we showed that Kleene’s definition of computability can be defined and justified without any reference to representing associates or ideals, i.e. without introducing algorithms on approximations to the continuous functionals. We do know, however, that there are externally computable objects that are not Kleene computable. In this section we will define a class that we call fixed point definable functionals. We will discuss a few examples from the literature on the continuous functionals. It turns out that if we try to use fixed point constructions as a tool for defining new functionals, there is a serious risk for us to get too much. We will come back to this.
5.1
Examples
Since Platek’s thesis [19], partial functionals and the least fixed point operator have been important tools in the computability theory of functionals. We will see, by considering some special and general examples, that we do not need the superstructure of partial functionals in order to construct fixed points. The simple idea, standard in topology, is that if ∆ : Ct(k) → Ct(k) is continuous with certain contraction properties, then we may find a fixed point as the limit of the iterated sequence ∆n (F), where F ∈ Ct(k) is arbitrarily chosen. We aim at proving the existence of such fixed points avoiding domain theory or other means of representations, but of course, we will essentially give the same proofs when we imitate a theorem known from domain theory. One of the methods is bar induction, in its most crude form: Definition 20. If ~s = (s0 , . . . , sk−1 ) is a sequence from N, we let B~s = { f : N → N | ∀i < k( f (i) = si )}.
The Continuous Functionals as Limit Spaces
367
Let F ∈ Ct(2). Let T F = {~s | F is not constant on B~s .} T F will be a well founded tree, and by bar induction we will mean induction on the ordinal rank of this tree. Since the map F 7→ T F is not even continuous, our version of bar induction is not constructive, and there is no corresponding form of bar recursion. 5.1.1
Gandy’s Gamma-function
We define the functional Γ of pure type 3 by the equation Γ(F) = F0 (λx ∈ N.Γ(F x+1 )), where F x ( f ) = F(x∗ f ) and ∗ is concatenation. If we try to compute Γ using the recursion theorem, we simply get the everywhere undefined function. However, using the partial functionals, it is easy to see that there is an object satisfying this equation, there is a fixed point of the functional ∆Γ (Φ)(F) = F0 (λn.Φ(Fn+1 )). We will see that this can be established directly on the basis of our approach via limit spaces. Claim 1. ∆Γ has at most one fixed point. Proof. If Γ1 and Γ2 are two fixed points, we see that for each object F = (F)m of finite character we have that Γ1 (F) = Γ2 (F). This is proved by induction on the complexity of F, or by induction on m. If two continuous functions into N agree on a dense set, they are equal. Claim 2. There exists a Γ ∈ Ct(3) that is a fixed point of ∆Γ . Proof. We can prove a much stronger statement. Let F = limn→∞ Gn be a convergent sequence from Ct(2) and let Γ0 be an arbitrary element of Ct(3). We see by bar induction on F that lim ∆nΓ (Γ0 )(Gn ) n→∞
exists, and is independent of {Gn }n∈N . Moreover, we may prove, also by bar induction, that there is a modulus for this sequence, independent of Γ0 . Thus Γ = limn→∞ ∆nΓ (Γ0 ) is well defined and continuous, and with some modulus function independent of Γ0 . The details of the argument are left for the reader.
368
Dag Normann
5.1.2
The fan functional
The fan functional discussed in the introduction will be the functional Φ ∈ Ct(3) defined by Φ(F) is the least n such that whenever f and g in {0, 1}N agree on the n first arguments 0, . . . , n − 1, then F( f ) = F(g). Φ is well defined because a continuous F will be uniformly continuous on the compact topological space {0, 1}N , and it is easy to see that Φ is continuous, using that any sequence on {0, 1}N will have a convergent subsequence. Tait [22] proved that Φ is not Kleene computable in the original sense, while Berger [1] showed that there is a partial continuous representation of Φ that is computable in the Scott sense. We will transform Berger’s proof to a fixed point representation of Φ. In order to do so, we must first define a test function δ : Ct(2) → Ct(1) that for every F ∈ Ct(2) will check if F is constant on the Cantor space or not. Let δ(F) =
0ω if 1∗ δ(F1 ) if 1∗ 0ω if 0∗ δ(F0 ) if
F(0ω ) = F(0ω ) = F(0ω ) = F(0ω ) ,
F(0∗ δ(F0 )) = F(1∗ 0ω ) = F(1∗ δ(F1 )) F(0∗ δ(F0 )) = F(1∗ 0ω ) , F(1∗ δ(F1 )) F(0∗ δ(F0 )) , F(1∗ 0ω ) F(0∗ δ(F0 ))
δ is the unique fixed point of the operator ω 0 if F(0ω ) = F(0∗ η(F0 )) = F(1∗ 0ω ) = F(1∗ η(F1 )) ∗ 1 η(F1 ) if F(0ω ) = F(0∗ η(F0 )) = F(1∗ 0ω ) , F(1∗ η(F1 )) ∆δ (η)(F) = 1∗ 0ω if F(0ω ) = F(0∗ η(F0 )) , F(1∗ 0ω ) 0∗ η(F0 ) if F(0ω ) , F(0∗ η(F0 )) It follows by induction on the rank of T F restricted to 0 - 1 -sequences that lim ∆nδ (δ0 )
n→∞
exists, is independent of δ0 and has a continuous modulus function for the sequence, independently of δ0 . We can then define the fan functional Φ, via the equation ( 0 if F(0ω ) = F(δ(F)) Φ(F) = max{Φ(F0 ), Φ(F1 )} + 1 otherwise, as the fixed point of a total operator ∆Φ . This fixed point will be the unique limit of sequences ∆nΦ (Ψ), where a continuous modulus function for {∆nΦ (Ψ)}n∈N can be found independently of Ψ. This ends the construction.
The Continuous Functionals as Limit Spaces
369
In the examples above, we have taken the liberty to define functionals of several typed variables as unique fixed points of basic computable operators ∆, and the ∆’s in question have to take functionals of mixed types as arguments in order for the constructions to make sense. We may, however, redefine each ∆ as an operator on some Ct(k), using the fact that the interpretation of a type is a computable retract of the interpretation of the pure type of the same level
5.2
Fixed points
We will now use the experience from the examples considered. Definition 21. Let ∆ : Ct(k) → Ct(k) be continuous. We say that ∆ is a contractor if i) limn→∞ ∆n (F) exists in Ct(k) for all F ∈ Ct(k), and the limit is independent of F. ii) There is a continuous modulus function for {∆n (F)}n∈N independently of F ∈ Ct(k). Remark 22. If ∆ is of type Ct(k) → Ct(k) and k > 0, it makes perfect sense to say that ∆ is Kleene computable, since ∆ can be viewed as a function from Ct(k) × Ct(k − 1) to N. Remark 23. The existence of a contractor does not make the space contractive in the standard sense of topology. Still, the intuition is that a contractor ∆ will bring any point closer to the unique fixed point, in the sense that, like we have in fixed point theory for metric spaces, any iteration of ∆ will bring any point as close to the fixed point as we like. We will explore the following concept, a concept that seems natural from the point of view of the category of limit spaces: Definition 24. We define the class of fixed point definable functionals as the least class of functionals closed under µ-recursion, that is S1 - S8, Sµ, and in addition satisfying Let ∆ : Ct(k) → Ct(k) be fixed point definable and a contractor. Then the unique fixed point of ∆ is fixed point definable. We use the term definable, and not computable for the elements of this class, since there is no genuine mentioning of internal algorithms in the definition. In fact, there will be non computable fixed point definable functions, and in order to use the extraction of fixed points as a means for extending the class of computable functionals in the limit space approach, we must put further restrictions on the set of contractors that we consider.
370
Dag Normann
5.3
The overflow
The class of fixed point definable functionals has far too liberal closure properties for our purposes. We prove a general theorem, and a consequence will be that the set of functions f : N → N that are fixed point definable will be closed under the jump operator. Theorem 25. Let {Φn }n∈N be a fixed point definable sequence from Ct(k) and assume that Φ = lim Φn . n→∞
Then Φ is fixed point definable. Proof. If k = 0, there is nothing to prove, so we let k > 0, and we let x range over Ct(k − 1). We will show that the least modulus function for the sequence will be fixed point definable, and thus Φ will be so. If {ai }i∈N is a sequence from N, we let mod n ({ai }i∈N ) be the least number m such that ∀i(m ≤ i ≤ n ⇒ ai = an ). This approximation to the modulus will of course be bounded by n. Now let ξ : N × Ct(k − 1) → N. We define ∆ by ∆(ξ)(n, x) =
mod
max{n+1,ξ(i,x)|i≤n+1} ({Φi (x)}i∈N ).
Let Ψ be the minimal modulus functional for the given sequence. For any x, ξ and n ≥ Ψ(x) we see that ∆(ξ)(n, x) = Ψ(x). It then follows by reversed recursion that if n < Ψ(x) we will, for any ξ, have that ∆Ψ(x)−n (ξ)(n, x) = Ψ(x). As a consequence ∆m (ξ)(n, x) = Ψ(x) whenever m ≥ Ψ(x), so ∆ is a contractor with λ(n, x)Ψ(x) as its unique fixed point.
In order to separate the good examples from this general construction, we notice that with the ∆ constructed above, the iteration requires more and more information from the original input, in the sense that if we try to use a partial ξ, the whole construction of the fixed point will be ruined. It would of course be nice to have a characterization of when a contractor is good that does not refer to external representations of the total objects, but we must leave
The Continuous Functionals as Limit Spaces
371
that for further research. For us, this is the time to realize that in order to give a general treatment of when a contractor can be used to define functionals that are in a sense computable, we need the underlying representations, and we will stick to domain representations.
6
Domain representations
Fortunately, we do not need to introduce the full machinery of domain theory. We will define what will correspond to the compact elements of the domain representations of these functionals, and then represent the functionals directly as ideals of compact elements. This is close to the original construction due to Kreisel [10]. Historically, Kleene [9] defined the continuous functionals using associates, and Kreisel [10] defined them using filters of formal neighborhoods. Ershov [3] characterized the functionals essentially using domain theory, and Scarpellini [20] characterized them using limit spaces. Our exposition is based on the fact that both Ershov and Scarpellini characterized the functionals defined by Kleene and by Kreisel, and we will not give specific reference to claims following from this fact. Recall that N⊥ is the set of integers plus an element ⊥ for the “undefined”, and that N⊥ is ordered in a flat way, with ⊥ v a for all a, and the integers are not ordered among themselves. Definition 26. We let D(0) = N⊥ considered as a partial ordering. By recursion, we let D(k + 1) consist of all monotonously increasing functions p : D(k) → N⊥ with finite support. We use the pointwise ordering to order D(k + 1). Each D(k) will be countable, due to the restriction to functionals with finite support, and there will be an enumeration of the elements such that the following are primitive recursive: i) p v q. ii) {p, q} is bounded. iii) r is the least upper bound of p and q. When a finite set in D(k) is pairwise bounded, there will be a least upper bound of the whole set. An ideal in D(k) will be a set of pairwise bounded elements closed downwards and under least upper bounds of finite subsets. If α is an ideal in D(k + 1) and β is an
372
Dag Normann
ideal in D(k), we let α(β) be the least upper bound of the elements p(q) ∈ N⊥ for p ∈ α and q ∈ β. Definition 27.
The ideal {⊥, a} will represent the number a.
An ideal α in D(k + 1) represents a function F : Ct(k) → N if whenever β is an ideal that represents x ∈ Ct(k) then α(β) = F(x). Ct(k + 1) will be exactly the set of functions that have ideal representations. The proof is not extensively hard, but too space consuming for this paper. Actually, the set ID(k) of all ideals in D(k), with the inclusion ordering, is isomorphic to the Scott-continuous partial functionals of type k, and the ideals representing elements in Ct(k) will correspond to the hereditarily total objects.
Definition 28. Let Φ be a continuous functional. We say that Φ is externally computable if Φ is represented by a computably enumerable ideal of compacts. We use the expression “externally” to express that this concept of being computable is imposed on the structure from the outside. µ-recursion and S1 - S9 computability are internal concepts, because they grow out of the structure itself. Kleene’s schemes S1 - S9 make sense also for the typed structure {ID(k) }k∈N , and if we use the recursion theorem to find an index for the fixed points used to define Γ or the fan functional, we actually get an index for computing a representative for these total functionals within this superstructure. This shows that the interpretations of S1 - S9 are not absolute. One further reason to restrict the attention to µ-recursion is that with this restriction the interpretations will be absolute. Normann and Rørdam [18] proved Theorem 29. Let Φ ∈ Ct(k) have a representative α that is µ-recursive in the sense of the partial continuous functionals. Then Φ is µ-recursive in the sense of the Ct(k)-hierarchy, using he same algorithm. S1 - S9 is suitable for investigating the computability theory of the continuous functionals, but when we consider the superstructure of partial continuous functionals we actually ad objects that are completely irrelevant for the interpretation of S1 S9. There are elements in D(k) that will not be in any S1 - S9 -definable functional, total or partial. Computer scientists say that the typed structure of partial continuous functionals is not fully abstract for S1 - S9 (or PCF, the programming language they prefer instead). In particular, the set of primitive recursive functionals is not a dense set in ID(k) for k ≥ 2. Definition 30. We define the relation p ≺ x where p ∈ D(k) and x ∈ Ct(k) as follows:
The Continuous Functionals as Limit Spaces
373
If p ∈ N⊥ and x ∈ N we let p ≺ x if p v x. If p ∈ D(k + 1) and x ∈ Ct(k + 1) we let p ≺ x if whenever q ∈ D(k), p(q) = a and q ≺ y ∈ Ct(k) we have that x(y) = a. The classical density theorem actually tells us that uniformly in p ∈ D(k) there is a primitive recursive x ∈ Ct(k) such that p ≺ x. We will give a guide to a simple argument: • If we replace N with Nn = {0, . . . , n} we may construct both the partial functionals Dn (k) of type k and the total functionals Finn (k) of type k, and both type structures will be finite at each level. • We may then restrict the definition of ≺ to this situation, and with ease prove that for any p ∈ Dn (k) there is an x ∈ Finn (k) such that p ≺ x. • Using the canonical embedding ek,n of Finn (k) into Ct(k) as defined in Section 3, we get p ≺ ek,n (x). If ∆ : Ct(k) → Ct(k) is continuous, there will be a ∆˜ : ID(k) → ID(k) representing ∆. If the least fixed point of ∆˜ is hereditarily total, it will represent a unique fixed point for ∆. We actually have Lemma 31. If ∆˜ : ID(k) → ID(k) represents a total ∆ : Ct(k) → Ct(k) and the least fixed point of ∆˜ is hereditarily total, then ∆ is a contractor. Proof. Let pn = ∆˜ n (⊥k ) where ⊥k is the everywhere undefined object in ID(k) . For every Ψ ∈ Ct(k), ∆n (Ψ) will have a representative extending pn . The lemma then follows from the fact that {pn }n∈N converges to a total object, and that the modulus function for this sequence will be continuous. When we define the fixed point computable functionals, we will define it as ˜ where F˜ is a domain representative of F, F is fixed point a class of pairs (F, F) definable and F˜ is defined simultaneously over the ID(k) -hierarchy using the least fixed point operator. Since the least fixed point operator is continuous, this will ensure that all fixed point computable functionals are continuous. Definition 32. We let the fixed point computable pairs be the least family of pairs ˜ where Φ ∈ Ct(k) for some k and Φ ˜ is a representative in ID(k) for Φ such that (Φ, Φ) the class is simultaneously closed under S1 - S8 and Sµ, and such that the following holds: ˜ is a fixed point computable pair of type Ct(k) → Ct(k) such that the If (∆, ∆) ˜ is fixed point computable, where Φ least fixed point of ∆˜ is total, then (Φ, Φ) ˜ is the least fixed point of Φ. ˜ is the unique fixed point of ∆ and Φ
374
Dag Normann
˜ such We say that Φ ∈ Ct(k) is fixed point computable if there is a representative Φ that the pair is fixed point computable. It is more or less a consequence of the definition that every fixed point computable functional is externally computable. Our final result will be that the converse is also true, every externally computable functional will be fixed point computable. Our argument will be an adjustment of the proof in Normann [16], and one reason for including it here is to give an updated exposition of the proof. We do this by separating out a lemma that requires the key idea behind the proof. Lemma 33. Let f : N → N be computable. Let T ⊆ Ct(k) × N2 be a fixed point computable predicate such that for all φ ∈ Ct(k) and all n1 and n2 , if ∀mT (φ, n1 , m) ∧ ∀mT (φ, n2 , m) then f (n1 ) = f (n2 ). Assume further that the corresponding predicate T˜ satisfies that for all total α ∈ ID(k) there are infinitely many n such that T˜ (α, n, ⊥). Finally assume ∗ It is decidable when ξik , i.e. the i’th element of the countable dense subset of Ct(k), satisfies ∀mT (ξik , n, m). Let Φ(φ) = f (n) when ∀mT (φ, n, m). Then Φ is fixed point computable. Proof. Let Ψ ∈ Ct(k) × N → N, φ ∈ Ct(k) and n ∈ N. Define ∆(Ψ)(φ, n) by the following algorithm: - Search for the least m such that 1. or 2. below holds: 1. ¬T (φ, n, m) 2. T (φ, n, m), m > n and T (φ, m, Ψ(φ, m)). - In case of 1., let ∆(Ψ)(φ, n) = m while case 2. splits into two sub cases: - If f (n) = f (m) we let ∆(Ψ)(φ, n) = m. - If f (n) , f (m) we search for the least t such that ¬T (φ, n, t) and let ∆(Ψ)(φ, n) = t.
The Continuous Functionals as Limit Spaces
375
We let ∆˜ be the corresponding functional on the partial objects. As defined, it will not be the case that ∆ is total, we may risk that the final search goes on for ever. We will show that the least fixed point of ∆˜ is total, and then we will see how ∆ can be adjusted to a total ∆1 with the same least fixed point. ˜ t ). Let ψ = tt∈N ψt . Let ψ0 be the ⊥-element of the type k ×0 → 0 and let ψt+1 = ∆(ψ Let α ∈ ID(k) be a representative for φ ∈ Ct(k). Let X = {n ∈ N | T˜ (α, n, ⊥)} and let a = f (n) for n ∈ X. Then Φ(φ) = a. We will show that for all n, ψ(α, n) ∈ N and if T˜ (α, n, ψ(α, n)) then f (n) = a. This will not only ensure that ψ is the representative of a fixed point computable Ψ, but that we can compute Φ by Φ(φ) = f (n) for the least n such that T (φ, n, Ψ(n)). We will prove, simultaneously for all n and by induction on t > 0, that if n + t ∈ X then - ψt (α, n) ∈ N - If T˜ (α, n, ψt (α, n)) then f (n) = a. So assume that the claim holds for all t0 with 0 < t0 < t and let n + t ∈ X. Let us consider the algorithm for ψ(ψt−1 )(α, n). - First we search up to n for an m such that ¬T˜ (α, n, m), and if we find one, we let ψt (α, n) be that value. If we find one, there is nothing more to prove. - If we do not find such m ≤ n we continue to search, and this search will stop for three possible reasons: i) We find m such that n < m < n + t and ¬T˜ (α, n, m). ii) We find m such that n < m < n + t and T˜ (α, m, ψt−1 (α, m)). iii) We find m = n + t and realize that T˜ (α, m, ψt−1 (α, m)) because T˜ (α, m, ⊥). By the induction hypothesis, the test in the search will terminate for all m with n < m < n + t since ψt−1 (α, m) ∈ N for these m, and our search will halt with m = n + t at the latest. We will now split the rest of the proof into these three cases: i) This is like the first case, we let ψt (α, n) = m, and since ¬T˜ (α, n, ψt (α, n)) there is nothing more to prove. ii) The construction now splits into two sub cases, f (n) = f (m) and f (n) , f (m). In the first sub case, the induction hypothesis tells us that f (m) = a. It follows that f (n) = a, so part two of the claim holds. In the second sub case, we know
376
Dag Normann
by the induction hypothesis that f (m) = a, so by the assumptions there must be a t such that ¬T˜ (α, n, t), and we will find one via the continued search. iii) We can argue as in ii), except that we do not need any induction hypothesis to ensure that f (m) = a when we are in case iii). ˜ and let Ξ Now let Ψ be the total object represented by the least fixed point of ∆, be an arbitrary functional of the same type. We will define ∆1 (Ξ) using exactly the same algorithm as we used for ∆(Ξ) unless there is information available demonstrating that Ξ , Ψ. Recall the definition of the trace functions hΨ and hΞ . hψ is outright computable and hΞ is computable in Ξ. In Case 2, f (m) , f (n), the idea is to define ∆1 (φ, n) by searching for the least t such that ¬T (φ, n, t) or hΨ (t) , hΞ (t). The obstacle here is that the totality of ∆˜ ω1 (⊥) does not follow anymore. Unless we choose the enumeration of the countable dense subset of Ct(k) × N with some care, we risk that h∆˜ ω1 (⊥) is not total. This is where the condition ∗ will be used. If ∀mT (ξik , n, m), our algorithm will ensure that ∆˜ 1 (ξ˜ik , n) is defined. Let g be computable such that ∃m¬T (ξik , n, m) ⇒ ∃m ≤ g(i, n)¬T (ξik , n, m). If we enumerate all pairs (i, n) as t(i, n) such that t ≥ g wherever g is defined, and use this enumeration when we define hΞ , we avoid this obstacle. We prove by induction on t that hψ (t) ∈ N where ψ is the least fixed point of ∆˜ 1 . Thus the pair (∆˜ 1 , ∆1 ) shows that the unique fixed point Ψ of ∆ is fixed point computable. Our main application of Lemma 33 is Theorem 34. Let Φ ∈ Ct(k) be externally computable. Then Φ is fixed point computable. Proof. We give the details for k = 3. For k < 3, the result is trivial. For k > 3 we refer to Normann [16], where the adjustments needed for the general theorem were given in detail. Let {(pn , an )}n∈N be a computable sequence, where each pn ∈ D(2), an ∈ N and if φ ∈ Ct(2) and α ∈ ID(2) then Φ(φ) = a if and only if there is an n ∈ N such that pn ∈ α and a = an . We may well assume that if there is one such n, there will be infinitely many. Since each pn has finite support, there will be (qn,1 , bn,1 ), . . . , (qn,rn , bn,rn ) where each qn,r ∈ D(1), each bn,r ∈ N and pn is the least monotone extension of the map {qn,r 7→ bn,r | 1 ≤ r ≤ rn }. We let T (φ, n, m) when φ agrees with pn on the m’th total extension (in a dense
The Continuous Functionals as Limit Spaces
377
set of possible extensions) of qn,1 , . . . , qn,rn . We design the algorithm for the m’th extension of qn,r such that qn,r itself is the ⊥’th extension. In this way we obtain the required property of T˜ . For φ ∈ Ct(2) we will have that ∀mT (φ, n, m) is equivalent to pn ≺ φ. Since this is decidable when φ is a known element of our countable dense subset of Ct(2), the assumption ∗ in Lemma 33 is also satisfied. Finally, we let f (n) = an , apply he lemma, and see that Φ is fixed point computable. We can use the same argument to prove the following characterization: Corollary 35. Let Φ ∈ Ct(k) for some k. Then the following are equivalent: 1. Φ is fixed point definable. 2. Φ has a representation in ID(k) that is arithmetical. Proof. 1. ⇒ 2. is proved by induction on the rank of the construction of Φ as fixed point definable. 2. ⇒ 1. is proved by a relativized version of Lemma 33.
7
Summary and conclusions
We have shown that the classical computability theory of the Kleene-Kreisel continuous functionals can be developed within the framework of limit spaces, i.e. in a strictly topological setting. We may strengthen S1 - S9 by adding unique fixed points of computable functionals. However, our conclusion so far is that we need an external way of representing the functionals in order to decide when we allow one such fixed point to be computable. This paper is partly a survey paper and partly an experiment with the concept of fixed point definable functional. Our conclusion so far is that we may construct all externally computable functionals using constructions that are absolute for the functionals themselves and the commonly used set of realizers, the domains of partial continuous functionals. It is a fact that the least fixed point operator is continuous as an operator on domains, while the unique fixed point operator is discontinuous on the set of contractors. We leave it as an open problem to find out if this naive observation can be used to ad one partial computable functional of type (k → k) → k for each k, representing legal fixed point constructions such that they, together with µ-recursion, generate all externally computable functionals. We know from Normann [13] that
378
Dag Normann
there is a sequence of computable master functionals, one for each type ≥ 4, that together with µ-recursion generate all externally computable functionals. These are constructed in an ad hoc way using theorem 13, and do not serve as a justifiable basis for an internal computability theory on the continuous functionals. Acknowledgement
I am grateful to an anonymous referee for suggesting improvements of the exposition.
References [1] U. Berger, Total sets and objects in domain theory, Annals of Pure and Applied Logic, 60: 91 - 117, 1993. [2] J. Bergstra, Computability and Continuity in Finite Types, Thesis, University of Utrecht, 1976. [3] Yu. L. Ershov, Computable functionals of finite type, Algebra and Logic 11: 203 - 277, 1972. [4] M. H. Escard´o, Exhaustible sets in higher-type computation, Logical Methods in Computer Science, Volume 4, Issue 3, Paper 3, 2008. [5] R. O. Gandy and J. M. E. Hyland, Computable and recursively countable functions of higher type, in Gandy and Hyland (eds.) Logic Colloquium ’76: 407 - 438, North Holland, 1977 [6] T. Grilliot, On effectively discontinuous type-2 objects, Journal of Symbolic Logic, 36: 245 - 247, 1971. [7] J. M. R. Hyland Recursion on the countable functionals, Dissertation, The University of Oxford, 1975. [8] S. C. Kleene, Recursive functionals and quantifiers of finite types I, Trans. Amer. Math. Soc. , 91: 1 - 52, 1959. [9] S. C. Kleene, Countable functionals, in A. Heyting (ed.) Constructivity in Mathematics: 81 - 100, North-Holland,1959. [10] G. Kreisel, Interpretation of analysis by means of functionals of finite type, in A. Heyting (ed.) Constructivity in Mathematics: 101 - 128, North-Holland, 1959.
The Continuous Functionals as Limit Spaces
379
[11] K. Kuratowski, Topologie, vol 1, Warsawa, 1952. [12] J. R. Longley and D. Normann, Computability at Higher Types(tentative title), in preparation. [13] D. Normann, The Continuous Functionals; Computations, Recursions and Degrees, Annals of Mathematical Logic 21: 1-26, 1981. [14] D. Normann, Recursion on the Continuous Functionals, Lecture Notes in Mathematics, Vol 811, Springer Verlag , 1980. [15] D. Normann, The Continuous Functionals, in E. R. Griffor (ed.) Handbook of Computability Theory: 251 - 275, 1999. [16] D. Normann, Computability over the partial continuous functionals, The Journal of Symbolic Logic 65: 1133 - 1142, 2000. [17] D. Normann, Computing with functionals - Computability theory or Computer science?, The Bulletin of Symbolic Logic 12: 43 - 59, 2006. [18] D. Normann and C. Rørdam, The computational power of M ω , Mathematical Logic Quarterly 48: 117-124, 2002. [19] R. A. Platek, Foundations of recursion theory, Thesis, Stanford University, 1966. [20] B. Scarpellini, A model for bar recursion of higher types, Comp. Math. 23: 123-153, 1971. [21] D. Scott, A type-theoretical alternative to ISWIM, CUCH, OWHY, unpublished notes, Oxford, 1969. [22] W. W. Tait, Continuity properties of partial recursive functionals of finite type, unpublished notes, 1958.
380
Provably Recursive Functions of Reflection Wolfram Pohlers and Jan–Carl Stegert
In this paper we give a characterization of the recursive number theoretic functions whose totality is provable within the theory Ref, which is Kripke–Platek set theory with the full reflection scheme. This is done in two steps. In the first part, Sections 2 through 5, we compute the proof theoretic ordinal of Ref and show in the second part, Section 6, how this computation can be modified into a characterization of the provable recursive functions of Ref.
Contents 1
Introduction
382
2
Preliminaries 2.1 The theory Ref . . . . . . . . . . . . 2.2 Reflecting– and indescribable ordinals 2.3 Heuristic . . . . . . . . . . . . . . . . 2.4 Iterated Skolem–hull operators . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
384 384 385 387 391
3
Reflection configurations and their instances 3.1 Reflection configurations and their instances . . . . . 3.2 Basic Structure Theory . . . . . . . . . . . . . . . . 3.3 The existence proof . . . . . . . . . . . . . . . . . . 3.4 Ordinal comparison . . . . . . . . . . . . . . . . . . 3.5 A primitive recursive characterization of V α ({0, Ξ}) 3.6 Fine Structure Theory . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
394 394 399 404 411 414 416
4
Ramified set theory 429 4.1 The language of ramified set theory . . . . . . . . . . . . . . . . 429 4.2 The infinitary calculus Π∞ ω . . . . . . . . . . . . . . . . . . . . . 431 4.3 Embedding of Ref . . . . . . . . . . . . . . . . . . . . . . . . . 434
381
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
382
Wolfram Pohlers and Jan–Carl Stegert
5 Π11 –ordinal analysis of Ref 5.1 Predicative cut–elimination . . . . . . . . . . . . . . . . . . . . . 5.2 Elimination of reflection rules . . . . . . . . . . . . . . . . . . . 5.3 The ordinal analysis . . . . . . . . . . . . . . . . . . . . . . . . .
437 437 439 450
6
451 451 452 455 458 461 464 467
Characterization of the Π02 –Skolem functions of Πω –reflection 6.1 The theory Ref∗ . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The subrecursive hierarchy . . . . . . . . . . . . . . . . . . 6.3 Fragmented Skolem hull operators . . . . . . . . . . . . . . 6.4 Embedding of Ref∗ . . . . . . . . . . . . . . . . . . . . . . 6.5 Predicative cut–elimination . . . . . . . . . . . . . . . . . . 6.6 Reflection elimination for fragmented controlled derivations 6.7 The characterization theorem . . . . . . . . . . . . . . . . .
Index
1
. . . . . . .
. . . . . . .
. . . . . . .
473
Introduction
The content of this paper should be viewed as a contribution to the part of Hilbert’s Programme that deals with the elimination of “ideal elements”. In his 1927 talk in Hamburg [12] he compares the work of a mathematician with the work of a physicist and states that only few conclusions of physical theories are verifiable by experiments and compares theses conclusions to the “real statements” of his prooftheory.1 The mathematical analog of a physical statement that is verifiable by experiments are Π02 –theorems. Whenever we have a Π02 –theorem (∀x)(∃y)F (x, y) there is an algorithm that computes y from a given input x. Here the algorithm corresponds to the “experiment” done by the physicist. In the present paper we present such an algorithm for the Π02 –theorems that are provable within a Kripke– Platek set theory with full reflection scheme. The algorithm is given in terms of a subrecursive hierarchy. Clearly this algorithm is only of hypothetical value since 1 The original citation in German is: “Der Physiker verlangt gerade von einer Theorie, daß ohne die Heranziehung anderweitiger Bedingungen aus den Naturgesetzen oder Hypothesen die besonderen S¨atze allein durch Schl¨usse, also auf Grund eines reinen Formelspiels abgeleitet werden. Nur gewisse Kombinationen und Folgerungen der physikalischen Gesetze k¨onnen durch das Experiment kontrolliert werden — so wie in meiner Beweistheorie nur die realen Aussagen unmittelbar einer Verifikation f¨ahig sind.” that loosely translated says: “The physicist requires for a theory that its theorems can be formally derived from the laws of nature and its hypotheses alone without referring to outside perceptions. Only certain combinations and conclusions of physical laws are checkable by experiments — this is also true for my prooftheory in which only “real statements” are verifiable.”
Provably Recursive Functions of Reflection
383
its computation requires resources that are widely outside the realm of all realizable possibilities. Nevertheless we regard it as an interesting problem to gauge the amount of resources that are needed for verifying a Π02 –theorem proved by abstract (i.e., ideal) principles. The subrecursive hierarchy developed in this paper is based on the ordinals that arise in an ordinal analysis of the theory of reflections. By an ordinal analysis of a theory we understand the computation of its proof theoretic ordinal, i.e., the supremum of the ordinals that can be represented by a recursive ordering on the natural numbers whose well–foundedness is provable in the theory. For reasons which we will not discuss here this ordinal is also known as the Π11 –ordinal of the theory. The key in the ordinal analysis of a theory is the elimination of all “ideal” principles that are used in a proof of a Π11 –statement. This is achieved by unravelling a formal proof into a proof within the framework of an infinitary system that allows the elimination of all “ideal” principles by a reduction procedure. The eventually obtained irreducible infinitary derivation of a Π11 –sentence is then freed from all “ideal” assumptions. When dealing with subsystems of set theory, the Π11 –sentences correspond to LωCK
Σ1 1 –sentences and an irreducible infinitary derivation of such a sentence can be viewed as a verification of the sentence in the constructible hierarchy. Building up the constructible hierarchy we need no ideal principles. The Π11 –ordinal of subtheories of set theory can therefore equivalently be characterized as the least LωCK
stage in the constructible hierarchy at which its provable Σ1 1 –sentences become true. This is clearly an ordinal below the first non–recursive ordinal ω1CK . Sections 3 through 5 are devoted to the ordinal analysis of the theory Ref. We ε compute |Ref|Π11 as ΨΩΞ+1 , the collapse of the first ε–number bigger than Ξ — the first ordinal that is Π1n –indescribable for all finite n — below ω1CK . In Section 6 we show how the ordinal analysis can be extended to a Π02 –analysis of Ref, i.e., to a characterization of the subrecursive hierarchy majorizing the Skolem functions of the Π02 –sentences provable in Ref. We obtain that the Π02 –Skolem functions — which coincide with the provably recursive functions of Ref — are the primitive recursive hull of a hierarchy {Fα α < |Ref|Π11 } which is an extension of the Hardy hierarchy. The paper is based on the first part of the doctoral thesis [23] of the second author which, in turn, is mostly based on the papers [17], [21] by Michael Rathjen, the dissertation [3] of Benjamin Blankertz, basing on the work of Andreas Weiermann (cf. [8],[24],[4],[25]), and the dissertation [10] of Christoph Duchhardt. The results in the thesis of the second author are, however, essentially further–reaching and include an analysis of a Kripke–Platek set theory with stability axiom.
384
Wolfram Pohlers and Jan–Carl Stegert
The structure and the notions of this paper differ from those given in [23]. This is especially true for the notion of fragmented controlled derivations — one of the innovative key notions in [23]. Due to an alternative approach to subrecursive hierarchies it was easy to built a collapsing feature into the definition of a fragmented controlled derivation. In [23] the same result is obtained by an extra collapsing theorem.
2
Preliminaries
2.1
The theory Ref
The theory of reflection is a subtheory of Zermelo–Fraenkel set theory. The language of Ref is first order logic with equality whose only non–logical constant is the membership symbol ∈. Its ontological axioms are the axiom of extensionality (Ext) (∀x)(∀y) (∀z ∈ x)[z ∈ y] ∧ (∀z ∈ y)[z ∈ x] → x = y,
and the scheme of foundation, which is the universal closure of (FOUND) (∃x)F (x) → (∃x) F (x) ∧ (∀y ∈ x)[¬F (y)] . Its set–existence axiom is the null–set axiom (Nullset)
(∃x)(∀y)[y ∈ / x].
Its closure axioms are ‘closure under unordered pairs’ (Pair)
(∀x)(∀y)(∃z)[x ∈ z ∧ y ∈ z],
‘closure under unions’ (Union)
(∀u)(∃z)(∀x ∈ u)[x ⊆ z]2
the ‘scheme of ∆0 –separation’ which is the universal closure of (∆0 –Sep) (∀u)(∃z) (∀x ∈ z)[x ∈ u ∧ F (x)] ∧ (∀x ∈ u)[F (x) → x ∈ z] ,
where F is a ∆0 –formula, i.e., a formula that contains only bounded quantifiers, and the ‘reflection scheme’ (REF) (∀~y ) F (~y ) → (∃z) (∃u ∈ z)[u ∈ z] ∧ Tran(z) ∧ ~y ∈ z ∧ F z
where Tran(z) is the formula (∀u ∈ z)(∀x ∈ u)[x ∈ z] and F z stands for the formula that is obtained from F by restricting all unbounded quantifiers to z. For convenience we abbreviate the right formula in the above implication by (∃z)[z |= F ]. 2 Where
z ⊆ u stands for (∀x ∈ z)[x ∈ u],
Provably Recursive Functions of Reflection
385
On should observe that the familiar formulations of pairing and union are derivable from the above versions by the ∆0 –separation scheme. Ordinals can be characterized as hereditarily transitive sets. Agreeing that lower case Greek letters range over ordinals we obtain (∀α)(∃β)[α ∈ β] with α ∪ {α} as a witness for β that exists by (Pair) and (Union). Reflecting this sentence we get (∃z)(∀α ∈ z)(∃β ∈ z)[α ∈ β]
which is an infinity axiom. Still essentially weaker than full ZFC, the theory Ref is already a pretty strong theory. In the following we will give an ordinal analysis of Ref and, based on this ordinal analysis, a characterization of its Π02 –Skolem functions. This includes a characterization of the provably recursive functions of Ref.
2.2
Reflecting– and indescribable ordinals
All (published) developments of the ordinal theory needed in the ordinal analysis of impredicative axiom systems suffer from the same drawback. In the proof theoretical analysis we work in G¨odel’s constructible hierarchy L. The relevant ordinals needed there are “reflecting ordinals” which can be understood as “recursive” counterparts of “large” cardinals in the von Neumann hierarchy V . Most important is ω1CK , the first “Π2 –reflecting” ordinal, which is equal to the first ordinal that cannot be represented by a recursively definable well–ordering on the naturals numbers. So ω1CK is the “recursive” counterpart of Ω, the first ordinal, that cannot be represented by well–ordering on the natural numbers. Simultaneously ω1CK is the first admissible ordinal above ω and the hierarchy of admissible ordinals is, roughly speaking, the “recursive” counterpart of the the hierarchy of regular cardinals. So it would be consequent to develop the ordinal theory on the basis of “recursive” ordinals. Although possible in principle, this turns out to be extremely complicated. Therefore the necessary ordinal theory has always been developed on the basis of cardinals instead of admissible ordinals and only afterwards attempts were made to justify the so obtained theory also on the basis of admissible ordinals instead of cardinals (cf. [18]). However, this situation should perhaps not only be viewed as a drawback but rather as another example of the beauty of introducing “ideal” notions in mathematical work. The use of the “ideal” large cardinals simplifies all considerations considerably. However, once the ordinal notation system is obtained, it turns out to be primitive recursively definable. We thus may forget all we used about large cardinals and work only with ordinal notations (what we will not do in this paper).3 This may be viewed as 3 This is the argument against the objection that “large” cardinals are used for the justification of theories considerably weaker than ZFC. However, consistency is not the goal of our studies, although
386
Wolfram Pohlers and Jan–Carl Stegert
another example for an “elimination of ideal elements” in Hilbert’s sense on the meta–level. Here we will use the correspondence between indescribable cardinals and reflecting ordinals, their “recursive” counterparts. The “recursive counterpart” of a Π1n+1 –indescribable cardinal is a Πn+3 –reflecting ordinal. A profound study of large cardinals and their recursive counterparts by Peter Aczel and Wayne Richter is in [22]. 2.1 Definition A formula F in the language of set theory is a ∆0 –formula if all quantifiers in F are bounded. We call F a ( pure) Πn –formula if it has the form (∀x1 ) . . . (Qxn )G(x1 , . . . , xn ) for a block of n alternating quantifiers and a ∆0 –formula G(x1 , . . . , xn ). Dually we define (pure) Σn –formulas as formulas (∃x1 ) . . . (Qxn )G(x1 , . . . , xn ) with a block of n alternating quantifiers and a ∆0 –formula G(x1 , . . . , xn ). A formula F in the second order language of set theory is a (pure) Π1n –formula iff F has the form (∀X1 ) . . . (QXn )G(X1 , . . . , Xn ) for a block of n alternating second order quantifiers and a first order formula G(X1 , . . . , Xn ). Dually we define (pure) Σ1n –formulas. The Πn (Σn , Π1n , Σ1n )–formulas are the closure of the pure Πn (Σn , Π1n , Σ1n )– formulas under the positive boolean operations ∧ and ∨ . An ordinal π is Πn –reflecting iff for any Πn –formula F (a1 , . . . , an ) with parameters a1 , . . . , an in Lπ such that Lπ |= F (a1 , . . . , an ) there is an ordinal κ < α such that a1 , . . . , an ∈ Lκ and Lκ |= F (a1 , . . . , an ). We call π Πn reflecting on a class M ⊆ On iff Lπ |= F (a1 , . . . , an ) entails that there is an ordinal κ ∈ M ∩ π such that a1 , . . . , an ∈ Lκ and Lκ |= F (a1 , . . . , an ). An ordinal π is Π1n –indescribable if for every Π1n –formula F (P1 , . . . , Pn ) with second order parameters P~ := P1 , . . . , Pn such that (Vπ , P~ ) |= F (P~ ) there is an ordinal κ < π such that (Vκ , P~ ∩ κ) |= F (P~ ). The ordinal π is called Π1n – indescribable on a class M ⊆ On if the above mentioned ordinal κ is in M ∩ π. Clearly these definition carry over to all other possible complexity classes for formulas. It is easy to see that all Π1n –indescribable ordinals are cardinals. To simplify notations we write F α instead of F Lα . Observe that Lα |= F is equivalent to L |= F α as soon as all parameters in F belong to Lα . Similarly we write mostly (∃xα ) and (∀xα ) instead of (∃x ∈ Lα ) and (∀x ∈ Lα ). By Ππn (Σπn ) we denote the complexity class {F π F a Πn (Σn )–formula }. We need a few of the classical results on indescribable cardinals. the consistency of the analyzed theory follows from the well–foundedness of the eventually obtained ordinal notation system. A “constructive” consistency proof therefore would only need a constructive proof of the well–foundedness of the ordinal notation system.
Provably Recursive Functions of Reflection
387
2.2 Theorem An ordinal κ is Π10 –indescribable iff κ is strongly inaccessible. Proof Cf. [9] Chapter 9, §1, Theorem 1.3.
2.3 Theorem Let κ be a strongly inaccessible cardinal then the set {σ ∈ κ (Vσ , P ∩ Vσ ) ≺ (Vκ , P )}4
is closed and unbounded in κ.
Proof Cf. e.g. [13] Lemma 6.1.
2.4 Theorem For every n ∈ ω there is a Π1n –formula ψn (X1 , . . . , Xk , x) that is ~ there is a parameuniversal for the Π1n –sentences, i.e., for any Π1n –sentence φ(X) ter φ such that for any limit ordinal λ > ω and parameters Pi ⊆ Vλ we have (Vλ , ∈, P1 , . . . , Pk ) |= φ(X1 , . . . , Xk ) ↔ ψn (X1 , . . . , Xk , φ ).
Proof Cf. [9] Chapter 9, §1, Lemma 1.9 for the case n > 0. A reinspection of the proof given there also shows the existence of a ∆10 –formula ψ0 that is universal for Π10 –sentences. 2.5 Corollary The Π1n –indescribability on a set M is definable by an Π1n+1 –formula with parameter M . I.e. there is a Π1n+1 –formula φ(P ) such that (Vπ , ∈, M ) |= φ(P ) iff π is Π1n –indescribable on M. Proof Define
~ ~ x) → (∃κ ∈ P )[κ 6= 0 ∧ ψn (X ~ ∩ Vκ , x)Vκ ] φ(P ) :⇔ (∀X)(∀x) ψn (X,
where ψn is the formula universal for Π1n –formulas.
2.3
Heuristic
The characterization of the provably recursive functions is based on the ordinal analysis of Ref. Since already the ordinal analysis is technically rather involved we want to give a heuristic (and thus pretty vague) description how it is obtained. An ordinal analysis of a subtheory of set theory can be understood as a computation of the least upper bound for the stages in the constructible hierarchy at ω CK which all provable Π2 1 –sentences are satisfied. This is commonly achieved by defining a language LRS of ramified set theory in which there is a name for every 4 where
≺ stands for elementary substructure.
388
Wolfram Pohlers and Jan–Carl Stegert
constructible set. This canonically induces an infinitary verification calculus whose essential infinite rule is αt
α
∆, F (t) for all t ∈ Lπ and αt < α ⇒
∆, (∀x ∈ Lπ )F (x)
where αt and α are ordinals and ∆ is a finite set of LRS –formulas, the side formulas of the rule, and ∆, F is to be interpreted as the finite disjunction of the formulas in ∆ and F . W α ∆ entails L |= ∆. Whenever we This infinitary calculus is correct, i.e., CK
ω CK
α
succeed to show that Ref ` F ω1 implies F π for a Π2 1 –sentence we know that π is an upper bound for the proof theoretic ordinal of Ref.5 In order to obtain an embedding of Ref we have to augment the verification calculus by reflection α rules and a cut rule. We denote by ρ ∆ that there is a derivation of ∆ in this enriched calculus whose cut formulas have complexities less than ρ. Then a derivaα tion ρ ∆π with ρ and π less than ω1CK cannot contain applications of reflection rules (since these rules always lead to stages ≥ ω1CK ). A derivation ρ, π < ω1CK is therefore “nearly” a verification
α
α ρ
∆π with W ∆π and thus entails L |= ∆π .6 ω CK
α
Therefore it suffices to show that Ref F for a Π2 1 –sentence entails ρ F π for ρ, π < ω1CK to obtain reasonable upper bounds. The central problems thus consist in the elimination of the reflection rules that are needed for the embedding of a ω CK Ref–proof of a Π2 1 –sentence and in collapsing the involved ordinals below ω1CK . For the collapsing task we can rely on the proven technique of derivations controlled by iterated Skolem hull operators, developed by W. Buchholz in [7], which we introduce in Section 2.4. The breakthrough in the elimination of reflections with complexities above Π2 is due to M. Rathjen in [17]. In his later papers he extended this method also to much stronger systems. Although the basic idea of his technique is beautifully simple its eventual realization turns out to be extremely cumbersome. Therefore we try to give a heuristic description of its basic ideas. Its main feature is the introduction of thinning hierarchies.7 An example for a thinning operation for a set X of ordinals is Rn (X) := {ξ ∈ X ξ is Πn –reflecting on X}.8
Since we are working with infinitely long derivations we need transfinitely iterof course only yields relevant information if we succeed in finding a π < ω1CK . “semantical cut elimination” (cf. [15] Theorem 11.10.2) it is in fact easily transformed into a verification. 7 This is the crucial progress in comparison to local predicativity which lead to ordinal analyzes of theories below Π3 –reflections. 8 According to the already mentioned fact that we will develop the ordinal theory on the basis of 5 This
6 By
Provably Recursive Functions of Reflection
389
ated thinning hierarchies, e.g. hierarchies given by R0n (π) = π, Rα+1 (π) := n ξ λ 1 we have 0 to work harder. Here we observe that (ii) holds for all ζ ∈ Rα n (π) ∩ Lξ by which we obtain W π,x α3 0 (∀xξ )[x ∈ Rα (iii) ∆ ], (∃xπ )[x |= F ] . n (π) → 0 α0
W β Clearly the infinitary calculus derives the tautology 0 ∆π,ξ , ¬ ∆π,ξ for a not too W big β. The formula ¬ ∆π,ξ is a Πξn –formula. Now we need additional reflection indescribable cardinals instead of reflecting ordinals the corresponding thinning operations will rather be based on the Mahlo–operations mentioned in Section 2.4. 9 We suppress all ordinal computations in the heuristic considerations.
390
Wolfram Pohlers and Jan–Carl Stegert
rules (R β0 .
Applying an instance of the additional rule to the above tautology we obtain W β1 0 ∆π,ξ , (∃xξ )[x ∈ Rα n (π) ∧ ¬ ∆π, x] . 0
(iv)
0 Cutting (iii) and (iv) — ξ ∈ Rα n (π) secures ξ < α and the complexity of the α0
cut formula is essentially ξ — yields 0 ∆π.ξ , (∃xξ )[x |= F ] and we have shown α that we can avoid a Ππn+1 –reflection rule for the cost of introducing a family of relativized Πξn rules for ξ < π. Of course this is not the end of the story. We have now also get rid of the new reflection rules. Using the above strategy to eliminate also a new reflection rule we have to prove (A) also for π replaced by κ ∈ Rδn+1 (π) and thus need a thinning hierarchy Kα n (κ) for κ. Instead of (ii) we then have to prove α2 α0 0
∆π,ζ , (∃xξ )[x ∈ Rβn+1 (π) ∧ x |= F ]
(v)
α for all ξ ∈ Kα n (κ) and all β < δ. But for this we need to know that Kn (κ) ⊆ β Rn+1 (π) holds true for all β < δ. This is of course impossible since for a limit orT dinal δ, κ := ψRn+1 (π) (δ) and ξ ∈ Kβn (κ) we get ξ ∈ β