123 98 5MB
English Pages xxxiii, 461 [489] Year 2023
Intelligent Systems Reference Library 245
Lech T. Polkowski
Logic: Reference Book for Computer Scientists The 2nd Revised, Modified, and Enlarged Edition of “Logics for Computer and Data Sciences, and Artificial Intelligence”
Intelligent Systems Reference Library Volume 245
Series Editors Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK
The aim of this series is to publish a Reference Library, including novel advances and developments in all aspects of Intelligent Systems in an easily accessible and well structured form. The series includes reference works, handbooks, compendia, textbooks, well-structured monographs, dictionaries, and encyclopedias. It contains well integrated knowledge and current information in the field of Intelligent Systems. The series covers the theory, applications, and design methods of Intelligent Systems. Virtually all disciplines such as engineering, computer science, avionics, business, e-commerce, environment, healthcare, physics and life science are included. The list of topics spans all the areas of modern intelligent systems such as: Ambient intelligence, Computational intelligence, Social intelligence, Computational neuroscience, Artificial life, Virtual society, Cognitive systems, DNA and immunity-based systems, e-Learning and teaching, Human-centred computing and Machine ethics, Intelligent control, Intelligent data analysis, Knowledge-based paradigms, Knowledge management, Intelligent agents, Intelligent decision making, Intelligent network security, Interactive entertainment, Learning paradigms, Recommender systems, Robotics and Mechatronics including human-machine teaming, Self-organizing and adaptive systems, Soft computing including Neural systems, Fuzzy systems, Evolutionary computing and the Fusion of these paradigms, Perception and Vision, Web intelligence and Multimedia. Indexed by SCOPUS, DBLP, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
Lech T. Polkowski
Logic: Reference Book for Computer Scientists The 2nd Revised, Modified, and Enlarged Edition of “Logics for Computer and Data Sciences, and Artificial Intelligence”
Lech T. Polkowski Department of Mathematics and Informatics University of Warmia and Mazury Olsztyn, Poland
ISSN 1868-4394 ISSN 1868-4408 (electronic) Intelligent Systems Reference Library ISBN 978-3-031-42033-7 ISBN 978-3-031-42034-4 (eBook) https://doi.org/10.1007/978-3-031-42034-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
To Maria and Marcin To Andrzej Skowron
Foreword
The new book LOGIC: Reference Book for Computer Scientists by Prof. Lech Polkowski is another remarkable contribution that the author has offered to the hand of the readers. This is an addition to his another book Logics for Computer and Data Sciences, and Artificial Intelligence which was published in the series Studies in Computational Intelligence. This new book is a substantial extension of the already published book as well as it covers different domains of classical and non-classical logic; among them sentential logic, first-order logic, intuitionistic and modal logics, temporal logics, many valued logics, dynamic logic, epistemic logics, mereology and rough mereology (with important results of Prof. Polkowski), logic models of knowledge in relation to the rough set approach, and some advanced issues concerning second-order logic, in particular monadic second-order logic, are a few to name. What is very impressive is its wide extent covering all these domains with so many deep mathematical results; usually such a wide range is not even possible to cover in a monograph of lectures on mathematical logics. I see this book as a further step and more deep in comparison to the first one, towards the realization of the idea of Prof. Helena Rasiowa; she formulated in 1960s some ideas emphasizing the fundamental role of logic for the development of CS and AI and conversely a very significant role of CS and AI for the development of logic. From this perspective, the new book concentrates on the presentation of the basic domains of logic which have a great importance for the contemporary state of development of CS and AI. This concerns, e.g., application of logic as a very important tool for expressing concepts on which different forms of reasoning can be performed, deep characterization of computational complexity of many different important problems in CS and AI as well as application of Boolean reasoning in different domains of CS and AI. Moreover, the book also covers theoretical issues which yet have not been considered in the scope of CS or AI applications. Exploiting these issues more may lead to the discovery of new technologies. It is worthwhile mentioning that researchers in AI are gradually discovering the usefulness of advanced tools related to different vii
viii
Foreword
branches of advanced mathematics and logic, e.g., topology and reasoning in solving the advanced problems related to Machine Learning. One can also observe that nowadays there is also an important call to logicians, especially from AI, for developing new reasoning tools based on the relevant computing model necessary for solving challenges, related to, e.g., Intelligent Systems dealing with complex phenomena; the book How Smart Machines Think by Sean Gerrish (MIT Press, 2019) may be considered as an example of such reflection of thought. The research related to an emerging computing paradigm of Interactive Granular Computing model, different from the classical Turing model, may also be counted to recognize the importance of research on such new reasoning tools. In this direction, the section on rough mereological logic, essentially devoted to computing with granules of knowledge, may provide a starting point towards more complex logics. The material is presented in a very condensed and precise way. This concerns the whole book starting from the preliminary chapter which can serve as a basis for a quite a few courses for graduate and Ph.D. students from CS and AI. It should be also noted that all theorems are presented together with the detailed proofs. Moreover, the historical comments added to the chapters will help the readers to better understand how different branches of logics have been born and developed. The problems included in all the chapters will also stimulate the readers for deeper studies. This will certainty work as a reference book for the graduate students and Ph.D. students; for the researchers working in CS or AI this book has the potential of guiding them in their studies through different areas of applied and theoretical issues of logic. The content of the book is so varied that the researchers working on different areas of CS or AI for sure will find some sketch of selected contents relevant for their areas of interest. Warsaw, Poland February 2023
Andrzej Skowron
Preface
Repetitio mater studiorum est. Repetition allows to perceive some facts not seen at the first study. It allows for connecting some threads not connected at first and for better seeing main dominating lines in apparently distinct reasoning moods. For those moods, let us be allowed to say that logic is about consistency. Satisfaction of consistent sets implies completeness, maximal consistent sets provide models, consistency guarantees incompleteness in the Gödel-Rosser sense. Another mood is exploitation of closures of formulae, also appearing in many places in this text; we should mention as well the role of the filter separation theorem which is applied in many important proofs of completeness, witness the Chang theorem. The aim to exhibit those moods to a fuller degree has been the reason that this author has underwent the enlarged at the same time corrected edition of his book styled Logics for Computer and Data Sciences and Artificial Intelligence (the title abbreviated to ’Logics’ in the sequel) published by Springer Nature Switzerland in the Series Studies in Computational Intelligence as no. 992. The present text is larger by about 35% in comparison to the text in ‘Logics’ and much richer topically. It is in fact a new book which implies a new title, better reflecting the content. Let us present shortly that content to the Reader. It opens with Chap. 1 which contains prerequisites in order to provide the Reader with necessary information so they have almost no need to reach for external information. Section 1.1, Set Theory Recapitulated, contains an account of ZFC set theory with theory of relations functions, ordering, complete ordering, well ordering, equipollence theory, cardinal and ordinal numbers, transfinite induction, elements of theory of graphs and trees, a new section on topology. We include proofs of fundamental theorems: the Knaster-Tarski theorem in Theorem 1.2, the Dedekind-McNeille theorem in Theorem 1.3, the Cantor diagonal theorem in Theorem 1.5, the Cantor-Bernstein theorem in Theorem 1.6, the Zermelo theorem on the well ordering in Theorem 1.12, many of them in the basic repertoire of set-theoretic tools in computer science, the Zorn maximal principle theorem in Theorem 1.13, the Teichmüller-Tukey principle in Theorem 1.14, the König theorem in Theorem 1.15, the Ramsey theorem in Theorem 1.16. In Sect. 1.2 about Rewriting Systems, we address topics of grammars, finite-state automata, regular expressions and grammars, automata on infinite words ix
x
Preface
(Büchi automata), showing relations between regular grammars and automata. We prove theorems on relations between regular finite languages and finite-state automata in Theorem 1.21, and, on relations between regulae ω-languages and Büchi automata. Section 1.3 on Computability brings basic information on Turing Machines, primitive recursive and recursive functions. Arithmetization of Turing machines by means of the Gödel numbering is the content of Sect. 1.4. The central part of Sect. 1.4 is the proof of the primitive recursiveness of the Kleene predicate which allows for proofs of undecidability of certain predicates (Theorem 1.34). Section 1.5 brings a characterization of recursively enumerable sets. In Sect. 1.6, we address the undecidability issues and Sect. 1.7 recalls the Smullyan account of frameworks for the Gödel incompleteness theorems in Theorems 1.43, 1.46 and for the Tarski theorem on non-arithmeticity of truth in Theorem 1.45. We recall in Sect. 1.8 main issues of Complexity theory. We introduce basic classes of complexity; L, NL, PTIME, PSPACE, P,NP, EXP, NEXP. We include proofs of many statements whenever it is possible, including the proof of the Savitch theorem in Theorem 1.59, the Cook-Levin theorem in Theorem 1.53, with examples of problems: CNT-CONTRA-UR, DGA, SAT(CNF), SAT(3-CNF), TQBF, SAT(BSR) of Bernays-Schönfinkel-Ramsey. We will see in the following chapters the role of the algebraic approach, notably the role of the separation theorem in proofs of completeness in Chaps. 6 and 7. For this reason, we include Sect. 1.9 on Algebraic Structures in which we introduce lattices, complete lattices with filters and ideals in lattices, distributive lattices with the proof of the separation theorem in Theorem 1.68, relatively pseudo complemented algebras and Boolean algebras. The new topic is the Marshall Stone theory of representation of Boolean algebras in Theorem 1.80. The new topic is Topology section in which we introduce the basic notions and results of set-theoretic topology: the notion of a topological space, open and closed sets, interior and closure operators, properties like T2 (Hausdorff) property, compactness with the Tychonoff theorem, degrees of disconnectedness, and elements of the Marshall Stone theory of topological representations of Boolean algebras in Hausdorff zero-dimensional compact Stone spaces in Theorems 1.86, 1.87. The last topic will be useful in Chap. 7 in discussion of mereological scheme for fuzzy computing. We are of opinion that Chap. 1 may serve as a basis for a one-semester course in mathematical foundations of Computer Science. Chapter 2 is dedicated to Sentential Logic. We use this name in order to distinguish this much more elaborated chapter from its predecessor in ‘Logics’. Nevertheless in the subsequent chapters we often return to the synonymous term ‘propositional’. It is the first of two cornerstone chapters on classical logic. It introduces main ideas, notions, and results which permeate all logic. We begin this chapter with the section on a brief account of history. We aim at turning attention to systems of Syllogistics of Aristotle and the sentential logic of Stoics and Megarians. In both cases, we meet an automatized system of deduction: medieval scholastic scholars gave names to figures of syllogisms which allowed to reduce syllogisms to those of the first group regarded as accepted unconditionally,
Preface
xi
predecessors of today’s axiom schemes. Stoics introduced five moods of reasoning and means for reduction of complex expressions to those moods. Justly Fr. I. M. Bochenski wrote that leading scholars of Stoic and Megarian schools like Philo of Megara, Chrysippus, Zeno of Elea can be counted among the greatest thinkers in logic. Next two Sects. 2.2, 2.3 are devoted, respectively, to syntax and semantics of sentential logic. In the section on syntactic properties of sentential logic, we discuss the language of SL, the notion of a well-formed formulae, we introduce sentential connectives and minimal sets of them. In the next section on semantics, we introduce truth tables, formation trees, notions of satisfiability, validity, unsatisfiability, as well as the notions of a model and of a logical consequence. All those notions will appear in discussions of all other logics. Section 2.4 is devoted to normal forms; we discuss negation, conjunctive, and disjunctive normal forms along with algorithms for conjunctive and disjunctive normal forms. In Sect. 2.5 we introduce axiomatic schemes and inference rules. We adopt for the rest of the chapter the axiomatic system of Jan Łukasiewicz along with detachment, substitution, and replacement as inference rules. We also introduce the notion of a proof and of a provable formula. We conclude this section with the statement and a proof of the Herbrand-Tarski deduction theorem in Theorem 2.4, the notion of logical matrices serving in independence proofs of axiom schemes and with a discussion of the Tarski-Lindenbaum algebra of the sentential logic in Definition 2.28. Sections 2.6, 2.7 open a discussion of Natural Deduction in which we discuss the method of sequents due to Gentzen and the related variant of decomposition diagrams of Rasiowa-Sikorski. We continue this topic in Sect. 2.8 with analytic tableaux in the sense of Smullyan. Tableaux offer a milieu for proving basic theorems on properties of sentential logic. We prove in this section the theorem on tableau-completeness of sentential logic. The fundamental for logic notion of consistency of sets of formulae along with the notion of maximal consistency are introduced in Sect. 2.9. We prove the Lindenbaum Lemma 2.18 on existence of the maximal consistent extensions for consistent sets. We prove the strong completeness of sentential logic. In Sect. 2.10, we recall the classical proof by Kálmar of completeness of sentential logic and we state and prove the Craig interpolation lemma. Section 2.11 brings the topic of resolution. We give a proof of its completeness based on the notion of consistency and on play between this notion and satisfiability. Horn clauses are discussed in Sect. 2.12 along with reasoning methods of forward and backward chaining. We conclude this section with the Haken theorem on exponential complexity of resolution in Theorem 2.29. Complexity of satisfiability problems in sentential logic is approached in Sect. 2.13. Due to NPcompleteness of the SAT problem, we present heuristic approaches by Davis-Putnam and Davis-Logemann-Loveland. We bring also the case of Horn formulae with the APT algorithm for solvability of 2-SAT problem in linear time. In the end of Chap. 2, we mention physical realization of validity proofs by means of logical circuits and McCulloch-Pitts neuron nets with their improvement by perceptron nets. We prove the theorem on perceptron learning in Theorem 2.32.
xii
Preface
Chapter 3 deals with the first-order logic (FO), the culmination of classical logic and the environment for the deepest results in logic. The chapter consists of 22 sections. We begin with syntax of FO in Sects. 3.1 and 3.2 brings an account of semantics. We define the notion of a model, and notions of satisfiability, validity and unsatisfiability. In Sect. 3.3 we enter the realm of Natural Deduction with the sequent calculus and in Sect. 3.4 we continue this topic with diagrams of formulae, due to RasiowaSikorski, being yet another rendering of the Gentzen idea. This last method allows us to prove a weak form of completeness for predicate logic. Section 3.5 continues discussion of methods in Natural Deduction with the method of tableaux. As with SL, also in predicate logic, tableaux allow for proving the basic properties of predicate logic. This is initiated in Sect. 3.6 in which we introduce Hintikka sets and prove their satisfiability which in turn proves tableau-completeness of predicate logic in Theorem 3.6. Continuing this line, we prove the Löwenheim-Skolem Theorem 3.8 and the compactness property of predicate logic. Relations between sequents and tableaux are discussed in Sect. 3.7: by means of then we prove the completeness of the sequent calculus. Normal forms are introduced in Sect. 3.8. We introduce the prenex normal form and the Skolem functions. We introduce conjunctive and disjunctive normal forms and we discuss the techniques of skolemization, renaming and unification in obtaining these normal forms. In Sect. 3.9, we prove theorems on existence of the most general unifier, and on soundness and completeness of predicate resolution. Horn formulae return for predicate logic in Sect. 3.10 along with SLD-resolution and logic programs. We give examples of reasoning in these structures. Section 3.11 begins the discussion of deepest problems in FO, beginning with undecidability of satisfaction and validity problems. We apply the fact that the decision problem of membership is undecidable for type 0 grammars and we construct an undecidable formula of predicate logic. We include a classical undecidable problem PCP of Post with a proof by Sipser in Theorem 3.22 and the Church theorem with the proof by Floyd in Theorem 3.19. Section 3.12 is in a nutshell mentioning of complexity issues. In Sect. 3.13, we prove that the specialization of FO, the monadic FO is decidable. Section 3.14 is devoted to the exposition of the Herbrand theory and the proof of the Herbrand theorem in Theorem 3.29 which reduces the validity problem for predicate logic to validity problem for SL. Many proofs of completeness for various logics apply the idea of building models as sets of maximal consistent sets of formulae. This pioneering idea of Henkin was applied by him in the proof of the Gödel completeness theorem for FO and in Sect. 3.15 we include this proof. Henkin proves a generalization of the Löwenheim-Skolem theorem to higher cardinalities.The idea of Henkin will appear in chapters on modal logics and on many-valued logics. We include in Sect. 3.16 the Smullyan proof of the Tarski theorem on unprovability of truth in the arithmetic theory L E .
Preface
xiii
The famous Gödel incompleteness theorem is proved, again after Smullyan, in Sect. 3.17. The Rosser incompleteness theorem along with the famous Rosser sentence, both proved under the mere consistency assumption, are included in Sect. 3.18. Section 3.19 takes up the theme of finite models for FO, important for applications. We define finite models with the analysis of expressive power of FO on mind. We define basic notions of an isomorphism and a partial isomorphism, m-equivalence and FO-definability. The tool for checking expressive power of FO is the Ehrenfeucht game, described and applied in examples in Sect. 3.20. We include a proof of the Ehrenfeucht theorem on equivalence of the existence of the winning strategy in the game Gm for the player called Duplicator with m-equivalence of structures on which the game is played. This opens up a way to check whether a given class of structures is definable within FO. In Sect. 3.21, we prove the form of the deduction theorem for FO. The Craig interpolation theorem and the Beth definability theorem for FO are discussed and proved in Sect. 3.22. Chapters 2 and 3 close our treatment of classical logic and we enter into the realm of logics which introduce new elements be it new models or new operators. We begin non-classical logics with Chap. 4 which treats modal and intuitionistic logics of which modal logics are close to classical first-order logic. In Sect. 4.1, we introduce syntax of modal logics with operators L of necessity and M of possibility along with detachment and necessity as inference rules. We introduce the notion of a normal modal logic and the modal system K as the basic system on which other more complex systems are built. Semantics for modal logics is introduced in Sect. 4.2. We define Kripke structures which inherit some features of first-order semantics but formalize in a new way domains of structures as sets of worlds related one to another by a relation of accessibility. We define the notion of satisfaction for modal logics which shows a dependence of validity of modal formulae on properties of accessibility relation. For modal logics K, T, B, 4, D, 5, 4C, DC and G, we show the corresponding relation and prove the 1-1 correspondence for pairs logic-relation. We define logics S4 and S5. We focus in our discussion of modal logic on the canonical sequence K T S4 S5 occasionally using B and D. In Sect. 4.3, we discuss natural deduction for modal logics in the form of tableaux for logics K, T, S4 and for S5. Section 4.4 opens a discussion of meta-theoretic properties of modal logics. We define Hintikka sets proving their satisfiability and proving the tableau-completeness of modal logics K, T, S4 in Theorem 4.16. Special treatment of S5 due to the condition of symmetry of the corresponding accessibility relation follows. The theme of natural deduction is continued in Sect. 4.5 with the sequent calculus. We bring modal sequent rules for K, T and S4. We establish tableaucompleteness of modal sequent calculus by means of relation between sequent and tableaux calculi.
xiv
Preface
In Sect. 4.6, we discuss notions of provability and consistency for modal logics, maximal consistent sets and relation of accessibility among them as preparation for completeness theorem in Sect. 4.7. The proof of completeness is in the Henkin style via maximal consistent sets. Section 4.8 contains a discussion of the notion of filtration which leads to the proof of decidability for logic K, T and S4 in Theorem 4.34. A further step for S5 requires the notion of bisimulation which allows for proof od decidability for S5 in Theorem 4.37. In Sect. 4.9, devoted to satisfiability, we recall the Ladner proof that SAT(S5) is NP-complete. With Sect. 4.10 we begin a discussion of quantified modal logic (QML). We mention the de re and de dicto readings of statements and on this basis we introduce the Barcan, Definition 4.36, and the converse Barcan formulae, Definition 4.37. We define the notion of satisfaction for QML which blends elements of FO with Kripke structures. Section 4.11 brings an analysis of tableaux for QML. Here one discerns between constant and variable domains of QML structures. We recall elements of Fitting approach to either case. For K, we prove tableau-completeness. With Sect. 4.12 we begin introduction to propositional intuitionistic logic (SIL). This logic is constructed on ontological insight of L. E. J. Brouwer that truth means provability. This logic models mathematical truth which requires proofs. SIL for instance rejects the law of exclusion of the middle. From Gödel’s result that SIL can be interpreted within the S4 and subsequent Kripke models for SIL, S4 serves as a structure for SIL with necessary modifications. In this section, we define the notion of satisfaction for SIL. Tableaux for SIL are introduced in Sect. 4.13 with a proof of satisfaction for Hintikka families and the resulting proof of tableau-completeness for SIL. In Sect. 4.14, we recall the Henkin style proof of strong completeness of SIL. Section 4.15 opens a discussion of the last in Chap. 4 logic: the first-order intuitionistic logic (FOIL). It is set in Kripke structures for S4. We introduce the structure for FOIL which adds to the structure for S4 elements of first-order theory. This serves as a prelude to Sect. 4.16 in which we prove the completeness of FOIL. Temporal logics which attempt at defining models for time related events are often regarded as forms of modality. Chap. 5 opens the second part of the book in which we meet logics for time, many valued logics and logics for action and knowledge which have a strong application appeal. Though built on principles inherited to some extent from first-order and propositional logic, yet they show distinct principles for their construction. In Chap. 5, we give a description of temporal logic triad LTL, CTL, CTL*. Their importance stems from their applications in descriptions of systems properties and in consequence in model checking. In Sect. 5.1, we recall the logic of tenses due to Arthur Prior which inspired Pnueli and Kamp to transfer it to computer science as LTL: Linear Temporal Logic. In Sect.5.2, we discuss syntactic properties of LTL in few sub-sections. Sub-section 5.2.1 brings description of operators of LTL: G (always) and F (eventually) modeled on the universal and existential quantifiers of FO and additional
Preface
xv
new operators X (next) and the binary operator U (until). In sub-section 5.2.2 we define formulae of LTL: Gp, Fp, Xp, pUq. Sub-section 5.2.3 brings description of a Kripke structure for LTL as the linear set S = {si : i ≥ 0} of states endowed with the transition relation modeled as the successor relation with an assignment si → 2 A P , each i, where A P is the set of atomic propositions. For branching time models, Kripke structures for CTL are transitions systems, i.e., sets of states S = {si : i ≥ 0} with transition relations S → 2 S , sets of initial states I and an assignment L : S → 2 A P , and CTL adds operators A (for all) and E (exists). The need for considering various paths calls for a complex syntax with the formulae set split into state and path formulae. We define satisfaction for states and the global satisfaction for transitions systems. We conclude Sect. 5.3 with examples of valid CTL formulae. Section 5.4 is dedicated to Full Computational tree Logic CTL* which subsumes LTL and CTL. Its models are transitions systems. As for previous systems LTL, CTL, we discuss in 5.4.1–5.4.4 the syntax, semantics, and satisfaction relation for CTL*. Section 5.5 begins the discussion of meta-theory of temporal logics. We begin with LTL, for which we prove the basic result about the ultimate periodic structure in Definition 5.16 and we define the Fisher-Ladner closure FLC(φ) in Definition 5.17. We define the notion of consistency and we define maximal consistent sets for FLC(φ) in Definition 5.19. We make maximal consistent sets into a transition relation, Definition 5.20 and we prove the Sistla-Clarke theorem about satisfiability of a satisfiable formula in the ultimate periodic model in Theorem 5.7. The consequence is decidability of LTL in Theorem 5.8. We conclude with the Sistla-Clarke result in Theorem 5.10 that SAT(LTL) is in PSPACE. We also sketch the proof by Markey that SAT(CTL) is in PTIME in Theorem 5.11. In Sect. 5.6, we discuss the logic LTL+PAST which adds to LTL past operators. In Sect. 5.7, we begin the description of model checking problem beginning with model checking by automata. We give examples for model checking of some system properties. In Sects. 5.8–5.10, we discuss tableaux for LTL and CTL with examples of tableaux and proofs of tableau-completeness for those systems. With regard to role of automata in model checking, we open in Sect. 5.11 a discussion of automata on infinite words (Büchi automata) in Definition 5.27. We define languages accepted by automata in Definition 5.28 and we include basic results on Büchi automata: constructions of the union, the intersection, and the complement for Büchi automata in Theorems 5.18–5.21 due to Choueka. Decision problems for automata are considered in Sect. 5.12 with the Vardi-Wolper theorem on linear time decidability and NL-completeness of the non-emptiness decision problem in Theorem 5.22. Section 5.13 begins with definitions of alternation in Definition 5.34 and of the structure of labeled trees in Definition 5.35. For alternating automata on labeled trees which are defined in Definition 5.36, we define runs and acceptance conditions in Definition 5.37. We recall in Theorem 5.23 the result by Miyano-Hayashi about equivalence of an alternating Büchi automaton on n states with a non-deterministic
xvi
Preface
Büchi automaton on 2 O(n) states. Section 5.14 is concerned with the Vardi-Wolper theorem on translation of LTL to a non-deterministic Büchi automaton. This requires the notion of extended closure EC L(φ) in Definition 5.38. Maximal consistent sets of EC L(φ) serve as states for the automaton defined in Definition 5.39. We include the theorem (Kupferman) about the existence for an LTL formula φ of an alternating Büchi automaton on O(|φ|) states and the same induced language in Theorem 5.24. In Sect. 5.15, we again address the LTL model checking. It is about the problem whether a given Kripke structure satisfies a given formula. We give a proof of the Sistla-Clarke theorem on PSPACE-completeness of the problem by indicating a pasage from the Kripke structure to a non-deterministic Büchi automaton of complexity proportional to 2 O(|φ|) . Section 5.16 addresses the parallel problem for branching time logics. This time structures are labeled trees and automata are alternating tree automata (ata) (Thomas, Muller-Schupp), see Definition 5.40. Runs of (ata) are defined in Definition 5.41, in Definition 5.42 we define weak alternating automata and in Definition 5.43 we introduce structures of trees in Kripke structures. The model checking problem for CTL is defined in Definition 5.44 and Theorem 5.26 brings the theorem by Kupferman-VardiWolper on the existence of a weak alternating automaton A(φ, Δ) which induces the same language as the set of trees with spread Δ that satisfy the formula φ. Section 5.17 is of different character as it addresses the method OBDD (Ordered Binary Decision Diagrams) for symbolic model checking. Its essence is in building graphs for representation of propositional formulae. In Definition 5.45 we present the OBDD methods for representation of sets and transitions in OBDD and Definition 5.46 treats notions of images and pre-images, essential in model checking. In Definition 5.47, we give examples of computation and we include OBDD schemes for LTL and CTL model checking after Bryant and Chaki-Gurnfinkel. We return in a sense to classical logics in Chap. 6 which is about many-valued logic whose existence was posed by Jan Łukasiewicz in March 1918 followed by detailed exposition by Łukasiewicz in 1920 and by Post logics about that time. We discuss the propositional case. We begin with the Łukasiewicz formulae for negation and implication and other connectives in Sect. 6.1. In Sect. 6.2, we define auxiliary functions: T-norms and T-co-norms. Their role my be compared to the role of accessibility relations in modal logics. We define classical T-norms and the corresponding T-co-norms. We define the classical T-norms of Łukasiewicz, Goguen, and Gödel which later define corresponding logics. T-norms induce residua which serve as implications for respective logics in Definition 6.4. In the end, we recall all basic connectives for the three T-norms in Theorem 6.1. The interesting idea due to Hájek of studying the many-valued logic (BL-logic) introduced by the smallest Tnorm which would inherit properties common to all T-norms is presented in Sect. 6.3. We reproduce axiom schemes for BL and give a list of selected valid formulae. Deduction theorem for BL is given a proof in Theorem 6.4. An algebraic proof of completeness of BL is reproduced in Sect. 6.4. In Sect. 6.5, we embark on theory of 3-valued logic of Łukasiewicz. We begin with axiom schemes due to Wajsberg (Sect. 6.5, introduction) and we proceed with the completeness proof by Goldberg-Leblanc-Weaver on lines of the Henkin method of
Preface
xvii
maximal consistent sets. We list established by Wajsberg valid formulae of 3 L necessary for the proof. We recapitulate the Wajsberg deduction theorem in Theorem 6.10 and we prove in Theorem 6.12 the strong completeness property of the Łukasiewicz 3-valued logic. In Sect. 6.6, we briefly recall some other 3- or 4-valued logics: the Kleene 3 K logic, the Bochvar logics 3 B E and 3 B I , and the 4-valued modal logic 4 L M of Łukasiewicz. In Sect. 6.7, we treat the n-valued logics for n > 2. The scheme we present is due to Rosser-Tourquette. In the beginning, we outline their convention of inverted truth values and in Definition 6.21 we list their axiom schemes. The completeness theorem for the logic n L is proved in Th. 6.14. Very short Sect. 6.8 defines Post logics and Sect. 6.9 brings us to the core of the chapter: the infinite valued Łukasiewicz logic [0, 1] L . We begin with definitions and properties of strict and Archimedean T-norms in Definition 6.23 and in Theorem 6.15, and we prove the Ling theorem on the Hilbert style representations of Archimedean T-norms in Theorem 6.17. We define and prove properties of residua and we conclude with the Manu-Pavelka theorem on the Łukasiewicz T-norm in Theorem 6.24. Section 6.10 begins our discussion of the logic [0, 1] L of Łukasiewicz. We recall its syntax and we formulate the axioms of [0, 1] L due to Łukasiewicz. We insert a comparison (Hájek) of BL to [0, 1] L . Proofs of basic valid formulae of [0, 1] L are collected in Theorem 6.26. We are on the threshold of the completeness proof of [0, 1] L which is algebraic and this compels us to a study of relevant algebraic structures. We begin with Sect. 6.11 about Wajsberg algebras, an algebraic rendition of Wajsberg axiom schemes. They are defined in Definition 6.27 and in Theorem 6.27 principal valid formulae are given proofs. MV-algebras, algebras crucial for the proof of completeness are presented in Sect. 6.12. Ideals in MV-algebras are discussed in Sect. 6.13. Theorem 6.34 brings the proof od the separation theorem. The Chang representation theorem is proved in Sect. 6.14. Steps to it are as follows: the notion of a sub-direct product in Definition 6.33, a criterion for sub-directness in Theorem 6.37, and the Chang theorem that each MV-algebra is a sub-direct product of MV-chains in Theorem 6.38. The Chang completeness theorem is at the core of the completeness proof for the [0, 1] L logic. It asserts that each equation t = 0 in any MV-algebra holds true if and only if it is true in the algebra over [0, 1] (called the Łukasiewicz residuated lattice). This technical proof is demonstrated in Sect. 6.15 in the version due to Cignoli-Mundici. Steps to the completeness proof for infinite-valued logic of Łukasiewicz are included in Sects. 6.16, 6.17. In Sect. 6.16, we recall the Łukasiewicz residuated lattice and we show that Wajsberg algebras and MV-algebras can be represented in each other. By means of the Wajsberg algebra structure in MV-algebras, we define MV-formulae and MV-terms. The completeness theorem for [0, 1] L is finally proved in Sect. 6.17. The algebraic proof is based on the Chang completeness theorem and algebraic prerequisites of Sects. 6.16 and 6.17. Infinite-valued logics of Goguen and Gödel are briefly commented upon in Sect. 6.18. Its peculiar features are highlighted.
xviii
Preface
The complexity issue is discussed in Sect. 6.19. We consider satisfiability problems SAT(Goguen), SAT(Gödel), SAT(Luk) for respective logics. We recall in Theorem 6.47 Hájek’s proof that SAT, SAT(Goguen), and SAT(Gödel) are equivalent hence all are NP-complete. The problem of complexity of SAT(Luk) is much more complex. It seems that one cannot avoid a functional appproach. We begin the exposition of a proof due to Mundici which employs McNaughton polynomial functions and Theorem 6.48 formulates the McNaughton theorem about the equivalence between formulae of [0, 1] L and McNaughtons functions. We recall McNaughton functions for connectives of [0, 1] L . In Definition 6.43, we recall the formula of Mundici ζn,t which embeds propositional logic into [0, 1] L by means of the equivalence: φ is valid in propositional logic if and only if ζn,t ⊃ φ is valid in [0, 1] L . This allows for a proof that SAT(Luk) is NP-hard in Theorem 6.50. As all coefficients of McNaughton functions are integers, and all functions define regions which cover the cube [0, 1]n , by connectedness two distinct functions either have disjoint regions or their regions have a common border defined by a set of algebraic equations with rational coefficients. This is exploited in the proof by Mundici in which it is shown that for the McNaughton function f φ corresponding to a satisfiable formula φ of [0, 1] L there exists a rational point b ∈ [0, 1]n . Its estimate is obtained via the Hadamard determinant theorem for the set of linear equations defining b. A polynomial guessing procedure for b shows SAT(Luk) in NP. Chapter 7 leads us into a few areas of logic related to logic for programs and actions, logics for episteme, and logics related to decision tables which describe in logical terms the notion of functional dependence among groups of features and exhibit the modal character of relations hidden in data tables. A separate area is Boolean reasoning applied to data tables with the aim of minimizing the set of features or discretization of continuously valued features. The last topic is a logic of approximate concepts couched in terms of approximate mereology, the extension of mereological theory of concepts. In Sect. 7.1, we describe the propositional dynamic logic (PDL) in its regular variant. We define sytnax and semantics of PDL along with the axiomatization due to Segerberg in Definition 7.4. We introduce the Fisher-Ladner closure and in this framework we render the Kozen-Parikh proof of completeness of PDL in Theorem 7.4. We discuss filtration for PDL in Definition 7.8 and we prove the small model existence for PDL after Fisher-Ladner in Theorem 7.6. We conclude our review of PDL with the result by Fisher-Ladner that satisfiability of a formula φ of PDL can be checked in NTIME(c|φ| ). Section 7.2 returns us to modal logics as we define epistemic logics which are clones of modal ones. We mention epistemic logics EX for X = K , T, S4, S5, D for one reasoner, and then we discuss the case of n agents. En passant, we discuss the epistemic-doxastic logics after Hintikka and Lenzen. New is the case of n agents, here we have a greater combinatorial freedom. The simple case is when we regard each agent as a separate individual reasoning unit, hence, each agent owns a separate
Preface
xix
accessibility relation. Under these assumption we define epistemic logics EXn. We prove completeness of logics EXn (2.16–2.24) with proofs in the Henkin style by applying maximal consistent sets of formulae in Theorems 7.17. and 7.18. One more method of generating knowledge for a group of agents is the group knowledge where knowledge is expressed by the knowledge operator Eφ = ∧i K i φ where K i is the knowledge operator of the i-th agent. In this way, we secure that all agents know each formula φ. Finally, we may secure that each agent knows that each agent knows that ..., i.e., we construct the transitive closure TC(E)= k≥0 E k and we let Cφ = ∧k≥0 E k φ as the common knowledge operator. These constructs are described in Definition 7.22. Satisfaction relation for E and C is defined in Definition 7.23 and in Theorem 7.20 we introduce an additional inference rule involving the common knowledge oerator C. In this framework, we render the proof of completeness of logics ECXn in Theorem 7.21. In Theorems 7.22 and 7.23, we include results by Halpern-Moses about the small model existence for logics ECXn as well as results on complexity of these logics. Section 7.3 departs from the two previous. It invokes the theory of mereology (Le´sniewski), i.e, a theory of concepts, i.e., sets of things, whose primitive operator is expressed as the predicate of a part. This theory has roots in Aristotle and in medieval logics, nevertheless it is a fruitful tool in describing relations between concepts in a sense close to modern topology as far as it does not enter into the contents of sets. We define basic notions of mereology in Definitions 7.21–7.30 which include the notions of a part, of an ingredient, and of overlap relation and in the inference rule of mereology. In particular, Definition 7.29 introduces the notion of a class, a mereology counterpart to the notion of the union of sets. Definition 7.30 brings the axiom about the class existence. The universal class V is defined in Definition 7.31 and it allows for definitions of relative complement and the complement in Definitions 7.35, 7.36. The Tarski complete Boolean mereological algebra endowed with Tarski fusion operations of addition, multiplication and negation, the class V as the unit and rid of the null element is defined in Definition 7.37. In Definition 7.38, we introduce the mereological implication. In Sect. 7.4, we develop the theory of rough mereology whose primitive predicate is the part to a degree. We introduce into the mereological universe the notion of a mass assignment axiomatized in Definition 7.40. Theorem 7.33 gathers provable consequences of this scheme. We introduce the notion of a part to a degree by means of the relation of rough inclusion. Rough inclusions have features of fuzzy membership function and its theory parallels the infinite-valued logic of Łukasiewicz. We include provable formulae of this theory in Theorem 7.34. We conclude with the counterpart of the Stone representation theorem for the Tarski Boolean algebra and with its consequence as the compactness theorem for mereological spaces in Theorem 7.39. We return to problems of logic models of knowledge in Sect. 7.5. We consider data tables formally known as information systems, and we introduce the notion of indiscernibility in the sense of Leibniz in Definition 7.45. Classes of indiscernibility constitute the smallest definable concepts. Important notion of a rough concept (set)
xx
Preface
as a non-definable set appears in Definition 7.47. The basic notion of a functional dependence between sets of features of things is defined in Definition 7.47 in which we also define approximations by definable concepts to non-definable ones. After this introduction, Sect. 7.6 brings the rendering of the Rauszer FD-logic of functional dependence set in the framework of the sequent calculus. An algebraic proof of completeness of this logic fills the rest of this section. Boolean reasoning which goes back to Boole and owes its rediscovery to Blake and Quine is a very useful method of proofs of existence of various constructs of Data Science. Such constructs are, e.g., reducts, i.e., minimal sets of features which preserve knowledge induced by the whole of features, both in information and decision systems. In Definitions 7.60 and 7.61, we recall algorithms for detection of reducts based on Boolean reasoning. We add a discussion of algorithms for finding minimal decision rules and optimal cuts in discretization of continuous features in Definitions 7.62 and 7.66. Chapter 7 proceeds with Sect. 7.8 about Information Logic (IL) which studies relations that emerge in generalized information systems which allow for multivalued features. Relations of indiscernibility I, tolerance T, and inclusion C along with the notion of the deterministic set D are defined in Definition 7.67. Theorem 7.52 lists dependencies among those relations. Relations I, T, C give rise to modal operators [I], [T], [C]. Axiom schemes for IL are given in Definition 7.69, and an algebraic proof of completeness of IL closes Sect. 7.8. Chapter 7 comes to a close with Sect. 7.9, a short account of Pavelka’s scheme for propositional fuzzy logic. The huge bulk of very interesting and deep results in the realm between FO and second-order logic SO, especially in the theory of finite models for FO deserves to be addressed and this is done in Chap. 8. All logics considered up to this point have been modeled on FO or SL. The addition to FO of quantifiers acting on concepts yields the second-order logic SO. Though SO is too complex for analysis parallel to that for FO yet some fragments are susceptible for analysis like monadic SO (MSO). The other way to transcend limitations of FO is to endow FO with inductive definitions which locates such logics midway between FO and SO and opens a way to relate logics to complexity problems in Descriptive Complexity and to problems of Model Checking. In Chap. 8, we give a review of the resulting logics with some basic facts about FO, MSO, and SO. We begin our account with a recapitulation of sections of Chap. 3 covering the expressive power of FO notably with Ehrenfeucht’s games and Ehrenfeucht-Fraïsse’s theorem in Definitions 8.1–8.5 and in Theorem 8.1. We prove the Hanf locality theorem. We begin with the notion of a Gaifman graph in Definition 8.6 and we recall the metric structure of a Gaifman graph in Definition 8.7. In Definition 8.8, we recall the notion of m-equivalence. A sufficient criterion for m-equivalence is given in the Hanf Theorem 8.5. The counterpart of the Hanf theorem for finite structures was proposed by Fagin, Stockmeyer, and Vardi and we state it and enclose a proof in Theorem 8.6. These authors also simplified their result to an elegant criterion stated in Theorem 8.7.
Preface
xxi
In Sect. 8.2, we introduce the syntax and the sematics of the second-order logic SO. In Definition 8.11, we define the monadic second-order logic MSO. In Definition 8.12– 8.15 we introduce Ehrenfeucht games for MSO and in Theorem 8.10 we enclose the proof that the class EVEN is not definable in ∃MSO. Theorem 8.12 due to Kannelakis states the REACH is expressible in ∃MSO for undirected finite graphs and we include an argument by Ajtai and Fagin. we define Ajtai-Fagin games for ∃MSO in Definition 8.17 and Theorem 8.13 states the equivalence of the winning strategy by Duplicator to non-definability in ∃MSO with a proof due to Immerman. As an example, we include the proof by Arora and Fagin that dREACH is not definable in ∃MSO. Section 8.4 is dedicated to the status of strings in MSO. We begin with definition Definition 8.23 of languages induced by formulae and in Theorem 8.15 we include the proof by Ladner of the fundamental theorem due to Büchi that a language is MSO definable if and only if it is regular. A contrasting theorem due to McNaughton-Papert that a language is FO-definable if and only if it is regular-star free is given with a proof by Ebbinghaus-Flum in Theorem 8.16. Section 8.5 contains two fundamental theorems due to, respectively, Trakhtenbrot and Fagin. The Trakhtenbrot theorem on undecidability of the finite satisfiability problem over vocabularies containing a binary predicate is proved in Theorem 8.17. The Fagin theorem that ∃SO = NP is stated and proved in Theorem 8.18. With Sect. 8.6, we open the vast area of augmentations of FO by additional constructs which lead to logics stronger than FO and subsumed by SO. We consider in Sect. 8.6 FO+inductive definitions; an example of such definition is the Transitive Closure TC which is non-definable in FO. In Definition 8.24, we return to the KnasterTarski fixed point theorem (see Chap. 1) in order to introduce fixed point operators LFP and GLP and the logic LFP=(FO+LFP+GLP). The theorem, independently by Immerman and Vardi, that LFP=PTIME is stated in Theorem 8.20 with a proof by Immerman. We notice that LFP⊆ ∃MSO. Introduced by Gurevich and Shelah Inductive fixed points IFP and Inflationary fixed points are defined in Definitions 8.28 and 8.29. Finally, in Definition 8.30 Partial fixed points PFPs are defined. Theorem 8.24 by Kreutzer that IFP≡ LFP simplifies the landscape of fixed points. Theorem 8.25 due independently to Immerman and Vardi states that on ordered finite structures IFP=LFP=PTIME, and Theorem 8.26 due to Vardi which establishes that PFP=PSPACE conclude Sect. 8.6. Up tp now, FO and fixed point logics have been unable to count as witnessed by EVEN. The logics with counting introduced by Immerman and Lander make up for this deficiency. The logic (FO+Count) is defined in Definitions 8.31–8.33. Infinitary connectives are defined in Definition 8.34 and their addition leads to the logic (FO+Count)inf in Definition 8.35. A further step is the logic in Definition 8.36. Its strength is curtailed by restriction to formulae of a finite rank (Definition 8.37) in the logic (FO+Count)*inf ,ω (Libkin) in Definition 8.38. Bijective Ehrenfeucht games (Hella, Libkin) are introduced in Definition 8.39. In Theorem 8.27, we prove that the winning strategy by Duplicator on structures A, B is equivalent to the agreement of those structures on closed formulae of (FO+Count)inf,ω
xxii
Preface
of rank m. Logics (FO+TC) and (FO+DTC) due to Immerman are described in Sect. 8.8 along with due to Immerman results that (FO+TC)=NL and (FO+DTC)=L. Section 8.9 introduces yet another game in Definition 8.42, the Pebble game (Immerman, Poizat) in Definition 8.43, yet another variant of the basic idea of the Ehrenfeucht game with the partial isomorphism of obtained structures as the winning criterion for Duplicator. We notice that Pebble games characterize logics with finitely many variables (Barwise). Definability versus locality is studied in Sect. 8.10 with locality in the sense of Hanf in Definition 8.50 and the locality in the sense of Gaifman in Definition 8.52. Hanf based and Gaifman based criterions for definability are stated in respective Theorems 8.32 and 8.33. Relations between the two localities are explained in Corollary 8.4: each Hanf-local Query for m > 0 is Gaifman-local and in Theorem 8.36: each FO-definable Query is Hanf-local. Given a collection Mn of structures with n-element domains and an Mn closed formula φ, we define estimate μn (φ) of the probability that a randomly selected structure in Mn will satisfy φ. The limit of μn (φ) at ∞ is denoted μ(φ). A languge L obeys the 0-1 law if and only if for each φ in L, either μ(φ) = 0 or μ(φ) = 1. In Sect. 8.11, we state the proof by Fagin that FO obeys 0-1 law. To this end, we define Gaifman’s extension formulae in Definition 8.57, and we consider their set TG . For completeness’ sake, we introduce the Ło´s-Vaught test in Theorem 8.38 and we show that TG is complete save a proof of categoricity, to be given in Sect. 8.12. Theorem 8.39 states and proves the Fagin theorem that if TG |= φ, then μ(φ) = 1. The conclusion is Theorem 8.41 by the same authors that FO obeys 0-1 law. The predicate BIT and the Random Graph are defined in Sect. 8.12. We include a proof that TG is categorical in ω in Theorem 8.43. In preparation to Parity game, we define games on graphs in Sect. 8.13. The architecture of a game is defined in Definition 8.60, winning strategies and positions are defined in Definition 8.62, the Query GAME is defined in Definition 8.63. Parity game is defined in Definition 8.64 and Theorem 8.44 states the result by Emerson-Jutla and, independently, by Mostowski, about positional determinacy. The important for applications in model checking modal μ-calculus L μ is introduced in Sect. 8.14. Kripke models are here labeled transition systems and we define formulae of L μ in Definition 8.65. and Definition 8.66 brings the satisfaction rules. Formulae of L μ are sometimes difficult to be disentangled, and we give some examples of them. Operators μ.X, ν.X of L μ are in fact LFP and GFP, respectively, and we use definitions of these fixed points to define unfolding of μ.X and ν.X in Definition 8.67. Well-named formulae and alternation depth are defined in Definition 8.68. Model checking by means of mu-calculus is considered in Sect. 8.15 in which we define parity games in Definition 8.69 with rules of them stating the moves by players E and A. Also defined are priority assignments. The discussion culminates in theorem which we state and prove after Bradfield and Walukiewicz that a transition system T at a state s satisfy a formula φ if and only if the player E has a winning strategy from the position (s, φ).
Preface
xxiii
The last Sect. 8.16 is dedicated to DATALOG. We define DATALOG programs in Definition 8.71 and we show that the outcome is a simultaneous fixed point LFPsim . We prove the Beki´c lemma that LFPsim is equivalent to a single LFP. Each chapter is provided with a list of problems, their total number is about 150. Our philosophy in selection of problems is not very original: we borrow them mostly from research papers in belief that they reflect fully the important additional results. About 150 problems span Chaps. 2–8. As problems are provided with links to respective works, the Reader will have the opportunity to consult original sources. Full list of references counts about 270 positions, providing the reader with links to influential and often breakthrough works. The debt and pleasure of the author is to thank them who helped. We owe our gratitude to many though we cannot mention all. We are grateful to Prof. Maria Semeniuk-Polkowska for Her patience during many months of work on the book and for Her unrelenting support. To Prof. Andrzej Skowron we offer thanks for His friendship and for the scientific cooperation which lasted for years as well as for His kind willingness to provide Foreword to our book. Author’s thanks and gratitude go to Prof. Janusz Kacprzyk, a Series Editor, for His support of our books and we send thanks to Prof. Lakhmi C. Jain, a Series Editor, ˙ for kind acceptance of this book.Ms Gra˙zyna Doma´nska-Zurek provided invaluable expertise in Latex and expertly set the book. Let us be allowed to remember late Profs. Helena Rasiowa and Zdzisław Pawlak who were very kind to offer their invaluable support. ˙ Ms Gra˙zyna Doma´nska-Zurek provided an expertise in Latex and set expertly the book. We want to express our gratitude to the team at Springer Nature: Thomas Ditzinger, Syvia Scheider and Sabine Schmitt for their support of the book project and to Varsha Prabakar and Jegadeeswari Diravidamani at Springer Nature Chennai for their professional involvement in various stages of book production and readiness to help. Names of Adolf Lindenbaum, Gerhard Gentzen and Mordechai Wajsberg are often mentioned in this text. We would like to dedicate this book to their memories as representatives of the legion of Polish and other European logicians and mathematicians and scientists from all venues of science who perished or suffered during the II WW and other wars in often unknown circumstances. Warsaw, Poland March–April 2023
Lech T. Polkowski
Contents
1 Introduction: Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Set Theory Recapitulated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 ZFC Set Theory. Basic Constructs . . . . . . . . . . . . . . . . . . . . 1.1.2 Equipollence and well-Ordered Sets. Cardinal and Ordinal Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Graph Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Rewriting Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Computability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Arithmetization of Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Recursively Enumerable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Undecidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7 Incompleteness. Non-provability of Truth . . . . . . . . . . . . . . . . . . . . . 1.8 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Algebraic Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10 Topological Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 2 6 13 15 21 27 31 33 34 37 47 55 59
2 Sentential Logic (SL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 A Bit of History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Syntax of Sentential Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Semantics of Sentential Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Sentential Logic as an Axiomatic Deductive System . . . . . . . . . . . . 2.6 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Natural Deduction: Decomposition Diagrams . . . . . . . . . . . . . . . . . . 2.8 Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Meta-Theory of Sentential Logic. Part I . . . . . . . . . . . . . . . . . . . . . . . 2.10 Meta-Theory of Sentential Logic. Part II . . . . . . . . . . . . . . . . . . . . . .
61 61 62 65 68 71 77 79 81 84 88
xxv
xxvi
Contents
2.11 Resolution, Logical Consequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 2.12 Horn Clauses. Forward and Backward Chaining . . . . . . . . . . . . . . . 96 2.13 Satisfiability, Validity and Complexity in Sentential Logic. Remarks on SAT Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 2.14 Physical Realization of SL. Logic Circuits, Threshold Logic . . . . . 102 2.15 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 3 Rudiments of First-Order Logic (FO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction to Syntax of FO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Introduction to Semantics of FO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Natural Deduction: Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Natural Deduction: Diagrams of Formulae . . . . . . . . . . . . . . . . . . . . 3.5 Natural Deduction: Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Meta-Theory of Predicate Logic. Part I . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 Tableau-Completeness of Predicate Logic . . . . . . . . . . . . . 3.7 Analytic Tableaux Versus Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 Resolution in Predicate Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10 Horn Clauses and SLD-Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11 Meta-Theory of FO. Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11.1 Undecidability of Satisfiability and Validity Decision Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.12 Complexity Issues for Predicate Logic . . . . . . . . . . . . . . . . . . . . . . . . 3.13 Monadic Predicate Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.14 The Theory of Herbrand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.15 The Gödel Completeness Theorem. The Henkin Proof . . . . . . . . . . 3.16 The Tarski Theorem on Inexpressibility of Truth . . . . . . . . . . . . . . . 3.17 Gödel Incompleteness Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.18 The Rosser Incompleteness Theorem. The Rosser Sentence . . . . . . 3.19 Finite FO Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.20 Ehrenfeucht Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.21 The General form of Deduction Theorem for FO . . . . . . . . . . . . . . . 3.22 Craig’s Interpolation Theorem. Beth’s Definability Theorem . . . . . 3.23 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
111 111 115 116 118 121 123 124 126 127 131 132 135
4 Modal and Intuitionistic Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Sentential Modal Logic (SML) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Semantics of the Modal System K . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Natural Deduction: Analytic Tableaux for Modal Logics . . . . . . . . 4.4 Meta-Theory of Modal Logics: Part I: K,T,S4 . . . . . . . . . . . . . . . . . 4.5 Natural Deduction: Modal Sequent Calculus . . . . . . . . . . . . . . . . . .
179 179 182 189 192 195
135 139 139 139 145 148 152 156 158 161 166 167 173 176
Contents
4.6 Meta-Theory of Modal Logics. Part II . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Model Existence Theorem and Strong Completeness . . . . . . . . . . . 4.8 Small Model Property, Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Satisfiability, Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10 Quantified Modal Logics (QML) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 Natural Deduction: Tableaux for Quantified Modal Logics . . . . . . 4.12 Sentential Intuitionistic Logic (SIL) . . . . . . . . . . . . . . . . . . . . . . . . . . 4.13 Natural Deduction: Tableaux for SIL . . . . . . . . . . . . . . . . . . . . . . . . . 4.14 Consistency of Intuitionistic Sentential Logic . . . . . . . . . . . . . . . . . . 4.15 First-Order Intuitionistic Logic (FOIL) . . . . . . . . . . . . . . . . . . . . . . . 4.16 Natural Deduction: FOIL-Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . 4.17 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Temporal Logics for Linear and Branching Time and Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Temporal Logic of Prior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Linear Temporal Logic (LTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Computational Tree Logic (CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Full Computational Tree Logic (CTL*) . . . . . . . . . . . . . . . . . . . . . . . 5.5 Meta-theory of Temporal Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Linear Time Logic with PAST (LTL+PAST) . . . . . . . . . . . . . . . . . . . 5.7 Properties of Systems, Model Checking by Means of Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Natural Deduction: Tableaux for LTL . . . . . . . . . . . . . . . . . . . . . . . . 5.9 Tableaux Construction for Linear Temporal Logic . . . . . . . . . . . . . . 5.10 Tableaux for Computational Tree Logic CTL . . . . . . . . . . . . . . . . . . 5.11 Non-deterministic Automata on Infinite Words . . . . . . . . . . . . . . . . 5.12 Decision Problems for Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.13 Alternation. Alternating Automata on Infinite Words . . . . . . . . . . . 5.14 From LTL to Büchi Automata on Infinite Words . . . . . . . . . . . . . . . 5.15 LTL Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.16 Model Checking for Branching-time Logics . . . . . . . . . . . . . . . . . . . 5.17 Symbolic Model Checking: OBDD (Ordered Binary Decision Diagram) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.18 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Finitely and Infinitely Valued Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 3-Valued Logic of Łukasiewicz (3 L ). Introduction . . . . . . . . . . . . . . 6.2 T-Norms, T-Co-norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Basic Logic (BL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Meta-Theory of BL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Meta-Theory of 3 L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xxvii
197 200 202 206 206 210 213 216 218 219 220 224 228 231 232 232 236 239 241 246 247 249 251 254 257 261 262 265 266 267 270 274 278 281 281 284 286 288 291
xxviii
Contents
6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16
Some Other Proposals for 3, 4-Valued Logics . . . . . . . . . . . . . . . . . . The n-Valued Logic n L : The Rosser-Tourquette Theory . . . . . . . . . The Post Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Infinite-Valued Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Infinite-Valued Logic [0,1] L of Łukasiewicz . . . . . . . . . . . . . . . . . . . Wajsberg Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MV-Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ideals in MV-Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Chang Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . The Chang Completeness Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . Wajsberg Algebras, MV-Algebras and the Łukasiewicz [0,1] L Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.17 The Completeness Theorem for Infinite-Valued Sentential Logic [0,1] L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.18 Remarks on Goguen and Gödel Infinite-Valued Logics . . . . . . . . . . 6.19 Complexity of Satisfiability Decision Problem for Infinite-Valued Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.20 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
295 297 303 303 309 314 316 319 320 321
7 Logics for Programs and Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Sentential Dynamic Logic (SDL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Epistemic Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Mereology Based Logic for Granular Computing and Fuzzy Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Rough Mereology. Rough Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Knowledge as an Ability to Classify. Information/Decision Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 The Logic of Functional Dependence (FD-Logic) . . . . . . . . . . . . . . 7.7 Boolean Reasoning in Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 7.8 Information Logic (IL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9 Pavelka’s Fuzzy Sentential Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
333 333 342
367 369 373 381 385 387 390
8 Beyond FO Within SO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction: Expressive Power of FO. Recapitulation . . . . . . . . . . 8.2 Syntax and Semantics of Second-Order Logic (SO) . . . . . . . . . . . . 8.3 Graph Structures in MSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Strings in FO and MSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Theorems of Trakhtenbrot and Fagin . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 FO+Inductive Definitions. Fixed Point Logics . . . . . . . . . . . . . . . . . 8.7 Logics with Counting (FO+Count) . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8 FO+Transitive Closure (FO+TC) . . . . . . . . . . . . . . . . . . . . . . . . . . . .
393 393 401 404 409 412 415 420 424
324 325 326 327 330 331
356 361
Contents
8.9 Logics with a Finite Number of Variables . . . . . . . . . . . . . . . . . . . . . 8.10 Definability Versus Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.11 0-1 Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.12 BIT Relational Symbol. Random Graph . . . . . . . . . . . . . . . . . . . . . . 8.13 Games on Graphs and Transition Systems . . . . . . . . . . . . . . . . . . . . . 8.14 Modal μ-Calculus (Lμ ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.15 μ-Calculus Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.16 DATALOG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.17 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xxix
425 427 431 434 435 437 440 443 445 448
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
List of Figures
Fig. 1.1 Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 2.4 Fig. 2.5 Fig. 2.6 Fig. 2.7 Fig. 2.8 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 3.7 Fig. 3.8 Fig. 4.1 Fig. 4.2 Fig. 4.3 Fig. 5.1 Fig. 5.2 Fig. 5.3 Fig. 5.4 Fig. 5.5 Fig. 5.6 Fig. 5.7 Fig. 5.8 Fig. 5.9 Fig. 8.1
Automaton for concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formation tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Diagram of a formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Forms of decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A closed tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An open tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Logical gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A logical circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The artificial neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A formation tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A diagram of decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Types of decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A closed predicate tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An open predicate tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The pattern for SLD resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . The resolution tree for the Example 3.9 . . . . . . . . . . . . . . . . . . . . . Connectedness example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A modal closed K-tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A modal closed T-tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A modal closed 4-tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sentential decomposition forms: A reminder . . . . . . . . . . . . . . . . . Decomposition forms for LTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The initial LTL-tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The final LTL-tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Decomposition forms for CTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The initial CTL-tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The final CTL-tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . BDD(φ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kripke’s structure K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphs G 1 , G 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 67 80 82 83 84 103 103 103 114 120 122 123 123 133 134 166 191 191 192 250 250 252 253 255 256 256 271 272 405
xxxi
List of Tables
Table 1.1 Table 1.2 Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 3.1 Table 4.1 Table 4.2 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table 5.5 Table 5.6 Table 6.1 Table 6.2 Table 6.3 Table 6.4 Table 6.5 Table 6.6 Table 6.7 Table 6.8 Table 6.9 Table 6.10 Table 7.1 Table 7.2
Automaton B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Automaton NB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Truth-functional description of connectives and of ⊥ . . . . . . . . Truth table for the formula φ . . . . . . . . . . . . . . . . . . . . . . . . . . . . Truth table for the formula φ . . . . . . . . . . . . . . . . . . . . . . . . . . . . C,N truth matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sequent rules rendered as tableau rules . . . . . . . . . . . . . . . . . . . . Proof of (i) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proof of (iii) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complexity of satisfiability decision problem . . . . . . . . . . . . . . Automaton B1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Automaton B2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transition system TS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Automaton B3 ........................................ Product TS B1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The truth function of the Łukasiewicz implication . . . . . . . . . . . The truth function of the Łukasiewicz negation . . . . . . . . . . . . . Connectives of L 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T-norms, related residua and t-co-norms . . . . . . . . . . . . . . . . . . . T-norms, related negation operators . . . . . . . . . . . . . . . . . . . . . . The Kleene implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The truth table for conjunction ∧ B I . . . . . . . . . . . . . . . . . . . . . . The truth table for 4 L M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The truth table for modality M . . . . . . . . . . . . . . . . . . . . . . . . . . The truth table for modality L . . . . . . . . . . . . . . . . . . . . . . . . . . . The partial decision system PLAY . . . . . . . . . . . . . . . . . . . . . . . The decision system minimal . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 21 65 68 71 75 127 181 182 246 248 248 248 248 249 282 282 284 286 286 295 296 296 297 297 374 381
xxxiii
Chapter 1
Introduction: Prerequisites
In this chapter, we collect basic information on set theory, rewriting systems, computability, complexity, algebraic structures, topological structures, which will serve us in what follows as references.
1.1 Set Theory Recapitulated ¨ notion of a set comes from Georg Cantor, the creator of set theory, who Naive defined a set as a collection of objects united by a common property. This definition was sufficient in the first period of development up to the end of 19th century; as reasoning on the basis of it led to antinomies, beginning of the XX century brought attempts at formal definitions which resulted in some formal theories of sets, accepted as satisfactory for developing basic mathematical theories. Definition 1.1 (ZFC theory of sets) The syntactic constituents of ZFC belong in a few categories of symbols: (i) (ii) (iii) (iv) (v)
the letters X, Y, Z , ... denote sets; letters x, y, z, ... denote elements of sets; the symbol ∈ (the Greek ‘esti’) denotes the phrase ... is an element of ...; the symbol = denotes identity of sets; connectives ∨, ∧, ⊃, ¬, ≡ meaning intuitively or, and, if ... then ..., , if and only if ....
Some relations between sets are derived first. The primitive formula of the set theory is x ∈ X (‘a thing x is an element of the set X ’). Definition 1.2 (Containment of sets) The symbol for this notion is ⊆; X ⊆ Y if and only if x ∈ X ⊃ x ∈ Y no matter what thing is substituted for x. When (X ⊆ Y ) ∧ ¬(Y ⊆ X ), then one uses the symbol ⊂ and writes X ⊂ Y meaning that X is a proper subset of Y . © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. T. Polkowski, Logic: Reference Book for Computer Scientists, Intelligent Systems Reference Library 245, https://doi.org/10.1007/978-3-031-42034-4_1
1
2
1 Introduction: Prerequisites
Definition 1.3 (Set algebra) We define elements of set algebra: the union X ∪ Y is the set defined by the formula x ∈ X ∪ Y ≡ x ∈ X ∨ x ∈ Y ; the intersection X ∩ Y is defined by the formula x ∈ X ∩ Y ≡ x ∈ X ∧ X ∈ Y ; the difference X \ Y is defined by the formula x ∈ X \ Y ≡ x ∈ X ∧ ¬(x ∈ Y ). Definition 1.4 (Identity of sets) X = Y if and only if X ⊆ Y ∧ Y ⊆ X ; equivalently, (X = Y ) ≡ (x ∈ X ≡ x ∈ Y ) for each thing x.
1.1.1 ZFC Set Theory. Basic Constructs We now list allowed in ZFC operations for forming new sets. We give them in the format X, Y, Z , ..., x, y, z, ... , new constr uct on basis o f X, Y, Z , ..., x, y, z, ... calling them axioms. The set with elements x, y, . . . is denoted {x, y, . . .}. x,y (A1) Axiom of unordered pair: {x,y} ; for each pair of things x, y, there exists the set {x, y} containing exactly things x and y; where (x ∈ X P ) ≡ (x ∈ X ∩ P); we are allowed to (A2) Axiom of separation: X,P XP single out of X elements which are in P. The set P is called a property, later on, we meet its symbolic rendition in logic under the name of a predicate; (A3) Axiom of the power set: 2XX , where (x ∈ 2 X ) ≡ (x ⊆ X ); it follows that the the set X ; power set 2 X consists of all subsets of (A4) Axiom of the union: XX , where (x ∈ X ) ≡ [(x ∈ Y ) ∧ (Y ∈ X )] for some set Y ; therefore, the set X consists of all elements which are elements in an element of X ; (A5) Axiom of the empty set: ∅ , where the empty set ∅ consists of no element: for each x it is true that ¬(x ∈ ∅); (A6) Axiom of infinity: ∅∅∞ , where ∅∞ has the following properties:
(i) ∅ ∈ ∅∞ ; (ii) for each x, if x ∈ ∅∞ , then {x, {x}} ∈ ∅∞ ; thus, ∅, {∅, {∅}}, ... are elements of the set ∅∞ ; (A7) Axiom of replacement: let X, Y be sets and F(x, y) be defined for x ∈ X and be such that for each x ∈ X there exists exactly one y ∈ Y satisfying F(x, y). X,F ⊆ Y and y ∈ Y X,F if and only if F(x, y) holds true for Then YX,Y X,F , where Y some x ∈ X ; (A8) Axiom of choice (C): XXC , where X C contains exactly one element from each element of X ; X C is a selector for X .This supposes that elements of X are non-empty sets themselves and the set X C selects exactly one element from each of these sets. We call X a family of sets. Let us add that letters ZF stand for Zermelo-Fraenkel set theory and the letter C stands for Choice.
1.1 Set Theory Recapitulated
3
A number of new constructs follow from axioms. x,y . The set {x, {x, y}} is denoted and Definition 1.5 (Ordered pairs) {x,{x,y}} it is called the ordered pair. The name is justified by the property: ( = ) ≡ (x1 = x2 ∧ y1 = y2 ). X,Y . The resulting set of ordered Definition 1.6 (Cartesian product) {:(x∈X )∧(y∈Y )} pairs is denoted X × Y and called the Cartesian product of sets X and Y . It exists as a subset of the set 2 X ∪Y (existing by (A2), (A3),(A4)).
Definition 1.7 (Binary relations) For sets X, Y , the relation R between X and Y is a subset R ⊆ X × Y . In case ∈ R, we write R(x, y) or x Ry. We list definitions of most frequently used relations. Consider a relation R ⊆ X × X ; R is: (i) (ii) (iii) (iv) (v) (vi) (vii)
reflexive: R(x, x) for each x; linear: R(x, y) ∨ R(y, x) for each pair x, y; symmetric: R(x, y) ⊃ R(y, x) for each pair x, y; transitive: R(x, y) ∧ R(y, z) ⊃ R(x, z); serial: for each x, there exists y such that R(x, y); Euclidean: R(x, y) ∧ R(x, z) ⊃ R(y, z) for each triple x, y, z; directed: if R(x, y) and R(x, z), then there exists t such that R(y, t) and R(z, t) for each triple x, y, z; (viii) functional: R(x, y) ∧ R(x, z) ⊃ (y = z) for each triple x, y, z. A relation satisfying (i) and (iii) is a tolerance relation which expresses similarity, if R satisfies in addition (iv), then it is an equivalence relation. The inverse to R relation R −1 is defined by the equivalence (R −1 (x, y)) ≡ (R(y, x)). Clearly, (R −1 )−1 = R. Relations can be composed: (R ◦ S(x, y)) ≡ (R(x, z) ∧ S(z, y)) for some z. Obviously, (R ◦ S)−1 = S −1 ◦ R −1 . For a binary relation R, we denote by R n the composition of n copies of R; the n union ∞ n=0 R is the transitive closure of R denoted T C(R). Two things x, y are in the relation T C(R) if and only if there exists n such that R n (x, y). Definition 1.8 (Functions) For sets X, Y , a relation R ⊆ X × Y is a function if and only if it does observe the property R(x, y1 ) ∧ R(x, y2 ) ⊃ y1 = y2 ; for functions symbols f, g, ... are used and notation f (x) = y is applied when the instance f (x, y) holds. Functions are denoted more suggestively by symbols of the form f : X → Y . We now define basic notions related to the notion of a function like image, counterimage, domain, range, functions total (surjective), injective, bijective. Definition 1.9 For a function f : X → Y and a subset Z ⊆ X , the set f (Z ) = {y ∈ Y : y = f (x)} for some x ∈ X is the image of the set Z . By analogy, for a subset W ⊆ Y , the subset f −1 (W ) = {x ∈ Z : f (x) = y ∧ y ∈ W } of X is the counterimage of W . The set X is the domain dom( f ) of f and the set f (X ) is the range r ng( f ) of f ; in case when the range of f is Y the function f is total. The function f is injective if for each y ∈ Y there exists at most one x ∈ X with f (x) = y; f is bijective when it is injective and total.
4
1 Introduction: Prerequisites
Definition 1.10 (Orderings) A binary relation R ⊆ X × X is a partial ordering on the set X in case the following properties hold: (i) R(x, x); (ii) R(x, y) ∧ R(y, z) ⊃ R(x, z); (iii) R(x, y) ∧ R(y, x) ⊃ x = y. If, in addition, R satisfies (iv) R(x, y) ∨ R(y, x), then R is a linear ordering. An ordering R is strict linear in case it satisfies (ii), (iv) and (v) R(x, y) ⊃ ¬R(y, x). Partial ordering is often denoted ≤ with the inverse ≤−1 denoted ≥ and strict linear ordering is denoted n 0 , if all natural numbers m such that n 0 < m < n are in A, then n ∈ A. Then each n ≥ n 0 ∈ A. In particular, if n 0 = 0, then A=N. Definition 1.16 (Cartesian products) The principle of induction allows us to define Cartesian products of n sets X 1 , X 2 , ..., X n by induction: (i) for n = 2, the product n X i is X 1 × X 2 is already defined in Definition 1.6. For n > 2, if the product Πi=1 n+1 n defined, then Πi=1 X i is defined as Πi=1 X i × X n+1 . Definition 1.17 (Arity) An n-ary relation R is a subset of a Cartesian product n n X i . A function of arity n is defined on a Cartesian product Πi=1 Xi . Πi=1
1.1.2 Equipollence and well-Ordered Sets. Cardinal and Ordinal Numbers Definition 1.18 (Equipollence) Sets X, Y are equipollent (are of the same cardinality) if there exists a bijection f : X → Y . The cardinality type of a set X is denoted |X |. The equipollence relation between sets X and Y is denoted as |X | = |Y |. For instance, no natural number n is equipollent with the set of natural numbers N. Each set equipollent with N is said to be countably infinite. Cardinality of N is denoted by the symbol ω, the first infinite cardinal number. We say that a cardinal number |X | is greater than a cardinal number |Y | if the set Y is a bijective image of a subset of a set X and there is no bijection of Y onto X . The question whether there exist cardinal numbers greater than ω is settled by the Cantor theorem. Theorem 1.5 (Cantor) For each set X , sets X and the power set 2 X are not equipollent and |X | < |2 X |. / Proof To the contrary, suppose that f : X → 2 X is a bijection, and let A = {x : x ∈ f (x)}; there is y ∈ X with A = f (y). If y ∈ f (y) then y ∈ / f (y) and if y ∈ / f (y) then y ∈ f (y), a contradiction. Thus, f cannot be any bijection. We meet here the celebrated diagonal argument, an echo of the ancient Liar paradox, repeated after Cantor in many proofs and constructions including the Gödel incompleteness theorem and the Tarski theorem on non-definability of truth.
1.1 Set Theory Recapitulated
7
The set X embeds into the set 2 X by the injection x → {x}, hence cardinality of X is smaller than cardinality of 2 X . Clearly, |X | = |X |, |X | = |Y | ∧ |Y | = |Z | ⊃ |X | = |Z |; what about condition (iii) of Definition 1.10? It turns out that it also holds. Theorem 1.6 (Cantor-Bernstein) If |X | ≤ |Y | and |Y | ≤ |X |, then |X | = |Y |. X Proof We exploit the fact that the power set 2 is completely ordered by inclusion ⊆ with as the join and as the meet. By assumptions, there exist an injection f : X → Y and an injection g : Y → X . We define a function h : 2 X → 2 X by the formula h(A) = g[Y \ f (X \ A)]
Then h is order-preserving: if A ⊆ B then (X \ B) ⊆ (X \ A), hence, f (X \ B) ⊆ f (X \ A), thus [Y \ f (X \ A)] ⊆ [Y \ f (X \ B)] and g(Y \ f (X \ A)) ⊆ g(Y \ f (X \ B)), i.e., h(A) ⊆ h(B). By the Knaster-Tarski theorem, h(C) = C for some C ⊆ A. Then, Y \ g −1 (C) = f (X \ C) and we can define the function k : X → Y as follows: k(x) = f (x) if x ∈ X \ C and k(x) = g −1 (x) if x ∈ C. Clearly, k is a bijection of X onto Y , so |X | = |Y |. Definition 1.19 (Countability) A set is countable if it embeds into the set N of natural numbers. Theorem 1.7 If for each n ∈N, the set X n is countable, then the union n X n is countable. Proof Consider a set X = {X n : n ∈N} such that each X n = {xn,k : k ∈N}. First, we change the set X n into the set X n × {n} for each n. Clearly, |X n | = |X n × {n}|. are pairwise disjoint. Thus, each set X n × {n} is countable and now all sets X n × {n} Therefore, we my assume that no two sets in the union X = {X n × {n} : n ∈N} have the same pair of indices. By axiom of choice, we can select for each n ∈ N
8
1 Introduction: Prerequisites
an injection gn : X n × {n} →N. The function g : n∈NX n × {n} →N defined as g(xn,k ) = gn (xn,k ) for n ∈N is an injection of the union X into N. To be more specific, we prove Theorem 1.8 The Cartesian product N × N is equipollent with N. x! Proof We denote by bn(x, y) the binomial coefficient y!·(x−y)! . We claim that the function f : N × N → N defined as f (x, y) = bn(x + y + 1, 2) + x = 21 [(x + y)2 + 3x + y] is a bijection from N × N onto N. Suppose that f (x, y) = f (u, w). Then x = u: was x > u, it would be x = u + q and we would have bn(u + q + y + 1, 2) + q = bn(u + w + 1, 2), hence, w > y + q, i.e., w = y + q + t. Then, bn(u + q + y + 1, 2) + q = bn(u + y + q + t + 1, 2)
and letting d = u + q + y + 1 we have bn(d, 2) + q = bn(d + t, 2). But q < d, hence, bn(d, 2) + q < bn(d, 2) + d = bn(d + 1, 2), a contradiction. Hence, it cannot be x > u; by symmetry, it cannot be u > x, hence, u = x. Similarly, y = w and f is injective. It remains to prove that f is total. We have f (0, 0) = 0, f (0, 1) = 1. If f (x, y) = n and y > 0 then f (x + 1, y − 1) = bn(x + 1 + y − 1 + 1, 2) + (x + 1) = bn(x + y + 1, 2) + (x + 1) = f (x, y) + 1 = n + 1. If y = 0, then x > 0 by case considered, then n = bn(x + 1, 2) + x, and, n + 1 = bn(x + 1, 2) + x + 1 = bn((x − 1) + 1 + 1, 2) + (x + 1) = f (x − 1, 1) This proves that f is total and finally a bijection.
A combinatorial sequel is applied in complexity theory. Let us represent the product N×N as N1 ×N2 , and let pr1 :N1 ×N2 → N1 and pr2 :N1 ×N2 → N2 be projections onto N1 , N2 , respectively, given by pr1 (x, y) = x and pr2 (x, y) = y. As f (x, y) is a bijection, there exists the inverse bijective function f −1 : N→ N1 × N2 . Composing f −1 with pr1 and pr2 we obtain functions K (n), L(n) such that f (K (n), L(n)) = n.
1.1 Set Theory Recapitulated
9
To get K (n) and L(n) explicite, let us represent f (x, y) as (x + y + 1)(x + y) (x + y)2 + 3x + y +x = = z. 2 2 Then 8z + 1 = (2x + 2y + 1)2 + 8x. From this we get (2x + 2y + 1)2 ≤ 8z + 1 < (2x + 2y + 3)2 , hence,
1
(2x + 2y + 1) ≤ (8z + 1) 2 < 2x + 2y + 3 It follows that
1
(8z + 1) 2 + 1 = x + y + 1, 2
finally, 1
(8z + 1) 2 + 1 − 1 = P(z) (x + y) = 2 and
1
3x + y = 2z − [
(8z + 1) 2 + 1 − 1]2 = Q(z). 2
Solving this system of linear equations, we obtain explicit expressions for K (z) = x and L(z) = y as K (z) =
3 1 1 · (Q(z) − P(z); L(z) = · P(z) − · Q(z). 2 2 2
Corollary 1.1 For each natural number k, the set Nk is equipollent with N. Proof It follows by Theorem 1.8 and induction on k.
A set X is uncountable if it is not equipollent with any subset of N. An example is the set 2N . Iteration of the operation of the power set yields an infinite sequence N of larger and larger cardinalities of sets N, 2N , 22 , .... Definition 1.20 (Algebra of cardinal numbers) A cardinal number is a symbol assigned to a set X denoted |X |; sets of the same cardinality are assigned the same cardinal number. We denote cardinal numbers with letters m, n, ... as natural numbers are cardinal numbers of finite sets and we extend this notation over all cardinal numbers. We consider disjoint sets X of cardinality m and Y of cardinality n. Then we define the algebraic operations on cardinal numbers m, n as follows:
10
1 Introduction: Prerequisites
(i) the sum m + n is the cardinal number of the union X ∪ Y ; (ii) the product m · n is the cardinal number of the Cartesian product X × Y ; (iii) the power of m to the exponent n, denoted m n , is the cardinal number of the set X Y of all functions from the set Y to the set X . Algebra of cardinal numbers has same properties as algebra of natural numbers: (i) (ii) (iii) (iv) (v)
m + n = n + m, m · n = n · m; m · (n 1 + n 2 ) = m · n 1 + m · n 2 ; (m n )k = m n·k ; (m · n)k = m k · n k ; m n+k = m n · m k ; m · 1 = m, where 1 is the cardinal number of the singleton set {∅}.
The cardinal number 2ω , i.e., the cardinality of the set 2N is called the cardinality of continuum. As each real number can be represented as an infinite sequence over the set {0, 1}, the cardinality of the set of real numbers is 2ω , the cardinality of continuum. Definition 1.21 (Order types) In addition to cardinality, linearly ordered sets can be compared with respect to their orderings; if there exists an isotone function f : X → Y , then the order type of X is not greater than the order type of Y ; in case f is an isotone bijection, we call it an isomorphism and then order type of X is equal to the order type of Y . We recall that a linearly ordered set X is well-ordered when each non-empty subset of X has the least element. Theorem 1.9 Consider well-ordered structures (X, ≺) and (Y, .
The symbol x denotes a new initial symbol, not in alphabets of L or M. Clearly, the fusion realizes productions of either grammar so it is regular and generates either language. That (iii) implies (i) is more technical and we refer this part to (Salomaa loc.cit.). Regular languages are characterized by regular expressions. Definition 1.34 (Regular expressions) Regular expressions (RE’s) are defined recursively, by letting (i) (∅) is RE (ii) (a) is RE for each a ∈ A (iii) if (u), (v) are RE’s, then (u)∗ , (uv), (u ∪ v) are RE’s. Regular expressions code languages. Definition 1.35 We denote by L(α) the language defined by the RE α: L(∅) = ∅, L(a) = {a}, L(u ∗ ) = (L(u))∗ , L(uv) = L(u)L(v), L(u ∪ v) = L(u) ∪ L(v). It remains to tie each regular language to a characterizing regular expression.
18
1 Introduction: Prerequisites
Theorem 1.22 A language L is regular if and only if it is defined by a regular expression. Proof We know already from Definition 1.34 that L(u ∪ v) is regular. We consider languages L of type u generated by the grammar G(L) =< N (L), T (L), a(L), P(L) > and the language M of type v generated by the grammar G(M) =< N (M), T (M), a(M), P(M) > . In P(M) insert in place of each production of the form x ⇒ w, where x ∈ N (M) and w ∈ T (M)∗ the production x ⇒ a(L)w. Let the set of the new productions be P ∗ . The grammar G =< N (L) ∪ N (M), T (L) ∪ T (M), a(M), P ∪ P ∗ > generates the language L M. In order to obtain a grammar which would generate the language of type (v ∗ ), form the set of productions P + by replacing each production x ⇒ w, where x ∈ N (M) and w ∈ T (M)∗ by the production x ⇒ a(M)w. Let x ∗ be a symbol neither in the alphabet of L nor in the alphabet of M and form the grammar G ∗ =< N (M) ∪ {x ∗ }, T (M), x ∗ , {x ∗ ⇒ ε, x ∗ ⇒ a(M)} ∪ P(M) ∪ P ∗ > which generates the language M ∗ . If a language L is defined by a regular expression over the alphabet A = {a1, a2, ..., ak} , then L is a result of a finite number of applications of regular operations of (u ∪ v), (uv), (u ∗ ) on elementary languages {∅}, {a1}, {a2}, ..., {ak}, hence, L is regular. Assume now that L is regular. By 1.2(i), L is accepted by an FDA; assume that final states of FDA are q1 , q2 , ..., qk and the initial state is q0 . Accepted words are obtained on a path from q0 to some q j for j ≤ k. Consider case of some s j . On a path to q j we have a concatenation ai11 ...ai1m = w1 followed by a periodic state of order n, say, so we may have a word w2∗ then again a word w3 ... and so on up to a finite number of such words. This path determines a regular expression for this particular language, and the language L is a finite union of those languages hence it is defined by a regular expression. Definition 1.36 (Infinite words) For an alphabet A, an infinite word, called also an ω-word, over A is an infinite sequence of symbols in A; the set of all ω-words is denoted Aω . We can concatenate finite words with infinite ones: if u ∈ A∗ and v ∈ Aω then uv ∈ Aω . Processing infinite words requires a new type of automata with modified acceptance conditions: Büchi automata Büchi [11].
1.2 Rewriting Systems
19
Definition 1.37 (Büchi automata) A deterministic Büchi automaton DBA is a tuple , where as in case of DFA, Q, q0 , A, F, tr are, respectively, the set of states, the initial state, the alphabet, the set of accepting states, the transition function tr : Q × A → Q. The non-deterministic variant NBA differs in that tr is a transition relation tr ⊆ Q × A × 2 Q , i.e., for each pair (q, a) there exists a set {q : tr (q, a, q )} and there is defined a set of initial states I , so the signature of NBA is . In case of Büchi automata, executions are called runs, which are infinite sequences q0 ⇒a0 q1 ⇒a1 . . . with q0 ∈ I and qi+1 ∈ tr (qi , ai ) in case of NBA, and, qi+1 = tr (qi , ai ) in case of DBA, for each i ≥ 0; the ω-word a0 a1 . . . ai . . . is called the label of the run. Definition 1.38 (The Büchi acceptance) For a run σ , the set in f (σ ) is the set {q : q occurs infinitely often in σ }. The run σ is accepted by a Büchi automaton if and only if the set F ∩ in f (σ ) is infinite. Equivalently, there exists an accepting state q ∗ which occurs infinitely many times in σ . The language L(B) accepted by the Büchi automaton is the set of accepted runs which are then called ω-words in the language L(B) of the automaton B. Definition 1.39 (ω-regular expressions and sets) We add to regular expressions from the finite case: (∅), (a), (u ∪ v), (uv), (u)∗ the expression (u)ω . We call these expressions ω-regular. For an alphabet A, we consider the set A∗,ω = A∗ ∪ Aω . A set C ⊆ A∗,ω is ωregular if and only if (i) C = ∅ or C = {a} for some a ∈ A, or (ii) C = X ∪ Y for some ω-regular sets (hence any finite union of ω-regular sets is ω-regular, or (iii) C = X Y for some ω-regular X ⊆ A∗ and some ω-regular Y ⊆ A∗,ω , or (iv) C = X ∗ or C = X ω for some ω-regular X ⊆ A∗ . We denote by the symbol R E G(ω) the smallest family of sets satisfying (i)-(iv); therefore, R E G(ω) is closed on empty set, elements of the alphabet A, finite unions, concatenations and finite and infinite powers of its elements, i.e., it contains sets X ∗ and X ω for each of its elements X . In search of analogies between finite and infinite expressions, we consider the analogy to Theorem 1.22. It turns out that the analogy holds true. ω Theorem 1.23 A set k if it is ωdenoted by an ωk X ⊆∗ A ωis in R E G(ω) if and only X i Yi for some regular regular expression i=1 (u )(v ) for some k, i.e., X = i=1 subsets X i , Yi of A∗ .
Proof Clearly, each set of the given form is ω-regular. Conversely, one checks directly that sets of this form constitute the class satisfying Definition 1.39(i)–(iv), hence, they are contained in the class R E G(ω).
20
1 Introduction: Prerequisites
Table 1.1 Automaton B States: q0 a b
q1 ∅
q1
q2
∅ q2
∅ q2
Fig. 1.1 Automaton for concatenation
Example 1.1 In Table 1.1, we give an example of a deterministic Büchi automaton DBA. The final state is in boldface and the initial state is denoted s0 . Language L(DBA) is (a)(bω ). The relation between Büchi automata and ω-regular languages is expressed by the counterpart to Theorem 1.21 for the infinite case. Theorem 1.24 A language L ⊆ Aω is ω-regular if and only if it is of the form L(B) for some finite Büchi automaton B. Proof If a language L is ω-regular, then by Theorem 1.23 it is of the form k ∗ ω i=1 (u )(v ) and it is sufficient to recall the automata for recognition of regular languages u ∗ v ω and of finite unions of regular languages. The automaton for the concatenation u ∗ v ω is sketched in Fig. 1.1. For the union u ∪ v, if u is recognized by the Büchi automaton and v is recognized by the Büchi automaton , where we may assume that Q ∩ Q = ∅, then u ∪ v is recognized by the Büchi automaton < A, Q ∪ Q , q0 , q0 , tr ∪ tr , F ∪ F > . Conversely, if L = L(B) for some automaton B, then for each initial state q1 and each accepting state q2 , the fragment of L containing paths beginning at q1 and ending at L ∗ (q1 , q2 )(L(q2 , q2 )ω , i.e., as a regular expression q2 can be written as L(q1 , q2 ) = and the language L is the union {L(q1 , q2 ) : (q1 , q2 ) ∈ I × F}. Contrary to the finite case (Theorem 1.21), there are languages accepted by nondeterministic Büchi automata but not accepted by any deterministic Büchi automaton.
1.3 Computability Table 1.2 Automaton NB States: a b
21
q0
q1
q0 {q0 , q1 }
∅ q1
An example is given in Landweber [12] of the language L = ({a} ∪ {b})∗ (bω ) which is accepted by the automaton NBA in Table 1.2 but not by any deterministic automaton DBA. Proof that no deterministic Büchi automaton accepts the language L can be sketched as follows: let tr ∗ be transitive closure of tr . We denote by b∗ a finite string of symbols b, irrespective of its length, this will simplify notation. As bω ∈ L, b∗ reaches an accepting state s1 ∈ F; we define for each k a sequence σk : b∗ ab∗ a . . . ab∗ with k symbols a such that tr ∗ (s0 , b∗ ab∗ a . . . ab∗ ) = sk ∈ F. As F is finite, there exist s∗ ∈ F which accepts infinitely many strings σk , hence, the language L should contain words with infinitely many occurrences of a, a contradiction.
1.3 Computability Rewriting systems are computing devices, they perform a series of operations on words in order to verify whether a given word belongs in a language accepted by the system. On the other hand many computing devices fall into the realm of rewriting systems. A visual model of computation is Turing machine (TM) Turing [13] and we first offer its description and properties along with description of objects computable by it and demonstrations of some undecidable problems. Definition 1.40 (One-tape Turing machines) A Turing machine consists of a symbolic part and of mechanical part. The symbolic part has signature , where A is a finite alphabet, Q is a finite set of states, q0 is the initial state, L , R are symbols denoting some actions of the mechanical part. Among states, the halting state q(halt) and the accepting state q(accept) can be singled out. The mechanical part consists of a tape divided into cells (or, squares). The tape may extend to the left and to the right indefinitely; in some versions, the markers are used to mark the left and the right ends of the tape and the markers can be shifted left or right when more space is needed. We assume that the tape is simply infinite in both directions. In addition to the tape, TM is equipped with a readingwriting head which can scan the cell just below it, erase the symbol written in the cell, write a new symbol in the cell, or leave the cell intact, after selecting the option, the head can move left (L), right (R), or stay on the current cell (which is marked by the lack of either R or L in the instruction).
22
1 Introduction: Prerequisites
Definition 1.41 (Actions of the one tape Turing machine) TM operates on words, rewriting of words is steered by instructions. Instructions are in the form uv, where u and v are words built from symbols of states, symbols of alphabet A, symbols L , R. Antecedents of instructions are in the form of words qa, meaning that TM is currently at the state q and it is scanning the cell in which the symbol a is written, and consequents denote actions of TM which are of few kinds: TM can erase the symbol a, it can write a new symbol into the cell, move left or right or make no move. A sequence of actions constitutes a computation, TM reaches the result of computation if it halts after a finite number of actions, otherwise it produces no result. TM can decide the language computed, i.e., output YES if the final string is in the language and NO otherwise; TM can also accept the string into language. In the first case the language is recursive, in the second case it is recursively enumerable. Definition 1.42 (Instructions of TM) We assume that the alphabet A consists of symbols a0 , a1 , ..., ak ; we denote a0 by the symbol B standing for blank, i.e., the empty cell, a1 will denote 1. Instructions for TM are of the form: (1) qi a j ak qn ; (2) qi a j Rqn ; (3) qi a j Lqn Each instruction when performed alters the content of the tape. The content of the tape is the word written currently on the tape. It is called formally an instantaneous description, denoted by the symbol ID. For two ID’s, ID1 and ID2, the symbol ID1 ID2 denotes that the instantaneous description ID1 has changed into ID2 as the result of an applied instruction. Definition 1.43 (Instantaneous descriptions vs. instructions) A general form of an ID is Pqi a j Q, where qi is the current state of TM, a j is the alphabet symbol currently scanned, P is a word to the left of a j , Q is the word to the right of a j . Below, we list all cases of brought forth by instructions (1)–(3). (4) (5) (6) (7) (8)
if I D1 is Pqi a j Q and an instruction is qi a j ak qn , then I D1 I D2 : Pqn ak Q; if I D1 is Pqi a j ak Q and an instruction is qi a j Rqn , then I D1 I D2 : Pa j qn ak Q; if I D1 is Pqi a j and an instruction is qi a j Rqn , then I D1 I D2 : Pa j qn B; if I D1 is Pak qi a j Q and an instruction is qi a j Lqn , then I D1 I D2 : Pqn ak a j Q; if I D1 is qi a j Q and an instruction is qi a j Lqn , then I D1 I D2 : qn Ba j Q
An ID is terminal if there is no I D1 such that I D I D1 . A sequence of instantaneous descriptions related by is a computation. Definition 1.44 (Computations by TM) A computation by TM is a finite sequence I D1 , I D2 , . . . , I Dm such that I Di I Di+1 for i = 1, 2, . . . , m − 1 and I Dm is terminal. For the purpose of computations, the symbol a1 is 1, and the natural number n is coded as the sequence of n + 1 symbols 1: n = 11...1 (n+1 repetitions, for short 1n+1 ). A sequence n 1 , n 2 , . . . , n k of natural numbers is written down as 1n 1 +1 B1n 2 +1 B . . . B1n k +1 . This long sequence can be replaced with a shortcut .
1.3 Computability
23
An output of TM is presented as the number |I Dm | of 1’s in the final tape expression I Dm , i.e., in the word left on the tape at the conclusion of computation. A computation begins with I D1 = q0 1n 1 +1 B1n 2 +1 B . . . B1n k +1 and it ends with I Dm so the result is |I Dm |. A function computed by TM is denoted f T(k)M ; it is total when it has the value computed for each input, otherwise it is partial. As an example we show the computation of the function idT(1)M , where idT(1)M (k) = k. The initial I D1 is q0 1k+1 and the instruction is q0 1Bq0 . Under this instruction I D1 changes into I D2 =q0 B1k which is terminal and |I D2 | = k. It is known that a TM can accept any language generated by a generative grammar (grammar of type 0 in the Chomsky hierarchy Salomaa [10]). Definition 1.45 (Computable functions) A k-ary function f is computable if and only if there exist a Turing machine TM and a function f T(k)M such that f = f T(k)M and the function f T(k)M is total; if f T(k)M is not total and f = f T(k)M , then f is partially computable. For a set X , its characteristic function χ X is defined as χ X (x) = 0 if x ∈ X , otherwise χ X (x) = 1. Definition 1.46 (Computable sets) A set X is computable, if and only if its characteristic function χ X is computable. This definition covers relations as well. One shows, see Davis [14], that the family of computable functions is closed on compositions. We denote by the shortcut x n the sequence . Definition 1.47 (Minimization) To a total function f (y, x n ), minimization assigns the function g(x n ) = argmin y [ f (y, x n ) = 0] in case the set {y : f (y, x n ) = 0} is non-empty, otherwise the function g is undefined. The notation for g is min y [ f (y, x n ) = 0]. One shows that if the function f is computable, then the function g is partially computable. The function f is called regular if the function g is total. Theorem 1.25 Among computable functions are: f T(2)M (x, y) = x + y, f T(2)M (x, y) = x −∗ y which is x − y in case x ≥ y, and 0 otherwise, f T(1)M (x) = x + 1 (the successor function, denoted usually S(x)), f T(n) M (x 1 , x 2 , ..., x n ) = x i for some (2) i ≤ n (usually denoted U (n, i)), f T M (x, y) = x y. Proof of computability consists in constructing appropriate Turing machines. Their description is lengthy even for simple functions, hence, we refer the reader to Davis [14]. Definition 1.48 (Partially recursive functions) A function f is (partially) recursive if it can be obtained from the set of functions {S(x), U (n, i), x + y, x −∗ y, x y} by finitely many applications of operations of minimization and composition. A function is recursive if all functions in this process are regular.
24
1 Introduction: Prerequisites
It follows from Theorem 1.25 that each (partial) recursive function is (partially) computable. As the converse statement holds true, recursiveness is equivalent to computability. Theorem 1.26 The following functions are recursive, hence, computable: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)
Z (x) ≡ 0; A(x) = 1 −∗ x; x 2; 1 x 2 ; |x − y|; xy if y = 0 else 0; x −∗ y · xy (the remainder of x divided by y); f (x, y) = 21 [(x + y)2 + 3x + y] of Theorem 1.8; functions K (z) = x, L(z) = y such that f (K (z), L(z)) = z of Theorem 1.8.
Proof For: (i) Z (x) = U (1, 1)(x) −∗ U (1, 1)(x). (ii) A(x) = S(Z (x)) −∗ U (1, 1)(x). (iii) x 2 = U (1, 1)(x)U (1, 1)(x). 1 (iv) x 2 = min y [A((S(U (2, 2)(x, y)))2 − U (2, 1)(x, y)) = 0]. (v) |x − y| = (x −∗ y) + (y −∗ x). (vi) xy = min z [y A(y(z + 1) −∗ x) = 0]. (vii) it is already defined. (viii), (ix) the claim follows as functions P(z) and Q(z) defined in proof of Theorem 1.8 are recursive and equations established there: x + y = P(z) and y = L(z) = P(z)−∗ 3x + y = Q(z) yield x = K (z) = 21 (Q(z) −∗ P(z), ( 21 (Q(z) −∗ P(z)). Definition 1.49 (Definitions by primitive recursion) For total functions f (x n ) and g(y n+2 ), the function h(x n+1 ) is defined in two steps as h(0, x n ) = f (x n ), h(z + 1, x n ) = g(z, h(z, x n ), x n ). Theorem 1.27 If functions f, g in Definition 1.49 are recursive, then the function h is recursive, too. For a proof, please consult Davis [14]. Definition 1.50 (Primitive recursive functions) A function f is primitive recursive if it can be obtained by finitely many operations of composition and primitive recursion from functions S(x) = x + 1, Z (x) ≡ 0, U (i, n)(xn ). Clearly, each primitive recursive function is recursive. Among primitive recursive functions, we find familiar ones. Theorem 1.28 Functions x + y, x y, x y , x!, A(x), |x − y| are primitive recursive.
1.3 Computability
25
Proof For: (i) x + y: x + 0 = U (1, 1)(x), x + (y + 1) = S(x + y); (ii) x y: x0 = Z (x), x(y + 1) = x y + U (2, 1)(x, y); (iii) x y : x 0 = S(Z (x)), x y+1 = x y U (2, 1)(x, y); (iv) x!: 0! = S(Z (x)), (x + 1)! = x!S(x); (v) A(x), one lets A(x) = S(Z (x)) −∗ U (1, 1)(x); (vi) finally, |x − y| = (x −∗ y) + (y −∗ x).
Theorem 1.29 The class of primitive recursive functions is closed on bounded sums and bounded function f (y, x n ), functions products: for a primitive recursive g(k, x n ) = m≤k f (m, x n ) and h(k, x n ) = m≤k f (m, x n ) are primitive recursive. Proof Indeed, g(0, x n ) = f (0, x n ) and g(k + 1, x n ) = g(k, xn ) + f (k + 1, x n ). Similarly for h, with the exception that addition is replaced by multiplication. We recall that relations are represented by their characteristic functions: χ R (x n ) = 0 if R(xn ) holds, otherwise χ R (x n ) = 1. We use relations in place of predicates. A predicate is a property of things denoted in models as a relation. Bounded existential symbol ∃my=0 R(y, x n ) means that there exists a value k of y bounded by m such that R(k, x n ) holds. Similarly for bounded universal symbol ∀my=0 R(y, x n ), the meaning is: for each value of y ≤ m, R(y, x n ) holds. Theorem 1.30 If R(y, x n ) is primitive recursive, then ∀my=0 R(y, x n ) are primitive recursive. Proof We have χ∃my=0 R(y,x n ) = m k=0 χ R(k,x n ) . For the second negation symbol ¬ meaning ‘it is not true that ...’ and we along with the (∀y.R(y, x n )) ≡ (¬∃y.¬R(y, x n ) 1 −∗ χ R .
∃my=0 R(y, x n ) and part, we recall the exploit the duality relation χ¬R =
In the same vein we prove that R ∨ P and R ∧ P are primitive recursive as χ R∨P = (χ R ) · χ P and χ R∧P = (χ R + χ P ) −∗ (χ R · χ P ). Minimization also preserves primitive recursiveness. For a relation R(y, x n ), we let M(z, x n ) = min y≤z R(y, x n ) in case such y exists, otherwise the result is 0. Theorem 1.31 If the relation R(y, x n ) is (primitive) recursive, then the function M(z, xn ) is (primitive) recursive. For the proof see Davis [14]. We notice that relations x = y, x < y and x ≤ y are primitive recursive: χx=y = A(A(|x − y|)), χx 1) ∧ ∀z≤x (z = 1 ∨ z = x ∨ ¬(z|x))). For proof for Prime(n), the argument of Euclid in his proof that there exist infinitely many prime numbers is used: n!+1 (Pr (y) ∧ y > Prime(n)). Prime(0) = 0, Prime(n + 1) = M y=0
It follows from the above that functions K (z), L(z) : N × N → N, of Theorem 1.8 are primitive recursive. We now propose to discuss a function which is total and not recursive known as the ‘Busy Beaver’function, BB for short. Definition 1.51 (Busy Beavers) An example of a function which is not recursive is the ‘Busy Beaver’ function Radó [15], Aaronson [16]. BB for the given natural number n begins on a Turing machine TM B B with the alphabet {1, B}, B denoting blank, with n states, not counting the halting (blocking) state q f and with the clean tape. BB determines the value of the function Σ(n) = the maximal number of 1s on the tape when a machine TM B B halts. Endless loops do not count. The impression about the hardness of the problem can be given, if we realize that the number of Turing machines satisfying the conditions for the given n is 4 · (n + 1)2n . The function Σ is accompanied by the function S(n)= the maximal number of moves by the machine TM B B . Not much is known about the function Σ as well as about the function S. The largest n is about 6, for which one of the best current results is Kropitz [17]: Σ(6) ≥ 3.515 · 1018267 . Theorem 1.33 Functions Σ and S are total non-computable. Proof The idea of the proof is to show that Σ grows faster than any computable function. Claim For each computable function f there exists a natural number c f such that Σ(n + c f ) ≥ f (n) for each n ≥ 0. Proof of Claim. Let TM(f) compute the function f and c f be the number of states of TM(f). We define the Turing machine TM(f,n): it writes n 1’s to the tape and then begins to emulate TM(f); clearly, (i) Σ(n + c f ) ≥ f (n) for n ≥ 0. Let h(n) = Σ(2n); was Σ computable, we would have (ii) Σ(n + ch ) ≥ Σ(2n). As Σ is monotonically increasing, for n ≤ ch , we have (iii) Σ(n + ch ) ≤ Σ(2n), contrary to (i).
1.4 Arithmetization of Turing Machines
27
BB shows some interesting results, for instance, as shown in (Yedidia and Aaronson [18]), with BB on 748 states, it halts if and only if the set theory ZFC is inconsistent. We are now in a position to arithmetize Turing machines, in order that they can, in a sense, compute on themselves. This approach opens up a way to the most important results in computability theory.
1.4 Arithmetization of Turing Machines To begin with, we assign an initial segment of odd natural numbers to symbols of TM: R → 3, L → 5, a0 → 7, q0 → 9, a1 → 11, q1 → 13, ....; in general, the alphabet symbol ai is assigned the number 4i + 7 and the state symbol qi is assigned the number 4i + 9. This enumeration extends over sequences of symbols, e.g.,to the expression q0 1Rq1 , the sequence assigned is . Definition 1.52 (Gödel numbering) For an expression s = s1 s2 ...s k enumerated by the sequence , the Gödel number is gn(s) = k1 Prime(n)m n . This extends over sequences of expressions E : E 1 E 2 ....E k , gn(E) = k numbering gn(E n) . n=1 Prime(n) Due to the uniqueness of decomposition of a natural number into a product of primes, no two distinct sequences of expressions have the same Gödel number. As a set X of n expressions can be ordered in n! distinct ways, X has n! distinct Gödel numbers. The central place in our discussion in this section will be taken by the Kleene predicate Kleene [19]. Definition 1.53 (The Kleene predicate) For n > 0, the Kleene predicate is Tn (z, x n , y), where z = gn(Z ), Z is a Turing machine TM which begins with I D1 : q0 1x1 +1 B1x2 +1 B....B1xn +1 , and y is the Gödel number of the resulting computation. The Kleene predicate captures all essential information: full information about TM is encoded in z, y codes the computation with the initial ID. The fundamental result is the Kleene theorem Kleene [19]. Theorem 1.34 (Kleene) The predicate Tn (z, x n , y) is primitive recursive. Proof Though the proof is lengthy, we will sketch it bearing in mind the importance of the predicate. The proof consists in a series of statements describing essential aspects of computation (our exposition follows the exposition in Davis [14]).
28
1 Introduction: Prerequisites
(P1) outputs the Gödel number of a given member of a sequence of symbols or expressions. Its definition is gn(E)
G(n, E) = M y=0 [(Prime(n) y |gn(E)) ∧ ¬(Prime(n) y+1 |gn(E))], where E is a sequence of expressions or symbols; (P2) returns for a given sequence the number of its members. Its definition is gn(E)
gn(E)
C(x) = M y=0 [(G(y, E) > 0) ∧ ∀ j=0 (G(y + j + 1, E) = 0)]; (P3) asserts that x is a Gödel number with positive exponents. Its definition is [(G(y, x) > 0) ∨ (G(y + 1, x) = 0)]; G N (x) ≡ ∀C(x)+1 y=1 (P4) asserts that x is the exponent at some prime factor in Gödel number z. Its formal name is C(z) (x = G(i, z))]; E x p(x, z) ≡ [G N (z) ∧ ∃i=1
(P5) expresses Gödel number of concatenation of two sequences of expressions E 1 with gn(E 1 ) = x and E 2 with gn(E 2 ) = y.Its definition is C(y)−∗ 1
x⊗y=x·
[Prime(C(x) + i + 1)G(i+1,y) ];
i=0
(P6) holds if its argument is a Gödel number of some state qi . Its definition is S N (x) ≡ ∃xy=0 (x = 4y + 9); (P7) holds if its argument is a Gödel number of an alphabet symbol. Its definition is Al(x) ≡ ∃xy=0 (x = 4y + 7); (P8) holds if its argument is an odd number. Its definition is Odd(x) ≡ ∃xy=0 (x = 2y + 3); (P9) holds if its argument is a Gödel number of a quadruple of the format of an instruction of a TM. Its definition is Quad(x) ≡ G N (x) ∧ (C(x) = 4) ∧ (S N (G(1, x)) ∧ (Al(G(2, x))∧ (Odd(G(3, x)) ∧ (S N (G(4, x));
1.4 Arithmetization of Turing Machines
29
(P10) holds if its arguments are Gödel numbers of two distinct quadruples having the same sets of two first symbols. Its definition is N D(x, y) ≡ Quad(x) ∧ Quad(y) ∧ (G(1, x) = G(1, y))∧ (G(2, x) = G(2, y)) ∧ (x = y); (P11) holds if its argument is a Gödel number of a Turing machine TM. Its definition is C(x)
C(x)
T M(x) ≡ G N (x) ∧ ∀n=1 [Quad(G(n, x)) ∧ ∀m=1 (¬(N D(G(n, x), G(m, x))))];
(P12) holds if it yields the Gödel number of the code for the machine representation of its argument. Its definition is Code(0) = 211 ∧ Code(n + 1) = 211 × Code(n), hence, Code(n) = gn(1n+1 ) =< n >; (P13) it is the characteristic function of the relation G(n, x). Its definition is CU (n, x) = 0 if G(n, x) = 11, else CU (n, x) = 1; (P14) it is a function which returns the number of ones in an expression with Gödel number x. Its definition is Cor n(x) =
C(x)
CU (n, x);
n=1
(P15) returns the result of a computation by a Turing machine TM. More generally, for a sequence of expressions E 1 , E 2 , ..., E n with the Gödel number x, it returns the number of ones in E n . Its definition is U (x) = Cor n(G(C(x), x)); (P16) holds if its argument is a Gödel number of an instantaneous description. Its definition is ∗
C(x)− 1 [S N (G(i, x)) ∧ ∀C(x) I D(x) ≡ G N (x) ∧ ∃i=1 j=1 (( j = i) ∨ Al(G( j, x)))];
In order to shorten descriptions that follow, we introduce the extension of the predicate Code of (P12); Code(x n ) will denote the machine representation of x n .
30
1 Introduction: Prerequisites
(P17) represents the Gödel number x of the initial sequence q0 Code(x n ). Its definition is I nit (x n ) = 29 ⊗ Code(x1 ) ⊗ 27 ⊗ Code(x2 ) ⊗ 27 ⊗ ... ⊗ Code(xn ); (P18) Next group of predicates are related to computations by TM. The first predicate holds true when x, y are Gödel numbers of I D1 and I D2 with I D1 I D2 according to instruction (1) of TM with gn(T M) = z of Definition I.3.3. Its definition is 1 (x, y, z) ≡ T M(z) ∧ I D(x) ∧ I D(y)∧ y
∃xA,B,a,b=0 ∃c,d=0 [(x = A ⊗ 2a ⊗ 2b ⊗ B)∧ (y = A ⊗ 2c ⊗ 2d ⊗ B)] ∧ S N (a) ∧ S N (c) ∧ Al(b) ∧ Al(d) ∧ E x p(2a · 3b · 5d · 7c , z);
(P19) expresses I D1 I D2 according to instruction (2) in Definition I.3.3. Its definition is 2 (x, y, z) ≡ I D(x) ∧ I D(y) ∧ T M(z)∧ y
∃xA,B,a,b=0 ∃c,d=0 [(x = A × 2a × 2b × 2c × B)∧ (y = A × 2b × 2d × 2c × B)] ∧ S N (a) ∧ S N (d) ∧ Al(b) ∧ Al(c) ∧ E x p(2a · 3b · 53 · 7d , z);
(P20) holds when I D1 I D2 according to instruction (3) in Definition I.3.3. It is defined as y
3 (x, y, z) ≡ I D(x) ∧ I D(y) ∧ T M(z) ∧ ∃xA,a,b=0 ∃c=0 [(x = A × 2a × 2b )∧ (y = A × 2b × 2c × 27 )] ∧ S N (a) ∧ S N (c) ∧ Al(b)∧ E x p(2a · 3b · 53 · 7c , z); (P21) collects (P18-P20) into one predicate which holds when x, y are Gödel numbers of I D1 , I D2 , z is gn(T M) and I D1 I D2 . Formally, 3 i ; (x, y, z) ≡ ∃i=1
(P22) holds, where z is a Gödel number of a TM and x a Gödel number of an instantaneous description ID, when ID is a final ID in a computation by TM. Formally.
1.5 Recursively Enumerable Sets
31
Fin(x, z) ≡ I D(x) ∧ T M(z) ∧ ∃xA,B,a,b=0 [(x = A ⊗ 2a ⊗ 2b ⊗ B) ∧ S N (a) ∧ Al(b)] C(z) ∧{∀i=1 [(G(1, G(i, z)) = r ) ∨ G(2, G(i, z)) = s]};
(P23) holds when z is a Gödel number of a TM and y is a number of a computation by TM. Its formal definition is ∗
C(x)− 1 (G(i, y), G(i + 1, y), z)∧ Comp(y, z) ≡ T M(z) ∧ G N (y) ∧ ∀i=1
Fin(G(C(y), y)); (P24) the final step in which the Kleene predicate is expressed by the above defined primitive recursive predicates: Tn (z, xn , y) ≡ Comp(y, z) ∧ G(1, y) = I nit (xn ). This concludes the proof that the Kleene predicate is primitive recursive.
The Kleene normal form Kleene [20] expresses values of computable functions in terms of the Kleene predicate. Theorem 1.35 (The Kleene normal form) For a function f Tn M computed by a Turing machine Z with gn(Z ) = z, the following holds: f T(n) M (x n ) = U (min y [Tn (z, x n , y)]). Hence, a function f (x n ) is (partially) computable if and only if f (x n ) = U (min y [Tn (z, x n , y)]) for some Gödel number z. Thus, a function is (partially) computable if and only if it is (partial) recursive.
1.5 Recursively Enumerable Sets Definition 1.54 (Recursively enumerable predicates) For a predicate P(x n ), we call its denotation the set I (P) = {x n : P(x n ) holds}. A predicate P(x n ) is recursively enumerable (abbrev. RE) if and only if there exists a partially computable function f (x n ) whose domain is the set I (P). We mention some important properties of RE predicates. Theorem 1.36 The following hold:
32
1 Introduction: Prerequisites
(i) computable predicates are recursively enumerable; (ii) if a predicate P(y, x n ) is computable, then the predicate ∃y.P(y, x n ) is recursively enumerable; (iii) if a predicate P(xn ) is recursively enumerable, then there exists a computable predicate Q(y, xn ) such that P(x n ) ≡ (∃y.Q(y, x n )); (iv) The Kleene enumeration theorem: if a predicate P(x n ) is recursively enumerable, then (P(x n )) ≡ (∃y, z.Tn (z, xn , y)). Proof For (i): Suppose that the predicate P(x n ) is computable, hence, its characteristic function χ P is computable. Then the set I (P) is the domain of the function min y [χ P (x n ) + y = 0]. For (ii): I (P) is the domain of the function min y [χ P (y, x n ) = 0]. For (iii), (iv): Suppose that the predicate P(x n ) is recursively enumerable, hence the set I (P) is the domain of a computable function f (x n ). By Theorem 1.35, f (x n ) = U (min y [Tn (z, x n , y)]) for some z. It follows that I (P) = dom( f (x n )) = {x n : ∃y.[Tn (z, x n , y)]}, hence, (P(x n )) ≡ (∃y.Tn (z, xn , y)).
The important relation between computability and recursive enumerability is provided by the following theorem. Theorem 1.37 A predicate P(x n ) is computable if and only if predicates P(x n ) and ¬P(x n ) are recursively enumerable. Proof One way it is simple: if P(x n ) is computable, then ¬P(x n ) is computable and both predicates are recursively enumerable by Theorem 1.36. Conversely, if P(x n ) and ¬P(x n ) are recursively enumerable, then there exist predicates R(y, x n ) and Q(y, x n ) such that (P(x n )) ≡ (∃y.R(y, x n ) and (¬P(xn )) ≡ (∃y.Q(y, x n ). Clearly, for each argument x n either R(y, x n ) or Q(y, x n ) must be true, so, the function f (x n ) = min y [R(y, x n ) ∨ Q(y, x n )] is defined everywhere, i.e., it is total, hence, computable. As (P(xn )) ≡ (R( f (x n ), x n )), P is computable.
It turns out that RE predicates can be characterized as ranges of computable functions. We recall that the class of recursive predicates is close on propositional connectives ∧, ∨, ¬: indeed, for (partially) recursive predicates P, Q, characteristic functions of, respectively P ∧ Q, P ∨ Q, ¬P are, respectively χ P∧Q = (χ P + χ Q ) −∗ (χ P · χ Q ), χ P∨Q = χ P · χ Q , χ¬P = 1 −∗ χ P .
1.6 Undecidability
33
Theorem 1.38 (i) If a predicate P(x) is the range of a (partially) computable function f (x), then P is recursively enumerable (ii) if P is a non-vacuous RE predicate, then there exists a computable function f (x) such that I (P) is the range of f (x). Proof For (i): By the Kleene normal form, Theorem 1.35, f (x) = U (min y [T (z, x, y)]) for some z. Then, y is a value of f , if and only if for some x, u, y = U (u) and T (z, x, u) holds. Hence, {y : P(y)}={y : ∃x, u, y = U (u) ∧ T (z, x, u)}, i.e., P is an RE. For (ii) (after Davis [14]): we know that (P(x)) ≡ (∃y, z.T (z, x, y)). We denote by χT (x, y) the characteristic function of the predicate T with z fixed. The function f is defined as follows: f (0)=the least element x0 in the denotation I (P); f (n + 1) = χT (K (n + 1), L(n + 1)) · f (n) + A[χT (K (n + 1), L(n + 1))] · K (n + 1) for n ≥ 0. When χT = 0, K (n + 1) gives the next value in I (P).
It follows that a predicate is RE when its denotation is the domain of computable function equivalently it is the range of a computable function. The important upshot of the above discussion is that a set is recursive if and only if both the set and its complement are recursively enumerable.
1.6 Undecidability We will relate some deep results obtained by means of the Kleene predicate. We denote the predicate T1 (z, x, y) as T (z, x, y). Theorem 1.39 The predicate Q(x) : ∃y.T (x, x, y) is recursively enumerable but not computable. Proof Q(x) is recursively enumerable. Suppose that ¬Q(x) is recursively enumerable. By the Kleene enumeration Theorem 1.36(iv), ¬Q(x) ≡ ∃y.T (z, x, y) for some z. Letting z = x, we obtain a contradiction: ∃y.T (x, x, y) ≡ ¬∃y.T (x, x, y). Definition 1.55 (Decision problems) For a predicate P(x n ), the decision problem is to obtain answer Y es or N o to the question: given an arbitrary argument a n is it true that P(a n ) holds? Clearly, the decision problem is closely related to computability: in order to pass the test of decision problem, the predicate P(xn ) must be computable, i.e., recursive. Hence, a predicate P(xn ) which answers in the positive to decision problem
34
1 Introduction: Prerequisites
is said to be recursively solvable, otherwise it is recursively unsolvable. Other used terminology is decidable or undecidable. Theorem 1.40 The predicate ∃y.T (x, x, y) is undecidable. The other ‘classical’ undecidable computing problem is the Halting Problem. Definition 1.56 (The Halting Problem) Consider Turing Machine TM and an ID. The problem is to decide whether TM started with ID as the initial word will halt. Theorem 1.41 There exists a Turing machine TM for which the Halting Problem is undecidable. Proof Consider a Turing machine T M which computes the function f M (x) = min y [T (x, x, y)] and a predicate PT M (x): x is the Gödel number of ID with which T M begins computation. Then, x ∈ I (P) if and only if ∃y.[T (x, x, y)], also x ∈ I (P) if and only if PT M (gn(q0 < x >)). Were P computable, ∃y.[T (x, x, y)] would be computable. As it is not computable, P is not computable. There are many problems about recursive sets which are undecidable, see, e.g., (Rozenberg and Salomaa [21]). For each predicate P which is RE but not recursive, its negation ¬P is not RE. We complete this section with yet another example of a set which is not RE, cf. Davis [14]. Theorem 1.42 The set of Gödel numbers of Turing machines T M whose functions f T M are total is not recursively enumerable. Proof Let G = {gn(M) : M is T M, f M is total}. Was G an RE, G would be the range of a computable function f (n) and the function U (min y [T ( f (n), x, y)]) would be computable (total), hence, U (min y [T ( f (n), x, y)]) + 1 would be computable and for some n 0 , U (min y [T ( f (n), x, y)]) + 1 = U (min y [T ( f (n 0 ), x, y)]). For n = n 0 , we arrive at contradiction.
1.7 Incompleteness. Non-provability of Truth Gödel numbering is used in proving fundamental results about non provability of truth due to Gödel and Tarski Gödel [22], Tarski [23]. We owe to Smullyan [24] an elegant abstract exposition of the most general framework for these results. They are set in the framework of formal systems. We have met such systems when discussing grammars. The following definition gives the ingredients of a formal system which of course are realized in different ways for different systems. Definition 1.57 (Formal systems) A formal system F consists of the following components: (i) a language L(F); (ii) a countable set EXP of expressions of the language L(F);
1.7 Incompleteness. Non-provability of Truth
(iii) (iv) (v) (vi) (vii) (viii) (ix)
35
a subset S ⊆ E X P of statements; a subset P ⊆ S of provable statements; a subset R ⊆ S of refutable statements; a subset T ⊆ S of valid statements; a set Q of predicates which express some properties of natural numbers; a set of expressions X (n) for X ∈ E X P and n ∈ N; a set of expressions of the form Q(n) ∈ S, where Q ∈ Q and n satisfies Q if and only if Q(n) ∈ T .
Definition 1.58 (Expressibility) A set A of natural numbers is expressed by a predicate Q if and only if A = {n ∈ N : Q(n) ∈ T }, i.e., A is expressible by Q if A collects numbers n for which Q(n) is valid. A set A is expressible if it is expressed by a predicate Q. We now involve Gödel’s numbering into discussion and we assume that we have infinitely many expressions, sentences and predicates and we enumerate all these sets with infinitely many countably many Gödel’s numbers. Hence, we denote by E n the expression E with gn(E) = n. Now, we apply the Cantor diagonal argument. Definition 1.59 (Diagonal expressions) For any expression E n , the diagonalization of E n is the expression E n (n), where n = gn(E n ).For any predicate Q, Q n (n) is the predicate Q n acting on its own Gödel number. Definition 1.60 (Diagonal function) We define δ(n) = gn(E n (n)). δ is the diagonal function. Definition 1.61 (Diagonal sets) For any subset A ⊆ N, we denote by Aδ the set of natural numbers n such that δ(n) ∈ A. In fact, Aδ = δ −1 (A). We say that the system F is regular if and only if P ⊆ T and T ∩ R = ∅, i.e., what is provable is valid and what is non provable is invalid (false); we denote the fact of regularity by the symbol R E G(F). Now, we have the Gödel theorem about incompleteness. Theorem 1.43 (Gödel) If R E G(F) and the set (N \ P)δ is expressible, then there exists a sentence in L(F) which is valid but not provable. Proof Let Q be the predicate which expresses the set (N \ P)δ and let n = gn(Q) so Q(n) is true if and only if n ∈ (N \ P)δ . It also means that δ(n) ∈ N \ P. As δ(n) ∈ P if and only if Q(n) is provable and δ(n) ∈ / P if and only if Q(n) is not provable, it follows that Q(n) is true if and only if Q(n) is not provable. As falsity of Q(n) excludes provability by assumption of regularity, we are left with the conclusion that the sentence Q(n) is true but not provable. We now recall the Tarski theorem on non-definability of truth. First some technicalities.
36
1 Introduction: Prerequisites
Definition 1.62 (Gödel sentences) A sentence E n is a Gödel sentence for a set A of natural numbers if and only if E n is true if and only if n ∈ A, hence, if E n is false then n ∈ / A. A criterion for having a Gödel sequence by a set of natural numbers is as follows. Theorem 1.44 For a set A, if the set Aδ is expressible, then there exists a Gödel sentence for A. Proof Let Q n be the predicate which expresses Aδ , so Q n is true if and only if n ∈ Aδ , hence Q n is true if and only if δ(n) ∈ A which means that Q n is a Gödel sentence for A. We denote by G(T ) the set of Gödel numbers of true sentences. Theorem 1.45 (Tarski) (i) the set (N \ G(T ))δ is not expressible; (ii) if for each expressible set A the set Aδ is expressible, then the set N \ G(T ) is not expressible; (iii) if for any expressible set A the set N \ A is expressible, then the set G(T ) is not expressible. Proof Essentially, (i) is in need of a proof. Was the set (N \ G(T ))δ expressible, there would exists a Gödel sentence for it, which would mean that the sentence would be true if and only if its Gödel number would not be a Gödel number of any true sentence. (ii) and (iii) follow immediately. The upshot of the Tarski theorem is that in sufficiently strong systems the notion of truth is not definable. Finally, we address the Gödel incompleteness theorem in the general setting. Definition 1.63 (Consistency) A system is consistent if no sentence is simultaneously provable and refutable, i.e., P ∩ R = ∅. Definition 1.64 (Decidability) A sentence is decidable if it is either provable or refutable, otherwise it is undecidable. Definition 1.65 (Completeness) A system is complete if each sentence is decidable, otherwise it is incomplete. Theorem 1.46 (The Gödel incompleteness theorem) If R E G(F) and the set (N \ P)δ is expressible, then L(F) is incomplete. Proof Under the same assumptions there exists a sentence which is true and not provable. As it is true it is not refutable by the assumptions of correctness. Hence, the system is incomplete. We have recalled the basic meta-theoretic results on formal systems. Further discussion will take place in Ch.3 on first-order logic.
1.8 Complexity
37
1.8 Complexity Complexity theory is concerned with assessment of resources of time and space necessary to perform computations in a given computation model. We adopt the orthodox approach, i.e., computations by Turing machines. We speak therefore of time- and space-complexity and classify problems into classes with respect to sizes of those resources. Concerning the computing device, there are many of them like lambda-calculus, Post canonical systems, Markov or semi-Thue rewriting systems, DNA computing, Membrane Computing, Quantum computing, Molecular computing, Swarm computing. We restrict ourselves to Turing machines. There are basically two types of Turing machines (not mentioning some specialized variants of them): single-tape machines and many-tape machines. Though apparently distinct, yet they are equivalent in the sense to be explicated below. A k-tape TM consists of k tapes, of which tape 1 is the input tape which is readonly and tape k is the output tape on which the machine gives values of solutions or answers Yes or No when the problem is a decision one. Tapes from 2 to k are readwrite and machine can read cell contents and modify them by writing new symbols in place of the current ones. As with one-tape machines the k-tape TM can find itself in one of states from the state set Q, and one of the states is qinit which marks the beginning of a computation, and in some problems one of the states is qhalt , its reaching denotes the end of computation when the machine stops (haltsaccepts). As with one-tape TMs, each of the tapes can move one cell left or right in every step. In our description of one-tape machine, the tape was infinitely extendable in both directions, with k-tape TMs it is often assumed that tape extends only to the right so the first move must be to the right. Theorem 1.47 If on an input of length n the k-tape T M1 uses at most T (n) steps, then one-tape T M2 emulating T M1 uses at most 4 · k · T (n)2 steps. Proof The idea for constructing T M2 is simple: take a segment of length k · T (n) of cells on the tape of T M2 and subdivide it into k sub-segments of length T (n) each. Reproduce on the i − th segment the instantaneous descriptions of the i − th tape of T M1 by marking the positions of heads and performing a double sweep to the right and then back to the left; during the right sweep, read positions of heads and after returning make the second forth and back sweep in order to change the instantaneous descriptions to the new ones resulting from the instructions. This requires at least 4 · k · T (n)2 steps; if some movements are required for technical reasons, then a small overhead is added. Due to this result, we will stay with 1-tape TM’s as by this the polynomial complexities will not be affected. Complexity theory makes use of Bachmann’s notation Bachmann [25] for orders of growth of functions from the set of natural numbers N into itself.
38
1 Introduction: Prerequisites
Definition 1.66 (Orders of growth) For functions f, g : N → N, we say that f is O(g) ( f is ‘big O’ of g) if there exist a constant C > 0 and n 0 ∈ N such that f (n) ≤ C · g(n) for n ≥ n 0 ; (ii) in case (i), we denote this case by writing dually that g(n) is Ω( f (n)); f (n) = C = 0, then f (n) is O(g(n)) and (iii) if there exists a finite limit lim n→∞ g(n) g(n) is O( f (n)). In this case we write that f (n) is (g(n)) and g(n) is ( f (n)). f (n) = 0 then we denote this fact by writing that f (n) is o(g(n)). If lim n→∞ g(n) (i)
This convention means that constant coefficients are eliminated, for instance the estimate for T M2 of 4 · k · T (n)2 is of T (n)2 shortly written down as (T (n)2 ). Definition 1.67 (Complexity classes DTIME and P) We consider first the deterministic case when no two instructions of the machine have the common prefix, hence, in each step of computation at most one instruction may be active. We first consider languages that can be decided in polynomial time, i.e, the time complexity function T (n) is bounded by a polynomial p(n) where n is the length of the input. This case splits into sub-cases bound by polynomials of fixed degree. Thus, classes DTIME(n k ) for each natural number k are introduced first. We say that a TM decides a language L if and only if it computes the characteristic function χ L (x) of L. We let (χ L (x) = 0) ≡ (x ∈ L): the value 0 of χ L (x) means that the string x ∈ L, otherwise x is rejected. The language L is in the class DTIME(n k ) if and only if χ L (x) can be decided in time T (n) ≤ n k . k The union of classes DTIME(n ) is the class Pkof languages having deterministic polynomial time complexity: P= k∈N DTIME(n ). In the class P we find problems of sorting, graph problems like searching, finding spanning trees, shortest paths etc. There are three issues related to classes of complexity: reduction, hardness and completeness. Definition 1.68 (Karp reducibility; X-hard, X-complete languages) We say that a language L 1 is time-polynomially Karp reducible to the language L 2 if there exists function f : {0, 1}∗ → {0, 1}∗ computable in polynomial time and such that for each x ∈ {0, 1}∗ , x ∈ L 1 if and only if f (x) ∈ L 2 . This relation is denoted L 1 ≤ Kp L 2 . Clearly, the relation ≤ Kp is reflexive and transitive as the composition of two polynomials is a polynomial. For a complexity class X , a language L is X -hard if each language L ∈ X is ≤ Kp reducible to L; if in addition L ∈ X , then L is X -complete. We describe a P-complete problem. We refer to the resolution in propositional logic and we add that the unit resolution is a particular variant of resolution in which 1 ∨y2 ∨...∨yk , i.e., one of clauses is a singleton. the resolution rule is the following: x,¬x∨y y1 ∨y2 ∨...∨yk Definition 1.69 (The problem CNF-CONTRA-UR) It is the set of propositional formulae in CNF that can be show unsatisfiable by the unit resolution.
1.8 Complexity
39
Theorem 1.48 CNF-CONTRA-UR is in P. Proof Consider a formula φ in CNF-CONTRA-UR with k clauses; let C be the set of clauses and C = U ∪ N , where U is the set of unit clauses and N = C \ U . Consider the first unit clause c ∈ U and check c against clauses in N : if c cannot be resolved with any of them, then delete c from the list U , otherwise let c1 be the first clause in the list N with which c can be resolved. Let c2 be the resolvent from c and c1 ; delete c1 from the list N . If c2 is the empty clause then halt and report that φ is contradictory. If c2 is a unit clause, attach it to the list U , otherwise attach it to the list N . In the end output that φ is not contradictory. The algorithm requires at most k 2 steps, hence, CNF-CONTRA-UR is in P. Theorem 1.49 CNF-CONTRA-UR is P-hard. Proof Let L be a language in P, p(n) a polynomial and TM a Turing machine whose one of states is qhalt , the halting state. For x ∈ L, we have to construct in a polynomial time a formula ψ(x) such that x ∈ L if and only if ψ(x) ∈ CNF-CONTRA-UR. Clearly, the formula ψ(x) must reflect the computation process of x. To this end, we define matrices: A p(|x|)×|Q| , B p(|x|)×(2 p(|x|)+1) , C p(|x|)×(2 p(|x|)+1) . The matrix A registers states of the machine in consecutive steps of computation, the matrix B registers the position of the head in consecutive steps and the matrix C carries records of instantaneous descriptions on the tape in consecutive steps. The number 2 p(|x|) + 1 of cells stems from the position of the head in the beginning of computation and p(|x|) cells to the right and to the left of this original position of the head which is the maximal reach of the head during computation. Let I = {1, 2, . . . , p(|x|)}, J = {1, 2, . . . , 2 p(|x|) + 1}, the entries in matrices A, B, C are: (i) for the matrix A = [ai,q ], ai,q = 1 if s(i) = q else 0, where s(i) is the state of TM at i − th step; (ii) for the matrix B = [bi, j ], bi, j = 1 if the head is at position j in the i − th step, else 0; (iii) for the matrix C = [ci, j,a ], ci, j,a = 1 if ci, j = a, where ci, j is the symbol of the alphabet in the step i in the cell j. We can now start the construction of the formula ψ(x). First, we build the formula ψ1 (x) which will reflect the fact that at each step, the head is in exactly one position, there is exactly one state TM is in, and in each position there is exactly one symbol of the alphabet. Thus, we need a formula which, if satisfied, selects exactly one variable to be true: (E(y1 , y2 , . . . , yk )) ≡ (∃i.(yi ⊃ ∀y j = yi .(¬y j ))).
40
1 Introduction: Prerequisites
Then, ψ1 (x) is ∀i ∈ I.E{ai,q : q ∈ Q} ∧ ∀i ∈ I.E{bi, j : j ∈ J }∧ ∀i ∈ I, j ∈ J.E{ci, j,a : a ∈ Γ }, where Γ is the alphabet of TM. The formula ψ2 (x) confirms that the first row of matrices (i=1) reflects the initial instantaneous description of the tape. Let x = x1 x2 . . . xm and p(m) = n. Then ψ2 (x) is a1,q0 ∧ b1,n+1 ∧ ∀1 ≤ j ≤ n.c1, j,B ∧ c1,n+1,x1 ∧ x1,n+2,x2 ∧ . . . ∧ c1,n+m,xm ∧ ∀n + m + 1 ≤ j ≤ 2n + 1.c1, j,B , where B denotes blank, i.e, the empty cell. The formula ψ3 (x) testifies that the content of the cell may change only if the head is over the cell. Thus, ψ3 (x) is ∀i ∈ I, j ∈ J.∀a ∈ Γ.(b(i, j) = 1 ∨ ci, j,a = ci+1, j,a ). The formula ψ4 (x) certifies that any change, be it state’s, head position’s, contents’ of the tape, can be effected only in accordance with an instruction. Let an instruction be (q, a, b, m, q ), where m ∈ {−1, 0, +1} encodes the movement, respectively, to the left, no move, to the right. Then, ψ4 (x) is ∀i ∈ I, j ∈ J.(¬ai,q ∨ ¬bi, j ∨ ¬ci, j,a ∨ ai+1,q ∧ bi+1, j+d ∧ ci+1, j,b ). The formulaψ5 (x) describes the halting step: in it, the unit resolution has to derive a p(|x|),q f from i ψi (x), hence, ψ5 (x) is ¬a p(|x|),q f . The formula ψ(x) is ψ1 (x) ∧ ψ2 (x) ∧ ψ3 (x) ∧ ψ4 (x) ∧ ψ5 (x). It is in P and the reduction to it of x is polynomial in time and logarithmic in space: log(|x|) cells are sufficient to record numbers up to 2|x| hence up to |J |. Corollary 1.3 It follows that CNF-CONTRA-UR is P-complete. Definition 1.70 (Classes L and NL) Class L is the class of languages which are decided by a deterministic Turing machine in space bounded by logarithm of the length of the input. Class NL is the class of languages which can be decided by a
1.8 Complexity
41
non-deterministic Turing machine in space bounded by logarithm of the length of the input. Clearly, L⊆ NL. We give an example of a language which is NL-complete. Definition 1.71 (Log-space reducibility) Language L 1 is log-space reducible to language L 2 if there exists a function f : {0, 1}∗ → {0, 1}∗ computable by log-space bounded Turing machine such that ∀x.((x ∈ L 1 ) ≡ ( f (x) ∈ L 2 )) and | f (x)| ≤ c · |x| for some constant c. This relation is denoted ≤ L . Definition 1.72 (Language DGA) DGA={((V, e), u, V )}: (V, e) is a directed graph, u a node in V, V ⊆ V and there exists v ∈ V such that there exists a path from u to v. Theorem 1.50 DGA is in NL. Proof Select non-deterministically an edge (u, w) ∈ e; if w ∈ V , then stop and accept, else, let V as V \ {w} and repeat. Output ‘No’. The required space is O(log(|V | + |E|)). Theorem 1.51 DGA is ≤ L -hard for the class NL. Proof Let A ∈ NL and TM be a log-space bounded Turing machine that accepts A. We construct an instance of the graph (V, E, u, V ). Let computation of TM on an instance x of A be the set of instantaneous descriptions I D1 , I D2 , . . . , I Dk . For each I Di , let vi be a node in the graph and V = {vi : i ≤ k}. Let u = v1 . It is convenient to assume that TM has two tapes: the input tape with space O(log|x|) and the work tape of space O(log|x|). IDs are read and vertices are written to the work tape. The edges are then written to the work tape, an edge (vi , v j ) is enlisted to e if I Di I D j , i.e., I D j is a successor to I Di . Finally, nodes in V are written to the work tape; a node vi is enlisted into V if the state qi in I Di is final. Clearly, x ∈ DGA if and only if there is a path from u into V and the space required is O(log|x|). Corollary 1.4 DGA is NL-complete. Definition 1.73 (Class NP) A language L ∈ {0, 1}∗ is in NP if there exist a polynomial p and a polynomial-time bounded Turing machine TM such that for each x ∈ {0, 1}∗ , (x ∈ L) ≡ (∃w ∈ {0, 1} p(|x|) .T M(x, w) = 1). The successful w is called the certificate. Thus, L ∈ NP if it is possible to check in polynomial time for each w whether the proposed w be a solution to the problem, but the experience shows that for no one such problem an algorithm in P has been found. Whether P=NP is a long-standing open problem. An archetypal NP-complete problem is SAT(CNF) along with its variant sat(3-CNF).
42
1 Introduction: Prerequisites
Definition 1.74 (Problems SAT(CNF) and SAT(3-CNF)) Problem SAT(CNF) consists in deciding whether a given propositional formula is satisfiable. Problem SAT(3CNF) is SAT for propositional formulae in the normal CNF form in which each clause contains exactly three literals. Theorem 1.52 SAT(3-CNF) is in NP. Proof Given an instance φ : C1 ∧ C2 ∧ . . . Ck where Ci is l1i ∨ l2i ∨ l3i and l ij are literals, a certificate is a valuation V : {l ij : i ≤ k, j = 1, 2, 3} → {0, 1}. Verification whether V (φ) = 1 can be done in polynomial time of the size = the number of functors in φ. NP-completeness of SAT(CNF) was established in Cook [26] and Levin [27]. Theorem 1.53 SAT(CNF) is NP-complete. Proof We refer to proof of P-completeness of the problem CNF-CONTRA-UR; formulae ψ1 (x) − ψ4 (x) are the same in this case, the formula ψ5 (x) is now a p(|x|),q f . 5 ψi (x). The formula ψ(x) is the conjunction i=1 We now address space complexity. Definition 1.75 (The class PSPACE) A language L is in PSPACE if there exists a k deterministic Turing machine TM that decides L using O(n ) non-blankkcells for some natural number k. Hence, one can define PSPACE as k P SPACE(n ), where PSPACE(n k ) is the class of languages that can be decided by deterministic Turing machines in space bounded by n k . Definition 1.76 (The class NPSPACE) A language L is in NPSPACE if there exists a non-deterministic Turing machine TM that decides L using O(n k ) non blank cells for some natural number k. Hence, one can define NPSPACE as k NPSPACE(n k ) where NPSPACE(n k ) is the class of languages that can be decided by non-deterministic Turing machines in space bounded by n k . We will show later that PSPACE=NPSPACE, hence, we restrict ourselves now to the class PSPACE. Our example of a PSPACE-complete problem is again related primarily to logic. It is TQBF (True Quantified Boolean Formula) problem (Stockmeyer and Meyer [28]). Definition 1.77 (Formulae QBF, problem TQBF) QBF is a formula in prenex form Q 1 x1 .Q 2 x2 . . . Q n xn .φ(x1 , x2 , . . . , xn ), where each Q i is either ∀ or ∃ and each xi is an atomic proposition. Clearly, each QBF is a closed formula, hence either valid or invalid. Problem TQBF (true QBF) is the satisfiability problem for QBF. Theorem 1.54 TQBF is in PSPACE.
1.8 Complexity
43
Proof Following (Stockmeyer and Meyer [28]) and (Arora and Barak [29]), we consider a formula : Q 1 x1 Q 2 x2 . . . Q n xn φ(x1 , x2 , . . . , xn ), where the size of φ in CNF, measured in the number of functors is m. We consider two cases, for Q 1 : if Q 1 is ∃ then call restricted formulas ↑ [x1 = 0] and ↑ [x1 = 1] and return true to if at least one of restrictions yields true. If Q 1 is ∀ then return true if both restrictions yield true. Assuming that work space is reused, one gets that all computations take space s(n, m) expressed by the recurrence s(n, m) = s(n − 1, m) + O(m), whence s(n, m) = n · m, i.e., it is polynomial in size of the problem. Theorem 1.55 TQBF is ≤ p -hard for PSPACE. Proof Consider a language L decided by the deterministic Turing machine in space s(n) and let x ∈ {0, 1}n . Let m = O(s(n)) be the space needed for instantaneous descriptions for computations on input of length n. We recall the formula ψ4 (x) from the proof of Theorem 1.68. It testifies that two IDs are adjacent IDs in the computation on x. We denote here ψ4 (x) simply as ψ(x). We let ψ0 to be ψ(x) and we let ψi (x)(c1 , c2 ) to be true if there is a path of length at most 2i from ID c1 to ID c2 in the computation on x.Clearly, ψi (x)(c1 , c2 ) if and only if there exists an ID c such that ψi−1 (x)(c1 , c) ∧ ψi−1 (x)(c, c2 ). This allows for an inductive definition of ψi (x) as: (ψi (x)(c1 , c2 )) ≡ (∃c.∀c , c .{[(c = c1 ) ∧ (c = c)] ∨ [(c = c) ∧ (c = c2 )]} ⊃ ψi−1 (x)(c , c )). Then, size(ψi (x)) ≤ size(ψi−1 (x)) + 0(m). For ψm (x) which testifies about accepting computation, this formula yields size(ψm (x)) ≤ O(m 2 ). Clearly, ψm (x) can be converted in polynomial time to the prenex form. Let us observe that in games of perfect information, like chess, hex or go, the order of appearance of quantifiers codes the result of play. Suppose that the order is: Q i is ∃ for i odd and it is ∀ for i even. This means that the first player has a winning strategy in case the described QBF is true. Thus, the existence of winning strategy is PSPACE-complete. We now address relations among complexity classes. Theorem 1.56 NDSPACE(f)⊆ k>0 DTIME(k f ). Proof Consider a non-deterministic f -space bounded Turing machine TM, deciding a language L. For x ∈ L, of size n, IDs in the computation on x occupy the space at most f (n). The number of IDs is bounded by n · |Q| · |A| f (n) ≤ a f (n) for some constant a, where A is the alphabet of the machine.
44
1 Introduction: Prerequisites
On basis of this estimate, it is possible to design a deterministic Turing machine, which we denote DTM, which will decide the language L. The machine DTM uses an inductive procedure to check all IDs of TM inducting on their used space l from l = 1 on. For each l, there are at most d l IDs of space used l. DTM begins with the initial ID and partitions all IDs into three classes: untried, yet not seen and not known to be reachable from the initial ID, used: already tried and having determined successors, active: reachable from the initial ID but without determined as of yet successor. DTM examines all active IDs in turn. If the ID is final, DTM decides x and halts. If not, then the active ID is labelled used and untried IDs are labelled active and the next active ID is examined. When all active IDs of space l are examined, DTM repeats the procedure with IDs using space l + 1. There are at most a l active IDs and at most a l successor IDs. As l does not exceed f (n), the procedure takes at most b f (n) steps which proves that DTM is b f (n) -time bound. Theorem 1.57 DTIME(f) ⊆ NDTIME(f) ⊆ DSPACE(f) ⊆ NDSPACE(f) ⊆ c>0 DTIME(c f ) In this statement only inclusion NDTIME(f)⊆ DSPACE(f) requires a comment. A technical result is helpful here. Theorem 1.58 The following hold: (i) DSPACE(f)=DSPACE(c· f); (ii) NDSPACE(f)=NDSPACE(c·f). Proof For an f-space bounded deterministic Turing machine TM and c > 0, choose a ≤ c f˙(n). A machine TM1 simulating TM has alphanatural number r such that f (n) r r bet V ∪ V where V is the alphabet of TM, and, the set of states Q × {1, 2, . . . , r } (we assume the one-tape TM for simplicity). Hence, TM1 can represent in a single cell the content of r cells of TM. Appropriately modified are IDs of TM. In consequence, DSPACE(f)⊆ DSPACE(c·f). By symmetry, DSPACE(c·f)⊆ DSPACE( 1c · c· f)=DSPACE(f). Finally, DSPACE(f)=DSPACE(c·f). In case of NDSPACE we follow along same lines. Now, we return to the proof that NDTIME(f)⊆ DSPACE(f) and we consider a non-deterministic f -time bounded Turing machine TM which in any computation on input of length n can access at most f (n) + 1 cells. For a language L decided by TM and x ∈ L, the deterministic machine TM1, decides x by checking for successive values of k, the non-deterministic computations of length k by TM until a deciding x computation is found. Computations are specified by strings of symbols of TM indicating possible choices during computation. Space needed is f (|x|) cells to represent successive
1.8 Complexity
45
strings of length k and space f (|x|) + 1 cells for computation specified by a given string. By Theorem 1.58, we can compress the space to f (n). This concludes the proof of Theorem 1.57. For PSPACE versus NPSPACE, we have the result due to Savitch [30]. Theorem 1.59 (The Savitch theorem) If the function s is space-constructible (meaning that there exists a Turing machine that uses s(n) cells on input of length n and s(n) ≥ logn), then NDSPACE(s)⊆ DSPACE(s 2 ). Proof The idea for a proof is similar to one already used. Let TM be a nondeterministic Turing machine that decides a language L of space-complexity s(n). For x ∈ L, with |x| = n, the number of IDs of TM does not exceed 2 O(s(n)) . Deciding x in deterministic fashion will mean that the accepting ID is reached from starting ID in O(s(n)) steps. Given two IDs, ID1 and ID2, the recurrence (I D1, I D2, i) ≡ (∃I D.(I D1, I D, i − 1) ∧ (I D, I D2, i − 1)), where (I D1, I D2, i) means that ID2 is reached from ID1 in at most 2i steps, requires finding ID. This is achieved by enumerating all ID’s in O(s(n)) space and searching for an ID satisfying the recurrence. At i = O(s(n)) the procedure stops and yields a path from starting ID to a final ID, which results in deciding x. The space complexity of this procedure is given by recurrence space(O(s(n)), i) = space(O(s(n)), i − 1) + O(s(n)) hence it is O(s(n)2 ). Corollary 1.5 (i) PSPACE=NPSPACE (ii) L⊆ NL ⊆ P ⊆ NP ⊆ PSPACE = NPSPACE. Definition 1.78 (Classes co-X, EXPTIME, NEXPTIME) The class co-X for a class X consists of languages L which are of the form {0, 1}∗ \ L for L ∈ X. From the logic point of view, the class co-NP is especially interesting as the language S AT consists of unsatisfiable formulae but their negations are formulae which are valid (are tautologies) and the language T AU T consisting of valid formulae is in co-NP. One proves that TAUT is co-NP-complete. 2k The class EXPTIME of exponential time is defined as k DTIME(2n ). Parallel k 2 definition of NEXPTIME is k NDTIME(2n ). These classes are interestingly related to classes with linear exponent in place of a polynomial one. The class E is defined as DTIME(k n ), the class NE is NDTIME(k n ). Theorem 1.60 Each language L in NEXPTIME reduces to a language L in NE. For a proof, see Papadimitriou [31]. An example of NEXPTIME-complete problem is the Bernays - SchönfinkelRamsey SAT(BSR) problem (cf. Ramsey [9]). Definition 1.79 The SAT(BSR) problem: A formula φ of predicate logic is in BSR form if it is in the prenex form ∃x1 ∃x2 . . . ∃xm ∀y1 ∀y2 . . . ∀yn ψ
46
1 Introduction: Prerequisites
and the formula ψ contains only constant and predicate symbols without predicate of identity. The problem is: given a formula in this language decide whether it is satisfiable. To this end, we have the following statement. Theorem 1.61 Consider a formula φ in SAT(BSR) form with p constant symbols in ψ. Then, the formula φ is satisfiable if and only if it has an interpretation with at most m + p elements. Proof If M = (D, I ) is an interpretation for φ, then let elements u 1 , u 2 , . . . , u m satisfy the formula ∀y1 ∀y2 . . . ∀yn ψ. Let W be the subset of U which consists of u 1 , u 2 , . . . , u m and of all elements of the form I (c) for each constant c in ψ. Then W is a model of cardinality m + p for φ. Theorem 1.62 SAT(BSR) is in NEXPTIME. Proof If the formula φ requires length q of representation, then m + p < q, and predicates in ψ are of arity less than q, the model of cardinality m + p needs the length O(q 2q ) to be described and verification that a guessed set of m + p elements is a satisfying interpretation requires O(q m+n ) steps. Theorem 1.63 SAT(BSR) is NEXPTIME-complete. Proof We sketch the proof after Papadimitriou [31]. We may assume that a language L in NE is decided by a non-deterministic Turing machine TM with two choices at each step in time of 2n . For each x ∈ L, the formula φ(x) in the SBR-SAT form is constructed as follows: the formula ψ(x), the matrix of φ(x) on 2n variables x1 , . . . , xn , y1 , . . . , yn is quantified universally so φ(x) : ∀x1 , x2 , . . . , xn .∀y1 , y2 , . . . yn .ψ(x). The burden is now on ψ(x). This formula is the conjunction of formulae specifying computation of TM on x in the way we used for proof in case of completeness of SAT and CNF-CONTRA-UR. For details, please see (Papadimitriou [31], 8.3). Polynomial hierarchy was introduced in (Meyer and Stockmeyer [32]). p
Definition 1.80 (Polynomial hierarchy) A language L is in the class Σi if and only if for a polynomial-time Turing machine TM and a polynomial p, x ∈ L ≡ ∃u 1 ∈ {0, 1} p(|x|) ∀u 2 ∈ {0, 1} p(|x|) . . . Q i u i ∈ {0, 1} p(|x|) . T M(x, u 1 , u 2 , . . . , u i ) = 1, p
p
p
where Q i is ∀ for i even and it is ∃ for i odd. Thus, Σ1 =NP and Σi ⊆ Σi+2 . Between p p p p Σi and Σi+2 , the class Πi = co-Σi finds itself.
1.9 Algebraic Structures
47
p p The polynomial hierarchy PH is the union i>0 Pii = Σi . p p It was shown in (Meyer and Stockmeyer [32]) that if for some i, Σi = Πi , then p PH= Σi . Definition 1.81 (The problem SAT(Σi )) It consists in verification of satisfiability of ∃u 1 .∀u 2 . . . . Q i u i .ψ(u 1 , u 2 , . . . , u i ) = 1. (Meyer and Stockmeyer [32]) offer us the following result. p
Theorem 1.64 The problem SAT(Σi ) is Σi -complete for i = 1, 2, . . .. A discussion can be found in (Arora and Barak [29]).
1.9 Algebraic Structures Some completeness proofs for logics are carried out in the algebraic setting, let us mention here the proof by Chang of completeness of the infinite-valued logic of Łukasiewicz or the Rasiowa-Sikorski algebraic approach to meta-theory of logic (Rasiowa and Sikorski [33]). For this reason, we include a short introduction to algebraic structures. In this survey a most abstract rendering of notions and results about algebraic structures is presented, which in some chapters that follow will be given in more specialized contexts. Definition 1.82 (Lattices) A set L partially ordered by a relation ≤ is a lattice if and only if for each pair x, y ∈ L there exist the least upper bound x ∪ y and the greatest lower bound x ∩ y, with the following properties: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii) (xiii) (xiv) (xv)
x ≤ x ∪ y; y ≤ x ∪ y; (x ≤ z) ∧ (y ≤ z) ⊃ (x ∪ y) ≤ z; x ∩ y ≤ x; x ∩ y ≤ y; (z ≤ x) ∧ (z ≤ y) ⊃ (z ≤ x ∩ y); x ∪ y = y ∪ x; x ∩ y = y ∩ x; x ∪ (y ∪ z) = (x ∪ y) ∪ z; x ∩ (y ∩ z) = (x ∩ y) ∩ z; (x ∪ y) ∩ x = x; (x ∩ y) ∪ x = x; (x ≤ y) ≡ (x ∪ y = y); (x ≤ y) ≡ (x ∩ y = x); (x ∪ y = y) ≡ (x ∩ y = x).
Lattices are related by homomorphisms.
48
1 Introduction: Prerequisites
Definition 1.83 (Homomorphisms) A mapping h : L 1 → L 2 from the lattice L 1 into the lattice L 2 is a homomorphism if it preserves joins and meets,i.e., (i) h(x ∩ y) = h(x) ∩ h(y); (ii) h(x ∪ y) = h(x) ∪ h(y). It follows from these identities that the image h(L 1 ) is a lattice—a sub-lattice of L 2 . Lattices L 1 and L 2 are isomorphic if the homomorphism h exists which is a bijection and h(x) ≤ h(y) if and only if x ≤ y. Definition 1.84 (Maximal elements in a lattice) Complete lattices: An element a ∈ L is maximal if there is no element greater than it, hence, for x ∈ L, a = a ∪ x, hence, x ≤ a for each x ∈ L. It follows that an element is maximal if and only if it is the greatest element in the lattice L. If it exists then it is called the unit and denoted 1. Dually, the minimal element is the least element in the lattice L and it is denoted 0 (zero). The notion of the greatest (dually, of the least) element can be defined relative to a subset X of the lattice L. We denote by sup X the join of the subset X , i.e., the least element a greater than any element of X : x ≤ a for each x ∈ L and if x ≤ b for each x ∈ X than a ≤ b. Dually, we denote by inf X the meet of X , i.e., the greatest element a smaller than any element in X : a ≤ x for each x ∈ X and if b ≤ x for each x ∈ X then b ≤ a. If sup X and inf X exist for each non-empty set X ⊆ L then the lattice L is complete. Clearly, if supL exists, then it is the unit 1 and if in f L exists, then it is the zero 0. Suprema sup X and infima inf X have properties: (i) (ii) (iii) (iv)
sup X ∪ sup Y = sup{x ∪ y : x ∈ X, y ∈ Y }; inf X ∩ inf Y = inf{x ∩ y : x ∈ X, y ∈ Y }; inf X ∪ inf Y ≤ inf{x ∪ y : x ∈ X, y ∈ Y }; sup{x ∩ y : x ∈ X, y ∈ Y } ≤ sup X ∩ sup Y .
An important consequence of completeness is the Knaster-Tarski fixed point theorem (Theorem 1.2). The Dedekind-McNeille theorem (Theorem 1.3) asserts that any lattice can be isomorphically embedded into a complete lattice. Filters and dual to them ideals constitute a powerful tool in deciding completeness. Definition 1.85 (Filters, ideals) A filter (dually, an ideal) in a lattice L is a subset F of L (respectively, a subset I of L) such that (i) if x, y ∈ F, then x ∩ y ∈ F; dually, if x, y ∈ I then x ∪ y ∈ I ; (ii) if x ∈ F and x ≤ y, then y ∈ F; dually, if x ∈ I and y ≤ x, then y ∈ I . A principal filter F(a) is the set {x ∈ L : a ≤ x}. Dually, a principal ideal I (a) is the set {x ∈ L : x ≤ a}. If a lattice L contains the unit 1, then 1 ∈ F for each filter F. Dually, if the lattice L contains zero 0, then 0 ∈ I for each ideal I . A filter (an ideal) is maximal if it is not contained as a proper subset in any filter (ideal). The intersection of a family of filters is a filter. Given a subset A ⊆ L, there exists the smallest filter F(A) containing A, given as the intersection of all filters containing A. The explicit expression for this filter is given in the theorem that follows.
1.9 Algebraic Structures
49
Theorem 1.65 The filter F(A) is the set {a : ∃x1 , x2 , . . . , xk ∈ A.a ≥ x1 ∩ x2 ∩ . . . ∩ xk }. The dual statement for ideals replaces ≥ with ≤ and ∩ with ∪. If a lattice L contains zero element 0, then by excluding 0 from filter in the filter definition, i.e., by adding the condition 0 ∈ / F, we define a proper filter. Dually, the condition 1 ∈ / I defines the proper ideal I . As any increasing chain of filters is a filter, it follows by the Zorn maximal principle that Theorem 1.66 In any lattice L containing 0, each proper filter is contained in a maximal proper filter. Dually, in any lattice containing the unit 1, any proper ideal is contained in a maximal proper ideal. Definition 1.86 (Prime filters and prime ideals) A proper filter F is prime if from x ∪ y ∈ F it follows that either x ∈ F or y ∈ F. A proper ideal is prime if from x ∩ y ∈ I it follows that either x ∈ I or y ∈ I . Definition 1.87 (Distributive lattices) A lattice L is distributive if it satisfies the condition x ∩ (y ∪ z) = (x ∩ y) ∪ (x ∩ z). Then the dual condition x ∪ (y ∩ z) = (x ∪ y) ∩ (x ∪ z) holds also. Distributivity bears on properties of filters. Theorem 1.67 Suppose that the lattice L is distributive. Then each maximal filter F (respectively, each maximal ideal I ) is prime. Proof Suppose to the contrary that a maximal filter F is not prime. Then there exist x, y ∈ L such that x ∪ y ∈ F but x ∈ / F and y ∈ / F. Consider the filter G = {z ∈ L : ∃u ∈ F.z ≥ x ∩ u}. Then F ⊆ G and we have to check that the inclusion is proper and G is a proper filter. Claim. y ∈ / G. Indeed, were y ∈ G, we would have y ≥ x ∩ u for some u ∈ F; as x ∪ y ∈ F and y ∪ u ∈ F we would have y = (x ∩ u) ∪ y = (x ∪ y) ∩ (x ∪ y) ∈ F, a contradiction. Thus G is a proper filter, F is a proper subset of G, hence F is not maximal, a contradiction.
50
1 Introduction: Prerequisites
Dual proof in which we would replace ≤ with ≥ and ∪ with ∩ would prove the part for ideals. We now state and prove the theorem on separation of elements in a lattice by a prime filter, the fact of crucial importance for proofs of completeness of various logics by algebraic tools. The following separation theorem plays a decisive role in many proofs of completeness in the following chapters. Theorem 1.68 (Filter separation theorem) Suppose that L is a distributive lattice and x = y are elements of L such that it is not true that x ≤ y. Then there exists a prime filter F such that x ∈ F and y ∈ / F. Proof Consider the set F of all filters on L which contain x and not contain y. The principal filter F(x) ∈ F, hence, F = ∅. As any linearly ordered chain in F has an upper bound, by the Zorn maximal principle there exists a maximal filter Fm satisfying x ∈ Fm , y ∈ / Fm . Claim. The filter Fm is prime. Suppose to the contrary. We have u, v ∈ L with / Fm and v ∈ / Fm . Let Fu be the filter generated by {u} ∪ Fm , and, u ∪ v ∈ Fm but u ∈ Fv be the filter generated by {v} ∪ Fm . / Fv . Was y in Fu and in Fv , then we would have Sub-claim. Either y ∈ / Fu or y ∈ q1 , q2 ∈ Fm with y ≥ u ∩ q1 and y ≥ v ∩ q2 , hence, for q = q1 ∩ q2 ∈ Fm we would have y ≥ u ∩ q and y ≥ v ∩ q, hence, y ≥ (u ∩ q) ∪ (v ∩ q) = (u ∪ v) ∩ q ∈ Fm , so it would follow that y ∈ Fm , a contradiction. This proves Sub-claim. Suppose that for instance y ∈ / Fu . Then Fu contains x but not y and Fm ⊆ Fu . As / Fm , Fu = Fm , contrary to maximality of Fm . This proves Claim and u ∈ Fu and u ∈ theorem. The following corollary is the dual statement for prime ideals. Corollary 1.6 If x, y are elements of a distributive lattice L, x = y and it is not true that y ≤ x, then there exists a prime ideal I with the property that x ∈ I and y∈ / I. Definition 1.88 (Complements) Existence of zero or unit elements allows for introduction of complements in a lattice L. For an element x ∈ L, if 1 ∈ L, then a ∪complement to x is the least element a ∈ L such that a ∪ x = 1; dually, if 0 ∈ L, then a ∩-complement to x is the greatest element a ∈ L such that a ∩ x = 0. Element a is the complement to x if it is simultaneously the ∩-complement and the ∪-complement. It is denoted by −x. Theorem 1.69 The following are properties of the complement: (i) if a ∩ x = 0 and a ∪ x = 1, then a = −x; (ii) if x ≤ y, then −y ≤ −x;
1.9 Algebraic Structures
(iii) (iv) (v) (vi)
51
−(−x) = x; −(x ∩ y) = −x ∪ −y; −(x ∪ y) = −x ∩ −y; −0 = 1; −1 = 0.
Proof Suppose that x ∩ y = 0; then y = y ∩ (x ∪ a) = (y ∩ x) ∪ (y ∩ a) = y ∩ a implying that y ≤ a which shows that a is the ∩-complement to x. Dually, one proves that from x ∪ y = 1 it follows that y ≥ a, hence, a is the ∪-complement to y. Finally, it shows that a is the complement to x. Suppose that x ≤ y. Then x ∩ −y = (x ∩ y) ∩ −y=0, i.e, −y ≤ −x. If −x is the complement to x, then x ∪ −x = 1, x ∩ −x = 0, i.e., x is the complement to −x, i.e., −(−x) = x. We have (x ∪ y) ∩ (−x ∩ −y) = (x ∩ −x ∩ y) ∪ (y ∩ −x ∩ −y) = 0 ∪ 0 = 0 and (x ∪ y) ∪ (−x ∩ −y) = (x ∪ −x ∪ y) ∩ (y ∪ −x ∪ −y) = 1 ∩ 1 = 1. From the last two facts, we deduce that −(x ∪ y) = −x ∩ −y. The claims fourth and fifth follow by duality. The last claim is obvious as 1 ∩ 0 = 0, 1 ∪ 0 = 1. The ∩-complement is also called the pseudo-complement. Definition 1.89 (Relative pseudo-complement) A relative variant of the pseudocomplement is the pseudo-complement of x relative to y defined as the greatest element a such that x ∩ a ≤ y, denoted x ⇒ y and called the relative pseudocomplement. We will dwell on it awhile because of its role in many-valued logics. We have the fundamental duality (x ⇒ y ≥ z) ≡ (z ∩ x ≤ y). Theorem 1.70 The following are among properties of the relative pseudocomplement: (i) (ii) (iii) (iv)
x ⇒ y ≥ y; x ≤ y if and only if x ⇒ y = 1; 1 ⇒ y = y; −x = x ⇒ 0.
All these properties follow in a straightforward way from definitions. Theorem 1.71 If in a distributive lattice there exists the complement −x of x, then x ⇒ y = −x ∪ y for each y.
52
1 Introduction: Prerequisites
Definition 1.90 (Relatively pseudo-complemented lattices) A lattice L is relatively pseudo-complemented if and only x ⇒ y exists for each pair x, y of elements of L. As x ⇒ x = 1, each relatively pseudo-complemented lattice is endowed with the unit 1. Theorem 1.72 Every relatively pseudo-complemented lattice satisfies the following properties: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii) (xiii) (xiv) (xv) (xvi) (xvii)
x ⇒ y = 1 if and only if x ≤ y; x ⇒ x = 1; 1 ⇒ x = x; (x ⇒ x) ∩ y = y; x ∩ (x ⇒ y) = x ∩ y; (x ⇒ y) ∩ y = y; (x ⇒ y) ∩ (x ⇒ z) = x ⇒ (y ∩ z); (x ⇒ z) ∪ (y ⇒ z) = (x ∪ y) ⇒ z; x ⇒ (y ⇒ z) = y ⇒ (x ⇒ z); x ≤ − − x; − − −x = −x; x ⇒ y ≤ −y ⇒ −x; x ⇒ −y = y ⇒ −x; x ⇒ (y ⇒ (x ∩ y)) = 1; (x ⇒ y) ∩ (y ⇒ z) ≤ x ⇒ z; (x ∪ y) ⇒ z = (x ⇒ z) ∩ (y ⇒ z); (x ∩ y) ⇒ z = y ⇒ (x ⇒ z).
Filters in relatively pseudo-complemented lattices can be defined by means of relative pseudo-complements. Theorem 1.73 For a relatively pseudo-complemented lattice L, a subset F of L is a filter if and only if the following conditions are satisfied: (i) 1 ∈ F; (ii) if x ∈ F and x ⇒ y ∈ F, then y ∈ F. Proof Suppose that both conditions hold; by Theorem 1.72(xiv), x ⇒ (y ⇒ (x ∩ y)) = 1. If x, y ∈ F, then x ∩ y ∈ F. If x ∈ F and x ≤ y then x ⇒ y = 1 ∈ F and y ∈ F, so F is a filter. Conversely, if F is a filter then 1 ∈ F and if x ⇒ y ∈ F, then x ∩ y = x ∩ (x ⇒ y) ∈ F by Theorem 1.72(v), hence, x ∩ y ∈ F. Definition 1.91 (Filter ordered lattices) Filters induce orderings by letting (x ≤ F y) ≡ (x ⇒ y ∈ F). The relation ≤ F is reflexive as x ⇒ x=1 ∈ F and transitive as, by Theorem 1.72(xv), x ≤ F y and y ≤ F z imply x ≤ F z.
1.9 Algebraic Structures
53
Definition 1.92 (Quotient spaces modulo filters) A relation (Q) (x ≡ F y) ≡ [(x ⇒ y ∈ F) ∧ (y ⇒ x ∈ F)] is an equivalence relation and we denote its classes by [x] F and the quotient space by the symbol L/ ≡ F . The quotient space is an ordered lattice by the relation [x] F ≤ [y] F if and only if x ⇒ y ∈ F. The lattice operations factor through ≡ F . Theorem 1.74 The following properties hold: (i) (ii) (iii) (iv)
[x] F ∪ [y] F = [x ∪ y] F ; [x] F ∩ [y] F = [x ∩ y] F ; [x] F ⇒ [y] F = [x ⇒ y] F ; −[x] F = [−x] F .
Proof Let us recall the arguments. For (i), x ⇒ x ∪ y = 1, y ⇒ x ∪ y = 1, hence, [x] F ≤ [x ∪ y] F and [y] F ≤ [x ∪ y] F thus [x] F ∪ [y] F ≤ [x ∪ y] F . For the converse, suppose that [x] F ≤ [z] F and [y] F ≤ [z] F so both x ⇒ z, y ⇒ z ∈ F, hence, by Theorem 1.72(xvi), (x ∪ y) ⇒ z ∈ F, i.e., [x ∪ y] F ≤ [z] F , i.e., [x ∪ y] F is [x] F ∪ [y] F . In proof of (ii), one applies in a similar way properties (i) and (vii) in Theorem 1.72. To prove (iii), one makes use of property (xvii) in Theorem 1.72. The unit in L/ ≡ F is [1] F as x ⇒ 1 = 1 for each x and the zero in L/ ≡ F is [0] F as 0 ⇒ x = 1 for each x (both facts under assumption that L possesses 0, 1, respectively). Finally, −[x] F = [x] F ⇒ [0] F =[x ⇒ 0] F =[−x] F . Remark 1.1 (i) If x ∈ F, then 1 ⇒ x = x ∈ F, i.e., [1] F ≤ [x] F so [1] F = [x] F (ii) from (i), we infer that the filter F is proper if and only if the quotient lattice L/ ≡ F is at least a two-element set.We recall that given an element x in a lattice L and a filter F, the set (G) G(F, x) = {y ∈ L : ∃z ∈ F.y ≥ x ∩ z} is the least filter containing x and F. We add a property of G(F, x). Theorem 1.75 G(F, x) is proper if and only if −x ∈ / F. Proof Indeed, was −x ∈ F, we would have −x ≥ x ∩ z for some z ∈ F, hence, 0 = x ∩ −x ≥ x ∩ z, i.e., 0 ∈ F, hence F = L. Theorem 1.76 If F is a maximal filter, then for each x ∈ L, F contains either x or −x, hence, L/ ≡ F is a two-element {0, 1} lattice.
54
1 Introduction: Prerequisites
Proof Concerning maximal filters, by Theorem 1.75, for a maximal filter F, given x ∈ L, the filter G(F, x) extends F as a proper filter if and only if −x ∈ / F and then it coincides with F. Hence, a maximal filter F contains x or −x for each element x ∈ L. From this fact it follows that for each x ∈ L, either [x] F = 1 (if x ∈ F ) or [−x] F = −[x] F = 1 (if −x ∈ F), hence, [x] F = 0; thus the quotient L/ ≡ F contains exactly two elements 0 and 1. Dually, all properties of filters are true for ideals; in particular, for each element x = 1 there exists a maximal ideal I such that x ∈ I . Definition 1.93 (Boolean algebras) A Boolean algebra is a distributive lattice in which every element x has the complement −x satisfying properties x ∪ −x = 1, x ∩ −x = 0. The new construct is the difference of elements defined as x − y = x ∩ −y. The relative complement ⇒ satisfies the formula x ⇒ y = −x ∪ y. An example of a Boolean algebra is the two-element algebra L/ ≡ F defined in Theorem 1.76. Theorem 1.77 The operations in the algebra L/ ≡ F are as follows: (i) 0 = 0 ∪ 0 = 0 ∩ 0 = 0 ∩ 1 = 1 ∩ 0 = 1 ⇒ 0 = −1; (ii) 1 = 0 ∪ 1 = 1 ∪ 0 = 1 ∪ 1 = 1 ∩ 1 = 0 ⇒ 0 = 0 ⇒ 1 = 1 ⇒ 1 = −0. Theorem 1.78 The following properties hold in any Boolean algebra: (i) (ii) (iii) (iv) (v)
−(x ∩ y) = −x ∪ −y; −(x ∪ y) = −x ∩ −y; − − x = x; x ⇒ y = −y ⇒ −x; −x ⇒ y = −y ⇒ x; x ⇒ y = −(x − y).
Concerning filters, in Boolean algebras it is true that Theorem 1.79 In each Boolean algebra, each prime filter is maximal, hence, notions of a prime filter and a maximal filter are equivalent. Proof Indeed, if a filter F is prime, then for each x ∈ L it contains 1 = x ∪ −x, hence either x or −x is in F which is equivalent to maximality. In (Stone [34]) a representation theory of algebraic structures in topological structures was developed and we include here its algebraic aspects. Definition 1.94 (The Stone set lattice) For a distributive lattice L, consider the set F(L) of all prime filters on L. For each element x ∈ L, consider the set h(x) = {F ∈ F(L) : x ∈ F} and let St (L) = {h(x) : x ∈ L}. St (L) is the Stone set lattice of L and F(L) is the Stone set of L. Theorem 1.80 St (L) is a set lattice and h establishes an isomorphism between L and St (L).
1.10 Topological Structures
55
Proof That h is injective, follows from separation property of filters: if x = y, then one of x ≤ y, y ≤ x cannot hold, hence there exists a prime filter F which contains only one of x, y, hence, h(x) = h(y). To prove that h(x ∪ y) = h(x) ∪ h(y), assume that x ∪ y ∈ F, F a prime filter. Then either x ∈ F or y ∈ F, hence, F ∈ h(x) or F ∈ h(y), therefore, F ∈ h(x) ∪ h(y). Conversely, If F ∈ h(x) or F ∈ h(y), hence, either x ∈ F or y ∈ F, so x ∪ y ∈ F, hence, F ∈ h(x ∪ y). Proof that h(x ∩ y) = h(x) ∩ h(y) follows on similar lines. If F ∈ h(x ∩ y), then x ∩ y ∈ F so x ∈ F and y ∈ F so F ∈ h(x) ∩ h(y). The proof for converse goes in the reversed direction. Corollary 1.7 Each distributive lattice is isomorphic to a set lattice. The reader will find in (Rasiowa and Sikorski [33]) an extensive and deep discussion of the topics in this section.
1.10 Topological Structures The role played by topology in certain realms of logic calls for an introduction of basic notions and results into this chapter; for more information, please consult Kelley [7]. Definition 1.95 (Topological structures) A topological structure is a pair (Ω, O), where Ω is a non-empty set and O is a family of subsets of Ω which satisfies the following conditions: (i) O is closed on finite intersections; (ii) O is closed on arbitrary unions. In particular, the empty set and Ω are in O. We will denote open sets with symbols F, G, H, ... Definition 1.96 (Open and closed sets) A set X ⊆ Ω is open of and only if X ∈ O. A set Y is closed if and only if the set X = Ω \ Y is open. It follows by Definition 1.95 (i), (ii) that the collection C of closed sets satisfies the following conditions: (i) C is closed on arbitrary intersections; (ii) C is closed on finite unions. Closed sets will be denoted with symbols K , P, Q, .... Definition 1.97 label1.10.3(Neighborhoods, interiors, closures) For a thing x ∈ Ω, an open neighborhood is an open set G such that x ∈ G. A neighborhood of x is a set X such that there exists an open set G with properties: x ∈ G ⊆ X .
56
1 Introduction: Prerequisites
An interior I nt X of a set X is the union of all open sets contained in X : I nt X = {G ∈ O ∧ G ⊆ X }. A closure Cl X of a set X is defined as Ω \ I nt (Ω \ X ). In order to reveal the more digestible characterization, we resort to the local version: x ∈ Cl X if and only if for each neighborhood Y of x, Y ∩ X = ∅. The duality between interior and closure expressed in the formula Cl X = Ω \ I nt (Ω \ X ) can be expressed as well in the form: I nt X = Ω \ Cl(Ω \ X ). Similarly, we have a local characterization of interiors; x ∈ I nt X if and only if there exists a neighborhood of x contained in X . Clearly, I nt X ⊆ X and X ⊆ Cl X . Theorem 1.81 The following are properties of interiors and closures: I nt (X ∩ Y ) = (I nt X ) ∩ (I ntY ); dually, Cl(X ∪ Y ) = (Cl X ) ∪ (ClY ); (X ⊆ Y ) implies I nt X ⊆ I ntY and Cl X ⊆ ClY ; I nt (I nt X ) = I nt X ; dually, Cl(Cl X ) = Cl(X ); I nt∅ = ∅ = Cl∅; for any open set G and any set X , if G ∩ X = ∅, then G ∩ Cl X = ∅; Because, X ⊆ (Ω \ G), the set Ω \ G closed, and Cl X ⊆ (Ω \ G) by (ii); (vi) for any open set G and any set X , Cl(G ∩ Cl X ) = Cl(G ∩ X ).
(i) (ii) (iii) (iv) (v)
Proof We prove (vi): From right to left: (a) G ∩ X ⊆ G ∩ Cl X ; (b) Cl(G ∩ X ) ⊆ Cl(G ∩ Cl X ); from left to right: we assume that x ∈ Cl(G ∩ Cl X ). Take any open neighborhood H of x. Then H ∩ (G ∩ Cl X ) = (H ∩ G) ∩ Cl X = ∅ and as H ∩ G is an open neighborhood of x, x ∈ Cl(G ∩ X ). Definition 1.98 (Boundary sets) The boundary Fr X of a set X is the set (Cl X ) ∩ Cl(Ω \ X ). Hence, locally, x ∈ Fr X if and only if each neighborhood of x intersects both X and Ω \ X . As boundary of a set represents in Chap. 7 the region of uncertain/incomplete knowledge, we deem it useful to include some properties of the Fr operator. Theorem 1.82 The following are selected properties of boundaries: (i) (ii) (iii) (iv) (v)
I nt X = X \ Fr X ; dually, Cl X = X ∪ Fr X ; Fr I nt X ⊆ Fr X ; dually, FrCl X ⊆ Fr X ; Fr X = Cl X \ I nt X ; Fr (X ∪ Y ) ⊆ Fr X ∪ Fr Y ; Fr (X ∩ Y ) ⊆ ((Fr X ) ∩ ClY ) ∪ ((Cl X ) ∩ Fr Y ).
Definition 1.99 (Complexity of topologies) Complexity of a topology can be measured by ability to separate subsets of Ω by open sets. We denote by τ the topology on Ω and we define the basic T2 property: (i) τ is T2 (Hausdorff) if and only if for each pair x, y of elements in Ω, there exist open disjoint sets G, H such that x ∈ G and y ∈ H . Definition 1.100 (Continuous mappings) A mapping f : Ω1 → Ω2 of a topological space (Ω1 , O1 ) into a topological space (Ω2 , O2 ) is continuous if and only if f −1 (G) ∈ O1 for each G ∈ O2 . By this definition,
1.10 Topological Structures
57
(i) f −1 (I ntY ) ⊆ I nt f −1 (Y ) for each Y ⊆ Ω2 ; (ii) f (Cl X ) ⊆ Cl f (X ); (iii) Cl f −1 Y ⊆ f −1 (ClY ) for each Y ⊆ Ω2 . Definition 1.101 (Filters) A filter on a space (Ω, τ ) is a family of sets F ⊆ 2Ω which satisfies the following conditions: (i) if A ∈ F and A ⊆ B, then B ∈ F; (ii) if A, B ∈ F, then A ∩ B ∈ F; (iii) ∅ ∈ / F. By the Zorn maximal principle, each filter is a subset of a maximal filter, called an ultrafilter. Definition 1.102 (Compactness) An open covering of Ω is a family C of open sets with the property that C = Ω; a space (Ω, O) is compact if each open covering of it contains a finite sub-family which is an open covering. Definition 1.103 (Centered families. Limit and cluster points) A family of sets is centered if and only if each finite sub-family has a non-empty intersection. An element x is a limit point of a filter F if and only if each neighborhood of x is an element of F. An element x is a cluster point of F if and only if x is in closure of each element of F. Theorem 1.83 (i) A space τ is compact if and only if each centered family of closed sets has a non-empty intersection; (ii) A space τ is compact if and only if each filter has a cluster point; (iii) A space τ is compact if and only if each ultrafilter has a limit point. Definition 1.104 (Induced topological structures) Set-theoretic constructs applied to topological spaces produce new structures which inherit many properties. We examine the most important one. Cartesian products: given a Cartesian product T =Πs∈S Ωs with topologies Os on Ωs for each s, we call a finite box the Cartesian product Πs∈J G s × Πs ∈J / Ωs , where J ⊆ S is finite and G s ∈ Os for each s ∈ J . The product topology on T is defined by declaring a subset G of T open if and only if it is the union of some family of finite boxes. The Cartesian product T is related to factors Ωs by projections pso : (xs )s → (G so ) = G so × Πs=so Ωs . xso for so ∈ S. Projections are continuous: ps−1 o For a filter F on T , and for each s ∈ S, the projection ps (F) = Fs is a filter; if F is an ultrafilter, then Fs is an ultrafilter. Tychonoff theorem Tychonoff [35] is one of most important in topology but finds also its usage in some areas of computer science. Theorem 1.84 The Cartesian product of compact spaces is compact. Proof It suffices to verify that for an ultrafilter F, if xs is a limit point of the ultrafilter Fs for s ∈ S, then (xs )s∈S is a limit point of F.
58
1 Introduction: Prerequisites
Metric spaces are present in many reasoning schemes, we will find them in Gaifman’s graphs in Chap. 8. Definition 1.105 A metric on a set Ω is a function δ : X × X → R1 such that (i) δ(x, y) ≥ 0; (δ(x, y) = 0) ≡ (x = y); (ii) δ(x, y) = δ(y, x); (iii) δ(x, y) ≤ δ(x, z) + δ(z, y). Definition 1.106 (Limit and cluster points) A set X endowed with a metric δ is a metric space. For x ∈ X and r > 0, an open ball B(x, r ) = {y : δ(x, y) < r }. In the topology induced by metric δ, a set is declared open if and only if it is a union of a family of open balls. In metric spaces filters are replaced with sequences: x is a cluster point of a sequence (xn ) if and only if there exists a subsequence xn k with x = lim k xn k . That sequences define topology of a metric space follows from the fact: x ∈ Cl A if and only if x = limn an with an ∈ A. This is true because each x has a countable neighborhood base: open balls B(x, qn ) where (qn )n is a sequence of rational numbers with limit 0 have the property that each neighborhood of x contains such ball for some qn . Hence, pick by axiom of choice an element xn from the intersection A ∩ B(x, qn ) for each n to obtain the sequence converging to x. Theorem 1.85 If τn is a metric space with the bounded by 1 metric δn for each n, ∞ τn is a metric space with the metric δ((x)n , (y)n ) = then the Cartesian product Πn=1 ∞ δn (xn ,yn ) ∞ which yields the finite box topology of Πn=1 τn . The restriction that 1 2n metrics δn are bounded by 1 is immaterial because the metric min{1, δn } introduces the same topology as the metric δn . Definition 1.107 (Degrees of connectedness) A topological space τ = (Ω, O) is connected if the set Ω cannot be represented as the union G ∪ H of two open disjoint sets G and H , otherwise τ is disconnected. A space τ is totally disconnected if and only if the set Ω contains no connected subset of at least two elements. A space τ is extremely disconnected if and only if, for each open set G, the closure ClG is open. We now return to the Stone theory (Stone [34]) in its topological aspect. Theorem 1.86 (The Stone topology) Suppose that a lattice L is a Boolean algebra. The Stone set F(L) is a compact totally disconnected Hausdorff (T2 ) topological space. Each element of the Stone set lattice St (L) is open-and-closed and St (L) is a Boolean algebra (a field of sets). Proof Compactness follows by the existence of the unit 1: suppose that F(L) = i ∈ Is h(ai,s ), we may assume that s∈S G s , each G s open. As each G s is a union sets h(ai,s ) for all i, s form an open covering. Suppose, to the contrary, that no finite collection of sets of the form h(ai,s ) can cover F(L). As F(L) = h(1), any finite collection a1 , a2 , . . . , am satisfies
References
59
m m h(∪i=1 ai ) = h(1), hence, ∪i=1 ai = 1. Denote by J the ideal generated by the col/ J , J is proper, hence, J extends to J ∗ -a maximal lection {a1 , a2 , . . . , am }. Then, 1 ∈ ideal, hence, prime, and the collection {a : a ∈ / J ∗ } is a prime filter F ∗ , and it follows by arbitrariness of the selected collection that no a ∈ L is an element of F ∗ , a contradiction. Hence, F(L) is compact. As h(a) ∪ h(−a) = F(L), and, h(a) ∩ h(−a) = ∅, each set h(a) is closed-andopen, hence F(L) is totally disconnected. That F(L) is a Hausdorff space (T2 ), follows by the same fact: any two distinct filters F1 , F2 point to some a ∈ L which is an element in one of them only, then, h(a) contains one of filters, say, F1 , and, h(−a) contains F2 .
A Boolean algebra is complete if and only if each subset has the least upper bound. Theorem 1.87 The Stone space of a complete Boolean algebra is extremely disconnected. Proof Consider an open set G in St (L). G is a union i∈J h(ai ); let a be the l.u.b. of the collection {ai : i ∈ J }. Then h(a) is closed, contains G and is the smallest closed set with this property.
References 1. Knaster, B., Tarski, A.: Un théorème sur fonctions d’ensembles. Ann. Soc. Polon. Math. 6, 133–4 (1928) 2. Zermelo, E.: Beweis dass jede Menge wohlgeordnet werden kann. Math. Ann. 59, 514–516 (1904) 3. Zorn, M.: A remark on method in transfinite algebra. Bull. Am. Math. Soc. 41, 667–670 (1915) 4. Kuratowski, C.: Une méthode d’ élimination des nombres transfinis des raisonnement mathématiques. Fund. Math. 3, 89 (1922) 5. Hausdorff, F.: Grundzuge ¨ der Mengenlehre, Leipzig (1914) 6. Tukey, J.: Convergence and uniformity in topology. Ann. Math. Stud. 2 (1940) 7. Kelley, J.L.: General Topology. Springer, New York (1991). (reprint of Van Nostrand (1955)) 8. König, D.: Über eine Schlussweise aus dem Endlichen ins Unendliche. Acta Litt. ac.sci.Hung. Fran. Josephinae, Sect. Sci. Math. 3, 121–130 (2927) 9. Ramsey, F.P.: On a problem of formal logic. Proc. Lond. Math. Soc. 30, 264–286 (1930). (2nd. ser.) 10. Salomaa, A.: Formal Languages. Academic Press, New York (1973) 11. Büchi, J.R.: Weak second-order arithmetic and finite automata. Z. Math. Logik und Grundl. Math. 6, 66–92 (1960) 12. Landweber, C.H.: Decision problems for ω-automata. Math. Syst. Theory 3, 376–385 (1969) 13. Turing, A.M.: On computable numbers, with an application to the Entscheidungsproblem. Proc. Lond. Math. Soc. 2(42), 230–265 (1937) 14. Davis, M.: Computability and Unsolvability. McGraw-Hill Book Co., New York (1958) 15. Radó, T.: On non-computable function. Bell Syst. Tech. J. 41(3), 877–884 (1962) 16. Aaronson, S.: The BB frontier. www.scottaaronson.com//papers//bb.pdf 17. Kropitz, P.: Busy Beaver problem. B.Sc Thesis. University Karlova, Prague (2011). https://is.cuni.cz//webappa//zzp//detail//49210
60
1 Introduction: Prerequisites
18. Yedidia, A., Aaronson, S.: A relatively small Turing machine whose behavior is independent of set theory. Complex Syst. 25(4) (2016) 19. Kleene, S.C.: Recursive predicates and quantifiers. Trans. Am. Math. Soc. 53, 41–73 (1943) 20. Kleene, S.C.: Introduction to Metamathematics. Van Nostrand, Princeton NJ (1952) 21. Rozenberg, G., Salomaa,A.: Cornerstones of Undecidability. Prentice Hall (1994) 22. Gödel, K.: Über formal unentscheidbare Sätze der Principia Mathematica und Verwandter Systeme 1. Monatshefte für Mathematik und Physics 38, 173–188 (1931) 23. Tarski, A.: Der Wahrheitsbegriff in den formalisierten Sprachen, Studia Philos. 1, 261–405 (1936). (also in Eng. transl.: Logic, Semantics, Metamathematics. Oxford University Press, New York (1956)) 24. Smullyan, R.M.: Gödel’s Incompleteness Theorems. Oxford Logic Guides. Oxford University Press, New York-Oxford (1992) 25. Bachmann, P.: Zahlentheorie. Zweiter Teil: Die Analytische Zahlentheorie. Teubner, Leipzig (1894) 26. Cook, S.A.: The complexity of theorem proving procedures. In: Proceedings of 3rd Annual ACM Symposium on Theory of Computing, pp. 151–158. ACM (1971) 27. Levin, A.: Universal sequential search problems. Problems Inf. Transm. 9 (1973) 28. Stockmeyer, L., Meyer, A.R.: Word problems requiring exponential time. In: STOC, pp. 1–9. ACM (1973) 29. Arora, S., Barak, B.: Computational Complexity. Cambridge University Press, Cambridge UK (2009) 30. Savitch, W.J.: Relationships between nondeterministic and deterministic tape complexities. J. Comput. Syst. Sci. 4, 177–19 (1970) 31. Papadimitriou, C.: Computational Complexity, 2nd edn. Longman, Reading MA (1995) 32. Meyer, A.R., Stockmeyer, L.: The equivalence problem for regular expressions with squaring requires exponential time. In: FOCS, pp. 125–129. IEEE (1972) 33. Rasiowa, H., Sikorski, R.: The Mathematics of Metamathematics. PWN-Polish Scientific Publishers, Warszawa (1963) 34. Stone, M.H.: The theory of representations for Boolean algebras. Trans. Am. Math. Soc. 40, 37–111 (1936) 35. Tychonoff, A.: Über die topologische Erweiterung von Räumen. Math. Ann. 102, 544–561 (1929)
Chapter 2
Sentential Logic (SL)
Called also Propositional Logic/Calculus, or Sentential Calculus, this logic is concerned with sentences, i.e., statements for which the value can be assigned of truth or falsity. It provides a cornerstone for other logics and this determines its role as a pioneering field for introduction of notions and concepts later transferred to other logics. At the same time, its simple formulation allows to see clearly the structure of a logical theory. We use standard notation for connectives, in particular we apply the ‘horseshoe’ symbol ⊃ in order to denote implication. We identify adjectives ‘sentential’ and ‘propositional’ as synonyms and we give preference to adjective ‘sentential’ to stress that logics are about sentences, i.e., true statements.
2.1 A Bit of History Rudiments of SL emerged in Greece in times of Plato who underwent an analysis of language as the system roughly preceding original system of Syllogistic by Aristotle (384-322 B.C.E.), i.e., a logic of terms built on lines of logic as figures of Syllogistics are implications of the form ‘if A and B, then C’, where A, B, C have the form of formulae Xab, where a, b are terms like ’a man’, an animal’, ’a Greek’ etc., and X is one of four operators: A for ’all’, E for ’no one’, O for ’some’, I for ’some is not’, in effect elementary statements are Aab: ’all a is b’, Eab: ’no a is b’, Oab: ’some a is b’, I ab:’some a is not b’, yielding 256 possible figures of which 24 are valid. En passant, Aristotle introduced variables into logic (cf. Łukasiewicz [1]). Sentential logic emerged from attempts by Diodorus Cronus (before 307 B.C.E.) and Philo of Megara (about 400 B.C.E.). Their efforts brought forth the implication understood to be true in all cases except when the antecedent is true and the consequent is false and, in consequence, the truth-functional view on statements. Problems related to contingency and modalities were also initiated by their school. A descendant of this school (called by some the Megaric School) Zeno of Chition © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. T. Polkowski, Logic: Reference Book for Computer Scientists, Intelligent Systems Reference Library 245, https://doi.org/10.1007/978-3-031-42034-4_2
61
62
2 Sentential Logic (SL)
(ca. 336-246 B.C.E.) founded the Stoic School (Stoa Poikile (the Painted Gate) was the site for their meetings). Stoics, principally their member Chrysippus of Soloi (ca. 281-205 B.C.E.) created a system of sentential logic in which they discerned among sentential connectives of disjunction ’or’ (though Chrysippus understood disjunction as the exclusive disjunction, which bears on the form and meaning of Moods 4 and 5, below), conjunction ’and’, negation ’it is not the case that ...’, in addition to already constructed implication ‘if ..., then ...’, they introduced and they built a deductive system based on five ’indemonstrables’ and four ’themata’. They represented logical arguments as ’moods’ i.e., sets of numbered terms representing statements either true or false. They had been: Mood 1. If 1st, 2nd; but 1st; therefore 2nd. Mood 2. If 1st, 2nd; but not 2nd; therefore not 1st. Mood 3. Not 1st and 2nd; but 1st; therefore not 2nd. Mood 4. 1st or 2nd; but 1st; therefore not 2nd. Mood 5. 1st or 2nd; but not 2nd; therefore 1st. Mood 1 is Modus Ponens (detachment), Mood 2 is Modus Tollens. In addition four themata were used to reduce complex statements to one of five indemonstrables. In this way, Stoics created a deduction system, in modern times proposed by Gottlob Frege (Boche´nski [2], Bobzien [3], Łukasiewicz [5]). Let us complete this succint account with the words from (Boche´nski [2], p.5): ‘the leading Megaricians and Stoics are among the greatest thinkers in Logic’.
2.2 Syntax of Sentential Logic Definition 2.1 (The language of SL) The alphabet A of the language L of sentential logic consists of the following categories of symbols. (i) A countable set P of atomic propositions: P = { p0 , p1 , p2 , . . . , pk , . . .}. In practice we denote a number of first of them as p, q, r, s, . . . and use them in formulae; (ii) The set C of sentential connectives ⊃ (implication), ¬ (negation); (iii) The set B of parentheses ( ) [ ] { } and the set M of punctuation marks; (iv) the symbol ⊥. An expression of SL is any word over A. Some expressions are not meaningful like pq∨ etc., therefore, we single out well formed expressions denoted wff’s (standing for well-formed formulae) defined recursively. Definition 2.2 (Well-formed formulae of SL) The set of wffs is the smallest set which satisfies the following conditions: (i) each atomic proposition is a wff;
2.2 Syntax of Sentential Logic
63
(ii) if φ and ψ are wffs, then ¬φ and φ ⊃ ψ are wffs; (iii) ⊥ is a wff. It follows by (ii) and (iii) that ≡ ¬⊥ is a wff. Definition 2.3 (Auxiliary connectives) We introduce connectives and formulae which simplify wffs and by themselves are important in formalization of utterances or written sentences. These connectives are the following: (i) φ ∨ ψ is (φ ⊃ ψ) ⊃ ψ; ∨ is the connective of disjunction; (ii) φ ∧ ψ is ¬(φ ⊃ ¬ψ); ∧ is the connective of conjunction; (iii) φ ≡ ψ is (φ ⊃ ψ) ∧ (ψ ⊃ φ); ≡ is the connective of equivalence. By rules (i)-(iii) in Definition 2.2, and by Definition 2.3, all formulae involving connectives of Definition 2.3 and are defined by means of wffs, hence, they are wffs. Theorem 2.1 By conditions of Definitions 2.2 and of 2.3, the following list consists of wf formulae: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (27)
( p ⊃ q) ⊃ [(q ⊃ r ) ⊃ ( p ⊃ r )]; p ⊃ p; p ⊃ (q ⊃ p); p ⊃ (q ⊃ p ∧ q); [ p ⊃ (q ⊃ r )] ⊃ [q ⊃ ( p ⊃ r )]; [ p ⊃ (q ⊃ r )] ⊃ [( p ⊃ q) ⊃ ( p ⊃ r )]; p ⊃ ¬¬ p; (¬¬ p) ⊃ p; p ⊃ p ∨ q; q ⊃ p ∨ q; ( p ∧ q) ⊃ p; ( p ∧ q) ⊃ q; [( p ⊃ r ) ⊃ (q ⊃ r )] ⊃ [( p ∨ q) ⊃ r ]; [ p ⊃ (q ⊃ r )] ⊃ ( p ∧ q ⊃ r ); ( p ∧ q ⊃ r ) ⊃ [ p ⊃ (q ⊃ r )]; ( p ∧ ¬ p) ⊃ q; p ∨ ¬ p; [ p ⊃ ( p ∧ ¬ p)] ⊃ ¬ p; ¬( p ∧ ¬ p); ( p ⊃ q) ⊃ (¬ p ∨ q); (¬ p ∨ q) ⊃ ( p ⊃ q); ( p ⊃ q) ⊃ (¬q ⊃ ¬ p); (¬q ⊃ ¬ p) ⊃ ( p ⊃ q); ( p ⊃ ¬q) ⊃ (q ⊃ ¬ p); (¬ p ⊃ q) ⊃ (¬q ⊃ p); [( p ⊃ q) ⊃ p] ⊃ p; ¬( p ∨ q) ⊃ (¬ p ∧ ¬q);
64
2 Sentential Logic (SL)
(28) (29) (30) (31) (32) (33) (34) (35) (36)
(¬ p ∧ ¬q) ⊃ ¬( p ∨ q); ¬( p ∧ q) ⊃ (¬ p ∨ ¬q); (¬ p ∨ ¬q) ⊃ ¬( p ∧ q); [( p ∧ q) ∨ r ] ⊃ [( p ∨ r ) ∧ (q ∨ r )]; [( p ∨ r ) ∧ (q ∨ r )] ⊃ [( p ∧ q) ∨ r ]; [( p ∨ q) ∧ r ] ⊃ [( p ∧ r ) ∨ (q ∧ r )]; [( p ∧ r ) ∨ (q ∧ r )] ⊃ [( p ∨ q) ∧ r ]; ( p ⊃ q) ∧ (q ⊃ p) ⊃ ( p ≡ q); ( p ≡ q) ⊃ ( p ⊃ q) ∧ (q ⊃ p).
Some formulae in this list bear historic names: (27)–(30) are DeMorgan laws, (17) is the law of excluded middle, (22) is the law of contradiction, (31)–(34) are laws of distribution, (1) is transitivity of implication (the hypothetical syllogism), (16) is the Duns Scotus formula. Definition 2.4 (Minimal sets of connectives) A set C of connectives is minimal (a base) if all connectives can be defined from C and no proper subset of C has this property. For instance, the set {⊃, ¬} is minimal; if we accept ⊥, then ¬φ can be defined as φ ⊃ ⊥, and the set ⊃, ⊥ is minimal. Other minimal sets are {∨, ¬}, {∧, ¬}. Indeed, from the set {∨, ¬}, we obtain the following definitions: (i) p ∧ q is ¬(¬ p ∨ ¬q); (ii) p ⊃ q is ¬ p ∨ q; (iii) p ≡ q is (¬ p ∨ q) ∧ (¬q ∨ p) is ¬[¬(¬ p ∨ q) ∨ ¬(¬q ∨ p)]. For the set {∧, ¬}, the definitions are the following: (i) p ∨ q is ¬(¬ p ∧ ¬q); (ii) p ⊃ q is ¬( p ∧ ¬q); (iii) p ≡ q is [¬( p ∧ ¬q)] ∧ [¬(q ∧ ¬ p)]. Minimality of the set {⊃, ¬} was shown in Definition 2.3. If we define ¬ p as p ⊃ ⊥, then we may use this definition for the set {⊃, ¬} to obtain (i) ¬ p is p ⊃ ⊥; (ii) p ∨ q is ( p ⊃ ⊥) ⊃ q; (iii) p ∧ q is (( p ⊃ ⊥) ⊃ ⊥) ⊃ (q ⊃ ⊥). We have deliberately left ≡ off. Definition 2.5 (Other connectives. Sheffer’s stroke, Peirce’s arrow) These two connectives have the property that each of them defines all 16 Boolean functions of two variables. The Sheffer stroke D (the alternative denial) is defined as follows: p Dq is p ⊃ ¬q. From this definition, we are able to express sentential connectives in terms of D.
2.3 Semantics of Sentential Logic
65
(i) ¬ p is p Dp; (ii) p ⊃ q is p D( p Dq). Having ⊃ and ¬ defined in terms of D, we can use Definition 2.3 in order to express ∨, ∧, ≡ in terms of D. The Peirce arrow ↓ (denoted also NOR, ‘logical OR’) is expressed in terms of D as: p ↓ q is ¬(¬ p D¬q), hence, (i) (ii) (iii) (iv)
¬ p is p ↓ p; p ⊃ q is [( p ↓ p) ↓ q] ↓ [( p ↓ p) ↓ q)]; p ∨ q is ( p ↓ q) ↓ ( p ↓ q)]; p ∧ q is ( p ↓ p) ↓ (q ↓ q).
Peirce’s arrow in terms of our standard connectives is a conjunction of negations.
2.3 Semantics of Sentential Logic Sentential logic is truth-functional: the truth value of a formula depends solely on truth values of atomic propositions from which it is built. There are two truth values: truth denoted 1 and falsity denoted 0, accordingly, the formula may only be valid (true) or invalid (false). A discussion of semantics begins with truth tables for connectives. Truth tables give values of formulae of the form p Xq where X is a binary connective of either disjunction or conjunction, or, implication, or, equivalence and values of formulae of the form ¬ p as truth functions of truth values of atomic propositions p, q. An assignment A is a function which assigns to each atomic proposition a truth value 0 or 1. The truth table gives values of formulae for all assignments. Definition 2.6 (Truth tables for connectives) Table 2.1 collects truth functions for connectives. The 0 − ar y connective ⊥ (’falsum’) is constantly valued 0, i.e., it is always invalid (false). The negation of ⊥, ¬⊥, denoted (’verum’, a’thumbtack’) is always valid by the truth-table for negation.
Table 2.1 Truth-functional description of connectives and of ⊥ p q ¬p p∨q p∧q p→q 0 0 1 1
0 1 0 1
1 1 0 0
0 1 1 1
0 0 0 1
1 1 0 1
p≡q 1 0 0 1
⊥ 0 0 0 0
66
2 Sentential Logic (SL)
Values of truth functions for atomic propositions that occur in a formula φ, determine the value of truth function for φ. The notation φ( p1 , p2 , . . . , pn ) indicates that φ is built from atomic propositions p1 , p2 , . . . , pn . We denote by A∗ (φ) the truth value of a formula φ under the assignment A; then A∗ (φ) = A∗ (A( p1 ), A( p2 ), . . . , A( pn )). Clearly, for a formula with n atomic propositions, the number of possible assignments is 2n . Definition 2.7 (Sub-formulae) The set Sub(φ) of sub-formulae of the formula φ is defined by the following conditions: (i) (ii) (iii) (iv)
Sub( p) = { p} for each atomic proposition p; Sub(¬φ) = Sub(φ) ∪ {¬φ}; Sub(φ ◦ ψ) = Sub(φ) ∪ Sub(ψ) ∪ {φ ◦ ψ}, where ◦ is a binary connective; Sub(⊥)=∅=Sub().
We denote members of the set Sub(φ) by the generic symbol sub(φ). Definition 2.8 (Immediate sub-formulae) The notion of an immediate sub-formula, in symbols: im − sub(φ) is defined as follows: (i) im − sub( p) does not hold for any atomic proposition p; (ii) im − sub(¬ p) holds for p only; (iii) im − sub(φ ◦ ψ) holds only for φ and ψ for each binary connective ◦. Definition 2.9 (Subordination relation sub. The formation tree) For formulae η, χ , we say that η is subordinated to χ , η subχ , if there exists a sequence χ1 , χ2 , . . . , χk such that (i) χ1 is η (ii) χk is χ (iii) for each i ≤ (k − 1), χi+1 im − subχi holds. We call such sequence a branch. The formation tree of a formula is the graph of the relation im − sub. By the Kuratowski-Hausdorff lemma (Theorem 1.40 (ii)), each branch extends to a maximal branch. By its definition, each maximal branch for a formula φ has φ as the greatest element and an atomic proposition as the least element. Therefore the undirected graph of the relation sub is connected. For any sub-formula of the form η ◦ χ , formulae η and χ initiate two disjoint branches (η ◦ χ is the bifurcation point). The relation sub is transitive. The graph of the relation sub is acyclic. Hence, the graph of the relation im − sub for any formula φ is a tree with the root φ. In Fig. 2.1, we show the formation tree for the formula φ : ( p ⊃ q) ⊃ (¬ p ∨ q). We may apply this tree towards satisfiability check of the formula φ by procedure called labelling. For an assignment A which assigns truth values at the leaves of the formation tree, we go up the tree and at each node we compute the value of A at that node from already computed values of A at immediate sub-formulae for that node. When we have reached the root, we have found the truth value of the formula at the root of the tree for the assignment A.
2.3 Semantics of Sentential Logic
67
Fig. 2.1 Formation tree
This procedure shows that the truth function for the formula φ at the root of tree is actually the composition of truth functions of sub-formulae composed according to the pattern of the formation tree. Labelling allows to bypass this task by the reverse process of walking the tree bottom-up. This is an example of structural induction: inferring that a formula φ satisfies a property P from the fact that P is satisfied by sub-formulae of φ. Yet another characteristics of a formula is its size. Definition 2.10 (The size of a formula) The size of a formula, si ze(φ), is defined by structural induction as follows, (i) (ii) (iii) (iv)
si ze( p) = 0; si ze(¬φ) = 1 + si ze(φ); si ze(φ ◦ ψ) = si ze(φ) + si ze(ψ) + 1; |Sub(φ)| ≤ 2si ze(φ) .
It is easy to realize that si ze(φ) is the number of occurrences of connectives in φ. As each sub-formula takes a subset of these connectives, we obtain the inequality (iv). Definition 2.11 (Validity, satisfiability, unsatisfiability) We denote by the symbol A∗ (φ, A) the value of the truth function A∗ (φ) under assignment A. In case A∗ (φ, A) = 1, we say that φ is satisfied by A. A formula φ is valid (is a tautology) if and only if it is satisfied by each assignment A. A formula φ is unsatisfiable if and only if the formula ¬φ is valid. A formula φ is satisfiable if and only if there exists an assignment by which the formula φ is satisfied. Definition 2.12 (Truth tables) A truth table for a formula φ is a tabular form of the formation tree: for each assignment A, we compute values of sub-formulae by structural induction and place them in the table under appropriate headings. A formula is valid when the column headed by the formula contains only values equal to 1.
68
2 Sentential Logic (SL)
Table 2.2 Truth table for the formula φ p q r p→q
0 0 0 1 0 1 1 1
0 0 1 0 1 0 1 1
0 1 0 0 1 1 0 1
1 1 1 0 1 0 1 1
q →r
1 1 0 1 1 1 0 1
p→r
1 1 1 0 1 1 0 1
(q → r) → (p → r)
φ
1 1 1 0 1 1 1 1
1 1 1 1 1 1 1 1
Table 2.2 is the truth table for the formula φ : ( p ⊃ q) ⊃ [(q ⊃ r ) ⊃ ( p ⊃ r )]. It follows that φ is valid. One may check that formulae listed above as (1)-(36) are valid. Definition 2.13 (Models. Logical consequence) For a formula φ, an assignment A such that A∗ (φ, A) = 1 determines a model for φ, we may write this fact down as A |= φ. This notion extends to sets of formulae. For a set Γ of formulae, we write A |= Γ if and only if A∗ (φ, A) = 1 for each formula φ ∈ Γ. We say that a formula φ is a logical consequence of a set Γ of formulae if and only if for each assignment A, if A |= Γ , then A |= φ. We express this fact of logical consequence with the formula Γ |= φ.
2.4 Normal Forms Definition 2.14 (Semantic equivalence) Formulae φ and ψ are semantically equivalent, when they have the same value of the truth function A∗ for each assignment A, i.e., A∗ (φ, A) = A∗ (ψ, A) for each assignment A. We denote semantic equivalence with the symbol ≡ A . This means that the formula φ ≡ A ψ is valid. Normal forms of formulae are distinguished due to their specialized structure, we are going to discuss negation normal forms, conjunctive normal forms and disjunctive normal forms, the latter two of importance for later developments, the first is put here just for the record. Each normal form of a formula is semantically equivalent to it.
2.4 Normal Forms
69
Definition 2.15 (Negation normal form) A negation normal form is a form of a formula in which the connective of negation is applied only to atomic propositions and connectives are only those of conjunction and disjunction. This form is obtained due to semantic equivalences: (i) p ⊃ q is equivalent to ¬ p ∨ q (ii) De Morgan laws: ¬( p ∨ q) is equivalent to (¬ p) ∧ (¬q), ¬( p ∧ q) is equivalent to (¬ p) ∨ (¬q) (iii) ¬(¬ p) is equivalent to p. As an example, we obtain the negation normal form for the hypothetical syllogism: ( p ⊃ q) ⊃ [(q ⊃ r ) ⊃ ( p ⊃ r )]: by applying (i) and the distribution law, p ∧ q) ∨ r ≡ A ( p ∨ r ) ∧ (q ∨ r ), we obtain the formula (¬ p ∨ q) ∨ [(q ∨ ¬ p) ∧ (q ∨ r ) ∧ (¬r ∨ ¬ p) ∧ (¬r ∨ r )]. Much more important are the conjunctive normal form (CNF) and the disjunctive normal form (DNF). Definition 2.16 (Literals) A literal is an expression l of the form p which may be either p or ¬ p. A pair (l, ¬l) forms the pair of contradictory literals. Definition 2.17 (Clauses) A clause is a formula l1 ∨ l2 ∨ . . . ∨ lk where each li is a literal. Definition 2.18 (Conjunctive normal form (CNF)) CNF is a formula C1 ∧ C2 ∧ . . . ∧ Ck , for some k, where each Ci is a clause. Definition 2.19 (Implicants) An implicant is an expression of the form I : l1 ∧ l2 ∧ . . . ∧ lm for some m, where each li is a literal. Definition 2.20 (Disjunctive normal form (DNF)) DNF is formula I1 ∨ I2 ∨ . . . ∨ In , for some n, where each Ii is an implicant, called in this case a prime implicant. j We may use shortcut symbols: i=k means the disjunction of expressions numbered j from k to j for k ≤ j, i=k is a conjunction of expressions, so we may write down k m i p i DNF as i=1 [ j=1 l j] and CNF as i=1 [ nj=1 l j ].
70
2 Sentential Logic (SL)
Let us observe that Theorem 2.2 A clause is valid if and only if it contains a pair of contradictory literals and C N F is valid if and only if all of its clauses are valid. Dually, an implicant is valid if and only if it contains no pair of contradictory literals and DNF is valid if and only if at least one of its prime implicants is valid. From this it follows that CNF and DNF of a formula are dual in the sense: CNF( phi)≡ A ¬DNF(¬φ). Theorem 2.3 For each formula φ of SL, there exist semantically equivalent normal forms C N F(φ) and D N F(φ). Proof The following algorithm produces DNF for any formula φ: Algorithm (A) which returns DNF of a formula 1 Write n truth table for the given formula φ( p1 , p2 , . . . , pn ); if φ is unsatisfiable, write the formula i=1 ( pi ∧ ¬ pi ) as DNF(φ) 2 For each assignment A with value 1 of φ, form the conjunction of literals: if an atomic proposition p has value 0 under A, then insert into the conjunction the literal ¬ p, otherwise insert the literal p 3 Form the disjunction of conjunctions for all assignments 4 Return the obtained DNF(φ)
Correctness of the algorithm follows from the fact that, by virtue of the construction of DNF(φ), φ and DNF(φ) are semantically equivalent. As for CNF, we have two possibilities. Algorithm (B1) which returns CNF of a formula 1 Apply algorithm (A) to the formula ¬φ to obtain DNF(¬φ) 2 Negate the formula DNF(¬φ) and obtain ¬(DNF(¬φ)) which is CNF(φ)
We can also obtain a dual to algorithm (A) for the formula φ. Algorithm (B2) which returns CNF of a formula n 1 Write truth table for the given formula φ; if φ( p1 , p2 , . . . , pn ) is valid, then write the formula i=1 ( pi ∨ ¬ pi ) and return it as CNF(φ) 2 For each assignment A with value 0 for φ, form the disjunction of literals: if a sentential variable p has the value 1 under A insert into the disjunction the literal ¬ p, otherwise insert p 3 Form the conjunction of obtained disjunctions 4 Return the obtained CNF(φ)
Correctness of the algorithm follows, as φ and obtained CNF(φ) are semantically equivalent. For example, consider Table 2.3 for a formula φ: Following algorithm (A), we produce the equivalent form DNF: (¬ p ∧ ¬q ∧ ¬r ) ∨ ( p ∧ ¬q ∧ ¬r ) ∨ ( p ∧ ¬q ∧ r ). Algorithm (B2) yields the following form CNF: ( p ∨ q ∨ ¬r ) ∧ ( p ∨ ¬q ∨ r ) ∧ ( p ∨ ¬q ∨ ¬r ) ∧ (¬ p ∨ ¬q ∨ r ) ∧ (¬ p ∨ ¬q ∨ ¬r ). This demonstrates a merit of normal forms, they allow us to reconstruct an explicit form of a formula from its truth table.
2.5 Sentential Logic as an Axiomatic Deductive System Table 2.3 Truth table for the formula φ
p 0 0 0 1 0 1 1 1
q 0 0 1 0 1 0 1 1
71 r 0 1 0 0 1 1 0 1
φ 1 0 0 1 0 1 0 0
2.5 Sentential Logic as an Axiomatic Deductive System Stoic school elaborated themata as means for reduction of expressions to five indemonstrables, medieval scholasticians gave names to valid syllogisms in order to reduce them to those of the first group, Euclid built a system of geometry as an axiomatic system. This idea of axiom schemes and inference rules was implemented by Frege [4], who according to Łukasiewicz [5], was the first to clearly separate axiom schemes from rules of inference. Gottlob Frege proposed the following system of axiom schemes: 1 p ⊃ (q ⊃ p). 2 [ p ⊃ (q ⊃ r )] ⊃ [( p ⊃ q) ⊂ ( p ⊃ r )]. 3 [ p ⊃ (q ⊃ r )] ⊃ [q ⊃ ( p ⊃ r )]. 4 ( p ⊃ q) ⊃ [(¬q) ⊃ (¬ p)]. 5 (¬(¬ p)) ⊃ p. 6 p ⊃ (¬(¬ p)). It was shown (cf. Łukasiewicz [5]) that the scheme 3 follows from schemes 1 and 2. Kleene [6] proposed a set of 13 axiom schemes. David Hilbert proposed a few axiom systems, we quote (Hilbert and Ackerman [7]) system : 1 ( p ∨ p) ⊃ p; 2 p ⊃ ( p ∨ q); 3 ( p ∨ q) ⊃ (q ∨ p); 4 ( p ⊃ q) ⊃ [( p ∨ r ) ⊃ (q ∨ r )]. The Hilbert second system (H2) was as follows: 1 p ⊃ (q ⊃ p);
72
2 Sentential Logic (SL)
2 ( p ⊃ (q ⊃ r )) ⊃ [( p ⊃ q) ⊃ ( p ⊃ r )]; 3 (¬q ⊃ ¬ p) ⊃ [(¬q ⊃ p) ⊃ q)]. In Church [8], a system P1 is discussed, among other systems, whose axiom schemes involve falsum ⊥: 1 p ⊃ (q ⊃ p); 2 [ p ⊃ (q ⊃ r )] ⊃ [( p ⊃ q) ⊃ ( p ⊃ r )]. 3 [( p ⊃ ⊥) ⊃ ⊥] ⊃ p. Two extremes in this area are the single axiom scheme in Meredith [9]: ((((( p ⊃ q) ⊃ (¬r ⊃ ¬s)) ⊃ r ) ⊃ u) ⊃ ((u ⊃ p) ⊃ (s ⊃ p))) (cf. a discussion of organic schemata in (Łukasiewicz and Tarski [10])), and a proposition by Herbrand to accept all valid formulae as axiom schemes. This proposal would eradicate syntactic considerations: rules of inference, provability problems, and large part of metatheory. We adopt in the sequel the Łukasiewicz system Łukasiewicz [11]. Definition 2.21 (The Łukasiewicz axiom schemes) (L1) ( p ⊃ q) ⊃ [(q ⊃ r ) ⊃ ( p ⊃ r )] (hypothetical syllogism); (L2) (¬ p ⊃ p) ⊃ p; (L3) ( p ⊃ (¬ p ⊃ q)) (the Duns Scotus formula). A set of axiom schemes is one ingredient of a deductive system, the other is a set of rules of inference, which are symbolic valued functions on sets of formulae producing new formulae; some systems like Hilbert’s or Kleene’s apply the detachment rule solely, the first Hilbert system admits substitution, other like (Łukasiewicz’s (L1)(L3)) use also the substitution rule along with the replacement rule. We define these rules formally. Definition 2.22 (Rules of inference) (MP) the rule of detachment (Modus Ponens (MP)): (MP) is exactly Mood 1 of the Stoic logic. It is written now in the following form p, p ⊃ q , q i.e., q follows from p and p ⊃ q; (S) the rule of substitution (S): in each wff, one may substitute for all equiform occurrences of an atomic proposition, a well-formed expression in such manner that equiform occurrences of that expression coincide with equiform occurrences of the sentential variable; (R) the rule of replacement allows for substitutions of definiendum p in any definition p = q in place of definiens q in any wff. For example, we have defined the connective p ∨ q = (¬ p) ⊃ q and, in consequence, we can substitute p ∨ q for (¬ p) ⊃ q in any wff.
2.5 Sentential Logic as an Axiomatic Deductive System
73
Let us mention that substitution is present in axiomatic systems, if not as a genuine inference rule then at first stages of proofs when substitutions are made in axiom schemes to produce instances of schemes. The crucial notion for deductive systems is the notion of a proof. Definition 2.23 (The notion of a proof) A proof of a formula φ from a set AX of axiom schemes is a sequence of formulae ψ1 , ψ2 , . . . , ψn with the properties that (i) ψn is φ (ii) each ψi for i ≤ n is an instance of an axiom scheme in AX or it is obtained from formulae in the set {ψ j : j < i} by means of an inference rule. Definition 2.24 (Provability) A formula is provable in the given axiomatic system if and only if it has a proof in this system. Provable formulae are called theorems of the system. Example 2.1 As an example, we recall a proof of the formula p ⊃ p. 1 Begin with (L1): ( p ⊃ q) ⊃ [(q ⊃ r ) ⊃ ( p ⊃ r )] 2 Substitute (q/¬ p ⊃ q) to obtain the instance 3 (L1*) [ p ⊃ (¬ p ⊃ q)] ⊃ [((¬ p ⊃ q) ⊃ r ) ⊃ ( p ⊃ r )] 4 Apply detachment with (L3) to (L1*) and then substitution ( p/q) to obtain 5 [(¬ p ⊃ p) ⊃ r )] ⊃ ( p ⊃ r )] 6 Substitute (r/ p) in (5) to obtain 7 [(¬ p ⊃ p) ⊃ p)] ⊃ ( p ⊃ p)] 8 Apply detachment rule with (L2) to (7) to obtain 9 p ⊃ p. Definition 2.25 (The notion of a relative proof) For a set Γ of formulae and a formula φ, a proof of φ from Γ is a sequence ψ1 , ψ2 , . . . , ψn of formulae such that: (i) ψn is φ (ii) each ψi is either an instance of an axiom scheme or is in Γ , or, follows from some preceding formulae by a rule of inference. The fact that φ is provable from Γ is denoted Γ φ. In particular, for a theorem φ the fact of provability is denoted φ. We come to the stage in our exposition in which we will rely on a number of provable formulae. As provability proofs are often quite complex and lengthy, we will point to proofs in Łukasiewicz [11]. One of most important results about provability is the Herbrand deduction theorem Herbrand [12]. Independently, a parallel result was obtained by Tarski [13] in which deduction was related to the notion of consequence. Theorem 2.4 (The deduction theorem) If Γ ∪ {ψ} φ, then Γ ψ ⊃ φ.
74
2 Sentential Logic (SL)
Proof Consider a proof ψ1 , ψ2 , . . . , ψn for φ with φ equiform with ψn . Proof goes by induction with respect to i for the formula (∗) Γ ψ ⊃ ψi . We consider particular cases. Case 1. i = 1: ψ1 is (a) an instance of an axiom (b) an element of Γ or (c) ψ. In sub-cases (a) and (b), we apply the provable instance ψ1 ⊃ (ψ ⊃ ψ1 ) (derivation 18 in Łukasiewicz [11]) and detachment rule yields Γ ψ ⊃ ψ1 . In case (c), we have already proved the formula ψ1 ⊃ ψ1 in Example 2.1. This completes the proof of Case 1. Suppose that (*) has been proved for i < j. Case 2. i = j: ψi is as in Case 1 or (d) ψi is obtained by an application of detachment rule from some ψm and ψk : ψm ⊃ ψ j with m < k. Hence, by hypothesis of induction, (e) Γ ψ ⊃ ψm and ( f ) Γ ψ ⊃ (ψm ⊃ ψ j ) From the instance (derivation 35 in [11]) (ψ ⊃ (ψm ⊃ ψ j )) ⊃ ((ψ ⊃ ψm ) ⊃ (ψ ⊃ ψ j )) we obtain by the detachment rule applied twice the conclusion Γ ψ ⊃ ψ j . Case 3. i = n: we obtain Γ ψ ⊃ φ. Proof is concluded.
In case Γ is finite, Γ = {γi : i = 1, 2, . . . , n}, Theorem 2.4 takes the following form Theorem 2.5 If γ1 , γ2 , . . . , γn φ, then γ1 , γ2 , . . . , γn−1 γn ⊃ φ. k We introduce a symbol Γi=1 γi p in order to denote a sequence of implications γ1 ⊃ (γ2 ⊃ (. . . ⊃ (γk ⊃ p))). Then, by iteration of Theorem 2.5, we obtain the following form of deduction theorem. n Theorem 2.6 If γ1 , γ2 , . . . , γn φ, then Γi=1 γi φ.
We state a lemma Church [8]. Lemma 2.1 If γ1 , γ2 , . . . , γn φ and δ1 , δ2 , . . . , δm are formulae such that among them is each of γi , then δ1 , δ2 , . . . , δm φ. Lemma 2.2 Formula Łukasiewicz [11]).
p ⊃ (q ⊃ p ∧ q)
is
provable
(derivation
91
in
2.5 Sentential Logic as an Axiomatic Deductive System
75
Using Lemma 2.1 and Definition 2.21 along with detachment, we can obtain the deduction theorem in the following form. n Theorem 2.7 If γ1 , γ2 , . . . , γn φ, then i=1 γi ⊃ φ. Now, we allow ourselves a bit of digression. Definition 2.26 (The Polish notation) This notation, introduced by Jan Łukasiewicz around 1922, called also the prefix notation, writes down formulae by prefixing arguments with operators, e.g., the formula p ⊃ q is written as C pq (C was the symbol applied for implication); similarly, negation was written as N p (here, of course, the prefix notation and standard notation coincide except for the symbol). For example, the axiom schema (L1) would be written as CC pqCCqrC pr (notice that the symbol C always acts on two succeeding groups of symbols or single symbols). The Polish notation and its dual, the Reverse Polish notation, prompted the introduction of the stack data structure. Definition 2.27 (Logical matrices) Invented independently by few authors beginning with C. S. Peirce, among them Bernays [14] and Łukasiewicz (some authors point also at Post and Wittgenstein), logical matrices have been used to ascertain the independence of axiomatic schemes in the sense that no instance of a scheme can be derived from other axiom schemes by rules of inference. We explain the usage of those matrices with an example in Łukasiewicz [11]. Consider a matrix which is a specially designed truth table for C that is, ⊃ and N that is ¬: We may look at this matrix as a multiplication table, e.g., C12 = 0, C20 = 1. If we compute the value of (L1) for p = 2, q = 0, r = 2, we obtain the result CC20CC02C22 = 0, while for (L2) and (L3), any substitution of values yields the result 1 (Table 2.4). It remains to notice that the result of computations is invariant under rules of inference, hence, axiom scheme (L1) is independent of (L2)+(L3). Definition 2.28 (The Lindenbaum-Tarski algebra) By deduction theorem, the provable formula p ⊃ p, and, the hypothetical syllogism (L1), p ⊃ q, q ⊃ r p ⊃ r , it follows that the relation p ∼ q if and only if p ⊃ q and q ⊃ p are provable is an equivalence relation on the theory T of wffs. Table 2.4 C,N truth matrix C 0 0 1 2
1 0 1
1
2
1 1 1
1 0 0
N 1 0 2
76
2 Sentential Logic (SL)
The formula p ⊃ q if provable defines the ordering p ≤ q. The quotient set T / ∼ is the carrier of the Lindenbaum-Tarski algebra (called also the Lindenbaum algebra (unpublished by Lindenbaum)). The ordering ≤ factorizes through ∼: [φ]∼ ≤∼ [ψ]∼ if and only if φ ≤ ψ. Please notice that while the relation ≤ is only reflexive, by provability of p ⊃ p and transitive by (L1), the factorization by ∼ yields the symmetry , i.e., ∼ is an equivalence. The Lindenbaum-Tarski construction leads to Boolean algebra in case of SL. Theorem 2.8 The Lindenbaum-Tarski algebra of sentential logic is a Boolean algebra. The unit of this algebra is the class of all provable formulae, the zero is the class of all unsatisfiable formulae. We have the following relations between operations of logic and lattice operations: (i) [φ]∼ ∪ [ψ]∼ = [φ ∨ ψ]∼ ; (ii) [φ]∼ ∩ [ψ]∼ = [φ ∧ ψ]∼ ; (iii) −[φ]∼ = [¬φ]∼ . Proof First, we verify the lattice structure: that [φ ∨ ψ]∼ is the join φ]∼ ∪ [ψ]∼ follows from provability of formulae: (1) ( p ⊃ p ∨ q); (2) (q ⊃ p ∨ q); (3) ( p ⊃ r ) ⊃ ((q ⊃ r ) ⊃ (( p ∨ q) ⊃ r ))). Similarly, [φ]∼ ∩ [ψ]∼ is [φ ∧ ψ]∼ due to provability of formulae: (4) p ∧ q ⊃ p; (5) p ∧ q ⊃ q; (6) (( p ⊃ q) ⊃ (( p ⊃ r ) ⊃ ( p ⊃ q ∧ r ))). We introduce into the lattice T / ∼ the relative pseudo-complementation ⇒ defined as [φ]∼ ⇒ [ψ]∼ is [φ ⊃ ψ]∼ due to provability of formulae (7) (( p ⊃ (q ⊃ r )) ⊃ (( p ∧ q) ⊃ r )); (8) ((( p ∧ q) ⊃ r ) ⊃ ( p ⊃ (q ⊃ r ))); The existence of the complement − is secured by the provable formulae: (9) p ∧ ¬ p ⊃ q; (10) (( p ⊃ ( p ∧ ¬ p)) ⊃ p). By (9), [ p ∧ ¬ p]∼ is the element zero 0. By (10), −[φ]∼ = [φ]∼ ⇒ [φ ∧ ¬φ]∼ = [φ ⊃ (φ ∧ ¬φ)]∼ = [¬φ]∼ . Finally, the unit element 1 is provided by provable formula p ∨ ¬ p which implies [φ]∼ ∪ [¬φ]∼ = 1. Provability of formulae (1)–(10) may be asserted by syntactic means: cf. e.g., Łukasiewicz [11] or by semantic means via the completeness property which will be proved in the sequel. Please consult (Rasiowa and Sikorski [15]) for more detailed analysis.
2.6 Natural Deduction
77
2.6 Natural Deduction In verifying validity of a formula we often apply an ‘informal’ approach, e.g., for hypothetical syllogism φ : ( p ⊃ q) ⊃ [(q ⊃ r ) ⊃ ( p ⊃ r )], we could argue as follows: φ may be false only if truth value of the antecedent p ⊃ q is 1, i.e, when the truth value of p is 0 or truth value of p is 1 and truth value of q is 1. In the first case, truth value of p ⊃ r is 1 which makes the consequent true and φ valid. In the second case, truth value of the consequent is decided by truth value of r ; but regardless whether r is valued 0 or 1, the consequent is true and φ is valid. Such informal ways of estimating validity prompted Jan Łukasiewicz to pose at the seminar in Warsaw University in 1926 the problem of formalization of ‘natural deduction’. Solutions were proposed by Ja´skowski (Ja´skowski [16, 17]) and Gentzen [18]). We begin with the Gentzen sequent calculus for sentential logic. Definition 2.29 (Sequents) A sequent is a formula of the form Γ ⇒ Δ, where Γ = {φ1 , φ2 , . . . , φn } and Δ = {ψ1 , ψ2 , . . . , ψm } are sets of formulae. As with implications in SL, Γ is the antecedent and Δ is the consequent of the sequent. Definition 2.30 (Validity notion for sequents) Each sequent of the generic form Γ ⇒ Δas in Definition 2.29, has the truth value n φi ⊃ mj=1 ψ j due to Gentzen [18]. equal to the truth value of the implication i=1 Hence, the sequent Γ ⇒ Δ is valid if each formula φi is valid and some formula ψ j is valid in case sets Γ and Δ are non-empty. n φi has truth value 1 and for the sequent Limiting cases are: if Γ = ∅, then i=1 to be valid there should exist at least one valid formula ψ j ; if there is no such ψ j then the sequent is of the form ∅ ⇒ ∅, hence it is invalid. If Δ = ∅, then the sequent has truth value 1 only if there exists a formula φi with truth value 0. From Definition 2.30, the criterion of validity for sequents follows. Theorem 2.9 A sequent Γ ⇒ Δ is valid if and only if either there exists an invalid formula in Γ or there exists a valid formula in Δ. There are many variants of the Gentzen sequent calculus, the original Gentzen calculus LK, the variant K of Ketonen (see Indrzejczak [19]), the systems G, G∗, G 1 , G 2 (see Smullyan [20]), diagrams of (Rasiowa and Sikorski [15]), among others. We will discuss the system K. Definition 2.31 (The sequent system K) The system K consists of the following rules: Γ ⇒Δ,φ ⇒Δ (right ¬) Γφ,Γ ; (left¬) ¬φ,Γ ⇒Δ ⇒Δ,¬φ
78
2 Sentential Logic (SL) φ,ψ,Γ ⇒Δ (left ∧) φ∧ψ,Γ ⇒Δ
Γ ⇒Δ,φ Γ ⇒Δ,ψ ; Γ ⇒Δ,φ∧ψ Γ ⇒Δ,φ,ψ ∨) Γ ⇒Δ,φ∨ψ ; ⊃) ΓΓ,φ⇒Δ,ψ . ⇒Δ,φ⊃ψ
(right ∧)
Γ,φ⇒Δ Γ,ψ⇒Δ Γ,φ∨ψ⇒Δ Γ,ψ⇒Δ ⊃) Γ ⇒Δ,φ Γ,φ⊃ψ⇒Δ
(left ∨)
(right
(left
(right
Actually, the system K is the system G0 in Smullyan [20]. An axiom is any sequent of the form Γ, φ ⇒ φ, Δ. Obviously, every instance of the axiom schema is a valid sequent. Let us observe that as Γ, Δ are sets, the order of listing their elements is irrelevant. Terminology associated with sequent figures is as follows: the formula obtained as a result of introducing a connective is principal, terms which have constituted the principal formula are side formulae, other formulae are forming context for the figure (also called parametric formulae). Theorem 2.10 Each rule of the sequent system K is valid in the sense that validity of the antecedent implies validity of the consequent. Proof We give a pattern for the proof, all other rules are checked along similar lines. For (left ¬): suppose the sequent Γ ⇒ Δ, φ is valid. Then, either a formula in Γ is invalid in which case the consequent ¬φ, Γ ⇒ Δ is valid, or, a formula in the set Δ ∪ {φ} is valid, in which case, either φ is valid, and then ¬φ is invalid which makes the consequent valid, or, a formula in Δ is valid which again makes the consequent valid. This pattern holds as well for rules with two antecedents. A proof of a formula φ is the sequent ∅ ⇒ φ. Proofs in sequent calculus are carried from axioms to the formula to be proved. S , the rule Let us observe that rules of sequent calculus are invertible: for the rule S∗ S∗ S S S∗ S∗ is valid; in case of the rule S∗ , rules S and S are valid. S Example 2.2 We prove the contraposition ( p ⊃ q) ⊃ [(¬q) ⊃ ¬ p]. The proof may go like below. 1 p ⇒ q, p p, q ⇒ p 2 p, p ⊃ q ⇒ q by applying (left ⊃) to 1 3 p ⊃ q ⇒ q, ¬ p by applying (right ¬) to 2 4 p ⊃ q, ¬q ⇒ ¬ p by applying (left ¬) to 3 5 p ⊃ q ⇒ ¬q ⊃ ¬ p by applying (right ⊃) to 4 6 ⇒ ( p ⊃ q) ⊃ [(¬q) ⊃ ¬ p] by applying (right ⊃) to 5. We add a proof of the formula ( p ⊃ q) ⊃ ((¬ p) ∨ q). 1 p ⇒ q, p p ⇒ q, p 2 p, p ⊃ q ⇒ q 3 p ⊃ q ⇒ ¬ p, q 4 p ⊃ q ⇒ (¬ p ∨ q) 5 ⇒ ( p ⊃ q) ⊃ [(¬ p) ∨ q]. These examples introduce us to the notion of a theorem of the sequent calculus.
2.7 Natural Deduction: Decomposition Diagrams
79
Definition 2.32 (Sequent provability) A formula φ is a theorem of the sequent calculus if and only if the sequent ⇒ φ has a sequent proof. That sequent calculus is sound follows from validity of sequent rules and of validity of axioms. Completeness of sequent calculus will be proved in Sect. 2.8. We also include a variant of Gentzen system due to (Rasiowa and Sikorski [15]) in which implication is replaced by its equivalent (¬ p) ∨ q.
2.7 Natural Deduction: Decomposition Diagrams By the symbol Γ we denote finite sequences of formulae. A sequence is final if it consists solely of literals and it is valid if it contains a pair of contradictory literals. Definition 2.33 (Decomposition rules) Each of these rules decomposes a sequence either into a conjunction of two sequences (we call such rules as being of type α) or into a disjunction of two sequences (type β of a rule). We denote disjunction as a coma ‘, and we denote conjunction as semicolon ‘; . The rules are: Γ1 , φ ∨ ψ, Γ2 Γ1 , φ, ψ, Γ2 Γ1 , ¬(φ ∨ ψ), Γ2 (¬∨) Γ1 , ¬φ, Γ2 ; Γ1 , ¬ψ, Γ2 Γ1 , φ ∧ ψ, Γ2 (∧) Γ1 , φ, Γ1 ; Γ1 , ψ, Γ2 Γ1 , ¬(φ ∧ ψ), Γ2 (¬∧) Γ1 , ¬φ, ¬ψ, Γ2 Γ1 , φ → ψ, Γ2 (→) Γ1 , ¬φ, ψ, Γ2 Γ1 , ¬(φ → ψ), Γ2 (¬ →) Γ1 , φ, Γ2 ; Γ1 , ¬ψ, Γ2 Γ1 , ¬¬φ, Γ2 (¬) . Γ1 , φ, Γ2 (∨)
In the above schemes, the sequence Γ1 is always an indecomposable sequence, if any. A sequence is a leaf sequence if it is either indecomposable or valid. In Fig. 2.2, the decomposition of formula φ : ( p ⊃ q) ⊃ (¬ p ∨ q) is presented. We obtain disjunctions of literals in the leaves of the tree, i.e., clauses. As we obtain over the leaves a conjunction of clauses (a C N F form of the formula), we obtain
80
2 Sentential Logic (SL)
Fig. 2.2 Diagram of a formula
Theorem 2.11 A formula φ is valid if and only if the tree of decomposition is finite and all leaf sequences (leaf clauses) in the decomposition of φ are valid. Proof It is easy to grasp that as decomposition rules are defined by valid equivalences, the formula φ is equivalent to the conjunction of end clauses. Hence, validity of φ implies validity of all leaf clauses and an upper bound on the height of the tree: suppose not, and let the tree contain branches of any height. Then, by the König Theorem 1.15, there exists an infinite branch B. Branch B cannot contain any pair of contradictory literals because in that case, after the appearance of a complementary literal making a contradictory pair, the branch would end by the validity condition for a leaf sequence. It follows that the branch B contains for each literal either only non-negated or only negated occurrences. But then B would be not valid and the formula φ would be not valid, a contradiction which proves the finiteness of the tree. The converse is obvious. Corollary 2.1 The set of valid formulae of SL is the least set containing all valid clauses and closed on the decomposition rules. Theorem 2.12 (a form of completeness theorem for SL) Each valid clause is provable. If a formula φ is valid, then its semantically equivalent CNF is provable. Proof The formula p ∨ ¬ p is provable (derivation 9 in [11]). Consider the clause C : p ∨ ¬ p ∨ Q, where Q is the sub-clause i qi . The formula p ⊃ p ∨ q is provable (derivation 73 in [11]) under substitution ( p/ p ∨ p), which is equivalent to the formula ¬ p ∨ p ∨ q. Substituting for q in the latter formula the sub-clause Q, we obtain that the clause C is provable. Suppose that CNF(φ) is a conjunction of k clauses, from C1 to Ck . On the strength of provable formula p ⊃ (q ⊃ p ∧ q) (derivation 91 in [11]), with clauses C1 substituted for p and C2 substituted for q, we obtain by applying detachment twice that the conjunction C1 ∧ C2 is provable. This argument repeated with C1 ∧ C2 for p and C3 for q yields that the conjunction C1 ∧ C2 ∧ C3 is provable; by continuing this argument to Ck , we prove that C N F(φ) is provable. Yet another offspring of the Gentzen sequent calculus is the method of tableaux (Beth [21], Smullyan [20]).
2.8 Tableaux
81
2.8 Tableaux Tableaux realize in similar but distinct form the idea of natural deduction of Ja´skowski-Gentzen. A tableau is a tree with the given formula φ at the root. In the subsequent steps, the formula is decomposed in a manner of Definition 2.9. The difference is that formulae are signed, i.e., prefixed with the symbol T or F, meaning valid or invalid. Depending on the sign, decomposition takes distinct forms. We continue with our convention of denoting disjunction with ‘, and conjunction with ‘; . Tableaux we discuss are called analytic in Smullyan [20] (this name is given tableaux also in Fitting [22] in distinction to semantic tableaux in Beth [21]). Analyticity of tableaux means that no external formulae are allowed in the decomposition process contrary to semantic tableaux of Beth in which such intervention is allowed. Definition 2.34 (Tableau rules) The schemes for decomposition of signed formulae are defined in accordance with truth tables for connectives and with valid equivalences; they are as follows: 1
T ( p ∧ q) T p; T q
F( p ∨ q) F p; Fq
F( p ⊃ q) T p; Fq
2
T ( p ∨ q) T p, T q
F( p ∧ q) F p, Fq
T ( p ⊃ q) F p, T q
3
T (¬ p) Fp
F¬ p Tp
The explanation for schemes is in the following: (1) collects decomposition rules in which components are related by conjunction (rules of type α); (2) collects decomposition rules in which components are related by disjunction (rules of type β); (3) concerns negation. The graphic presentation of rules (1)–(3) is shown in Fig. 2.3. Tableaux are built recursively from the root. Suppose the process of building the tableau has reached a node N . We consider the path π(N ) from the root to N . Then we can extend the path π(N ) if on the path π(N ) before N there is a formula to be decomposed; if the formula is of type (α), with α1 and α2 as imminent sub-formulae, then we extend the path π(N ) by adding α1 and then α2 to the path π(N ) so we obtain the path π(N ) - α1 - α2 . In case the formula is of type (β), with imminent sub-formulae β1 ,β2 forming disjunction, the path π(N ) splits into two paths, on one the successor to N is β1 , on the other the successor to N is β2 . This, clearly, realizes the distributive law. Like in the diagrammatic method, we obtain a branching tree. A branch is closed (marked X) when one finds on it a formula along with its negation, so conjunction of formulae on the branch is unsatisfiable. Otherwise, the branch is open.
82
2 Sentential Logic (SL)
Fig. 2.3 Forms of decomposition
The formula φ is unsatisfiable when all branches in its tableau are closed; it is valid if and only if all branches on its tableau are open; when there exists an open branch in the tableau then the formula is satisfiable. These facts are exploited when the signed formula at the root of a tableau is Fφ. The closed tableau witnesses unsatisfiability of Fφ, hence, validity of φ; on the contrary, the existence of an open branch is a witness to satisfiability of ¬φ. Example 2.3 In Fig. 2.4, the tableau is shown for the signed formula Fφ : F[( p ⊃ (q ⊃ r )) ⊃ (q ⊃ ( p ⊃ r ))]. All three branches are closed, hence, the assumption of falsity of φ has led to contradiction, hence, φ is valid. On the contrary, in the tableau in Fig. 2.5 for the signed formula F [(( p ∧ q) ⊃ r ) ⊃ (( p ∨ q) ⊃ r )], we see two branches closed, marked X, and two branches open, marked , which point to valuations falsifying the formula φ: V1 : p = 0, q = 1, r = 0 and V2 : p = 1, q = 0, r = 0. We sum up these facts. Definition 2.35 (Hintikka sets) A Hintikka set (Hintikka [23], cf. Smullyan [20]), is a set Δ of formulae such that (i) neither ⊥ nor any pair F p, T p for any atomic proposition p are in Δ; (ii) for each formula φ ∧ ξ ∈ Δ, φ, ξ ∈ Δ; (iii) for each formula φ ∨ ξ ∈ Δ, either φ ∈ Δ or ξ ∈ Δ. We recall that a set of formulae is satisfiable if there exists an assignment which satisfies each formula in the set. Theorem 2.13 Each Hintikka set of formulae is satisfiable.
2.8 Tableaux
83
Fig. 2.4 A closed tableau
Proof To satisfy atomic propositions, we let A( p) = 0 if F p ∈ Δ and A( p) = 1 if T p ∈ Δ. If neither F p nor T p occurs in Δ, then we assign arbitrarily either value 1 or value 0. The rest goes by structural induction. Condition (ii) guarantees that if φ and ξ are satisfied then φ ∧ ξ is satisfied and if either φ or ξ is satisfied then φ ∨ ξ is satisfied. Corollary 2.2 Each open branch of a tableau is a Hintikka set hence it is satisfiable. In particular an infinite branch of a tableau is satisfiable. Definition 2.36 (Tableau provability) A formula φ is tableau provable if and only if each tableau for Fφ is closed. Theorem 2.14 (On tableau completeness) If a formula φ is valid then it has a tableau proof. Proof Suppose that φ is not tableau provable, hence, Fφ has a tableau with an open branch which is satisfiable, and thus, φ is not valid. It follows from the definition of validity of a sequent in Definition 2.30 that validity of a sequent Γ ⇒ Δ, where Γ = {γi : i ≤ n} and Δ = {δ j : j ≤ m}, is equivalent
84
2 Sentential Logic (SL)
Fig. 2.5 An open tableau
to validity of the tableau for the set of signed formulae T γ1 , T γ2 , . . . , T γn , Fδ1 , Fδ2 , . . . , Fδn . Tableau completeness implies the following result: Theorem 2.15 (Completeness of the sequent calculus) If a formula φ is valid, then the sequent ⇒ φ is provable. Proof Suppose that the sequent ⇒ φ is not provable, hence, a tableau for the signed formula Fφ is not closed, hence, there is an open branch, which is satisfiable, and thus, φ is not valid.
2.9 Meta-Theory of Sentential Logic. Part I We have already entered into this realm by proving the deduction theorem. Now, we address meta-properties of sentential logic. We apply analytic tableaux to this end. In the first place, we collect the basic properties of the relation Γ φ. We recall that Σ SL is the set of theorems of SL.
2.9 Meta-Theory of Sentential Logic. Part I
85
Theorem 2.16 The following are properties of the relation Γ φ: (i) (ii) (iii) (iv) (v) (vi)
Σ φ for each φ ∈ Σ SL ; if φ ∈ Γ , then Γ φ; if Γ ⊆ Γ and Γ φ, then Γ φ; if Γ φ, then Γ φ for a finite subset Γ of Γ ; if Γ ∪ {φ} ψ, then Γ φ ⊃ ψ; if Γ φ and φ ⊃ ψ ∈ Σ SL , then Γ ψ.
Proof For (i): it follows by definition of Σ SL and validity preservation by detachment(MP), substitution S, and, replacement (R); For (ii): it follows as p ⊃ p is a theorem in Σ SL (cf. Example 2.1); For (iii): it is obvious as each formula in Γ is in Γ ; For (iv): if Γ φ then there exists a set {ψ1 , ψ2 , . . . , ψn } such that i ψi ⊃ φ ∈ Σ SL is a proof; clearly, Γ = {ψ1 , ψ2 , . . . , ψn } is as desired; For (v): it is the deduction theorem; For (vi): by addition to a proof of φ of the implication φ ⊃ ψ we obtain a proof of ψ from Γ . Property (iv) is a compactness property. Consistency property means essentially that a set of formulae cannot prove all wffs, in particular, it cannot prove falsity. Definition 2.37 (Consistency) A set Γ of formulae is consistent if and only if there is no formula φ such that both φ and ¬φ are provable from Γ . Equivalently, a set Γ is consistent if and only if it is not true that Γ ⊥. Indeed, if Γ φ and Γ ¬φ for some formula φ, then the provable formula ¬φ ⊃ (φ ⊃ ⊥) (derivation 36 in [11]) implies that Γ ⊥. The converse follows on similar lines. Theorem 2.17 Consistency has the following basic properties. (i) (ii) (iii) (iv) (v) (vi) (vii)
a set Γ is inconsistent if and only if Γ φ for each φ; a set Γ is consistent if and only if there is a formula φ not provable from Γ ; a set Γ is consistent if and only if each finite subset of Γ is consistent; if Γ φ, then the set Γ ∪ {¬φ} is inconsistent; if Γ ∪ {φ} is consistent, then ¬φ has no proof from Γ ; the set Σ SL is consistent; if Γ ∪ {¬φ} is inconsistent then Γ φ.
Proof For (i): From left to right. Inconsistency of Γ means Γ φ and Γ ¬φ for some φ. From provable formula ¬φ ⊃ (φ ⊃ ψ) (derivation 36 in [11]) it follows by detachment applied twice that Γ ψ for each formula ψ. The converse is obvious, (ii) is the negation of (i).
86
2 Sentential Logic (SL)
Property (iii) follows by the compactness property; for property (iv): suppose that Γ φ; (L3) states that φ ⊃ (¬φ ⊃ q) is provable and the substitution q/⊥ yields the provable formula φ ⊃ (¬φ ⊂ ⊥). By detachment, we obtain that Γ ¬φ ⊃ ⊥. If a sequence σ of formulae is a proof of ¬φ ⊃ ⊥ from Γ , then σ, ¬φ is a proof of ⊥ from Γ ∪ {¬φ}, hence, the latter set is inconsistent; Property (v) is a transposition of (iv); for property (vi): as Σ SL consists of provable formulae which are valid by validity-preserving properties of inference rules, no falsity can be inferred from Σ SL ; For property (vii): if Γ ∪ {¬φ} is inconsistent, then Γ ∪ {¬φ} ⊥ and by deduction theorem Γ ¬φ ⊂ ⊥. From (¬φ ⊃ ⊥) ⊃ φ in Σ, we infer that Γ φ. Consistency of a set Γ will be denoted by the symbol Con(Γ ). Definition 2.38 (Maximal consistency) A consistent set Γ is maximal consistent if and only if Γ is consistent and there does not exist a consistent proper superset Γ ∗ of Γ . Maximality of a consistent set Γ will be denoted by the symbol MaxCon(Γ ). Maximality has important consequences and therefore it is important to know that maximal extension exists for each consistent set Γ . While the finite character of consistency (Theorem 2.17(iii)) allows for the application of the Teichmüller-Tukey lemma (Theorem 1.14(i)), we include, also for historic reasons, another argument known as the Lindenbaum Lemma. Theorem 2.18 (The Lindenbaum Lemma) Each consistent set Γ of formulae has a maximal consistent extension Γ ∗ . Proof As each formula in SL uses only finitely many atomic propositions, and their set is countable, the set of formulae is at most of cardinality of the set of finite sequences over a countable set, hence, it is countable and we can arrange all formulae into, possibly infinite, sequence φ0 , φ1 , . . . , φn , . . .. We define a sequence Γ0 , Γ1 , Γ2 , . . . , Γn , . . . of sets of formulae by letting (i) Γ0 = Γ ; (ii) Γn+1 = Γn ∪ {φn } if Γn ∪ {φn } is consistent; (iii) Γn+1 = Γn , otherwise. We let Γ ∗ = {Γn : n ≥ 0} and we claim that (iv) Γ ⊆ Γ ∗ ; (v) Γ ∗ is consistent; (vi) Γ ∗ is maximal consistent. We easily prove by induction that each Γn is consistent. Then for (iv): it follows by (i) and definition of Γ ∗ . For (v): by the Teichmüller-Tukey Lemma (Thm. 1.14(i)) it is sufficient to check that each finite subset Δ of Γ ∗ is consistent. By finiteness of Δ, there exists n such that Δ ⊆ Γn . As Γn is consistent, Δ is consistent, hence, Γ ∗ is consistent. For (vi): suppose that Γ ∗ ⊂ Ω and Con(Ω). Let φ ∈ Ω \ Γ ∗ . Then φ is φn for some n. It follows that Γn ∪ {φn } is inconsistent, hence, Γ ∗ ∪ {φn } is inconsistent
2.9 Meta-Theory of Sentential Logic. Part I
87
and as Γ ∗ ∪ {φn } ⊆ Ω, the set Ω is inconsistent, a contradiction, i.e., Γ ∗ is maximal consistent extension of Γ . Proof is concluded. Corollary 2.3 If ConΓ and Con(Γ ∪ {φ}) implies φ ∈ Γ , then MaxCon(Γ ). Theorem 2.19 The following are basic properties of maximal consistent sets. Suppose that MaxCon(Γ ). (i) (ii) (iii) (iv) (v) (vi)
Γ φ if and only if φ ∈ Γ ; Σ SL ⊆ Γ ; φ ∈ Γ if and only if ¬φ ∈ / Γ; φ ∧ ψ ∈ Γ if and only if φ ∈ Γ and ψ ∈ Γ ; φ ∨ ψ ∈ Γ if and only if φ ∈ Γ or ψ ∈ Γ ; φ ⊃ ψ ∈ Γ if and only if φ ∈ Γ implies that ψ ∈ Γ .
Proof For (i): if Γ φ and φ ∈ / Γ , then Γ ∪ {φ} is inconsistent, hence, Γ ∪ {φ} ⊥. As Γ ⊥ is impossible, the proof of ⊥ from Γ ∪ {φ} must involve φ, and thus, Γ φ along with Γ ∪ {φ} ⊥ imply Γ ⊥, a contradiction; For (ii): by definition of a proof, Γ proves all theorems, hence all theorems are in Γ , i.e. Σ SL ⊆ Γ ; For (iii): were φ, ¬φ ∈ Γ , we would have Γ φ, Γ ¬φ, a contradiction, hence, at most one of φ, ¬φ may be in Γ ; For (iv): property (iv) follows by formulae φ ⊃ (ψ ⊃ φ ∧ ψ) and by φ ∧ ψ ⊃ φ, and, φ ∧ ψ ⊃ ψ, ([11]); For (v): property (v) follows by provable formulae φ ⊃ φ ∨ ψ and ψ ⊃ φ ∨ ψ (derivation 72 in [11]) for the proof from right to left and by provable formulae φ ∨ ψ ⊃ (¬φ ⊃ ψ) and φ ∨ ψ ⊃ (¬ψ ⊃ φ) (derivations 84–86 in [11]) for the proof from left to right; For (vi): property (vi) follows on the strength of the provable formula (φ ⊃ ψ) ≡ ¬φ ∨ ψ (derivations 84–86 in [11]), since either ¬φ ∈ Γ or ψ ∈ Γ . If φ ∈ Γ then ψ ∈ Γ. Theorem 2.20 Each MaxCon(Γ ) is a Hintikka set, hence, each maximal consistent set is satisfiable. Proof By Theorem 2.13 and by Theorem 2.19(iii)–(v).
Corollary 2.4 Each consistent set of formulae is satisfiable. Corollary 2.5 (i) For a consistent set Γ of formulae and a formula φ, Γ φ if and only φ ∈ Ω for each maximal consistent extension Ω of Γ ; (ii) φ ∈ Σ SL if and only if φ ∈ Ω for each MaxCon(Ω); (iii) Each formula in a consistent set Γ is provable from each maximal consistent extension of Γ .
88
2 Sentential Logic (SL)
The importance of Corollary 2.4 stems from its relation to completeness. We will see usages of this relation in the following chapters. This relation may be expressed in the following statement. Theorem 2.21 (The strong completeness of SL) If each consistent set of formulae is satisfiable, then for each set Γ of formulae and each formula φ, (SC) i f Γ |= φ, then Γ φ. Proof Suppose the conclusion is false, i.e., for a set Γ of formulae and a formula φ we have that Γ |= φ but it is not true that Γ φ. Then, the set Γ ∪ {¬φ} is consistent but not satisfiable because truth of Γ in a model would imply the truth of φ in the model, hence, the falsity of ¬φ. The property (SC) is called the strong completeness property.
2.10 Meta-Theory of Sentential Logic. Part II We have pointed to property of completeness in Theorem 2.12 save the provability of the equivalence φ ≡ φ C N F . We have relied on this property when discussing the Lindenbaum-Tarski algebra in Definition 2.28, and finally we have proved strong completeness in Corollary 2.4 and in Theorem 2.21. In spite of this proof, we include the proof of completeness in purely syntactical environment. Two important characteristics of each deductive systems are soundness and completeness. In case of SL axiomatized by (L1)–(L3), soundness means that each formula provable from(L1–L3)) by means of (MP), (S), and, (R) is valid and completeness of SL means that each valid formula is provable from (L1–L3) by means of (MP),(R), and (S). While soundness of SL results immediately from validity of all instances of axiom schemata (L1)–(L3) and from preservation of validity by rules of inference, a proof of completeness requires some additional work. Many authors provided proofs of completeness for SL, from Emil Post [24], through Łukasiewicz [11], Church [8], Hilbert and Ackerman (Hilbert and Ackerman [7]) among others. We recall here the ‘classical’ proof by Kalmár in [25], by elimination of variables, which has served as pattern for some other proofs, like Rosser and Tourqette’s completeness proof for n-valued logic in Chap. 6. In this proof, some provable formulae are used, and, we will point to derivations of those formulae in Łukasiewicz [11] to make the proof self-contained. Theorem 2.22 The following formulae are provable. (A) (B) (C) (D) (E)
p ⊃ p; p ⊃ ¬¬ p; ¬ p ⊃ ( p ⊃ q); (¬ p ⊃ p) ⊃ p; p ⊃ (q ⊃ p);
2.10 Meta-Theory of Sentential Logic. Part II
(F) (G) (H) (J)
89
p ⊃ [(¬q ⊃ ¬( p ⊃ q)]; ( p ⊃ q) ⊃ [(¬ p ⊃ q) ⊃ q]; (¬ p ⊃ ¬q) ⊃ [(¬ p ⊃ q) ⊃ q]; [ p ⊃ (q ⊃ r )] ⊃ [ p ⊃ q) ⊃ ( p ⊃ r )].
(A) is proved in Example 2.1, other formulae will be given, when met in the proof, references to derivations in Łukasiewicz [11]. Theorem 2.23 SL is complete, i.e., each valid formula is provable: for each φ, if |= φ, then φ. Proof (Kalmár) Consider a valid formula φ. We list all atomic propositions in φ as p1 , p2 , ..., pn and their set will be denoted by P. For each assignment A, we denote by f ( p, A) the truth value of p and we introduce formulae φ ∗ , p1∗ , p2∗ , ..., pn ∗ as follows. (1) φ ∗ is φ if f (φ, A) = 1, else φ ∗ is ¬φ; (2) p ∗j is p j if f ( p j , A) = 1, else p ∗j is ¬ p j . Observe that f (φ ∗ , A) = 1 and f ( p ∗j , A) = 1. We now prove the intermediary claim. Claim. p1∗ , p2∗ , ..., pn∗ φ ∗ . Proof of Claim. It goes by structural induction on φ, i.e., we consider the increasingly complex form of φ, complexity measured by si ze(φ). We begin with si ze(φ) = 0. In this case φ is reduced to an atomic proposition p1 . As p1∗ ⊃ p1∗ is provable, it follows by deduction theorem that p ∗ p ∗ so either p p or ¬ p ¬ p depending on the assignment A. Now, we assume that Claim is true for all formulae with size less then k. Consider a formula φ of size k and check the following cases. Case 1. φ is ¬ψ, hence, ψ has (k − 1) connectives so, by induction hypothesis, (3) p1∗ , p2∗ , ..., pn∗ ψ ∗ Suppose first that ψ is true under A, so ψ ∗ is ψ and A∗ (φ, A) = 0, hence, φ ∗ is ¬φ which is ¬¬ψ. We know that ψ ⊃ ¬¬ψ is provable (see derivation 40 in [11]), hence, p1∗ , p2∗ , ..., pn∗ ¬¬ψ, i.e., p1∗ , p2∗ , ..., pk∗ φ ∗ . Next, we assume that A makes ψ false, hence A∗ (φ, A) = 1, and thus φ ∗ is φ. By assumption, p1∗ , p2∗ , ..., pn∗ ¬ψ, i.e., p1∗ , p2∗ , ..., pk∗ φ ∗ . This concludes Case 1. Case 2. φ is ψ ⊃ χ . As ψ and χ have less connectives then φ, the hypothesis of induction holds for them and we have (4) p1∗ , p2∗ , . . . , pl∗ ψ ∗ ,
90
2 Sentential Logic (SL)
where p1 , p2 , . . . , pl are atomic propositions in ψ and (5) q1∗ , q2∗ , . . . , q ∗p χ ∗ , where q1 , q2 , . . . , q p are atomic propositions in χ . Then, p1 , p2 , . . . , pl , q1 , q2 , . . . , q p are atomic propositions in φ (possibly with repetitions); enlarged sets of variables preserve provability, hence, (6)) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2∗ , . . . , q ∗p ψ ∗ , χ ∗ There are some sub-cases to be considered. Sub-case 2.1. In this sub-case we assume that both ψ, χ are true under assignment A: A∗ (ψ, A) = 1 = A∗ (χ , A), so ψ ∗ is ψ and χ ∗ is χ which implies that φ ∗ is φ. By (6), we infer (7) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2∗ , . . . , q ∗p ψ, χ It is sufficient now to invoke (E) (derivation 18 in [11]) in the form χ ⊃ (ψ ⊃ χ ) and then detachment gives (8) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2 , . . . , q ∗p ψ ⊃ χ , (9) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2∗ , . . . , q ∗p φ ∗ . Sub-case 2.2 Now we assume that A∗ (ψ, A) = 1 and A∗ (χ , A) = 0, hence, ψ ∗ is ψ and χ ∗ is ¬χ . Also, A∗ (ψ ⊃ χ , A)=0, so φ ∗ is ¬φ. We know by the hypothesis of induction that (10) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2∗ , . . . , q ∗p ψ, ¬χ . The provable formula (F) (derivation 69 in [11]) becomes after substitutions ( p/ψ; q/χ ) (11) ψ ⊃ [¬χ ⊃ ¬(ψ ⊃ χ )]. By applying detachment twice, we obtain (12) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2∗ , . . . , q ∗p φ ∗ Sub-case 2.3 As implication with false premise is true, it is enough to consider as the last case that A∗ (ψ, A) = 0, so ψ ∗ is ¬ψ and φ ∗ is φ. By the hypothesis of induction we have (13) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2∗ , . . . , q ∗p ¬ψ By provable formula (C) (derivation 36 in [11]), we obtain
2.10 Meta-Theory of Sentential Logic. Part II
91
(14) ¬ψ ⊃ (ψ ⊃ χ ) and from (14) it follows by detachment (15) p1∗ , p2∗ , . . . , pl∗ , q1∗ , q2∗ , . . . , q ∗p φ ∗ All cases concluded, proof of Claim is concluded. We return to the proof of the completeness theorem. As φ is valid, φ ∗ is φ, the set P = { pi : i = 1, 2, ..., n} contains all atomic propositions in φ. Let M P denote the set of all worlds for the set P. We know by Claim, that (16) p1∗ , p2∗ , . . . , pn∗ φ The idea now is to eliminate subsequently all pi∗ from the antecedent of (16), beginning with pn∗ down to p1∗ . We begin with pn∗ . We choose from the set M P two assignments A+ and A− such that A+∗ ( pn , A+ ) = 1 and A−∗ ( pn , A− ) = 0. In case of A+, we have ∗ , pn φ. (17) p1∗ , p2∗ , . . . , pn−1 Similarly, for A− :
∗ , ¬ pn φ. (18) p1∗ , p2∗ , . . . , pn−1
By deduction theorem, applied to (17) and (18), we obtain ∗ pn ⊃ φ, (19) p1∗ , p2∗ , . . . , pn−1 ∗ ¬ pn ⊃ φ. (20) p1∗ , p2∗ , . . . , pn−1
We now invoke the provable formula (G) (derivation 120 in [11]), which after substitutions takes the form (21) ( pn ⊃ φ) ⊃ [(¬ pn ⊃ φ) ⊃ φ]. Detachment applied twice to (21), yields ∗ φ. (22) p1∗ , p2∗ , . . . , pn−1
It suffices now to follow the above procedure with pn−1 , ..., p1 and after removing p1 we are left with (23) φ and the proof of the completeness theorem is concluded. By this theorem, syntactic and semantic consequences are the same.
92
2 Sentential Logic (SL)
In the above proof, we have witnessed provability in action. This is often a tedious process to prove a formula, hence, completeness is a very useful meta-property as checking validity is much easier. In particular, we may check that formulae (1)–(10) in Definition 2.28 are valid, hence, they are provable. We now reveal the interpolation property for sentential logic by stating and proving the Craig interpolation theorem. Definition 2.39 (An interpolant) Consider a valid formula φ ⊃ ψ. An interpolant for this formula is a formula ξ with the properties that all atomic propositions in ξ are also in φ and ψ and both formulae φ ⊃ ξ and ξ ⊃ ψ are valid. This definition can be extended to sets of formulae. Using the formalism of analytic tableaux, we consider the formula ξ : φ ⊃ ψ with its signed form Fξ which is φ ∧ ¬ψ. We begin the tableau with φ and ¬ψ in the initial branch. We use the idea in Fitting [22] of biased formulae. In the case of our formula ξ , we refer to its original ancestry by denoting φ as le f t (φ) and ¬ψ as right (¬ψ). This denotation continues with the buildup of the tableau. For instance, descendants of right (¬(¬ p ⊃ q)) will be right (¬ p) and right (¬q), all of course in one branch. The definition of an interpolant can be extended to sets of biased formulae of the ., right (δm }: form Γ : {le f t (γ1 ), le f t (γ2 ), . . . , le f t (γn ), right (δ1 ), right (δ2 ), . . an interpolant for Γ is the interpolant for the formula (I ) n1 γi ⊃ m 1 ¬δ j . This generalization is in agreement with the case of the formula ξ : φ ⊃ ψ for which the closed tableau begins with the set Γ =le f t (φ), right (¬ψ), hence, an interpolant for Γ according to the generalized definition is the interpolant for the formula φ ⊃ ¬(¬ψ), i.e., an interpolant for ξ . Definition 2.40 (Rules for closed tableaux for biased signed formulae) We keep the notation of type α for conjunctions and of type β for disjunctions. We follow the generalized definition for an interpolant, remember that formulae denoted as le f t enter the conjunction in the antecedent and those denoted right enter the disjunction of negations in the consequent of the implication (I). If there is no right formula, then the consequent is false, as we meet disjunction over the empty set. (i) (ii) (iii) (iv) (v) (vi) (vii)
{Γ,le f t (φ),le f t (¬φ)} ; inter polant:⊥ {Γ,right (ψ),right (¬ψ)} ; inter polant: {Γ,le f t (φ),le f t (¬φ)} ; inter polant:⊥ {Γ,le f t (φ),right (¬φ)} ; inter polant:φ {Γ,le f t (¬φ),right (φ)} ; inter polant:¬φ {Γ,le f t (⊥)} ; inter polant:⊥ {Γ,right (⊥)} . inter polant:
We now state sentential rules.
2.10 Meta-Theory of Sentential Logic. Part II
(viii) (ix) (x) (xi) (xii) (xiii)
93
Γ, le f t () and Γ, le f t (¬⊥) have the same interpolants; Γ, le f t (⊥) and Γ, le f t (¬) have the same interpolants; Γ, right (⊥) and Γ, right (¬) have the same interpolants; Γ, right () and Γ, right (¬⊥) have the same interpolants; Γ, le f t (φ) and Γ, le f t (¬¬φ) have the same interpolants; Γ, right (φ) and Γ, right (¬¬φ) have the same interpolants;
The phrase ‘have the same interpolants’ is to be understood as ‘if ξ is an interpolant for one, then ξ is an interpolant for the other’. We now state rules for types α and β. Γ,le f t (α1 ),le f t (α2 ) has an inter polant ξ ; Γ,le f t (α) has an inter polant ξ Γ,right (α1 ),right (α2 ) has an inter polant ξ α1 ∧ α2 : ; Γ,right (α) has an inter polant ξ Γ,le f t (betai ) has an inter polant ξi β1 ∨ β2 for i = 1, 2: Γ,le f t (βi ) has an inter polant ξ1 ∨ξ2 ; Γ,right (betai ) has an inter polant ξi β1 ∨ β2 for i = 1, 2: Γ,right . (βi ) has an inter polant ξ1 ∧ξ2
(xiv) for α : α1 ∧ α2 : (xv) for α : (xvi) for β : (xvii) for β :
We justify rules (xvi) and (xvii) as a pattern for other rules. Let Γ = {le f t (γ1 ), f t (γn ), right le f t (γ2 ), ..., le (δ1 ), right (δ2 ), ..., right (δm )}, hence, the corresponding formula is in γi ⊃ m 1 ¬δi . For rule (xvi): Suppose ξ1 is an interpolant for Γ, le f t (β1 ) and ξ2 is an interpolant for Γ, le f t (β2 ), hence, (a) β1 ∧ n1 γi ⊃ ξ1 ; (b) β2 ∧ n1 γi ⊃ ξ2 . Then, β∧ (β1 ∧
n
γi ≡ (β1 ∨ β2 ) ∧
1 n 1
n
γi ≡
1
γi ∨ β2 ∧
n
γi ⊃ ξ1 ∨ ξ2 .
1
The other implication from ξ1 ∨ ξ2 to m 1 ¬δ j follows obviously from implications ¬δ . The requirement that all atomic proposifrom ξ1 respectively, from ξ2 to m j 1 δ is satisfied by virtue of tions in ξ1 ∨ ξ2 have occurrences in both n1 γi and m j 1 assumptions about ξ1 and ξ2 . For rule (xvii), we assume ¬δ j ∨ ¬β1 ; (c) ξ1 ⊃ m 1 (d) ξ2 ⊃ m 1 ¬δ j ∨ ¬β2 .
94
2 Sentential Logic (SL)
Then, ξ1 ∧ ξ2 ⊃ (
m
¬δ j ∨ ¬β1 ) ∧ (
1 m
¬δ j ∨ (¬β1 ∧ ¬β2 ) ≡
m
¬δ j ∨ ¬β2 ) ≡
1 m
1
1
¬δ j ∨ ¬(β1 ∨ β2 ) ≡
m
¬δ j ∨ ¬β.
1
That n1 γi ⊃ ξ1 ∧ ξ2 follows obviously and the requirement about occurrences of atomic propositions is also satisfied. After a closed tableau for Fφ ⊃ ψ started with le f t (φ) and right (¬ψ) is constructed, we apply the rules bottom-up. Example 2.4 Consider the formula ξ : p ∧ q ∧ r ⊃ ( p ⊃ q) ⊃ r . It is valid: if the antecedent is true then p, q, r are truth valued and the consequent is true, so the formula is true. The interpolant for ξ results from the following tableau presented in the schematic form of left and right biased components: F[( p ∧ q ∧ r ) ⊃ (( p ⊃ q) ⊃ r )] left: p–q –r right:¬r − −¬ p, q Interpolant: ((¬ p) ∨ q) ∧ r . The interpolant induced from the tableau is ξ : ((¬ p) ∨ q) ∧ r . Let us check it is so. Obviously all atomic propositions p, q, r are in the antecedent and in the consequent of ξ . We consider first the implication: p ∧ q ∧ r ⊃ ((¬ p) ∨ q) ∧ r . If the antecedent is true, then the consequent is true, so the implication holds. We consider next the implication (((¬ p) ∨ q) ∧ r ) ⊃ (( p ⊃ q) ⊃ r ). If the consequent is false, then r is valued false but then the antecedent is false so the implication holds. Theorem 2.24 (Craig interpolation, Craig [26]) For each valid formula φ ⊃ ψ there exists an interpolant. Proof Suppose that the formula φ ⊃ ψ has no interpolant. Consider a property C of a finite collection H offormulae: C(H) if there exists a partition F = F1 ∪ F2 and the formula F1 ⊃ ¬ F2 has no interpolant. Then one checks that H is a Hintikka set, hence, satisfiable. Apply this to the valid formula φ ⊃ ψ and define the collection H as {φ, ¬ψ}. Partition the set {φ, ¬ψ} into {φ} and {¬ψ}. Had the formula φ ⊃ ¬¬ψ an interpolant, also the formula φ ⊃ ψ would have an interpolant, contrary to the assumption. It follows that the collection {φ, ¬ψ} is satisfiable, contradicting the validity of the formula φ ⊃ ψ. This concludes the proof. Remark 2.1 The idea for the proof as well as rules for interpolants come from Fitting [22].
2.11 Resolution, Logical Consequence
95
2.11 Resolution, Logical Consequence The rule of detachment can be expressed in the form φ,¬φ∨ψ , which can be interpreted ψ as the cancellation of contradictory literals φ,¬φ in two clauses in the numerator of the fraction and moving the remaining literals from both clauses to the denominator. It has been generalized to arbitrary clauses with a pair of contradictory literals in Robinson [27]. Definition 2.41 (The resolution rule) j j k l1i and C2 : mj=1 l2 with contradictory literals l1i and l2 , the For clauses C1 : i=1 resolution rule is j
(R R)
l11 ∨ l12 ∨ . . . ∨ l1i ∨ . . . ∨ l1k , l21 ∨ l22 ∨ . . . ∨ l2 ∨ . . . ∨ l2m .
j i k m 1 2 1 2 l1 ∨ l1 ∨ . . . ∨ l1 ∨ . . . ∨ l1 ∨ l2 ∨ l2 ∨ . . . ∨ l2 ∨ . . . ∨ l2
The symbol x means that the literal so denoted is omitted; thus, (RR) acts on a pair of clauses by removing a pair of contradictory literals, one from each clause. The resolution rule is valid as from validity of clauses C1 , C2 under an assignment A, the validity under A of the resulting clause called resolvent and denoted
r es(C1 , C2 ) follows: if the literal marked with l is removed and it is valued 0, then there is other literal valued 1 in its clause and it finds itself in the resolvent. Clearly, of the two contradictory literals one must have value 0. We denote by Box the empty clause which is unsatisfiable as existential quantification over the empty set of objects is false. The context of resolution is a set of clauses C={C1 , C2 , . . . , Ck }. Definition 2.42 (Resolution refutation) A sequence of resolution steps from the set C of clauses to the empty clause Box is called the resolution refutation. The existence of resolution refutation points to unsatisfiability of the set C of clauses: was the set C of clauses satisfiable, satisfiability would be preserved along any sequence of resolution steps with clauses in C. We obtain the following results. Theorem 2.25 Resolution is sound, i.e., if the initial set of clauses is satisfiable, then there does not exist resolution refutation. Completeness of resolution means that if the initial set of clauses C is unsatisfiable, then there exists a resolution refutation. Theorem 2.26 Resolution is complete: if a set of clauses is unsatisfiable, then there exists a resolution refutation.
96
2 Sentential Logic (SL)
Proof Let C be an unsatisfiable set of clauses. By Corollary 2.4, C is inconsistent, hence, there exists a proof C1 , C2 , . . . , Ck of falsum ⊥. The inference rule is detachment, i.e, a particular case of the resolution rule. Hence, the sequence C1 , C2 , . . . , Ck is a resolution refutation. Example 2.5 Consider the set of clauses C which contains clauses: 1. p ∨ q, 2. q ∨ ¬r , 3. ¬ p ∨ ¬q ∨ ¬r , 4. p ∨ q ∨ r , 5. p ∨ ¬r , 6. r. We decide whether this set is satisfiable or not by searching for a resolution refutation. Consider the sequence of resolution steps: 5,6 , 3,6 , 7,8 , 2,9 , 6,10 . 7 r 8 ¬ p∨¬q 9 ¬q 10 ¬r The set C of clauses is unsatisfiable. Definition 2.43 (Logical consequence. Entailment) A set Δ of formulae is a logical consequence of a set Γ of formulae if and only if for each assignment A, if A satisfies Γ , then A satisfies a formula in Δ. In particular case when Δ = {φ}, we say that φ is entailed by Γ . We denote the fact of sentential consequence with the symbol Γ |= Δ. Theorem 2.27 A set Δ of formulae is a sentential consequence to a set Γ of formulae if and only if the formula ( Γ ) ∧ ¬Δ is unsatisfiable. Proof Suppose that A is an assignment on atomic propositions in formulae in Γ ∪ Δ which satisfies Γ ; then A satisfies a formula in Δ, hence, ¬Δ is false. As A is arbitrary, the formula ( Γ ) ∧ ¬Δ is unsatisfiable. The converse is proved on similar lines. Example 2.6 In conditions of Example 2.5, consider the clause 5: p ∨ ¬r . It is the negation of the formula φ : (¬ p) ∧ r . By Theorem 2.27, the formula φ is entailed by the set of clauses with nos.: 1,2,3,4,6. We meet here the two facets of resolution: it can be applied towards checking satisfiability or towards checking entailment. In both applications, resolution is sound and complete.
2.12 Horn Clauses. Forward and Backward Chaining We comment briefly on two inference methods. A set of clauses is called, when applications are discussed, a Knowledge Base (KB). Inferences from Knowledge Bases can be obtained by other means then resolution. The form depends on the aim. If we want to check whether a given query follows from the Knowledge Base then we may apply Backward Chaining besides Resolution. If we want to infer all consequences of the Knowledge Base then our choice can be Forward Chaining. These labels tell the idea of each search in Knowledge Bases: Backward Chaining begins with the query and tries to verify it, Forward Chaining begins with the first fact in the Knowledge Base and collects in the top-down manner all facts encountered in the process.
2.12 Horn Clauses. Forward and Backward Chaining
97
It is understandable that these procedures may require some other then in resolution form of clauses. Definition 2.44 (Horn clauses) A Horn clause Horn [28] is any clause which contains at most one non-negated literal. A Horn clause can be in one and only one of the following forms: (i) as a non-negated literal p or (ii) as a clause with one non-negated literal p ∨ i∈I ¬ pi , ¬ p . A clause of the form (ii) or (iii), as a clause with only negated literals j j∈J a decision rule. The p ∨ i∈I ¬ pi can be brought to the form i∈I pi ⊃ p called clause of the form p is called a fact. The third possibility of j∈J ¬ p j can be written down in the form j∈J p j ⊃ ⊥ called an integrity constraint. A Horn formula is a formula in CNF whose all clauses are Horn. Example 2.7 Backward Chaining The textual version of Knowledge Base is as follows (Carroll [29]): 1. The only animals in this house are cats. 2. Every animal that loves to gaze at the moon is suitable for a pet. 3. When I detest an animal, I avoid it. 4. No animals are carnivorous unless they prowl at night. 5. No cat fails to kill mice. 6. No animals ever like me, except those that are in this house. 7. Kangaroos are not suitable for pets. 8. None but carnivorous animals kill mice. 9. I detest animals that do not like me. 10. Animals that prowl at night always love to gaze at the moon. 11. Query: Therefore, I always avoid a kangaroo. In order to render this set of statements in symbolic form, we introduce some acronyms: AH = animal in house; C = cat; LGM = loves to gaze at the moon; Pet = pet;Det = detest;Av = avoid; CA = carnivorous; PR = prowl; K = kill; LI = likes; KNG = kangaroo. Clearly, we could encode these phrases with letters a,b,c etc. but then the reading would be more difficult. With this set, statements 1–11 can be transformed into decision rules: (I) AH → C; (II) LG M ⊃ Pet; (III) Det ⊃ AV ; (IV) C A ⊃ P R; (V) C ⊃ K ; (VI) L I ⊃ AH ; (VII) ¬Pet ⊃ K N G; (VIII) K ⊃ C A; (IX) ¬L I ⊃ Det; (X) P R ⊃ LG M; (XI) Query: AV ⊃ K N G. In order to prove Query from (I)-(X), we assume that the premise AV is true and we enter the backward reasoning in order to derive the consequent K N G. The following chain of derivations leads from consequents to verifying them antecedents. In this process, we often use the contraposition law, using ¬q ⊃ ¬ p instead of p ⊃ q. We prefer this approach instead of rendering some of (I)-(X) in the contraposition form in order to preserve the syntax of the original text. The solution to the Query is a list of goals: AV to (III): Det to (IX): ¬L I to (VI): ¬AH to (I): ¬Cat to (V): ¬K to (VIII): ¬C A to (IV): ¬P R to (X): ¬LG M to (II): ¬Pet to (VII): K N G. The consequent K N G of Query (XI) has been derived from (I)-(X), hence, Query is proved. The backward chaining process is illustrated with the chain of applications of detachment rule: AV →(I I I ) Det →(I X ) ¬L I →(V I ) ¬AH →(I ) ¬Cat →(V ) ¬K →(V I I I ) ¬C A →(I V ) ¬P R →(X ) ¬LG M →(I I ) ¬Pet →(V I I ) K N G.
98
2 Sentential Logic (SL)
To introduce Forward Chaining, we enlarge our set of atomic propositions by allowing open predicate statements like Cat (Amber ) which are treated like atomic propositions in sentential logic. They are called facts. Example 2.8 Forward Chaining The textual version is as follows (folklore): 1. Amber is a cat. 2. Cats prowl at nights. 3. If a cat prowls at nights then it gazes at the moon. 4. Amber is black. 5. If a cat is black then it is not visible at nights. 6. If a cat is not visible at nights then it catches a big catch. 7. Amber is shy. 8. If a cat is shy then it goes into hiding during a day. 9. If a cat goes into hiding during a day then it is not known to neighbors. 10. If a cat is black and goes into hiding during a day then it does not cross our paths. Acronyms for predicative statements are: PR = prowl at night; GM = gazes at the moon; NV = not visible at nights; CTCH = catches a big catch; GH = goes into hiding;NKN = not known; NCR = not crosses our paths. Let us convert the text into a symbolic form. Our atomic propositions are in predicative open form. The facts are: (I) Cat (Amber ); (II) Black(Amber ); (III) Shy(Amber ). Decision rules extracted from the text are: (IV) (V) (VI) (VII) (VIII) (IX) (X)
Cat (Amber ) → P R(Amber ); P R(Amber ) → G M(Amber ); Black(Amber ) → N V (Amber ); N V (Amber ) → C T C H (Amber ); Shy(Amber ) → G H (Amber ); G H (Amber ) → N K N (Amber ); Black(Amber ) ∧ G H (Amber ) → N C R(Amber ).
Derivations of new facts by detachment are: (a) (I, I V ) → (X I ) : P R(Amber ); (b) (V, X I ) → (X I I ) : G M(Amber ); (c) (I I, V I ) → (X I I I ) : N V (Amber ); (d) (V I I, X I I I ) → (X I V ) : C T C H (Amber ); (e) (I I I, V I I I ) → (X V ) : G H (Amber ); (f) (I X, X V ) → (X V I ) : N K N (Amber ); (g) (I I, X, X V ) → (X V I I ) : N C N (Amber ). Theorem 2.28 Backward chaining with Horn decision rules is sound and complete.
2.13 Satisfiability, Validity and Complexity in Sentential Logic. Remarks …
99
Proof Consider a set of decision rules R1 , R2 , . . . , Rk and suppose we have obtained a derivation from them of a consequent qk of the rule Rk . In the process of derivation, we checked each antecedent pi in the rule Rk finding a rule in which pi was the consequent and adding antecedents od that rule to the queue of antecedents to be checked. Checking a premise can be regarded, if we consider rules in their clausal form, as executing the resolution rule. We consider the qk -saturated set Sk defined as the smallest set containing literals used in the process of validating qk except for qk . Let Ci be the clausal form of the rule Ri for i ≤ k. Let Cik be the sub-clause of the clause Ci containing those literals in Ci which occur in the set Sk . Then: qk is validated if and only if resolution performed on the set of clauses Cik , C2k , . . . , CkK yields . It follows, by soundness and completeness of resolution, that backward chaining is sound and complete. Resolution has the exponential complexity Haken [30]. Theorem 2.29 (Exponential complexity of resolution) There exists a sequence of 3 formulae (Hn )∞ n=1 such that the set of clauses for Hn is of cardinality of order of n and resolution tree for Hn contains the order of cn clauses for some constant c. It may be interesting to point to the idea for the proof. Formulae Hn stem from the pidgeon-hole principle of Dirichlet: n + 1 letters cannot be put into n mailboxes in such way that each matchbox contains exactly one letter. For a given n, the formula Hn is constructed as follows: (1) let Pi j means that the letter i is in the mailbox j; (2) in order the fact that each letter is in a mailbox, we need the formula n+1toexpress n φ : i=1 j=1 Pi j ; (3) in order to express the fact that two letters fall into some mailbox, we need the n+1 n+1 formula ψ : np=1 i=1 j=i+1 .(Pi p ∧ P j p ); (4) the formula φ ⊃ ψ is the formula Hn . Proof in Haken [30] is not reproduced here due to its length.
2.13 Satisfiability, Validity and Complexity in Sentential Logic. Remarks on SAT Solvers Satisfiability (SAT) decision problem in sentential logic consists in determining whether a given formula is satisfiable which can be paraphrased as the question whether each clause in CNF of the formula is satisfiable. The decision problem for satisfiability property is theoretically decidable by brute force: given a formula φ, one has to construct the truth table for φ and check each assignment of truth values
100
2 Sentential Logic (SL)
for φ. If we meet an assignment for which the value is 1 for φ, then φ is satisfiable and this assignment yields the model. The validity problem is dual to satisfiability problem as a formula φ is valid if and only if the formula ¬φ is unsatisfiable, so checking unsatisfiability for φ by means of the truth table is equivalent to checking validity of φ. Hence, the decision problem of validity is decidable. This approach has exponential complexity of 2n in case of formulae of size n, hence, it is impractical. In terms of computability, we formulate the decidability for SL, by stating that the set of Gödel numbers of valid formulae is recursive, hence, the decision problem whether a formula is valid is recursively solvable (see Chap. 1 or Davis [31]). It is known, cf. Theorem 1.53, that the problem CNF-SAT of checking satisfiability of a formula in CNF as well as the problem (3-CNF)-SAT of checking satisfiability of a formula in 3-CNF, i.e., in CNF in which each clause has exactly three literals, are NP-complete. Hence, the validity problem is co-NP-complete. Yet, satisfiability is a vital property in applications, so there are a number of algorithms to test satisfiability property of formulae of sentential logic in a heuristic way. We present here some ideas on which SAT solvers are based. Let us first comment on some variants of resolution. Let C be a non-empty set of clauses. A unit resolution is the variant in which the rule (RR) is allowed only if at least one clause is a singleton (l). It is known that unit resolution is complete for sets of Horn clauses. We now consider some operations on sets of clauses with resolution on mind. Definition 2.45 (A pre-processing of clauses) (1) if we have already observed that each clause containing a pair of contradictory literals can be removed from a set of clauses without affecting the issue of satisfiability; Suppose that there exists a literal l such that the clause (l) ∈ C. Let Cl be the set {C ∈ C : l ∈ C} and C = C \ Cl . If C is satisfiable in the world V , then V (l) = 1, hence each clause in Cl is valid at V . The test of satisfiability is with C , hence, satisfiability of C is equivalent to satisfiability of C . Hence, (2) if a set of clauses C contains a singular clause l (called an orphan), then the literal ¬l can be removed from all remaining clauses; (3) after (i) and (ii) are performed, we remove all orphaned literals. Finally, consider a valid formula C ⊃ C between clauses in the set C of clauses. Clearly, if C is valid under an assignment V , then C is valid under V . The impact of this remark is that (4) in the set C of clauses we can remove all non-minimal clauses with respect to set-theoretic inclusion on sets of literals in clauses. We call a set C of clauses pre-processed with the result being the set of clauses C pr e if and only if we have applied to C reduction rules (i)–(iv) in this order.
2.13 Satisfiability, Validity and Complexity in Sentential Logic. Remarks …
101
We now introduce the Davis-Putnam (Davis and Putnam [32]) solver which conveys the basic ideas of SAT solving. Definition 2.46 (Davis-Putnam algorithm) Consider a pre-processed set of clauses C. Do the following: (1) select non-deterministically a literal l and perform resolution with the set D of all clauses containing l and ¬l adding resolvents and after that removing all clauses in D; (2) Repeat (1) until either Box is obtained and report unsatisfiable or there is nothing left to resolve and Box is not obtained in which case report satisfiable. Example 2.9 (i) Consider the set of clauses: (x ∨ y ∨ z), (y ∨ ¬z ∨ ¬w), (¬y ∨ t). 1 There is no unit literal, so prescription in Definition 2.46(1) is skipped; 2 There are orphaned literals x and ¬w and by Definition 2.46(2), we remove clauses containing them. A satisfying assignment A should have values A(x) = 1, A(w) = 0 and the pre-processed set of clauses is: (¬y ∨ t). The report is then: satisfiable. (ii) Let us consider one more case of a set of clauses in 2-SAT: (x ∨ y), (x ∨ ¬y), (¬x ∨ z), (¬x ∨ ¬z). 1 Resolve clauses with x and ¬x, add resolvents, remove clauses resolved; remaining clauses are: (y ∨ z), (y ∨ ¬z), (¬y ∨ z), (¬y ∨ ¬z). 2 Resolve clauses with y, ¬y, add resolvents, remove clauses resolved; remaining clauses are: (z), (¬z). 3 Resolve the last two clauses: the resolvent is Box - the formula is not satisfiable; report unsatisfiable. Definition 2.47 (The Davis-Logemann-Loveland (DLL/DPLL) algorithm) This algorithm uses backtracking along a search tree. Example 2.10 Consider clauses (¬x ∨ ¬y), (¬z ∨ y), (¬x ∨ z ∨ ¬w), (z ∨ w). The search tree begins with the root x and contains levels for y, z, w in that order; left edge is labelled with 1, right edge is labelled with 0, so we have 16 maximal branches and we list them as sequences of labels 0,1 on edges of the tree along branches together with satisfiability results F (unsatisfiable), T (satisfiable): 1111F, 1110F, 1101F, 1100F, 1011F, 1010F, 1001T, 1000F, 0111T, 0110T, 0101T, 0100F, 0011F, 0010F, 0001T, 0000F The set of clauses is satisfiable. In practical execution of the algorithm, we would begin with values x = 1, y = 1 which invalidate the first clause, hence, we backtrack to x = 1, y = 0, z = 1 which
102
2 Sentential Logic (SL)
invalidate the second clause so we backtrack to values x = 1, y = 0, z = 0, w = 1 which invalidate the third clause, hence, we backtrack to values x = 1, y = 0, z = 0, w = 0 which invalidate the fourth clause, so we backtrack to values x = 0, y = 1 which satisfy clauses 1,2,3 but not decide on clause 4, hence, we go down to values x = 0, y = 1, z = 1 which satisfy the set of clauses. We now consider some additional cases of polynomial complexity. We mention cases of sets of Horn clauses and of 2-CNF clauses. Theorem 2.30 The satisfiability problem SAT-HORN for sets of Horn decision rules is solvable in time linear in number of literals (Dowling and Gallier [33]). It is a PTIME-complete problem, see (Greenlaw et al. [34]). Theorem 2.31 2-SAT satisfiability problem is solvable in linear time. It is an NLcomplete problem (Aspvall et al. [35]). Proof We outline an idea of a proof. Consider the problem 2-SAT in which a formula in CNF consists of clauses with two literals. The APT-algorithm for solving the problem in linear time is recalled below. Linearity comes from Tarjan’s algorithm for finding strongly connected components in a graph Tarjan [36]. APT-algorithm Represent clauses of φ in the graph called the implication graph: set of vertices W is the set of all literals in the formula and their negations. Edges are of the following types: (i) if the clause is of the form u ∨ v, then edges are ¬u → v, ¬v → u; (ii) if the clause is of the form ¬u ∨ v then edges are u → v, ¬v → ¬u; (iii) if the clause is of the form ¬u ∨ ¬v edges are of the form u → ¬v, v → ¬ u. Then the following are equivalent: (i) the formula φ is satisfiable under an assignment A; (ii) the satisfying assignment A assigns complementary values to complementary vertices and there is no path which leads from a vertex valued 1 to a vertex valued 0; (iii) for no vertex u, both u and ¬u are in one strongly connected component of the graph.
2.14 Physical Realization of SL. Logic Circuits, Threshold Logic Two-valued sentential logic can be implemented in physical environment. Logical gates are one way for doing it. See Fig. 2.6 for symbolic rendering of gates AND, OR, NOT and Fig. 2.7 for the circuit which computes the formula ( p → q) → [(q ↔
2.14 Physical Realization of SL. Logic Circuits, Threshold Logic
103
Fig. 2.6 Logical gates
Fig. 2.7 A logical circuit
Fig. 2.8 The artificial neuron
r ) → ( p → r )] in its CNF: ( p ∨ q ∨ ¬ p ∨ r ) ∧ (¬q ∨ q ∨ ¬ p ∨ r ). Theory of circuits parallels theory expounded above with a suitable change of terminology. The other way is to turn to neural networks. The McCulloch-Pitts neuron (McCulloch and Pitts [37]) is a rendering of the physiological neuron (Ramón y Cajal [38]) described by Ramón y Cajal, see Fig. 2.8. In this figure, we see the soma (body), input synapses each of which can convey a unit impulse and the output. The symbol Θdenotes the threshold. Computations by McCulloch-Pitts neuron follow the rule: if i xi ≥ Θ, then y = 1, otherwise y = 0. Hence, this type of neuron computes the Heaviside function. This type of neuron computes monotone functions. One can enhance its computing power by introducing excitatory and inhibitory inputs; a neuron endowed with a
104
2 Sentential Logic (SL)
single inhibitory input and threshold Θ = 0 computes the negation NOT: input x = 1 activates inhibition and the neuron outputs no value, i.e., 0, while the input x = 0 produces the output 1. Then a network of connected McCulloch-Pitts neurons can compute each Boolean function of finitely many atomic propositions. In order to provide learning capabilities to neurons one has to endow the neuron with weights on inputs. Then, one obtains a perceptron Rosenblatt [39]. While weighted networks are equivalent to networks of McCulloch-Pitts neurons with inhibitory powers, yet introduction of weights allows for a proof of the perceptron learning theorem. The computation rule for the perceptron is: if i wi xi ≥ Θ, then y = 1, otherwise y = 0. In a learning process of a concept, a perceptron is given a sample S of positive (P) and negative (N) examples, each coded as input vector x = [x1 , x2 , . . . , xn ]. Classification of the concept consists in a linear separation of positive from negative examples. if this is achieved, then we say that the perceptron has learned the concept. The criterion for the proper classification is: if an example is positive, then i wi xi ≥ Θ, where the vector x is coding the example, and, in case of a negative example, the criterion is i wi xi < Θ. Theorem 2.32 (the perceptron learning theorem) If sets (P) and (N) of examples coded as sets of vectors in the Cartesian space can be linearly separated by linear manifold, then the perceptron beginning with a random set of non-zero weights, can apply a learning algorithm whose execution will terminate after finitely many steps with a set of weights which will properly classify the concept, i.e, they will separate linearly (P) from (N). Proof We make a few simplifications. First, we can suppose that Θ = 0: this change will result only in a shift of (P)∪(N) by a vector, not affecting relative positions of (P) and (N). Next, a simplification consists in considering the set (-N) instead of (N): then the criterion for proper classification will be i wi xi ≥ 0 for each vector x ∈ (P) ∪ (−N ). Finally, we may assume that all vectors are normalized, i.e., the length (the norm) of each is 1 as this does not change the sign of the scalar product. The assumption that (P) and (N) are linearly separable means that there exists a weight vector w which correctly classifies all vectors in the sample. In the process of learning, two types of error can be made: (i) a vector x ∈ (P) can be classified as negative (ii) a vector x ∈ (N ) can be classified as positive. The learning algorithm has to account for and correct those errors. The remedy is to change weights. We simplifynotation, we denote the vector of weights as w and we denote the scalar product i wi xi as x · w. The learning algorithm Input: vectors in (P)∪(-N), random vector w0 , Θ = 0 Q = the queue of vectors to be tested on satisfiability of the criterion x · w ≥ 0 t = time measured as steps in the execution of the algorithm; t = 0; wt = current value of the weight vector at time t; xt = the vector dequeued at time t.
2.15 Problems
105
For a vector xt ∈ (P)∪ (N): 1 if xt ∈ (P) and xt · wt ≥ 0, then enqueue xt , dequeue(Q) and test the dequeued vector xt+1 2 if xt ∈ (P) and xt · wt < 0, then wt+1 =wt + xt , enqueue xt , dequeue Q and test the vector xt+1 3 if xt ∈ (N) and xt · wt < 0, then enqueue xt , dequeue Q and test the dequeued vector xt+1 4 if xt ∈ (N) and xt · wt ≥ 0, then wt+1 =wt − xt , enqueue xt , dequeue Q and test the dequeued vector xt+1 Suppose that an error was committed at some time t and the vector wt+1 was computed as wt + xt (due to convention that (N) was replaced with (-N)). Suppose that we compute the distance from w to wt+1 . We denote by ||v|| the norm of the vector v. As distance can be measured by the cosine of the angle between vectors, we compute wt+1 · w (i) cos(wt+1 , w) = ||wt+1 || Let η = min{x · w : x ∈ (P) ∪ (−N )}; clearly, η > 0. Then (ii) wt+1 · w ≥ wt · w + x · w ≥ wt · w + η By recurrence, we obtain (iii) wt+1 · w ≥ w0 · w = (t + 1) · η On the other hand, (iv)||wt+1 || = sqr t[(wt + xt ) · (wt + xt )] = sgr t (||wt ||2 + ||xt ||2 + 2 · xt · wt ) ≤ sqr t (||wt ||2 + 1)
By (iii) and (iv), cos(wt+1 , w) → ∞ as t → ∞, impossible, hence, time is bounded from above. The estimate for maximum number of steps is given by 1 ≥ sqr t (t + 1) · η, i.e, t ∼ η12 . It follows that the number of steps depends on the distance between sets (P) and (N) measured by η.
2.15 Problems Problem 2.17 was picked up from (Gallier J. H.: Logic for Computer Science. Foundations of Automatic Theorem Proving. Dover Publications (2015), 3.5.9) and (Chang, C. C., Keisler, J. H.: Model Theory. Elsevier Science Publ., Amsterdam (1992), Chap. 4.6).
106
2 Sentential Logic (SL)
Problem 2.1 (Logical consequence) (after Smullyan [20]). We recall that Γ |= Δ denotes logical consequence (11.6). For a set of formulae Γ and formulae φ, ψ, prove the following: if Γ, φ |= ψ and Γ, φ |= ¬ψ, then Γ |= ¬φ. Problem 2.2 (Sentential compactness, consistency) (after Smullyan [20], II.4). Prove: If Γ |= φ, then Δ ⊃ φ is valid for a finite Δ ⊆ Γ. Problem 2.3 (Consistency) (after Beth [21], 89, p.262). A set Γ of formulae is called complete if and only if for each formula φ either φ ∈ Γ or ¬φ ∈ Γ . Prove: (i) the Lindenbaum theorem: each consistent set has a complete and consistent extension; (ii) each set Γ of formulae which is consistent and complete is maximally consistent. Problem 2.4 (Consistency. The Tarski theorem) (after Beth [21], 89, p. 262). Prove: Each consistent set Γ is the intersection of all its consistent and complete extensions. Problem 2.5 (Replacement theorem) (after Fitting [22], 2.5.1). Consider a formula φ( p) in which p is a singled out atomic proposition (there may be more of them in φ) along with formulae φ and χ and an assignment A. Apply structural induction in order to prove: if A(ψ) = A(χ ), then A(φ( p/ψ)) = A(φ( p/χ )). Problem 2.6 (Tableaux) (after Fitting [22], 3.1.2). Prove by means of an analytic tableau that the function X O R( p, q) = ((¬ p) ∧ q) ∨ ( p ∧ (¬q)) is associative. Problem 2.7 (Validity) (after Fitting [22], 3.6.6). Prove that if for formulae φ and ψ which have no atomic proposition in common the formula φ ⊃ ψ is valid, then either ¬φ is valid or ψ is valid. Problem 2.8 (Validity. Dilemmatic mood) (after Church [8], 15.9). Prove validity of the following moods; for simplicity we use symbols for atomic propositions. Simple constructive dilemmatic mood: If p, then q. If r, then q. p or r. Therefore q. Simple destructive dilemmatic mood: If p, then q. If p, then r.Not q, not r. Therefore not p. Complex constructive dilemmatic mood: If p, then q. If r, then s. p or r. Therefore q or s. Complex destructive dilemmatic mood: If p, then q. If r, then s. Not q, not s. Therefore not p, not r.
2.15 Problems
107
Problem 2.9 (Sentential connectives: converse non-implication) (after Church [8], 24, p.155). The connective of converse non-implication ⊂n is defined as p ⊂n q = (¬ p) ∧ q = ¬(q ⊃ p). Prove: ⊃, ⊂n constitute a complete system of independent connectives, i.e, together they define all other connectives. Problem 2.10 (Normal forms: implicative normal form) (after Church [8], 15.4). A formula φ( p1 , p2 , . . . , pn ) is in implicative normal form if and only if φ is of the form (i) C1 ⊃ (C2 ⊃ (. . . (Cm−1 ⊃ (Cm ⊃ ⊥)))), where (ii) each Ci is of the form Ci1 ⊃ (Ci2 ⊃ (. . . (Cin ⊃ ⊥))), where (iii) each Cik is either pk or ¬ pk ; (iv) formulae Ci are arranged in the lexicographic order of their Cik s with respect of superscripts k with precedence of pk over¬ pk , i.e., suppose that the sequence and Cik = pk , and, C kj = ¬ pk , then Ci preCi1 , Ci2 , . . . , Cik−1 =C 1j , C 2j , . . . , C k−1 j cedes C j . Prove: for each formula φ there exists a unique implicative normal form φ⊃ which contains atomic proposition same as φ and such that φ ≡ φ⊃ . [Hint: Recall CNF and the valid equivalence ¬ p ≡ p ⊃ ⊥]. Problem 2.11 (Normal forms: negation normal form) A negation normal form is a form of a formula in which negation signs are applied only to atomic propositions. Prove that each formula can be transformed into an equivalent negation form. [Hint: Apply the replacement Theorem 2.5]. Problem 2.12 (Independence) A set Γ of formulae is independent if and only if for each formula φ ∈ Γ , it is not the case that Γ \ {φ} φ. Prove: for each finite set Γ , there exists Δ ⊆ Γ such that Δ is independent and Δ φ for each φ ∈ Γ . Problem 2.13 (Resolution) (after Fitting [22], 3.8.4). A set Γ of clauses is resolution saturated if and only if with each clause C it contains results of any application to C of the Resolution Rule (RRR). Prove: A set Γ of clauses, which is resolution saturated and unsatisfiable contains the empty clause. Problem 2.14 (Robinson sets) (after Fitting [22], 3.8.5, 3.8.6). For a set Γ of formulae, Γ is a Robinson set if and only if (i) for each φ ∈ Γ , Γ contains clauses of its clausal expansion; (ii) Γ contains all results of applications of Resolution Rule (RRR) to any pair of its clauses; (iii) Γ does not contain the empty clause. Prove: Each Robinson’s set is satisfiable.
108
2 Sentential Logic (SL)
Problem 2.15 (Craig interpolation theorem) Decide the form of the Craig interpolant for the valid formula φ ⊃ ψ which satisfies conditions of Problem 2.7. Problem 2.16 (Haken formulae. Complexity of resolution) (after Haken [30], Theorem 2.29). Write down the formula H2 for two pidgeon-holes and 3 letters and check its validity either by the method of analytic tableaux or by the method of resolution. Problem 2.17 (Reduced products) We recall the notion of a proper filter (see Sect. 1.9) as a family F of subsets of set S such that (i) ∅ ∈ / F (ii) A, B ∈ F ⊃ A ∩ B ∈ F (iii) A ∈ F ∧ A ⊆ B ⊇ S ⊃ B ∈ F. By Zorn maximal principle, each proper filter extends to a maximal proper filter, called ultrafilter. Recall: a family of sets is finitely centered if each finite subfamily has a non-empty intersection. (a) Prove: each finitely centered family of sets H , extends to a filter on the set H; (b) consider an infinite set S and the family I of all finite subsets of S. For each s ∈ S, consider the set I (s) = {A ∈ I : s ∈ A}, hence, each I (s) is a collection of finite subsets of S and we let J = {I (s) : s ∈ S}; J is a family of subsets of the set I . Prove: J has the finite intersection property, hence, J extends to a proper filter FJ and then to an ultrafilter U J ; (c) consider a non-empty set S, a filter F on S along with a set As = ∅ for each product of the family {As : s ∈ S} of sets. s ∈ S. Let C = s As be the Cartesian Elements of C are functions f : S → As such that f (s) ∈ As for each s ∈ S. Consider a relation f ∼ F g on functions in C, defined as follows: f ∼ F g if and only if {s ∈ S : f (s) = g(s)} ∈ F. Prove: The relation ∼ F is an equivalence relation on C. (d) with reference to (c), denote by the symbol [ f ]∼ the class of the function f on C and let C F = {[ f ]∼ F : f ∈ C} which set is called the reduced product of the family {As : s ∈ S}. Prove: for the set P of atomic propositions, for a collection {A p : p ∈ P} of assignments, and for a filter F on P, C F |= p if and only if {q ∈ P : Aq |= p} ∈ F. [Hint: Use the reduced product and observe that it defines an assignment.]
References 1. Łukasiewicz, J.: Aristotle’s Syllogistic from the Standpoint of Modern Formal Logic, 2nd edn. enlarged. Oxford University Press (1957) 2. Boche´nski, I.M.: Ancient Formal Logic. North Holland, Amsterdam (1951) 3. Bobzien, S.: Ancient Logic. In: Zalta, E.N. (ed.) SEP. https://plato.stanford.edu/archives/ sum2020/entries/logic-ancient
References
109
4. Frege, G.: Begriffsschrift, eine der mathematischen nachgebildete Formelsprache des reinen Denkens. Nebert, L. Halle, A. S. (1879). (also in: Van Heijenoort, J.(ed.): From Frege to Gödel. A Source Book in Mathematical Logic 1879–1931, , Harvard University Press, Cambridge MA (1967)) 5. Łukasiewicz, J.: From history of sentential logic (in Polish). Przegl¸ad Filozoficzny 37, (1934). (also in Erkenntnis 5, 111–131 (1935-36) and in Borkowski, L. (ed.). Jan Łukasiewicz. Selected Works. North Holland P.C. Amsterdam-Polish Scientic Publishers (PWN). Warsaw (1970)) 6. Kleene, S.C.: Mathematical Logic. Dover Publications, Mineola, N.Y., USA (2002) 7. Hilbert, D., Ackerman, W.: Principles of Mathematical Logic. Chelsea (1950). (Luce, R. (ed.), Hammond, L.M., Leckie,G.G., Steinhard, F. translation of Grundzüge der Theoretischen Logik. Julius Springer, Berlin (1938)) 8. Church, A.: Introduction to Mathematical Logic. Princeton University Press, Princeton NJ (1956) 9. Meredith, C.A.: Single axioms for the systems (C, N), (C, O) and (A, N) of the two-valued sentential calculus. J. Comput. Syst. 1, 155–164 (1953) 10. Łukasiewicz, J., Tarski, A.: Untersuchungen über den Aussagenkalkül. C.R. Soc. Sci. Lett. Varsovie, Cl. III, 23, 39–50 (1930). (also in: Borkowski, L. (ed.): J. Lukasiewicz: Selected Works. Studies in Logic and the Foundations of Mathematics. North-Holland Publ. Amsterdam and Polish. Sci. Publ. (PWN), Warszawa (1970)) 11. Łukasiewicz, J.: Elements of Mathematical Logic. Pergamon Press, Oxford and Polish Scientific Publishers (PWN), Warsaw (1966). (reprinted from mimeographed notes by students of Warsaw University (1929)) 12. Herbrand, J.: Recherches sur la théorie de la déemonstration. Ph.D Thesis at the Paris University. Travaux Soc. Sci. Lett. Varsovie cl. III, pp. 128 (1930). (also in: Van Heijenoort, J.: From Frege to Gödel. A Source Book in Mathematical Logic 1879–1931, Harvard University Press, Cambridge MA, pp. 525–581 (1967)) 13. Tarski, A.: Über einige fundamentale Begriffe der Metamathematik (1930). (As: ‘On some fundamental concepts of metamathematics’ in: Tarski, A.: Logic, Semantics, Metamathematics, pp. 30–37. Oxford University Press, New York (1956)) 14. Bernays, P.: Axiomatische Untersuchung des Aussagenkalküls der ‘Principia Mathematica’ Mathematische Zeitschrift, XXV (1926) 15. Rasiowa, H., Sikorski, R.: The Mathematics of Metamathematics. Polish Scientific Publishers (PWN), Warszawa (1963) 16. Ja´skowski, S.: Teoria dedukcji oparta na dyrektywach zało˙zeniowych (in Polish) (Theory of deduction based on suppositional directives). In: Ksi¸ega Pami¸atkowa I Polskiego Zjazdu Matematycznego. Uniwersytet Jagiello´nski, Kraków (1929) 17. Ja´skowski, S.: On the rules of suppositions in formal logic. Stud. Logica 1, 5–32 (1934). (Also in: McCall, S. (ed.). Polish Logic 1920–1939. Oxford University Press, pp. 232–258 (1967)) 18. Gentzen, G.: Untersuchungen über das Logische Schliessen, I, II. Math. Z. 39, 176–210, 405– 431 (1934/5) 19. Indrzejczak, A.: Sequents and Trees. Springer Nature Switzerland, Cham, Switzerland (2021) 20. Smullyan, R.M.: First Order Logic. Dover, Minneola N.Y (1996) 21. Beth, E.W.: The Foundations of Mathematics. A Study in the Philosophy of Science. Harper & Row Publishers, New York (1966) 22. Fitting, M.: First-Order Logic and Automated Theorem Proving. Springer, New York (1996) 23. Hintikka, K.J.J.: Form and content in quantification theory. Acta Philosophica Fennica 8, 7–55 (1955) 24. Post, E.L.: Introduction to a general theory of elementary propositions. Am. J. Math. 43(3), 163–185 (1921). https://doi.org/10.2307/2370324 25. Kalmár, L.: Über die Axiomatisierbarkeit des Aussagenkalküls. Acta Sci. Math. 7, 222–243 (1935) 26. Craig, W.: Linear reasoning. A new form of the Herbrand-Gentzen theorem. J. Symb. Logic 22, 250–268 (1957)
110
2 Sentential Logic (SL)
27. Robinson, J.A.: A machine oriented logic based on the resolution principle. J. ACM 12(1), 23–41 (1965) 28. Horn, A.: On sentences which are true of direct unions of algebras. J. Symb. Log. 16(1), 14–21 (1951) 29. Carroll, L.: Complete Works. Symbolic Logic, vol. 60. Vintage Books, New York (1976) 30. Haken, A.: The intractability of resolution. Theoret. Comput. Sci. 39, 297–308 (1985) 31. Davis, M.: Computability and Unsolvability. McGraw-Hill Book Co., New York (1958) 32. Davis, M., Putnam, H.: A computing procedure for quantification theory. J. ACM 7, 201–215 (1960) 33. Dowling, W.F., Gallier, J.H.: Linear-time algorithms for testing the satisfiability of sentential Horn formulae. J. Logic Progr. 1(3), 267–284 (1984). https://doi.org/10.1016/07431066(84)90014-1 34. Greenlaw, R., Hoover, J., Ruzzo, W.: Limits to Parallel Computation. P-Completeness Theory. Oxford University Press, Oxford, UK (1995) 35. Aspvall, B., Plass, M.F., Tarjan, R.E.: A linear-time algorithm for testing the truth of certain quantified boolean formulas. Inf. Process. Lett. 8(3), 121–123 (1979). https://doi.org/10.1016/ 0020-0190(79)90002-4 36. Tarjan, R.E.: Depth-first search and linear graph algorithm., SIAM J. Comput. 1(2), 146–160 (1972). https://doi.org/10.1137/0201010 37. McCulloch, W., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 9, 127–147 (1943) 38. Ramón y Cajal, S.: New Ideas on the Structure of the Nervous System in Man and Vertebrates. MIT Press, Cambridge, MA (1990). (1st ed. Paris (1894)) 39. Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958)
Chapter 3
Rudiments of First-Order Logic (FO)
In this chapter we present basic results on the second classical logic, first-order logic in many topics discussed here reduced to the predicate logic, i.e, logic without function symbols, yet with the full power of FO when deepest results like the Gödel completeness theorem, Gödel incompleteness theorems, the Rosser incompleteness theorem, and the Tarski theorem on non-definability of truth are discussed.
3.1 Introduction to Syntax of FO First-order logic (FO) adds to propositional logic the possibility of expressing properties of individuals collectively by using quantified phrases ‘for all’ and ‘for some’. In this, it follows in the footsteps of Aristotle’s Syllogistics which introduced those expressions into its syllogisms. In order to express properties, FO is using relations rendered symbolically in the form of predicates encoded as relational symbols. The term ‘predicate’ is derived from the latin ‘praedicatum’ meaning a declared property of an object/subject. As predicates are interpreted as relations, we will use the name of a relational symbol in place of a name of a predicate but we keep the traditional name of the predicate logic for FO without function symbols. For instance, when we want to state that ‘John loves each animal’, we need the binary predicate loves(J ohn, x) and a unary predicate animal(x); the predicate ‘loves’ does express a relation of being in love for two beings, and, the unary predicate ‘animal’ renders the property of being an animal. In propositional logic,we could express our statement, at least partially, by listing all animals, at least accessible to John, as a1 , a2 , . . . , an , . . . and forming the formula loves(J ohn, a1 ) ∧ loves(J ohn, a2 ) ∧ ... with each loves(J ohn, a) as an atomic proposition. In predicate logic, we would write down the formula (∀x.(animal(x) ⊃ loves(J ohn, x))). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. T. Polkowski, Logic: Reference Book for Computer Scientists, Intelligent Systems Reference Library 245, https://doi.org/10.1007/978-3-031-42034-4_3
111
112
3 Rudiments of First-Order Logic (FO)
The expression ∀ is the universal quantifier meaning ‘for all’. Similarly, if we wanted to convey to someone the information that there are no unicorns, we would write down the formula (¬∃x.unicor n(x)) in which the predicate ‘unicorn’ does express the property of being a unicorn and the existential quantifier ∃ means ‘for some’, ‘exists’. Let us see some more examples. ‘Every animal that prowls at nights loves to gaze at the moon’ is rendered as ((∀x.((animal(x) ∧ pr owl − at − night (x)) ⊃ (loves − gaze − at − moon(x)))).
‘Anyone who knows Mary loves her’ is rendered as (∀x.((knows(x, Mar y) ⊃ loves(x, Mar y)))). ‘If some animal loves me, then it is an animal in the house’ is rendered as (∃x.(animal(x) ∧ loves(x, m) ⊃ in − house(x))). In the above examples, ‘m’ meaning myself and ‘Mary’ are constants. ‘If a prime number divides the product of two numbers then it does divide at least one of them’ is rendered as (∀x, y, z.(( prime(x) ∧ x|yz) ⊃ (x|y ∨ x|z))). In the last formula, we meet specialized predicates which express some arithmetic properties; in such cases we say that this statement belongs in the theory of arithmetics, a specialized case of FO. In absence of specialized predicates, we simply speak of FO and in absence of function symbols and the relational symbol = of identity, we discuss predicate logic. From the above examples it follows that in order to be able to render in symbols such complex statements, we may need symbols for individual variables like ‘x’, symbols for individual constants like ‘m’ or ‘Mary’, symbols for functions and predicates, quantifier symbols, symbols for propositional connectives and auxiliary symbols like parentheses and commas. We now define formally the syntax of first-order logic. We distinguish between logical and non-logical symbols. Logical symbols are those for individual variables, logical connectives, quantifiers, and auxiliary symbols like punctuation marks, parentheses. Non-logical symbols are constant, relational and function symbols which are subject for interpretations. List of symbols L = L v ∪ L c ∪ L q ∪ L a , where (i) L v = {x1 , x2 , . . . , xn , . . .} is the countably infinite set of individual variables, often expressed in formulae as x, y, z, . . .; (ii) L c = {∨, ∧, ⊃, ≡, ¬} is the set of logical connectives of SL; (iii) L q = {∀, ∃} is the set of quantifier symbols ‘forall’ and ‘exists’;
3.1 Introduction to Syntax of FO
113
(iv) L a = {, (, ), ], [} is the set of auxiliary symbols: parentheses and punctuation marks. The relational vocabulary for FO is the set Q = P ∪ C ∪ F, where n n n (v) P = ∞ n=1 Pn , Pn = {P1 , P2 , . . . , Pk , . . .} is the set of relational symbols and each Pn is the set of relational symbols of arity n; (vi) C = {c1 , c2 , . . . , cn , . . .} is a countably infinite set of constant symbols; n n n (vii) F = ∞ n=0 Fn , Fn = { f 1 , f 2 , . . . , f k , . . .} is the set of function symbols, each Fn is the set of symbols for functions of arity n. The vocabulary may also contain the special relational symbol = of identity. Formulae of FO are built from elementary sequences of symbols: terms and atomic formulae. Definition 3.1 (Terms) Terms are defined by the following conditions: (i) constants and variables are terms; (ii) if function symbols are present, f jn is a function symbol and t1 , t2 , . . . , tn are terms, then f jn (t1 , t2 , . . . , tn ) is a term; (iii) each term is obtained either as in (i) or (ii). Definition 3.2 (Atomic formulae) Atomic formulae are defined as follows: (i) for each relational symbol Pnk , and terms t1 , t2 , . . . , tk , the expression Pnk (t1 , t2 , . . . , tk ) is an atomic formula; (ii) if the identity symbol = is present, then for each pair t1 , t2 of terms, the expression t1 = t2 is an atomic formula; (iii) each atomic formula is obtained either as in (i) or as in (ii). Definition 3.3 (Formulae of FO) The set of formulae is defined by rules (i)–(iv). Each formula obtained by means of rules (i)–(iii) is well-formed (wf). We use the shortcut wff for the phrase:‘well-formed formula’: (i) (ii) (iii) (iv)
each atomic formula is a wff; if φ, ψ are wffs, and ◦ is one of ∨, ∧, ⊃, ≡, then φ ◦ ψ and ¬φ are wffs; for each wff φ, and an individual variable x, expressions∀x.φ, ∃x.φ are wffs; Each wff is obtained solely by means of (i)–(iii).
Rules (i)–(iii) are rules of formation and their consecutive applications lead from atomic formulae to the formula. The formation tree is the visualization of the process of application of formation rules. Example 3.1 (The formation tree) The formation tree for the formula φ : ∀x.∀y.(P(x, y) ⊃ ∃z.¬R(x, y, z)) is presented in Fig. 3.1. In nodes of the tree we have sub-formulae of the formula φ. The set Sub(φ) contains sub-formulae of φ.
114
3 Rudiments of First-Order Logic (FO)
Fig. 3.1 A formation tree
Definition 3.4 (Sub-formulae) For each formula φ, we define the set of its sub-formulae Sub(φ) as follows: (i) (ii) (iii) (iv)
Sub(φ) = {φ} for each atomic formula φ; Sub(φ ◦ ψ) = Sub(φ) ∪ Sub(ψ) ∪ {φ ◦ ψ} for ◦ ∈ {∨, ∧, ⊃, ≡}; Sub(¬φ)= {¬φ} ∪ Sub(φ); Sub(∀x.φ) = {∀x.φ} ∪ Sub(φ); Sub(∃x.φ) = {∃x.φ} ∪ Sub(φ).
Definition 3.5 (Scope, free and bound occurrences, independence of terms) For formulae ∀x.φ(x), ∃x.φ(x), the scope of the quantifier is the formula φ(x). For instance, in the formula φ of Example 3.1, the scope of ∀x is the sub-formula ∀y.(P(x, y) ⊃ ∃z.¬R(x, y, z)), the scope of ∃z is the sub-formula is ¬R(x, y, z). An individual variable x is bound in a formula if it occurs in the scope of a quantifier ∀x, or, ∃x, in the contrary case the variable is free. In the formula φ in Example 3.1, all variables are bound. A formula is open if all variables that occur in it are free; a formula is closed if all variables in it are bound. Closed formulae are called sentences because they behave like formulae in propositional logic. A term t is independent of a variable x in a formula φ if no free occurrence of x in φ falls into the scope of any quantifier ∀y or ∃y such that y is a variable in the term t. If we check that t is independent of x in φ, then the substitution (x/t) in φ will not bound any variable in t. In particular, independence of a variable y of a variable x in a formula φ means that no free occurrence of x in φ occurs in the scope of ∀y or ∃y. If so, substitution (x/y) preserves free occurrence of the variable. The closure of a formula φ, denoted φ, is obtained by prefixing φ with universal quantifiers for every variable that occurs in φ as free. For instance, the closure of the formula [P(x) → Q(x)] ∨ ∃y.[R(y, z) ∨ S(y)] is ∀z.∀x.[P(x) → Q(x)] ∨ ∃y.[R(y, z) ∨ S(y)]. Closed formulae are called sentences: they behave like sentential sentences with respect to truth. Any open formula is obtained by means of steps (i) and (ii) in Definition 3.3, which implies the following relation between predicate and propositional logics. Theorem 3.1 Each open formula of predicate logic is the result of a substitution of atomic formulae into a formula of propositional logic.
3.2 Introduction to Semantics of FO
115
3.2 Introduction to Semantics of FO It is obvious from our discussion of syntax of predicate logic that the notion of truth for predicate logic should be defined in more complex environment then it was the case for propositional logic: we should accommodate individual variables, constants, function symbols, relational symbols and quantifiers, i.e, various types of grammatical categories of the syntax language for first-order logic. Semantic theory for predicate logic which we will present goes back to Tarski [1]. Definition 3.6 (Domain, interpretation) A structure for FO is a pair M = (D, I ). The set D of things is called the domain of the structure and I is an interpretation of non-logical constructs of the logic in the set D: (i) for a relational symbol P = P jk , the interpretation P I ⊆ D k is the relation of arity k on the domain D; (ii) for a function symbol f = f jk , the interpretation is the function f I : D k → D; (iii) for a constant symbol c = c j , the interpretation c I ∈ D is a fixed element of the domain D. It remains to settle the case of individual variables. Their meanings are decided by assignments. Definition 3.7 (Assignments) An assignment is a mapping A on the set of individual variables L v into the domain D, which assigns to each individual variable xi an element xiA ∈ D. Definition 3.8 (Substitution) ∞ Assume that variables are ordered into a sequence σ : (xi )i=1 . For an assignment A, an individual variable xi and an element d ∈ D, a substitution σ(i, d) is a mapping σ(i, d)(A) on the set of A-assignments to individual variables defined as follows: (i) σ(i, d)(x jA ) = x jA in case j = i; (ii) σ(i, d)(xiA ) = d. Definition 3.9 (A-structures) An A-structure is a triple M I,A = (D, I, A), where (D, I ) is a structure M, and A is an assignment. Interpretations in the A-structure are as follows: P I,A = P I , f I,A = f I , c I,A = c I , xiI,A = xiA and, more generally, interpretations for terms are denoted t I,A . We now address the issue of satisfiability and we define the notion of satisfaction of a formula in an A-structure. Definition 3.10 (Satisfaction) For an A-structure over a fixed domain D, the following conditions define the relation I, A |= φ of formula satisfaction:
116
3 Rudiments of First-Order Logic (FO)
(i) for each atomic formula φ : P k (t1 , t2 , . . . , tk ), I, A |= φ if and only if (t1I,A , t2I,A , . . . , tkI,A ) ∈ P I ; (ii) I, A |= (t1 = t2 ) if and only if t1I,A = t2I,A ; (iii) I, A |= f (t1 , t2 , . . . , tk ) = t if and only if f I (t1I,A , t2I,A , . . . , tkI,A ) = t I,A ; (iv) I, A |= φ ∨ ψ if and only if either I, A |= φ or I, A |= ψ; (v) I, A |= φ ∧ ψ if and only if I, A |= φ and I, A |= ψ; (vi) I, A |= ¬φ if and only if it is not the case that I, A |= φ; (vii) I, A |= φ ⊃ ψ if and only if either it is not the case that I, A |= φ or I, A |= ψ; (viii) I, A |= ∀xi .φ if and only if for each d ∈ D, I, σ(i, d)(A) |= φ; (ix) I, A |= ∃xi .φ if and only if for some d ∈ D, I, A, σ(i, d)(A) |= φ. Please observe that conditions (viii) and (ix) make satisfaction for closed formulae (sentences) independent from σ. Definition 3.11 (Validity, satisfiability) A formula φ is true in an A-structure M I,A if and only if I, A |= φ. In this case the formula φ is satisfiable and the A-structure M I,A is a model for the formula φ. A formula φ is true in a structure M, in symbols M |= φ, if and only if it is true in A-structure M I,A for all assignments A. Finally, the formula φ is valid, which is denoted |= φ, if and only if φ is true in each structure M. If there is no A-structure in which a formula φ is true, then the formula φ is unsatisfiable. Clearly, a formula is valid if and only if its negation is unsatisfiable. These notions extend to sets of formulae: for a set Γ of formulae, I, A |= Γ if and only if I, A |= φ for each φ ∈ Γ . The same convention is obeyed in cases of M |= Γ and |= Γ . Definition 3.12 (Logical consequence. Entailment) A set Γ of formulae entails a set Δ of formulae (Δ is a logical consequence of Γ ) if and only if for each structure M if M |= Γ , then M |= ψ for a formula ψ ∈ Δ. If Δ = {φ}, then we say that φ is entailed by Γ .
3.3 Natural Deduction: Sequents Sequents, introduced in Gentzen [2] responded to the call for formalization of natural deduction whose first formalization had been due to Ja´skowski [3, 4]. In case of Γ and Δ being finite sets of formulae, Γ = {γ1 , γ2 , . . . , γk }, Δ = {δ1 , δ2 , . . . , δl }, we call the ordered pair < Γ, Δ > a sequent. Definition 3.13 The meaning of the pair is: if all formulae in Γ are true in an interpretation M, then some formula in Δ is true in the interpretation M; this understanding of the pair < Γ, Δ > is stressed by the notation: we denotesequents by the symbol k γi ⊃ lj=1 δ j . This Γ ⇒ Δ and the truth condition in any structure M is M |= i=1 condition is equivalent to saying that Δ is a logical consequence to Γ .
3.3 Natural Deduction: Sequents
117
Before we list the rules, we comment on some limiting cases involving empty sets of formulae. (i) in case of a sequent ∅ ⇒ Δ (written in Gentzen [2] as ⇒ Δ), the conjunction of the empty set of formulae is valid, hence the sequent is valid if at least one formula in Δ is valid; (ii) in case of a sequent Γ ⇒ (we already apply the Gentzen notation), the disjunction of Δ is false, hence, the sequent is valid if and only if at least one formula in Γ is invalid; (iii) in the extreme case, the sequent ⇒ is unsatisfiable as neither there is a valid formula in Δ nor there is an invalid formula in Γ . We may now list the rules of sequent calculus pertaining to predicate logic (i.e., without function and identity symbols). The reader may check their validity by means of our definition of a sequent as an implication. Each rule is traditionally presented as a fraction with premisses in the numerator and consequents in the denominator. We know from Chap. 1 that connectives ∨, ¬ define all other connectives, and, similarly, by definition of semantics, the equivalence ∀x.φ ≡ ¬∃¬φ is valid, thus, we need only, e.g., the quantifier ∃; accordingly, the rules below employ only connectives ∨, ¬ and quantifier ∃. In case we want to prove a formula with other connectives or quantifier, we use replacement rules. Definition 3.14 (Rules for the sequent calculus) Of few modifications of the original Gentzen system (cf., e.g., Indrzejczak [5]), we adopt the system G 0 plus quantifier rules in Smullyan [6]. First we recall the propositional part. (Rule 1) (Rule 2)
(The axiom) Γ, φ ⇒ φ, Δ; (introduction of right negation)
(Rule 3)
(introduction of
(Rule 4)
(introduction of
(Rule 5)
(introduction of
(Rule 6)
(introduction of
(Rule 7)
(introduction of
(Rule 8)
(introduction of
(Rule 9)
(introduction of
Γ,φ⇒Δ ; Γ ⇒Δ,¬φ Γ ⇒Δ,φ left negation) Γ,¬φ⇒Δ ; ⇒Δ,φ,ψ right disjunction) ΓΓ⇒Δ,φ∨ψ ; Γ,φ⇒Δ Γ,ψ⇒Δ left disjunction) ; Γ,φ∨ψ⇒Δ Γ ⇒Δ,φ Γ ⇒Δ,ψ right conjunction) ; Γ ⇒φ∧ψ Γ,φ,ψ⇒Δ left conjunction) Γ,φ∧ψ⇒Δ ; right implication) ΓΓ,φ⇒Δ,ψ ; ⇒Δ,φ⊃ψ Γ ⇒Δ,φ Γ,ψ⇒Δ left implication) . Γ,φ⊃ψ⇒Δ
Now, we add rules for quantification. (Rule 10)
Γ,φ(d)⇒Δ ; Γ,∀x.φ(x)⇒Δ Γ ⇒Δ,φ(d) quantifier) Γ ⇒Δ,∀x.φ(x)
(introduction of left universal quantifier)
(Rule 11) (introduction of right universal under proviso: d has no occurrence in the consequent; Γ,φ(d)⇒Δ under proviso: d (Rule 12) (introduction of left existential quantifier) Γ,∃x.φ(x)⇒Δ has no occurrence in the consequent; ⇒Δ,φ(d) . (Rule 13) (introduction of right existential quantifier) Γ Γ⇒Δ,∃x.φ(x)
118
3 Rudiments of First-Order Logic (FO)
Example 3.2 We prove the formula ∀x.α(x) ⊃ ∃x.α(x). (i) (ii) (iii) (iv)
α(d) ⇒ α(d): axiom (Rule 1); ∀x.α(x) ⇒ α(d): introduction of left universal quantifier (Rule 10); ∀x.α(x) ⇒ ∃x.α(x): introduction of right existential quantifier (Rule 13); ⇒ ∀x.α(x) ⊃ ∃x.α(x): introduction of right implication (Rule 8).
A proof of a formula φ is a sequence S0 , S1 , . . . , Sk of sequents such that S0 is an axiom, Si+1 for 0 < i ≤< k is obtained from Si by one of the rules, and, Sk is ⇒ φ. A formula φ is provable if and only if it has a proof. Sequent calculus is sound: from validity of axiom, validity is preserved by each rule from the premiss to the consequent, and, finally, to the validity of the result φ given in the form ⇒ φ. Hence, each provable formula is valid. We postpone a discussion of completeness of the sequent calculus to Sect. 3.6. In Gentzen [2], some additional rules were proposed like thinning: Γ ⇒Δ Γ ⇒Δ , . Γ, φ ⇒ Δ Γ ⇒ Δ, φ The Gentzen ‘Hauptsatz’ theorem concerns the elimination of the Cut Rule: Γ1 , φ ⇒ Δ1 Γ2 ⇒ φ, Δ2 ; Γ 1 , Γ 2 ⇒ Δ 1 , Δ2 each formula proved with usage of Cut Rule can be proved without using it. We return to this topic in Sect. 3.5.
3.4 Natural Deduction: Diagrams of Formulae We return now to Chap. 2 in which we presented the ‘diagrammatic method’ for validity checking. We recall this idea for the case of predicate logic, on the lines of (Rasiowa and Sikorski [7]). Diagrams of formulae are constructed on same ideas as sequents, they are based on valid inference rules but they applied them in the opposite way: each diagram is a fraction, in which premisses are below the line and consequents are above the line. Definition 3.15 (Diagrams of formulae) Each diagram is either of the form ΓΓ0 (type (β)) or Γ0Γ;Γ1 (type (α)). In case of type (β), the rule Γ0 ⊃ Γ and in case of type (α) the rule Γ0 ∧ Γ1 ⊃ Γ are valid inference rules. As with propositional diagrams, coma ‘, denotes disjunction and the sign of semicolon ‘; denotes conjunction. We suppose that individual variables and constants are ordered into infinite sequence σ without repetitions. With these provisos, diagrams are the following:
3.4 Natural Deduction: Diagrams of Formulae
(∨)(1)
Γ0 , α ∨ β, Γ1 Γ0 , α, β, Γ1
Γ0 , ¬(α ∨ β), Γ1 Γ0 , ¬α, Γ1 ; Γ0 , ¬β, Γ1
(¬∨)(2)
(∧)(3)
Γ0 , α ∧ β, Γ1 Γ0 , α, Γ1 ; Γ0 , β, Γ1 Γ0 , ¬(α ∧ β), Γ1 Γ0 , ¬α, ¬β, Γ1
(¬∧)(4)
(⊃)(5)
(¬ →)(6)
Γ0 , α ⊃ β, Γ1 Γ0 , ¬α, β, Γ1
Γ0 , ¬(α ⊃ β), Γ1 Γ0 , ¬α, Γ1 ; Γ0 , β, Γ1
(¬¬)(7)
(∃)(8)
119
Γ0 , ¬¬αΓ1 Γ0 , α, Γ1
Γ0 , ∃x.α(x), Γ1 Γ0 , α(c), Γ1 , ∃x.α(x)
Condition for (8): c is a term in the sequence σ with the property that the formula α(c) does not appear at any earlier step of decomposition, including the consequent of the currently considered rule. (¬∃)(9)
Γ0 , ¬∃x.α(x), Γ1 Γ0 , ∀x.¬α(x), Γ1
(∀)(10)
Γ0 , ∀x.α(x), Γ1 Γ0 , α(y), Γ1
Condition for (10): y is a variable which has no occurrence in any formula in the consequent. Γ0 , ¬∀x.α(x), Γ1 (¬∀)(11) Γ0 , ∃α(y), Γ1 A formula is indecomposable if and only if it is a clause, i.e., a disjunction of literals, i.e, atomic formulae and their negations. A sequence Γ is final in case it consists of indecomposable formulae and it is valid in case it contains an indecomposable formula along with its negation. Due to
120
3 Rudiments of First-Order Logic (FO)
Fig. 3.2 A diagram of decomposition
valid disjunction p ∨ ¬ p, the disjunction Γ is valid. As with propositional logic, this disjunction is provable. The procedure for a formula φ begins with a formula φ at the root and then a tree is grown: each time a type β is met, the tree branches according to the distributive law. There are two outcomes possible: either the obtained tree is finite and all formulae at the nodes are indecomposable or the tree is infinite; in the latter case the formula φ is unsatisfiable, in the former case the result is a conjunction of disjunctions of indecomposable formulae. When all disjunctions are valid the resulting formula is valid, moreover it is provable. Theorem 3.2 A formula φ is a valid formula of predicate logic if and only if the diagram for φ is finite and all conjuncts are valid. As each of them is a substitution of atomic formulae and their negation into a tautology, all conjuncts are provable (see Chap. 2). We obtain Theorem 3.3 (A form of completeness theorem for predicate logic) For each valid formula of predicate logic, the formula resulting from decomposition is provable. Example 3.3 Consider Fig. 3.2 which represents the diagram for the formula φ : ∀x.(P(x) ⊃ (Q(x)) ⊃ P(x)). The final sequence ¬P(y) ∨ Q(y) ∨ P(y) proves validity of φ. Theorem 3.4 The following are valid formulae of predicate logic. A. Formulae involving negation (1) (2) (3) (4)
∀x.φ(x) ⊃ ¬∃x.¬φ(x) (duality for quantifiers, De Morgan laws); ¬∀x.φ(x) ⊃ ∃x.¬φ(x) (duality for quantifiers, De Morgan laws); ¬∃x.φ(x) ⊃ ∀x.¬φ(x) (duality for quantifiers, De Morgan laws); ∃x.φ(x) ⊃ ¬∀x.¬φ(x) (duality for quantifiers, De Morgan laws).
B. Formulae involving implication In formulae (5)–(8), ψ is a formula in which individual variable x is not free. For conciseness sake, we present these formulae as equivalences, so each equivalence replaces forth and back implications. (5) ∀x.(φ(x) ⊃ ψ) ≡ ((∃x(φ(x)) ⊃ ψ)); (6) ∀x.(ψ ⊃ φ(x)) ≡ (ψ ⊃ (∀x.φ(x));
3.5 Natural Deduction: Tableaux
(7) (8) (9) (10)
121
(∃x.(φ(x) ⊃ ψ) ≡ ((∀x.(φ(x)) ⊃ ψ); (∃x.(ψ ⊃ φ(x)) ≡ (ψ ⊃ (∃x.φ(x))); ∀x.(φ(x) ⊃ ψ(x)) ⊃ (∀x.φ(x) ⊃ ∀x.ψ(x)); ∀x.(φ(x) ⊃ ψ(x)) ⊃ (∃x.φ(x) ⊃ ∃x.ψ(x)).
Formulae (5)–(8) allow for pulling out quantifier symbols. C. Formulae involving disjunctions and conjunctions In formulae (11)–(14) individual variable x in ψ is not free. We state equivalences which represent implications forth and back. These formulae also allow for pulling out quantifier symbols. (11) (12) (13) (14)
∀x.(ψ ∧ φ(x)) ≡ (ψ ∧ ∀x.φ(x)); ∀x.(ψ ∨ φ(x)) ≡ (ψ ∨ ∀x.φ(x)); ∃x.(ψ ∧ φ(x)) ≡ (ψ ∧ ∃x.φ(x)); ∃x.(ψ ∨ φ(x)) ≡ (ψ ∨ ∃x.φ(x)).
D. Distributivity laws (15) (16) (17) (18)
∀x.(φ(x) ⊃ ψ(x)) ≡ (∀x.φ(x) ⊃ ∀x.ψ(x)); ∃x.(φ(x) ∨ ψ(x)) ≡ (∃x.φ(x) ∨ ∃x.ψ(x)); (∀x.φ(x) ∨ ∀x.ψ(x)) ⊃ (∀x.(φ(x) ∨ ψ(x)); ∃x.(φ(x) ∧ ψ(x)) ⊃ (∃x.φ(x) ∧ ∃x.ψ(x)).
E. Generalization and specification rules (19) ∀x.φ(x) ⊃ φ(t), where t is an arbitrary term; (20) (Gen) φ(x) ⊃ ∀x.φ(x).
3.5 Natural Deduction: Tableaux As with propositional logic, tableaux decompose signed formulae into a form of a tree with sub-formulae in nodes. They resemble very much diagrams in their idea, but are different when we come to details. We extend the set of types of formulae beyond propositional types: (α) is the type of conjunctive formulae p ∧ q, (β) is the type of disjunctive formulae p ∨ q; we add the type (γ) of universally quantified formulae ∀x.φ, ¬∃x.φ and the type(δ) of existentially quantified formulae ∃x.φ, ¬∀x.φ. The symbol φ(x/d) will denote the result of uniform substitution of a constant d ∈ D (D is the domain of a structure) for individual variable x free in φ. In the following tableau rules, semi-colon ‘; means conjunction, coma ‘, means disjunction; in definition of types, as well as in tableau analysis, we acknowledge analysis in Smullyan [6]. The tableaux we discuss are called analytic in Smullyan [6], as an aftermath of semantic tableaux in Beth [8] and ‘block’ tableaux in Hintikka [9]. The formulae in analytic tableaux are signed, i.e, as in propositional version, prefixed with T meaning ‘true’ or F meaning ‘false’.
122
3 Rudiments of First-Order Logic (FO)
Definition 3.16 (Tableau rules) A. A general form of rules (i) For type (α): (ii) For type (β): (iii) For type (γ): (iv) For type (δ):
φ ; φ1 ;φ2 φ ; φ1 ,φ2 φ ; φ(a) φ , φ(d)
on condition that d is a constant not occurring in the preceding steps of tableau development. B. Explicit tableau rules for signed formulae of types (γ), (δ). (v) For type (γ): (vi) For type (δ):
T (∀x.φ) ; F(∃x.φ) ; T (φ(x/d)) F(φ(x/d)) T (∃x.φ) ; F(∀x.φ) . T (φ(x/d)) F(φ(x/d))
on condition in (vi) that in both cases d has not been employed before. In Fig. 3.3, we present diagrams for signed formulae of types (γ) and (δ). Example 3.4 Consider the formula ξ : [∀x.(φ(x) ⊃ ψ(x)) ⊃ (∀x.φ(x) ⊃ ∀x.ψ(x))]. We show in Fig. 3.4 the tableau for the signed formula Fξ. Please observe that both branches are closed: the left branch contains contradictory atomic formulae P(a), ¬P(a), the right branch contains contradictory atomic formulae Q(a), ¬Q(a). This means that ξ is valid (we know that, actually, this is the valid formula (9). Contrariwise, Fig. 3.5 for the signed formula F[¬∃x.((¬P(x) ⊃ P(x)) ⊃ P(x))]
Fig. 3.3 Types of decomposition
3.6 Meta-Theory of Predicate Logic. Part I
123
Fig. 3.4 A closed predicate tableau
Fig. 3.5 An open predicate tableau
presents tableau whose all three branches are open, i.e., not contradictory which point to satisfying structures. In effect, the formula [∃x.((¬P(x) ⊃ P(x)) ⊃ P(x))] is valid.
3.6 Meta-Theory of Predicate Logic. Part I We pursue topics of consistency, compactness, completeness, soundness, not necessarily in this order, using tableaux. We have met a form of completeness when discussing diagrams. Yet, we use tableaux in a proof of completeness.
124
3 Rudiments of First-Order Logic (FO)
We address completeness, compactness, countable model property, and, consistency, first. Tableaux prove to be a very convenient milieu to prove those properties of the predicate logic.
3.6.1 Tableau-Completeness of Predicate Logic Definition 3.17 (Hintikka sets for predicate logic) A set Γ of formulae is a Hintikka set for a structure M = (D, I ) if and only if the following conditions are fulfilled: (H0)
(H1) (H2) (H3) (H4) (H5)
The set Γ does not contain simultaneously an atomic formula and its negation; it does not contain ⊥. In the realm of signed formulae this means that Γ cannot contain both T P and F P for any atomic formula P; If Γ contains a formula φ : φ1 ∧ φ2 of conjunctive type (α), then it contains both formulae φ1 and φ2 ; If Γ contains a formula φ : φ1 ∨ φ2 of type (β), then it contains either φ1 or φ2 ; If Γ contains a formula φ of type (γ), then it contains the formula φ(d) for each d ∈ D; If Γ contains a formula φ of type (δ), then it contains the formula φ(d) for an element d ∈ D; If Γ contains a formula ¬¬φ, then Γ contains the formula φ.
It follows by structural induction, as in case of propositional logic, that Γ is satisfied in M. Theorem 3.5 Each Hintikka set for the structure M is satisfied in M. Proof As announced, the proof goes by structural induction. For atomic formulae, we assign to P the value 1 (truth) if T P ∈ Γ , and, the value 0 (falsity) if F P ∈ Γ ; otherwise, if {T P, F P} ∩ Γ = ∅, we assign truth value at random. Consider now formulae of size > 0. If φ is φ1 ∧ φ2 , then φ1 , φ2 ∈ Γ by (H1), and, by hypothesis of induction, φ1 , φ2 are satisfied in M, hence, φ is satisfied in M. Same argument works for φ which is φ1 ∨ φ2 with (H2) in place of (H1). If φ ∈ Γ is of the form ∀x.ψ (i.e., of type (γ)), then, by (H3), ψ(d) ∈ Γ for each d ∈ D, hence, by hypothesis of induction, each ψ(d) is valid in M and this is necessary for validity of φ ∈ M. Finally, if φ ∈ Γ is ∃x.ψ, then, by (H4), ψ(d) ∈ Γ for some d ∈ D which implies by hypothesis of induction that ψ(d) is valid in M, witnessing validity of φ in M. Similarly, for (H5). The point is now in considering the tableau for a formula φ with validity of φ on mind. Let us observe that if a tableau for a formula Fφ contains an open branch then (H0) and (H1) and (H2) are satisfied for some open extension, the same holds for (H4), (H5) and only (H3) requires more attention as it can lead to an open infinite branch. With some judicious choice of a strategy for expanding nodes, one can
3.6 Meta-Theory of Predicate Logic. Part I
125
satisfy (H3) in order to make some open extension, possibly infinite, a Hintikka set, i.e, satisfiable. It follows that if a formula φ is valid, then the tableau for Fφ is closed, i.e., all branches are closed and these facts must be noticed in a finite number of steps for each branch, hence, the tableau is finite. A theorem obtains. Theorem 3.6 (Tableau-completeness) If a formula φ is valid, then the tableau for Fφ is finite and closed, hence, φ is tableau-provable. Countable model property: the Löwenheim-Skolem theorem; compactness property Another observation related to the above discussion is that an open Hintikka branch witnesses the satisfiability of the formula and this branch is countable, which implies the Löwenheim theorem Löwenheim [10]. Theorem 3.7 (The Löwenheim theorem) If a formula is satisfiable in an interpretation, then it is satisfiable in an interpretation with the countable domain. Löwenheim theorem was generalised in Skolem [11] to the following result. Theorem 3.8 (The Skolem-Löwenheim theorem) Each countable set of formulae which are jointly satisfiable in a common interpretation, are jointly satisfiable in a common interpretation over a countable domain. Proof For a countable set Γ of formulae {φn : n ≥ 1}, suppose that formulae are ordered in that order, and, modify the tableau by beginning it with φ1 . After φ j for j < n have been used, attach φn to the end of each open branch and continue. By induction, a tableau is built. Was the tableau closed, it would prove joint unsatisfiability of a finite set of formulae from Γ , a contradiction. Therefore, there exists an open branch proving joint satisfiability of formulae from Γ . The branch is countable. Corollary 3.1 (Compactness property) A countable set Γ of formulae is jointly satisfiable if and only if each finite subset Δ of it is jointly satisfiable. Proof The condition that each finite set of formulae is jointly satisfiable implies that no tableau built from them as in proof of Theorem 3.8 can close, hence there exists in it, by the König Theorem 1.43 an open infinite branch witnessing satisfiability of Γ . The converse is manifest. Consistency: existence of interpretations We cannot analyse consistency as in the case of propositional logic: as of yet, we have not introduced any syntactic proof mechanism, yet we may define consistency by means of properties of consistent sets, established in Chap. 2, and well suited for the tableau paradigm.
126
3 Rudiments of First-Order Logic (FO)
Definition 3.18 (Consistent sets) A set Γ of formulae is consistent if and only if it satisfies the following conditions: (i) for no atomic formula P, Γ contains T P and F P; Γ does not contain ⊥; (ii) if a formula φ : φ1 ∧ φ2 of type (α) is in Γ , then the set Γ ∪ {φ1 , φ2 } is consistent; (iii) if a formula φ : φ1 ∨ φ2 of type (β) is in Γ , then either Γ ∪ {φ1 } is consistent or Γ ∪ {φ2 } is consistent; (iv) if a formula φ of type (γ) is in Γ , then the set Γ ∪ {φ(x/d)} is consistent; (v) if a formula φ of type (δ) is in Γ , then the set Γ ∪ {φ(x/d)} is consistent, where d has no occurrence in Γ . By properties (i)–(v), any tableau for the set Γ must contain an open branch. By a repetition of arguments in Theorem 3.8, we conclude that Theorem 3.9 (Consistency implies satisfiability) Any consistent set of formulae has an interpretation with a countable domain, which can be stated as follows: each consistent set is satisfiable. We know from Chap. 2 that this implies strong completeness. Theorem 3.10 Predicate logic is tableau–strongly complete.
3.7 Analytic Tableaux Versus Sequents From the definition of validity of sequents it follows that a sequent S : {γ1 , γ2 , . . . , γk } ⇒ {δ1 , δ2 , . . . , δm } is valid if only if the set of signed formulae{T γ1 , T γ2 , . . . , T γk , Fδ1 , Fδ2 , . . . , Fδm } is tableau-unsatisfiable. Therefore a proof of unsatisfiability of the set {T γ1 , T γ2 , . . . , T γk , Fδ1 , Fδ2 , . . . , Fδm } supplies the proof of validity of the sequent S. The following result obtains by tableau-completeness. Theorem 3.11 (Completeness of sequent calculus) Sequent calculus for predicate logic is complete: each valid sequent is provable. We can render sequent rules in tableaux, as the following examples show. We denote as TΓ the sequence T γ1 , . . . , T γk and as FΔ the sequence Fδ1 , . . . , Fδm (Table 3.1). Tableau-completeness supplies also a short argument in favor of Hauptsatz in Smullyan [6]. Gentzen’s Hauptsatz concerns Cut Rule: Γ2 ⇒ φ, Δ2 Γ 1 , φ ⇒ Δ1 Γ 1 , Γ 2 ⇒ Δ1 , Δ2
3.8 Normal Forms
127
Table 3.1 Sequent rules rendered as tableau rules
Sequent r ules
As tableau r ules
Γ, φ ⇒ Δ, φ
T Γ, T φ, FΔ, Fφ
Γ,φ,ψ⇒Δ Γ,φ∧ψ⇒Δ
T Γ,T φ,T ψ,FΔ T Γ,T φ∧ψ,FΔ
Γ ⇒Δ,φ,ψ Γ ⇒Δ,φ∨ψ
T Γ,FΔ,Fφ,Fψ T Γ,FΔ,Fφ∨ψ
and the possibility of proving sequents without the Cut Rule. The Hauptsatz states that each sequent provable with use of Cut Rule is provable without use of Cut Rule. The argument for Hauptsatz formulated in the framework of tableau counterpart to sequents runs as follows: suppose that tableaux for Γ, φ and for Γ, ¬φ are closed, hence, both Γ, φ and Γ, ¬φ are unsatisfiable which forces the conclusion that Γ is unsatisfiable, hence, the tableau for Γ is closed. This means that φ is eliminable. In (Gentzen, op.cit.), a constructive proof of the Hauptsatz is given, by means of a primitive recursive function which does estimate complexity of the closed tableau for Γ in terms of complexities of closed tableaux for Γ, φ and for Γ, ¬φ. Smullyan [6] gives the proof on these lines which does encompass the Hauptsatz for tableaux and sequents.
3.8 Normal Forms Laws of predicate logic allow for presentation of formulae in some specialized forms. As with propositional logic, we meet in FO negative normal form, CNF and DNF forms and some forms which we discuss later on: Prenex normal form, Skolem normal forms and Herbrand normal forms. A good introduction to this topic is the negative-normal form. We recall that a literal is either an atomic formula or its negation. Definition 3.19 (Negative-normal forms) A formula φ is in a negative-normal form if and only if negation signs are negating literals only and other connectives are ∨ and ∧ along with quantifiers ∀, ∃. Duality laws of predicate logic secure the existence of negative-normal form for each formula. Consider, e.g., the formula φ : ¬(∀x.(ψ(x) ⊃ (ξ(x) ⊃ ψ(x)))) The sequence of transformations ∃x.(¬(¬ψ(x) ∨ ¬ξ(x) ∨ ψ(x))) ∃x.(ψ(x) ∧ ξ(x) ∧ ¬ψ(x))
128
3 Rudiments of First-Order Logic (FO)
yields a negative-normal form for φ. Let us observe that negative-normal forms are not defined uniquely. Definition 3.20 (Renaming) In formulae 3.4(5)–(8), we find sub-formulae ψ in which there are no free occurrences of individual variables; examples of such sub-formulae are sentences like ∀x.ξ(x), ∃y.ξ(y). It is obvious that truth of sentences does not depend on the variable used, so we are allowed to rename variables in closed formulae in order to apply relevant laws from among 3.4.(1)–(20). For instance, consider: (∗)(∀x.(ξ(x) ∧ ¬φ(c))) ∧ (φ(c) ∨ ∃x.ξ(x)) rename x as y in the first disjunct: (∀y.(ξ(y) ∧ ¬φ(c))) ∧ (φ(c) ∨ ∃x.ξ(x)) pull out ∃x.:(∀y.(ξ(y) ∧ ¬φ(c))) ∧ ∃x.(φ(c) ∨ ξ(x)) pull out ∃:∃x.(∀y.(ξ(y) ∧ ¬φ(c))) ∧ (φ(c) ∨ ξ(x)) Finally, we obtain: (∗∗)∃x.∀y.(((ξ(y) ∧ ¬φ(c))) ∧ (φ(c) ∨ ξ(x))) The form (**)is yet another normal form: the prenex form. Definition 3.21 (The prenex normal form) A formula φ is in the prenex normal form if φ is Q 1 x1 Q 2 x2 . . . Q k xk .ψ, where the prefix Q 1 , Q 2 , . . . Q k contains all quantifier symbols in φ and the matrix ψ is quantifier-free. In other words, all quantifier symbols are pulled out and they form the prefix of φ. Renaming, laws (1)–(20) of Theorem 3.4 and equivalences allow for transforming each formula into a prenex form. An example is formula (**), the prenex form of formula (*). A formal proof of the existence of prenex can be given by structural induction: (i) literals are already in prenex; (ii) if a connective ◦ is ∨ or ∧ (and they suffice), φ is ψ ◦ ξ and, by induction hypothesis ψ, ξ are in prenex forms, then renaming and laws (12)-(15) give prenex form for φ; (iii) if φ is ¬ψ and ψ is in a prenex form, e.g., ∀x.ξ(x), then duality laws (1)-(4), transform φ into ∃x.¬ξ(x) and, by induction hypothesis,¬ξ(x) is in a prenex form, hence, φ is in the prenex form; (iv) if φ is Qx.ψ and ψ is in a prenex form, then φ is in prenex form. Elimination of existential quantifier symbols from a formula can be effected by means of a technique due to Skolem [12].
3.8 Normal Forms
129
Definition 3.22 (Skolem normal forms) For a formula φ in prenex form, a Skolem sequence of quantifier symbols in the prefix of φ is a sequence Sk : ∀xi1 ∀xi2 . . . ∀xik ∃y. The Skolem function symbol associated with the sequence Sk is f (xi1 , xi2 , . . . , xik ), where the function symbol f has no occurrence in φ, and the Skolem substitution (Skolemization) is the substitution y/ f (xi1 , xi2 , . . . , xik ). Particular cases are: (i) for a formula φ : ∃y.ψ(y): the Skolem function symbol becomes a function of arity 0, i.e., a Skolem constant f and the formula φ becomes ψ( f ); (ii) for a formula φ : ∀x.∃y.ψ(x, y): the Skolem function symbol is f (x) and φ becomes ψ(x, f (x)). Skolem normal form is a prenex form in which for all Skolem sequences, existential quantifiers have been replaced with Skolem function symbols. We denote by the symbol ∀F O the set of all formulae in FO in prenex forms whose prefixes consist solely of universal quantifier symbols and matrices contain only bound individual variables. A formula in a Skolem normal form is in ∀F O. Theorem 3.12 A formula φ is satisfiable if and only if its Skolem normal form is satisfiable. Proof Suppose that φ is Skψ, with Sk of Definition 8.4, free variables in ψ are x1 , x2 , . . . , xn , y and φ is closed. If φ is satisfied in an A-structure M I,A = (D, I, A), then, for each sequence d = (d1 , d2 , . . . , dn ), each di ∈ D, there exists ad ∈ D such that ψ(x1 /d1 , x2 /d2 , . . . , xn /dn , y/ad ) holds in M I,A . For the Skolem normal form, with the substitution y/ f (x1 , x2 , . . . , xn ), we let f (d1 , d2 , . . . , dn ) = ad , for each substitution (x1 /d1 , x2 /d2 , . . . , xn /dn ) and then ψ(x1 /d1 , x2 /d2 , . . . , xn /dn , y/ f (x1 /d1 , x2 /d2 , . . . , xn /dn ) holds in M I,A . A formula and its Skolem normal form need not be equivalent: introducing, e.g., a constant reduces the class of possible interpretations for the Skolem normal form. A remedy is to add to the Skolem normal form a set of Skolem axioms, see (Boolos et al. [13]). Example 3.5 We find the Skolem normal form for φ : ∀x.∃y.[(P(x, y) ⊃ R(y)) ∨ ∃z.S(z)] (i) We introduce the Skolem symbol f : ∀x.∃y.[(P(x, y) ⊃ R(y)) ∨ S( f )]; (ii) We introduce the Skolem function symbol g(x): ∀x.[(P(x, g(x)) ⊃ R(g(x))) ∨ S( f )]. A finite diagram in Example 3.3 for a formula φ yields a CNF form of φ, i.e., a conjunction Also, a tableau for Fφ results in DNF of φ, viz., this of disjunctions. DNF is branches (branch). Negating this DNF, we obtain the CNF form for φ.
130
3 Rudiments of First-Order Logic (FO)
Definition 3.23 (Conjunctive and disjunctive normal forms) Conjunctive normal form (CNF) of an open formula is i Di , where Di is j l ij for each i, literals l ij are atomic formulae or their negations. Sub-formulae of the form i j l j are clauses. Disjunctive normal forms (DNF’s) are i Ci , where Ci is j l ij for each i. Sub i formulae of the form j l j are prime implicants. By means of tautologies ¬( p ∨ ¬q) ≡ (¬ p) ∧ (¬q), ¬( p ∧ ¬q) ≡ (¬ p) ∨ (¬q), ( p ⊃ q) ≡ ((¬ p) ∨ q), and distributivity laws (( p ∧ q) ∨ r ) ≡ (( p ∨ r ) ∧ (q ∨ r )), (( p ∨ q) ∧ r ) ≡ (( p ∧ r ) ∨ (q ∧ r )), each open formula can be transformed into an equivalent conjunctive normal form. As negation converts CNF’s into DNF’s, each formula can be expressed equivalently as a DNF via CNF of its negation. It follows that CNF’s for open formulae are obtained exactly as in the case of propositional logic. For a formula φ of predicate logic, CNF can be obtained in the following steps: 1. Introduce Skolem constant and function symbols for each Skolem sequence; 2. Drop universal quantifiers in front of sub-formulae by virtue of Theorem 3.3(19), using eventually renaming; 3. Transform the resulting open formula to a CNF. Example 3.6 Consider the formula φ: (¬∀x.((P(x) ⊃ ∃y.(Q(y) ∧ R(x, y))) ⊃ ∃z.T (z))) Its CNF form is obtained in the following steps. (1) duality ¬∀ ≡ ∃¬: (∃x.¬((P(x) ⊃ ∃y.(Q(y) ∧ R(x, y))) ⊃ ∃z.T (z))); (2) Skolemization: Skolem constants f for x, g for z, h for y : (¬(P( f ) ⊃ (Q(h) ∧ R( f, h))) ⊃ T (g)); (3) removal of implications: (¬(¬(¬P( f ) ∨ (Q(h) ∧ R( f, h)) ∨ T (g)))); (4) removal of inner negations:‘(¬(P( f ) ∧ (¬Q(h) ∨ ¬R( f, h))) ∨ T (g)); (5) removal of outer negation: ((¬P( f ) ∨ (Q(h) ∧ R( f, h))) ∨ T (g)); (6) application of a distribution law: (((¬P( f ) ∨ Q(h)) ∧ (¬P( f ) ∨ R( f, h))) ∨ T (g)); (7) application of a distribution law: (¬P( f ) ∨ Q(h) ∨ T (g)) ∧ (¬P( f ) ∨ R( f, h) ∨ T (g)). We obtain two clauses.
3.9 Resolution in Predicate Calculus
131
3.9 Resolution in Predicate Calculus In Chap. 2, we have discussed resolution in propositional logic. We recall that resolution was a procedure for testing satisfiability for sets of formulae in clausal form CNF. The procedure was based on Robinson’s Resolution Rule (RRR) which for two clauses C1 , C2 with contradictory literals l1 ∈ C1 , l2 ∈ C2 produces the clause r es(C1 , C2 ) = (C1 ∪ C2 ) \ {l1 , l2 }, called the resolvent. In case of predicate logic, due to presence of individual variables and constants, an application of the resolution rule becomes possible after some preliminary pre-processing of formulae. We refer to Chap. 2 for a discussion of principles of resolution; we would like only to recall the equivalence: a set Δ of formulae is a logical consequence of a set Γ of formulae if and only if the formula ( Γ ) ∧ ( ¬Δ) is unsatisfiable, which is the foundation for resolution refutation. To see the need for pre-processing of formulae, consider for instance atomic formulae P(a, b) and ¬P(x, y); we may apply the resolution rule only after the procedure of unification: after we make the substitution (x/a; y/b) in the latter formula, the formulae become contradictory literals and we may apply the rule (RR) to them. It follows that we have to consider more closely the unification procedure. The unifying substitution is called the unifier. Definition 3.24 (Unification) A unifier for a set of atomic formulae is a set of substitutions which make all formulae equiform. An abstract case here is that we have a finite set of formulae Γ with the finite set T (Γ ) of terms in Γ and a countable set of terms T from which we select substitutions; as each set of substitutions is finite, we have a countable set of possible substitutions for terms in T (Γ ) . Among them are substitutions of the same minimal cardinality. We call them minimal unifiers. Consider a minimal unifier U and any unifier V . As cardinality of U is not greater then that of V , for each substitution x/t by V , there exists the substitution x/t by U , hence, we obtain a set of substitutions of the form t /t which we denote by Q. Then, U ◦ Q = V . A substitution U such that for each substitution V there exists a set of substitutions Q such that U ◦ Q = V is called the most general unifier (m.g.u). We therefore see that each minimal unifier is an m.g.u. We obtain Theorem 3.13 (Robinson [14]). For any set Γ of finitely many formulae, there exists a most general unifier. There is a substantial literature on unification, especially in the context of automated theorem proving, see (Baader et al. [15]). In order to give an example, we reinstate function symbols f, g and we consider the set of literals {P( f (x, g(z)), a), P( f (a, b, a), g(a))}. We substitute: g(a)/a, x/a, g(z)/b. This is a minimal substitution, hence, an m.g.u. A ground clause is the clause in which the only terms are constants.
132
3 Rudiments of First-Order Logic (FO)
Theorem 3.14 Suppose that C1 , C2 are two ground clauses with contradictory literals l, l c . Let C = (C1 \ {l}) ∪ (C2 \ {l c }) be the resolvent r es(C1 , C2 ). If clauses C1 , C2 are satisfiable, then the clause r es(C1 , C2 ) is satisfiable. Proof Suppose that M = (D, I )) is the structure in which C1 , C2 are true. Either (l) M = 1 or (l c ) M = 1.As two cases are symmetric, we suppose that (l) M = 1, hence, (l c ) M = 0. As the clause C2 is valid, there exists a literal l ∈ C2 such that (l ) M = 1 which secures satisfiability of r es(C1 , C2 ). We denote the empty clause by the symbol . Obviously, the empty clause is ground and unsatisfiable. Corollary 3.2 (Soundness of ground resolution) If for a set of clauses, after a finite number of resolution rule applications, the empty clause is obtained, then the set of clauses is unsatisfiable. We may observe that as open clauses are substitutions into propositional formulae, resolution is actually the propositional resolution whose completeness have been proved in Chap. 1. Theorem 3.15 Predicate resolution is complete, i.e., if a set of open clauses is unsatisfiable, then the resolution procedure yields the empty clause. Example 3.7 Consider the formula φ: ¬[(∃w.∃x.∃y.P(x, w, y)) ⊃ (∀w.∀x.P(x, w, x))]. We obtain the conjunction (∃w.∃x.∃y.P(x, w, y)) ∧ (∃w.∃x.¬P(x, w, x)). We introduce Skolem constants: into the first conjunct w/a, x, b, y/c to obtain the atomic formula P(b, a, c) and into the second conjunct: w/e, x/ f to obtain the atomic formula ¬P( f, e, f ). The unifier {b/ f, c/ f, a/e} leads to .
3.10 Horn Clauses and SLD-Resolution Definition 3.25 (Horn clauses) A Horn clause Horn [16] is any clause which contains at most one non-negated (alias positive) literal. A Horn clause is definite if and only if it contains a positive literal. In that case, a Horn clause i ¬Pi (x i ) ∨ Q(y) can be rewritten into the equivalent form of a decision rule: i Pi (xi ) → Q(y). Horn clauses without any positive literal are called negative clauses and positive clauses contain only positive (non-negated) literals, hence, any positive Horn clause consists of a unique non-negated literal.
3.10 Horn Clauses and SLD-Resolution
133
Definition 3.26 (SLD-resolution) There are some restricted forms of resolution, discussed mostly in connection with logic programming. The positive resolution (P-resolution) is the variant in which one of any two clauses that undergo resolution is to be positive, i.e, without negated literals, similarly, the negative resolution performs resolution rules on pairs of clauses of which at least one is negative , i.e., with only negated literals. A linear resolution for φ is the one for which there exists a sequence C1 , C2 , . . . , Cn of clauses with C1 , the base, a clause in φ, Cn the empty clause and each Ci is resolved with either a clause D in φ or with one of earlier C j s (called the side clause). These three variants of resolutions are complete (see Schöning [17] for proofs). SLD-resolution is a linear resolution with a negative clause (called in this case a goal clause) as a base endowed with a strategy for selecting the next side clause which should be non-negative, i.e., having a positive literal, i.e, being a decision rule with possibly more than one decision value. SLD resolution and linear resolution are complete on Horn clauses, see Gallier [18]. Example 3.8 We consider a set of clauses C={P, Q, ¬R ∨ ¬Q, ¬P ∨ R}. Clauses P, Q, ¬P ∨ R are definite, the clause ¬R ∨ ¬Q is the goal clause. The SLD-refutation is shown in Fig. 3.6. Definition 3.27 (Logic programs) A set of clauses C along with a query C is said to be a logic program. In applications, the set of clauses C is often called a Knowledge Base (KB). Ground clauses of the form P are called facts. We give an example. Example 3.9 Consider the following textual form of knowledge base (see Russell and Norvig [19]): 1. Jack owns a dog. 2. Every dog owner is an animal lover. 3. No animal lover kills an animal. 4. Either Curiosity or Jack killed the cat named Tuna.
Fig. 3.6 The pattern for SLD resolution
134
3 Rudiments of First-Order Logic (FO)
After Skolemization, the knowledge base consists of the following clauses: I. Facts: 1. dog(D). 2. owns(Jack,D). 3. cat(Tuna). II. Definite clauses: 4. ¬Dog(D) ∨ ¬owns(J ack, D) ∨ animal − lover J ack. 5. kills(Curiosit y, T una) ∨ kills(J ack, T una). 6. ¬cat (T una) ∨ animal(T una). III. Goal clauses: 7. ¬animal − lover (J ack) ∨ ¬animal(T una) ∨ ¬kills(J ack, T una). 8.negation of Query Q:∃y.kills(y, T una) in clausal form: ¬kills(y, T una). In Fig. 3.7 we show the resolution steps leading to solution y = Curiosit y to the Query.
Fig. 3.7 The resolution tree for the Example 3.9
3.11 Meta-Theory of FO. Part II
135
3.11 Meta-Theory of FO. Part II 3.11.1 Undecidability of Satisfiability and Validity Decision Problems We propose to consider the following result in (Salomaa [20], Theorem I.9.6). Theorem 3.16 There exists a language L which is non-recursive and recursively enumerable. Proof We regard language L as a set of finite words over an alphabet A, a subset of the set A∗ of all finite words over A and by Thm. 1.37 in order to demonstrate that L is not recursive, it suffices to prove that the complement −L = A∗ \ L is not recursively enumerable. We consider the alphabet A = {a}, i.e, the single-letter alphabet. We may assume that L is generated by a type 0 grammar, i.e, by a Turing machine, and we can list all computations which yield all languages over A as C0 , C1 , C2 , . . . . We use the symbol C j → a j in case the word a j is listed by C j . Now, the language L is defined as (L)L = {a n : Cn → a n } We apply a diagonal method of reasoning: was −L recursively enumerable, it would be listed by some C j and then a j would be in −L, contrary to definition of L which assures that a j ∈ L. Hence, −L is not recursively enumerable and L is not recursive. The argument for recursive enumerability of L consists in representing each C j as the sequence of steps (i, j) where we may assume that if a procedure terminates at some i 0 then steps after i 0 simply repeat what was obtained at i 0 , hence, pairs (i, j) run over all pairs of natural numbers. It is well-known that the function f (i, j) = bn(i + j + 1, 2) + i establishes a bijection between the set of pairs (i, j) and the set N of natural numbers (Theorem 1.8). We can thus list all words in L by examination of all pairs (i, j) in the order indicated by the ordering f (i, j) and adding a j to L in case it is listed in C j . Corollary 3.3 The membership decision problem for type-0 grammars is undecidable. Let us give a few necessary facts about type 0 grammars. Each grammar G over a vocabulary V generates from a word X 0 called an axiom, the language L(G) containing words over V obtained by means of a finite set P of productions; each production in case of type 0 grammars is an expression P : Q ⇒ R, where Q and R are words over V . The action of a production Q ⇒ R consists in rewriting a word of the form T QW into the word T RW . A word Z is in L(G) if there exists a proof of it from X 0 , i.e., a finite sequence of productions P1 , P2 , . . . Pk such that the premiss to P1 is X 0 , each Pi+1 has premiss obtained as the consequent of the production Pi
136
3 Rudiments of First-Order Logic (FO)
and the consequent of the production Pk is Z . It is usually expressed by the formula X 0 ⇒∗ Z , where ⇒∗ is the transitive closure of ⇒. Consider then the grammar G of type 0 with the production set P which generates the language L(G) = L of Theorem 3.16, which has the undecidable membership problem. Let Q be a binary predicate defined as follows (Q)∃.∃x1 , x2 , . . . , xk , y0 , y1 , . . . , yk .(X 0 , y0 ) ∧ (y0 , x1 ) ∧ (x1 , y1 ) ∧ (y1 , x2 ) ∧ . . . ∧ (xk , yk ),
where (x, y) is satisfied in M if and only if there exist words (x/T, y/W ) such that T ⇒ W ∈ P. Then Theorem 3.17 A word Z is in L(G) if and only if Q is satisfied in M with yk /Z . Theorem 3.18 The satisfiability problem for predicate logic is undecidable. Hence, the validity problem for predicate logic is undecidable. Proof Was Q satisfiability decidable for each Z ∈ L(G), the membership problem for L(G) would be decidable. As it is not, satisfiability problem for predicate logic is undecidable. As validity is equivalent to unsatisfiability, the decision problem for validity is undecidable for predicate logic. We can render these results in terms of computation theory by recalling the Church theorem Church [21]. Theorem 3.19 (Church) The set of Gödel numbers of valid formulae of predicate logic is recursively enumerable but not recursive. Hence the validity problem for predicate calculus is recursively unsolvable (undecidable). Proof (Floyd [22] in Manna [23]). We exploit here the undecidability of PCP in Theorem 3.22, below. We consider the PCP over the alphabet {0, 1}. Let A = {(u 1 , v1 ), (u 2 , v2 ), . . . , (u k , vk )} be an instance of PCP. Proof consists in the construction of a formula F with the property that F is valid if and only if the instance A has a solution. Models for F will have as domain the set of words {0, 1}∗ , i.e., the set of finite binary sequences, a relational vocabulary consisting of a binary predicate symbol P, two unary function symbols f 0 and f 1 , 0-ary function, i.e. a constant symbol c. The formula F is defined as follows: k k P( f u i (c), f vi (c)) ∧ ∀x, y.(P(x, y) ⊃ ∀i=1 P( f u i (x), f vi (y))] ⊃ F : [(∀i=1
∃z.P(z, z). Claim. PCP instance A has a solution if and only if F is valid.
3.11 Meta-Theory of FO. Part II
137
Proof of Claim. Let the structure M = (D, c, f 0 , f 1 , P), with D, c, f 0 , f 1 , P defined specifically as D = {0, 1}∗ , c = ε, i.e, the empty word, f 0 (x) = x0, f 1 (x) = x1, P(x, y) holds if and only if there exists a sequence i 1 , i 2 , . . . , i m with m ≤ k such that x = (u i1 , u i2 , . . . , u im ) and y = (vi1 , vi2 , . . . , vim ). The assignment in the structure M is defined as sending a variable to a sequence of the form either (u i1 , u i2 , . . . , u i j ) for some j or (vi1 , vi2 , . . . , vi p ) for some p. Suppose that F is valid in the structure M. It is easy to see that functions f u i , f vi return on c simply u i , respectively vi , hence the predicate P rendered in M is true. k P( f u i (c), f vi (c)) and ∀x, y.(P(x, y) ⊃ For this reason, both sub-formulae ∀i=1 k ∀i=1 P( f u i (x), f vi (y)) are valid in M, hence the consequent ∃z.P(z, z) is valid which means the existence of a sequence i 1 , i 2 , . . . , i q of indices with the property that (u i1 u i2 . . . u iq ) = (vi1 vi2 . . . viq ) ,which is a solution to PCP. Now, suppose that PCP is decidable, and any instance A has a solution (u i1 u i2 . . . u iq ) = (vi1 vi2 . . . viq ), this satisfies the consequent ∃z.P(z, z) in any model, hence F is valid. Let us compare (see Sect. 1.6): Theorem 3.20 The set of Gödel numbers of proofs in predicate logic is recursive. The set of Gödel numbers of closed formulae is recursive. The set of Gödel numbers of provable formulae is recursively enumerable. Let us mention some undecidable decision problems. Theorem 3.21 The following are undecidable: (i) the decision problem for the Kleene predicate ∃y.T (x, x, y) see Theorem 1.6.2; (ii) the halting problem see Sect. 1.6.; (iii) the acceptance problem whether a Turing machine accepts a given word w. Proof We recall that (i) and (ii) are proved in Sect. 1.6. For (iii): If a Turing machine halts on w, then add a new state qaccept and let the machine to make one more move to the state qaccept so the word w is accepted. If the machine does not halt, w is not accepted. Hence, the acceptance problem is equivalent to the halting problem. Definition 3.28 (The Post Correspondence Problem (PCP)) The problem introduced in Post [24] is a quadruple PCP = (A, n, U, V ) where A is an alphabet, n is a positive natural number, U = (a1 , a2 , . . . , an ) and V = (b1 , b2 , . . . , bn ) are sequences of words over A. A solution to PCP is a set of indices i 1 , i 2 , . . . , i k such that (ai1 ai2 . . . aik ) = (bi1 bi2 . . . bik ). An example of an instance of PCP is (∗) U =< bc, a, ba, b >, V =< c, ab, a, bb > with the solution abcbbba. It is known that PCP is undecidable: there is no algorithm which would produce a solution to each instance of PCP (for a proof, see Post [24], also (Manna [23], p. 60), (Salomaa [20], VIII). The alphabet A can be reduced to two symbols only (Salomaa [20], VIII), hence, A can be the binary alphabet {0, 1} and a, b are then binary sequences.
138
3 Rudiments of First-Order Logic (FO)
Post’s proof relies on undecidable decision problem of membership for normal grammars (see, please, Salomaa [20] for the definition of a normal grammar), Salomaa [20] exploits the analogous problem for type 0 grammars, Scott [25] reduces the PCP problem to Halting Problem for Post Machines (see, e.g., Manna [23], pp. 24–28); an exhaustive analysis of computing machinery is given in Friedman [26] with applications to undecidable problems, in particular PCP is reduced there to the Halting Problem for Shepherdson-Sturgis Machines and RAM Machines. We deem that a visualization of the proof by Scott and others is the version given in Sipser [27]. PCP in Sipser [27] is reduced to the acceptance problem for Turing Machines cf. Theorem 3.28(iii). As with many other undecidable problems it applies the domino tiles. Each domino is of the form [ ww ] with w at the upper side, w at the lower side. For any word w, a computation is described which matches the sequence of top sides with the sequence of bottom sides if and only if w is accepted. Theorem 3.22 The PCP problem is undecidable. Proof (Sipser [27]). In this proof, given a word w over the alphabet of a Turing Machine T M, an accepting computation is searched for; the instance of PCP is constructed from instructions of the machine with antecedent filling top side and consequent filling bottom side; as instructions are repeated one by one antecedents and consequents occupy alternately top and bottom sides. A match is achieved if and only if the word w has an accepting computation. The construction proceeds in steps. Step 1. The initial domino tile is [ #q0 a1 a#2 ...ak # ]; Step 2. In this step, for each instruction (q, a → q , b, right), where q = qr eject , qa add the tile [ bq ]; Step 3. In this step, for each instruction (q, a → q , b, le f t), where q = qr eject , add the tile [ qcqa cb ], where c is a tape symbol, for each c; Step 4. For each tape symbol a, add the tile [ aa ]; # ], where B is blank; Step 5. Add tiles [ ## ] and [ B# aqaccept q a ] and [ qaccept ]; Step 6. For each tape symbol a, add tiles [ qaccept accept Step 7. The last tile added is [
qaccept ## ]. #
In order to illustrate this computation, we assume that consecutive instructions for a TM with tape symbols 0, 1, 2, B and word w = 0100, have been: (i) q0 0 → q5 2 right so the tile added is [ q20q05 ]; (ii) step 4 adds tiles [ 00 ], [ 11 ], [ 22 ], [ BB ]; (iii) . . .. #q0 0100# ]. At this stage, we get the following partial match: [ #q 0 0100# Upon reaching a halting state which accepts, the tiles added in step 7 are applied #q ## ]. leaving the top and bottom sequences as [ # qaccept accept ## We have omitted some technical issues, the interested Reader will please consult quoted sources.
3.14 The Theory of Herbrand
139
3.12 Complexity Issues for Predicate Logic For some classes of formulae, the satisfiability problem has high complexity: the Bernays-Schönfinkel-Ramsey SAT(BSR) problem is NEXPTIME-complete (see p Sect. 1.8), the SAT(TQBF) problem is PSPACE-complete (see Sect. 1.8), SAT(Σi ) p is Σ i -complete (see Sect. 1.8). It follows that any level of the polynomial hierarchy is accessible with formulae of predicate logic.
3.13 Monadic Predicate Logic Monadic predicate logic allows for unary predicates only. The main property of monadic logic is the following theorem about finite model property. Theorem 3.23 For each formula φ(P1 , P2 , ..., Pn , x1 , x2 , ..., xk ) of monadic predicate logic, if M |= φ for a structure M, then there is a sub-structure M ∗ ⊆ M such that cardinality of M ∗ is not greater than 2n and M ∗ |= φ. Proof Let φ be in the prenex form: Q 1 x1 Q 2 x2 ...Q m xm β(P1 , P2 , .., Pn , x1 , x2 , ..., xk ) Consider the set of binary sequences of length k and for each such sequence σ form the set M(σ) = {a ∈ M : A(Pi (a)) = σ(i) for i=1,2,..., n}. From each subset M(σ) of M, select an element a(σ) and let M ∗ = {a(σ) : σ ∈ {0, 1}n }. Then M ∗ |= φ. It follows from Theorem 3.23 that monadic predicate calculus is decidable: to check the truth of φ one can build a truth table for φ with at most 2n worlds. We have seen some close encounters between SL and FO and now we arrive at the Hebrand theory Herbrand [28] which canonizes these relations.
3.14 The Theory of Herbrand We consider FO L with non-empty countable sets of individual variables, predicate symbols, function symbols, and, constant symbols. Definition 3.29 (Herbrand structures) We recall that a term is either an individual variable or a constant, or, an expression of the form f (t1 , . . . , tk ), where f is a k-ary function symbol and all ti are terms. A term is closed if and only if it contains no free individual variable symbols. For the logic L, we consider the set T (L) of all closed terms obtained from constructs in L. T (L) is the Herbrand universe.
140
3 Rudiments of First-Order Logic (FO)
The Herbrand structure for L is the pair M H (L)=(T (L), I T (L) ), where I T (L) (t) = t for each t ∈ T (L). Thus, in Herbrand interpretation, the logic L interprets in a sense itself. We recall known from tableau theory types of quantified formulae: γ for universally quantified formulae ∀xφ, ¬∃xφ and δ for existentially quantified formulae ∃xφ, ¬∀xφ. Theorem 3.24 (Satisfaction conditions) For a formula of type γ, respectively of type δ, M H |= γ (respectively M H |= δ) if and only if M H |= γ(t) for each t ∈ T (L) (respectively M H |= δ(t) for some t ∈ T (L)). Definition 3.30 (Relative Herbrand models) For a closed formula φ of FO L, the Herbrand model for φ is built on the Herbrand universe T (φ) of all closed terms built from function and constant symbols in φ. Example 3.10 We consider φ : ∀x.P(c, f (a, x), h(x, c)) with the matrix P(c, f (a, x), h(x, c)). T (φ) is: {a, c, f (a, a), f (a, c), h(a, c), h(c, c), f (a, f (a, c)), f ( f (a, c), a)), f ( f (a, c)), f (a, c), h( f (a, a), c), ...} It may happen that a formula φ has no occurrences of any constant symbols; in such case we add a constant symbol {c} in order to built a non-empty Herbrand universe. Definition 3.31 (The Herbrand normal form (the validity functional form)) The closed formula φ (denoted ψ H ) is the Herbrand normal form of the closed formula ψ if φ is ¬Sk(¬ψ), where Sk(α) is the Skolem normal form of α. Please observe that ψ H is valid if and only if ψ is valid: suppose that ψ is valid, hence, ¬ψ is unsatisfiable, thus, by Theorem 3.12, Sk(¬ψ) is unsatisfiable, hence, ¬(Sk(¬ψ)) is valid. The converse is proved by reversing the direction of these inferences. Example 3.11 Consider φ : ∀z.∃w.∀x.((∀u.Q(x, u)) ⊃ Q(w, z)). Then ¬φ is: ∃z.∀w.∃x.¬(∀u.Q(x, u) ⊃ Q(w, z)) Skolemization of ¬φ yields: ∀w.¬((∀u.Q( f (w), u)) ⊃ Q(w, c)), where c is a Skolem constant symbol and f is a Skolem function symbol. Negation of Skolem form of ¬φ yields φ H : ∃w.((∀u.Q( f (w), u)) ⊃ Q(w, c)). The function symbol f and the constant symbol c induce the Herbrand universe T (φ) : {c, f (c), f ( f (c)), f ( f ( f (c))), . . .}. This is the feature of the Herbrand normal form: it begins with the existential quantifier (it is essentially existential). Theorem 3.25 Each closed formula φ is satisfiable if and only if it is satisfiable in the Herbrand interpretation M H .
3.14 The Theory of Herbrand
141
Proof It suffices to prove the implication to the right. Suppose then that φ is satisfiable in an interpretation M = (D, I ). We assign to each term t ∗ ∈ D H the value assigned to t in M, and to each atomic formula P ∗ (t1∗ , t2∗ , . . . , tk∗ ), we assign the value which M assigns to the atomic formula P(t1 , t2 , . . . , tk ). Then φ is satisfiable in M H . Definition 3.32 (The Herbrand domain) The Herbrand domain for a closed formula φ is any non-empty finite subset TH ⊆ T (φ). For φ of Example 3.11, an example of a domain is TH : {c, f (c)}. We now give definition of the crucial notion of a Herbrand expansion. Given a Herbrand domain, one interprets quantified closed formulae as well as propositional formulae in this domain. We recall that α is the type of conjunctions, β the type of disjunctions. Definition 3.33 (Herbrand expansions) For a closed formula φ and the finite set T = {t1 , t2 , . . . , tn } of closed terms, a Herbrand expansion of φ over T , in symbols HE (φ, T ) is defined by structural induction for all types of formulae as follows. As before γ, δ are generic symbols for a formula of type γ, respectively, of type δ. (i) (ii) (iii) (iv) (v)
HE (l, T ) = {l} for each propositional literal l; HE (α1 ∧ α2 , T ) = HE (α1 , T ) ∧ HE (α2 , T ); = HE (α1 , T ) ∨ HE (α2 , T ); HE (α1 ∨ α2 , T ) HE (∀x.γ, T ) = t j ∈T HE (γ(t j ), T ); HE (∃x.δ, T ) = t j ∈T HE (δ(t j ), T ).
Definition 3.34 (The Herbrand expansion) For a closed formula φ, the Herbrand expansion of φ is the Herbrand expansion of the Herbrand normal form φ H over the Herbrand domain. Example 3.12 For φ and φ H of 13.6 and the domain T = {c, f (c)}, the Herbrand expansion HE (φ, T ) is ¬Q( f (c), c) ∨ Q(c, c) ∨ ¬Q( f (c), f (c)) ∨ Q( f (c), c) ∨ ¬Q( f ( f (c)), c) ∨ ¬Q( f ( f (c)), f (c)) and it is valid as it contains an occurrence of a pair of contradictory literals. The Herbrand theorem states that a closed formula of the language L is valid if and only if there exists the Herbrand expansion of φ valid as a propositional formula, thus reducing validity in L to validity in PL. We precede a proof of this theorem with a more detailed look at consistency and satisfiability via Hintikka sets. Definition 3.35 (FO consistency) Concerning Definition 3.18, where the notion of consistency was introduced, we restrict ourselves in cases of γ and δ types of closed formulae to closed terms, hence, we modify the corresponding conditions in the notion of consistency. At the same time we define consistency for collections of sets.
142
3 Rudiments of First-Order Logic (FO)
Let Γ be the set of all sets of closed formulae of L. The collection Γ is said to be the FO consistency property if and only if for each set Δ ∈ Γ , in addition to properties for α and β types of propositional formulae, it possesses the following properties for γ and δ types of quantified formulae; for each set Δ ∈ Γ , we require the following: (i) if γ ∈ Δ, then Δ ∪ {γ(t)} ∈ Γ for each closed term t of L; (ii) if δ ∈ Δ, then Δ ∪ {δ(t)} ∈ Δ for some closed term t in L, which has no occurrence in Δ. We now follow the path already delineated in Part I of meta-theory of FO as well as in propositional logic: we extend Δ to maximal consistent set. The construction is on the lines of the Lindenbaum lemma, with some modifications due to FO case. We suppose that we have a countably infinite set of terms; if it is not the case then we introduce a countably infinite set of constants, called parameters, disjoint to the set of closed terms in L. We will denote these parameters with the generic symbol p. They are used in the context of δ-formulae, when there is no closed term in L available which has not been used yet. We will use the uniform symbol t in all cases assuming that for particular δ-type φ a record is kept for the term or parameter used, in order to eliminate parameters in the end. Elimination of parameters is discussed at the end of the section. Theorem 3.26 Each set Δ ∈ Γ is extendable to a maximal set Δ∗ which is FO consistent. Proof The proof is on lines of the proof of Lindenbaum’s Lemma: let (φn )∞ n=1 be an ∞ : enumeration of all closed formulae of L. We define a new sequence (Δi∗ )i=1 (i) Δ1 ∗ = Δ1 ; / Γ , else Δ∗n+1 = Δ∗n ∪ {φn } if φn is not of type δ; (ii) Δ∗n+1 = Δ∗n if Δ∗n ∪ {φn } ∈ ∗ ∗ (iii) Δn+1 = Δn ∪ {φn } ∪ {φn (t)} if φn is of type δ and t is a closed term not used yet (which is possible as we have countably many parameters). Then, as in propositional case, Δ∗ = n Δ∗n is maximal consistent. Definition 3.36 (The Hintikka set) We add to the conditions for propositional Hintikka sets, new conditions for γ and δ types of formulae, and we appropriately modify conditions (H3) and (H4) in Definiton 3.17. The FO Hintikka set H for a language L satisfies in addition to the propositional case, the conditions: H(γ) if γ ∈ H , then γ(t) ∈ H for each closed term t of L; H(δ) if δ ∈ H , then δ(t) ∈ H for some closed term t of L. Hintikka sets provide a link to Herbrand models via the following result. It is a specialization of Theorem 3.5.
3.14 The Theory of Herbrand
143
Theorem 3.27 For a closed formula φ of the language L, if φ ∈ H , then φ is valid in a Herbrand model. Proof First, we have to specify the Herbrand model M H = (D H , I H ). As D H , we adopt the set of all closed terms of L. Then I H (t) = t for each closed term t of L, hence, each term t is satisfiable in M H , hence, valid. For any atomic formula P(t1 , t2 , . . . , tk ), we let I H (P(t1 , t2 , . . . , tk )) if and only if P(t1 , t2 , . . . , tk ) ∈ H . Proof is by structural induction. Sentential cases are as in sentential logic. For any atomic formula P(t1 , t2 , . . . , tk ) ∈ H , the interpretation is in M H by definition, hence it is valid in M H . For a formula γ ∈ H , γ(t) ∈ H for each closed term in D, hence, by hypothesis of induction, γ(t) is valid in M for each t, which means that γ is valid in M. An analogous argument settles the case of δ. A particular case of consistency property is the Herbrand consistency. Definition 3.37 (The Herbrand consistency property) Consider the language L along with the set T of all closed terms obtained from L. A collection Γ of sets Δ of closed formulae of L is a Herbrand consistency property if the following conditions hold. (i) all sets Δ are finite; (ii) all formulae in all sets Δ are essentially universally quantified; (iii) ¬HE ( Δ, D) is invalid for each finite domain D ⊆ T . For HE please see Definition 3.33. It turns out that we get no new consistency property. Theorem 3.28 The Herbrand consistency property is an FO consistency property. Proof Let T be the collection of all closed terms of L. We check the conditions for FO consistency property. First, as example for propositional cases, we consider a formula β. Suppose then that Δ ∈ Γ , β ∈ Δ, but Δ ∪ {βi } ∈ / Δ for i = 1, 2 (recall that β is β1 ∨ β2 ). By ) is valid for some finite Di ⊆ T , for i = 1, 2. condition (iii), ¬HE ( Δ ∧ βi , Di We let D = D1 ∪ D2 , so ¬HE ( Δ ∧ βi , D) is valid for i = 1, 2. As HE (
Δ ∧ β, D) ≡ HE ( Δ ∧ (β1 ∨ β2 ), D) ≡ HE ( Δ ∧ β1 , D) ∨ HE ( Δ ∧ β2 , D),
it follows that ¬HE (Δ ∧ β, D) ≡ ¬HE (Δ ∧ β1 , D) ∧ ¬HE (Δ ∧ β2 , D) is valid, a contradiction, as β ∈ Δ ∈ Γ . By condition (ii), only the case of γ type requires a proof. Suppose that Δ ∈ Γ , γ ∈ Δ and for aclosed term t, Δ ∪ {γ(t) ∈ / Δ. Hence, there exists a finite D ⊆ T , such that ¬HE ( Δ ∧ γ, D) is valid.
144
3 Rudiments of First-Order Logic (FO)
As γ has γ(t) as one of conjuncts, it follows that HE ( HE (
Δ ∧ γ, D ∪ {t}) ≡
Δ, D ∪ {t}) ∧ HE (γ, D ∪ {t}).
This implies that HE (
Δ, D ∪ {t}) ∧ HE (
Γ, γ(t), D ∪ {t}) ≡ HE (
Γ ∧ γ(t), D ∪ {t})
As ¬HE ( Γ ∧ γ(t), D ∪ {t}) is valid so is ¬HE ( Γ ∧ γ, D ∪ {t}), a contradiction, as γ ∈ Δ ∈ Γ . Corollary 3.4 If Γ is a Herbrand consistency property, then each set Δ ∈ Γ is satisfiable. Theorem 3.29 (Herbrand) Each closed formula φ is valid as a formula of language L if and only if there exists a valid (tautological) Herbrand expansion. Proof We use the Herbrand normal form, hence, the formulae are essentially existentially quantified. Claim 1. We suppose that the formula φ is essentially existentially quantified. If φ is valid, then HE (φ, D) is valid for some Herbrand domain D. Proof of Claim 1. Suppose that HE (φ, D) is not valid for each domain D. Thus, ¬φ is a formula in a consistency property Δ as it satisfies Definition 3.37(i)-(iii), notice in particular that ¬φ is essentially universally quantified and ¬HE (¬φ, D) ≡ ¬¬HE (φ, D) ≡ HE (φ, D) is not any tautology for any domain D, hence, ¬φ is satisfiable, a contradiction. The only thing more to do is to remove parameters. Let s be a substitution: for each parameter p, let s( p) be a closed term of L; as E H (φ, s( p)) is a tautology, the proof of Claim 1 is concluded. Claim 2. For each closed formula φ, if φ is essentially existentially quantified and D is a finite set of closed terms, then the formula HE (φ, D) ⊃ φ is valid. Proof of Claim 2. Proof is by structural induction. It suffices to consider the δ D the formula case. Suppose in virtue of hypothesis of induction that for each ti ∈ E H (δ(ti ), D) ⊃ δ(ti ) is valid. Then, HE (δ, D) = HE (δ(ti ), D) ⊃ δ(ti ) ≡ δ(t). It follows that if HE (φ, D) is valid, then φ is valid. This concludes the proof of the Herbrand theorem. The impact of the Herbrand theorem is that Herbrand expansions form a recursively enumerable set and if on reading elements of this set, we come to a propositional valid formula, then it means that the formula φ is valid; otherwise our search
3.15 The Gödel Completeness Theorem. The Henkin Proof
145
would go ad infinitum. The Herbrand theorem does express a form of the completeness property of FO. It will be only fair to state that the analysis of the Herbrand theory here has been influenced by the exposition in Fitting [29]. The idea of FO consistency comes from Fitting [29] as well.
3.15 The Gödel Completeness Theorem. The Henkin Proof The proof of completeness theorem in Gödel [30] was followed by a novel idea of a proof in Henkin [31] by creating models for sets of 1st order formulae as maximal consistent sets. Simplified in Hasenjaeger [32], the proof is known as the HenkinHasenjaeger proof. From historic perspective, our exposition in this chapter goes back in historic time: we have seen already the idea of maximal consistency as the implication for the model existence, we have met the Löwenheim-Skolem theorem, we have witnessed the introduction of auxiliary constants called parameters, used to satisfy existentially quantified formulae. On the other hand, in the Henkin proof we find the Lindenbaum Lemma. The Henkin proof initiated a long series of results based on His idea. We will see it in non-classical logics, modal and temporal, for example. Theorem 3.30 (Gödel) First-order logic is complete. Proof (Henkin [31]). The alphabet of the first-order logic language L is standard: it consists of symbols ( ) f = ⊥ ⊃: the symbol f ‘false’ is used, following Church [33] to denote falsity. The acronim ‘wf’ means ‘well-formed (’formula’), the acronim ‘cf’ means closed (well-formed formula) instead of the original acronim ‘cwff’. Well-formed formulae are defined as usual: (1) if φ, ψ are wfs, then φ ⊃ ψ is wf; (1’) if φ is wf, then ∀x.φ is wf. As usual, negation of φ is φ ⊃ f , ∃x.φ is introduced as (∀x.φ ⊃ f ) ⊃ f . The axiom system of the language L consists of the following schemes. We only consider cfs. (2) (3) (4) (5) (6)
φ ⊃ (ψ ⊃ φ); (φ ⊃ ψ) ⊃ ((ψ ⊃ ξ) ⊃ (φ ⊃ ξ)); ((φ ⊃ f ) ⊃ f ) ⊃ φ; if x is not free in φ, then (∀x.(φ ⊃ ψ)) ⊃ (φ ⊃ ∀x.ψ); (∀x.φ) ⊃ ψ, where ψ results from substitution of free occurrences of x in φ by y under proviso that x is independent of y (i.e., no free occurrence in x falls into a sub-formula of φ of the form ∀y.ξ).
146
3 Rudiments of First-Order Logic (FO)
Rules of inference of the system L are (7) detachment (MP) ψ is inferred from φ and φ ⊃ ψ; (8) generalization (Gen) ∀x.φ is inferred from φ for any individual variable x. Deduction theorem. If Γ, φ ψ, then Γ φ ⊃ ψ (please see Sect. 3.21). The derived formulae to be applied in the proof are as follows. (9) (10) (11) (12) (13)
(φ ⊃ f ) ⊃ (φ ⊃ ψ); (φ ⊃ (ψ ⊃ f )) ⊃ ((φ ⊃ ψ) ⊃ f ); (∀x.(φ ⊃ f )) ⊃ ((∃x.φ) ⊃ f ); ((∀x.φ) ⊃ f ) ⊃ (∃x.φ ⊃ f ); The replacement rule. Here wf formulae are allowed. If Γ is a set of wf formulae each of which contains no free occurrence of individual variable y, and ψ results by substitution in a wf φ of x for all free occurrences of y under proviso that any of these occurrences is not bound in ψ, then if Γ φ, then Γ ψ.
The crucial notion in the proof is consistency, defined in the standard way: a set Γ of formulae is consistent if it is not true that Γ f , otherwise Γ is inconsistent. We already know that any consistent set is satisfiable, and we obtained this result by the method of Henkin applied earlier. As a side result, obtained is also a generalization of the Löwenheim-Skolem theorem. Theorem 3.31 If a set Γ of cf’s of L is consistent, then Γ is satisfiable in a domain of the same cardinality as cardinality of vocabulary of L. Proof In order to be able to satisfy all existential formulae which will appear in the constructs of the proof, one introduces countably infinite set {P(i) : i ∈ N}, each P(i) = { pi j : j ∈ N}, each pi j a constant called parameter. Please observe that each pi j has no occurrence in formulae of L, therefore any substitution with its use cannot cause any inconsistency. Moreover, distinct pi j ’s are independent of one another. We will use the Lindenbaum Lemma. (I) For Γ = Γ0 , order cfs of L into a sequence (φ1j ) j , and apply the Lindenbaum Lemma to obtain a maximal consistent set Δ0 with Γ0 ⊆ Δ0 ; then, add to Δ0 the parameters from the set P(1). (II) Order cfs in Δ0 , into a sequence, select the sub-sequence (ψq )q of formulae of the form ∃x.χq (x) and for the formula ψq substitute p1q for each free occurrence of x in χq (x), then add χq (x/ p1q ) to Δ0 . Claim 1. Δ0 ∪ {χq (x/ p1q } is consistent. Proof of Claim 1. Suppose, to the contrary, that Δ0 ∪ {χq (x/ p1q )} is not consistent, hence, Δ0 ∪ {χq (x/ p1q )} f , and by Deduction rule, Δ0 χq (x/ p1q ) ⊃ f . By replacement rule (13), Δ0 χq (x) ⊃ f and by (Gen), Δ0 (∀x.χq (x)) ⊃ f , hence by (11), and, (MP), Δ0 (∃x.χq (x)) ⊃ f . Yet, Δ0 ∃x.χq (x) by maximal consistency of Δ0 , hence, by (MP), Δ0 f , a contradiction. It ends the proof of Clam 1.
3.15 The Gödel Completeness Theorem. The Henkin Proof
147
(III) After all existential formula in Δ0 have underwent step (II), enlarge the resulting set to a maximal consistent set Δ1 . (IV) Repeat the steps (II), (III) with Δ1 using parameters in the set P(2) to obtain a maximal consistent set Δ2 and then in the same manner sets Δi with use of P(i) for i = 3, 4, . . .. ∞ Finally let Δ∗ = i=1 Δi . The set Δ∗ is maximal consistent and each formula ∗ ∃x.χ(x) is in Δ along its substitution χ(x/ pi j ) for an appropriate pi j . is actually the Herbrand model: its domain D is the The model proposed for Δ∗ set of constants of the set L ∪ i P(i) and interpretation I sends each constant c to itself: c I = c. For relational symbols, we assign the extent of them, i.e., to R n the set of tuples (a1 , a2 , . . . , an ) having the property that Δ∗ R n (a1 , a2 , . . . , an ). Propositional symbols p are assigned the value T or F depending on whether Δ∗ p. Claim 2. For each cf φ of L ∪ i P(i), the value assigned to φ is T if and only if Δ∗ φ and it is in agreement with the semantic value of φ. Proof of Claim 2. The proof is by structural induction on sub-formulae. For formulae χ of type φ ⊃ ψ, we need to consider some cases. (i) the assigned value of ψ is T; hence, Δ∗ ψ. By axiom scheme (2), Δ∗ φ ⊃ ψ, hence, the assigned value to χ is T which is the semantic value of χ; (ii) the assigned value of φ is F, hence, Δ∗ φ does not hold; then, Δ∗ ∪ {φ} f and by Deduction rule Δ∗ φ ⊃ f and, by the scheme (9), and (MP), Δ∗ φ ⊃ ψ, hence the value assigned to χ is T which agrees with the semantic value of χ; (iii) assigned values are T for φ and F for ψ; hence, Δ∗ φ and it is not that Δ∗ ψ, i.e, as in (ii), Δ∗ ψ ⊃ f . By the scheme (10) and (MP) applied twice, Δ∗ (φ ⊃ ψ) ⊃ f . Were Δ∗ φ ⊃ ψ true, we would obtain by (MP) that Δ∗ f , impossible as Δ∗ is consistent. It follows that the assigned value of χ is F and this agrees with the semantic value of χ. For a formula χ of type ∀x.φ, suppose that Δ∗ χ. Then, by axiom scheme (6) and (MP), Δ∗ φ(x) for each x, hence, by hypothesis of induction, the assigned value of φ(x) is T for each x and this means that χ has the semantic value T which agrees with the assigned value; on the other hand, if Δ∗ χ does not hold, then Δ∗ χ ⊃ f , and, by the scheme (12), and (MP), Δ∗ ∃x.(φ ⊃ f ). It follows that for an appropriate pi j , after substitution x/ pi j into φ the resulting formula φ has the property that Δ∗ (φ ⊃ f ). This does exclude the possibility that Δ∗ φ as such fact would imply by (MP) that Δ∗ f , which is impossible. Thus, φ(x) is not satisfied for some x, hence the formula χ has the semantic value F which agrees with the assigned value. Suppose that cardinality of the alphabet of L is κ and that λ is the first ordinal of that cardinality; replace in the above proof countable sequences with transfinite sequences defined on the set λ and well-order by the Zermelo theorem sets of cf formulae and sets of parameters. The result obtained is the generalization of the
148
3 Rudiments of First-Order Logic (FO)
Löwenheim-Skolem theorem: any set Γ of cf formulae which is consistent has a model of cardinality κ of the set of its symbols. The conclusion: though we know already that satisfiability of consistent sets of L means completeness, yet, it would be only fair to allow Master to conclude His proof, especially that it is a bit different from already used by us. Suppose that a cf formula φ of L is valid. Thus, φ ⊃ f is false, hence, not satisfiable, hence, not consistent. It follows that (φ ⊃ f ) f , hence, by Deduction rule, ((φ ⊃ f ) ⊃ f ) and the axiom scheme (4) along with (MP) the conclusion φ follows. The case of a valid wf formula φ follows by passing to the closure φ which is valid hence provable: φ. From there, we prove φ by means of the axiom scheme (6) and (MP).
3.16 The Tarski Theorem on Inexpressibility of Truth Deep philosophical and ontological consequences of theorems of Tarski and Gödel force their discussion at this point. In Sect. 1.7, a scheme has been included for those theorems, which now is going to be filled with details in the framework of a formal system of arithmetic. We are influenced by the exposition in Smullyan [34]. Definition 3.38 (Syntax of the arithmetic system L E ) The alphabet of the system L E consists of the following symbols 0
o , v ( )
≤ =
# ¬
⊃
∀
The intended meaning of symbols is that 0 explains itself, prime denotes the successor function, hence, 1 is 0 , 2 is 0 and so on, o means operation, ‘, serves to denote arithmetic operations: o, is +, o, , is ·, o, , , is exp - exponentiation, same device as to numerals is applied to the symbol v standing for ‘individual variable’, v, stands for v1 , v, , for v2 and so on. Meanings of = and ≤ are obvious. The remaining symbols are those for logical connectives and the universal quantifier; this form of symbolics comes from Smullyan [34]. By this usage of primes and commas, the infinitely countable universe is encoded by means of 13 symbols. The notion of a term requires an explication. We recall that the shortcut cf means closed formula. (1) variables and numerals are terms; (2) if t1 , t2 are terms, then t1 , t1 + t2 , t1 · t2 , ex p(t1 , t2 ) are terms. Formulae are defined by structural induction as follows:
3.16 The Tarski Theorem on Inexpressibility of Truth
149
(3) an atomic formula is any expression of the form t1 ≤ t2 or of the form t1 = t2 , for terms t1 , t2 ; (4) if φ, ψ are formulae, then φ ⊃ ψ, ¬φ, ∀vi .φ are formulae for each variable vi . As before, any occurrence of any vi in terms and non-quantified formulae is free and each occurrence of vi in a formulae ∀vi .φ is bound. A formula is closed (abbreviated cf) if all variables in it are bound. The formal method of introducing numerals, with possibly many primes appended to 0 is impractical, and we return to the symbolics known from Ch. I: we denote the natural number n as n. For a formula P(v1 , v2 , . . . , vk ) we denote by P(n 1 , n 2 , . . . , n k ) the result of substitution of n 1 , . . . , n k for, respectively, v1 , v2 , . . . , vk ; clearly, indices at v’s may be like i 1 , i 2 , and so on, and, n 1 , . . . , n k is the spacesaving notation for (n 1 , . . . , n k ). A closed term is any term with no occurrences of variables. Such terms are: n designating the natural number n, for each n, n 1 + n 2 designating n 1 + n 2 , n 1 · n 2 designating n 1 · n 2 , ex p(n 1 , n 2 ) designating n n1 2 and n designating n + 1. Definition 3.39 (Semantics of L E ) We recall the notion of a size of a formula as the number of logical connectives and quantifiers in it. We define the notion of truth by structural induction. Let c denotes any closed term. Then we have, (5) an atomic cf c1 = c2 is true if and only if c1 and c2 designate the same natural number n; (6) an atomic cf c1 ≤ c2 is true if and only if ci designates a natural number n i for i = 1, 2 and n 1 ≤ n 2 ; (7) a cf ¬φ is true if and only if cf φ is not true; (8) a cf φ ⊃ ψ is true if and only if either φ is not true or ψ is true; (9) a cf ∀vi .φ is true if and only if for each natural number n, the cf φ(n) is true. The notion of a valid formula follows: φ(v1 , v2 , . . . , vk ) is valid if and only if φ(n 1 , n 2 , . . . , n k ) is true for each tuple (n 1 , n 2 , . . . , n k ). Substitution in L E is governed by the following rules. We assume the formula φ(v1 ) and consider the substitution of vi for v1 . (10) if a variable vi is not bound in φ(v1 ), then we substitute vi for each free occurrence of v1 in φ; (11) if vi has bound occurrences in φ, then we pick the first v j not in φ and we substitute v j for all occurrences of vi in φ and into the obtained formula we substitute vi for free occurrences of v1 to obtain the formula φ(vi ). This definition extends to formulae of few variables. The symbol φ(v1 ) will denote a formula with the only free variable v1 . Definition 3.40 (Formulae expressing sets of natural numbers) A formula φ(v1 ) does express a set Q of natural numbers if and only if the equivalence holds: φ(n) is true if and only if n ∈ Q.
150
3 Rudiments of First-Order Logic (FO)
In case of a formula of more than one free variable, a formula φ(v1 , v2 , . . ., vk ) does express a relation Q of arity k on k-tuples of natural numbers if and only if the equivalence holds: φ(n 1 , n 2 , . . . , n k ) is true if and only if Q(n 1 , n 2 , . . . , n k ) holds. Definition 3.41 (Arithmetic sets and relations) A set of natural numbers is Arithmetic if it is expressed by a formula of the language L E ; in case the formula has no symbol ex p, the set or relation are said to be arithmetic. This notion extends to functions: a function f (n 1 , n 2 , . . . , n k ) is Arithmetic if and only if the relation Q(n 1 , n 2 , . . . , n k , n k+1 ), which holds if and only if f (n 1 , n 2 , . . . , n k ) = n k+1 , is Arithmetic. Now, we introduce a Gödel style numbering of symbols, terms and formulae of L E . This numbering differs from Gödel numbering in Chap. 1, as we do not use prime numbers, but it is constructed at the fixed base. Such numbering with 9 symbols was proposed in Quine [35], and in Smullyan [34] it was modified to 13 symbols. We present this numbering. Definition 3.42 (Gödel numbering) We consider a base b greater than 1 and define concatenation m ◦b n as m expressed in base b followed by n expressed in base b. For example, let b=2, m=32, n = 16, hence, m in base 2 is 100000, n in base 2 is 10000, m ◦2 n = 10000010000=1040. Let us observe that the length of n at base 2 is 5, and 32x25 + 16 = 1040. The general formula is m ◦b n = m · b|n|b + n, where |n|b is the length of n in base b. We will use symbols x, y, ... as variables for natural number arguments. Theorem 3.32 The following relations are Arithmetic, including the relation x ◦b y = z. Alongside relations, proofs that they are Arithmetic are included: (i) let E(x, y, b) denote that x is b to the power of y. This relation is Arithmetic: E(x, y, b) ≡ x = b y ; (ii) the relation E(x, b): ‘x is a power of b’ is Arithmetic: E(x, b) ≡ ∃y.x = b y . Let us observe that b|x|b is the smallest power of b greater than x; (iii) the relation gr (x, y), read ‘y is the smallest power of b greater than x’ is Arithmetic: gr (x, y) ≡ (E(y, b) ∧ ¬(x = y) ∧ x ≤ y ∧ ∀z.E(z, b) ∧ ¬(x = z) ∧ x ≤ z) ⊃ y ≤ z); (iv) the relation b|x|b = y is Arithmetic: b|x|b = y ≡ [(x = 0) ∧ (y = b) ∨ (¬(x = 0) ∧ gr (x, y)]; (v) x ◦b y = z ≡ x · b|y|b + y = z is Arithmetic. By induction, x1 ◦b x2 ◦b . . . ◦b xn = y is Arithmetic. Definition 3.43 (Numbering of primitive symbols of L E ) We assign, following the scheme in (Smullyan [34]), to primitive symbols of the language L E the numbers and symbols in the following way (we present this assignment in pairs [language symbol, assigned value]): [0 1], [ 0], [( 2], [) 3], [o 4], [, 5], [v 6], [¬ 7], [⊃ [≤
X ], [# X ].
8], [∀ 9], [= X ]
3.16 The Tarski Theorem on Inexpressibility of Truth
151
In order to assign the Gödel number to a string of symbols, we first represent the string by the sequence of positions of consecutive symbols in the listing (A) and then we express the obtained sequence in the base 13, for instance, the natural number n is represented as the string 0 ... of 0 with n accent signs, represented as the sequence of 1 followed by n zeros, i.e., in base 13 as 13n . By E 0 is denoted the expression , for n > 0, E n denotes the expression with the Gödel number n; it follows that concatenation E n E m of expressions has the Gödel number n ◦13 m. Definition 3.44 (Gödel sentences) A Gödel sentence for a set Q of natural numbers is a closed formula (for short: cf) φ of L E with the property that if φ is true then its Gödel number is in Q and if φ is false then its Gödel number is not in Q. Following Tarski and Smullyan, we consider the formula φ(v1 ) where v1 is the only free variable in φ and we define cf φ(n). The latter is equivalent to the formula ∀v1 .(v1 = n ⊃ φ(v1 )), denoted in what follows φ[n]. We generalize the latter notation: for any expression E, the symbol E[n] will denote the expression ∀v1 .(v1 = n ⊃ E) which is a formula if E is a formula. We define a function Λ whose value Λ(x, y) is the Gödel number of the expression E x [y] which is ∀v1 .(v1 = y ⊃ E x ). We recall the symbol gn(...) meaning ‘the Gödel number of (...)’. Theorem 3.33 The function Λ(x, y) is Arithmetic. Proof Indeed, gn(Λ(x, y)) = gn(∀v1 .(v1 =)) ◦13 13 y ◦13 8 ◦13 x ◦13 3, i.e., Λ(x, y) is Arithmetic, hence, Λ(x, y) = z is Arithmetic. Definition 3.45 (The diagonal function) The diagonal function is d(x) = Λ(x, x). Theorem 3.34 The diagonal function is Arithmetic. Proof Indeed, d(x) = gn(E x [x]).
For a set Q of natural numbers, we let Q ∗ denote the set d −1 (Q), i.e., n ∈ Q ∗ if and only if d(n) ∈ Q. Theorem 3.35 For Arithmetic set Q, the set Q ∗ is Arithmetic. Proof Indeed, x ∈ Q ∗ ≡ ∃y.(d(x) = y ∧ y ∈ Q). Let Q be expressed by the formula φ(v1 ). The relation d(x) = y is Arithmetic, hence, it is expressed by a formula, say, δ(v1 , v2 ). Then Q ∗ is expressed by the formula ∃v2 .(δ(v1 , v2 ) ∧ φ(v2 )). Theorem 3.36 For each Arithmetic set Q there exists a Gödel sentence.
152
3 Rudiments of First-Order Logic (FO)
Proof Suppose that Q is Arithmetic, then Q ∗ is Arithmetic, hence, there exists a ¨ number. Then E[e] is formula expressing Q ∗ , let it be E(v1 ) and let e be its Godel true if and only if e ∈ Q ∗ ≡ d(e) ∈ Q. As d(e) is gn(E[e]), the sentence E[e] is the Gödel sentence for Q. Let T be the set of Gödel numbers of all true Arithmetic sentences (cf’s) of L E . The theorem by Tarski [1] settles the question of Arithmetic of T in the negative. Theorem 3.37 (Tarski). The set T is not Arithmetic. Proof For an Arithmetic set Q, its complement −Q is Arithmetic: if φ(v1 ) expresses Q, then ¬φ(v1 ) expresses −Q. Were −T Arithmetic it would have a Gödel sentence, but then as −T consists of untrue sentences, such Gödel sentence would be true if and only if it would be untrue, a contradiction, showing that the set T is not Arithmetic.
3.17 Gödel Incompleteness Theorems We continue with the setting from Sect. 3.16. We address the Gödel incompleteness theorem Gödel [36] following Smullyan [34]. We begin with axiomatization of Arithmetic known as Peano Arithmetic. We continue from Sect. 3.16 with inclusion of exponentiation into arithmetic operations and with the system of Gödel numbering adopted there, and, we denote the system to be presented as L P E . In this case we meet both logical axioms for 1st-order logic FO and arithmetic axioms. We begin with logical axioms due to (Kalish, Montague [37]). Definition 3.46 (Axiom schemes for L P E ) Logical axiom schemes (A1) (A2) (A3) (A4) (A5) (A6) (A7)
( p ⊃ (q ⊃ p)); (( p ⊃ (q ⊃ r )) ⊃ (( p ⊃ q) ⊃ ( p ⊃ r ))); ((¬q ⊃ ¬ p) ⊃ ( p ⊃ q)); (A1)-(A3) are axiom schemes for PL. ((∀vi .(φ ⊃ ψ) ⊃ (∀vi .φ) ⊃ ∀vi ψ)); (φ ⊃ ∀vi .φ) if vi has no occurrence in φ; (∃vi .(vi = t)) provided vi has no occurrence in the term t; (vi = t) ⊃ (Φ1 ⊃ Φ2 ), where Φ1 , Φ2 are atomic formulae , Φ1 is arbitrary φ1 vi φ2 , and, Φ2 is φ1 t φ2 : it follows that only one occurrence of vi can be replaced by t.
Arithmetic axiom schemes (A8) (A9) (A10) (A11)
((v1 = v2 ) ⊃ (v1 = v2 )); (¬(0 = v1 )); ((v1 + 0) = v1 ); ((v1 + v2 ) = (v1 + v2 ) );
3.17 Gödel Incompleteness Theorems
(A12) (A13) (A14) (A15) (A16) (A17) (A18) (A19)
153
((v1 · 0) = 0); ((v1 · v2 ) = (v1 · v2 + v1 )); ((v1 ≤ 0) ≡ (v1 = 0)); ((v1 ≤ v2 ) ≡ (v1 ≤ v2 ∨ v1 = v2 )); (v1 ≤ v2 ∨ v2 ≤ v1 ); (ex p(v1 , 0) = 0 ); (ex p(v1 , v2 ) = (ex p(v1 , v2 ) · v1 ). ((φ(0) ⊃ (∀v1 φ(v1 ) ⊃ φ(v1 ))) ⊃ ∀v1 φ(v1 )) (the principle of mathematical induction).
A formula is refutable if and only if it is not provable. Our discussion assumes that the axiomatization is sound: what is provable is valid. Definition 3.47 (Gödel numbering for L P E ) We recall from Sect. 3.16 the notation x ◦b y and the base b=13. We go now a bit further into this topic. We denote by the symbol (x)b the representation of the number x in base b. We define the predicate pr e(x, y) which holds if the representation (x)b is a prefix in the representation (y)b ; similarly, su f (x, y) will hold when (x)b is a suffix in (y)b . Lemma 3.1 The following relations are Arithmetic. We recall that the relation ex p(x, y) denotes the expression x y . (1) pr e(x, y) and su f (x, y): pr e(x, y) ≡ (x = y) ∨ (x = 0 ∧ (∃z ≤ y) ∧ ∃(w ≤ y). ∧ E(w, b) ∧ (x · w) ◦b z = y
su f (x, y) ≡ (x = y) ∨ (∃z ≤ y.z ◦b x = y); (2) the relation of a segment: seg(x, y) ≡ ∃z ≤ y. pr e(z, y) ∧ su f (x, z); (3) sequences-related relations: from a standard k-tuple (φ1 , . . . , φk ), we induce the sequence written down formally as the string σ : # φ1 # φ2 # . . . #φk # and gn(σ) which is called the sequence number. This new construct induces some new relations. We recall that gn(#) = X . Relations are: (a) Seq(x), x is gn(σ) for some σ: Seq(x) ≡ pr e(X , x) ∧ su f (X , x) ∧ (X = x) ∧ (¬seg(X X , x)0∧ ∀y ≤ x(seg(X 0y, x) ⊃ pr e(X , y));
154
3 Rudiments of First-Order Logic (FO)
(b) I n(x, y), x is an element of a sequence σ with gn(σ) = y: I n(x, y) ≡ Seq(y) ∧ seg(X x X , y) ∧ (¬seg(X , x)), (c) Pr ec(x, y, z), z = gn(σ), x, y in σ, x precedes y in σ: Pr ec(x, y, z) ≡ I n(x, z) ∧ I n(y, z) ∧ ∃w.( pr e(w, z) ∧ I n(x, w) ∧ ¬I n(y, w)).
(4) relations which define terms and formulae: the relation F(t1 , t2 , t3 ) holds if and only if t3 is one of t1 + t2 , t1 · t2 , ex p(t1 , t2 ), t1 . Then, an expression t is a term if and only if there exists a proof t1 , t2 , . . . , tk , where each ti is a variable or a numeral or F(t j , tl , ti ) holds with j, l < i. For formulae, we let G(φ1 , φ2 , φ3 ) to hold in cases where φ3 is one of ¬φ1 , ¬φ2 , φ1 ⊃ φ2 , for some vi , ∀vi φ1 , ∀vi φ2 . A φ is a formula if it is an element of a sequence φ1 , φ2 , . . . , φk with each φi either an atomic formula or G(φ j , φl , φi ) holds with j, l < i. These relations are Arithmetic; proof is given below. For an expression E, the symbol E x denotes that x = gn(E). We embark on the second part of arithmetization of L P E . Forming sequences serve the purpose of replacement of inductive definitions of terms or formulae with formal proof-like definitions. Let F(t1 , t2 , t3 ) be a relation which holds if t3 is either t1 + t2 or t1 · t2 , or, ex p(t1 , t2 ). Then we say that t is a term if and only if there exists a sequence t1 , t2 , . . . , tn called a term-forming sequence such that each ti is either a variable or a numeral, or there exists an instance F(tk, tl , ti ) with k, l < i. The expression t is a term if there exists a term-forming sequence with t as its element. We treat formulae in a similar way: let G(φ1 , φ2 , φ3 ) holds if and only if φ3 is one of: ¬φ1 or ¬φ2 , or, φ1 ⊃ φ2 , or, ∀vi .φ1 , or, ∀vi .φ2 , for some vi . A formula-forming sequence is a sequence φ1 , φ2 , . . . , φm such that each φi is an atomic formula or it satisfies G(φk , φl , φi ) for some k, l < i. A φ is a formula if it is an element of a formula-forming sequence. We list arithmetic and logical expressions and symbols for their Gödel numbers. We consider expressions E x , E y and we list Gödel numbers for: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)
gn(E x ⊃ E y ) is denoted as gn ⊃ (x, y); gn(¬E x ) is denoted gn ¬ (x); gn(E x + E y ) is denoted gn + (x, y); gn(E x · E y ) is denoted gn · (x, y); gn(ex p(E x , E y )) is denoted gn ex p (x, y); gn(E x ) is denoted gn (x, y); gn(E x = E y ) is denoted gn = (x, y); gn(E x ≤ E y ) is denoted gn ≤ (x, y); gn(z = o(x, y)) is denoted gn o (x, y).
3.17 Gödel Incompleteness Theorems
155
Expressions (i)–(ix) are Arithmetic, for instance, gn = (x, y) = 2 ◦13 x ◦13 η ◦13 y ◦13 3. Now, we continue the list or relations necessary for the proof. In order to facilitate the task of arithmetization, we borrow a trick: instead of I n(x, y) we will write xεy. This will alow to write ∀yεx in some formulae below, instead of wrestling with the problem of formulae ∀y I n(y, x) ∧ .. which would call for ⊃. (5) ssub(x), x is a sequence of commas (subscripts)(recall that gn(, ) = 5): (∃yseg(y, x)) ⊃ seg(5, x); (6) var (x), E x is a variable: ∃yssub(y) ∧ x = 2 ◦13 6 ◦13 y ◦13 3; (7) num(x), E x is a numeral: ex p(x, 13); (8) F(x, y, z), the relation F(E x , E y , E z ) holds: z = gn + (x, y) ∨ z = gn · (x, y) ∨ z = gn ex p (x, y) ∨ z = gn (x); (9) Fseq(x), E x is a term-forming sequence: Seq(x) ∧ ∀yεx(var (y) ∨ num(y) ∨ ∃z, w Pr ec(w, y, x) ∧ F(z, w, y)); (10) ter m(x), E x is a term: ∃y Fseq(y) ∧ xεy; (11) a f or m(x), E x is an atomic formula: ∃y∃zter m(y) ∧ ter m(z) ∧ [(x = gn = (y, z)) ∨ (x = gn ≤ (y, z))]; (12) f all(x, y), E y is E x quantified universally: ∃zvar (z) ∧ y = 9 ◦13 z ◦13 x; (13) G(x, y, z), the relation G(E x , E y , E z ) holds: z = gn ⊃ (x, y) ∨ z = gn ¬ (x) ∨ f all(x, z); (14) Gseq(x), E x is a formula-forming sequence: Seq(x) ∧ ∀yεx(a f or m(y) ∨ ∃z, w Pr ec(z, w, y) ∧ G(z, w, y)); (15) f or m(x), E x is a formula: ∃y.Gseq(y) ∧ xεy; (16) mp(x, y, z), E z follows from E x and E y by means of (MP): z = gn ⊃ (x, y); (17) der (x, y, z), E z follows from E x and E y by means od (MP) or from E x by means of f all(x, z): mp(x, y, z) ∨ f all(x, z); (18) pr oo f (x), E x is a proof: Seq(x) ∧ ∀yεx(A(y) ∨ ∃z, wεx.Pr ec(z, w, y) ∧ der (z, w, y)); (19) pr ov(x), E x is provable: ∃y( pr oo f (y) ∧ xεy); (20) r e f (x), E x is refutable: pr ov(¬(x)). The term A(y) denotes that y is an instance of an axiom scheme. We leave the arithmetization of axiom schemes off. The Reader may fill this gap in at their will, and we assume that A(y) is arithmetized. It follows that all formulae are arithmetic. We denote by the symbol P the set of all Gödel numbers of provable formulae of L P E and the symbol R will stand for the set of Gödel numbers of refutable formulae. Theorem 3.38 (Gödel [36]) The system L P E is incomplete. Proof As sets P and R are Arithmetic, both are expressed in L P E : P by (v1 ), R by (v1 ). The formula ¬(v1 ) expresses the complement −. There exists a formula Q P (v1 ) which expresses the set (−P)∗ . For its Gödel number q p , we have that diagonalization Q P (q p ) is a Gödel sentence for the set −P. Then Q P (q p ) is true but not provable. The sentence ¬Q P (q p ) is false and not provable.
156
3 Rudiments of First-Order Logic (FO)
This proof which we owe to Smullyan [34] is probably the simplest. The original proof in Gödel [36] was based on the notion of ω-consistency: a system is ω-inconsistent if there exists a formula φ(x) such that the formula ∃x.φ(x) is provable but for each n, the formula φ(n) is refutable. A system is ω-consistent if it is not ω-inconsistent. Gödel’s formulation of incompleteness theorem was: ‘if Peano Arithmetic is ω-consistent, then it is incomplete’. Rosser [38] proved incompleteness of Peano Arithmetic under assumption of mere consistency.
3.18 The Rosser Incompleteness Theorem. The Rosser Sentence We consider an axiomatic system Σ about which it is assumed that it is consistent. Definition 3.48 (Representation) A formula Φ represents a set A if and only if A = {n : Φ(n) is provable}. We recall that the symbol n encodes the natural number n in Σ. Definition 3.49 We adopt or recall the following notation: (P) the symbol P denotes the set of Gödel numbers of provable formulae; (R) the symbol R denotes the set of Gödel numbers of refutable (i.e., not provable) formulae; (E n ) the symbol E n denotes that the Gödel number of the expression E is n; (E n (n)) the symbol E n (n) denotes the Gödel number of E n (n) (cf. the diagonal function in sect. 15.); (P ∗ ) the symbol P ∗ denotes the set {n : E n (n) is provable}; (R ∗ ) the symbol R ∗ denotes the set {n : E n (n) is refutable}. Definition 3.50 (Separability) For sets A, B, a formula Ψ (v1 ) separates A from B if and only if for each n ∈ A the formula Ψ (n) is provable and for each n ∈ B the formula Ψ (n) is refutable. Definition 3.51 (Enumerability) A set A is enumerable if and only if there exists a formula A(x, y) such that n ∈ A if and only if there exits k such that A(n, k) is provable. We recall after Smullyan [34] an axiomatization due to R. Robinson [39]. Definition 3.52 (The Robinson axiomatization for Σ) Axiom schemes (I)–(V) are the following: (I) the scheme n + m = k for all triples (n, m, k) such that n + m = k; (II) the scheme n · m = k for all triples (n, m, k) such that nm = k; (III) the scheme m = m for all pairs n, m of distinct numbers;
3.18 The Rosser Incompleteness Theorem. The Rosser Sentence
157
(IV) the scheme v1 ≤ n; (V) the scheme (v1 ≤ n) ∨ (n ≤ v1 ). The first incompleteness result is the following (cf. [34]). Theorem 3.39 Under assumptions about Σ, if a formula ¬F(v1 ) represents the set P ∗ and f = gn(F(v1 )), then F( f ) is neither provable nor refutable. Proof Since ¬F(v1 ) represents P ∗ , F(n) is refutable if and only if n ∈ P ∗ , for each n. In particular, F( f ) is refutable if and only if f ∈ P ∗ . On the other hand, F( f ) is provable if and only if f ∈ P ∗ . Thus, we have two cases: (a) F( f ) is provable and refutable (b) F( f ) is neither provable nor refutable. The case (a) must be dismissed by consistency assumption, hence, (b) remains. Theorem 3.40 Suppose that the formula Ψ (v1 ) separates the set A from the set B in Σ. Then, Ψ (v1 ) represents a set C which contains A and is disjoint from B. Proof By consistency property of Σ, C cannot intersect B.
Theorem 3.41 If Ψ (v1 ) separates P ∗ from R ∗ and p is the Gödel number of Ψ (v1 ), then the formula Ψ ( p) is undecidable. Proof By consistency of Σ, Ψ (v1 ) represents the set Q of which, by symmetry with respect to P ∗ and R ∗ , we may assume that R ∗ ⊆ Q and P ∗ ∩ Q = ∅. Then, for p = gn(Ψ (v1 ), (i) Ψ ( p) is provable if and only if p ∈ Q by definition of separability; (ii) Ψ ( p) is provable if and only if p ∈ P ∗ by definition of P ∗ . It follows that p ∈ Q ≡ p ∈ P ∗ , and by consistency, the only case possible is that / R ∗ . This means that Ψ ( p) neither p ∈ Q nor p ∈ P ∗ , which implies that also p ∈ is neither provable nor refutable in Σ. It is now desirable to have an explicit formula which separates two sets. Let us observe that for given sets A, B it is sufficient to separate sets A \ B and B \ A. Theorem 3.42 (The Rosser separation lemma) Suppose that all instances of axiom schemes (IV) and (V) are provable in Σ and sets A, B are enumerable. Then sets A \ B and B \ A are separable. Proof Suppose that the set A is enumerated by the formula A(x, y) and the set B is enumerated by the formula B(x, y). Claim. The formula (∀y.A(x, y)) ⊃ (∃z ≤ y.B(x, z)) separates A \ B from B \ A. Proof of Claim. Suppose that n ∈ B \ A; as n ∈ B, there exists k such that (i) B(n, k) is provable (in Σ). As n ∈ / A, A(n, m) is refutable for each m, hence, by (IV), the sentence ∀(y ≤ k)¬A(n, y) is provable, and, the formula y ≤ k ⊃ ¬A(n, y) is provable which is
158
3 Rudiments of First-Order Logic (FO)
equivalent to provability of A(n, y) ⊃ ¬(y ≤ k). By (V), the formula (ii) A(n, y) ⊃ (k ≤ y) is provable. By (i) and (ii), the formula (iii) A(n, y) ⊃ (k ≤ y) ∧ B(n, k) is provable. From (iii), we obtain the provable formula (iv) A(n, y) ⊃ (∃z ≤ y) ∧ B(n, z); the inference rule (Gen) yields from (iv) the formula (v) ∀y(A(n, y) ⊃ (∃z ≤ y) ∧ B(n, z)). Suppose now that n ∈ A \ B, hence, there exists k such that A(n, k) is provable and for each m the formula B(n, m) is refutable. By (IV), the formula (vi) ∀(z ≤ k)¬B(n, z) is provable. This implies provability of A(n, k) ∧ ∀(z ≤ k)¬B(n, z) and provability of (vii) ¬(A(n, k) ⊃ ∃(z ≤ k).B(n, z)), hence, (viii) A(n, k) ⊃ ∃(z ≤ k).B(n, z)) is refutable. Finally, the formula ∀y.(A(x, y) ⊃ ∃z ≤ y.B(x, y) separates A from B.
Theorem 3.43 If the system Σ is consistent, sets P ∗ and R ∗ are enumerable and all instances of axiom schemes (IV) and (V) are provable in Σ, then Σ is incomplete and the incompleteness is witnessed by the Rosser sentence ∀y.(A(h, y) ⊃ ∃(z ≤ y).B(h, z), where A(x, y) enumerates P ∗ and B(x, y) enumerates R ∗ , and, h = gn(∀y.(A(x, y) ⊃ ∃(z ≤ y).B(x, y)). Proof By Theorem 3.42, P ∗ and Q ∗ are separated by the sentence ∀y.(A(x, y) ⊃ ∃(z ≤ y).B(x, y)) and by Theorem 3.41, the sentence ∀y.(A(h, y) ⊃ ∃(z ≤ y).B(h, z) is undecidable.
The Rosser sentence is often paraphrased as the statement ‘if the given formula is provable, then there is a shorter (i.e., with a smaller Gödel number) proof of its unprovability’. Theorem 3.44 If Peano Arithmetic is consistent, then it is incomplete.
3.19 Finite FO Structures Definition 3.53 (Finite structures. 1st-order definability) We assume a vocabulary V = {c1 , c2 , . . . , cl , P1 , P2 , . . . , Pm } consisting of constant symbols ci and relational symbols P j of specified arity. The realization of V is a
3.19 Finite FO Structures
159
structure M = (D, I, A), where D is the domain of M, i.e., a non-empty set of things, I is an interpretation which sends each ci to an element c I ∈ D, and each relational symbol P to a relation P I ⊂ D a(P) , where a(P) is the arity of P. The third ingredient is an assignment A which sends each individual variable x into an element A(x) ∈ D, for each x in a countable set of individual variables (xi )i . A structure M is finite when the domain D is finite. We recall that (i) individual variables and constants are terms; terms are denoted as t, s, . . .; (ii) atomic formulae are expressions of the form t = s or Pi (t1 , t2 , . . . , ta(Pi ); a(Pi ) is the arity of Pi ; (iii) formulae are expressions φ ∨ ψ, φ ∧ ψ, φ ⊃ ψ, ¬ψ, ∀x.φ, ∃x.φ in case φ and ψ are formulae. The notion of satisfaction under given interpretation I and assignment V in a structure M = (D, I, V ) is defined in the standard for FO manner. The definition of truth for a structure is also standard for FO. If a formula is true in a structure M, them M is a model for the formula. A formula true in all models is valid. Otherwise it is invalid. Definition 3.54 (Pointed structures) ∞ For each (possibly infinite) sequence (di )i=1 of elements of the domain D, we call − → the pair (M, {di : i = 1, 2, . . .}), usually denoted as (M, d ), a pointed structure. Definition 3.55 (Interpretations, assignments and satisfaction in pointed structures) − → The interpretation of an expression e in a pointed structure Md = (M, d ) will be denoted as e Md . We list the rules for interpretations: (i) (ii) (iii) (iv) (v)
for an individual variable xi : xiMd is di ; for a constant symbol c, c Md is c I ; an atomic formula t = s is interpreted as t Md = s Md ; a relational symbol Pi is interpreted as P I ; an atomic formula Pi (t1 , t2 ,. . ., tn ) is interpreted as P I(t1Md , t2Md ,. . ., tnMd ).
The notion of satisfaction for pointed structures is defined in the standard way; the symbol x/d means substitution for all free occurrences of variable x in a formula by an element d ∈ D. (vi) (vii) (viii) (ix) (x) (xi) (xii)
|= Md |= Md |= Md |= Md |= Md |= Md |= Md
(t = s) if and only if t Md = s Md holds; Pi (t1 , t2 , . . . , tn ) if and only if (t1Md , t2Md , . . . , tnMd ) ∈ P I holds; φ ∨ ψ if and only if either |= Md φ or |= Md ψ; φ ∧ ψ if and only if |= Md φ and |= Md ψ; ¬φ if and only if it is not the case that |= Md φ; ∀x.φ if and only if for each d ∈ D, |= Md φ(x/d); ∃x.φ if and only if for some d ∈ D, |= Md φ(x/d).
160
3 Rudiments of First-Order Logic (FO)
Counterparts for φ ⊃ ψ and φ ≡ ψ follow by well-known valid equivalences (φ ⊃ ψ) if and only if ¬φ ∨ ψ and φ ≡ ψ if and only if (φ ⊃ ψ) ∧ (ψ ⊃ φ). Definition 3.56 (An isomorphism between structures) Two structures M1 = (D1 , I1 ) and M2 = (D2 , I2 ) are isomorphic if there exists a bijective mapping h from D1 onto D2 such that (i) for each set {d1 , d2 , . . . , dn } ⊆ D1 and relation P I1 of arity n, the equivalence holds: (d1 , d2 , . . . , dn ) ∈ P I1 if and only if (h(d1 ), h(d2 ), . . . , h(dn )) ∈ P I2 ; (ii) h(c I1 ) = c I2 for each constant symbol c. An isomorphism of structures M1 and M2 is denoted M1 ∼ = M2 . Isomorphic structures satisfy the same formulae: M1 |= φ(d1 , a2 , . . . , dn ) if and only if M2 |= φ(h(d1 , h(d2 ), . . . , h(dn )). In case φ is a closed formula, we have simply M1 |= φ if and only if M2 |= φ. Definition 3.57 (Class of models Cls − mod(φ)) Yet another form of expressing the existence of an isomorphism between structures is by means of the notion of the class of models for a formula φ, denoted Cls − mod(φ). For each closed formula (sentence) φ over a relational vocabulary V , if M1 ∼ = M2 , then M1 ∈ Cls − mod(φ) if and only if M2 ∈ Cls − mod(φ). Definition 3.58 (Partial isomorphism between structures) Two structures M1 and M2 are partially isomorphic if and only if there exists a bijection f from a set dom( f ) ⊆ D1 onto a set r ng( f ) ⊆ D2 such that (i) for each relation P I1 of arity n and each set {d1 , d2 , . . . , dn } ⊆ dom( f ), P I1 (d1 , d2 , . . . , dn ) if and only if P I2 ( f (d1 ), f (d2 ), . . . , f (dn )) holds; (ii) for each constant symbol c, if c I1 ∈ dom( f ) then f (c I1 ) = c I2 . Domain dom( f ) and range r ng( f ) of a partial isomorphism f are isomorphic under the isomorphism established by f . A particular case is when we select for some k elements d11 , d21 , . . . , dk1 in D1 and d12 , d22 , . . . , dk2 in D2 such that the mapping f : di1 → di2 for i ≤ k and f (c I1 ) = c I2 for each constant symbol c is a partial isomorphism, which case is denoted (d I1 )k1 ∼ = f (d I2 )k1 . The notions of an isomorphism and a partial isomorphism relate two structures on the level of relations and constant symbols. We need relations between structures which deal with formulae. Such relation is induced by the notion of a quantifier rank. Definition 3.59 (Quantifier rank of a formula) For a formula φ of FO, the quantifier rank of φ, qr (φ) is defined by structural induction: (i) (ii) (iii) (iv)
qr (xi ) = 0 = qr (c) for each xi and c; qr (φ ∨ ψ) = qr (φ ∧ ψ) = qr (φ ⊃ ψ) = max{qr (φ), qr (ψ)}; qr (¬φ) = qr (φ); qr (Qφ) = qr (φ) + 1, where Q = ∀, ∃.
3.20 Ehrenfeucht Games
161
It follows that quantifier rank of a formula is the number of quantifier symbols in its prenex form. The class Q m collects formulae of quantifier rank not greater than m. These classes are instrumental in definitions of relations between structures. Definition 3.60 (The m-equivalence) Two structures M1 and M2 are m-equivalent if and only if for each closed formula φ with qr (φ) ≤ m, M1 |= φ if and only if M2 |= φ. The relation of m-equivalence between structures is denoted M1 ≡m M2 . We now define the crucial notion of FO-definability, opening the way up to asserting whether a given class of structures can be axiomatized within FO. We recall that Cls − mod(φ) is the collection of all structures which are models for φ. Definition 3.61 (FO-definability) A class C of structures is FO-definable if and only if there exists a formula φ of FO such that C = Cls − mod(φ). The following theorem brings forth the methodology for answering the FOdefinability question in the negative. Theorem 3.45 For a class C of finite structures, if for each natural number m there exist structures M1 and M2 such that: (i) M1 ≡m M2 ; (ii) M1 ∈ C and M2 ∈ / C, then C is not FO-definable. Proof Suppose that, to the contrary, C is FO-definable and let φ be the formula with the property that C = Cls − mod(φ). Let m = qr (φ). As, by (i), M1 ≡m M2 , and, by (ii), M1 |= φ, it follows by Definition 3.60 that M2 |= φ, hence, M2 ∈ C, a contradiction. In the light of Theorem 3.45, it is important to have a test for the relation ≡m . This is supplied by Ehrenfeucht games Ehrenfeucht [40].
3.20 Ehrenfeucht Games Definition 3.62 (The Ehrenfeucht game) In the notation of Sect. 3.19, we consider pointed structures (M1 , (a)k1 ) and (M2 , (b)k1 ). The Ehrenfeucht game G m (M1 , (a)k1 ; M2 , (b)k1 ) is played by two players: player p1 (called Spoiler) and player p2 (called Duplicator). In each party of the game, players make m moves, beginning with the first move by Spoiler, who selects an element either from D1 or from D2 to which Duplicator responds with a choice of an element from the set not chosen by Spoiler in its first move. This pattern is repeated m − 1 times and the play is terminated leaving k m m players with the pointed structures (D1 , (a)k1 (u)m 1 ) and (D2 , (a)1 (w)1 ), where (u)1
162
3 Rudiments of First-Order Logic (FO)
are elements chosen by players from the set D1 and (w)m 1 are the elements chosen by players from the set D2 . k m Duplicator wins the party if and only if the mapping f : (a)k1 (u)m 1 → (b)1 (w)1 is a partial isomorphism. Duplicator has a winning strategy in the game G m (M1 , (a)k1 ; M2 , (b)k1 ) if and only if it wins every party of the game. An example of the winning strategy for Duplicator, which has a better position as it always makes the second move in each ply, is when structures M1 and M2 are isomorphic under an isomorphism h, because then if Spoiler selects an element u ∈ D1 , then Duplicator responds with h(u) ∈ D2 , and, for a choice of w ∈ D2 by Spoiler, Duplicator responds with h −1 (w) ∈ D1 . This remark establishes Theorem 3.46 If M1 ∼ = M2 , then Duplicator has a winning strategy in the game G m (M1 , ∅; M2 , ∅). By Definition 3.62, Duplicator has the winning strategy in the game G 0 (M1 , (a)k1 ; M2 , (b)k1 ) if and only if the mapping f : (a)k1 → (b)k1 is a partial isomorphism. For m > 0, we have the following properties to observe. Theorem 3.47 The following statement holds for the Ehrenfeucht game, please observe that the subscript m means m moves to be made: (i) if the Duplicator has the winning strategy in the game G m (M1 , (a)k1 ; M2 , (b)k1 ), j
j
then for each j < m the position (a)k1 (u)1 ; (b)k1 (w)1 establishes a partial isomorj j phism f j : (a)k1 (u)1 → (b)k1 (w)1 . Corollary 3.5 At each stage j < m of each party of the game G m , if Duplicator has the winning strategy for the game G m , then there are moves extending the winning strategy to G j−1 , viz. (i) (forth) for each a ∈ A, there exists b ∈ D2 such that the pair (a, b) added to the position of G j provides a partial isomorphism at the position of G j−1 ; (ii) (back) for each b ∈ B, there exists a ∈ D1 such that the pair (a, b) added to the position of G j provides a partial isomorphism at the position of G j−1 . The next result Ehrenfeucht [40] paves the way from existence of the winning strategy for Duplicator to m-equivalence of structures. Theorem 3.48 If Duplicator has the winning strategy in the game G m (M1 , (a)k1 ; M2 , (b)k1 ), then for each formula φ(x1 , x2 , . . . , xk ) of quantifier rank less or equal to m, M1 |= φ(a1 , a2 , . . . ak ) if and only if M2 |= φ(b1 , b2 , . . . , bk ).
3.20 Ehrenfeucht Games
163
Proof It is by induction on m. Case 1. m = 0; Duplicator wins game G 0 in position (a)k1 ; (b)k1 if and only if f 0 : (a)k1 → (b)k1 is a partial isomorphism and the thesis follows by definition of a partial isomorphism. Case 2. m > 0: as the class of formulae satisfying the thesis is closed on Boolean connectives and negation, the only case is that of quantified formulae, so consider a formula φ of the form ∃y.ψ(x1 , x2 , . . . , xk , y). Suppose that M1 |= φ(a1 , a2 , . . . ak ), hence, M1 |= ψ(a1 , a2 , . . . ak , a) for some a ∈ D1 . As Duplicator has the winning strategy in G m , by 18.4 (i), there exists b ∈ B such that Duplicator wins G m−1 in position (a)k1 a; (b)k1 b. As qr (ψ) ≤ m − 1, by the hypothesis of induction, M2 |= ψ(b1 , b2 , . . . , bk , b), i.e., M2 |= φ(b1 , b2 , . . . , bk ). By symmetry, the converse follows as well. Corollary 3.6 (Ehrenfeucht [40]) Duplicator has the winning strategy in the game G m (M1 , ∅; M2 , ∅) if and only if M1 ≡m M2 . Ehrenfeucht games are also called Ehrenfeucht-Fraïssé games due to an earlier result Fraïssé [41] about equivalence of the notion of m-equivalence and that of an m-isomorphism. Definition 3.63 (The notion of an m-isomorphism) Structures M1 and M2 are m-isomorphic if and only if there exists a sequence {I j : j ≤ m} of sets of partial isomorphisms between M1 and M2 such that the following conditions hold: (i) for each j ≤ m, for each a ∈ D1 , and, each h ∈ I j , there exists g ∈ I j−1 such that g extends h and a ∈ dom(g); (i) is (forth property); (ii) for each j ≤ m, for each b ∈ D2 , and, each h ∈ I j , there exists g ∈ I j−1 such that g extends h and b ∈ r ng(g); (ii) is (back property). We denote the fact of m an m-isomorphism between M1 and M2 due to (I j )m 1 as (I j )0 : M1 ∼ M2 . The theorem by Fraïssé (Fraïssé [41]) completes the picture. Theorem 3.49 (Fraïssé) Structures M1 and M2 are m-isomorphic if and only if M1 and M2 are m-equivalent. Proof We outline the proof. First, we assume that structures M1 , M2 are misomorphic via (I j ) j≤m . We consider n ≤ m. Claim. For φ with qr (φ) ≤ n, h ∈ In , M1 |= φ(a0 , a1 , . . . , ak ) if and only if M2 |= φ(h(a0 ), h(a1 ) . . . , h(ak )). The proof od Claim is by structural induction. For an atomic formula P(t1 , t2 , . . . , tk ) we use the forth and back properties of (I j ) s to build tuples (a1 , a2 , . . . , ak ) of elements of D1 and (b1 , b2 , . . . , bk ) of elements of D2 such that the
164
3 Rudiments of First-Order Logic (FO)
correspondence (ai ↔ bi ) is a partial isomorphism and M1 |= P I1 (a1 , a2 , . . . , ak ) if and only if M2 |= P I2 (b1 , b2 , . . . bk ). For propositional formulae it is sufficient to consider formulae of the form ¬ψ and ψ ∨ ξ. Suppose that φ is ¬ψ. By hypothesis of induction, ψ, ξ satisfy Claim. Then, M1 |= φ ≡ M1 |= ¬ψ ≡ ¬M1 |= ψ ≡ ¬M2 |= ψ ≡ M2 |= ¬ψ ≡ M2 |= φ. In case of φ which is ψ ∨ ξ: M1 |= φ ≡ (M1 |= ψ ∨ M1 |= ξ) ≡ (M2 |= ψ ∨ M2 |= ξ) ≡ (M2 |= φ.) The case of quantified formula φ: it is sufficient to assume that φ is ∃x.ψ where x may be assumed to be an individual variable xi . As qr (ψ) ≤ n − 1, Claim holds for ψ with In−1 . By applying the forth and back properties of In−1 , we have the sequence of equivalences: M1 |= φ(a0 , a1 , . . . , ak ), there is h ∈ In−1 with the property that M2 |= ψ(h(a0 ), h(a1 ), . . . , h(ak )), and there is a ∈ D1 and g ∈ In−2 such that M1 |= ψ(a0 , a1 , . . . , ak , a)) and M2 |= ψ(h(a0 ), h(a1 ), . . . , h(ak ), g(a)), i.e., M2 |= φ(h(a0 ), h(a1 ), . . . , h(ak )). Suppose that M1 ≡m M2 . By the Ehrenfeucht Theorem 3.48, if Duplicator has the winning strategy in the game G m then for each of its instances G m (M1 , (a1 , a2 , . . . , ak ); M2 , (b1 , b2 , . . . , bk ), the correspondence h : ai ↔ bi is a partial isomorphism. Let G be the set of all pairs of tuples and partial isomorphisms resulting from winning strategy for Duplicator. Then the sub-collection of isomorphisms in G m for each m is Im . The corollary to theorems of Ehrenfeucht and Fraïssé is the equivalence of all three notions involved in the above discussion: (i) existence of the winning strategy for Duplicator in the game G m on structures M1 , M2 (ii) m-equivalence of M1 , M2 (iii) m-isomorphism of M1 , M2 . We now apply the above results in proving non-definability of some typical problems in FO. Example 3.13 The class EVEN of sets of even cardinality is not FO-definable. Proof The relational vocabulary of EVEN=∅. For a given m, let us consider sets A and B with |A| = 2k > m and |B| = 2k + 1. Obviously, A ∈ E V E N , B ∈ / EV E N. In the game G m (A, B), Duplicator has the winning strategy: we prove it by induction on the number of moves: suppose that the position is after i − th move by each player and Spoiler selects its (i + 1) − st, ai+1 , element from, say, the set A. In case ai+1 has been already selected as an element a j , Duplicator selects already selected element b j ∈ B, otherwise Duplicator selects any element in the set B\{alr eady selected elements o f B}. By Theorem 3.45, the class EVEN is not FO-definable.
3.20 Ehrenfeucht Games
165
Example 3.14 The class EVEN(LO) of linear finite ordered sets of even cardinality is not FO-definable. Proof The relational vocabulary of EVEN(LO)={ of elements of M with the property that the element N j1 , j2 of the product N = Mi1 Mi2 . . . Mik is equal to 0. Prove: the problem of mortal matrices is undecidable by reducing the Post correspondence problem to it, precisely, for a given PCP, construct the system M such that PCP has a solution if and only if the constructed M is (3, 2)-mortal. [Hint: eventually, consult (Manna [23], Example 1-24) for construction of M]. Problem 3.7 (Reduced products of models for FO) (see Chang and Keisler [45], 4.1.6) and Problem 2.17. Let L be a first-order language over a vocabulary Σ. Let I be a non-empty set, F a proper filter on I , and Mi be a model for L with the domain Ai for each i ∈ I . We recall that the equivalence ∼ F is defined on Πi Ai as follows: f ∼ F g if and only if {i ∈ I : f (i) = g(i)} ∈ F. The symbol [ f ]∼ denotes the class of f with respect to the equivalence ∼. We define a model for L in the reduced product Π F Mi : its domain is Π F Ai ; the interpretation of relational symbols in L in Π F Mi is the following: (i) each predicate symbol Pn of arity n in Σ is interpreted as a relation Pni on Mi ; the predicate P is interpreted in Π F Mi as relation PnF which satisfies the instance PnF ([ f 1 ]∼ , [ f 2 ]∼ , ..., [ f n ]∼ ) if and only if {i ∈ I : Pni ( f 1 (i), f 2 (i), ..., f n (i))} ∈ F, for each n-tuple ( f 1 , f 2 , . . . , f n ) of elements (threads) of Πi Ai ; (ii) each function symbol h n of arity n is interpreted in each Mi as the function h in from Min into Mi ; the function symbol h n is interpreted in Π F Mi as the function h nF which is defined as follows: h nF ([ f 1 ]∼ F , [ f 2 ]∼ F , . . . , [ f n ]∼ F = [< h in ( f 1 (i), f 2 (i), . . . , f n (i) >i∈I ]∼ ; (iii) each constant symbol c is interpreted in Mi as ci ∈ Ai for each i ∈ I ; the constant symbol c is interpreted in Π F Mi as the element [< ci >i∈I ]∼ of Π F Mi . We assume now that F is an ultrafilter (i.e., a maximal proper filter, in this case Π F Mi is called the ultrafilter). Prove: (a) definitions of PnF and h nF factor through equivalence classes of ∼ F : if f 1 ∼ F g1 , f 2 ∼ F g2 , …, f n ∼ F gn ) for any pair of tuples ( f 1 , f 2 , . . . , f n ) and (g1 , g2 , . . . , gn ), then {i ∈ I : Pni ( f 1 (i), f 2 (i), . . . , f n (i))} ∈ F if and only if {i ∈ I : Pni (g1 (i), g2 (i), . . . , gn (i))} ∈ F, and < h in ( f 1 (i), f 2 (i), . . . , f n (i) >i∈I ∼ F < h in (g1 (i), g2 (i), . . . , gn (i) >i∈I . (b) for each closed formula φ of L, Π F Mi |= φ if and only if {i ∈ I : Mi |= φ} ∈ F.
3.23 Problems
175
The following three problems concern Ehrenfeucht’s games and come from (Kolaitis, Ph. G.: On the expressive power of logics on finite models. In: Finite Model Theory and its Applications. Springer (2007)). Problem 3.8 (Ehrenfeucht’s games) The context: consider the cyclic graph G on 4 vertices a, b, c, d (for convenience, imagine it as a border of the square with a in the left upper corner, and other vertices in the clockwise order) and let the structure A be the domain A = {a, b, c, d} with the relational vocabulary {E}. Let the graph H be constructed from two copies of the graph G: G 1 with vertices u, v, w, t and G 2 with vertices u 1 , v1 , w1 , t1 in the same locations and order as in G. Graph H is obtained by identifying pairs of vertices:w = v1 and t = u 1 . The structure B has the domain B = {u, v, w = v1 , t = u 1 , w1 , t1 } and the relational vocabulary {E} with the relational symbol E interpreted in both structures as the edge e. Prove that A and B are not satisfying the relation ≡3 by considering the formula φ : ∃x1 .∃x2 .∃x3 .
(xi = x j ) ∧
i= j
¬E(xi , x j ).
i= j
Problem 3.9 (Ehrenfeucht’s games Continuing Problem 3.8, deduce from it that in the 3-rounds game on A, B, Spoiler has the winning strategy, Outline a winning strategy for Spoiler. Problem 3.10 (Ehrenfeucht’s games) Continuing Problem 3.9, prove that in the 2-rounds game on A, B, Duplicator has the winning strategy. Problem 3.11 (Ehrenfeucht’s games) We generalize the context of Problems 3.8, 3.9, 3.10. The construct H of Problem 3.8 will be called 2-ladder and it be denoted as L 2 and we define the (n + 1)-ladder L n+1 as the construct obtained from the n-ladder L n and the graph G by the construction of Problem 3.8 i.e. by gluing together the base edge of L n with the top edge of G. Prove: Given k, n, m explore winning strategies in the Ehrenfeucht game in k-rounds on domains L n , L m . Problem 3.12 (BDD; Binary Decision Diagrams) BDD is a method for symbolic model checking and formal verification. Its operations are the following. For a formula φ, and c ∈ {0, 1}, we denote by the symbol φxi /c the formula whose value on the argument tuple a n is φ(a1 , a2 , . . . , ai−1 , c, ai+1 , . . . , an ). The Boole-Shannon expansion of the formula φ is the formula φ : (xi ∧ φxi /0 ) ∨ (xi ∧ φxi /1 ). Prove: φ is semantically equivalent to φ .
176
3 Rudiments of First-Order Logic (FO)
Problem 3.13 (BDD) For formulae φ, ψ, the operation of composition φxi ψ(a n ) is defined as φ(a1 , a2 , . . . , ai−1 , ψ(a n ), ai+1 , . . . , an ). Prove: φxi /ψ is semantically equivalent to the formula (ψ ∧ φxi /1 ) ∨ ((¬ψ) ∧ φxi /0 ). BDD allows for a form of quantification: ∀xi .φ is defined as φxi /0 ∧ φxi /1 and ∃xi .φ is defined as φxi /0 ∨ φxi /1 and ∃xi .φ. Prove: φ is satisfiable if and only if ∃xi .φ is satisfiable for each xi for i = 1, 2, . . . , n. Problem 3.14 (Generalized quantification) We finally define generalized existential quantification over sets of variables by using the recurrent set of conditions: (i) ∃∅.φ is φ; (ii) ∃(xi ∪ X ).φ is ∃xi .(∃X.φ). Analogous conditions define the generalized universal quantification ∀X.φ. Check which properties of FO quantification are preserved by the generalized quantifications.
References 1. Tarski, A.: Der Wahrheitsbegriff in den formalisierten Sprachen. Stud. Philos. 1, 261–405 (1936). (Also in: Tarski, A.: Logic, Semantics, Metamathematics. Oxford University Press (1956, 1983)) 2. Gentzen, G.: Untersuchungen über das Logische Schliessen. Math. Z. 39(176–210), 405–431 (1934) 3. Ja´skowski, S.: Teoria dedukcji oparta na dyrektywach zało˙zeniowych (in Polish) (Theory of deduction based on suppositional directives). In: Ksi¸ega Pami¸atkowa I Polskiego Zjazdu Matematycznego. Uniwersytet Jagiello´nski, Kraków (1929) 4. Ja´skowski, S.: On the rules of suppositions in formal logic. Stud. Logica 1, 5–32 (1934). (Also in: Storrs McCall (ed.). Polish Logic 1920–1939. Oxford U. P., 232–258 (1967)) 5. Indrzejczak, A.: Sequents and Trees. Springer Nature Switzerland, Cham, Switzerland (2021) 6. Smullyan, R.M.: First Order Logic. Dover, Minneola N.Y (1996) 7. Rasiowa, H., Sikorski, R.: The Mathematics of Metamathematics. Polish Scientific Publishers (PWN). Warsaw (1963) 8. Beth, E.W.: The Foundations of Mathematics: A Study in the Philosophy of Science. Harper & Row Publishers, New York (1966) 9. Hintikka, K.J.J.: Form and content in quantification theory. Acta Philosophica Fennica 8, 7–55 (1955) 10. Löwenheim, L.: Über möglichkeiten im Relativkalkül. Math. Ann. 76(4), 447–470 (1915). https://doi.org/10.1007/bf01458217. (Also in: Van Heijenoort,J. (ed.): From Frege to Gödel. A Source Book in Mathematical Logic, 1879-1931, pp. 228–251. Harvard U. Press, Cambridge MA (1967)) 11. Skolem, T.A.: Logico-combinatorial investigations in the satisfiability or provability of mathematical propositions: A simplified proof of a theorem by L. Löwenheim and generalizations of the theorem. In: Van Heijenoort, J. (ed.) From Frege to Gödel: A Source Book in Mathematical Logic, 1879–1931, pp. 252–263. Harvard University Press, Cambridge MA (1967)
References
177
12. Skolem, T.A.: Selected Works in Logic. Universitetsforlaget, Oslo (1970) 13. Boolos, G.S., Burgess, J.P., Jeffrey, R.C.: Computability and Logic. Cambridge University Press, Cambridge UK (2002) 14. Robinson, J.A.: A machine oriented logic based on the resolution principle. J. ACM 12(1), 23–41 (1965) 15. Baader, F., Snyder, W., Narendran, P., Schmidt-Schauss, M., Schulz, K.: Unification theory. In: Robinson, A., Voronkov, A. (eds.): Handbook of Automated Reasoning, vol. 1, Ch. 8, pp. 447–533. Elsevier, Amsterdam (2001) 16. Horn, A.: On sentences which are true of direct unions of algebras. J. Symb. Log. 16(1), 14–21 (1951) 17. Schöning, U.: Logic for Computer Scientists. Springer Science+Business Media, New York (1989) 18. Gallier, J.H.: Logic for Computer Science. Foundations of Automatic Theorem Proving. Longman (2003) 19. Russell, S., Norvig, P.: Artificial Intelligence. A Modern Approach, 4th edn. Pearson (2020) 20. Salomaa, A.: Formal Languages. Academic Press, New York (1973) 21. Church, A.: An unsolvable problem of elementary number theory. Am. J. Math. 58, 345–363 (1936) 22. Floyd, R.W.: in [Manna, 2-1.6] 23. Manna, Z.: Mathematical Theory of Computation. McGraw-Hill, New York (1974) 24. Post, E.: A variant of a recursively unsolvable problem. Bull. Am. Math. Soc. 52 (1946) 25. Scott, D.: Outline of Mathematical Theory of Computation. In: 4th Annual Princeton Conference on Information Sciences & Systems, pp. 169–176 (1970) 26. Friedman, J.: Lecture Notes on Foundations of Computer Science. Technical Report CS 99, Stanford University (1968) 27. Sipser, M.: Introduction to Theory of Computation. PWS Publ. Co., Boston MA (1977) 28. Herbrand, J.: Logical Writings. Harvard University Press, Cambridge MA (1971) 29. Fitting, M.: First-Order Logic and Automated Theorem Proving. Springer, New York (1996) 30. Gödel, K.: Die Vollständigkeit der Axiome des logischen Funktionenkalküls. Monatshefte für Mathematik und Physik 37, 349–360 (1930) 31. Henkin, L.: The completeness of the first-order functional calculus. J. Symb. Log. 14, 159–166 (1949) 32. Hasenjaeger, G.: Eine Bemerkung zu Henkin’s Beweis für die Vollständigkeit des Prädikatenkalküls der ersten Stufe. J. Symb. Log. 18(1), 42–48 (1953). https://doi.org/10. 2307/2266326 33. Church, A.: Introduction to Mathematical Logic. Princeton University Press, Princeton NJ (1956) 34. Smullyan, R.M.: Gödel’s Incompleteness Theorems. Oxford University Guides. Oxford University Press, New York-Oxford (1992) 35. Quine, W.V.O.: Concatenation as basis for Arithmetic. J. Symb. Logic 11, 105–114 (1946) 36. Gödel, K.: Über formal unentscheidbare Sätze der Principia Mathematica und Verwandter Systeme 1. Monatshefte für Mathematik und Physics 38, 173–198 (1931) 37. Kalish, D., Montague, R.: On Tarski ’s Formalization of predicate logic with identity. Arch. f. Math. Logik und Grundl. 7, 81–101 (1964) 38. Rosser, J.B.: Extensions of some theorems of Gödel and Church. J. Symb. Log. 1(3), 87–91 (1936). https://doi.org/10.2307/2269028 39. Robinson, R.: An essentially undecidable axiom system. Proc. Int. Congress Math. 1, 729–730 (1950) 40. Ehrenfeucht, A.: An application of games to the completeness problem for formalized theories. Fundam. Math. 49, 129–141 (1961) 41. Fraïssé, R.: Sur quelques classifications des systémes de relations. Université d’lger, Publications Scientifiques, Séerie A 1, 35–182 (1954)
178
3 Rudiments of First-Order Logic (FO)
42. Mendelson, E.: Introduction to Mathematical Logic. CRC Press. Taylor and Francis Group, Boca Raton FL (2015) 43. Craig, W.: Three uses of the Herbrand-Gentzen theorem in relating model theory and proof theory. J. Symb. Logic 22(3), 269–285 (1957) 44. Halvorson, H.: https://www.princeton.edu/hhalvors/teaching/phi312_s2013/craig.pdf 45. Chang, C.C., Keisler, J.H.: Model Theory. Elsevier Science Publication, Amsterdam (1992)
Chapter 4
Modal and Intuitionistic Logics
4.1 Sentential Modal Logic (SML) Modal logics address uncertainty about truth values of statements by introducing and discussing the notion of possibility of truth in addition to the necessity of truth. In this, modal logics trespass the boundary between the realm of dichotomy truefalse into the less transparent realm of certainly true-possibly true. As the latter is less rigorous about truth and the notions of necessity and possibility are open to various interpretations, the result is the existence of many variants of modal logics. We will follow in this chapter some line of more and more complex interpretations of necessity and possibility as well as their mutual relations. Tradition of modal logics is as old as the tradition of sentential logic. In Aristotle, we find modalities N (necessity) P (possibility), C (contingency: may be possible and may be not possible). Some combinations of these modalities were added as prefixes to assertoric operators A, E, O, I to form modal syllogisms. For instance, to A’s in Barbara Aristotle would add one N and one P to form modal syllogism N Aab ∧ P Abc ⊃ Aac (cf. Smith [1]). Aristotle was also considering duality between N and P as ¬N ≡ P ∨ ¬P. The Stoic school (Diodorus Kronus, Philo, Chrysippus) considered four modalities: necessity, non-necessity, possibility, impossibility as attributes of assertoric statements. The value of attribute was dependent on the statement as well as on the analysis of contingencies in the context of the statement. They regarded pairs necessity - impossibility and possibility- non-necessity as contradictory (see Bobzien [2]). Theory of modalities was vivid in medieval times, often influenced by theological discourse, in approaches by Abelard, Thomas Aquinas, Duns Scotus, Buridan, Ockham, among others, (see Knuuttila [3]) leading to many views on modalities. Modern modal logic owes its inception to Lewis [4] who proposed for this logic the first axiomatic system. With the renewed by Carnap [5] idea of Leibniz of possible worlds and the possible worlds semantics by Kripke [6], modal logic entered its modern phase of development. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. T. Polkowski, Logic: Reference Book for Computer Scientists, Intelligent Systems Reference Library 245, https://doi.org/10.1007/978-3-031-42034-4_4
179
180
4 Modal and Intuitionistic Logics
Modal logics described here play important roles as a basis for models for reasoning in system analysis, model checking, software verification, epistemic and doxastic logics, agent systems. We adopt L as the symbol for necessity and M as the symbol for possibility, departing from very often used symbols of box [] and diamond . Basic modal operators express necessity (symbol L) and possibility (symbol M); formula Lφ reads it is necessary that φ and the formula Mφ reads it is possible that φ. Definition 4.1 (Syntactic constituents of SML) Sentential modal logic contains sentential logic at its core. Hence, any modal logic we consider contains the countable possibly infinite set of atomic propositions p1 , p2 , ..., pn , ... and we will use generic symbols p, q, r, ... to denote atomic propositions put into our formulae. Obviously, modal logics employ sentential connectives ∨, ∧, ¬, ⊃, ≡. With a slight freedom of language we may say that sentential logic is the smallest modal logic which we denote as before SL. SML applies the necessity symbol L and falsum ⊥. Definition 4.2 (Dualities) The set of syntactic constituents of SML contains the possibility symbol M defined by duality Mφ ≡ ¬L¬φ and the verum symbol defined by duality as ≡ ¬⊥. The formal buildup of SML begins with formulae. We define the recurrent set of well-formed formulae and the symbol wf will denote that a formula is well-formed. Definition 4.3 (Well-formed (wf) formulae of SML) The set of wf formulae of modal logic (wffs) is defined as the least set which satisfies the following conditions: (i) (ii) (iii) (iv) (v)
pi is wff for each atomic proposition pi ; ⊥ is wff; if φ is wff, then ¬φ is wff; if φ is wff and ψ is wff, then φ ⊃ ψ is wff; if φ is wff, then Lφ is wff.
From conditions (iii), (iv), (v), we infer that (vi) if φ and ψ are wffs, then φ ∨ ψ, φ ∧ ψ, Mφ, are wffs. Definition 4.4 (Inference rules) The set of inference rules contains detachment (MP): φ . Lφ
φ,φ⊃ψ ψ
and Necessitation (N):
As necessity and possibility are notions admitting various interpretations, they are defined by means of various axiom systems. We begin with the simplest, not counting SL, modal logic K.
4.1 Sentential Modal Logic (SML)
181
Definition 4.5 (The axiom system K) Axiom schemes for the system K are the following: (K1) all sentential formulae valid in SL called tautologies of SML; (K2) the scheme (K) L(φ ⊃ ψ) ⊃ (Lφ ⊃ Lψ); (K3) the scheme Mφ ≡ ¬L¬φ. Definition 4.6 (Normal modal logics) A modal logic X is normal if and only if it contains all instances of axiom schemes (K1) (K2),(K3) and is closed on inference rules (MP), and (N). Definition 4.7 (The notion of a proof) A proof of a formula φ in K is a finite sequence φ1 , φ2 , . . . , φn , each of which is an axiom instance or is obtained from the preceding formulae in the sequence by means of an inference rule, and φn is φ. Theorem 4.1 Modal logic K is normal. The following formulae are provable in K: (i) (ii) (iii) (iv) (v)
L(φ ∧ ψ) ≡ (Lφ) ∧ (Lψ); M(φ ∨ ψ) ≡ (Mφ) ∨ M(ψ); Lφ ⊃ (Lψ ⊃ Lφ); (Lφ) ∨ (Lψ) ⊃ L(φ ∨ ψ); M(φ ∧ ψ) ⊃ (Mφ) ∧ (Mψ).
In Table 4.1 we prove the formula (i). In Table 4.2, we give a proof of the formula (iii).
Table 4.1 Proof of (i) p∧q ⊃ p φ∧ψ ⊃φ L(φ ∧ ψ ⊃ φ L(φ ∧ ψ) ⊃ Lφ L(φ ∧ ψ) ⊃ Lψ L(φ ∧ ψ) ⊃ (Lφ) ∧ (Lψ) φ ⊃ (ψ ⊃ φ ∧ ψ) L[φ ⊃ (ψ ⊃ φ ∧ ψ)] Lφ ⊃ L[ψ ⊃ (φ ∧ ψ)] L(ψ ⊃ φ ∧ ψ) ⊃ (Lψ ⊃ L(φ ∧ ψ)) Lφ ⊃ [Lψ ⊃ L(φ ∧ ψ)] (Lφ) ∧ (Lψ) ⊃ L(φ ∧ ψ)
SL Substitution Necessitation (K), detachment As in (2)–(4) with ψ for φ SL (the one-way proof) SL, substitution Necessitation (K), detachment (K) (9), (10), SL (11), SL (the complete two-way proof)
182 Table 4.2 Proof of (iii)
4 Modal and Intuitionistic Logics p ⊃ (q ⊃ p) φ ⊃ (ψ ⊃ φ) L[φ ⊃ (ψ ⊃ φ)] [L(φ) ⊃ (Lψ ⊃ Lφ)]
SL Substitution Necessitation (K), detachment
4.2 Semantics of the Modal System K Semantics for K is the possible worlds semantics due to Saul Kripke (the Kripke semantics, the possible worlds semantics) (Kripke [6]). The notion of a possible world goes back to Leibniz and possible world semantics was initiated in Carnap [5], also in connection with the famous problem posed by Frege of ‘the Morning Star’ and ‘the Evening Star’. In Kripke semantics, we observe a fusion of the idea of a possible world with the Tarski semantics. Kripke structures correspond in a sense to interpretations in predicate logic. Definition 4.8 (Kripke frames) A Kripke frame for any modal logic is a pair S = (W, R), where (i) W is a non-empty set of possible worlds; (ii) R ⊆ W × W is a binary relation called the accessibility relation. If R(w, w ) holds, then w is the world of an agent who recognizes w as a possible world, in a sense, with a similar logical structure. One may say that whoever is in w, knows the logical status at w and accepts it, and consults w about its own decisions. A pointed frame is a pair F, w, where F = (W, R) is a frame and w ∈ W . Given a frame F = (W, R), we define for each world w ∈ W its F-neighborhood N F (w). Definition 4.9 (F-neighborhoods) An F-neighborhood N F (w) = {w : R(w, w )}. Definition 4.10 (An assignment) Given a frame F = (W, R), an assignment is a mapping A which to each pair (w, p) of a world w and an atomic proposition p assigns a truth value. Given A, a logical status at each world is determined. The triple M = (W, R, A) is a structure. Consequently, the pair (M, w) for w ∈ W is a pointed structure. Definition 4.11 (The satisfaction relation) The satisfaction relation, denoted |= (the Kleene symbol), is defined by structural induction for pointed structures of the form (F, A, w) where F is a frame, A an assignment and w ∈ W , by the following conditions: (i) M, w |= p if and only if A(w, p) = 1; (ii) M, w |= ⊥ for no world w;
4.2 Semantics of the Modal System K
(iii) (iv) (v) (vi) (vii) (viii)
M, w M, w M, w M, w M, w M, w
183
|= ¬φ if and only if it is not true that M, w |= φ; |= φ ⊃ ψ if and only if either it is not true that M, w |= φ or M, w |= ψ; |= φ ∧ ψ if and only if M, w |= φ and M, w |= ψ; |= φ ∨ ψ if and only if either M, w |= φ or M, w |= ψ; |= Lφ if and only if M, w |= φ for each w ∈ N F (w); |= Mφ if and only if M, w |= φ for some w ∈ N F (w).
The formula φ is true at the world w in the structure M if and only if M, w |= φ. The formula φ is true in the structure M if and only if it is true at every world in that model and we mark this fact as M |= φ. In this case M is said to be a model for φ. Finally, the formula φ is valid, which is denoted |= φ, if and only if φ is true in every structure proper for the given logic. While prescriptions (i)–(vi) are familiar from the sentential calculus, (vii) and (viii) are new as they tie the truth of a formula to the properties of the accessibility relation. The question about the direction of the relation R (why from w to w ), can be answered to the effect that the instance R(w, w ) occurs because the observer at the world w has access to the logical status at the world w and regards this status as plausible from their point of view (whatever this means). In plain words, one can compare this situation to one in everyday’s life, when we seek advice as to a certain decision from people we know and we know that their cases have been similar to the one we face. Example 4.1 Assume that a world w in a model M is such that N F (w) = ∅. Then the formula L p ⊃ p is not true at w: if A( p, w) = 0, then p is false at w while L p is true at w as the universal quantification of any formula over the empty set yields us truth. The same argument shows the falsity of the formula L p ⊃ M p: one may say that ‘what is necessary is impossible’. Theorem 4.2 The formula (K) is valid. Proof Consider a pointed structure (M, w). Suppose that the premiss L( p ⊃ q) and the premiss L p are true at w. Then for each world w in the neighborhood N F (w), formulae p ⊃ q and p are valid at w , hence, by detachment the formula q is valid at w , and this implies that Lq is true at w so the formula (K) is true at w and arbitrariness of (M, w) testifies to the validity of (K). Rules of inference: detachment (MP) and necessitation (N) as well as substitution preserve validity and this fact implies Theorem 4.3 Provable formulae of logic K are valid: Logic K is sound. Example 4.2 We give exemplary proofs of validity for formulae (i),(ii) below. (i) L(φ ⊃ ψ) ⊃ (Mφ ⊃ Mψ); (ii) M(φ ⊃ ψ) ⊃ (Lφ ⊃ Mψ).
184
4 Modal and Intuitionistic Logics
Proof We prove (i). Suppose that L(φ ⊃ ψ) is true at a world w in a pointed structure (M, w). The formula φ ⊃ ψ is true at each w ∈ N F (w). Suppose that the formula Mφ is valid at w, hence, there exists a world w ∈ N F (w) such that the formula φ is true at w . As the implication φ ⊃ ψ is true at w , the formula ψ is true at w by detachment, which implies that the formula Mψ is true at w so the formula (i) is true at (M, w) and in consequence, it is valid. Now, we prove (ii). Suppose that the formula M(φ ⊃ ψ) is true at the world w which implies that the formula φ ⊃ ψ is true at some w ∈ N F (w). Suppose that the formula Lφ is true at w, hence, the formula φ is true at each world in N F (w). By detachment, the formula ψ is true at w which shows that the formula Mψ is true at w. There are schemes that are not valid in K. Theorem 4.4 The following schemes are not valid in the modal system K: (i) (ii) (iii) (iv) (v)
L(φ ∨ ψ) ⊃ (Lφ) ∨ (Lψ); (Mφ) ∧ (Mψ) ⊃ M(φ ∧ ψ); φ ⊃ L Mφ; Mφ ⊃ L Mφ; M Lφ ⊃ L Mφ.
Proof Consider the set of possible worlds W = {w1 , w2 , w3 } with the assignment A: A(w1 , p) = A(w3 , p) = A(w1 , q) = A(w2 , q) = 1 and A(w2 , p) = A(w3 , q) = 0; the relation R has instances (w1 , w2 ), (w1 , w3 ), (w2 , w3 ), (w3 , w3 ). Then, in the structure S1 = (W, R, A): (i) is not true at w1 , (ii) is equivalent to (i), (iii) is not true at w1 , (iv) is not true at w1 . In this structure (v) is satisfied. In order to falsify (v), we consider the structure S2 = (W , R A ) with W = {w1 , w2 , w3 , w4 }, R = {(w1 , w2 ), (w1 , w3 ), (w4 , w2 ), (w4 , w3 )} and any assignment A . Then, (v) is not true at w1 . From the above it follows that properties of logic depend on properties of accessibility relations. We already know that logic K does not set any demands on accessibility relations in its models. For other logics, properties of the relation R are instrumental. We gather below in pairs formulae of modal logic alongside types of relations of accessibility which make the formulae valid. Theorem 4.5 The following are pairs of particular formulae and accessibility relations which define Kripke structures in which formulae are true, hence, valid: (i) (T) Lφ ⊃ φ: formula (T) is true in all structures in which accessibility relation R is reflexive, i.e., R(w, w) for each w. Such structures are said to be reflexive; (ii) (B) φ ⊃ L Mφ: formula (B) is true in all structures in which accessibility relation is symmetric, i.e., if R(w, v) then R(v, w) for each v, w ∈ W . Such structures are said to be symmetric; (iii) (4) Lφ ⊃ L Lφ: formula (4) is true in all structures in which accessibility relation is transitive, i.e., if R(w, v) and R(v, u) then R(w, u) for each w, v, u ∈ W . Such structures are said to be transitive;
4.2 Semantics of the Modal System K
185
(iv) (D) Lφ ⊃ Mφ: formula (D) is true in all structures in which accessibility is serial, i.e., for each w there exists v such that R(w, v). Such structures are called serial; (v) (5) Mφ ⊃ L Mφ: formula (5) is true in all structures in which accessibility relation is Euclidean, i.e., if R(w, v) and R(w, u) then R(v, u) for each w, v, u ∈ W . Such structures are said to be Euclidean; (vi) (DC) Mφ ⊃ Lφ: formula (DC) is true in all structures in which accessibility relation is partly functional, i.e., if R(w, v) and R(w, u) then v = u, or, functional, i.e., there exists v such that R(w, v) and for each u, if R(w, u) then u = v, for each w, v, u ∈ W . Such structures are called partly functional/functional; (vii) (4C) L Lφ ⊃ Lφ: formula (4C) is true in all structures in which accessibility relation is (weakly) dense, i.e., for each pair w, v, if R(w, v), then there exists u such that R(w, u) and R(u, v). Such structures are said to be dense; (viii) (G) M Lφ ⊃ L Mφ: formula (G) is true in all structures in which accessibility relation is directed, i.e., for each triple w, v, u, if R(w, v) and R(w, u) then there exists t such that R(v, t) and R(u, t). Such structures are called directed. Proof We prove statements (i)–(viii). For (i): Consider a frame F = (W, R) in which accessibility relation R is reflexive. Suppose that the formula Lφ is true at w ∈ W which means that if w ∈ N F (w), then w |= φ. As R(w, w), w ∈ N F (w), hence, w |= φ. For (ii), suppose that a frame F = (W, R) is symmetric and the formula φ is true at w ∈ W . We need to prove the truth of the formula L Mφ at w which means that if w ∈ N F (w) then there exists w ∗ ∈ N F (w ) such that w ∗ |= φ. By symmetry, we can take w ∗ = w which proves truth of the formula. For (iii), suppose that Lφ is true at w ∈ W and the frame (W, R) is transitive. Truth of L Lφ at w means that if w ∈ N F (w) and w ∗ ∈ N F (w ) then w ∗ |= φ. By transitivity of R, w ∗ ∈ N F (w) and, by truth of Lφ at w, we have w ∗ |= φ which proves the case. Concerning (iv), suppose that a frame F = (W, R) is serial and consider the formula (D): Lφ ⊃ Mφ. To prove its truth at w ∈ W , assume that the formula Lφ is true at w which means that if w ∈ N F (w), then w |= φ. By the seriality property of R, there exists w ∗ such that w ∗ ∈ N F (w), hence, w ∗ |= φ by which we conclude that w |= Mφ. For (v), consider a Euclidean frame F = (W, R) and suppose that the formula Mφ is true at w ∈ W . There exists w with R(w, w ). If R(w, w ∗ ), then R(w ∗ , w ), hence, w ∗ |= Mφ and this holds true for each w + such that R(w, w+ ) thus w |= L Mφ. For (vi), assume that a frame F = (W, R) is functional and the formula Mφ is true at w ∈ W . It suffices to consider the only w ∈ N F (w) satisfying the definition of functionality. If w |= Mφ then w |= φ and, it being the only neighbor of w, w |= Lφ. To treat the case (vii), assume the dense frame F = (W, R) and truth of the premise L Lφ at w ∈ W . Suppose that w ∈ N F (w) and w |= L Lφ, hence, w |= Lφ. Consider w ∗ such that w ∗ ∈ N F (w) and w ∈ N F (w ∗ ). Then w ∗ |= Lφ and w |= φ by which w |= Lφ.
186
4 Modal and Intuitionistic Logics
For (viii), assume a directed frame F = (W, R) and suppose that w |= M Lφ for any w ∈ W . There exists w ∈ N F (w) such that w |= Lφ. Consider any w ∈ N F (w); as the relation R is directed, there exists t ∈ N F (w ) ∩ N F (w ) and t |= φ, hence, w |= Mφ and w |= L Mφ. This concludes the case. It turns out that statements converse to (i)-(viii) above are true. Theorem 4.6 In cases (i)-(viii) in Theorem 4.5, the converse is true, i.e., if the formula is true for the frame, then the corresponding relation property holds in the frame. Proof The best way to prove the theorem is to argue in each case by reductio ad absurdum (by contradiction), i.e., to assume that the relation R in the frame F = (W, R) does not have the required property and prove that in that frame the formula is not true. To this end, in each case we construct a falsifying structure. We consider an atomic proposition p as φ and we endow our frames with appropriate falsifying assignments. Consider the structure S3 = (W, R, A) with W = {w1 , w2 , w3 }, R consisting of instances (w1 , w2 ), (w3 , w1 ), (w3 , w2); the frame (W, R) is neither reflexive, nor symmetric, nor, serial, nor, Euclidean, nor, functional. For the assignment A with A(w1 , p) = 1, A(w2 , p) = A(w3 , p) = 0, the structure S3 falsifies (i), (ii), (iv), (v), (vi). Consider the structure S4 = (W, R, A), with W = {w1 , w2 , w3 , w4 }, R consisting of instances {(w1 , w2 ), (w2 , w3 ), (w2 , w4 ), (w3 , w4 )} and an assignment A which satisfies conditions A(w2 , p) = 1 = A(w4 , p), A(w3 , p) = 0. The frame is neither transitive, nor dense, nor directed and the structure S4 falsifies (iii), (vii), (viii). Corollary 4.1 The following equivalences relate relational structures and modal formulae: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii)
For any frame For any frame For any frame For any frame For any frame For any frame For any frame For any frame
F, F, F, F, F, F, F, F,
F F F F F F F F
|= Lφ ⊃ φ if and only if F is reflexive; |= φ ⊃ L Mφ if and only if F is symmetric; |= Lφ ⊃ L Lφ if and only if F is transitive; |= Lφ ⊃ Mφ if and only if F is serial; |= Mφ ⊃ L Mφ if and only if F is Euclidean; |= Mφ ⊃ Lφ if and only if F is functional; |= L Lφ ⊃ Lφ if and only if F is dense; |= M Lφ ⊃ L Mφ if and only if F is directed.
Results in Corollary 4.1 reveal the close bond between modal logics and Kripke structures. Properties of relations are not independent and their relationships allow to establish relationships among logics. We offer some examples. Theorem 4.7 For any frame F, F is a frame for (K). Hence, K is contained in any modal logic. Theorem 4.8 Any reflexive relation is serial. Hence, any frame for the formula (T) is a frame for the formula (D). Hence, T extends D.
4.2 Semantics of the Modal System K
187
Theorem 4.9 Any relation R which is symmetric and Euclidean is transitive. Hence, any model for (B5) is a model for (4). Hence (B5) extends (4). Proof Consider a relation R on a set W of worlds which is symmetric and Euclidean along with worlds w, w , w ∗ ∈ W and instances of the relation R: R(w, w ), R(w , w ∗ ). By symmetry, instances R(w , w), R(w ∗ , w ) hold. Consider the pair of instances R(w , w), R(w , w ∗ ); as the relation R is Euclidean, the instance R(w, w∗ ) holds true which proves that the relation R is transitive. We argue as above to conclude that (B5) extends (4). Theorem 4.10 The following are equivalent for any relation R: (i) Relation tion; (ii) Relation (iii) Relation (iv) Relation (v) Relation
R is reflexive, symmetric and transitive, i.e., it is an equivalence relaR is reflexive and Euclidean; R is serial, symmetric and Euclidean; R is serial, symmetric and transitive; R is reflexive, transitive and Euclidean.
Proof (i) implies (ii): given w with R(w, w ), R(w, w ∗ ), we have by symmetry that R(w , w), R(w, w ∗ ) and transitivity implies R(w , w ∗ ); (ii) implies (iii): as R is reflexive, it is serial, consider R(w, w∗ ). By instances R(w, w∗ ), R(w, w), it follows by the Euclidean property of R that R(w∗ , w) holds true; (iii) implies (iv): proof in 3.9; (iv) implies (v): consider w, as R is serial there exists w∗ with R(w, w ∗ ). By symmetry, R(w∗ , w) and transitivity yields from the last two instances that R(w, w), i.e., reflexivity. If R(w, w∗ ), R(w, w ) then symmetry yields R(w ∗ , w) which along with R(w, w ) yields by transitivity R(w ∗ , w ) proving the Euclidean property of R; (v) implies (i): only symmetry needs a proof which is simple: from R(w, w∗ ), R(w, w), R(w∗ , w ∗ ) we obtain by the Euclidean property that R(w ∗ , w). For a ∈ {K , T, B, D, 4, 5}, we denote by Fa1 a2 . . . ak the frame which satisfies logics a1 , a2 , . . . ak . Corollary 4.2 For any frame F, the following are equivalent: (i) (ii) (iii) (iv) (v)
F F F F F
= = = = =
FK T B4 ; FK T 5 ; FK D B5 ; FK D B4 ; FK T 45 .
From a plethora of possible combinations among relations and formulae of modal logic, we single out some which may be called ‘canonical’. These are modal systems K, T, S4, S5. We can now introduce explicitly axiom schemes for logics T, S4, S5. To this end, we list some formulae which provide axiom schemes for these logics.
188
4 Modal and Intuitionistic Logics
Theorem 4.11 The following formulae are pairwise non-equivalent: (1) (2) (3) (4)
(K) L(φ ⊃ ψ) ⊃ (Lφ ⊃ Lψ); (T) Lφ ⊃ φ; (4) Lφ ⊃ L Lφ; (5) Mφ ⊃ L Mφ.
Proof (K) holds true unconditionally in all frames, (T) requires reflexive frames, (4) requires additionally to reflexivity also transitivity and (5) requires additionally symmetry and simple examples show that no two of these frames are equivalent as relational systems. Definition 4.12 (Normal system T of modal logic) Axiomatization of the system T rests on the following axiom schemes (we will not mention the present in all axiomatizations replacement Mφ = ¬L¬φ: (SL) All tautologies (i.e., valid formulae of SL); (K) L(φ ⊃ ψ) ⊃ (Lφ ⊃ Lψ); (T) Lφ ⊃ φ. Please observe that the scheme (T) implies the schema (T n ): L n+1 φ ⊃ L n φ, where L is L, L n+1 is L(L n ) for n > 1. System T is in our notational convention the system KT. Formulae of T are valid in reflexive frames; we denote by the symbol F r the class of reflexive frames. Valid formulae of T are also : φ ⊃ Mφ and, more generally, M n φ ⊃ M n+1 φ. 1
Definition 4.13 (Normal system S4 of modal logic) System S4 adds to system T the axiom scheme (4) Lφ ⊃ L Lφ. Hence, in S4, we have Lφ ≡ L Lφ, more generally L n φ ≡ L n+1 φ with dual formulae M n+1 φ ≡ M n φ for n ≥ 1. System S4 is in our notational convention, the system KT4, hence, formulae of S4 are valid in reflexive transitive frames, the class of these frames is denoted as F r t . Definition 4.14 (System S5 of modal logic) S5 is S4 augmented by the scheme: (5) Mφ ⊃ L Mφ. The dual formula is M Lφ ⊃ Lφ. System S5 is in our notation the system KT45, hence, frames for S5 are reflexive, symmetric and transitive, i.e., equivalent, their class is denoted F eq .
4.3 Natural Deduction: Analytic Tableaux for Modal Logics
189
4.3 Natural Deduction: Analytic Tableaux for Modal Logics Analytic tableaux for modal logics differ essentially from tableaux for sentential and predicate logics. The reason is obvious: in case of modal logics semantics is defined via Kripke structures in which the main role is played by relations, hence, their properties bear on the form of tableaux. We have therefore to discuss separately particular cases. Definition 4.15 (Tableaux for the modal logic K) In case of K, no restrictions are imposed on the accessibility relation R. Suppose we develop the signed analytic tableau and we proceed from the current development at word w to the R-connected world w . If we have the formula TLφ at w, then we can write Tφ at the child w . If we have the formula TMφ at w then again w may satisfy T φ along with consequences of the formulae up the tree on the branch from w to w : if TLφ is at w, then we should have T φ at w prior to the expansion of T Mφ; similarly, if FLφ occurs on the branch before w , we should add Fφ to w before expanding TMφ. The rules for tableaux for K are then: T Lφ F Lφ T Mφ F Mφ ; Fφ ; T φ ; Fφ Tφ
with proviso about the order of expansions stated in Definition 4.15. Example 4.3 We consider the formula φ : L(ψ ∧ ξ) ⊃ (Lψ) ∧ (Lξ). The tableau develops as follows. (1.) F [L(ψ ∧ ξ) ⊃ (Lψ) ∧ (Lξ)] (2.) T [L(ψ ∧ ξ)]; at this step, we add the condition F[(Lψ) ∧ (Lξ)] which initiates two branches into which we expand 2; (3.) we develop left branch: (4.) T [ψ ∧ ξ]; we continue development; (5.) T ψ;6. (6.) T ξ : at this point we add the consequent of FL(ψ) 6.’ we initiate the right branch by adding Fξ to (1.-6.); (7.) F ψ 7.’ Fξ; (8.) X: the left branch closes X the right branch closes. Definition 4.16 (Tableaux for the modal logic T) In case of T, Kripke structures are reflexive. This means that any world w at which we are with our development of a tableau, inherits consequents of the signed formulae
190
4 Modal and Intuitionistic Logics
at the node for w, along with expansions of preceding formulae exactly as in case of the modal logic K. The rules for T-tableau are as follows: T Lφ F Mφ F Lφ T Mφ ; Fφ ; Fφ; Tφ T φ
Example 4.4 We propose to discuss the tableau for the T-formula φ : L Lψ ⊃ Lψ. The tableau develops as follows. (1.) (2.) (3.) (4.) (5.) (6.) (7.)
F L Lψ ⊃ Lψ; T L Lψ F L p : at this node, we first develop the content of line 2 T Lp Tp F p we develop the content of line (2.) X: the branch closes.
Definition 4.17 (Tableaux for the modal logic S4) Kripke structures for S4 are reflexive and transitive. Transitivity requires that we should keep formulae T Lφ and F Mφ at the considered branch, neglecting other signed formulae before expanding the current node. Rules for S4-tableaux are T Lφ F Mφ F Lφ T Mφ ; Fφ ; Fφ; Tφ T φ
Example 4.5 The tableau for S4-formula φ : Lψ ⊃ L[(Lψ) ∨ (Lξ)]. (1.) (2.) (3.) (4.) (5.) (6.) (7.) (8.) (9.) (10.)
F Lψ ⊃ L[(Lψ) ∨ (Lξ)] T Lψ: it is kept for further usage; F L[(Lψ) ∨ (Lξ)] from 1.; F (Lψ) ∨ (Lξ): 1 and 3 are forgotten (may be crossed out); F Lψ: by 4. and rules for ∨; F Lξ: by 4. and rules for ∨; F ψ: by 5.; F ξ: by 6.; T ψ: by 2.; X: by 7.,9.: the branch closes.
Example 4.6 In figures below: Figs. 4.1, 4.2, and 4.3, we sketch tableau trees for K,T,S4-formulae. The above rules for K,T,S4-tableaux come from (Fitting [7–9]). It will be useful for the sequel to adopt notation already established in Fitting [9], viz., the rules TT Mφ φ and
F Lφ F φ
are called π-rules with the consequents in them denoted as π0 and rules
and FFMφ are called ν-rules with consequents denoted ν0 . φ For a set Γ of modal formulae, we denote by Γ ∗ the set which is: {ν0 : ν ∈ Γ } in case of K and T and {ν : ν ∈ Γ } in case of S4, this last definition is reflecting transitivity of Kripke structures for S4. T Lφ T φ
4.3 Natural Deduction: Analytic Tableaux for Modal Logics
191
Fig. 4.1 A modal closed K-tableau
Fig. 4.2 A modal closed T-tableau
Definition 4.18 (Semi-analytic tableaux for the modal logic S5) The case of S5 as well as any logic whose Kripke structures are symmetric is different from those for K,T,S4. The condition of symmetry of the accessibility relation induces new rules in addition to already standard ones. Consider a world w such that w |= T φ, hence, w |= φ and for a world w such that R(w, w ), also R(w , w), hence, w |= T Mφ. Similarly, in the notation of the last paragraph, if w |= Fφ, then w |= F Lφ. We add these two rules to our set of rules, to collect these rules under a common label, we
192
4 Modal and Intuitionistic Logics
Fig. 4.3 A modal closed 4-tableau
call them ε-rules in addition to known from Ch. 3 types α, β, γ, δ. We have therefore the following rules: T Lφ F Mφ F Lφ T Mφ ; Fφ ; Fφ; Tφ T φ
and ε-rules: T φ ; Fφ T Mφ F Lφ
Example 4.7 We insert an example of a tableau for S5. We consider the formula φ : L Mξ ⊃ Mξ. (1.) (2.) (3.) (4.) (5.)
F L Mξ ⊃ Mξ; T L Mξ; F Mξ: here we apply an ε-rule; F L Mξ X: tableau closes: expansion of 2. yields in two steps T ξ and expansion of 4. yields in two steps F ξ on the single branch.
4.4 Meta-Theory of Modal Logics: Part I: K,T,S4 Definition 4.19 (Provability, satisfiability) Let L be one of K,T,S4. A formula φ is tableau-provable if each tableau for Fφ is closed. A set Γ of formulae is satisfiable if and only if there exists a Kripke structure M and a world w in S such that M, w |= Γ . A tableau is satisfiable if and only if there exists a branch whose set of formulae is satisfiable.
4.4 Meta-Theory of Modal Logics: Part I: K,T,S4
193
Theorem 4.12 (The extension lemma) For L in {K, T, S4}, if an L-tableau for φ is satisfiable and it is modified by an application of a tableau rule for L, then the resulting tableau is satisfiable. Proof The proof is by structural induction. Different logic’s cases can be settled by slightly different arguments. If a rule of type α or β is applied, then the proof goes . Let w be the as in the sentential case. Suppose that the rule applied is, e.g., TT Mφ φ world such that w |= T Mφ, then there exists w such that R(w, w ) and w |= Mφ so the extended branch is satisfiable. Other cases are decided in a similar manner. Theorem 4.13 (Soundness property) If a formula φ of either K or T,S4 is tableau provable, then φ is valid. Proof Provability of φ means that each tableau for Fφ is closed, hence Fφ is unsatisfiable by Theorem 4.12 and φ is valid. Completeness property of tableau calculus requires additional tools to which we proceed with the completeness theorem in the end. Definition 4.20 (Hintikka sets, Hintikka consistency property for modal logics K, T, S4) It turns out that it is convenient to consider families of sets of formulae. Hintikka sets from Chap. 3 undergo modifications adapting them to modal contexts. Clearly, the sentential context remains unchanged. We define H ∗ below as follows: H ∗ = {ν0 : ν ∈ H } for K,T and H ∗ = {ν : ν ∈ H } for S4. In (H0)-(H5), α and β are names for types α, β and for formulae of those respective types. A family H of sets of modal formulae is a Hintikka consistent family if (H0) (H1) (H2) (H3) (H4) (H5)
if H ∈ H, then H contains no pair of conjugate formulae; if H ∈ H, then H contains neither T ⊥ nor F formulae; if H ∈ H and α ∈ H , then H ∪ {α1 , α2 } ∈ H; if H ∈ H and β ∈ H , then either H ∪ {β1 } ∈ H or H ∪ {β2 } ∈ H; for K,T,S4: if H ∈ H and π ∈ H , then H ∗ ∪ {π0 } ∈ H; for T,S4: if H ∈ H and ν ∈ H , then H ∗ ∪ {ν0 } ∈ H.
We call each member of a Hintikka consistent family a Hintikka consistent set. Theorem 4.14 Each Hintikka consistent set is satisfiable. Proof The steps in proof are the following. Consider a family H of Hintikka sets. First, use, e.g., the Lindenbaum Lemma (recalled below) to extend each Hintikka set to a maximal Hintikka set. Consider the collection MaxCon(H) of maximal Hintikka sets for H. Accept maximal Hintikka sets as possible worlds in a Kripke frame H+ and complete the frame by a definition of the accessibility relation R H : for two maximal Hintikka sets W and W , let R H (W, W ) if and only if W ∗ ⊆ W . Then, a lemma follows.
194
4 Modal and Intuitionistic Logics
Lemma 4.1 The accessibility relation R H has no conditions for K; it is reflexive for T and transitive for S4. Proof Reflexivity of R for T: consider a world W being a maximal Hintikka consistent set along with ν0 ∈ W ∗ . There exists a ν ∈ W and by definition, W ∪ {ν0 } ∈ H+ . As W is maximal, ν0 ∈ W , hence, W ∗ ⊆ W , i.e., R(W, W ). The argument for S4 follows by reflexivity proved above and the relation W ∗ = (W ∗ )∗ . Lemma is proved. It follows by Lemma, that H+ is a frame, respectively, for K,T,S4. It remains to produce a Kripke structure satisfying sets of formulae in H by submitting a definition of an assignment A. One verifies that the property of canonical models: if φ ∈ MaxCon M, then Maxcon M |= φ holds in case of maximal Hintikka consistent sets. This is done by structural induction beginning with atomic formulae T p, F p: if T p ∈ MaxCon H then A(MaxCon H, p) = 1, otherwise A(Maxcon H, p) = 0; if neither T p nor F p in Maxcon M, then we may assign any value be it 1 or 0 to p. Please observe that maximality implies that each maximal Hintikka consistent set W contains imminent sub-formulae of its formulae. For instance, if α ∈ W , then W ∪ {α1 , α2 } ∈ H and as W is maximal, simply, α1 , α2 ∈ W . So, if W |= α1 , α2 , then W |= α. Similarly, for β ∈ W . Consider the case ν ∈ W . One has to prove that if R(W, W ), i.e., W ∗ ⊆ W , then ν0 ∈ W . In case of K,T, ν0 ∈ W ∗ , hence, ν0 ∈ W . By arbitrary choice of W , W |= ν. In case of S4, with the above notation, ν ∈ W ∗ , hence, ν ∈ W and, by Definition 4.12 (H5), W ∪ {ν0 } ∈ H, hence, ν0 ∈ W and as in case of K,T, W |= ν. Finally, the case π ∈ W is to be considered. If π ∈ W , then by Definition 4.12, (H4), W ∗ ∪ {π0 } ∈ H, hence, there exists a maximal Hintikka consistent set W which extends W ∗ ∪ {π0 }, and then, π0 ∈ W . By hypothesis of induction, W |= π0 and as R(W, W ), it follows that W |= π. By the above, each Hintikka consistent set H in H is satisfied in the canonical model H+ , the world satisfying H is the maximal consistent extension of H . A tableau for a finite set of signed formulae begins with a list of the formulae in the set and proceeds as usual. For those sets, we have Theorem 4.15 A finite set H of formulae is Hintikka consistent in a logic X=K,T,S4 if and only if no X-tableau for H closes. Proof It suffices to prove that the family of finite sets with the property that no tableau for them closes is a Hintikka consistent family H. Suppose H is a set with this property and α ∈ H . Suppose that H ∪ {α1 , α2 } ∈ / H. Then there is a closed tableau which contains a vertical sequence -α-α1 -α2 -Θ. Hence, before propagation of α, we have a fragment −α − Θ of the tableau that closes. Thus, a contradiction. Similarly, for other cases of β, ν, π. In consequence of the above considerations, we obtain the completeness theorem for K,T,S4 tableaux.
4.5 Natural Deduction: Modal Sequent Calculus
195
Theorem 4.16 (The tableau-completeness theorem for K,T,S4) If a formula is valid, then it is tableau provable. Proof It is a proof by contradiction. Suppose that a formula φ is not tableau provable, hence, there exists an open branch in a tableau for F φ and then, F φ is Hintikkaconsistent hence satisfiable, a contradiction. This concludes our excursion into the realm of modal tableaux for K,T,S4. We now pass to the case of S5. Definition 4.21 (The case of S5) As before, S5, due to symmetry, involves ε-rules, transitivity and reflexivity of its structures take all what was prescribed for K and T. As the result, for any set Γ of S5-formulae, the set Γ ∗ is {ν : ν ∈ Γ } ∪ {π : π ∈ Γ }. Theorem 4.17 S5 is weakly complete, i.e., it is complete when enhanced with the ε-rules. Proof We augment our definition of a Hintikka set with conditions: (H6) if H ∈ H and ν ∈ H , then: H ∪ {ν0 } ∈ H; (H7) if H ∈ H and π ∈ H , then: H ∗ ∪ {π0 } ∈ H; (H8) if H ∈ H and π0 ∈ H , then: H ∪ {π} ∈ H. We then proceed as in cases for K, T,S4 to build maximal consistent Hintikka sets as canonical model and repeat the argument leading to completeness proof.
4.5 Natural Deduction: Modal Sequent Calculus We have met sequent calculus in Chaps. 2 and 3 and now we propose to meet it in case of modal logics. We present the system for S4 developed in (Ohnishi and Matsumoto [10]). A sequent in modal logic is an ordered pair < Γ, Δ > of sets of modal formulae which we write down as Γ ⇒ Δ. A sequent is valid if in each case when all formulae in Γ are valid a formula in Δ is valid. In particular, if a sequent < ∅, φ > is valid then φ is valid. The same concerns provability : if a sequent < ∅, φ > is provable, then φ is provable. This is exactly the case of entailment (logical consequence) and a sequent < Γ, Δ > is equivalent to the formula Γ ⊃ Δ. We have been applying the idea of Smullyan of relating sequents to tableaux: a sequent < Γ, Δ > is equivalent to the set of signed formulae {T γi : γi ∈ Γ } ∪ {Fδ j : δ j ∈ Δ}. We now recall the aforementioned sequent system for S4. Definition 4.22 The modal sequent system for S4: 1. Axioms: Γ, φ ⇒ φ, Δ; Though rules for sentential part are already given in Chs. 2 and 3, yet we recall them here for our convenience.
196
4 Modal and Intuitionistic Logics
2. Rules for sentential connectives 2.1 (left ∧) 2.2 (left ∨) 2.3 (left ⊃) 2.4 (left ¬)
Γ,φ,ψ⇒Δ Γ ⇒Δ,ψ (right ∧) Γ ⇒Δ,φ Γ,φ∧ψ⇒Δ Γ ⇒Δ,φ∧ψ Γ,φ⇒Δ Γ,ψ⇒Δ ⇒Δ,φ,ψ (right ∨) ΓΓ⇒Δ,φ∨ψ Γ,φ∨ψ⇒Δ Γ ⇒Δ,φ Γ,ψ⇒Δ (right ⊃) ΓΓ,φ⇒Δ,ψ Γ,φ⊃ψ⇒Δ ⇒Δ,φ⊃ψ Γ ⇒Δ,φ Γ,φ⇒Δ (right ¬) Γ ⇒Δ,¬φ Γ,¬φ⇒Δ
3. Rules for modal connectives We recall that Γ ∗ = {ν : ν ∈ Γ } and we add Γ ∗∗ = {π : π ∈ Γ } ∗ Γ,φ⇒Δ ⇒Δ∗∗ ,φ 3.1 (left L) Γ,Lφ⇒Δ (right L) ΓΓ ⇒Δ,Lφ 3.2 (left M)
Γ ∗ ,φ⇒Δ∗∗ Γ,Mφ⇒Δ
(right M)
Γ ⇒Δ,φ Γ ⇒Δ,Mφ
Example 4.8 We give a sequent proof and a parallel tableau proof for the formula Lφ ∧ Lψ ⊃ L(φ ∧ ψ). The sequent proof (S.i) (S.ii) (S.iii) (S.iv) (S.v) (S.vi) (S.vii) (S.viii) (S.ix)
φ, ψ ⇒ φ; φ, ψ ⇒ ψ : axiom instances; φ ∧ ψ ⇒ φ; φ ∧ ψ ⇒ ψ; Lφ ∧ ψ ⇒ φ; Lφ ∧ ψ ⇒ ψ; Lφ ∧ Lψ ⇒ φ ∧ ψ; Lφ ∧ Lψ ⇒ L(φ ∧ ψ); ⇒ Lφ ∧ Lψ ⊃ L(φ ∧ ψ).
The tableau proof F ((Lφ ∧ Lψ) ⊃ L(φ ∧ ψ)); T Lφ ∧ Lψ; F L(φ ∧ ψ); T Lφ, T Lψ; F φ ∧ ψ; T Lφ, T Lψ; F φ (T.v’) T Lφ,T Lψ, F ψ: branching; (T.vi) T φ, T ψ, F φ; (T.vi’) T φ,T ψ; F ψ (T.vii) X: left branch closes; X: right branch closes. (T.i) (T.ii) (T.iii) T.(iv) (T.v)
Please notice that tableau closes with the same sets of formulae as in axiom instances for the sequent proof: the instance (S.i) in the Smullyan correspondence is {T φ, T ψ, ; Fφ} and the instance (S.ii) in that correspondence is {T φ, T ψ; Fψ}. This was the sequent set of rules for S4 modal logic. For logics K and T, modal rules apply different sets Γ ∗ and Γ ∗∗ .
4.6 Meta-Theory of Modal Logics. Part II
197
Definition 4.23 (Modal sequent rules for K and T) Modal sequent rules for K and T are the following: Γ ∗ = {ν0 : ν ∈ Γ }; Γ ∗∗ = {π0 : π ∈ Γ } Completeness of modal sequent calculus follows from completeness property of modal tableaux via the Smullyan correspondence: a sequent Γ ⇒ Δ is valid if and only if the tableau for the set T Γ , FΔ closes.
4.6 Meta-Theory of Modal Logics. Part II Let us observe that the necessitation rule (N) allows for an extension of the formula (K) to the rule (RK): Definition 4.24 (The rule (RK)) (R K )
γ1 ⊃ (γ2 ⊃ . . . ⊃ (γk )) ⊃ φ . Lγ1 ⊃ (Lγ2 . . . ⊃ (Lγk )) ⊃ Lφ
To prove the rule (RK), observe that for k = 1 it suffices to apply necessitation and then the rule (K); the rest follows by induction on k. We denote by Σ = a1 a2 ...ak the signature of the logic in question, where a :: K |T |D|B|4|5. The symbol Σ will denote the logic with the signature Σ. In our case Σ is a generic symbol for signatures K, KT,KT4,KT45, KB. The symbol Σ φ, equivalently, ∅ Σ φ denotes that φ has a proof in the logic Σ. There is a counterpart for relative validity modulo sets of formulae, Γ Σ . Definition 4.25 (Deducibility) A formula φ is deducible from a set of formulae Γ within a logic with the signature Σ if there exist formulae γ1 , γ2 , . . . , γn ∈ Γ such that the formula i γi ⊃ φ is provable in the logic Σ (we omit here some instances of axiom schemas). Theorem 4.18 We list below some properties of relations Σ and Γ Σ which denote provability and provability from a set Γ . (i) If SL φ, then Γ Σ φ. This is a symbolic rendering of the fact that SL is a subset of any modal logic; (ii) If φ ∈ Γ , then Γ Σ φ. This fact comes by our notion of a proof and the first step in it; or by the tautology p → p; (iii) If Γ Σ ψ and {ψ} φ is a consequence of a tautology, then Γ Σ φ. This is an instance of the tautology ( p ⊃ q) ⊃ [(q ⊃ r ) ⊃ ( p ⊃ r )]; (iv) Γ Σ ⊥ if and only if Γ Σ φ ∧ ¬φ for any formula φ. By (iii) and tautology ⊥ ≡ φ ∧ ¬φ; (v) Γ Σ (φ ⊃ ψ) if and only if Γ ∪ {φ} Σ ψ. The one way implication (to the right) follows by detachment. The reverse way is the deduction theorem and the proof of it follows the lines of the proof for sentential logic;
198
4 Modal and Intuitionistic Logics
(vi) If Γ Σ φ then Δ Σ φ for some finite subset Δ of Γ . This is the compactness property of deduction and it follows from the finiteness of a proof; (vii) Deducibility is monotone with respect to Σ and Γ , i.e., if Γ Σ φ and either Σ extends Σ or Γ ⊆ Γ , then Γ Σ φ, respectively, Γ Σ φ. Definition 4.26 (Consistency) A set Γ is consistent with respect to Σ if and only if it is not true that Γ Σ ⊥, where ⊥ means the unsatisfiable formula (‘falsum’). The symbol Con Σ (Γ ) denotes the fact that the set Γ is consistent with respect to Σ. The converse is denoted noCΣ (Γ ). Let us observe that tautology p ⊃ (q ⊃ p ∧ q) implies that from Con Σ (Γ ) and Theorem 6.1, (iv) it follows that any Σ-consistent Γ cannot contain both φ and ¬φ for any formula φ. Definition 4.27 (Maximal consistency) A consistent set Γ is maximal consistent when there does not exist a consistent set Δ such that Γ ⊂ Δ. Each consistent set is contained in a maximal consistent set. This fact is known as the Lindenbaum Lemma. We repeat the proof here for Reader’s convenience. Theorem 4.19 (The Lindenbaum Lemma) Each Σ-consistent set Γ of formulae extends to a maximal Σ-consistent set of formulae Γ + . Proof We repeat proof from Ch. 2. The set of all formulae of Σ is countable, hence, its formulae can be enumerated as φ0 , φ1 , . . . , φn , . . .. We define a sequence Γ0 , Γ1 , . . . , Γn , . . ., where (i) Γ0 is Γ (ii) Γn+1 is Γn ∪ {φn } if this set is Σ-consistent else Γn+1 is Γn . We set Γ + to be the union n≥0 Γn . Clearly, Γ + is Σ-consistent as every finite set of formulas is contained in some Γn which is consistent by definition. For each formula φn , exactly one of φn , ¬φn is by construction a member of Γ + , hence, Γ + is maximal consistent. We denote a maximal consistent set Γ by the symbol MaxCon Σ (Γ ). Properties of consistent sets result from properties of deducibility. We list the main of them. Theorem 4.20 The following are properties of consistency. (i) If Con Σ (Γ ) then noCΣ (Γ ∪ {φ ∧ ¬φ}). No contradiction can be an element in consistent Γ by 6.1(iv); (ii) Con Σ (Γ ) if and only if there exists a formula φ not deducible from Γ . Otherwise Γ ⊥. By (i); (iii) Con Σ is anti-monotone, i.e., if Con Σ (Γ ) and Δ ⊆ Γ then Con Σ (Δ); (iv) Con Σ (Γ ) if and only if Con Σ (Δ) for each finite subset Δ of Γ . This follows from compactness property of deducibility; (v) If Con MaxΣ (Γ ) and φ a formula then either φ ∈ Γ or ¬φ ∈ Γ . Indeed, assume that neither φ nor ¬φ are in Γ . Then Γ ∪ {φ} Σ ⊥, hence, by deduction theorem Γ Σ φ ⊃ ⊥ and, analogously, Γ Σ ¬φ ⊃ ⊥. From tautology ( p ⊃ q) ⊃ ((¬ p ⊃ q) ⊃ q) with ( p/φ), q/⊥) it follows that Γ Σ ⊥, a contradiction;
4.6 Meta-Theory of Modal Logics. Part II
199
(vi) If MaxCon Σ (Γ ) and Γ Σ φ then φ ∈ Γ . By property (v), as ¬φ cannot be in Γ ; (vii) If MaxCon Σ (Γ ), then φ ∧ ψ ∈ Γ if and only if ψ ∈ Γ and φ ∈ Γ . If φ ∧ ψ ∈ Γ then by tautology ψ ∧ φ ⊃ ψ, and property (vi), ψ ∈ Γ , analogously φ ∈ Γ . The converse: if ψ, φ ∈ Γ , then by 6.1(ii), Γ Σ φ and Γ Σ ψ. By tautology p ⊃ (q ⊃ p ∧ q) with ( p/φ) and (q/ψ), and by 6.1(iii), Γ Σ φ ∧ ψ, hence, φ ∧ ψ ∈ Γ by (vi); (viii) If MaxCon Σ (Γ ), then φ ∨ ψ ∈ Γ if and only if either φ ∈ Γ or ψ ∈ Γ . Indeed, if φ ∨ ψ ∈ Γ and neither φ ∈ Γ nor ψ ∈ Γ then by property (v), ¬φ ∈ Γ and ¬ψ ∈ Γ , hence, ¬φ ∧ ¬ψ ∈ Γ , i.e., ¬(φ ∨ ψ) ∈ Γ , a contradiction; / Γ or ψ ∈ Γ . (ix) If MaxCon Σ (Γ ), then (φ ⊃ ψ) ∈ Γ if and only if φ ∈ By tautology (φ ⊃ ψ) ≡ ¬φ ∨ ψ and property (viii). This condition is, by property(v), equivalent to the condition: if φ ∈ Γ then ψ ∈ Γ . We consider modal logics containing the logic K, so in addition to the necessitation rule, also the axiomatic schemes (K) and (RK) as well as all tautologies are present. We begin preparations to a proof of completeness for modal logics. Proof we are going to present exploits the idea in Henkin [11] by extending a consistent set to a maximal consistent one and using the extension as a model for its formulae. We record a consequence to the rule (RK). Theorem 4.21 If Γ Σ φ then LΓ Σ Lφ, where LΓ is the set {Lψ : ψ ∈ Γ }. Equivalently, if {ψ ∈ Γ : Lψ ∈ Γ } Σ φ, then Γ Σ Lφ. Proof Deducibility of φ from Γ is a sequence ψ1 , ψ2 , . . . , ψk , hence, the formula ψ1 ⊃ (. . . ⊃ (ψk ⊃ φ)) is provable and the rule (RK) implies that Lψ1 ⊃ (. . . ⊃ (Lψk ⊃ Lφ)) is provable, i.e., the sequence Lψ1 , . . . , Lψk is a deducibility witness of Lφ from LΓ . We state this new rule: Definition 4.28 (Rule (LRK)) (L R K )
ψ1 ⊃ (. . . ⊃ (ψk ⊃ φ)) . Lψ1 (⊃ . . . ⊃ (Lψk ⊃ Lφ))
The following fact (Chellas [12]) is of importance for our forthcoming discussion. Theorem 4.22 For a MaxCon Σ (Γ ), if a formula φ has the property that φ belongs in each MaxCon(Δ) with the property that {ψ : Lψ ∈ Γ } ⊆ Δ, then Lφ ∈ Γ . Proof Suppose that the assumption of the theorem is true. It follows that φ belongs to each maximal consistent set which contains the set {ψ : Lψ ∈ Γ }. We state a Claim. Claim. For a set Γ of formulae and a formula φ, if φ in Δ for each MaxCon Σ (Δ) which contains Γ , then Γ Σ φ for any normal Σ.
200
4 Modal and Intuitionistic Logics
Proof of Claim. Suppose that Γ Σ φ is not true. Then Γ = Γ ∪ {¬φ} is con/ Δ. Claim sistent and this set extends to a Con MaxΣ (Δ). Then, ¬φ ∈ Δ, hence, φ ∈ is proved. We return to the proof of the theorem. It follows from the Claim, that {ψ : Lψ ∈ Γ } Σ φ. By (LRK), Γ Σ Lφ, and maximality of Γ implies that Lφ ∈ Γ . Theorem 4.22 has a dual. Theorem 4.23 For a MaxCon Σ (Γ ), and a formula ψ, Mψ ∈ Γ if there exists a MaxCon Σ (Δ) with the property that {Mψ : ψ ∈ Δ} ⊆ Γ and ψ ∈ Δ. Proof By duality between L and M. Suppose Mψ ∈ Γ , i.e.,¬L¬ψ ∈ Γ , hence, L¬ψ ∈ / Γ . By Theorem 4.22, there exists a Con Max(Δ) such that (i) {ψ : Lψ ∈ Γ } ⊆ Δ and (ii) ¬ψ ∈ / Δ. For MaxCon(Γ ) and MaxCon(Δ), the following are equivalent: (i) {ψ : Lψ ∈ Γ } ⊆ Δ; (ii) {Mψ : ψ ∈ Δ} ⊆ Γ . From (i) to (ii): suppose {ψ : Lψ ∈ Γ } ⊆ Δ) and ψ ∈ Δ. Then ¬ψ ∈ / Δ, i.e., L¬ψ ∈ / Γ which implies that ¬L¬ψ ∈ Γ , i.e., Mψ ∈ Γ . The converse from (ii) to (i) is proved on the same lines in the opposite direction. We return to the proof of the theorem. The condition (i) in virtue of Claim can be rewritten as (iii) {Mψ : ψ ∈ Δ} ⊆ Γ . Moreover, ¬ψ ∈ / Δ, i.e., ψ ∈ Δ.
4.7 Model Existence Theorem and Strong Completeness We continue with the assumption that our logics are normal. This concerns in particular logics K, T, S4, S5. We apply maximal consistent sets in order to define models for modal logics. Definition 4.29 (Strong completeness) A modal logic Σ is strongly complete with respect to a set of frames F if for every set Γ of formulae and a formula φ, if Γ |=F φ then Γ Σ φ. A working paraphrase of this definition is given below. Theorem 4.24 A modal logic Σ is strongly complete with respect to a class F of frames if and only if each Σ-consistent set Γ of formulae is true on some frame F ∈ F. An argument justifying the paraphrase runs as follows: assume that Σ is not strongly complete so by definition there are a set Γ and a formula φ such that Γ |=F φ but not Γ Σ φ, hence, the set Γ ∪ {¬φ} is consistent but it is not true on any frame in F. This argument appeared already in Chaps. 2 and 3.
4.7 Model Existence Theorem and Strong Completeness
201
This fact paves our way toward a proof of completeness: the idea is to build a frame such that each world w in it would have the property that w φ if and only if w |= φ. Such property is immanent to maximal consistent sets and this determines their crucial role in proofs of completeness as they provide worlds for canonical structures. Definition 4.30 (Canonical frames and structures) A canonical frame for a normal modal logic Σ is a pair F Σ = (W Σ , R Σ ), where (i) W Σ = {Γ : MaxCon Σ (Γ )}; (ii) R Σ (Γ, Γ ) ≡ (Lφ ∈ Γ ⊃ φ ∈ Γ ), equivalently, by duality, if φ ∈ Γ then Mφ ∈ Γ for each φ; (iii) A canonical structure for a normal modal logic Σ is a pair M Σ = (F Σ , AΣ ), where AΣ ( p) = {Γ : p ∈ Γ }. We denote worlds in canonical frames with capital letter W eventually endowed with apostrophes, stars, etc. The condition in Definition 4.30(ii) tells the meaning of the relation R Σ : if R Σ (W, W ), then formulae valid at w are possibly valid, i.e., plausible at w. This condition admits its converse in a sense. Theorem 4.25 In conditions of Definition 4.30, for any formula φ, if Mφ ∈ Γ then there exists a MaxCon Σ (Γ ) such that φ ∈ Γ and R Σ (Γ, Γ ). Proof Suppose that Mφ ∈ Γ . The set Δ ={ψ : Lψ ∈ Γ } ∪ {φ} is consistent: was the converse true, we would have a proof γ1 , . . . γn for ¬φ and by necessity, (K) and maximality of Γ , we would have L¬φ ∈ Γ which by duality would be equivalent to ¬Mφ ∈ Γ , a contradiction. By the Lindenbaum Lemma, the set Δ extends to a MaxCon Σ (Γ ) which contains φ. The last step is to check that axioms of any normal modal logic Σ are valid in canonical models. This is true for atomic propositions by definition; also tautologies are valid as they are valid in all models. For a formula ψ: Mφ, validity at world Γ follows from Theorem 4.25. Theorem 4.26 (Strong completeness theorem for the modal logic K) K is strongly complete with respect to class of all frames. Proof Let Γ be a K-consistent set of formulae. Let MaxCon K (Γ + ) be an extension of Γ . We have Γ + |= Γ . This argument extends to normal modal logics of signatures of the form KA where A is a sequence of ai s. Theorem 4.27 Logic KT is strongly complete with respect to reflexive frames. Proof We show that canonical structure for KT is reflexive. For a canonical world MaxCon K T (Γ ), assume that φ ∈ Γ . As Γ contains the formula Lφ ⊃ φ equivalent to the formula φ ⊃ Mφ, by detachment, Mφ ∈ Γ , which implies that R K T (Γ, Γ ), i.e., that R K T is reflexive.
202
4 Modal and Intuitionistic Logics
Theorem 4.28 Logic KB is strongly complete with respect to symmetric frames. Proof Let M K B be a canonical structure for K B and R K B (Γ, Γ ). Consider a formula φ ∈ Γ . As the world Γ contains the formula (B): φ ⊃ L Mφ, it contains by detachment the formula L Mφ hence Mφ ∈ Γ , i.e., R K B (Γ , Γ ). Theorem 4.29 Logic K4 is strongly complete with respect to the class of transitive frames. Proof By pattern of preceding proofs, it suffices to show that canonical structure for K4 is transitive. Suppose that R K 4 (Γ, Γ ) and R K 4 (Γ , Γ ∗ ) hold. Suppose that φ ∈ Γ ∗ , hence, Mφ ∈ Γ and M Mφ ∈ Γ . As Γ contains the formula M Mφ ⊃ Mφ, detachment yields Mφ ∈ Γ which witnesses that R K 4 (Γ, Γ ∗ ), i.e, transitivity of R K 4. Corollary 4.3 Logic S4 is strongly complete with respect to the class of reflexive and transitive Kripke frames, logic S5 is complete with respect to the class of Kripke frames whose accessibility relations are equivalences. Indeed, logic S4 is KT4 and logic S5 is KT45 which contains B. This construction of proofs of completeness goes back to Henkin [11].
4.8 Small Model Property, Decidability The main result on this topic for sentential modal logics is the finite model property Ladner [13]. Roughly speaking, it means that if a formula φ is valid at some world w in a structure M then it is valid at some world in a finite structure. On this occasion, we will be introduced to a classical method in modal theory, the filtration. Definition 4.31 (The idea of filtration) We first show this method and the result on finite structures in the simplest case: we consider the so-called basic modal logic in which we consider formulae of the form Mφ with only occurrences of the modal functor M. An additional simplification is that checking validity of the formula Mφ in a structure M at a world w requires finding a world v such that R(w, v) and M, v |= φ which makes analysis of truth especially convenient. Method of filtration consists in identifying worlds having identical sets of true formulae. One more notion we will need is the notion of a sub-formula of a formula φ. We recall definition. Definition 4.32 (Sub-formulae) (i) if χ ∨ ψ, χ ∧ ψ, χ ⊃ ψ are sub-formulas of φ then, in each case, χ and ψ are sub-formulae of φ;
4.8 Small Model Property, Decidability
203
(ii) if ¬ψ is a sub-formula of φ, then ψ is a sub-formula of φ, and, if Mψ is a sub-formula of φ, then ψ is a sub-formula; (iii) the formula φ itself is a sub-formula of φ. This notion extends to sets of formulae. A set of basic modal formulae Γ is closed on sub-formulae if for each formula in it all its sub-formulae are in the set Γ . We denote the set of sub-formulae of a collection Γ of formulae by the symbol Sub(Γ ). For instance if Γ = {M( p ⊃ q) ⊃ ( p ∧ q); (r ∨ s) ⊃ (Mr ∨ Ms)}, then subformula-closed set is the closure C(Γ ) of Γ , i.e., the set {M( p ⊃ q) ⊃ ( p ∧ q), M( p ⊃ q), p ∧ q), ( p ⊃ q), p, q, (r ∨ s) ⊃ (Mr ∨ Ms)(r ∨ s), r, s, (Mr ∨ Ms), Mr, Ms}. Definition 4.33 (Filtered structures) We can now present the idea of a filtration of a structure. Consider a structure M = (W, R, A). Filtration introduces an equivalence relation ≈ on worlds in M by letting w ≈ v if and only if M, w |= φ ≡ M, v |= φ for each formula φ in the sub-formula-closed set Γ . We denote by the symbol [w]≈ the equivalence class of the world w. The filtered through the relation ≈ structure M becomes the structure M ≈ =(W ≈ , R ≈ , A≈ ), where (i) W ≈ = {[w]≈ : w ∈ W }; (ii) R ≈ ([w]≈ , [v]≈ ) if and only if there exist w1 , v1 such that w1 ≈ w, v1 ≈ v and R(w1 , v1 ); (iii) one more requirement: if R ≈ ([w]≈ , [v]≈ ) then for each sub-formula of the form Mφ in Γ , if M, v |= φ then M, w |= Mφ; (iv) A≈ ([w]≈ , p) = 1 if and only if M, w |= p. The comparison of the set of worlds W to the set of worlds W ≈ shows that distinct worlds in the filtered model have distinct sets of formulas true in them; hence, the cardinality of the set W ≈ is not greater than the number of subsets in Γ which is 2|Γ | . The next task is to prove that both structures, the structure M and the filtered structure M≈ satisfy the same set of formulas from the set Γ . Theorem 4.30 For each formula φ ∈ Γ , the following equivalence takes place, under assumed properties in Definition 4.33(i)-(iv) of the filtered model: M, w |= φ if and only if M ≈ , [w]≈ |= φ. Proof Proof goes by structural induction. The first step is when φ is an atomic proposition p. As valuations A and A≈ assign to p the same set of worlds, the theorem holds for p. Hence, the theorem is true for sentential formulae and it remains to consider the case of a basic modal formula, say Mψ in Γ . Suppose first that M, w |= Mψ. There exists a world v in M with properties: (a) R(w, v) (b) M, v |= ψ. As R(w, v) holds, by condition 7.3(ii), we have that R ≈ ([w]≈ , [v]≈ ). By inductive assumption, as ψ ∈ Γ , we have that M ≈ , [v]≈ |= ψ and thus M ≈ |= Mψ. The converse is proved along similar lines by use of the condition (iii): suppose that M ≈ , [w]≈ |= Mψ. There exists a world [v]≈ such that R ≈ ([w]≈ [v]≈ )
204
4 Modal and Intuitionistic Logics
and M ≈ , [v]≈ |= ψ; again, by inductive assumption, M, v |= ψ. By condition (iii), M, w |= Mφ. Definition 4.34 (The size of a formula) As already pointed to, both structures satisfy the same set of formulae. In order to estimate the size of models, we define the size of a formula φ, denoted ||φ||, by the following rules: (i) || p||=1 for each atomic proposition p; (ii) for formulae φ, ψ and a binary operator o ∈ {∨, ∧, ⊃}, ||φ o ψ|| = 1 + ||φ|| + ||ψ||; (iii) for a formula φ and a unary operator o ∈ {¬, L , M}, ||oφ|| = 1 + ||φ||. Recalling the definition of sub-formulae, we prove by structural induction that |Sub(φ)| ≤ ||φ||, where Sub(φ) is the set of sub-formulae of φ. Theorem 4.31 (Decidability of basic modal logic) If a formula φ of basic modal logic is satisfiable, then it is satisfiable in a finite model. Therefore basic modal logic is decidable. All of these was under proviso that we have a relation R ≈ which satisfies conditions (ii) and (iii) of Definition 4.33. It is necessary now to construct this relation. There are three possible ways (Blackburn et al. [14]): in addition to Definition 4.33(ii) which defines the relation which we now denote as R ≈ I and to Defwe may consider the inition 4.33(iii) which defines the relation now denoted R ≈ II following condition (after Definition 4.33(iv)): (v) R ≈ I I I ([w]≈ , [v]≈ ) if and only if for each modal basic formula Mψ ∈ Γ , if M, v |= ψ ∨ Mψ then M, w |= Mψ. Theorem 4.32 All three candidates for filtration accessibility relations are satisfying conditions (ii), (iii) of Definition 4.33. ≈ Proof Clearly, R ≈ I satisfies condition (ii). For (iii), suppose that R I ([w]≈ , [v]≈ ) and suppose that M, v |= ψ. Then there exist w1 , v1 such that R(w1 , v1 ) and M, v1 |= ψ, ≈ hence M, w1 |= Mψ and thus M, w |= Mψ. Proofs for R ≈ I I and R I I I go along same lines.
It is in addition obvious that R ≈ I does preserve reflexivity and symmetry while preserves reflexivity of relation R. R ≈ I I I preserves transitivity; in consequence, we obtain R≈ II
Theorem 4.33 (i) Any formula of the form p ⊃ M p (logic T) has a finite reflexive satisfying structure; (ii) Any formula of the form M¬M¬ p ⊃ p (logic B) has a finite symmetric satisfying structure;
4.8 Small Model Property, Decidability
205
(iii) Any formula of the form M M p ⊃ M p (logic 4) has a finite transitive satisfying structure. Theorem 4.34 Logics K, T, S4, B are decidable. It remains to discuss the case of S5. Our notion here is that of bisimulation, which constitutes the next step in factorization of a modal structure. Definition 4.35 (Bisimulation) We address here the technique of bisimulation in the case of S5, i.e, the accessibility relation in this case is an equivalence ∼. We consider mainly pointed models of the form M, w where M = (W, ∼, A) and w ∈ W . The technique we are going to introduce has some affinities with filtration, viz., the first step in reducing a structure M, w is to restrict it to the equivalence class of w, ie., the restricted structure is now M∼ = ([w]∼ , R|[w]∼ , V |[w]∼ ). Clearly, for a formula φ of S5, w and w ∈ [w]∼ : (i) M, w |= φ if and only if M∼ , w |= φ (ii) M, w |= φ if and only if M∼ , w |= φ. A bisimulation between two structures M and M is a relation Λ ⊆ W × W which satisfies conditions (i) Λ is right-closed: for all w, w ∈ W , for each v ∈ W , if Λ(w, v) and R(w, w ), then there exists v ∈ W such that Λ(w , v ) and R (v, v ); (ii) Λ is left-closed: for all v, v ∈ W , for each w ∈ W , if (*) Λ(w, v) and R (v, v ) then there exists w ∈ W such that Λ(w , v ) and R(w, w ); (iii) if Λ(w, v), then A(w) = A (v), where A(w) = { p : w |= p}. For a reduced structure M∼ = ([w]∼ , V∼ = V |[w]∼ ) (no need to mention explicitly ∼), we carry the reduction further by picking out from [w]∼ worlds that have the same values of A∼ : w ≡ w ∗ if and only if A∼ (w ) = A∼ (w ∗ ). We denote classes of ≡ by the symbol [.]≡ . Now, we establish a bisimulation between the reduced structure and its reduction by ≡ (the latter called also a simple model in (Halpern-Moses [15]). The following theorem follows immediately from definitions. Theorem 4.35 For each reduced structure M, there exists a simple model M and a bisimulation Λ. Moreover, as the simple model is constructed within the reduced model, the relation Λ has as the domain the world-set of M and as the range the world-set of M . Proof We indicate the construction of M = (W , A∼ ) and that of Λ. Worlds in W are classes of ≡, all worlds are in relation R. For a world w of M and a world w of M , we define Λ if and only if w ∈ [w ]≡ . Properties of right- and left-closedness follow straightforwardly,hence, Λ is a bisimulation. Now, we begin with a simple model M and a world w of it, and we construct a simple model M ∗ . (i) w ∈ W ∗ ; (ii) for each sub-formula of φ of the form γ : Lψ, add to W ∗ a world wγ such that ¬ψ is valid at wγ in case Lψ is not valid at (M, w).
206
4 Modal and Intuitionistic Logics
Theorem 4.36 For each sub-formula ψ of φ and each v ∈ W ∗ , M, v |= ψ if and only if M ∗ , v |= ψ. Proof It suffices to check the claim in case of a sub-formula of the form Lψ. The proof is by structural induction. Suppose that M, v |= Lψ but M ∗ , v not. There is, by (ii), z ∈ M ∗ such that ¬ψ is valid at z. By assumption of induction, M, z does not satisfy ψ, hence, M, v does not satisfy Lψ, a contradiction. Proof of the converse is straightforward. Theorem 4.37 Logic S5 is decidable: to check validity of φ it suffices to check a finite set of worlds in the simple model.
4.9 Satisfiability, Complexity Theorem 4.38 (Ladner [13]) The problem SAT(S5) of satisfiability for logic S5 is NP-complete. Proof As SL is contained in S5, by the Cook Theorem 1.53, SAT(S5) is NP-hard. To prove its NP-completeness, one should provide an NP procedure for determining satisfiability. Suppose φ is a formula of S5 and P(φ) is the set of atomic propositions in φ. Guess a model M = (W, R, A) with |W | ≤ ||φ|| and guess valuations ( p, w) for p ∈ P(φ). Please note that we restrict ourselves to simple models so the relation R is immaterial (all worlds are connected via R). For p ∈ / P(φ) let by default V ( p, w) = 1. This guessing takes O(||φ||2 ) because both the number of worlds and the number of atomic propositions in P(φ) are bound by ||φ||. Next, we check whether φ is satisfied at some world in W . We describe the labelling procedure common to many problems (e.g., in model checking). We list all sub-formulae of φ in the order of increasing length. For each sub-formula ψ and a world w, we label w either with ψ or ¬ψ depending on which is valid at w. For Lψ we check each world for validity of ψ and for Mψ we check worlds to find whether some of them validates ψ. Complexity of labeling is O(||φ||2 ). φ is satisfiable if there will be a guess showing φ valid at some world. As proved in Ladner [13], SAT(K), SAT(T), SAT(S4) are PSPACE-complete.
4.10 Quantified Modal Logics (QML) There is a huge bulk of literature on the so called de re and de dicto readings of a sentence. Various interpretations abound Nelson [16]. Example 4.9 We consider two sentences: (i) Sally believes that some people are good;
4.10 Quantified Modal Logics (QML)
207
(ii) Some people are such that Sally believes that they are good. On the surface, both sentences carry the same message: Sally thinks that some people are good. However, in the (i)-sentence, we have a modal operator ‘believes’ (which we may interpret as ‘it is possible for Sally’) which prefixes the statement ‘some people are good’ which is existentially quantified while in the (ii)-sentence the quantified phrase ‘some people are such that’ precedes the modal phrase Sally believes that ...’. In case of (i)-sentence, the modal attitude depends on the reading of the dictum, i.e., what was said to be believed. This form is the de dicto reading. In case of (ii)sentence, the case is brought forth first and presented to be believed or not, it is res/re and the reading is de re. Definition 4.36 (The Barcan formulae) Sentences (i) and (ii) give rise to implication (i) ⊃ (ii) which converted into logical form yield formulae (1) M ∃x good(x) ⊃ ∃x M good(x); the dual form up to the sign of good(x) is (2) ∀x L ¬ good(x) ⊃ L ∀x ¬ good(x). General forms of (1) and (2) with a generic φ replacing good(x), respectively, ¬good(x), are Barcan formulae Marcus [17] (i) M∃x φ ⊃ ∃x Mφ; the dual form is ; (ii) ∀x Lφ ⊃ L∀xφ. Definition 4.37 (The converse Barcan formulae) In these formulae the antecedent and the consequent in Barcan formulae change places: (3) ∃x Mφ ⊃ M∃xφ; (4) L∀xφ ⊃ ∀x Lφ. Formulae (3) and (4) are converse Barcan formulae; clearly, they are dual to each other. The question now is to introduce proper syntax and semantics for interpreting such formulae. Interpretations may vary, bordering on intensional approach, which would demand that each possible world has its own domain of interpretation. We begin with the assumption that the domain D for the sentential part of the syntax is one and the same for all possible worlds. Definition 4.38 (Syntax of quantified modal logic) We have to knit tightly two structures: the structure for the sentential modal logic and the structure for predicate logic. A possible way of building a structure which could accommodate both modal sentential and predicate structures was shown in Kripke [18], and a discussion of relevant aspects of model choice can be found in (Fitting, Mendelson [19]).
208
4 Modal and Intuitionistic Logics
There are two main kinds of models for quantified modal logic which depend on a choice of the kind of domain assignment to possible worlds. One is called the constant domain model and here a chosen domain D is assigned to each possible world, the other called the variable domain model in which a functional assignment gives distinct domains to distinct worlds. We will present the constant domain models based quantified modal logic: the variable domain model requires some technical changes in the exposition. We denote by the symbol D a fixed domain which is a non-empty set of beings. We impose a linear ordering (ai )∞ 1 on the domain D, calling it σ. We will keep a fixed σ without mentioning it explicitly. We add the usual components of the sentential modal logic and predicate logic in this new context: a set W of possible worlds, each w ∈ W a possible world; an accessibility relation R ⊂ W × W ; a set P n of countably many relation symbols for each arity n ≥ 1; P = n P n ; a countable set X of individual variables n {x 1 , x 2 , . . . , x n , . . .}; an interpretation I : W × P :→ 2 n D ; for each pair < w, Q >, where arity of Q is n, I (w, Q) ⊂ D n is a relation of arity n; (vi) an assignment A : X → D, A(xi ) is an element ai of the domain D; (vii) symbols L , M of modal operators and ∀, ∃ of quantifiers as well as auxiliary symbols. (i) (ii) (iii) (iv) (v)
A structure M for quantified modal logic is a quadruple < W, R, D, A > and the pair F =< W, R > is the frame of M, D is the domain of the frame F. An atomic formula is an expression of the form Q(x1 , x2 , . . . , xn ) where Q is an n-ary relation symbol. Formulae are built in the usual way by means of sentential connectives, and, generalization and necessitation rules, as we have witnessed in case of Barcan formulae. Definition 4.39 (The notion of satisfiability) Given a structure M, a world w ∈ W , an interpretation I and an assignment A, a formula φ is satisfied (true) at M, w, A which is denoted M, w, |= I,A φ when the following conditions are fulfilled for specific forms of φ: (i) M, w, |= I,A Q(x1 , x2 , . . . , xn ) if and only if I (w, Q)(A(x1 , A(x2 ), . . . , A(xn )) holds; (ii) M, w |= I,A φ ∧ ψ if and only if M, w |= I,A φ and M, w |= I,A ψ; (iii) M, w |= I,A φ ∨ ψ if and only if M, w |= I,A φ or M, w |= I,A ψ; (iv) M, w |= I,A ¬φ if and only if it is not the case that M, w |= I,A φ; (v) M, w |= I,A φ ⊃ ψ if and only if either M, w |= I,A ¬φ or M, w |= I,A ψ; (vi) M, w, |= I,A Lφ if and only if M, w |= I,A φ for each w such that R(w, w ); (vii) M, w |= I,A Mφ if and only if M, w |= I,A φ for some w such that R(w, w ); (viii) M, w |= I,A ∀x.φ if and only if M, w |= I,A(x/a) φ for each a ∈ D, where A(x/a) is the assignment A with a substituted for x; (ix) M, w |= I,A ∃x.φ if and only if M, w |= I,A(x/a) φ for some a ∈ D.
4.10 Quantified Modal Logics (QML)
209
Definition 4.40 (Validity) A formula φ is true in a structure M if and only if it is satisfied at every world in W and φ is valid if it is satisfied in every structure. Example 4.10 The converse Barcan formula in Definition 4.37(iii) is valid. Suppose that: (i) M, w |= I,A ∃x Mφ(x). Then; (ii) for some a ∈ D, M, w |= I,A Mφ(a). Then; (iii) for some world w ∈ W such that R(w, w ), it holds: M, w |= I,A(x/a ) φ(a ); Now, we consider the antecedent M∃x φ(x). Its validity requires that; (iv) for some w ∈ W and for some a ∈ D, M, w |= I,A(x/a φ(a ); (v) Then, letting a to be a and w to be w makes the antecedent satisfied and the converse Barcan formula (iii) is valid. The essential assumption for the proof has been the constancy of the domain. A simple generalization is the condition of monotonicity of models; let D(w) stands for the domain at the world w; then, increasing monotonicity means that if R(w, w ), then D(w) ⊆ D(w ). Obviously, constant domain models satisfy this condition. Example 4.11 The converse Barcan formula (iii) is not valid when the monotonicity condition fails. Suppose that W = {w1 , w2 }, D(w1 ) \ D(w2 ) = ∅, R(w1 , w2 ) is the only instance of R, for some c ∈ D(w1 ) \ D(w2 ) P(c) is valid in w2 , then w1 |= ∃x M P(x) but w1 fails M∃P(x). We recall facts about monotonicity of structures related to Barcan and converse Barcan formulae. A structure is monotonically decreasing when R(w, w ) implies that D(w ) ⊂ D(w). Theorem 4.39 A frame F is monotonic increasing if and only if the Barcan converse formula is true in each F-based structure. Proof We can modify the proof in Example 4.10 to prove that in increasingly monotonic case the converse Barcan formula is valid and Example 4.11 shows that when a structure is not monotonically increasing, then the converse Barcan formula is not true. This proves the theorem. Let us observe that reversing arrows of the relation R in the structure of Example 4.11 shows falsity of the Barcan formula: M∃x P(x) ⊃ ∃x M P(x), in general the converse Barcan formula p ⊃ q is valid in a monotonically increasing model if and only if the corresponding Barcan formula q ⊃ p is valid in the dual model obtained by reversing arrows of the relation R. This implies that a frame F is monotonically decreasing if and only if the Barcan formula is valid in each F-based structure.
210
4 Modal and Intuitionistic Logics
4.11 Natural Deduction: Tableaux for Quantified Modal Logics Definition 4.41 (Tableaux in the case of constant domain) We assume the modal logic K for which we recall the modal tableau rules in Definition 4.15. (i) (ii) (iii) (iv)
T Lφ ; Tφ F Lφ ; Fφ T Mφ ; Tφ F Mφ . Fφ
We recall for reader’s convenience the rules for predicate logic in Chap. 3). We recall that; stands for conjunction and , denotes disjunction. F( p∨q) F( p⊃q) T ( p∧q) ; T p;T q F p;Fq T p;Fq F( p∧q) T ( p⊃q) T ( p∨q) ; β: T p,T q F p,Fq F p,T q F¬ p ; Tp (∀x.φ) F(∃x.φ) ; (γ): TT(φ(x/ p)) F(φ(x/ p)) (∃x.φ) F(∀x.φ) . (δ): TT(φ(x/ p)) F(φ(x/ p))
(v) For type α: (vi) For type (vii)
T (¬ p) Fp
(viii) For type (ix) For type
In rules in (ix), the parameter p must be new to the branch. Tableaux for quantified modal logic merge rules for both predicate and modal logics. Example 4.12 We give the tableau proof for the Barcan formula ∀x Lφ(x) ⊃ L∀xφ(x). Parameters are additional variables denoted p, q, r, ... in distinction to quantifiable variables x, y, z, ..., which enter tableaux and replace variables x, y, z, .. in steps in which the last quantifiers or modal operators are removed from a branch. The announced tableau follows. (1.) (2.) (3.) (4.) (5.) (6.) (7.) (8.)
F [∀x Lφ(x) ⊃ L∀x φ(x)] T [∀x Lφ(x)] F L ∀xφ(x)] T Lφ(x): from 2; F ∀xφ(x): from 3; T φ( p): from 4, use of parameter p; F φ( p); from 5, use of the same parameter p as the rule in 5 is universal; X: by 6,7, tableau closes.
This tableau is constructed for the constant domain models. Tableaux for variable domain models require more attention. Obviously they contain constant domain cases. We apply the device of prefixed tableaux Fitting [20] cf. Goré [21]. Prefixes
4.11 Natural Deduction: Tableaux for Quantified Modal Logics
211
are finite sequences of natural numbers in the form σ.n.m....k, where σ names a world and σ.n etc. names a world accessible from that named σ. We reproduce prefixed rules. We denote with coma, disjunction of formulae and with semicolon ; we denote conjunction of formulae. Formulae separated by coma at a node will initiate forking of the branch into two extensions, one formula in each extension, semicolon separated formulae will both extend the current branch. With this notation, we state tableau prefixed rules. Definition 4.42 (Prefixed sentential logic tableau rules) (i) (ii) (iii) (iv) (v) (vi) (vii) (viii)
σφ∧ψ ; σφ;σψ σφ∨ψ ; σφ,σψ σ¬(φ∧ψ) ; σ¬φ,σ¬ψ σ¬(φ∨ψ) ; σ¬φ;σ¬ψ σφ⊃ψ ; σ¬φ,σψ σ¬(φ⊃ψ) ; σφ;σ¬ψ σ¬¬φ ; σφ σφ≡ψ ; σφ⊃ψ;σψ⊃φ
Definition 4.43 (Prefixed predicate logic tableau rules) (ix) (x) (xi) (xii)
σ∀xφ(x) ; σφ( pσ ) σ¬∃xφ(x) ; σ¬φ( pσ ) σ∃xφ(x) ; σφ( pσ ) σ¬∀xφ(x) ; σ¬φ( pσ )
In these rules, pσ is a parameter associated with the world name σ; in rules (xi) and (xii), parameter pσ must be new to the branch, this proviso is not obeyed by rules (ix) and (x). Definition 4.44 (Prefixed modal tableau rules) (xiii) (xiv) (xv) (xvi)
σ Lφ(x) ; σ.n φ(x) σ ¬Mφ(x) ; σ.n¬φ(x) σ¬ Lφ(x) ; σ.n ¬φ(x) σ Mφ(x) . σ.n φ(x)
In rules (xiii) and (xiv), the prefix σ.n should be applied if it is already on the branch, in rules (xv) and (xvi), the prefix σ.n should be new. Example 4.13 We give a proof tableau over K for the Barcan formula M∃xφ(x) ⊃ ∃x Mφ(x).
212
(1) (1) (2) (1.1) (1.1) (1) (1.1)
4 Modal and Intuitionistic Logics
F [M∃xφ(x) ⊃ ∃x Mφ(x)] (1); T M∃xφ(x) (2); F ∃x Mφ(x) (3); T ∃xφ(x) (4) from (2); T φ( p1.1 ) (5) from (4); F Mφ(x); (6) from (3); F φ( p1.1 ) (7) from (6) by rule (xiv).
Tableau closes: X Definition 4.45 (Hintikka family of sets for prefixed tableaux) We have defined Hintikka sets in previous cases, now, we define them for quantified modal logic. We assume the modal logic K. A Hintikka family of sets H consists of sets denoted generically by the symbol H such that each member of H satisfies conditions: for any formula φ(x) at most one of φ(x), ¬φ(x) is in H ; if σ¬¬φ ∈ H , then σφ ∈ H ; if σφ ∧ ψ ∈ H , then σφ ∈ H and σψ ∈ H ; if σφ ∨ ψ ∈ H , then σφ ∈ H or σψ ∈ H ; if σ Lφ ∈ H , then σ.n φ ∈ H for each name σ.n ∈ H ; same for σ¬ Mφ; if σ Mφ ∈ H , then σ.n φ ∈ H ; same for σ ¬ Mφ; if σ∀x.φ ∈ H , then σφ( pσ ) ∈ H for each parameter pσ of σ; same for σ¬∃x.φ(x); (H7) if σ∃x.φ(x) ∈ H , then σφ( pσ ) ∈ H for some parameter pσ ; same for σ¬∀x.φ(x).
(H0) (H1) (H2) (H3) (H4) (H5) (H6)
Theorem 4.40 Each Hintikka set H in a Hintikka family H is satisfiable. Proof Consider a Hintikka set H . The canonical model for H has the set of possible worlds W defined as the set of prefixes of the formulae in H ; the accessibility relation R is the set of pairs < σ, σ.n > for each prefix σ ∈ W , the domain function maps each world σ ∈ W to the set of parameters pσ ∈ H . The interpretation I sends each pair < P, σ >, where P is an n-place relation symbol, into the set {< p1 , p2 , . . . , pn >: σ P( p1 , p2 , . . . , pn ) ∈ H }. It remains to define an assignment A on the model M =< W, R, I >: A( pσ ) = pσ for each parameter pσ . Now, we have to verify that M, σ |= A φ for each φ ∈ H . This is done by structural induction. By the very definition, each atomic formula in H is satisfied. Conditions (H1),(H2),(H3) settle sentential cases. Quantified formulae are satisfied by (H6) and (H7) and modal formulae are satisfied by (H4),(H5) by virtue of the model construction. Negated formulae are satisfied due to (H0) and (H1). Suppose now that a formula φ is valid in all variable domain K-models and there exists in the set T of all tableaux for Fφ a tableau T which is not closed. Hence, there exists in T an open branch which is a Hintikka set, hence, satisfiable, which shows that φ is not valid, a contradiction. This leads to completeness theorem for tableaux.
4.12 Sentential Intuitionistic Logic (SIL)
213
Theorem 4.41 (Tableau completeness for quantified modal logic K) If a formula φ is true in all variable domain K-models, then each tableau for φ is closed, hence, φ is provable.
4.12 Sentential Intuitionistic Logic (SIL) SIL arose from intuitionism, a paradigm in foundations of mathematics and logic initiated by Luitzen E.J. Brouwer for whom mathematics was a creation of mind and truth meant provability, and existence meant effective construction. In particular, the law of excluded middle p ∨ ¬ p accepted in classical logic and mathematics was regarded as unacceptable from the intuitionism’s point of view as there are statements such that neither they nor their negations are given any proof. In a similar vein a part of mathematics could not be accepted by intuitionists. Regardless of that, logic embraced intuitionistic ideas exposed in Heyting [22] when Gödel [23] demonstrated that intuitionistic logic can be interpreted in models for modal logic S4. In Problems section we outline the mapping of S4 into SIL due to Schütte [24]. Kripke [25] proposed classical models for intuitionistic logic, so we may continue the topic of that logic within modal logics. In accordance with Gödel’s insight and Kripke’s model, we adopt the S4 frame. Hence, the accessibility relation R is reflexive and transitive. We assume the sentential syntax, with a countable set of atomic propositions p, q, r, ..., connectives ∨, ∧, ¬, ⊃, and, some auxiliary variables like parentheses and punctuation marks. Well-formed formulae as well as sub-formulae and size of them are defined in an already familiar manner. Definition 4.46 (Sentential intuitionistic frame) It is a frame F = (W, R) where W is a set of worlds and R is an S4-accessibility relation, i.e., it is reflexive and transitive. We recall our notion of an F-neighborhood of a world: N F (w) = {w : R(w, w )}. Definition 4.47 (Intuitionistic sentential model. Satisfiability) We define the relation |= for a sentential intuitionistic pointed structure frame (W, R, A, w) denoted for short as (M, w) with M = (W, R, A); as intuitionistic meaning differs from the classical one, we adopt the name ’accepts’ for the symbol |=. The pair F = (W, R) is as before a frame. for each formula φ: if M, w |= φ, then M, w |= φ for each w ∈ N F (w); M, w |= ; M, w |= φ ∧ ψ if and only if M, w |= φ and M, w |= ψ; M, w |= φ ∨ ψ if and only if M, w |= φ or M, w |= ψ; M, w |= ¬φ if and only if for each w ∈ N F (w), it is not the case that M, w |= φ; (vi) M, w |= φ ⊃ ψ if and only if for each w ∈ N F (w), if M, w |= φ, then M, w |= ψ.
(i) (ii) (iii) (iv) (v)
214
4 Modal and Intuitionistic Logics
It follows from (ii) and (v) that w |= ⊥ for no world w. Please take heed of the fact that truth of a formula at a world is coupled with its truth at all accessible worlds. Definition 4.48 (Validity) A formula φ is true in a frame(W, R) if and only if M, w |= φ for each w ∈ W . A formula φ is valid if and only if φ is true in each sentential intuitionistic frame. The status of signed formulae needs a comment and a definition of their relations to negation. By 12.2(v), asserting negation of a formula at w needs a verification that a formula is not accepted by all neighbors w of w, which would mean a disproof of the formula. On the other hand, w |= Fφ means that no proof of φ is possible from w. These statements are somewhat vague, so let us include formulae which point to differences between ¬ and F: (i) if M, w |= ¬φ, then M, w |= ¬φ by 11.2(i) (see a proof after 11.4); (ii) if M, w |= Fφ, then M, w |= Fφ. It turns out that intuitionistic tautologies are sentential tautologies. Indeed, it is shown by a simple argument to the contrary. Suppose that φ is not any sentential tautology, so for some assignment A, A(φ) = 0, consider the simplest one world model (W = {w}, R(w, w)) with w |= p if and only if A( p) = 1. Then, for each formula ψ, A(ψ) = 1 if and only if w |= ψ, hence, it is not the case that w |= φ. A simple example, of W = {w1 , w2 } with instances of accessibility R(w1 , w2 ), R(w1 , w1 ) and R(w2 , w2 ), and w2 |= p, shows that w1 p ∨ ¬ p does not hold, hence, p ∨ ¬ p is not any intuitionistic tautology. It follows that intuitionistic sentential tautologies are a proper subset of sentential tautologies. Properties of intuitionistic satisfiability are distinct from those for previously studied logics. It is evident in the treatment of negation in (v) above and the treatment of implication in (vi) above: in order to assert truth of negated or implicational formula at a world w, we need to check the truth or falsity of a formula at each world in the neighborhood N F (w). For a set of formulae Γ , we introduce the notation ΓT = {T φ : T φ ∈ Γ } and Γ F = {Fφ : Fφ ∈ Γ }. Yet another difference between classical and intuitionistic sentential logics is that in the intuitionistic case each connective is independent of all others Wajsberg [26]. One can construct simple examples showing that (see, e.g., Fitting [9]). However, negation and implication are related in a way similar to previous cases. The truth is uniform over neighborhoods of worlds. Theorem 4.42 The following are true in intuitionistic sentential logic: (i) if M, w |= φ, then M, w |= φ for each world w ∈ N F (w) and each formula φ; (ii) if M, w |= Γ , then M, w |= ΓT for each world w ∈ N F (w); (iii) if M, w |= Γ , then M, w |= Γ F , where w ∈ N F (w). Proof Statement (i) is secured by condition in Definition 4.47(i). For (ii), suppose that M, w |= Γ ; as ΓT ⊂ Γ , it follows that M, w |= ΓT and by Definition 4.47(i),
4.12 Sentential Intuitionistic Logic (SIL)
215
M, w |= ΓT . For (iii), it follows by Definition 4.47(v) by an argument similar to that of (ii) with T replaced by F as subscript of Γ A comment may be advisable now concerning Definition 4.47(i). We have bypassed here the standard way of requiring (i) for atomic propositions and extending it to formulae as in our (i) by structural induction. While it needs no comment for disjunction and conjunction, the case of negation may be in need of some explanation. Let M, w |= ¬φ, then by Definition 4.47(v), each w ∈ N F (w) satisfies Fφ, by transitivity, N F (w ) ⊆ N F (w), hence, no world w ∈ N F (w ) satisfies φ, and this by Definition 4.47(v) witnesses that w |= ¬φ. Fitting [7] gives a useful interpretation of the intuitionistic scheme of Definition 4.47. One may regard each world as a repository of current knowledge and R(w, w ) as expectance that world w visited in future may bring some additional knowledge. In this sense, intuitionistic logic contains some elements of temporal aspects as well as epistemic aspects. For instance, w |= ¬φ witnesses that the world w imposes ¬φ if no other knowledge gathered in future can impose φ. This is certainly motivated by temporal aspects of mathematical science in which some problems have been decided after centuries. In Definition 4.47, a distinction is observable in behavior of connectives with respect to satisfaction: formulae of type α in the signed form, i.e., T (φ ∧ ψ), F(φ ∨ ψ), F(φ ⊃ ψ), F(¬φ) and formulae of type β (i.e., T (φ ∨ ψ), F(φ ∧ ψ), T (φ ⊃ ψ), F(¬φ) whose main connective is either ∨ or ∧, behave differently from formulae of types α and β whose main connective is either ¬ or ⊃. We call the former case ‘positive’ and the latter case ‘negative’. For a formula φ, formulae φ1 and φ2 are imminent sub-formulae, which in case of ¬ are identical : for ¬φ they are T φ, T φ and in case of T ¬φ they are Fφ, Fφ. From 12.2, the following code of behavior can be extracted. Definition 4.49 (Code of behavior) The rules are: (i) for a positive formula φ of type α and each w ∈ W , w |= φ ≡ w |= φ1 and w |= φ2 ; (ii) for a positive formula φ of type β and each w ∈ W , w |= φ ≡ w |= φ1 or w |= φ2 ; (iii) for a negative formula φ of type α and each w ∈ W , w |= φ if and only if w |= φ1 and w |= φ2 for some w ∈ N F (w); (iv) for a negative formula φ of type β and each w ∈ W , w |= φ if and only if w |= φ1 or w |= φ2 for each w ∈ N F (w); We also add conditions which follows from an analysis in Beth [27] and in Fitting [9]: (v) for each atomic proposition p, for each w ∈ W , if w |= T p, then w |= T p for each w ∈ N F (w); moreover, w |= , it is not the case that w |= ⊥, and only one of the following holds true: w |= T p, w |= F p. Now, we proceed to tableaux.
216
4 Modal and Intuitionistic Logics
4.13 Natural Deduction: Tableaux for SIL The system we propose to discuss comes from Beth [27] with modifications in Fitting [7] which introduce signed formulae and standard, in the sense of Smullyan’s, usage of trees. Definition 4.50 (Tableau rules) We state the rules for intuitionistic sentential tableaux. We use the symbol Γ to denote non-active in a rule set of formulae. Γ,T (φ∧ψ) ; Γ,T φ;T ψ Γ,Fφ∧ψ (F ∧) (Γ,Fφ),(Γ,Fψ) ; Γ,T (φ∨ψ) (T∨) (Γ,T ; φ),(Γ,T ψ) Γ,F(φ∨ψ) (F ∨) Γ,Fφ;Fψ ; ¬φ (T ¬) Γ,T ; Γ,Fφ Γ,F¬φ (F ¬) ΓT ,T φ ; Γ,T (φ⊃ψ) (T⊃) (Γ,Fφ),(Γ,T ; ψ) Γ,F(φ⊃ψ) (F ⊃) ΓT ,(T φ;Fψ) .
(∧) (T ∧) (∨)
(¬)
(⊃)
We recall that ΓT is defined as {T φ : T φ ∈ Γ }. Example 4.14 A one-way De Morgan law ( p ∨ q) ⊃ ¬ (¬ p ∧ ¬q) has a tableau proof. 1 F [( p ∨ q) ⊃ ¬ (¬ p ∧ ¬q)]; 2 T ( p ∨ q) F ¬ (¬ p ∧ ¬q); 3 T ( p ∨ q) T (¬ p ∧ ¬q); 4 T ( p ∨ q) T ¬ p T ¬q; 5 T ( p ∨ q), F p, T ¬q; 6 T ( p ∨ q), F p, F q 7 left branch: T p, F p, F q X: branch closes; 8 right branch: T q, F p, F q X: branch closes. We now consider the converse formula: ¬ (¬ p ∧ ¬q) ⊃ ( p ∨ q). The tableau is the following. 1 F [¬ (¬ p ∧ ¬q) ⊃ ( p ∨ q)]; 2 T ¬ (¬ p ∧ ¬q), F ( p ∨ q); 3 F (¬ p ∧ ¬q), F ( p ∨ q); 4 F (¬ p ∧ ¬q), F p, F q; 5 left branch: F ¬ p, F p, F q; 6 left branch continues: T p, F p, F q X: branch closes 7 right branch: F (¬q), F p, F q 8 right branch continues: T q, F p, F q X: branch closes.
4.13 Natural Deduction: Tableaux for SIL
217
Theorem 4.43 The tableau system of proof is sound: each provable formula is valid. We recall that a formula φ is provable if the tableau for Fφ closes. Proof To the contrary, suppose that a formula φ is provable but not valid. Then the formula Fφ is satisfiable, i.e, there is a pointed frame (F, w) which satisfies Fφ. The crux of the proof is in realization that each step in expansion of a tableau for Fφ preserves satisfiability by checking each of tableau rules, for instance, if at a level (n) of the tableau the satisfiable formula is T ψ ∧ ξ then the next level (n+1) contains formulae T ψ and T ξ, both satisfiable. When the tableau branches conclude, the tableau is open, contrary to the assumption that φ is provable. In order to attempt a completeness proof for intuitionistic tableau system, we enter the already known track and define the Hintikka family for this case. We require of these families to be consistent, i.e., no tableau for each set closes. Definition 4.51 (The Hintikka family of sets) Let H denote this family, it is a Hintikka intuitionistic family of sets if it is consistent and if it satisfies the following conditions. Γ and Δ are generic symbols for members of H. no set Γ in H contains both T φ and F φ for some formula φ; if T φ ∧ ψ in Γ , then T φ in Γ and T ψ in Γ ; if F φ ∧ ψ in Γ , then F φ in Γ or F ψ in Γ ; if T φ ∨ ψ in Γ , then T φ in Γ or T ψ in Γ ; if F φ ∨ ψ in Γ , then F φ in Γ and F ψ in Γ ; if T ¬φ in Γ , then F φ in Γ ; if T φ ⊃ ψ in Γ , then F φ in Γ or T ψ in Γ ; if F(φ ⊃ ψ) in Γ , then there exists Δ in H such that (ΓT ) ⊂ Δ, Tφ in Δ and F ψ in Δ; (H8) if F ¬φ in Γ , then there exists Δ in H such that (ΓT ) ⊂ Δ and T φ in Δ. (H0) (H1) (H2) (H3) (H4) (H5) (H6) (H7)
Theorem 4.44 Each Hintikka family is satisfiable. Proof We take as the set W of worlds in a model the sets in the Hintikka family and we define the accessibility relation R(Γ, Δ) as ΓT ⊂ Δ. The frame of the model is this F = (H, R). It remains to define the satisfaction relation |=. In order to follow on the Henkin idea of canonical models, we would like to secure the condition (H) If T φ ∈ Γ ∈ H, then Γ |= φ along with its dual condition (-H) If Fφ ∈ Γ, then it not the case that Γ |= φ We apply the idea already used with tableaux, viz., we begin with atomic formula and for atomic formula P, we let Γ |= P if and only if P ∈ Γ and we extend |= by structural induction.
218
4 Modal and Intuitionistic Logics
We provide an example: T (φ ⊃ ψ) ∈ Γ ≡ ∀Δ((ΓT ⊆ Δ) ⊃ T (φ ⊃ ψ ∈ Δ) ≡ ∀Δ((ΓT ⊆ Δ) ⊃ (F(φ) ∈ Δ ∨ T (ψ) ∈ Δ)) ≡ ∀Δ((ΓT ⊆ Δ) ⊃ Δ¬ |= φ ∨ Δ |= ψ)) ≡ ∀Δ((ΓT ⊆ Δ) ⊃ (Δ |= φ ⊃ ψ) ≡ Γ |= φ ⊃ ψ). Theorem 4.45 The tableau system is complete: each valid formula is provable. Proof The scheme for proof is standard by now. Suppose a formula φ is valid but not provable. There is a model M for φ. Non-provability of φ provides an open tableau for Fφ with open branches satisfying properties for Hintikka sets, hence satisfiable, contrary to validity of φ.
4.14 Consistency of Intuitionistic Sentential Logic Following on the lines of previous cases, with necessary modifications, we recall the property of consistency introduced in Fitting [7]. Some remarks on notions to be used are in order. Connectives ∨, ∧ are called regular and connectives ¬, ⊃ are called special. This distinction is caused by distinct behavior of two sets with respect to satisfaction, as stated in Definition 4.47. A formula of type α is regular or special depending on whether the main connective in it is regular or special. The symbol C T stands for {T φ : T φ ∈ C}. Definition 4.52 (Consistency property) A family C of sets of signed formulae is consistency property if and only if each set C in it has the following properties: (i) (ii) (iii) (iv)
C contains no pair of conjugate formulae and neither F nor T ⊥; for a type β formula φ ∈ C, either C ∪ {φ1 } in C or C ∪ {φ2 } in C; for a regular type α formula φ ∈ C, C ∪ {φ1 , φ2 } in C; for a special type α formula φ ∈ C, T C ∪ {φ1 , φ2 } in C.
We now follow the line initiated in Henkin [11]. By applying appropriately modified Lindenbaum Lemma, each set C ∈ C can be enlarged to a maximal consistent set MaxCon(C). Definition 4.53 (The canonical structure) We let W to be the family of all sets MaxCon(C) for C ∈ C. The accessibility relation R is defined by letting R(MaxCon(C), MaxCon(C )) if and only if CT ⊆ C . In line
4.15 First-Order Intuitionistic Logic (FOIL)
219
with previous developments, for any signed atomic formula p we let MaxConC |= p if and only if p ∈ MaxCon(C). By means of structural induction which employs rules in Definition 4.47, we verify that if φ ∈ MaxCon(C) then MaxCon(C) |= φ for each C ∈ C for each signed formula φ. This defines all ingredients of the structure M =< W, R, |=>. Theorem 4.46 Each consistent set of signed formulae is satisfiable. This implies strong completeness of tableau proof system.
4.15 First-Order Intuitionistic Logic (FOIL) First-order intuitionistic logic shares syntactic features with classical predicate logic: a set of countably many individual variables denoted x, y, z, . . ., a countable set of n-ary predicate symbols P1 , P2n , . . . for each n ≥ 1, sentential connective symbols, quantifier symbols, and auxiliary symbols of parentheses and punctuation marks. In addition, as usual in case of prefixed tableaux, there is a countable set I P of individual parameters, i.e., non-quantifiable variables. Atomic formulae are P jn (a1 , a2 , . . . , an ), each ai an individual variable or individual parameter. Formulae are built in the standard recursive way, beginning from atomic formulae, by means of sentential connectives and quantifiers. Models for first-order intuitionistic logic (FOIL) presented here are due to Kripke [25]. The notation presented here follows, with some modifications, one adopted in (Fitting [7–9]). Definition 4.54 (First-order intuitionistic models. Satisfiability, validity) A frame for FOIL is a triple (W, R, P), where W, R is an S4 frame and P is a mapping from W into the collection of non-empty sets of parameters. A first-order intutionistic structure over a frame F = (W, R, P) is the tuple (W, R, P, |=), where |= is a satisfaction relation. The mapping P is monotone: if R(w, w ), then P(w) ⊆ P(w ). Definition 4.55 (The first-order satisfaction relation) The relation |= fulfills the following conditions. For w ∈ W , the symbol FI P (w) denotes the set of formulae which contain only parameters from P(w). (i) for each atomic formula α, and each w ∈ W : if w |= α, then α ∈ FI P (w); (ii) for w, w ∈ W , and for each atomic formula α: if R(w, w ) and w |= α, then w |= α; (iii) for w ∈ W : w |= φ ∧ ψ if and only if w |= φ and w |= ψ; (iv) for w ∈ W : w |= φ ∨ ψ if and only if φ ∨ ψ ∈ FI P (w) and (w |= φ or w |= ψ); (v) for w ∈ W : w |= ¬φ if and only if ¬φ ∈ FI P (w) and w |= ¬φ holds for each w ∈ N F (w);
220
4 Modal and Intuitionistic Logics
(vi) for w ∈ W : w |= φ ⊃ ψ if and only if φ ⊃ ψ ∈ FI P (w) and if w |= φ, then w |= ψ for each w ∈ N F (w); (vii) for w ∈ W : w |= ∃xφ(x) if and only if w |= φ( p) for some p ∈ P(w); (viii) for w ∈ W : w |= ∀xφ(x) if and only if w |= φ( p) for each w ∈ N F (w) and each p ∈ P(w ). The monotonicity condition of the mapping P may be interpreted as the reflection of properties of mathematical knowledge: the once established truth remains true in the future and knowledge is collected incrementally. Conditions (v) and (vi) reflect the intuitionistic point of view on negation and implication: their truth is preserved by future developments. Condition (viii) requires of universal quantification the confirmation of truth for all parameters met in future. Theorem 4.47 For w ∈ W and a formula φ: (a) if w |= φ, then φ ∈ FI P (w) (b) w |= φ for each w ∈ N F (w). Proof (a) is encoded in conditions (iv)–(viii) and it may be verified by structural induction. Verification of (b) may be done by structural induction beginning with condition (ii). For a structure S = (W, R, P, |=), a formula φ is true in S if and only if w |= φ for each w ∈ W such that φ ∈ FI P (w). A formula φ is valid if and only if it is true in all structures. FOIL excludes as invalid some formulae of FO: an example is the formula (∀x( p ∨ φ(x)) ⊃ ( p ∨ ∀xφ(x)).
4.16 Natural Deduction: FOIL-Tableaux Definition 4.56 (FOIL tableau rules) We attempt at recalling a proof of tableau-completeness and soundness of FOIL. First thing is to introduce the tableau rules for FOIL in addition to rules given for the sentential case. The symbol p denotes a parameter, and Γ means a set of formulae, the symbol ΓT denotes the set {T φ : T φ ∈ Γ }. All formulae are signed. The new rules for FOIL are as follows. Γ,T ∃xφ(x) (parameter Γ,φ( p) Γ,F∃xφ(x) (F∃ ) Γ,Fφ( p) ; ∀xφ(x) (T∀ ) Γ,T ; Γ,T φ( p) Γ,F∀xφ(x) (F∀) ΓT ,Fφ( p) , p new.
(i) (T∃ ) (ii) (iii) (iv)
p new);
While in FO, the duality between ∃ and ∀ via negation holds without any restriction on ∀, in FOIL, as witnessed by (i) and (iv), the case of F∀ is restricted by the change of Γ into T Γ : this reflects the need to verify the truth of ∀ with respect to future developments which may bring some changes to our current knowledge; to be on the safe side, we restrict ourselves to true formulae which will stay true.
4.16 Natural Deduction: FOIL-Tableaux
221
Example 4.15 The formula ∀xφ(x) ⊃ ¬∃x¬φ(x) has a tableau proof. F[∀x φ(x) ⊃ ¬∃x ¬φ(x)]; T∀x φ(x); F¬∃x ¬φ(x); Tφ( p): by (2) and 15.1(iii); T∃x ¬φ(x): by (3); T¬φ( p): by 15.1(i); Fφ( p): by (6); Tφ( p): by (2) and 15.1(iii), no restrictions on p so we may use p introduced in (6); (9) X: the branch closes.
(1) (2) (3) (4) (5) (6) (7) (8)
We approach the completeness theorem for FOIL. We include a proof modelled on the proof in Kripke [25] (see also Fitting [7]). Definition 4.57 (FOIL notion of consistency) A finite set C of signed FOIL formulae is consistent if no tableau for it closes; a set of FOIL formulae is consistent if and only if each finite subset is consistent. Definition 4.58 (Hintikka families) We define the FOIL version of a Hintikka family H of sets of formulae. We denote by the symbol P(H ) the set of constants and parameters used in formulae in H . We regard elements of H as worlds and define the accessibility relation R as R(H1 , H2 ) if and only if H1,T ⊆ H2 and P(H1 ) ⊆ P(H2 ). A family H is a Hintikka family if the following conditions hold for each set H ∈ H: (H0) (H1) (H2) (H3) (H4) (H5) (H6) (H7) (H8) (H9) (H10) (H11) (H12)
H is consistent; if Tφ ∧ ψ ∈ H , then Tφ ∈ H and Tψ ∈ H ; if Fφ ∧ ψ ∈ H , then Fφ ∈ H or Fψ ∈ H ; if Tφ ∨ ψ ∈ H , then Tφ ∈ H or Tψ ∈ H ; if Fφ ∨ ψ ∈ H , then Fφ ∈ H and Fψ ∈ H ; if T¬φ ∈ H , then Fφ ∈ H ; if Tφ ⊃ ψ ∈ H , then Fψ ∈ H or Tψ ∈ H ; if Fφ ⊃ ψ ∈ H , then for some H1 ∈ H such that R(H, H1 ): Fψ ∈ H1 and Tφ ∈ H1 ; if F¬φ ∈ H , then for some H1 ∈ H such that R(H, H1 ): Tφ ∈ H1 ; if T∀xφ(x) ∈ H , then Tφ( p) ∈ H for each p ∈ P(H ); if F∀xφ(x) ∈ H , then there exists H1 ∈ H such that R(H, H1 ) and Fφ(a) for some a ∈ P(H1 ); if T∃xφ(x) ∈ H , then Tφ(a) for some a ∈ P(H ); if F∃xφ(x) ∈ H , then Fφ(a) ∈ H for each a ∈ P(H ).
222
4 Modal and Intuitionistic Logics
We prove that each Hintikka family H is satisfiable in a Hintikka structure which for this case is M = (W, R, P, |=) such that each H ∈ H satisfies (i) if T φ ∈ H , then H |= φ and (ii) if Fφ ∈ H , then H |= φ does not hold. The Hintikka structure which satisfies (i) and (ii) is called a Hintikka model. Theorem 4.48 Each Hintikka family has a Hintikka model. Proof Building a model begins with definitions of R and P as above, and, for atomic φ, letting for each H ∈ H that H |= φ if and only if Tφ ∈ H . Then, |= is extended by structural induction to all formulae: as an example, suppose that T∀xφ(x) ∈ H . Steps in verification that H |= ∀xφ(x) are as follows: (i) (ii) (iii) (iv) (v)
T∀xφ(x) ∈ H ; for each H such that R(H, H ): T ∀x φ(x) ∈ H (by 14.2(viii)); for each H such that R(H, H ): T φ(a) ∈ H for each a ∈ P(H ); for each H such that R(H, H ): H |= φ(a) for each a ∈ P(H ); H |= ∀x φ(x).
In case of F ∀x φ(x) ∈ H , we follow same lines except that we replace line (ii) with line (ii’): (ii’) there exists H such that R(H, H ) and F φ(a) ∈ H for some a ∈ I (H ); and we replace line (iii) with line (iii’): there exists H such that R(H, H ) and it is not the case that H |= φ(a) for some a ∈ I (H ); and we replace line (v) with line (v’): H |= ∀xφ(x) does not hold. We proceed by analogy in case of existential quantification. Sentential cases were considered in case of the intuitionistic sentential logic. The symbol Par denotes the countable (infinite) set of parameters. Definition 4.59 (Hintikka saturated sets) (cf. Fitting [7–9]). A set of signed formulae H is a Hintikka saturated set relative to a set Par of parameters (is a Par -saturated set) if and only if the following hold: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)
if Tφ ∧ ψ ∈ H , then Tφ ∈ H and Tψ ∈ H ; if Fφ ∧ ψ ∈ H , then Fφ ∈ H or Fψ ∈ H Tψ ∈ H ; if Tφ ∨ ψ ∈ H , then Tφ ∈ H or Tψ ∈ H ; if Fφ ∨ ψ ∈ H , then FTφ ∈ H and Fψ ∈ H ; if T¬φ ∈ H , then Fφ ∈ H ; if Tφ ⊃ ψ ∈ H , then either Fφ ∈ H or Tψ ∈ H ; if T∀x.φ(x) ∈ H , then Tφ( p) for each p ∈ Par ; if F∃xφ(x) ∈ H , then Fφ( p) ∈ H for each p ∈ Par ; if T∃xφ(x) ∈ H , then Tφ( p) ∈ H for some p ∈ Par .
Theorem 4.49 Each consistent set of signed formulae H with P being the set of parameters in formulae of H and Q a countably infinite set ordered as (qn )∞ 1 of parameters, disjoint with P, extends to a (P ∪ Q)-saturated set H ∗ .
4.16 Natural Deduction: FOIL-Tableaux
223
Proof Proof of Theorem 4.49 resembles proof of the Lindenbaum Lemma, a saturated extension is built inductively. Formulae in H can be made into a sequence Φ : φ1 , φ2 , . . . and parameters in Q form a sequence Q : q1 , q2 , . . .. Disjointness of Q with P secures preservation of consistency after substitution of any qi into formulae of the form T ∃x.φ(x) ∈ H . Steps in construction of saturation are as follows. (i) if Tφ ∧ ψ ∈ H (ii) initialization: H0 = H ; (iii) inductive step: suppose that Hn is already defined. Pick the first φ j which is T ∃x ψ(x) and there is no p ∈ P such that T ψ( p) ∈ Γn∗ ; for qn , form the set {Γn∗ , T ψ(qn )} and extend it to MaxCon P∪{q1 ,...,qn } ; call it Hn+1 ; (iv) define H ∗ = n≥0 Hn . We let R = P ∪ Q and R\ = P ∪ {q1 , q2 , . . . , qn }. By construction, MaxCon P H ∗ . If a formula ξ : T ∃xφ(x) ∈ H ∗ , then, for some n, ξ appears at step n and Tφ(qn ) ∈ H ∗ , so H ∗ is saturated. Theorem 4.50 H ∗ supports a Hintikka family. Proof (Fitting [7]) Split R into countably infinite disjoint pairwise sets Z i = { pij : i = 1, 2, . . .} and let Z n∗ = i≤n Z i . Declare Δn as the family of sets of formulae which are Z n∗ -saturated. Claim. The family H = {Δn : n ≥ 1} is a Hintikka family. Indeed, consider Ω ∈ Δn for some n. Parameters of Ω form the set Z n . We check that conditions (H0)-(H12) in Definition 4.58 are fulfilled. As Ω is Z n -saturated, hence, maximal consistent, the following hold: if T φ ∧ ψ ∈ Ω, then T φ ∈ Ω and T ψ ∈ Ω; if F φ ∧ ψ ∈ Ω, then F φ ∈ Ω or F ψ ∈ Ω; if T φ ∨ ψ ∈ Ω, then T φ ∈ Ω or T ψ ∈ Ω; if F φ ∨ ψ ∈ Ω, then F φ ∈ Ω and F ψ ∈ Ω; if T φ ⊃ ψ ∈ Ω, then F φ ∈ Ω or T ψ ∈ Ω; if T ¬φ ∈ Ω, then F φ ∈ Ω; if T ∀xφ(x) ∈ Ω, then φ( p) ∈ Ω for each p ∈ Z n ; if F ∃x φ(x) ∈ Ω, then F φ( p) ∈ Ω for each p ∈ Z n ; if T ∃x φ(x) ∈ Ω, then T φ( p) ∈ Ω for some p ∈ Z n (by construction of H ∗ ); if F¬φ ∈ Ω, then the set {T Ω, T φ} is consistent and it extends to a Z n+1 saturated set Z , so R(Ω, Z ) and T φ ∈ Z ; (xi) if F φ ⊃ ψ ∈ Ω, then as in (x), we find Z such that R(Ω, Z ), T φ ∈ Z , F ψ ∈ Z; (xii) if F∀xφ(x) ∈ Ω, then, for some p ∈ I (Z n+1 ), the set {T Ω, Fφ( p)} is consistent so it extends to some Z n+1 -saturated set W and as in (x), R(Ω, W ) and F φ( p) ∈ W . Claim is proved.
(i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x)
Theorem 4.51 (The Completeness Theorem for FOIL) FOIL is complete: what is valid, is provable.
224
4 Modal and Intuitionistic Logics
Proof Consider a formula φ and suppose that φ is not provable, so the set {Fφ} is Z n∗ -consistent for some n, hence, it extends to some Z n∗ -saturated set U , hence, U ∈ H, and F φ ∈ U , a contradiction.
4.17 Problems Problem 4.1 (The formula G(k, l, m, n)) (after Chellas [12], 3.3). For given natural numbers k, l, m, n, the formula G(k, l, m, n) is: M k L l → L m M n . Prove: formulae (D), (B), (T), (4), (5) are special cases of G(k, l, m, n). Problem 4.2 ((k, l, m, n)-directed structures) Prove: the formula G(k, l, m, n) is valid in frames in which the accessibility relation R is (k, l, m, n)-directed: if R k (u, w) and R m (u, v), then there exists z such that R l (w, z) and R n (v, z). Problem 4.3 (The formula L(m, n)) Prove: The formula L(m, n): L m φ ⊃ L n φ does encompass formulae (4) and (T). Problem 4.4 (L(m, n)-linear structures) Prove: the formula L(m, n) is valid in frames in which the accessibility relation R satisfies the condition: if R n (u, v), then R m (u, v). Problem 4.5 (Schemes Bc , 5c ) (after Chellas [12], 3.39). The formula (Bc ) is L M p ⊃ p and the formula (5c ) is L M P ⊃ M p. Prove: (a) the formula (Bc ) is satisfied in structures with the accessibility relation R fulfilling the condition: for each w ∈ W there exists v ∈ W such that R(w, v) and if R(v, u), then w = u; (b) the formula (5c ) is satisfied in structures in which the accessibility relation R has the property that for each w ∈ W there exists v ∈ W such that R(w, v) and if R(v, u), then v = u. Problem 4.6 (Transitive closure of accessibility) n 0 Recall that transitive closure of a relation R is T C R = ∞ n≥0 R , where R = id, n+1 n ω ω = R ◦ R. We define an operator L as M, w |= L φ ≡ ∀v.(T C R(w, v) ⊃ R M, v |= Lφ}. Prove: (a) In structures with accessibility defined as T C R for any relation R, formulae (K), (T), (4) are valid; (b) If a relation R is symmetric, then under T C R the formula (B) is valid, a fortiori, the structure is a model for S5; (c) Define semantics for the operator M ω φ = ¬(L ω ¬φ).
4.17 Problems
225
Problem 4.7 (Intutionistic connections of the system B) Prove: (a) the formula (B): p ⊃ L M p is equivalent to the formula (B’): p ⊃ ¬M¬M p; (b) in the formula (B’) of (a), replace the double symbol ¬M with the intuitionistic negation sign ∼ and prove that the obtained formula p ⊃∼∼ p is the double negation law of the intuitionistic logic. Problem 4.8 (Truth and falsity modeled) It is tacitly assumed that each world is endowed with full SL (we do not mention it explicitly). Taking this into account, prove: (a) the formula L is valid in any normal system; (b) the formula ¬(M⊥) is valid in any normal system. Problem 4.9 Prove that in any normal modal system the formula M ≡ (L p ⊃ M p) is valid. Problem 4.10 (The Le´sniewski erasure mapping ι) The mapping ι is defined on formulae of sentential modal logic as follows: (i) (ii) (iii) (iv)
ι(¬φ is ¬(ι(φ)); ι(φ ◦ ψ) is ι(φ) ◦ ι(ψ) for ◦ ∈ {∨, ∧, ⊃, ≡}; ι(Lφ) = ι(φ) = ι(Mφ); ι(⊥) = ⊥.
The mapping ι erases all symbols of modal operators from formulae of modal logic. Prove that ι transforms valid formulae of modal logic into valid formulae of sentential logic and inference rules of modal logic into inference rules of sentential logic. Apply this result in order to prove consistency of logics K, T, 4, 5, B, D. Problems 4.11–4.19 are already in (Chellas [12]). Problem 4.11 (Less discussed modal logics: KB) Prove: (a) a normal modal logic is KB if it contains as valid the formula L(M p ⊃ p⊃q (c) the scheme q) ⊃ ( p ⊃ Lq) (b) the following is a rule of inference of KB: Mp⊃Lq (4) is valid in KB if and only if the scheme (5) is valid in KB. Deduce that KB4 is equivalent to KB5. Problem 4.12 (Less discussed modal logics: KT5) Prove that the following logics are identical: KT5, KTB4, KDB4, KDB5. [Hint: Prove that KT5 contains B, 4, D.] Problem 4.13 (Less discussed modal logics: KDB4) Prove that the scheme T is valid in the modal logic KDB4. Problem 4.14 (Less discussed modal logics: KD4, KD5) Prove that the formula L M p ≡ L M L M p is valid in the normal modal logic KD4. Prove that the formula L L p ≡ M L p is valid in the normal modal logic KD5. Problem 4.15 (Less discussed modal logics: KTG4) Prove that KTG4 contains KT4 and is contained in KT5.
226
4 Modal and Intuitionistic Logics
Problem 4.16 (Modalities of normal modal logic S5) A reduction law is an equivalence which reduces the number of modal symbols in a formula. Prove that the following reductions are valid for the normal modal logic S5 (i) (ii) (iii) (iv)
L L p ≡ L p; M M p ≡ M p; Mlp ≡ L p; L M p ≡ M p.
A modality is a sequence of modal operator symbols. Apply reduction laws (i)-(iv) in a proof that S5 has the following modalities: L, M, ¬L, ¬M (one may add formally the empty modality ∅ and its negation ¬). Problem 4.17 (Modalities vs. equivalence of logics: S5 vs. KD45) Prove that KD45 has the same set of modalities as S5 but the formula L p ⊃ p is valid in S5 and it is not valid in KD45. Problem 4.18 (Modalities of KT4) Prove the following reduction laws for KT4: (i) (ii) (iii) (iv)
L L p ≡ L p; M M p ≡ M p; L M L M p ≡ L M p; M L M L p ≡ M L p.
Deduce that KT4 has the following modalities: L, M LM, ML, LML, MLM and their negations (one may add the empty sequence and its negation). Problem 4.19 (Modalities of K5) (after Chellas [12]). Prove the following reduction laws for K5: (i) (ii) (iii) (iv)
L L L p ≡ L L p; M M M p ≡ M M p; L M L p ≡ L L p; M L M p ≡ M M p; M L L p ≡ M L p; L M M p ≡ L M p; M M L p ≡ M L p; L L M p ≡ L M p.
Prove: the normal modal logic K5 has the following modalities: L, M, LL, MM, LM, ML, their negations and the empty modality and its negation=negation ¬. Problem 4.20 (Quantified modal logic) Consider formulae: (a) M∀xφ(x) ⊃ ∀x Mφ; (b) ∃x Lφ ⊃ L∃xφ. For formulae (a), (b), check that they form a pair like Barcan and converse Barcan formulae and check their validity.
4.17 Problems
227
Problem 4.21 (SIL) Prove or disprove validity of the following formulae of SIL: (a) p∨ ∼ p; (b) ∼∼ p ⊃ p; (c) ∼∼ ( p∨ ∼ p); (d) ( p ∨ q) ⊃∼ (∼ p∧ ∼ q); (e) ∼∼ ( p ∨ q) ⊃ (∼∼ p∨ ∼∼ q). Problem 4.22 (Truth sets) (cf. Smullyan [28], Fitting [7]). An SL-truth set T satisfies the following conditions: (a) (b) (c) (d)
φ∨ψ ∈T ≡φ∈T ∨ψ ∈T; φ∧ψ ∈T ≡φ∈T ∧φ∈T; ¬φ ∈ T ≡ φ ∈ /T; φ⊃ψ∈T ≡φ∈ / T ∨ψ ∈T.
Prove: (i) a formula φ of sentential logic is valid if and only if φ belongs in every truth set; (ii) apply the notion of truth set and (a) to prove: each valid formula of intuitionistic sentential logic is a valid formula of sentential logic. Problem 4.23 (SIL into S4) (Schütte [24], cf. Fiting [7]). Translation of formulae of SIL into formulae of S4 is effected by means of the following function P: (a) (b) (c) (d) (e)
P(φ) = Lφ for each atomic formula φ; P(φ ∨ ψ) = P(φ) ∨ P(ψ); P(φ ∧ ψ) = P(φ) ∧ P(ψ); P(∼ φ) = L¬P(φ); P(φ ⊃ ψ) = L(P(φ) ⊃ P(ψ)).
Prove: a formula φ of sentential intuitionistic logic is valid in this logic if and only if the formula P(φ) is a valid formula of the modal system S4. Problem 4.24 (FOIL) Prove: For each model (W, R, P, |=) for FOIL, for any w ∈ W , w |= φ if and only if φ ∈ PI F (w). Problem 4.25 (FOIL) For each model (W, R, P, |=) for FOIL, if w |= φ and w∗ ∈ N F (w), then w ∗ |= φ. Problem 4.26 (FOIL) Prove that the valid formula of FO: ¬¬[∀x.(φ(x) ∨ ¬φ(x))] is not valid as a formula of intuitionistic logic. Problem 4.27 (FOIL) Prove that the valid formula of FO: ∀x.[( p ∨ φ(x)) ⊃ ( p ∨ ∀x.φ(x))], where p contains no occurrence of x, is not valid as a formula of intuitionistic logic. [Hint: in both cases,construct falsifying FOIL models].
228
4 Modal and Intuitionistic Logics
References 1. Smith, R.: Aristotle’s Logic. The Stanford Encyclopedia of Philosophy (Fall 2020 Edition), Zalta, E.N.(ed.). https://plato.stanford.edu/archives/fall2020/entries/aristotle-logic 2. Bobzien, S.: Stoic logic. In: Inwood, B. (ed.) The Cambridge Companion to Stoic Philosophy. Cambridge University Press (2003) 3. Knuuttila, S.: Medieval Theories of Modality. The Stanford Encyclopedia of Philosophy (Summer 2021 Edition), Zalta, E.N. (ed.). https://plato.stanford.edu/archives/sum2021/entries/ modality-medieval/ 4. Lewis, C.I.: A Survey of Symbolic Logic. Berkeley: University of California Press; reprinted: Dover Publications (1960) (without Chs. 5, 6) 5. Carnap, R.: Meaning and Necessity. University of Chicago Press (1947, 1956) 6. Kripke, S.: Semantical analysis of modal logic I: Normal modal propositional calculi. Z. Math. Logik und Grundlagen der Mathematik 9(5-6), 67-96 (1963). https://doi.org/10.1002/malq. 19630090502 7. Fitting, M.C.: Intuitionistic Logic Model Theory and Forcing. North-Holland Publishing Co., Amsterdam (1969) 8. Fitting, M.C.: Model existence theorems for modal and intuitionistic logics. J. Symb. Logic 38, 613–627 (1973) 9. Fitting, M.C.: Proof Methods for Modal and Intuitionistic Logics. Springer Science+Business Media, Dordrecht (1983) 10. Ohnishi, M., Matsumoto, K.: Gentzen method for modal calculi I, II. Osaka Math. J. 9, 113–130 (1957); 11, 115–120 (1959) 11. Henkin, L.: The completeness of the first-order functional calculus. J. Symb. Logic 14(3), 159–166 (1949) 12. Chellas, B.F.: Modal Logic. An Introduction. Cambridge University Press, Cambridge UK (1980) 13. Ladner, R.E.: The computational complexity of provability in systems of modal propositional logic. SIAM J. Comput. 6(3), 467–480 (1977) 14. Blackburn, P., de Rijke, M., Venema, Y.: Modal Logic. Cambridge University Press, Cambridge UK (2001) 15. Halpern, J., Moses, Y.O.: A guide to completeness and complexity for modal logics of knowledge and belief. Artif. Intell. 54(2), 319–379 (1992) 16. Nelson, M.: Propositional Attitude Reports. The Stanford Encyclopedia of Philosophy (Spring 2022 Edition), Zalta, E.N. (ed.). https://plato.stanford.edu/archives/spr2022/entries/ prop-attitude-reports 17. Marcus, R.B.: A functional calculus of first order based on strict implication. J. Symb. Logic 11, 1–16 (1946) 18. Kripke, S.: Semantical considerations on modal logic. Acta Philosophica Fennica 16, 83–94 (1963) 19. Fitting, M.C., Mendelson, R.L.: First-Order Modal Logic. Springer Business+Media B.V, Dordrecht (1998) 20. Fitting, M.C.: Tableau methods of proof for modal logics. Notre Dame J. Form. Logic 13, 237–247 (1972) 21. Gorë, R.: Tableau methods for modal and temporal logics. In: D’Agostino, M., et al. (eds.) Handbook of Tableau Methods. Kluwer, Dordrecht (1998) 22. Heyting, A.: Die formalen Regeln der intuitionistischen Logik, Preussischen Akademie der Wissenschaften. Physikalisch-mathematische Klasse, pp. 42–56, 57–71 & 158–169 (1930) 23. Gödel, K.: Eine Interpretation des intuitionistischen Aussagenkalküls. Ergebnisse eines mathematischen Kolloquiums 4, 39–40 (1933). (English transl.: Interpretation of the intuitionistic sentential logic. In: Hintikka, J.K.K. (ed.):The Philosophy of Mathematics, Oxford University Press, 128–129 (1969)) 24. Schütte, K.: Vollständige Systeme modaler und intuitionistischer Logik. Springer, Berlin (1968)
References
229
25. Kripke, S.: Semantical analysis of intuitionistic logic I, Formal Systems and Recursive Functions. In: Proceedings of the Eighth Logic Colloquium, Oxford 1963, pp. 92–130. NorthHolland Publishing Co. (1965) 26. Wajsberg, M.: Untersuchungen über den Aussagenkalkül von A. Heyting. Wiadomo´sci Matematyczne 4, 45–101 (1938). (also in: Logical Works by Mordechaj Wajsberg, Surma, S. (ed.), Polish Acad. Sci. 132, 171, (1977)) 27. Beth, E.W.: The Foundations of Mathematics. A Study in the Philosophy of Science. Harper & Row Publishers, New York (1966) 28. Smullyan, R.M.: First Order Logic. Dover, Minneola N.Y (1996)
Chapter 5
Temporal Logics for Linear and Branching Time and Model Checking
Temporal expressions had been with humanity from the beginnings. Once the oral speeches or written texts begun, it was possible to record for contemporary and future generations reflections on time. The Flood was perceived as a time related event: after it time begun anew, time as a main factor was mentioned in earliest philosophy, e.g., in philosophy of Heraclitus of Ephesus, time flows bringing cycles of opposites, life and death, way upward and way downward. In the Bible, we see recognition of God’s time, eternal and not changing, and the limited time of man, without past and without future ‘To every thing there is a season, and a time to every purpose under the heaven; A time to be born, and a time to die ... (Kohelet [1], 3:1,3:2). Megaric and Stoic schools (Diodorus Cronus, Zeno of Elea) disputed the nature of motion trying to resolve paradoxes they created which implicitly involved time as a sequence of events. Into logic time came in Aristotle in connection with the problem of future contingencies:‘there will be a sea battle tomorrow; there will not be any sea battle tomorrow’. Analysis of those problem in Aristotle led Jan Łukasiewicz to invention of many-valued logic Łukasiewicz [2]. Some medieval thinkers involved time in emerging dynamics (Nicolas d’Ore sme). Avicenna (Ibn Sina) at the turn of 9th century CE considered temporal qualifications to syllogisms: ‘at all times’, ‘at most times’, ‘at some time’.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. T. Polkowski, Logic: Reference Book for Computer Scientists, Intelligent Systems Reference Library 245, https://doi.org/10.1007/978-3-031-42034-4_5
231
232
5 Temporal Logics for Linear and Branching Time and Model Checking
5.1 Temporal Logic of Prior Modern temporal logic begun with introduction of formalized notion of time into analysis of language, the work of Arthur Prior in Prior’s ‘Tense Logic’ Prior [3]. Prior’s tense operators listed below, reflect expressions for time events used in natural language: Definition 5.1 (The Prior tense operators) They are divided into past time operators and future times operators. Prior’s past time operators (i) P: ‘... it has at some time been the case that...’; (ii) H: ‘... it has always been the case that ...’. Prior’s future time operators (iii) F: ‘... it will at some time be the case that...’; (iv) G: ‘... it will always be the case that ...’. Pairs P, H and F, G are dual to each other. (i) (ii) (iii) (iv)
Pφ ≡ ¬H ¬φ; Fφ ≡ ¬G¬φ; H φ ≡ ¬P¬φ; Gφ ≡ ¬F¬φ.
Time may be modelled as continuous, but time may be as well modelled as discrete. In Pnueli [4], modal logic of time was proposed as a tool in verification of programs and in model checking, and, the linear time model (LTL) was proposed. Later on, more complex models of time emerged, for instance, tree-like structures and then we speak of branching time models (BTL) of which Computational Tree Model (CTL) and its extension CTL* are most often discussed. These models find their use in modelling various systems. In temporal logics one discusses long sequences of moments of time called paths, which are modelled as infinite sequences of states connected by instances of a transition relation. In applications, temporal logics are often coupled with automatons and languages, hence, we collected in Chap. 1 basic facts about grammars, formal languages and automata. In the final part of this chapter, we give an account of model checking with automata for linear and branching temporal logics. We begin with the linear discrete model of time.
5.2 Linear Temporal Logic (LTL) Introduction of time aspect into computer science problems is owed to Kamp [5] and Pnueli who discussed linear time which led to emergence of Linear Temporal Logic (LTL). Linear temporal logic studies basically the behavior of systems over a single
5.2 Linear Temporal Logic (LTL)
233
sequence of time moments modelled as an infinite path. LTL uses as the set of worlds W in the simplest case the discrete set of natural numbers N. Let us call in this case numbers—states. So we have the infinite sequence od states s0 , s1 , s2 , ..sn , ..... This means that LTL is concerned with events present and future. By a path, denoted σ, we mean any infinite subsequence of states si0 , si1 , . . . , ..... However, LTL can be enriched by addition of mirror operators symmetric with respect to the state s0 to future time operators. This variant of LTL is denoted LTL+PAST; a canonical model of time for LTL+Past is the set Z of integers. Definition 5.2 (Syntax of LTL) (i) LTL contains a countable set of atomic propositions of sentential logic SL; in LTL oriented towards model checking, i.e., to computer science applications, this set is denoted A P and it may consist of formulae, descriptors, etc.; (ii) LTL makes use of temporal operators X (next), F (eventually), G (always), U (until); the variant LTL+PAST employs additionally operators P (previous) and S (since). As P mirrors X and S mirrors U, we exclude past operators from our present discussion. Definition 5.3 (Formulae of LTL) Well-formed formulae (wffs) of LTL are defined by structural complexity as follows: (i) (ii) (iii) (iv) (v)
all atomic propositions p ∈ A P are wffs; falsum ⊥ (falsity) is the wff; if φ is a wff, then Fφ, Gφ, Xφ are wffs; if φ and ψ are wffs, then φ U ψ is a wff; If φ and ψ are wffs, then φ ⊃ ψ, and ¬φ are wfs.
From Definition 5.3 it follows in a standard way that if φ, ψ are wffs, then φ ∨ ψ, φ ∧ ψ are wffs and verum defined as ¬⊥ is a wff. We call valid formulae of sentential logic SL tautologies in order to discern them from proper formulae of LTL. Definition 5.4 (Linear Kripke structures for LTL) LTL semantics borrows from modal logic the notion of a Kripke structure as a set of worlds together with an accessibility relation and an assignment on A P, which assigns to each world the set of atomic propositions true in that world. For purposes of LTL, this scheme has to be adapted to already mentioned structure of an infinite sequence of moments of time. Let σ =< s0 , s1 , . . . , s j , . . . > be an infinite sequence of states and Σ be the set of elements of σ. The accessibility relation is the successor relation Succ with instances of the form si → si+1 for i ≥ 0. The functional space (2 A P )Σ is the space of all assignments si → X i ⊆ A P for i ≥ 0. Each element of the space (2 A P )Σ is a mapping f : Σ → 2 A P called a trace. Hence, f (Σ) is an ω-sequence (ω-word) over 2 A P . We denote the linear structure defined by the mapping f by the symbol M f .
234
5 Temporal Logics for Linear and Branching Time and Model Checking
Definition 5.5 (The relation of satisfaction) The pointed structure is the pair (M f , si ), where si ∈ Σ. We define the relation M f , si |= φ by structural induction. (i) M f , si |= for each f and each si ; (ii) for each atomic proposition p, M f , si |= p if and only if p ∈ f (si ) ; (iii) M f , si |= φ ⊃ ψ if and only if either it is not true that M f , si |= φ or M f , si |= ψ; (iv) M f , si |= φ ∨ ψ if and only if either M f , si |= φ or M f , si |= ψ; (v) M f , si |= φ ∧ ψ if and only if M f , si |= φ and M f , si |= ψ; (vi) M f , si |= Xφ if and only if M f , si+1 |= φ; (vii) M f , si |= φUψ if and only if there exists j ≥ i such that M f , s j |= ψ and M f , sk |= φ for each i ≤ k < j; (viii) M f , si |= Fφ if and only if there exists j ≥ i such that M f , s j |= φ; (ix) M f , si |= Gφ if and only if for each i ≤ j, M f , s j |= φ. Conditions (v)–(ix) in Definition 5.5 imply dependencies among temporal operators. Theorem 5.1 The following equivalences hold. (i) (ii) (iii) (iv) (v) (vi)
M f , si M f , si M f , si M f , si M f , si M f , si
|= Gφ ≡ M f , si |= ¬ F¬φ; |= X ¬φ ≡ M f , si |= ¬ Xφ; |= Fφ ≡ M f , si |= U φ; |= Fφ ≡ M f , si |= φ∨ XFφ; |= Gφ ≡ M f , si |= φ∧ XGφ; |= φ Uψ ≡ M f , si |= ψ∨ X(φ U ψ).
Proof For example, we prove (iii). Suppose that M f , si |= Fφ. Hence, there exists j ≥ i such that M f , s j |= φ. By Definition 5.5(i), M f , sk |= for each k. By Defi nition 5.5(vii), M f , si |= U φ. The converse is proved along same lines. Definition 5.6 (Validity) A formula φ is true in the structure M f if and only if M f , s0 |= φ; a formula φ is valid if and only if it is valid in each structure M f . As usual, we denote truth at a structure M f as M f |= φ and validity as |= φ. We denote by σ j the prefix < s0 , s1 , . . . , s j > and by σ j the suffix < s j , s j+1 , . . . > of σ. In accordance with our notational scheme, Σ j is the set of elements in σ j and Σ j contains elements of σ j . For f ∈ (2 A P )Σ , we denote by the symbol f j the restriction f |Σ j and, correspondingly, we define the structure M f j . Similarly, we define the prefix f j as the mapping f |Σ j and we define the structure M fj . We now list some additional valid formulae of LTL. In checking validity please remember about detachment rule of inference.
5.2 Linear Temporal Logic (LTL)
235
Theorem 5.2 The following formulae are among valid formulae of LTL. I. Formulae involving F and G. (I.i) (I.ii) (I.iii) (I.iv) (I.v) (I.vi) (I.vii) (I.viii) (I.ix)
FFφ ≡ Fφ; GGφ ≡ Gφ; FGFGφ ≡ FGφ; FGFφ ≡ GFφ; GFGFφ ≡ GFφ; GFGφ ≡FGφ; FGφ ⊃ GFφ; φ ⊃ Gφ; (φ ⊃ ψ) ⊃ (Gφ ⊃ Gψ).
II. Formulae involving U (II.i) (II.ii) (II.iii) (II.iv)
φ U(φUψ) ≡ φUψ; (φUψ)U ψ ≡ φUψ; φU(φUψ) ≡ (φ Uψ)Uψ; φUψ ≡ (ψ ∨ (φ∧ XφUψ)).
III. Distributive formulae (III.i) (III.ii) (III.iii) (III.iv)
F(φ ∨ ψ) ≡ F(φ)∨F(ψ); F(φ ∧ ψ) ⊃F(φ)∧F(ψ); G(φ ∧ ψ) ≡G(φ)∧G(ψ); G(φ)∨G(ψ) ⊃G(φ ∨ ψ).
IV. Formulae involving G,X,F and G, F, U (IV.i) (IV.ii) (IV.iii) (IV.iv) (IV.v)
φ∧G(φ ⊃ XFφ)⊃ GFφ; Gφ∧ Fψ ⊃ φ U ψ; X(φ ⊃ ψ) ⊃ (Xφ ⊃Xψ); GXφ ⊃ XGφ; G(φ ⊃ ψ∧ Xφ) ⊃ (φ ⊃ Gψ).
We prove some typical cases. Proof For (I.i): M f |=FFφ if and only if there exists j ≥ 0 such that M f j |=Fφ if and only if there exists k ≥ j such that M f j k |= φ if and only if there exists k ≥ 0 such that M f k |= φ, i.e., M f |= Fφ. For (I.iv): please observe that GFφ means that φ occurs at states of Σ infinitely often so adding F once more does not change this behavior of φ. Formally, M f |=FGFφ if and only if there exists j ≥ 0 such that M f j |= GFφ if and only if for each k ≥ j there exists m ≥ k such that M f m |= φ if and only if for each p ≥ 0 there exists q ≥ p such that M f q |= φ if and only if M f |= GFφ. For (IV.ii): M f |= Gφ if and only if φ is true at each si . M f |=Fψ if and only if M f,si |= ψ for some si , hence, for j < i, M f , s j |= φ and M f , s j |= ψ which is equivalent to M f , s |= φUψ.
236
5 Temporal Logics for Linear and Branching Time and Model Checking
Temporal logic LTL allows us to define some properties of systems. Example 5.1 Consider the system of traffic lights with green, red, yellow lights. We let A P = {green, red, yellow}. Let the mapping f assign to σ the sequence (green, red, yellow)ω . Then the following formulae are valid: (i) GF(green), respectively, GF(red), GF(yellow); the meaning is ‘infinitely often red, same for green, yellow; (ii) yellow⊃ F(red): after ‘yellow, at some time red’; (iii) G(yellow ⊃ X red): ‘always, after yellow, next red’. Definition 5.7 (Some properties of systems expressible in LTL) (i) liveness G(action initiated) → F(action completed): always action initiated and then at some time action completed; (ii) fairness: GF(action initiated)→ GF(action completed): action initiated and completed infinitely often; (iii) safety: G(true) (called also invariant), ¬ F(failure): always correct, never fails; (iv) stability: FG(true): after some time, always correct; (v) response: p →Fq: ‘after p at some time q’; (vi) precedence: p → qUr ; p then q after q r ; (vii) causality: F p → Fq: if at some time p, then at some time q. We may have other temporal operators. Examples are operators R and W. Definition 5.8 (Operators R and W) W (called Weak Until) weakens condition φUψ: if always φ, then ψ may not happen; the rendering is: φWψ is equivalent to Gφ ∨ φUψ. R (called Release) is defined as φRψ if and only if ¬(¬φU¬ψ). The reader will please disentangle the meaning. [Hint: a plausible meaning is G ψ ∨ ψU φ ∧ ψ].
5.3 Computational Tree Logic (CTL) Branching time logics model more sophisticated approaches to time in which one allows that time may ramify in various directions, e.g., when at a given state one has a non-deterministic choice of progression along distinct paths. Such cases are modelled by transition systems. We recall here the notion of a transition system. Definition 5.9 (Transition systems) A transition system is a tuple T = (S, A P, →T , I, L), where (i) S is a set of states which may be finite or infinite countable; (ii) A P is a set of atomic propositions; (iii) →T is a transition relation which maps a given state s to a set S(s) of successors of s, i.e., →T : S → 2 S ;
5.3 Computational Tree Logic (CTL)
237
(iv) I is a set of initial states, all actions of a transition system begin in one of initial states; (v) L is an assignment of sets of atomic propositions to states, i.e, L : S → 2 A P . Choice of L determines logical language L(s) at each particular state s. L may be also called a valuation or a labelling. A transition system is actually a directed graph which may possess various properties of directed graphs, i.e, be acyclic or cyclic, be a tree or a forest. A transition system T is rooted when it has the unique initial state, and it is tree-like when it is a tree. A transition system is total when the transition relation →T is serial, i.e., each state s has a successor s ∈→T (s). We assume that transition systems considered here are total and rooted. Definition 5.10 (Paths, traces) A walk in a transition system T is a sequence of states (sq )q∈Q over a well-ordered set Q of indices such that each sq has a unique successor sq . A walk may be infinite or finite; it is initial when its first state is the initial state and it is maximal when it is not extendable to a proper super-walk. An infinite walk is called a path. For each state s, we denote by the symbol Path(s) the set of paths which begin with s. Then s is the root for the fragment of transition system defined by Path(s). For each π ∈ Path(s), s is denoted as f ir st (π). For each path π, the trace T race(π) is the sequence (L(s))s∈π , (cf. Mazurkiewicz [6]). We now introduce CTL: Computational Tree logic, which belongs in the family of Branching Time Logics, whose structures allow for each state to begin distinct paths, i.e, time ramifies in those models. This calls for another type of structures than linear structures we have met in the case of LTL. Definition 5.11 (Syntax of CTL) Formulae of CTL are divided into two categories: state formulae and path formulae. State formulae describe the statuses of states and path formulae describe the behavior of paths. In addition to known from LTL temporal operators X, U, F, G, we find in CTL two path operators: A meaning ‘for all paths’ and E meaning ‘there exists a path’. State formulae State formulae are denoted as φs , ψs , . . ., and the subscript s means ‘at state s’. (i) (ii) (iii) (iv) (v) (vi)
for each state s: each p ∈ A P induces the state formula ps ; for each state s: falsum ⊥ induces the state formula ⊥s ; if φs and ψs are state formulae, then φs ⊃ ψs is the state formula; if φs is a state formula, then ¬φs is the state formula; if α p is a path formula, then Aα p is a state formula; if α p is a path formula, then Eα p is a state formula.
Path formulae Path formulae are denoted as φ p , etc., and the subscript p means ‘for path p’.
238
5 Temporal Logics for Linear and Branching Time and Model Checking
(vii) If φs is a state formula, then Xφs is a path formula; (viii) If φs , ψs are state formulae, then φs Uψs is a path formula. From the above rules it follows that there is an interplay between state and path formulae: path formulae prefixed with either A or E become state formulae and state formulae prefixed with temporal operators become path operators. Let us also observe that CTL does not allow for blocks of temporal operators known from LTL: in CTL, each temporal operator must be prefixed with either A or E. Clearly, we have a greater stock of formulae in CTL then listed in Definition 5.11(i)–(viii). Theorem 5.3 If φs , ψs are state formulae, then (φ ∨ ψ)s , (φ ∧ ψ)s , (φ ≡ ψ)s , s are state formulae and AFφs , AGφs , EFφs , EGφs are path formulae. Proof Part for state formulae follows from laws of sentential logic: ( p ∨ q) ≡ (¬ p ⊃ q) and ( p ∧ q) ≡ ¬(¬ p ∨ ¬q), and, from duality s ≡ ¬⊥s . For path formulae, we have AFφs is defined as A Uφs , EFφs is defined as E Uφs ; AGφs is defined as ¬EF¬φs , EGφs is defined as ¬ AF¬φs . For instance AF( ps ∨ qs )∧ EG( ps ⊃ qs ) is a well-formed state formula of CTL (we do not discuss as of yet its validity). Now, there comes time for semantics of CTL. We assume a rooted and total transition system T = (S, A P, →T , I, L). Definition 5.12 (Semantics of CTL) For a path π, we denote by the symbol π[i] the i-th state in π. By the symbol T, s |= φ, we denote the fact that the state s in the transition system T satisfies the formula φ, and analogous symbol will denote satisfiability by a path π. The notation for a path is π : π[0], π[1], . . . , π[i], . . ., where π[0] is f ir st (π). Satisfaction relation for state formulae (i) (ii) (iii) (iv) (v) (vi)
T, s T, s T, s T, s T, s T, s
|= p if and only if p ∈ L(s); |= s ; |= ¬φs if and only if it is not true that T, s |= φ; |= φs ⊃ ψs if and only if T, s |= ¬φs or T, s |= ψs ; |= Eφ p if and only if there exists a path π ∈ Path(s) such that T, π |= φ p ; |=Aφ p if and only if for each path π ∈ Path(s), T, π |= φ p .
Satisfaction relation for path formulae (vii) T, π |= Xφs if and only if π[1] |= φs ; (viii) T, π |= φs U ψs if and only if there exists i ≥ 0 such that T, π[i] |= ψs and T, π[ j] |= φs for each 0 ≤ j < i; (ix) T, s |= AXφs if and only if for each π ∈ Path(s), π[1] |= φs ; (x) T, s |= EFφs if and only if there exists π ∈ Path(s) and j ≥ 0 such that π[ j] |= φs ; (xi) T, s |= AGφs if and only if for each π ∈ Path(s) and for each j ≥ 0, π[ j] |= φs .
5.4 Full Computational Tree Logic (CTL*)
239
Definition 5.13 (Satisfaction relation for transition systems) Semantics for CTL does encompass the global satisfiability by transition systems. For a transition system T , we let (i) T |= φs if and only if T, s |= φs for each initial s ∈ I ; under our assumptions, there is a unigue such s which we may denote root, so T |= φs if and only if T, root |= φs ; (ii) T |= AXφs if and only if for each path π ∈ Path(root), π[1] |= φs ; (iii) T |= A φs U ψs if and only if for each path π ∈ Path(root), there exists i π ≥ 0 such that T, π[i π ] |= ψs and T, π[ j] |= φs for each 0 ≤ j < i π . Counterparts to (ii), (iii) with A replaced by E define T |= EXφs and T |= Eφs Uψs when we replace the phrase ‘for each π ∈ Path(root)’ with the phrase ‘for some π ∈ Path(root)’. Example 5.2 We list some additional CTL-valid formulae: (i) (ii) (iii) (iv) (v) (vi) (vii) (ix) (x) (xi)
E(φUψ) ≡ [ψ ∨ (φ∧EXE(φUψ))]; A(φUψ) ≡ [ψ ∨ (φ∧AXA(φUψ))]; AG((φ∨EXψ) ⊃ ψ) ⊃(EFφ ⊃ ψ); AG((φ∨AXψ) ⊃ ψ) ⊃(AFφ ⊃ ψ); A(φ ∧ ψ) ⊃(Aφ)∧(Aψ); E(φ ∨ ψ) ≡ (Eφ) ∨ (Eψ); AX(φ ⊃ ψ) ⊃(AXφ ⊃ AXψ); AGφ ≡ φ∧ AXAGφ; EGφ ≡ φ∧ EXEGφ; Aφ Uψ ≡ ¬ [EG¬ψ∨ E(¬ψ U¬ψ ∧ ¬φ)].
Formula (xi) may seem hard to unveil, but it is only the first impression. By equivalence ¬E F¬ ≡ AG, it reads (xi’): Aφ Uψ ≡ AFψ ∧ ¬ E(¬ψ U ¬ψ ∧ ¬φ)]. The term AFψ asserts that ψ appears on each path at some state; the term ¬ E(¬ψ U¬ψ ∧ ¬φ)] asserts that in no path the term ¬ψU ¬ψ ∧ ¬φ holds. Consider an arbitrary path π. Let si be the fist state on π with the occurrence of q. Then, at all preceding states, ¬ψ holds, hence φ holds, witnessing Aφ Uψ. The converse is equally simple.
5.4 Full Computational Tree Logic (CTL*) CTL* subsumes both LTL and CTL as it applies operators A and E which are in CTL but not in LTL and it allows for blocks of temporal operators like in LTL but not in CTL. As with CTL, formulae of CTL* are divided into state formulae and path formulae.
240
5 Temporal Logics for Linear and Branching Time and Model Checking
Definition 5.14 (Syntax of CTL*) We give separate rules for state and path formulae. State formulae (i) (ii) (iii) (iv) (v)
for each p ∈ A P, ps is a state formula; ⊥s is a state formula; if φs , ψs are state formulae, then φs ⊃ ψs is a state formula; if φs is a state formula, then ¬φs is a state formula; if φ p is a path formula, then Eφ p is a state formula.
Path formulae (vi) (vii) (viii) (ix) (x)
each state formula is a path formula; if φ p , ψ p are path formulae, then φ p ⊃ ψ p is a path formula; if φ p is a path formula, then ¬φ p is a path formula; if φ p is a path formula, then Xφ p is a path formula; if φ p , ψ p are path formulae, then φ p Uψ p is a path formula.
Theorem 5.4 In CTL*, Fφ p , Gφ p , Aφ p , φs ∨ ψs , φs ∧ ψs are formulae. Proof In CTL* which allows for blocks of temporal operators along with A,E, we can define F,G as in LTL, i.e., Fφ p is Uφ p , and, Gφ p is ¬F¬φ p , and, Aφ p as ¬E¬φ p . Sentential formulae follow in the standard way. CTL* allows for formulae like AFGE( p∨ Xq). Definition 5.15 (Semantics of CTL*) Semantics for CTL* is defined on lines of semantics for LTL. In particular, we recall that for a path π, the symbol π j denotes the j-th suffix of π, i.e, the sub-path π[ j], π[ j + 1], . . .. Semantics is defined for state formulae and for path formulae. Satisfaction relations for state formulae (i) (ii) (iii) (iv)
T, s T, s T, s T, s
|= p if and only if p ∈ L(s); |= ¬φs if and only if it is not true that T, s |= φs ; |= φs ⊃ ψs if and only if either T, s |= ¬φs or T, s |= ψs ; |= Eφ p if and only if there exists a path π ∈ Path(s) such that T, π |= φ p .
Satisfaction conditions for φs ∨ ψs , φs ∧ ψs , Aφ p follow by definitions (ii) and (iii). Satisfaction relations for path formulae (v) (vi) (vii) (viii) (ix)
T, π |= φs if and only if T, π[0] |= φs ; T, π |= ¬φ p if and only if it is not true that T, π |= φ p ; T, π |= φ p ⊃ ψ p if and only if either T, π |= ¬φ p or T, π |= ψ p ; T, π |= Xφ p if and only if π[1] |= φ p ; T, π |= φ p Uψ p if and only if there exists j ≥ 0 such that T, π[ j] |= ψ p and T, π[i] |= φ p for each 0 ≤ i < j.
5.5 Meta-theory of Temporal Logics
241
Satisfaction conditions for F and G follow by equivalences in Theorem 5.4 We list some CTL*-valid formulae. Theorem 5.5 The following are valid formulae of CTL*. (i) (ii) (iii) (iv) (v) (vi)
Aφs ⊃ φs ; AXφ ⊃ XAφ; AGφ ⊃ GAφ; A(φ ⊃ ψ) ⊃(Aφ ⊃ Aψ); Aφ ⊃ AAφ; Aφ ⊃ AEφ.
5.5 Meta-theory of Temporal Logics We begin with the linear temporal logic LTL and its decidability and we begin with preliminaries. We recall the notion of a linear Kripke structure for LTL: let σ =< s0 , s1 , . . . , s j , . . . > be an infinite sequence of states with Σ the set of elements of σ. The accessibility relation is the successor relation Succ with instances of the form si → si+1 for i ≥ 0.The set A P consists of atomic propositions. A functional space (2 A P )Σ is the space of all assignments si → X i ⊆ A P for i ≥ 0. Each element of the space (2 A P )Σ is a mapping f : Σ → 2 A P . Hence, f (Σ) is an ω-sequence (ω-word) of subsets of A P. We denote the linear structure defined by the mapping f by the symbol M f . We denote by σ j the prefix < s0 , s1 , . . . , s j > and by Σ j the suffix < s j , s j+1 , . . . > of Σ. For f ∈ (2 A P )Σ , we denote by the symbol f j the restriction f |Σ j . Similarly, we define the mapping f j as f |Σ j . Definition 5.16 (Periodic linear structures) A sequence σ of states is periodic if and only if there exists a natural number k and a natural number j0 such that for each j ≥ j0 s j = s j+k . The number k is an ultimate period of σ and the smallest k with this property is called the ultimate period of σ. The number j0 is the prefix length and k is the loop length. The same definition concerns infinite words. The set Sub(φ) of a formula φ is defined by structural induction: (i) Sub(⊥) = {⊥} and Sub() = {}; (ii) for any binary operator or connective ◦, Sub(φ ◦ ψ) = Sub(φ)∪ Sub(ψ) ∪{φ ◦ ψ}; (iii) for any unary operator or connective , Sub( φ) = Sub(φ)∪{ φ}; Cardinality |Sub(φ)| does not exceed the size ||φ|| of φ, defined as the number of nodes in the parse tree of φ. We recall the negation ∼ which toggles between the formula and its negation. (i) ∼ φ is ψ if φ is ¬ψ; (ii) ∼ φ is ¬ψ if φ is ψ.
242
5 Temporal Logics for Linear and Branching Time and Model Checking
We recall the notion of the Fischer-Ladner closure of a formula φ (Fisher, Ladner [7] ). Definition 5.17 (The Fischer-Ladner closure) The Fischer-Ladner closure of φ, denoted F LC(φ), is the smallest set which contains all sub-formulae of φ and their negations by ∼. We observe that |F LC(φ)| ≤ 2 · |Sub(φ)|. Moreover (i) if ψ ∈ Sub(φ), then {ψ, ∼ ψ} ⊆ F LC(φ)); (ii) if ¬ψ ∈ F LC(φ), then ψ ∈ Sub(φ). A set Γ of formulae of LTL is closed if and only if Γ = F LC(Γ ) = {F LC(φ) : φ ∈ Γ }. We define the consistency notion for a set Γ of formulae of LTL. Definition 5.18 (SL-consistency) A set Γ of formulae is SL-consistent if and only if the following requirements are met: (i) (ii) (iii) (iv) (v) (vi)
Γ contains neither the pair ⊥, nor any pair φ, ¬φ, nor, ψ, ∼ ψ; if ¬¬φ ∈ Γ , then φ ∈ Γ ; if φ ∧ ψ ∈ Γ , then φ ∈ Γ and ψ ∈ Γ ; if φ ∨ ψ ∈ Γ , then φ ∈ Γ or ψ ∈ Γ ; if ¬(φ ∧ ψ) ∈ Γ , then ∼ φ ∈ Γ or ∼ ψ ∈ Γ ; If ¬(φ ∨ ψ) ∈ Γ , then ∼ φ ∈ Γ and ∼ ψ ∈ Γ .
In particular, FLC(φ) is SL-consistent. Definition 5.19 (Maximal consistent sets) For a formula φ, a set Γ of formulae is maximally consistent with respect to F LC(φ), which is denoted MaxCon φ (Γ ), if the following conditions are fulfilled: (i) (ii) (iii) (iv) (v)
Γ contains neither the pair ⊥, ¬ nor any pair φ, ¬φ, nor, ψ, ∼ ψ; for each ψ ∈ F LC(φ), dichotomy: either ψ ∈ Γ or ∼ ψ ∈ Γ holds; for each ¬¬ψ ∈ F LC(φ), ψ ∈ Γ if and only if ¬¬ψ ∈ Γ ; for each ψ ∧ ξ ∈ F LC(φ), ψ ∧ ξ ∈ Γ if and only if ψ ∈ Γ and ξ ∈ Γ ; for each ψ ∨ ξ ∈ F LC(φ), ψ ∨ ξ ∈ Γ if and only if either ψ ∈ Γ or ξ ∈ Γ .
Clearly, the number of maximal consistent sets does not exceed 2|Sub(φ)| , i.e., 2|φ| . As with the Henkin idea of treating maximal consistent sets as elements in a model, applied in Chap. 4, we define the successor relation between MaxCons. Definition 5.20 (Transition relation between maximal consistent sets) The transition relation Γ1 →T Γ2 between MaxCon(Γ1 ) and MaxCon(Γ2 ) holds if: (i) (ii) (iii) (iv)
if Xφ ∈ Γ1 then φ ∈ Γ2 ; if ¬ Xφ ∈ Γ1 then ∼ φ ∈ Γ2 ; if ψ1 Uψ2 ∈ Γ1 then either ψ2 ∈ Γ1 or (ψ1 ∈ Γ1 and ψ1 Uψ2 ∈ Γ2 ); if ¬(ψ1 Uψ2 ) ∈ Γ1 then ∼ ψ2 ∈ Γ1 and (∼ ψ1 ∈ Γ1 or ¬(ψ1 Uψ2 ) ∈ Γ2 ).
5.5 Meta-theory of Temporal Logics
243
An important example of a maximal consistent set is related to computations in the structure M f . We define a relative closure of φ modulo M f i as F LC M f i (φ) = {ψ ∈ F LC(φ) : M f i |= ψ}. Theorem 5.6 The following hold true by semantic laws of LTL and Definition 5.20. In the forthcoming case, we denote the transition →T by the symbol →φ . (i) MaxCon(F LC M f i (φ)); (ii) for i ≥ 0, the relation F LC M f i (φ) →φ F LC M f i+1 (φ) holds. By Theorem 5.6, the sequence S:(F LC M f i (φ) : i ≥ 0) is a linear structure consisting of maximal consistent sets. Now, we state the crucial fact about small models for a formula φ of LTL (Sistla and Clarke [8]). It is obvious that, as the set FLC(φ) is finite, there exist minimal k, l such that k is the prefix length and l is the loop length, i.e., the sequence (S) is the ultimately periodic one. The problem consists in finding values of prefix length and loop length depending on φ. This problem may be resolved to the effect stated in the theorem in (Sistla, Clarke [8]) that follows. Theorem 5.7 For each formula φ of L T L , if φ is satisfiable in M f , i.e., f (0) |= φ, then it is satisfiable in a periodic model M f ∗ with the prefix length at most 2|φ| and the loop length at most |φ| · 2|φ| . Proof For the prefix of the periodic model, the number of sub-formulae of φ is bounded from above by 2|φ| , hence, i = 2|φ| is sufficient for the length of the prefix. For the loop, it has to accommodate the U formulae, their number bounded from above by |φ|. Any two states satisfying distinct U-formulae can be separated in the loop by all remaining sub-formulae, their number bounded from above by 2|φ| , hence, the sufficient number of states in the loop is l = |φ| · 2|φ| . The size of the periodic model is then bounded by |φ| + |φ| · 2|φ| . By Theorem 5.7, if a formula φ of LTL is satisfiable, i.e., if M f , σ[0] |= φ for some f, σ, then there exists the transition system ({Γ j : j ≤ i + l}, →φ ) where each Γi is maximal consistent and φ ∈ Γ0 . Hence, φ is satisfiable in the ultimately periodic structure and LTL has the small model property. The conclusion is Theorem 5.8 The problem of satisfiability of formulae in LTL is decidable. As validity problem is co-unsatisfiability problem, the validity problem is decidable. We address the problem of model checking for LTL. Model checking problem consists in decision whether, for a given transition system (T, s, →T , L) and an LTL formula φ, T, s |= φ. The idea for solving this decision problem is to apply already existing witness for satisfiability in the form of the ultimately periodic model, adapted to the new setting of a transition system, and augmented with necessary additional ingredients taking into account states in T . Formally, one considers the existence of a sequence (Γ j , s j ) : j ≤ i + l) for some i < l, such that this sequence consists of a prefix and a loop of the ultimately periodic sequence. Then the following are satisfied.
244
(i) (ii) (iii) (iv) (v) (vi) (vii)
5 Temporal Logics for Linear and Branching Time and Model Checking
i ≤ |T | · 2|φ| and l ≤ |T | · |φ| · 2|φ| ; sets Γ j are maximal consistent; Γi = Γi+l ; Γ j →φ Γ j+1 holds for each j; each formula ψ : ψ1 Uψ2 ∈ {Γ j : j < i + l} is fulfilled before i + l; s0 = s and s j →T s j+1 for j < i + l; for each atomic proposition p ∈ A P, p ∈ Γ j if and only if p ∈ L(s j ).
The factor |T | in (i) is due to the fact that we try to find a witness that the transition system satisfies φ, hence, we explore various paths. The main result obtained exactly on the lines of Theorem 5.7 is Theorem 5.9 (Demri, Goranko, Lange [9]) There exists a path σ(s) beginning at s ∈ T such that σ(s) |= φ if and only if there exists a witness (Γ j , s j ) : j ≤ i + l) satisfying (i)–(vii) with φ ∈ Γ0 . By Theorem 5.9, in order to check that a transition system (T, s, →T , L) satisfies φ, one may examine ultimately periodic models (Γ j , s j ) : j ≤ i + l) bounded in number by |T | · (|φ| + |φ| · 2|φ| ). We address the problem of complexity of the satisfiability problem SAT(LTL). In this respect, in (Sistla and Clarke [8]) the following result was obtained. Theorem 5.10 The satisfiability problem SAT(LTL) is in PSPACE, hence, the decision problem of validity is in co-PSPACE = PSPACE. Proof We know about the existence of a sequence Γ0 , Γ1 , . . . , Γi , . . . , Γi+l such that (i) each Γ j is a maximal consistent subset of F LC(φ); (ii) Γ j →φ Γ j+1 for each j; (iii) all formulae of the form ψUξ enter Γi+l if and only if they have been verified before i + l; (iv) φ ∈ Γ0 ; (v) i ≤ |φ|, l ≤ |φ| · 2|φ| . In order to control U-formulae, one keeps the record Rec(U ) of U-formulae which are waiting for validity witness and the record Rec(U )∗ of U-formulae already verified. As proved in (Sistla, Clarke [8]), M f |= φ if and only if φ has the witnessing systems with Γ s as states and →φ as the transition relation. One can use the witnessing system to subsequently guess: (i) (ii) (iii) (iv) (v) (vi)
i 0 ∈ [0, 2|φ| ]; j0 ∈ [1, |φ| · 2|φ| ]; Γ0 , a maximal consistent set in F LC(φ) which contains φ; for 0 < k < i 0 , guess maximal consistent Γk such that Γk−1 →φ Γk ; let Γ ∗ be the last set guessed for k; for i 0 < j < i 0 + j0 , guess maximal consistent Γ j like in (iv) for (j–1) and register sets Rec(U j ) = Rec(U j−1 ) ∪ {ψ1 U ψ2 ∈ Γ j } and Rec(U j )∗ = Rec(U j−1 )∗ ∪ {ψ1 U ψ2 ∈ Γ j : ψ2 ∈ Γ j };
5.5 Meta-theory of Temporal Logics
245
(vii) let Γ ∗∗ be the last set Γ j and Rec(U )∗ be the last Rec(U j ) with Rec(U )∗∗ the last Rec(U j )∗ ; (viii) if Γ ∗ = Γ ∗∗ and Rec(U )∗ ⊆ Rec(U )∗∗ then accept else reject. There are finitely many guesses, checking that the witnessing sequence consists of maximal consistent sets, that transition relation satisfies conditions, and, checking inclusions can be done in space polynomial in size of φ, hence, non-deterministic polynomial space is sufficient. By the Savitch Theorem 1.59, the problem is in PSPACE. Satisfiability and complexity for CTL and CTL* Satisfiability problem for CTL is known to be EXPTIME-complete (Fischer, Ladner [7]). Satisfiability problem for CTL* is known to be 2EXPTIME-complete (Vardi, Stockmeyer [10]). Model checking for CTL is in PTIME. We sketch the proof in Markey [11]. Theorem 5.11 Model checking for CTL is in PTIME. Proof It is known that temporal operators EX,EU,AF define other operators and all formulae of CTL can be thus expressed by means of them and of sentential connectives. Consider as a model a Kripke structure M = (W, R, A) along with a pointed structure (M, w0 ) for some world w0 ∈ W . Let φ be a formula of CTL. The proof is using the labelling procedure. (i) for each atomic proposition p, for each word w, world w is labelled with p if and only if w ∈ A( p); (ii) for Boolean formulae: w is labelled with φ ∧ ψ, respectively with φ ∨ ψ, if and only if w is labelled with φ and ψ, respectively, w is labelled either with φ or ψ; w is labelled with ¬φ if and only if w is not labelled with φ; (iii) for a sub-formula EX ψ of φ, each world w is labelled with EX ψ if and only if there exists a world w such that R(w, w ) and w is labelled with ψ; correctness follows by semantics of EX; (iv) for a sub-formula ψUξ of φ, we label with ψUξ all worlds already labeled with ξ; next, we label with ψUξ each world labeled with ψ such that there exists a world w such that R(w, w ) and w is labelled with ψUξ; clearly, this is monotone operation on sets of formulae in a complete set of all subsets of sub-formulae of φ so it reaches the fixed point by the Knaster-Tarski Theorem 1.2; correctness of this labelling follows by the valid formula of CTL: EψUξ ≡ ξ∧EXEψUξ; (v) for a sub-formula AFψ of φ, we label with AFψ each world w either already labelled with ψ or such that each world w with R(w, w ) is labelled with AFψ; correctness of this labeling follows by the valid formula of CTL: AFψ ≡ ψ ∨ AXAFψ; as in (iv) , the Knaster-Tarski theorem ensures that the labelling reaches the fixed point; (vi) M, w0 |= φ if and only if w0 is labeled with φ.
246
5 Temporal Logics for Linear and Branching Time and Model Checking
Table 5.1 Complexity of satisfiability decision problem
Logic
Complexity
LTL CTL CTL* LTL(U) LTL(XG) LTL(F) LTL(X) LTL+PAST
PSPACE EXPTIME 2EXPTIME PSPACE PSPACE NP NP PSPACE
In order to conclude that model checking for CTL is PTIME-complete, one has to show PTIME-hardness of it. In Markey [11] the proof is proposed by reduction of the CIRCUIT-VALUE problem known to be log space PTIME-complete Ladner [12] to the problem of model checking for CTL. The corresponding problem for CTL* is PSPACE-complete (Emerson and Lei [13]). Theorem 5.12 CTL* model checking is PSPACE-complete. The idea for a proof: as for each world w in a Kripke structure, w |= L T L φ if and only if w |=C T L ∗ Aφ, the idea of a proof is to apply LTL model checking along of a reduction of formulae of CTL* to those of LTL. In that respect please see Schnoebelen [14]. Some fragments of logics have been also explored for model checking, e.g., the model checking problem for the logic L(F) is NP-complete (Sistla, Clarke [8]). We include in Table 5.1 some results from literature.
5.6 Linear Time Logic with PAST (LTL+PAST) The temporal logic LTL+PAST is LTL augmented with past operators which are in a sense mirrored LTL operators. While it is possible to consider them in the standard model of an infinite path, yet more natural is to enlarge the standard path by going from the set of natural numbers N to the set of integers Z. For each state s in Z, we denote by the symbol (←, s] the set of states preceding s in the set Z ordered by the natural ordering ≤ and we denote by the symbol [s, →) the set of states following s with respect to the natural ordering ≤. We preserve temporal operators X,U, F,G with their semantics and we add new operators Y, which mirrors X, S, which mirrors U, and operators F−1 , G−1 as past counterparts to F and G. Definition 5.21 (Syntax of LTL+PAST) Formulae of LTL+PAST are defined as follows:
5.7 Properties of Systems, Model Checking by Means of Automata
(i) (ii) (iii) (iv) (v)
247
falsum ⊥ is a formula; atomic propositions of SL are formulae; if φ is a formula, then ¬φ is a formula; if φ, ψ are formulae, then φ ⊃ ψ is a formula; if φ, ψ are formulae, then Xφ, Yφ, φUψ, φSψ are formulae.
As we already know from the discussion of LTL, (ii), (iii), (iv) imply that formulae of sentential logic are formulae of LTL+PAST, (i) implies that verum is defined as ¬⊥, and, F−1 φ, G−1 φ are formulae of LTL+PAST, F−1 φ defined as Sφ, G−1 φ defined as ¬F−1 ¬φ. Definition 5.22 (Semantics of LTL+PAST) LTL semantics of formulae of sentential logics remains unchanged, for temporal operators, we define it in the new framework of the set Z of integers. (i) Z, s |=Xφ if and only if Z, s + 1 |= φ; (ii) Z, s |= φUψ if and only if there exists s ∈ [s, →) such that Z, s |= ψ and Z, s ∗ |= φ for each s ≤ s ∗ < s ; (iii) Z, s |=Yφ if and only if Z, s − 1 |= φ; (iv) Z, s |= φSψ if and only if there exists s ∈ (←, s] such that Z, s |= ψ and Z, s ∗ |= φ for each s < s ∗ ≤ s; (v) Z, s |= F−1 φ if and only if there exists s ∈ (←, s] such that Z, s |= φ; (vi) Z, s |= G−1 φ if and only if for each s ∈ (←, s], Z, s |= φ. Please observe that for each state s ∈ Z, Z, s |= φ ≡ Z, s |= YXφ ≡ Z, s |= XYφ. This allows to eliminate Y; in consequence, LTL+PAST is also denoted L(U,S,X) (Sistla, Clarke [8]); as proved therein, satisfiability and model checking for L(U,S,X) by means of periodic models are in PSPACE.
5.7 Properties of Systems, Model Checking by Means of Automata We suggest Chap. 1 which brings notions and facts about transition systems, finite and infinite automata and regular expressions and languages. We give here some insights into the vast area of model checking. The model checking problem consists of verifying whether a transition system T satisfies a given formula φ. The mentioned above in Definition 5.7 properties of systems are expressed by LTL formulae and those properties can be represented by automata. We give simple examples of LTL properties modeled by Büchi automata. In that case the alphabet consists of formulae of LTL. We give some examples of automata. Example 5.3 The automaton B1 in Table 5.2 specifies the formula G p and the language L(B1) is p ω . This is a safety property. The accepting state is in bold. The initial state is marked with ∗.
248
5 Temporal Logics for Linear and Branching Time and Model Checking
Table 5.2 Automaton B1
States
q0
q1
p ¬p
q0∗ q1 ∅
∅ ∅ q1
States
q0∗
q1
p ¬p
q1 q0 ∅
∅ ∅ q1
Table 5.4 Transition system TS
States Transitions Labeling L: ¬ on
s0 s1 on
s1 s0
Table 5.5 Automaton B3
States
q0
q1
¬on
q1 q0
q1 ∅
Table 5.3 Automaton B2
The automaton B2 in Table 5.3 specifies the formula F p. The language L(B2) is (¬ p)∗ ( p)()ω . This is the property ‘eventually p’ a particular case of the liveness property. Accepting state is in boldface.The initial state is marked ∗. Example 5.4 (Model checking of a liveness property) We encode the property ‘infinitely many times on’ in a transition system T S in Table 5.4. We encode the complement to the property ‘infinitely many times on’, i.e., the property ‘from some time on, always ¬ on’ in a Büchi automaton B3 shown in Table 5.5. The accepting state is printed in boldface. The initial state is q0 . We define the synchronous product T S B3. We recall that the signature of T S is (S, Act, →T S , I, A P, L) and the signature of B3 is (Q, 2 A P , →B3 , Q 0 , F). The synchronous product has as its components (i) the set of states S × Q; (ii) the transition relation → defined as the smallest of relations satisfying the following: s1 →T S s2 , q1 →B3 L(s2 ) q2 (28) (s1 , q1 ) → (s2 , q2 ) (iii) the set of initial states I = {(s0 , q) : ∃q0 ∈ Q 0 .q0 →B3 L(s0 ) q}; (iv) the set of accepting states Q × F.
5.8 Natural Deduction: Tableaux for LTL Table 5.6 Product TS B1 States (s0 , q0 ) Transitions (s1 , q0 )
(s0 , q1 ) ∅
249
(s1 , q0 ) {(s0 , q1 ), (s0 , q0 )}
(s1 , q1 ) (s0 , q1 )
Let us observe that the demand that transitions in the synchronized product be possible under labels of parallel transitions of TS, forces the condition that any accepting run of B3 induces a path in TS which negates the condition ‘infinitely many times on’, ie., TS does not satisfy it. In Table 5.6 the product T S B1 is shown. There is no infinite path in the product which meets any of accepting states (s0 , q1 ), (s1 , q1 ) infinitely many times. This proves that T S satisfies the property ‘infinitely many times on’.
5.8 Natural Deduction: Tableaux for LTL We have met tableaux in Chaps. 2–4 for, respectively, sentential, predicate and modal and intuitionistic logics, and now we embark on tableau satisfaction checking for temporal logics. We begin with tableau method for LTL. Tableau construction for LTL is definitely distinct from previously discussed by us constructions: first, LTL deals with paths, next, among its operators are operators like φ Uψ and ¬ Gφ whose satisfaction is verified in future. These features pose new problems for their rendering in tableaux. We begin with a list of LTL-equivalences which will be useful in tableau constructions. Theorem 5.13 We recall the basic equivalences among operators of LTL. (i) (ii) (iii) (iv) (v)
Gφ ≡ φ∧ XGφ; ¬ Gφ ≡ ¬φ∨X¬Gφ; φUψ ≡ ψ ∨ φ∧XφUψ; ¬φUψ ≡ ¬ψ ∧ (¬φ ∨ ¬XφUψ); X¬φ ≡ ¬Xφ.
We also recall the classification of formulae of any logic into four types, two types for sentential formulae and two types for formulae of logics with operators, these types being α, β ,γ, δ. Types α, γ are conjunctive, meaning that formulae of those types decompose into conjunctions, types β, δ are disjunctive, i.e., formulae of those types decompose into disjunctions. Figure 5.1 recollects sentential formulae of types α and β. Equivalences in Theorem 5.13 allow for decomposition patterns of types γ (conjunctive) and δ (disjunctive); these patterns are shown in Fig. 5.2. One more type of formulae are successor formulae Xφ and X¬φ which propagate , respectively, as φ and ¬φ. We call these formulae as being of type ε.
250
5 Temporal Logics for Linear and Branching Time and Model Checking
Fig. 5.1 Sentential decomposition forms: A reminder
Fig. 5.2 Decomposition forms for LTL
We recall the notion of sub-formulae of a formula. We call immediate subformulae of a formula φ, the children of the root φ in the formation tree for φ. For example for the formula Gφ, imminent sub-formulae are φ, X Gφ of conjunctive type according to Theorem 5.13(i). Definition 5.23 (Decomposition components) The set of decomposition components, dc(φ), for short, of a formula φ contains φ and all formulae which are immediate
5.9 Tableaux Construction for Linear Temporal Logic
251
sub-formulae of φ as well as immediate sub-formulae of each formula ψ ∈ dc(φ). For instance, (i) dc(Gφ) = {Gφ, φ, XG φ}; (ii) dc(φUψ) = {φUψ, ψ, φ∧ XφUψ, φ, XφUψ}. Among decomposition components are formulae which have come from disjunctive types β and δ as well as from conjunctive types α and γ. The laws of distribution allow to separate these formulae into disjoint sets: decomposition implicants. Definition 5.24 (Decomposition implicants) A decomposition implicant for a formula φ is any prime implicant in DNF form of dc(φ) from which φ is removed. We denote prime implicants for φ by the generic symbol ιφ . For example, decomposition implicants of the formula φUψ∧ Gξ in Theorem 5.13 are (i) ι1φ : {φUψ, Gξ, ξ, XGξ, ψ}; (ii) ι2φ : {φUψ, Gξ,ξ, XGξ, φ, XφUψ}.
5.9 Tableaux Construction for Linear Temporal Logic Let us consider a simple example. Suppose φ is G p. Semantic analysis would lead to the following development of a tableau. (i) {G p}; initial node; (ii) { p, XG p}; decomposition implicant; (iii) {G p}; extension to the next node. At this node, if we have wanted to produce a witnessing tableau for φ, then node (ii) would be redundant: we treat it as a semantic external remainder about φ. In the final version, we omit it. Our tableau becomes (iv) { p, XG p} (v) { p, XG p}; goto (iv). In this way we produce an infinite path validating φ. The moral is that we should produce an initial tableau with semantic nodes like (iii) serving for prolonging paths and then we should remove them from paths of their occurrences. Thus, we call auxiliary, those semantic states, denoted Aux. In the final tableau these auxiliary states are omitted: if a state is connected to an auxiliary state which in turn is connected to the next state, then we bypass the auxiliary state and connect the preceding state to the successor state. The root of a tableau is an auxiliary state which contains the initial set of formulae and it is omitted in the final tableau as well. Example 5.5 We show in Fig. 5.3 the initial tableau for the formula (G p) ∧ ¬( pUq). We owe some explanations pertaining to the construction. We denote
252
5 Temporal Logics for Linear and Branching Time and Model Checking
Fig. 5.3 The initial LTL-tableau
states as si for 0 ≤ i ≤ 2 and we denote auxiliary states as AU X . We make use of Theorem 5.13(i)–(v) in creating decompositions. In particular, (iv) ¬( pUq) ≡ (¬q) ∧ ((¬ p) ∨ ¬q ∧ ¬(X( pUq))) is responsible for children s0, s1 of the root Aux. The loop from s3 to the second AU X stems from the provisos: the auxiliary state which would be obtained from s3 would be identical with this AU X . Crosses under s0 and s2 mean as usual that paths through these states are contradictory, hence, closed: both states contain p and ¬ p. We obtain the final tableau by elimination of auxiliary states: if for an auxiliary state AU X , there exists a state s with transition arrow s → AU X , then add arrows s → s for each state s for which the arrow AU X → s exists and remove AU X and all arrows in and out of it.
5.9 Tableaux Construction for Linear Temporal Logic
253
Fig. 5.4 The final LTL-tableau
The rules for elimination of states are: (i) if a state s has no successor state, then remove s; (ii) if a formula φUψ is in state s and there is no finite sequence of successor states s = s0 → s1 → . . . → sn with ψ ∈ sn, where n ≥ 0 and φ in si for each i < n, then remove s from the tableau Init(φ); (iii) remove states with contradictory sets of formulae. When these measures are taken, what remains is the final tableau for (φ). Example 5.6 In Fig. 5.4, the final tableau for the formula (G p) ∧ ¬( pUq) is shown obtained from the initial tableau for (G p) ∧ ¬( pUq). States s0 and s2 are removed due to inconsistencies and loop at s3 is added in accordance with the rule for elimination of auxiliary states. We recall that in tableaux from Chaps. 1–3, open branches of a tableau were Hintikka sets and they were satisfiable. We need a counterpart to Hintikka sets from Chaps. 1–3; in this case these sets represent labels of paths in the tableau, hence, they are called Hintikka traces. A Hintikka trace is a trace for transition system modelled on the structure N with the successor to n= n + 1. Definition 5.25 (Hintikka traces) Let Γ be a dc-closed set of formulae with Γ = dc(Γ ). A Hintikka trace H for the set Γ is a sequence (H(i) : i ∈ N) of subsets of Γ with the following properties: (H1) each H(i) is a decomposition implicant; (H2) if a formula φ ∈ H(i) is of type (ε), then dcφ ∈ H(i + 1) for each i ∈ N; (H3) if a formula φ U ψ ∈ H(i), then there exists j ≥ 0 with the property that ψ ∈ H(i + j) and φ ∈ H(i + k) for 0 ≤ k < j, for each i ∈ N; (H4) if Gφ ∈ H(i), then φ ∈ H(i + j) for j ≥ 0, for each i ∈ N; (H5) if φUψ ∈ H(i), then either ψ ∈ H(i) or φ ∈ H(i) and φ Uψ ∈ H(i + 1), for each i ∈ N;
254
5 Temporal Logics for Linear and Branching Time and Model Checking
(H6) if ¬(φUψ) ∈ H(i), then ¬ψ ∈ H(i) ∧ (¬φ ∈ H(i) ∨ ¬(φUψ) ∈ H(i + 1)), for each i ∈ N; (H7) if ¬(φUψ) ∈ H(i), then ¬ψ ∈ H(i + j) for each j ≥ 0 or ∃ j ∈ N such that ¬φ ∈ H(i + j) and ¬ψ ∈ H(i + k) for each 0 ≤ k ≤ j. Conditions (H6) and (H7) offer two readings of ¬(φUψ) ∈ H(i), easy to comprehend. One may see that definition of Hintikka traces is fully consistent with semantics of LTL, and, basically, we construct a transition system in which labelling is Hintikka traces. In order to induce a Hintikka trace from a formula φ, we need, according to (H1)– (H7), to find decomposition components of φ. Let us consider a trace σ : Σ → 2 A P , where σ, A P are components of a transition systems over LTL, and, for each n, we consider the set dc(φ)n = {ψ ∈ dc(φ) : σ[n] |= ψ}. Then, we verify that the following holds. Theorem 5.14 The sequence dc(φ)n )∞ n=0 is a Hintikka trace and φ is satisfiable if and only if φ ∈ dc(φ)n for some n. Proof That (dc(φ)n )∞ n=0 is a Hintikka trace follows by Definition 5.25. If for some n, φ ∈ dc − cl(φ)n , then σ, n |= φ. Suppose now that φ is satisfiable. The existence of a Hintikka trace {H (n) : n ∈ N } with φ ∈ H (0) can be proved by structural induction: by Definition 5.25(i), the thesis follows for the sentential formula φ and (ii) and (iii) imply the thesis in cases of Xφ and ψ1 Uψ2 . A final tableau for a formula φ is open if and only if the tableau contains an infinite path with φ in some state. Otherwise, a tableau is closed. Let us observe that each path in the tableau is a Hintikka trace for some sub-formula of φ. Theorem 5.15 (Tableau-completeness of LTL) LTL is tableau-complete, i.e, if a tableau for φ is open, then φ is satisfiable. (A sketch of the idea). Each path in a tableau is a Hintikka trace by Definition 5.25(H1)–(H6,7) and rules for tableau construction. Each formula in a state of the path is satisfiable by Theorem 5.14, as satisfiability goes down each path, hence, if φ is in some state of the path, then φ is satisfiable. Tableaux for LTL go back to Wolper [15].
5.10 Tableaux for Computational Tree Logic CTL We recall that in CTL temporal operators X,F,U,G are prefixed with path operators A, E and can occur in well-formed formulae only in the prefixed form and that structures for CTL are branched time models, e.g., trees. We recall that basic operators of CTL are: EG, EF, E(φUψ), A(φUψ), EXφ, AXφ and their negations. We recall CTL rules for decomposition of formulae.
5.10 Tableaux for Computational Tree Logic CTL
255
Theorem 5.16 The following equivalences hold for CTL: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x)
EGφ ≡ φ∧EXEGφ; ¬EGφ ≡ ¬φ∨EX¬EGφ; EFφ ≡ φ∨ EXEFφ; ¬ EFφ ≡ ¬φ ∧ ¬ EXEFφ; E(φUψ) ≡ ψ ∨ (φ∧ EXE(φUψ)); ¬E(φUψ) ≡ ψ ∧ (¬φ ∨ ¬EXE(φUψ)); A(φUψ) ≡ ψ ∨ (φ∧AXA(φUψ)); ¬A(φUψ) ≡ ¬ψ ∧ (¬φ ∨ ¬AXA(φUψ)); EXφ, AXφ decompose into φ; ¬ EXφ, ¬AXφ decompose into ¬φ.
Local decomposition rules for types (γ), (δ), (ε) are provided in Fig. 5.5. Basically, constructions of tableaux for formulae and sets of formulae of CTL follow the lines of constructions for LTL, with auxiliary states, states, rules for decompositions and elimination of auxiliary states and states. Example 5.7 Consider η: EGφ∧ EφUψ. Figure 5.6 presents the initialization Init(φ) of the tableau for φ and Fig. 5.7 shows the final tableau for φ. Understandably, Hintikka sets for CTL are more developed. The one of main reasons is that CTL is a logic of branching time, hence, structures in this case are transition systems. We consider a dc-closed set Γ of CTL formulae along with a transition system H = (S, →T , H ), where H is a labelling of (S, →T ). H is a Hintikka structure if the following conditions hold true.
Fig. 5.5 Decomposition forms for CTL
256
5 Temporal Logics for Linear and Branching Time and Model Checking
Fig. 5.6 The initial CTL-tableau
Fig. 5.7 The final CTL-tableau
5.11 Non-deterministic Automata on Infinite Words
257
Definition 5.26 (CTL Hintikka structure) Each Hintikka structure (S, →, H) satisfies the following conditions: (H1) each set H (s) is a decomposition implicant; (H2) if φ: AXψ ∈ H (s), respectively, ¬ EXψ ∈ H (s), then ψ, respectively, ¬ψ in H (s ) for each s ∈→ (s), for each s ∈ S; (H3) if φ:EXψ ∈ H (s), respectively, ¬AX¬ψ ∈ H (s), then ψ, respectively, ¬ψ in H (s ) for some s ∈→ (s), for each s ∈ S; (H4) if EFφ ∈ H (s), then φ ∈ H (s ) for some s reachable from s by transitive closure →∗ ; (H5) if EφUψ ∈ H (s), then for some s with s →∗ s , where π(s, s ) is the path from s to s , ψ ∈ H (s ) and φ ∈ H (s ∗ ) for each s ∗ on the path π(s, s ) from s to the predecessor of s ; (H6) if ¬EGφ ∈ H (s), then for each path π ∈ P AT H (s), there exists j such that ¬φ ∈ H (π[ j]); (H7) if AφUψ ∈ H (s), then for each π ∈ P AT H (s) there exists j such that ψ ∈ H (π[ j]) and φ ∈ H (π[k]) for 0 ≤ k < j. By (H1)–(H7), by semantic definitions, other conditions are fulfilled by Hintikka structures. One obtains those conditions from decomposition equivalences. Theorem 5.17 (Tableau—completeness for CTL) As with LTL, CTL is tableaucomplete; the proof goes along the lines outlined for LTL. From our discussion of small model property for LTL it follows that CTL has also small model property. Tableaux for CTL go back to (Emerson, Halpern [16]). For tableaux for CTL*, please see (Reynolds [17, 18]). Model checking by means of tableaux for LTL, CTL and CTL* is discussed in (Clarke et al. [19]) and in (Demri et al. [9]). There is a number of automated tools for LTL-satisfiability verification: NuSMV2 (Cimatti et al. [20]), Aalta (Li et al. [21]), Leviathan (Bertello et al. [22]).
5.11 Non-deterministic Automata on Infinite Words In Sect. 5.7, some Büchi automata are shown, which model some properties of reactive systems. Now, we will discuss in a more systematic way the relations between temporal logics and automata. We begin with a survey of automata on infinite words. Definition 5.27 (Non-deterministic Büchi automata on words) We recall that a Büchi automaton is defined as a tuple B = (Σ, Q, I, δ, F), where Σ is a finite alphabet, Q is a finite set of states, I ⊆ Q is a set of initial states, δ : Q × Σ × Q is a transition relation, and F ⊆ Q is a set of accepting states.
258
5 Temporal Logics for Linear and Branching Time and Model Checking
We assume that δ is serial, i.e., for each pair (q, a) there exists q such that (q, a, q ) ∈ δ. B is deterministic if and only if for each pair (q, a) there exists a unique q such that (q, a, q ) ∈ δ. In this case, δ : Q × Σ → Q is a mapping, and we write δ(q, a) = q . Otherwise, B is non-deterministic. ∞ such that (i) q0 ∈ I (ii) (qi , ai , qi+1 ) ∈ δ A run of B is any sequence ρ = (qi , ai )i=0 ∞ for each i ≥ 0. The sequence (ai )i=0 is the label of the run ρ, denoted l(ρ). For each run ρ, and each state symbol q, we define the set O(q, ρ) = {i : qi = q}; this is the set of numbers of positions in ρ at which the state symbol is q. The set in f (ρ) is the set {q : |O(q, ρ)| = ω}. The symbol |.| denotes cardinality of a set, ω in this case denotes the cardinality of the set of natural numbers N. As runs are infinite, their labels are infinite words over the alphabet Σ, i.e, ω-words. We denote them with symbols w, v, u,. . ., sometimes primed. Nondeterministic Büchi automata on infinite words will be marked as NBAω . Definition 5.28 (Acceptance conditions. Languages defined by automata) For a run ρ: it is Büchi accepted if and only if in f (ρ) ∩ F = ∅, equivalently, there exists q ∈ F with |O(q, ρ)| = ω. For an accepted run ρ, the label l(ρ) is a word in the language generated by NBAω , the language L(NBAω ) is the set of labels of runs accepted by NBAω . Definition 5.29 (Generalized Büchi automata) A generalized Buüchi automaton differs from the standard in that it has a finite set of sets of accepting states {F1 , F2 , . . . , Fk } for some natural number k. The generalized Büchi acceptance is in f (ρ) ∩ Fi = ∅ for each 1 ≤ i ≤ k. It turns out that the generating power of generalized Büchi automata is the same as standard Büchi automata, but, passing from generalized to standard Büchi automaton costs an increase in the number of states. Lemma 5.1 For a generalized Büchi non-deterministic automaton B with n states and k acceptance sets, there exists a non-deterministic Büchi automaton B ∗ with n · k states such that L(B) = L(B ∗ ). Indeed, for B = (Σ, Q, I, δ, F1 , F2 , . . . , Fk ),, define the automaton B ∗ = (Σ, Q , I , δ , F1 ), where
(i) Q = Q × {1, 2, . . . , k}; (ii) I = Q × {1}; / Fi and δ ((q, i), a) = δ(q, a) × (iii) δ ((q, i), a) = δ(q, a × {i} in case q ∈ {(i mod k) + 1} in case q ∈ Fi ; (iv) F = F1 × {1}. Runs of the automaton B ∗ begin in Q × {1} and if successful, then visit each Fi returning to F1 and so on. Infinite runs on F1 visit infinitely many times each F j for 1 < j ≤ k. There are many other ideas for automatons on infinite words, especially in terms of acceptance conditions; let us mention a few.
5.11 Non-deterministic Automata on Infinite Words
259
(i) The Rabin acceptance condition is a set {(Fi , G i ) : i = 1, 2, . . . , k}, where Fi , G i ⊆ Q for each i and a run ρ is accepted if and only if there exists i such that in f (ρ) ∩ Fi = ∅ and in f (ρ) ∩ G i = ∅; (ii) The Parity acceptance condition is a sequence F1 ⊆ F2 ⊆ . . . ⊆ Fk = Q and a run ρ is accepted if and only if the first i such that in f (ρ) ∩ Fi = ∅ is even; (iii) The Muller acceptance consition is a set F1 , F2 , . . . , Fk of subsets of Q and a run ρ is accepted if there exists i such that in f (ρ) = Fi . A discussion of relations among those classes of automata can be found, e.g., in Kupferman [23] in (Clarke et al. [24]) and in (Perrin, Pin [25]). We embark now on some properties of non-deterministic Büchi automata termed closure properties as these properties concern closure of the class of those automata under union, intersection (cf. Choueka [26]) and complement operations. Theorem 5.18 (Choueka) For non-deterministic Büchi automata B and B ∗ with respectively, n and m states, there exists a non-deterministic Büchi automaton B ∪ with n + m states and such that L(B ∪ )=L(B) ∪ L(B ∗ ). Proof Let B be (Σ, Q 1 , I1 , δ1 , F1 ) and B ∗ be (Σ, Q 2 , I2 , δ2 , F2 ) where we may assume that Q 1 ∩ Q 2 = ∅. Then, B ∪ is defined as (Σ, Q 1 ∪ Q 2 , I1 ∪ I2 , δ, F1 ∪ F2 ), where δ(q, a) = δi (q, a) for q ∈ Q i for i = 1, 2. It follows that B ∪ has an accepting run if and only if either B or B ∗ has an accepting run. A similar argument proves closure under intersection (Choueka [26]). Theorem 5.19 (Choueka) For non-deterministic Büchi automata B and B ∗ with respectively, n and m states, there exists a non-deterministic Büchi automaton B ∩ with 2 · n · m states and such that L(B ∩ )=L(B) ∩ L(B ∗ ). Proof This time the product of two automatons must be defined in a more complex way. We use the notation of Theorem 5.18. We let B ∩ = (Σ, Q 1 × Q 2 × {1, 2}, I1 × I2 × {1}, δ, F1 × Q 2 × {1}), where δ((q1 , q2 , j), a) = δ1 (q1 , a) × δ2 (q2 , a) × f (q1 , q2 , j) for the function f defined as follows: f (q1 , q2 , j) = 1 if / F1 )or ( j = 2) ∧ (q2 ∈ F2 )] [either ( j = 1) ∧ (q1 ∈
260
5 Temporal Logics for Linear and Branching Time and Model Checking
and f (q1 , q2 , j) = 2 if / F2 )]. [either ( j = 1) ∧ (q1 ∈ F1 )or ( j = 2) ∧ (q2 ∈ Thus, runs of the product automaton oscillate between copies of either automaton infinitely often. The problem of complementation is more difficult. Its solutions involved some additional tools including graph-theoretical tools like Ramsey’s theorem. We outline a proof in (Kupferman, Vardi [27]) which exploits a visualization of automatons in the form of graph G of runs (gr G) defined for any automaton and each ω-word w. Definition 5.30 (The graph grG of runs of an automaton NBAω ) The graph gr G for ∞ is defined as follows: (i) an automaton B = (Σ, Q, I, δ, F) and a word w = (ai )i=1 ∞ ∞ is defined by the set of vertices is V = i=0Q i × {i} where the sequence (Q i )i=0 induction as Q 0 = I , Q i+1 = q∈Q i δ(q, ai+1 ). The set of edges E is {((q, i), (q , i + 1)) : q ∈ δ(q, ai+1 }, i.e, E⊆
∞
(Q i × {i}) × (Q i+1 × {i + 1}).
i=0
∞ ∞ The accepting condition is i=0 F × {i}: a run (qi , ai+1 )i=0 is accepted if and only if (q, ai+1 ) ∈ F for infinitely many i’s. The graph gr G for B and w represents all runs of B on w. It follows that acceptance by B is equivalent to acceptance by gr G. We assume that our automaton has n states. Definition 5.31 (The ranking function on grG) An accepted run must stabilize on an even rank. Hence, the dichotomy: accepted—non-accepted is equivalent to dichotomy: odd-even for runs. The ranking f is odd if and only if all runs are odd. In this case, B is rejecting. It turns out that the converse is true as well. The main result in (Kupferman, Vardi [27]) establishes the effect of odd runs. Theorem 5.20 B is rejecting if and only if all runs are odd. Proof One way it has been observed. We have to prove that if B is rejecting then all path rankings are odd. Suppose to the contrary that there is a run ρ not having the odd ranking. Then ρ stabilizes on the even rank, hence, infinitely many vertices on ρ fall into F, i.e, ρ is accepting, a contradiction. Definition 5.32 (A level ranking) A level ranking is a function g : Q → {0, 1, . . . , 2n, ⊥} such that the odd value of g(q) means q ∈ / F. For two level rankings g, g , and
q, q ∈ Q, g refines g if g (q ) ≤ g(q) any time when q = δ(q, a) for some a. Thus, a level ranking g is constant on levels of the graph gr G for the given automaton. As with ranking functions, level rankings diminish as they go to higher valued levels. The set of even values of a level ranking g is denoted par (g).
5.12 Decision Problems for Automata
261
We now outline the proof in (Kupferman, Vardi [27]) on the existence of the complementary automaton B − to the given B=(Σ, Q, I, δ, F). Theorem 5.21 (Kupferman, Vardi) For the given NBAω B with n states, there exists an NDAω B − on less or 2 O(nlogn) states and such that Σ ω \ L(B) = L(B − ). Proof Let B − = (Σ, Q , I , δ , F ). (i) Let Lrank be the set of level rankings on gr G and let variable P runs over sets in 2 Q . Then Q = {(g, P) : g ∈ Lrank, P ∈ 2 Q }; (ii) I = {(g0 , ∅) : g0 (q) = 2n if q ∈ I, else g(q) = ⊥}; (iii) if P = ∅, then δ ((g, P), a) = {(g , P ) : g refines g, P = {q : q ∈ δ(q, a) for some q ∈ P and g (q ) is even}; δ ((g, ∅), a) = {(g , q ) : g refines g, g (q )is even}; (iv) F = Lrank × {∅}. On reading the consecutive k-th symbol ak of the input word w, the automaton B − guesses the level ranking for the k-th level of gr G under the condition that it refines the (k − 1)st level of gr G. The set P stores states with even ranks. This controls those states so eventually paths through these states reach odd ranks emptying the set P and then from the odd rank anew P is updated for visits in states with odd ranks. The acceptance condition checks that there are infinitely many levels with solely odd ranked states. For the estimate of 2 O(nlogn) , observe that the number of sets P is 2n and number of rankings is n O(n) = 2 O(nlogn) , hence, the number of states is at most 2 O(nlogn) .
5.12 Decision Problems for Automata Definition 5.33 (Three main decision problems) These decision problems for nondeterministic Büchi automaton B are: (i) The non-emptiness problem: whether L(B) = ∅; (ii) The non-universality problem: whether L(B) = Σ ω ; (iii) The containment problem: for B and B ∗ : whether L(B) ⊆ L(B ∗ ). Let us observe that (i) and (ii) are equivalent in the sense that if L(B) = ∅, then L(B ) = Σ ω for B in which accepting states are the complement of accepting states in B. For (iii), it is equivalent to (i) for L(B ∗ ) ∩ L(B − ) = ∅. For these reasons we discuss (i) only. We remark only that complexities are distinct in each of these three problems. The emptiness problem was decided in (Vardi, Wolper [28]) as follows. Theorem 5.22 The emptiness problem is decidable in linear time. It is an NLcomplete problem.
262
5 Temporal Logics for Linear and Branching Time and Model Checking
Proof Consider NBAω (Σ, Q, I, δ, F). Recall the graph G with V = Q and E = {(q, q ) : ∃a ∈ Σ.q = δ(q, a)}. A run ρ is accepting if and only if there exists a path π beginning at qo ∈ I and ending at first q f ∈ F and continuing with a loop from q f to some q f ∈ F; in other words, π is periodic due to finiteness of F and infinite length of π. Hence, the emptiness problem is equivalent to reachability problem for the graph G. Let us observe that in f (ρ) is a strongly connected component (SCC) of G, i.e., a maximal strongly connected subset of V which means that each pair of vertices in SCC is connected by a path. Strongly connected components in a directed graph can be found in linear time (|V | + |E|) by double depth search on the graph and its transpose, see, e.g., (Cormen et al. [29]). The reachability problem solution consists in selection of an SCC1 which contains an initial state, and an SCC2 which contains an accepting state and in guessing a path from SCC1 to SCC2 in the reduced graph. This requires logarithmic space: one has to keep in memory the initial state, the accepting state, the currently guessed state, and the current number of steps, all requiring logarithmic space. Actually, the non-emptiness problem is NL-complete. The dual problem of non-universality as well as the problem of containment are decidable in exponential time and they are PSPACE-complete, cf. (Sistla, Vardi, Wolper [30]).
5.13 Alternation. Alternating Automata on Infinite Words Definition 5.34 (Alternation. The idea) Alternation may be viewed as an attempt at introduction of regularity into non-determinism (Chandra, Kozen, Stockmayer [31]). Consider an automaton on the set Q of states along with the collection Bool ∨,∧ (Q) ofsentential formulae over Q written using only disjunction and conjunction, i.e.,formulae in Bool ∨,∧ (Q) are monotone. The extremal cases are the disjunction q q and the conjunction q q. In the first case we have the fully existential case in the second case we have the fully universal case. Between these cases are plethora of intermediate cases like (q1 ∧ q2 ) ∨ q3 . In the realm of automata, the application of the transition relation δ(q, a) can be defined as the condition (q1 ∧ q2 ) ∨ q3 which would mean that given a word w induced at the step q, the word wa will be accepted either at q1 and q2 or at q3 . Therefore, the appropriate notation for a transition will be δ(q, w) = (q1 ∧ q2 ) ∨ q3 . A consequence of alternation introduced into an automaton is that in order to realize the idea as above, we need the structure of labeled trees. Definition 5.35 (Labeled trees) Trees are defined and discussed in Chap. 1. An abstract tree T # over a set T is a set T # of finite sequences of elements of T such that with each sequence s the set T # contains each prefix of s including the empty prefix ε called the root of T # . If all sequences in T # are finite, then the tree is finite in
5.13 Alternation. Alternating Automata on Infinite Words
263
case the upper bound of lengths of all sequences is finite, otherwise the tree contains sequences of any finite length. For a sequence s, its length is the distance from the root, and sequences of the same distance from the root form levels of the tree. A collection of sequences such that the prefix relation orders this collection into a linear order defined as s ≤ s if and only if s is a prefix of s , is called a path in the tree. In particular, {ε} is a path. A maximal path is called a branch. A successor to a path p is the path p such that p ≤ p and length of p is the length of p +1. Therefore, p = px for an element x ∈ T . If p = px is a branch, then the element x is a leaf. Let T = {x ∈ T : for some paths p,p’, p’=px}; elements of T along with the root are vertices of the tree T # , the set of vertices of the tree T # is denoted V (T # ). A labeling of a tree T # is a mapping L : A → V (T # ), where A is a set of symbols. The pair (T # , L) is a labeled tree. Definition 5.36 (Alternating automata on infinite words) An alternating automaton A is defined as a tuple (Σ, Q, {q0 }, δ, F); please notice the singleton q0 as the initial state corresponding to the root of a tree. The components of the tuple are Σ, Q which are finite sets of, respectively, alphabet symbols and states, the acceptance condition is F ⊆ Q ω , δ : Q × Σ → Bool ∨,∧ (Q), q0 is the single initial state which corresponds to the root of the tree T # . For a formula φ in Bool∨,∧ (Q), and δ(q, a) → φ, an implicant is a minimal subset of Q satisfying φ. If φ is converted into the DNF then implicants of DNF are implicants in the current sense. For instance, consider the transition δ(q, a) → φ : [q1 ∨ q2 ∨ q3 ) ∧ (q4 ∨ q5 )]. The automaton has to make a non-deterministic choice from {q1 , q2 , q3 } and a nonto perform two runs, one for each deterministic choice from{q4 , q5 } and it has choice. The DNF of φ is ( i=1,2,3 (qi ∧ q4 ) ∨ ( i=1,2,3 (qi ∧ q5 ) which prompts the automation to a non-deterministic choice of one of the implicants, which is equivalent to the former choice. Definition 5.37 (Runs of alternating automatons) A run of A on a word w = ∞ ∈ Σ ω is drawing in a tree T # a dag G representing the flow of transitions. (ai )i=0 As with trees, we will observe levels corresponding to isoclines of the same distance from the root which is the pair (q0 , 0). Vertices of G are pairs (q, l), where l is a level number. Formally: (i) (ii) (iii) (iv)
the set of vertices V (G) ∞⊆ Q× N; the set of edges E ⊆ l=0 [(Q × {l}) × (Q × {l + 1})]; each (q, l) with l > 0 has a predecessor (q , l − 1), i.e., ((q , l − 1), (q, l)) ∈ E; for each transition δ(q, al ) → φ, the set of chosen states {q : ((q, l), (q , l + 1)) ∈ E satisfies φ.
A run is accepting if and only if it meets the acceptance condition F, i.e, the word w ∈ F. The language L(A) is the set of accepted words. Acceptance conditions can be distinct, as with Büchi automata:
264
5 Temporal Logics for Linear and Branching Time and Model Checking
(i) alternating Büchi acceptance: F ⊆ Q and in f (w) ∩ F = ∅; (ii) alternating generalized Büchi acceptance: a finite number of sets F1 , F2 , . . . , Fk , in f (w) ∩ Fi = ∅ for i = 1, 2, . . . , k. Example 5.8 For our purpose in what follows it will be useful to give a small example of the graph G in Definition 5.37. The transition relation δ is as follows. For simplicity, our Σ = {a}: δ(q0 , a) = q1 ∧ q2 ∨ q3 ; δ(q1 , a) = q2 ∨ q4 ; δ(q2 , a) = q2 ∧ q3 ; δ(q3 , a) = q1 ∧ q2 ∧ q4 ; δ(q4 , a) = . Alternating automata due to their universal and non-deterministic components corresponding to universal and existential temporal operators are better suited to LTL than Büchi automata. However, their expressional power is the same as that of nondeterministic Büchi automata on infinite words. The reduction of A to N B Aω B is at the cost of exponential blow-up (Miyano, Hayashi [32]). Theorem 5.23 (Miyano, Hayashi) For any alternating Büchi automaton A, with n states, there exists a non-deterministic Büchi automaton B with an exponential number of states. Proof (an outline, [9, 23]) The automaton B, at each state of its run, keeps the whole corresponding level of the graph G of the run of A and when passing to the next state, it guesses the whole next level of G. Hence, the number of all states of B should be at least 2 O(n) . B has also to convert the acceptance condition for A into the Büchi acceptance for B. This requires check-up of visiting accepting states of A to be positive that each infinite path visits infinitely many of them. The device which ensures this is the splitting of the set of all states {q : (q, l) ∈ V (G)} into disjoint sets A, B, where A is the set of states whose next state will be accepting. If it is so, the accepting state is moved to the set B; when A = ∅ which means that all paths met at least once an accepting state, the roles are reversed and the state from B is moved to A to be the next state for the state waiting to meet an accepting state. Formally: for A = (Σ, Q, {q0 }, δ, F), B = (Σ, 2 Q × 2 Q , {q0 } × {∅}, δ , F = {∅} × 2 Q ), the transition relation δ is defined as follows: (i) when A = ∅, then δ ((∅, B), a) = {(X \ F, X ∩ F) : X ⊆ Q, X |= δ(q, a) for each q ∈ B};
5.14 From LTL to Büchi Automata on Infinite Words
265
(ii) when A = ∅, then δ ((A, B), a) = {(X \ F, X ∪ (X ∩ F)) : X, X ⊆ Q, X |= δ(q, a) for each q ∈ A, X |= δ(q, a) for each q ∈ B}. It follows that L(A) = L(B).
5.14 From LTL to Büchi Automata on Infinite Words We consider the translation from LTL to a non-deterministic Büchi automaton (Vardi, Wolper [28]). This translation employs the extended closure of a formula. We recall it here. Definition 5.38 (The extended closure) The extended closure ECl(φ) is subject to the following conditions: (i) (ii) (iii) (iv) (v)
φ ∈ ECl(φ); ψ ∈ ECl(φ) if and only if ¬ψ ∈ ECl(φ); ψ ∧ χ ∈ ECl(φ) if and only if ψ ∈ ECl(φ) and χ ∈ ECl(φ); if Xψ ∈ ECl(φ) then ψ ∈ ECl(φ); if ψUχ ∈ ECl(φ), then ψ ∈ ECl(φ) and χ ∈ ECl(φ).
Definition 5.39 (The generalized non-deterministic Büchi automaton) We define the automaton B(φ) for an LTL formula φ whose set of states Q is the set of all maximally consistent subsets of ECl(φ). The alphabet Σ is the power set 2 A P , where A P is the set of atomic propositions. We recall that the language L(φ) is the set {π ∈ (2 A P )ω : π |= φ}. The transition relation δ is defined as follows: for two MaxCon sets Γ, Δ and a ∈ Σ, i.e., a ⊆ A P, Δ ∈ δ(Γ, a) if and only if (i) a = Γ ∩ A P; (ii) if Xψ ∈ ECl(φ), then Xψ ∈ Γ if and only if ψ ∈ Δ; (iii) if ψUχ ∈ ECl(φ), then ψUχ ∈ Γ if and only if either χ ∈ Γ or ψ ∈ Γ and ψUχ ∈ Δ. Initial states are all states MaxConΓ such that φ ∈ Γ . The set of acceptance sets consists of all MaxConΓψU χ such that either χ ∈ Γ or ¬(ψU χ) ∈ Γ . As ECl(φ) is of cardinality O(|φ|), Bφ has 2 O(|φ|) states. The number of acceptance sets is O(|φ|), and we know that the language accepted by generalized automaton is equal to the language of an ordinary Büchi automaton. One checks that L(φ) = L(Bφ ).
266
5 Temporal Logics for Linear and Branching Time and Model Checking
The exponential size of Bφ can be avoided if LTL is translated into an alternating Büchi automaton. To that end, formulae of LTL are expressed in the positive normal form, i.e., negation is applied exclusively to atomic propositions. This prompts the recalling of the Release operator ψ Rχ defined as ¬(¬ψU¬χ). The semantics of R is as follows: for each path π and each k ≥ 1, if π[k] does not satisfy χ, then there exists 1 ≤ i < k such that π[i] |= ψ. For instance, please check that the formula GF p is equivalent to the formula ⊥R( U p). Theorem 5.24 ([23]) For each LTL formula φ, there exists an alternating Büchi automaton Aφ with O(|φ|) states and such that L(Aφ ) = L(φ). Proof This time we apply the closure of the formula φ, which we denote φ. The closure satisfies the following conditions: (i) φ ∈ φ; (ii) for ◦ ∈ {∨, ∧, U, R}, if ψ ◦ χ ∈ φ, then ψ, χ ∈ φ; (iii) if Xψ ∈ φ, then ψ ∈ φ. The automaton Aφ is defined as follows: (i) (ii) (iii) (iv)
the alphabet Σ = 2 A P ; the set of states Q = φ; the initial state q0 = φ; the transition relation δ is defined for particular cases in the following way: (a) δ( p, a) = if p ∈ a, else ⊥; (b) δ(¬ p, a) = ¬δ( p, a); (c) δ(ψ ∧ χ, a) = δ(ψ, a) ∧ δ(χ, a); (d) δ(ψ ∨ χ, a) = δ(ψ, a) ∨ δ(χ, a); (e) δ(ψUχ, a) = δ(χ, a) ∨ (δ(ψ, a) ∧ (ψUχ)); (f) δ(ψRχ, a) = δ(χ, a) ∧ (δ(ψ, a) ∨ (ψRχ));
(v) acceptance F = {ψ Rχ : ψRχ ∈ φ}. One proves by structural induction on φ that if Aφ is in state ψ, then it accepts solely all paths that satisfy ψ. It follows that L(Aφ )=L(φ).
5.15 LTL Model Checking The problem of LTL model checking consists in verification whether a given structure for LTL satisfies a given LTL formula φ. Concerning the structure, it is a Kripke structure defined for this purpose as a tuple M=(A P, W, R, I, L), where (i) A P is the set of atomic pripositions; (ii) W is the set of worlds;
5.16 Model Checking for Branching-time Logics
267
(iii) I is the set of initial worlds; (iv) R ⊆ W × W is an accessibility relation; (v) L : W → 2 A P is a labeling mapping. It establishes a logic at each world. ∞ A computation by M is a sequence (L(wi ))i=0 where w0 ∈ I and R(wi , wi+1 ) for ∞ each i ≥ 0. We denote by π the path (wi )i=0 . Semantics of the Kripke structure is defined in Chap. 4. We keep our standing assumption about the seriality of the relation R. We recall in this new framework the result in (Sistla, Clarke [8]) already mentioned by us.
Theorem 5.25 The LTL model checking problem is PSPACE-complete. We outline the passage from Kripke’s structure M to a non-deterministic Büchi automaton B. The automaton B is defined as a tuple (A P, W, δ, I, F = W ), where the transition relation δ emulates the accessibility relation R as follows: (i) δ(w, a) = {w : R(w, w )} if a = L(w), else δ(w, a) = ∅. Then, model checking consists in verification whether L(M) ⊆ L(B); this question is presented also in the form of disjointness of L(M) and the complement of L(B). The complexity of the disjointness problem is |M| · 2 O(|φ|) .
5.16 Model Checking for Branching-time Logics We now outline the problem of model checking for branching-time logics. In this case, automata are a tree automata. Hence, we begin with them. Our exposition is based on (Vardi, Wilke [33]) and also (Kupferman, Vardi, Wolper [35]). Definition 5.40 (Alternating tree automata) We consider infinite trees ordered by the successor relation of which we assume seriality, i.e, each vertex has a successor. We may assume that the tree is built as a subset N∗ , i.e., its nodes are finite sequences of natural numbers closed on prefixes, ordered by the immediate prefix relation. We assume that a set Δ ⊆ N is the star number function which defines star numbers of vertices: for a vertex x, Δ(x) is the number of successors to the vertex x. We assume that the width of the tree is bounded from above, so the set Δ is finite and all levels of the tree are of bounded size. For a tree T , and a set Σ, a Σ-labeling of T is a mapping L : Σ → T which assigns to each vertex x a symbol L(x) in Σ. Usually, Σ is taken as the set 2 A P of sets of atomic propositions, and, L establishes a logic at each vertex. We recall the known notions of a path in the tree and a branch of the tree. We have already met the notion of an alternating automaton and the set Bool∨,∧ (Q) of sentential formulae over Q written with use of only ∨ and ∧ connectives. We denote by m the set of smaller than m natural numbers, i.e., m = {0, 1, 2, . . . , m − 1}.
268
5 Temporal Logics for Linear and Branching Time and Model Checking
An alternating automaton on infinite trees is represented by a tuple A = (Σ, Q, q0 , δ, Δ, F) Thomas [36], (Muller, Schupp [37]), where Σ, Q have the already established meaning, q0 is the initial state, Δ is the star number function, F is acceptance conditions. We denote by D the maximal value of Δ. The transition function δ is defined as: δ : Q × Σ × D → Bool∨,∧ (N×Q); given a triple (q, a, n), δ(q, a, n) ∈ Bool∨,∧ (n × Q). The acceptance condition F may be the Büchi acceptance F ⊆ Q or already mentioned by us Rabin’s acceptance, or, Parity acceptance, or, Büchi modified acceptance F ⊆ Q ω . Definition 5.41 (Runs of alternating tree automata) For a labeled tree (T, L) and an alternating tree automaton A = (Σ, Q, q0 , δ, Δ, F), a run of A is a tree (T , L ) with nodes of the form (x, q), where x ∈ N and q ∈ Q. The initial node is (ε, q0 ). On reading a node x in T in a state q, the automaton applies the transition rule δ(q, L(x), Δ(x)) = φ ∈ Bool∨,∧ (N×Q). Then a nondeterministic choice of an implicant I of φ of the form, (c1 , q1 ) ∧ (c2 , q2 )∧, . . . , ∧(cn , qn ) such that I |= φ, yields L (x, q) = (xci , qi ) and extends x to xi for i = 1, 2, . . . , n. A run is accepting if all its infinite paths satisfy the acceptance condition. For instance, given F ⊆ Q, a path π is accepted if and only if there is q ∈ F such that infinitely many nodes in π are of the form (x, q). This is the Büchi acceptance. An automaton A accepts a tree if and only if there is a run of it accepting the tree. The language L(A) consists of all accepted trees. We now define weak alternating tree automata after (Muller, Saoudi, Schupp [38]). Definition 5.42 (Weak alternating tree automata) A variant of A, which we denote A∗ , called a weak alternating tree automaton refines the Büchi acceptance condition by partitioning Q into pairwise disjoint subsets Q 1 , Q 2 , . . ., Q n with the properties that (i) Q i ⊆ F or (ii) Q i ∩ F = ∅ for each 1 ≤ i ≤ n. In case (i), Q i is accepting, in case (ii) Q i is rejecting. On the set Q 1 , Q 2 , . . ., Q m an ordering ≤ is imposed: if q ∈ δ(q, a, n), q ∈ Q i and q ∈ Q j , then Q j ≤ Q i . Each infinite path gets ultimately into one and final Q i which determines its acceptance or rejection. Definition 5.43 (The Kripke structures treefied) Kripke structures we know from modal logics, undergo some augmentation aimed at giving them a form plausible for cooperation with automata. Therefore, the Kripke structure is a tuple
5.16 Model Checking for Branching-time Logics
269
K = (A P, W, R, w0 , L , Δ), where W and R are familiar set of worlds and an accessibility relation, w0 is the initial world, L : W → 2 A P is a labeling. There is also a star number function Δ : N→ W defined and with properties as above in Definition 5.40. The structure of the tree (T, V ) carried by K develops from w0 = V (ε), by recurrence: for v ∈ T , with R-successors w1 , w2 , . . . , wn of the node V (v), vi ∈ T and V (vi) = wi for 1 ≤ i ≤ n. Definition 5.44 (The model checking problem for branching-time logics) The model checking problem is as for LTL: given a Kripke structure K = (A P, W, R, w0 , L , Δ) and a formula φ determine whether K |= φ. The idea of this checkup is as follows: build an alternating automaton A(φ, Δ) which accepts exactly all trees with the function Δ which satisfy φ. Then build the product of treefied K with A(φ, Δ). Check whether the Kripke tree (T, V ) in Definition 5.43 belongs in L(A(φ, Δ)) ∩ {(T, V )}. If it is so, then K |= φ, if not K does not satisfy φ. This procedure leads to the non-emptiness problem. We now recall the construction of the automaton A(φ, Δ) in [35]. Theorem 5.26 (CTL model checking) There exists a weakly alternating automaton A(φ, Δ) whose language consists of all and only trees with the parameter Δ that satisfy the given formula φ. Proof The construction. We recall φ, the closure of φ consisting of all subformulae of φ and their negations. By an R-formula, we mean the formula involving the Release operator R. We recall that ψRχ is equivalent to ¬(¬ψU¬χ) and its semantics is that for any path π either χ holds always or there is some state k on a path π such that π[k] satisfies ψ and χ is satisfied for each state π[i] for i ≤ k. The signature of A(φ, Δ), which we denote for short as Aφ,Δ , is (2 A P , φ, δ, Δ, φ, F), where the accepting set F consists of all elements ψ Rχ ∈ φ. Other elements are: (i) the partition of Q = φ consists of sets {ψ} for ψ ∈ φ; the ordering on the partition is: {ψ} ≤ {χ} if and only if ψ ∈ χ; (ii) the acceptance condition F is the set of all formulae ψRχ ∈ φ; (iii) the transition function δ is specified for the following cases: (a) δ( p, a, k) = if and only if p ∈ a for p ∈ A P, otherwise δ( p, a, k) = ⊥; δ(¬ p, a, k) = ¬δ( p, a, k); (b) δ(ψ ∧ χ, a, k) = δ(ψ, a, k) ∧ δ(χ, a, k); (c) δ(ψ ∨ χ, a, k) = δ(ψ, a, k) ∨ δ(χ, a, k); (c, ψ); (d) δ(AX ψ, a, k) = k−1 c=0 (e) δ(E X ψ, a, k) = k−1 c=0 (c, ψ); (f) δ(AψU χ, a, k) = δ(χ, a, k) ∨ [(δ(ψ, a, k) ∧ k−1 c=0 (c, AψU χ))];
270
5 Temporal Logics for Linear and Branching Time and Model Checking
(g) δ(EψU χ, a, k) = δ(χ, a, k) ∨ [(δ(ψ, a, k) ∧ k−1 (c, EψU χ))]; c=0 k−1 (h) δ(Aψ Rχ), a, k) = δ(χ, a, k) ∧ [(δ(ψ, a, k) ∨ c=0 (c, Aψ Rχ))]; (j) δ(E(ψ Rχ), a, k) = δ(χ, a, k) ∧ [(δ(ψ, a, k) ∨ k−1 c=0 (c, EψU χ))]. By structural induction on φ, one proves that for each accepting run r of Aφ,Δ on the Kripke tree (Tk , Vk ), the run tree (Tr , r ) has the property that for every node (x, ψ) of Tr , VK (x) |= ψ. It follows that for the initial node (ε, φ), ε satisfies φ. This means soundness of Aφ,Δ . To prove completeness, suppose that (T, V ) is a tree with the spread Δ such that (T, V ) |= φ. It is to be proved that Aφ,Δ accepts (T, V ). An accepting run (Tr , r ) begins with ε and r (ε) = (ε, φ). Throughout the run r , the property is kept that for each run node (x, ψ), V (x) |= ψ. This property is carried to successors of a node by definition of δ which reflects the semantics of CTL. So-called eventualities related to R, which make the acceptance condition F, are eventually reached by infinite paths due to appropriate parts of definition of δ. Finally, we have the theorem due to (Kupferman, Vardi, Wolper [35]). Theorem 5.27 The language of Aφ,Δ is non-empty if and only if the Kripke tree structure with the parameter Δ in Definition 5.43 satisfies φ. We kindly refer the reader to ([35]) for the proof.
5.17 Symbolic Model Checking: OBDD (Ordered Binary Decision Diagram) Symbolic model checking by means of OBDD is a sequel to BDD outlined in Problem 3.12. It is a form of representation of sentential formulae by means of graphs. An OBDD diagram (or, a graph), is built over leaves representing truth values 0 and 1 (sketched as boxes) and of vertices representing atomic propositions in a formula. Atomic propositions are ordered and this ordering is kept along each path from the root to a leaf. Each non-leaf vertex v is representing an atomic proposition x and it defines a sub-graph representing a sentential function in conformance with the Boole-Shannon expansion (see Chap. 3). Each vertex issues two edges, one for the 0 value and one for the value 1. Usually, cf. Bryant [39], the edge labelled 0 is sketched with a dashed line, and it is called lo(v) and the edge labelled 1 is drawn as a solid line and it is called hi(v). Then, for a vertex v representing an atomic proposition x, the formula φv defined by v is expressed by recurrence as (x ∧ φhi(v) ) ∨ ((¬x) ∧ φlo(v) ). In Fig. 5.8, we sketch an OBDD for the formula φ : (x1 ∧ x2 ∧ x3 ) ∨ ((¬x1 ) ∧ x3 ). In Fig. 5.8, we see dashed and solid lines and the ordering x1 > x2 > x3 which repeats itself on each path from the root to leaves. Implementation of OBDD requires proper representations of sets of states and transition relations as models of systems are Kripke structures K = (S, R, L), where S is a set of states, R ⊆ S × S is a
5.17 Symbolic Model Checking: OBDD (Ordered Binary Decision Diagram)
271
Fig. 5.8 BDD(φ)
transition relation and L : S → 2 A P is an assignment of sets of atomic propositions to states. Definition 5.45 (OBDD representations of sets and transitions) Each subset X ⊆ S is defined by its characteristic function χ X (s) which takes the value 1 on if and only if s ∈ X , otherwise the value is 0. We assume that A P = { p1 , p2 , . . . , pk }. k An open complete description with respect to A P is the set of conjuncts i=1 li , k where each literal li is either pi or ¬ pi . For each s ∈ S, we let f (s) = i=1 li (s), where li (s) is pi in case pi ∈ L(s), otherwise li (s) = ¬ pi . The mapping f can be extended over sets of states and for X ⊆ S, we let f (X ) to be s∈X f (s). Then χ X is s∈X f (s). We now introduce BDD operations which produce graphs for sentential formulae, the graph for a formula φ is denoted BDD(φ). These operations are (cf. Chaki, Gurnfinkel [40]): (i) (ii) (iii) (iv)
-BDD(φ) yields BDD(¬φ); BDD(φ)∧ BDD(ψ) yields BDD(φ ∧ ψ); BDD(φ)∨ BDD(ψ) yields BDD(φ ∨ ψ); ∃x. BDD(φ) yields BDD(∃x.φ).
The operation of quantification has been defined in Problem 3.12. BDD supports substitution; to this end, one creates a copy of A P, A P = { p1 , p2 , . . . , pk }. For a formula φ(xi1 , xi2 , . . . , xi j ), BDD(φ[xi1 ,xi2 ,...,xi j /xi ,xi ,...,xi ] creates the formula 1 2 j φ(xi 1 , xi 2 , . . . , xi j ). Substitution allows for definitions of quantifications, which we recall from Sect. 3.22: ∀xi .φ is φ[xi /1] ∧ φ[xi /0] , ∃xi .φ is φ[xi /1] ∨ φ[xi /0] . Formal treatment of formulae representations is done via API which we exemplify on the case of our formula in Fig. 1: (x1 ∧ x2 ∧ x3 ) ∨ ((¬x1 ) ∧ x3 ). We use standard notation: NOT for negation, AND for conjunction, OR for disjunction, VAR to denote an atomic proposition. Then, we can write (cf. [39]):
272
5 Temporal Logics for Linear and Branching Time and Model Checking
(1.) (2.) (3.) (4.) (5.) (6.) (7.) (8.)
φ1 is VAR(1); φ2 is VAR(2); φ3 is VAR(3); φ4 is AND(φ1 , φ2 ); φ5 is AND(φ4 , φ3 ); φ6 is NOT(φ1 ); φ7 is AND(φ6 , φ3 ); φ8 is OR(φ5 , φ7 ).
It remains to represent the transition function R. We exploit to this end the mapping f (s) into open complete description and substitutions of primed atomic propositions: for f (s) we denote by f (s) the formula f (s)[ p1 , p2 ,..., pk / p1 , p2 ,..., pk ] . With thus convention, for (s, t) ∈ R, we let f (s, t) = f (s) ∧ f (t) . As with sets of states, the characteristic function of a transition relation R is given by f (R) which is (s,t)∈R f (s, t). BDD(f(R)) is denoted R; similarly, BDD(f(X)) is denoted X. Definition 5.46 (Images and Preimages) In model checking, the dominating idea is to explore the space of states by checking the set of successors as well as the set of predecessors of a given set of states. For a set X ⊆ S of states, the image of X under the transition relation R, is the set Im(X,R)={t ∈ S : ∃s ∈ X.(s, t) ∈ R}. We need one more operation of rev.prime (reverse prime), shortly, rp, which consists in substitution of pi for pi . The BDD rendering of the image Im(X,R) is the formula BDDIm(X,R): (∃{ p1 , p2 , . . . , pk }.X ∧ R)r p . The preimage PREIm(Y,R)={t : ∃s ∈ Y : (t, s) ∈ R}. The BDD rendering of the preimage PREIm(Y,R) is BDDPREIm(Y,r): ∃{ p1 , p2 , . . . , pk }.R ∧ Y . Example 5.9 It is time to show examples on computing BDD objects defined in Definition 5.45 and in Definition 5.46. Please consider a Kripke structure K in Fig. 5.9. Let us compute some constructs. f (s0 ) is p ∧ q;
Fig. 5.9 Kripke’s structure K
5.17 Symbolic Model Checking: OBDD (Ordered Binary Decision Diagram)
273
f (s1 ) is p; For X = {s0 , s1 }, f (X ) is (( p ∧ q) ∨ p) ≡ p; For the transition relation R, f (R) is [( p ∧ q ∧ p ) ∨ ( p ∧ p )]. BDDImage for X = {s1 } is BDDIm(X,R)=(∃ p, q. p ∧ p ∧ p ,r p ≡ ( p ,r p ) ≡ p = f (s1 ). BDDPreimage for X = {s1 } ∨ p ∧ p ) ≡ p ∧ p ∨ p = f (s0 , s1 ).
is
BDDPREIm(X,R)=∃ p , q .( p ∧ q ∧ p
Definition 5.47 (Symbolic model checking for CTL via OBDD) We know that a base for temporal CTL operators may be provided by EX, EG,EU, which proclaim the existence of a path on which X, respectively, G, or U takes place. All these cases may be model checked by beginning with the set of all states of a Kripke structure and using BDDPREIm in order to find recursively preceding states from which a path is initiated on which the formula for X or G, or U happens to be true. An exception is the operator AG expressing the safety property (always true) whose model checking requires checking all states so the checking procedure begins with the initial states s0 and progresses through all states by means of BDDIm in order to find successors to already checked states. Please recall that boldface denotes BDD formulae. OBDD Model checking safety property MODEL− CHECKING− SAFETY− PROPERTY (Kripke structure K=(S,R,L), s0 ∈ S, CTL formula AGp) S =BDD for the set S of already explored states; N=BDD for the set N of new states=successors to states in S . Initialization: S =0, N={s0 }. 1. N=BDDIm(S , R)∧¬ S . 2. S =S ∨ N. 3. while N = 0 . 6. return S ∧ (¬ p=0. Remarks on model checking CTL EX, EG, EU formulae (Chaki, Gurnfinkel [40]). Recall that for the Kripke structure K, K |= φ if and only if K , s0 |= φ. Therefore the function CHECKC T L (K , s0 , φ) can be recorded as the function SAT(CTL)(K , φ) which returns the result of S0 ∧ ¬ S=0 for S=the set of states which satisfy φ. SAT(CTL)(K , φ) is a general name for specialized variants depending on the operator. The following list ([40]) contains all specializations. (1.) if φ is p, then p. (2.) if φ is ¬ψ, then SAT(CTL)(K , φ) is ¬ SAT(CTL)(K , ψ).
274
5 Temporal Logics for Linear and Branching Time and Model Checking
(3.) if φ is ψ ∧ χ, then SAT(CTL)(K , φ) is SAT(CTL)(K , ψ)∧ SAT(CTL)(K , χ). (4.) if φ is EXψ, then SAT(CTL)(K , φ) is EX(SAT(CTL)(K , ψ)). (5.) if φ is EGψ, then SAT(CTL)(K , φ) is EG((SAT(CTL)(K , ψ). (6.) if φ is E(ψUχ), then SAT(CTL)(K , φ) is EU(SAT(CTL)(K , ψ), SAT(CTL)(K , χ)). The function EX computes simply BDDPREIm of the set S(φ) of states which satisfy φ, i.e., it returns BDDPREIm(S(φ), R). The two remaining functions are more complex as they have to compute fixed points (see the Knaster-Tarski theorem in Chap. 1 and a discussion of fixed point logics in Chap. 8). EG(K, S(φ)) 1. while S(φ)= S ( S(φ) is the current set of states satisfying φ 2. S(φ):= S(φ)∧ BDDPREIm(S(φ), R) 3. return S(φ) Returned S(φ) is the fixed point of the predecessor operator. EU(K, S1 , S2 ) 1. S := 0 2. while S = S 2. S:=S 3. S := S2 ∨ (S1 ∧ BDDPREIm(S, R) 4. return S Tableaux for LTL and CTL in Sects. 5.9 and 5.10 serve as well as tools in symbolic model checking (cf. [40]) for a discussion of the LTL case.
5.18 Problems Problem 5.1 (Time modalities) Due to their meanings, the pair G, F of temporal operators is equivalent to the pair L , M of modal operators. Please write the formulae (K),(T),(4),(5),(B), (D), (L(n,m)), (G(k,l,n,m)),(Bc ), (5c ) from Sect. 4.16 in the temporal context, i.e., by replacing L by G and M by F and applying temporal equivalences. Check validity of the obtained formulae and write valid temporal formulae defining temporal variants of logics K, T, S4, S5. Problem 5.2 (Discretization of continuous time) Time is often perceived as continuous, hence, its discretization makes it possible to pass to discrete time.
5.18 Problems
275
We consider the structure (R+ , 0, δn = [an−1 , an ). Let P be the countable set of atomic propositions and P x, then T (x1 , y) ≥ T (x, y) for every y ∈ [0, 1]; T (x, 0) = 0 and T (x, 1) = x.
6.2 T-Norms, T-Co-norms
285
Each T-norm T is then symmetric (T1), associative (T2), coordinate-wise increasing (T3), and obeying boundary conditions (T4). It is obvious that besides the logic 3 L with values in the set {0, 21 , 1}, the connectives defined in Theorem 6.1 are applicable as well in the sets 1 , . . . , n−2 , 1} producing the n-valued logic n L for each natural number {0, n−1 n−1 n > 2;
Q ∩ [0, 1] = { mn : 0 ≤ m ≤ n, n = 0} for natural numbers m, n, i.e, for rational numbers in the unit interval [0, 1], producing the infinite-valued logic Q L ; [0,1] producing the infinite-valued logic [0,1] L . The same concerns logics induced by Gödel’s and Goguen’s T-norms. Each T-norm T introduces a relative pseudo-complement ⇒T (called also the residuum of T ) by means of the generalization of (6.1a). x ⇒T y ≥ z if and only if T (x, z) ≤ y. Definition 6.4 (Residual implications) For a T-norm T , the residual implication induced by T (T-residuum) denoted ⇒T is defined by means of the duality: x ⇒T y ≥ z ≡ T (x, z) ≤ y. In case of a continuous T-norm T , T -residuum can be defined as x ⇒T y = max{z : T (x, z) ≤ y}. For each T-norm T , T -residuum is the semantic interpretation of the implication ⊃T . Theorem 6.1 The following hold for residua of T-norms. In case x ≤ y, all residua take on the value 1. We consider the case when x > y. (i) for the Łukasiewicz T-norm , the value of the residuum x ⇒ L y is the value of the Łukasiewicz implication: min{1, 1 − x + y}; (ii) for the product T-norm, the value of the residuum x ⇒ P y is xy ; (iii) for the minimum T-norm, the value of the residuum x ⇒ M y is y. Residuum allows us to define the negation operator: ¬T (x) ≡ A x ⇒T 0. In Definition 6.2(ii), we have given the value of ∨ as min {1, A∗ (φ) + A∗ (ψ)}. This function comes as dual to & via the equivalence φ∨ψ ≡ A ¬(¬φ&¬ψ). This duality applied to any T-norm yields the dual function ST called T-co-norm: ST (x, y) = 1 − T (1 − x, 1 − y).
286
6 Finitely and Infinitely Valued Logics
Table 6.4 T-norms, related residua and t-co-norms T-norm Residuum Łukasiewicz: Gödel: Goguen:
max {0, x + y − 1} min {x, y} x·y
Table 6.5 T-norms, related negation operators T-norm Łukasiewicz: Gödel: Goguen:
min {1, 1 − x + y} 1 for x ≤ y, else y 1 for x ≤ y, else xy
t-co-norm min {1, x + y} max {x, y} x+y−x·y
Negation ¬x = 1 − x ¬1 = 0, ¬x = 0 for x < 1 ¬1 = 0, ¬x = 0 for x < 1
Obviously, any T-norm can be defined in terms of its T-co-norm: T (x, y) = 1 − ST (1 − x, 1 − y). By this formula, any co-norm S defines the corresponding T-norm TS . Properties of co-norms are dual to properties of T-norms: in particular, any conorm S is symmetric, associative and monotone increasing in each coordinate, the only change are boundary conditions: S(x, 0) = x and S(x, 1) = 1. In Table 6.4, we give principal T-norms, with their residua, T-co- norms, and in Table 6.5 we insert the corresponding negations operators. We may observe the drastic negation for Gödel and Goguen T-norms. As already observed by us, the semantic value of the Łukasiewicz implication is given by the residuum of the Łukasiewicz T-norm. The scheme outlined in Sect. 6.18 for the logic 3 L may be generalized to the case of an arbitrary T-norm T and its residuum ⇒T . The resulting logic is called the Basic Logic (BL-logic) in Hájek [5].
6.3 Basic Logic (BL) Before proceeding with the logic 3 L , we give an account of the minimal many-valued logic based on the above scheme, i.e., BL. As with sentential logic, semantics of many-valued logic is determined by assignments. Each assignment creates world in which a formula can be true; the designated value for truth is 1. A formula true in all worlds is valid. A formula is satisfiable if and only if its value is 1 in at least one world, otherwise the formula is unsatisfiable. Let us observe that a formula is valid if and only if its negation is unsatisfiable. In BL, the implication is ⊃T whose value is the residuum of T , and, strong conjunction &T is just T . The constant is 0 is interpreted as 0. Other connectives are defined from ⊃T and &T by means of equivalences:
6.3 Basic Logic (BL)
287
¬T φ ≡ φ ⊃T 0; φ ∧ ψ ≡ φ&T (φ ⊃T ψ); φ ∨ ψ ≡ [(φ ⊃T ψ) ⊃T ψ] ∧ [(ψ ⊃T φ) ⊃T φ]; φ∨ψ ≡ ¬T (¬T φ&T ¬T ψ).
(i) (ii) (iii) (iv)
Definition 6.5 (Axiomatization of BL) Axiom system for BL comes from Hájek [5]. We reproduce it along with some provable formulae of BL referring to Hájek [5] for a more complete account. Axiom schemes for BL are the following (≡ is defined as ⊃ & ⊃): (A1) (A2) (A3) (A4) (A5) (A6) (A7)
((φ ⊃T ψ) ⊃T (ψ ⊃T ξ )) ⊃T (φ ⊃T ξ ); φ&T ψ ⊃T φ; φ&T ψ ⊃T ψ&T φ; (φ&T (φ ⊃T ψ)) ⊃T (ψ&T (ψ ⊃T φ)); (φ ⊃T (ψ ⊃T ξ )) ⊃T ((φ&T ψ) ⊃T ξ ); ((φ ⊃T ψ) ⊃T ξ ) ⊃T (((ψ ⊃T φ) ⊃T ξ ) ⊃T ξ ); 0 ⊃T φ.
The inference rule in BL is detachment (MP). Theorem 6.2 All axiom schemes are valid. BL is sound. Proof It consists in checking that value of each scheme is 1. We prove this for (A2). We have A∗ ((φ&T ψ) ⊃T φ) = min{1, 1 − max{0, A∗ (φ) + A∗ (ψ) − 1} + A∗ (ψ)}; we have two cases: if A∗ (φ) + A∗ (ψ) ≤ 1, then A∗ (φ&T ψ ⊃T φ) = min{1, 1 + A∗ (ψ} = 1, else A∗ (φ&T ψ ⊃T φ) = min{1, 2 − A ∗ (φ)} = 1. Theorem 6.3 The following selected formulae are provable, hence, valid in BL. (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii) (xiii)
φ ⊃T φ; φ ⊃T (ψ ⊃T φ); (φ ⊃T (ψ ⊃T ξ )) ⊃T (ψ ⊃T (φ ⊃T ξ )); (φ&T (φ ⊃T ψ)) ⊃T ψ; φ ⊃T (ψ ⊃T φ&T ψ); φ&T (ψ&T ξ ) ≡T (φ&T ψ)&T ξ ; (φ&T ψ) ⊃T φ ∧T ψ; φ ⊃T (φ ∨T ψ); (φ ⊃T ψ) ∨T (ψ ⊃T φ); (φ ⊃T ψ) ⊃T (¬T ψ ⊃T ¬T φ); φ ⊃T (¬φ ⊃T ψ); φ ⊃T ¬¬φ; (φ&T ¬φ) ⊃T 0;
288
(xiv) (xv) (xvi)
6 Finitely and Infinitely Valued Logics
¬T (φ ∨T ψ) ≡T (¬T φ ∧T ¬T ψ); ¬T (φ ∧T ψ) ≡T (¬T φ ∨T ¬T ψ); [(φ ⊃T ψ)&T (χ ⊃T ξ )] ⊃T [(φ&T χ ) ⊃T (ψ&T ξ )].
Theorem 6.4 (Deduction theorem for Basic Logic) If Γ ∪ {φ} B L ψ then Γ B L φ n ⊃T ψ for some natural number n, where φ n is defined by recurrence as φ&T (φ n−1 ), φ 1 being φ. Proof Consider a proof α1 , α2 , . . . , αn of ψ from Γ ∪ {φ}. One proves by induction that B L φ n j ⊃T α j for j = 1, 2, . . . , n. For α1 it is true as it is an axiom instance or φ. The essential case is when some α j results by detachment from already proved αk and αk ⊃T α j and then by induction hypothesis B L φ n ⊃T αk and B L φ m ⊃T (αk ⊃T α j ). Then by (xvi), φ n+m ⊃T αk &T (αk ⊃T α j ) which by (iv) yields B L φ n+m ⊃T α j . We define the constant 1 as 0 ⊃T 0.
6.4 Meta-Theory of BL We recall a proof of completeness of BL couched in algebraic terms. Algebraic approach to logics goes in its origins to George Boole and was continued by Lindenbaum and Tarski, Tarski and McKinsey, Stone, Henkin, Rasiowa and Sikorski (see Rasiowa and Sikorski [6]), Chang [7], among others. Algebraic methods for proving completeness use the prime filter theorem (Chap. 1, Theorem 1.68). We include the proof for BL in Hájek [5]. First, some preliminary definitions. Definition 6.6 (Residuated lattices) A residuated lattice is (abstractly) a tuple L =< A, ∩, ∪, &, ⇒, 0, 1 >, where (i) (ii)
(iii)
the substructure < A, ∩, ∪, 0, 1 > is a lattice (see Definition 1.82); the substructure < A, &, 1 > is a commutative semigroup with the unit 1, i.e. x&1 = x for each x ∈ A, & is associative and commutative; in our case it is strong conjunction which is rendered by the continuous T-norm T ; the duality x&z ≤ y ≡ x ⇒ y ≥ z holds for all triples x, y, z ∈ A.
Definition 6.7 (BL-algebras) A residuated lattice is a BL-algebra if it satisfies in addition two conditions: (iv) x&x ⇒ y = x ∩ y; (v) (x ⇒ y) ∪ (y ⇒ x) = 1. We postpone a discussion of properties of T-norms to Sect. 6.9, at this point, we give a list of essential for this discussion properties of T-norms. (vi)
x ≤ (x&T x ⇒ y) ≤ y;
6.4 Meta-Theory of BL
(vii) (viii) (ix) (x) (xi)
289
x ≤ ((y ⇒ x)&T y); if x ≤ y, then (x&T z) ≤ (y&T z); if x ≤ y, then (z ⇒T x) ≤ (z ⇒T y) and (y ⇒T z) ≤ (x ⇒T z); (x ∪ y)&T z = (x&T z) ∪ (y&T z); (x ∪ y) = [(x ⇒T y) ⇒T y] ∩ [(y ⇒T x) ⇒T x].
Definition 6.8 (Linearly ordered lattices) A lattice L is linearly ordered if and only if x ∩ y = x or x ∩ y = y for each pair x, y ∈ A. By duality, x ∪ y = y or x ∪ y = x. Any linearly ordered BL-algebra satisfies x ∩ y = x&T (x ⇒T y) with the converse also true. Definition 6.9 (BL-algebras as models for BL) We consider assignments on atomic propositions of BL in a BL-algebra L. For each atomic proposition p, an assignment A assigns an element A( p) ∈ L. The assignment A is extended by structural induction to the valuation A∗ on formulae of BL. We then have: (i) (ii) (iii) (iv)
A∗ (φ ⊃T ψ) = A∗ (φ) ⇒T A∗ (ψ); A∗ (φ&T ψ) = A∗ (φ)& A∗ (ψ); A∗ (φ ∨T ψ) = A∗ (φ) ∪ A∗ (ψ); A∗ (φ ∧T ψ) = A∗ (φ) ∩ A∗ (ψ); A∗ (0) = 0; A∗ (¬T φ) = A∗ (φ) ⇒T 0;
Definition 6.10 (L-Validity) A formula φ of BL is L-valid if and only if A∗ (φ) = 1 for each assignment A on atomic propositions of BL. Definition 6.11 (Tarski-Lindenbaum algebra of BL) For each formula φ of BL, we define the class of provably equivalent formulae [φ]τ by letting: ψ ∈ [φ]τ if and only if the formula φ ≡T ψ is provable in BL; we denote this relation by the symbol τ . We denote by T L the collection of all classes of τ . The factorization is canonical, i.e., (i) (ii) (iii) (iv) (v)
[φ]τ &τ [ψ]τ = [φ&T ψ]τ ; [φ]τ ⇒τ [ψ]τ = [φ ⊃T ψ]τ ; [φ]τ ∪τ [ψ]τ = [φ ∨T ψ]τ ; [φ]τ ∩τ [ψ]τ = [φ ∧T ψ]τ ; 0τ = [0T ]τ ; 1τ = [1T ]τ .
It follows that factorization preserves properties of BL-algebra, i.e., T L is a BLalgebra. Definition 6.12 (Filters on residuated lattices) For a residuated lattice L, a filter F (see Definition 1.9.4) is a subset of the domain A such that the following are satisfied: (i) if x, y ∈ F, then x&y ∈ F (ii) if x ∈ F and x ≤ y, then y ∈ F. Filter F is prime if and only if x ⇒ y ∈ F or y ⇒ x ∈ F for each pair x, y ∈ A. For each pair x, y ∈ A, x and y are F-equivalent, if and only if both x ⇒ y, y ⇒ x ∈ F, which is denoted symbolically as x ∼ F y. The relation ∼ F is a congruence, i.e., [x] F ⊗ F [y] F = [x ⊗ y] F for each operation ⊗, and the same holds for 0, 1. We denote by L/ ∼ F the collection of equivalence classes of ∼ F . The relation ≤ F is defined as x ⇒ y ∈ F.
290
6 Finitely and Infinitely Valued Logics
Theorem 6.5 L/ ∼ F is linearly ordered if and only if the filter F is prime. Proof If the filter F is prime, then either x ⇒ y ∈ F which means that x ≤ y or y ⇒ x ∈ F, i.e., y ≤ x, hence, x ∩ y = x or x ∩ y = y, proving linearity. Conversely, if x ≤ y, then x ⇒ y ∈ F, and if y ≤ x, then y ⇒ x ∈ F, hence, the filter F is prime. We now come to the principal tool in algebraic proofs of completeness: the filter separation theorem (cf. Theorem 1.68). We recall it in the current setting. Theorem 6.6 (The filter separation theorem) For an L and x = 1, there exists a prime filter F with x ∈ / F. For a proof, see proof of Theorem 1.68. Definition 6.13 (Direct products of BL-algebras) For a BL-algebra L, we consider the set of all prime filters F and for each F ∈ F, we consider the factored algebra L/F. The direct product of the family {L/F : F ∈ F} is the Cartesian product Π ∗ =Π {L/F : F ∈ F} (see Chap. 1 for Cartesian products). We define an embedding E : L → Π ∗ by letting E(x) = {[x] F : F ∈ F}; indeed, if x = y, then, e.g., ¬x ≤ y, i.e., x ⇒ y = 1 so for some prime filter F, [x] F = [y] F . As operations in Π ∗ are defined coordinate-wise, E is an isomorphic embedding. Definition 6.14 (Translation of BL into L) We translate BL by assigning to each atomic proposition pi a variable xi and by translating ⊃T into ⇒, &T into &, ∨ into ∪, ∧ into ∩, 0 into 0, 1 into 1. Each formula φ of BL translates into the formula < φ > of L. Theorem 6.7 A formula φ of BL is L-valid if and only if A∗ (< φ >) = 1 holds in L. Indeed, the claim holds by Definition 6.11. Theorem 6.8 (the completeness theorem for BL) If for each BL-algebra L, φ is L-valid, then φ is BL-provable. Proof If φ satisfies the assumption, then in the Tarski-Lindenbaum algebra L T , φ is L T -valid, hence the equivalence φ ≡ 1 is BL-provable, thus, φ is BL-provable. Remark 6.1 From the embedding E into Π ∗ in Definition 6.13, it follows that each formula φ is valid in each BL-algebra if and only if it is valid in each linearly ordered BL-algebra. We have learned the fabric of algebraic proofs of completeness. We will meet it again when we give, e.g., the Chang proof of completeness for the infinite-valued logic [0,1] L .
6.5 Meta-Theory of 3 L
291
6.5 Meta-Theory of 3 L Axiomatization of logic 3 L and the proof of its completeness were provided in Wajsberg [8]. Wajsberg’s axiom schemes are: (W1) (W2) (W3) (W4)
φ ⊃ (ψ ⊃ φ); (φ ⊃ ψ) ⊃ [(ψ ⊃ ξ ) ⊃ (φ ⊃ ξ )]; [(φ ⊃ ¬φ) ⊃ φ] ⊃ φ; (¬ψ ⊃ ¬φ) ⊃ (φ ⊃ ψ).
We denote this system by the symbol (W). Inference rules are detachment and uniform substitution. A formula is provable if it has a proof from axioms by means of inference rules. The relation between syntax and semantics is provided by properties of soundness and completeness; about them later on. It is easy to check that axiom schemas (W1)–(W4) are valid and inference rules preserve validity, hence, the system (W) is sound, i.e., each provable formula is valid. We say that a set of formulae Γ proves a formula φ, Γ φ, if and only if there exists a proof of φ from Γ , i.e., a sequence φ0 , φ1 , . . . , φn , where (i) φ0 ∈ Γ or φ0 is an instance of an axiom scheme (ii) φn is φ (iii) each φi , 0 < i < n, is the result of substitution into an axiom scheme or is in Γ , or, it is obtained from some φr , φs , r < s < i, where φs is φr ⊃ φi , by means of detachment. Original proof of completeness of the system (W) was provided in Wajsberg [8]. We recall the proof provided in (Goldberg et al. [9]), carried out by the method of canonical models, on the lines of the Henkin completeness proof in Chap. 3. We comment on detachment rule: we will apply it in the form (D); the syntactic consequence is denoted by . (D)If Γ φ and Γ φ ⊃ ψ, then Γ ψ In particular, if formulae φ and φ ⊃ ψ are provable, then the formula ψ is provable. Definition 6.15 (Consistent and maximal consistent sets of formulae) A set Γ is syntactically consistent if and only if there is no formula φ such that Γ φ and Γ ¬φ; otherwise, Γ is inconsistent. A set Γ is maximal consistent if and only if for each formula φ if Γ ∪ {φ} is syntactically consistent then Γ φ. As with sentential logic, proof of completeness requires a preparation in the form of proofs of some necessary theses. We list below provable formulae established in Wajsberg [8] to be used in the proof of completeness. (W5) (W6) (W7) (W8) (W9) (W10)
φ ψ ⊃ φ; ¬φ ⊃ (φ ⊃ ψ); (φ ⊃ ψ) [(ψ ⊃ ξ ) ⊃ (φ ⊃ ξ )]; {φ ⊃ ψ, ψ ⊃ ξ } φ ⊃ ξ ; ¬¬φ ⊃ (ψ ⊃ φ); [(φ ⊃ ψ) ⊃ ψ] ⊃ (¬φ ⊃ ψ);
292
6 Finitely and Infinitely Valued Logics
(W11) (W12) (W13) (W14) (W15) (W16) (W17) (W18) (W19) (W20) (W21) (W22) (W23) W(24) (W25)
(¬¬φ) ⊃ φ; φ ⊃ (¬¬φ); φ ⊃ φ; (φ ⊃ ψ) ⊃ (¬ψ ⊃ ¬φ); (φ ⊃ ψ) ⊃ ((¬¬φ) ⊃ ψ); [(¬¬φ) ⊃ ψ] ⊃ [(¬¬φ) ⊃ (¬¬ψ)]; (φ ⊃ ψ) ⊃ [(¬¬φ) ⊃ (¬¬ψ)]; [φ ⊃ (φ ⊃ ¬φ)] ⊃ (φ ⊃ ¬φ); [(φ ⊃ ¬φ) ⊃ ¬(φ ⊃ ¬φ)] ⊃ φ; [¬(φ ⊃ ψ)] ⊃ φ; [¬(φ ⊃ ψ)] ⊃ ¬ψ; [φ ⊃ [¬ψ ⊃ ¬(φ ⊃ ψ)]; [(φ ⊃ ¬φ) ⊃ (ψ ⊃ ¬¬ψ)] ⊃ (φ ⊃ ψ); [φ ⊃ (φ ⊃ (ψ ⊃ ξ )) ⊃ (φ ⊃ (φ ⊃ ψ)] ⊃ (φ ⊃ ξ ); ⊥ ≡ ¬(φ ⊃ φ) is unsatisfiable.
We will use (W1)–(W4) and (W5)–(W25) in the proof of completeness of the threevalued logic 3 L . First, we establish properties of consistent and maximal consistent sets of formulae and of the syntactic consequence . Theorem 6.9 The following are basic properties of provability Γ which we recall. (i) (ii) (iii) (iv) (v) (vi)
monotonicity: if Γ φ and Γ ⊆ Γ then Γ φ; compactness: if Γ φ then Γ φ for a finite subset Γ of Γ ; If Γ φ and Γ (φ ⊃ ψ) then Γ ψ; if φ ∈ Γ , then Γ φ; if φ, then Γ φ for each Γ , in particular Γ φ for each φ in (W5)–(W25). if φ is an instance of an axiom scheme, then Γ φ for each Γ .
Theorem 6.10 (The deduction theorem) If Γ ∪ {φ} ψ then Γ φ ⊃ (φ ⊃ ψ). Proof Suppose that Γ ∪ {φ} ψ and let ψ1 , ψ2 , . . . , ψn be a proof of ψ from Γ ∪ {φ}. We prove by induction on i the following Claim. Claim. Γ φ ⊃ (φ ⊃ ψi ) for each i ≤ n. Suppose then that Claim is true for i < j. Consider the following cases. Case 1. ψ j is φ. The formula φ ⊃ (φ ⊃ φ) follows from axiom scheme (W1) by substitution (ψ/φ). Case 2. ψ j is an instance of an axiom scheme or an element of Γ . Then Γ ψ j by Theorem 6.9 (v), (vi). By (W1), Γ ψ j ⊃ (φ ⊃ ψ j ). By 5.2 (iii), Γ φ ⊃ ψ j . Substitutions into (W1) of (φ/φ ⊃ ψ j ; ψ/φ) yield Γ (φ ⊃ ψ j ) ⊃ (φ ⊃ (φ ⊃ ψ j )), whence, by Theorem 5.2 (iii), Γ φ ⊃ (φ ⊃ ψ j ). Case 3. ψ j is obtained by detachment from some ψk , k < j, and ψk ⊃ ψ j . By inductive assumption, the following are true:
6.5 Meta-Theory of 3 L
293
(a) φ ⊃ (φ ⊃ ψk ); (b) φ ⊃ (φ ⊃ (ψk ⊃ ψ j )). To prove Case 3, it is sufficient to apply (W24) with 5.2 (iii), (a) and (b). The proof is concluded. Theorem 6.11 The following are basic properties of consistent and inconsistent sets. (i) a set of formulae Γ is syntactically inconsistent if and only if Γ φ for each formula φ; (ii) a set Γ of formulae is syntactically inconsistent if and only if Γ ⊥; (iii) inconsistency of Γ ∪ {φ} implies that Γ φ ⊃ ¬φ; (iv) inconsistency of Γ ∪ {φ ⊃ ¬φ}, implies that Γ φ. Proof For (i): Suppose that Γ is inconsistent. Then Γ φ and Γ ¬φ for some formula φ. We apply (W6) in order to obtain Γ (¬φ) ⊃ (φ ⊃ ψ) and detachment applied twice yields Γ ψ. The converse is obvious, if Γ φ for any φ, then for any φ, Γ φ and Γ ¬φ. For (ii): if Γ ¬(φ ⊃ φ), then Γ is inconsistent because of (W13) and Theorem 5.2.2 (v). The converse is obvious by (i). For (iii): suppose that Γ ∪ {φ} is inconsistent; then Γ ∪ {φ} ¬φ. By deduction theorem, Γ φ → (φ → ¬φ) and detachment yields Γ φ ⊃ ¬φ. For (iv): by (iii), Γ [(φ ⊃ ¬φ) ⊃ (¬(φ ⊃ ¬φ))] and, by (W19), and detachment, Γ φ. Definition 6.16 (Semantic consistency) We call a set Γ of formulae semantically consistent if and only if there exists an assignment A such that A∗ (φ) = 1 for each φ ∈ Γ . We denote this fact as A∗ (Γ ) = 1. We say that a formula ψ is a semantic consequence of the set of formulae Γ if and only if for each assignment A if A∗ (Γ ) = 1, then A∗ (ψ) = 1. We denote this fact by the symbol Γ |= ψ. Theorem 6.12 (The strong completeness property) For each formula φ and a set of formulae Γ of logic 3 L , Γ φ if and only if Γ |= φ. Proof From Γ φ to Γ |= φ it is straightforward: axiom schemas are valid and derivation rules preserve validity. Proof of the converse rests on Claim: If a set Γ of formulae is syntactically consistent then Γ is semantically consistent. For the proof, we recall (L) The Lindenbaum Lemma. Each syntactically consistent set Γ of formulae can be extended to a maximal syntactically consistent set Γ ∗ .
294
6 Finitely and Infinitely Valued Logics
Lindenbaum Lemma can be proved exactly as in Chap. 2, by ordering all formulae of 3 L into a sequence φ1 , φ2 , . . . and by defining sequence (Γn )∞ n=0 by letting Γ0 = Γ = Γ ∪ {φ } if the latter set is consistent and Γ = Γn , otherwise. Then and Γn+1 n n+1 n+1 ∗ Γ∗ = ∞ Γ is maximal consistent and Γ ⊆ Γ . n=0 n Definition 6.17 (The canonical assignment) We apply Γ ∗ in definition of a canonical assignment A on atomic propositions; for notational convenience, we use value 2 instead of 21 . Γ ∗ is maximal consistent set containing Γ obtained in (L). (i) A( p) = 1 if and only if Γ ∗ p; (ii) A( p) = 0 if and only if Γ ∗ ¬ p; (iii) A( p) = 2, otherwise. We now check the following properties about the valuation A∗ on formulae. Theorem 6.13 The following holds. (i) A∗ (α) = 1 if and only if Γ ∗ α; (ii) A∗ (α) = 0 if and only if Γ ∗ ¬α; (iii) A∗ (α) = 2, otherwise. Proof The proof is by structural induction. The case of atomic propositions is already settled in Theorem 6.13(i)–(iii). We have some cases to consider. Case 1. α is ¬β and Sub-claim holds for β. If Γ ∗ ¬β, then by inductive assumption A∗ (β) = 0, hence, A∗ (¬β) = 1. If Γ ∗ ¬¬β, then A∗ (¬β) = 0. In both cases Sub-claim holds for α. If neither of two sub-cases holds, then, by Theorem 6.13(iii), A∗ (β) = 2 and A∗ (α) = 2. Case 2. α is β ⊃ γ and Sub-claim is true for β and γ . There are three sub-cases to consider: Sub-case 2.1. Γ ∗ (β ⊃ γ ). To obtain that A∗ (β ⊃ γ ) = 1, we have to discuss three possibilities. 2.1.1 If A∗ (γ ) = 1 or A∗ (β) = 0, then A∗ (α) = 1; 2.1.2 If A∗ (γ ) = 0, then Γ ∗ (¬β) so A∗ (α) = 1; 2.1.3 The third sub-case is when neither of β, ¬β, γ , ¬γ has a derivation from Γ ∗ , hence, A∗ (β) = 2 = A∗ (γ ), hence, A∗ (α) = 1. Case 3. Next, we consider Γ ∗ ¬(β ⊃ γ ). By (W20) and (W21), Γ ∗ β and Γ ∗ ¬γ , hence, A∗ (β) = 1, A∗ (γ ) = 0 so A∗ (α) = 0. Case 4. Finally, we consider the case, when neither Γ ∗ (β ⊃ γ ) nor Γ ∗ ¬(β ⊃ γ ). By (W6), Γ ∗ ¬β is not true and (W1) implies that Γ ∗ γ is not true, hence, A∗ (β) = 0 and A∗ (γ ) = 1. Sub-case 4.1. First, let A∗ (β) = 1 so Γ ∗ β. By (W21), Γ ∗ ¬γ is not true, so A (γ ) = 0 and finally A∗ (γ ) = 2, so A∗ (α) = 2; next, let A∗ (β) = 2, i.e., Γ ∗ β does not hold. ∗
6.6 Some Other Proposals for 3, 4-Valued Logics
295
By maximality of Γ ∗ , Γ ∗ ∪ {β} is not syntactically consistent and, by Theorem 6.11(iii), Γ ∗ (β ⊃ ¬β); was A∗ (γ ) = 2, we would have, by Theorem 6.11, again, that Γ ∗ ¬γ ⊃ ¬¬γ and, by (W23), we would have that Γ ∗ (β ⊃ γ ), a contradiction. It follows that A∗ (γ ) = 0 and A∗ (α) = 2. All cases considered, Sub-claim is established. As Γ ⊆ Γ ∗ , if α ∈ Γ , then ∗ A (α) = 1 meaning that Γ is semantically consistent. Now, let us suppose that Γ |= α. If A∗ (Γ ) = 1, then A∗ (α) = 1. Hence, A∗ (α ⊃ ¬α) = 0, i.e., Γ ∪ {α ⊃ ¬α} is semantically inconsistent, hence, Theorem 6.11(iv) tells us that Γ α. This concludes proof of the strong completeness theorem. A concise interlude about some other proposals for 3- and 4-valued logics follows.
6.6 Some Other Proposals for 3, 4-Valued Logics We recall here 3-valued logics of Kleene and Bochvar and the 4-valued modal logic of Łukasiewicz. Definition 6.18 (The logic 3 K of Kleene) The Kleene 3-valued logic 3 K Kleene [10], whose values we denote as {0, 21 , 1} has the values of sentential connectives defined as follows; for atomic propositions p, q, x denotes the value of p and y denotes the value of q: then, negation ¬ p is defined as 1 − x, the value of conjunction p ∧ q is defined as min{x, y}, the value of disjunction p ∨ q is max{x, y}. Values of implication ⊃3 K are given in the Table 6.6. In 3 K , the implication x ⊃3 K y from the value x to the value y is given by the formula max{1 − x, y} which corresponds to the expression ¬ p ∨ q for p ⊃3 K q. Definition 6.19 (The Bochvar logics) The Bochvar three-valued external logic 3 B E (Bochvar [11]), treats values of 21 and 0 in the same way, hence truth tables for this logic are simplified by omitting the value 21 . Hence, the negation ¬ B E is the classical negation ¬, and truth tables for ∨ B E , ∧ B E , ⊃ B E are identical with truth tables for classical ∨, ∧, ⊃. Observe that values of formulae in the logic 3 B E are only 0 or 1. The class of formulae valid in the logic 3 B E coincides with the class of formulae valid in sentential
Table 6.6 The Kleene implication pq q=1 p=0 p=1 p = 21
1 1 1
q=0
q=
1 0
1
1 2
1 2 1 2
1 2
296
6 Finitely and Infinitely Valued Logics
Table 6.7 The truth table for conjunction ∧ B I pq q=1 p=0 p=1 p = 21
q=0
q=
0 1
0 0
1 2
1 2
1 2 1 2 1 2
Table 6.8 The truth table for 4 L M ⊃L M (1, 1) (1, 0) (1, 1) (1, 0) (0, 1) (0, 0)
(1, 1) (1, 1) (1, 1) (1, 1)
(1, 0) (1, 1) (1, 0) (1, 1)
1 2
(0, 1)
(0, 0)
¬L M
(0, 1) (0, 1) (1, 1) (1, 1)
(0, 0) (0, 1) (1, 0) (1, 1)
(0, 0) (0, 0) (1, 0) (1, 1)
logic SL; hence, the class of formulae unsatisfiable in the logic 3 B E coincides with the class of formulae unsatisfiable in the sentential logic SL. The Bochvar three-valued internal logic 3 B I Bochvar [11] is defined by truth table for negation ¬ B I , identical with the truth table for the Łukasiewicz 3-valued negation, so truth function of it is 1 − x, truth table for conjunction ∧ B I is shown in Table 6.7. The following formulae (i), (ii) define disjunction and implication (i) ( p ∨ B I q) ≡ ¬ B I (¬ B I p ∧ B I ¬ B I q); (ii) ( p ⊃ B I q) ≡ (¬ B I p ∨ q). Definition 6.20 (The Łukasiewicz 4-valued logic) The Łukasiewicz 4-valued modal logic 4 L M Łukasiewicz [12] adopts as atomic propositions pairs ( p, q) of atomic propositions of sentential logic, hence, formulae of 4 L M are pairs of formulae of SL. Connectives of 4 L M are defined as coordinate-wise actions of connectives of SL, i.e., (i) ( p, q) ⊃ L M (r, s) is (( p ⊃ r ), (q ⊃ s)); (ii) ¬ L M ( p, q) is (¬ p, ¬q) The set of values is T = {(0, 0), (0, 1), (1, 0), (1, 1)}. Valid formulae are those whose value is constantly (1, 1). Table 6.8 brings the truth table for the logic 44 L M . In addition to ¬ = N , with N (0) = 1, N (1) = 0, one can define three more unary functions: V (0) = 1, V (1) = 1; F(0) = 0, F(1) = 0; S(0) = 0, S(1) = 1. This allows for definitions of 16 binary functions on pairs of values 0, 1. Of them, Łukasiewicz singled out the two as representing modal operators L , M: (i) M(v( p), v(q)) is (V (v( p)), S(v(q)), where v( p) is the value of p; (ii) L(v( p), v(q)) is N M N (v( p), v(q)) Check that tables, Tables 6.9, 6.10, below present truth tables for modalities M and L.
6.7 The n-Valued Logic n L : The Rosser-Tourquette Theory
297
Table 6.9 The truth table for modality M (v( p), v(q)) (1, 1) (1, 0) M(v( p), v(q)) (1, 1) (1, 1)
(0, 1) (0, 1)
(0, 0) (0, 1)
Table 6.10 The truth table for modality L (v( p), v(q)) (1, 1) (1, 0) L(v( p), v(q)) (1, 0) (1, 0)
(0, 1) (0, 0)
(0, 0) (0, 0)
With respect to the criterion of acceptance, one may check that modal formulae (K), (T), (4) are accepted but the formula (5) is rejected. We now proceed with the general finite-valued Łukasiewicz logic. The value set 1 , . . . , n−2 } makes with operations ∨, &, ∨, ∧, ⊃, ¬ and constants L n = {0, 1, n−1 n−1 0, 1 a finite algebra—the Łukasiewicz residuated lattice (see Chap. 1 for basic algebraic structures).
6.7 The n-Valued Logic n L : The Rosser-Tourquette Theory Consider a natural number n > 2. We recall that the set of values for n-valued logic 1 2 , n−1 , . . . , n−2 , 1}. We denote these values by symbols 1, 2, . . . , n − 1, n. is {0, n−1 n−1 For the two-valued (Boolean) sentential logic we have designated 1 as the truth value and 0 as the designated falsity value, but for the case of n values, we may choose a number 1 ≤ s < n and declare a formula α accepted when its truth value is always less or equal to s for all assignments of truth values to sentential variables in α. Otherwise, α is rejected. This convention, due to (Rosser and Tourquette [13]), calls as well for modification of truth functions so the smaller truth value corresponds to the greater degree of acceptance: regardless of the choice of the threshold s, the truth value 1 signifies the certainty of truth while the truth value n signifies certainty of falsity. We denote by A the truth assignment on atomic propositions and A∗ denotes the extension of A to valuation on formulae. Thus, we let (i) A∗ (φ ⊃ ψ) = max{1, A∗ ψ) − A∗ (φ) + 1}; (ii) A∗ (¬φ) = 1 − A∗ (φ) + n In addition to truth functions for implication and negation and disjunctions and conjunctions, the scheme proposed in (Rosser and Turquette [13]) makes use of unary functors Jk for k = 1, 2, . . . , n, whose truth functions jk satisfy conditions, where v is a truth value (see (Rosser and Turquette [13]) for details of construction): (iii)
jk (v) = 1 if and only if v = k;
298
(iv)
6 Finitely and Infinitely Valued Logics
jk (v) = n when v = k.
Other functors are expressed as: (v) φ ∨ ψ is (φ ⊃ ψ) ⊃ ψ; (vi) φ ∧ ψ is ¬(¬φ ∨ ¬ψ). The truth value function for ∨ is min{A∗ (φ), A∗ (ψ)}. The truth value for ∧ is max{A∗ (φ), A∗ (ψ)}. The inverted values of truth functions are in agreement with the convention that value 1 means truth and value n means falsity. Given the acceptance threshold s, φ ∧ ψ is accepted only if both φ, ψ are accepted while φ ∨ ψ is accepted when at least one of φ, ψ is accepted. To facilitate the comprehension of the axiom system in [13], we add that F1 , F2 , . . . are connectives of which F1 is the implication ⊃ and F2 is negation ¬ as defined above, while other connectives are serving other purposes beyond the proof of completeness for L n , like the SŁupecki connective F3 whose truth value function is constantly 2 and which makes n L functionally complete (meaning that all truth tables can be realized by these three functors). Each connective Fi has arity bi , i.e., its full symbol is Fi (φ1 , . . . , φbi ). The value s < n is the acceptance threshold. We have seen in formulae of deduction theorem that many-valued logics are used to formulae longer than those used previously, hence, we introduce some shortcuts for long formulae. We recall the symbol Γuv which is defined as follows: (vii) (viii)
Γuv φi ψ is ψ when u > v; Γuv φi ψ is φv ⊃ Γuv−1 φi ψ when u ≤ v.
Thus, (vii), (viii) are shortcuts for chains of implications, e.g., Γ14 pi q is p4 ⊃ ( p3 ⊃ ( p2 ⊃ ( p1 ⊃ q))). Axioms schemes proposed in (Rosser and Tourquette [13]) are the following. Definition 6.21 (Rosser, Tourquette) Axiom schemes for n-valued logic are: (A1) (A2) (A3) (A4) (A5) (A6)
ψ ⊃ (φ ⊃ ψ); [φ ⊃ (ψ ⊃ ξ )] ⊃ [ψ ⊃ (φ ⊃ ξ )]; (φ ⊃ ψ) ⊃ [(ψ ⊃ ξ ) ⊃ (φ ⊃ ξ )]; [Jk (φ) ⊃ (Jk (φ) ⊃ ψ)] ⊃ (Jk (φ) ⊃ ψ) for k = 1, 2, . . . , n; Γ1n (Jk (φ) ⊃ ψ)ψ; Jk (φ) ⊃ φ for k = 1, 2, . . . , s;
b J pk (Pk )J f (Fi (P1 , P2 , . . . , Pb )), where b is arity of Fi , i runs over indices (A7) Γk=1 of operators Fi , each p j is the value of atomic proposition P j - an argument for Fi and f is the value f i ( p1 , p2 , . . . , pb ) for the value function f i of Fi and values pi s of Pi s.
The derivation rule is the detachment rule: if p, p ⊃ q are accepted, then q is accepted.
6.7 The n-Valued Logic n L : The Rosser-Tourquette Theory
299
Theorem 6.14 (The completeness theorem for n-valued logic n L ) For a formula φ of n-valued logic, the formula is accepted if and only if it is provable from axiom schemes A1–A7, i.e., |= φ ≡ φ. Proof It is based on few claims. Its idea is in analogy to the Kalmár proof of SL completeness in Chap. 2: elimination of atomic propositions. It will be convenient to use small letters p, q, r, .... as denoting atomic propositions and then the symbol V ( p) will denote the value of p. Claim 1. (q ⊃ r ) ⊃ [( p ⊃ q) ⊃ ( p ⊃ r )]. Proof consists in substituting p/q ⊃ r , q/ p ⊃ q, r/ p ⊃ r in the scheme (A2) and then applying detachment with the scheme (A3). Claim 2. q ⊃ q. Proof consists in substitution into the scheme (A2) of p/q, q/ p, r/q which yields (i) [q ⊃ ( p ⊃ q)] ⊃ [ p ⊃ (q ⊃ q)] from which by the schema (A1) and detachment we get (ii) p ⊃ (q ⊃ q) Substitution p/A1 along with detachment brings q ⊃ q. Claim 3. ( p ⊃ q) ⊃ (Γ1n ri p ⊃ Γ1n ri q). Proof by induction: in case n = 0 we get Claim 2. Suppose Claim 3 to be true for n = k, consider the case n = k + 1. By Claim 1 we have (iii) (Γ1k ri p ⊃ Γ1k ri q) ⊃ (Γ1k+1ri p ⊃ Γ1k+1ri q) Apply to (iii) the schema (A3) and the inductive assumption. Claim 4. Consider a sequence q1 , q2 , . . . , qk of formulae in which sentential variables p1 , p2 , . . . , pl occur at least once each. Then (i) Γ1l pi r ⊃ Γ1k q j r. Proof is by induction on k. In case k = 0 also l = 0 and we get r ⊃ r by Claim 2. Suppose Claim 4 true for k = m and let k = m + 1. Some cases are to be considered: Case 1. l = 0, hence, r ⊃ Γ1m q j r . From the schema (A1) we get (ii) Γ1m q j r ⊃ Γ1m+1 q j r and from the last two facts we get Claim 4 by applying the scheme (A3). Case 2. l > 0. As pl is some q j , by inductive assumption, we get
300
6 Finitely and Infinitely Valued Logics j−1
m+1 (iii) Γ1l−1 pi r ⊃ Γ j+1 qi (Γ1
Claim 1 gives
qi r );
j−1
m+1 qi (Γ1 (iv) Γ1l pi r ⊃ (q j ⊃ (Γ j+1
qi r ))).
Now, some sub-cases: Sub-case 2.1 j = m + 1, then Claim 4 follows from (iv). Sub-case 2.2 j < m + 1. From scheme (A2), we obtain j−1
m+1 qi (Γ1 (v) (q j ⊃ (Γ j+1
j−1
m qi r ))) ⊃ (qm+1 ⊃ (q j ⊃ (Γ j+1 qi (Γ1
qi r )))).
By inductive assumption, j−1
m qi (Γ1 (vi) (q j ⊃ Γ j+1
qi r )) ⊃ (Γ1m qi r ).
From (vi) and Claim 1, it follows that j−1
m qi (Γ1 (vii) (qm+1 ⊃ (q j ⊃ Γ j+1
qi r ))) ⊃ (Γ1m+1 qi r ).
Now, Claim 4 follows by (vii), (v), (iv) and scheme (A3). Claim 5. (Jk p ⊃ (q ⊃ r )) ⊃ ((Jk p ⊃ q) ⊃ (Jk p ⊃ r )). Proof uses the scheme (A2) which yields (i) (Jk p ⊃ (q ⊃ r )) ⊃ (q ⊃ (Jk p ⊃ r )). Claim 1 yields (ii) (q ⊃ (Jk p ⊃ r )) ⊃ ((Jk p ⊃ q) ⊃ (Jk p ⊃ (Jk p ⊃ r ))). By scheme (A4) we get (iii) (Jk p ⊃ (Jk p ⊃ r )) ⊃ (Jk p ⊃ r ). From (iii) and Claim 1, we obtain (iv) ((Jk p ⊃ q) ⊃ (Jk p ⊃ (Jk p ⊃ r ))) ⊃ ((Jk p ⊃ q) ⊃ (Jk p ⊃ r )). Now, (i), (ii), (iv) together with scheme (A3) prove Claim 5. Claim 6. p
p
p
(Γ1 JV ( pr ) ( pr )(r ⊃ s)) ⊃ ((Γ1 JV ( pr ) ( pr )r ) ⊃ (Γ1 JV ( pr ) ( pr )s)),
6.7 The n-Valued Logic n L : The Rosser-Tourquette Theory
301
where V( pr ) denotes the truth value of pr . Proof is by induction on p. Case 1. p = 0. Apply Claim 2. Case 2. p > 0. Suppose Claim 6 is true for p = k and consider p = k + 1. By hypothesis of induction and Claim 1, we have (i) (JV ( pk+1 ) ( pk+1 ) ⊃ (Γ1k JV ( pr ) ( pr )(r ⊃ s))) ⊃ (JV ( pk+1 ) ( pk+1 ) ⊃ ((Γ1k JV ( pr ) ( pr )r ) ⊃ (Γ1k JV ( pr ) ( pr )s)). Claim 5 implies (ii) (JV ( pk+1 ) ( pk+1 ) ⊃ ((Γ1k JV ( pr ) ( pr )r ⊃ Γ1k JV ( pr ) ( pr )s))) ⊃ ((Γ1k+1 JV ( pr ) ( pr )r ) ⊃ Γ1k+1 JV ( pr ) ( pr )s). Now, from (i) we obtain (iii) (Γ1k+1 JV ( pr ) ( pr )(r ⊃ s)) ⊃ (JV ( pk+1 ) ( p) ⊃ ((Γ1k JV ( pr ) ( pr )r ) ⊃ (Γ1k JV ( pr ) ( pr )s)). We prove Claim 6 by applying schema (A3) together with (ii), (iii). Claim 7. Suppose φ is a formula on sentential variables q1 , q2 , . . . , q p and f φ (V (q1 ), V (q2 ), . . . , V (q p )) = V is its truth function value as a function of truth p values V (qi ) of qi ’s. Then Γ1 JV (qi ) (qi )JV φ. Proof is by induction on number of symbols in φ. Suppose Claim 7 be true for number of symbols in a formula less than k and consider a formula φ with k symbols. There are some cases to be considered. Case 1. φ is qi so V is V (qi ). By Claim 4, we get p
(i) (JV (qi ) (φ) ⊃ JV (qi ) (φ)) ⊃ (Γ1 JV (qi ) (qi )JV (φ)). Claim 7 follows by Claim 2. Case 2. Suppose φ is F(q1 , q2 , . . . , qa ), where each qi occurs in F with truth functions Vi for each i, and f (V1 , V2 , . . . , Va ) = V , the truth function of φ. Clearly, a=arity(F). Hypothesis of induction supplies us with
302
6 Finitely and Infinitely Valued Logics
(ii) Γ1a JVi (qi )JV j (q j ). From scheme (A7), we get (iii) Γ1a JV j (q j )JV (φ). Now, by (iii) and Claim 3, we obtain (iv) (Γ1a JV j (q j )JVa (qa )) ⊃ (Γ1a JV j (q j )(Γ1a−1 JV j (q j )JV (φ) )). From (ii) we obtain (v) (Γ1a JV j (q j )JVa−1 (qa−1 )) ⊃ (Γ1a−2 JV j (q j )JV (φ)). Claim 6 applied to (v) and then (ii) applied to the result, yield (vi) Γ1a JV j (q j )(JV(a−2) (q(a−2) ) ⊃ Γ1a−3 JV j (q j )JV (φ). Claim 7 is obtained by repeating arguments leading to (vi) from (v). Claim 8. If φ is accepted at all valuations (is valid), then φ (the completeness theorem). First, by scheme A6 and Claim 3, (i) (Γ1a JV j (q j )JV (φ)) ⊃ (Γ1a JV j JV φ). From (i), by Claim 7, we obtain (ii) JVa (qa ) ⊃ Γ1a JV j (q j )φ. From (ii) and schema (A5), we get (iii) Γ1a−1 JVi (qi )φ. From (iii) we get for p − 1, (iv) JVa−1 (qa−1 ) ⊃ Γ1a−2 JV j (q j )φ. By the repetition of this argument, we obtain φ and Claim 8 is proved. Axiomatization with (A1)–(A7) axiomatizes n L completely. Claim 8 demonstrates that any acceptable formula is provable from axiom schemes A1–A7.
6.9 Infinite-Valued Logics
303
6.8 The Post Logics Definition 6.22 (The m-valued Post logic) For m ≥ 2, the Post logic Pm (Post [14]) 1 2 , m−1 , . . . , m−2 , 1}. has as the set of truth values the set {0, m−1 m−1 Values of connectives ¬ P , ∨ P , ∧ P are defined as follows: ¬ P (0) = 1, ¬ P (v) = 1 for v = 0, u ∨ P v =max {u, v}, u ∧ P v =min {u, v}. Post logic gave impulse v − m−1 for Post algebras (see Rasiowa and Sikorski [6]).
6.9 Infinite-Valued Logics We now continue our discussion of many-valued logics with infinite-valued logics. As proved by McNaughton [15], the set of values for infinite valued logic should be dense in the interval [0, 1] and contain 0 and 1. We will focus on the interval [0, 1] as it plays the important role in fuzzy logic applications. It will help in what follows if we have a deeper information on T-norms and their residua. Definition 6.23 (Strict and Archimedean T-norms, T-co-norms) For a T-norm T , we define the T-co-norm as ST (x, y) = 1 − T (1 − x, 1 − y). A T-norm T is strict if it is strictly increasing: T (z, y) > T (x, y) whenever z > x and T (x, z) > T (x, y) whenever z > y. One can see that TL is not strict. A T-norm T is Archimedean if and only if T (x, x) < x for x ∈ (0, 1). The T-norm TL is Archimedean: T (x, x) = max{0, 2x − 1} takes the value 0 for x ≤ 1 and the value 2x − 1 for x > 21 . In the first case T (x, x) = x for x = 0, in the second 2 case T (x, x) = x implies x = 1. Clearly, it is always that T (x, x) ≤ T (x, 1) = x. Each strict T-norm is Archimedean: T (x, x) < T (x, 1) = x for x = 0, 1. In order to produce a strict T-norm, one has to turn to the product T-norm TP (x, y) = x · y. The corresponding product T-co-norm is S P (x, y) = 1 − (1 − x) · (1 − y) = x + y − x · y. The third classical T-norm is the minimum T-norm TM (x, y) = min{x, y} with the adjoint T-co-norm STM =max {x, y}. The Łukasiewicz T-norm TL and the product T-norm TP are Archimedean. They have some specific properties. We establish some basic properties of Archimedean T-norms.
304
6 Finitely and Infinitely Valued Logics
Theorem 6.15 (Ling [16]) We define the sequence T n (x) : T 0 (x) = 1; T n+1 (x) = T (T n (x), x) = T (x, T n (x)). This sequence has the following properties: (i)
The sequence (T n (x))n is non-increasing for each x ∈ [0, 1]: T n+1 (x) = T (T n (x), x) ≤ T (T n (x), 1) = T n (x);
(ii)
For 0 < y < x < 1, there exists n such that T n (x) < y, which implies that limn T n (x) = 0; Suppose that 0 < y and y ≤ T n (x) for each n. Let z = in f n T n (x), so z = limn T n (x) and y ≤ z. Then T (z, z) = limn (T n (x), T n (x)) = limn T 2n (x) = z, a contradiction.
(iii)
The set Tn (x) = {y : T n (y) = x} is closed and non-empty; By continuity of T n (y) and the boundary values T n (0) = 0, T n (1) = 1. (*) Let rn (x) = in f Tn (x) ∈ Tn (x). It exists by property (iii).
(iv)
rn (x) is continuous and decreasing:
(v)
By property (iii), Tn (rn (x)) = x, rn (x) is the right inverse to Tn (x) so it is continuous and decreasing. If y < x, 0 < T n (x), then T n (y) < T n (x); Proof by induction.
(vi)
T (x, y) < min{x, y}: T (x, y) ≤ T (1, y) = y;
(vii)
Was T (x, y) = y, we would have by induction that T (T n (x), y) = y, yet by (ii), T n 0 (x) < y for some n 0 , hence, y = T (T n 0 (x), y) ≤ T (y, y) < y, a contradiction. rn (x) < rn+1 (x) for 0 < x < 1; Suppose that 0 rn (x) for each n. We have x = T n (rn (x)) < T n (s) for each n. Was s < 1 we would have T n (s) < 1 for each n and by property (ii), we would find some m such that T mn (s) = T m (T n (s)) < x, a contradiction.
Theorem 6.16 The superposition T m ◦ rn depends only on the fraction mn , i.e., T m ◦ rn = T km ◦ rkn for each k ≥ 1. Proof Suppose to the contrary that T m ◦ rn = T km ◦ rkn for some k, m, n. Let y = rn (x), z = rkn (x); then T k (z) ≥ y by definition (*). Hence, by our assumption, T k (z) > y. Continuity of T k implies that y = T k (w) for some w < z, hence, T kn (w) = T n (y) = x, i.e., w ∈ Tkn (x), a contradiction with w < z and z = rkn (x). Let a ∈ (0, 1). Define a function h on the set of rational numbers in (0,1): h( mn ) = T m (rn (a)). As superposition of continuous and decreasing functions, h is continuous and decreasing. The function h satisfies the functional equation T (h(x), h(y)) = h(x + y): let x = mn , y = kl . For the common denominator d, write x = dc , y = de ; ) = h(x + y). then T (h(x), h(y)) = T (T c (rd (a)), T e (rd (a)) = T c+e (rd (a)) = h( c+e d By density of the set of rational numbers and continuity of h, the function h extends in a unique way to a continuous and decreasing function g on [0,1]. Clearly, T (g(x), g(y)) = g(x + y). Let b satisfy g(b) = 0. Then g maps the interval [0, b] onto [0,1] hence there exists a continuous decreasing function f : [0, 1] → [0, b]. Letting x = f (u), y = f (v)) we obtain the representation for T : T (u, v) = g( f (u) + f (v)). Theorem 6.17 (Ling [16]) Each Archimedean T-norm T admits the Hilbert-style representation T (x, y) = g( f (x) + f (y)), where g : [0, a] → [0, 1] is a continuous decreasing function and f is its inverse (called also a pseudo-inverse, when extended to minus ∞ on the left and to ∞ on the right). The function g is called the generator for T . We find for instance, the generator for the Łukasiewicz T-norm. Theorem 6.18 The generator for T-norm T L is g(x) = 1 − x with f (y) = 1 − y. Proof Suppose f (1) = a. Then g(x) = 1 for x ∈ [0, a], hence 1 = TL (1, 1) = g( f (1) + f (1)) = g(2a), hence, 2a ∈ [0, a], i.e. a = 0 and g maps [0, 1] onto [1, 0]; this makes g(x) = 1 − x a candidate for the generator with f (y) = 1 − y. Inserting these hypothetical functions into functional equation g( f (x) + f (y)) we get 1 − [(1 − x) + (1 − y)] = x + y − 1 if x + y ≥ 1 and 0 if x + y < 1, i.e., we get TL (x, y). The supposition that T is Archimedean is essential for the representation above, as witnessed by the case of the minimum T-norm TM .
306
6 Finitely and Infinitely Valued Logics
Theorem 6.19 In any representation min{x, y} = g( f (x) + f (y)), the function f is neither continuous nor decreasing. Proof As g ◦ f = id, f is injective. Let f (0) = a, f (1) = b, suppose a > b. Since 0 = min{x, 0} = g( f (x) + f (0)) for every x, was f continuous, we would have g constantly 0 on the interval [a + b, 2a]. For x close enough to 0, we would have a + b < f (x) + f (x) ≤ 2a so x = min{x, x} = g( f (x) + f (x)) = 0, a contradiction when x = 0. Hence, f cannot be continuous. To show that f cannot be decreasing, we apply a theorem of Riemann which states that any monotone function on a closed interval may have at most countably many points of discontinuity (a jump at each such point and then the family of disjoint open intervals must be countable). Was f decreasing, it would have a point of continuity a ∈ (0, 1). Let the sequence (an )n be increasing and let limn an = a. Then the sequence f (an )n is decreasing and limn f (a)n) = f (a). As f (a) > 0, we find an such that (i) f (a) < f (an ) < 2 · f (a). Applying to (i) the function g, we obtain a > an > g( f (a) + f (a)) = min{a, a} = a, a contradiction. Thus, f cannot be decreasing. Definition 6.24 (Residua of continuous T-norms) The residuum ⇒T of a continuous T-norm T is defined as: x ⇒T y = max{r : T (x, r ) ≤ y}. We recall the already mentioned residua. Suppose x ≤ y. Then T (x, r ) ≤ T (x, 1) = x ≤ y, hence, x ⇒T y = 1. Now let x > y. For the Łukasiewicz T-norm, x + r − 1 ≤ y, i.e., r ≤ 1 − x + y, i.e., x ⇒ L y = 1 − x + y is the Łukasiewicz implication in case x > y. For the product T-norm TP (x, y) = x · y, the residuum x ⇒ P y takes on the value 1 when x ≤ y and the value xy when x > y. This implication is called the Goguen implication. For the minimum T-norm TM = min{x, y}, the residuum x ⇒ M y takes on the value 1 when x ≤ y and the value y when x > y. It is the Gödel implication. Theorem 6.20 For each T-norm T , the residuum ⇒T satisfies the following properties: (i) (ii) (iii) (iv) (v) (vi) vii)
T (x, y) ≤ z ⇒T u ≡ x ≤ T (y, z) ⇒T u; y ⇒T (z ⇒T u) = T (y, z) ⇒T u; x ⇒T y = 1 ≡ x ≤ y; x ≤ (y ⇒T u) ≡ (y ≤ (x ⇒T u); Let us fix a ∈ (0, 1). The function T (x, a) does preserve suprema; ⇒T is non-increasing in the first coordinate and non-decreasing in the second coordinate; For a fixed a ∈ (0, 1), a ⇒T x preserves infima and x ⇒T a changes suprema into infima;
6.9 Infinite-Valued Logics
(viii) (ix)
307
x ≤ (x ⇒T y) ⇒T y; T (x, y) ⇒T 0 = x ⇒T (y ⇒T 0).
Proof As T (x, y) is associative, we obtain a sequence (a) x ≤ y ⇒T (z ⇒T u) ≡ T (x, y) ≤ z ⇒T u ≡ T ((T (x, y), z)) ≤ u ≡ T (x, T (y, z)) ≤ u ≡ x ≤ T (y, z) ⇒T u Sequence (a) of equivalences proves (i) and (ii). If x ⇒T y = 1 then T (x, 1) ≤ y, i.e., x ≤ y which proves (iii). Commutativity of T implies: (b) x ≤ y ⇒T u ≡ T (x, y) ≤ u ≡ T (y, x) ≤ u ≡ y ≤ x ⇒T u which proves (iv). For (v), consider a set S with s = sup S . Then T (s, a) ≥ T (x, a) for each x ∈ S. Thus, (c) T (sup, a) ≥ sup T (x, a) S
S
Also, (d) T (sup, a) ≤ u ≡ sup ≤ a ⇒T u ≡ S
S
∀x ∈ S.x ≤ a ⇒T u ≡ ∀x ∈ S.T (x, a) ≤ u Now, letting u = sup S T (x, a), we get T (sup S , a) ≤ sup S T (x, a). For (vi), we address for example the case of the second coordinate as proof for the first coordinate goes along the same lines. Let a ≤ b and x ≤ y ⇒T a so T (x, y) ≤ b and x ≤ y ⇒T b. Arbitrariness of x proves that y ⇒T a ≤ y →T b. Concerning (vii), it follows by (v) and the duality between T and →T . For (viii), we begin with obvious y ⇒T z ≤ y ⇒T z so T (y →T z, y) ≤ z thus T (y, y ⇒T z) ≤ z and y ≤ (y ⇒T z) ⇒T z. Property (ix) is proved on similar lines: z ≤ T (x, y) ⇒T 0 ≡ T (z, T (x, y)) ≤ 0, i.e., T (T (z, x), y) ≤ 0 hence z ≤ x ⇒T (y ⇒T 0). We now begin an interlude on topological properties of T-norms and residua. Properties expressed in terms of infima and suprema can be rendered in topological language of continuity properties. For a sequence (xn )n, the lower limit lim inf xn is the infimum of limits of all convergent sub-sequences (xn k ); the upper limit lim sup xn is the supremum of limits of all convergent sub-sequences.
308
6 Finitely and Infinitely Valued Logics
A function f is lower semi-continuous at x0 if and only if for each sequence (xn ) with limn xn = x0 , f (x0 ) ≤ lim inf f (xn ); dually, f is upper semi-continuous at x0 if and only if f (x0 ) ≥ lim sup f (xn ). A simple and effective characterization is as follows: a function f is upper semicontinuous (respectively, lower semi-continuous) if and only if the set {x : f (x) ≥ r } is closed (respectively, the set {x : f (x) ≤ r } is closed) for each r . Theorem 6.21 (i) Each T-norm T is lower semi-continuous and conversely each commutative and associative function T with T (x, 0) = 0, T (x, 1) = x which is lower semi-continuous is a T-norm (ii) Each residuum is upper semi-continuous and each function which is non-increasing in the first coordinate, non-decreasing in the second coordinate and upper semi-continuous is a residuum for some T-norm. Proof Let us fix our attention on the case of T-norm T . Choose a number r and consider the set {(x, y) : T (x, y) ≤ r }. Suppose that there exists a sequence (xn , yn )n with the limit (x0 , y0 ) and that T (x0 , y0 ) > r , T (xn , yn ) ≤ r for each n. As T is monotone, either xn < x0 or yn < y0 for each n. Suppose that (xn )n increases to x0 and (yn )n decreases to y0 (we may assume this by passing to sequences (xn+ = max{x1 , x2 , . . . , xn })n and (yn− = min{y1 , y2 , . . . , yn })n . Then sup T (xn , yn ) = T (x0 , y1 ) ≤ r which implies by monotonicity of T that T (x0 , y0 ) ≤ r , a contradiction. Proofs of remaining claims go on similar lines. Further results rest on theorems of (Mostert and Shields [17]) and Faucett [18] concerning characterizations of T-norms TL and TP . An element x = 0 is nilpotent with respect to T-norm T if and only if T n (x) = 0 for some n. For instance, for Łukasiewicz ’s T-norm TL , we have TLn (x) = nx − n + 1, hence each x ≤ 1 − n1 is nilpotent with respect to the given n. Theorem 6.22 (Mostert, Shields) Each T-norm T having x = 0, 1 as the only solutions to the equation T (x, x) = x and at least one nilpotent element is equivalent to TL . Theorem 6.23 (Faucett) Each T-norm T having x = 0, 1 as the only solutions to the equation T (x, x) = x and having no nilpotent element is equivalent to T-norm TP . We define the notion of an equivalence: T-norms T, T ∗ are equivalent if and only if there exists an automorphism Φ : [0, 1] → [0, 1] such that T ∗ (x, y) = Φ −1 (T (Φ(x), Φ(y))). Then, x ⇒∗T y = Φ −1 (Φ(x) ⇒T Φ(y)). We state now an important result on a property of T-norm TL (Menu and Pavelka [19]). Theorem 6.24 (Menu, Pavelka) For each T-norm T, if the residuum ⇒T is continuous, then T-norm T is continuous and it is equivalent to the Łukasiewicz T-norm TL .
6.10 Infinite-Valued Logic [0,1] L of Łukasiewicz
309
Proof Suppose that ⇒T is continuous. First we check that the following statement holds true. Claim. y = (y ⇒T a) ⇒T a for y ∈ [a, 1] and a ∈ [0, 1). By 9.8 (viii), y ≤ (y ⇒T a) ⇒T a. To prove the converse inequality, observe that the function h(x) = x ⇒T a is continuous decreasing with h(a) = 1, h(1) = a, hence, there exists z ≥ a such that y = z ⇒T a. Then (a) y = z ⇒T a ≥ ((z ⇒T a) ⇒T a) ⇒T a = (y ⇒T a) ⇒T a by the fact that the superposition (x ⇒T a) ⇒T a is increasing which implies that y ≥ (y ⇒T a) ⇒T a and proves Claim. The corollary to this result is that the function ⇒T a is injective. From Claim, it follows for a = 0 that T (x, y) = (T (x, y) ⇒T 0) ⇒T 0 = ((x ⇒T (y ⇒T 0)) ⇒T 0 which proves that the T-norm T is definable in terms of its residuum and as such is continuous. Now, for the second part, consider the equation T (x, x) = x. Suppose there is a solution x0 = 0, 1 and choose c, d such that c ≤ x0 ≤ d and c = T (x, u) for some u. Then T (x, c) = T (x, T (x, u)) = T (T (x, x), u) = T (x, u) = c and c = T (x, c) ≤ T (d, c) ≤ T (1, c) = c so T (d, c) = c. This means that for c < x, the function ⇒T c is not injective, a contradiction witnessing that no x = 0, 1 solves the equation T (x, x) = x. Now one can refer to theorems by Mostert - Shields and Faucett: as only TL has the continuous residuum, T-norm T is equivalent to TL .
6.10 Infinite-Valued Logic [0,1] L of Łukasiewicz This logic is based on Łukasiewicz’s T-norm and its residuum and on the Łukasiewicz negation. Anticipating the problem of completeness, we want to say that the proof of completeness for this logic is an intricate one. The announcement in Wajsberg [20] was without proof and the first proof in (Rose and Rosser [21]) made use of linear functionals on real vector spaces, which delineated sets of formulae, close to the referred below approach by McNaughton. We prefer to include main lines of a proof by Chang [7] which makes use of the algebraic technique of MV-algebras. The Łukasiewicz truth functions for negation A∗ (¬ p) = 1 − A( p) and for implication A∗ ( p ⊃ q) = min{1, 1 − A( p) + A(q)} form the basis for semantics of the infinite-valued logic [0, 1] L . The set of values W for this logic is a subset of the unit interval [0,1]. As shown by McNaughton, the set W should be a dense subset of the unit interval [0,1] with 0, 1 ∈ W . The obvious candidates are the set Q ∗ = Q ∩ [0, 1] of the rational numbers in [0,1] and the whole unit interval [0,1]. We focus on [0,1] as
310
6 Finitely and Infinitely Valued Logics
the set of values and we choose 1 as the designated value of acceptance. As already pointed to, other functors are defined from negation and implication. The symbol TL denotes the Łukasiewicz t-norm. Definition 6.25 (Functors of infinite-valued logic [0,1) L ) (i) (ii) (iii) (iv) (v) (vi) (vii) (viii)
p&q is the strong conjunction with the value A∗ ( p&q) = max{0, A( p) + A(q) − 1}; p ∧ q is the conjunction with the value A ∗ ( p ∧ q) = min{A( p), A(q)}; p ∨ q is the disjunction with the value A∗ ( p ∨ q) = max{A( p), A(q)}; p ∨q is strong disjunction with the value A∗ ( p∨q) = min{1, A( p) + A(q)}; p ≡ q is equivalence with the value A∗ ( p ≡ q) = min{1 − A( p) + A(q), 1 − A(q) + A( p)}; if x ≤ y, then TL (x, x ⇒TL y) = x; if x > y, then TL (x, x ⇒TL y) = y; TL (x, x ⇒TL y) = min{x, y}.
The standard definition of equivalence is ( p ⊃ q) ∧ (q ⊃ p). The strong conjunction φ&ψ can be expressed as ¬(φ ⊃ ¬ψ). Definition 6.26 (Syntactic aspects of [0, 1)) As usual, we use the ‘turnstile’ symbol to signal that a formula is provable in the system, i.e, it admits a proof from axiom schemes plus possibly a set Γ of formulae in which case we will use the symbol Γ .... The derivation rule is that of detachment. Łukasiewicz conjectured in (Łukasiewicz and Tarski [22]) a famous set of axiom schemes which would make [0,1] L a completely axiomatizable system. These axiom schemes have been: (L1) (L2) (L3) (L4) (L5)
(φ ⊃ ψ) ⊃ φ; (φ ⊃ ψ) ⊃ [(ψ ⊃ ξ ) ⊃ (φ ⊃ ξ )]; (φ ∨ ψ) ⊃ (ψ ∨ φ). Equivalently: ((φ ⊃ ψ) ⊃ ψ) ⊃ ((ψ ⊃ φ) ⊃ φ); (¬φ ⊃ ¬ψ) ⊃ (ψ ⊃ φ); (φ ⊃ ψ) ∨ (ψ ⊃ φ).
The scheme (L5) was shown redundant in Meredith [23] and in Chang [24]. As shown in Hájek [5], the Łukasiewicz system (L1)–(L4) is equivalent to the system BL+¬¬φ ⊃ φ which we will denote by BLDN. Let us consider relations between BL and [0,1] L on the level of axiom systems in Hájek [5]. Theorem 6.25 The following are relations among axiom schemes for BL and axiom schemes for [0,1] L . (i) (ii)
BL implies (L1): the axiom schema (A2) gives (φ&ψ) ⊃ φ; the axiom schema (A5): [(φ&ψ) ⊃ φ] ⊃ (φ ⊃ (ψ ⊃ φ)) and detachment yield (L1). (L2) is (A1).
6.10 Infinite-Valued Logic [0,1] L of Łukasiewicz
(iii) (iv)
311
For (L3): as B L φ ⊃ ¬¬φ (cf. Hájek [5], 2.2.(17)), it follows that B L D N φ ≡ ¬¬φ and by (Hájek [5], 2.2.(18’)) B L (φ ⊃ ψ) ⊃ (¬ψ ⊃ ¬φ). For (L4): from the axiom schema (A4) we get ¬φ&(¬φ ⊃ ¬ψ) ⊃ ¬ψ&(¬ψ ⊃ ¬φ), hence, ¬φ&(ψ ⊃ φ) ⊃ ¬ψ&(φ ⊃ ψ), so we get ¬(¬ψ&(φ ⊃ ψ)) ⊃ ¬(¬φ&(ψ ⊃ φ)) which yields ((φ ⊃ ψ) ⊃ ψ) ⊃ ((ψ ⊃ φ) ⊃ φ).
In particular, deduction theorem proved for BL is valid for [0,1] L . Axioms (L1)–(L4) were shown to axiomatize completely the system [0,1] L in (Rose and Rosser [21]) and in Chang [7]; the announcement of the completeness for (L1)–(L5) was given in Wajsberg [20]. We will present a proof on lines of Chang’s due to (Cignoli and Mundici [25]). Theorem 6.26 The following inferences and formulae are among valid inferences and provable formulae of the system [0,1] L . (i) If Γ p ⊃ q, Γ q ⊃ r , then Γ p ⊃ r ; It follows by axiom schema (L2) and detachment. (ii) If Γ ( p ≡ q) and Γ (q ≡ r ), then Γ p ≡ r ; By Theorem 6.25(vii) and definition of equivalence. (iii) p ⊃ q, q ⊃ r p ⊃ r ; By (i). (iv) ( p ∨ q) ≡ (q ∨ p); By (L3). (v) If Γ ( p ≡ q), then Γ [( p ≡ r ) ≡ (q ≡ r )]; By (L2) and detachment, we obtain (a) p ⊃ q (q ⊃ r ) ⊃ ( p ⊃ r ) and (b) q ⊃ p ( p ⊃ r ) ⊃ (q ⊃ r ) (vi) p ⊃ (q ∨ p);
312
6 Finitely and Infinitely Valued Logics
By substitution (q/q ⊃ p) into (L1); (vii) p ⊃ p ∨ q; By (vi) and (L3). (viii) ( p ⊃ (q ⊃ r )) ⊃ (q ⊃ ( p ⊃ r )); By (vii), we obtain q ⊃ (q ∨ r ) and by (L2) we obtain (c) ((q ∨ r ) ⊃ ( p ⊃ r )) ⊃ (q ⊃ ( p ⊃ r )) Substitution (q/q ⊃ r ) into (c) and an application of (L2) yield (d) ( p ⊃ (q ⊃ r )) ⊃ ((q ∨ r ) ⊃ ( p ⊃ r )) From (c) and (d), (viii) follows by means of (L1); (ix) (q ⊃ r ) ⊃ [( p ⊃ q) ⊃ ( p ⊃ r )]; By (viii) and (L2); (x) Γ ( p ≡ q) implies Γ (r ⊃ p) ≡ (r ⊃ q); Substitute ( p/q; q/ p) into (viii) to obtain (e) ( p ⊃ (q ⊃ r )) ≡ (q ⊃ ( p ⊃ r )) and then substitute ( p/q; q/r ; r/ p) into (L2); (xi) p ⊃ p; Substitute (q/ p) into (viii) and apply (L1) to obtain (f) q ⊃ ( p ⊃ p). Substitute for q any already proved formula and apply detachment; (xii) p ≡ p; By (xi). (xiii) ( p q) ≡ ( p ⊃ q); By (L1), into which substitutions ( p/q; q/ p) has been made and by (vii) in which the functor ∨ was replaced by its definiens ( p ⊃ q) ⊃ q; (xiv) q ≡ (q ∨ q); Substitute ( p/q ⊃ q) into (xiii) and apply (xi); (xv) p ⊃ r, q ⊃ r ( p ∨ q) ⊃ r ; By (L2), (g) ((q ⊃ r ) ⊃ ( p ⊃ r ) ⊃ (( p ∨ r ) ⊃ (q ∨ r )) and (L2) applied to (g) yields (h) ( p ⊃ q) ⊃ (( p ∨ r ) ⊃ (q ∨ r )) and, by (L3), we obtain (j) ( p ⊃ q) ⊃ ((r ∨ p) ⊃ (r ∨ q))
6.10 Infinite-Valued Logic [0,1] L of Łukasiewicz
313
This implies (k) ( p ⊃ r ) ⊃ (( p ∨ q) ⊃ (r ∨ q)) and (l) (q ⊃ s) ⊃ ((r ∨ q) ⊃ (r ∨ s)) We digress here, to recall from (Rosser and Tourquette [13], 6.12. Claim 3) the valid formula m m (∗) (q ⊃ r ) ⊃ (Γi=1 pi q ⊃ Γi=1 pi r )
From (*), we obtain m n m n pi q, Γi=1 si (q ⊃ r ) Γi=1 pi Γi=1 si r (∗∗) Γi=1 n n Indeed, Suppose (ι) Γi=1 si (q ⊃ r ) and obtain via (viii) (κ) q ⊃ Γi=1 si r . Then (*) yields m n Γi=1m pi q ⊃ Γi=1 pi Γi=1 si r Now, by (**), we obtain from (k) and (l) that (m)(( p ⊃ r ) ⊃ ( p ∨ q)) ⊃ ((q ⊃ s) ⊃ (r ∨ s)) which by (viii) yields
(γ ) ( p ⊃ r ) ⊃ ((q ⊃ s) ⊃ (( p ∨ q) ⊃ (r ∨ s))). By applying (xiv) to (γ ), we conclude the proof; (xvi) ¬¬ p ⊃ p; Begin with (L1) to obtain (α): (¬¬ p) ⊃ ((¬¬q) ⊃ (¬¬ p)) by substitutions ( p/¬¬ p; q/¬¬q). Substitute in (L4) ( p/¬ p; q¬q) and apply to (α) to obtain (β): (¬¬ p) ⊃ ((¬ p) ⊃ (¬q)). Apply again (L4) to (β) to obtain (δ): (¬¬ p) ⊃ (q ⊃ p). (ix) applied to (δ) yields (η): q ⊃ (¬¬ p ⊃ p) and by substitution in (η) of (xi) for q,i.e.,(q/ p ⊃ p), we obtain (xvi) by detachment. (xvii) p ⊃ ¬¬ p; Apply (ix) to (xvi) to obtain (λ): ( p ⊃ ¬q) ⊃ ((¬¬ p) ⊃ (¬q)) and apply (L4) to (λ) to obtain (μ): ( p ⊃ ¬q) ⊃ (q ⊃ ¬ p). Substitution ( p/q; q/ p) in (xvi)) and repetition of above steps yield (ν): (q ⊃ ¬ p) ⊃ ( p ⊃ ¬q) and (λ) and (ν) yield (π ): ( p ⊃ ¬q) ≡ (q ⊃ ¬ p). Substitution q/¬ p) in (π ) yields p ⊃ ¬¬ p. (xviii) p ≡ ¬¬ p; By (xvi) and (xvii); (xix) ( p ⊃ q) ≡ (¬q ⊃ ¬ p);
314
6 Finitely and Infinitely Valued Logics
Obtain (ρ): ( p ⊃ q) ≡ ( p ⊃ ¬¬q) by (xviii) and (x). Then go to proof of (xvii) and substitute (q/¬q) in formula (π ) to obtain (σ ): ( p ⊃ ¬¬q) ≡ ((¬q) ⊃ (¬ p)). From (ρ) and (σ ) derive (xix). (xx) Γ p ≡ q implies Γ (¬ p) ≡ (¬q); By (xix).
6.11 Wajsberg Algebras Definition 6.27 (Wajsberg algebras) A Wajsberg algebra, see (Font et al. [26]) is an algebra W = (L , ⊃, ¬, 1) which satisfies the following conditions: (WA1) (WA2) (WA3) (WA4)
1 ⊃ x = x; (x ⊃ y) ⊃ [(y ⊃ z) ⊃ (x ⊃ z)] = 1; (x ⊃ y) ⊃ y = (y ⊃ x) ⊃ x; (¬x ⊃ ¬y) ⊃ (x ⊃ y).
(WA1) and (WA4) are direct renderings of Wajsberg’s axioms for 3 L . Theorem 6.27 The following are properties of Wajsberg algebras. (i) (x ⊃ x) = 1; By (WA1), 1 ⊃ 1 = 1, by (WA2), 1 ⊃ (x ⊃ x) = 1, substitution (x/x ⊃ x) yields x ⊃ x = 1 ⊃ (x ⊃ x), hence, x ⊃ x = 1; (ii) If (x ⊃ y) = (y ⊃ x) = 1, then x = y; (WA1) and (WA3) yield x = 1 ⊃ x = (y ⊃ x) ⊃ x = (x ⊃ y) ⊃ y = 1 ⊃ y = y. This property reveals the origins of Wajsberg algebra as the Tarski-Lindenbaum algebra of 3-valued logic axiomatized by Wajsberg. (iii) x ⊃ 1 = 1; By (WA3), then by (i)(1), we get (x ⊃ 1) ⊃ 1 = [(1 ⊃ x) ⊃ x] = x ⊃ x = 1 Using the obtained identity (x ⊃ 1) ⊃ 1 = 1, by (WA1) and (WA2), we obtain 1 = 1 ⊃ x = [(x ⊃ 1) ⊃ (1 ⊃ 1)] = x ⊃ (x ⊃ 1) ⊃ 1 = x ⊃ 1 (iv) x ⊃ (y ⊃ x) = 1; By (WA1), (WA2), (iii), in some order, we get 1 = (y ⊃ 1) ⊃ [(1 ⊃ x) ⊃ (y ⊃ x)] = 1 ⊃ [x ⊃ (y ⊃ x)] = x ⊃ (y ⊃ x) Yet another Wajsberg’s axiom scheme.
6.11 Wajsberg Algebras
315
(v) If x ⊃ y = 1 and y ⊃ z = 1, then x ⊃ z = 1; Suppose that x ⊃ y = 1 = y ⊃ z and apply (WA2) to get (x ⊃ y) ⊃ [(y ⊃ z) ⊃ (x ⊃ z)], i.e., 1 = 1 ⊃ (1 ⊃ [(x ⊃ z)], hence, by (WA1), 1 = 1 ⊃ (x ⊃ z) and (WA1) applied again yields 1 = x ⊃ z; (vi) If x ⊃ (y ⊃ z) = 1 then y ⊃ (x ⊃ z) = 1; Suppose that x ⊃ (y ⊃ z) = 1. Substitution (y/y ⊃ z) in (W2) yields 1 = [x ⊃ (y ⊃ z)] ⊃ {[(y ⊃ z) ⊃ z] ⊃ (x ⊃ z)} = ⊃ [(y ⊃ z) ⊃ z] ⊃ (x ⊃ z) By applying (WA1) and (WA3), we obtain [(z ⊃ y) ⊃ y] ⊃ (x ⊃ z) = 1. (iv) yields y ⊃ [(z ⊃ y) ⊃ y] = 1. Substitute in (v): (x/y; y/(z ⊃ y) ⊃ y; z/x ⊃ z). It obtains y ⊃ (x ⊃ z) = 1; (vii) (x ⊃ y) ⊃ [(z ⊃ x) ⊃ (z ⊃ y)] = 1; By (vi) and (WA2). (viii) x ⊃ (y ⊃ z) = y ⊃ (x ⊃ z); (WA3) and (iv) yield (a) y ⊃ [(y ⊃ z) ⊃ z] = y ⊃ [(z ⊃ y) ⊃ y] = 1 By (vii), (b) [(y ⊃ z) ⊃ z] ⊃ {[(x ⊃ (y ⊃ z)] ⊃ (x ⊃ z)} = 1 Application of (v) obtains from (a), (b), (c) y ⊃ {[x ⊃ (y ⊃ z)] ⊃ (x ⊃ z)} = 1 By (vi), (d)[x ⊃ (y ⊃ z)] ⊃ [y ⊃ (x ⊃ z)] = 1 follows. By symmetry, (e)[y ⊃ (x ⊃ z)] ⊃ [x ⊃ (y ⊃ z)] = 1. The result follows from (d) and (e) by (ii) These are basic conclusions from (WA1)–(WA3). As (WA4) is concerned with negation, we now recall some properties involving negation. The basic ones are: (ix) (¬1) ⊃ x = 1; By (iv), (¬1) ⊃ (¬x ⊃ ¬1) = 1, hence, by (WA4), (¬1) ⊃ (1 ⊃ x) = 1 and (WA1) yields (¬1) ⊃ x = 1; (x) (¬x) = x ⊃ ¬1;
316
6 Finitely and Infinitely Valued Logics
By (WA1) and (WA4), (a) (¬x ⊃ ¬1) ⊃ x = (¬x ⊃ ¬1) ⊃ (1 ⊃ x) = 1. (iv) implies (b) (¬x) ⊃ (¬¬1 ⊃ ¬x) = 1. By (W4), we obtain (c)(¬¬1) ⊃ ¬x) ⊃ (x ⊃ ¬1) = 1. From (b) and (c), by (v), one obtains (d)(¬x) ⊃ (x ⊃ ¬1) = 1. By (viii), (d) implies (e)x ⊃ ((¬x) ⊃ ¬1) = 1. (ii) and (a) together with (e) yield ( f ) x = (¬x) ⊃ ¬1. Now, (WA1), (WA3), (f) and (ix) yield (x ⊃ ¬1) = (¬x ⊃ ¬1) ⊃ ¬1) = (¬1 ⊃ ¬x) ⊃ ¬x = 1 ⊃ ¬x = ¬x (xi) ¬¬x = x; By (x), (WA1) and (WA3), (¬¬x) = (x ⊃ ¬1) ⊃ ¬1 = (¬1 ⊃ x) ⊃ x = 1 ⊃ x = x; (xii) x ⊃ y = ¬y ⊃ ¬x; (WA4) and (xi) imply 1 = (¬¬x ⊃ ¬¬y) ⊃ (¬y ⊃ ¬x) = (x ⊃ y) ⊃ (¬y ⊃ ¬x). The converse implication is (WA4) and then the thesis follows by (ii). MV-algebras provide an environment for the Chang completeness proof. We will see their important interrelations with Wajsberg algebras in what follows.
6.12 MV-Algebras Definition 6.28 (MV algebras) An MV-algebra is an algebra (A, , ¬, 0) with a binary , unary ¬ and a constant 0. This algebra should satisfy the following conditions, see Chang [7], Mundici [27], Mundici et al. [28]. (MV1) x (y z) = (x y) z; (MV2) x y=y x; (MV3) x 0 = x; (MV4) ¬¬x = x; (MV5) x (¬0) = ¬0; (MV6) ¬(¬x y) y = ¬(¬y x) x. We are interested in MV-algebras thanks to Chang’s proof of completeness for the infinite-valued logic of Łukasiewicz. For this reason an especially interesting example of an MV-algebra is the unit interval [0,1] with as min{1, x + y}, i.e., the Łukasiewicz T-co-norm SL , ¬ as 1 − x, i.e, the Łukasiewicz negation, and, with 0 as 0 ∈ [0, 1]. In [0,1], we have x y = max{0, x + y − 1}, the residuum x ⇒ y = min{1, 1 − x + y}, x y = max{0, x − y}, the ordering x ≤ y being the natural ordering on the unit interval [0,1], and, with 0, 1 ∈ [0, 1].
6.12 MV-Algebras
317
It would be desirable to introduce into MV-algebras other constructs, already announced, known from many-valued logics of Łukasiewicz. Definition 6.29 (Secondary operators in MV-algebras) The following operators augment MV-algebras. (OT) x y = ¬(¬x ¬y); (OM) x y = x ¬y; (U) 1 = ¬0; In our particular case, x y = max{0, x + y − 1}, i.e., it is the Łukasiewicz T-norm TL ,denoted earlier as L, and, x y = max{0, x − y}, and, 1 ∈ [0, 1]. Theorem 6.28 We have further relations. (A) ¬1 = 0; By (MV4). (B) x y = ¬(¬x ¬y); By (MV4). (C) x ¬x = 1; By (MV6) 1 = ¬0. An ordering in an MV-algebra is introduced via y = 1;
(D) x ≤ y if and only if ¬x
Theorem 6.29 The following are equivalent. (i) ¬x y = 1; (ii) x ¬y = 0; (iii) y = x (y x); (iv) y = x z for some z. Proof Suppose (i) holds. By (OT) and (MV4), ¬x
y = ¬(¬¬x
¬y) = ¬(x
¬y).
By (A), x ¬y = 0 which is (ii). The proof of converse is in reading backwards the proof from (i) to (ii). Suppose (ii) and rewrite (MV6) as (x y) y = (y x) x. As x y = 0, we obtain y = x (y − x), i.e., (iii) holds true. (iv) follows from (iii) and (iii) implies directly (i). Theorem 6.30 The ordering ≤ is a partial ordering, i.e., it is reflexive, weakly antisymmetric and transitive.
318
6 Finitely and Infinitely Valued Logics
Proof Reflexivity means x ≤ x, i.e., x ¬x = 1 which is (C). Weak anti-symmetry follows by (MV6) and identities x y = 0 = y x. Transitivity follows by (iv) in Theorem 6.29. Partial ordering ≤ implies that the complement ¬x is z that satisfies the system x = 1. of equations: (i) z x = 0 (ii) z Indeed, in the language of partial ordering ≤, these two conditions come down to a double inequality ¬x ≤ z ≤ ¬x, hence z = ¬x. Theorem 6.31 We collect here important properties of the partial ordering ≤. (i) x ≤ y if and onlyif ¬y ≤ ¬x; (ii) x ≤ y implies x z≤y z and z≤y z; x (iii) x y ≤ z if and only if x ≤ ¬y z. Proof For (i), by (MV4), x ≤ y ≡ ¬x
y = 1 ≡ ¬x
¬¬y = 1 ≡ ¬y ≤ ¬x
(ii) follows by (iv) in 12.4. For (iii), by (OM), x
y ≤ z ≡ ¬(x
y)
which implies by definition that x ≤ ¬y
z = ¬x
¬y
z.
z=1
Theorem 6.31(iii) reminds us of duality between T-norms and their residua. Each MV-algebra bears the structure of a lattice. The join and the meet in an MV-algebra are expressed in the following theorem. Theorem 6.32 The join in any MV-algebra is expressed as (i) x ∨ y = (x y) y and the meet is rendered as (ii) x ∧ y = ¬(¬x ∨ ¬y). Proof The proof should show that x ∨ y is the least upper bound of {x, y} under the ordering ≤. By Theorem 6.29, , y ≤ (x y) y and x ≤ x y) y as, by (MV6), (x y) y = (y x) x. Let x ≤ z and y ≤ z. By Theorem 6.29, (a) ¬x z = 1 and (b) z = (z y) y. The rest is a tedious calculation: we need to show that ¬[(x y) y] z = 1. This is obtained by substituting the right-hand of (ii) for z, applying (MV6) together with definitions of operators and finally using (i). The proof for the meet follows by duality. Each MV-algebra possesses distributivity properties. Theorem 6.33 The following hold in each MV-algebra. (i) x (y ∨ z) = (x y) ∨ (x z); (ii) x (y ∧ z) = (x y) ∧ (x z). The algebraic completeness proof by Chang exploits the separation theorem and it is immaterial whether it is formulated in the language of filters or ideals.
6.13 Ideals in MV-Algebras
319
6.13 Ideals in MV-Algebras Definition 6.30 (MV-ideals) An ideal in an MV-algebra (A, such that
, ¬, 0) is a set I ⊆ A
(I1) 0 ∈ I ; (I2) If y ∈ I and x ≤ y, then x ∈ I ; (I3) If x, y ∈ I then x y ∈ I. A filter F is a dual to the ideal I , as F = {x ∈ A : ¬x ∈ I }. It follows that it is immaterial in what language one is carrying the discussion, ideals or filters. The intersection of a family of ideals is an ideal, in particular for a subset X ⊆ A, there exists a minimal ideal I (X ) containing X . Its explicit form is given as I (X ) = {y ∈ A : ∃x1 , x2 , . . . , xk ∈ X.y ≤ x1
x2
...
xk }
for some natural number k. We denote by k z the expression z z . . . z in which the symbol k z} occurs k times. Then, the principal ideal I (z) is given as I (z) = {y ∈ A : y ≤ for some k. Combining the last two results, we get the formula for a minimal ideal J (I, x) containing an ideal I and an element x ∈ A: m J (I, x) = {y ∈ A : y ≤ ( x) z : z ∈ I, for some m}.
An ideal I is proper if and only if 1 ∈ / I . An ideal I is prime if and only if it is proper and x y ∈ I or y x ∈ I for each pair x, y ∈ A. The separation property by ideals is dual to this property for filters. It is the crucial property of ideals/filters, essential in algebraic proofs of completeness. Theorem 6.34 (The separation property) For each proper ideal I , if an element x is not in I , then there exists a prime ideal J such that I ⊆ J and x ∈ / J. Proof By the Zorn maximal principle, there exists a maximal ideal J with I ⊆ J and x∈ / J . Suppose that J is not prime, hence, y, z neither z y y ∈ J nor for some z ∈ J . Then x ∈ J (z y), hence, x ≤ k (z y) u and x ≤ l (y z) w for some u, w ∈ J . m m (z y) and x ≤ t (y z) for any Then, t = u w ∈ J and x ≤ t m ≥ k, l, hence, m m (z y)] ∧ [t (y z)] = x ≤ [t m m (y z) ∧ (z y)] = t t [ so x ∈ I , a contradiction.
320
6 Finitely and Infinitely Valued Logics
An MV-algebra is an MV-chain if the universe A is linearly ordered by the ordering ≤. In MV-chains, prime ideals are especially well positioned. Theorem 6.35 In each MV-chain, every proper ideal I is prime. Proof As x ≤ y or y ≤ x, we have x y = 0 ∈ I or y x = 0 ∈ I .
Definition 6.31 (Homomorphisms of MV-algebras) A homomorphism h from an MV-algebra W into an MV-algebra T is a mapping h : W → T preserving operations and constants, i.e., (h1) h(x y) = h(x) h(y); (h2) h(¬x) = ¬h(x); (h3) h(0) = 0. It follows that the image h(W ) is an MV-sub-algebra of T . From standard settheoretic facts the following follows. Theorem 6.36 (i) (ii) (iii) (iv)
The kernel K er (h)= h −1 (0) is an ideal; more generally, if J is an ideal in T then h −1 (J ) is an ideal in W ; x y ∈ K er (h) if and only if h(x) ≤ h(y); The ideal K er (h) is prime if and only if T = {0} and the image h(W ) is an MV-chain.
Definition 6.32 (Congruences modulo ideal) The congruence ∼ I induced on an MV-algebra W by an ideal I , is defined as (Cong) (x ∼ I y) ≡ (x y) (y x) ∈ I . are quite The congruence ∼ I is an equivalence relation (reflexivity and symmetry obvious, transitivity follows by the property (x z) ≤ (x y) (y z), which satisfies additional properties z ∼I y z; (Cong1) if x ∼ I y, then x (Cong2) if x ∼ I y, then ¬x ∼ I ¬y; (Cong3) if x ∼ I 0, then x ∈ I . We denote by the symbol [x]∼ I the equivalence class of x and by A/ ∼ I the collection of all equivalence classes. By the congruence property, the canonical mapping h : A → A/ ∼ I which sends x ∈ A to [x]∼ I ∈ A/ ∼ I is a homomorphism.
6.14 The Chang Representation Theorem We begin with the notion of a Cartesian product PF of a family F = {As : s ∈ S} of MV-algebras. We denote this product by the symbol PAs for short. Elements of the product are threads x S =< xs >s∈S where xs ∈ As for each s ∈ S.
6.15 The Chang Completeness Theorem
The coordinate-wise: < xs >s∈S MV-operations are performed xs s ys >s∈S and analogously for , , ¬.
321
< ys >s∈S =
s∈S ) = xs . Definition 6.33 (Sub-direct products) We say that an MV-algebra A is a sub-direct product of the family F : {As : s ∈ S} if there exists an injective homomorphism h : A → P As such that the composition πs ◦ h : A → As is a surjective homomorphism for each s. A criterion for a sub-direct representation of an MV-algebra is given below. Theorem 6.37 If I is a family of ideals such that (i) A is isomorphic to A/ ∼ I for each I ∈ I (ii) I I = {0} then A is a sub-direct product of quotient MV-algebras A/ ∼ I . The converse is also true. Proof Suppose that I is a family of ideals that satisfy (i), (ii). Denote by h I : A → A/ ∼ I the isomorphism existing by: (i) for each I and let h : A → P A/ ∼ I be defined as h(x) =< h I (x) > I . (ii) h is injective.
We may now state the Chang representation theorem Chang [7]. Theorem 6.38 (Chang) Each MV-algebra A is a sub-direct product of MV-chains. Proof By Theorem 6.36(iv) and the separation theorem, it is enough to take as the family F P(A) of all prime ideals on A as, by the separation theorem, P(A) = {0}. Let us observe, that each MV-chain is a distributive lattice with the meet x ∧ y = min{x, y} and the join x ∨ y = max{x, y}. Theorem 6.38 tells that any equation holds in all MV-algebras if and only if it holds in all MV-chains. The next result is a breakthrough: any equation holds in all MV-algebras if and only if it holds in the MV-algebra on [0,1]. We state the crucial result by Chang. The proof comes from (Cignoli and Mundici [25]).
6.15 The Chang Completeness Theorem Theorem 6.39 (Chang [7]) For each equation t = 0 in the language of MV-algebras, it holds true in each MV-algebra if and only if it holds in the MV-algebra [0,1] which is the Łukasiewicz residuated lattice. We begin with algebraic prerequisites essential for the proof.
322
6 Finitely and Infinitely Valued Logics
Definition 6.34 (Lattice-ordered abelian groups (-groups)) For such group G and 0 < u, u ∈ G, [0, u] denotes the segment {x ∈ G : 0 ≤ x ≤ u}. For x, y ∈ [0, u], we define operators: x y = u ∧ (x + y), ¬x = u − x. Then, ([0, u], , ¬, 0) is an MV-algebra, denoted Γ (G, u); in particular, Γ (R, 1) is [0,1] - Łukasiewicz MV-algebra. Additional constants and operators in Γ (G, u) are: 1 = u, x y= ¬(¬x ¬y) = (x + y − u) ∨ 0, x ¬y = (x − y) ∨ 0, x ≤ y if and only if ¬x y = u (cf. Theorem 6.29). Definition 6.35 (Good sequences) For an MV-algebra A, a sequence a = (a1 , a2 , . . . , an , ..) is good if and only if ai ai+1 = ai for each i ≥ 1 and an = 0 for n > q, for some q. An example of a good sequence is (a1 a2 , a1 a2 ). A sum a + b, where a = (a1 , a2 , . . . , an ) and b = (b1 , b2 , . . . , bm ) is c = (c1 , c2 , . . . , cn+m ), where ci = ai
(ai−1
b1 )
...
bi−1 ) (a1 bi .
Good sequences look especially simple in MV-chains: as in any MV-chain A, m x y = x if and only if x = 1 or y = 0, any good sequence is of the form (1 , a) m n n+m , a b, a b). The for some m and a ∈ A. The sum (1 , a) + (1 , b) = (1 ordering: b ≤ a if and only if bi ≤ ai for each i if and only if there exists c such that b + c = a. Definition 6.36 (The Chang group G A of an MV-algebra A) We denote by the symbol M A the set of all good sequences of elements of A. The group G A is the set of equivalence classes of pairs of good sequences from the set M A . The equivalence relation ∼ on the set M A2 of pairs of good sequences on A is defined as follows: (a, b) ∼ (c, d)) if and only if a + d = c + b. The neutral element in G A is 0 = ((0), (0)), group operations are (a, b) + (c, d) = (a + c, b + d), ¬(a, b) = (b, a), (a, b) ≤ (c, d) if and only if (c, d) − −(a, b) ∈ M A , where M A = {(a, (0)}. Proof (Proof of the Chang completeness theorem) Letτ = τ (x1 , x2 , . . . , xn , y) be a term, i.e., a word over the alphabet (x1 , x2 , . . . , xn , 0, , ¬). In the framework of the word over the alphabet Chang group G A , τ is defined as τ = τ (x1 , x2 , . . . , xn , y), as a z = (y ∧ x + z. (x1 , x2 , . . . , xn , y, 0, −, +, ∨, ∧, ‘, ), where ¬x = y − x, x For MV-chain A and a1 , a2 , . . . , an ∈ A and τ (x1 , x2 , . . . , xn ) an MV-term, the sequence τ A (a1 , a2 , . . . , an ) = τ G A ((a1 ), (a2 ), . . . , (an ), (1)) ∈ G A . Suppose that the equation τ (x1 , x2 , . . . , xn ) = 0 does not hold in an MV A; by 14.3, we may assume that A is an MV-chain. We can find a1 , a2 , . . . , an ∈ A such that τ (a1 , a2 , . . . , an ) > 0; the corresponding equation in the Chang group G A is
6.15 The Chang Completeness Theorem
323
0 < τ G A ((a1 ), (a2 ), . . . , (an ), (1)) ≤ (1). We consider a free group Σ= Z(1) + Z(a1 ) + . . . + Z(an ) = Zn+1 . We render respectively 1, a + 1, a2 , . . . , an as elements Z 0 , Z 1 , . . . , Z n ∈ Zn+1 . Let P be the set of non-negative elements in Zn+1 . Then x ≥ P y if and only if y − x ∈ P. We list all sub-terms of τ G A ; τ0 = y, τ1 = x1 , . . . , τn = xn , τn+1 , . . . , τk = τ G A . We extend the mapping y → Z 0 , x1 → Z 1 , . . . , xn → Z n to mapping τ j → Z j of sub-terms of τ G A into elements of the linear group Zn+1 ordered by ≤ P . We have by our assumption, that 0 ≤ P Z 1 , Z 2 , . . . , Z n ≤ P Z 0 ; 0 ≤ P Z k ≤ P Z 0 ; 0 = Z k = τ (Z 1 ,2 , . . . , Z n , Z 0 . Let π be a permutation on the set {0, 1, 2, . . . , k} leading to ordering Z π(0) ≤ P Z π(1) ≤ P . . . ≤ P Z π(k) . n+1 into Rn+1 and consider the subConsider vectors v j =k Z π( j) − Z π j−1 ; embed Z ∗ space P = { j = 1 λ j v j : λ j ≥ 0}. We let also N ∗ = −P ∗ , then P ∗ ∩ N ∗ = {0}, P ∗ ∩ P = P ∗ ∩ Zn+1 . Concerning ordering ≤ P , we have Z i ≤ P Z j if and only if Z j − Zi ∈ P ∗.
Lemma 6.1 Sets P ∗ and N ∗ can be separated by a hyperplane defined by a vector w ∈ Rn+1 . Consider the sphere S = {t : ||t|| = 1} ⊂ Rn+1 . Was for each vector t that hyperplane t ⊥ intersected both N ∗ and P ∗ on sets different from {0}, we would have S disconnected, a contradiction. Therefore, for some vector w = (γ1 , γ2 , . . . , γn+1 ), the hyperplane Πw : w · v = 0 is as wanted. We write Πw+ = {(ηi ) : i ηi γi ≥ 0}, then P ∗ ⊆ Πw+ and N ∗ ⊆ −Πw+ . Letting P = Πw+ ∩ Zn+1 , we obtain a linearly ordered abelian group H = (Zn+1 , ≤ P ). We have Z i ≤ P Z j if and only if Z j − Z i ∈ P ∗ if and only if Z j − Z i ∈ / N ∗ if and only if Z i ≤ P Z j . Therefore, (i) 0 ≤P Z z , . . . , Z n ≤P Z 0 , (ii) 0 ≤P Z k ≤P Z 0 , (iii) 0 = Z k = τ H (Z 1 , Z 2 , . . . , Z n , Z 0 ). H is isomorphic to the sub-group W = Zγ1 + Zγ2 + . . . + Zγn+1 ⊂ R via the isomorphism ι : (b1 , b1 , . . . , bn+1 ) → i bi γi . If we let k0 = ι(Z 0 ), k1 = ι(Z 1 ), . . . , kn = ι(Z n ), kk = ι(Z k ), then we have 0 ≤ k1 , k2 , . . . , kn ≤ k0 and 0 < kk ≤ k0 . For k0 = 1, we obtain kk = τ W (k1 , k2 , . . . , kn , 1) > 0. For the MV-algebra Γ (W, 1), we have τ (k1 , k2 , . . . , kn ) = 0, i.e., the equation τ = 0 is not satisfied in [0,1]. Chang’s theorem is proved.
324
6 Finitely and Infinitely Valued Logics
6.16 Wajsberg Algebras, MV-Algebras and the Łukasiewicz [0,1] L Logic We recall that Łukasiewicz residuated lattice A L on [0, 1] has operations x y= max{0, x + y − 1}, residuum x → y = min{1, 1 − x + y}, x y = min{1, x + y}, ¬x = 1 − x, x y = max{0, x − y}, and the ordering ≤ as the natural ordering ≤ in the interval [0,1]. Theorem 6.40 (Wajsberg algebras vs. MV-algebras) Given an MV-algebra (A, , ¬, 0), letting x ⊃ y = ¬x y and 1 = ¬0, introduces in A the structure of theWajsberg algebra. Conversely, given a Wajsberg algebra (A, ⊃, ¬, 1) and letting x y = ¬x ⊃ y, 0 = ¬1, introduces in A the structure of an MV-algebra. Definition 6.37 (Formulae over MV ) We are thus justified in considering MValgebras with operations ⊃, 1, and, ¬. For an MV-algebra A, we define the set For m of formulae as follows (Form1) each of countably many variables x0 , x1 , x2 , . . . , xk , . . . is in For m; (Form2) if α ∈ For m then ¬α ∈ For m; (Form3) if α ∈ For m and β ∈ For m then α ⊃ β ∈ For m. Definition 6.38 (Assignments) An assignment V A : For m → A satisfies the following conditions. (i) V A (¬α) = ¬Va (α); (ii) V A (α ⊃ β) = V A (α) ⊃ V A (β) An assignment V A satisfies a formula α if V A (α) = 1. We denote this case as V A |= α. A formula α is valid (is a tautology) if and only if V A |= α for every A-assignment V A . In this case, we write |= α. Semantic equivalence of formulae α and β means that formulae α ⊃ β and β ⊃ α are valid. Definition 6.39 (MV-terms) More general than formulae are MV-terms defined in terms of variables. The inductive definition of a term parallels that of a formula. (i) (ii) (iii)
each variable xi and the constant 0 are terms; if τ is a term, then ¬τ is a term; if τ and σ are terms, then τ σ is a term.
For a term τ (x1 , x2 , . . . , xk ) and elements a1 , a2 , . . . , ak of the MV-algebra A, by substitutions (xi /ai ) for each i, we obtain τ A (a1 , a2 , . . . , ak ) ∈ A. This defines a function called the τ -function τ A : Ak → A. An MV-algebra A satisfies a term equation τ = σ if and only if τ A = σ A . Returning to V A assignments on A-formulae, we record a relation between term functions and V A assignments. Theorem 6.41 For an MV-algebra A, a formula α(x1 , x2 , . . . , xk ) and an A-assignment V A , the following holds: V A (α) = α A (V A (x1 ), V A (x2 ), . . . , V A (xk )).
6.17 The Completeness Theorem for Infinite-Valued Sentential Logic [0,1] L
325
6.17 The Completeness Theorem for Infinite-Valued Sentential Logic [0,1] L Definition 6.40 (The notion of a proof) A proof of a formula α is a sequence γ0 , γ1 , . . . , γn = α of formulae with γ0 an instance of an axiom scheme, each γi either an instance of an axiom scheme or obtained from some previous formula γ j ⊃ γi by detachment. We recall that the Tarski-Lindenbaum algebra is the quotient algebra by the relation ∼ defined as: α ∼ β ≡ α ⊃ β∧ β ⊃ α That ∼ is an equivalence relation stems from the symmetry α ∼ β ≡ β ∼ α, reflexivity α ∼ α as α ⊃ α is a valid formula in Łukasiewicz’s algebra, and transitivity is the consequence of the axiom scheme (L2). The relation ∼ is a congruence, i.e., (i) α ∼ β implies α ⊃ γ ∼ β ⊃ γ ; (ii) α ∼ β implies ¬α ∼ ¬β. Property (i) follows by the axiom scheme (L2), property (ii) follows by valid formulae β ≡ ¬¬β and if (α ⊃ ¬¬β), then (¬β ⊃ ¬α), and, by (L4). We denote by the symbol [α]∼ the equivalence class of the formula α. The quotient algebra For m/ ∼ carries a structure of a Wajsberg algebra as well as that of an MValgebra. Theorem 6.42 For m/ ∼ bears the structure of the Wajsberg algebra under the interpretation: (i) [α]∼ ⊃ [β]∼ = [α ⊃ β]∼ ; (ii) ¬[α]∼ = [¬α]∼ ; (iii) 1 = [Pr ovable]∼ where Pr ovable is the class of provable formulae. A parallel result is Theorem 6.43 For m/ ∼ bears the structure of the MV-algebra under the interpretation: (i) [α]∼ [β]∼ = [¬α ⊃ β]∼ ; (ii) ¬[α]∼ = [¬α]∼ ; (iii) 0 = ¬[Pr ovable]∼ . The MV-algebra (For m/ ∼, , ¬, 0) is the Lindenbaum-Tarski algebra of the Łukasiewicz infinite-valued MV-algebra A L . The soundness of [0,1] L is shown along standard lines: axiom schemes are valid, detachment preserves validity, hence, each provable formula is valid. The converse is
326
6 Finitely and Infinitely Valued Logics
Theorem 6.44 (The completeness theorem for [0, 1] L ) Each valid formula in [0,1] L is provable. Proof Let α(x1 , x2 , . . . , xk ) be a valid formula. We shorten [0, 1] L to L ∗ for conciseness and readability sakes. Clearly, (i) [α]∼ = α A ([x1 ]∼ , [x2 ]∼ , . . . , [xk ]∼ ). Suppose α is not provable. Then, [α]∼ = 1, hence, by (i), α A ([x1 ]∼ , [x2 ]∼ , . . . , [xk ]∼ ) = 1. This means that α is not valid in the MV-algebra A and the Chang completeness theorem implies that α is not valid in [0,1], a contradiction.
6.18 Remarks on Goguen and Gödel Infinite-Valued Logics Definition 6.41 (The infinite-valued logic of Goguen) In this logic, the semantics of the implication ⊃ is given by the formula A∗ ( p ⊃ q) = 1 in case A( p) ≤ A(q) and A(q) in case A( p) > A(q). A( p) Negation ¬ p is valued as A∗ (¬ p) = A∗ ( p ⊃ ⊥) = (A( p) ⊃ 0)= 1 for A( p) = 0 and 0 for A( p) > 0. The strong conjunction p&q is valued as A∗ ( p&q) = A( p) · A(q); the strong disjunction, p∨q, is valued A∗ ( p∨q) = A( p) + A(q) − A( p) · A(q). Axiom schemes for this logic are axiom schemas A1-A7 of BL plus two specific axioms, see Hájek [5]: (G1) ¬¬ξ ⊃ ((φ&ξ ⊃ ψ&ξ ) ⊃ (φ ⊃ ψ)), (G2) φ ∧ ¬φ ⊃ ⊥. These axioms completely axiomatize Goguen’s logic see Hájek [5]. Definition 6.42 (The infinite-valued logic of Gödel) The implication is the Gödel implication valued A∗ ( p ⊃ q) = 1 in case A( p) ≤ A(q) and A(q) in case A( p) > A(q). The strong conjunction & is given by A∗ ( p&q) = min{A( p), A(q)}, the strong disjunction is given by A∗ ( p∨q) = max{A( p), A(q)} and it follows that in Gödel’s logic, we have (i) (φ ∧ ψ) ≡ (φ&ψ) (ii) (φ ∨ ψ) ≡ (φ∨ψ) Axiom schemas for Gödel’s logic are those of BL and (Gl) φ ⊃ φ&φ, see Hájek [5]. The consequence of (i) is that deduction theorem holds in the classical form: if φ ψ then φ ⊃ ψ; the reason is that φ ≡ φ n for each natural number n ≥ 1. Gödel’s logic is completely axiomatizable, cf. Hájek [5].
6.19 Complexity of Satisfiability Decision Problem for Infinite-Valued Logics
327
6.19 Complexity of Satisfiability Decision Problem for Infinite-Valued Logics We denote by the symbol SAT the satisfiability problem for sentential logic, by, respectively, SAT(Gödel), SAT(Goguen), SAT(Luk), satisfiability problems for logics of,respectively, Gödel, Goguen, Łukasiewicz . Following the idea in (Hájek [5], 6.2.2), we consider a set P = { p1 , p2 , . . . , pn } of atomic propositions and for an assignment A on P, we denote by N (A) the set { pi ∈ P : A( pi ) = 0}. Formulae undergo reduction: to each formula φ its reduction φ ∗ is assigned as follows (i) (ii) (iii) (iv) (iv) (v)
⊥∗ is ⊥, ∗ is ; pi∗ is ⊥ if pi ∈ N (A), pi∗ is pi , otherwise; (⊥ ⊃ φ)∗ is ; (φ ⊃ ⊥)∗ is ⊥ if φ ∗ = ⊥; (φ ⊃ ψ)∗ is φ ∗ ⊃ ψ ∗ in cases other than (iii) and (iv); (119) (φ&⊥)∗ is ⊥, (φ&ψ)∗ is φ ∗ &ψ ∗ in other cases.
It follows form (i)–(v) above that the following holds. Theorem 6.45 For logics of Goguen and Gödel, for each formula φ and each assignment A, either φ ∗ is ⊥ or φ ∗ contains no symbol ⊥. Proof By structural induction starting with (i) and (ii).
Theorem 6.46 A∗ (φ)=A∗ (φ ∗ ) and A∗ (φ ∗ ) = 0 if and only if φ ∗ is ⊥. The consequence is Hájek [5]: Theorem 6.47 SAT, SAT(Gödel), SAT (Goguen) are equivalent, hence, SAT (Gödel) and SAT(Goguen) are NP-complete. Proof If φ in SAT then φ in SAT(Gödel)and in SAT(Goguen). If φ in SAT (Gödel) or SAT(Goguen), then for some assignment A, we have A∗ (φ) > 0 hence φ ∗ = ⊥ i.e., φ ∗ does not contain ⊥. Now, consider 0-1 assignment on φ; then A∗ (φ) = 1 so φ ∈ S AT . For SAT(Luk), see Mundici [29], where a proof is given that SAT(Luk) is NPcomplete. We may here only outline the main ideas of this approach. The rest is just calculations. The starting point is given by McNaughton’s result McNaughton [15] about representation of formulae of [0,1] L by piece-wise continuous functions on an n-cube (we are close here to the proof of completeness of [0,1] L in Rose and Rosser [21]). Theorem 6.48 Consider [0, 1] L with operators defined as above. For a formula q( p1 , p2 , .., pn ), where pi s are atomic propositions, we denote by Q(x1 , x2 , ..., xn ) the value of q with x1 , x2 , .., xn being values of p1 , p2 , ..., pn . We regard x1 , x2 , . . . , xn as coordinate values in the cube [0, 1]n . The question posed by McNaughton was
328
6 Finitely and Infinitely Valued Logics
whether there is a correspondence between functions of the form Q(x1 , x2 , . . . , xn ) and continuous functions of some form on [0, 1]n . The result were two theorems. Theorem A. For each function of the form f (x1 , x2 , . . . , xn ) = min{1, max{0, b +
n
m i xi }},
i=1
where b and all m i are integers, there exists a formula q( p1 , p2 , . . . , pn ) such that Q(x1 , x2 , . . . , xn ) = f (x1 , x2 , . . . , xn ). The piece-wise structure of the function f is explained in Theorem B below. Theorem B. Q(x1 , x2 , . . . , xn )= f (x1 , x2 , . . . , xn ) holds if the function f : [0, 1]n → [0, 1] is continuous and there exists a finite collection of monomials h j (x1 , x2 , . . . , n j xn ) = b j + i=1 m i xi with integer coefficients and such that for each (x1 , . . . , xn ) there exists h j such that f (x1 , . . . , xn ) = h j (x1 , . . . , xn ). We denote the function f corresponding to a formula Q by the symbol f Q . Elementary considerations show that the n-cube is divided into closed convex regions with h j defined on the region R j and two regions are either disjoint or having a common border of dimension (n-1) on which two monomials agree. In this sense the function f Q is piece-wise linear. We recall that for McNaughton functions, (i) f Q f = max{0, f Q + f P − 1}; P f P = min{1, f Q + f P }; (ii) f Q (iii) f Q ∨ f P = max{ f Q , f P }; (iv) f Q ∧ f P = min{ f Q , f P }; (v) f Q ⊃ f P = min{1, 1 − f Q + f P }; (vi) for the Łukasiewicz [30] symbol A P Q ≡ (P ⊃ Q) ⊃ Q, f A P Q = f P ∨ f Q ; fQ; (vii) for the Łukasiewicz symbol K P Q ≡ ¬(P ⊃ ¬Q), f K P Q = f P (remark: in Mundici [29] K is denoted L after McNaughton [15]). We need a lemma. Lemma 6.2 (The Hadamard inequality) (cf. Hadamard [31]). For a matrix M with n rows and n columns, whose all entries are bounded by a constant C, the determinant det(M) satisfies the inequality |det(M)|≤ C n · n n/2 . We recall that for a formula φ the size* of φ, size*(φ), is the number of symbols in φ. The following couple of propositions come from Mundici [29]. Definition 6.43 (A function and a formula) The following pair define a formula ξn,t t and the function f n,t = f ξn,t . We denote by h the function h applied to itself t times, i.e. h ... h (t times). We define consecutive functions for i, n ≥ 1 and t ≥ 2:
6.19 Complexity of Satisfiability Decision Problem for Infinite-Valued Logics
(i)
(ii)
φi is Axi ¬xi with
329
f φi = xi ∨ (1 − xi );
ψi,t is (Aφi )t φ with the function f ψi,t = (x ∨ (1 − x))t : [0, 1] → [0, 1];
(iii)
ξn,t is
K ψ1,t K ψ2,t . . . K ψn−2,t K ψn−1,t ψn,t
with the Mc Naughton function f ξn,t = f ψ1,t
(x1 ∨ (1 − x1 ))t
f ψ2,t
...
...
f ψn,t =
(xn ∨ (1 − xn ))t .
Theorem 6.49 A formula φ is valid in sentential logic if and only if the formula ξn,t ⊃ φ is valid in infinite-valued logic [0, 1] L . We recall that by SAT we mean the satisfiability problem for sentential logic. As we know SAT is NP-complete. Theorem 6.50 SAT(Luk) is NP-hard: the reduction SAT≤ p SAT(Luk) is provided by the mapping φ → ¬ξn,t ⊃ ¬φ. Concerning the NP-membership of SAT(Luk), geometric considerations from piecewise linear geometry, along with the hadamard inequality have brought the following estimates. Their form comes by McNaughton’s Theorems 6.48, A, B: if a formula φ(x1 , x2 , . . . , xn ) is satisfiable, then the corresponding McNaughton function f φ takes on (x1 , x2 , . . . , xn ) the positive value. As (x1 , x2 , . . . , xn ) is an element of a convex compact region R which may be assumed to be a simplex, the affine f φ takes its maximal, i.e., positive value at one of vertices, say, r . The description of r is given as a solution to the system M of n affine equations, hence r = ( ab1 , . . . , abn ) with b given as the determinant of the system M, whose value may be estimated by means of the Hadamard inequality. As computed in Mundici [29], coefficients of the equations which express facets of the simplex are bound by 2 · si ze∗ (φ) and clearly n ≤ si ze∗ (φ). These considerations lead to the following theorem. Theorem 6.51 For a McNaughton function f φ for a satisfiable φ, there exists in ∗ 2 [0, 1]n a rational point r = [ ab1 , . . . , abn ] with f φ (r ) > 0 and b < 24·si ze (φ) . As shown in Mundici [29], guessing r along with b satisfying the bound in Theorem 6.51, can be done non-deterministically in polynomial time. This shows that SAT(Luk) is NP-complete.
330
6 Finitely and Infinitely Valued Logics
An analogous reasoning carried out for any logic n L with n > 2 proves that SAT(n L ) is NP-complete Mundici [27].
6.20 Problems Problem 6.1 (Kleene’s logic 3 K ) Prove: (i) in the logic 3 K (i) ( p ∨ q) ≡ ¬(¬ p ∧ ¬q) (ii) ( p ⊃ q) ≡ ¬( p ∧ ¬q). Problem 6.2 (Kleene’s logic 3 K ) Prove that the detachment rule ( p ∧ ( p ⊃ q)) ⊃ q is valid in 3 K . Problem 6.3 (Kleene’s logic 3 K ) Prove that if a set of formulae Γ entails a formula φ in Kleene’s logic, then Γ entails φ in sentential logic. Verify that the formula ¬( p ≡ q) entails in sentential logic the formula ( p ≡ r ) ∨ (q ≡ r ) but it is not true that this entailment holds in the logic 3 K . Problem 6.4 (Łukasiewicz’s logic 3 L ) Prove: each formula valid in the logic 3 L is valid in sentential logic. Problem 6.5 (Łukasiewicz’s logic 3 L ) Prove that the following formulae valid in sentential logic are not valid in the logic 3 L : (i) p ∨ ¬ p; (ii) ¬( p ∧ ¬ p; (iii) ( p ⊃ (q ⊃ r )) ⊃ (( p ⊃ q) ⊃ ( p ⊃ r )). Problem 6.6 (Łukasiewicz’s logic 3 L ) Prove that entailment in the logic 3 L holds in the sentential logic and detachment rule holds in 3 L . Problem 6.7 (Bochvar’s internal logic 3 B I ) Prove: neither of binary connectives of logics 3 K and 3 L can be defined in the logic 3 B I . Problem 6.8 (Bochvar’s internal logic 3 B I ) Prove: No formula with an occurrence of negation sign is valid in the logic 3 B L . Problem 6.9 (Łukasiewicz’s logic 4 L M ) Verify whether modal formulae (D), (B), (DC), (4C), (G) are valid in the logic 4 L M . Problem 6.10 (Łukasiewicz’s logic [0,1] L ) (after Meredith [23]). Consider the original set (L1)–(L5) of Łukasiewicz’s axiom schemes for the infinite-valued logic [0,1] L and verify that following sequence of formulae in Polish notation is a proof of (L5) from (L1)–(L4). The following symbol are used: Apq is CC pqq (i.e., ( p ⊃ q) ⊃ q)) (disjunction); (i) (L2); substitute into (L2): { p/C pq; q/Cqp; r/ p}; (ii) obtains;
References
(ii) CCC pqCqpC ApqCC pqp; apply 10.3(iv); substitute { p/CC pqCqp; q/C ApqCC pqr ; r/CCqC pqCqp}; it obtains
331
into
(L2):
(iii) CCC ApqCC pqpCCqC pqCqpCCC pqCqpCCqC pqCqp; apply 10.3(iv) to obtain (iv) CC pqCrq ≡ CCqpr p; substitute into (iv): { p/N p; q/N q; r/Nr }; it obtains by 10.3(xix) (v) CCqpCqr ≡ CC pqC pr ; substitute into (v): { p/q; q/C pq; r/ p}; it obtains (vi) CCC pqCqpCCqC pqCqp; apply 103(viii) and (L1); it obtains (vii) AC pqCqp, i.e. (L5). Problem 6.11 (Natural implications) Consider the set T = {0, 21 , 1} and let the set W ⊆ T be either {1} or {1, 21 }. The mapping i(x, y) : T × T → T is called a natural implication if and only if (i) the restriction i|{0, 1} is the classical implication of sentential logic; (ii) if i(x, y) ∈ W and x ∈ w, then y ∈ W ; (iii) for x, y ∈ T , if x ≤ y, then i(x, y) ∈ W . Prove: for W = {1}, there exist six distinct natural implications and for W = {1, 21 } there exist twenty four natural implications. Problem 6.12 (The Łukasiewicz logic [0, 1] L ) For the sentential logic detachment formula ( p ∧ (q ⊃ p)) ⊃ p, determine its truth function in [0,1] L and decide whether it is valid. Problem 6.13 (Archimedean t-norms) Find formulae for functions f, g in the Hilbert style representation g( f (x) + f (y)) of the product t-norm P(x, y) = x · y. Problem 6.14 (Łukasiewicz’s logic [0, 1] L ) Prove: for each formula φ of the logic [0, 1] L , φ is valid in [0, 1] L if and only if φ is valid as a formula of sentential logic.
References 1. Łukasiewicz, J.: Über den Satz von Widerspruch bei Aristoteles. Bulletin Internationale de l’Académie des Sciences de Cracovie, Classe de Philosophie (1910), 15–38. Also: On the principle of contradiction in Aristotle. In: Review of Metaphysics 24 (1970/71), 485–509 2. Łukasiewicz, J.: Farewell Lecture at the University of Warsaw, March 7, 1918. The Polish Review 13(3), 45–47 (1968). University of Illinois Press 3. Łukasiewicz, J.: On three-valued logic (in Polish). Ruch Filozoficzny 5 (1920), 170–171. English translation. In: Borkowski, L. (ed.) Jan Łukasiewicz. Selected Works. North Holland - Polish Scientific Publishers, Amsterdam-Warsaw (1970) 4. Menger, K.: Statistical Metrics. Proc. Natl. Acad. Sci. 28, 535–537 (1942)
332
6 Finitely and Infinitely Valued Logics
5. Häjek, P.: Metamathematics of Fuzzy Logic. Springer Science+Business Media, Dordrecht (1998) 6. Rasiowa, H., Sikorski, R.: The Mathematics of Metamathematics. Polish Scientific Publishers (PWN), Warszawa (1963) 7. Chang, C.C.: A new proof of the completeness of theŁukasiewicz axioms. Trans. Amer. Math. Soc. 93, 74–80 (1959) 8. Wajsberg, M.: Axiomatization of the three-valued sentential calculus (in Polish, German summary). C.R. Soc.Sci. Lettr. Varsovie 24, 126–148 (1931) 9. Goldberg, H., Leblanc, H., Weaver, G.: A strong completeness theorem for 3-valued logic. Notre Dame J. Formal Logic 15, 325–332 (1974) 10. Kleene, S.C.: J. Symb. Logic 3, 150–155 (1938) 11. Bochvar, D.A.: Mat. Sbornik 4, 287–308 (1938) 12. Łukasiewicz, J.: Aristotle’s Syllogistic from the Standpoint of Modern Formal Logic, 2nd edn. enlarged. Oxford University Press (1957) 13. Rosser, J.B., Tourquette, A.R.: Many-Valued Logics. North-Holland Publishing Co., Amsterdam (1952) 14. Post, E.L.: Introduction to a general theory of elementary propositions. Am. J. Math. 43(3), 163–185 (1921). https://doi.org/10.2307/2370324 15. McNaughton, R.: A theorem about infinite-valued sentential logic. J. Symb. Logic 16, 1–13 (1951) 16. Ling, C.-H.: Representation of associative functions. Publ. Math. Debrecen 12, 189–212 (1965) 17. Mostert, P.S., Shields, A.L.: On the structure of semigroups on a compact manifold with boundary. Ann. Math. 65, 117–143 (1957) 18. Faucett, W.M.: Compact semigroups irreducibly connected between two idempotents. Proc. Amer. Math. Soc. 6, 741–747 (1955) 19. Menu, J., Pavelka, J.: A note on tensor products on the unit interval. Comment. Math. Univ. Carol. 17(1), 71–83 (1976) 20. Wajsberg, M.: Beiträge zum Metaaussagenkalkül. Monat. Math. Phys. 42, 221–242 (1935) 21. Rose, A., Rosser, J.B.: Fragments of many-valued statement calculi. Trans. Amer. Math. Soc. 87, 1–53 (1958) 22. Łukasiewicz, J., Tarski, A.: Untersuchungen über den Aussagenkalkül. C.R. Soc. Sci. Lettr. Varsovie, Cl. III, 23, 39–50 (1930); also in: Borkowski, L. (ed.): J. Lukasiewicz: Selected Works. Studies in Logic and the Foundations of Mathematics. North-Holland Publisher, Amsterdam and Polish Scientific Publishers (PWN), Warszawa (1970) 23. Meredith, C.A.: The dependence of an axiom of Lukasiewicz. Trans. Amer. Math. Soc. 87, 54 (1958) 24. Chang, C.C.: Proof of an axiom of Łukasiewicz. Trans. Amer. Math. Soc. 87, 55–56 (1958) 25. Cignoli, R.L.O., Mundici, D.: An elementary proof of Chang’s completeness theorem for the infinite-valued calculus of Łukasiewicz. Stud. Logica. 58, 79–97 (1997) 26. Font, J.M., Rodriguez, A.J., Torrens, A.: Wajsberg algebras. Stochastica 8(1), 5–23 (1984) 27. Mundici, D.: MV-algebras. A short tutorial (2007) 28. Cignoli, R.L., d’Ottaviano, I.M., Mundici, D.: Algebraic Foundations of Many-Valued Reasoning. Springer Science + Business Media, B.V. Dordrecht (2000) 29. Mundici, D.: Satisfiability in many-valued sentential logic is NP-complete. Theoret. Comput. Sci. 52, 145–153 (1987) 30. Łukasiewicz, J.: Elements of Mathematical Logic. Pergamon Press, Oxford and Polish Scientific Publishers (PWN), Warsaw (1966). (Reprinted from mimeographed notes by students of Warsaw University (1929)) 31. Hadamard, J.: Résolution d’une question relative aux determinants. Bull. Sci. Math. 17, 240– 246 (1893)
Chapter 7
Logics for Programs and Knowledge
In this chapter we meet sentential dynamic logic (SDL), epistemic logics, logics of approximate containment of concepts couched in terms of mereology, and elements of Data Analysis in the form of Boolean reasoning in the environment of data along with the logic for functional dependence and the information logic.
7.1 Sentential Dynamic Logic (SDL) SDL is a logic of programs/actions based on modal logic as an archetype. We begin with regular PDL whose syntax is governed by regular expressions. Please see Chap. 1 for an account of regular expressions. PDL renders actions or programs in terms of relations input-output, disregarding intermediate stages. Nevertheless, it is able to formally absorb structural programming’s block commands as well as earlier formalizations like Hoare’s triplets. Definition 7.1 (Syntax of regular SDL) Due to need for formalization of programs/actions, SDL imposes on SL additional ingredient which is separate formalization of programs, related to sentential part by actions of programs on formulae. Atomic propositions and atomic programs Construction of SDL syntax begins with atomic propositions and atomic programs. We are given a countable set of atomic propositions Π0 = { pn : n ≥ 0}, usually denoted p, q, r, . . . and we are given a countable set Φ0 = {an : n ≥ 0} of atomic programs, usually denoted a, b, c, . . .. We denote by ⊃ the sentential implication and the symbol ⊥ is falsum representing the truth value 0. Purely sentential formulae The set Π0 of purely sentential formulae (ppfs) is obtained from atomic propositions by means of sentential connectives ∨, ∧, ⊃, ≡, ¬: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. T. Polkowski, Logic: Reference Book for Computer Scientists, Intelligent Systems Reference Library 245, https://doi.org/10.1007/978-3-031-42034-4_7
333
334
7 Logics for Programs and Knowledge
(i) each atomic proposition is a ppf; (ii) if φ and ψ are ppfs, then φ ∨ ψ, φ ∧ ψ, φ ⊃ ψ, φ ≡ ψ, ¬φ are ppfs; (iii) falsum ⊥ is a ppf. As with SL, verum is the formula ¬⊥, hence, a ppf. Non-sentential connectives acting on ppfs These connectives act on ppfs: (iv) necessity connective [ . ]; (v) test connective ?; Connectives acting on programs (vi) connective of composition of programs (.); (.) ; (vii) connective of non-deterministic choice between programs (.) (.); (viii) connective of non-deterministic program iteration (.)∗ . The set of formulae The set Π of formulae is defined as the smallest set closed on the conditions: (ix) all ppfs are formulae, i.e., Π0 ⊆ Π ; (x) for a program π and a formula φ, the expression [π]φ is a formula; (xi) for a program π and a formula φ, the expression π φ equivalent to ¬[π]¬φ is a formula. The set of programs The set Φ of programs is the smallest set closed on the conditions: (xii) (xiii) (xiv) (xv) (xvi)
each atomic program is a program, i.e., Φ0 ⊆ Φ; if π and ρ are programs, then π; ρ is a program; if π and ρ are programs, then π ρ is a program; if π is a program, then π ∗ is a program; for a formula φ, the test φ? is a program.
Conditions (x), (xi), (xvi) witness the interplay of formulae and programs. SDL is regular as operations in it are defined by regular expressions. Before we formally introduce semantics for regular SDL, we convey intuitive interpretations for the introduced constructs. (i) box [.] resembles the necessity operator of modal logic, [π]φ means that φ is necessarily true after the program π executes; (ii) ‘(.);(.)’ denotes sequential execution, π; ρ means that ρ is executed after π is executed; (iii) ‘ ’ denotes random choice, π ρ means the non-deterministic choice between π and ρ; (iv) ‘(.)∗ ’ denotes non-deterministic iteration, π ∗ means ‘; applied to π a nondeterministically chosen number of times; (v) ‘(.)?’ denotes test of (.), φ? means: if φ is true, then continue, otherwise fail.
7.1 Sentential Dynamic Logic (SDL)
335
Definition 7.2 (Semantics of SDL) Semantics of SDL is modelled on the Kripke semantics of possible words. There is a difference in terminology in comparison to modal logics: instead of worlds, we speak of states, so we assume a set of states S. The accessibility relation now connects input states to output states, in the sense that a pair of states (s, t) as instance of the relation Rπ means that the program π executed in state s can terminate in the state t. Hence, we can represent the relation Rπ as an assignment A(π) ⊆ S 2 on pairs of states, for the pair (s, t) of states, A(π)(s, t) = 1 means that (s, t) ∈ Rπ . In SDL, we have also to account for valuation on atomic beings; for atomic formulae, we can define V ( p) as the set of all states at which p is valid, i.e., V ( p) ⊆ S. SDL frames By an SDL frame, simply a frame, we understand the triple (S, A, V ), where S is a non-empty set of states, A is an assignment on atomic programs, i.e., for each atomic program a, A(a) ⊆ S 2 , and, V is a valuation on atomic propositions, i.e., V ( p) ⊆ S. Each choice of a frame determines semantics for formulae and programs. Meanings in that semantics are assigned on lines close to those we have met at logics SL,FO, SML. Assume a frame (S, A, V ). Valuations on formulae V (¬φ) = S \ V (φ); V (φ ∨ ψ) = V (φ) ∪ V (ψ); V (φ ⊃ ψ) = [S \ V (φ)] ∪ V (ψ); V (φ ∧ ψ) = V (φ) ∩ V (ψ); V (⊥) = ∅, V ( ) = S; V ([π]φ) = {s ∈ S : ∀t ∈ S.[if (s, t) ∈ A(π) then t ∈ V (φ)]}. This renders the meaning:‘after π is executed, it is necessary that φ is true; (vii) V ( π φ) = {s ∈ S : ∃(s, t) ∈ A(π).t ∈ V (φ)}. This renders the meaning: ‘there is an execution of π which terminates at a state in which φ may be true’. (i) (ii) (iii) (iv) (v) (vi)
Assignments on programs (viii) A(π; ρ) = {(s, t) ∈ S 2 : ∃w ∈ S : (s, w) ∈ W (π) ∧ (w, t) ∈ W (ρ)} = A(π) ◦ A(ρ), where ◦ denotes composition; (ix) A(π ρ) = A(π) ∪ A(ρ); (x) A(π ∗ ) = [A(π)]r,t . The latter symbol denotes the reflexive and transitive closure of A(π), i.e., the relation n≥0 A(π)n , A(π)n denoting the composition of n copies of A(π); (xi) A(φ?) = {(s, s) : s ∈ V (φ)}. Definition 7.3 (Satisfiability and validity) We denote a frame (S, A, V ) by the symbol F; then (i) A pointed frame is a pair (F, s) with s ∈ S. A formula φ is true at a pointed frame (F, s), which is denoted F, s |= φ if and only if s ∈ V (φ);
336
7 Logics for Programs and Knowledge
(ii) a formula φ is true at a frame F if and only if F, s |= φ for each s ∈ S, which is denoted as F |= φ; (iii) a formula φ is valid if and only if φ is true in each frame. This fact is denoted as |= φ; (iv) a formula is satisfiable if and only if there exists a frame F such that F |= φ; (v) A set Γ of SDL formulae is true at a frame F if and only if F |= φ for each φ ∈ Γ and it is valid if it is true at each frame. We state some basic valid formulae of SDL. Due to analogy between SDL operators of necessity and possibility and those operators in modal logics, laws known from SML transfer to SDL with the same proofs which we are thus at liberty to omit. Theorem 7.1 The following SDL formulae are valid. (i) (ii) (iii) (iv) (v)
[π](φ ∧ ψ) ≡ ([π]φ) ∧ ([π]ψ); π (φ ∨ ψ) ≡ ( π φ) ∨ ( π ψ); P DL(K ) : [π](φ ⊃ ψ) ⊃ ([π]φ ⊃ [π]ψ); ([π]φ) ∨ ([π]ψ) ⊃ [π](φ ∨ ψ); π (φ ∧ ψ) ⊃ ( π φ) ∧ ( π ψ).
We now recall an axiomatic system for SDL due to Segerberg [1] and its inference rules. Definition 7.4 (Axiom schemes for SDL) Axiom schemes (A1) Contains all valid formulae of SL, called in this case tautologies in order to discern them from proper formulae of SDL; (A2) π ⊥ ≡ ⊥. The dual (box) form is: [π] ≡ ; (A3) π ρ φ ≡ π φ ∨ ρ φ. The dual form is: [π ρ]φ ≡ [π]φ ∧ [ρ]φ; (A4) π (φ ∨ ψ) ≡ π φ ∨ π ψ. The dual form is: [π](φ ∧ ψ) ≡ [π]φ ∧ [π]ψ; (A5) π; ρ φ ≡ π ρ φ. The dual form is: [π; ρ]φ ≡ [π][ρ]φ; (A6) π ∗ φ ≡ φ ∨ π π ∗ φ. The dual form is: [π ∗ ]φ ≡ φ ∧ [π][π ∗ ]φ; (A7) π ∗ φ ⊃ φ ∨ π ∗ (¬φ ∧ π φ). The dual form is: [φ ∧ [π ∗ ](φ ⊃ [π]φ)] ⊃ [π ∗ ]φ. (A7) is called the axiom of induction: indeed, it states that if φ and any iteration od π derives [π]φ from φ, then any iteration of π terminates with φ. Inference rules (D) Detachment: if φ and φ ⊃ ψ, then ψ; (Gen) Generalization: if φ, then [π]φ. Completeness of SDL We recall the Kozen-Parikh proof of completeness (Kozen and Parikh [2]) The proof follows the already met in the previous chapters idea of verifying that each consistent formula is satisfiable. This proof covers the case of PDL without the test operator. It begins with the notion of the Fisher-Ladner closure (Fisher and Ladner [3]).
7.1 Sentential Dynamic Logic (SDL)
337
Definition 7.5 (The Fisher-Ladner closure (FLC)) Consider a consistent formula φ. The FLC for φ, denoted F LC(φ), is the smallest set closed on the following rules: (FLC1) (FLC2) (FLC3) (FLC4) (FLC5) (FLC6)
If ξ ∨ ψ ∈ F LC(φ), then ξ ∈ F LC(φ) and ψ ∈ F LC(φ); If ¬ξ ∈ F LC(φ), then ξ ∈ F LC(φ); If π ξ ∈ F LC(φ), then ξ ∈ F LC(φ); If π ρ ξ ∈ F LC(φ), then π ξ ∈ F LC(φ) and ρ ξ ∈ F LC(φ); If π; ρ ξ ∈ F LC(φ), then π ρ ξ ∈ F LC(φ); If π ∗ ξ ∈ F LC(φ), then π ξ ∈ F LC(φ) and π π ∗ ξ ∈ F LC(φ).
Theorem 7.2 F LC(φ) is finite for each consistent formula φ. Proof It goes by structural induction on the set of sub-formulae. (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii)
F LC( p) = { p} for each atomic proposition p; F LC(φ ⊃ ψ) ={φ ⊃ ψ} ∪ F LC(φ) ∪ F LC(ψ); F LC( ) = { }; F LC(¬ψ) = F LC(ψ ⊃ ⊥) = {ψ ⊃ ⊥} ∪ F LC(ψ) ∪ {⊥}; F LC(ψ ∨ ξ) = F LC(¬ψ ⊃ ξ); F LC(ψ ∧ ξ) = F LC(¬(ψ ⊃ ¬ξ). F LC([a]ψ) = {[a]ψ} (a an atomic program). F LC([π]ψ) = {[π]ψ} ∪ F LC(ψ); F LC( π ψ) = F LC(¬[π]¬ψ); F LC([π ρ]ψ) = {[π ρ]ψ} ∪ F LC([π]ψ) ∪ F LC([ρ]ψ). F LC( π ρ ψ follows like in (viii); F LC([π; ρ]ψ) = {[π; ρ]ψ} ∪ F LC([π][ρ]ψ) ∪ F LC([ρ]ψ). F LC( π; ρ ψ) follows; F LC([π ∗ ]ψ) = {[π ∗ ]ψ} ∪ F LC([π][π ∗ ]ψ); F LC([ψ?]ξ) = [ψ?ξ] ∪ F LC(ψ).
The corresponding formulae for possibility operator . are obtained as in (viii). The size of a formula ψ denoted |ψ| is defined as the number of non-parentheses symbols in ψ. The size |π| of a formula π is defined analogously. One then employs (i)-(xii) to prove by structural induction on sub-formulae that |F LC(φ)| ≤ |φ| and |F LC([π]ψ)| ≤ |π|. Due to finiteness of F LC(φ), it can be written down as a list of formulae {ψ1 , ψ2 , . . . , ψn }. We meet here an open complete description already met in previous chapters which we will meet in Chap. 8, though here it is called by an another name. Definition 7.6 (FLC(φ)-atomic formulae) An FLC(φ)-atomic formula is a consistent conjunction σ : ψ1 ∧ ψ2 ∧ . . . , ∧ψn , where each ψi is either ψi or ¬ψi . The set of FLC(φ)-atomic formulae is denoted as A(φ). Then:
338
7 Logics for Programs and Knowledge
(i) For each ψi ∈ F LC(φ) and σ ∈ A(φ), σ ψi ∨ σ ¬ψi . By the fact that either ψi or ¬ψi occurs in σ and of the tautology p ∧ q → p; (ii) σψ i σ ≡ ψi ; (iii) ≡ σ. Theorem 7.3 There exists an FLC(φ)-atomic formula σ such that σ ⊃ φ. Proof Indeed, as φ is consistent, it extends to a maximal consistent set C and then C ∪ { } φ so by deduction theorem for SL, C ⊃ φ and as is consistent, and, by 1.8(iii), ≡ all σ σ, there exists at least one consistent σ such that σ ⊃ φ. Definition 7.7 (The Kozen-Parikh frame) The Kozen-Parikh frame is defined as follows: (i) the set of states S is defined as the set of FLC(φ)-atomic formulae, {σ1 , σ2 , . . . , σ2n }; (ii) for each atomic program a, the assignment A(a) = {(σi , σ j ) : σi ∧ a σ j is consistent}; (iii) for each atomic proposition p, the set V ( p) of states is given as {σi : σi ⊃ p}. Theorem 7.4 SDL is complete. Proof (Kozen, Parikh). The proof is split into three lemmas. Lemma 7.1 For each program π, if σi ∧ π σ j is consistent, then (σi , σ j ) ∈ A(π). The proof is by structural induction and it rests on the axiom schema (A7): π ∗ φ ⊃ φ ∨ π ∗ (¬φ ∧ π φ), or, in the dual form (A7*): φ ∧ [π∗](φ ⊃ [π]φ) ⊃ [π ∗ ]φ. The first step hs been made in Definition 7.7(ii) for atomic programs a. Next, one has to consider the three cases: (i) π is ρ ξ; (ii) π is ρ; ξ; (iii) π is ρ∗ . For case (i): suppose that σi ∧ ρ ξ σ j is consistent, hence, by (A3), (a) (σi ∧ ρ σ j ) ∨ (σi ∧ ξ σ j ) is consistent, i.e., either (b) σi ∧ ρ σ j
7.1 Sentential Dynamic Logic (SDL)
339
is consistent or (c)σi ∧ ξ σ j is consistent, i.e., by hypothesis of induction, either (σi , σ j ) ∈ A(ρ) or (σi , σ j ) ∈ A(ξ), hence, (σi , σ j ) ∈ A(ρ ξ). For case (ii): suppose that σi ∧ ρ; ξ σ j is consistent, hence, by (A5), (d)σi ∧ ρ ( ξ σ j ) is consistent. In order to introducean intermediate state between σi and σ j , we introduce into (d) in the form of σ σ and then the axiom schema (A4) allows us to write (d) in the form σi ∧ ρ (σ ∧ ξ σ j ), (e) σ
so there exists σk such that ( f ) σi ∧ ρ (σk ∧ ξ σ j ) is consistent. By (A2), (g) σk ∧ ξ σ j is consistent and, by 1.2(ii), (h) σi ∧ ρ σk is consistent. By hypothesis of induction, (σi , σk ) ∈ A(ρ) and (σk , σ j ) ∈ A(ξ), hence, (σi , σ j ) ∈ A(ρ; ξ). For case (iii): suppose that σi ∧ ρ∗ σ j is consistent. Consider the smallest set Γ ⊆ A of FLC(φ)-atomic formulae with the properties: (*) if σ ∈ Γ and σ ∧ ρ σ is consistent, then σ ∈ Γ ; (**) σi ∈ Γ . By hypothesis of induction, for each pair (σ, σ ) ∈ Γ 2 , if σ ∧ ρ σ , then (σ, σ ) ∈ A(ρ). It follows by induction that for each σ ∈ Γ , (σi , σ) ∈ A(ρ∗ ). Hence, one has to show that σj ∈ Γ . Consider γ : Γ . Then ( j) γ ∧ ρ ¬γ is inconsistent as it is equivalent by (A4) to (k)
X ∈Γ,Y ∈Γ /
X ∧ ρ Y
340
7 Logics for Programs and Knowledge
and each term X ∧ ρ Y is inconsistent as Y ∈ / Γ . It follows that γ ⊃ [ρ]γ and, by (Gen), [ρ∗ ]γ ⊃ [ρ]γ. As σi ⊃ γ, one obtains by (A7) that σi ⊃ γ, hence, σi ∧ ρ∗ ¬γ is inconsistent, hence, σ j ∈ Γ and (σi , σ j ) ∈ A(ρ∗ ). This proves Lemma 7.1. Lemma 7.2 For each formula π ψ ∈ F LC(φ) and each FLC(φ)-atomic formula σ, the following equivalence holds: σ ⊃ π ψ if and only if there exists an FLC(φ)atomic formula σ such that (σ, σ ) ∈ A(π) and σ ⊃ ψ. Proof From left to right it is straightforward, if we remember that ψ is one of ψi ’s making F LC(φ) and σ, σ are conjuncts containing in each position i either ψi or ¬ψi . From σ ⊃ π ψ it follows that for some σ , it is true that σ ∧ π σ is consistent and σ ⊃ ψ. The opposite direction is proved by structural induction on π. Supposethat (σ, σ ) ∈ A(π) and σ ⊃ ψ. There are four cases: (a) an atomic a (b) (ρ ξ) (c) (ρ; ξ) (d) ρ∗ . For case (a): if π is a, an atomic program, then (σ, σ ) ∈ A(a) means that σ ∧ a σ is consistent, and σ ⊃ ψ implies that σ ∧ a ψ is consistent, therefore, σ ⊃ a ψ. For case (b): suppose that (σ, σ ) ∈ A(ρ ξ), so either (σ, σ ) ∈ A(ρ) or (σ, σ ) ∈ A(ξ). Suppose it is the former. By rule (FLC(4)), ρ ψ ∈ F LC(φ) so, by hypothesis of induction, σ ⊃ ρ ψ, and σ ⊃ ρ ξ ψ. For case (c): suppose that (σ, σ ) ∈ A(ρ; ξ), so (σ, σ ∗ ) ∈ A(ρ) and (σ ∗ , σ ) ∈ A(ξ) for some σ ∗ . As in case(b), σ ∗ ⊃ ξ ψ. By rules (FLC(5)) and (FLC(3)), ξ ψ ∈ F LC(φ) and, by hypothesis of induction, σ ⊃ ρ ξ ψ, hence, by (A5), σ ⊃ ρ; ξ ψ. For case (d): suppose that (σ, σ ) ∈ A(ρ∗ ) so for some sequence σ1 = σ, σ2 , . . . , σn = σ of states we have (σi , σi+1 ) ∈ A(ρ) for i = 1, 2, . . . , n − 1. Reasoning by backward induction from σn we prove that σ ⊃ ρ∗ ψ. This concludes the proof of Lemma 7.2. Lemma 7.3 For an LFC-atomic sub-formula σ and a formula ψ ∈ L FC(φ), σ |= ψ if and only if σ ⊃ ψ. Proof It is by structural induction. For an atomic proposition p, the thesis follows by definition of valuation V . Cases to consider are: (i) ψ is α ∨ β; (ii) ψ is ¬α; (iii) ψ is π α. For (i): follows by properties of |= and ; For (ii): follows by (FLC(1)), (FLC(2)); For (iii): σ π α if and only if, by Lemma 7.2., for some σ ∗ , (σ, σ ∗ ) ∈ A(π) and σ ∗ ⊃ α, hence, by hypothesis of induction, σ ∗ |= α but this means that σ |= π α.
7.1 Sentential Dynamic Logic (SDL)
341
As already observed, for the formula φ, there exists σ with the property that σ ⊃ φ, hence, σ |= φ. This proves completeness of SDL. An upshot of this proof of completeness is the fact that φ has the small model property: F LC(φ) is finite. Hence, SDL is decidable. A more exact evaluation of cardinality of a small model is obtained by the method of filtration. Filtration. Small model property for SDL The method of filtration, known from modal logics, reminds of the LindenbaumTarski approach by forming quotient Kripke structures. It leads as we know from φ discussion of modal logics, to small structures of size at most 22 for each formula φ satisfiable in a given original structure. The number of quotient structures is φ bounded by 22 and this doubly exponential size makes it impractical but theoretically important. Definition 7.8 (Filtration) We consider a Kripke structure = (S, A, V ) and a formula φ of SDL. We define an equivalence relation ∼ on states in S by letting: ∼ (s, s ) if and only if s |= ψ if and only if s |= ψ for each formula ψ ∈ F LC(φ) The relation ∼ is an equivalence relation on the set S, and we denote by the symbol S/ ∼ the set of equivalence classes [s]∼ of states in S: S/ ∼= {[s]∼ : s ∈ S}, where [s]∼ = {s :∼ (s, s )}. The relation ∼ factors in a natural way through an assignment A and valuation V: (i) (A/ ∼)(a) = {([s], [s ]) : (s, s ) ∈ A(a)} for each atomic program a; (ii) (V / ∼)( p) = {[s]∼ : s ∈ V ( p)}. A/ ∼ and V / ∼ defined in (i) and (ii), extend in the familiar way to programs and formulae. The quotient Kripke structure ∼ = (S/ ∼, A/ ∼, V / ∼) is the filtration of by F LC(φ). The first task is to check that this quotient structure behaves correctly with respect to formulae in F LC(φ). Postulates for a correct extension to F LC(φ) are brought for in Theorem 7.5 Let = (S, A, V ) be a Kripke structure and φ a formula of SDL. In the notation of Definition 7.8, for states s, s ∈ S: (i) s ∈ V (ψ) if and only if [s]∼ ∈ V / ∼ (ψ) for each ψ ∈ F LC(φ);
342
7 Logics for Programs and Knowledge
(ii) for each formula [π]ψ ∈ F LC(φ), if (s, s ) ∈ A(π), then ([s]∼ , [s ]∼ ) ∈ A/ ∼ (π); (iii) for each formula [π]ψ ∈ F LC(φ), if ([s]∼ , [s ]∼ ) ∈ A/ ∼ (π) and s ∈ A([π]ψ), then s ∈ V (ψ). Proof of Theorem 7.5 applies structural induction checking all possible cases. We omit the proof advising interested readers to consult (Harel et al. [4], II.6.2). The upshot of Theorem 7.5 is the theorem on existence of small structures for SDL (Fisher and Ladner [3]). Theorem 7.6 (The small structure (model) theorem) For a formula φ of SDL, if φ is satisfiable, then there exists a filtered structure of size not greater than 2|φ| in which φ is satisfiable. Proof Let be a Kripke model for φ and a state s such that s |= φ; then, in the filtered structure, [s]∼ |= φ, by Theorem 7.5(i). In the quotient structure / ∼, each state can be identified with a value assignment on atomic propositions in φ, hence, size of the quotient structure is not greater than 2|φ| , |φ| being the number of symbols in φ, i.e., the size of φ. Decidability, validity, complexity of SDL Theorem 7.7 SDL is decidable: Indeed, it follows from the small model theorem: for each formula φ, it suffices to check all Kripke structures of sizes not greater than 2|φ| in order to ascertain eventual satisfiability of φ. Satisfiability in SDL is in NTIME (Fisher and Ladner [3]). Theorem 7.8 Satisfiability of a formula φ can be checked in NTIME(c|φ| ) for some constant c). Algorithm for checking satisfiability of a formula φ proposed in [3] is the following: (1) guess a structure = (S, A, V ) of size not grater than c|φ| ; (2) guess a state s in S; (3) check, whether s |= φ. Model checking in (3) can be done in time polynomial in sizes of φ and (cf. [3]). It follows that complexity of validity checking is co-NTIME(c|φ| ).
7.2 Epistemic Logics The case of a single reasoning agent In this section, we propose a discussion of epistemic logics, a clone of modal logics in which operators of necessity and possibility appear as operators of knowledge and belief. Yet, it is not any direct translation consisting in mere symbol changing as
7.2 Epistemic Logics
343
notions of knowledge and belief have a wide spectre of applications, for instance, in reasoning by teams of agents. We begin with epistemic logics for a single reasoning agent. Epistemic logic in its present formulation is constructed on the basis of modal logics. It considers notions of knowledge and belief modelled usually by following modal logic modelling of necessity and possibility. However, one can model knowledge and belief separately which leads to strictly epistemic logics concerned with the notion of knowledge (from knowledge = epistem ˆ eˆ in Greek) and to doxastic logics (from doxa=belief in Greek) concerned with the notion of belief. From Plato and Aristotle, epistêmê has been regarded as the knowledge of things that ‘cannot be otherwise’, and this feature would exclude all possibilities of deliberating about them contrary to techné which was encompassing changeable aspects of reality allowing ‘calculations’ about them. Doxa was encompassing opinions about reality not founded on its exact features, a kind of ‘knowledge for the laymen’. Relations between knowledge and belief are studied in logical theories of knowledge and we will give some account of basics in this area. We assume here that we have read the chapter on modal logics so we are familiar with notation and technical content of it. In epistemic logic, the necessity operator L is denoted by the symbol K and the formula K φ is read ‘it is known that φ’. The dual operator ¬K ¬ is denoted by the symbol K . In doxastic logic, the symbol for belief operator is denoted by B and its dual is denoted by B . Thus, contrary to the usage in modal logics operators K and B are not dual to each other but have their own dual forms and, though it is possible that the two occur jointly in some contexts, yet they form parallel theories. Both epistemic and doxastic logics are more difficult to be formalized than modal logics due to the subjective understanding of notions of knowledge and belief. It was an idea in Hintikka [5] that to be positive about validity of a statement in a certain state one should check its validity in all achievable states and only the validity in all those states could corroborate validity in the given state. Though the notion of a possible state has a long history associated with names like Leibniz and Carnap, yet the introduction of semantics of possible worlds in Kripke [6] opened up venues for interpretations of notions of knowledge and belief in rigorous ways. We concentrate on technical aspects of epistemic and doxastic logics. We recall modal logics K, T, S4, S5. As knowledge and belief logics are built on the lines of modal logics, we refer to Chap. 4 for formal introduction to syntax and semantics of modal logics and we adapt them to the epistemic content by introducing epistemic logics, where the symbol (SL) denotes all valid formulae of sentential logic. Notwithstanding this reservation, we shortly recall the syntax of epistemic logics. Definition 7.9 (Syntax of epistemic logic I) Syntactic constructs of epistemic logic encompass (i) atomic propositions of sentential logic SL: p1 , p2 , . . . of which we have a countable set P and of the symbol ⊥ for falsum (unsatifiable); (ii) connectives of sentential logic: ∨, ∧, ⊃, ≡, ¬. Relational vocabulary consists of the epistemic operator K .
344
7 Logics for Programs and Knowledge
Well-formed formulae of epistemic logics (wf’s) are defined as follows, where wfφ denotes that φ is well-formed: (iv) all valid propositions of SL are wf; their set is denoted (PL); (v) if wfφ and wfψ, then wfφ ∨ ψ, wfφ ∧ ψ, wfφ ⊃ ψ, wfφ ≡ ψ, wf¬φ; (vi) if wf φ, then wf K φ. In particular, verum defined as ¬⊥ is wf. We have the analogues of modal logics for the epistemic case. Definition 7.10 (Syntax of epistemic logic II. Logic EK) EK satisfies the axiomatic schemes (SL) and (EK): (i) (EK): K (φ ⊃ ψ) ⊃ (K φ ⊃ K ψ); Let us notice that (EK) is often introduced in the form (ii) (EK) (K φ ∧ K (φ ⊃ ψ)) ⊃ K ψ. Thus, in the logic EK knowledge satisfies the statement ‘if it is known that φ implies ψ, then if it is known that φ then it is known that ψ’. Definition 7.11 (Epistemic logic ET ) The axiomatic schemes for ET are (SL), (EK) and (ET): (i) (ET) K φ ⊃ φ. In the logic ET knowledge satisfies the statement: ‘If it is known that φ then φ is true’, or, in a paraphrase ‘what is known is true’. Definition 7.12 (Epistemic logic ES4) The logic ES4 is endowed with axiomatic schemes (SL), (EK), (ET) and (E4): (i) (E4) K φ ⊃ K K φ. In logic ES4 knowledge observes the statement: ‘if it is known that φ then it is known that it is known that φ’ - the statement witnessing the ‘positive introspection’: ‘they know that they know.’ Definition 7.13 (Epistemic logic ES5) We add one more axiomatic schema (E5) to (SL), (EK),(ET),(E4): (i) (E5) K φ ⊃ K K φ. In logic ES5 knowledge obeys the property: ‘if it is not known that not φ then it is known that it is not known that not φ’ - the statement witnessing the ‘negative introspection’. Definition 7.14 (Rules of inference) They are modal rules of inference with necessity operator L replaced by the epistemic operator K :
7.2 Epistemic Logics
345
(i) Detachment rule (Modus Ponens, MP): if φ and φ ⊃ ψ, then ψ; (ii) Epistemic necessitation rule (EN): if φ, then K φ. In the same way we construct doxastic logics; here, we omit the interpretations, as they will mimic the interpretations for knowledge logics with replacement of the word ‘knows’ by the word ‘believes’. When we replace the epistemic operator K with the doxastic operator B, we obtain logics DK, DT, DS4, DS5 with axiomatic schemas parallelling schemas for epistemic logic in which the operator K is replaced by the operator B. Rules of inference for doxastic logics are in analogy to those for epistemic logics with the only difference that the epistemic operator K is replaced with the doxastic φ . operator B, e.g., the rule (EN) Kφφ becomes the rule (DN) Bφ Definition 7.15 (Semantics for epistemic and doxastic logics) Semantics is defined on lines introduced in Chap. 4, i.e., it is the Kripke semantics of possible worlds. We recall that a frame F is a pair (W, R), where the set W is a set of possible worlds and the relation R is a binary accessibility relation on the set W . The meaning of the instance R(w, w ) is that being in the world w, we check the status of a statement φ in worlds w which are possible for us to inspect. A structure for epistemic logic is a pair (F, A) of a frame F and an assignment A: P × W → {0, 1}. Thus, the assignment A sends each pair ( p, w) of an atomic proposition p and a world w into a Boolean value 0 or 1. Once the assignment is defined, all sentential formulae in all worlds gain a value of truth or falsity. We denote by the generic symbol M the structure (F, A), i.e., (W, R, A). Given a world w ∈ W , a pointed structure is a pair (M, w) where M is the structure (W, R, A). A Kripke structure for doxastic logic is defined on the same lines, except that we use another symbol, say S, for the accessibility relation; this is important in cases when we discuss a simultaneous occurrence of epistemic and doxastic logics in a common set of possible worlds. Definition 7.16 (The notion of satisfaction) Satisfaction conditions are exactly the same as for modal logics with already mentioned change of operator symbols. Epistemic as well as doxastic logics defined above are normal in the sense of Chap. 4 and they inherit all properties of modal logics on whose lines they have been defined. In particular, Theorem 7.9 Epistemic logics EK, ET, ES4, ES5 are strongly complete in, respectively, frames that are universal, reflexive, transitive, equivalent. The same holds for doxastic logics DK, DT, DS4, DS5. Theorem 7.10 All epistemic and doxastic logics defined above have the small structure (model) property and are decidable. Theorem 7.11 Satisfiability decision problems for EK, DK, ET, DT, ES4, DS4 are PSPACE-complete, satisfiability problems for ES5, DS5 are NP-complete.
346
7 Logics for Programs and Knowledge
Proofs for most NP-statements can be found in Chap. 4. Proofs for PSPACEcompleteness are to be found in Ladner [7] and in (Halpern and Moses [8]). Epistemic-doxastic logics It is of course interesting to consider the interaction of both types of logics in common models. For a structure M = (W, R, S, A), and a world w ∈ W , we recall from Chap. 4 the notion of an R-neighborhood N F (w) = {w : R(w, w )}, where F is a frame (W, R). We define in the same manner the notion of an H -neighborhood for the frame H = (W, S). We recall that truth in case of epistemic or doxastic logic is understood as respective conditions (i) M, w |= K φ if and only if M, w |= φ for each world w ∈ N F (w); Similarly, in case of a doxastic logic we have the condition (ii) M, w |= Bφ if and only if M, w |= φ for each world w ∈ N H (w). Basic interrelations between the two types of logics can be expressed in some logics. We mention hybrid logics due to Hintikka [5] and Lenzen [9]. Definition 7.17 (Epistemic-doxastic logics KB1-KB4) Logics KB1-KB4 are characterized, respectively, by the following axiom schemes (KB1)-(KB4). (KB1) K φ ⊃ Bφ; This means that one believes in what they know. (KB2) Bφ ⊃ K Bφ; Each believer knows that they believe (‘positive introspection’); (KB3) Bφ ⊃ B K φ; Anyone who believes, believes that they know; (KB4) (¬K φ) ⊃ B¬K φ; Anyone who does not know believes that they do not know (‘negative introspection’). Theorem 7.12 Mixed logics (KB1)-(KB4) are satisfied under respective conditions: (i) A sufficient condition for satisfiability of (KB1) in a frame (W, R, S) is S ⊆ R, i.e., N H (w) ⊆ N R (w) for w ∈ W ; (ii) A sufficient condition for satisfiability of (KB2) in a frame (W, R, S) is that R, S satisfy the property RS-Euclidean: if R(w, v) then N H (v) ⊆ N H (w) for all pairs w, v ∈ W ; (iii) A sufficient condition for satisfiability of (KB3) in a frame (W, R, S) is that R, S satisfy the property SR-Euclidean: if S(w, v) then N R (v) ⊆ N H (w) for all pairs w, v ∈ W ; (iv) A sufficient condition for satisfiability of (KB4) in a frame (W, R, S) is that R, S satisfy the property: if S(w, u) and R(w, v) then R(v, u) for all triples w, v, u ∈ W . We have of course all modalities from Chap. 4 at our disposal. For the purpose of quoting the next result we recall axiom schemes:
7.2 Epistemic Logics
347
(E5) ¬K φ ⊃ K ¬K φ; (D5)¬Bφ ⊃ B¬Bφ (ED) K φ ⊃ ¬K ¬φ; (DD)Bφ ⊃ ¬B¬φ Combinations of the above properties are possible as well. An interesting example is quoted in Blackburn et al. [10]. Theorem 7.13 There exists a structure in which if knowledge satisfies (E5) and belief satisfies (DD) then together with (KB1) they imply that the formula B K φ ⊃ K φ is satisfied, i.e., ‘if one believes that knows then they know’. Proof We outline a proof on semantic lines. Consider the accessibility relation R for epistemic and S for the doxastic part, with R a Euclidean and S a serial accessibility relations. Then each model with the union R ∪ S for those R and S satisfies (5) and (D). In order to satisfy (KB1), there must be S ⊆ R, so R ∪ S = R. We select a model in which R = S. Then models with R, S = R satisfy B K φ ⊃ K φ. From Definition 7.17 it follows that the addition of (KB3) to axiom schemes in the outlined in Theorem 7.13 class of structures, satisfies the schema K φ ≡ Bφ which means the collapse of knowledge as knowledge and belief coincide. Epistemic logics for n agents A simultaneous occurrence of knowledge and belief logics in a structure involves two accessibility relations, R and S, and we could introduce two beings r, s with r responsible for R and s for S. In such cases we speak of epistemic agents (agents, for short). Given n agents, which we denote as 1, 2, . . . , n, we introduce n accessibility relations R1 , R2 , . . . , Rn , the relation Ri pertaining to the agent i, for i ≤ n. Kripke structures for this case are tuples of the form Mn = (W, R1 , R2 , . . . , Rn , A) where W is a set of worlds, A is an assignment, i.e., A maps the Cartesian product P × W into the set of truth values {0, 1}, hence, A( p, w) ∈ {0, 1} for each pair ( p, w) ∈ P × W . Definition 7.18 (Epistemic logic EKn) This logic extends the one-agent epistemic logic EK. Like EK, it consists of two axiomatic schemes and two rules of inference. Axiomatic schemes are (i) all valid (tautological) schemes of sentential logic SL; (ii) (EKi) K i (φ ⊃ ψ) ⊃ [(K i φ) ⊃ (K i ψ)] equivalently (K i φ ∧ K i (φ ⊃ ψ)) ⊃ K i ψ for i ≤ n. Rules of inference are: (iii) detachment: if φ and φ ⊃ ψ, then ψ; (iv) epistemic necessitations (ENi): if φ, then K i φ for i ≤ n. Theorem 7.14 The epistemic logic EKn is sound.
348
7 Logics for Programs and Knowledge
Proof One proves on the lines for one-agent epistemic (modal) logics that the system EKn is sound: given a structure Mn , one checks that (i) Mn |= φ when φ is a valid formula of SL; (ii) Mn |= ψ whenever Mn |= φ and Mn |= φ ⊃ ψ; (iii) Mn |= K i (φ ⊃ ψ) ⊃ (K i φ ⊃ K i ψ) for i ≤ n; (iv) Mn |= K i φ whenever Mn |= φ. We prove (iii) and (iv) as an example of reasoning in EKn. For (iii), suppose that Mn , w |= K i (φ ⊃ ψ), hence, Mn , v |= φ ⊃ ψ at each v such that Ri (w, v). Suppose that M, w |= K i φ so that M, v |= φ. By detachment, M, v |= ψ, hence, M, w |= K i ψ. For (iv), suppose that Mn |= φ then, for each world w ∈ W , Mn , w |= φ, in particular if Ri (w, v) then Mn , v |= φ, hence, Mn , w |= K i φ. Completeness of n-agent logics We now address the completeness property for EKn. We consider the completeness proof for EKn as an archetype for completeness proofs for other epistemic logics. We recall from Chap. 4 some necessary notions and results proved there. Definition 7.19 (Consistency of EKn) A formula φ of EKn is consistent if and only if its negation ¬φ is not provable in EKn. A set Γ of formulae of EKn is consistent if and only if the formula Δ is consistent for each finite Δ ⊆ Γ . Consistency of Γ is denoted Con(Γ ). A set Γ of formulae is maximal consistent, which is denoted MaxCon(Γ ), if there is no consistent set Δ such that Γ ⊂ Δ. Technically, MaxCon(Γ ) if and only if adding to Γ any formula φ ∈ / Γ results in inconsistency of the set Γ ∪ {φ}. Theorem 7.15 We know from Chap. 4 that MaxCon(Γ ) has the following properties: (i) if MaxCon(Γ ), then either φ ∈ Γ or ¬φ ∈ Γ for each formula φ of EKn; (ii) if MaxCon(Γ ), then φ ∧ ψ ∈ Γ if and only if φ ∈ Γ and ψ ∈ Γ for each pair φ, ψ of formulae of EKn; (iii) if MaxCon(Γ ), then if φ ∨ ψ ∈ Γ , then either φ ∈ Γ or ψ ∈ Γ for each pair φ, ψ of formulae of EKn; (iv) if MaxCon(Γ ), then if φ ∈ Γ and φ ⊃ ψ ∈ Γ , then ψ ∈ Γ for each pair φ, ψ of formulae of EKn; (v) if MaxCon(Γ ), then if Γ φ, then φ ∈ Γ for each formula φ of EKn; (vi) if MaxCon(Γ ), then ∈ Γ . In order to facilitate the perception of these facts, we include standard proofs of (i)–(vi). These proofs are based on the definition of consistency. Suppose that for a formula φ neither φ nor ¬φ is in Γ . Then both Γ ∪ {φ} and Γ ∪ {¬φ} are inconsistent and then Γ ∪ {φ ∨ ¬φ} is inconsistent, a contradiction because φ ∨ ¬φ is valid in PL, hence Γ would be inconsistent. This is at the heart of the Lindenbaum Lemma: adding successively φ or ¬φ to build a consistent set yields a maximal consistent set.
7.2 Epistemic Logics
349
If φ ∧ ψ ∈ Γ and, was, say, φ ∈ / Γ , it would imply ¬φ ∈ Γ and would yield inconsistency of Γ . If φ, ψ ∈ Γ then we must have φ ∧ ψ ∈ Γ else ¬(φ ∧ ψ) ∈ Γ , i.e., (¬φ ∨ ¬ψ) ∈ Γ and,say, ¬ψ ∈ Γ , an inconsistency. (iii) is provable on the same lines. If φ, (φ ⊃ ψ) ∈ Γ then (¬φ ∨ ψ) ∈ Γ and by (iii) it must be ψ ∈ Γ as ¬φ ∈ Γ would cause inconsistency. (vi) holds true as ¬ ≡ ⊥ cannot be in Γ . (v) follows as φ ∈ / Γ would mean ¬φ ∈ Γ hence along with φ, Γ would prove falsity. Completeness of EKn means that each valid formula of EKn is provable. Suppose a formula φ is valid but not provable. Then the formula ¬¬φ is not provable and then ¬φ is consistent. If it was known that each consistent formula is satisfiable in some model Mn then ¬φ would be satisfiable in some model Mn contrary to validity of φ. Hence,in order to prove completeness of EKn, it suffices to prove Theorem 7.16 If a formula φ of EKn is consistent, then it is satisfied in some structure Mn . Proof As each consistent formula belongs in some maximal consistent set, the way of proof is to use the already known to us from Chap. 4 the technique of canonical models due to Henkin in which worlds are maximal consistent sets. We recall from Chap. 4 the rule (RK) which we reproduce here in the epistemic guise for each K i : (RKi):
γ1 ⊃ (γ2 ⊃ (. . . ⊃ (γk ⊃ φ))) K i γ1 ⊃ (K i γ2 . . . (⊃ (K i γk ⊃ K i φ)))
We denote by Γ /K i the set {φ : K i φ ∈ Γ }. The canonical structure Mnc is a triple (W c , R1c , R2c , . . . , RnC , Ac ), where (C1) W c = {Γ : MaxCon(Γ )}; (C2) Ric (Γ, Γ ) if and only if Γ /K i ⊆ Γ ; (C3) Ac ( p, Γ ) = 1 if and only if p ∈ Γ . The main result is Lemma 7.4 For each MaxCon(Γ ) and a formula φ of EKn, (Mnc , Γ ) |= φ if and only if φ ∈ Γ . Proof of Lemma The proof is by structural induction. For a sentential variable p, Lemma is true by (C3). For a formula ¬φ, if ¬φ ∈ Γ then, by Theorem 7.15(i), φ∈ / Γ so, by assumption of induction, φ is invalid, hence, ¬φ is valid, the converse goes along the same lines. Proof for the case of φ ∧ ψ is analogous. The only harder case is that of ψ of the form of K i φ. The theorem is true for φ by hypothesis of induction. Suppose that ψ ∈ Γ for a MaxCon(Γ ). Then φ ∈ Γ /K i and it follows that if Ric (Γ, Γ ) then Mnc , Γ |= φ, hence, Γ |= ψ. For the converse, suppose that (Mnc , Γ ) |= K i φ for some MaxCon(Γ ). Then Γ /K i ∪ {¬φ} is inconsistent: each maximal consistent set Γ such that Ric (Γ, Γ ) contains φ and was Γ /K i ∪ {¬φ} consistent, we would have some MaxCon(Γ ∗ ) with Γ /K i ∪ {¬φ} ⊆ Γ ∗ ,i.e, Ric (Γ, Γ ∗ ), a contradiction.
350
7 Logics for Programs and Knowledge
Thus, there is a proof of ¬¬φ, i.e, of φ from Γ /K i : ψ1 , ψ2 , . . . , ψk , φ and by deduction theorem in PL we have (∗) ψ1 ⊃ (ψ2 ⊃ (. . . ⊃ (ψk ⊃ φ))) from which, by (RKi), we obtain (∗∗) K i ψ1 → (K i ψ2 → (. . . → (K i ψk → K i φ))) As {K i ψ1 , K i ψ2 , . . . , K i ψk } ⊆ Γ it follows from (**) that K i φ ∈ Γ . This concludes the proof of completeness of EKn.
The case of logics ETn, ES4n, ES5n, EKD45n Definition 7.20 (Epistemic logics ETn, ES4n, ES5n, EKD45n) Following known cases of logics ET, ES4, ES5, and modal logic D, we extend them to the case of n agents in the same way in which EK has been extended to EKn, so our Kripke structures will be of the form Mn = (W , R1 , R2 , . . . , Rn , V ) and K i will be the epistemic operator for the agent i. By analogy with modal cases, we assign some rules to operators K i for i ≤ n: (Tni) (4ni) (5ni) (Dni)
K i φ ⊃ φ; K i φ ⊃ K i K i φ; ¬K i φ ⊃ K i ¬K i φ; ¬K i ⊥;
Remark 7.1 Let us notice that the rule (Dni) is equivalent to known to us rule: K i φ ⊃ ¬K i ¬φ. We offer a proof: 1. 2. 3. 4. 5. 6.
K i φ → ¬K i ¬φ; K i ¬φ → ¬K i φ; PL φ/⊥; substitution K i → ¬K i ⊥; verum as ¬⊥ K i ; valid ¬K i ⊥ detachment; For the converse:
7. 8. 9. 10. 11.
¬K i ⊥; ¬K i (φ ∧ ¬φ); PL ¬[(K i φ) ∧ (K i ¬φ)]; property of the system K ¬(K i φ) ∨ (¬K i ¬φ); PL K i φ → ¬K i ¬φ.
With rules (Tni), (4ni), (5ni), and, (Dni), we define other epistemic logics for n agents as satisfying the sets of axiomatic schemes.
7.2 Epistemic Logics
351
Definition 7.21 The axiom schemes for logics ETn, ES4n, ES5n, EKD45n: The logic ETn satisfies axiom schemes (Kni),(Tni) for i ≤ n. The logic ES4n satisfies axiom schemes (Kni, (Tni), (4ni) for i ≤ n. The logic ES5n satisfies axiom schemes (Kni), (Tni), (4ni), (5ni) for i ≤ n. The logic EKD45n satisfies axiom schemes (Kni, (4ni), (5ni), (Dni) for i ≤ n. In analogy to the modal cases of Chap. 4, we check that corresponding results are true for epistemic logics for n agents. Theorem 7.17 Let symbols MnT,c , Mn4,c , Mn5,c , MnD,c denote that the canonical model Mnc , validates, respectively, all instances of ETn, ES4n, ES5n, EKD45n. Then (i) MnT,c is reflexive, i.e., Ric (Γ, Γ ) for each MaxCon(Γ ) and i ≤ n; (ii) Mn4,c is transitive, i.e, Ric (Γ, Γ ) and Ric (Γ , Γ ∗ ) imply Ric (Γ, Γ ∗ ) for each MaxCon{Γ, Γ , Γ ∗ } and i ≤ n; (iii) Mn5,c is Euclidean, i.e., if Ric (Γ, Γ ) and Ric (Γ, Γ ∗ ) then Ric (Γ , Γ ∗ ); (iv) MnD,c is serial,transitive and Euclidean, i.e., for each i ≤ n and for each world Γ , there exists a world Γ ∗ such that Ric (Γ, Γ ∗ ) and conditions in (ii), (iii) are also satisfied. Proof We recall that S5 contains the formula (B) φ ⊃ K ¬K ¬φ so if we prove that the canonical model for E Bn is symmetric, then (ii) along with (i) will show that the canonical model Mn5,c is Euclidean. Let us observe that (i) the canonical accessibility relation Ric can be defined equivalently as follows: Ric (w, v) if and only if ¬K i ¬φ ∈ v implies φ ∈ w. Suppose then that Ric (Γ, Γ ∗ ) and φ ∈ Γ . Detachment from the formula (B) for i in Γ yields K i ¬K i ¬φ ∈ Γ , hence, ¬K i ¬φ ∈ Γ ∗ which by Remark I means that Ric (Γ ∗ , Γ ). For (i), suppose that the canonical model Mnc satisfies all instances of (Tni),i.e., at each Γ the formula K i φ ⊃ φ is valid. If Γ |= K i φ then Γ |= φ, i.e., Γ /K i ⊆ Γ which means that Ric (Γ, Γ ), i.e., Mnc is reflexive which can be also denoted Mnc,r . For (ii), suppose that the canonical model Mnc satisfies all instances of S4ni, i.e., at each Γ the formula K i φ ⊃ K i K i φ is valid. Then for MaxCon{Γ , Γ , Γ ∗ } we have that if Γ |= K i φ then Γ |= K i K i φ, hence, if Ric (Γ, Γ ) then Γ |= K i φ and if Ric (Γ , Γ ∗ ) then Γ ∗ |= φ, i.e.,φ ∈ Γ ∗ , so Mnc is transitive. We denote Mnc in this case as Mnc,tr . For (iv), we should consider the formula (ED). Suppose that Γ |= ¬K i ⊥. Consider Γ /K i . Claim. Γ /K i is consistent. Indeed, suppose not. Then there is a proof of ⊥ from Γ /K i : φ1 ⊃ (φ2 ⊃ (. . . ⊃ (φk ⊃ ⊥))) and the rule (R K i ) yields K i φ1 ⊃ (K i φ2 ⊃ (. . . ⊃ (K i φk ⊃ K i ⊥))), a contradiction. It follows that Γ /K i can be extended to a MaxCon(Γ ∗ ) and Ric (Γ, Γ ∗ ) holds true which proves seriality of Mnc denoted now Mnc,s . c,eq We now denote by the symbol Mnc,r , respectively Mnc,tr , Mn , Mnc,e,s,tr canonical models in which, respectively, all instances of ETn, ES4n, ES5n, EKD45n are
352
7 Logics for Programs and Knowledge
valid. Then, ETn, ES4n, ES5n, EKD45n are sound and complete with respect to, c,eq respectively, Mnc,r , Mnc,tr , Mn , Mnc,e,s,tr . Proofs of these facts parallel the proof for EKn, with necessary modifications. We have justified Theorem 7.18 Epistemic logics ETn, ES4n, ES5n, EKD45n are complete. This model of multi-agent knowledge treats agents as independent from one another: each i of them is endowed with a knowledge operator K i and each is reasoning on their own. However, there are inter-relations as distinct agents can have access to common possible worlds. There is also possibility of agents sharing knowledge and knowing knowledge of one another. Group knowledge and common knowledge Definition 7.22 (Group and common knowledge) There are two basic kinds of collective knowledge for a group of agents: the group knowledge, when each agents knows a formula φ, expressed as (E) Eφ ≡
K i φ,
i≤n
when we discuss n agents, or, more generally, as (EG) E G φ ≡
K i φ for a group G of agents from among 1, 2, . . . , n,
i∈G
and the common knowledge expressed as (C) Cφ ≡
E k φ,
k≥1
where as usual E k means the sequence of length k of symbols E. Thus, Cφ means Eφ ∧ E Eφ ∧ E E Eφ . . .. The reading of Cφ is ‘everyone knows and everyone knows that everyone knows and ...’. Concerning Cφ, we observe that for a frame Mn = (W, R1 , R2 , . . . , Rn ), reaching a world v from a world w in order to satisfy E would mean that each relation Ri should be allowed so the global accessibility relation in that case should be R E = i≤n Ri . Then, the global accessibility relation for E k is R(k) = (R E )k , hence, the accessibility relation for C, R C is the transitive closure Cl(R E ) = R C = k≥1 (R E )k . We have thus two kinds of frames: frames F E = (W, R E ) and frames F C = (W, R C ) derived from frames Mn = (W, R1 , R2 , . . . , Rn ). Definition 7.23 (Satisfaction for group and common knowledge) Rules for satisfaction for group as well as for common knowledge are as follows:
7.2 Epistemic Logics
353
(i) Mn , w |= Eφ if and only if Mn , w |= K i φ for each i ≤ n; (ii) Mn , w |= Cφ if and only if Mn , w |= E k φ for each k ≥ 1. Satisfiability in terms of accessibility relations is expressed in the following forms: (iii) Mn , w |= Eφ if Mn , v |= φ for each world v such that R E (w, v); (iv) Mn , w |= Cφ if Mn , v |= φ for each v such that R(k)(w, v) for some k, i.e., v is reachable from w in at most k applications of R E . From (iii) and (iv), some properties of E and C follow. Theorem 7.19 The following hold true for each structure Mn : (i) Mn |= (Cφ ≡ E(φ ∧ Cφ)); (ii) if Mn |= (φ ⊃ E(φ ∧ ψ)), then Mn |= (φ ⊃ Cψ). Proof For (i): suppose Cφ holds at w. This means that Mn , v |= φ for each world v that is reached by some R(k) from w. Then, for k = 1, if v is reached by R E from w then φ holds at v, hence Eφ holds at w. Clearly, if some world t is reached from v by some R(k) then t is reached from w by R(k + 1) so Cφ holds at v, hence, E(φ ∧ Cφ) holds at w. For the converse, suppose that E(φ ∧ Cφ) holds at w and consider any v which can be reached from w by some R(k). Take the world t on the path from w to v which is the successor to w. Clearly, φ ∧ Cφ holds at t, hence, Cφ holds at t. As t is reached from w, Cφ holds at w. For (ii), suppose Mn |= φ ⊃ E(φ ∧ ψ). Let w be a world in Mn and Mn , w |= φ. We apply induction on length of path from w. If R(w, v) then φ ∧ ψ holds at v. Suppose that φ ∧ ψ holds at each world that can be reached from w by R(k) and let t be reached from w by R(k + 1). Let z be the predecessor to t on path from w so, by assumption of induction,φ ∧ ψ holds at z. As φ ⊃ E(φ ∧ ψ) holds at Mn , t inherits φ ∧ ψ from z. We conclude that Mn , w |= Cψ so φ ⊃ Cψ holds true at Mn . We can interpret Theorem 7.19(i) by stating that each agent having common knowledge about φ knows that everyone knows φ and knows that everyone has common knowledge. The property Theorem 7.19(i) along with the definition of E serves as additional axiomatic scheme for common knowledge. Theorem 7.20 The following are implied by Theorem 7.19(i), (ii). (i) Eφ ≡ i≤n K i φ (ii) Cφ ≡ E(φ ∧ Cφ); Theorem 7.20(ii) provides an additional inference rule (IC): (I C) if φ ⊃ E(φ ∧ ψ), then φ ⊃ Cψ If we add 2.26 (i),(ii), and (IC) to EKn, ETn, ES4n, ES5n, EKD45n, then we obtain systems of common knowledge ECKn, ECTn, ECS4n, ECS5n, ECKD45n,
354
7 Logics for Programs and Knowledge
respectively. One can expect that these systems are sound and complete with respect to models Mn , and it is so. Definition 7.24 Completeness is demonstrated in axiomatic systems which contain: (1) axiom schemes proper for a given logic, e.g., K i (φ ⊃ ψ) ⊃ (K i φ ⊃ K i ψ) for ECKn and corresponding schemes for ECTn, ECS4n, ECS5n, ECKD45n; (2) axiom scheme Eφ ≡ i≤n K i φ; (3) axiom scheme Cφ ≡ E(φ ∧ Cφ); (4) inference rules: (φ ⊃ E(φ ∧ ψ)) ⊃ (φ ⊃ Cψ) with detachment and necessitation. Theorem 7.21 Systems ECKn, ECTn, ECS4n, ECS5n, ECKD45n are complete. We insert an outline of a proof for the system ECKn as a pattern for proofs for other commnon knowledge systems. Proof The idea for a proof is already familiar: one proves that if a formula φ is consistent then it is satisfiable and the means for the proof is to construct maximal consistent sets from φ in a manner resembling the Ladner construction of Chap. 4. Some details of proof come from (Halpern and Moses [8]). For a formula φ, we denote by SubC (φ) the set Sub(φ) ∪ {E(ψ ∧ Cψ), ψ ∧ Cψ, K i (ψ ∧ Cψ) : Cψ ∈ Sub(φ), i ≤ n}∪ {K i ψ : i ≤ n, Eψ ∈ Sub(φ)} ∗
and we let SubC (φ)=SubC (φ) ∪ {¬ξ : ξ ∈ SubC (φ)}. Clearly, SubC∗ (φ) is finite. ∗ We consider maximal consistent subsets of SubC (φ) and we define the set ∗ ∗ MaxConC (φ) as the set of all maximal consistent sets in SubC (φ). For MaxCon(Γ ) and MaxCon(Δ), we let, as in 2.19, Γ /K i = {φ : K i φ ∈ Γ }, and, we define the relation Ri (Γ, Δ) as Γ /K i ⊆ Δ. The assignment A on atomic propositions is defined in the standard way: A( p, Γ ) = 1 if and only if p ∈ Γ . We should prove the claim: ∗
Claim. (MaxCon C (φ), Γ ) |= ψ if and only if ψ ∈ Γ . As usual, it is proved by structural induction. By properties of maximal consistent sets, the claim holds for atomic propositions, conjunctions, and disjunctions. In case of a formula Eψ, by axiom scheme 2.29 (2), Eψ ∈ Γ if and only if K i ψ ∈ Γ for i ≤ n and for Δ with Ri (Γ, Δ) we have ψ ∈ Δ and by the hypothesis of induction Δ |= ψ, hence, Γ |= K i ψ for each i ≤ n and finally Γ |= Eψ. The converse holds by maximality of Γ . Finally, the case of Cψ remains. Suppose that Cψ ∈ Γ . By axiom scheme 2.29 (3) and detachment, E(ψ ∧ Cψ) ∈ Γ . Notice that if Δ is reached from Γ in one step of R C , then ψ ∧ Cψ ∈ Δ. As MaxCon(Δ), ψ, Cψ ∈ Δ. We extend this observation to an arbitrary Δ reached in k steps from Γ . Suppose that for k and Δ, if from Γ to Δ are k steps of R C , then ψ, Cψ ∈ Δ, and consider the
7.2 Epistemic Logics
355
case of k + 1. Let Θ be on the path from Γ to Δ and Δ be at one step from Θ. By hypothesis of induction, ψ, Cψ ∈ Θ and by the case k = 1, ψ, Cψ ∈ Δ. By structural induction hypothesis, and by the principle of mathematical induction, Δ |= ψ for any Δ reachable from Γ , hence, Γ |= Cψ. Suppose now that Γ |= Cψ. Consider the finite set Ω of MaxCon(Δ)’s reachable from Γ . In each such Δ, ψ holds true, hence, by the hypothesis of induction, ψ ∈ Δ for Δ ∈ Ω. As each Δ ∈ Ω is on some path from Γ , by repeating the above reasoning, we arrive at the conclusion that Cψ ∈ Δ for each Δ ∈ Ω. By restraining us to one step from Δ, we obtain that E(ψ ∧ Cψ) holds in each Δ reachable from / Γ , hence, ¬Cψ ∈ Γ , hence, Γ |= ¬Cψ, ie., for some Δ Γ . Suppose now that Cψ ∈ reachable from Γ w would have ¬ψ ∈ Δ, a contradiction. Thus Cψ ∈ Γ . The proof is concluded. Satisfiability, validity, decidability of epistemic logics In case of one agent, results on satisfiability are like those for classical modal logics: as shown by Ladner [7], SAT(S5) and SAT(KD45) are NP-complete and SAT(K), SAT(T), SAT(S4) are PSPACE-complete. These results were extended in [8]: it was shown there that SAT(EKn), SAT(ETn), and also SAT(ES4n) are PSPACEcomplete regardless of the number n of agents while SAT for ES5n and KD45n are PSPACE-complete for n ≥ 2. Passing to common knowledge, we enter the realm of EXPTIME: such is complexity of SAT for ECKn, ECTn for n ≥ 1 and SAT for ECS4n, ECS5n, ECKD45n for n ≥ 2, see [8]. We mention that the validity problem of deciding whether a formula is valid is in co-class for the class of each logic as a formula is valid if and only if its negation is unsatisfiable. Decidability problem for epistemic logic is addressed in a way similar to that for modal logics. We sketch the line of reasoning in the case of epistemic logics. First, we recall the algorithm for checking satisfiability in finite models. For a finite model Mn , the size of M, denoted |M| is defined as the sum of the number of worlds and of the number of instances of accessibility relations. We recall the following theorem, cf. [8]. Theorem 7.22 For a finite model M of size |M|, checking satisfiability of a formula φ can be done in time O(|M| · |φ|) where |φ| is the length of φ. Proof We form the set S F(φ) of sub-formulae of φ which we can list in the order of increasing length. The cardinality |S F(φ)| is not greater than |φ|. We now begin the labelling procedure: to each world w in M, we assign either ψ ∈ S F(φ) or ¬ψ depending on which holds at w. In case of K i ψ ∈ S F(φ) we have to check all worlds connected to w by some Ri . Finally, we check φ at each world. The complexity is of order |M| · |φ|. The next step is to show the existence of finite models for EKn consistent formulae (Halpern and Moses [8]).
356
7 Logics for Programs and Knowledge
Theorem 7.23 (Finite model existence theorem) If a formula φ is an EKn consistent, then it is satisfiable in a model Mn with the number of worlds not greater then 2|φ| on the condition that all sentential variables in Mn not occurring in φ are assigned permanent falsity in order to make them immaterial. Proof We augment the set S F(φ) of sub-formulae of φ with negations of elements of S F(φ) to obtain the set S F2(φ). We repeat the process of constructing canonical models from consistent subsets of S F2(φ). For each consistent set Γ , we can apply the Teichmüller-Tukey lemma (see Sect. 1.1.1.), to extend Γ to a maximal consistent set Γ + (as consistency is the property of finite character). Cardinality of the set of maximal consistent sets is not greater than the cardinality of all subsets of S F(φ) (as for each ψ ∈ S F(φ) each maximal consistent set contains exactly one of the pair (ψ, ¬ψ). Exactly like in case of canonical models for Kn, one proves that Γ + |= φ if and only if φ ∈ Γ + . Corollary 7.1 Satisfiability, validity and provability problems for EKn are decidable. Some specializations allow to extend this result to ETn, ES4n, ES5n, EKD45n, viz., preserving the basic construction, we have to introduce specialized accessibility relations: In case of ETn, Ri (Γ, Γ ∗ ) if and only if Γ /K i ⊆ Γ ∗ . In case of ES4n, Ri (Γ, Γ ∗ ) if and only if Γ /K i ⊆ Γ ∗ /K i . In case of ES5n, Ri (Γ, Γ ∗ ) if and only if Γ /K i = Γ ∗ /K i . In case of EKD45n, Ri (Γ, Γ ∗ ) if and only if Γ /K i = Γ ∗ /K i and Γ /K i ⊆ Γ ∗ . Then, one checks that the respective models are reflexive, transitive, Euclidean and serial. Hence, problems of satisfiability, validity and provability are decidable for logics ETn, ES4n, ES5n, EKD45n.
7.3 Mereology Based Logic for Granular Computing and Fuzzy Reasoning We now begin a discussion of many-valued logic in terms of partial containment among concepts which are sets of things, in particular granules of knowledge. The theory of concepts which we apply is mereology due to Le´sniewski [11], see also [12]. Mereology is a theory of concepts/sets whose primitive notion is that of a part. This choice of the primitive notion of the theory points in the direction and to footsteps of Aristotle with his Syllogistics as well as to medieval scholars like Thomas Aquinas, Duns Scotus and others. Mereology of Le´sniewski is defined for individual things and the notion of an individual thing is defined in the Le´sniewski Ontology Le´sniewski [13] by means of the Axiom of Ontology (AO). The form of (AO) uses the copula is which we meet in Syllogistics written often also as ’esti’ ε meaning also is,
7.3 Mereology Based Logic for Granular Computing and Fuzzy Reasoning
357
(AO) (aεb) ≡ (∃cεa) ∧ ∀d, e[(dεa) ∧ (eεa) ⊃ (dεe)] ∧ ∀ f [( f εa) ⊃ ( f εb)]. From (AO), one can ascertain the characteristic of an individual thing: non-vacuous, (cεa), one-element ((dεa) ∧ (eεa) ⊃ (dεe), hence, d = e). One can introduce the relation = of identity by letting d = e if and only if (dεe) ∧ (eεd); in particular a = a can be equivalently defined as aεa for each a. We define on a non-empty domain Ω of individual things, the primitive notion of mereology, the relation of a part π as a single element of the relational vocabulary od mereology. In addition, we introduce a countable set of individual variables denoted in practice as x, y, z, . . .. Mereology We give here the basic introduction to mereology: theory of concepts based on the notion of a part. We adhere to the classical scheme of mereology due to Le´sniewski with some slight modifications. This theory concerns a binary predicate π read as ‘being a part of’. Inference rules are detachment and substitution. Definition 7.25 (Part predicate) Part predicate π satisfies conditions: (P1) ¬π(x, x); (P2) π(x, y) ∧ π(y, z) ⊃ π(x, z); (P3) for each pair x, y of individual things, π(x, y) ⊃ ¬π(y, x). (P3) follows by (P1), (P2). Part predicate does express the notion of a proper part. We postulate the notion of identity of things in terms of part relation. Definition 7.26 (Identity) For things x, y, x = y if and only if ∀z.π(z, x) ≡ π(z, y). Then by (P1)-(P3), the predicate = satisfies usual properties of reflexivity, symmetry and transitivity which follows by (P2). In addition to the predicate of part, we introduce two secondary predicates: those of ingredient and overlap. Both are essential in construction of the theory of mereology. Definition 7.27 (Ingredients, Overlap) The predicate of ingredient, denoted I (x, y) and the predicate of overlap denoted Ov(x, y) are defined as follows: (i) I (x, y) ≡ π(x, y) ∨ (x = y); (ii) Ov(x, y) ≡ ∃z.I (z, x) ∧ I (z, y). We introduce one more postulate. Definition 7.28 (Postulate (P4)) ∀x, y.[∀z.(I (z, x) ⊃ ∃w.I (w, y) ∧ Ov(z, w))] ⊃ I (x, y). Theorem 7.24 The following properties hold. (i) (x = y) ≡ I (x, y) ∧ I (y, x); (ii) I (x, x); (iii) I (x, y) ∧ I (y, z) ⊃ I (x, z);
358
7 Logics for Programs and Knowledge
(iv) (x = y) ≡ ∀z.(Ov(z, x) ≡ Ov(z, y)); (v) Ov(x, x); (vi) Ov(x, y) ≡ Ov(y, x). Proof For (i). By Definition 7.27(i), I (x, y) ∧ I (y, x) ≡ [π(x, y) ∨ ((x = y)] ∧ [(π(y, x)) ∨ (y = x)] which in turn is equivalent to (π(x, y) ∧ π(y, x)) ∨ (π(x, y) ∧ (y = x)) ∨ (x = y) ∧ (π(y, x)) ∨ (x = y). The first three implicants are false by (P1) and (P2), hence, I (x, y) ∧ I (y, x) ≡ (x = y). For (ii). By Definition 7.27(i). For (iii). By transitivity of π (Definition 7.25 (P2)). For (iv). Suppose that Ov(z, x) ≡ O(z, y) for each thing z. Assume Ov(z, x). There exists a thing t with I (t, z) and I (t, x). But also Ov(t, x), hence, Ov(t, y) and, by (P4), I (x, y). By symmetry, I (y, x) follows, and by Definition 7.28(i), x = y. For (vi), (vii): evident by Definition 7.27(ii). Definition 7.29 (The notion of a class) For a non-empty collection F of things in Ω, the notion of a class Cls(F) obeys the conditions. (Cl1) ∀x.(x ∈ F ⊃ I (x, Cls(F)); (Cl2) ∀x.[I (x, Cls(F)) ⊃ ∀y.(I (y, x) ⊃ ∃w ∈ F.Ov(y, w))]. Definition 7.30 (The class existence axiom (P5)) For each non-empty collection F of things in any mereological space (Ω, π), there exists a class Cls(F). Theorem 7.25 For each non-empty collection F of things in any mereological space (Ω, π), there exists the unique class Cls(F). Proof Assume that for some collection F there exist two classes Cl1 and Cl2 . Consider a thing t with I (t, Cl1 ). By condition (C2), there exists a thing z such that Ov(t, z) and I (z, Cl2 ). By property (P3), I (Cl1 , Cl2 ) and by symmetry, I (Cl2 , Cl1 ), hence, by property (I2), Cl1 = Cl2 . Theorem 7.26 For each thing x, x = Cls({y : I (y, x)}). Proof Suppose that I (z, x), hence, I (z, Cls({u : I (u, x)}), hence, I (x, Cls({u : I (u, x)}). Conversely, if I (u, Cls({u : I (u, x)}), then there exist w, t with I (w, u), I (w, t), I (t, x), hence, I (w, x) and I (Cls({u : I (u, x)}), x), hence, x = Cls({u : I (u, x)}). Definition 7.31 (The universal class) Let Ω be the collection of all things considered. We define the universal class V by letting V = Cls({u : u ∈ Ω}). Then x ∈ Ω ≡ I (x, V ). Equivalently, V = Cls{x : x = x ∧ x ∈ Ω}.
7.3 Mereology Based Logic for Granular Computing and Fuzzy Reasoning
359
Theorem 7.27 The universal class V has the following properties: (i) I (x, V ) holds for each thing x ∈ Ω (ii) for each non-vacuous property F, the relation instance I (Cls F, V ) holds. Definition 7.32 (The notion of an element) A thing x is an element of a thing y which is denoted el(x, y) if and only if there exists a collection F such that y = Cls F and x ∈ F. Theorem 7.28 For things x, y, the equivalence holds: el(x, y) ≡ I (x, y). Proof If I (x, y), then x ∈ F = Cls{z : I (z, y)} and F = y, hence, el(x, y) holds. If el(x, y), then there exists F such that y = Cls F and x ∈ F, hence, I (x, y). It follows that each thing is its own element, so there are no empty things in mereological universes. The notion of a subset in mereology is defined as follows. Definition 7.33 (The notion of a subset) A thing x is a subset of a thing y, denoted sub(x, y) if and only if the condition holds: ∀z.I (z, x) ⊃ I (z, y). Theorem 7.29 For each pair x, y of things, the equivalence holds : sub(x, y) ≡ I (x, y). Proof Suppose that sub(x, y) holds; then for z = x, we obtain: I (x, x) ⊃ I (x, y) and as I (x, x) holds by Property(I1), I (x, y) holds also. Conversely, if I (x, y) holds, then for each thing z, from I (z, x), I (z, y) holds by property (I3), hence, sub(x, y) holds. It follows that notion of an ingredient, an element, and, a subset are equivalent. As for classes, we have the proposition which holds by class definition and (P4). Theorem 7.30 (i) If ∅ = F ⊆ G, then I (Cls F, ClsG) (ii) if F = G, then Cls F = ClsG. In the language of properties, if F ⊃ G, then I Cls F, ClsG) and if F ≡ G, then Cls F = ClsG. We now offer a glimpse into topology of mereological spaces by presenting notions of an exterior thing, relative complement and complement. Definition 7.34 (The notion of an exterior thing) A thing x is exterior to a thing y, in symbols extr (x, y), if and only if ¬Ov(x, y). The predicate E(x) = extr (x, y) has property (E): if for a non-vacuous collection F, a thing x is exterior to each thing z such that z ∈ F, then extr (x, Cls F). Definition 7.35 (The notion of a relative complement) For things z, y such that ingr (y, z), a thing x is the relative complement to y with respect to z, if and only if x = Cls{t : ingr (t, z) ∧ extr (t, y)}. The complement is denoted by the symbol comp(y, z). This notion is defined if π(y, z) is defined.
360
7 Logics for Programs and Knowledge
Definition 7.36 (the complement) For a thing x, the complement −x to x is comp(x, V ). We now follow Tarski [14] in presentation of the complete Boolean algebra without the null element defined in the mereological universe Ω. First, we need additional notions. Definition 7.37 (The Tarski Boolean mereological algebra BT M (π)) For things x, y, we let (+) (·) (-) (1) (0)
x + y = Cls({u : I (u, x) ∨ I (u, y)); x · y = Cls({u : I (u, x) ∧ I (u, y)); −x = comp(x, V ); 1=V ; 0 is not defined.
Theorem 7.31 The universe Ω with the unit V and operations +, ·, − is a complete Boolean algebra without the null element as mereology does not admit the empty thing. Proof It follows by straightforward checking of requirements for Boolean algebra (see Chap. 1). Completeness follows by the class existence. Definition 7.38 (The mereological implication) For things x, y, we let x → y ≡ −x + y. The implication x → y is valid if and only if −x + y = V . Theorem 7.32 The following are equivalent. (i) I (x, y); (ii) x · y = x; (iii) x → y is valid. Proof (i) ≡ (ii): if I (x, y), then I (x, x) and I (x, y), hence I (x, x · y). Suppose now that I (z, x · y), hence, there exist t, w such that I (t, z), I (t, w) and Ov(w, x · y), hence, Ov(w, x) and by (P4), I (x · y, x), hence x · y = x. (i)≡ (iii): Assume I (x, y), hence x · y = x, thus, x → y = −x + y = −(x · y) + y = −x + −y + y = −x + V = V . Conversely, if −x + y = V , then (−x + y) · x = x · y = V · x = x. Corollary 7.2 The following are equivalent: I (x, y) ≡ x · y = x ≡ x → y = V . Consistency of Mereology was proved by Le´sniewski (cf. Lejewski [15]).
7.4 Rough Mereology. Rough Inclusions
361
7.4 Rough Mereology. Rough Inclusions Rough mereology Polkowski [16], (Polkowski and Skowron [17]) adds to mereology a ternary predicate μ(x, y, r ) defined for triples (x, y, r ), where x, y ∈ Ω, r ∈ (0, 1], read as ‘x is a part of y to a degree of at least r ’. The predicate μ called rough inclusion is defined by means of conditions: (RM1) μ(x, y, 1) ≡ I (x, y); (RM2) μ(x, y, 1) ⊃ ∀z, r.(μ(z, x, r ) ⊃ μ(z, y, r )); (RM3) μ(x, y, r ) ∧ (s < r ) ⊃ μ(x, y, s). Łukasiewicz [18] also in Borkowski [19] gave a logical rendering of classical probability calculus by assigning to a sentential unary formula on a finite domain a weight defined as the fraction of the number of elements satisfying the formula to the number of elements in the universe. We follow up on the idea by considering the notion of a mass assignment on a universe of things and constructing in this environment a rough inclusion. The following theory is an abstract rendering of many-valued logic and its applications in fuzzy computing. For a more detailed look at rough mereology and its applications, consult (Polkowski, L.: Approximate Reasoning by Parts. An Introduction to Rough Mereology, Springer Vlg. (2011)) and (Polkowski, L.: Mereology in engineering and computer science. In: Calosi, C., Graziani,P.(eds.): Mereology and the Sciences: Parts and Wholes in Contemporary Scientific Context. Syntese Library 371. Springer Intl. Publishing, 217-293 (2014)). We define the basic notion of a mass assignment Polkowski [20]. Definition 7.39 (Mass assignment) Given a mereological space (Ω, π) over the relational vocabulary {π}, we define a mass assignment m which satisfies the conditions: (M1) (56) ∀x ∈ U.m(x) ∈ (0, 1]; We introduce a constant symbol Θ denoting the empty thing not in Ω in order to be able to assign the mass=0. (M2) m(Θ) = 0 Definition 7.40 (Axiom schemes for the mass assignment) The following are axiom schemes Polkowski [20]. (M3) (x = V ) ≡ (m(x) = 1); (M4) (x = Θ) ≡ m(x) = 0; (M5) (x → y) ⊃ [m(y) = m(x) + m((−x) · y)]. Theorem 7.33 The following are provable consequences of schemes (M1)-(M5). (T1) (T2) (T3) (T4) (T5)
I (x, y) ≡ x · y = x; I (x, y) ≡ x → y; (x · y = x) ≡ (x → y); (x = y) ⊃ (m(x) = m(y); m(x + y) = m(x) + m((−x) · y);
362
(T6) (T7) (T8) (T9) (T10) (T11) (T12) (T13) (T14) (T15)
7 Logics for Programs and Knowledge
x · y = Θ ⊃ m(x + y) = m(x) + m(y); m(x) + m(−x) = 1; m(y) = m(x · y) + m((−x) · y); m(x + y) = m(x) + m(y) − m(x · y); I (x, y) ⊃ m(x) ≤ m(y); (m(x + y) = m(x) + m(y)) ⊃ x · y = Θ; I (x, y) ≡ m(x → y) = 1; I (y, x) ⊃ x · (−y) = Θ; I (y, x) ⊃ m(x → y) = 1 − m(x) + m(y); m(x → y) = 1 − m(x − y).
Proof For (T1): already proved in Theorem 7.32; For (T2): already proved in Theorem 7.32; For (T3): already proved in Theorem 7.32; For (T4): x = y implies x → y and y → x, hence, by (M5), m(x) = m(y) + m(−y · x), and thus m(x) ≥ m(y). By symmetry, m(y) ≥ m(x) and finally m(x) = m(y); For (T5): substitution x/x + y in (M5) yields x → x + y true, hence, m(x + y) = m(x) + m((−x) · (x + y)), i.e., m(x + y) = m(x) + m((−x) · y); For (T6): Since x · y = Θ, (−x) · y = y, hence, m(x + y) = m(x) + m(y); For (T7): by (T6), as x · (−x) = Θ, m(x) + m(−x) = m(V ) = 1; For (T8): as x + (−x) = V and x · (−x) = Θ, y = x · y + (−x) · y, hence, m(y) = m(x · y) + m((−x) · y); For (T9): by (T5), m(x + y) = m(x) + m((−x) · y) and, by (T8), m(y) = m(x · y) + m((−x) · y), hence, (T9); For (T10): I (x, y) ≡ x → y by 3.23, hence, by (M5), m(y) = m(x) + m((−x) · y), and by (M1), m(x) ≤ m(y); For (T11): if m(x + y) = m(x) + m(y), then, by (T9), m(x · y) = 0, and by (M4), x · y = Θ; For (T12): By Theorem 7.32, I (x, y) ≡ x → y which is equivalent to x → y = V and thus m(x → y) = 1; For (T13): I (y, x) ≡ x · y = y, hence, y · (−x) = (y · x) · (−x) = y · (x · (−x)) = y · Θ = Θ; For (T14): by (T13), y · (−x) = Θ, hence, m(x → y) = m(−x + y) and, by (T6), m(−x + y) = m(−x) + m(y) and, by (T7), m(−x + y) = 1 − m(x) + m(y); For (T15): m(x → y) = m(−x + y) = m(V − (x − y). As V = (x − y) + (V − (x − y)), we obtain 1 = m(x − y) + m(V − (x − y)), and then, m(V − (x − y)) = 1 − m(x − y). Let us observe that (T14) yields an abstract form of the Łukasiewicz implication in case m(x) ≥ m(y), complemented by (T12) in case m(x) ≤ m(y). (T15) is a new formula valid for non-linear structures. (T7) defines the Łukasiewicz negation. We now introduce the notion of a rough inclusion: a similarity measure for mass concepts based on approximate containment. It is the granular computing on an abstract level.
7.4 Rough Mereology. Rough Inclusions
363
Definition 7.41 (Independence of things) We say that things x, y are independent, I nd(x, y) in symbols, if and only if m(x · y) = m(x) × m(y). Mass based rough inclusions on concepts We define a rough inclusion (Polkowski [21]). (RI1) μm (x, y, r ) ≡ m(x·y) ≥ r; m(x) The maximal rough inclusion μm 1 is defined as follows m(x·y) m (RI2) μm is argmax μ (x, y, r ) = . r 1 m(x) Theorem 7.34 The following are provable consequences of (M1)-(M5) and (RI1)(RI2). (T16) (T17) (T18) (T19) (T20) (T21) (T22) (T23) (T24) (T25)
I nd(x, y) ≡ I nd(−x, y); I (x, y) ⊃ μm 1 (x, y) = 1; (x, y) = 1 ⊃ I (x, y); μm 1 I (x, y) ≡ μm 1 (x, y) = 1 ≡ x → y = V ; μm (x, y, 1) ∧ μm (z, x, r ) ⊃ μm (z, y, r ); m μm 1 (x, −y) = 1 − μ1 (x, y); [m(x · y) + m((−x) · y) = m(y)] ⊃ I (x, y); m(y)μm 1 (y,x) μm (a simple Bayes’ formula); 1 (x, y) = m(x) μm 1 (x,y) μm 1 (y,x)
m(y) ; m(x) m I (x, y) ⊃ μ1 (y, x) = m(x) ; m(y) m m μm (x,y) (y,z)) μ (x,z) μ 1 1 = μ1m (z,x) ; m μm 1 (y,x) μ1 (z,y) 1
(T26) (T27) The
=
Bayes
m(x)μm 1 (x,z) . m i m(yi )μ1 (yi ,z)
theorem:
(+i= j yi · y j = Θ) ∧ (+i yi = V ) ⊃ μm 1 (z, x) =
Proof For (T16): by (T8), m(y) = m(x · y) + m((−x) · y) = m(x) × (m(y) + m(−x) · y), hence, m((−x) · y) = m(y) − m(y) × m(x) = m(y) × (1 − m(x)) = m(y) × m(−x), i.e., I nd(x, y) ≡ I nd(−x, y). For T(17): by (T1), I (x, y) implies x · y = x; For (T18) by (T8), m(x) = m(x · y) + m(x · (−y)), hence, by assumption μ1 (x, y) = 1, m(x · (−y)) = 0, and thus x · (−y) = Θ, i.e., I (x, y); For (T19): by (T12) and (T18); For (T20): by (T19), μ1 m(x, y, 1) ≡ I (x, y), hence, by (T1), x · y = x, hence, (x · z) · (y · z) = (x · y) · z = x · z, hence, I (x · z, y · z) which implies by (T10) that m(x · z) ≤ m(y · z); For (T21): by (T8), m(x) = m(x · y) + m(x · (−y)), hence, m(x · (−y)) = m m(x) − m(x · y), thus μm 1 (x, −y) = 1 − μ1 (x, y); For (T22): by (T8), m(y) = m(x · y) + m((−x) · y). From the premise in (T22) we infer that m(x) = m(x · y), hence, μm 1 (x, y) = 1 and by (T18) I (x, y); We now give an abstract version of the form of the Bayes theorem established by Łukasiewicz. We begin with a simple formulation (T23). μm μm m(x·y) 1 (y,x) 1 (x,y) For (T23): μm 1 (x, y) = m(x) , hence, m(y) = m(x · y) = m(x) , hence (T23) follows;
364
7 Logics for Programs and Knowledge
For (T24): a paraphrase of (T23); For (T25): I (x, y) implies mu m 1 (x, y) = 1 and (T24) submits the result; For (T26): a straightforward computation; We now address the Bayes theorem (T27). First some explanations: for a finite set Y of things, +Y means the sum of things in Y . Lemma 7.5 Let Y = {yi : i ≤ k} be a maximal set of pairwise disjoint things in U , i.e., yi · y j = Θ for i = j. Then +Y = V . Claim 1. For each thing z ∈ U , z = Cls({z · yi : yi ∈ Y }). Proof of Claim 1. Consider I (x, z). There exist y, w such that I (y, x), I (y, w), w = z · yi for some yi , hence, Ov(y, z · yi ), thus I (z, Cls({z · yi : yi ∈ Y }). The proof of converse goes along similar lines. Claim 2. Cls({z · yi : yi ∈ Y }) = + yi ∈Y z · yi . From Claims 1, 2 we obtain (T28) z = + yi ∈Y z · yi . Hence, (T29) m(z) = yi ∈Y m(z · yi ) = yi ∈Y m(yi )μm 1 (yi , z).
The Stone theorem for mereological spaces We now refer to the Stone representation of complete Boolean algebras which we apply to our case of mereological spaces Polkowski [21]. The following section requires from the reader some knowledge of notions of topology, which may be found in Chap. 1. Our discussion is carried out in the language of mereology. We do not introduce any topology in mereological universes, topology enters our discussion via the Stone representation. We recall now the Stone representation theorem which represents a complete Boolean algebra B as a topological space of ultrafilters on B. In Chap. 1, we introduced the Stone space and here we show that it can be defined in the mereological setting. Definition 7.42 (Filters) By a proper filter on the mereological space (Ω, π), we understand a collection F of things such that (i) if x, y ∈ F, then x · y ∈ F; (ii) (ii) if x ∈ F and I (x, y) then y ∈ F; (iii) Θ ∈ / F. Definition 7.43 (Ultrafilters) An ultrafilter is a filter which is not contained properly in any other filter. By the Zorn maximal principle, each filter extends to an ultrafilter, i.e., a maximal filter with respect to containment. Lemma 7.6 I (x, y) ≡ x + y = y.
7.4 Rough Mereology. Rough Inclusions
365
Proof I (x, y) is equivalent to I (−y, −x), hence, to (−y) · (−x) = (−y), hence, −(x + y) = (−y), i.e., x + y = y. The following properties of any ultrafilter F are of importance to us. Theorem 7.35 Any ultrafilter F satisfies the following properties: (i) for each thing x ∈ Ω, either x ∈ F or −x ∈ F; (ii) F is prime, i.e., if x + y ∈ F then either x ∈ F or y ∈ F. For completeness’ sake, we offer a proof. Proof For (i), assume that for some thing x ∈ Ω, x ∈ / F and −x ∈ / F. For y ∈ F, was x · y = Θ then I (y, −x) and −x ∈ F, a contradiction. It follows that x · y = Θ for each y ∈ F, hence, the collection F ∪ {x} extends to a filter containing F properly, a contradiction. For (ii), it follows by (i): was x ∈ / F and y ∈ / F, we would have by (i) that −x ∈ F and −y ∈ F, hence, (−x) · (−y) = −(x + y) ∈ F, a contradiction. We introduce some useful notions. Definition 7.44 (Stone sets) (i) the Stone space St (Ω) is the collection of all ultrafilters on Ω; (ii) for x ∈ Ω, S(x) = {F ∈ St (Ω) : x ∈ F}; (iii) S(Ω)={S(x) : x ∈ Ω}. We now state the Stone theorem and we recall the (Gleason [22]) theorem which states that the space (St (Ω), S(Ω)) is extremely disconnected (see Chap. 1 for definition). We recall proofs of those results rendering them in the language of mereology. Theorem 7.36 (The Stone topology) The collection S(Ω) is an open-and-closed base which induces on the set St (Ω) a Hausdorff compact zero-dimensional topology. Proof S(x) ∩ S(y) = S(x · y), hence, S has properties of a base. Each set S(x) is clopen: S(x) = S(Ω) \ S(−x). Hence, St (Ω) is zero-dimensional. St (Ω) is compact: let B, a collection of sets of the form of S(x) for x ∈ Δ ⊆ Ω, be centered, i.e., for each finite sub-collection X ={x1 , x2 , . . . , xk } of Δ, there exists an ultrafilter F with X ⊆ F. Let us consider a set Γ = Δ ∪ {z ∈ Ω: there exists x ∈ Δ with Π (x, z)}. Then Γ extends to an ultrafilter Ψ and Ψ ∈ B, i.e., St (U ) is compact. St (U ) is Hausdorff: let F = G for ultrafilters F, G. Assume, for the attention sake, that x ∈ F \ G for some thing x. Hence, −x ∈ G and F ∈ S(x), G ∈ S(−x), and, S(x) ∩ S(−x) = ∅. Theorem 7.37 The Stone space (St (Ω), S(Ω)) is extremely disconnected. We remind that Cl denotes the closure operator.
366
7 Logics for Programs and Knowledge
Proof Consider an open set G = {S(x) : x ∈ A ⊆ Ω}. For the class Cls A, consider the clopen set S(Cls A). As Π (x, Cls A) for each x ∈ A, it follows that G ⊆ S(Cls A), hence, (i) ClG ⊆ S(Cls A). We claim that S(Cls A) ⊆ ClG. Let us assume that, to the contrary, S(Cls A) \ ClG = ∅. Let an ultrafilter F belong in S(Cls A) \ ClG; hence, (ii) Cls A ∈ F. There exists a neighborhood S(z) of F, i.e., (iii) z ∈ F, (iv) S(z) ∩ S(x) = ∅ for each x ∈ A. It follows that (v) z · x = θ for each x ∈ A, hence, (vi) Π (x, −z) for each x ∈ A. By (vi), Π (Cls A, −z), which implies that (vii) −z ∈ F contradicting (iii). Thus, S(Cls A) ⊆ ClG and finally we obtain that ClG = S(Cls A), i.e., ClG is open. Consequences of mereological Stone representation for mereological spaces By compactness of St (Ω), there exist things x1 , x2 , . . . , xk ∈ Ω for a natural number k with the property that St (U ) = {S(xi ) : i ≤ k}. Let K = {x1 , x2 , . . . , xk }. For each thing x ∈ U , there exists a set I (x) ⊆ {1, 2, . . . , k} with the property that S(x) ∩ S(xi ) = ∅ if and only if i ∈ I (x). Theorem 7.38 Each x = +i∈I (x) x · xi , where + and · are operations in the Tarski algebra of mereology. Proof Consider an arbitrary thing y with Π (y, x). Let F(y) be an ultrafilter containing y; hence, x ∈ F(y). Let xi ∈ K be such that F(y) ∈ S(xi ). Then, y · xi = Θ and i ∈ I (x). As Π (y · xi , x · xi ), it follows Π (x, +i x · xi ) by M3. Contrariwise, assume for an arbitrary thing z that Π (z, +i x · xi ), hence, Π (z, x · xi ) for some i ∈ I (x) and thus Π (z, x); by M3, Π (+i x · xi , x) and finally x = +i x · xi . / I (x), We call the set K = {x1 , x2 , . . . , xk } a base in Ω. As x · x j = Θ for j ∈ k x · xi , or simply, x = +i x · xi . we can represent the thing x as x = +i=1 Theorem 7.39 (A compactness theorem) There exists a finite base in Ω consisting of things x1 , x2 , . . . , xk for some k with the property that each thing x ∈ Ω admits the representation x = +i x · xi . We have an abstract formulation of inclusion-exclusion principle obtained by repeated application of the property T9. Letting yi = xi · j